<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/sys.c, branch v6.8.12</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.8.12</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.8.12'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2024-04-03T13:32:37Z</updated>
<entry>
<title>prctl: generalize PR_SET_MDWE support check to be per-arch</title>
<updated>2024-04-03T13:32:37Z</updated>
<author>
<name>Zev Weiss</name>
<email>zev@bewilderbeest.net</email>
</author>
<published>2024-02-27T01:35:41Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9699d986251d9845d7fb60dff75e2c741685d96e'/>
<id>urn:sha1:9699d986251d9845d7fb60dff75e2c741685d96e</id>
<content type='text'>
commit d5aad4c2ca057e760a92a9a7d65bd38d72963f27 upstream.

Patch series "ARM: prctl: Reject PR_SET_MDWE where not supported".

I noticed after a recent kernel update that my ARM926 system started
segfaulting on any execve() after calling prctl(PR_SET_MDWE).  After some
investigation it appears that ARMv5 is incapable of providing the
appropriate protections for MDWE, since any readable memory is also
implicitly executable.

The prctl_set_mdwe() function already had some special-case logic added
disabling it on PARISC (commit 793838138c15, "prctl: Disable
prctl(PR_SET_MDWE) on parisc"); this patch series (1) generalizes that
check to use an arch_*() function, and (2) adds a corresponding override
for ARM to disable MDWE on pre-ARMv6 CPUs.

With the series applied, prctl(PR_SET_MDWE) is rejected on ARMv5 and
subsequent execve() calls (as well as mmap(PROT_READ|PROT_WRITE)) can
succeed instead of unconditionally failing; on ARMv6 the prctl works as it
did previously.

[0] https://lore.kernel.org/all/2023112456-linked-nape-bf19@gregkh/


This patch (of 2):

There exist systems other than PARISC where MDWE may not be feasible to
support; rather than cluttering up the generic code with additional
arch-specific logic let's add a generic function for checking MDWE support
and allow each arch to override it as needed.

Link: https://lkml.kernel.org/r/20240227013546.15769-4-zev@bewilderbeest.net
Link: https://lkml.kernel.org/r/20240227013546.15769-5-zev@bewilderbeest.net
Signed-off-by: Zev Weiss &lt;zev@bewilderbeest.net&gt;
Acked-by: Helge Deller &lt;deller@gmx.de&gt;	[parisc]
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Florent Revest &lt;revest@chromium.org&gt;
Cc: "James E.J. Bottomley" &lt;James.Bottomley@HansenPartnership.com&gt;
Cc: Josh Triplett &lt;josh@joshtriplett.org&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Miguel Ojeda &lt;ojeda@kernel.org&gt;
Cc: Mike Rapoport (IBM) &lt;rppt@kernel.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Ondrej Mosnacek &lt;omosnace@redhat.com&gt;
Cc: Rick Edgecombe &lt;rick.p.edgecombe@intel.com&gt;
Cc: Russell King (Oracle) &lt;linux@armlinux.org.uk&gt;
Cc: Sam James &lt;sam@gentoo.org&gt;
Cc: Stefan Roesch &lt;shr@devkernel.io&gt;
Cc: Yang Shi &lt;yang@os.amperecomputing.com&gt;
Cc: Yin Fengwei &lt;fengwei.yin@intel.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;	[6.3+]
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>getrusage: use sig-&gt;stats_lock rather than lock_task_sighand()</title>
<updated>2024-02-08T05:20:32Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2024-01-22T15:50:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f7ec1cd5cc7ef3ad964b677ba82b8b77f1c93009'/>
<id>urn:sha1:f7ec1cd5cc7ef3ad964b677ba82b8b77f1c93009</id>
<content type='text'>
lock_task_sighand() can trigger a hard lockup. If NR_CPUS threads call
getrusage() at the same time and the process has NR_THREADS, spin_lock_irq
will spin with irqs disabled O(NR_CPUS * NR_THREADS) time.

Change getrusage() to use sig-&gt;stats_lock, it was specifically designed
for this type of use. This way it runs lockless in the likely case.

TODO:
	- Change do_task_stat() to use sig-&gt;stats_lock too, then we can
	  remove spin_lock_irq(siglock) in wait_task_zombie().

	- Turn sig-&gt;stats_lock into seqcount_rwlock_t, this way the
	  readers in the slow mode won't exclude each other. See
	  https://lore.kernel.org/all/20230913154907.GA26210@redhat.com/

	- stats_lock has to disable irqs because -&gt;siglock can be taken
	  in irq context, it would be very nice to change __exit_signal()
	  to avoid the siglock-&gt;stats_lock dependency.

Link: https://lkml.kernel.org/r/20240122155053.GA26214@redhat.com
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Reported-by: Dylan Hatch &lt;dylanbhatch@google.com&gt;
Tested-by: Dylan Hatch &lt;dylanbhatch@google.com&gt;
Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>getrusage: move thread_group_cputime_adjusted() outside of lock_task_sighand()</title>
<updated>2024-02-08T05:20:32Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2024-01-22T15:50:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=daa694e4137571b4ebec330f9a9b4d54aa8b8089'/>
<id>urn:sha1:daa694e4137571b4ebec330f9a9b4d54aa8b8089</id>
<content type='text'>
Patch series "getrusage: use sig-&gt;stats_lock", v2.


This patch (of 2):

thread_group_cputime() does its own locking, we can safely shift
thread_group_cputime_adjusted() which does another for_each_thread loop
outside of -&gt;siglock protected section.

This is also preparation for the next patch which changes getrusage() to
use stats_lock instead of siglock, thread_group_cputime() takes the same
lock.  With the current implementation recursive read_seqbegin_or_lock()
is fine, thread_group_cputime() can't enter the slow mode if the caller
holds stats_lock, yet this looks more safe and better performance-wise.

Link: https://lkml.kernel.org/r/20240122155023.GA26169@redhat.com
Link: https://lkml.kernel.org/r/20240122155050.GA26205@redhat.com
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Reported-by: Dylan Hatch &lt;dylanbhatch@google.com&gt;
Tested-by: Dylan Hatch &lt;dylanbhatch@google.com&gt;
Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>prctl: Disable prctl(PR_SET_MDWE) on parisc</title>
<updated>2023-11-18T18:35:31Z</updated>
<author>
<name>Helge Deller</name>
<email>deller@gmx.de</email>
</author>
<published>2023-11-18T18:33:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=793838138c157d4c49f4fb744b170747e3dabf58'/>
<id>urn:sha1:793838138c157d4c49f4fb744b170747e3dabf58</id>
<content type='text'>
systemd-254 tries to use prctl(PR_SET_MDWE) for it's MemoryDenyWriteExecute
functionality, but fails on parisc which still needs executable stacks in
certain combinations of gcc/glibc/kernel.

Disable prctl(PR_SET_MDWE) by returning -EINVAL for now on parisc, until
userspace has catched up.

Signed-off-by: Helge Deller &lt;deller@gmx.de&gt;
Co-developed-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Reported-by: Sam James &lt;sam@gentoo.org&gt;
Closes: https://github.com/systemd/systemd/issues/29775
Tested-by: Sam James &lt;sam@gentoo.org&gt;
Link: https://lore.kernel.org/all/875y2jro9a.fsf@gentoo.org/
Cc: &lt;stable@vger.kernel.org&gt; # v6.3+
</content>
</entry>
<entry>
<title>Merge tag 'mm-nonmm-stable-2023-11-02-14-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm</title>
<updated>2023-11-03T06:53:31Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-11-03T06:53:31Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8f6f76a6a29f36d2f3e4510d0bde5046672f6924'/>
<id>urn:sha1:8f6f76a6a29f36d2f3e4510d0bde5046672f6924</id>
<content type='text'>
Pull non-MM updates from Andrew Morton:
 "As usual, lots of singleton and doubleton patches all over the tree
  and there's little I can say which isn't in the individual changelogs.

  The lengthier patch series are

   - 'kdump: use generic functions to simplify crashkernel reservation
     in arch', from Baoquan He. This is mainly cleanups and
     consolidation of the 'crashkernel=' kernel parameter handling

   - After much discussion, David Laight's 'minmax: Relax type checks in
     min() and max()' is here. Hopefully reduces some typecasting and
     the use of min_t() and max_t()

   - A group of patches from Oleg Nesterov which clean up and slightly
     fix our handling of reads from /proc/PID/task/... and which remove
     task_struct.thread_group"

* tag 'mm-nonmm-stable-2023-11-02-14-08' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (64 commits)
  scripts/gdb/vmalloc: disable on no-MMU
  scripts/gdb: fix usage of MOD_TEXT not defined when CONFIG_MODULES=n
  .mailmap: add address mapping for Tomeu Vizoso
  mailmap: update email address for Claudiu Beznea
  tools/testing/selftests/mm/run_vmtests.sh: lower the ptrace permissions
  .mailmap: map Benjamin Poirier's address
  scripts/gdb: add lx_current support for riscv
  ocfs2: fix a spelling typo in comment
  proc: test ProtectionKey in proc-empty-vm test
  proc: fix proc-empty-vm test with vsyscall
  fs/proc/base.c: remove unneeded semicolon
  do_io_accounting: use sig-&gt;stats_lock
  do_io_accounting: use __for_each_thread()
  ocfs2: replace BUG_ON() at ocfs2_num_free_extents() with ocfs2_error()
  ocfs2: fix a typo in a comment
  scripts/show_delta: add __main__ judgement before main code
  treewide: mark stuff as __ro_after_init
  fs: ocfs2: check status values
  proc: test /proc/${pid}/statm
  compiler.h: move __is_constexpr() to compiler.h
  ...
</content>
</entry>
<entry>
<title>mm: add a NO_INHERIT flag to the PR_SET_MDWE prctl</title>
<updated>2023-10-06T21:44:11Z</updated>
<author>
<name>Florent Revest</name>
<email>revest@chromium.org</email>
</author>
<published>2023-08-28T15:08:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=24e41bf8a6b424c76c5902fb999e9eca61bdf83d'/>
<id>urn:sha1:24e41bf8a6b424c76c5902fb999e9eca61bdf83d</id>
<content type='text'>
This extends the current PR_SET_MDWE prctl arg with a bit to indicate that
the process doesn't want MDWE protection to propagate to children.

To implement this no-inherit mode, the tag in current-&gt;mm-&gt;flags must be
absent from MMF_INIT_MASK.  This means that the encoding for "MDWE but
without inherit" is different in the prctl than in the mm flags.  This
leads to a bit of bit-mangling in the prctl implementation.

Link: https://lkml.kernel.org/r/20230828150858.393570-6-revest@chromium.org
Signed-off-by: Florent Revest &lt;revest@chromium.org&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Alexey Izbyshev &lt;izbyshev@ispras.ru&gt;
Cc: Anshuman Khandual &lt;anshuman.khandual@arm.com&gt;
Cc: Ayush Jain &lt;ayush.jain3@amd.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Greg Thelen &lt;gthelen@google.com&gt;
Cc: Joey Gouly &lt;joey.gouly@arm.com&gt;
Cc: KP Singh &lt;kpsingh@kernel.org&gt;
Cc: Mark Brown &lt;broonie@kernel.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Peter Xu &lt;peterx@redhat.com&gt;
Cc: Ryan Roberts &lt;ryan.roberts@arm.com&gt;
Cc: Szabolcs Nagy &lt;Szabolcs.Nagy@arm.com&gt;
Cc: Topi Miettinen &lt;toiwoton@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>getrusage: use __for_each_thread()</title>
<updated>2023-10-04T17:41:57Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2023-09-09T17:26:29Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=13b7bc60b5353371460a203df6c38ccd38ad7a3a'/>
<id>urn:sha1:13b7bc60b5353371460a203df6c38ccd38ad7a3a</id>
<content type='text'>
do/while_each_thread should be avoided when possible.

Plus this change allows to avoid lock_task_sighand(), we can use rcu
and/or sig-&gt;stats_lock instead.

Link: https://lkml.kernel.org/r/20230909172629.GA20454@redhat.com
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>getrusage: add the "signal_struct *sig" local variable</title>
<updated>2023-10-04T17:41:57Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2023-09-09T17:25:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c7ac8231ace9b07306d0299969e42073b189c70a'/>
<id>urn:sha1:c7ac8231ace9b07306d0299969e42073b189c70a</id>
<content type='text'>
No functional changes, cleanup/preparation.

Link: https://lkml.kernel.org/r/20230909172554.GA20441@redhat.com
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>prctl: move PR_GET_AUXV out of PR_MCE_KILL</title>
<updated>2023-07-17T19:53:21Z</updated>
<author>
<name>Miguel Ojeda</name>
<email>ojeda@kernel.org</email>
</author>
<published>2023-07-08T23:33:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=636e348353a7cc52609fdba5ff3270065da140d5'/>
<id>urn:sha1:636e348353a7cc52609fdba5ff3270065da140d5</id>
<content type='text'>
Somehow PR_GET_AUXV got added into PR_MCE_KILL's switch when the patch was
applied [1].

Thus move it out of the switch, to the place the patch added it.

In the recently released v6.4 kernel some user could, in principle, be
already using this feature by mapping the right page and passing the
PR_GET_AUXV constant as a pointer:

    prctl(PR_MCE_KILL, PR_GET_AUXV, ...)

So this does change the behavior for users.  We could keep the bug since
the other subcases in PR_MCE_KILL (PR_MCE_KILL_CLEAR and PR_MCE_KILL_SET)
do not overlap.

However, v6.4 may be recent enough (2 weeks old) that moving the lines
(rather than just adding a new case) does not break anybody?  Moreover,
the documentation in man-pages was just committed today [2].

Link: https://lkml.kernel.org/r/20230708233344.361854-1-ojeda@kernel.org
Fixes: ddc65971bb67 ("prctl: add PR_GET_AUXV to copy auxv to userspace")
Link: https://lore.kernel.org/all/d81864a7f7f43bca6afa2a09fc2e850e4050ab42.1680611394.git.josh@joshtriplett.org/ [1]
Link: https://git.kernel.org/pub/scm/docs/man-pages/man-pages.git/commit/?id=8cf0c06bfd3c2b219b044d4151c96f0da50af9ad [2]
Signed-off-by: Miguel Ojeda &lt;ojeda@kernel.org&gt;
Cc: Josh Triplett &lt;josh@joshtriplett.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>riscv: Add prctl controls for userspace vector management</title>
<updated>2023-06-08T14:16:53Z</updated>
<author>
<name>Andy Chiu</name>
<email>andy.chiu@sifive.com</email>
</author>
<published>2023-06-05T11:07:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1fd96a3e9d5d4febe1a8486590ad52c048d1be77'/>
<id>urn:sha1:1fd96a3e9d5d4febe1a8486590ad52c048d1be77</id>
<content type='text'>
This patch add two riscv-specific prctls, to allow usespace control the
use of vector unit:

 * PR_RISCV_V_SET_CONTROL: control the permission to use Vector at next,
   or all following execve for a thread. Turning off a thread's Vector
   live is not possible since libraries may have registered ifunc that
   may execute Vector instructions.
 * PR_RISCV_V_GET_CONTROL: get the same permission setting for the
   current thread, and the setting for following execve(s).

Signed-off-by: Andy Chiu &lt;andy.chiu@sifive.com&gt;
Reviewed-by: Greentime Hu &lt;greentime.hu@sifive.com&gt;
Reviewed-by: Vincent Chen &lt;vincent.chen@sifive.com&gt;
Link: https://lore.kernel.org/r/20230605110724.21391-22-andy.chiu@sifive.com
Signed-off-by: Palmer Dabbelt &lt;palmer@rivosinc.com&gt;
</content>
</entry>
</feed>
