<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/sys.c, branch v4.18.11</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.18.11</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.18.11'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2018-09-09T08:32:39Z</updated>
<entry>
<title>sys: don't hold uts_sem while accessing userspace memory</title>
<updated>2018-09-09T08:32:39Z</updated>
<author>
<name>Jann Horn</name>
<email>jannh@google.com</email>
</author>
<published>2018-06-25T16:34:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1497f1e0714accef5334237f5bbe7e300290d799'/>
<id>urn:sha1:1497f1e0714accef5334237f5bbe7e300290d799</id>
<content type='text'>
commit 42a0cc3478584d4d63f68f2f5af021ddbea771fa upstream.

Holding uts_sem as a writer while accessing userspace memory allows a
namespace admin to stall all processes that attempt to take uts_sem.
Instead, move data through stack buffers and don't access userspace memory
while uts_sem is held.

Cc: stable@vger.kernel.org
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Jann Horn &lt;jannh@google.com&gt;
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>mm: introduce arg_lock to protect arg_start|end and env_start|end in mm_struct</title>
<updated>2018-06-08T00:34:34Z</updated>
<author>
<name>Yang Shi</name>
<email>yang.shi@linux.alibaba.com</email>
</author>
<published>2018-06-08T00:05:28Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=88aa7cc688d48ddd84558b41d5905a0db9535c4b'/>
<id>urn:sha1:88aa7cc688d48ddd84558b41d5905a0db9535c4b</id>
<content type='text'>
mmap_sem is on the hot path of kernel, and it very contended, but it is
abused too.  It is used to protect arg_start|end and evn_start|end when
reading /proc/$PID/cmdline and /proc/$PID/environ, but it doesn't make
sense since those proc files just expect to read 4 values atomically and
not related to VM, they could be set to arbitrary values by C/R.

And, the mmap_sem contention may cause unexpected issue like below:

INFO: task ps:14018 blocked for more than 120 seconds.
       Tainted: G            E 4.9.79-009.ali3000.alios7.x86_64 #1
 "echo 0 &gt; /proc/sys/kernel/hung_task_timeout_secs" disables this
message.
 ps              D    0 14018      1 0x00000004
 Call Trace:
   schedule+0x36/0x80
   rwsem_down_read_failed+0xf0/0x150
   call_rwsem_down_read_failed+0x18/0x30
   down_read+0x20/0x40
   proc_pid_cmdline_read+0xd9/0x4e0
   __vfs_read+0x37/0x150
   vfs_read+0x96/0x130
   SyS_read+0x55/0xc0
   entry_SYSCALL_64_fastpath+0x1a/0xc5

Both Alexey Dobriyan and Michal Hocko suggested to use dedicated lock
for them to mitigate the abuse of mmap_sem.

So, introduce a new spinlock in mm_struct to protect the concurrent
access to arg_start|end, env_start|end and others, as well as replace
write map_sem to read to protect the race condition between prctl and
sys_brk which might break check_data_rlimit(), and makes prctl more
friendly to other VM operations.

This patch just eliminates the abuse of mmap_sem, but it can't resolve
the above hung task warning completely since the later
access_remote_vm() call needs acquire mmap_sem.  The mmap_sem
scalability issue will be solved in the future.

[yang.shi@linux.alibaba.com: add comment about mmap_sem and arg_lock]
  Link: http://lkml.kernel.org/r/1524077799-80690-1-git-send-email-yang.shi@linux.alibaba.com
Link: http://lkml.kernel.org/r/1523730291-109696-1-git-send-email-yang.shi@linux.alibaba.com
Signed-off-by: Yang Shi &lt;yang.shi@linux.alibaba.com&gt;
Reviewed-by: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Alexey Dobriyan &lt;adobriyan@gmail.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Mateusz Guzik &lt;mguzik@redhat.com&gt;
Cc: Kirill Tkhai &lt;ktkhai@virtuozzo.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>kernel/sys.c: fix potential Spectre v1 issue</title>
<updated>2018-05-26T01:12:11Z</updated>
<author>
<name>Gustavo A. R. Silva</name>
<email>gustavo@embeddedor.com</email>
</author>
<published>2018-05-25T21:47:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=23d6aef74da86a33fa6bb75f79565e0a16ee97c2'/>
<id>urn:sha1:23d6aef74da86a33fa6bb75f79565e0a16ee97c2</id>
<content type='text'>
`resource' can be controlled by user-space, hence leading to a potential
exploitation of the Spectre variant 1 vulnerability.

This issue was detected with the help of Smatch:

  kernel/sys.c:1474 __do_compat_sys_old_getrlimit() warn: potential spectre issue 'get_current()-&gt;signal-&gt;rlim' (local cap)
  kernel/sys.c:1455 __do_sys_old_getrlimit() warn: potential spectre issue 'get_current()-&gt;signal-&gt;rlim' (local cap)

Fix this by sanitizing *resource* before using it to index
current-&gt;signal-&gt;rlim

Notice that given that speculation windows are large, the policy is to
kill the speculation on the first load and not worry if it can be
completed with a dependent load/store [1].

[1] https://marc.info/?l=linux-kernel&amp;m=152449131114778&amp;w=2

Link: http://lkml.kernel.org/r/20180515030038.GA11822@embeddedor.com
Signed-off-by: Gustavo A. R. Silva &lt;gustavo@embeddedor.com&gt;
Reviewed-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Cc: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>nospec: Allow getting/setting on non-current task</title>
<updated>2018-05-03T11:55:51Z</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2018-05-01T22:19:04Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7bbf1373e228840bb0295a2ca26d548ef37f448e'/>
<id>urn:sha1:7bbf1373e228840bb0295a2ca26d548ef37f448e</id>
<content type='text'>
Adjust arch_prctl_get/set_spec_ctrl() to operate on tasks other than
current.

This is needed both for /proc/$pid/status queries and for seccomp (since
thread-syncing can trigger seccomp in non-current threads).

Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;

</content>
</entry>
<entry>
<title>prctl: Add speculation control prctls</title>
<updated>2018-05-03T11:55:50Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2018-04-29T13:20:11Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b617cfc858161140d69cc0b5cc211996b557a1c7'/>
<id>urn:sha1:b617cfc858161140d69cc0b5cc211996b557a1c7</id>
<content type='text'>
Add two new prctls to control aspects of speculation related vulnerabilites
and their mitigations to provide finer grained control over performance
impacting mitigations.

PR_GET_SPECULATION_CTRL returns the state of the speculation misfeature
which is selected with arg2 of prctl(2). The return value uses bit 0-2 with
the following meaning:

Bit  Define           Description
0    PR_SPEC_PRCTL    Mitigation can be controlled per task by
                      PR_SET_SPECULATION_CTRL
1    PR_SPEC_ENABLE   The speculation feature is enabled, mitigation is
                      disabled
2    PR_SPEC_DISABLE  The speculation feature is disabled, mitigation is
                      enabled

If all bits are 0 the CPU is not affected by the speculation misfeature.

If PR_SPEC_PRCTL is set, then the per task control of the mitigation is
available. If not set, prctl(PR_SET_SPECULATION_CTRL) for the speculation
misfeature will fail.

PR_SET_SPECULATION_CTRL allows to control the speculation misfeature, which
is selected by arg2 of prctl(2) per task. arg3 is used to hand in the
control value, i.e. either PR_SPEC_ENABLE or PR_SPEC_DISABLE.

The common return values are:

EINVAL  prctl is not implemented by the architecture or the unused prctl()
        arguments are not 0
ENODEV  arg2 is selecting a not supported speculation misfeature

PR_SET_SPECULATION_CTRL has these additional return values:

ERANGE  arg3 is incorrect, i.e. it's not either PR_SPEC_ENABLE or PR_SPEC_DISABLE
ENXIO   prctl control of the selected speculation misfeature is disabled

The first supported controlable speculation misfeature is
PR_SPEC_STORE_BYPASS. Add the define so this can be shared between
architectures.

Based on an initial patch from Tim Chen and mostly rewritten.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Reviewed-by: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
</content>
</entry>
<entry>
<title>kernel: add ksys_setsid() helper; remove in-kernel call to sys_setsid()</title>
<updated>2018-04-02T18:16:06Z</updated>
<author>
<name>Dominik Brodowski</name>
<email>linux@dominikbrodowski.net</email>
</author>
<published>2018-03-16T11:36:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e2aaa9f423367ee03755d632555c242629a08d00'/>
<id>urn:sha1:e2aaa9f423367ee03755d632555c242629a08d00</id>
<content type='text'>
Using this helper allows us to avoid the in-kernel call to the
sys_setsid() syscall. The ksys_ prefix denotes that this function
is meant as a drop-in replacement for the syscall. In particular, it
uses the same calling convention as sys_setsid().

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Dominik Brodowski &lt;linux@dominikbrodowski.net&gt;
</content>
</entry>
<entry>
<title>kernel: provide ksys_*() wrappers for syscalls called by kernel/uid16.c</title>
<updated>2018-04-02T18:15:30Z</updated>
<author>
<name>Dominik Brodowski</name>
<email>linux@dominikbrodowski.net</email>
</author>
<published>2018-03-19T17:09:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e530dca584a9aa4aedf26adf0ed3c1c9b80e2767'/>
<id>urn:sha1:e530dca584a9aa4aedf26adf0ed3c1c9b80e2767</id>
<content type='text'>
Using these helpers allows us to avoid the in-kernel calls to these
syscalls: sys_setregid(), sys_setgid(), sys_setreuid(), sys_setuid(),
sys_setresuid(), sys_setresgid(), sys_setfsuid(), and sys_setfsgid().

The ksys_ prefix denotes that these function are meant as a drop-in
replacement for the syscall. In particular, they use the same calling
convention.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro &lt;viro@ZenIV.linux.org.uk&gt;
Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Dominik Brodowski &lt;linux@dominikbrodowski.net&gt;
</content>
</entry>
<entry>
<title>kernel: add do_getpgid() helper; remove internal call to sys_getpgid()</title>
<updated>2018-04-02T18:15:28Z</updated>
<author>
<name>Dominik Brodowski</name>
<email>linux@dominikbrodowski.net</email>
</author>
<published>2018-03-11T10:34:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=192c58073d9abb1c507c89f109da5dc9f130ba70'/>
<id>urn:sha1:192c58073d9abb1c507c89f109da5dc9f130ba70</id>
<content type='text'>
Using the do_getpgid() helper removes an in-kernel call to the
sys_getpgid() syscall.

This patch is part of a series which removes in-kernel calls to syscalls.
On this basis, the syscall entry path can be streamlined. For details, see
http://lkml.kernel.org/r/20180325162527.GA17492@light.dominikbrodowski.net

Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Signed-off-by: Dominik Brodowski &lt;linux@dominikbrodowski.net&gt;
</content>
</entry>
<entry>
<title>fix typo in assignment of fs default overflow gid</title>
<updated>2017-12-14T22:01:45Z</updated>
<author>
<name>Wolffhardt Schwabe</name>
<email>wolffhardt.schwabe@fau.de</email>
</author>
<published>2017-12-14T18:13:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8b2770a4e1c7a837e1889d4f00026e999da5faa0'/>
<id>urn:sha1:8b2770a4e1c7a837e1889d4f00026e999da5faa0</id>
<content type='text'>
The patch remains without practical effect since both macros carry
identical values.  Still, it might become a problem in the future if
(for whatever reason) the default overflow uid and gid differ.  The
DEFAULT_FS_OVERFLOWGID macro was previously unused.

Signed-off-by: Wolffhardt Schwabe &lt;wolffhardt.schwabe@fau.de&gt;
Signed-off-by: Anatoliy Cherepantsev &lt;anatoliy.cherepantsev@fau.de&gt;
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux</title>
<updated>2017-11-15T18:56:56Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2017-11-15T18:56:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c9b012e5f4a1d01dfa8abc6318211a67ba7d5db2'/>
<id>urn:sha1:c9b012e5f4a1d01dfa8abc6318211a67ba7d5db2</id>
<content type='text'>
Pull arm64 updates from Will Deacon:
 "The big highlight is support for the Scalable Vector Extension (SVE)
  which required extensive ABI work to ensure we don't break existing
  applications by blowing away their signal stack with the rather large
  new vector context (&lt;= 2 kbit per vector register). There's further
  work to be done optimising things like exception return, but the ABI
  is solid now.

  Much of the line count comes from some new PMU drivers we have, but
  they're pretty self-contained and I suspect we'll have more of them in
  future.

  Plenty of acronym soup here:

   - initial support for the Scalable Vector Extension (SVE)

   - improved handling for SError interrupts (required to handle RAS
     events)

   - enable GCC support for 128-bit integer types

   - remove kernel text addresses from backtraces and register dumps

   - use of WFE to implement long delay()s

   - ACPI IORT updates from Lorenzo Pieralisi

   - perf PMU driver for the Statistical Profiling Extension (SPE)

   - perf PMU driver for Hisilicon's system PMUs

   - misc cleanups and non-critical fixes"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (97 commits)
  arm64: Make ARMV8_DEPRECATED depend on SYSCTL
  arm64: Implement __lshrti3 library function
  arm64: support __int128 on gcc 5+
  arm64/sve: Add documentation
  arm64/sve: Detect SVE and activate runtime support
  arm64/sve: KVM: Hide SVE from CPU features exposed to guests
  arm64/sve: KVM: Treat guest SVE use as undefined instruction execution
  arm64/sve: KVM: Prevent guests from using SVE
  arm64/sve: Add sysctl to set the default vector length for new processes
  arm64/sve: Add prctl controls for userspace vector length management
  arm64/sve: ptrace and ELF coredump support
  arm64/sve: Preserve SVE registers around EFI runtime service calls
  arm64/sve: Preserve SVE registers around kernel-mode NEON use
  arm64/sve: Probe SVE capabilities and usable vector lengths
  arm64: cpufeature: Move sys_caps_initialised declarations
  arm64/sve: Backend logic for setting the vector length
  arm64/sve: Signal handling support
  arm64/sve: Support vector length resetting for new processes
  arm64/sve: Core task context handling
  arm64/sve: Low-level CPU setup
  ...
</content>
</entry>
</feed>
