<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/seccomp.c, branch v5.10.32</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.10.32</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.10.32'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2021-03-04T10:38:32Z</updated>
<entry>
<title>seccomp: Add missing return in non-void function</title>
<updated>2021-03-04T10:38:32Z</updated>
<author>
<name>Paul Cercueil</name>
<email>paul@crapouillou.net</email>
</author>
<published>2021-01-11T17:28:39Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b506450ce3d904386c1bd7e33c74f9343548d2d8'/>
<id>urn:sha1:b506450ce3d904386c1bd7e33c74f9343548d2d8</id>
<content type='text'>
commit 04b38d012556199ba4c31195940160e0c44c64f0 upstream.

We don't actually care about the value, since the kernel will panic
before that; but a value should nonetheless be returned, otherwise the
compiler will complain.

Fixes: 8112c4f140fa ("seccomp: remove 2-phase API")
Cc: stable@vger.kernel.org # 4.7+
Signed-off-by: Paul Cercueil &lt;paul@crapouillou.net&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Link: https://lore.kernel.org/r/20210111172839.640914-1-paul@crapouillou.net
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>seccomp: Set PF_SUPERPRIV when checking capability</title>
<updated>2020-11-17T20:53:22Z</updated>
<author>
<name>Mickaël Salaün</name>
<email>mic@linux.microsoft.com</email>
</author>
<published>2020-10-30T12:38:49Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fb14528e443646dd3fd02df4437fcf5265b66baa'/>
<id>urn:sha1:fb14528e443646dd3fd02df4437fcf5265b66baa</id>
<content type='text'>
Replace the use of security_capable(current_cred(), ...) with
ns_capable_noaudit() which set PF_SUPERPRIV.

Since commit 98f368e9e263 ("kernel: Add noaudit variant of
ns_capable()"), a new ns_capable_noaudit() helper is available.  Let's
use it!

Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Tyler Hicks &lt;tyhicks@linux.microsoft.com&gt;
Cc: Will Drewry &lt;wad@chromium.org&gt;
Cc: stable@vger.kernel.org
Fixes: e2cfabdfd075 ("seccomp: add system call filtering using BPF")
Signed-off-by: Mickaël Salaün &lt;mic@linux.microsoft.com&gt;
Reviewed-by: Jann Horn &lt;jannh@google.com&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Link: https://lore.kernel.org/r/20201030123849.770769-3-mic@digikod.net
</content>
</entry>
<entry>
<title>seccomp: Make duplicate listener detection non-racy</title>
<updated>2020-10-08T20:17:47Z</updated>
<author>
<name>Jann Horn</name>
<email>jannh@google.com</email>
</author>
<published>2020-10-05T01:44:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=dfe719fef03d752f1682fa8aeddf30ba501c8555'/>
<id>urn:sha1:dfe719fef03d752f1682fa8aeddf30ba501c8555</id>
<content type='text'>
Currently, init_listener() tries to prevent adding a filter with
SECCOMP_FILTER_FLAG_NEW_LISTENER if one of the existing filters already
has a listener. However, this check happens without holding any lock that
would prevent another thread from concurrently installing a new filter
(potentially with a listener) on top of the ones we already have.

Theoretically, this is also a data race: The plain load from
current-&gt;seccomp.filter can race with concurrent writes to the same
location.

Fix it by moving the check into the region that holds the siglock to guard
against concurrent TSYNC.

(The "Fixes" tag points to the commit that introduced the theoretical
data race; concurrent installation of another filter with TSYNC only
became possible later, in commit 51891498f2da ("seccomp: allow TSYNC and
USER_NOTIF together").)

Fixes: 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")
Reviewed-by: Tycho Andersen &lt;tycho@tycho.pizza&gt;
Signed-off-by: Jann Horn &lt;jannh@google.com&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201005014401.490175-1-jannh@google.com
</content>
</entry>
<entry>
<title>seccomp: Use current_pt_regs() instead of task_pt_regs(current)</title>
<updated>2020-09-08T23:26:45Z</updated>
<author>
<name>Denis Efremov</name>
<email>efremov@linux.com</email>
</author>
<published>2020-08-24T12:59:21Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2d9ca267a944c2b6ed5b4d750b1cf0407b6262b4'/>
<id>urn:sha1:2d9ca267a944c2b6ed5b4d750b1cf0407b6262b4</id>
<content type='text'>
As described in commit a3460a59747c ("new helper: current_pt_regs()"):
- arch versions are "optimized versions".
- some architectures have task_pt_regs() working only for traced tasks
  blocked on signal delivery. current_pt_regs() needs to work for *all*
  processes.

In preparation for adding a coccinelle rule for using current_*(), instead
of raw accesses to current members, modify seccomp_do_user_notification(),
__seccomp_filter(), __secure_computing() to use current_pt_regs().

Signed-off-by: Denis Efremov &lt;efremov@linux.com&gt;
Link: https://lore.kernel.org/r/20200824125921.488311-1-efremov@linux.com
[kees: Reworded commit log, add comment to populate_seccomp_data()]
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</content>
</entry>
<entry>
<title>seccomp: kill process instead of thread for unknown actions</title>
<updated>2020-09-08T19:00:49Z</updated>
<author>
<name>Rich Felker</name>
<email>dalias@libc.org</email>
</author>
<published>2020-08-29T01:56:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4d671d922d51907bc41f1f7f2dc737c928ae78fd'/>
<id>urn:sha1:4d671d922d51907bc41f1f7f2dc737c928ae78fd</id>
<content type='text'>
Asynchronous termination of a thread outside of the userspace thread
library's knowledge is an unsafe operation that leaves the process in
an inconsistent, corrupt, and possibly unrecoverable state. In order
to make new actions that may be added in the future safe on kernels
not aware of them, change the default action from
SECCOMP_RET_KILL_THREAD to SECCOMP_RET_KILL_PROCESS.

Signed-off-by: Rich Felker &lt;dalias@libc.org&gt;
Link: https://lore.kernel.org/r/20200829015609.GA32566@brightrain.aerifal.cx
[kees: Fixed up coredump selection logic to match]
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</content>
</entry>
<entry>
<title>seccomp: don't leave dangling -&gt;notif if file allocation fails</title>
<updated>2020-09-08T18:30:16Z</updated>
<author>
<name>Tycho Andersen</name>
<email>tycho@tycho.pizza</email>
</author>
<published>2020-09-02T14:09:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e839317900e9f13c83d8711d684de88c625b307a'/>
<id>urn:sha1:e839317900e9f13c83d8711d684de88c625b307a</id>
<content type='text'>
Christian and Kees both pointed out that this is a bit sloppy to open-code
both places, and Christian points out that we leave a dangling pointer to
-&gt;notif if file allocation fails. Since we check -&gt;notif for null in order
to determine if it's ok to install a filter, this means people won't be
able to install a filter if the file allocation fails for some reason, even
if they subsequently should be able to.

To fix this, let's hoist this free+null into its own little helper and use
it.

Reported-by: Kees Cook &lt;keescook@chromium.org&gt;
Reported-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Signed-off-by: Tycho Andersen &lt;tycho@tycho.pizza&gt;
Acked-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Link: https://lore.kernel.org/r/20200902140953.1201956-1-tycho@tycho.pizza
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</content>
</entry>
<entry>
<title>seccomp: don't leak memory when filter install races</title>
<updated>2020-09-08T18:19:50Z</updated>
<author>
<name>Tycho Andersen</name>
<email>tycho@tycho.pizza</email>
</author>
<published>2020-09-02T01:40:16Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a566a9012acd7c9a4be7e30dc7acb7a811ec2260'/>
<id>urn:sha1:a566a9012acd7c9a4be7e30dc7acb7a811ec2260</id>
<content type='text'>
In seccomp_set_mode_filter() with TSYNC | NEW_LISTENER, we first initialize
the listener fd, then check to see if we can actually use it later in
seccomp_may_assign_mode(), which can fail if anyone else in our thread
group has installed a filter and caused some divergence. If we can't, we
partially clean up the newly allocated file: we put the fd, put the file,
but don't actually clean up the *memory* that was allocated at
filter-&gt;notif. Let's clean that up too.

To accomplish this, let's hoist the actual "detach a notifier from a
filter" code to its own helper out of seccomp_notify_release(), so that in
case anyone adds stuff to init_listener(), they only have to add the
cleanup code in one spot. This does a bit of extra locking and such on the
failure path when the filter is not attached, but it's a slow failure path
anyway.

Fixes: 51891498f2da ("seccomp: allow TSYNC and USER_NOTIF together")
Reported-by: syzbot+3ad9614a12f80994c32e@syzkaller.appspotmail.com
Signed-off-by: Tycho Andersen &lt;tycho@tycho.pizza&gt;
Acked-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Link: https://lore.kernel.org/r/20200902014017.934315-1-tycho@tycho.pizza
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</content>
</entry>
<entry>
<title>seccomp: Introduce addfd ioctl to seccomp user notifier</title>
<updated>2020-07-14T23:29:42Z</updated>
<author>
<name>Sargun Dhillon</name>
<email>sargun@sargun.me</email>
</author>
<published>2020-06-03T01:10:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7cf97b12545503992020796c74bd84078eb39299'/>
<id>urn:sha1:7cf97b12545503992020796c74bd84078eb39299</id>
<content type='text'>
The current SECCOMP_RET_USER_NOTIF API allows for syscall supervision over
an fd. It is often used in settings where a supervising task emulates
syscalls on behalf of a supervised task in userspace, either to further
restrict the supervisee's syscall abilities or to circumvent kernel
enforced restrictions the supervisor deems safe to lift (e.g. actually
performing a mount(2) for an unprivileged container).

While SECCOMP_RET_USER_NOTIF allows for the interception of any syscall,
only a certain subset of syscalls could be correctly emulated. Over the
last few development cycles, the set of syscalls which can't be emulated
has been reduced due to the addition of pidfd_getfd(2). With this we are
now able to, for example, intercept syscalls that require the supervisor
to operate on file descriptors of the supervisee such as connect(2).

However, syscalls that cause new file descriptors to be installed can not
currently be correctly emulated since there is no way for the supervisor
to inject file descriptors into the supervisee. This patch adds a
new addfd ioctl to remove this restriction by allowing the supervisor to
install file descriptors into the intercepted task. By implementing this
feature via seccomp the supervisor effectively instructs the supervisee
to install a set of file descriptors into its own file descriptor table
during the intercepted syscall. This way it is possible to intercept
syscalls such as open() or accept(), and install (or replace, like
dup2(2)) the supervisor's resulting fd into the supervisee. One
replacement use-case would be to redirect the stdout and stderr of a
supervisee into log file descriptors opened by the supervisor.

The ioctl handling is based on the discussions[1] of how Extensible
Arguments should interact with ioctls. Instead of building size into
the addfd structure, make it a function of the ioctl command (which
is how sizes are normally passed to ioctls). To support forward and
backward compatibility, just mask out the direction and size, and match
everything. The size (and any future direction) checks are done along
with copy_struct_from_user() logic.

As a note, the seccomp_notif_addfd structure is laid out based on 8-byte
alignment without requiring packing as there have been packing issues
with uapi highlighted before[2][3]. Although we could overload the
newfd field and use -1 to indicate that it is not to be used, doing
so requires changing the size of the fd field, and introduces struct
packing complexity.

[1]: https://lore.kernel.org/lkml/87o8w9bcaf.fsf@mid.deneb.enyo.de/
[2]: https://lore.kernel.org/lkml/a328b91d-fd8f-4f27-b3c2-91a9c45f18c0@rasmusvillemoes.dk/
[3]: https://lore.kernel.org/lkml/20200612104629.GA15814@ircssh-2.c.rugged-nimbus-611.internal

Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Cc: Tycho Andersen &lt;tycho@tycho.ws&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Robert Sesek &lt;rsesek@google.com&gt;
Cc: Chris Palmer &lt;palmer@google.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-api@vger.kernel.org
Suggested-by: Matt Denton &lt;mpdenton@google.com&gt;
Link: https://lore.kernel.org/r/20200603011044.7972-4-sargun@sargun.me
Signed-off-by: Sargun Dhillon &lt;sargun@sargun.me&gt;
Reviewed-by: Will Drewry &lt;wad@chromium.org&gt;
Co-developed-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</content>
</entry>
<entry>
<title>seccomp: Use -1 marker for end of mode 1 syscall list</title>
<updated>2020-07-10T23:01:52Z</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2020-06-19T19:20:15Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fe4bfff86ec54773df3db79e8112e3b0f820c799'/>
<id>urn:sha1:fe4bfff86ec54773df3db79e8112e3b0f820c799</id>
<content type='text'>
The terminator for the mode 1 syscalls list was a 0, but that could be
a valid syscall number (e.g. x86_64 __NR_read). By luck, __NR_read was
listed first and the loop construct would not test it, so there was no
bug. However, this is fragile. Replace the terminator with -1 instead,
and make the variable name for mode 1 syscall lists more descriptive.

Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Cc: Will Drewry &lt;wad@chromium.org&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</content>
</entry>
<entry>
<title>seccomp: Fix ioctl number for SECCOMP_IOCTL_NOTIF_ID_VALID</title>
<updated>2020-07-10T23:01:52Z</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2020-06-15T22:42:46Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=47e33c05f9f07cac3de833e531bcac9ae052c7ca'/>
<id>urn:sha1:47e33c05f9f07cac3de833e531bcac9ae052c7ca</id>
<content type='text'>
When SECCOMP_IOCTL_NOTIF_ID_VALID was first introduced it had the wrong
direction flag set. While this isn't a big deal as nothing currently
enforces these bits in the kernel, it should be defined correctly. Fix
the define and provide support for the old command until it is no longer
needed for backward compatibility.

Fixes: 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</content>
</entry>
</feed>
