<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/seccomp.c, branch v5.10.170</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.10.170</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.10.170'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2021-08-18T06:59:06Z</updated>
<entry>
<title>seccomp: Fix setting loaded filter count during TSYNC</title>
<updated>2021-08-18T06:59:06Z</updated>
<author>
<name>Hsuan-Chi Kuo</name>
<email>hsuanchikuo@gmail.com</email>
</author>
<published>2021-03-04T23:37:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=561d13128bb825f4e628d03c4f64a2cddb958ac8'/>
<id>urn:sha1:561d13128bb825f4e628d03c4f64a2cddb958ac8</id>
<content type='text'>
commit b4d8a58f8dcfcc890f296696cadb76e77be44b5f upstream.

The desired behavior is to set the caller's filter count to thread's.
This value is reported via /proc, so this fixes the inaccurate count
exposed to userspace; it is not used for reference counting, etc.

Signed-off-by: Hsuan-Chi Kuo &lt;hsuanchikuo@gmail.com&gt;
Link: https://lore.kernel.org/r/20210304233708.420597-1-hsuanchikuo@gmail.com
Co-developed-by: Wiktor Garbacz &lt;wiktorg@google.com&gt;
Signed-off-by: Wiktor Garbacz &lt;wiktorg@google.com&gt;
Link: https://lore.kernel.org/lkml/20210810125158.329849-1-wiktorg@google.com
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Cc: stable@vger.kernel.org
Fixes: c818c03b661c ("seccomp: Report number of loaded filters in /proc/$pid/status")
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>seccomp: Refactor notification handler to prepare for new semantics</title>
<updated>2021-06-03T07:00:31Z</updated>
<author>
<name>Sargun Dhillon</name>
<email>sargun@sargun.me</email>
</author>
<published>2021-05-17T19:39:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b71781c58982e9814c926e7e9c71744fbb8a2795'/>
<id>urn:sha1:b71781c58982e9814c926e7e9c71744fbb8a2795</id>
<content type='text'>
commit ddc473916955f7710d1eb17c1273d91c8622a9fe upstream.

This refactors the user notification code to have a do / while loop around
the completion condition. This has a small change in semantic, in that
previously we ignored addfd calls upon wakeup if the notification had been
responded to, but instead with the new change we check for an outstanding
addfd calls prior to returning to userspace.

Rodrigo Campos also identified a bug that can result in addfd causing
an early return, when the supervisor didn't actually handle the
syscall [1].

[1]: https://lore.kernel.org/lkml/20210413160151.3301-1-rodrigo@kinvolk.io/

Fixes: 7cf97b125455 ("seccomp: Introduce addfd ioctl to seccomp user notifier")
Signed-off-by: Sargun Dhillon &lt;sargun@sargun.me&gt;
Acked-by: Tycho Andersen &lt;tycho@tycho.pizza&gt;
Acked-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Tested-by: Rodrigo Campos &lt;rodrigo@kinvolk.io&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20210517193908.3113-3-sargun@sargun.me
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>seccomp: Add missing return in non-void function</title>
<updated>2021-03-04T10:38:32Z</updated>
<author>
<name>Paul Cercueil</name>
<email>paul@crapouillou.net</email>
</author>
<published>2021-01-11T17:28:39Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b506450ce3d904386c1bd7e33c74f9343548d2d8'/>
<id>urn:sha1:b506450ce3d904386c1bd7e33c74f9343548d2d8</id>
<content type='text'>
commit 04b38d012556199ba4c31195940160e0c44c64f0 upstream.

We don't actually care about the value, since the kernel will panic
before that; but a value should nonetheless be returned, otherwise the
compiler will complain.

Fixes: 8112c4f140fa ("seccomp: remove 2-phase API")
Cc: stable@vger.kernel.org # 4.7+
Signed-off-by: Paul Cercueil &lt;paul@crapouillou.net&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Link: https://lore.kernel.org/r/20210111172839.640914-1-paul@crapouillou.net
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>seccomp: Set PF_SUPERPRIV when checking capability</title>
<updated>2020-11-17T20:53:22Z</updated>
<author>
<name>Mickaël Salaün</name>
<email>mic@linux.microsoft.com</email>
</author>
<published>2020-10-30T12:38:49Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fb14528e443646dd3fd02df4437fcf5265b66baa'/>
<id>urn:sha1:fb14528e443646dd3fd02df4437fcf5265b66baa</id>
<content type='text'>
Replace the use of security_capable(current_cred(), ...) with
ns_capable_noaudit() which set PF_SUPERPRIV.

Since commit 98f368e9e263 ("kernel: Add noaudit variant of
ns_capable()"), a new ns_capable_noaudit() helper is available.  Let's
use it!

Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Tyler Hicks &lt;tyhicks@linux.microsoft.com&gt;
Cc: Will Drewry &lt;wad@chromium.org&gt;
Cc: stable@vger.kernel.org
Fixes: e2cfabdfd075 ("seccomp: add system call filtering using BPF")
Signed-off-by: Mickaël Salaün &lt;mic@linux.microsoft.com&gt;
Reviewed-by: Jann Horn &lt;jannh@google.com&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Link: https://lore.kernel.org/r/20201030123849.770769-3-mic@digikod.net
</content>
</entry>
<entry>
<title>seccomp: Make duplicate listener detection non-racy</title>
<updated>2020-10-08T20:17:47Z</updated>
<author>
<name>Jann Horn</name>
<email>jannh@google.com</email>
</author>
<published>2020-10-05T01:44:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=dfe719fef03d752f1682fa8aeddf30ba501c8555'/>
<id>urn:sha1:dfe719fef03d752f1682fa8aeddf30ba501c8555</id>
<content type='text'>
Currently, init_listener() tries to prevent adding a filter with
SECCOMP_FILTER_FLAG_NEW_LISTENER if one of the existing filters already
has a listener. However, this check happens without holding any lock that
would prevent another thread from concurrently installing a new filter
(potentially with a listener) on top of the ones we already have.

Theoretically, this is also a data race: The plain load from
current-&gt;seccomp.filter can race with concurrent writes to the same
location.

Fix it by moving the check into the region that holds the siglock to guard
against concurrent TSYNC.

(The "Fixes" tag points to the commit that introduced the theoretical
data race; concurrent installation of another filter with TSYNC only
became possible later, in commit 51891498f2da ("seccomp: allow TSYNC and
USER_NOTIF together").)

Fixes: 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")
Reviewed-by: Tycho Andersen &lt;tycho@tycho.pizza&gt;
Signed-off-by: Jann Horn &lt;jannh@google.com&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20201005014401.490175-1-jannh@google.com
</content>
</entry>
<entry>
<title>seccomp: Use current_pt_regs() instead of task_pt_regs(current)</title>
<updated>2020-09-08T23:26:45Z</updated>
<author>
<name>Denis Efremov</name>
<email>efremov@linux.com</email>
</author>
<published>2020-08-24T12:59:21Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2d9ca267a944c2b6ed5b4d750b1cf0407b6262b4'/>
<id>urn:sha1:2d9ca267a944c2b6ed5b4d750b1cf0407b6262b4</id>
<content type='text'>
As described in commit a3460a59747c ("new helper: current_pt_regs()"):
- arch versions are "optimized versions".
- some architectures have task_pt_regs() working only for traced tasks
  blocked on signal delivery. current_pt_regs() needs to work for *all*
  processes.

In preparation for adding a coccinelle rule for using current_*(), instead
of raw accesses to current members, modify seccomp_do_user_notification(),
__seccomp_filter(), __secure_computing() to use current_pt_regs().

Signed-off-by: Denis Efremov &lt;efremov@linux.com&gt;
Link: https://lore.kernel.org/r/20200824125921.488311-1-efremov@linux.com
[kees: Reworded commit log, add comment to populate_seccomp_data()]
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</content>
</entry>
<entry>
<title>seccomp: kill process instead of thread for unknown actions</title>
<updated>2020-09-08T19:00:49Z</updated>
<author>
<name>Rich Felker</name>
<email>dalias@libc.org</email>
</author>
<published>2020-08-29T01:56:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4d671d922d51907bc41f1f7f2dc737c928ae78fd'/>
<id>urn:sha1:4d671d922d51907bc41f1f7f2dc737c928ae78fd</id>
<content type='text'>
Asynchronous termination of a thread outside of the userspace thread
library's knowledge is an unsafe operation that leaves the process in
an inconsistent, corrupt, and possibly unrecoverable state. In order
to make new actions that may be added in the future safe on kernels
not aware of them, change the default action from
SECCOMP_RET_KILL_THREAD to SECCOMP_RET_KILL_PROCESS.

Signed-off-by: Rich Felker &lt;dalias@libc.org&gt;
Link: https://lore.kernel.org/r/20200829015609.GA32566@brightrain.aerifal.cx
[kees: Fixed up coredump selection logic to match]
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</content>
</entry>
<entry>
<title>seccomp: don't leave dangling -&gt;notif if file allocation fails</title>
<updated>2020-09-08T18:30:16Z</updated>
<author>
<name>Tycho Andersen</name>
<email>tycho@tycho.pizza</email>
</author>
<published>2020-09-02T14:09:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e839317900e9f13c83d8711d684de88c625b307a'/>
<id>urn:sha1:e839317900e9f13c83d8711d684de88c625b307a</id>
<content type='text'>
Christian and Kees both pointed out that this is a bit sloppy to open-code
both places, and Christian points out that we leave a dangling pointer to
-&gt;notif if file allocation fails. Since we check -&gt;notif for null in order
to determine if it's ok to install a filter, this means people won't be
able to install a filter if the file allocation fails for some reason, even
if they subsequently should be able to.

To fix this, let's hoist this free+null into its own little helper and use
it.

Reported-by: Kees Cook &lt;keescook@chromium.org&gt;
Reported-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Signed-off-by: Tycho Andersen &lt;tycho@tycho.pizza&gt;
Acked-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Link: https://lore.kernel.org/r/20200902140953.1201956-1-tycho@tycho.pizza
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</content>
</entry>
<entry>
<title>seccomp: don't leak memory when filter install races</title>
<updated>2020-09-08T18:19:50Z</updated>
<author>
<name>Tycho Andersen</name>
<email>tycho@tycho.pizza</email>
</author>
<published>2020-09-02T01:40:16Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a566a9012acd7c9a4be7e30dc7acb7a811ec2260'/>
<id>urn:sha1:a566a9012acd7c9a4be7e30dc7acb7a811ec2260</id>
<content type='text'>
In seccomp_set_mode_filter() with TSYNC | NEW_LISTENER, we first initialize
the listener fd, then check to see if we can actually use it later in
seccomp_may_assign_mode(), which can fail if anyone else in our thread
group has installed a filter and caused some divergence. If we can't, we
partially clean up the newly allocated file: we put the fd, put the file,
but don't actually clean up the *memory* that was allocated at
filter-&gt;notif. Let's clean that up too.

To accomplish this, let's hoist the actual "detach a notifier from a
filter" code to its own helper out of seccomp_notify_release(), so that in
case anyone adds stuff to init_listener(), they only have to add the
cleanup code in one spot. This does a bit of extra locking and such on the
failure path when the filter is not attached, but it's a slow failure path
anyway.

Fixes: 51891498f2da ("seccomp: allow TSYNC and USER_NOTIF together")
Reported-by: syzbot+3ad9614a12f80994c32e@syzkaller.appspotmail.com
Signed-off-by: Tycho Andersen &lt;tycho@tycho.pizza&gt;
Acked-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Link: https://lore.kernel.org/r/20200902014017.934315-1-tycho@tycho.pizza
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</content>
</entry>
<entry>
<title>seccomp: Introduce addfd ioctl to seccomp user notifier</title>
<updated>2020-07-14T23:29:42Z</updated>
<author>
<name>Sargun Dhillon</name>
<email>sargun@sargun.me</email>
</author>
<published>2020-06-03T01:10:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7cf97b12545503992020796c74bd84078eb39299'/>
<id>urn:sha1:7cf97b12545503992020796c74bd84078eb39299</id>
<content type='text'>
The current SECCOMP_RET_USER_NOTIF API allows for syscall supervision over
an fd. It is often used in settings where a supervising task emulates
syscalls on behalf of a supervised task in userspace, either to further
restrict the supervisee's syscall abilities or to circumvent kernel
enforced restrictions the supervisor deems safe to lift (e.g. actually
performing a mount(2) for an unprivileged container).

While SECCOMP_RET_USER_NOTIF allows for the interception of any syscall,
only a certain subset of syscalls could be correctly emulated. Over the
last few development cycles, the set of syscalls which can't be emulated
has been reduced due to the addition of pidfd_getfd(2). With this we are
now able to, for example, intercept syscalls that require the supervisor
to operate on file descriptors of the supervisee such as connect(2).

However, syscalls that cause new file descriptors to be installed can not
currently be correctly emulated since there is no way for the supervisor
to inject file descriptors into the supervisee. This patch adds a
new addfd ioctl to remove this restriction by allowing the supervisor to
install file descriptors into the intercepted task. By implementing this
feature via seccomp the supervisor effectively instructs the supervisee
to install a set of file descriptors into its own file descriptor table
during the intercepted syscall. This way it is possible to intercept
syscalls such as open() or accept(), and install (or replace, like
dup2(2)) the supervisor's resulting fd into the supervisee. One
replacement use-case would be to redirect the stdout and stderr of a
supervisee into log file descriptors opened by the supervisor.

The ioctl handling is based on the discussions[1] of how Extensible
Arguments should interact with ioctls. Instead of building size into
the addfd structure, make it a function of the ioctl command (which
is how sizes are normally passed to ioctls). To support forward and
backward compatibility, just mask out the direction and size, and match
everything. The size (and any future direction) checks are done along
with copy_struct_from_user() logic.

As a note, the seccomp_notif_addfd structure is laid out based on 8-byte
alignment without requiring packing as there have been packing issues
with uapi highlighted before[2][3]. Although we could overload the
newfd field and use -1 to indicate that it is not to be used, doing
so requires changing the size of the fd field, and introduces struct
packing complexity.

[1]: https://lore.kernel.org/lkml/87o8w9bcaf.fsf@mid.deneb.enyo.de/
[2]: https://lore.kernel.org/lkml/a328b91d-fd8f-4f27-b3c2-91a9c45f18c0@rasmusvillemoes.dk/
[3]: https://lore.kernel.org/lkml/20200612104629.GA15814@ircssh-2.c.rugged-nimbus-611.internal

Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Cc: Tycho Andersen &lt;tycho@tycho.ws&gt;
Cc: Jann Horn &lt;jannh@google.com&gt;
Cc: Robert Sesek &lt;rsesek@google.com&gt;
Cc: Chris Palmer &lt;palmer@google.com&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: linux-fsdevel@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-api@vger.kernel.org
Suggested-by: Matt Denton &lt;mpdenton@google.com&gt;
Link: https://lore.kernel.org/r/20200603011044.7972-4-sargun@sargun.me
Signed-off-by: Sargun Dhillon &lt;sargun@sargun.me&gt;
Reviewed-by: Will Drewry &lt;wad@chromium.org&gt;
Co-developed-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</content>
</entry>
</feed>
