<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/seccomp.c, branch v5.5.13</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.5.13</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.5.13'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2020-01-02T21:03:45Z</updated>
<entry>
<title>seccomp: Check that seccomp_notif is zeroed out by the user</title>
<updated>2020-01-02T21:03:45Z</updated>
<author>
<name>Sargun Dhillon</name>
<email>sargun@sargun.me</email>
</author>
<published>2019-12-29T06:24:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2882d53c9c6f3b8311d225062522f03772cf0179'/>
<id>urn:sha1:2882d53c9c6f3b8311d225062522f03772cf0179</id>
<content type='text'>
This patch is a small change in enforcement of the uapi for
SECCOMP_IOCTL_NOTIF_RECV ioctl. Specifically, the datastructure which
is passed (seccomp_notif) must be zeroed out. Previously any of its
members could be set to nonsense values, and we would ignore it.

This ensures all fields are set to their zero value.

Signed-off-by: Sargun Dhillon &lt;sargun@sargun.me&gt;
Reviewed-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Reviewed-by: Aleksa Sarai &lt;cyphar@cyphar.com&gt;
Acked-by: Tycho Andersen &lt;tycho@tycho.ws&gt;
Link: https://lore.kernel.org/r/20191229062451.9467-2-sargun@sargun.me
Fixes: 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")
Cc: stable@vger.kernel.org
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</content>
</entry>
<entry>
<title>seccomp: add SECCOMP_USER_NOTIF_FLAG_CONTINUE</title>
<updated>2019-10-10T21:45:51Z</updated>
<author>
<name>Christian Brauner</name>
<email>christian.brauner@ubuntu.com</email>
</author>
<published>2019-09-20T08:30:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fb3c5386b382d4097476ce9647260fc89b34afdb'/>
<id>urn:sha1:fb3c5386b382d4097476ce9647260fc89b34afdb</id>
<content type='text'>
This allows the seccomp notifier to continue a syscall. A positive
discussion about this feature was triggered by a post to the
ksummit-discuss mailing list (cf. [3]) and took place during KSummit
(cf. [1]) and again at the containers/checkpoint-restore
micro-conference at Linux Plumbers.

Recently we landed seccomp support for SECCOMP_RET_USER_NOTIF (cf. [4])
which enables a process (watchee) to retrieve an fd for its seccomp
filter. This fd can then be handed to another (usually more privileged)
process (watcher). The watcher will then be able to receive seccomp
messages about the syscalls having been performed by the watchee.

This feature is heavily used in some userspace workloads. For example,
it is currently used to intercept mknod() syscalls in user namespaces
aka in containers.
The mknod() syscall can be easily filtered based on dev_t. This allows
us to only intercept a very specific subset of mknod() syscalls.
Furthermore, mknod() is not possible in user namespaces toto coelo and
so intercepting and denying syscalls that are not in the whitelist on
accident is not a big deal. The watchee won't notice a difference.

In contrast to mknod(), a lot of other syscall we intercept (e.g.
setxattr()) cannot be easily filtered like mknod() because they have
pointer arguments. Additionally, some of them might actually succeed in
user namespaces (e.g. setxattr() for all "user.*" xattrs). Since we
currently cannot tell seccomp to continue from a user notifier we are
stuck with performing all of the syscalls in lieu of the container. This
is a huge security liability since it is extremely difficult to
correctly assume all of the necessary privileges of the calling task
such that the syscall can be successfully emulated without escaping
other additional security restrictions (think missing CAP_MKNOD for
mknod(), or MS_NODEV on a filesystem etc.). This can be solved by
telling seccomp to resume the syscall.

One thing that came up in the discussion was the problem that another
thread could change the memory after userspace has decided to let the
syscall continue which is a well known TOCTOU with seccomp which is
present in other ways already.
The discussion showed that this feature is already very useful for any
syscall without pointer arguments. For any accidentally intercepted
non-pointer syscall it is safe to continue.
For syscalls with pointer arguments there is a race but for any cautious
userspace and the main usec cases the race doesn't matter. The notifier
is intended to be used in a scenario where a more privileged watcher
supervises the syscalls of lesser privileged watchee to allow it to get
around kernel-enforced limitations by performing the syscall for it
whenever deemed save by the watcher. Hence, if a user tricks the watcher
into allowing a syscall they will either get a deny based on
kernel-enforced restrictions later or they will have changed the
arguments in such a way that they manage to perform a syscall with
arguments that they would've been allowed to do anyway.
In general, it is good to point out again, that the notifier fd was not
intended to allow userspace to implement a security policy but rather to
work around kernel security mechanisms in cases where the watcher knows
that a given action is safe to perform.

/* References */
[1]: https://linuxplumbersconf.org/event/4/contributions/560
[2]: https://linuxplumbersconf.org/event/4/contributions/477
[3]: https://lore.kernel.org/r/20190719093538.dhyopljyr5ns33qx@brauner.io
[4]: commit 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")

Co-developed-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Christian Brauner &lt;christian.brauner@ubuntu.com&gt;
Reviewed-by: Tycho Andersen &lt;tycho@tycho.ws&gt;
Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Cc: Will Drewry &lt;wad@chromium.org&gt;
CC: Tyler Hicks &lt;tyhicks@canonical.com&gt;
Link: https://lore.kernel.org/r/20190920083007.11475-2-christian.brauner@ubuntu.com
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</content>
</entry>
<entry>
<title>signal: Remove the signal number and task parameters from force_sig_info</title>
<updated>2019-05-29T14:31:44Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2019-05-15T15:11:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a89e9b8abf82725e4ac96100e07c8104dbe8a240'/>
<id>urn:sha1:a89e9b8abf82725e4ac96100e07c8104dbe8a240</id>
<content type='text'>
force_sig_info always delivers to the current task and the signal
parameter always matches info.si_signo.  So remove those parameters to
make it a simpler less error prone interface, and to make it clear
that none of the callers are doing anything clever.

This guarantees that force_sig_info will not grow any new buggy
callers that attempt to call force_sig on a non-current task, or that
pass an signal number that does not match info.si_signo.

Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'audit-pr-20190507' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit</title>
<updated>2019-05-08T02:06:04Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-05-08T02:06:04Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=02aff8db6438ce29371fd9cd54c57213f4bb4536'/>
<id>urn:sha1:02aff8db6438ce29371fd9cd54c57213f4bb4536</id>
<content type='text'>
Pull audit updates from Paul Moore:
 "We've got a reasonably broad set of audit patches for the v5.2 merge
  window, the highlights are below:

   - The biggest change, and the source of all the arch/* changes, is
     the patchset from Dmitry to help enable some of the work he is
     doing around PTRACE_GET_SYSCALL_INFO.

     To be honest, including this in the audit tree is a bit of a
     stretch, but it does help move audit a little further along towards
     proper syscall auditing for all arches, and everyone else seemed to
     agree that audit was a "good" spot for this to land (or maybe they
     just didn't want to merge it? dunno.).

   - We can now audit time/NTP adjustments.

   - We continue the work to connect associated audit records into a
     single event"

* tag 'audit-pr-20190507' of git://git.kernel.org/pub/scm/linux/kernel/git/pcmoore/audit: (21 commits)
  audit: fix a memory leak bug
  ntp: Audit NTP parameters adjustment
  timekeeping: Audit clock adjustments
  audit: purge unnecessary list_empty calls
  audit: link integrity evm_write_xattrs record to syscall event
  syscall_get_arch: add "struct task_struct *" argument
  unicore32: define syscall_get_arch()
  Move EM_UNICORE to uapi/linux/elf-em.h
  nios2: define syscall_get_arch()
  nds32: define syscall_get_arch()
  Move EM_NDS32 to uapi/linux/elf-em.h
  m68k: define syscall_get_arch()
  hexagon: define syscall_get_arch()
  Move EM_HEXAGON to uapi/linux/elf-em.h
  h8300: define syscall_get_arch()
  c6x: define syscall_get_arch()
  arc: define syscall_get_arch()
  Move EM_ARCOMPACT and EM_ARCV2 to uapi/linux/elf-em.h
  audit: Make audit_log_cap and audit_copy_inode static
  audit: connect LOGIN record to its syscall record
  ...
</content>
</entry>
<entry>
<title>Merge branch 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security</title>
<updated>2019-05-07T15:39:54Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-05-07T15:39:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=78ee8b1b9b2fa1b51c51c42f3cffa0e12ad5f0ab'/>
<id>urn:sha1:78ee8b1b9b2fa1b51c51c42f3cffa0e12ad5f0ab</id>
<content type='text'>
Pull security subsystem updates from James Morris:
 "Just a few bugfixes and documentation updates"

* 'next-general' of git://git.kernel.org/pub/scm/linux/kernel/git/jmorris/linux-security:
  seccomp: fix up grammar in comment
  Revert "security: inode: fix a missing check for securityfs_create_file"
  Yama: mark function as static
  security: inode: fix a missing check for securityfs_create_file
  keys: safe concurrent user-&gt;{session,uid}_keyring access
  security: don't use RCU accessors for cred-&gt;session_keyring
  Yama: mark local symbols as static
  LSM: lsm_hooks.h: fix documentation format
  LSM: fix documentation for the shm_* hooks
  LSM: fix documentation for the sem_* hooks
  LSM: fix documentation for the msg_queue_* hooks
  LSM: fix documentation for the audit_* hooks
  LSM: fix documentation for the path_chmod hook
  LSM: fix documentation for the socket_getpeersec_dgram hook
  LSM: fix documentation for the task_setscheduler hook
  LSM: fix documentation for the socket_post_create hook
  LSM: fix documentation for the syslog hook
  LSM: fix documentation for sb_copy_data hook
</content>
</entry>
<entry>
<title>Merge tag 'seccomp-v5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux</title>
<updated>2019-04-29T20:24:34Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-04-29T20:24:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=83a50840e72a5a964b4704fcdc2fbb2d771015ab'/>
<id>urn:sha1:83a50840e72a5a964b4704fcdc2fbb2d771015ab</id>
<content type='text'>
Pull seccomp fixes from Kees Cook:
 "Syzbot found a use-after-free bug in seccomp due to flags that should
  not be allowed to be used together.

  Tycho fixed this, I updated the self-tests, and the syzkaller PoC has
  been running for several days without triggering KASan (before this
  fix, it would reproduce). These patches have also been in -next for
  almost a week, just to be sure.

   - Add logic for making some seccomp flags exclusive (Tycho)

   - Update selftests for exclusivity testing (Kees)"

* tag 'seccomp-v5.1-rc8' of git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux:
  seccomp: Make NEW_LISTENER and TSYNC flags exclusive
  selftests/seccomp: Prepare for exclusive seccomp flags
</content>
</entry>
<entry>
<title>seccomp: Make NEW_LISTENER and TSYNC flags exclusive</title>
<updated>2019-04-25T22:55:58Z</updated>
<author>
<name>Tycho Andersen</name>
<email>tycho@tycho.ws</email>
</author>
<published>2019-03-06T20:14:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7a0df7fbc14505e2e2be19ed08654a09e1ed5bf6'/>
<id>urn:sha1:7a0df7fbc14505e2e2be19ed08654a09e1ed5bf6</id>
<content type='text'>
As the comment notes, the return codes for TSYNC and NEW_LISTENER
conflict, because they both return positive values, one in the case of
success and one in the case of error. So, let's disallow both of these
flags together.

While this is technically a userspace break, all the users I know
of are still waiting on me to land this feature in libseccomp, so I
think it'll be safe. Also, at present my use case doesn't require
TSYNC at all, so this isn't a big deal to disallow. If someone
wanted to support this, a path forward would be to add a new flag like
TSYNC_AND_LISTENER_YES_I_UNDERSTAND_THAT_TSYNC_WILL_JUST_RETURN_EAGAIN,
but the use cases are so different I don't see it really happening.

Finally, it's worth noting that this does actually fix a UAF issue: at the
end of seccomp_set_mode_filter(), we have:

        if (flags &amp; SECCOMP_FILTER_FLAG_NEW_LISTENER) {
                if (ret &lt; 0) {
                        listener_f-&gt;private_data = NULL;
                        fput(listener_f);
                        put_unused_fd(listener);
                } else {
                        fd_install(listener, listener_f);
                        ret = listener;
                }
        }
out_free:
        seccomp_filter_free(prepared);

But if ret &gt; 0 because TSYNC raced, we'll install the listener fd and then
free the filter out from underneath it, causing a UAF when the task closes
it or dies. This patch also switches the condition to be simply if (ret),
so that if someone does add the flag mentioned above, they won't have to
remember to fix this too.

Reported-by: syzbot+b562969adb2e04af3442@syzkaller.appspotmail.com
Fixes: 6a21cc50f0c7 ("seccomp: add a return code to trap to userspace")
CC: stable@vger.kernel.org # v5.0+
Signed-off-by: Tycho Andersen &lt;tycho@tycho.ws&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Acked-by: James Morris &lt;jamorris@linux.microsoft.com&gt;
</content>
</entry>
<entry>
<title>seccomp: fix up grammar in comment</title>
<updated>2019-04-23T23:21:12Z</updated>
<author>
<name>Tycho Andersen</name>
<email>tycho@tycho.ws</email>
</author>
<published>2019-03-06T20:14:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6beff00b79ca0b5caf0ce6fb8e11f57311bd95f8'/>
<id>urn:sha1:6beff00b79ca0b5caf0ce6fb8e11f57311bd95f8</id>
<content type='text'>
This sentence is kind of a train wreck anyway, but at least dropping the
extra pronoun helps somewhat.

Signed-off-by: Tycho Andersen &lt;tycho@tycho.ws&gt;
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: James Morris &lt;jamorris@linux.microsoft.com&gt;
</content>
</entry>
<entry>
<title>syscalls: Remove start and number from syscall_get_arguments() args</title>
<updated>2019-04-05T13:26:43Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2016-11-07T21:26:37Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b35f549df1d7520d37ba1e6d4a8d4df6bd52d136'/>
<id>urn:sha1:b35f549df1d7520d37ba1e6d4a8d4df6bd52d136</id>
<content type='text'>
At Linux Plumbers, Andy Lutomirski approached me and pointed out that the
function call syscall_get_arguments() implemented in x86 was horribly
written and not optimized for the standard case of passing in 0 and 6 for
the starting index and the number of system calls to get. When looking at
all the users of this function, I discovered that all instances pass in only
0 and 6 for these arguments. Instead of having this function handle
different cases that are never used, simply rewrite it to return the first 6
arguments of a system call.

This should help out the performance of tracing system calls by ptrace,
ftrace and perf.

Link: http://lkml.kernel.org/r/20161107213233.754809394@goodmis.org

Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Cc: Dominik Brodowski &lt;linux@dominikbrodowski.net&gt;
Cc: Dave Martin &lt;dave.martin@arm.com&gt;
Cc: "Dmitry V. Levin" &lt;ldv@altlinux.org&gt;
Cc: x86@kernel.org
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-c6x-dev@linux-c6x.org
Cc: uclinux-h8-devel@lists.sourceforge.jp
Cc: linux-hexagon@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-mips@vger.kernel.org
Cc: nios2-dev@lists.rocketboards.org
Cc: openrisc@lists.librecores.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-riscv@lists.infradead.org
Cc: linux-s390@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Cc: linux-um@lists.infradead.org
Cc: linux-xtensa@linux-xtensa.org
Cc: linux-arch@vger.kernel.org
Acked-by: Paul Burton &lt;paul.burton@mips.com&gt; # MIPS parts
Acked-by: Max Filippov &lt;jcmvbkbc@gmail.com&gt; # For xtensa changes
Acked-by: Will Deacon &lt;will.deacon@arm.com&gt; # For the arm64 bits
Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt; # for x86
Reviewed-by: Dmitry V. Levin &lt;ldv@altlinux.org&gt;
Reported-by: Andy Lutomirski &lt;luto@amacapital.net&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>syscall_get_arch: add "struct task_struct *" argument</title>
<updated>2019-03-21T01:12:36Z</updated>
<author>
<name>Dmitry V. Levin</name>
<email>ldv@altlinux.org</email>
</author>
<published>2019-03-17T23:30:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=16add411645cff83360086e102daa67b25f1e39a'/>
<id>urn:sha1:16add411645cff83360086e102daa67b25f1e39a</id>
<content type='text'>
This argument is required to extend the generic ptrace API with
PTRACE_GET_SYSCALL_INFO request: syscall_get_arch() is going
to be called from ptrace_request() along with syscall_get_nr(),
syscall_get_arguments(), syscall_get_error(), and
syscall_get_return_value() functions with a tracee as their argument.

The primary intent is that the triple (audit_arch, syscall_nr, arg1..arg6)
should describe what system call is being called and what its arguments
are.

Reverts: 5e937a9ae913 ("syscall_get_arch: remove useless function arguments")
Reverts: 1002d94d3076 ("syscall.h: fix doc text for syscall_get_arch()")
Reviewed-by: Andy Lutomirski &lt;luto@kernel.org&gt; # for x86
Reviewed-by: Palmer Dabbelt &lt;palmer@sifive.com&gt;
Acked-by: Paul Moore &lt;paul@paul-moore.com&gt;
Acked-by: Paul Burton &lt;paul.burton@mips.com&gt; # MIPS parts
Acked-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt; (powerpc)
Acked-by: Kees Cook &lt;keescook@chromium.org&gt; # seccomp parts
Acked-by: Mark Salter &lt;msalter@redhat.com&gt; # for the c6x bit
Cc: Elvira Khabirova &lt;lineprinter@altlinux.org&gt;
Cc: Eugene Syromyatnikov &lt;esyr@redhat.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: x86@kernel.org
Cc: linux-alpha@vger.kernel.org
Cc: linux-snps-arc@lists.infradead.org
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-c6x-dev@linux-c6x.org
Cc: uclinux-h8-devel@lists.sourceforge.jp
Cc: linux-hexagon@vger.kernel.org
Cc: linux-ia64@vger.kernel.org
Cc: linux-m68k@lists.linux-m68k.org
Cc: linux-mips@vger.kernel.org
Cc: nios2-dev@lists.rocketboards.org
Cc: openrisc@lists.librecores.org
Cc: linux-parisc@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-riscv@lists.infradead.org
Cc: linux-s390@vger.kernel.org
Cc: linux-sh@vger.kernel.org
Cc: sparclinux@vger.kernel.org
Cc: linux-um@lists.infradead.org
Cc: linux-xtensa@linux-xtensa.org
Cc: linux-arch@vger.kernel.org
Cc: linux-audit@redhat.com
Signed-off-by: Dmitry V. Levin &lt;ldv@altlinux.org&gt;
Signed-off-by: Paul Moore &lt;paul@paul-moore.com&gt;
</content>
</entry>
</feed>
