<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/syscalls.h, branch v4.2.4</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.2.4</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.2.4'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2015-06-26T21:02:43Z</updated>
<entry>
<title>Merge tag 'trace-v4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace</title>
<updated>2015-06-26T21:02:43Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-06-26T21:02:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e382608254e06c8109f40044f5e693f2e04f3899'/>
<id>urn:sha1:e382608254e06c8109f40044f5e693f2e04f3899</id>
<content type='text'>
Pull tracing updates from Steven Rostedt:
 "This patch series contains several clean ups and even a new trace
  clock "monitonic raw".  Also some enhancements to make the ring buffer
  even faster.  But the biggest and most noticeable change is the
  renaming of the ftrace* files, structures and variables that have to
  deal with trace events.

  Over the years I've had several developers tell me about their
  confusion with what ftrace is compared to events.  Technically,
  "ftrace" is the infrastructure to do the function hooks, which include
  tracing and also helps with live kernel patching.  But the trace
  events are a separate entity altogether, and the files that affect the
  trace events should not be named "ftrace".  These include:

    include/trace/ftrace.h         -&gt;    include/trace/trace_events.h
    include/linux/ftrace_event.h   -&gt;    include/linux/trace_events.h

  Also, functions that are specific for trace events have also been renamed:

    ftrace_print_*()               -&gt;    trace_print_*()
    (un)register_ftrace_event()    -&gt;    (un)register_trace_event()
    ftrace_event_name()            -&gt;    trace_event_name()
    ftrace_trigger_soft_disabled() -&gt;    trace_trigger_soft_disabled()
    ftrace_define_fields_##call()  -&gt;    trace_define_fields_##call()
    ftrace_get_offsets_##call()    -&gt;    trace_get_offsets_##call()

  Structures have been renamed:

    ftrace_event_file              -&gt;    trace_event_file
    ftrace_event_{call,class}      -&gt;    trace_event_{call,class}
    ftrace_event_buffer            -&gt;    trace_event_buffer
    ftrace_subsystem_dir           -&gt;    trace_subsystem_dir
    ftrace_event_raw_##call        -&gt;    trace_event_raw_##call
    ftrace_event_data_offset_##call-&gt;    trace_event_data_offset_##call
    ftrace_event_type_funcs_##call -&gt;    trace_event_type_funcs_##call

  And a few various variables and flags have also been updated.

  This has been sitting in linux-next for some time, and I have not
  heard a single complaint about this rename breaking anything.  Mostly
  because these functions, variables and structures are mostly internal
  to the tracing system and are seldom (if ever) used by anything
  external to that"

* tag 'trace-v4.2' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (33 commits)
  ring_buffer: Allow to exit the ring buffer benchmark immediately
  ring-buffer-benchmark: Fix the wrong type
  ring-buffer-benchmark: Fix the wrong param in module_param
  ring-buffer: Add enum names for the context levels
  ring-buffer: Remove useless unused tracing_off_permanent()
  ring-buffer: Give NMIs a chance to lock the reader_lock
  ring-buffer: Add trace_recursive checks to ring_buffer_write()
  ring-buffer: Allways do the trace_recursive checks
  ring-buffer: Move recursive check to per_cpu descriptor
  ring-buffer: Add unlikelys to make fast path the default
  tracing: Rename ftrace_get_offsets_##call() to trace_event_get_offsets_##call()
  tracing: Rename ftrace_define_fields_##call() to trace_event_define_fields_##call()
  tracing: Rename ftrace_event_type_funcs_##call to trace_event_type_funcs_##call
  tracing: Rename ftrace_data_offset_##call to trace_event_data_offset_##call
  tracing: Rename ftrace_raw_##call event structures to trace_event_raw_##call
  tracing: Rename ftrace_trigger_soft_disabled() to trace_trigger_soft_disabled()
  tracing: Rename FTRACE_EVENT_FL_* flags to EVENT_FILE_FL_*
  tracing: Rename struct ftrace_subsystem_dir to trace_subsystem_dir
  tracing: Rename ftrace_event_name() to trace_event_name()
  tracing: Rename FTRACE_MAX_EVENT to TRACE_EVENT_TYPE_MAX
  ...
</content>
</entry>
<entry>
<title>clone: support passing tls argument via C rather than pt_regs magic</title>
<updated>2015-06-26T00:00:38Z</updated>
<author>
<name>Josh Triplett</name>
<email>josh@joshtriplett.org</email>
</author>
<published>2015-06-25T22:01:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3033f14ab78c326871a4902591c2518410add24a'/>
<id>urn:sha1:3033f14ab78c326871a4902591c2518410add24a</id>
<content type='text'>
clone has some of the quirkiest syscall handling in the kernel, with a
pile of special cases, historical curiosities, and architecture-specific
calling conventions.  In particular, clone with CLONE_SETTLS accepts a
parameter "tls" that the C entry point completely ignores and some
assembly entry points overwrite; instead, the low-level arch-specific
code pulls the tls parameter out of the arch-specific register captured
as part of pt_regs on entry to the kernel.  That's a massive hack, and
it makes the arch-specific code only work when called via the specific
existing syscall entry points; because of this hack, any new clone-like
system call would have to accept an identical tls argument in exactly
the same arch-specific position, rather than providing a unified system
call entry point across architectures.

The first patch allows architectures to handle the tls argument via
normal C parameter passing, if they opt in by selecting
HAVE_COPY_THREAD_TLS.  The second patch makes 32-bit and 64-bit x86 opt
into this.

These two patches came out of the clone4 series, which isn't ready for
this merge window, but these first two cleanup patches were entirely
uncontroversial and have acks.  I'd like to go ahead and submit these
two so that other architectures can begin building on top of this and
opting into HAVE_COPY_THREAD_TLS.  However, I'm also happy to wait and
send these through the next merge window (along with v3 of clone4) if
anyone would prefer that.

This patch (of 2):

clone with CLONE_SETTLS accepts an argument to set the thread-local
storage area for the new thread.  sys_clone declares an int argument
tls_val in the appropriate point in the argument list (based on the
various CLONE_BACKWARDS variants), but doesn't actually use or pass along
that argument.  Instead, sys_clone calls do_fork, which calls
copy_process, which calls the arch-specific copy_thread, and copy_thread
pulls the corresponding syscall argument out of the pt_regs captured at
kernel entry (knowing what argument of clone that architecture passes tls
in).

Apart from being awful and inscrutable, that also only works because only
one code path into copy_thread can pass the CLONE_SETTLS flag, and that
code path comes from sys_clone with its architecture-specific
argument-passing order.  This prevents introducing a new version of the
clone system call without propagating the same architecture-specific
position of the tls argument.

However, there's no reason to pull the argument out of pt_regs when
sys_clone could just pass it down via C function call arguments.

Introduce a new CONFIG_HAVE_COPY_THREAD_TLS for architectures to opt into,
and a new copy_thread_tls that accepts the tls parameter as an additional
unsigned long (syscall-argument-sized) argument.  Change sys_clone's tls
argument to an unsigned long (which does not change the ABI), and pass
that down to copy_thread_tls.

Architectures that don't opt into copy_thread_tls will continue to ignore
the C argument to sys_clone in favor of the pt_regs captured at kernel
entry, and thus will be unable to introduce new versions of the clone
syscall.

Patch co-authored by Josh Triplett and Thiago Macieira.

Signed-off-by: Josh Triplett &lt;josh@joshtriplett.org&gt;
Acked-by: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Thiago Macieira &lt;thiago.macieira@intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>tracing: Rename ftrace_event_{call,class} to trace_event_{call,class}</title>
<updated>2015-05-13T18:06:10Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2015-05-05T15:45:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2425bcb9240f8c97d793cb31c8e8d8d0a843fa29'/>
<id>urn:sha1:2425bcb9240f8c97d793cb31c8e8d8d0a843fa29</id>
<content type='text'>
The name "ftrace" really refers to the function hook infrastructure. It
is not about the trace_events. The structures ftrace_event_call and
ftrace_event_class have nothing to do with the function hooks, and are
really trace_event structures. Rename ftrace_event_* to trace_event_*.

Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>syscalls: Declare sys_*stat64 prototypes if __ARCH_WANT_(COMPAT_)STAT64</title>
<updated>2015-01-27T09:38:00Z</updated>
<author>
<name>Catalin Marinas</name>
<email>catalin.marinas@arm.com</email>
</author>
<published>2015-01-06T16:52:38Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=54e45c169dbce43cf46d00eb1521b655b6e4f9e9'/>
<id>urn:sha1:54e45c169dbce43cf46d00eb1521b655b6e4f9e9</id>
<content type='text'>
Currently, the sys_stat64, sys_fstat64 and sys_lstat64 prototpyes are
only declared if BITS_PER_LONG == 32. Following commit 0753f70f07fb
(fs: Build sys_stat64() and friends if __ARCH_WANT_COMPAT_STAT64), the
implementation of these functions is allowed on 64-bit systems for
compat support. The patch changes the condition on the prototype
declaration from BITS_PER_LONG == 32 to defined(__ARCH_WANT_STAT64) ||
defined(__ARCH_WANT_COMPAT_STAT64).

In addition, it moves the sys_fstatat64 prototype under the same #if
block

Signed-off-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Acked-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
</content>
</entry>
<entry>
<title>syscalls: implement execveat() system call</title>
<updated>2014-12-13T20:42:51Z</updated>
<author>
<name>David Drysdale</name>
<email>drysdale@google.com</email>
</author>
<published>2014-12-13T00:57:29Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=51f39a1f0cea1cacf8c787f652f26dfee9611874'/>
<id>urn:sha1:51f39a1f0cea1cacf8c787f652f26dfee9611874</id>
<content type='text'>
This patchset adds execveat(2) for x86, and is derived from Meredydd
Luff's patch from Sept 2012 (https://lkml.org/lkml/2012/9/11/528).

The primary aim of adding an execveat syscall is to allow an
implementation of fexecve(3) that does not rely on the /proc filesystem,
at least for executables (rather than scripts).  The current glibc version
of fexecve(3) is implemented via /proc, which causes problems in sandboxed
or otherwise restricted environments.

Given the desire for a /proc-free fexecve() implementation, HPA suggested
(https://lkml.org/lkml/2006/7/11/556) that an execveat(2) syscall would be
an appropriate generalization.

Also, having a new syscall means that it can take a flags argument without
back-compatibility concerns.  The current implementation just defines the
AT_EMPTY_PATH and AT_SYMLINK_NOFOLLOW flags, but other flags could be
added in future -- for example, flags for new namespaces (as suggested at
https://lkml.org/lkml/2006/7/11/474).

Related history:
 - https://lkml.org/lkml/2006/12/27/123 is an example of someone
   realizing that fexecve() is likely to fail in a chroot environment.
 - http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=514043 covered
   documenting the /proc requirement of fexecve(3) in its manpage, to
   "prevent other people from wasting their time".
 - https://bugzilla.redhat.com/show_bug.cgi?id=241609 described a
   problem where a process that did setuid() could not fexecve()
   because it no longer had access to /proc/self/fd; this has since
   been fixed.

This patch (of 4):

Add a new execveat(2) system call.  execveat() is to execve() as openat()
is to open(): it takes a file descriptor that refers to a directory, and
resolves the filename relative to that.

In addition, if the filename is empty and AT_EMPTY_PATH is specified,
execveat() executes the file to which the file descriptor refers.  This
replicates the functionality of fexecve(), which is a system call in other
UNIXen, but in Linux glibc it depends on opening "/proc/self/fd/&lt;fd&gt;" (and
so relies on /proc being mounted).

The filename fed to the executed program as argv[0] (or the name of the
script fed to a script interpreter) will be of the form "/dev/fd/&lt;fd&gt;"
(for an empty filename) or "/dev/fd/&lt;fd&gt;/&lt;filename&gt;", effectively
reflecting how the executable was found.  This does however mean that
execution of a script in a /proc-less environment won't work; also, script
execution via an O_CLOEXEC file descriptor fails (as the file will not be
accessible after exec).

Based on patches by Meredydd Luff.

Signed-off-by: David Drysdale &lt;drysdale@google.com&gt;
Cc: Meredydd Luff &lt;meredydd@senatehouse.org&gt;
Cc: Shuah Khan &lt;shuah.kh@samsung.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Rich Felker &lt;dalias@aerifal.cx&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: Michael Kerrisk &lt;mtk.manpages@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>separate kernel- and userland-side msghdr</title>
<updated>2014-11-19T21:22:59Z</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2014-04-06T18:03:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=666547ff591cebdedc4679bf6b1b3f3383a8dea3'/>
<id>urn:sha1:666547ff591cebdedc4679bf6b1b3f3383a8dea3</id>
<content type='text'>
Kernel-side struct msghdr is (currently) using the same layout as
userland one, but it's not a one-to-one copy - even without considering
32bit compat issues, we have msg_iov, msg_name and msg_control copied
to kernel[1].  It's fairly localized, so we get away with a few functions
where that knowledge is needed (and we could shrink that set even
more).  Pretty much everything deals with the kernel-side variant and
the few places that want userland one just use a bunch of force-casts
to paper over the differences.

The thing is, kernel-side definition of struct msghdr is *not* exposed
in include/uapi - libc doesn't see it, etc.  So we can add struct user_msghdr,
with proper annotations and let the few places that ever deal with those
beasts use it for userland pointers.  Saner typechecking aside, that will
allow to change the layout of kernel-side msghdr - e.g. replace
msg_iov/msg_iovlen there with struct iov_iter, getting rid of the need
to modify the iovec as we copy data to/from it, etc.

We could introduce kernel_msghdr instead, but that would create much more
noise - the absolute majority of the instances would need to have the
type switched to kernel_msghdr and definition of struct msghdr in
include/linux/socket.h is not going to be seen by userland anyway.

This commit just introduces user_msghdr and switches the few places that
are dealing with userland-side msghdr to it.

[1] actually, it's even trickier than that - we copy msg_control for
sendmsg, but keep the userland address on recvmsg.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>bpf: enable bpf syscall on x64 and i386</title>
<updated>2014-09-26T19:05:14Z</updated>
<author>
<name>Alexei Starovoitov</name>
<email>ast@plumgrid.com</email>
</author>
<published>2014-09-26T07:16:58Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=749730ce42a2121e1c88350d69478bff3994b10a'/>
<id>urn:sha1:749730ce42a2121e1c88350d69478bff3994b10a</id>
<content type='text'>
done as separate commit to ease conflict resolution

Signed-off-by: Alexei Starovoitov &lt;ast@plumgrid.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>kexec: new syscall kexec_file_load() declaration</title>
<updated>2014-08-08T22:57:32Z</updated>
<author>
<name>Vivek Goyal</name>
<email>vgoyal@redhat.com</email>
</author>
<published>2014-08-08T21:25:55Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f0895685c7fd8c938c91a9d8a6f7c11f22df58d2'/>
<id>urn:sha1:f0895685c7fd8c938c91a9d8a6f7c11f22df58d2</id>
<content type='text'>
This is the new syscall kexec_file_load() declaration/interface.  I have
reserved the syscall number only for x86_64 so far.  Other architectures
(including i386) can reserve syscall number when they enable the support
for this new syscall.

Signed-off-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Cc: Michael Kerrisk &lt;mtk.manpages@gmail.com&gt;
Cc: Borislav Petkov &lt;bp@suse.de&gt;
Cc: Yinghai Lu &lt;yinghai@kernel.org&gt;
Cc: Eric Biederman &lt;ebiederm@xmission.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Matthew Garrett &lt;mjg59@srcf.ucam.org&gt;
Cc: Greg Kroah-Hartman &lt;greg@kroah.com&gt;
Cc: Dave Young &lt;dyoung@redhat.com&gt;
Cc: WANG Chao &lt;chaowang@redhat.com&gt;
Cc: Baoquan He &lt;bhe@redhat.com&gt;
Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>shm: add memfd_create() syscall</title>
<updated>2014-08-08T22:57:31Z</updated>
<author>
<name>David Herrmann</name>
<email>dh.herrmann@gmail.com</email>
</author>
<published>2014-08-08T21:25:29Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9183df25fe7b194563db3fec6dc3202a5855839c'/>
<id>urn:sha1:9183df25fe7b194563db3fec6dc3202a5855839c</id>
<content type='text'>
memfd_create() is similar to mmap(MAP_ANON), but returns a file-descriptor
that you can pass to mmap().  It can support sealing and avoids any
connection to user-visible mount-points.  Thus, it's not subject to quotas
on mounted file-systems, but can be used like malloc()'ed memory, but with
a file-descriptor to it.

memfd_create() returns the raw shmem file, so calls like ftruncate() can
be used to modify the underlying inode.  Also calls like fstat() will
return proper information and mark the file as regular file.  If you want
sealing, you can specify MFD_ALLOW_SEALING.  Otherwise, sealing is not
supported (like on all other regular files).

Compared to O_TMPFILE, it does not require a tmpfs mount-point and is not
subject to a filesystem size limit.  It is still properly accounted to
memcg limits, though, and to the same overcommit or no-overcommit
accounting as all user memory.

Signed-off-by: David Herrmann &lt;dh.herrmann@gmail.com&gt;
Acked-by: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Michael Kerrisk &lt;mtk.manpages@gmail.com&gt;
Cc: Ryan Lortie &lt;desrt@desrt.ca&gt;
Cc: Lennart Poettering &lt;lennart@poettering.net&gt;
Cc: Daniel Mack &lt;zonque@gmail.com&gt;
Cc: Andy Lutomirski &lt;luto@amacapital.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random</title>
<updated>2014-08-06T15:16:24Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-08-06T15:16:24Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f4f142ed4ef835709c7e6d12eaca10d190bcebed'/>
<id>urn:sha1:f4f142ed4ef835709c7e6d12eaca10d190bcebed</id>
<content type='text'>
Pull randomness updates from Ted Ts'o:
 "Cleanups and bug fixes to /dev/random, add a new getrandom(2) system
  call, which is a superset of OpenBSD's getentropy(2) call, for use
  with userspace crypto libraries such as LibreSSL.

  Also add the ability to have a kernel thread to pull entropy from
  hardware rng devices into /dev/random"

* tag 'random_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/random:
  hwrng: Pass entropy to add_hwgenerator_randomness() in bits, not bytes
  random: limit the contribution of the hw rng to at most half
  random: introduce getrandom(2) system call
  hw_random: fix sparse warning (NULL vs 0 for pointer)
  random: use registers from interrupted code for CPU's w/o a cycle counter
  hwrng: add per-device entropy derating
  hwrng: create filler thread
  random: add_hwgenerator_randomness() for feeding entropy from devices
  random: use an improved fast_mix() function
  random: clean up interrupt entropy accounting for archs w/o cycle counters
  random: only update the last_pulled time if we actually transferred entropy
  random: remove unneeded hash of a portion of the entropy pool
  random: always update the entropy pool under the spinlock
</content>
</entry>
</feed>
