<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/prctl.h, branch v3.5-rc2</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.5-rc2</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.5-rc2'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2012-06-07T21:43:55Z</updated>
<entry>
<title>c/r: prctl: add ability to get clear_tid_address</title>
<updated>2012-06-07T21:43:55Z</updated>
<author>
<name>Cyrill Gorcunov</name>
<email>gorcunov@openvz.org</email>
</author>
<published>2012-06-07T21:21:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=300f786b2683f8bb1ec0afb6e1851183a479c86d'/>
<id>urn:sha1:300f786b2683f8bb1ec0afb6e1851183a479c86d</id>
<content type='text'>
Zero is written at clear_tid_address when the process exits.  This
functionality is used by pthread_join().

We already have sys_set_tid_address() to change this address for the
current task but there is no way to obtain it from user space.

Without the ability to find this address and dump it we can't restore
pthread'ed apps which call pthread_join() once they have been restored.

This patch introduces the PR_GET_TID_ADDRESS prctl option which allows
the current process to obtain own clear_tid_address.

This feature is available iif CONFIG_CHECKPOINT_RESTORE is set.

[akpm@linux-foundation.org: fix prctl numbering]
Signed-off-by: Andrew Vagin &lt;avagin@openvz.org&gt;
Signed-off-by: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Cc: Pedro Alves &lt;palves@redhat.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Pavel Emelyanov &lt;xemul@parallels.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>c/r: prctl: add ability to set new mm_struct::exe_file</title>
<updated>2012-06-01T00:49:32Z</updated>
<author>
<name>Cyrill Gorcunov</name>
<email>gorcunov@openvz.org</email>
</author>
<published>2012-05-31T23:26:46Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b32dfe377102ce668775f8b6b1461f7ad428f8b6'/>
<id>urn:sha1:b32dfe377102ce668775f8b6b1461f7ad428f8b6</id>
<content type='text'>
When we do restore we would like to have a way to setup a former
mm_struct::exe_file so that /proc/pid/exe would point to the original
executable file a process had at checkpoint time.

For this the PR_SET_MM_EXE_FILE code is introduced.  This option takes a
file descriptor which will be set as a source for new /proc/$pid/exe
symlink.

Note it allows to change /proc/$pid/exe if there are no VM_EXECUTABLE
vmas present for current process, simply because this feature is a special
to C/R and mm::num_exe_file_vmas become meaningless after that.

To minimize the amount of transition the /proc/pid/exe symlink might have,
this feature is implemented in one-shot manner.  Thus once changed the
symlink can't be changed again.  This should help sysadmins to monitor the
symlinks over all process running in a system.

In particular one could make a snapshot of processes and ring alarm if
there unexpected changes of /proc/pid/exe's in a system.

Note -- this feature is available iif CONFIG_CHECKPOINT_RESTORE is set and
the caller must have CAP_SYS_RESOURCE capability granted, otherwise the
request to change symlink will be rejected.

Signed-off-by: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Reviewed-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Cc: Pavel Emelyanov &lt;xemul@parallels.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Matt Helsley &lt;matthltc@us.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>c/r: prctl: extend PR_SET_MM to set up more mm_struct entries</title>
<updated>2012-06-01T00:49:32Z</updated>
<author>
<name>Cyrill Gorcunov</name>
<email>gorcunov@openvz.org</email>
</author>
<published>2012-05-31T23:26:45Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fe8c7f5cbf91124987106faa3bdf0c8b955c4cf7'/>
<id>urn:sha1:fe8c7f5cbf91124987106faa3bdf0c8b955c4cf7</id>
<content type='text'>
During checkpoint we dump whole process memory to a file and the dump
includes process stack memory.  But among stack data itself, the stack
carries additional parameters such as command line arguments, environment
data and auxiliary vector.

So when we do restore procedure and once we've restored stack data itself
we need to setup mm_struct::arg_start/end, env_start/end, so restored
process would be able to find command line arguments and environment data
it had at checkpoint time.  The same applies to auxiliary vector.

For this reason additional PR_SET_MM_(ARG_START | ARG_END | ENV_START |
ENV_END | AUXV) codes are introduced.

Signed-off-by: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Andrew Vagin &lt;avagin@openvz.org&gt;
Cc: Serge Hallyn &lt;serge.hallyn@canonical.com&gt;
Cc: Pavel Emelyanov &lt;xemul@parallels.com&gt;
Cc: Vasiliy Kulikov &lt;segoon@openwall.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Michael Kerrisk &lt;mtk.manpages@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Add PR_{GET,SET}_NO_NEW_PRIVS to prevent execve from granting privs</title>
<updated>2012-04-14T01:13:18Z</updated>
<author>
<name>Andy Lutomirski</name>
<email>luto@amacapital.net</email>
</author>
<published>2012-04-12T21:47:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=259e5e6c75a910f3b5e656151dc602f53f9d7548'/>
<id>urn:sha1:259e5e6c75a910f3b5e656151dc602f53f9d7548</id>
<content type='text'>
With this change, calling
  prctl(PR_SET_NO_NEW_PRIVS, 1, 0, 0, 0)
disables privilege granting operations at execve-time.  For example, a
process will not be able to execute a setuid binary to change their uid
or gid if this bit is set.  The same is true for file capabilities.

Additionally, LSM_UNSAFE_NO_NEW_PRIVS is defined to ensure that
LSMs respect the requested behavior.

To determine if the NO_NEW_PRIVS bit is set, a task may call
  prctl(PR_GET_NO_NEW_PRIVS, 0, 0, 0, 0);
It returns 1 if set and 0 if it is not set. If any of the arguments are
non-zero, it will return -1 and set errno to -EINVAL.
(PR_SET_NO_NEW_PRIVS behaves similarly.)

This functionality is desired for the proposed seccomp filter patch
series.  By using PR_SET_NO_NEW_PRIVS, it allows a task to modify the
system call behavior for itself and its child tasks without being
able to impact the behavior of a more privileged task.

Another potential use is making certain privileged operations
unprivileged.  For example, chroot may be considered "safe" if it cannot
affect privileged tasks.

Note, this patch causes execve to fail when PR_SET_NO_NEW_PRIVS is
set and AppArmor is in use.  It is fixed in a subsequent patch.

Signed-off-by: Andy Lutomirski &lt;luto@amacapital.net&gt;
Signed-off-by: Will Drewry &lt;wad@chromium.org&gt;
Acked-by: Eric Paris &lt;eparis@redhat.com&gt;
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;

v18: updated change desc
v17: using new define values as per 3.4
Signed-off-by: James Morris &lt;james.l.morris@oracle.com&gt;
</content>
</entry>
<entry>
<title>prctl: add PR_{SET,GET}_CHILD_SUBREAPER to allow simple process supervision</title>
<updated>2012-03-23T23:58:32Z</updated>
<author>
<name>Lennart Poettering</name>
<email>lennart@poettering.net</email>
</author>
<published>2012-03-23T22:01:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ebec18a6d3aa1e7d84aab16225e87fd25170ec2b'/>
<id>urn:sha1:ebec18a6d3aa1e7d84aab16225e87fd25170ec2b</id>
<content type='text'>
Userspace service managers/supervisors need to track their started
services.  Many services daemonize by double-forking and get implicitly
re-parented to PID 1.  The service manager will no longer be able to
receive the SIGCHLD signals for them, and is no longer in charge of
reaping the children with wait().  All information about the children is
lost at the moment PID 1 cleans up the re-parented processes.

With this prctl, a service manager process can mark itself as a sort of
'sub-init', able to stay as the parent for all orphaned processes
created by the started services.  All SIGCHLD signals will be delivered
to the service manager.

Receiving SIGCHLD and doing wait() is in cases of a service-manager much
preferred over any possible asynchronous notification about specific
PIDs, because the service manager has full access to the child process
data in /proc and the PID can not be re-used until the wait(), the
service-manager itself is in charge of, has happened.

As a side effect, the relevant parent PID information does not get lost
by a double-fork, which results in a more elaborate process tree and
'ps' output:

before:
  # ps afx
  253 ?        Ss     0:00 /bin/dbus-daemon --system --nofork
  294 ?        Sl     0:00 /usr/libexec/polkit-1/polkitd
  328 ?        S      0:00 /usr/sbin/modem-manager
  608 ?        Sl     0:00 /usr/libexec/colord
  658 ?        Sl     0:00 /usr/libexec/upowerd
  819 ?        Sl     0:00 /usr/libexec/imsettings-daemon
  916 ?        Sl     0:00 /usr/libexec/udisks-daemon
  917 ?        S      0:00  \_ udisks-daemon: not polling any devices

after:
  # ps afx
  294 ?        Ss     0:00 /bin/dbus-daemon --system --nofork
  426 ?        Sl     0:00  \_ /usr/libexec/polkit-1/polkitd
  449 ?        S      0:00  \_ /usr/sbin/modem-manager
  635 ?        Sl     0:00  \_ /usr/libexec/colord
  705 ?        Sl     0:00  \_ /usr/libexec/upowerd
  959 ?        Sl     0:00  \_ /usr/libexec/udisks-daemon
  960 ?        S      0:00  |   \_ udisks-daemon: not polling any devices
  977 ?        Sl     0:00  \_ /usr/libexec/packagekitd

This prctl is orthogonal to PID namespaces.  PID namespaces are isolated
from each other, while a service management process usually requires the
services to live in the same namespace, to be able to talk to each
other.

Users of this will be the systemd per-user instance, which provides
init-like functionality for the user's login session and D-Bus, which
activates bus services on-demand.  Both need init-like capabilities to
be able to properly keep track of the services they start.

Many thanks to Oleg for several rounds of review and insights.

[akpm@linux-foundation.org: fix comment layout and spelling]
[akpm@linux-foundation.org: add lengthy code comment from Oleg]
Reviewed-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Lennart Poettering &lt;lennart@poettering.net&gt;
Signed-off-by: Kay Sievers &lt;kay.sievers@vrfy.org&gt;
Acked-by: Valdis Kletnieks &lt;Valdis.Kletnieks@vt.edu&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Yama: add PR_SET_PTRACER_ANY</title>
<updated>2012-02-15T23:25:18Z</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2012-02-15T00:48:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bf06189e4d14641c0148bea16e9dd24943862215'/>
<id>urn:sha1:bf06189e4d14641c0148bea16e9dd24943862215</id>
<content type='text'>
For a process to entirely disable Yama ptrace restrictions, it can use
the special PR_SET_PTRACER_ANY pid to indicate that any otherwise allowed
process may ptrace it. This is stronger than calling PR_SET_PTRACER with
pid "1" because it includes processes in external pid namespaces. This is
currently needed by the Chrome renderer, since its crash handler (Breakpad)
runs external to the renderer's pid namespace.

Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: James Morris &lt;jmorris@namei.org&gt;
</content>
</entry>
<entry>
<title>security: Yama LSM</title>
<updated>2012-02-09T22:18:52Z</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2011-12-21T20:17:04Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2d514487faf188938a4ee4fb3464eeecfbdcf8eb'/>
<id>urn:sha1:2d514487faf188938a4ee4fb3464eeecfbdcf8eb</id>
<content type='text'>
This adds the Yama Linux Security Module to collect DAC security
improvements (specifically just ptrace restrictions for now) that have
existed in various forms over the years and have been carried outside the
mainline kernel by other Linux distributions like Openwall and grsecurity.

Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Acked-by: John Johansen &lt;john.johansen@canonical.com&gt;
Signed-off-by: James Morris &lt;jmorris@namei.org&gt;
</content>
</entry>
<entry>
<title>c/r: prctl: add PR_SET_MM codes to set up mm_struct entries</title>
<updated>2012-01-13T04:13:13Z</updated>
<author>
<name>Cyrill Gorcunov</name>
<email>gorcunov@openvz.org</email>
</author>
<published>2012-01-13T01:20:55Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=028ee4be34a09a6d48bdf30ab991ae933a7bc036'/>
<id>urn:sha1:028ee4be34a09a6d48bdf30ab991ae933a7bc036</id>
<content type='text'>
When we restore a task we need to set up text, data and data heap sizes
from userspace to the values a task had at checkpoint time.  This patch
adds auxilary prctl codes for that.

While most of them have a statistical nature (their values are involved
into calculation of /proc/&lt;pid&gt;/statm output) the start_brk and brk values
are used to compute an allowed size of program data segment expansion.
Which means an arbitrary changes of this values might be dangerous
operation.  So to restrict access the following requirements applied to
prctl calls:

 - The process has to have CAP_SYS_ADMIN capability granted.
 - For all opcodes except start_brk/brk members an appropriate
   VMA area must exist and should fit certain VMA flags,
   such as:
   - code segment must be executable but not writable;
   - data segment must not be executable.

start_brk/brk values must not intersect with data segment and must not
exceed RLIMIT_DATA resource limit.

Still the main guard is CAP_SYS_ADMIN capability check.

Note the kernel should be compiled with CONFIG_CHECKPOINT_RESTORE support
otherwise these prctl calls will return -EINVAL.

[akpm@linux-foundation.org: cache current-&gt;mm in a local, saving 200 bytes text]
Signed-off-by: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Reviewed-by: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Andrew Vagin &lt;avagin@openvz.org&gt;
Cc: Serge Hallyn &lt;serge.hallyn@canonical.com&gt;
Cc: Pavel Emelyanov &lt;xemul@parallels.com&gt;
Cc: Vasiliy Kulikov &lt;segoon@openwall.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Michael Kerrisk &lt;mtk.manpages@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>HWPOISON: Clean up PR_MCE_KILL interface</title>
<updated>2009-10-04T01:23:17Z</updated>
<author>
<name>Andi Kleen</name>
<email>ak@linux.intel.com</email>
</author>
<published>2009-10-04T00:20:11Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1087e9b4ff708976499b4de541d9e1d57d49b60a'/>
<id>urn:sha1:1087e9b4ff708976499b4de541d9e1d57d49b60a</id>
<content type='text'>
While writing the manpage I noticed some shortcomings in the
current interface.

- Define symbolic names for all the different values
- Boundary check the kill mode values
- For symmetry add a get interface too. This allows library
code to get/set the current state.
- For consistency define a PR_MCE_KILL_DEFAULT value

Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6</title>
<updated>2009-09-24T14:53:22Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2009-09-24T14:53:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=db16826367fefcb0ddb93d76b66adc52eb4e6339'/>
<id>urn:sha1:db16826367fefcb0ddb93d76b66adc52eb4e6339</id>
<content type='text'>
* 'hwpoison' of git://git.kernel.org/pub/scm/linux/kernel/git/ak/linux-mce-2.6: (21 commits)
  HWPOISON: Enable error_remove_page on btrfs
  HWPOISON: Add simple debugfs interface to inject hwpoison on arbitary PFNs
  HWPOISON: Add madvise() based injector for hardware poisoned pages v4
  HWPOISON: Enable error_remove_page for NFS
  HWPOISON: Enable .remove_error_page for migration aware file systems
  HWPOISON: The high level memory error handler in the VM v7
  HWPOISON: Add PR_MCE_KILL prctl to control early kill behaviour per process
  HWPOISON: shmem: call set_page_dirty() with locked page
  HWPOISON: Define a new error_remove_page address space op for async truncation
  HWPOISON: Add invalidate_inode_page
  HWPOISON: Refactor truncate to allow direct truncating of page v2
  HWPOISON: check and isolate corrupted free pages v2
  HWPOISON: Handle hardware poisoned pages in try_to_unmap
  HWPOISON: Use bitmask/action code for try_to_unmap behaviour
  HWPOISON: x86: Add VM_FAULT_HWPOISON handling to x86 page fault handler v2
  HWPOISON: Add poison check to page fault handling
  HWPOISON: Add basic support for poisoned pages in fault handler v3
  HWPOISON: Add new SIGBUS error codes for hardware poison signals
  HWPOISON: Add support for poison swap entries v2
  HWPOISON: Export some rmap vma locking to outside world
  ...
</content>
</entry>
</feed>
