<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/pid_namespace.c, branch v4.9.99</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.9.99</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.9.99'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2017-05-25T13:44:38Z</updated>
<entry>
<title>pid_ns: Sleep in TASK_INTERRUPTIBLE in zap_pid_ns_processes</title>
<updated>2017-05-25T13:44:38Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2017-05-11T23:21:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6dc6a2700b6a0dd755027654b85bd614d8d3d52b'/>
<id>urn:sha1:6dc6a2700b6a0dd755027654b85bd614d8d3d52b</id>
<content type='text'>
commit b9a985db98961ae1ba0be169f19df1c567e4ffe0 upstream.

The code can potentially sleep for an indefinite amount of time in
zap_pid_ns_processes triggering the hung task timeout, and increasing
the system average.  This is undesirable.  Sleep with a task state of
TASK_INTERRUPTIBLE instead of TASK_UNINTERRUPTIBLE to remove these
undesirable side effects.

Apparently under heavy load this has been allowing Chrome to trigger
the hung time task timeout error and cause ChromeOS to reboot.

Reported-by: Vovo Yang &lt;vovoy@google.com&gt;
Reported-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Tested-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Fixes: 6347e9009104 ("pidns: guarantee that the pidns init will be the last pidns process reaped")
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>pid: fix lockdep deadlock warning due to ucount_lock</title>
<updated>2017-01-19T19:18:03Z</updated>
<author>
<name>Andrei Vagin</name>
<email>avagin@openvz.org</email>
</author>
<published>2017-01-05T03:28:14Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=52fd0ab07676b295abc54eeb1cef7a06d49c9847'/>
<id>urn:sha1:52fd0ab07676b295abc54eeb1cef7a06d49c9847</id>
<content type='text'>
commit add7c65ca426b7a37184dd3d2172394e23d585d6 upstream.

=========================================================
[ INFO: possible irq lock inversion dependency detected ]
4.10.0-rc2-00024-g4aecec9-dirty #118 Tainted: G        W
---------------------------------------------------------
swapper/1/0 just changed the state of lock:
 (&amp;(&amp;sighand-&gt;siglock)-&gt;rlock){-.....}, at: [&lt;ffffffffbd0a1bc6&gt;] __lock_task_sighand+0xb6/0x2c0
but this lock took another, HARDIRQ-unsafe lock in the past:
 (ucounts_lock){+.+...}
and interrupts could create inverse lock ordering between them.
other info that might help us debug this:
Chain exists of:                 &amp;(&amp;sighand-&gt;siglock)-&gt;rlock --&gt; &amp;(&amp;tty-&gt;ctrl_lock)-&gt;rlock --&gt; ucounts_lock
 Possible interrupt unsafe locking scenario:
       CPU0                    CPU1
       ----                    ----
  lock(ucounts_lock);
                               local_irq_disable();
                               lock(&amp;(&amp;sighand-&gt;siglock)-&gt;rlock);
                               lock(&amp;(&amp;tty-&gt;ctrl_lock)-&gt;rlock);
  &lt;Interrupt&gt;
    lock(&amp;(&amp;sighand-&gt;siglock)-&gt;rlock);

 *** DEADLOCK ***

This patch removes a dependency between rlock and ucount_lock.

Fixes: f333c700c610 ("pidns: Add a limit on the number of pid namespaces")
Signed-off-by: Andrei Vagin &lt;avagin@openvz.org&gt;
Acked-by: Al Viro &lt;viro@ZenIV.linux.org.uk&gt;
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>Merge branch 'nsfs-ioctls' into HEAD</title>
<updated>2016-09-23T01:00:36Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2016-09-23T01:00:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=78725596644be0181c46f55c52aadfb8c70bcdb7'/>
<id>urn:sha1:78725596644be0181c46f55c52aadfb8c70bcdb7</id>
<content type='text'>
From: Andrey Vagin &lt;avagin@openvz.org&gt;

Each namespace has an owning user namespace and now there is not way
to discover these relationships.

Pid and user namepaces are hierarchical. There is no way to discover
parent-child relationships too.

Why we may want to know relationships between namespaces?

One use would be visualization, in order to understand the running
system.  Another would be to answer the question: what capability does
process X have to perform operations on a resource governed by namespace
Y?

One more use-case (which usually called abnormal) is checkpoint/restart.
In CRIU we are going to dump and restore nested namespaces.

There [1] was a discussion about which interface to choose to determing
relationships between namespaces.

Eric suggested to add two ioctl-s [2]:
&gt; Grumble, Grumble.  I think this may actually a case for creating ioctls
&gt; for these two cases.  Now that random nsfs file descriptors are bind
&gt; mountable the original reason for using proc files is not as pressing.
&gt;
&gt; One ioctl for the user namespace that owns a file descriptor.
&gt; One ioctl for the parent namespace of a namespace file descriptor.

Here is an implementaions of these ioctl-s.

$ man man7/namespaces.7
...
Since  Linux  4.X,  the  following  ioctl(2)  calls are supported for
namespace file descriptors.  The correct syntax is:

      fd = ioctl(ns_fd, ioctl_type);

where ioctl_type is one of the following:

NS_GET_USERNS
      Returns a file descriptor that refers to an owning user names‐
      pace.

NS_GET_PARENT
      Returns  a  file descriptor that refers to a parent namespace.
      This ioctl(2) can be used for pid  and  user  namespaces.  For
      user namespaces, NS_GET_PARENT and NS_GET_USERNS have the same
      meaning.

In addition to generic ioctl(2) errors, the following  specific  ones
can occur:

EINVAL NS_GET_PARENT was called for a nonhierarchical namespace.

EPERM  The  requested  namespace  is outside of the current namespace
      scope.

[1] https://lkml.org/lkml/2016/7/6/158
[2] https://lkml.org/lkml/2016/7/9/101

Changes for v2:
* don't return ENOENT for init_user_ns and init_pid_ns. There is nothing
  outside of the init namespace, so we can return EPERM in this case too.
  &gt; The fewer special cases the easier the code is to get
  &gt; correct, and the easier it is to read. // Eric

Changes for v3:
* rename ns-&gt;get_owner() to ns-&gt;owner(). get_* usually means that it
  grabs a reference.

Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: James Bottomley &lt;James.Bottomley@HansenPartnership.com&gt;
Cc: "Michael Kerrisk (man-pages)" &lt;mtk.manpages@gmail.com&gt;
Cc: "W. Trevor King" &lt;wking@tremily.us&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Serge Hallyn &lt;serge.hallyn@canonical.com&gt;
</content>
</entry>
<entry>
<title>nsfs: add ioctl to get a parent namespace</title>
<updated>2016-09-23T00:59:41Z</updated>
<author>
<name>Andrey Vagin</name>
<email>avagin@openvz.org</email>
</author>
<published>2016-09-06T07:47:15Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a7306ed8d94af729ecef8b6e37506a1c6fc14788'/>
<id>urn:sha1:a7306ed8d94af729ecef8b6e37506a1c6fc14788</id>
<content type='text'>
Pid and user namepaces are hierarchical. There is no way to discover
parent-child relationships.

In a future we will use this interface to dump and restore nested
namespaces.

Acked-by: Serge Hallyn &lt;serge@hallyn.com&gt;
Signed-off-by: Andrei Vagin &lt;avagin@openvz.org&gt;
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
</content>
</entry>
<entry>
<title>kernel: add a helper to get an owning user namespace for a namespace</title>
<updated>2016-09-23T00:59:39Z</updated>
<author>
<name>Andrey Vagin</name>
<email>avagin@openvz.org</email>
</author>
<published>2016-09-06T07:47:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bcac25a58bfc6bd79191ac5d7afb49bea96da8c9'/>
<id>urn:sha1:bcac25a58bfc6bd79191ac5d7afb49bea96da8c9</id>
<content type='text'>
Return -EPERM if an owning user namespace is outside of a process
current user namespace.

v2: In a first version ns_get_owner returned ENOENT for init_user_ns.
    This special cases was removed from this version. There is nothing
    outside of init_user_ns, so we can return EPERM.
v3: rename ns-&gt;get_owner() to ns-&gt;owner(). get_* usually means that it
grabs a reference.

Acked-by: Serge Hallyn &lt;serge@hallyn.com&gt;
Signed-off-by: Andrei Vagin &lt;avagin@openvz.org&gt;
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
</content>
</entry>
<entry>
<title>userns: When the per user per user namespace limit is reached return ENOSPC</title>
<updated>2016-09-22T18:25:56Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2016-09-22T18:08:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=df75e7748bae1c7098bfa358485389b897f71305'/>
<id>urn:sha1:df75e7748bae1c7098bfa358485389b897f71305</id>
<content type='text'>
The current error codes returned when a the per user per user
namespace limit are hit (EINVAL, EUSERS, and ENFILE) are wrong.  I
asked for advice on linux-api and it we made clear that those were
the wrong error code, but a correct effor code was not suggested.

The best general error code I have found for hitting a resource limit
is ENOSPC.  It is not perfect but as it is unambiguous it will serve
until someone comes up with a better error code.

Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
</content>
</entry>
<entry>
<title>pidns: Add a limit on the number of pid namespaces</title>
<updated>2016-08-08T19:42:01Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2016-08-08T19:08:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f333c700c6100b53050980986be922bb21466e29'/>
<id>urn:sha1:f333c700c6100b53050980986be922bb21466e29</id>
<content type='text'>
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2014-12-16T23:53:03Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-12-16T23:53:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=603ba7e41bf5d405aba22294af5d075d8898176d'/>
<id>urn:sha1:603ba7e41bf5d405aba22294af5d075d8898176d</id>
<content type='text'>
Pull vfs pile #2 from Al Viro:
 "Next pile (and there'll be one or two more).

  The large piece in this one is getting rid of /proc/*/ns/* weirdness;
  among other things, it allows to (finally) make nameidata completely
  opaque outside of fs/namei.c, making for easier further cleanups in
  there"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs:
  coda_venus_readdir(): use file_inode()
  fs/namei.c: fold link_path_walk() call into path_init()
  path_init(): don't bother with LOOKUP_PARENT in argument
  fs/namei.c: new helper (path_cleanup())
  path_init(): store the "base" pointer to file in nameidata itself
  make default -&gt;i_fop have -&gt;open() fail with ENXIO
  make nameidata completely opaque outside of fs/namei.c
  kill proc_ns completely
  take the targets of /proc/*/ns/* symlinks to separate fs
  bury struct proc_ns in fs/proc
  copy address of proc_ns_ops into ns_common
  new helpers: ns_alloc_inum/ns_free_inum
  make proc_ns_operations work with struct ns_common * instead of void *
  switch the rest of proc_ns_operations to working with &amp;...-&gt;ns
  netns: switch -&gt;get()/-&gt;put()/-&gt;install()/-&gt;inum() to working with &amp;net-&gt;ns
  make mntns -&gt;get()/-&gt;put()/-&gt;install()/-&gt;inum() work with &amp;mnt_ns-&gt;ns
  common object embedded into various struct ....ns
</content>
</entry>
<entry>
<title>exit: pidns: fix/update the comments in zap_pid_ns_processes()</title>
<updated>2014-12-11T01:41:18Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2014-12-10T23:55:28Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a53b831549141aa060a8b54b76e3a42870d74cc0'/>
<id>urn:sha1:a53b831549141aa060a8b54b76e3a42870d74cc0</id>
<content type='text'>
The comments in zap_pid_ns_processes() are not clear, we need to explain
how this code actually works.

1. "Ignore SIGCHLD" looks like optimization but it is not, we also
   need this for correctness.

2. The comment above sys_wait4() could tell more.

   EXIT_ZOMBIE child is only possible if it has exited before we
   ignored SIGCHLD. Or if it is traced from the parent namespace,
   but in this case it will be reaped by debugger after detach,
   sys_wait4() acts as a synchronization point.

3. The comment about TASK_DEAD (EXIT_DEAD in fact) children is
   outdated. Contrary to what it says we do not need to make sure
   they all go away after 0a01f2cc390e "pidns: Make the pidns proc
   mount/umount logic obvious".

   At the same time, we do need to wait for nr_hashed==init_pids,
   but the reasons are quite different and not obvious: setns().

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Aaron Tomlin &lt;atomlin@redhat.com&gt;
Cc: Pavel Emelyanov &lt;xemul@parallels.com&gt;
Cc: Serge Hallyn &lt;serge.hallyn@ubuntu.com&gt;
Cc: Sterling Alexander &lt;stalexan@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>copy address of proc_ns_ops into ns_common</title>
<updated>2014-12-04T19:34:47Z</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2014-11-01T06:32:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=33c429405a2c8d9e42afb9fee88a63cfb2de1e98'/>
<id>urn:sha1:33c429405a2c8d9e42afb9fee88a63cfb2de1e98</id>
<content type='text'>
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
</feed>
