<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/pid.c, branch v3.10.44</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.10.44</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.10.44'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2013-09-27T00:18:27Z</updated>
<entry>
<title>pidns: Fix hang in zap_pid_ns_processes by sending a potentially extra wakeup</title>
<updated>2013-09-27T00:18:27Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2013-08-29T20:56:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5a48788ca4d6dc78d1856d9ddeea1b1160097cc9'/>
<id>urn:sha1:5a48788ca4d6dc78d1856d9ddeea1b1160097cc9</id>
<content type='text'>
commit a606488513543312805fab2b93070cefe6a3016c upstream.

Serge Hallyn &lt;serge.hallyn@ubuntu.com&gt; writes:

&gt; Since commit af4b8a83add95ef40716401395b44a1b579965f4 it's been
&gt; possible to get into a situation where a pidns reaper is
&gt; &lt;defunct&gt;, reparented to host pid 1, but never reaped.  How to
&gt; reproduce this is documented at
&gt;
&gt; https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1168526
&gt; (and see
&gt; https://bugs.launchpad.net/ubuntu/+source/lxc/+bug/1168526/comments/13)
&gt; In short, run repeated starts of a container whose init is
&gt;
&gt; Process.exit(0);
&gt;
&gt; sysrq-t when such a task is playing zombie shows:
&gt;
&gt; [  131.132978] init            x ffff88011fc14580     0  2084   2039 0x00000000
&gt; [  131.132978]  ffff880116e89ea8 0000000000000002 ffff880116e89fd8 0000000000014580
&gt; [  131.132978]  ffff880116e89fd8 0000000000014580 ffff8801172a0000 ffff8801172a0000
&gt; [  131.132978]  ffff8801172a0630 ffff88011729fff0 ffff880116e14650 ffff88011729fff0
&gt; [  131.132978] Call Trace:
&gt; [  131.132978]  [&lt;ffffffff816f6159&gt;] schedule+0x29/0x70
&gt; [  131.132978]  [&lt;ffffffff81064591&gt;] do_exit+0x6e1/0xa40
&gt; [  131.132978]  [&lt;ffffffff81071eae&gt;] ? signal_wake_up_state+0x1e/0x30
&gt; [  131.132978]  [&lt;ffffffff8106496f&gt;] do_group_exit+0x3f/0xa0
&gt; [  131.132978]  [&lt;ffffffff810649e4&gt;] SyS_exit_group+0x14/0x20
&gt; [  131.132978]  [&lt;ffffffff8170102f&gt;] tracesys+0xe1/0xe6
&gt;
&gt; Further debugging showed that every time this happened, zap_pid_ns_processes()
&gt; started with nr_hashed being 3, while we were expecting it to drop to 2.
&gt; Any time it didn't happen, nr_hashed was 1 or 2.  So the reaper was
&gt; waiting for nr_hashed to become 2, but free_pid() only wakes the reaper
&gt; if nr_hashed hits 1.

The issue is that when the task group leader of an init process exits
before other tasks of the init process when the init process finally
exits it will be a secondary task sleeping in zap_pid_ns_processes and
waiting to wake up when the number of hashed pids drops to two.  This
case waits forever as free_pid only sends a wake up when the number of
hashed pids drops to 1.

To correct this the simple strategy of sending a possibly unncessary
wake up when the number of hashed pids drops to 2 is adopted.

Sending one extraneous wake up is relatively harmless, at worst we
waste a little cpu time in the rare case when a pid namespace
appropaches exiting.

We can detect the case when the pid namespace drops to just two pids
hashed race free in free_pid.

Dereferencing pid_ns-&gt;child_reaper with the pidmap_lock held is safe
without out the tasklist_lock because it is guaranteed that the
detach_pid will be called on the child_reaper before it is freed and
detach_pid calls __change_pid which calls free_pid which takes the
pidmap_lock.  __change_pid only calls free_pid if this is the
last use of the pid.  For a thread that is not the thread group leader
the threads pid will only ever have one user because a threads pid
is not allowed to be the pid of a process, of a process group or
a session.  For a thread that is a thread group leader all of
the other threads of that process will be reaped before it is allowed
for the thread group leader to be reaped ensuring there will only
be one user of the threads pid as a process pid.  Furthermore
because the thread is the init process of a pid namespace all of the
other processes in the pid namespace will have also been already freed
leading to the fact that the pid will not be used as a session pid or
a process group pid for any other running process.

Acked-by: Serge Hallyn &lt;serge.hallyn@canonical.com&gt;
Tested-by: Serge Hallyn &lt;serge.hallyn@canonical.com&gt;
Reported-by: Serge Hallyn &lt;serge.hallyn@ubuntu.com&gt;
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs</title>
<updated>2013-05-02T00:51:54Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2013-05-02T00:51:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=20b4fb485227404329e41ad15588afad3df23050'/>
<id>urn:sha1:20b4fb485227404329e41ad15588afad3df23050</id>
<content type='text'>
Pull VFS updates from Al Viro,

Misc cleanups all over the place, mainly wrt /proc interfaces (switch
create_proc_entry to proc_create(), get rid of the deprecated
create_proc_read_entry() in favor of using proc_create_data() and
seq_file etc).

7kloc removed.

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits)
  don't bother with deferred freeing of fdtables
  proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h
  proc: Make the PROC_I() and PDE() macros internal to procfs
  proc: Supply a function to remove a proc entry by PDE
  take cgroup_open() and cpuset_open() to fs/proc/base.c
  ppc: Clean up scanlog
  ppc: Clean up rtas_flash driver somewhat
  hostap: proc: Use remove_proc_subtree()
  drm: proc: Use remove_proc_subtree()
  drm: proc: Use minor-&gt;index to label things, not PDE-&gt;name
  drm: Constify drm_proc_list[]
  zoran: Don't print proc_dir_entry data in debug
  reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show()
  proc: Supply an accessor for getting the data from a PDE's parent
  airo: Use remove_proc_subtree()
  rtl8192u: Don't need to save device proc dir PDE
  rtl8187se: Use a dir under /proc/net/r8180/
  proc: Add proc_mkdir_data()
  proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h}
  proc: Move PDE_NET() to fs/proc/proc_net.c
  ...
</content>
</entry>
<entry>
<title>proc: Split the namespace stuff out into linux/proc_ns.h</title>
<updated>2013-05-01T21:29:39Z</updated>
<author>
<name>David Howells</name>
<email>dhowells@redhat.com</email>
</author>
<published>2013-04-12T00:50:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0bb80f240520c4148b623161e7856858c021696d'/>
<id>urn:sha1:0bb80f240520c4148b623161e7856858c021696d</id>
<content type='text'>
Split the proc namespace stuff out into linux/proc_ns.h.

Signed-off-by: David Howells &lt;dhowells@redhat.com&gt;
cc: netdev@vger.kernel.org
cc: Serge E. Hallyn &lt;serge.hallyn@ubuntu.com&gt;
cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
</content>
</entry>
<entry>
<title>pid_namespace.c/.h: simplify defines</title>
<updated>2013-05-01T00:04:07Z</updated>
<author>
<name>Raphael S.Carvalho</name>
<email>raphael.scarv@gmail.com</email>
</author>
<published>2013-04-30T22:28:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5cc5445164c16d32bab2912fac28356ab07aa8b4'/>
<id>urn:sha1:5cc5445164c16d32bab2912fac28356ab07aa8b4</id>
<content type='text'>
Move BITS_PER_PAGE from pid_namespace.c to pid_namespace.h, since we can
simplify the define PID_MAP_ENTRIES by using the BITS_PER_PAGE.

[akpm@linux-foundation.org: kernel/pid.c:54:1: warning: "BITS_PER_PAGE" redefined]
Signed-off-by: Raphael S.Carvalho &lt;raphael.scarv@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>kernel/pid.c: improve flow of a loop inside alloc_pidmap.</title>
<updated>2013-05-01T00:04:07Z</updated>
<author>
<name>Raphael S. Carvalho</name>
<email>raphael.scarv@gmail.com</email>
</author>
<published>2013-04-30T22:28:26Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8db049b3d666b3676ff4a976e03c14de302bf9fa'/>
<id>urn:sha1:8db049b3d666b3676ff4a976e03c14de302bf9fa</id>
<content type='text'>
find_next_offset() searches for an available "cleaned bit" in the
respective pid bitmap (page), so returns the offset if found, otherwise
it returns a value equals to BITS_PER_PAGE.

For example, suppose find_next_offset didn't find any available bit, so
there's no purpose to call mk_pid (Wasteful Cpu Cycles).

Therefore, I found it could be better to call mk_pid after the checking
(offset &lt; BITS_PER_PAGE) returned sucessfully! Another point: If (offset
&lt; BITS_PER_PAGE) results in a "failure", then mk_pid would be called
again afterwards.

[akpm@linux-foundation.org: simplify code]
Signed-off-by: Raphael S. Carvalho &lt;raphael.scarv@gmail.com&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Serge Hallyn &lt;serge.hallyn@canonical.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>hlist: drop the node parameter from iterators</title>
<updated>2013-02-28T03:10:24Z</updated>
<author>
<name>Sasha Levin</name>
<email>sasha.levin@oracle.com</email>
</author>
<published>2013-02-28T01:06:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b67bfe0d42cac56c512dd5da4b1b347a23f4b70a'/>
<id>urn:sha1:b67bfe0d42cac56c512dd5da4b1b347a23f4b70a</id>
<content type='text'>
I'm not sure why, but the hlist for each entry iterators were conceived

        list_for_each_entry(pos, head, member)

The hlist ones were greedy and wanted an extra parameter:

        hlist_for_each_entry(tpos, pos, head, member)

Why did they need an extra pos parameter? I'm not quite sure. Not only
they don't really need it, it also prevents the iterator from looking
exactly like the list iterator, which is unfortunate.

Besides the semantic patch, there was some manual work required:

 - Fix up the actual hlist iterators in linux/list.h
 - Fix up the declaration of other iterators based on the hlist ones.
 - A very small amount of places were using the 'node' parameter, this
 was modified to use 'obj-&gt;member' instead.
 - Coccinelle didn't handle the hlist_for_each_entry_safe iterator
 properly, so those had to be fixed up manually.

The semantic patch which is mostly the work of Peter Senna Tschudin is here:

@@
iterator name hlist_for_each_entry, hlist_for_each_entry_continue, hlist_for_each_entry_from, hlist_for_each_entry_rcu, hlist_for_each_entry_rcu_bh, hlist_for_each_entry_continue_rcu_bh, for_each_busy_worker, ax25_uid_for_each, ax25_for_each, inet_bind_bucket_for_each, sctp_for_each_hentry, sk_for_each, sk_for_each_rcu, sk_for_each_from, sk_for_each_safe, sk_for_each_bound, hlist_for_each_entry_safe, hlist_for_each_entry_continue_rcu, nr_neigh_for_each, nr_neigh_for_each_safe, nr_node_for_each, nr_node_for_each_safe, for_each_gfn_indirect_valid_sp, for_each_gfn_sp, for_each_host;

type T;
expression a,c,d,e;
identifier b;
statement S;
@@

-T b;
    &lt;+... when != b
(
hlist_for_each_entry(a,
- b,
c, d) S
|
hlist_for_each_entry_continue(a,
- b,
c) S
|
hlist_for_each_entry_from(a,
- b,
c) S
|
hlist_for_each_entry_rcu(a,
- b,
c, d) S
|
hlist_for_each_entry_rcu_bh(a,
- b,
c, d) S
|
hlist_for_each_entry_continue_rcu_bh(a,
- b,
c) S
|
for_each_busy_worker(a, c,
- b,
d) S
|
ax25_uid_for_each(a,
- b,
c) S
|
ax25_for_each(a,
- b,
c) S
|
inet_bind_bucket_for_each(a,
- b,
c) S
|
sctp_for_each_hentry(a,
- b,
c) S
|
sk_for_each(a,
- b,
c) S
|
sk_for_each_rcu(a,
- b,
c) S
|
sk_for_each_from
-(a, b)
+(a)
S
+ sk_for_each_from(a) S
|
sk_for_each_safe(a,
- b,
c, d) S
|
sk_for_each_bound(a,
- b,
c) S
|
hlist_for_each_entry_safe(a,
- b,
c, d, e) S
|
hlist_for_each_entry_continue_rcu(a,
- b,
c) S
|
nr_neigh_for_each(a,
- b,
c) S
|
nr_neigh_for_each_safe(a,
- b,
c, d) S
|
nr_node_for_each(a,
- b,
c) S
|
nr_node_for_each_safe(a,
- b,
c, d) S
|
- for_each_gfn_sp(a, c, d, b) S
+ for_each_gfn_sp(a, c, d) S
|
- for_each_gfn_indirect_valid_sp(a, c, d, b) S
+ for_each_gfn_indirect_valid_sp(a, c, d) S
|
for_each_host(a,
- b,
c) S
|
for_each_host_safe(a,
- b,
c, d) S
|
for_each_mesh_entry(a,
- b,
c, d) S
)
    ...+&gt;

[akpm@linux-foundation.org: drop bogus change from net/ipv4/raw.c]
[akpm@linux-foundation.org: drop bogus hunk from net/ipv6/raw.c]
[akpm@linux-foundation.org: checkpatch fixes]
[akpm@linux-foundation.org: fix warnings]
[akpm@linux-foudnation.org: redo intrusive kvm changes]
Tested-by: Peter Senna Tschudin &lt;peter.senna@gmail.com&gt;
Acked-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Cc: Wu Fengguang &lt;fengguang.wu@intel.com&gt;
Cc: Marcelo Tosatti &lt;mtosatti@redhat.com&gt;
Cc: Gleb Natapov &lt;gleb@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>kernel/pid.c: reenable interrupts when alloc_pid() fails because init has exited</title>
<updated>2013-02-12T22:34:00Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2013-02-12T21:46:23Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6e6668845fe593414a938b7726d6359b5570ac5a'/>
<id>urn:sha1:6e6668845fe593414a938b7726d6359b5570ac5a</id>
<content type='text'>
We're forgetting to reenable local interrupts on an error path.

Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Reported-by: Josh Boyer &lt;jwboyer@redhat.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>pidns: Stop pid allocation when init dies</title>
<updated>2012-12-26T00:10:05Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2012-12-22T04:27:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c876ad7682155958d0c9c27afe9017925c230d64'/>
<id>urn:sha1:c876ad7682155958d0c9c27afe9017925c230d64</id>
<content type='text'>
Oleg pointed out that in a pid namespace the sequence.
- pid 1 becomes a zombie
- setns(thepidns), fork,...
- reaping pid 1.
- The injected processes exiting.

Can lead to processes attempting access their child reaper and
instead following a stale pointer.

That waitpid for init can return before all of the processes in
the pid namespace have exited is also unfortunate.

Avoid these problems by disabling the allocation of new pids in a pid
namespace when init dies, instead of when the last process in a pid
namespace is reaped.

Pointed-out-by:  Oleg Nesterov &lt;oleg@redhat.com&gt;
Reviewed-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'akpm' (Andrew's patch-bomb)</title>
<updated>2012-12-18T04:58:12Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-12-18T04:58:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=848b81415c42ff3dc9a4204749087b015c37ef66'/>
<id>urn:sha1:848b81415c42ff3dc9a4204749087b015c37ef66</id>
<content type='text'>
Merge misc patches from Andrew Morton:
 "Incoming:

   - lots of misc stuff

   - backlight tree updates

   - lib/ updates

   - Oleg's percpu-rwsem changes

   - checkpatch

   - rtc

   - aoe

   - more checkpoint/restart support

  I still have a pile of MM stuff pending - Pekka should be merging
  later today after which that is good to go.  A number of other things
  are twiddling thumbs awaiting maintainer merges."

* emailed patches from Andrew Morton &lt;akpm@linux-foundation.org&gt;: (180 commits)
  scatterlist: don't BUG when we can trivially return a proper error.
  docs: update documentation about /proc/&lt;pid&gt;/fdinfo/&lt;fd&gt; fanotify output
  fs, fanotify: add @mflags field to fanotify output
  docs: add documentation about /proc/&lt;pid&gt;/fdinfo/&lt;fd&gt; output
  fs, notify: add procfs fdinfo helper
  fs, exportfs: add exportfs_encode_inode_fh() helper
  fs, exportfs: escape nil dereference if no s_export_op present
  fs, epoll: add procfs fdinfo helper
  fs, eventfd: add procfs fdinfo helper
  procfs: add ability to plug in auxiliary fdinfo providers
  tools/testing/selftests/kcmp/kcmp_test.c: print reason for failure in kcmp_test
  breakpoint selftests: print failure status instead of cause make error
  kcmp selftests: print fail status instead of cause make error
  kcmp selftests: make run_tests fix
  mem-hotplug selftests: print failure status instead of cause make error
  cpu-hotplug selftests: print failure status instead of cause make error
  mqueue selftests: print failure status instead of cause make error
  vm selftests: print failure status instead of cause make error
  ubifs: use prandom_bytes
  mtd: nandsim: use prandom_bytes
  ...
</content>
</entry>
<entry>
<title>pidns: remove unused is_container_init()</title>
<updated>2012-12-18T01:15:23Z</updated>
<author>
<name>Gao feng</name>
<email>gaofeng@cn.fujitsu.com</email>
</author>
<published>2012-12-18T00:03:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a5ba911ec3792168530d35e16a8ec3b6fc60bcb5'/>
<id>urn:sha1:a5ba911ec3792168530d35e16a8ec3b6fc60bcb5</id>
<content type='text'>
Since commit 1cdcbec1a337 ("CRED: Neuter sys_capset()")
is_container_init() has no callers.

Signed-off-by: Gao feng &lt;gaofeng@cn.fujitsu.com&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Acked-by: Serge Hallyn &lt;serge.hallyn@canonical.com&gt;
Cc: James Morris &lt;jmorris@namei.org&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
