<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/cgroup.c, branch v3.12.43</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.12.43</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.12.43'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2015-01-29T14:45:16Z</updated>
<entry>
<title>move d_rcu from overlapping d_child to overlapping d_alias</title>
<updated>2015-01-29T14:45:16Z</updated>
<author>
<name>Al Viro</name>
<email>viro@zeniv.linux.org.uk</email>
</author>
<published>2014-10-26T23:19:16Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4b2f6663ebde6bed50209a05041b34c203116253'/>
<id>urn:sha1:4b2f6663ebde6bed50209a05041b34c203116253</id>
<content type='text'>
commit 946e51f2bf37f1656916eb75bd0742ba33983c28 upstream.

Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Acked-by: Miklos Szeredi &lt;mszeredi@suse.cz&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>cgroup: protect modifications to cgroup_idr with cgroup_mutex</title>
<updated>2014-03-26T11:24:36Z</updated>
<author>
<name>Li Zefan</name>
<email>lizefan@huawei.com</email>
</author>
<published>2014-02-11T08:05:46Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=59d9c5f94655a3cc63e2772e847509010598b7f1'/>
<id>urn:sha1:59d9c5f94655a3cc63e2772e847509010598b7f1</id>
<content type='text'>
commit 0ab02ca8f887908152d1a96db5130fc661d36a1e upstream.

Setup cgroupfs like this:
  # mount -t cgroup -o cpuacct xxx /cgroup
  # mkdir /cgroup/sub1
  # mkdir /cgroup/sub2

Then run these two commands:
  # for ((; ;)) { mkdir /cgroup/sub1/tmp &amp;&amp; rmdir /mnt/sub1/tmp; } &amp;
  # for ((; ;)) { mkdir /cgroup/sub2/tmp &amp;&amp; rmdir /mnt/sub2/tmp; } &amp;

After seconds you may see this warning:

------------[ cut here ]------------
WARNING: CPU: 1 PID: 25243 at lib/idr.c:527 sub_remove+0x87/0x1b0()
idr_remove called for id=6 which is not allocated.
...
Call Trace:
 [&lt;ffffffff8156063c&gt;] dump_stack+0x7a/0x96
 [&lt;ffffffff810591ac&gt;] warn_slowpath_common+0x8c/0xc0
 [&lt;ffffffff81059296&gt;] warn_slowpath_fmt+0x46/0x50
 [&lt;ffffffff81300aa7&gt;] sub_remove+0x87/0x1b0
 [&lt;ffffffff810f3f02&gt;] ? css_killed_work_fn+0x32/0x1b0
 [&lt;ffffffff81300bf5&gt;] idr_remove+0x25/0xd0
 [&lt;ffffffff810f2bab&gt;] cgroup_destroy_css_killed+0x5b/0xc0
 [&lt;ffffffff810f4000&gt;] css_killed_work_fn+0x130/0x1b0
 [&lt;ffffffff8107cdbc&gt;] process_one_work+0x26c/0x550
 [&lt;ffffffff8107eefe&gt;] worker_thread+0x12e/0x3b0
 [&lt;ffffffff81085f96&gt;] kthread+0xe6/0xf0
 [&lt;ffffffff81570bac&gt;] ret_from_fork+0x7c/0xb0
---[ end trace 2d1577ec10cf80d0 ]---

It's because allocating/removing cgroup ID is not properly synchronized.

The bug was introduced when we converted cgroup_ida to cgroup_idr.
While synchronization is already done inside ida_simple_{get,remove}(),
users are responsible for concurrent calls to idr_{alloc,remove}().

[mhocko@suse.cz: ported to 3.12]
Fixes: 4e96ee8e981b ("cgroup: convert cgroup_ida to cgroup_idr")
Cc: &lt;stable@vger.kernel.org&gt; #3.12+
Reported-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Signed-off-by: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Michal Hocko &lt;mhocko@suse.cz&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>cgroup: update cgroup_enable_task_cg_lists() to grab siglock</title>
<updated>2014-03-05T16:13:43Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2014-02-13T18:29:31Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2c561a28bfeafc587b9d1ecc0b3c0568c2489ee7'/>
<id>urn:sha1:2c561a28bfeafc587b9d1ecc0b3c0568c2489ee7</id>
<content type='text'>
commit 532de3fc72adc2a6525c4d53c07bf81e1732083d upstream.

Currently, there's nothing preventing cgroup_enable_task_cg_lists()
from missing set PF_EXITING and race against cgroup_exit().  Depending
on the timing, cgroup_exit() may finish with the task still linked on
css_set leading to list corruption.  Fix it by grabbing siglock in
cgroup_enable_task_cg_lists() so that PF_EXITING is guaranteed to be
visible.

This whole on-demand cg_list optimization is extremely fragile and has
ample possibility to lead to bugs which can cause things like
once-a-year oops during boot.  I'm wondering whether the better
approach would be just adding "cgroup_disable=all" handling which
disables the whole cgroup rather than tempting fate with this
on-demand craziness.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
</entry>
<entry>
<title>cgroup: fix locking in cgroup_cfts_commit()</title>
<updated>2014-03-05T16:13:43Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2014-02-08T15:26:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3e507fa19fea1d5423f4be675fd211c52f7c12c4'/>
<id>urn:sha1:3e507fa19fea1d5423f4be675fd211c52f7c12c4</id>
<content type='text'>
commit 48573a893303986e3b0b2974d6fb11f3d1bb7064 upstream.

cgroup_cfts_commit() walks the cgroup hierarchy that the target
subsystem is attached to and tries to apply the file changes.  Due to
the convolution with inode locking, it can't keep cgroup_mutex locked
while iterating.  It currently holds only RCU read lock around the
actual iteration and then pins the found cgroup using dget().

Unfortunately, this is incorrect.  Although the iteration does check
cgroup_is_dead() before invoking dget(), there's nothing which
prevents the dentry from going away inbetween.  Note that this is
different from the usual css iterations where css_tryget() is used to
pin the css - css_tryget() tests whether the css can be pinned and
fails if not.

The problem can be solved by simply holding cgroup_mutex instead of
RCU read lock around the iteration, which actually reduces LOC.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
</entry>
<entry>
<title>cgroup: fix error return from cgroup_create()</title>
<updated>2014-03-05T16:13:43Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2014-02-08T15:26:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=35fcf4dd2296b06b0ef042f203246dc73e6e3f02'/>
<id>urn:sha1:35fcf4dd2296b06b0ef042f203246dc73e6e3f02</id>
<content type='text'>
commit b58c89986a77a23658682a100eb15d8edb571ebb upstream.

cgroup_create() was returning 0 after allocation failures.  Fix it.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
</entry>
<entry>
<title>cgroup: fix error return value in cgroup_mount()</title>
<updated>2014-03-05T16:13:43Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2014-02-08T15:26:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7737f595e8f566d41ef9f91246816bfbed4b7a87'/>
<id>urn:sha1:7737f595e8f566d41ef9f91246816bfbed4b7a87</id>
<content type='text'>
commit eb46bf89696972b856a9adb6aebd5c7b65c266e4 upstream.

When cgroup_mount() fails to allocate an id for the root, it didn't
set ret before jumping to unlock_drop ending up returning 0 after a
failure.  Fix it.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
</entry>
<entry>
<title>cgroup: fix cgroup_create() error handling path</title>
<updated>2014-01-09T20:25:11Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2013-12-06T20:07:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7b8a321513bce5945bc36f18ab27651288975a4c'/>
<id>urn:sha1:7b8a321513bce5945bc36f18ab27651288975a4c</id>
<content type='text'>
commit 266ccd505e8acb98717819cef9d91d66c7b237cc upstream.

ae7f164a09 ("cgroup: move cgroup-&gt;subsys[] assignment to
online_css()") moved cgroup-&gt;subsys[] assignements later in
cgroup_create() but didn't update error handling path accordingly
leading to the following oops and leaking later css's after an
online_css() failure.  The oops is from cgroup destruction path being
invoked on the partially constructed cgroup which is not ready to
handle empty slots in cgrp-&gt;subsys[] array.

  BUG: unable to handle kernel NULL pointer dereference at 0000000000000008
  IP: [&lt;ffffffff810eeaa8&gt;] cgroup_destroy_locked+0x118/0x2f0
  PGD a780a067 PUD aadbe067 PMD 0
  Oops: 0000 [#1] SMP
  Modules linked in:
  CPU: 6 PID: 7360 Comm: mkdir Not tainted 3.13.0-rc2+ #69
  Hardware name:
  task: ffff8800b9dbec00 ti: ffff8800a781a000 task.ti: ffff8800a781a000
  RIP: 0010:[&lt;ffffffff810eeaa8&gt;]  [&lt;ffffffff810eeaa8&gt;] cgroup_destroy_locked+0x118/0x2f0
  RSP: 0018:ffff8800a781bd98  EFLAGS: 00010282
  RAX: ffff880586903878 RBX: ffff880586903800 RCX: ffff880586903820
  RDX: ffff880586903860 RSI: ffff8800a781bdb0 RDI: ffff880586903820
  RBP: ffff8800a781bde8 R08: ffff88060e0b8048 R09: ffffffff811d7bc1
  R10: 000000000000008c R11: 0000000000000001 R12: ffff8800a72286c0
  R13: 0000000000000000 R14: ffffffff81cf7a40 R15: 0000000000000001
  FS:  00007f60ecda57a0(0000) GS:ffff8806272c0000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
  CR2: 0000000000000008 CR3: 00000000a7a03000 CR4: 00000000000007e0
  Stack:
   ffff880586903860 ffff880586903910 ffff8800a72286c0 ffff880586903820
   ffffffff81cf7a40 ffff880586903800 ffff88060e0b8018 ffffffff81cf7a40
   ffff8800b9dbec00 ffff8800b9dbf098 ffff8800a781bec8 ffffffff810ef5bf
  Call Trace:
   [&lt;ffffffff810ef5bf&gt;] cgroup_mkdir+0x55f/0x5f0
   [&lt;ffffffff811c90ae&gt;] vfs_mkdir+0xee/0x140
   [&lt;ffffffff811cb07e&gt;] SyS_mkdirat+0x6e/0xf0
   [&lt;ffffffff811c6a19&gt;] SyS_mkdir+0x19/0x20
   [&lt;ffffffff8169e569&gt;] system_call_fastpath+0x16/0x1b

This patch moves reference bumping inside online_css() loop, clears
css_ar[] as css's are brought online successfully, and updates
err_destroy path so that either a css is fully online and destroyed by
cgroup_destroy_locked() or the error path frees it.  This creates a
duplicate css free logic in the error path but it will be cleaned up
soon.

v2: Li pointed out that cgroup_destroy_locked() would do NULL-deref if
    invoked with a cgroup which doesn't have all css's populated.
    Update cgroup_destroy_locked() so that it skips NULL css's.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Li Zefan &lt;lizefan@huawei.com&gt;
Reported-by: Vladimir Davydov &lt;vdavydov@parallels.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>cgroup: fix cgroup_subsys_state leak for seq_files</title>
<updated>2013-12-04T19:05:55Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2013-11-27T23:16:21Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e6af24fef5c9ed17a25f062b3c22c3df44904dc9'/>
<id>urn:sha1:e6af24fef5c9ed17a25f062b3c22c3df44904dc9</id>
<content type='text'>
commit e605b36575e896edd8161534550c9ea021b03bc0 upstream.

If a cgroup file implements either read_map() or read_seq_string(),
such file is served using seq_file by overriding file-&gt;f_op to
cgroup_seqfile_operations, which also overrides the release method to
single_release() from cgroup_file_release().

Because cgroup_file_open() didn't use to acquire any resources, this
used to be fine, but since f7d58818ba42 ("cgroup: pin
cgroup_subsys_state when opening a cgroupfs file"), cgroup_file_open()
pins the css (cgroup_subsys_state) which is put by
cgroup_file_release().  The patch forgot to update the release path
for seq_files and each open/release cycle leaks a css reference.

Fix it by updating cgroup_file_release() to also handle seq_files and
using it for seq_file release path too.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>cgroup: use a dedicated workqueue for cgroup destruction</title>
<updated>2013-12-04T19:05:55Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2013-11-22T22:14:39Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=65d6ec10c7cf2575de2aa9159f8cf43cbc1074fe'/>
<id>urn:sha1:65d6ec10c7cf2575de2aa9159f8cf43cbc1074fe</id>
<content type='text'>
commit e5fca243abae1445afbfceebda5f08462ef869d3 upstream.

Since be44562613851 ("cgroup: remove synchronize_rcu() from
cgroup_diput()"), cgroup destruction path makes use of workqueue.  css
freeing is performed from a work item from that point on and a later
commit, ea15f8ccdb430 ("cgroup: split cgroup destruction into two
steps"), moves css offlining to workqueue too.

As cgroup destruction isn't depended upon for memory reclaim, the
destruction work items were put on the system_wq; unfortunately, some
controller may block in the destruction path for considerable duration
while holding cgroup_mutex.  As large part of destruction path is
synchronized through cgroup_mutex, when combined with high rate of
cgroup removals, this has potential to fill up system_wq's max_active
of 256.

Also, it turns out that memcg's css destruction path ends up queueing
and waiting for work items on system_wq through work_on_cpu().  If
such operation happens while system_wq is fully occupied by cgroup
destruction work items, work_on_cpu() can't make forward progress
because system_wq is full and other destruction work items on
system_wq can't make forward progress because the work item waiting
for work_on_cpu() is holding cgroup_mutex, leading to deadlock.

This can be fixed by queueing destruction work items on a separate
workqueue.  This patch creates a dedicated workqueue -
cgroup_destroy_wq - for this purpose.  As these work items shouldn't
have inter-dependencies and mostly serialized by cgroup_mutex anyway,
giving high concurrency level doesn't buy anything and the workqueue's
@max_active is set to 1 so that destruction work items are executed
one by one on each CPU.

Hugh Dickins: Because cgroup_init() is run before init_workqueues(),
cgroup_destroy_wq can't be allocated from cgroup_init().  Do it from a
separate core_initcall().  In the future, we probably want to reorder
so that workqueue init happens before cgroup_init().

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: Hugh Dickins &lt;hughd@google.com&gt;
Reported-by: Shawn Bohrer &lt;shawn.bohrer@gmail.com&gt;
Link: http://lkml.kernel.org/r/20131111220626.GA7509@sbohrermbp13-local.rgmadvisors.com
Link: http://lkml.kernel.org/g/alpine.LNX.2.00.1310301606080.2333@eggly.anvils
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>Merge branch 'for-3.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup</title>
<updated>2013-10-22T07:20:34Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2013-10-22T07:20:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ee7eafc907db64ef4cbe8a17da3a1089cbe50617'/>
<id>urn:sha1:ee7eafc907db64ef4cbe8a17da3a1089cbe50617</id>
<content type='text'>
Pull cgroup fixes from Tejun Heo:
 "Two late fixes for cgroup.

  One fixes descendant walk introduced during this rc1 cycle.  The other
  fixes a post 3.9 bug during task attach which can lead to hang.  Both
  fixes are critical and the fixes are relatively straight-forward"

* 'for-3.12-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: fix to break the while loop in cgroup_attach_task() correctly
  cgroup: fix cgroup post-order descendant walk of empty subtree
</content>
</entry>
</feed>
