<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/cgroup, branch v4.11.8</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.11.8</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.11.8'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2017-06-14T13:07:50Z</updated>
<entry>
<title>cgroup: mark cgroup_get() with __maybe_unused</title>
<updated>2017-06-14T13:07:50Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2017-05-01T19:24:14Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=695b2bd64be3a3824c264e4ba32b3d3d58b7d7e8'/>
<id>urn:sha1:695b2bd64be3a3824c264e4ba32b3d3d58b7d7e8</id>
<content type='text'>
commit 310b4816a5d8082416b4ab83e5a7b3cb92883a4d upstream.

a590b90d472f ("cgroup: fix spurious warnings on cgroup_is_dead() from
cgroup_sk_alloc()") converted most cgroup_get() usages to
cgroup_get_live() leaving cgroup_sk_alloc() the sole user of
cgroup_get().  When !CONFIG_SOCK_CGROUP_DATA, this ends up triggering
unused warning for cgroup_get().

Silence the warning by adding __maybe_unused to cgroup_get().

Reported-by: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Link: http://lkml.kernel.org/r/20170501145340.17e8ef86@canb.auug.org.au
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>cpuset: consider dying css as offline</title>
<updated>2017-06-14T13:07:45Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2017-05-24T16:03:48Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9be0c9d62dfb1f00de04fceae948fd56e84cdd28'/>
<id>urn:sha1:9be0c9d62dfb1f00de04fceae948fd56e84cdd28</id>
<content type='text'>
commit 41c25707d21716826e3c1f60967f5550610ec1c9 upstream.

In most cases, a cgroup controller don't care about the liftimes of
cgroups.  For the controller, a css becomes online when -&gt;css_online()
is called on it and offline when -&gt;css_offline() is called.

However, cpuset is special in that the user interface it exposes cares
whether certain cgroups exist or not.  Combined with the RCU delay
between cgroup removal and css offlining, this can lead to user
visible behavior oddities where operations which should succeed after
cgroup removals fail for some time period.  The effects of cgroup
removals are delayed when seen from userland.

This patch adds css_is_dying() which tests whether offline is pending
and updates is_cpuset_online() so that the function returns false also
while offline is pending.  This gets rid of the userland visible
delays.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: Daniel Jordan &lt;daniel.m.jordan@oracle.com&gt;
Link: http://lkml.kernel.org/r/327ca1f5-7957-fbb9-9e5f-9ba149d40ba2@oracle.com
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>cgroup: Prevent kill_css() from being called more than once</title>
<updated>2017-06-14T13:07:44Z</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2017-05-15T13:34:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f3c1dfa84dbefc138a85930ebc43aa9a875e5718'/>
<id>urn:sha1:f3c1dfa84dbefc138a85930ebc43aa9a875e5718</id>
<content type='text'>
commit 33c35aa4817864e056fd772230b0c6b552e36ea2 upstream.

The kill_css() function may be called more than once under the condition
that the css was killed but not physically removed yet followed by the
removal of the cgroup that is hosting the css. This patch prevents any
harmm from being done when that happens.

Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>cgroup: fix spurious warnings on cgroup_is_dead() from cgroup_sk_alloc()</title>
<updated>2017-05-20T12:49:50Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2017-04-28T19:14:55Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a5938c093a1827a7262afc925a6a1032842af096'/>
<id>urn:sha1:a5938c093a1827a7262afc925a6a1032842af096</id>
<content type='text'>
commit a590b90d472f2c176c140576ee3ab44df7f67839 upstream.

cgroup_get() expected to be called only on live cgroups and triggers
warning on a dead cgroup; however, cgroup_sk_alloc() may be called
while cloning a socket which is left in an empty and removed cgroup
and thus may legitimately duplicate its reference on a dead cgroup.
This currently triggers the following warning spuriously.

 WARNING: CPU: 14 PID: 0 at kernel/cgroup.c:490 cgroup_get+0x55/0x60
 ...
  [&lt;ffffffff8107e123&gt;] __warn+0xd3/0xf0
  [&lt;ffffffff8107e20e&gt;] warn_slowpath_null+0x1e/0x20
  [&lt;ffffffff810ff465&gt;] cgroup_get+0x55/0x60
  [&lt;ffffffff81106061&gt;] cgroup_sk_alloc+0x51/0xe0
  [&lt;ffffffff81761beb&gt;] sk_clone_lock+0x2db/0x390
  [&lt;ffffffff817cce06&gt;] inet_csk_clone_lock+0x16/0xc0
  [&lt;ffffffff817e8173&gt;] tcp_create_openreq_child+0x23/0x4b0
  [&lt;ffffffff818601a1&gt;] tcp_v6_syn_recv_sock+0x91/0x670
  [&lt;ffffffff817e8b16&gt;] tcp_check_req+0x3a6/0x4e0
  [&lt;ffffffff81861ba3&gt;] tcp_v6_rcv+0x693/0xa00
  [&lt;ffffffff81837429&gt;] ip6_input_finish+0x59/0x3e0
  [&lt;ffffffff81837cb2&gt;] ip6_input+0x32/0xb0
  [&lt;ffffffff81837387&gt;] ip6_rcv_finish+0x57/0xa0
  [&lt;ffffffff81837ac8&gt;] ipv6_rcv+0x318/0x4d0
  [&lt;ffffffff817778c7&gt;] __netif_receive_skb_core+0x2d7/0x9a0
  [&lt;ffffffff81777fa6&gt;] __netif_receive_skb+0x16/0x70
  [&lt;ffffffff81778023&gt;] netif_receive_skb_internal+0x23/0x80
  [&lt;ffffffff817787d8&gt;] napi_gro_frags+0x208/0x270
  [&lt;ffffffff8168a9ec&gt;] mlx4_en_process_rx_cq+0x74c/0xf40
  [&lt;ffffffff8168b270&gt;] mlx4_en_poll_rx_cq+0x30/0x90
  [&lt;ffffffff81778b30&gt;] net_rx_action+0x210/0x350
  [&lt;ffffffff8188c426&gt;] __do_softirq+0x106/0x2c7
  [&lt;ffffffff81082bad&gt;] irq_exit+0x9d/0xa0 [&lt;ffffffff8188c0e4&gt;] do_IRQ+0x54/0xd0
  [&lt;ffffffff8188a63f&gt;] common_interrupt+0x7f/0x7f &lt;EOI&gt;
  [&lt;ffffffff8173d7e7&gt;] cpuidle_enter+0x17/0x20
  [&lt;ffffffff810bdfd9&gt;] cpu_startup_entry+0x2a9/0x2f0
  [&lt;ffffffff8103edd1&gt;] start_secondary+0xf1/0x100

This patch renames the existing cgroup_get() with the dead cgroup
warning to cgroup_get_live() after cgroup_kn_lock_live() and
introduces the new cgroup_get() which doesn't check whether the cgroup
is live or dead.

All existing cgroup_get() users except for cgroup_sk_alloc() are
converted to use cgroup_get_live().

Fixes: d979a39d7242 ("cgroup: duplicate cgroup reference when cloning sockets")
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Reported-by: Chris Mason &lt;clm@fb.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>Merge branch 'for-4.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup</title>
<updated>2017-04-16T18:48:10Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2017-04-16T18:48:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=11c994d9a50b6ba5a96a1c1450ce0fa7bd840f1c'/>
<id>urn:sha1:11c994d9a50b6ba5a96a1c1450ce0fa7bd840f1c</id>
<content type='text'>
Pull cgroup fix from Tejun Heo:
 "Unfortunately, the commit to fix the cgroup mount race in the previous
  pull request can lead to hangs.

  The original bug has been around for a while and isn't too likely to
  be triggered in usual use cases. Revert the commit for now"

* 'for-4.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  Revert "cgroup: avoid attaching a cgroup root to two different superblocks"
</content>
</entry>
<entry>
<title>Revert "cgroup: avoid attaching a cgroup root to two different superblocks"</title>
<updated>2017-04-16T14:17:37Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2017-04-16T14:17:37Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=330c418638612d7658b6314e6a244fcb5f7efac5'/>
<id>urn:sha1:330c418638612d7658b6314e6a244fcb5f7efac5</id>
<content type='text'>
This reverts commit bfb0b80db5f9dca5ac0a5fd0edb765ee555e5a8e.

Andrei reports CRIU test hangs with the patch applied.  The bug fixed
by the patch isn't too likely to trigger in actual uses.  Revert the
patch for now.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: Andrei Vagin &lt;avagin@virtuozzo.com&gt;
Link: http://lkml.kernel.org/r/20170414232737.GC20350@outlook.office365.com
</content>
</entry>
<entry>
<title>Merge branch 'for-4.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup</title>
<updated>2017-04-12T06:38:16Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2017-04-12T06:38:16Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=06ea4c38bce80f3eb99b01d7c17419eb1a49fec2'/>
<id>urn:sha1:06ea4c38bce80f3eb99b01d7c17419eb1a49fec2</id>
<content type='text'>
Pull cgroup fixes from Tejun Heo:
 "This contains fixes for two long standing subtle bugs:

   - kthread_bind() on a new kthread binds it to specific CPUs and
     prevents userland from messing with the affinity or cgroup
     membership. Unfortunately, for cgroup membership, there's a window
     between kthread creation and kthread_bind*() invocation where the
     kthread can be moved into a non-root cgroup by userland.

     Depending on what controllers are in effect, this can assign the
     kthread unexpected attributes. For example, in the reported case,
     workqueue workers ended up in a non-root cpuset cgroups and had
     their CPU affinities overridden. This broke workqueue invariants
     and led to workqueue stalls.

     Fixed by closing the window between kthread creation and
     kthread_bind() as suggested by Oleg.

   - There was a bug in cgroup mount path which could allow two
     competing mount attempts to attach the same cgroup_root to two
     different superblocks.

     This was caused by mishandling return value from kernfs_pin_sb().

     Fixed"

* 'for-4.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroup: avoid attaching a cgroup root to two different superblocks
  cgroup, kthread: close race window where new kthreads can be migrated to non-root cgroups
</content>
</entry>
<entry>
<title>cgroup: avoid attaching a cgroup root to two different superblocks</title>
<updated>2017-04-11T00:00:57Z</updated>
<author>
<name>Zefan Li</name>
<email>lizefan@huawei.com</email>
</author>
<published>2017-04-07T08:51:55Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bfb0b80db5f9dca5ac0a5fd0edb765ee555e5a8e'/>
<id>urn:sha1:bfb0b80db5f9dca5ac0a5fd0edb765ee555e5a8e</id>
<content type='text'>
Run this:

    touch file0
    for ((; ;))
    {
        mount -t cpuset xxx file0
    }

And this concurrently:

    touch file1
    for ((; ;))
    {
        mount -t cpuset xxx file1
    }

We'll trigger a warning like this:

 ------------[ cut here ]------------
 WARNING: CPU: 1 PID: 4675 at lib/percpu-refcount.c:317 percpu_ref_kill_and_confirm+0x92/0xb0
 percpu_ref_kill_and_confirm called more than once on css_release!
 CPU: 1 PID: 4675 Comm: mount Not tainted 4.11.0-rc5+ #5
 Hardware name: Bochs Bochs, BIOS Bochs 01/01/2007
 Call Trace:
  dump_stack+0x63/0x84
  __warn+0xd1/0xf0
  warn_slowpath_fmt+0x5f/0x80
  percpu_ref_kill_and_confirm+0x92/0xb0
  cgroup_kill_sb+0x95/0xb0
  deactivate_locked_super+0x43/0x70
  deactivate_super+0x46/0x60
 ...
 ---[ end trace a79f61c2a2633700 ]---

Here's a race:

  Thread A				Thread B

  cgroup1_mount()
    # alloc a new cgroup root
    cgroup_setup_root()
					cgroup1_mount()
					  # no sb yet, returns NULL
					  kernfs_pin_sb()

					  # but succeeds in getting the refcnt,
					  # so re-use cgroup root
					  percpu_ref_tryget_live()
    # alloc sb with cgroup root
    cgroup_do_mount()

  cgroup_kill_sb()
					  # alloc another sb with same root
					  cgroup_do_mount()

					cgroup_kill_sb()

We end up using the same cgroup root for two different superblocks,
so percpu_ref_kill() will be called twice on the same root when the
two superblocks are destroyed.

We should fix to make sure the superblock pinning is really successful.

Cc: stable@vger.kernel.org # 3.16+
Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Signed-off-by: Zefan Li &lt;lizefan@huawei.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>cgroup, kthread: close race window where new kthreads can be migrated to non-root cgroups</title>
<updated>2017-03-17T14:18:47Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2017-03-16T20:54:24Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=77f88796cee819b9c4562b0b6b44691b3b7755b1'/>
<id>urn:sha1:77f88796cee819b9c4562b0b6b44691b3b7755b1</id>
<content type='text'>
Creation of a kthread goes through a couple interlocked stages between
the kthread itself and its creator.  Once the new kthread starts
running, it initializes itself and wakes up the creator.  The creator
then can further configure the kthread and then let it start doing its
job by waking it up.

In this configuration-by-creator stage, the creator is the only one
that can wake it up but the kthread is visible to userland.  When
altering the kthread's attributes from userland is allowed, this is
fine; however, for cases where CPU affinity is critical,
kthread_bind() is used to first disable affinity changes from userland
and then set the affinity.  This also prevents the kthread from being
migrated into non-root cgroups as that can affect the CPU affinity and
many other things.

Unfortunately, the cgroup side of protection is racy.  While the
PF_NO_SETAFFINITY flag prevents further migrations, userland can win
the race before the creator sets the flag with kthread_bind() and put
the kthread in a non-root cgroup, which can lead to all sorts of
problems including incorrect CPU affinity and starvation.

This bug got triggered by userland which periodically tries to migrate
all processes in the root cpuset cgroup to a non-root one.  Per-cpu
workqueue workers got caught while being created and ended up with
incorrected CPU affinity breaking concurrency management and sometimes
stalling workqueue execution.

This patch adds task-&gt;no_cgroup_migration which disallows the task to
be migrated by userland.  kthreadd starts with the flag set making
every child kthread start in the root cgroup with migration
disallowed.  The flag is cleared after the kthread finishes
initialization by which time PF_NO_SETAFFINITY is set if the kthread
should stay in the root cgroup.

It'd be better to wait for the initialization instead of failing but I
couldn't think of a way of implementing that without adding either a
new PF flag, or sleeping and retrying from waiting side.  Even if
userland depends on changing cgroup membership of a kthread, it either
has to be synchronized with kthread_create() or periodically repeat,
so it's unlikely that this would break anything.

v2: Switch to a simpler implementation using a new task_struct bit
    field suggested by Oleg.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Suggested-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reported-and-debugged-by: Chris Mason &lt;clm@fb.com&gt;
Cc: stable@vger.kernel.org # v4.3+ (we can't close the race on &lt; v4.3)
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-4.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup</title>
<updated>2017-03-14T22:11:19Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2017-03-14T22:11:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=352526f45387cb96671f13b003bdd5b249e509bd'/>
<id>urn:sha1:352526f45387cb96671f13b003bdd5b249e509bd</id>
<content type='text'>
Pull cgroup fixes from Tejun Heo:
 "Three cgroup fixes.  Nothing critical:

   - the pids controller could trigger suspicious RCU warning
     spuriously. Fixed.

   - in the debug controller, %p -&gt; %pK to protect kernel pointer
     from getting exposed.

   - documentation formatting fix"

* 'for-4.11-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup:
  cgroups: censor kernel pointer in debug files
  cgroup/pids: remove spurious suspicious RCU usage warning
  cgroup: Fix indenting in PID controller documentation
</content>
</entry>
</feed>
