<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/cpuset.c, branch v3.12.48</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.12.48</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.12.48'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2015-04-09T11:14:06Z</updated>
<entry>
<title>cpuset: Fix cpuset sched_relax_domain_level</title>
<updated>2015-04-09T11:14:06Z</updated>
<author>
<name>Jason Low</name>
<email>jason.low2@hp.com</email>
</author>
<published>2015-02-13T03:58:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=90b682b6065978253a1c5ea75b26c44a67e1a38a'/>
<id>urn:sha1:90b682b6065978253a1c5ea75b26c44a67e1a38a</id>
<content type='text'>
commit 283cb41f426b723a0255702b761b0fc5d1b53a81 upstream.

The cpuset.sched_relax_domain_level can control how far we do
immediate load balancing on a system. However, it was found on recent
kernels that echo'ing a value into cpuset.sched_relax_domain_level
did not reduce any immediate load balancing.

The reason this occurred was because the update_domain_attr_tree() traversal
did not update for the "top_cpuset". This resulted in nothing being changed
when modifying the sched_relax_domain_level parameter.

This patch is able to address that problem by having update_domain_attr_tree()
allow updates for the root in the cpuset traversal.

Fixes: fc560a26acce ("cpuset: replace cpuset-&gt;stack_list with cpuset_for_each_descendant_pre()")
Signed-off-by: Jason Low &lt;jason.low2@hp.com&gt;
Signed-off-by: Zefan Li &lt;lizefan@huawei.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Tested-by: Serge Hallyn &lt;serge.hallyn@canonical.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>mm: page_alloc: use jump labels to avoid checking number_of_cpusets</title>
<updated>2014-09-26T09:52:02Z</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2014-08-28T18:35:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ee1760b2b4920841f45b2ad07a1c9f99e08568e7'/>
<id>urn:sha1:ee1760b2b4920841f45b2ad07a1c9f99e08568e7</id>
<content type='text'>
commit 664eeddeef6539247691197c1ac124d4aa872ab6 upstream.

If cpusets are not in use then we still check a global variable on every
page allocation.  Use jump labels to avoid the overhead.

Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Reviewed-by: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Michal Hocko &lt;mhocko@suse.cz&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Cc: Theodore Ts'o &lt;tytso@mit.edu&gt;
Cc: "Paul E. McKenney" &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephen Rothwell &lt;sfr@canb.auug.org.au&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>mm: optimize put_mems_allowed() usage</title>
<updated>2014-09-26T09:51:56Z</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2014-08-28T18:34:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=337c9823cf0b93c8f4d1c4654bd93cf24e5b837b'/>
<id>urn:sha1:337c9823cf0b93c8f4d1c4654bd93cf24e5b837b</id>
<content type='text'>
commit d26914d11751b23ca2e8747725f2cae10c2f2c1b upstream.

Since put_mems_allowed() is strictly optional, its a seqcount retry, we
don't need to evaluate the function if the allocation was in fact
successful, saving a smp_rmb some loads and comparisons on some relative
fast-paths.

Since the naming, get/put_mems_allowed() does suggest a mandatory
pairing, rename the interface, as suggested by Mel, to resemble the
seqcount interface.

This gives us: read_mems_allowed_begin() and read_mems_allowed_retry(),
where it is important to note that the return value of the latter call
is inverted from its previous incarnation.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>cpuset,mempolicy: fix sleeping function called from invalid context</title>
<updated>2014-07-18T13:51:15Z</updated>
<author>
<name>Gu Zheng</name>
<email>guz.fnst@cn.fujitsu.com</email>
</author>
<published>2014-06-25T01:57:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d9e8b4f661a91d29bd641568a30e54c93cebb15d'/>
<id>urn:sha1:d9e8b4f661a91d29bd641568a30e54c93cebb15d</id>
<content type='text'>
commit 391acf970d21219a2a5446282d3b20eace0c0d7a upstream.

When runing with the kernel(3.15-rc7+), the follow bug occurs:
[ 9969.258987] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:586
[ 9969.359906] in_atomic(): 1, irqs_disabled(): 0, pid: 160655, name: python
[ 9969.441175] INFO: lockdep is turned off.
[ 9969.488184] CPU: 26 PID: 160655 Comm: python Tainted: G       A      3.15.0-rc7+ #85
[ 9969.581032] Hardware name: FUJITSU-SV PRIMEQUEST 1800E/SB, BIOS PRIMEQUEST 1000 Series BIOS Version 1.39 11/16/2012
[ 9969.706052]  ffffffff81a20e60 ffff8803e941fbd0 ffffffff8162f523 ffff8803e941fd18
[ 9969.795323]  ffff8803e941fbe0 ffffffff8109995a ffff8803e941fc58 ffffffff81633e6c
[ 9969.884710]  ffffffff811ba5dc ffff880405c6b480 ffff88041fdd90a0 0000000000002000
[ 9969.974071] Call Trace:
[ 9970.003403]  [&lt;ffffffff8162f523&gt;] dump_stack+0x4d/0x66
[ 9970.065074]  [&lt;ffffffff8109995a&gt;] __might_sleep+0xfa/0x130
[ 9970.130743]  [&lt;ffffffff81633e6c&gt;] mutex_lock_nested+0x3c/0x4f0
[ 9970.200638]  [&lt;ffffffff811ba5dc&gt;] ? kmem_cache_alloc+0x1bc/0x210
[ 9970.272610]  [&lt;ffffffff81105807&gt;] cpuset_mems_allowed+0x27/0x140
[ 9970.344584]  [&lt;ffffffff811b1303&gt;] ? __mpol_dup+0x63/0x150
[ 9970.409282]  [&lt;ffffffff811b1385&gt;] __mpol_dup+0xe5/0x150
[ 9970.471897]  [&lt;ffffffff811b1303&gt;] ? __mpol_dup+0x63/0x150
[ 9970.536585]  [&lt;ffffffff81068c86&gt;] ? copy_process.part.23+0x606/0x1d40
[ 9970.613763]  [&lt;ffffffff810bf28d&gt;] ? trace_hardirqs_on+0xd/0x10
[ 9970.683660]  [&lt;ffffffff810ddddf&gt;] ? monotonic_to_bootbased+0x2f/0x50
[ 9970.759795]  [&lt;ffffffff81068cf0&gt;] copy_process.part.23+0x670/0x1d40
[ 9970.834885]  [&lt;ffffffff8106a598&gt;] do_fork+0xd8/0x380
[ 9970.894375]  [&lt;ffffffff81110e4c&gt;] ? __audit_syscall_entry+0x9c/0xf0
[ 9970.969470]  [&lt;ffffffff8106a8c6&gt;] SyS_clone+0x16/0x20
[ 9971.030011]  [&lt;ffffffff81642009&gt;] stub_clone+0x69/0x90
[ 9971.091573]  [&lt;ffffffff81641c29&gt;] ? system_call_fastpath+0x16/0x1b

The cause is that cpuset_mems_allowed() try to take
mutex_lock(&amp;callback_mutex) under the rcu_read_lock(which was hold in
__mpol_dup()). And in cpuset_mems_allowed(), the access to cpuset is
under rcu_read_lock, so in __mpol_dup, we can reduce the rcu_read_lock
protection region to protect the access to cpuset only in
current_cpuset_is_being_rebound(). So that we can avoid this bug.

This patch is a temporary solution that just addresses the bug
mentioned above, can not fix the long-standing issue about cpuset.mems
rebinding on fork():

"When the forker's task_struct is duplicated (which includes
 -&gt;mems_allowed) and it races with an update to cpuset_being_rebound
 in update_tasks_nodemask() then the task's mems_allowed doesn't get
 updated. And the child task's mems_allowed can be wrong if the
 cpuset's nodemask changes before the child has been added to the
 cgroup's tasklist."

Signed-off-by: Gu Zheng &lt;guz.fnst@cn.fujitsu.com&gt;
Acked-by: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>cpuset: fix a race condition in __cpuset_node_allowed_softwall()</title>
<updated>2014-03-22T21:01:55Z</updated>
<author>
<name>Li Zefan</name>
<email>lizefan@huawei.com</email>
</author>
<published>2014-02-27T10:19:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=44ba741a0f470c22c691586dea1e219df757fef2'/>
<id>urn:sha1:44ba741a0f470c22c691586dea1e219df757fef2</id>
<content type='text'>
commit 99afb0fd5f05aac467ffa85c36778fec4396209b upstream.

It's not safe to access task's cpuset after releasing task_lock().
Holding callback_mutex won't help.

Signed-off-by: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
</entry>
<entry>
<title>cpuset: fix a locking issue in cpuset_migrate_mm()</title>
<updated>2014-03-22T21:01:55Z</updated>
<author>
<name>Li Zefan</name>
<email>lizefan@huawei.com</email>
</author>
<published>2014-02-27T10:19:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d1275b0297b1995a15aab2361f4051684c18a2ea'/>
<id>urn:sha1:d1275b0297b1995a15aab2361f4051684c18a2ea</id>
<content type='text'>
commit 4729583006772b9530404bc1bb7c3aa4a10ffd4d upstream.

I can trigger a lockdep warning:

  # mount -t cgroup -o cpuset xxx /cgroup
  # mkdir /cgroup/cpuset
  # mkdir /cgroup/tmp
  # echo 0 &gt; /cgroup/tmp/cpuset.cpus
  # echo 0 &gt; /cgroup/tmp/cpuset.mems
  # echo 1 &gt; /cgroup/tmp/cpuset.memory_migrate
  # echo $$ &gt; /cgroup/tmp/tasks
  # echo 1 &gt; /cgruop/tmp/cpuset.mems

  ===============================
  [ INFO: suspicious RCU usage. ]
  3.14.0-rc1-0.1-default+ #32 Not tainted
  -------------------------------
  include/linux/cgroup.h:682 suspicious rcu_dereference_check() usage!
  ...
    [&lt;ffffffff81582174&gt;] dump_stack+0x72/0x86
    [&lt;ffffffff810b8f01&gt;] lockdep_rcu_suspicious+0x101/0x140
    [&lt;ffffffff81105ba1&gt;] cpuset_migrate_mm+0xb1/0xe0
  ...

We used to hold cgroup_mutex when calling cpuset_migrate_mm(), but now
we hold cpuset_mutex, which causes task_css() to complain.

This is not a false-positive but a real issue.

Holding cpuset_mutex won't prevent a task from migrating to another
cpuset, and it won't prevent the original task-&gt;cgroup from destroying
during this change.

Fixes: 5d21cc2db040 (cpuset: replace cgroup_mutex locking with cpuset internal locking)
Signed-off-by: Li Zefan &lt;lizefan@huawei.com&gt;
Sigend-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
</entry>
<entry>
<title>cpuset: Fix memory allocator deadlock</title>
<updated>2013-12-04T19:05:55Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2013-11-26T14:03:41Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=764d66de37288d6baab42e07060de4f5c97089a2'/>
<id>urn:sha1:764d66de37288d6baab42e07060de4f5c97089a2</id>
<content type='text'>
commit 0fc0287c9ed1ffd3706f8b4d9b314aa102ef1245 upstream.

Juri hit the below lockdep report:

[    4.303391] ======================================================
[    4.303392] [ INFO: SOFTIRQ-safe -&gt; SOFTIRQ-unsafe lock order detected ]
[    4.303394] 3.12.0-dl-peterz+ #144 Not tainted
[    4.303395] ------------------------------------------------------
[    4.303397] kworker/u4:3/689 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire:
[    4.303399]  (&amp;p-&gt;mems_allowed_seq){+.+...}, at: [&lt;ffffffff8114e63c&gt;] new_slab+0x6c/0x290
[    4.303417]
[    4.303417] and this task is already holding:
[    4.303418]  (&amp;(&amp;q-&gt;__queue_lock)-&gt;rlock){..-...}, at: [&lt;ffffffff812d2dfb&gt;] blk_execute_rq_nowait+0x5b/0x100
[    4.303431] which would create a new lock dependency:
[    4.303432]  (&amp;(&amp;q-&gt;__queue_lock)-&gt;rlock){..-...} -&gt; (&amp;p-&gt;mems_allowed_seq){+.+...}
[    4.303436]

[    4.303898] the dependencies between the lock to be acquired and SOFTIRQ-irq-unsafe lock:
[    4.303918] -&gt; (&amp;p-&gt;mems_allowed_seq){+.+...} ops: 2762 {
[    4.303922]    HARDIRQ-ON-W at:
[    4.303923]                     [&lt;ffffffff8108ab9a&gt;] __lock_acquire+0x65a/0x1ff0
[    4.303926]                     [&lt;ffffffff8108cbe3&gt;] lock_acquire+0x93/0x140
[    4.303929]                     [&lt;ffffffff81063dd6&gt;] kthreadd+0x86/0x180
[    4.303931]                     [&lt;ffffffff816ded6c&gt;] ret_from_fork+0x7c/0xb0
[    4.303933]    SOFTIRQ-ON-W at:
[    4.303933]                     [&lt;ffffffff8108abcc&gt;] __lock_acquire+0x68c/0x1ff0
[    4.303935]                     [&lt;ffffffff8108cbe3&gt;] lock_acquire+0x93/0x140
[    4.303940]                     [&lt;ffffffff81063dd6&gt;] kthreadd+0x86/0x180
[    4.303955]                     [&lt;ffffffff816ded6c&gt;] ret_from_fork+0x7c/0xb0
[    4.303959]    INITIAL USE at:
[    4.303960]                    [&lt;ffffffff8108a884&gt;] __lock_acquire+0x344/0x1ff0
[    4.303963]                    [&lt;ffffffff8108cbe3&gt;] lock_acquire+0x93/0x140
[    4.303966]                    [&lt;ffffffff81063dd6&gt;] kthreadd+0x86/0x180
[    4.303969]                    [&lt;ffffffff816ded6c&gt;] ret_from_fork+0x7c/0xb0
[    4.303972]  }

Which reports that we take mems_allowed_seq with interrupts enabled. A
little digging found that this can only be from
cpuset_change_task_nodemask().

This is an actual deadlock because an interrupt doing an allocation will
hit get_mems_allowed()-&gt;...-&gt;__read_seqcount_begin(), which will spin
forever waiting for the write side to complete.

Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Reported-by: Juri Lelli &lt;juri.lelli@gmail.com&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Tested-by: Juri Lelli &lt;juri.lelli@gmail.com&gt;
Acked-by: Li Zefan &lt;lizefan@huawei.com&gt;
Acked-by: Mel Gorman &lt;mgorman@suse.de&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>Merge branch 'for-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup</title>
<updated>2013-09-04T01:25:03Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2013-09-04T01:25:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=32dad03d164206ea886885d0740284ba215b0970'/>
<id>urn:sha1:32dad03d164206ea886885d0740284ba215b0970</id>
<content type='text'>
Pull cgroup updates from Tejun Heo:
 "A lot of activities on the cgroup front.  Most changes aren't visible
  to userland at all at this point and are laying foundation for the
  planned unified hierarchy.

   - The biggest change is decoupling the lifetime management of css
     (cgroup_subsys_state) from that of cgroup's.  Because controllers
     (cpu, memory, block and so on) will need to be dynamically enabled
     and disabled, css which is the association point between a cgroup
     and a controller may come and go dynamically across the lifetime of
     a cgroup.  Till now, css's were created when the associated cgroup
     was created and stayed till the cgroup got destroyed.

     Assumptions around this tight coupling permeated through cgroup
     core and controllers.  These assumptions are gradually removed,
     which consists bulk of patches, and css destruction path is
     completely decoupled from cgroup destruction path.  Note that
     decoupling of creation path is relatively easy on top of these
     changes and the patchset is pending for the next window.

   - cgroup has its own event mechanism cgroup.event_control, which is
     only used by memcg.  It is overly complex trying to achieve high
     flexibility whose benefits seem dubious at best.  Going forward,
     new events will simply generate file modified event and the
     existing mechanism is being made specific to memcg.  This pull
     request contains prepatory patches for such change.

   - Various fixes and cleanups"

Fixed up conflict in kernel/cgroup.c as per Tejun.

* 'for-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: (69 commits)
  cgroup: fix cgroup_css() invocation in css_from_id()
  cgroup: make cgroup_write_event_control() use css_from_dir() instead of __d_cgrp()
  cgroup: make cgroup_event hold onto cgroup_subsys_state instead of cgroup
  cgroup: implement CFTYPE_NO_PREFIX
  cgroup: make cgroup_css() take cgroup_subsys * instead and allow NULL subsys
  cgroup: rename cgroup_css_from_dir() to css_from_dir() and update its syntax
  cgroup: fix cgroup_write_event_control()
  cgroup: fix subsystem file accesses on the root cgroup
  cgroup: change cgroup_from_id() to css_from_id()
  cgroup: use css_get() in cgroup_create() to check CSS_ROOT
  cpuset: remove an unncessary forward declaration
  cgroup: RCU protect each cgroup_subsys_state release
  cgroup: move subsys file removal to kill_css()
  cgroup: factor out kill_css()
  cgroup: decouple cgroup_subsys_state destruction from cgroup destruction
  cgroup: replace cgroup-&gt;css_kill_cnt with -&gt;nr_css
  cgroup: bounce cgroup_subsys_state ref kill confirmation to a work item
  cgroup: move cgroup-&gt;subsys[] assignment to online_css()
  cgroup: reorganize css init / exit paths
  cgroup: add __rcu modifier to cgroup-&gt;subsys[]
  ...
</content>
</entry>
<entry>
<title>cpuset: fix a regression in validating config change</title>
<updated>2013-08-21T12:40:27Z</updated>
<author>
<name>Li Zefan</name>
<email>lizefan@huawei.com</email>
</author>
<published>2013-08-21T02:22:28Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1c09b195d37fa459844036f429a0f378e70c3db6'/>
<id>urn:sha1:1c09b195d37fa459844036f429a0f378e70c3db6</id>
<content type='text'>
It's not allowed to clear masks of a cpuset if there're tasks in it,
but it's broken:

  # mkdir /cgroup/sub
  # echo 0 &gt; /cgroup/sub/cpuset.cpus
  # echo 0 &gt; /cgroup/sub/cpuset.mems
  # echo $$ &gt; /cgroup/sub/tasks
  # echo &gt; /cgroup/sub/cpuset.cpus
  (should fail)

This bug was introduced by commit 88fa523bff295f1d60244a54833480b02f775152
("cpuset: allow to move tasks to empty cpusets").

tj: Dropped temp bool variables and nestes the conditionals directly.

Signed-off-by: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>cpuset: remove an unncessary forward declaration</title>
<updated>2013-08-14T00:23:06Z</updated>
<author>
<name>Li Zefan</name>
<email>lizefan@huawei.com</email>
</author>
<published>2013-08-13T01:17:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ff58ac0d58d51bffe868b239ed8fce7c4a23c5a9'/>
<id>urn:sha1:ff58ac0d58d51bffe868b239ed8fce7c4a23c5a9</id>
<content type='text'>
Signed-off-by: Li Zefan &lt;lizefan@huawei.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
</feed>
