<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/cpuset.h, branch tmp/leds/core</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=tmp%2Fleds%2Fcore</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=tmp%2Fleds%2Fcore'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2009-01-08T16:31:11Z</updated>
<entry>
<title>cpuset: remove remaining pointers to cpumask_t</title>
<updated>2009-01-08T16:31:11Z</updated>
<author>
<name>Li Zefan</name>
<email>lizf@cn.fujitsu.com</email>
</author>
<published>2009-01-08T02:08:45Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6af866af34a96fed24a55979a78b6f73bd4e8e87'/>
<id>urn:sha1:6af866af34a96fed24a55979a78b6f73bd4e8e87</id>
<content type='text'>
Impact: cleanups, use new cpumask API

Final trivial cleanups: mainly s/cpumask_t/struct cpumask

Note there is a FIXME in generate_sched_domains(). A future patch will
change struct cpumask *doms to struct cpumask *doms[].
(I suppose Rusty will do this.)

Signed-off-by: Li Zefan &lt;lizf@cn.fujitsu.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Acked-by: Mike Travis &lt;travis@sgi.com&gt;
Cc: Paul Menage &lt;menage@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>oom: print triggering task's cpuset and mems allowed</title>
<updated>2009-01-06T23:58:59Z</updated>
<author>
<name>David Rientjes</name>
<email>rientjes@google.com</email>
</author>
<published>2009-01-06T22:39:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=75aa199410359dc5fbcf9025ff7af98a9d20f0d5'/>
<id>urn:sha1:75aa199410359dc5fbcf9025ff7af98a9d20f0d5</id>
<content type='text'>
When cpusets are enabled, it's necessary to print the triggering task's
set of allowable nodes so the subsequently printed meminfo can be
interpreted correctly.

We also print the task's cpuset name for informational purposes.

[rientjes@google.com: task lock current before dereferencing cpuset]
Cc: Paul Menage &lt;menage@google.com&gt;
Cc: Li Zefan &lt;lizf@cn.fujitsu.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>cpuset: update top cpuset's mems after adding a node</title>
<updated>2008-11-20T02:49:58Z</updated>
<author>
<name>Miao Xie</name>
<email>miaox@cn.fujitsu.com</email>
</author>
<published>2008-11-19T23:36:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f481891fdc49d3d1b8a9674a1825d183069a805f'/>
<id>urn:sha1:f481891fdc49d3d1b8a9674a1825d183069a805f</id>
<content type='text'>
After adding a node into the machine, top cpuset's mems isn't updated.

By reviewing the code, we found that the update function

  cpuset_track_online_nodes()

was invoked after node_states[N_ONLINE] changes.  It is wrong because
N_ONLINE just means node has pgdat, and if node has/added memory, we use
N_HIGH_MEMORY.  So, We should invoke the update function after
node_states[N_HIGH_MEMORY] changes, just like its commit says.

This patch fixes it.  And we use notifier of memory hotplug instead of
direct calling of cpuset_track_online_nodes().

Signed-off-by: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Acked-by: Yasunori Goto &lt;y-goto@jp.fujitsu.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Paul Menage &lt;menage@google.com
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>sched: arch_reinit_sched_domains() must destroy domains to force rebuild</title>
<updated>2008-09-06T17:22:15Z</updated>
<author>
<name>Max Krasnyansky</name>
<email>maxk@qualcomm.com</email>
</author>
<published>2008-08-29T20:11:41Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=dfb512ec4834116124da61d6c1ee10fd0aa32bd6'/>
<id>urn:sha1:dfb512ec4834116124da61d6c1ee10fd0aa32bd6</id>
<content type='text'>
What I realized recently is that calling rebuild_sched_domains() in
arch_reinit_sched_domains() by itself is not enough when cpusets are enabled.
partition_sched_domains() code is trying to avoid unnecessary domain rebuilds
and will not actually rebuild anything if new domain masks match the old ones.

What this means is that doing
     echo 1 &gt; /sys/devices/system/cpu/sched_mc_power_savings
on a system with cpusets enabled will not take affect untill something changes
in the cpuset setup (ie new sets created or deleted).

This patch fixes restore correct behaviour where domains must be rebuilt in
order to enable MC powersaving flags.

Test on quad-core Core2 box with both CONFIG_CPUSETS and !CONFIG_CPUSETS.
Also tested on dual-core Core2 laptop. Lockdep is happy and things are working
as expected.

Signed-off-by: Max Krasnyansky &lt;maxk@qualcomm.com&gt;
Tested-by: Vaidyanathan Srinivasan &lt;svaidy@linux.vnet.ibm.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>cpu hotplug, sched: Introduce cpu_active_map and redo sched domain managment (take 2)</title>
<updated>2008-07-18T11:22:25Z</updated>
<author>
<name>Max Krasnyansky</name>
<email>maxk@qualcomm.com</email>
</author>
<published>2008-07-15T11:43:49Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e761b7725234276a802322549cee5255305a0930'/>
<id>urn:sha1:e761b7725234276a802322549cee5255305a0930</id>
<content type='text'>
This is based on Linus' idea of creating cpu_active_map that prevents
scheduler load balancer from migrating tasks to the cpu that is going
down.

It allows us to simplify domain management code and avoid unecessary
domain rebuilds during cpu hotplug event handling.

Please ignore the cpusets part for now. It needs some more work in order
to avoid crazy lock nesting. Although I did simplfy and unify domain
reinitialization logic. We now simply call partition_sched_domains() in
all the cases. This means that we're using exact same code paths as in
cpusets case and hence the test below cover cpusets too.
Cpuset changes to make rebuild_sched_domains() callable from various
contexts are in the separate patch (right next after this one).

This not only boots but also easily handles
	while true; do make clean; make -j 8; done
and
	while true; do on-off-cpu 1; done
at the same time.
(on-off-cpu 1 simple does echo 0/1 &gt; /sys/.../cpu1/online thing).

Suprisingly the box (dual-core Core2) is quite usable. In fact I'm typing
this on right now in gnome-terminal and things are moving just fine.

Also this is running with most of the debug features enabled (lockdep,
mutex, etc) no BUG_ONs or lockdep complaints so far.

I believe I addressed all of the Dmitry's comments for original Linus'
version. I changed both fair and rt balancer to mask out non-active cpus.
And replaced cpu_is_offline() with !cpu_active() in the main scheduler
code where it made sense (to me).

Signed-off-by: Max Krasnyanskiy &lt;maxk@qualcomm.com&gt;
Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Acked-by: Gregory Haskins &lt;ghaskins@novell.com&gt;
Cc: dmitry.adamushko@gmail.com
Cc: pj@sgi.com
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>mm: filter based on a nodemask as well as a gfp_mask</title>
<updated>2008-04-28T15:58:19Z</updated>
<author>
<name>Mel Gorman</name>
<email>mel@csn.ul.ie</email>
</author>
<published>2008-04-28T09:12:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=19770b32609b6bf97a3dece2529089494cbfc549'/>
<id>urn:sha1:19770b32609b6bf97a3dece2529089494cbfc549</id>
<content type='text'>
The MPOL_BIND policy creates a zonelist that is used for allocations
controlled by that mempolicy.  As the per-node zonelist is already being
filtered based on a zone id, this patch adds a version of __alloc_pages() that
takes a nodemask for further filtering.  This eliminates the need for
MPOL_BIND to create a custom zonelist.

A positive benefit of this is that allocations using MPOL_BIND now use the
local node's distance-ordered zonelist instead of a custom node-id-ordered
zonelist.  I.e., pages will be allocated from the closest allowed node with
available memory.

[Lee.Schermerhorn@hp.com: Mempolicy: update stale documentation and comments]
[Lee.Schermerhorn@hp.com: Mempolicy: make dequeue_huge_page_vma() obey MPOL_BIND nodemask]
[Lee.Schermerhorn@hp.com: Mempolicy: make dequeue_huge_page_vma() obey MPOL_BIND nodemask rework]
Signed-off-by: Mel Gorman &lt;mel@csn.ul.ie&gt;
Acked-by: Christoph Lameter &lt;clameter@sgi.com&gt;
Signed-off-by: Lee Schermerhorn &lt;lee.schermerhorn@hp.com&gt;
Cc: KAMEZAWA Hiroyuki &lt;kamezawa.hiroyu@jp.fujitsu.com&gt;
Cc: Mel Gorman &lt;mel@csn.ul.ie&gt;
Cc: Hugh Dickins &lt;hugh@veritas.com&gt;
Cc: Nick Piggin &lt;nickpiggin@yahoo.com.au&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>cpuset: modify cpuset_set_cpus_allowed to use cpumask pointer</title>
<updated>2008-04-19T17:44:58Z</updated>
<author>
<name>Mike Travis</name>
<email>travis@sgi.com</email>
</author>
<published>2008-04-05T01:11:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f9a86fcbbb1e5542eabf45c9144ac4b6330861a4'/>
<id>urn:sha1:f9a86fcbbb1e5542eabf45c9144ac4b6330861a4</id>
<content type='text'>
  * Modify cpuset_cpus_allowed to return the currently allowed cpuset
    via a pointer argument instead of as the function return value.

  * Use new set_cpus_allowed_ptr function.

  * Cleanup CPU_MASK_ALL and NODE_MASK_ALL uses.

Depends on:
	[sched-devel]: sched: add new set_cpus_allowed_ptr function

Signed-off-by: Mike Travis &lt;travis@sgi.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>mempolicy: silently restrict nodemask to allowed nodes</title>
<updated>2008-02-12T04:48:29Z</updated>
<author>
<name>KOSAKI Motohiro</name>
<email>kosaki.motohiro@jp.fujitsu.com</email>
</author>
<published>2008-02-12T04:30:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=31f1de46b90ad360a16e7af3e277d104961df923'/>
<id>urn:sha1:31f1de46b90ad360a16e7af3e277d104961df923</id>
<content type='text'>
Kosaki Motohito noted that "numactl --interleave=all ..." failed in the
presence of memoryless nodes.  This patch attempts to fix that problem.

Some background:

numactl --interleave=all calls set_mempolicy(2) with a fully populated
[out to MAXNUMNODES] nodemask.  set_mempolicy() [in do_set_mempolicy()]
calls contextualize_policy() which requires that the nodemask be a
subset of the current task's mems_allowed; else EINVAL will be returned.

A task's mems_allowed will always be a subset of node_states[N_HIGH_MEMORY]
i.e., nodes with memory.  So, a fully populated nodemask will be
declared invalid if it includes memoryless nodes.

  NOTE:  the same thing will occur when running in a cpuset
         with restricted mem_allowed--for the same reason:
         node mask contains dis-allowed nodes.

mbind(2), on the other hand, just masks off any nodes in the nodemask
that are not included in the caller's mems_allowed.

In each case [mbind() and set_mempolicy()], mpol_check_policy() will
complain [again, resulting in EINVAL] if the nodemask contains any
memoryless nodes.  This is somewhat redundant as mpol_new() will remove
memoryless nodes for interleave policy, as will bind_zonelist()--called
by mpol_new() for BIND policy.

Proposed fix:

1) modify contextualize_policy logic to:
   a) remember whether the incoming node mask is empty.
   b) if not, restrict the nodemask to allowed nodes, as is
      currently done in-line for mbind().  This guarantees
      that the resulting mask includes only nodes with memory.

      NOTE:  this is a [benign, IMO] change in behavior for
             set_mempolicy().  Dis-allowed nodes will be
             silently ignored, rather than returning an error.

   c) fold this code into mpol_check_policy(), replace 2 calls to
      contextualize_policy() to call mpol_check_policy() directly
      and remove contextualize_policy().

2) In existing mpol_check_policy() logic, after "contextualization":
   a) MPOL_DEFAULT:  require that in coming mask "was_empty"
   b) MPOL_{BIND|INTERLEAVE}:  require that contextualized nodemask
      contains at least one node.
   c) add a case for MPOL_PREFERRED:  if in coming was not empty
      and resulting mask IS empty, user specified invalid nodes.
      Return EINVAL.
   c) remove the now redundant check for memoryless nodes

3) remove the now redundant masking of policy nodes for interleave
   policy from mpol_new().

4) Now that mpol_check_policy() contextualizes the nodemask, remove
   the in-line nodes_and() from sys_mbind().  I believe that this
   restores mbind() to the behavior before the memoryless-nodes
   patch series.  E.g., we'll no longer treat an invalid nodemask
   with MPOL_PREFERRED as local allocation.

[ Patch history:

  v1 -&gt; v2:
   - Communicate whether or not incoming node mask was empty to
     mpol_check_policy() for better error checking.
   - As suggested by David Rientjes, remove the now unused
     cpuset_nodes_subset_current_mems_allowed() from cpuset.h

  v2 -&gt; v3:
   - As suggested by Kosaki Motohito, fold the "contextualization"
     of policy nodemask into mpol_check_policy().  Looks a little
     cleaner. ]

Signed-off-by:  Lee Schermerhorn &lt;lee.schermerhorn@hp.com&gt;
Signed-off-by:  KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Tested-by:      KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Acked-by:       David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>proc: seqfile convert proc_pid_status to properly handle pid namespaces</title>
<updated>2008-02-08T17:22:24Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2008-02-08T12:18:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=df5f8314ca30d6a76735748e5ba4ca9809c0f434'/>
<id>urn:sha1:df5f8314ca30d6a76735748e5ba4ca9809c0f434</id>
<content type='text'>
Currently we possibly lookup the pid in the wrong pid namespace.  So
seq_file convert proc_pid_status which ensures the proper pid namespaces is
passed in.

[akpm@linux-foundation.org: coding-style fixes]
[akpm@linux-foundation.org: build fix]
[akpm@linux-foundation.org: another build fix]
[akpm@linux-foundation.org: s390 build fix]
[akpm@linux-foundation.org: fix task_name() output]
[akpm@linux-foundation.org: fix nommu build]
Signed-off-by: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: Andrew Morgan &lt;morgan@kernel.org&gt;
Cc: Serge Hallyn &lt;serue@us.ibm.com&gt;
Cc: Cedric Le Goater &lt;clg@fr.ibm.com&gt;
Cc: Pavel Emelyanov &lt;xemul@openvz.org&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Paul Menage &lt;menage@google.com&gt;
Cc: Paul Jackson &lt;pj@sgi.com&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>hotplug cpu: migrate a task within its cpuset</title>
<updated>2007-10-19T18:53:44Z</updated>
<author>
<name>Cliff Wickman</name>
<email>cpw@sgi.com</email>
</author>
<published>2007-10-19T06:40:46Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=470fd646444c65a5d062a371f5ec8dcedee61239'/>
<id>urn:sha1:470fd646444c65a5d062a371f5ec8dcedee61239</id>
<content type='text'>
When a cpu is disabled, move_task_off_dead_cpu() is called for tasks that have
been running on that cpu.

Currently, such a task is migrated:
 1) to any cpu on the same node as the disabled cpu, which is both online
    and among that task's cpus_allowed
 2) to any cpu which is both online and among that task's cpus_allowed

It is typical of a multithreaded application running on a large NUMA system to
have its tasks confined to a cpuset so as to cluster them near the memory that
they share.  Furthermore, it is typical to explicitly place such a task on a
specific cpu in that cpuset.  And in that case the task's cpus_allowed
includes only a single cpu.

This patch would insert a preference to migrate such a task to some cpu within
its cpuset (and set its cpus_allowed to its entire cpuset).

With this patch, migrate the task to:
 1) to any cpu on the same node as the disabled cpu, which is both online
    and among that task's cpus_allowed
 2) to any online cpu within the task's cpuset
 3) to any cpu which is both online and among that task's cpus_allowed

In order to do this, move_task_off_dead_cpu() must make a call to
cpuset_cpus_allowed_locked(), a new subset of cpuset_cpus_allowed(), that will
not block.  (name change - per Oleg's suggestion)

Calls are made to cpuset_lock() and cpuset_unlock() in migration_call() to set
the cpuset mutex during the whole migrate_live_tasks() and
migrate_dead_tasks() procedure.

[akpm@linux-foundation.org: build fix]
[pj@sgi.com: Fix indentation and spacing]
Signed-off-by: Cliff Wickman &lt;cpw@sgi.com&gt;
Cc: Oleg Nesterov &lt;oleg@tv-sign.ru&gt;
Cc: Christoph Lameter &lt;clameter@sgi.com&gt;
Cc: Paul Jackson &lt;pj@sgi.com&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Paul Jackson &lt;pj@sgi.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
