user/sven/linux.git/kernel/cpuset.c, branch v2.6.16.20

[PATCH] cpuset: oops in exit on null cpuset fix

2006-02-15T23:32:21Z

Fix a latent bug in cpuset_exit() handling. If a task tried to allocate memory after calling cpuset_exit(), it oops'd in cpuset_update_task_memory_state() on a NULL cpuset pointer. So set the exiting tasks cpuset to the root cpuset instead of to NULL. A distro kernel hit this with an added kernel package that had just such a hook (allocating memory) in the exit code path. Signed-off-by: Paul Jackson Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

[PATCH] cpuset: fix sparse warning

2006-02-03T16:32:06Z

kernel/cpuset.c:644:38: warning: non-ANSI function declaration of function 'cpuset_update_task_memory_state' Signed-off-by: Randy Dunlap Acked-by: Paul Jackson Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

[PATCH] cpuset oom lock fix

2006-01-15T02:27:10Z

The problem, reported in: http://bugzilla.kernel.org/show_bug.cgi?id=5859 and by various other email messages and lkml posts is that the cpuset hook in the oom (out of memory) code can try to take a cpuset semaphore while holding the tasklist_lock (a spinlock). One must not sleep while holding a spinlock. The fix seems easy enough - move the cpuset semaphore region outside the tasklist_lock region. This required a few lines of mechanism to implement. The oom code where the locking needs to be changed does not have access to the cpuset locks, which are internal to kernel/cpuset.c only. So I provided a couple more cpuset interface routines, available to the rest of the kernel, which simple take and drop the lock needed here (cpusets callback_sem). Signed-off-by: Paul Jackson Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

[PATCH] Unlinline a bunch of other functions

2006-01-15T02:27:06Z

Remove the "inline" keyword from a bunch of big functions in the kernel with the goal of shrinking it by 30kb to 40kb Signed-off-by: Arjan van de Ven Signed-off-by: Ingo Molnar Acked-by: Jeff Garzik Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

[PATCH] mutex subsystem, semaphore to mutex: VFS, ->i_sem

2006-01-09T23:59:24Z

This patch converts the inode semaphore to a mutex. I have tested it on XFS and compiled as much as one can consider on an ia64. Anyway your luck with it might be different. Modified-by: Ingo Molnar (finished the conversion) Signed-off-by: Jes Sorensen Signed-off-by: Ingo Molnar

[PATCH] shrink dentry struct

2006-01-09T04:13:58Z

Some long time ago, dentry struct was carefully tuned so that on 32 bits UP, sizeof(struct dentry) was exactly 128, ie a power of 2, and a multiple of memory cache lines. Then RCU was added and dentry struct enlarged by two pointers, with nice results for SMP, but not so good on UP, because breaking the above tuning (128 + 8 = 136 bytes) This patch reverts this unwanted side effect, by using an union (d_u), where d_rcu and d_child are placed so that these two fields can share their memory needs. At the time d_free() is called (and d_rcu is really used), d_child is known to be empty and not touched by the dentry freeing. Lockless lookups only access d_name, d_parent, d_lock, d_op, d_flags (so the previous content of d_child is not needed if said dentry was unhashed but still accessed by a CPU because of RCU constraints) As dentry cache easily contains millions of entries, a size reduction is worth the extra complexity of the ugly C union. Signed-off-by: Eric Dumazet Cc: Dipankar Sarma Cc: Maneesh Soni Cc: Miklos Szeredi Cc: "Paul E. McKenney" Cc: Ian Kent Cc: Paul Jackson Cc: Al Viro Cc: Christoph Hellwig Cc: Trond Myklebust Cc: Neil Brown Cc: James Morris Cc: Stephen Smalley Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

[PATCH] cpuset: skip rcu check if task is in root cpuset

2006-01-09T04:13:45Z

For systems that aren't using cpusets, but have them CONFIG_CPUSET enabled in their kernel (eventually this may be most distribution kernels), this patch removes even the minimal rcu_read_lock() from the memory page allocation path. Actually, it removes that rcu call for any task that is in the root cpuset (top_cpuset), which on systems not actively using cpusets, is all tasks. We don't need the rcu check for tasks in the top_cpuset, because the top_cpuset is statically allocated, so at no risk of being freed out from underneath us. Signed-off-by: Paul Jackson Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

[PATCH] cpuset: mark number_of_cpusets read_mostly

2006-01-09T04:13:45Z

Mark cpuset global 'number_of_cpusets' as __read_mostly. This global is accessed everytime a zone is considered in the zonelist loops beneath __alloc_pages, looking for a free memory page. If number_of_cpusets is just one, then we can short circuit the mems_allowed check. Since this global is read alot on a hot path, and written rarely, it is an excellent candidate for __read_mostly. Thanks to Christoph Lameter for the suggestion. Signed-off-by: Paul Jackson Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

[PATCH] cpuset: use rcu directly optimization

2006-01-09T04:13:45Z

Optimize the cpuset impact on page allocation, the most performance critical cpuset hook in the kernel. On each page allocation, the cpuset hook needs to check for a possible change in the current tasks cpuset. It can now handle the common case, of no change, without taking any spinlock or semaphore, thanks to RCU. Convert a spinlock on the current task to an rcu_read_lock(), saving approximately a memory barrier and an atomic op, depending on architecture. This is done by adding rcu_assign_pointer() and synchronize_rcu() calls to the write side of the task->cpuset pointer, in cpuset.c:attach_task(), to delay freeing up a detached cpuset until after any critical sections referencing that pointer. Thanks to Andi Kleen, Nick Piggin and Eric Dumazet for ideas. Signed-off-by: Paul Jackson Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

[PATCH] cpuset: remove test for null cpuset from alloc code path

2006-01-09T04:13:44Z

Remove a couple of more lines of code from the cpuset hooks in the page allocation code path. There was a check for a NULL cpuset pointer in the routine cpuset_update_task_memory_state() that was only needed during system boot, after the memory subsystem was initialized, before the cpuset subsystem was initialized, to catch a NULL task->cpuset pointer. Add a cpuset_init_early() routine, just before the mem_init() call in init/main.c, that sets up just enough of the init tasks cpuset structure to render cpuset_update_task_memory_state() calls harmless. Signed-off-by: Paul Jackson Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds