user/sven/linux.git/include/linux/cpuset.h, branch v4.1.6

cpuset: simplify cpuset_node_allowed API

2014-10-27T15:15:27Z

Current cpuset API for checking if a zone/node is allowed to allocate from looks rather awkward. We have hardwall and softwall versions of cpuset_node_allowed with the softwall version doing literally the same as the hardwall version if __GFP_HARDWALL is passed to it in gfp flags. If it isn't, the softwall version may check the given node against the enclosing hardwall cpuset, which it needs to take the callback lock to do. Such a distinction was introduced by commit 02a0e53d8227 ("cpuset: rework cpuset_zone_allowed api"). Before, we had the only version with the __GFP_HARDWALL flag determining its behavior. The purpose of the commit was to avoid sleep-in-atomic bugs when someone would mistakenly call the function without the __GFP_HARDWALL flag for an atomic allocation. The suffixes introduced were intended to make the callers think before using the function. However, since the callback lock was converted from mutex to spinlock by the previous patch, the softwall check function cannot sleep, and these precautions are no longer necessary. So let's simplify the API back to the single check. Suggested-by: David Rientjes Signed-off-by: Vladimir Davydov Acked-by: Christoph Lameter Acked-by: Zefan Li Signed-off-by: Tejun Heo

Merge branch 'for-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup

2014-10-10T11:24:40Z

Pull cgroup updates from Tejun Heo: "Nothing too interesting. Just a handful of cleanup patches" * 'for-3.18' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup: Revert "cgroup: remove redundant variable in cgroup_mount()" cgroup: remove redundant variable in cgroup_mount() cgroup: fix missing unlock in cgroup_release_agent() cgroup: remove CGRP_RELEASABLE flag perf/cgroup: Remove perf_put_cgroup() cgroup: remove redundant check in cgroup_ino() cpuset: simplify proc_cpuset_show() cgroup: simplify proc_cgroup_show() cgroup: use a per-cgroup work for release agent cgroup: remove bogus comments cgroup: remove redundant code in cgroup_rmdir() cgroup: remove some useless forward declarations cgroup: fix a typo in comment.

cpuset: PF_SPREAD_PAGE and PF_SPREAD_SLAB should be atomic flags

2014-09-25T02:16:06Z

When we change cpuset.memory_spread_{page,slab}, cpuset will flip PF_SPREAD_{PAGE,SLAB} bit of tsk->flags for each task in that cpuset. This should be done using atomic bitops, but currently we don't, which is broken. Tetsuo reported a hard-to-reproduce kernel crash on RHEL6, which happened when one thread tried to clear PF_USED_MATH while at the same time another thread tried to flip PF_SPREAD_PAGE/PF_SPREAD_SLAB. They both operate on the same task. Here's the full report: https://lkml.org/lkml/2014/9/19/230 To fix this, we make PF_SPREAD_PAGE and PF_SPREAD_SLAB atomic flags. v4: - updated mm/slab.c. (Fengguang Wu) - updated Documentation. Cc: Peter Zijlstra Cc: Ingo Molnar Cc: Miao Xie Cc: Kees Cook Fixes: 950592f7b991 ("cpusets: update tasks' page/slab spread flags in time") Cc: # 2.6.31+ Reported-by: Tetsuo Handa Signed-off-by: Zefan Li Signed-off-by: Tejun Heo

cpuset: simplify proc_cpuset_show()

2014-09-18T17:27:23Z

Use the ONE macro instead of REG, and we can simplify proc_cpuset_show(). Signed-off-by: Zefan Li Signed-off-by: Tejun Heo

mm: page_alloc: use jump labels to avoid checking number_of_cpusets

2014-06-04T23:54:08Z

If cpusets are not in use then we still check a global variable on every page allocation. Use jump labels to avoid the overhead. Signed-off-by: Mel Gorman Reviewed-by: Rik van Riel Cc: Johannes Weiner Cc: Vlastimil Babka Cc: Jan Kara Cc: Michal Hocko Cc: Hugh Dickins Cc: Dave Hansen Cc: Theodore Ts'o Cc: "Paul E. McKenney" Cc: Oleg Nesterov Cc: Peter Zijlstra Cc: Stephen Rothwell Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

mm: optimize put_mems_allowed() usage

2014-04-03T23:20:58Z

Since put_mems_allowed() is strictly optional, its a seqcount retry, we don't need to evaluate the function if the allocation was in fact successful, saving a smp_rmb some loads and comparisons on some relative fast-paths. Since the naming, get/put_mems_allowed() does suggest a mandatory pairing, rename the interface, as suggested by Mel, to resemble the seqcount interface. This gives us: read_mems_allowed_begin() and read_mems_allowed_retry(), where it is important to note that the return value of the latter call is inverted from its previous incarnation. Signed-off-by: Peter Zijlstra Signed-off-by: Mel Gorman Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds

cpuset: Fix potential deadlock w/ set_mems_allowed

2013-11-06T11:40:27Z

After adding lockdep support to seqlock/seqcount structures, I started seeing the following warning: [ 1.070907] ====================================================== [ 1.072015] [ INFO: SOFTIRQ-safe -> SOFTIRQ-unsafe lock order detected ] [ 1.073181] 3.11.0+ #67 Not tainted [ 1.073801] ------------------------------------------------------ [ 1.074882] kworker/u4:2/708 [HC0[0]:SC0[0]:HE0:SE1] is trying to acquire: [ 1.076088] (&p->mems_allowed_seq){+.+...}, at: [] new_slab+0x5f/0x280 [ 1.077572] [ 1.077572] and this task is already holding: [ 1.078593] (&(&q->__queue_lock)->rlock){..-...}, at: [] blk_execute_rq_nowait+0x53/0xf0 [ 1.080042] which would create a new lock dependency: [ 1.080042] (&(&q->__queue_lock)->rlock){..-...} -> (&p->mems_allowed_seq){+.+...} [ 1.080042] [ 1.080042] but this new dependency connects a SOFTIRQ-irq-safe lock: [ 1.080042] (&(&q->__queue_lock)->rlock){..-...} [ 1.080042] ... which became SOFTIRQ-irq-safe at: [ 1.080042] [] __lock_acquire+0x5b9/0x1db0 [ 1.080042] [] lock_acquire+0x95/0x130 [ 1.080042] [] _raw_spin_lock+0x41/0x80 [ 1.080042] [] scsi_device_unbusy+0x7e/0xd0 [ 1.080042] [] scsi_finish_command+0x32/0xf0 [ 1.080042] [] scsi_softirq_done+0xa1/0x130 [ 1.080042] [] blk_done_softirq+0x73/0x90 [ 1.080042] [] __do_softirq+0x110/0x2f0 [ 1.080042] [] run_ksoftirqd+0x2d/0x60 [ 1.080042] [] smpboot_thread_fn+0x156/0x1e0 [ 1.080042] [] kthread+0xd6/0xe0 [ 1.080042] [] ret_from_fork+0x7c/0xb0 [ 1.080042] [ 1.080042] to a SOFTIRQ-irq-unsafe lock: [ 1.080042] (&p->mems_allowed_seq){+.+...} [ 1.080042] ... which became SOFTIRQ-irq-unsafe at: [ 1.080042] ... [] __lock_acquire+0x613/0x1db0 [ 1.080042] [] lock_acquire+0x95/0x130 [ 1.080042] [] kthreadd+0x82/0x180 [ 1.080042] [] ret_from_fork+0x7c/0xb0 [ 1.080042] [ 1.080042] other info that might help us debug this: [ 1.080042] [ 1.080042] Possible interrupt unsafe locking scenario: [ 1.080042] [ 1.080042] CPU0 CPU1 [ 1.080042] ---- ---- [ 1.080042] lock(&p->mems_allowed_seq); [ 1.080042] local_irq_disable(); [ 1.080042] lock(&(&q->__queue_lock)->rlock); [ 1.080042] lock(&p->mems_allowed_seq); [ 1.080042] [ 1.080042] lock(&(&q->__queue_lock)->rlock); [ 1.080042] [ 1.080042] *** DEADLOCK *** The issue stems from the kthreadd() function calling set_mems_allowed with irqs enabled. While its possibly unlikely for the actual deadlock to trigger, a fix is fairly simple: disable irqs before taking the mems_allowed_seq lock. Signed-off-by: John Stultz Signed-off-by: Peter Zijlstra Acked-by: Li Zefan Cc: Mathieu Desnoyers Cc: Steven Rostedt Cc: "David S. Miller" Cc: netdev@vger.kernel.org Link: http://lkml.kernel.org/r/1381186321-4906-4-git-send-email-john.stultz@linaro.org Signed-off-by: Ingo Molnar

Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs

2013-05-02T00:51:54Z

Pull VFS updates from Al Viro, Misc cleanups all over the place, mainly wrt /proc interfaces (switch create_proc_entry to proc_create(), get rid of the deprecated create_proc_read_entry() in favor of using proc_create_data() and seq_file etc). 7kloc removed. * 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs: (204 commits) don't bother with deferred freeing of fdtables proc: Move non-public stuff from linux/proc_fs.h to fs/proc/internal.h proc: Make the PROC_I() and PDE() macros internal to procfs proc: Supply a function to remove a proc entry by PDE take cgroup_open() and cpuset_open() to fs/proc/base.c ppc: Clean up scanlog ppc: Clean up rtas_flash driver somewhat hostap: proc: Use remove_proc_subtree() drm: proc: Use remove_proc_subtree() drm: proc: Use minor->index to label things, not PDE->name drm: Constify drm_proc_list[] zoran: Don't print proc_dir_entry data in debug reiserfs: Don't access the proc_dir_entry in r_open(), r_start() r_show() proc: Supply an accessor for getting the data from a PDE's parent airo: Use remove_proc_subtree() rtl8192u: Don't need to save device proc dir PDE rtl8187se: Use a dir under /proc/net/r8180/ proc: Add proc_mkdir_data() proc: Move some bits from linux/proc_fs.h to linux/{of.h,signal.h,tty.h} proc: Move PDE_NET() to fs/proc/proc_net.c ...

take cgroup_open() and cpuset_open() to fs/proc/base.c

2013-05-01T21:29:46Z

Signed-off-by: Al Viro

cpuset: remove include of cgroup.h from cpuset.h

2013-03-05T17:33:26Z

We don't need to include cgroup.h in cpuset.h. Signed-off-by: Li Zefan Signed-off-by: Tejun Heo