<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/workqueue.c, branch v3.12.33</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.12.33</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.12.33'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2014-07-18T13:51:15Z</updated>
<entry>
<title>workqueue: zero cpumask of wq_numa_possible_cpumask on init</title>
<updated>2014-07-18T13:51:15Z</updated>
<author>
<name>Yasuaki Ishimatsu</name>
<email>isimatu.yasuaki@jp.fujitsu.com</email>
</author>
<published>2014-07-07T13:56:48Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9b8fa806491226ca3540acb9364b5181f1ab4142'/>
<id>urn:sha1:9b8fa806491226ca3540acb9364b5181f1ab4142</id>
<content type='text'>
commit 5a6024f1604eef119cf3a6fa413fe0261a81a8f3 upstream.

When hot-adding and onlining CPU, kernel panic occurs, showing following
call trace.

  BUG: unable to handle kernel paging request at 0000000000001d08
  IP: [&lt;ffffffff8114acfd&gt;] __alloc_pages_nodemask+0x9d/0xb10
  PGD 0
  Oops: 0000 [#1] SMP
  ...
  Call Trace:
   [&lt;ffffffff812b8745&gt;] ? cpumask_next_and+0x35/0x50
   [&lt;ffffffff810a3283&gt;] ? find_busiest_group+0x113/0x8f0
   [&lt;ffffffff81193bc9&gt;] ? deactivate_slab+0x349/0x3c0
   [&lt;ffffffff811926f1&gt;] new_slab+0x91/0x300
   [&lt;ffffffff815de95a&gt;] __slab_alloc+0x2bb/0x482
   [&lt;ffffffff8105bc1c&gt;] ? copy_process.part.25+0xfc/0x14c0
   [&lt;ffffffff810a3c78&gt;] ? load_balance+0x218/0x890
   [&lt;ffffffff8101a679&gt;] ? sched_clock+0x9/0x10
   [&lt;ffffffff81105ba9&gt;] ? trace_clock_local+0x9/0x10
   [&lt;ffffffff81193d1c&gt;] kmem_cache_alloc_node+0x8c/0x200
   [&lt;ffffffff8105bc1c&gt;] copy_process.part.25+0xfc/0x14c0
   [&lt;ffffffff81114d0d&gt;] ? trace_buffer_unlock_commit+0x4d/0x60
   [&lt;ffffffff81085a80&gt;] ? kthread_create_on_node+0x140/0x140
   [&lt;ffffffff8105d0ec&gt;] do_fork+0xbc/0x360
   [&lt;ffffffff8105d3b6&gt;] kernel_thread+0x26/0x30
   [&lt;ffffffff81086652&gt;] kthreadd+0x2c2/0x300
   [&lt;ffffffff81086390&gt;] ? kthread_create_on_cpu+0x60/0x60
   [&lt;ffffffff815f20ec&gt;] ret_from_fork+0x7c/0xb0
   [&lt;ffffffff81086390&gt;] ? kthread_create_on_cpu+0x60/0x60

In my investigation, I found the root cause is wq_numa_possible_cpumask.
All entries of wq_numa_possible_cpumask is allocated by
alloc_cpumask_var_node(). And these entries are used without initializing.
So these entries have wrong value.

When hot-adding and onlining CPU, wq_update_unbound_numa() is called.
wq_update_unbound_numa() calls alloc_unbound_pwq(). And alloc_unbound_pwq()
calls get_unbound_pool(). In get_unbound_pool(), worker_pool-&gt;node is set
as follow:

3592         /* if cpumask is contained inside a NUMA node, we belong to that node */
3593         if (wq_numa_enabled) {
3594                 for_each_node(node) {
3595                         if (cpumask_subset(pool-&gt;attrs-&gt;cpumask,
3596                                            wq_numa_possible_cpumask[node])) {
3597                                 pool-&gt;node = node;
3598                                 break;
3599                         }
3600                 }
3601         }

But wq_numa_possible_cpumask[node] does not have correct cpumask. So, wrong
node is selected. As a result, kernel panic occurs.

By this patch, all entries of wq_numa_possible_cpumask are allocated by
zalloc_cpumask_var_node to initialize them. And the panic disappeared.

Signed-off-by: Yasuaki Ishimatsu &lt;isimatu.yasuaki@jp.fujitsu.com&gt;
Reviewed-by: Lai Jiangshan &lt;laijs@cn.fujitsu.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Fixes: bce903809ab3 ("workqueue: add wq_numa_tbl_len and wq_numa_possible_cpumask[]")
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>workqueue: fix dev_set_uevent_suppress() imbalance</title>
<updated>2014-07-18T13:51:14Z</updated>
<author>
<name>Maxime Bizon</name>
<email>mbizon@freebox.fr</email>
</author>
<published>2014-06-23T14:35:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=982170a363d53cf2c65547e79d0a30055268a841'/>
<id>urn:sha1:982170a363d53cf2c65547e79d0a30055268a841</id>
<content type='text'>
commit bddbceb688c6d0decaabc7884fede319d02f96c8 upstream.

Uevents are suppressed during attributes registration, but never
restored, so kobject_uevent() does nothing.

Signed-off-by: Maxime Bizon &lt;mbizon@freebox.fr&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Fixes: 226223ab3c4118ddd10688cc2c131135848371ab
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>workqueue: make rescuer_thread() empty wq-&gt;maydays list before exiting</title>
<updated>2014-06-09T13:53:36Z</updated>
<author>
<name>Lai Jiangshan</name>
<email>laijs@cn.fujitsu.com</email>
</author>
<published>2014-04-18T15:04:16Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=aa439c1760a31f68afad736633bc487eb0a4eb15'/>
<id>urn:sha1:aa439c1760a31f68afad736633bc487eb0a4eb15</id>
<content type='text'>
commit 4d595b866d2c653dc90a492b9973a834eabfa354 upstream.

After a @pwq is scheduled for emergency execution, other workers may
consume the affectd work items before the rescuer gets to them.  This
means that a workqueue many have pwqs queued on @wq-&gt;maydays list
while not having any work item pending or in-flight.  If
destroy_workqueue() executes in such condition, the rescuer may exit
without emptying @wq-&gt;maydays.

This currently doesn't cause any actual harm.  destroy_workqueue() can
safely destroy all the involved data structures whether @wq-&gt;maydays
is populated or not as nobody access the list once the rescuer exits.

However, this is nasty and makes future development difficult.  Let's
update rescuer_thread() so that it empties @wq-&gt;maydays after seeing
should_stop to guarantee that the list is empty on rescuer exit.

tj: Updated comment and patch description.

Signed-off-by: Lai Jiangshan &lt;laijs@cn.fujitsu.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>workqueue: fix a possible race condition between rescuer and pwq-release</title>
<updated>2014-06-09T13:53:36Z</updated>
<author>
<name>Lai Jiangshan</name>
<email>laijs@cn.fujitsu.com</email>
</author>
<published>2014-04-18T15:04:16Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d445003e2c54b48d280f255a40cbad97f0c26019'/>
<id>urn:sha1:d445003e2c54b48d280f255a40cbad97f0c26019</id>
<content type='text'>
commit 77668c8b559e4fe2acf2a0749c7c83cde49a5025 upstream.

There is a race condition between rescuer_thread() and
pwq_unbound_release_workfn().

Even after a pwq is scheduled for rescue, the associated work items
may be consumed by any worker.  If all of them are consumed before the
rescuer gets to them and the pwq's base ref was put due to attribute
change, the pwq may be released while still being linked on
@wq-&gt;maydays list making the rescuer dereference already freed pwq
later.

Make send_mayday() pin the target pwq until the rescuer is done with
it.

tj: Updated comment and patch description.

Signed-off-by: Lai Jiangshan &lt;laijs@cn.fujitsu.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>workqueue: fix bugs in wq_update_unbound_numa() failure path</title>
<updated>2014-06-09T13:53:35Z</updated>
<author>
<name>Daeseok Youn</name>
<email>daeseok.youn@gmail.com</email>
</author>
<published>2014-04-16T05:32:29Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3b3b224978f8426000d0c26ecdde037356f3d873'/>
<id>urn:sha1:3b3b224978f8426000d0c26ecdde037356f3d873</id>
<content type='text'>
commit 77f300b198f93328c26191b52655ce1b62e202cf upstream.

wq_update_unbound_numa() failure path has the following two bugs.

- alloc_unbound_pwq() is called without holding wq-&gt;mutex; however, if
  the allocation fails, it jumps to out_unlock which tries to unlock
  wq-&gt;mutex.

- The function should switch to dfl_pwq on failure but didn't do so
  after alloc_unbound_pwq() failure.

Fix it by regrabbing wq-&gt;mutex and jumping to use_dfl_pwq on
alloc_unbound_pwq() failure.

Signed-off-by: Daeseok Youn &lt;daeseok.youn@gmail.com&gt;
Acked-by: Lai Jiangshan &lt;laijs@cn.fujitsu.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Fixes: 4c16bd327c74 ("workqueue: implement NUMA affinity for unbound workqueues")
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>workqueue: ensure @task is valid across kthread_stop()</title>
<updated>2014-03-05T16:13:50Z</updated>
<author>
<name>Lai Jiangshan</name>
<email>laijs@cn.fujitsu.com</email>
</author>
<published>2014-02-15T14:02:28Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=33328880843bac26ae1b8f62569dad13dd703bf0'/>
<id>urn:sha1:33328880843bac26ae1b8f62569dad13dd703bf0</id>
<content type='text'>
commit 5bdfff96c69a4d5ab9c49e60abf9e070ecd2acbb upstream.

When a kworker should die, the kworkre is notified through WORKER_DIE
flag instead of kthread_should_stop().  This, IIRC, is primarily to
keep the test synchronized inside worker_pool lock.  WORKER_DIE is
first set while holding pool-&gt;lock, the lock is dropped and
kthread_stop() is called.

Unfortunately, this means that there's a slight chance that the target
kworker may see WORKER_DIE before kthread_stop() finishes and exits
and frees the target task before or during kthread_stop().

Fix it by pinning the target task before setting WORKER_DIE and
putting it after kthread_stop() is done.

tj: Improved patch description and comment.  Moved pinning above
    WORKER_DIE for better signify what it's protecting.

Signed-off-by: Lai Jiangshan &lt;laijs@cn.fujitsu.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
</entry>
<entry>
<title>workqueue: fix ordered workqueues in NUMA setups</title>
<updated>2013-12-04T19:05:55Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2013-09-05T16:30:04Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a432a4023f22f8c0aa9fa7262b124354351e6390'/>
<id>urn:sha1:a432a4023f22f8c0aa9fa7262b124354351e6390</id>
<content type='text'>
commit 8a2b75384444488fc4f2cbb9f0921b6a0794838f upstream.

An ordered workqueue implements execution ordering by using single
pool_workqueue with max_active == 1.  On a given pool_workqueue, work
items are processed in FIFO order and limiting max_active to 1
enforces the queued work items to be processed one by one.

Unfortunately, 4c16bd327c ("workqueue: implement NUMA affinity for
unbound workqueues") accidentally broke this guarantee by applying
NUMA affinity to ordered workqueues too.  On NUMA setups, an ordered
workqueue would end up with separate pool_workqueues for different
nodes.  Each pool_workqueue still limits max_active to 1 but multiple
work items may be executed concurrently and out of order depending on
which node they are queued to.

Fix it by using dedicated ordered_wq_attrs[] when creating ordered
workqueues.  The new attrs match the unbound ones except that no_numa
is always set thus forcing all NUMA nodes to share the default
pool_workqueue.

While at it, add sanity check in workqueue creation path which
verifies that an ordered workqueues has only the default
pool_workqueue.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: Libin &lt;huawei.libin@huawei.com&gt;
Cc: Lai Jiangshan &lt;laijs@cn.fujitsu.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial</title>
<updated>2013-09-06T16:36:28Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2013-09-06T16:36:28Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2e515bf096c245ba87f20ab4b4ea20f911afaeda'/>
<id>urn:sha1:2e515bf096c245ba87f20ab4b4ea20f911afaeda</id>
<content type='text'>
Pull trivial tree from Jiri Kosina:
 "The usual trivial updates all over the tree -- mostly typo fixes and
  documentation updates"

* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jikos/trivial: (52 commits)
  doc: Documentation/cputopology.txt fix typo
  treewide: Convert retrun typos to return
  Fix comment typo for init_cma_reserved_pageblock
  Documentation/trace: Correcting and extending tracepoint documentation
  mm/hotplug: fix a typo in Documentation/memory-hotplug.txt
  power: Documentation: Update s2ram link
  doc: fix a typo in Documentation/00-INDEX
  Documentation/printk-formats.txt: No casts needed for u64/s64
  doc: Fix typo "is is" in Documentations
  treewide: Fix printks with 0x%#
  zram: doc fixes
  Documentation/kmemcheck: update kmemcheck documentation
  doc: documentation/hwspinlock.txt fix typo
  PM / Hibernate: add section for resume options
  doc: filesystems : Fix typo in Documentations/filesystems
  scsi/megaraid fixed several typos in comments
  ppc: init_32: Fix error typo "CONFIG_START_KERNEL"
  treewide: Add __GFP_NOWARN to k.alloc calls with v.alloc fallbacks
  page_isolation: Fix a comment typo in test_pages_isolated()
  doc: fix a typo about irq affinity
  ...
</content>
</entry>
<entry>
<title>Merge branch 'for-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq</title>
<updated>2013-09-04T01:19:21Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2013-09-04T01:19:21Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9ee52a1633a77961cb7b7fb5bd40be682f8412c7'/>
<id>urn:sha1:9ee52a1633a77961cb7b7fb5bd40be682f8412c7</id>
<content type='text'>
Pull workqueue updates from Tejun Heo:
 "Nothing interesting.  All are doc / comment updates"

* 'for-3.12' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: Correct/Drop references to gcwq in Documentation
  workqueue: Fix manage_workers() RETURNS description
  workqueue: Comment correction in file header
  workqueue: mark WQ_NON_REENTRANT deprecated
</content>
</entry>
<entry>
<title>Merge tag 'driver-core-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core</title>
<updated>2013-09-03T18:37:15Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2013-09-03T18:37:15Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=542a086ac72fb193cbc1b996963a572269e57743'/>
<id>urn:sha1:542a086ac72fb193cbc1b996963a572269e57743</id>
<content type='text'>
Pull driver core patches from Greg KH:
 "Here's the big driver core pull request for 3.12-rc1.

  Lots of tiny changes here fixing up the way sysfs attributes are
  created, to try to make drivers simpler, and fix a whole class race
  conditions with creations of device attributes after the device was
  announced to userspace.

  All the various pieces are acked by the different subsystem
  maintainers"

* tag 'driver-core-3.12-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (119 commits)
  firmware loader: fix pending_fw_head list corruption
  drivers/base/memory.c: introduce help macro to_memory_block
  dynamic debug: line queries failing due to uninitialized local variable
  sysfs: sysfs_create_groups returns a value.
  debugfs: provide debugfs_create_x64() when disabled
  rbd: convert bus code to use bus_groups
  firmware: dcdbas: use binary attribute groups
  sysfs: add sysfs_create/remove_groups for when SYSFS is not enabled
  driver core: add #include &lt;linux/sysfs.h&gt; to core files.
  HID: convert bus code to use dev_groups
  Input: serio: convert bus code to use drv_groups
  Input: gameport: convert bus code to use drv_groups
  driver core: firmware: use __ATTR_RW()
  driver core: core: use DEVICE_ATTR_RO
  driver core: bus: use DRIVER_ATTR_WO()
  driver core: create write-only attribute macros for devices and drivers
  sysfs: create __ATTR_WO()
  driver-core: platform: convert bus code to use dev_groups
  workqueue: convert bus code to use dev_groups
  MEI: convert bus code to use dev_groups
  ...
</content>
</entry>
</feed>
