<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/workqueue.c, branch stable/4.8.y</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=stable%2F4.8.y</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=stable%2F4.8.y'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2016-07-29T20:55:30Z</updated>
<entry>
<title>Merge branch 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2016-07-29T20:55:30Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-07-29T20:55:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a6408f6cb63ac0958fee7dbce7861ffb540d8a49'/>
<id>urn:sha1:a6408f6cb63ac0958fee7dbce7861ffb540d8a49</id>
<content type='text'>
Pull smp hotplug updates from Thomas Gleixner:
 "This is the next part of the hotplug rework.

   - Convert all notifiers with a priority assigned

   - Convert all CPU_STARTING/DYING notifiers

     The final removal of the STARTING/DYING infrastructure will happen
     when the merge window closes.

  Another 700 hundred line of unpenetrable maze gone :)"

* 'smp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (70 commits)
  timers/core: Correct callback order during CPU hot plug
  leds/trigger/cpu: Move from CPU_STARTING to ONLINE level
  powerpc/numa: Convert to hotplug state machine
  arm/perf: Fix hotplug state machine conversion
  irqchip/armada: Avoid unused function warnings
  ARC/time: Convert to hotplug state machine
  clocksource/atlas7: Convert to hotplug state machine
  clocksource/armada-370-xp: Convert to hotplug state machine
  clocksource/exynos_mct: Convert to hotplug state machine
  clocksource/arm_global_timer: Convert to hotplug state machine
  rcu: Convert rcutree to hotplug state machine
  KVM/arm/arm64/vgic-new: Convert to hotplug state machine
  smp/cfd: Convert core to hotplug state machine
  x86/x2apic: Convert to CPU hotplug state machine
  profile: Convert to hotplug state machine
  timers/core: Convert to hotplug state machine
  hrtimer: Convert to hotplug state machine
  x86/tboot: Convert to hotplug state machine
  arm64/armv8 deprecated: Convert to hotplug state machine
  hwtracing/coresight-etm4x: Convert to hotplug state machine
  ...
</content>
</entry>
<entry>
<title>Merge branches 'pm-sleep' and 'pm-tools'</title>
<updated>2016-07-25T11:44:32Z</updated>
<author>
<name>Rafael J. Wysocki</name>
<email>rafael.j.wysocki@intel.com</email>
</author>
<published>2016-07-25T11:44:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7f234a4d8aab23e778933808a88bfa47ea4fdc2a'/>
<id>urn:sha1:7f234a4d8aab23e778933808a88bfa47ea4fdc2a</id>
<content type='text'>
* pm-sleep:
  PM / hibernate: Introduce test_resume mode for hibernation
  x86 / hibernate: Use hlt_play_dead() when resuming from hibernation
  PM / hibernate: Image data protection during restoration
  PM / hibernate: Add missing braces in __register_nosave_region()
  PM / hibernate: Clean up comments in snapshot.c
  PM / hibernate: Clean up function headers in snapshot.c
  PM / hibernate: Add missing braces in hibernate_setup()
  PM / hibernate: Recycle safe pages after image restoration
  PM / hibernate: Simplify mark_unsafe_pages()
  PM / hibernate: Do not free preallocated safe pages during image restore
  PM / suspend: show workqueue state in suspend flow
  PM / sleep: make PM notifiers called symmetrically
  PM / sleep: Make pm_prepare_console() return void
  PM / Hibernate: Don't let kasan instrument snapshot.c

* pm-tools:
  PM / tools: scripts: AnalyzeSuspend v4.2
  tools/turbostat: allow user to alter DESTDIR and PREFIX
</content>
</entry>
<entry>
<title>workqueue: Convert to state machine callbacks</title>
<updated>2016-07-14T07:34:43Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2016-07-13T17:16:29Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7ee681b25284782ecf380bf5ccf55f13c52fd0ce'/>
<id>urn:sha1:7ee681b25284782ecf380bf5ccf55f13c52fd0ce</id>
<content type='text'>
Get rid of the prio ordering of the separate notifiers and use a proper state
callback pair.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Anna-Maria Gleixner &lt;anna-maria@linutronix.de&gt;
Reviewed-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Lai Jiangshan &lt;jiangshanlai@gmail.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Nicolas Iooss &lt;nicolas.iooss_linux@m4x.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Rasmus Villemoes &lt;linux@rasmusvillemoes.dk&gt;
Cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160713153335.197083890@linutronix.de
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>PM / suspend: show workqueue state in suspend flow</title>
<updated>2016-07-01T23:42:48Z</updated>
<author>
<name>Roger Lu</name>
<email>roger.lu@mediatek.com</email>
</author>
<published>2016-07-01T03:05:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7b776af66dc462caa7e839cc5c950a61db1f8551'/>
<id>urn:sha1:7b776af66dc462caa7e839cc5c950a61db1f8551</id>
<content type='text'>
If freezable workqueue aborts suspend flow, show
workqueue state for debug purpose.

Signed-off-by: Roger Lu &lt;roger.lu@mediatek.com&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
</entry>
<entry>
<title>workqueue: Fix setting affinity of unbound worker threads</title>
<updated>2016-06-16T19:37:05Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2016-06-16T12:38:42Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d945b5e9f0e35cb56a3783d849b5f0f37da0a7f1'/>
<id>urn:sha1:d945b5e9f0e35cb56a3783d849b5f0f37da0a7f1</id>
<content type='text'>
With commit e9d867a67fd03ccc ("sched: Allow per-cpu kernel threads to
run on online &amp;&amp; !active"), __set_cpus_allowed_ptr() expects that only
strict per-cpu kernel threads can have affinity to an online CPU which
is not yet active.

This assumption is currently broken in the CPU_ONLINE notification
handler for the workqueues where restore_unbound_workers_cpumask()
calls set_cpus_allowed_ptr() when the first cpu in the unbound
worker's pool-&gt;attr-&gt;cpumask comes online. Since
set_cpus_allowed_ptr() is called with pool-&gt;attr-&gt;cpumask in which
only one CPU is online which is not yet active, we get the following
WARN_ON during an CPU online operation.

------------[ cut here ]------------
WARNING: CPU: 40 PID: 248 at kernel/sched/core.c:1166
__set_cpus_allowed_ptr+0x228/0x2e0
Modules linked in:
CPU: 40 PID: 248 Comm: cpuhp/40 Not tainted 4.6.0-autotest+ #4
&lt;..snip..&gt;
Call Trace:
[c000000f273ff920] [c00000000010493c] __set_cpus_allowed_ptr+0x2cc/0x2e0 (unreliable)
[c000000f273ffac0] [c0000000000ed4b0] workqueue_cpu_up_callback+0x2c0/0x470
[c000000f273ffb70] [c0000000000f5c58] notifier_call_chain+0x98/0x100
[c000000f273ffbc0] [c0000000000c5ed0] __cpu_notify+0x70/0xe0
[c000000f273ffc00] [c0000000000c6028] notify_online+0x38/0x50
[c000000f273ffc30] [c0000000000c5214] cpuhp_invoke_callback+0x84/0x250
[c000000f273ffc90] [c0000000000c562c] cpuhp_up_callbacks+0x5c/0x120
[c000000f273ffce0] [c0000000000c64d4] cpuhp_thread_fun+0x184/0x1c0
[c000000f273ffd20] [c0000000000fa050] smpboot_thread_fn+0x290/0x2a0
[c000000f273ffd80] [c0000000000f45b0] kthread+0x110/0x130
[c000000f273ffe30] [c000000000009570] ret_from_kernel_thread+0x5c/0x6c
---[ end trace 00f1456578b2a3b2 ]---

This patch fixes this by limiting the mask to the intersection of
the pool affinity and online CPUs.

Changelog-cribbed-from: Gautham R. Shenoy &lt;ego@linux.vnet.ibm.com&gt;
Reported-by: Abdul Haleem &lt;abdhalee@linux.vnet.ibm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>debugobjects: insulate non-fixup logic related to static obj from fixup callbacks</title>
<updated>2016-05-20T02:12:14Z</updated>
<author>
<name>Du, Changbin</name>
<email>changbin.du@intel.com</email>
</author>
<published>2016-05-20T00:09:41Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b9fdac7f660609abb157500e468d2165b3c9cf08'/>
<id>urn:sha1:b9fdac7f660609abb157500e468d2165b3c9cf08</id>
<content type='text'>
When activating a static object we need make sure that the object is
tracked in the object tracker.  If it is a non-static object then the
activation is illegal.

In previous implementation, each subsystem need take care of this in
their fixup callbacks.  Actually we can put it into debugobjects core.
Thus we can save duplicated code, and have *pure* fixup callbacks.

To achieve this, a new callback "is_static_object" is introduced to let
the type specific code decide whether a object is static or not.  If
yes, we take it into object tracker, otherwise give warning and invoke
fixup callback.

This change has paassed debugobjects selftest, and I also do some test
with all debugobjects supports enabled.

At last, I have a concern about the fixups that can it change the object
which is in incorrect state on fixup? Because the 'addr' may not point
to any valid object if a non-static object is not tracked.  Then Change
such object can overwrite someone's memory and cause unexpected
behaviour.  For example, the timer_fixup_activate bind timer to function
stub_timer.

Link: http://lkml.kernel.org/r/1462576157-14539-1-git-send-email-changbin.du@intel.com
[changbin.du@intel.com: improve code comments where invoke the new is_static_object callback]
  Link: http://lkml.kernel.org/r/1462777431-8171-1-git-send-email-changbin.du@intel.com
Signed-off-by: Du, Changbin &lt;changbin.du@intel.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Josh Triplett &lt;josh@kernel.org&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>workqueue: update debugobjects fixup callbacks return type</title>
<updated>2016-05-20T02:12:14Z</updated>
<author>
<name>Du, Changbin</name>
<email>changbin.du@intel.com</email>
</author>
<published>2016-05-20T00:09:26Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=02a982a6ec631d871571f940ca13817551759884'/>
<id>urn:sha1:02a982a6ec631d871571f940ca13817551759884</id>
<content type='text'>
Update the return type to use bool instead of int, corresponding to
change (debugobjects: make fixup functions return bool instead of int)

Signed-off-by: Du, Changbin &lt;changbin.du@intel.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Josh Triplett &lt;josh@kernel.org&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq</title>
<updated>2016-05-13T23:16:51Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-05-13T23:16:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=da92223908f309d5ba63ce7dd568dfb6a7e5d7d2'/>
<id>urn:sha1:da92223908f309d5ba63ce7dd568dfb6a7e5d7d2</id>
<content type='text'>
Pull workqueue fix from Tejun Heo:
 "CPU hotplug callbacks can invoke DOWN_FAILED w/o preceding
  DOWN_PREPARE which can trigger a WARN_ON() in workqueue.

  The bug has been there for a very long time.  It only triggers if CPU
  down fails at a specific point and I don't think it has adverse
  effects other than the warning messages.  The fix is very low impact"

* 'for-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: fix rebind bound workers warning
</content>
</entry>
<entry>
<title>workqueue: fix rebind bound workers warning</title>
<updated>2016-05-12T16:00:23Z</updated>
<author>
<name>Wanpeng Li</name>
<email>wanpeng.li@hotmail.com</email>
</author>
<published>2016-05-11T09:55:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f7c17d26f43d5cc1b7a6b896cd2fa24a079739b9'/>
<id>urn:sha1:f7c17d26f43d5cc1b7a6b896cd2fa24a079739b9</id>
<content type='text'>
------------[ cut here ]------------
WARNING: CPU: 0 PID: 16 at kernel/workqueue.c:4559 rebind_workers+0x1c0/0x1d0
Modules linked in:
CPU: 0 PID: 16 Comm: cpuhp/0 Not tainted 4.6.0-rc4+ #31
Hardware name: IBM IBM System x3550 M4 Server -[7914IUW]-/00Y8603, BIOS -[D7E128FUS-1.40]- 07/23/2013
 0000000000000000 ffff881037babb58 ffffffff8139d885 0000000000000010
 0000000000000000 0000000000000000 0000000000000000 ffff881037babba8
 ffffffff8108505d ffff881037ba0000 000011cf3e7d6e60 0000000000000046
Call Trace:
 dump_stack+0x89/0xd4
 __warn+0xfd/0x120
 warn_slowpath_null+0x1d/0x20
 rebind_workers+0x1c0/0x1d0
 workqueue_cpu_up_callback+0xf5/0x1d0
 notifier_call_chain+0x64/0x90
 ? trace_hardirqs_on_caller+0xf2/0x220
 ? notify_prepare+0x80/0x80
 __raw_notifier_call_chain+0xe/0x10
 __cpu_notify+0x35/0x50
 notify_down_prepare+0x5e/0x80
 ? notify_prepare+0x80/0x80
 cpuhp_invoke_callback+0x73/0x330
 ? __schedule+0x33e/0x8a0
 cpuhp_down_callbacks+0x51/0xc0
 cpuhp_thread_fun+0xc1/0xf0
 smpboot_thread_fn+0x159/0x2a0
 ? smpboot_create_threads+0x80/0x80
 kthread+0xef/0x110
 ? wait_for_completion+0xf0/0x120
 ? schedule_tail+0x35/0xf0
 ret_from_fork+0x22/0x50
 ? __init_kthread_worker+0x70/0x70
---[ end trace eb12ae47d2382d8f ]---
notify_down_prepare: attempt to take down CPU 0 failed

This bug can be reproduced by below config w/ nohz_full= all cpus:

CONFIG_BOOTPARAM_HOTPLUG_CPU0=y
CONFIG_DEBUG_HOTPLUG_CPU0=y
CONFIG_NO_HZ_FULL=y

As Thomas pointed out:

| If a down prepare callback fails, then DOWN_FAILED is invoked for all
| callbacks which have successfully executed DOWN_PREPARE.
|
| But, workqueue has actually two notifiers. One which handles
| UP/DOWN_FAILED/ONLINE and one which handles DOWN_PREPARE.
|
| Now look at the priorities of those callbacks:
|
| CPU_PRI_WORKQUEUE_UP        = 5
| CPU_PRI_WORKQUEUE_DOWN      = -5
|
| So the call order on DOWN_PREPARE is:
|
| CB 1
| CB ...
| CB workqueue_up() -&gt; Ignores DOWN_PREPARE
| CB ...
| CB X ---&gt; Fails
|
| So we call up to CB X with DOWN_FAILED
|
| CB 1
| CB ...
| CB workqueue_up() -&gt; Handles DOWN_FAILED
| CB ...
| CB X-1
|
| So the problem is that the workqueue stuff handles DOWN_FAILED in the up
| callback, while it should do it in the down callback. Which is not a good idea
| either because it wants to be called early on rollback...
|
| Brilliant stuff, isn't it? The hotplug rework will solve this problem because
| the callbacks become symetric, but for the existing mess, we need some
| workaround in the workqueue code.

The boot CPU handles housekeeping duty(unbound timers, workqueues,
timekeeping, ...) on behalf of full dynticks CPUs. It must remain
online when nohz full is enabled. There is a priority set to every
notifier_blocks:

workqueue_cpu_up &gt; tick_nohz_cpu_down &gt; workqueue_cpu_down

So tick_nohz_cpu_down callback failed when down prepare cpu 0, and
notifier_blocks behind tick_nohz_cpu_down will not be called any
more, which leads to workers are actually not unbound. Then hotplug
state machine will fallback to undo and online cpu 0 again. Workers
will be rebound unconditionally even if they are not unbound and
trigger the warning in this progress.

This patch fix it by catching !DISASSOCIATED to avoid rebind bound
workers.

Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Lai Jiangshan &lt;jiangshanlai@gmail.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Frédéric Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: stable@vger.kernel.org
Suggested-by: Lai Jiangshan &lt;jiangshanlai@gmail.com&gt;
Signed-off-by: Wanpeng Li &lt;wanpeng.li@hotmail.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'for-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq</title>
<updated>2016-04-27T19:03:59Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-04-27T19:03:59Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b75a2bf899b668b1d52de8846aafdbcf81349c73'/>
<id>urn:sha1:b75a2bf899b668b1d52de8846aafdbcf81349c73</id>
<content type='text'>
Pull workqueue fix from Tejun Heo:
 "So, it turns out we had a silly bug in the most fundamental part of
  workqueue for a very long time.  AFAICS, this dates back to pre-git
  era and has quite likely been there from the time workqueue was first
  introduced.

  A work item uses its PENDING bit to synchronize multiple queuers.
  Anyone who wins the PENDING bit owns the pending state of the work
  item.  Whether a queuer wins or loses the race, one thing should be
  guaranteed - there will soon be at least one execution of the work
  item - where "after" means that the execution instance would be able
  to see all the changes that the queuer has made prior to the queueing
  attempt.

  Unfortunately, we were missing a smp_mb() after clearing PENDING for
  execution, so nothing guaranteed visibility of the changes that a
  queueing loser has made, which manifested as a reproducible blk-mq
  stall.

  Lots of kudos to Roman for debugging the problem.  The patch for
  -stable is the minimal one.  For v3.7, Peter is working on a patch to
  make the code path slightly more efficient and less fragile"

* 'for-4.6-fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq:
  workqueue: fix ghost PENDING flag while doing MQ IO
</content>
</entry>
</feed>
