<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/workqueue.c, branch v3.2.45</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.2.45</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.2.45'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2013-03-06T03:22:34Z</updated>
<entry>
<title>workqueue: consider work function when searching for busy work items</title>
<updated>2013-03-06T03:22:34Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2012-12-18T18:35:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4e392406854be9807fbd51e626ec74530655e8a4'/>
<id>urn:sha1:4e392406854be9807fbd51e626ec74530655e8a4</id>
<content type='text'>
commit a2c1c57be8d9fd5b716113c8991d3d702eeacf77 upstream.

To avoid executing the same work item concurrenlty, workqueue hashes
currently busy workers according to their current work items and looks
up the the table when it wants to execute a new work item.  If there
already is a worker which is executing the new work item, the new item
is queued to the found worker so that it gets executed only after the
current execution finishes.

Unfortunately, a work item may be freed while being executed and thus
recycled for different purposes.  If it gets recycled for a different
work item and queued while the previous execution is still in
progress, workqueue may make the new work item wait for the old one
although the two aren't really related in any way.

In extreme cases, this false dependency may lead to deadlock although
it's extremely unlikely given that there aren't too many self-freeing
work item users and they usually don't wait for other work items.

To alleviate the problem, record the current work function in each
busy worker and match it together with the work item address in
find_worker_executing_work().  While this isn't complete, it ensures
that unrelated work items don't interact with each other and in the
very unlikely case where a twisted wq user triggers it, it's always
onto itself making the culprit easy to spot.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: Andrey Isakov &lt;andy51@gmx.ru&gt;
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=51701
[bwh: Backported to 3.2:
 - Adjust context
 - Incorporate earlier logging cleanup in process_one_work() from
   044c782ce3a9 ('workqueue: fix checkpatch issues')]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>workqueue: convert BUG_ON()s in __queue_delayed_work() to WARN_ON_ONCE()s</title>
<updated>2013-01-03T03:32:46Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2012-12-04T15:40:39Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7b08c866d7f084a4277921b14c155e029c52a888'/>
<id>urn:sha1:7b08c866d7f084a4277921b14c155e029c52a888</id>
<content type='text'>
commit fc4b514f2727f74a4587c31db87e0e93465518c3 upstream.

8852aac25e ("workqueue: mod_delayed_work_on() shouldn't queue timer on
0 delay") unexpectedly uncovered a very nasty abuse of delayed_work in
megaraid - it allocated work_struct, casted it to delayed_work and
then pass that into queue_delayed_work().

Previously, this was okay because 0 @delay short-circuited to
queue_work() before doing anything with delayed_work.  8852aac25e
moved 0 @delay test into __queue_delayed_work() after sanity check on
delayed_work making megaraid trigger BUG_ON().

Although megaraid is already fixed by c1d390d8e6 ("megaraid: fix
BUG_ON() from incorrect use of delayed work"), this patch converts
BUG_ON()s in __queue_delayed_work() to WARN_ON_ONCE()s so that such
abusers, if there are more, trigger warning but don't crash the
machine.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Xiaotian Feng &lt;xtfeng@gmail.com&gt;
[Shuah Khan: This change is back-ported from upstream change that
 converted BUG_ON()s in __queue_delayed_work() to WARN_ON_ONCE()s.]
Tested on Stable Trees: 3.0.x, 3.4.x, 3.6.x
Signed-off-by: Shuah Khan &lt;shuah.khan@hp.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>workqueue: exit rescuer_thread() as TASK_RUNNING</title>
<updated>2012-12-06T11:20:38Z</updated>
<author>
<name>Mike Galbraith</name>
<email>mgalbraith@suse.de</email>
</author>
<published>2012-11-28T06:17:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=704f66592067ef19abb87da6fd43a59e34823122'/>
<id>urn:sha1:704f66592067ef19abb87da6fd43a59e34823122</id>
<content type='text'>
commit 412d32e6c98527078779e5b515823b2810e40324 upstream.

A rescue thread exiting TASK_INTERRUPTIBLE can lead to a task scheduling
off, never to be seen again.  In the case where this occurred, an exiting
thread hit reiserfs homebrew conditional resched while holding a mutex,
bringing the box to its knees.

PID: 18105  TASK: ffff8807fd412180  CPU: 5   COMMAND: "kdmflush"
 #0 [ffff8808157e7670] schedule at ffffffff8143f489
 #1 [ffff8808157e77b8] reiserfs_get_block at ffffffffa038ab2d [reiserfs]
 #2 [ffff8808157e79a8] __block_write_begin at ffffffff8117fb14
 #3 [ffff8808157e7a98] reiserfs_write_begin at ffffffffa0388695 [reiserfs]
 #4 [ffff8808157e7ad8] generic_perform_write at ffffffff810ee9e2
 #5 [ffff8808157e7b58] generic_file_buffered_write at ffffffff810eeb41
 #6 [ffff8808157e7ba8] __generic_file_aio_write at ffffffff810f1a3a
 #7 [ffff8808157e7c58] generic_file_aio_write at ffffffff810f1c88
 #8 [ffff8808157e7cc8] do_sync_write at ffffffff8114f850
 #9 [ffff8808157e7dd8] do_acct_process at ffffffff810a268f
    [exception RIP: kernel_thread_helper]
    RIP: ffffffff8144a5c0  RSP: ffff8808157e7f58  RFLAGS: 00000202
    RAX: 0000000000000000  RBX: 0000000000000000  RCX: 0000000000000000
    RDX: 0000000000000000  RSI: ffffffff8107af60  RDI: ffff8803ee491d18
    RBP: 0000000000000000   R8: 0000000000000000   R9: 0000000000000000
    R10: 0000000000000000  R11: 0000000000000000  R12: 0000000000000000
    R13: 0000000000000000  R14: 0000000000000000  R15: 0000000000000000
    ORIG_RAX: ffffffffffffffff  CS: 0010  SS: 0018

Signed-off-by: Mike Galbraith &lt;mgalbraith@suse.de&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>workqueue: fix possible stall on try_to_grab_pending() of a delayed work item</title>
<updated>2012-10-17T02:48:33Z</updated>
<author>
<name>Lai Jiangshan</name>
<email>laijs@cn.fujitsu.com</email>
</author>
<published>2012-09-18T17:40:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=69db9d4382f1a90aa4343e7a4aa4ead72654ca8f'/>
<id>urn:sha1:69db9d4382f1a90aa4343e7a4aa4ead72654ca8f</id>
<content type='text'>
commit 3aa62497594430ea522050b75c033f71f2c60ee6 upstream.

Currently, when try_to_grab_pending() grabs a delayed work item, it
leaves its linked work items alone on the delayed_works.  The linked
work items are always NO_COLOR and will cause future
cwq_activate_first_delayed() increase cwq-&gt;nr_active incorrectly, and
may cause the whole cwq to stall.  For example,

state: cwq-&gt;max_active = 1, cwq-&gt;nr_active = 1
       one work in cwq-&gt;pool, many in cwq-&gt;delayed_works.

step1: try_to_grab_pending() removes a work item from delayed_works
       but leaves its NO_COLOR linked work items on it.

step2: Later on, cwq_activate_first_delayed() activates the linked
       work item increasing -&gt;nr_active.

step3: cwq-&gt;nr_active = 1, but all activated work items of the cwq are
       NO_COLOR.  When they finish, cwq-&gt;nr_active will not be
       decreased due to NO_COLOR, and no further work items will be
       activated from cwq-&gt;delayed_works. the cwq stalls.

Fix it by ensuring the target work item is activated before stealing
PENDING in try_to_grab_pending().  This ensures that all the linked
work items are activated without incorrectly bumping cwq-&gt;nr_active.

tj: Updated comment and description.

Signed-off-by: Lai Jiangshan &lt;laijs@cn.fujitsu.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>workqueue: add missing smp_wmb() in process_one_work()</title>
<updated>2012-10-17T02:48:09Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2012-08-03T17:30:45Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=563ce0476bddbd8e9818c3c7088186dd01479ffe'/>
<id>urn:sha1:563ce0476bddbd8e9818c3c7088186dd01479ffe</id>
<content type='text'>
commit 959d1af8cffc8fd38ed53e8be1cf4ab8782f9c00 upstream.

WORK_STRUCT_PENDING is used to claim ownership of a work item and
process_one_work() releases it before starting execution.  When
someone else grabs PENDING, all pre-release updates to the work item
should be visible and all updates made by the new owner should happen
afterwards.

Grabbing PENDING uses test_and_set_bit() and thus has a full barrier;
however, clearing doesn't have a matching wmb.  Given the preceding
spin_unlock and use of clear_bit, I don't believe this can be a
problem on an actual machine and there hasn't been any related report
but it still is theretically possible for clear_pending to permeate
upwards and happen before work-&gt;entry update.

Add an explicit smp_wmb() before work_clear_pending().

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>workqueue: reimplement work_on_cpu() using system_wq</title>
<updated>2012-10-10T02:30:49Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2012-09-18T19:48:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b15bea04d4662a3b8fdf0d039a56f753be151907'/>
<id>urn:sha1:b15bea04d4662a3b8fdf0d039a56f753be151907</id>
<content type='text'>
commit ed48ece27cd3d5ee0354c32bbaec0f3e1d4715c3 upstream.

The existing work_on_cpu() implementation is hugely inefficient.  It
creates a new kthread, execute that single function and then let the
kthread die on each invocation.

Now that system_wq can handle concurrent executions, there's no
advantage of doing this.  Reimplement work_on_cpu() using system_wq
which makes it simpler and way more efficient.

stable: While this isn't a fix in itself, it's needed to fix a
        workqueue related bug in cpufreq/powernow-k8.  AFAICS, this
        shouldn't break other existing users.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Jiri Kosina &lt;jkosina@suse.cz&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Bjorn Helgaas &lt;bhelgaas@google.com&gt;
Cc: Len Brown &lt;lenb@kernel.org&gt;
Cc: Rafael J. Wysocki &lt;rjw@sisk.pl&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>workqueue: UNBOUND -&gt; REBIND morphing in rebind_workers() should be atomic</title>
<updated>2012-09-19T14:04:58Z</updated>
<author>
<name>Lai Jiangshan</name>
<email>laijs@cn.fujitsu.com</email>
</author>
<published>2012-09-01T16:28:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f152c392a7715862818fb671cbd5e0b2545aedae'/>
<id>urn:sha1:f152c392a7715862818fb671cbd5e0b2545aedae</id>
<content type='text'>
commit 96e65306b81351b656835c15931d1d237b252f27 upstream.

The compiler may compile the following code into TWO write/modify
instructions.

	worker-&gt;flags &amp;= ~WORKER_UNBOUND;
	worker-&gt;flags |= WORKER_REBIND;

so the other CPU may temporarily see worker-&gt;flags which doesn't have
either WORKER_UNBOUND or WORKER_REBIND set and perform local wakeup
prematurely.

Fix it by using single explicit assignment via ACCESS_ONCE().

Because idle workers have another WORKER_NOT_RUNNING flag, this bug
doesn't exist for them; however, update it to use the same pattern for
consistency.

tj: Applied the change to idle workers too and updated comments and
    patch description a bit.

stable: Idle worker rebinding doesn't apply for -stable and
        WORKER_UNBOUND used to be WORKER_ROGUE.  Updated accordingly.

Signed-off-by: Lai Jiangshan &lt;laijs@cn.fujitsu.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
</content>
</entry>
<entry>
<title>workqueue: perform cpu down operations from low priority cpu_notifier()</title>
<updated>2012-08-02T13:37:51Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2012-07-17T19:39:26Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c4f1f3253303865d0fda62c27057009096eeb6ef'/>
<id>urn:sha1:c4f1f3253303865d0fda62c27057009096eeb6ef</id>
<content type='text'>
commit 6575820221f7a4dd6eadecf7bf83cdd154335eda upstream.

Currently, all workqueue cpu hotplug operations run off
CPU_PRI_WORKQUEUE which is higher than normal notifiers.  This is to
ensure that workqueue is up and running while bringing up a CPU before
other notifiers try to use workqueue on the CPU.

Per-cpu workqueues are supposed to remain working and bound to the CPU
for normal CPU_DOWN_PREPARE notifiers.  This holds mostly true even
with workqueue offlining running with higher priority because
workqueue CPU_DOWN_PREPARE only creates a bound trustee thread which
runs the per-cpu workqueue without concurrency management without
explicitly detaching the existing workers.

However, if the trustee needs to create new workers, it creates
unbound workers which may wander off to other CPUs while
CPU_DOWN_PREPARE notifiers are in progress.  Furthermore, if the CPU
down is cancelled, the per-CPU workqueue may end up with workers which
aren't bound to the CPU.

While reliably reproducible with a convoluted artificial test-case
involving scheduling and flushing CPU burning work items from CPU down
notifiers, this isn't very likely to happen in the wild, and, even
when it happens, the effects are likely to be hidden by the following
successful CPU down.

Fix it by using different priorities for up and down notifiers - high
priority for up operations and low priority for down operations.

Workqueue cpu hotplug operations will soon go through further cleanup.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: "Rafael J. Wysocki" &lt;rjw@sisk.pl&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>workqueue: skip nr_running sanity check in worker_enter_idle() if trustee is active</title>
<updated>2012-05-30T23:43:42Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2012-05-14T22:04:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5d79c6f64a904afc92a329f80abe693e3ae105fe'/>
<id>urn:sha1:5d79c6f64a904afc92a329f80abe693e3ae105fe</id>
<content type='text'>
commit 544ecf310f0e7f51fa057ac2a295fc1b3b35a9d3 upstream.

worker_enter_idle() has WARN_ON_ONCE() which triggers if nr_running
isn't zero when every worker is idle.  This can trigger spuriously
while a cpu is going down due to the way trustee sets %WORKER_ROGUE
and zaps nr_running.

It first sets %WORKER_ROGUE on all workers without updating
nr_running, releases gcwq-&gt;lock, schedules, regrabs gcwq-&gt;lock and
then zaps nr_running.  If the last running worker enters idle
inbetween, it would see stale nr_running which hasn't been zapped yet
and trigger the WARN_ON_ONCE().

Fix it by performing the sanity check iff the trustee is idle.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: "Paul E. McKenney" &lt;paulmck@linux.vnet.ibm.com&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>Block: use a freezable workqueue for disk-event polling</title>
<updated>2012-03-19T16:02:34Z</updated>
<author>
<name>Alan Stern</name>
<email>stern@rowland.harvard.edu</email>
</author>
<published>2012-03-02T09:51:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2053689f68e19b8c1bb38aa68049c57576eed6e0'/>
<id>urn:sha1:2053689f68e19b8c1bb38aa68049c57576eed6e0</id>
<content type='text'>
commit 62d3c5439c534b0e6c653fc63e6d8c67be3a57b1 upstream.

This patch (as1519) fixes a bug in the block layer's disk-events
polling.  The polling is done by a work routine queued on the
system_nrt_wq workqueue.  Since that workqueue isn't freezable, the
polling continues even in the middle of a system sleep transition.

Obviously, polling a suspended drive for media changes and such isn't
a good thing to do; in the case of USB mass-storage devices it can
lead to real problems requiring device resets and even re-enumeration.

The patch fixes things by creating a new system-wide, non-reentrant,
freezable workqueue and using it for disk-events polling.

Signed-off-by: Alan Stern &lt;stern@rowland.harvard.edu&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Rafael J. Wysocki &lt;rjw@sisk.pl&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
</feed>
