<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/rcu/tree_exp.h, branch v5.4.55</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.4.55</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.4.55'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2020-03-05T15:43:50Z</updated>
<entry>
<title>rcu: Allow only one expedited GP to run concurrently with wakeups</title>
<updated>2020-03-05T15:43:50Z</updated>
<author>
<name>Neeraj Upadhyay</name>
<email>neeraju@codeaurora.org</email>
</author>
<published>2019-11-19T19:50:52Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ef0dcab6d21cc2e717ad8d9e4138646c8c7bd886'/>
<id>urn:sha1:ef0dcab6d21cc2e717ad8d9e4138646c8c7bd886</id>
<content type='text'>
commit 4bc6b745e5cbefed92c48071e28a5f41246d0470 upstream.

The current expedited RCU grace-period code expects that a task
requesting an expedited grace period cannot awaken until that grace
period has reached the wakeup phase.  However, it is possible for a long
preemption to result in the waiting task never sleeping.  For example,
consider the following sequence of events:

1.	Task A starts an expedited grace period by invoking
	synchronize_rcu_expedited().  It proceeds normally up to the
	wait_event() near the end of that function, and is then preempted
	(or interrupted or whatever).

2.	The expedited grace period completes, and a kworker task starts
	the awaken phase, having incremented the counter and acquired
	the rcu_state structure's .exp_wake_mutex.  This kworker task
	is then preempted or interrupted or whatever.

3.	Task A resumes and enters wait_event(), which notes that the
	expedited grace period has completed, and thus doesn't sleep.

4.	Task B starts an expedited grace period exactly as did Task A,
	complete with the preemption (or whatever delay) just before
	the call to wait_event().

5.	The expedited grace period completes, and another kworker
	task starts the awaken phase, having incremented the counter.
	However, it blocks when attempting to acquire the rcu_state
	structure's .exp_wake_mutex because step 2's kworker task has
	not yet released it.

6.	Steps 4 and 5 repeat, resulting in overflow of the rcu_node
	structure's -&gt;exp_wq[] array.

In theory, this is harmless.  Tasks waiting on the various -&gt;exp_wq[]
array will just be spuriously awakened, but they will just sleep again
on noting that the rcu_state structure's -&gt;expedited_sequence value has
not advanced far enough.

In practice, this wastes CPU time and is an accident waiting to happen.
This commit therefore moves the rcu_exp_gp_seq_end() call that officially
ends the expedited grace period (along with associate tracing) until
after the -&gt;exp_wake_mutex has been acquired.  This prevents Task A from
awakening prematurely, thus preventing more than one expedited grace
period from being in flight during a previous expedited grace period's
wakeup phase.

Fixes: 3b5f668e715b ("rcu: Overlap wakeups with next expedited grace period")
Signed-off-by: Neeraj Upadhyay &lt;neeraju@codeaurora.org&gt;
[ paulmck: Added updated comment. ]
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>rcu: Fix missed wakeup of exp_wq waiters</title>
<updated>2020-02-24T07:36:22Z</updated>
<author>
<name>Neeraj Upadhyay</name>
<email>neeraju@codeaurora.org</email>
</author>
<published>2019-11-19T03:17:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b7725deb9d611443040807787e17364191d30f9a'/>
<id>urn:sha1:b7725deb9d611443040807787e17364191d30f9a</id>
<content type='text'>
[ Upstream commit fd6bc19d7676a060a171d1cf3dcbf6fd797eb05f ]

Tasks waiting within exp_funnel_lock() for an expedited grace period to
elapse can be starved due to the following sequence of events:

1.	Tasks A and B both attempt to start an expedited grace
	period at about the same time.	This grace period will have
	completed when the lower four bits of the rcu_state structure's
	-&gt;expedited_sequence field are 0b'0100', for example, when the
	initial value of this counter is zero.	Task A wins, and thus
	does the actual work of starting the grace period, including
	acquiring the rcu_state structure's .exp_mutex and sets the
	counter to 0b'0001'.

2.	Because task B lost the race to start the grace period, it
	waits on -&gt;expedited_sequence to reach 0b'0100' inside of
	exp_funnel_lock(). This task therefore blocks on the rcu_node
	structure's -&gt;exp_wq[1] field, keeping in mind that the
	end-of-grace-period value of -&gt;expedited_sequence (0b'0100')
	is shifted down two bits before indexing the -&gt;exp_wq[] field.

3.	Task C attempts to start another expedited grace period,
	but blocks on -&gt;exp_mutex, which is still held by Task A.

4.	The aforementioned expedited grace period completes, so that
	-&gt;expedited_sequence now has the value 0b'0100'.  A kworker task
	therefore acquires the rcu_state structure's -&gt;exp_wake_mutex
	and starts awakening any tasks waiting for this grace period.

5.	One of the first tasks awakened happens to be Task A.  Task A
	therefore releases the rcu_state structure's -&gt;exp_mutex,
	which allows Task C to start the next expedited grace period,
	which causes the lower four bits of the rcu_state structure's
	-&gt;expedited_sequence field to become 0b'0101'.

6.	Task C's expedited grace period completes, so that the lower four
	bits of the rcu_state structure's -&gt;expedited_sequence field now
	become 0b'1000'.

7.	The kworker task from step 4 above continues its wakeups.
	Unfortunately, the wake_up_all() refetches the rcu_state
	structure's .expedited_sequence field:

	wake_up_all(&amp;rnp-&gt;exp_wq[rcu_seq_ctr(rcu_state.expedited_sequence) &amp; 0x3]);

	This results in the wakeup being applied to the rcu_node
	structure's -&gt;exp_wq[2] field, which is unfortunate given that
	Task B is instead waiting on -&gt;exp_wq[1].

On a busy system, no harm is done (or at least no permanent harm is done).
Some later expedited grace period will redo the wakeup.  But on a quiet
system, such as many embedded systems, it might be a good long time before
there was another expedited grace period.  On such embedded systems,
this situation could therefore result in a system hang.

This issue manifested as DPM device timeout during suspend (which
usually qualifies as a quiet time) due to a SCSI device being stuck in
_synchronize_rcu_expedited(), with the following stack trace:

	schedule()
	synchronize_rcu_expedited()
	synchronize_rcu()
	scsi_device_quiesce()
	scsi_bus_suspend()
	dpm_run_callback()
	__device_suspend()

This commit therefore prevents such delays, timeouts, and hangs by
making rcu_exp_wait_wake() use its "s" argument consistently instead of
refetching from rcu_state.expedited_sequence.

Fixes: 3b5f668e715b ("rcu: Overlap wakeups with next expedited grace period")
Signed-off-by: Neeraj Upadhyay &lt;neeraju@codeaurora.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>rcu: Use *_ONCE() to protect lockless -&gt;expmask accesses</title>
<updated>2020-02-11T12:35:08Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@kernel.org</email>
</author>
<published>2019-10-08T01:53:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a523031513b755ab7b3b556b78ff0a385cc3498c'/>
<id>urn:sha1:a523031513b755ab7b3b556b78ff0a385cc3498c</id>
<content type='text'>
commit 15c7c972cd26d89a26788e609c53b5a465324a6c upstream.

The rcu_node structure's -&gt;expmask field is accessed locklessly when
starting a new expedited grace period and when reporting an expedited
RCU CPU stall warning.  This commit therefore handles the former by
taking a snapshot of -&gt;expmask while the lock is held and the latter
by applying READ_ONCE() to lockless reads and WRITE_ONCE() to the
corresponding updates.

Link: https://lore.kernel.org/lkml/CANpmjNNmSOagbTpffHr4=Yedckx9Rm2NuGqC9UqE+AOz5f1-ZQ@mail.gmail.com
Reported-by: syzbot+134336b86f728d6e55a0@syzkaller.appspotmail.com
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Acked-by: Marco Elver &lt;elver@google.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>rcu: Fix spelling mistake "greate"-&gt;"great"</title>
<updated>2019-08-12T18:25:06Z</updated>
<author>
<name>Mukesh Ojha</name>
<email>mojha@codeaurora.org</email>
</author>
<published>2019-07-29T07:55:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=511b44f7598ce602f9efce687ca9eec013967d9b'/>
<id>urn:sha1:511b44f7598ce602f9efce687ca9eec013967d9b</id>
<content type='text'>
This commit fixes a spelling mistake in file tree_exp.h.

Signed-off-by: Mukesh Ojha &lt;mojha@codeaurora.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.ibm.com&gt;
</content>
</entry>
<entry>
<title>rcu: Add destroy_work_on_stack() to match INIT_WORK_ONSTACK()</title>
<updated>2019-08-01T21:05:51Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@linux.ibm.com</email>
</author>
<published>2019-06-19T22:42:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fbad01af8c3bb9618848abde8054ab7e0c2330fe'/>
<id>urn:sha1:fbad01af8c3bb9618848abde8054ab7e0c2330fe</id>
<content type='text'>
The synchronize_rcu_expedited() function has an INIT_WORK_ONSTACK(),
but lacks the corresponding destroy_work_on_stack().  This commit
therefore adds destroy_work_on_stack().

Reported-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.ibm.com&gt;
Acked-by: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
</content>
</entry>
<entry>
<title>Merge branches 'consolidate.2019.05.28a', 'doc.2019.05.28a', 'fixes.2019.06.13a', 'srcu.2019.05.28a', 'sync.2019.05.28a' and 'torture.2019.05.28a' into HEAD</title>
<updated>2019-06-19T16:21:46Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@linux.ibm.com</email>
</author>
<published>2019-06-19T16:21:46Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=11ca7a9d541d09586fbf89290d1e14619cc40de0'/>
<id>urn:sha1:11ca7a9d541d09586fbf89290d1e14619cc40de0</id>
<content type='text'>
consolidate.2019.05.28a: RCU flavor consolidation cleanups and optmizations.
doc.2019.05.28a: Documentation updates.
fixes.2019.06.13a: Miscellaneous fixes.
srcu.2019.05.28a: SRCU updates.
sync.2019.05.28a: RCU-sync flavor consolidation.
torture.2019.05.28a: Torture-test updates.
</content>
</entry>
<entry>
<title>rcu: Upgrade sync_exp_work_done() to smp_mb()</title>
<updated>2019-06-13T22:33:19Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@linux.ibm.com</email>
</author>
<published>2019-04-20T08:40:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=96050c68be33edef18800ad6748f61f81db81a20'/>
<id>urn:sha1:96050c68be33edef18800ad6748f61f81db81a20</id>
<content type='text'>
The sync_exp_work_done() function uses smp_mb__before_atomic(), but
there is no obvious atomic in the ensuing code.  The ordering is
absolutely required for grace periods to work correctly, so this
commit upgrades the smp_mb__before_atomic() to smp_mb().

Fixes: 6fba2b3767ea ("rcu: Remove deprecated RCU debugfs tracing code")
Reported-by: Andrea Parri &lt;andrea.parri@amarulasolutions.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.ibm.com&gt;
</content>
</entry>
<entry>
<title>rcu: Remove unused rdp local from synchronize_rcu_expedited()</title>
<updated>2019-05-28T15:48:19Z</updated>
<author>
<name>Jiang Biao</name>
<email>benbjiang@tencent.com</email>
</author>
<published>2019-04-23T01:21:55Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f0b635627395223d3c60a3105372b4349e04772f'/>
<id>urn:sha1:f0b635627395223d3c60a3105372b4349e04772f</id>
<content type='text'>
Because rdp is initialized but never used in synchronize_rcu_expedited(),
this commit removes it.

Signed-off-by: Jiang Biao &lt;benbjiang@tencent.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.ibm.com&gt;
</content>
</entry>
<entry>
<title>rcu: Rename rcu_data's -&gt;deferred_qs to -&gt;exp_deferred_qs</title>
<updated>2019-05-28T15:48:19Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@linux.ibm.com</email>
</author>
<published>2019-03-27T22:51:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1bb336443cde1154600bd147a45a30baa59c57db'/>
<id>urn:sha1:1bb336443cde1154600bd147a45a30baa59c57db</id>
<content type='text'>
The rcu_data structure's -&gt;deferred_qs field is used to indicate that the
current CPU is blocking an expedited grace period (perhaps a future one).
Given that it is used only for expedited grace periods, its current name
is misleading, so this commit renames it to -&gt;exp_deferred_qs.

Signed-off-by: Paul E. McKenney &lt;paulmck@linux.ibm.com&gt;
</content>
</entry>
<entry>
<title>rcu: Avoid self-IPI in sync_sched_exp_online_cleanup()</title>
<updated>2019-05-25T21:50:50Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@linux.ibm.com</email>
</author>
<published>2019-03-27T17:03:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e015a341122024198f57d1f0498a776523137e94'/>
<id>urn:sha1:e015a341122024198f57d1f0498a776523137e94</id>
<content type='text'>
The sync_sched_exp_online_cleanup() is invoked at online time to handle
the case where the start of an expedited grace period ran concurrently
with a CPU being taken offline and then immediately being placed online.
It checks to see if RCU needs an expedited quiescent state from the
incoming CPU, sending it an IPI if so.  However, it is quite possible
that sync_sched_exp_online_cleanup() is running on that CPU, in which
case it is considerably less overhead to simply request the quiescent
state locally instead of simulating a self-IPI.

This commit therefore places the last few lines of rcu_exp_handler()
into a new rcu_exp_need_qs() function, which is invoked both by
rcu_exp_handler() and by sync_sched_exp_online_cleanup() in the self-IPI
case.

This also reduces the rcu_exp_handler() function's state space by
removing the direct call that this smp_call_function_single() uses to
emulate the requested self-IPI.  This in turn will allow tighter error
checking in rcu_is_cpu_rrupt_from_idle().

Signed-off-by: Paul E. McKenney &lt;paulmck@linux.ibm.com&gt;
Reviewed-by: Joel Fernandes (Google) &lt;joel@joelfernandes.org&gt;
</content>
</entry>
</feed>
