<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/srcu.c, branch v3.7.8</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.7.8</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.7.8'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2012-08-20T21:51:24Z</updated>
<entry>
<title>workqueue: deprecate system_nrt[_freezable]_wq</title>
<updated>2012-08-20T21:51:24Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2012-08-20T21:51:24Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3b07e9ca26866697616097044f25fbe53dbab693'/>
<id>urn:sha1:3b07e9ca26866697616097044f25fbe53dbab693</id>
<content type='text'>
system_nrt[_freezable]_wq are now spurious.  Mark them deprecated and
convert all users to system[_freezable]_wq.

If you're cc'd and wondering what's going on: Now all workqueues are
non-reentrant, so there's no reason to use system_nrt[_freezable]_wq.
Please use system[_freezable]_wq instead.

This patch doesn't make any functional difference.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-By: Lai Jiangshan &lt;laijs@cn.fujitsu.com&gt;

Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: David Airlie &lt;airlied@linux.ie&gt;
Cc: Jiri Kosina &lt;jkosina@suse.cz&gt;
Cc: "David S. Miller" &lt;davem@davemloft.net&gt;
Cc: Rusty Russell &lt;rusty@rustcorp.com.au&gt;
Cc: "Paul E. McKenney" &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
</content>
</entry>
<entry>
<title>rcu: Implement per-domain single-threaded call_srcu() state machine</title>
<updated>2012-04-30T17:48:25Z</updated>
<author>
<name>Lai Jiangshan</name>
<email>laijs@cn.fujitsu.com</email>
</author>
<published>2012-03-19T08:12:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=931ea9d1a6e06a5e3af03aa4aaaa7c7fd90e163f'/>
<id>urn:sha1:931ea9d1a6e06a5e3af03aa4aaaa7c7fd90e163f</id>
<content type='text'>
This commit implements an SRCU state machine in support of call_srcu().
The state machine is preemptible, light-weight, and single-threaded,
minimizing synchronization overhead.  In particular, there is no longer
any need for synchronize_srcu() to be guarded by a mutex.

Expedited processing is handled, at least in the absence of concurrent
grace-period operations on that same srcu_struct structure, by having
the synchronize_srcu_expedited() thread take on the role of the
workqueue thread for one iteration.

There is a reasonable probability that a given SRCU callback will
be invoked on the same CPU that registered it, however, there is no
guarantee.  Concurrent SRCU grace-period primitives can cause callbacks
to be executed elsewhere, even in absence of CPU-hotplug operations.

Callbacks execute in process context, but under the influence of
local_bh_disable(), so it is illegal to sleep in an SRCU callback
function.

Signed-off-by: Lai Jiangshan &lt;laijs@cn.fujitsu.com&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
</content>
</entry>
<entry>
<title>rcu: Use single value to handle expedited SRCU grace periods</title>
<updated>2012-04-30T17:48:24Z</updated>
<author>
<name>Lai Jiangshan</name>
<email>laijs@cn.fujitsu.com</email>
</author>
<published>2012-03-19T08:12:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d9792edd7a9a0858a3b1df92cf8beb31e4191e3c'/>
<id>urn:sha1:d9792edd7a9a0858a3b1df92cf8beb31e4191e3c</id>
<content type='text'>
The earlier algorithm used an "expedited" flag combined with a "trycount"
counter to differentiate between normal and expedited SRCU grace periods.
However, the difference can be encoded into a single counter with a cutoff
value and different initial values for expedited and normal SRCU grace
periods.  This commit makes that change.

Signed-off-by: Lai Jiangshan &lt;laijs@cn.fujitsu.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;

Conflicts:

	kernel/srcu.c
</content>
</entry>
<entry>
<title>rcu: Improve srcu_readers_active_idx()'s cache locality</title>
<updated>2012-04-30T17:48:24Z</updated>
<author>
<name>Lai Jiangshan</name>
<email>laijs@cn.fujitsu.com</email>
</author>
<published>2012-03-06T09:57:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=dc87917501e324701dbfb249def44054b5220187'/>
<id>urn:sha1:dc87917501e324701dbfb249def44054b5220187</id>
<content type='text'>
Expand the calls to srcu_readers_active_idx() from srcu_readers_active()
inline.  This change improves cache locality by interating over the CPUs
once rather than twice.

Signed-off-by: Lai Jiangshan &lt;laijs@cn.fujitsu.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
</content>
</entry>
<entry>
<title>rcu: Implement a variant of Peter's SRCU algorithm</title>
<updated>2012-04-30T17:48:22Z</updated>
<author>
<name>Lai Jiangshan</name>
<email>laijs@cn.fujitsu.com</email>
</author>
<published>2012-02-27T17:29:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b52ce066c55a6a53cf1f8d71308d74f908e31b99'/>
<id>urn:sha1:b52ce066c55a6a53cf1f8d71308d74f908e31b99</id>
<content type='text'>
This commit implements a variant of Peter's algorithm, which may be found
at https://lkml.org/lkml/2012/2/1/119.

o	Make the checking lock-free to enable parallel checking.
	Parallel checking is required when (1) the original checking
	task is preempted for a long time, (2) sychronize_srcu_expedited()
	starts during an ongoing SRCU grace period, or (3) we wish to
	avoid acquiring a lock.

o	Since the checking is lock-free, we avoid a mutex in state machine
	for call_srcu().

o	Remove the SRCU_REF_MASK and remove the coupling with the flipping.
	This might allow us to remove the preempt_disable() in future
	versions, though such removal will need great care because it
	rescinds the one-old-reader-per-CPU guarantee.

o	Remove a smp_mb(), simplify the comments and make the smp_mb() pairs
	more intuitive.

Inspired-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Lai Jiangshan &lt;laijs@cn.fujitsu.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
</content>
</entry>
<entry>
<title>rcu: Improve SRCU's wait_idx() comments</title>
<updated>2012-04-30T17:48:22Z</updated>
<author>
<name>Lai Jiangshan</name>
<email>laijs@cn.fujitsu.com</email>
</author>
<published>2012-02-27T17:28:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=18108ebfebe9e871d0a9af830baf8f5df69eb5fc'/>
<id>urn:sha1:18108ebfebe9e871d0a9af830baf8f5df69eb5fc</id>
<content type='text'>
The safety of SRCU is provided byy wait_idx() rather than flipping.
The flipping actually prevents starvation.

This commit therefore updates the comments to more accurately and
precisely describe what is going on.

Signed-off-by: Lai Jiangshan &lt;laijs@cn.fujitsu.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
</content>
</entry>
<entry>
<title>rcu: Flip -&gt;completed only once per SRCU grace period</title>
<updated>2012-04-30T17:48:21Z</updated>
<author>
<name>Lai Jiangshan</name>
<email>laijs@cn.fujitsu.com</email>
</author>
<published>2012-02-23T00:43:55Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=944ce9af4767ca085d465e4add69df11a8faa9ef'/>
<id>urn:sha1:944ce9af4767ca085d465e4add69df11a8faa9ef</id>
<content type='text'>
This is an optimization of the SRCU grace period.  To guard against
preempted readers with old values of the counter, it suffices to scan the
old counters once more, then flip -&gt;completed only one time.  The reason
this works is that the old readers must have incremented the old set of
counters (if they have not yet incremented, then their critical section
starts after this grace period, so they may be safely ignored).

This commit therefore optimizes the second flip out in favor of a simple
rescan.

Signed-off-by: Lai Jiangshan &lt;laijs@cn.fujitsu.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
</content>
</entry>
<entry>
<title>rcu: Increment upper bit only for srcu_read_lock()</title>
<updated>2012-04-30T17:48:20Z</updated>
<author>
<name>Lai Jiangshan</name>
<email>laijs@cn.fujitsu.com</email>
</author>
<published>2012-02-22T21:29:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=440253c17fc4ed41d778492a7fb44dc0d756eccc'/>
<id>urn:sha1:440253c17fc4ed41d778492a7fb44dc0d756eccc</id>
<content type='text'>
The purpose of the upper bit of SRCU's per-CPU counters is to guarantee
that no reasonable series of srcu_read_lock() and srcu_read_unlock()
operations can return the value of the counter to its original value.
This guarantee is require only after the index has been switched to
the other set of counters, so at most one srcu_read_lock() can affect
a given CPU's counter.  The number of srcu_read_unlock() operations
on a given counter is limited to the number of tasks in the system,
which given the Linux kernel's current structure is limited to far less
than 2^30 on 32-bit systems and far less than 2^62 on 64-bit systems.
(Something about a limited number of bytes in the kernel's address space.)

Therefore, if srcu_read_lock() increments the upper bits, then
srcu_read_unlock() need not do so.  In this case, an srcu_read_lock() and
an srcu_read_unlock() will flip the lower bit of the upper field of the
counter.  An unreasonably large additional number of srcu_read_unlock()
operations would be required to return the counter to its initial value,
thus preserving the guarantee.

This commit takes this approach, which further allows it to shrink
the size of the upper field to one bit, making the number of
srcu_read_unlock() operations required to return the counter to its
initial value even more unreasonable than before.

Signed-off-by: Lai Jiangshan &lt;laijs@cn.fujitsu.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
</content>
</entry>
<entry>
<title>rcu: Remove fast check path from __synchronize_srcu()</title>
<updated>2012-04-30T17:48:20Z</updated>
<author>
<name>Lai Jiangshan</name>
<email>laijs@cn.fujitsu.com</email>
</author>
<published>2012-02-22T21:06:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4b7a3e9e32114a09c61995048f055615b5d4c26d'/>
<id>urn:sha1:4b7a3e9e32114a09c61995048f055615b5d4c26d</id>
<content type='text'>
The fastpath in __synchronize_srcu() is designed to handle cases where
there are a large number of concurrent calls for the same srcu_struct
structure.  However, the Linux kernel currently does not use SRCU in
this manner, so remove the fastpath checks for simplicity.

Signed-off-by: Lai Jiangshan &lt;laijs@cn.fujitsu.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
</content>
</entry>
<entry>
<title>rcu: Direct algorithmic SRCU implementation</title>
<updated>2012-04-30T17:48:19Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paul.mckenney@linaro.org</email>
</author>
<published>2012-02-05T15:42:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cef50120b61c2af4ce34bc165e19cad66296f93d'/>
<id>urn:sha1:cef50120b61c2af4ce34bc165e19cad66296f93d</id>
<content type='text'>
The current implementation of synchronize_srcu_expedited() can cause
severe OS jitter due to its use of synchronize_sched(), which in turn
invokes try_stop_cpus(), which causes each CPU to be sent an IPI.
This can result in severe performance degradation for real-time workloads
and especially for short-interation-length HPC workloads.  Furthermore,
because only one instance of try_stop_cpus() can be making forward progress
at a given time, only one instance of synchronize_srcu_expedited() can
make forward progress at a time, even if they are all operating on
distinct srcu_struct structures.

This commit, inspired by an earlier implementation by Peter Zijlstra
(https://lkml.org/lkml/2012/1/31/211) and by further offline discussions,
takes a strictly algorithmic bits-in-memory approach.  This has the
disadvantage of requiring one explicit memory-barrier instruction in
each of srcu_read_lock() and srcu_read_unlock(), but on the other hand
completely dispenses with OS jitter and furthermore allows SRCU to be
used freely by CPUs that RCU believes to be idle or offline.

The update-side implementation handles the single read-side memory
barrier by rechecking the per-CPU counters after summing them and
by running through the update-side state machine twice.

This implementation has passed moderate rcutorture testing on both
x86 and Power.  Also updated to use this_cpu_ptr() instead of per_cpu_ptr(),
as suggested by Peter Zijlstra.

Reported-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Paul E. McKenney &lt;paul.mckenney@linaro.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Reviewed-by: Lai Jiangshan &lt;laijs@cn.fujitsu.com&gt;
</content>
</entry>
</feed>
