<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/locking/qspinlock.c, branch v4.9.105</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.9.105</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.9.105'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2018-05-30T05:50:26Z</updated>
<entry>
<title>locking/qspinlock: Ensure node-&gt;count is updated before initialising node</title>
<updated>2018-05-30T05:50:26Z</updated>
<author>
<name>Will Deacon</name>
<email>will.deacon@arm.com</email>
</author>
<published>2018-02-13T13:22:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c8723ceed341db0e62570f5b996554bd60026af5'/>
<id>urn:sha1:c8723ceed341db0e62570f5b996554bd60026af5</id>
<content type='text'>
[ Upstream commit 11dc13224c975efcec96647a4768a6f1bb7a19a8 ]

When queuing on the qspinlock, the count field for the current CPU's head
node is incremented. This needn't be atomic because locking in e.g. IRQ
context is balanced and so an IRQ will return with node-&gt;count as it
found it.

However, the compiler could in theory reorder the initialisation of
node[idx] before the increment of the head node-&gt;count, causing an
IRQ to overwrite the initialised node and potentially corrupt the lock
state.

Avoid the potential for this harmful compiler reordering by placing a
barrier() between the increment of the head node-&gt;count and the subsequent
node initialisation.

Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/1518528177-19169-3-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>locking/qspinlock: Use __this_cpu_dec() instead of full-blown this_cpu_dec()</title>
<updated>2016-06-27T09:37:41Z</updated>
<author>
<name>Pan Xinhui</name>
<email>xinhui.pan@linux.vnet.ibm.com</email>
</author>
<published>2016-06-14T06:37:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0dceeaf599e6d9b8bd908ba4bd3dfee84aa26be2'/>
<id>urn:sha1:0dceeaf599e6d9b8bd908ba4bd3dfee84aa26be2</id>
<content type='text'>
queued_spin_lock_slowpath() should not worry about another
queued_spin_lock_slowpath() running in interrupt context and
changing node-&gt;count by accident, because node-&gt;count keeps
the same value every time we enter/leave queued_spin_lock_slowpath().

On some architectures this_cpu_dec() will save/restore irq flags,
which has high overhead. Use the much cheaper __this_cpu_dec() instead.

Signed-off-by: Pan Xinhui &lt;xinhui.pan@linux.vnet.ibm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Waiman.Long@hpe.com
Link: http://lkml.kernel.org/r/1465886247-3773-1-git-send-email-xinhui.pan@linux.vnet.ibm.com
[ Rewrote changelog. ]
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/barriers: Introduce smp_acquire__after_ctrl_dep()</title>
<updated>2016-06-14T09:55:14Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2016-05-24T11:17:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=33ac279677dcc2441cb93d8cb9cf7a74df62814d'/>
<id>urn:sha1:33ac279677dcc2441cb93d8cb9cf7a74df62814d</id>
<content type='text'>
Introduce smp_acquire__after_ctrl_dep(), this construct is not
uncommon, but the lack of this barrier is.

Use it to better express smp_rmb() uses in WRITE_ONCE(), the IPC
semaphore code and the qspinlock code.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/barriers: Replace smp_cond_acquire() with smp_cond_load_acquire()</title>
<updated>2016-06-14T09:54:27Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2016-04-04T08:57:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1f03e8d2919270bd6ef64f39a45ce8df8a9f012a'/>
<id>urn:sha1:1f03e8d2919270bd6ef64f39a45ce8df8a9f012a</id>
<content type='text'>
This new form allows using hardware assisted waiting.

Some hardware (ARM64 and x86) allow monitoring an address for changes,
so by providing a pointer we can use this to replace the cpu_relax()
with hardware optimized methods in the future.

Requested-by: Will Deacon &lt;will.deacon@arm.com&gt;
Suggested-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/qspinlock: Add comments</title>
<updated>2016-06-08T12:44:01Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2016-06-08T08:36:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=055ce0fd1b86c204430cbc0887165599d6e15090'/>
<id>urn:sha1:055ce0fd1b86c204430cbc0887165599d6e15090</id>
<content type='text'>
I figured we need to document the spin_is_locked() and
spin_unlock_wait() constraints somwehere.

Ideally 'someone' would rewrite Documentation/atomic_ops.txt and we
could find a place in there. But currently that document is stale to
the point of hardly being useful.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Pan Xinhui &lt;xinhui.pan@linux.vnet.ibm.com&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Waiman Long &lt;waiman.long@hpe.com&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/qspinlock: Clarify xchg_tail() ordering</title>
<updated>2016-06-08T12:44:00Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2016-06-08T07:12:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8d53fa19041ae65c484d81d75179b4a577e6d8e4'/>
<id>urn:sha1:8d53fa19041ae65c484d81d75179b4a577e6d8e4</id>
<content type='text'>
While going over the code I noticed that xchg_tail() is a RELEASE but
had no obvious pairing commented.

It pairs with a somewhat unique address dependency through
decode_tail().

So the store-release of xchg_tail() is paired by the address
dependency of the load of xchg_tail followed by the dereference from
the pointer computed from that load.

The @old -&gt; @prev transformation itself is pure, and therefore does
not depend on external state, so that is immaterial wrt. ordering.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Pan Xinhui &lt;xinhui.pan@linux.vnet.ibm.com&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Waiman Long &lt;waiman.long@hpe.com&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/qspinlock: Fix spin_unlock_wait() some more</title>
<updated>2016-06-08T12:29:08Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2016-06-08T08:19:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2c610022711675ee908b903d242f0b90e1db661f'/>
<id>urn:sha1:2c610022711675ee908b903d242f0b90e1db661f</id>
<content type='text'>
While this prior commit:

  54cf809b9512 ("locking,qspinlock: Fix spin_is_locked() and spin_unlock_wait()")

... fixes spin_is_locked() and spin_unlock_wait() for the usage
in ipc/sem and netfilter, it does not in fact work right for the
usage in task_work and futex.

So while the 2 locks crossed problem:

	spin_lock(A)		spin_lock(B)
	if (!spin_is_locked(B)) spin_unlock_wait(A)
	  foo()			foo();

... works with the smp_mb() injected by both spin_is_locked() and
spin_unlock_wait(), this is not sufficient for:

	flag = 1;
	smp_mb();		spin_lock()
	spin_unlock_wait()	if (!flag)
				  // add to lockless list
	// iterate lockless list

... because in this scenario, the store from spin_lock() can be delayed
past the load of flag, uncrossing the variables and loosing the
guarantee.

This patch reworks spin_is_locked() and spin_unlock_wait() to work in
both cases by exploiting the observation that while the lock byte
store can be delayed, the contender must have registered itself
visibly in other state contained in the word.

It also allows for architectures to override both functions, as PPC
and ARM64 have an additional issue for which we currently have no
generic solution.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Giovanni Gherdovich &lt;ggherdovich@suse.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Pan Xinhui &lt;xinhui.pan@linux.vnet.ibm.com&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Waiman Long &lt;waiman.long@hpe.com&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: stable@vger.kernel.org # v4.2 and later
Fixes: 54cf809b9512 ("locking,qspinlock: Fix spin_is_locked() and spin_unlock_wait()")
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/qspinlock: Use smp_cond_acquire() in pending code</title>
<updated>2016-02-29T09:02:42Z</updated>
<author>
<name>Waiman Long</name>
<email>Waiman.Long@hpe.com</email>
</author>
<published>2015-12-10T20:17:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cb037fdad6772df2d49fe61c97d7c0d8265bc918'/>
<id>urn:sha1:cb037fdad6772df2d49fe61c97d7c0d8265bc918</id>
<content type='text'>
The newly introduced smp_cond_acquire() was used to replace the
slowpath lock acquisition loop. Similarly, the new function can also
be applied to the pending bit locking loop. This patch uses the new
function in that loop.

Signed-off-by: Waiman Long &lt;Waiman.Long@hpe.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Douglas Hatch &lt;doug.hatch@hpe.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Scott J Norton &lt;scott.norton@hpe.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/1449778666-13593-1-git-send-email-Waiman.Long@hpe.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/pvqspinlock: Queue node adaptive spinning</title>
<updated>2015-12-04T10:39:51Z</updated>
<author>
<name>Waiman Long</name>
<email>Waiman.Long@hpe.com</email>
</author>
<published>2015-11-10T00:09:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cd0272fab785077c121aa91ec2401090965bbc37'/>
<id>urn:sha1:cd0272fab785077c121aa91ec2401090965bbc37</id>
<content type='text'>
In an overcommitted guest where some vCPUs have to be halted to make
forward progress in other areas, it is highly likely that a vCPU later
in the spinlock queue will be spinning while the ones earlier in the
queue would have been halted. The spinning in the later vCPUs is then
just a waste of precious CPU cycles because they are not going to
get the lock soon as the earlier ones have to be woken up and take
their turn to get the lock.

This patch implements an adaptive spinning mechanism where the vCPU
will call pv_wait() if the previous vCPU is not running.

Linux kernel builds were run in KVM guest on an 8-socket, 4
cores/socket Westmere-EX system and a 4-socket, 8 cores/socket
Haswell-EX system. Both systems are configured to have 32 physical
CPUs. The kernel build times before and after the patch were:

		    Westmere			Haswell
  Patch		32 vCPUs    48 vCPUs	32 vCPUs    48 vCPUs
  -----		--------    --------    --------    --------
  Before patch   3m02.3s     5m00.2s     1m43.7s     3m03.5s
  After patch    3m03.0s     4m37.5s	 1m43.0s     2m47.2s

For 32 vCPUs, this patch doesn't cause any noticeable change in
performance. For 48 vCPUs (over-committed), there is about 8%
performance improvement.

Signed-off-by: Waiman Long &lt;Waiman.Long@hpe.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Douglas Hatch &lt;doug.hatch@hpe.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Scott J Norton &lt;scott.norton@hpe.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/1447114167-47185-8-git-send-email-Waiman.Long@hpe.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/pvqspinlock: Allow limited lock stealing</title>
<updated>2015-12-04T10:39:51Z</updated>
<author>
<name>Waiman Long</name>
<email>Waiman.Long@hpe.com</email>
</author>
<published>2015-11-10T21:18:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1c4941fd53afb46ab15826628e4819866d008a28'/>
<id>urn:sha1:1c4941fd53afb46ab15826628e4819866d008a28</id>
<content type='text'>
This patch allows one attempt for the lock waiter to steal the lock
when entering the PV slowpath. To prevent lock starvation, the pending
bit will be set by the queue head vCPU when it is in the active lock
spinning loop to disable any lock stealing attempt.  This helps to
reduce the performance penalty caused by lock waiter preemption while
not having much of the downsides of a real unfair lock.

The pv_wait_head() function was renamed as pv_wait_head_or_lock()
as it was modified to acquire the lock before returning. This is
necessary because of possible lock stealing attempts from other tasks.

Linux kernel builds were run in KVM guest on an 8-socket, 4
cores/socket Westmere-EX system and a 4-socket, 8 cores/socket
Haswell-EX system. Both systems are configured to have 32 physical
CPUs. The kernel build times before and after the patch were:

                    Westmere                    Haswell
  Patch         32 vCPUs    48 vCPUs    32 vCPUs    48 vCPUs
  -----         --------    --------    --------    --------
  Before patch   3m15.6s    10m56.1s     1m44.1s     5m29.1s
  After patch    3m02.3s     5m00.2s     1m43.7s     3m03.5s

For the overcommited case (48 vCPUs), this patch is able to reduce
kernel build time by more than 54% for Westmere and 44% for Haswell.

Signed-off-by: Waiman Long &lt;Waiman.Long@hpe.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Douglas Hatch &lt;doug.hatch@hpe.com&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Scott J Norton &lt;scott.norton@hpe.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/1447190336-53317-1-git-send-email-Waiman.Long@hpe.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
</feed>
