<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/locking/qspinlock.c, branch v5.15.16</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.15.16</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.15.16'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2020-07-08T20:21:57Z</updated>
<entry>
<title>x86/kvm: Add "nopvspin" parameter to disable PV spinlocks</title>
<updated>2020-07-08T20:21:57Z</updated>
<author>
<name>Zhenzhong Duan</name>
<email>zhenzhong.duan@oracle.com</email>
</author>
<published>2019-10-23T11:16:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=05eee619ed61c8cd89633954d38c4e5653086845'/>
<id>urn:sha1:05eee619ed61c8cd89633954d38c4e5653086845</id>
<content type='text'>
There are cases where a guest tries to switch spinlocks to bare metal
behavior (e.g. by setting "xen_nopvspin" on XEN platform and
"hv_nopvspin" on HYPER_V).

That feature is missed on KVM, add a new parameter "nopvspin" to disable
PV spinlocks for KVM guest.

The new 'nopvspin' parameter will also replace Xen and Hyper-V specific
parameters in future patches.

Define variable nopvsin as global because it will be used in future
patches as above.

Signed-off-by: Zhenzhong Duan &lt;zhenzhong.duan@oracle.com&gt;
Reviewed-by: Vitaly Kuznetsov &lt;vkuznets@redhat.com&gt;
Cc: Jonathan Corbet &lt;corbet@lwn.net&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Cc: Radim Krcmar &lt;rkrcmar@redhat.com&gt;
Cc: Sean Christopherson &lt;sean.j.christopherson@intel.com&gt;
Cc: Vitaly Kuznetsov &lt;vkuznets@redhat.com&gt;
Cc: Wanpeng Li &lt;wanpengli@tencent.com&gt;
Cc: Jim Mattson &lt;jmattson@google.com&gt;
Cc: Joerg Roedel &lt;joro@8bytes.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Signed-off-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
</content>
</entry>
<entry>
<title>locking/qspinlock: Fix inaccessible URL of MCS lock paper</title>
<updated>2020-01-17T09:19:30Z</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2020-01-07T17:49:14Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=57097124cbbd310cc2b5884189e22e60a3c20514'/>
<id>urn:sha1:57097124cbbd310cc2b5884189e22e60a3c20514</id>
<content type='text'>
It turns out that the URL of the MCS lock paper listed in the source
code is no longer accessible. I did got question about where the paper
was. This patch updates the URL to BZ 206115 which contains a copy of
the paper from

  https://www.cs.rochester.edu/u/scott/papers/1991_TOCS_synch.pdf

Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Will Deacon &lt;will@kernel.org&gt;
Link: https://lkml.kernel.org/r/20200107174914.4187-1-longman@redhat.com
</content>
</entry>
<entry>
<title>treewide: Replace GPLv2 boilerplate/reference with SPDX - rule 157</title>
<updated>2019-05-30T18:26:37Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2019-05-27T06:55:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c942fddf8793b2013be8c901b47d0a8dc02bf99f'/>
<id>urn:sha1:c942fddf8793b2013be8c901b47d0a8dc02bf99f</id>
<content type='text'>
Based on 3 normalized pattern(s):

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option any later version this program is distributed in the
  hope that it will be useful but without any warranty without even
  the implied warranty of merchantability or fitness for a particular
  purpose see the gnu general public license for more details

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option any later version [author] [kishon] [vijay] [abraham]
  [i] [kishon]@[ti] [com] this program is distributed in the hope that
  it will be useful but without any warranty without even the implied
  warranty of merchantability or fitness for a particular purpose see
  the gnu general public license for more details

  this program is free software you can redistribute it and or modify
  it under the terms of the gnu general public license as published by
  the free software foundation either version 2 of the license or at
  your option any later version [author] [graeme] [gregory]
  [gg]@[slimlogic] [co] [uk] [author] [kishon] [vijay] [abraham] [i]
  [kishon]@[ti] [com] [based] [on] [twl6030]_[usb] [c] [author] [hema]
  [hk] [hemahk]@[ti] [com] this program is distributed in the hope
  that it will be useful but without any warranty without even the
  implied warranty of merchantability or fitness for a particular
  purpose see the gnu general public license for more details

extracted by the scancode license scanner the SPDX license identifier

  GPL-2.0-or-later

has been chosen to replace the boilerplate/reference in 1105 file(s).

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Allison Randal &lt;allison@lohutok.net&gt;
Reviewed-by: Richard Fontana &lt;rfontana@redhat.com&gt;
Reviewed-by: Kate Stewart &lt;kstewart@linuxfoundation.org&gt;
Cc: linux-spdx@vger.kernel.org
Link: https://lkml.kernel.org/r/20190527070033.202006027@linutronix.de
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>locking/qspinlock_stat: Introduce generic lockevent_*() counting APIs</title>
<updated>2019-04-10T08:56:03Z</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2019-04-04T17:43:16Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ad53fa10fa9e816067bbae7109845940f5e6df50'/>
<id>urn:sha1:ad53fa10fa9e816067bbae7109845940f5e6df50</id>
<content type='text'>
The percpu event counts used by qspinlock code can be useful for
other locking code as well. So a new set of lockevent_* counting APIs
is introduced with the lock event names extracted out into the new
lock_events_list.h header file for easier addition in the future.

The existing qstat_inc() calls are replaced by either lockevent_inc() or
lockevent_cond_inc() calls.

The qstat_hop() call is renamed to lockevent_pv_hop(). The "reset_counters"
debugfs file is also renamed to ".reset_counts".

Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Acked-by: Davidlohr Bueso &lt;dbueso@suse.de&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tim Chen &lt;tim.c.chen@linux.intel.com&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Link: http://lkml.kernel.org/r/20190404174320.22416-8-longman@redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/qspinlock: Remove unnecessary BUG_ON() call</title>
<updated>2019-02-28T06:55:38Z</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2019-02-25T01:14:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=733000c7ffd9d9c8c4fdfd82f0d41956c8cf0537'/>
<id>urn:sha1:733000c7ffd9d9c8c4fdfd82f0d41956c8cf0537</id>
<content type='text'>
With the &gt; 4 nesting levels case handled by the commit:

  d682b596d993 ("locking/qspinlock: Handle &gt; 4 slowpath nesting levels")

the BUG_ON() call in encode_tail() will never actually be triggered.

Remove it.

Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lkml.kernel.org/r/1551057253-3231-1-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/qspinlock_stat: Track the no MCS node available case</title>
<updated>2019-02-04T08:03:30Z</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2019-01-29T21:53:46Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=412f34a82ccf7dd52f6b197f6450a33f03342523'/>
<id>urn:sha1:412f34a82ccf7dd52f6b197f6450a33f03342523</id>
<content type='text'>
Track the number of slowpath locking operations that are being done
without any MCS node available as well renaming lock_index[123] to make
them more descriptive.

Using these stat counters is one way to find out if a code path is
being exercised.

Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: James Morse &lt;james.morse@arm.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: SRINIVAS &lt;srinivas.eeda@oracle.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Zhenzhong Duan &lt;zhenzhong.duan@oracle.com&gt;
Link: https://lkml.kernel.org/r/1548798828-16156-3-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/qspinlock: Handle &gt; 4 slowpath nesting levels</title>
<updated>2019-02-04T08:03:29Z</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2019-01-29T21:53:45Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d682b596d99345ef0000e7017db714ba7f29e017'/>
<id>urn:sha1:d682b596d99345ef0000e7017db714ba7f29e017</id>
<content type='text'>
Four queue nodes per CPU are allocated to enable up to 4 nesting levels
using the per-CPU nodes. Nested NMIs are possible in some architectures.
Still it is very unlikely that we will ever hit more than 4 nested
levels with contention in the slowpath.

When that rare condition happens, however, it is likely that the system
will hang or crash shortly after that. It is not good and we need to
handle this exception case.

This is done by spinning directly on the lock using repeated trylock.
This alternative code path should only be used when there is nested
NMIs. Assuming that the locks used by those NMI handlers will not be
heavily contended, a simple TAS locking should work out.

Suggested-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: H. Peter Anvin &lt;hpa@zytor.com&gt;
Cc: James Morse &lt;james.morse@arm.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: SRINIVAS &lt;srinivas.eeda@oracle.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Zhenzhong Duan &lt;zhenzhong.duan@oracle.com&gt;
Link: https://lkml.kernel.org/r/1548798828-16156-2-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/pvqspinlock: Extend node size when pvqspinlock is configured</title>
<updated>2018-10-17T06:37:32Z</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2018-10-16T13:45:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0fa809ca7f81c47bea6706bc689e941eb25d7e89'/>
<id>urn:sha1:0fa809ca7f81c47bea6706bc689e941eb25d7e89</id>
<content type='text'>
The qspinlock code supports up to 4 levels of slowpath nesting using
four per-CPU mcs_spinlock structures. For 64-bit architectures, they
fit nicely in one 64-byte cacheline.

For para-virtualized (PV) qspinlocks it needs to store more information
in the per-CPU node structure than there is space for. It uses a trick
to use a second cacheline to hold the extra information that it needs.
So PV qspinlock needs to access two extra cachelines for its information
whereas the native qspinlock code only needs one extra cacheline.

Freshly added counter profiling of the qspinlock code, however, revealed
that it was very rare to use more than two levels of slowpath nesting.
So it doesn't make sense to penalize PV qspinlock code in order to have
four mcs_spinlock structures in the same cacheline to optimize for a case
in the native qspinlock code that rarely happens.

Extend the per-CPU node structure to have two more long words when PV
qspinlock locks are configured to hold the extra data that it needs.

As a result, the PV qspinlock code will enjoy the same benefit of using
just one extra cacheline like the native counterpart, for most cases.

[ mingo: Minor changelog edits. ]

Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Link: http://lkml.kernel.org/r/1539697507-28084-2-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/qspinlock_stat: Count instances of nested lock slowpaths</title>
<updated>2018-10-17T06:37:31Z</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2018-10-16T13:45:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1222109a53637f96c581224198b86856d503f892'/>
<id>urn:sha1:1222109a53637f96c581224198b86856d503f892</id>
<content type='text'>
Queued spinlock supports up to 4 levels of lock slowpath nesting -
user context, soft IRQ, hard IRQ and NMI. However, we are not sure how
often the nesting happens.

So add 3 more per-CPU stat counters to track the number of instances where
nesting index goes to 1, 2 and 3 respectively.

On a dual-socket 64-core 128-thread Zen server, the following were the
new stat counter values under different circumstances:

         State                         slowpath   index1   index2   index3
         -----                         --------   ------   ------   -------
  After bootup                         1,012,150    82       0        0
  After parallel build + perf-top    125,195,009    82       0        0

So the chance of having more than 2 levels of nesting is extremely low.

[ mingo: Minor changelog edits. ]

Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Link: http://lkml.kernel.org/r/1539697507-28084-1-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/qspinlock, x86: Provide liveness guarantee</title>
<updated>2018-10-16T15:33:54Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2018-09-26T11:01:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7aa54be2976550f17c11a1c3e3630002dea39303'/>
<id>urn:sha1:7aa54be2976550f17c11a1c3e3630002dea39303</id>
<content type='text'>
On x86 we cannot do fetch_or() with a single instruction and thus end up
using a cmpxchg loop, this reduces determinism. Replace the fetch_or()
with a composite operation: tas-pending + load.

Using two instructions of course opens a window we previously did not
have. Consider the scenario:

	CPU0		CPU1		CPU2

 1)	lock
	  trylock -&gt; (0,0,1)

 2)			lock
			  trylock /* fail */

 3)	unlock -&gt; (0,0,0)

 4)					lock
					  trylock -&gt; (0,0,1)

 5)			  tas-pending -&gt; (0,1,1)
			  load-val &lt;- (0,1,0) from 3

 6)			  clear-pending-set-locked -&gt; (0,0,1)

			  FAIL: _2_ owners

where 5) is our new composite operation. When we consider each part of
the qspinlock state as a separate variable (as we can when
_Q_PENDING_BITS == 8) then the above is entirely possible, because
tas-pending will only RmW the pending byte, so the later load is able
to observe prior tail and lock state (but not earlier than its own
trylock, which operates on the whole word, due to coherence).

To avoid this we need 2 things:

 - the load must come after the tas-pending (obviously, otherwise it
   can trivially observe prior state).

 - the tas-pending must be a full word RmW instruction, it cannot be an XCHGB for
   example, such that we cannot observe other state prior to setting
   pending.

On x86 we can realize this by using "LOCK BTS m32, r32" for
tas-pending followed by a regular load.

Note that observing later state is not a problem:

 - if we fail to observe a later unlock, we'll simply spin-wait for
   that store to become visible.

 - if we observe a later xchg_tail(), there is no difference from that
   xchg_tail() having taken place before the tas-pending.

Suggested-by: Will Deacon &lt;will.deacon@arm.com&gt;
Reported-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: andrea.parri@amarulasolutions.com
Cc: longman@redhat.com
Fixes: 59fb586b4a07 ("locking/qspinlock: Remove unbounded cmpxchg() loop from locking slowpath")
Link: https://lkml.kernel.org/r/20181003130957.183726335@infradead.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
</feed>
