<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/locking, branch v4.19.54</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.19.54</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.19.54'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2019-05-22T05:37:34Z</updated>
<entry>
<title>locking/rwsem: Prevent decrement of reader count before increment</title>
<updated>2019-05-22T05:37:34Z</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2019-04-28T21:25:38Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7761dbf58d2255601c2f71d853f1802810919a31'/>
<id>urn:sha1:7761dbf58d2255601c2f71d853f1802810919a31</id>
<content type='text'>
[ Upstream commit a9e9bcb45b1525ba7aea26ed9441e8632aeeda58 ]

During my rwsem testing, it was found that after a down_read(), the
reader count may occasionally become 0 or even negative. Consequently,
a writer may steal the lock at that time and execute with the reader
in parallel thus breaking the mutual exclusion guarantee of the write
lock. In other words, both readers and writer can become rwsem owners
simultaneously.

The current reader wakeup code does it in one pass to clear waiter-&gt;task
and put them into wake_q before fully incrementing the reader count.
Once waiter-&gt;task is cleared, the corresponding reader may see it,
finish the critical section and do unlock to decrement the count before
the count is incremented. This is not a problem if there is only one
reader to wake up as the count has been pre-incremented by 1.  It is
a problem if there are more than one readers to be woken up and writer
can steal the lock.

The wakeup was actually done in 2 passes before the following v4.9 commit:

  70800c3c0cc5 ("locking/rwsem: Scan the wait_list for readers only once")

To fix this problem, the wakeup is now done in two passes
again. In the first pass, we collect the readers and count them.
The reader count is then fully incremented. In the second pass, the
waiter-&gt;task is then cleared and they are put into wake_q to be woken
up later.

Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Borislav Petkov &lt;bp@alien8.de&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tim Chen &lt;tim.c.chen@linux.intel.com&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: huang ying &lt;huang.ying.caritas@gmail.com&gt;
Fixes: 70800c3c0cc5 ("locking/rwsem: Scan the wait_list for readers only once")
Link: http://lkml.kernel.org/r/20190428212557.13482-2-longman@redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>Revert "locking/lockdep: Add debug_locks check in __lock_downgrade()"</title>
<updated>2019-04-27T07:36:41Z</updated>
<author>
<name>Greg Kroah-Hartman</name>
<email>gregkh@linuxfoundation.org</email>
</author>
<published>2019-04-25T08:00:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ac54bc121e1f491b70d71ac234b8e0f76f79d7ad'/>
<id>urn:sha1:ac54bc121e1f491b70d71ac234b8e0f76f79d7ad</id>
<content type='text'>
This reverts commit 0e0f7b30721233ce610bde2395a8e58ff7771475 which was
commit 71492580571467fb7177aade19c18ce7486267f5 upstream.

Tetsuo rightly points out that the backport here is incorrect, as it
touches the __lock_set_class function instead of the intended
__lock_downgrade function.

Reported-by: Tetsuo Handa &lt;penguin-kernel@i-love.sakura.ne.jp&gt;
Cc: Waiman Long &lt;longman@redhat.com&gt;
Cc: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Sasha Levin &lt;sashal@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>locking/lockdep: Add debug_locks check in __lock_downgrade()</title>
<updated>2019-03-27T05:14:43Z</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2019-01-10T04:03:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0e0f7b30721233ce610bde2395a8e58ff7771475'/>
<id>urn:sha1:0e0f7b30721233ce610bde2395a8e58ff7771475</id>
<content type='text'>
commit 71492580571467fb7177aade19c18ce7486267f5 upstream.

Tetsuo Handa had reported he saw an incorrect "downgrading a read lock"
warning right after a previous lockdep warning. It is likely that the
previous warning turned off lock debugging causing the lockdep to have
inconsistency states leading to the lock downgrade warning.

Fix that by add a check for debug_locks at the beginning of
__lock_downgrade().

Debugged-by: Tetsuo Handa &lt;penguin-kernel@i-love.sakura.ne.jp&gt;
Reported-by: Tetsuo Handa &lt;penguin-kernel@i-love.sakura.ne.jp&gt;
Reported-by: syzbot+53383ae265fb161ef488@syzkaller.appspotmail.com
Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Link: https://lkml.kernel.org/r/1547093005-26085-1-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>locking/rwsem: Fix (possible) missed wakeup</title>
<updated>2019-03-05T16:58:49Z</updated>
<author>
<name>Xie Yongji</name>
<email>xieyongji@baidu.com</email>
</author>
<published>2018-11-29T12:50:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9ad6216e8c3c26acb935ca8e1bbe99e9cb510b70'/>
<id>urn:sha1:9ad6216e8c3c26acb935ca8e1bbe99e9cb510b70</id>
<content type='text'>
[ Upstream commit e158488be27b157802753a59b336142dc0eb0380 ]

Because wake_q_add() can imply an immediate wakeup (cmpxchg failure
case), we must not rely on the wakeup being delayed. However, commit:

  e38513905eea ("locking/rwsem: Rework zeroing reader waiter-&gt;task")

relies on exactly that behaviour in that the wakeup must not happen
until after we clear waiter-&gt;task.

[ peterz: Added changelog. ]

Signed-off-by: Xie Yongji &lt;xieyongji@baidu.com&gt;
Signed-off-by: Zhang Yu &lt;zhangyu31@baidu.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Fixes: e38513905eea ("locking/rwsem: Rework zeroing reader waiter-&gt;task")
Link: https://lkml.kernel.org/r/1543495830-2644-1-git-send-email-xieyongji@baidu.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>futex: Handle early deadlock return correctly</title>
<updated>2019-02-12T18:47:24Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2019-01-29T22:15:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ee73954d9a21791d496befcdd004be00415bceda'/>
<id>urn:sha1:ee73954d9a21791d496befcdd004be00415bceda</id>
<content type='text'>
commit 1a1fb985f2e2b85ec0d3dc2e519ee48389ec2434 upstream.

commit 56222b212e8e ("futex: Drop hb-&gt;lock before enqueueing on the
rtmutex") changed the locking rules in the futex code so that the hash
bucket lock is not longer held while the waiter is enqueued into the
rtmutex wait list. This made the lock and the unlock path symmetric, but
unfortunately the possible early exit from __rt_mutex_proxy_start() due to
a detected deadlock was not updated accordingly. That allows a concurrent
unlocker to observe inconsitent state which triggers the warning in the
unlock path.

futex_lock_pi()                         futex_unlock_pi()
  lock(hb-&gt;lock)
  queue(hb_waiter)				lock(hb-&gt;lock)
  lock(rtmutex-&gt;wait_lock)
  unlock(hb-&gt;lock)
                                        // acquired hb-&gt;lock
                                        hb_waiter = futex_top_waiter()
                                        lock(rtmutex-&gt;wait_lock)
  __rt_mutex_proxy_start()
     ---&gt; fail
          remove(rtmutex_waiter);
     ---&gt; returns -EDEADLOCK
  unlock(rtmutex-&gt;wait_lock)
                                        // acquired wait_lock
                                        wake_futex_pi()
                                        rt_mutex_next_owner()
					  --&gt; returns NULL
                                          --&gt; WARN

  lock(hb-&gt;lock)
  unqueue(hb_waiter)

The problem is caused by the remove(rtmutex_waiter) in the failure case of
__rt_mutex_proxy_start() as this lets the unlocker observe a waiter in the
hash bucket but no waiter on the rtmutex, i.e. inconsistent state.

The original commit handles this correctly for the other early return cases
(timeout, signal) by delaying the removal of the rtmutex waiter until the
returning task reacquired the hash bucket lock.

Treat the failure case of __rt_mutex_proxy_start() in the same way and let
the existing cleanup code handle the eventual handover of the rtmutex
gracefully. The regular rt_mutex_proxy_start() gains the rtmutex waiter
removal for the failure case, so that the other callsites are still
operating correctly.

Add proper comments to the code so all these details are fully documented.

Thanks to Peter for helping with the analysis and writing the really
valuable code comments.

Fixes: 56222b212e8e ("futex: Drop hb-&gt;lock before enqueueing on the rtmutex")
Reported-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Co-developed-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: linux-s390@vger.kernel.org
Cc: Stefan Liebler &lt;stli@linux.ibm.com&gt;
Cc: Sebastian Sewior &lt;bigeasy@linutronix.de&gt;
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1901292311410.1950@nanos.tec.linutronix.de
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>locking/qspinlock, x86: Provide liveness guarantee</title>
<updated>2018-12-21T13:15:12Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2018-12-18T17:48:28Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2658687568cd36cc1250106032d540454c0046c9'/>
<id>urn:sha1:2658687568cd36cc1250106032d540454c0046c9</id>
<content type='text'>
commit 7aa54be2976550f17c11a1c3e3630002dea39303 upstream.

On x86 we cannot do fetch_or() with a single instruction and thus end up
using a cmpxchg loop, this reduces determinism. Replace the fetch_or()
with a composite operation: tas-pending + load.

Using two instructions of course opens a window we previously did not
have. Consider the scenario:

	CPU0		CPU1		CPU2

 1)	lock
	  trylock -&gt; (0,0,1)

 2)			lock
			  trylock /* fail */

 3)	unlock -&gt; (0,0,0)

 4)					lock
					  trylock -&gt; (0,0,1)

 5)			  tas-pending -&gt; (0,1,1)
			  load-val &lt;- (0,1,0) from 3

 6)			  clear-pending-set-locked -&gt; (0,0,1)

			  FAIL: _2_ owners

where 5) is our new composite operation. When we consider each part of
the qspinlock state as a separate variable (as we can when
_Q_PENDING_BITS == 8) then the above is entirely possible, because
tas-pending will only RmW the pending byte, so the later load is able
to observe prior tail and lock state (but not earlier than its own
trylock, which operates on the whole word, due to coherence).

To avoid this we need 2 things:

 - the load must come after the tas-pending (obviously, otherwise it
   can trivially observe prior state).

 - the tas-pending must be a full word RmW instruction, it cannot be an XCHGB for
   example, such that we cannot observe other state prior to setting
   pending.

On x86 we can realize this by using "LOCK BTS m32, r32" for
tas-pending followed by a regular load.

Note that observing later state is not a problem:

 - if we fail to observe a later unlock, we'll simply spin-wait for
   that store to become visible.

 - if we observe a later xchg_tail(), there is no difference from that
   xchg_tail() having taken place before the tas-pending.

Suggested-by: Will Deacon &lt;will.deacon@arm.com&gt;
Reported-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: andrea.parri@amarulasolutions.com
Cc: longman@redhat.com
Fixes: 59fb586b4a07 ("locking/qspinlock: Remove unbounded cmpxchg() loop from locking slowpath")
Link: https://lkml.kernel.org/r/20181003130957.183726335@infradead.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
[bigeasy: GEN_BINARY_RMWcc macro redo]
Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/qspinlock: Re-order code</title>
<updated>2018-12-21T13:15:11Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2018-12-18T17:48:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=150f038c9382e92de446caee1754d65c47127341'/>
<id>urn:sha1:150f038c9382e92de446caee1754d65c47127341</id>
<content type='text'>
commit 53bf57fab7321fb42b703056a4c80fc9d986d170 upstream.

Flip the branch condition after atomic_fetch_or_acquire(_Q_PENDING_VAL)
such that we loose the indent. This also result in a more natural code
flow IMO.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: andrea.parri@amarulasolutions.com
Cc: longman@redhat.com
Link: https://lkml.kernel.org/r/20181003130257.156322446@infradead.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/lockdep: Fix debug_locks off performance problem</title>
<updated>2018-11-13T19:08:20Z</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2018-10-19T01:45:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=117d5fbddd39a7de3c5ad400d9564ef70fe669cd'/>
<id>urn:sha1:117d5fbddd39a7de3c5ad400d9564ef70fe669cd</id>
<content type='text'>
[ Upstream commit 9506a7425b094d2f1d9c877ed5a78f416669269b ]

It was found that when debug_locks was turned off because of a problem
found by the lockdep code, the system performance could drop quite
significantly when the lock_stat code was also configured into the
kernel. For instance, parallel kernel build time on a 4-socket x86-64
server nearly doubled.

Further analysis into the cause of the slowdown traced back to the
frequent call to debug_locks_off() from the __lock_acquired() function
probably due to some inconsistent lockdep states with debug_locks
off. The debug_locks_off() function did an unconditional atomic xchg
to write a 0 value into debug_locks which had already been set to 0.
This led to severe cacheline contention in the cacheline that held
debug_locks.  As debug_locks is being referenced in quite a few different
places in the kernel, this greatly slow down the system performance.

To prevent that trashing of debug_locks cacheline, lock_acquired()
and lock_contended() now checks the state of debug_locks before
proceeding. The debug_locks_off() function is also modified to check
debug_locks before calling __debug_locks_off().

Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Link: http://lkml.kernel.org/r/1539913518-15598-1-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>locking/ww_mutex: Fix runtime warning in the WW mutex selftest</title>
<updated>2018-10-03T06:56:31Z</updated>
<author>
<name>Guenter Roeck</name>
<email>linux@roeck-us.net</email>
</author>
<published>2018-10-02T21:48:49Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e4a02ed2aaf447fa849e3254bfdb3b9b01e1e520'/>
<id>urn:sha1:e4a02ed2aaf447fa849e3254bfdb3b9b01e1e520</id>
<content type='text'>
If CONFIG_WW_MUTEX_SELFTEST=y is enabled, booting an image
in an arm64 virtual machine results in the following
traceback if 8 CPUs are enabled:

  DEBUG_LOCKS_WARN_ON(__owner_task(owner) != current)
  WARNING: CPU: 2 PID: 537 at kernel/locking/mutex.c:1033 __mutex_unlock_slowpath+0x1a8/0x2e0
  ...
  Call trace:
   __mutex_unlock_slowpath()
   ww_mutex_unlock()
   test_cycle_work()
   process_one_work()
   worker_thread()
   kthread()
   ret_from_fork()

If requesting b_mutex fails with -EDEADLK, the error variable
is reassigned to the return value from calling ww_mutex_lock
on a_mutex again. If this call fails, a_mutex is not locked.
It is, however, unconditionally unlocked subsequently, causing
the reported warning. Fix the problem by using two error variables.

With this change, the selftest still fails as follows:

  cyclic deadlock not resolved, ret[7/8] = -35

However, the traceback is gone.

Signed-off-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Cc: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Fixes: d1b42b800e5d0 ("locking/ww_mutex: Add kselftests for resolving ww_mutex cyclic deadlocks")
Link: http://lkml.kernel.org/r/1538516929-9734-1-git-send-email-linux@roeck-us.net
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/ww_mutex: Fix spelling mistake "cylic" -&gt; "cyclic"</title>
<updated>2018-09-10T12:00:01Z</updated>
<author>
<name>Colin Ian King</name>
<email>colin.king@canonical.com</email>
</author>
<published>2018-08-24T11:22:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0b405c65ad459f5f4d3db1672246172bd19d946d'/>
<id>urn:sha1:0b405c65ad459f5f4d3db1672246172bd19d946d</id>
<content type='text'>
Trivial fix to spelling mistake in pr_err() error message

Signed-off-by: Colin Ian King &lt;colin.king@canonical.com&gt;
Acked-by: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/20180824112235.8842-1-colin.king@canonical.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
</feed>
