<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/locking, branch v4.19.29</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.19.29</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.19.29'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2019-03-05T16:58:49Z</updated>
<entry>
<title>locking/rwsem: Fix (possible) missed wakeup</title>
<updated>2019-03-05T16:58:49Z</updated>
<author>
<name>Xie Yongji</name>
<email>xieyongji@baidu.com</email>
</author>
<published>2018-11-29T12:50:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9ad6216e8c3c26acb935ca8e1bbe99e9cb510b70'/>
<id>urn:sha1:9ad6216e8c3c26acb935ca8e1bbe99e9cb510b70</id>
<content type='text'>
[ Upstream commit e158488be27b157802753a59b336142dc0eb0380 ]

Because wake_q_add() can imply an immediate wakeup (cmpxchg failure
case), we must not rely on the wakeup being delayed. However, commit:

  e38513905eea ("locking/rwsem: Rework zeroing reader waiter-&gt;task")

relies on exactly that behaviour in that the wakeup must not happen
until after we clear waiter-&gt;task.

[ peterz: Added changelog. ]

Signed-off-by: Xie Yongji &lt;xieyongji@baidu.com&gt;
Signed-off-by: Zhang Yu &lt;zhangyu31@baidu.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Fixes: e38513905eea ("locking/rwsem: Rework zeroing reader waiter-&gt;task")
Link: https://lkml.kernel.org/r/1543495830-2644-1-git-send-email-xieyongji@baidu.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>futex: Handle early deadlock return correctly</title>
<updated>2019-02-12T18:47:24Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2019-01-29T22:15:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ee73954d9a21791d496befcdd004be00415bceda'/>
<id>urn:sha1:ee73954d9a21791d496befcdd004be00415bceda</id>
<content type='text'>
commit 1a1fb985f2e2b85ec0d3dc2e519ee48389ec2434 upstream.

commit 56222b212e8e ("futex: Drop hb-&gt;lock before enqueueing on the
rtmutex") changed the locking rules in the futex code so that the hash
bucket lock is not longer held while the waiter is enqueued into the
rtmutex wait list. This made the lock and the unlock path symmetric, but
unfortunately the possible early exit from __rt_mutex_proxy_start() due to
a detected deadlock was not updated accordingly. That allows a concurrent
unlocker to observe inconsitent state which triggers the warning in the
unlock path.

futex_lock_pi()                         futex_unlock_pi()
  lock(hb-&gt;lock)
  queue(hb_waiter)				lock(hb-&gt;lock)
  lock(rtmutex-&gt;wait_lock)
  unlock(hb-&gt;lock)
                                        // acquired hb-&gt;lock
                                        hb_waiter = futex_top_waiter()
                                        lock(rtmutex-&gt;wait_lock)
  __rt_mutex_proxy_start()
     ---&gt; fail
          remove(rtmutex_waiter);
     ---&gt; returns -EDEADLOCK
  unlock(rtmutex-&gt;wait_lock)
                                        // acquired wait_lock
                                        wake_futex_pi()
                                        rt_mutex_next_owner()
					  --&gt; returns NULL
                                          --&gt; WARN

  lock(hb-&gt;lock)
  unqueue(hb_waiter)

The problem is caused by the remove(rtmutex_waiter) in the failure case of
__rt_mutex_proxy_start() as this lets the unlocker observe a waiter in the
hash bucket but no waiter on the rtmutex, i.e. inconsistent state.

The original commit handles this correctly for the other early return cases
(timeout, signal) by delaying the removal of the rtmutex waiter until the
returning task reacquired the hash bucket lock.

Treat the failure case of __rt_mutex_proxy_start() in the same way and let
the existing cleanup code handle the eventual handover of the rtmutex
gracefully. The regular rt_mutex_proxy_start() gains the rtmutex waiter
removal for the failure case, so that the other callsites are still
operating correctly.

Add proper comments to the code so all these details are fully documented.

Thanks to Peter for helping with the analysis and writing the really
valuable code comments.

Fixes: 56222b212e8e ("futex: Drop hb-&gt;lock before enqueueing on the rtmutex")
Reported-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Co-developed-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: linux-s390@vger.kernel.org
Cc: Stefan Liebler &lt;stli@linux.ibm.com&gt;
Cc: Sebastian Sewior &lt;bigeasy@linutronix.de&gt;
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1901292311410.1950@nanos.tec.linutronix.de
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>locking/qspinlock, x86: Provide liveness guarantee</title>
<updated>2018-12-21T13:15:12Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2018-12-18T17:48:28Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2658687568cd36cc1250106032d540454c0046c9'/>
<id>urn:sha1:2658687568cd36cc1250106032d540454c0046c9</id>
<content type='text'>
commit 7aa54be2976550f17c11a1c3e3630002dea39303 upstream.

On x86 we cannot do fetch_or() with a single instruction and thus end up
using a cmpxchg loop, this reduces determinism. Replace the fetch_or()
with a composite operation: tas-pending + load.

Using two instructions of course opens a window we previously did not
have. Consider the scenario:

	CPU0		CPU1		CPU2

 1)	lock
	  trylock -&gt; (0,0,1)

 2)			lock
			  trylock /* fail */

 3)	unlock -&gt; (0,0,0)

 4)					lock
					  trylock -&gt; (0,0,1)

 5)			  tas-pending -&gt; (0,1,1)
			  load-val &lt;- (0,1,0) from 3

 6)			  clear-pending-set-locked -&gt; (0,0,1)

			  FAIL: _2_ owners

where 5) is our new composite operation. When we consider each part of
the qspinlock state as a separate variable (as we can when
_Q_PENDING_BITS == 8) then the above is entirely possible, because
tas-pending will only RmW the pending byte, so the later load is able
to observe prior tail and lock state (but not earlier than its own
trylock, which operates on the whole word, due to coherence).

To avoid this we need 2 things:

 - the load must come after the tas-pending (obviously, otherwise it
   can trivially observe prior state).

 - the tas-pending must be a full word RmW instruction, it cannot be an XCHGB for
   example, such that we cannot observe other state prior to setting
   pending.

On x86 we can realize this by using "LOCK BTS m32, r32" for
tas-pending followed by a regular load.

Note that observing later state is not a problem:

 - if we fail to observe a later unlock, we'll simply spin-wait for
   that store to become visible.

 - if we observe a later xchg_tail(), there is no difference from that
   xchg_tail() having taken place before the tas-pending.

Suggested-by: Will Deacon &lt;will.deacon@arm.com&gt;
Reported-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: andrea.parri@amarulasolutions.com
Cc: longman@redhat.com
Fixes: 59fb586b4a07 ("locking/qspinlock: Remove unbounded cmpxchg() loop from locking slowpath")
Link: https://lkml.kernel.org/r/20181003130957.183726335@infradead.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
[bigeasy: GEN_BINARY_RMWcc macro redo]
Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/qspinlock: Re-order code</title>
<updated>2018-12-21T13:15:11Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2018-12-18T17:48:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=150f038c9382e92de446caee1754d65c47127341'/>
<id>urn:sha1:150f038c9382e92de446caee1754d65c47127341</id>
<content type='text'>
commit 53bf57fab7321fb42b703056a4c80fc9d986d170 upstream.

Flip the branch condition after atomic_fetch_or_acquire(_Q_PENDING_VAL)
such that we loose the indent. This also result in a more natural code
flow IMO.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: andrea.parri@amarulasolutions.com
Cc: longman@redhat.com
Link: https://lkml.kernel.org/r/20181003130257.156322446@infradead.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/lockdep: Fix debug_locks off performance problem</title>
<updated>2018-11-13T19:08:20Z</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2018-10-19T01:45:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=117d5fbddd39a7de3c5ad400d9564ef70fe669cd'/>
<id>urn:sha1:117d5fbddd39a7de3c5ad400d9564ef70fe669cd</id>
<content type='text'>
[ Upstream commit 9506a7425b094d2f1d9c877ed5a78f416669269b ]

It was found that when debug_locks was turned off because of a problem
found by the lockdep code, the system performance could drop quite
significantly when the lock_stat code was also configured into the
kernel. For instance, parallel kernel build time on a 4-socket x86-64
server nearly doubled.

Further analysis into the cause of the slowdown traced back to the
frequent call to debug_locks_off() from the __lock_acquired() function
probably due to some inconsistent lockdep states with debug_locks
off. The debug_locks_off() function did an unconditional atomic xchg
to write a 0 value into debug_locks which had already been set to 0.
This led to severe cacheline contention in the cacheline that held
debug_locks.  As debug_locks is being referenced in quite a few different
places in the kernel, this greatly slow down the system performance.

To prevent that trashing of debug_locks cacheline, lock_acquired()
and lock_contended() now checks the state of debug_locks before
proceeding. The debug_locks_off() function is also modified to check
debug_locks before calling __debug_locks_off().

Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Link: http://lkml.kernel.org/r/1539913518-15598-1-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>locking/ww_mutex: Fix runtime warning in the WW mutex selftest</title>
<updated>2018-10-03T06:56:31Z</updated>
<author>
<name>Guenter Roeck</name>
<email>linux@roeck-us.net</email>
</author>
<published>2018-10-02T21:48:49Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e4a02ed2aaf447fa849e3254bfdb3b9b01e1e520'/>
<id>urn:sha1:e4a02ed2aaf447fa849e3254bfdb3b9b01e1e520</id>
<content type='text'>
If CONFIG_WW_MUTEX_SELFTEST=y is enabled, booting an image
in an arm64 virtual machine results in the following
traceback if 8 CPUs are enabled:

  DEBUG_LOCKS_WARN_ON(__owner_task(owner) != current)
  WARNING: CPU: 2 PID: 537 at kernel/locking/mutex.c:1033 __mutex_unlock_slowpath+0x1a8/0x2e0
  ...
  Call trace:
   __mutex_unlock_slowpath()
   ww_mutex_unlock()
   test_cycle_work()
   process_one_work()
   worker_thread()
   kthread()
   ret_from_fork()

If requesting b_mutex fails with -EDEADLK, the error variable
is reassigned to the return value from calling ww_mutex_lock
on a_mutex again. If this call fails, a_mutex is not locked.
It is, however, unconditionally unlocked subsequently, causing
the reported warning. Fix the problem by using two error variables.

With this change, the selftest still fails as follows:

  cyclic deadlock not resolved, ret[7/8] = -35

However, the traceback is gone.

Signed-off-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Cc: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Fixes: d1b42b800e5d0 ("locking/ww_mutex: Add kselftests for resolving ww_mutex cyclic deadlocks")
Link: http://lkml.kernel.org/r/1538516929-9734-1-git-send-email-linux@roeck-us.net
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/ww_mutex: Fix spelling mistake "cylic" -&gt; "cyclic"</title>
<updated>2018-09-10T12:00:01Z</updated>
<author>
<name>Colin Ian King</name>
<email>colin.king@canonical.com</email>
</author>
<published>2018-08-24T11:22:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0b405c65ad459f5f4d3db1672246172bd19d946d'/>
<id>urn:sha1:0b405c65ad459f5f4d3db1672246172bd19d946d</id>
<content type='text'>
Trivial fix to spelling mistake in pr_err() error message

Signed-off-by: Colin Ian King &lt;colin.king@canonical.com&gt;
Acked-by: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: kernel-janitors@vger.kernel.org
Link: http://lkml.kernel.org/r/20180824112235.8842-1-colin.king@canonical.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/lockdep: Delete unnecessary #include</title>
<updated>2018-09-10T11:48:25Z</updated>
<author>
<name>Ben Hutchings</name>
<email>ben@decadent.org.uk</email>
</author>
<published>2018-08-28T20:33:15Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=dc5591a03f1d6dae6b11cdf1d74b023f7ac0fdbf'/>
<id>urn:sha1:dc5591a03f1d6dae6b11cdf1d74b023f7ac0fdbf</id>
<content type='text'>
Commit:

  c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints and unify their usage")

added the inclusion of &lt;trace/events/preemptirq.h&gt;.

liblockdep doesn't have a stub version of that header so now fails to build.

However, commit:

  bff1b208a5d1 ("tracing: Partial revert of "tracing: Centralize preemptirq tracepoints and unify their usage"")

removed the use of functions declared in that header. So delete the #include.

Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Cc: Joel Fernandes &lt;joel@joelfernandes.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Fixes: bff1b208a5d1 ("tracing: Partial revert of "tracing: Centralize ...")
Fixes: c3bc8fd637a9 ("tracing: Centralize preemptirq tracepoints ...")
Link: http://lkml.kernel.org/r/20180828203315.GD18030@decadent.org.uk
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/mutex: Fix mutex debug call and ww_mutex documentation</title>
<updated>2018-09-10T08:05:10Z</updated>
<author>
<name>Thomas Hellstrom</name>
<email>thellstrom@vmware.com</email>
</author>
<published>2018-09-03T14:07:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e13e2366d8415e029fe96a62502955083e272cef'/>
<id>urn:sha1:e13e2366d8415e029fe96a62502955083e272cef</id>
<content type='text'>
The following commit:

  08295b3b5bee ("Implement an algorithm choice for Wound-Wait mutexes")

introduced a reference in the documentation to a function that was
removed in an earlier commit.

It also forgot to remove a call to debug_mutex_add_waiter() which is now
unconditionally called by __mutex_add_waiter().

Fix those bugs.

Signed-off-by: Thomas Hellstrom &lt;thellstrom@vmware.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: dri-devel@lists.freedesktop.org
Fixes: 08295b3b5bee ("Implement an algorithm choice for Wound-Wait mutexes")
Link: http://lkml.kernel.org/r/20180903140708.2401-1-thellstrom@vmware.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'trace-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace</title>
<updated>2018-08-21T01:32:00Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-08-21T01:32:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7140ad3898dd119d993aff76a8752570c4f23871'/>
<id>urn:sha1:7140ad3898dd119d993aff76a8752570c4f23871</id>
<content type='text'>
Pull tracing updates from Steven Rostedt:

 - Restructure of lockdep and latency tracers

   This is the biggest change. Joel Fernandes restructured the hooks
   from irqs and preemption disabling and enabling. He got rid of a lot
   of the preprocessor #ifdef mess that they caused.

   He turned both lockdep and the latency tracers to use trace events
   inserted in the preempt/irqs disabling paths. But unfortunately,
   these started to cause issues in corner cases. Thus, parts of the
   code was reverted back to where lockdep and the latency tracers just
   get called directly (without using the trace events). But because the
   original change cleaned up the code very nicely we kept that, as well
   as the trace events for preempt and irqs disabling, but they are
   limited to not being called in NMIs.

 - Have trace events use SRCU for "rcu idle" calls. This was required
   for the preempt/irqs off trace events. But it also had to not allow
   them to be called in NMI context. Waiting till Paul makes an NMI safe
   SRCU API.

 - New notrace SRCU API to allow trace events to use SRCU.

 - Addition of mcount-nop option support

 - SPDX headers replacing GPL templates.

 - Various other fixes and clean ups.

 - Some fixes are marked for stable, but were not fully tested before
   the merge window opened.

* tag 'trace-v4.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (44 commits)
  tracing: Fix SPDX format headers to use C++ style comments
  tracing: Add SPDX License format tags to tracing files
  tracing: Add SPDX License format to bpf_trace.c
  blktrace: Add SPDX License format header
  s390/ftrace: Add -mfentry and -mnop-mcount support
  tracing: Add -mcount-nop option support
  tracing: Avoid calling cc-option -mrecord-mcount for every Makefile
  tracing: Handle CC_FLAGS_FTRACE more accurately
  Uprobe: Additional argument arch_uprobe to uprobe_write_opcode()
  Uprobes: Simplify uprobe_register() body
  tracepoints: Free early tracepoints after RCU is initialized
  uprobes: Use synchronize_rcu() not synchronize_sched()
  tracing: Fix synchronizing to event changes with tracepoint_synchronize_unregister()
  ftrace: Remove unused pointer ftrace_swapper_pid
  tracing: More reverting of "tracing: Centralize preemptirq tracepoints and unify their usage"
  tracing/irqsoff: Handle preempt_count for different configs
  tracing: Partial revert of "tracing: Centralize preemptirq tracepoints and unify their usage"
  tracing: irqsoff: Account for additional preempt_disable
  trace: Use rcu_dereference_raw for hooks from trace-event subsystem
  tracing/kprobes: Fix within_notrace_func() to check only notrace functions
  ...
</content>
</entry>
</feed>
