<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/locking, branch v4.4.159</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.4.159</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.4.159'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2018-09-19T20:48:56Z</updated>
<entry>
<title>locking/osq_lock: Fix osq_lock queue corruption</title>
<updated>2018-09-19T20:48:56Z</updated>
<author>
<name>Prateek Sood</name>
<email>prsood@codeaurora.org</email>
</author>
<published>2017-07-14T13:47:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d914882c936d9c3a1fa4c10d5950c5f0a7d32d79'/>
<id>urn:sha1:d914882c936d9c3a1fa4c10d5950c5f0a7d32d79</id>
<content type='text'>
commit 50972fe78f24f1cd0b9d7bbf1f87d2be9e4f412e upstream.

Fix ordering of link creation between node-&gt;prev and prev-&gt;next in
osq_lock(). A case in which the status of optimistic spin queue is
CPU6-&gt;CPU2 in which CPU6 has acquired the lock.

        tail
          v
  ,-. &lt;- ,-.
  |6|    |2|
  `-' -&gt; `-'

At this point if CPU0 comes in to acquire osq_lock, it will update the
tail count.

  CPU2			CPU0
  ----------------------------------

				       tail
				         v
			  ,-. &lt;- ,-.    ,-.
			  |6|    |2|    |0|
			  `-' -&gt; `-'    `-'

After tail count update if CPU2 starts to unqueue itself from
optimistic spin queue, it will find an updated tail count with CPU0 and
update CPU2 node-&gt;next to NULL in osq_wait_next().

  unqueue-A

	       tail
	         v
  ,-. &lt;- ,-.    ,-.
  |6|    |2|    |0|
  `-'    `-'    `-'

  unqueue-B

  -&gt;tail != curr &amp;&amp; !node-&gt;next

If reordering of following stores happen then prev-&gt;next where prev
being CPU2 would be updated to point to CPU0 node:

				       tail
				         v
			  ,-. &lt;- ,-.    ,-.
			  |6|    |2|    |0|
			  `-'    `-' -&gt; `-'

  osq_wait_next()
    node-&gt;next &lt;- 0
    xchg(node-&gt;next, NULL)

	       tail
	         v
  ,-. &lt;- ,-.    ,-.
  |6|    |2|    |0|
  `-'    `-'    `-'

  unqueue-C

At this point if next instruction
	WRITE_ONCE(next-&gt;prev, prev);
in CPU2 path is committed before the update of CPU0 node-&gt;prev = prev then
CPU0 node-&gt;prev will point to CPU6 node.

	       tail
    v----------. v
  ,-. &lt;- ,-.    ,-.
  |6|    |2|    |0|
  `-'    `-'    `-'
     `----------^

At this point if CPU0 path's node-&gt;prev = prev is committed resulting
in change of CPU0 prev back to CPU2 node. CPU2 node-&gt;next is NULL
currently,

				       tail
			                 v
			  ,-. &lt;- ,-. &lt;- ,-.
			  |6|    |2|    |0|
			  `-'    `-'    `-'
			     `----------^

so if CPU0 gets into unqueue path of osq_lock it will keep spinning
in infinite loop as condition prev-&gt;next == node will never be true.

Signed-off-by: Prateek Sood &lt;prsood@codeaurora.org&gt;
[ Added pictures, rewrote comments. ]
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: sramana@codeaurora.org
Link: http://lkml.kernel.org/r/1500040076-27626-1-git-send-email-prsood@codeaurora.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Amit Pundir &lt;amit.pundir@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>locking/rwsem-xadd: Fix missed wakeup due to reordering of load</title>
<updated>2018-09-19T20:48:56Z</updated>
<author>
<name>Prateek Sood</name>
<email>prsood@codeaurora.org</email>
</author>
<published>2017-09-07T14:30:58Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=70cc08c44fb55b587c7485a15549e9f9a12c9405'/>
<id>urn:sha1:70cc08c44fb55b587c7485a15549e9f9a12c9405</id>
<content type='text'>
commit 9c29c31830a4eca724e137a9339137204bbb31be upstream.

If a spinner is present, there is a chance that the load of
rwsem_has_spinner() in rwsem_wake() can be reordered with
respect to decrement of rwsem count in __up_write() leading
to wakeup being missed:

 spinning writer                  up_write caller
 ---------------                  -----------------------
 [S] osq_unlock()                 [L] osq
  spin_lock(wait_lock)
  sem-&gt;count=0xFFFFFFFF00000001
            +0xFFFFFFFF00000000
  count=sem-&gt;count
  MB
                                   sem-&gt;count=0xFFFFFFFE00000001
                                             -0xFFFFFFFF00000001
                                   spin_trylock(wait_lock)
                                   return
 rwsem_try_write_lock(count)
 spin_unlock(wait_lock)
 schedule()

Reordering of atomic_long_sub_return_release() in __up_write()
and rwsem_has_spinner() in rwsem_wake() can cause missing of
wakeup in up_write() context. In spinning writer, sem-&gt;count
and local variable count is 0XFFFFFFFE00000001. It would result
in rwsem_try_write_lock() failing to acquire rwsem and spinning
writer going to sleep in rwsem_down_write_failed().

The smp_rmb() will make sure that the spinner state is
consulted after sem-&gt;count is updated in up_write context.

Signed-off-by: Prateek Sood &lt;prsood@codeaurora.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: dave@stgolabs.net
Cc: longman@redhat.com
Cc: parri.andrea@gmail.com
Cc: sramana@codeaurora.org
Link: http://lkml.kernel.org/r/1504794658-15397-1-git-send-email-prsood@codeaurora.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Amit Pundir &lt;amit.pundir@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>locking/lockdep: Do not record IRQ state within lockdep code</title>
<updated>2018-08-24T11:26:55Z</updated>
<author>
<name>Steven Rostedt (VMware)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2018-04-04T18:06:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c40dc96f7f7e29d7c1e520c46c87178c2d4b1dcc'/>
<id>urn:sha1:c40dc96f7f7e29d7c1e520c46c87178c2d4b1dcc</id>
<content type='text'>
[ Upstream commit fcc784be837714a9173b372ff9fb9b514590dad9 ]

While debugging where things were going wrong with mapping
enabling/disabling interrupts with the lockdep state and actual real
enabling and disabling interrupts, I had to silent the IRQ
disabling/enabling in debug_check_no_locks_freed() because it was
always showing up as it was called before the splat was.

Use raw_local_irq_save/restore() for not only debug_check_no_locks_freed()
but for all internal lockdep functions, as they hide useful information
about where interrupts were used incorrectly last.

Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Link: https://lkml.kernel.org/lkml/20180404140630.3f4f4c7a@gandalf.local.home
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>locking/qspinlock: Ensure node-&gt;count is updated before initialising node</title>
<updated>2018-05-30T05:48:57Z</updated>
<author>
<name>Will Deacon</name>
<email>will.deacon@arm.com</email>
</author>
<published>2018-02-13T13:22:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=abd9138a1b0a987498296f93be174aab0d41b051'/>
<id>urn:sha1:abd9138a1b0a987498296f93be174aab0d41b051</id>
<content type='text'>
[ Upstream commit 11dc13224c975efcec96647a4768a6f1bb7a19a8 ]

When queuing on the qspinlock, the count field for the current CPU's head
node is incremented. This needn't be atomic because locking in e.g. IRQ
context is balanced and so an IRQ will return with node-&gt;count as it
found it.

However, the compiler could in theory reorder the initialisation of
node[idx] before the increment of the head node-&gt;count, causing an
IRQ to overwrite the initialised node and potentially corrupt the lock
state.

Avoid the potential for this harmful compiler reordering by placing a
barrier() between the increment of the head node-&gt;count and the subsequent
node initialisation.

Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/1518528177-19169-3-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>locking/mutex: Allow next waiter lockless wakeup</title>
<updated>2018-01-17T08:35:27Z</updated>
<author>
<name>Davidlohr Bueso</name>
<email>dave@stgolabs.net</email>
</author>
<published>2016-01-25T02:23:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bd44e3f19d14e196fdd2635698ff5612e971dfa5'/>
<id>urn:sha1:bd44e3f19d14e196fdd2635698ff5612e971dfa5</id>
<content type='text'>
commit 1329ce6fbbe4536592dfcfc8d64d61bfeb598fe6 upstream.

Make use of wake-queues and enable the wakeup to occur after releasing the
wait_lock. This is similar to what we do with rtmutex top waiter,
slightly shortening the critical region and allow other waiters to
acquire the wait_lock sooner. In low contention cases it can also help
the recently woken waiter to find the wait_lock available (fastpath)
when it continues execution.

Reviewed-by: Waiman Long &lt;Waiman.Long@hpe.com&gt;
Signed-off-by: Davidlohr Bueso &lt;dbueso@suse.de&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Ding Tianhong &lt;dingtianhong@huawei.com&gt;
Cc: Jason Low &lt;jason.low2@hp.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Paul E. McKenney &lt;paulmck@us.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Tim Chen &lt;tim.c.chen@linux.intel.com&gt;
Cc: Waiman Long &lt;waiman.long@hpe.com&gt;
Cc: Will Deacon &lt;Will.Deacon@arm.com&gt;
Link: http://lkml.kernel.org/r/20160125022343.GA3322@linux-uzut.site
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>locking/lockdep: Add nest_lock integrity test</title>
<updated>2017-10-21T15:09:03Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2017-03-01T15:23:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=28eab3db727efb7ad4eb17aaa83df59c3d50e330'/>
<id>urn:sha1:28eab3db727efb7ad4eb17aaa83df59c3d50e330</id>
<content type='text'>
[ Upstream commit 7fb4a2cea6b18dab56d609530d077f168169ed6b ]

Boqun reported that hlock-&gt;references can overflow. Add a debug test
for that to generate a clear error when this happens.

Without this, lockdep is likely to report a mysterious failure on
unlock.

Reported-by: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Nicolai Hähnle &lt;Nicolai.Haehnle@amd.com&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>locktorture: Fix potential memory leak with rw lock test</title>
<updated>2017-09-13T21:09:46Z</updated>
<author>
<name>Yang Shi</name>
<email>yang.shi@linaro.org</email>
</author>
<published>2016-11-10T21:06:39Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=10863607c242e970cfc14c42b35689737c397fe4'/>
<id>urn:sha1:10863607c242e970cfc14c42b35689737c397fe4</id>
<content type='text'>
commit f4dbba591945dc301c302672adefba9e2ec08dc5 upstream.

When running locktorture module with the below commands with kmemleak enabled:

$ modprobe locktorture torture_type=rw_lock_irq
$ rmmod locktorture

The below kmemleak got caught:

root@10:~# echo scan &gt; /sys/kernel/debug/kmemleak
[  323.197029] kmemleak: 2 new suspected memory leaks (see /sys/kernel/debug/kmemleak)
root@10:~# cat /sys/kernel/debug/kmemleak
unreferenced object 0xffffffc07592d500 (size 128):
  comm "modprobe", pid 368, jiffies 4294924118 (age 205.824s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 c3 7b 02 00 00 00 00 00  .........{......
    00 00 00 00 00 00 00 00 d7 9b 02 00 00 00 00 00  ................
  backtrace:
    [&lt;ffffff80081e5a88&gt;] create_object+0x110/0x288
    [&lt;ffffff80086c6078&gt;] kmemleak_alloc+0x58/0xa0
    [&lt;ffffff80081d5acc&gt;] __kmalloc+0x234/0x318
    [&lt;ffffff80006fa130&gt;] 0xffffff80006fa130
    [&lt;ffffff8008083ae4&gt;] do_one_initcall+0x44/0x138
    [&lt;ffffff800817e28c&gt;] do_init_module+0x68/0x1cc
    [&lt;ffffff800811c848&gt;] load_module+0x1a68/0x22e0
    [&lt;ffffff800811d340&gt;] SyS_finit_module+0xe0/0xf0
    [&lt;ffffff80080836f0&gt;] el0_svc_naked+0x24/0x28
    [&lt;ffffffffffffffff&gt;] 0xffffffffffffffff
unreferenced object 0xffffffc07592d480 (size 128):
  comm "modprobe", pid 368, jiffies 4294924118 (age 205.824s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 3b 6f 01 00 00 00 00 00  ........;o......
    00 00 00 00 00 00 00 00 23 6a 01 00 00 00 00 00  ........#j......
  backtrace:
    [&lt;ffffff80081e5a88&gt;] create_object+0x110/0x288
    [&lt;ffffff80086c6078&gt;] kmemleak_alloc+0x58/0xa0
    [&lt;ffffff80081d5acc&gt;] __kmalloc+0x234/0x318
    [&lt;ffffff80006fa22c&gt;] 0xffffff80006fa22c
    [&lt;ffffff8008083ae4&gt;] do_one_initcall+0x44/0x138
    [&lt;ffffff800817e28c&gt;] do_init_module+0x68/0x1cc
    [&lt;ffffff800811c848&gt;] load_module+0x1a68/0x22e0
    [&lt;ffffff800811d340&gt;] SyS_finit_module+0xe0/0xf0
    [&lt;ffffff80080836f0&gt;] el0_svc_naked+0x24/0x28
    [&lt;ffffffffffffffff&gt;] 0xffffffffffffffff

It is because cxt.lwsa and cxt.lrsa don't get freed in module_exit, so free
them in lock_torture_cleanup() and free writer_tasks if reader_tasks is
failed at memory allocation.

Signed-off-by: Yang Shi &lt;yang.shi@linaro.org&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Reviewed-by: Josh Triplett &lt;josh@joshtriplett.org&gt;
Cc: 石洋 &lt;yang.s@alibaba-inc.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>locking/rtmutex: Use READ_ONCE() in rt_mutex_owner()</title>
<updated>2016-12-15T16:49:22Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2016-11-30T21:04:42Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c6a5bf4cda12e4a3c097036ce7044ee65e4cf6d9'/>
<id>urn:sha1:c6a5bf4cda12e4a3c097036ce7044ee65e4cf6d9</id>
<content type='text'>
commit 1be5d4fa0af34fb7bafa205aeb59f5c7cc7a089d upstream.

While debugging the rtmutex unlock vs. dequeue race Will suggested to use
READ_ONCE() in rt_mutex_owner() as it might race against the
cmpxchg_release() in unlock_rt_mutex_safe().

Will: "It's a minor thing which will most likely not matter in practice"

Careful search did not unearth an actual problem in todays code, but it's
better to be safe than surprised.

Suggested-by: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: David Daney &lt;ddaney@caviumnetworks.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Sebastian Siewior &lt;bigeasy@linutronix.de&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Link: http://lkml.kernel.org/r/20161130210030.431379999@linutronix.de
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>locking/rtmutex: Prevent dequeue vs. unlock race</title>
<updated>2016-12-15T16:49:22Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2016-11-30T21:04:41Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b27d9147f24a501d632916964f77c1221246fae9'/>
<id>urn:sha1:b27d9147f24a501d632916964f77c1221246fae9</id>
<content type='text'>
commit dbb26055defd03d59f678cb5f2c992abe05b064a upstream.

David reported a futex/rtmutex state corruption. It's caused by the
following problem:

CPU0		CPU1		CPU2

l-&gt;owner=T1
		rt_mutex_lock(l)
		lock(l-&gt;wait_lock)
		l-&gt;owner = T1 | HAS_WAITERS;
		enqueue(T2)
		boost()
		  unlock(l-&gt;wait_lock)
		schedule()

				rt_mutex_lock(l)
				lock(l-&gt;wait_lock)
				l-&gt;owner = T1 | HAS_WAITERS;
				enqueue(T3)
				boost()
				  unlock(l-&gt;wait_lock)
				schedule()
		signal(-&gt;T2)	signal(-&gt;T3)
		lock(l-&gt;wait_lock)
		dequeue(T2)
		deboost()
		  unlock(l-&gt;wait_lock)
				lock(l-&gt;wait_lock)
				dequeue(T3)
				  ===&gt; wait list is now empty
				deboost()
				 unlock(l-&gt;wait_lock)
		lock(l-&gt;wait_lock)
		fixup_rt_mutex_waiters()
		  if (wait_list_empty(l)) {
		    owner = l-&gt;owner &amp; ~HAS_WAITERS;
		    l-&gt;owner = owner
		     ==&gt; l-&gt;owner = T1
		  }

				lock(l-&gt;wait_lock)
rt_mutex_unlock(l)		fixup_rt_mutex_waiters()
				  if (wait_list_empty(l)) {
				    owner = l-&gt;owner &amp; ~HAS_WAITERS;
cmpxchg(l-&gt;owner, T1, NULL)
 ===&gt; Success (l-&gt;owner = NULL)
				    l-&gt;owner = owner
				     ==&gt; l-&gt;owner = T1
				  }

That means the problem is caused by fixup_rt_mutex_waiters() which does the
RMW to clear the waiters bit unconditionally when there are no waiters in
the rtmutexes rbtree.

This can be fatal: A concurrent unlock can release the rtmutex in the
fastpath because the waiters bit is not set. If the cmpxchg() gets in the
middle of the RMW operation then the previous owner, which just unlocked
the rtmutex is set as the owner again when the write takes place after the
successfull cmpxchg().

The solution is rather trivial: verify that the owner member of the rtmutex
has the waiters bit set before clearing it. This does not require a
cmpxchg() or other atomic operations because the waiters bit can only be
set and cleared with the rtmutex wait_lock held. It's also safe against the
fast path unlock attempt. The unlock attempt via cmpxchg() will either see
the bit set and take the slowpath or see the bit cleared and release it
atomically in the fastpath.

It's remarkable that the test program provided by David triggers on ARM64
and MIPS64 really quick, but it refuses to reproduce on x86-64, while the
problem exists there as well. That refusal might explain that this got not
discovered earlier despite the bug existing from day one of the rtmutex
implementation more than 10 years ago.

Thanks to David for meticulously instrumenting the code and providing the
information which allowed to decode this subtle problem.

Reported-by: David Daney &lt;ddaney@caviumnetworks.com&gt;
Tested-by: David Daney &lt;david.daney@cavium.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Sebastian Siewior &lt;bigeasy@linutronix.de&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Fixes: 23f78d4a03c5 ("[PATCH] pi-futex: rt mutex core")
Link: http://lkml.kernel.org/r/20161130210030.351136722@linutronix.de
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>locking/qspinlock: Fix spin_unlock_wait() some more</title>
<updated>2016-07-27T16:47:29Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2016-06-08T08:19:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a39e660a55e8ce5ae35a7694835ad464b72666ca'/>
<id>urn:sha1:a39e660a55e8ce5ae35a7694835ad464b72666ca</id>
<content type='text'>
commit 2c610022711675ee908b903d242f0b90e1db661f upstream.

While this prior commit:

  54cf809b9512 ("locking,qspinlock: Fix spin_is_locked() and spin_unlock_wait()")

... fixes spin_is_locked() and spin_unlock_wait() for the usage
in ipc/sem and netfilter, it does not in fact work right for the
usage in task_work and futex.

So while the 2 locks crossed problem:

	spin_lock(A)		spin_lock(B)
	if (!spin_is_locked(B)) spin_unlock_wait(A)
	  foo()			foo();

... works with the smp_mb() injected by both spin_is_locked() and
spin_unlock_wait(), this is not sufficient for:

	flag = 1;
	smp_mb();		spin_lock()
	spin_unlock_wait()	if (!flag)
				  // add to lockless list
	// iterate lockless list

... because in this scenario, the store from spin_lock() can be delayed
past the load of flag, uncrossing the variables and loosing the
guarantee.

This patch reworks spin_is_locked() and spin_unlock_wait() to work in
both cases by exploiting the observation that while the lock byte
store can be delayed, the contender must have registered itself
visibly in other state contained in the word.

It also allows for architectures to override both functions, as PPC
and ARM64 have an additional issue for which we currently have no
generic solution.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Cc: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Giovanni Gherdovich &lt;ggherdovich@suse.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Pan Xinhui &lt;xinhui.pan@linux.vnet.ibm.com&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Waiman Long &lt;waiman.long@hpe.com&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Fixes: 54cf809b9512 ("locking,qspinlock: Fix spin_is_locked() and spin_unlock_wait()")
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
</feed>
