<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/locking/rtmutex.c, branch v4.11.3</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.11.3</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.11.3'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2017-03-02T07:42:34Z</updated>
<entry>
<title>sched/headers: Prepare for new header dependencies before moving code to &lt;linux/sched/debug.h&gt;</title>
<updated>2017-03-02T07:42:34Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2017-02-08T17:51:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b17b01533b719e9949e437abf66436a875739b40'/>
<id>urn:sha1:b17b01533b719e9949e437abf66436a875739b40</id>
<content type='text'>
We are going to split &lt;linux/sched/debug.h&gt; out of &lt;linux/sched.h&gt;, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder &lt;linux/sched/debug.h&gt; file that just
maps to &lt;linux/sched.h&gt; to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/headers: Prepare to move signal wakeup &amp; sigpending methods from &lt;linux/sched.h&gt; into &lt;linux/sched/signal.h&gt;</title>
<updated>2017-03-02T07:42:32Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2017-02-02T18:15:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=174cd4b1e5fbd0d74c68cf3a74f5bd4923485512'/>
<id>urn:sha1:174cd4b1e5fbd0d74c68cf3a74f5bd4923485512</id>
<content type='text'>
Fix up affected files that include this signal functionality via sched.h.

Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/headers: Prepare for new header dependencies before moving code to &lt;linux/sched/wake_q.h&gt;</title>
<updated>2017-03-02T07:42:26Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2017-02-01T15:36:40Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=84f001e15737f8214b0f5f0f7dfec0fb1027938f'/>
<id>urn:sha1:84f001e15737f8214b0f5f0f7dfec0fb1027938f</id>
<content type='text'>
We are going to split &lt;linux/sched/wake_q.h&gt; out of &lt;linux/sched.h&gt;, which
will have to be picked up from other headers and a couple of .c files.

Create a trivial placeholder &lt;linux/sched/wake_q.h&gt; file that just
maps to &lt;linux/sched.h&gt; to make this patch obviously correct and
bisectable.

Include the new header in the files that are going to need it.

Acked-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/rtmutex: Flip unlikely() branch to likely() in __rt_mutex_slowlock()</title>
<updated>2017-01-30T10:42:59Z</updated>
<author>
<name>Steven Rostedt (VMware)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2017-01-19T16:32:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4009f4b3a9d8b74547269f293e6a920adf278996'/>
<id>urn:sha1:4009f4b3a9d8b74547269f293e6a920adf278996</id>
<content type='text'>
Running my likely/unlikely profiler for 3 weeks on two production
machines, I discovered that the unlikely() test in
__rt_mutex_slowlock() checking if state is TASK_INTERRUPTIBLE is hit
100% of the time, making it a very likely case.

The reason is, on a vanilla kernel, the majority case of calling
rt_mutex() is from the futex code. This code is always called as
TASK_INTERRUPTIBLE. In the -rt patch, this code is commonly called when
PREEMPT_RT is enabled with TASK_UNINTERRUPTIBLE. But that's not the
likely scenario.

The rt_mutex() code should be optimized for the common vanilla case,
and that is from a futex, with TASK_INTERRUPTIBLE as the state.

Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/20170119113234.1efeedd1@gandalf.local.home
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/rtmutex: Explain locking rules for rt_mutex_proxy_unlock()/init_proxy_locked()</title>
<updated>2016-12-02T10:13:57Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2016-11-30T21:04:45Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=84d82ec5b9046ecdf16031d3e93a66ef50257402'/>
<id>urn:sha1:84d82ec5b9046ecdf16031d3e93a66ef50257402</id>
<content type='text'>
While debugging the unlock vs. dequeue race which resulted in state
corruption of futexes the lockless nature of rt_mutex_proxy_unlock()
caused some confusion.

Add commentry to explain why it is safe to do this lockless. Add matching
comments to rt_mutex_init_proxy_locked() for completeness sake.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: David Daney &lt;ddaney@caviumnetworks.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Sebastian Siewior &lt;bigeasy@linutronix.de&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Link: http://lkml.kernel.org/r/20161130210030.591941927@linutronix.de
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'locking/urgent' into locking/core, to pick up dependent fixes</title>
<updated>2016-12-02T10:13:44Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2016-12-02T10:13:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1b95b1a06cb27badb3e53329fb56af2a2113fd80'/>
<id>urn:sha1:1b95b1a06cb27badb3e53329fb56af2a2113fd80</id>
<content type='text'>
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/rtmutex: Prevent dequeue vs. unlock race</title>
<updated>2016-12-02T10:13:26Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2016-11-30T21:04:41Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=dbb26055defd03d59f678cb5f2c992abe05b064a'/>
<id>urn:sha1:dbb26055defd03d59f678cb5f2c992abe05b064a</id>
<content type='text'>
David reported a futex/rtmutex state corruption. It's caused by the
following problem:

CPU0		CPU1		CPU2

l-&gt;owner=T1
		rt_mutex_lock(l)
		lock(l-&gt;wait_lock)
		l-&gt;owner = T1 | HAS_WAITERS;
		enqueue(T2)
		boost()
		  unlock(l-&gt;wait_lock)
		schedule()

				rt_mutex_lock(l)
				lock(l-&gt;wait_lock)
				l-&gt;owner = T1 | HAS_WAITERS;
				enqueue(T3)
				boost()
				  unlock(l-&gt;wait_lock)
				schedule()
		signal(-&gt;T2)	signal(-&gt;T3)
		lock(l-&gt;wait_lock)
		dequeue(T2)
		deboost()
		  unlock(l-&gt;wait_lock)
				lock(l-&gt;wait_lock)
				dequeue(T3)
				  ===&gt; wait list is now empty
				deboost()
				 unlock(l-&gt;wait_lock)
		lock(l-&gt;wait_lock)
		fixup_rt_mutex_waiters()
		  if (wait_list_empty(l)) {
		    owner = l-&gt;owner &amp; ~HAS_WAITERS;
		    l-&gt;owner = owner
		     ==&gt; l-&gt;owner = T1
		  }

				lock(l-&gt;wait_lock)
rt_mutex_unlock(l)		fixup_rt_mutex_waiters()
				  if (wait_list_empty(l)) {
				    owner = l-&gt;owner &amp; ~HAS_WAITERS;
cmpxchg(l-&gt;owner, T1, NULL)
 ===&gt; Success (l-&gt;owner = NULL)
				    l-&gt;owner = owner
				     ==&gt; l-&gt;owner = T1
				  }

That means the problem is caused by fixup_rt_mutex_waiters() which does the
RMW to clear the waiters bit unconditionally when there are no waiters in
the rtmutexes rbtree.

This can be fatal: A concurrent unlock can release the rtmutex in the
fastpath because the waiters bit is not set. If the cmpxchg() gets in the
middle of the RMW operation then the previous owner, which just unlocked
the rtmutex is set as the owner again when the write takes place after the
successfull cmpxchg().

The solution is rather trivial: verify that the owner member of the rtmutex
has the waiters bit set before clearing it. This does not require a
cmpxchg() or other atomic operations because the waiters bit can only be
set and cleared with the rtmutex wait_lock held. It's also safe against the
fast path unlock attempt. The unlock attempt via cmpxchg() will either see
the bit set and take the slowpath or see the bit cleared and release it
atomically in the fastpath.

It's remarkable that the test program provided by David triggers on ARM64
and MIPS64 really quick, but it refuses to reproduce on x86-64, while the
problem exists there as well. That refusal might explain that this got not
discovered earlier despite the bug existing from day one of the rtmutex
implementation more than 10 years ago.

Thanks to David for meticulously instrumenting the code and providing the
information which allowed to decode this subtle problem.

Reported-by: David Daney &lt;ddaney@caviumnetworks.com&gt;
Tested-by: David Daney &lt;david.daney@cavium.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Sebastian Siewior &lt;bigeasy@linutronix.de&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: stable@vger.kernel.org
Fixes: 23f78d4a03c5 ("[PATCH] pi-futex: rt mutex core")
Link: http://lkml.kernel.org/r/20161130210030.351136722@linutronix.de
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/wake_q: Rename WAKE_Q to DEFINE_WAKE_Q</title>
<updated>2016-11-21T09:29:01Z</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2016-11-17T16:46:38Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=194a6b5b9cb6b91a5f7d86984165a3bc55188599'/>
<id>urn:sha1:194a6b5b9cb6b91a5f7d86984165a3bc55188599</id>
<content type='text'>
Currently the wake_q data structure is defined by the WAKE_Q() macro.
This macro, however, looks like a function doing something as "wake" is
a verb. Even checkpatch.pl was confused as it reported warnings like

  WARNING: Missing a blank line after declarations
  #548: FILE: kernel/futex.c:3665:
  +	int ret;
  +	WAKE_Q(wake_q);

This patch renames the WAKE_Q() macro to DEFINE_WAKE_Q() which clarifies
what the macro is doing and eliminates the checkpatch.pl warnings.

Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Acked-by: Davidlohr Bueso &lt;dave@stgolabs.net&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/1479401198-1765-1-git-send-email-longman@redhat.com
[ Resolved conflict and added missing rename. ]
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/rtmutex: Only warn once on a trylock from bad context</title>
<updated>2016-06-08T12:22:00Z</updated>
<author>
<name>Sebastian Andrzej Siewior</name>
<email>bigeasy@linutronix.de</email>
</author>
<published>2016-05-27T13:47:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a461d58792d4a46b8b10ae50973ec9b2763b694e'/>
<id>urn:sha1:a461d58792d4a46b8b10ae50973ec9b2763b694e</id>
<content type='text'>
One warning should be enough to get one motivated to fix this. It is
possible that this happens more than once and that starts flooding the
output. Later the prints will be suppressed so we only get half of it.
Depending on the console system used it might not be helpful.

Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/1464356838-1755-1-git-send-email-bigeasy@linutronix.de
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>rtmutex: Make wait_lock irq safe</title>
<updated>2016-01-26T10:08:35Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2016-01-13T10:25:38Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b4abf91047cf054f203dcfac97e1038388826937'/>
<id>urn:sha1:b4abf91047cf054f203dcfac97e1038388826937</id>
<content type='text'>
Sasha reported a lockdep splat about a potential deadlock between RCU boosting
rtmutex and the posix timer it_lock.

CPU0					CPU1

rtmutex_lock(&amp;rcu-&gt;rt_mutex)
  spin_lock(&amp;rcu-&gt;rt_mutex.wait_lock)
					local_irq_disable()
					spin_lock(&amp;timer-&gt;it_lock)
					spin_lock(&amp;rcu-&gt;mutex.wait_lock)
--&gt; Interrupt
    spin_lock(&amp;timer-&gt;it_lock)

This is caused by the following code sequence on CPU1

     rcu_read_lock()
     x = lookup();
     if (x)
     	spin_lock_irqsave(&amp;x-&gt;it_lock);
     rcu_read_unlock();
     return x;

We could fix that in the posix timer code by keeping rcu read locked across
the spinlocked and irq disabled section, but the above sequence is common and
there is no reason not to support it.

Taking rt_mutex.wait_lock irq safe prevents the deadlock.

Reported-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
</content>
</entry>
</feed>
