<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/locking, branch v4.15.5</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.15.5</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.15.5'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2018-01-24T09:00:09Z</updated>
<entry>
<title>locking/lockdep: Avoid triggering hardlockup from debug_show_all_locks()</title>
<updated>2018-01-24T09:00:09Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2018-01-22T22:00:55Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=88f1c87de11a86d839f4ce5313e552d96709b990'/>
<id>urn:sha1:88f1c87de11a86d839f4ce5313e552d96709b990</id>
<content type='text'>
debug_show_all_locks() iterates all tasks and print held locks whole
holding tasklist_lock.  This can take a while on a slow console device
and may end up triggering NMI hardlockup detector if someone else ends
up waiting for tasklist_lock.

Touch the NMI watchdog while printing the held locks to avoid
spuriously triggering the hardlockup detector.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: kernel-team@fb.com
Link: http://lkml.kernel.org/r/20180122220055.GB1771050@devbig577.frc2.facebook.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>futex: Avoid violating the 10th rule of futex</title>
<updated>2018-01-14T17:49:16Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2017-12-08T12:49:39Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c1e2f0eaf015fb7076d51a339011f2383e6dd389'/>
<id>urn:sha1:c1e2f0eaf015fb7076d51a339011f2383e6dd389</id>
<content type='text'>
Julia reported futex state corruption in the following scenario:

   waiter                                  waker                                            stealer (prio &gt; waiter)

   futex(WAIT_REQUEUE_PI, uaddr, uaddr2,
         timeout=[N ms])
      futex_wait_requeue_pi()
         futex_wait_queue_me()
            freezable_schedule()
            &lt;scheduled out&gt;
                                           futex(LOCK_PI, uaddr2)
                                           futex(CMP_REQUEUE_PI, uaddr,
                                                 uaddr2, 1, 0)
                                              /* requeues waiter to uaddr2 */
                                           futex(UNLOCK_PI, uaddr2)
                                                 wake_futex_pi()
                                                    cmp_futex_value_locked(uaddr2, waiter)
                                                    wake_up_q()
           &lt;woken by waker&gt;
           &lt;hrtimer_wakeup() fires,
            clears sleeper-&gt;task&gt;
                                                                                           futex(LOCK_PI, uaddr2)
                                                                                              __rt_mutex_start_proxy_lock()
                                                                                                 try_to_take_rt_mutex() /* steals lock */
                                                                                                    rt_mutex_set_owner(lock, stealer)
                                                                                              &lt;preempted&gt;
         &lt;scheduled in&gt;
         rt_mutex_wait_proxy_lock()
            __rt_mutex_slowlock()
               try_to_take_rt_mutex() /* fails, lock held by stealer */
               if (timeout &amp;&amp; !timeout-&gt;task)
                  return -ETIMEDOUT;
            fixup_owner()
               /* lock wasn't acquired, so,
                  fixup_pi_state_owner skipped */

   return -ETIMEDOUT;

   /* At this point, we've returned -ETIMEDOUT to userspace, but the
    * futex word shows waiter to be the owner, and the pi_mutex has
    * stealer as the owner */

   futex_lock(LOCK_PI, uaddr2)
     -&gt; bails with EDEADLK, futex word says we're owner.

And suggested that what commit:

  73d786bd043e ("futex: Rework inconsistent rt_mutex/futex_q state")

removes from fixup_owner() looks to be just what is needed. And indeed
it is -- I completely missed that requeue_pi could also result in this
case. So we need to restore that, except that subsequent patches, like
commit:

  16ffa12d7425 ("futex: Pull rt_mutex_futex_unlock() out from under hb-&gt;lock")

changed all the locking rules. Even without that, the sequence:

-               if (rt_mutex_futex_trylock(&amp;q-&gt;pi_state-&gt;pi_mutex)) {
-                       locked = 1;
-                       goto out;
-               }

-               raw_spin_lock_irq(&amp;q-&gt;pi_state-&gt;pi_mutex.wait_lock);
-               owner = rt_mutex_owner(&amp;q-&gt;pi_state-&gt;pi_mutex);
-               if (!owner)
-                       owner = rt_mutex_next_owner(&amp;q-&gt;pi_state-&gt;pi_mutex);
-               raw_spin_unlock_irq(&amp;q-&gt;pi_state-&gt;pi_mutex.wait_lock);
-               ret = fixup_pi_state_owner(uaddr, q, owner);

already suggests there were races; otherwise we'd never have to look
at next_owner.

So instead of doing 3 consecutive wait_lock sections with who knows
what races, we do it all in a single section. Additionally, the usage
of pi_state-&gt;owner in fixup_owner() was only safe because only the
rt_mutex owner would modify it, which this additional case wrecks.

Luckily the values can only change away and not to the value we're
testing, this means we can do a speculative test and double check once
we have the wait_lock.

Fixes: 73d786bd043e ("futex: Rework inconsistent rt_mutex/futex_q state")
Reported-by: Julia Cartwright &lt;julia@ni.com&gt;
Reported-by: Gratian Crisan &lt;gratian.crisan@ni.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Julia Cartwright &lt;julia@ni.com&gt;
Tested-by: Gratian Crisan &lt;gratian.crisan@ni.com&gt;
Cc: Darren Hart &lt;dvhart@infradead.org&gt;
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20171208124939.7livp7no2ov65rrc@hirez.programming.kicks-ass.net
</content>
</entry>
<entry>
<title>locking/lockdep: Remove the cross-release locking checks</title>
<updated>2017-12-12T11:38:51Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2017-12-12T11:31:16Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e966eaeeb623f09975ef362c2866fae6f86844f9'/>
<id>urn:sha1:e966eaeeb623f09975ef362c2866fae6f86844f9</id>
<content type='text'>
This code (CONFIG_LOCKDEP_CROSSRELEASE=y and CONFIG_LOCKDEP_COMPLETIONS=y),
while it found a number of old bugs initially, was also causing too many
false positives that caused people to disable lockdep - which is arguably
a worse overall outcome.

If we disable cross-release by default but keep the code upstream then
in practice the most likely outcome is that we'll allow the situation
to degrade gradually, by allowing entropy to introduce more and more
false positives, until it overwhelms maintenance capacity.

Another bad side effect was that people were trying to work around
the false positives by uglifying/complicating unrelated code. There's
a marked difference between annotating locking operations and
uglifying good code just due to bad lock debugging code ...

This gradual decrease in quality happened to a number of debugging
facilities in the kernel, and lockdep is pretty complex already,
so we cannot risk this outcome.

Either cross-release checking can be done right with no false positives,
or it should not be included in the upstream kernel.

( Note that it might make sense to maintain it out of tree and go through
  the false positives every now and then and see whether new bugs were
  introduced. )

Cc: Byungchul Park &lt;byungchul.park@lge.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/core: Remove break_lock field when CONFIG_GENERIC_LOCKBREAK=y</title>
<updated>2017-12-12T10:24:01Z</updated>
<author>
<name>Will Deacon</name>
<email>will.deacon@arm.com</email>
</author>
<published>2017-11-28T18:42:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d89c70356acf11b7cf47ca5cfcafae5062a85451'/>
<id>urn:sha1:d89c70356acf11b7cf47ca5cfcafae5062a85451</id>
<content type='text'>
When CONFIG_GENERIC_LOCKBEAK=y, locking structures grow an extra int -&gt;break_lock
field which is used to implement raw_spin_is_contended() by setting the field
to 1 when waiting on a lock and clearing it to zero when holding a lock.
However, there are a few problems with this approach:

  - There is a write-write race between a CPU successfully taking the lock
    (and subsequently writing break_lock = 0) and a waiter waiting on
    the lock (and subsequently writing break_lock = 1). This could result
    in a contended lock being reported as uncontended and vice-versa.

  - On machines with store buffers, nothing guarantees that the writes
    to break_lock are visible to other CPUs at any particular time.

  - READ_ONCE/WRITE_ONCE are not used, so the field is potentially
    susceptible to harmful compiler optimisations,

Consequently, the usefulness of this field is unclear and we'd be better off
removing it and allowing architectures to implement raw_spin_is_contended() by
providing a definition of arch_spin_is_contended(), as they can when
CONFIG_GENERIC_LOCKBREAK=n.

Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
Acked-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Sebastian Ott &lt;sebott@linux.vnet.ibm.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/1511894539-7988-3-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/core: Fix deadlock during boot on systems with GENERIC_LOCKBREAK</title>
<updated>2017-12-12T10:24:01Z</updated>
<author>
<name>Will Deacon</name>
<email>will.deacon@arm.com</email>
</author>
<published>2017-11-28T18:42:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f87f3a328dbbb3e79dd53e7e889ced9222512649'/>
<id>urn:sha1:f87f3a328dbbb3e79dd53e7e889ced9222512649</id>
<content type='text'>
Commit:

  a8a217c22116 ("locking/core: Remove {read,spin,write}_can_lock()")

removed the definition of raw_spin_can_lock(), causing the GENERIC_LOCKBREAK
spin_lock() routines to poll the -&gt;break_lock field when waiting on a lock.

This has been reported to cause a deadlock during boot on s390, because
the -&gt;break_lock field is also set by the waiters, and can potentially
remain set indefinitely if no other CPUs come in to take the lock after
it has been released.

This patch removes the explicit spinning on -&gt;break_lock from the waiters,
instead relying on the outer trylock() operation to determine when the
lock is available.

Reported-by: Sebastian Ott &lt;sebott@linux.vnet.ibm.com&gt;
Tested-by: Sebastian Ott &lt;sebott@linux.vnet.ibm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
Acked-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Fixes: a8a217c22116 ("locking/core: Remove {read,spin,write}_can_lock()")
Link: http://lkml.kernel.org/r/1511894539-7988-2-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/lockdep: Fix possible NULL deref</title>
<updated>2017-12-06T18:29:56Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2017-12-06T16:32:47Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5e351ad106997e06b2dc3da9c6b939b95f67fb88'/>
<id>urn:sha1:5e351ad106997e06b2dc3da9c6b939b95f67fb88</id>
<content type='text'>
We can't invalidate xhlocks when we've not yet allocated any.

Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-kernel@vger.kernel.org
Fixes: f52be5708076 ("locking/lockdep: Untangle xhlock history save/restore from task independence")
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>kmemcheck: remove annotations</title>
<updated>2017-11-16T02:21:04Z</updated>
<author>
<name>Levin, Alexander (Sasha Levin)</name>
<email>alexander.levin@verizon.com</email>
</author>
<published>2017-11-16T01:35:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4950276672fce5c241857540f8561c440663673d'/>
<id>urn:sha1:4950276672fce5c241857540f8561c440663673d</id>
<content type='text'>
Patch series "kmemcheck: kill kmemcheck", v2.

As discussed at LSF/MM, kill kmemcheck.

KASan is a replacement that is able to work without the limitation of
kmemcheck (single CPU, slow).  KASan is already upstream.

We are also not aware of any users of kmemcheck (or users who don't
consider KASan as a suitable replacement).

The only objection was that since KASAN wasn't supported by all GCC
versions provided by distros at that time we should hold off for 2
years, and try again.

Now that 2 years have passed, and all distros provide gcc that supports
KASAN, kill kmemcheck again for the very same reasons.

This patch (of 4):

Remove kmemcheck annotations, and calls to kmemcheck from the kernel.

[alexander.levin@verizon.com: correctly remove kmemcheck call from dma_map_sg_attrs]
  Link: http://lkml.kernel.org/r/20171012192151.26531-1-alexander.levin@verizon.com
Link: http://lkml.kernel.org/r/20171007030159.22241-2-alexander.levin@verizon.com
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Cc: Alexander Potapenko &lt;glider@google.com&gt;
Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: Michal Hocko &lt;mhocko@kernel.org&gt;
Cc: Pekka Enberg &lt;penberg@kernel.org&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Tim Hansen &lt;devtimhansen@gmail.com&gt;
Cc: Vegard Nossum &lt;vegardno@ifi.uio.no&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>locking/pvqspinlock: Implement hybrid PV queued/unfair locks</title>
<updated>2017-11-08T09:10:04Z</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2017-11-07T21:18:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=11752adb68a388724b1935d57bf543897c34d80b'/>
<id>urn:sha1:11752adb68a388724b1935d57bf543897c34d80b</id>
<content type='text'>
Currently, all the lock waiters entering the slowpath will do one
lock stealing attempt to acquire the lock. That helps performance,
especially in VMs with over-committed vCPUs. However, the current
pvqspinlocks still don't perform as good as unfair locks in many cases.
On the other hands, unfair locks do have the problem of lock starvation
that pvqspinlocks don't have.

This patch combines the best attributes of an unfair lock and a
pvqspinlock into a hybrid lock with 2 modes - queued mode &amp; unfair
mode. A lock waiter goes into the unfair mode when there are waiters
in the wait queue but the pending bit isn't set. Otherwise, it will
go into the queued mode waiting in the queue for its turn.

On a 2-socket 36-core E5-2699 v3 system (HT off), a kernel build
(make -j&lt;n&gt;) was done in a VM with unpinned vCPUs 3 times with the
best time selected and &lt;n&gt; is the number of vCPUs available. The build
times of the original pvqspinlock, hybrid pvqspinlock and unfair lock
with various number of vCPUs are as follows:

  vCPUs    pvqlock     hybrid pvqlock    unfair lock
  -----    -------     --------------    -----------
    30      342.1s         329.1s          329.1s
    36      314.1s         305.3s          307.3s
    45      345.0s         302.1s          306.6s
    54      365.4s         308.6s          307.8s
    72      358.9s         293.6s          303.9s
   108      343.0s         285.9s          304.2s

The hybrid pvqspinlock performs better or comparable to the unfair
lock.

By turning on QUEUED_LOCK_STAT, the table below showed the number
of lock acquisitions in unfair mode and queue mode after a kernel
build with various number of vCPUs.

  vCPUs    queued mode  unfair mode
  -----    -----------  -----------
    30      9,130,518      294,954
    36     10,856,614      386,809
    45      8,467,264   11,475,373
    54      6,409,987   19,670,855
    72      4,782,063   25,712,180

It can be seen that as the VM became more and more over-committed,
the ratio of locks acquired in unfair mode increases. This is all
done automatically to get the best overall performance as possible.

Using a kernel locking microbenchmark with number of locking
threads equals to the number of vCPUs available on the same machine,
the minimum, average and maximum (min/avg/max) numbers of locking
operations done per thread in a 5-second testing interval are shown
below:

  vCPUs         hybrid pvqlock             unfair lock
  -----         --------------             -----------
    36     822,135/881,063/950,363    75,570/313,496/  690,465
    54     542,435/581,664/625,937    35,460/204,280/  457,172
    72     397,500/428,177/499,299    17,933/150,679/  708,001
   108     257,898/288,150/340,871     3,085/181,176/1,257,109

It can be seen that the hybrid pvqspinlocks are more fair and
performant than the unfair locks in this test.

The table below shows the kernel build times on a smaller 2-socket
16-core 32-thread E5-2620 v4 system.

  vCPUs    pvqlock     hybrid pvqlock    unfair lock
  -----    -------     --------------    -----------
    16      436.8s         433.4s          435.6s
    36      366.2s         364.8s          364.5s
    48      423.6s         376.3s          370.2s
    64      433.1s         376.6s          376.8s

Again, the performance of the hybrid pvqspinlock was comparable to
that of the unfair lock.

Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Reviewed-by: Juergen Gross &lt;jgross@suse.com&gt;
Reviewed-by: Eduardo Valentin &lt;eduval@amazon.com&gt;
Acked-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Boris Ostrovsky &lt;boris.ostrovsky@oracle.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Cc: Radim Krčmář &lt;rkrcmar@redhat.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/1510089486-3466-1-git-send-email-longman@redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>locking/rwlocks: Fix comments</title>
<updated>2017-11-07T11:22:21Z</updated>
<author>
<name>Cheng Jian</name>
<email>cj.chengjian@huawei.com</email>
</author>
<published>2017-11-03T10:59:48Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f791dd2589d7e217625d7e411621a5bac71cbf69'/>
<id>urn:sha1:f791dd2589d7e217625d7e411621a5bac71cbf69</id>
<content type='text'>
- fix the list of locking API headers in kernel/locking/spinlock.c
- fix an #endif comment

Signed-off-by: Cheng Jian &lt;cj.chengjian@huawei.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: huawei.libin@huawei.com
Cc: xiexiuqi@huawei.com
Link: http://lkml.kernel.org/r/1509706788-152547-1-git-send-email-cj.chengjian@huawei.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'linus' into locking/core, to resolve conflicts</title>
<updated>2017-11-07T09:32:44Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2017-11-07T09:32:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8c5db92a705d9e2c986adec475980d1120fa07b4'/>
<id>urn:sha1:8c5db92a705d9e2c986adec475980d1120fa07b4</id>
<content type='text'>
Conflicts:
	include/linux/compiler-clang.h
	include/linux/compiler-gcc.h
	include/linux/compiler-intel.h
	include/uapi/linux/stddef.h

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
</feed>
