user/sven/linux.git/kernel/locking/qrwlock.c, branch v4.14

kernel/locking: Fix compile error with qrwlock.c

2017-05-25T19:06:50Z

Saw these compile errors on SPARC when queued rwlock feature is enabled. CC kernel/locking/qrwlock.o kernel/locking/qrwlock.c: In function ‘queued_read_lock_slowpath’: kernel/locking/qrwlock.c:89: error: implicit declaration of function ‘arch_spin_lock’ kernel/locking/qrwlock.c:102: error: implicit declaration of function ‘arch_spin_unlock’ make[4]: *** [kernel/locking/qrwlock.o] Error 1 Include spinlock.h in qrwlock.c to fix it. Signed-off-by: Babu Moger Reviewed-by: Håkon Bugge Reviewed-by: Jane Chu Reviewed-by: Shannon Nelson Reviewed-by: Vijay Kumar Signed-off-by: David S. Miller

locking/core: Remove cpu_relax_lowlatency() users

2016-11-16T09:15:10Z

With the s390 special case of a yielding cpu_relax() implementation gone, we can now remove all users of cpu_relax_lowlatency() and replace them with cpu_relax(). Signed-off-by: Christian Borntraeger Signed-off-by: Peter Zijlstra (Intel) Cc: Catalin Marinas Cc: Heiko Carstens Cc: Linus Torvalds Cc: Martin Schwidefsky Cc: Nicholas Piggin Cc: Noam Camus Cc: Peter Zijlstra Cc: Russell King Cc: Thomas Gleixner Cc: Will Deacon Cc: linuxppc-dev@lists.ozlabs.org Cc: virtualization@lists.linux-foundation.org Cc: xen-devel@lists.xenproject.org Link: http://lkml.kernel.org/r/1477386195-32736-5-git-send-email-borntraeger@de.ibm.com Signed-off-by: Ingo Molnar

locking/atomic, arch/qrwlock: Employ atomic_fetch_add_acquire()

2016-06-16T08:48:34Z

The only reason for the current code is to make GCC emit only the "LOCK XADD" instruction on x86 (and not do a pointless extra ADD on the result), do so nicer. Signed-off-by: Peter Zijlstra (Intel) Acked-by: Waiman Long Cc: Andrew Morton Cc: Linus Torvalds Cc: Paul E. McKenney Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: linux-arch@vger.kernel.org Cc: linux-kernel@vger.kernel.org Signed-off-by: Ingo Molnar

locking/qrwlock: Rename ->lock to ->wait_lock

2015-09-18T07:27:29Z

... trivial, but reads a little nicer when we name our actual primitive 'lock'. Signed-off-by: Davidlohr Bueso Signed-off-by: Peter Zijlstra (Intel) Cc: Andrew Morton Cc: Linus Torvalds Cc: Paul E. McKenney Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Waiman Long Link: http://lkml.kernel.org/r/1442216244-4409-1-git-send-email-dave@stgolabs.net Signed-off-by: Ingo Molnar

locking/qrwlock: Make use of _{acquire|release|relaxed}() atomics

2015-08-12T09:59:06Z

The qrwlock implementation is slightly heavy in its use of memory barriers, mainly through the use of _cmpxchg() and _return() atomics, which imply full barrier semantics. This patch modifies the qrwlock code to use the more relaxed atomic routines so that we can reduce the unnecessary barrier overhead on weakly-ordered architectures. Signed-off-by: Will Deacon Signed-off-by: Peter Zijlstra (Intel) Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Cc: Waiman.Long@hp.com Cc: paulmck@linux.vnet.ibm.com Link: http://lkml.kernel.org/r/1438880084-18856-7-git-send-email-will.deacon@arm.com Signed-off-by: Ingo Molnar

locking/qrwlock: Reduce reader/writer to reader lock transfer latency

2015-08-03T08:57:10Z

Currently, a reader will check first to make sure that the writer mode byte is cleared before incrementing the reader count. That waiting is not really necessary. It increases the latency in the reader/writer to reader transition and reduces readers performance. This patch eliminates that waiting. It also has the side effect of reducing the chance of writer lock stealing and improving the fairness of the lock. Using a locking microbenchmark, a 10-threads 5M locking loop of mostly readers (RW ratio = 10,000:1) has the following performance numbers in a Haswell-EX box: Kernel Locking Rate (Kops/s) ------ --------------------- 4.1.1 15,063,081 4.1.1+patch 17,241,552 (+14.4%) Signed-off-by: Waiman Long Signed-off-by: Peter Zijlstra (Intel) Cc: Andrew Morton Cc: Arnd Bergmann Cc: Douglas Hatch Cc: Linus Torvalds Cc: Paul E. McKenney Cc: Peter Zijlstra Cc: Scott J Norton Cc: Thomas Gleixner Cc: Will Deacon Link: http://lkml.kernel.org/r/1436459543-29126-2-git-send-email-Waiman.Long@hp.com Signed-off-by: Ingo Molnar

locking/qrwlock: Better optimization for interrupt context readers

2015-07-06T12:11:28Z

The qrwlock is fair in the process context, but becoming unfair when in the interrupt context to support use cases like the tasklist_lock. The current code isn't that well-documented on what happens when in the interrupt context. The rspin_until_writer_unlock() will only spin if the writer has gotten the lock. If the writer is still in the waiting state, the increment in the reader count will cause the writer to remain in the waiting state and the new interrupt context reader will get the lock and return immediately. The current code, however, does an additional read of the lock value which is not necessary as the information has already been there in the fast path. This may sometime cause an additional cacheline transfer when the lock is highly contended. This patch passes the lock value information gotten in the fast path to the slow path to eliminate the additional read. It also documents the action for the interrupt context readers more clearly. Signed-off-by: Waiman Long Signed-off-by: Peter Zijlstra (Intel) Reviewed-by: Will Deacon Cc: Arnd Bergmann Cc: Douglas Hatch Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Scott J Norton Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/1434729002-57724-3-git-send-email-Waiman.Long@hp.com Signed-off-by: Ingo Molnar

locking/qrwlock: Rename functions to queued_*()

2015-07-06T12:11:27Z

To sync up with the naming convention used in qspinlock, all the qrwlock functions were renamed to started with "queued" instead of "queue". Signed-off-by: Waiman Long Signed-off-by: Peter Zijlstra (Intel) Cc: Arnd Bergmann Cc: Douglas Hatch Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Scott J Norton Cc: Thomas Gleixner Cc: Will Deacon Link: http://lkml.kernel.org/r/1434729002-57724-2-git-send-email-Waiman.Long@hp.com Signed-off-by: Ingo Molnar

locking/qrwlock: Don't contend with readers when setting _QW_WAITING

2015-06-19T07:45:38Z

The current cmpxchg() loop in setting the _QW_WAITING flag for writers in queue_write_lock_slowpath() will contend with incoming readers causing possibly extra cmpxchg() operations that are wasteful. This patch changes the code to do a byte cmpxchg() to eliminate contention with new readers. A multithreaded microbenchmark running 5M read_lock/write_lock loop on a 8-socket 80-core Westmere-EX machine running 4.0 based kernel with the qspinlock patch have the following execution times (in ms) with and without the patch: With R:W ratio = 5:1 Threads w/o patch with patch % change ------- --------- ---------- -------- 2 990 895 -9.6% 3 2136 1912 -10.5% 4 3166 2830 -10.6% 5 3953 3629 -8.2% 6 4628 4405 -4.8% 7 5344 5197 -2.8% 8 6065 6004 -1.0% 9 6826 6811 -0.2% 10 7599 7599 0.0% 15 9757 9766 +0.1% 20 13767 13817 +0.4% With small number of contending threads, this patch can improve locking performance by up to 10%. With more contending threads, however, the gain diminishes. Signed-off-by: Waiman Long Signed-off-by: Peter Zijlstra (Intel) Cc: Andrew Morton Cc: Arnd Bergmann Cc: Borislav Petkov Cc: Douglas Hatch Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Paul E. McKenney Cc: Peter Zijlstra Cc: Scott J Norton Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/1433863153-30722-3-git-send-email-Waiman.Long@hp.com Signed-off-by: Ingo Molnar

locking/qrwlock: Rename QUEUE_RWLOCK to QUEUED_RWLOCKS

2015-05-12T07:46:00Z

To be consistent with the queued spinlocks which use CONFIG_QUEUED_SPINLOCKS config parameter, the one for the queued rwlocks is now renamed to CONFIG_QUEUED_RWLOCKS. Signed-off-by: Waiman Long Cc: Borislav Petkov Cc: Douglas Hatch Cc: H. Peter Anvin Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Scott J Norton Cc: Thomas Gleixner Link: http://lkml.kernel.org/r/1431367031-36697-1-git-send-email-Waiman.Long@hp.com Signed-off-by: Ingo Molnar