summaryrefslogtreecommitdiff
path: root/include/asm-ppc64/spinlock.h
AgeCommit message (Collapse)Author
2005-11-19powerpc: Merge spinlock.hPaul Mackerras
The result is mostly similar to the original ppc64 version but with some adaptations for 32-bit compilation. include/asm-ppc64 is now empty! Signed-off-by: Paul Mackerras <paulus@samba.org>
2005-11-01merge filename and modify references to iseries/hv_call.hKelly Daly
Signed-off-by: Kelly Daly <kelly@au.ibm.com>
2005-09-10[PATCH] spinlock consolidationIngo Molnar
This patch (written by me and also containing many suggestions of Arjan van de Ven) does a major cleanup of the spinlock code. It does the following things: - consolidates and enhances the spinlock/rwlock debugging code - simplifies the asm/spinlock.h files - encapsulates the raw spinlock type and moves generic spinlock features (such as ->break_lock) into the generic code. - cleans up the spinlock code hierarchy to get rid of the spaghetti. Most notably there's now only a single variant of the debugging code, located in lib/spinlock_debug.c. (previously we had one SMP debugging variant per architecture, plus a separate generic one for UP builds) Also, i've enhanced the rwlock debugging facility, it will now track write-owners. There is new spinlock-owner/CPU-tracking on SMP builds too. All locks have lockup detection now, which will work for both soft and hard spin/rwlock lockups. The arch-level include files now only contain the minimally necessary subset of the spinlock code - all the rest that can be generalized now lives in the generic headers: include/asm-i386/spinlock_types.h | 16 include/asm-x86_64/spinlock_types.h | 16 I have also split up the various spinlock variants into separate files, making it easier to see which does what. The new layout is: SMP | UP ----------------------------|----------------------------------- asm/spinlock_types_smp.h | linux/spinlock_types_up.h linux/spinlock_types.h | linux/spinlock_types.h asm/spinlock_smp.h | linux/spinlock_up.h linux/spinlock_api_smp.h | linux/spinlock_api_up.h linux/spinlock.h | linux/spinlock.h /* * here's the role of the various spinlock/rwlock related include files: * * on SMP builds: * * asm/spinlock_types.h: contains the raw_spinlock_t/raw_rwlock_t and the * initializers * * linux/spinlock_types.h: * defines the generic type and initializers * * asm/spinlock.h: contains the __raw_spin_*()/etc. lowlevel * implementations, mostly inline assembly code * * (also included on UP-debug builds:) * * linux/spinlock_api_smp.h: * contains the prototypes for the _spin_*() APIs. * * linux/spinlock.h: builds the final spin_*() APIs. * * on UP builds: * * linux/spinlock_type_up.h: * contains the generic, simplified UP spinlock type. * (which is an empty structure on non-debug builds) * * linux/spinlock_types.h: * defines the generic type and initializers * * linux/spinlock_up.h: * contains the __raw_spin_*()/etc. version of UP * builds. (which are NOPs on non-debug, non-preempt * builds) * * (included on UP-non-debug builds:) * * linux/spinlock_api_up.h: * builds the _spin_*() APIs. * * linux/spinlock.h: builds the final spin_*() APIs. */ All SMP and UP architectures are converted by this patch. arm, i386, ia64, ppc, ppc64, s390/s390x, x64 was build-tested via crosscompilers. m32r, mips, sh, sparc, have not been tested yet, but should be mostly fine. From: Grant Grundler <grundler@parisc-linux.org> Booted and lightly tested on a500-44 (64-bit, SMP kernel, dual CPU). Builds 32-bit SMP kernel (not booted or tested). I did not try to build non-SMP kernels. That should be trivial to fix up later if necessary. I converted bit ops atomic_hash lock to raw_spinlock_t. Doing so avoids some ugly nesting of linux/*.h and asm/*.h files. Those particular locks are well tested and contained entirely inside arch specific code. I do NOT expect any new issues to arise with them. If someone does ever need to use debug/metrics with them, then they will need to unravel this hairball between spinlocks, atomic ops, and bit ops that exist only because parisc has exactly one atomic instruction: LDCW (load and clear word). From: "Luck, Tony" <tony.luck@intel.com> ia64 fix Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjanv@infradead.org> Signed-off-by: Grant Grundler <grundler@parisc-linux.org> Cc: Matthew Wilcox <willy@debian.org> Signed-off-by: Hirokazu Takata <takata@linux-m32r.org> Signed-off-by: Mikael Pettersson <mikpe@csd.uu.se> Signed-off-by: Benoit Boissinot <benoit.boissinot@ens-lyon.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-01[PATCH] ppc64: reverse prediction on spinlock busy loop codeJake Moilanen
On our raw spinlocks, we currently have an attempt at the lock, and if we do not get it we enter a spin loop. This spinloop will likely continue for awhile, and we pridict likely. Shouldn't we predict that we will get out of the loop so our next instructions are already prefetched. Even when we miss because the lock is still held, it won't matter since we are waiting anyways. I did a couple quick benchmarks, but the results are inconclusive. 16-way 690 running specjbb with original code # ./specjbb 3000 16 1 1 19 30 120 ... Valid run, Score is 59282 16-way 690 running specjbb with unlikely code # ./specjbb 3000 16 1 1 19 30 120 ... Valid run, Score is 59541 I saw a smaller increase on a JS20 (~1.6%) JS20 specjbb w/ original code # ./specjbb 400 2 1 1 19 30 120 ... Valid run, Score is 20460 JS20 specjbb w/ unlikely code # ./specjbb 400 2 1 1 19 30 120 ... Valid run, Score is 20803 Anton said: Mispredicting the spinlock busy loop also means we slow down the rate at which we do the loads which can be good for heavily contended locks. Note: There are some gcc issues with our default build and branch prediction, but a CONFIG_POWER4_ONLY build should emit them correctly. I'm working with Alan Modra on it now. Signed-off-by: Jake Moilanen <moilanen@austin.ibm.com> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-01-19ppc64: rwlock *_can_lock() primitivesLinus Torvalds
2005-01-19Remove broken-as-designed "rwlock_is_locked()" macroLinus Torvalds
rwlocks can't be "locked". They can be "locked for read", or "locked for write", but not both. The confusion this caused is evident in the long discussion about this on linux-kernel ;)
2005-01-11[PATCH] PPC64 had _raw_read_trylock alreadyPaul Mackerras
Ingo presumably didn't notice that ppc64 already had a functional _raw_read_trylock when he added the #define to use the generic version. This just removes the #define so we use the ppc64-specific version again. Signed-off-by: Paul Mackerras <paulus@samba.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-01-07[PATCH] improve preemption on SMPIngo Molnar
SMP locking latencies are one of the last architectural problems that cause millisec-category scheduling delays. CONFIG_PREEMPT tries to solve some of the SMP issues but there are still lots of problems remaining: spinlocks nested at multiple levels, spinning with irqs turned off, and non-nested spinning with preemption turned off permanently. The nesting problem goes like this: if a piece of kernel code (e.g. the MM or ext3's journalling code) does the following: spin_lock(&spinlock_1); ... spin_lock(&spinlock_2); ... then even with CONFIG_PREEMPT enabled, current kernels may spin on spinlock_2 indefinitely. A number of critical sections break their long paths by using cond_resched_lock(), but this does not break the path on SMP, because need_resched() *of the other CPU* is not set so cond_resched_lock() doesnt notice that a reschedule is due. to solve this problem i've introduced a new spinlock field, lock->break_lock, which signals towards the holding CPU that a spinlock-break is requested by another CPU. This field is only set if a CPU is spinning in a spinlock function [at any locking depth], so the default overhead is zero. I've extended cond_resched_lock() to check for this flag - in this case we can also save a reschedule. I've added the lock_need_resched(lock) and need_lockbreak(lock) methods to check for the need to break out of a critical section. Another latency problem was that the stock kernel, even with CONFIG_PREEMPT enabled, didnt have any spin-nicely preemption logic for the following, commonly used SMP locking primitives: read_lock(), spin_lock_irqsave(), spin_lock_irq(), spin_lock_bh(), read_lock_irqsave(), read_lock_irq(), read_lock_bh(), write_lock_irqsave(), write_lock_irq(), write_lock_bh(). Only spin_lock() and write_lock() [the two simplest cases] where covered. In addition to the preemption latency problems, the _irq() variants in the above list didnt do any IRQ-enabling while spinning - possibly resulting in excessive irqs-off sections of code! preempt-smp.patch fixes all these latency problems by spinning irq-nicely (if possible) and by requesting lock-breaks if needed. Two architecture-level changes were necessary for this: the addition of the break_lock field to spinlock_t and rwlock_t, and the addition of the _raw_read_trylock() function. Testing done by Mark H Johnson and myself indicate SMP latencies comparable to the UP kernel - while they were basically indefinitely high without this patch. i successfully test-compiled and test-booted this patch ontop of BK-curr using the following .config combinations: SMP && PREEMPT, !SMP && PREEMPT, SMP && !PREEMPT and !SMP && !PREEMPT on x86, !SMP && !PREEMPT and SMP && PREEMPT on x64. I also test-booted x86 with the generic_read_trylock function to check that it works fine. Essentially the same patch has been in testing as part of the voluntary-preempt patches for some time already. NOTE to architecture maintainers: generic_raw_read_trylock() is a crude version that should be replaced with the proper arch-optimized version ASAP. From: Hugh Dickins <hugh@veritas.com> The i386 and x86_64 _raw_read_trylocks in preempt-smp.patch are too successful: atomic_read() returns a signed integer. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Hugh Dickins <hugh@veritas.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-01-04[PATCH] ppc64: remove StudlyCaps from lppaca structureStephen Rothwell
This patch just renames all the fields (and the structure name) of the lppaca structure to rid us of some more StudyCaps. Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-09-07[PATCH] ppc64: fix __rw_yield prototypeAnton Blanchard
From: Nathan Lynch <nathanl@austin.ibm.com> Hit this in latest bk: include/asm/spinlock.h: In function `_raw_read_lock': include/asm/spinlock.h:198: warning: passing arg 1 of `__rw_yield' from incompatible pointer type include/asm/spinlock.h: In function `_raw_write_lock': include/asm/spinlock.h:255: warning: passing arg 1 of `__rw_yield' from incompatible pointer type This seems to have been broken by the out-of-line spinlocks patch. You won't hit it unless you've enabled CONFIG_PPC_SPLPAR. Use the rwlock_t for the argument type, and move the definition of rwlock_t up next to that of spinlock_t. Signed-off-by: Nathan Lynch <nathanl@austin.ibm.com> Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-09-04[PATCH] out-of-line locks / ppc64Zwane Mwaikambo
Signed-off-by: Zwane Mwaikambo <zwane@fsmlabs.com> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-07-01[PATCH] ppc64: PACA cleanupDavid Gibson
Cleanup the PPC64 PACA structure. It was previously a big mess of unecessary fields, overengineered cache layout and uninformative comments. This is essentially a rewrite of include/asm-pp64/paca.h with associated changes elsewhere. The patch: - Removes unused PACA fields - Removes uneeded #includes - Uses gcc attributes instead of explicit padding to get the desired cacheline layout, also rethinks the layout and comments accordingly. - Better comments where asm or firmware dependencies apply non-obvious layout constraints. - Splits up the pointless STAB structure, letting us move its elements independently. - Uses offsetof instead of hardcoded offset in spinlocks. - Eradicates xStudlyCaps identifiers - Replaces PACA guard page with an explicitly defined emergency stack (removing more than NR_CPUS pages from the initialized data segment). Cc: Paul Mackerras <paulus@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-05-24[PATCH] ppc64: fix inline spinlocksAndrew Morton
From: Anton Blanchard <anton@samba.org> In _raw_spin_lock_flags we were branching to the wrong spot and would restore random stuff to the MSR.
2004-05-20[PATCH] ppc64: fix inline version of _raw_spin_trylockAndrew Morton
From: Paul Mackerras <paulus@samba.org> When I added the out-of-line spinlocks on PPC64, I inadvertently introduced a bug in the inline version of _raw_spin_trylock, where it returns the opposite of what it should return. The patch below fixes it.
2004-05-14[PATCH] ppc64: strengthen I/O and memory barriersAndrew Morton
From: Paul Mackerras <paulus@samba.org> After I sent the recent patch to include/asm-ppc64/io.h which put stronger barriers in the I/O accessor macros, Paul McKenney pointed out to me that a writex/outx could still slide out from inside a spinlocked region. This patch makes the barriers a bit stronger so that this can't happen. It means that we need to use a sync instruction for wmb (a full "heavyweight" sync), since drivers rely on wmb for ordering between writes to system memory and writes to a device. I have left smb_wmb() as a lighter-weight barrier that orders stores, and doesn't impose an ordering between cacheable and non-cacheable accesses (the amusingly-named eieio instruction). I am assuming here that smp_wmb is only used for ordering stores to system memory so that another cpu will see them in order. It can't be used for enforcing any ordering that a device will see, because it is just a gcc barrier on UP. This also changes the spinlock/rwlock unlock code to use lwsync ("light-weight sync") rather than eieio, since eieio doesn't order loads, and we need to ensure that loads stay inside the spinlocked region.
2004-05-10[PATCH] Un-inline spinlocks on ppc64Andrew Morton
From: Paul Mackerras <paulus@samba.org> The patch below moves the ppc64 spinlocks and rwlocks out of line and into arch/ppc64/lib/locks.c, and implements _raw_spin_lock_flags for ppc64. Part of the motivation for moving the spinlocks and rwlocks out of line was that I needed to add code to the slow paths to yield the processor to the hypervisor on systems with shared processors. On these systems, a cpu as seen by the kernel is a virtual processor that is not necessarily running full-time on a real physical cpu. If we are spinning on a lock which is held by another virtual processor which is not running at the moment, we are just wasting time. In such a situation it is better to do a hypervisor call to ask it to give the rest of our time slice to the lock holder so that forward progress can be made. The one problem with out-of-line spinlock routines is that lock contention will show up in profiles in the spin_lock etc. routines rather than in the callers, as it does with inline spinlocks. I have added a CONFIG_SPINLINE config option for people that want to do profiling. In the longer term, Anton is talking about teaching the profiling code to attribute samples in the spin lock routines to the routine's caller. This patch reduces the kernel by about 80kB on my G5. With inline spinlocks selected, the kernel gets about 4kB bigger than without the patch, because _raw_spin_lock_flags is slightly bigger than _raw_spin_lock. This patch depends on the patch from Keith Owens to add _raw_spin_lock_flags.
2004-05-10[PATCH] Allow architectures to reenable interrupts on contended spinlocksAndrew Morton
From: Keith Owens <kaos@sgi.com> As requested by Linus, update all architectures to add the common infrastructure. Tested on ia64 and i386. Enable interrupts while waiting for a disabled spinlock, but only if interrupts were enabled before issuing spin_lock_irqsave(). This patch consists of three sections :- * An architecture independent change to call _raw_spin_lock_flags() instead of _raw_spin_lock() when the flags are available. * An ia64 specific change to implement _raw_spin_lock_flags() and to define _raw_spin_lock(lock) as _raw_spin_lock_flags(lock, 0) for the ASM_SUPPORTED case. * Patches for all other architectures and for ia64 with !ASM_SUPPORTED to map _raw_spin_lock_flags(lock, flags) to _raw_spin_lock(lock). Architecture maintainers can define _raw_spin_lock_flags() to do something useful if they want to enable interrupts while waiting for a disabled spinlock.
2002-09-11ppc64: add rwlock_is_lockedAnton Blanchard
2002-08-13Use HMT hints in cpu_relaxAnton Blanchard
2002-05-02ppc64: Only implement thread priority macros on HMT or iSeries kernelsAnton Blanchard
Drop back to eieio in spinlocks for the moment due to performance issues of sync on power3
2002-03-12ppc64: add large page bit, use lwsync instead of eieio for spinlockAnton Blanchard
release.
2002-02-15Add ppc64 support. This includes both pSeries (RS/6000) andAnton Blanchard
iSeries (AS/400). There are no changes outside of include/asm-ppc64 and arch/ppc64 in this changeset.