summaryrefslogtreecommitdiff
path: root/include/linux/wait.h
AgeCommit message (Collapse)Author
2013-08-29SCSI: zfcp: fix lock imbalance by reworking request queue lockingMartin Peschke
commit d79ff142624e1be080ad8d09101f7004d79c36e1 upstream. This patch adds wait_event_interruptible_lock_irq_timeout(), which is a straight-forward descendant of wait_event_interruptible_timeout() and wait_event_interruptible_lock_irq(). The zfcp driver used to call wait_event_interruptible_timeout() in combination with some intricate and error-prone locking. Using wait_event_interruptible_lock_irq_timeout() as a replacement nicely cleans up that locking. This rework removes a situation that resulted in a locking imbalance in zfcp_qdio_sbal_get(): BUG: workqueue leaked lock or atomic: events/1/0xffffff00/10 last function: zfcp_fc_wka_port_offline+0x0/0xa0 [zfcp] It was introduced by commit c2af7545aaff3495d9bf9a7608c52f0af86fb194 "[SCSI] zfcp: Do not wait for SBALs on stopped queue", which had a new code path related to ZFCP_STATUS_ADAPTER_QDIOUP that took an early exit without a required lock being held. The problem occured when a special, non-SCSI I/O request was being submitted in process context, when the adapter's queues had been torn down. In this case the bug surfaced when the Fibre Channel port connection for a well-known address was closed during a concurrent adapter shut-down procedure, which is a rare constellation. This patch also fixes these warnings from the sparse tool (make C=1): drivers/s390/scsi/zfcp_qdio.c:224:12: warning: context imbalance in 'zfcp_qdio_sbal_check' - wrong count at exit drivers/s390/scsi/zfcp_qdio.c:244:5: warning: context imbalance in 'zfcp_qdio_sbal_get' - unexpected unlock Last but not least, we get rid of that crappy lock-unlock-lock sequence at the beginning of the critical section. It is okay to call zfcp_erp_adapter_reopen() with req_q_lock held. Reported-by: Mikulas Patocka <mpatocka@redhat.com> Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com> Signed-off-by: Martin Peschke <mpeschke@linux.vnet.ibm.com> Signed-off-by: Steffen Maier <maier@linux.vnet.ibm.com> Signed-off-by: James Bottomley <JBottomley@Parallels.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2013-06-07wait: fix false timeouts when using wait_event_timeout()Imre Deak
commit 4c663cfc523a88d97a8309b04a089c27dc57fd7e upstream. Many callers of the wait_event_timeout() and wait_event_interruptible_timeout() expect that the return value will be positive if the specified condition becomes true before the timeout elapses. However, at the moment this isn't guaranteed. If the wake-up handler is delayed enough, the time remaining until timeout will be calculated as 0 - and passed back as a return value - even if the condition became true before the timeout has passed. Fix this by returning at least 1 if the condition becomes true. This semantic is in line with what wait_for_condition_timeout() does; see commit bb10ed09 ("sched: fix wait_for_completion_timeout() spurious failure under heavy load"). Daniel said "We have 3 instances of this bug in drm/i915. One case even where we switch between the interruptible and not interruptible wait_event_timeout variants, foolishly presuming they have the same semantics. I very much like this." One such bug is reported at https://bugs.freedesktop.org/show_bug.cgi?id=64133 Signed-off-by: Imre Deak <imre.deak@intel.com> Acked-by: Daniel Vetter <daniel.vetter@ffwll.ch> Acked-by: David Howells <dhowells@redhat.com> Acked-by: Jens Axboe <axboe@kernel.dk> Cc: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> Cc: Dave Jones <davej@redhat.com> Cc: Lukas Czerner <lczerner@redhat.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2012-03-28Remove all #inclusions of asm/system.hDavid Howells
Remove all #inclusions of asm/system.h preparatory to splitting and killing it. Performed with the following command: perl -p -i -e 's!^#\s*include\s*<asm/system[.]h>.*\n!!' `grep -Irl '^#\s*include\s*<asm/system[.]h>' *` Signed-off-by: David Howells <dhowells@redhat.com>
2012-03-01sched/wait: Add __wake_up_all_locked() APIThomas Gleixner
For code which protects the waitqueue itself with another lock it makes no sense to acquire the waitqueue lock for wakeup all. Provide __wake_up_all_locked(). This is an optimization on the vanilla kernel (to be used by the PCI code) and an important semantic distinction on -rt. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Link: http://lkml.kernel.org/n/tip-ux6m4b8jonb9inx8xafh77ds@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2011-12-21lockdep/waitqueues: Add better annotationPeter Zijlstra
-> #2 (&tty->write_wait){-.-...}: is a lot more informative than: -> #2 (key#19){-.....}: Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Link: http://lkml.kernel.org/n/tip-8zpopbny51023rdb0qq67eye@git.kernel.org Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-10-05wait: using uninitialized member of wait queueEvgeny Kuznetsov
The "flags" member of "struct wait_queue_t" is used in several places in the kernel code without beeing initialized by init_wait(). "flags" is used in bitwise operations. If "flags" not initialized then unexpected behaviour may take place. Incorrect flags might used later in code. Added initialization of "wait_queue_t.flags" with zero value into "init_wait". Signed-off-by: Evgeny Kuznetsov <EXT-Eugeny.Kuznetsov@nokia.com> [ The bit we care about does end up being initialized by both prepare_to_wait() and add_to_wait_queue(), so this doesn't seem to cause actual bugs, but is definitely the right thing to do -Linus ] Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-20Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6: (229 commits) USB: remove unused usb_buffer_alloc and usb_buffer_free macros usb: musb: update gfp/slab.h includes USB: ftdi_sio: fix legacy SIO-device header USB: kl5usb105: reimplement using generic framework USB: kl5usb105: minor clean ups USB: kl5usb105: fix memory leak USB: io_ti: use kfifo to implement write buffering USB: io_ti: remove unsused private counter USB: ti_usb: use kfifo to implement write buffering USB: ir-usb: fix incorrect write-buffer length USB: aircable: fix incorrect write-buffer length USB: safe_serial: straighten out read processing USB: safe_serial: reimplement read using generic framework USB: safe_serial: reimplement write using generic framework usb-storage: always print quirks USB: usb-storage: trivial debug improvements USB: oti6858: use port write fifo USB: oti6858: use kfifo to implement write buffering USB: cypress_m8: use kfifo to implement write buffering USB: cypress_m8: remove unused drain define ... Fix up conflicts (due to usb_buffer_alloc/free renaming) in drivers/input/tablet/acecad.c drivers/input/tablet/kbtab.c drivers/input/tablet/wacom_sys.c drivers/media/video/gspca/gspca.c sound/usb/usbaudio.c
2010-05-20wait_event_interruptible_locked() interfaceMichal Nazarewicz
New wait_event_interruptible{,_exclusive}_locked{,_irq} macros added. They work just like versions without _locked* suffix but require the wait queue's lock to be held. Also __wake_up_locked() is now exported as to pair it with the above macros. The use case of this new facility is when one uses wait queue's lock to protect a data structure. This may be advantageous if the structure needs to be protected by a spinlock anyway. In particular, with additional spinlock the following code has to be used to wait for a condition: spin_lock(&data.lock); ... for (ret = 0; !ret && !(condition); ) { spin_unlock(&data.lock); ret = wait_event_interruptible(data.wqh, (condition)); spin_lock(&data.lock); } ... spin_unlock(&data.lock); This looks bizarre plus wait_event_interruptible() locks the wait queue's lock anyway so there is a unlock+lock sequence where it could be avoided. To avoid those problems and benefit from wait queue's lock, a code similar to the following should be used: /* Waiting */ spin_lock(&data.wqh.lock); ... ret = wait_event_interruptible_locked(data.wqh, (condition)); ... spin_unlock(&data.wqh.lock); /* Waiting exclusively */ spin_lock(&data.whq.lock); ... ret = wait_event_interruptible_exclusive_locked(data.whq, (condition)); ... spin_unlock(&data.whq.lock); /* Waking up */ spin_lock(&data.wqh.lock); ... wake_up_locked(&data.wqh); ... spin_unlock(&data.wqh.lock); When spin_lock_irq() is used matching versions of macros need to be used (*_locked_irq()). Signed-off-by: Michal Nazarewicz <m.nazarewicz@samsung.com> Cc: Kyungmin Park <kyungmin.park@samsung.com> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Ingo Molnar <mingo@elte.hu> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Takashi Iwai <tiwai@suse.de> Cc: David Howells <dhowells@redhat.com> Cc: Andreas Herrmann <andreas.herrmann3@amd.com> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Mike Galbraith <efault@gmx.de> Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2010-05-11sched, wait: Use wrapper functionsChangli Gao
epoll should not touch flags in wait_queue_t. This patch introduces a new function __add_wait_queue_exclusive(), for the users, who use wait queue as a LIFO queue. __add_wait_queue_tail_exclusive() is introduced too instead of add_wait_queue_exclusive_locked(). remove_wait_queue_locked() is removed, as it is a duplicate of __remove_wait_queue(), disliked by users, and with less users. Signed-off-by: Changli Gao <xiaosuo@gmail.com> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Alexander Viro <viro@zeniv.linux.org.uk> Cc: Paul Menage <menage@google.com> Cc: Li Zefan <lizf@cn.fujitsu.com> Cc: Davide Libenzi <davidel@xmailserver.org> Cc: <containers@lists.linux-foundation.org> LKML-Reference: <1273214006-2979-1-git-send-email-xiaosuo@gmail.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-09-15sched: Rename sync argumentsPeter Zijlstra
In order to extend the functions to have more than 1 flag (sync), rename the argument to flags, and explicitly define a WF_ space for individual flags. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <new-submission> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-08-10locking, sched: Give waitqueue spinlocks their own lockdep classesPeter Zijlstra
Give waitqueue spinlocks their own lockdep classes when they are initialised from init_waitqueue_head(). This means that struct wait_queue::func functions can operate other waitqueues. This is used by CacheFiles to catch the page from a backing fs being unlocked and to wake up another thread to take a copy of it. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: David Howells <dhowells@redhat.com> Tested-by: Takashi Iwai <tiwai@suse.de> Cc: linux-cachefs@redhat.com Cc: torvalds@osdl.org Cc: akpm@linux-foundation.org LKML-Reference: <20090810113305.17284.81508.stgit@warthog.procyon.org.uk> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-05-11Merge commit 'v2.6.30-rc5' into sched/coreIngo Molnar
Merge reason: sched/core was on .30-rc1 before, update to latest fixes Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-28net: Avoid extra wakeups of threads blocked in wait_for_packet()Eric Dumazet
In 2.6.25 we added UDP mem accounting. This unfortunatly added a penalty when a frame is transmitted, since we have at TX completion time to call sock_wfree() to perform necessary memory accounting. This calls sock_def_write_space() and utimately scheduler if any thread is waiting on the socket. Thread(s) waiting for an incoming frame was scheduled, then had to sleep again as event was meaningless. (All threads waiting on a socket are using same sk_sleep anchor) This adds lot of extra wakeups and increases latencies, as noted by Christoph Lameter, and slows down softirq handler. Reference : http://marc.info/?l=linux-netdev&m=124060437012283&w=2 Fortunatly, Davide Libenzi recently added concept of keyed wakeups into kernel, and particularly for sockets (see commit 37e5540b3c9d838eb20f2ca8ea2eb8072271e403 epoll keyed wakeups: make sockets use keyed wakeups) Davide goal was to optimize epoll, but this new wakeup infrastructure can help non epoll users as well, if they care to setup an appropriate handler. This patch introduces new DEFINE_WAIT_FUNC() helper and uses it in wait_for_packet(), so that only relevant event can wakeup a thread blocked in this function. Trace of function calls from bnx2 TX completion bnx2_poll_work() is : __kfree_skb() skb_release_head_state() sock_wfree() sock_def_write_space() __wake_up_sync_key() __wake_up_common() receiver_wake_function() : Stops here since thread is waiting for an INPUT Reported-by: Christoph Lameter <cl@linux.com> Signed-off-by: Eric Dumazet <dada1@cosmosbay.com> Signed-off-by: David S. Miller <davem@davemloft.net>
2009-04-14wait: don't use __wake_up_common()Johannes Weiner
'777c6c5 wait: prevent exclusive waiter starvation' made __wake_up_common() global to be used from abort_exclusive_wait(). It was needed to do a wake-up with the waitqueue lock held while passing down a key to the wake-up function. Since '4ede816 epoll keyed wakeups: add __wake_up_locked_key() and __wake_up_sync_key()' there is an appropriate wrapper for this case: __wake_up_locked_key(). Use it here and make __wake_up_common() private to the scheduler again. Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> LKML-Reference: <1239720785-19661-1-git-send-email-hannes@cmpxchg.org> Signed-off-by: Ingo Molnar <mingo@elte.hu>
2009-04-01epoll keyed wakeups: introduce new *_poll() wakeup macrosDavide Libenzi
Introduce new wakeup macros that allow passing an event mask to the wakeup targets. They exactly mimic their non-_poll() counterpart, with the added event mask passing capability. I did add only the ones currently requested, avoiding the _nr() and _all() for the moment. Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Cc: William Lee Irwin III <wli@movementarian.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-04-01epoll keyed wakeups: add __wake_up_locked_key() and __wake_up_sync_key()Davide Libenzi
This patchset introduces wakeup hints for some of the most popular (from epoll POV) devices, so that epoll code can avoid spurious wakeups on its waiters. The problem with epoll is that the callback-based wakeups do not, ATM, carry any information about the events the wakeup is related to. So the only choice epoll has (not being able to call f_op->poll() from inside the callback), is to add the file* to a ready-list and resolve the real events later on, at epoll_wait() (or its own f_op->poll()) time. This can cause spurious wakeups, since the wake_up() itself might be for an event the caller is not interested into. The rate of these spurious wakeup can be pretty high in case of many network sockets being monitored. By allowing devices to report the events the wakeups refer to (at least the two major classes - POLLIN/POLLOUT), we are able to spare useless wakeups by proper handling inside the epoll's poll callback. Epoll will have in any case to call f_op->poll() on the file* later on, since the change to be done in order to have the full event set sent via wakeup, is too invasive for the way our f_op->poll() system works (the full event set is calculated inside the poll function - there are too many of them to even start thinking the change - also poll/select would need change too). Epoll is changed in a way that both devices which send event hints, and the ones that don't, are correctly handled. The former will gain some efficiency though. As a general rule for devices, would be to add an event mask by using key-aware wakeup macros, when making up poll wait queues. I tested it (together with the epoll's poll fix patch Andrew has in -mm) and wakeups for the supported devices are correctly filtered. Test program available here: http://www.xmailserver.org/epoll_test.c This patch: Nothing revolutionary here. Just using the available "key" that our wakeup core already support. The __wake_up_locked_key() was no brainer, since both __wake_up_locked() and __wake_up_locked_key() are thin wrappers around __wake_up_common(). The __wake_up_sync() function had a body, so the choice was between borrowing the body for __wake_up_sync_key() and calling it from __wake_up_sync(), or make an inline and calling it from both. I chose the former since in most archs it all resolves to "mov $0, REG; jmp ADDR". Signed-off-by: Davide Libenzi <davidel@xmailserver.org> Cc: Alan Cox <alan@lxorguk.ukuu.org.uk> Cc: Ingo Molnar <mingo@elte.hu> Cc: David Miller <davem@davemloft.net> Cc: William Lee Irwin III <wli@movementarian.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2009-02-05wait: prevent exclusive waiter starvationJohannes Weiner
With exclusive waiters, every process woken up through the wait queue must ensure that the next waiter down the line is woken when it has finished. Interruptible waiters don't do that when aborting due to a signal. And if an aborting waiter is concurrently woken up through the waitqueue, noone will ever wake up the next waiter. This has been observed with __wait_on_bit_lock() used by lock_page_killable(): the first contender on the queue was aborting when the actual lock holder woke it up concurrently. The aborted contender didn't acquire the lock and therefor never did an unlock followed by waking up the next waiter. Add abort_exclusive_wait() which removes the process' wait descriptor from the waitqueue, iff still queued, or wakes up the next waiter otherwise. It does so under the waitqueue lock. Racing with a wake up means the aborting process is either already woken (removed from the queue) and will wake up the next waiter, or it will remove itself from the queue and the concurrent wake up will apply to the next waiter after it. Use abort_exclusive_wait() in __wait_event_interruptible_exclusive() and __wait_on_bit_lock() when they were interrupted by other means than a wake up through the queue. [akpm@linux-foundation.org: coding-style fixes] Reported-by: Chris Mason <chris.mason@oracle.com> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org> Mentored-by: Oleg Nesterov <oleg@redhat.com> Cc: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Matthew Wilcox <matthew@wil.cx> Cc: Chuck Lever <cel@citi.umich.edu> Cc: Nick Piggin <nickpiggin@yahoo.com.au> Cc: Ingo Molnar <mingo@elte.hu> Cc: <stable@kernel.org> ["after some testing"] Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-10-16wait: kill is_sync_wait()Tejun Heo
is_sync_wait() is used to distinguish between sync and async waits. Basically sync waits are the ones initialized with init_waitqueue_entry() and async ones with init_waitqueue_func_entry(). The sync/async distinction is used only in prepare_to_wait[_exclusive]() and its only function is to skip setting the current task state if the wait is async. This has a few problems. * No one uses it. None of func_entry users use prepare_to_wait() functions, so the code path never gets executed. * The distinction is bogus. Maybe back when func_entry is used only by aio but it's now also used by epoll and in future possibly by 9p and poll/select. * Taking @state as argument and ignoring it silenly depending on how @wait is initialized is just a bad error-prone API. * It prevents func_entry waits from using wait->private for no good reason. This patch kills is_sync_wait() and the associated code paths from prepare_to_wait[_exclusive](). As there was no user of these code paths, this patch doesn't cause any behavior difference. Signed-off-by: Tejun Heo <tj@kernel.org> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-13include/linux: Remove all users of FASTCALL() macroHarvey Harrison
FASTCALL() is always expanded to empty, remove it. [akpm@linux-foundation.org: coding-style fixes] Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-05lockdep: annotate epollPeter Zijlstra
On Sat, 2008-01-05 at 13:35 -0800, Davide Libenzi wrote: > I remember I talked with Arjan about this time ago. Basically, since 1) > you can drop an epoll fd inside another epoll fd 2) callback-based wakeups > are used, you can see a wake_up() from inside another wake_up(), but they > will never refer to the same lock instance. > Think about: > > dfd = socket(...); > efd1 = epoll_create(); > efd2 = epoll_create(); > epoll_ctl(efd1, EPOLL_CTL_ADD, dfd, ...); > epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...); > > When a packet arrives to the device underneath "dfd", the net code will > issue a wake_up() on its poll wake list. Epoll (efd1) has installed a > callback wakeup entry on that queue, and the wake_up() performed by the > "dfd" net code will end up in ep_poll_callback(). At this point epoll > (efd1) notices that it may have some event ready, so it needs to wake up > the waiters on its poll wait list (efd2). So it calls ep_poll_safewake() > that ends up in another wake_up(), after having checked about the > recursion constraints. That are, no more than EP_MAX_POLLWAKE_NESTS, to > avoid stack blasting. Never hit the same queue, to avoid loops like: > > epoll_ctl(efd2, EPOLL_CTL_ADD, efd1, ...); > epoll_ctl(efd3, EPOLL_CTL_ADD, efd2, ...); > epoll_ctl(efd4, EPOLL_CTL_ADD, efd3, ...); > epoll_ctl(efd1, EPOLL_CTL_ADD, efd4, ...); > > The code "if (tncur->wq == wq || ..." prevents re-entering the same > queue/lock. Since the epoll code is very careful to not nest same instance locks allow the recursion. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Tested-by: Stefan Richter <stefanr@s5r6.in-berlin.de> Acked-by: Davide Libenzi <davidel@xmailserver.org> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-12-06Add wait_event_killableMatthew Wilcox
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2007-12-06wait: Use TASK_NORMALMatthew Wilcox
Also move wake_up_locked() to be with the related functions Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
2007-07-09sched: clean up sleep_on() APIsIngo Molnar
clean up the sleep_on() APIs: - do not use fastcall - replace fragile macro magic with proper inline functions Signed-off-by: Ingo Molnar <mingo@elte.hu>
2006-10-30[PATCH] lockdep: annotate DECLARE_WAIT_QUEUE_HEADPeter Zijlstra
kernel: INFO: trying to register non-static key. kernel: the code is fine but needs lockdep annotation. kernel: turning off the locking correctness validator. kernel: [<c04051ed>] show_trace_log_lvl+0x58/0x16a kernel: [<c04057fa>] show_trace+0xd/0x10 kernel: [<c0405913>] dump_stack+0x19/0x1b kernel: [<c043b1e2>] __lock_acquire+0xf0/0x90d kernel: [<c043bf70>] lock_acquire+0x4b/0x6b kernel: [<c061472f>] _spin_lock_irqsave+0x22/0x32 kernel: [<c04363d3>] prepare_to_wait+0x17/0x4b kernel: [<f89a24b6>] lpfc_do_work+0xdd/0xcc2 [lpfc] kernel: [<c04361b9>] kthread+0xc3/0xf2 kernel: [<c0402005>] kernel_thread_helper+0x5/0xb Another case of non-static lockdep keys; duplicate the paradigm set by DECLARE_COMPLETION_ONSTACK and introduce DECLARE_WAIT_QUEUE_HEAD_ONSTACK. Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Cc: Greg KH <gregkh@suse.de> Cc: Markus Lidel <markus.lidel@shadowconnect.com> Acked-by: Ingo Molnar <mingo@elte.hu> Cc: Arjan van de Ven <arjan@infradead.org> Cc: James Bottomley <James.Bottomley@steeleye.com> Cc: Marcel Holtmann <marcel@holtmann.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-10[PATCH] uninline init_waitqueue_head()Ingo Molnar
allyesconfig vmlinux size delta: text data bss dec filename 20736884 6073834 3075176 29885894 vmlinux.before 20721009 6073966 3075176 29870151 vmlinux.after ~18 bytes per callsite, 15K of text size (~0.1%) saved. (as an added bonus this also removes a lockdep annotation.) Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] lockdep: annotate waitqueuesIngo Molnar
Create one lock class for all waitqueue locks in the kernel. Has no effect on non-lockdep kernels. Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-03[PATCH] lockdep: locking init debugging improvementIngo Molnar
Locking init improvement: - introduce and use __SPIN_LOCK_UNLOCKED for array initializations, to pass in the name string of locks, used by debugging Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Arjan van de Ven <arjan@linux.intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-04-26Don't include linux/config.h from anywhere else in include/David Woodhouse
Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2005-11-07[PATCH] fix remaining missing includesTim Schmielau
Fix more include file problems that surfaced since I submitted the previous fix-missing-includes.patch. This should now allow not to include sched.h from module.h, which is done by a followup patch. Signed-off-by: Tim Schmielau <tim@physik3.uni-rostock.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-06-23[PATCH] aio: make wait_queue ->task ->privateBenjamin LaHaise
In the upcoming aio_down patch, it is useful to store a private data pointer in the kiocb's wait_queue. Since we provide our own wake up function and do not require the task_struct pointer, it makes sense to convert the task pointer into a generic private pointer. Signed-off-by: Benjamin LaHaise <benjamin.c.lahaise@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-24[PATCH] Cleanup DEFINE_WAITblaisorblade@yahoo.it
Use LIST_HEAD_INIT rather than doing it by hand in DEFINE_WAIT. Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-03-13[PATCH] Unified spinlock initialization include/linux/wait.hDomen Puncer
Unify the spinlock initialization as far as possible. Signed-off-by: Amit Gud <gud@eth.net> Signed-off-by: Domen Puncer <domen@coderock.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-03-11[PATCH] docbook: new kernel-doc comments for might_sleep & wait_event_*Martin Waitz
New kernel-doc comments for might_sleep & wait_event_* Signed-off-by: Martin Waitz <tali@admingilde.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-12-16[PATCH] parenthesize init_wait() macro parametersAndrew Morton
Addresses bug #3863, from <daveh@dmh2000.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-10-18[PATCH] remove wake_up_all_syncChristoph Hellwig
no user in sight Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-10-18[PATCH] reduce number of parameters to __wait_on_bit() and __wait_on_bit_lock()William Lee Irwin III
Some of the parameters to __wait_on_bit() and __wait_on_bit_lock() are redundant, as the wait_bit_queue parameter holds the flags word and the bit number. This patch updates __wait_on_bit() and __wait_on_bit_lock() to fetch that information from the wait_bit_queue passed to them and so reduce the number of parameters so that -mregparm may be more effective. Incremental atop the complete out-of-lining of the contention cases and the fastcall and wait_on_bit_lock()/test_and_set_bit() fixes. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-10-18[PATCH] move wait ops' contention case completely out of lineWilliam Lee Irwin III
Move the slow paths of wait_on_bit() and wait_on_bit_lock() out of line. Also uninline wake_up_bit() to reduce the number of callsites generated, and adjust loop startup in __wait_on_bit_lock() to properly reflect its usage in the contention case. Incremental atop the fastcall and wait_on_bit_lock()/test_and_set_bit() fixes. Successfully tested on x86-64. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-10-18[PATCH] eliminate bh waitqueue hashtableWilliam Lee Irwin III
Eliminate the bh waitqueue hashtable using bit_waitqueue() via wait_on_bit() and wake_up_bit() to locate the waitqueue head associated with a bit. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-10-18[PATCH] consolidate bit waiting code patternsWilliam Lee Irwin III
Consolidate bit waiting code patterns for page waitqueues using __wait_on_bit() and __wait_on_bit_lock(). Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-10-18[PATCH] standardize bit waiting data typeWilliam Lee Irwin III
Eliminate specialized page and bh waitqueue hashing structures in favor of a standardized structure, using wake_up_bit() to wake waiters using the standardized wait_bit_key structure. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-09-26[PATCH] Add wait_event_timeout()Russell King
There appears to be one case missing from the wait_event() family - the uninterruptible timeout wait. The following patch adds this. This wait is particularly useful when (eg) you wish to pass work off to an interrupt handler to perform, but also want to know if the hardware has decided to go gaga under you. You don't want to sit around waiting for something that'll never happen - you want to go and wack the gremlin which caused the failure and retry. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-08-30[PATCH] waitid system callRoland McGrath
This patch adds a new system call `waitid'. This is a new POSIX call that subsumes the rest of the wait* family and can do some things the older calls cannot. A minor addition is the ability to select what kinds of status to check for with a mask of independent bits, so you can wait for just stops and not terminations, for example. A more significant improvement is the WNOWAIT flag, which allows for polling child status without reaping. This interface fills in a siginfo_t with the same details that a SIGCHLD for the status change has; some of that info (e.g. si_uid) is not available via wait4 or other calls. I've added a new system call that has the parameter conventions of the POSIX function because that seems like the cleanest thing. This patch includes the actual system call table additions for i386 and x86-64; other architectures will need to assign the system call number, and 64-bit ones may need to implement 32-bit compat support for it as I did for x86-64. The new features could instead be provided by some new kludge inventions in the wait4 system call interface (that's what BSD did). If kludges are preferable to adding a system call, I can work up something different. I added a struct rusage field si_rusage to siginfo_t in the SIGCHLD case (this does not affect the size or layout of the struct). This is not part of the POSIX interface, but it makes it so that `waitid' subsumes all the functionality of `wait4'. Future kernel ABIs (new arch's or whatnot) can have only the `waitid' system call and the rest of the wait* family including wait3 and wait4 can be implemented in user space using waitid. There is nothing in user space as yet that would make use of the new field. Most of the new functionality is implemented purely in the waitid system call itself. POSIX also provides for the WCONTINUED flag to report when a child process had been stopped by job control and then resumed with SIGCONT. Corresponding to this, a SIGCHLD is now generated when a child resumes (unless SA_NOCLDSTOP is set), with the value CLD_CONTINUED in siginfo_t.si_code. To implement this, some additional bookkeeping is required in the signal code handling job control stops. The motivation for this work is to make it possible to implement the POSIX semantics of the `waitid' function in glibc completely and correctly. If changing either the system call interface used to accomplish that, or any details of the kernel implementation work, would improve the chances of getting this incorporated, I am more than happy to work through any issues. Signed-off-by: Roland McGrath <roland@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-08-23[PATCH] AIO: retry infrastructure fixes and enhancementsSuparna Bhattacharya
From: Daniel McNeil <daniel@osdl.org> From: Chris Mason <mason@suse.com> AIO: retry infrastructure fixes and enhancements Reorganises, comments and fixes the AIO retry logic. Fixes and enhancements include: - Split iocb setup and execution in io_submit (also fixes io_submit error reporting) - Use aio workqueue instead of keventd for retries - Default high level retry methods - Subtle use_mm/unuse_mm fix - Code commenting - Fix aio process hang on EINVAL (Daniel McNeil) - Hold the context lock across unuse_mm - Acquire task_lock in use_mm() - Allow fops to override the retry method with their own - Elevated ref count for AIO retries (Daniel McNeil) - set_fs needed when calling use_mm - Flush workqueue on __put_ioctx (Chris Mason) - Fix io_cancel to work with retries (Chris Mason) - Read-immediate option for socket/pipe retry support Note on default high-level retry methods support ================================================ High-level retry methods allows an AIO request to be executed as a series of non-blocking iterations, where each iteration retries the remaining part of the request from where the last iteration left off, by reissuing the corresponding AIO fop routine with modified arguments representing the remaining I/O. The retries are "kicked" via the AIO waitqueue callback aio_wake_function() which replaces the default wait queue entry used for blocking waits. The high level retry infrastructure is responsible for running the iterations in the mm context (address space) of the caller, and ensures that only one retry instance is active at a given time, thus relieving the fops themselves from having to deal with potential races of that sort. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-06-23[PATCH] Use fancy wakeups in wait.hAndrew Morton
Use the more SMP-friendly prepare_to_wait()/finish_wait() in wait_event() and friends. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-06-20[PATCH] add wait_event_interruptible_exclusive() macroDean Nelson
This patch defines a macro that does exactly what wait_event_interruptible() does except that it adds the current task to the wait queue as an exclusive task (i.e., sets the WQ_FLAG_EXCLUSIVE flag) rather than as a non-exclusive task as wait_event_interruptible() does. This allows one to do a wake_up_nr() to wake up a specific number of tasks. I'm in the process of submitting a patch to linux-ia64 that requires this capability. (Its subject line is "[PATCH 3/4] SGI Altix cross partition functionality".) Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-05-14[PATCH] filtered wakeups: wakeup enhancementsAndrew Morton
From: William Lee Irwin III <wli@holomorphy.com> This patch provides an additional argument to __wake_up_common() so that the information wakefunc.patch made waiters ready to receive may be passed to them by wakers. This is provided as a separate patch so that the overhead of the additional argument to __wake_up_common() can be measured in isolation. No change in performance was observable here.
2004-05-14[PATCH] filtered wakeupsAndrew Morton
From: William Lee Irwin III <wli@holomorphy.com> This patch series is solving the "thundering herd" problem that occurs in the mainline implementation of hashed waitqueues. There are two sources of spurious wakeups in such arrangements: (a) Hash collisions that place waiters on different objects on the same waitqueue, which wakes threads falsely when any of the objects hashed to the same queue receives a wakeup. i.e. loss of information about which object a wakeup event is related to. (b) Loss of information about which object a given waiter is waiting on. This precludes wake-one semantics for mutual exclusion scenarios. For instance, a lock bit may be slept on. If there are any waiters on the object, a lock bit release event must wake at least one of them so as to prevent deadlock. But without information as to which waiter is waiting on which object, we must resort to waking all waiters who could possibly be waiting on it. Now, as the lock bit provides mutual exclusion, only one of the waiters woken can proceed, and the remainder will go back to sleep and wait for another event, creating unnecessary system load. Once wake-one semantics are established, only one of the waiters waiting to acquire a lock bit need to be woken, which measurably reduces system load and improves efficiency (i.e. it's the subject of the benchmarking I've been sending to you). Even beyond the measurable efficiency gains, there are reasons of robustness and responsiveness to motivate addressing the issue of thundering herds. In a real-life scenario I've been personally involved in resolving, the thundering herd issue caused powerful modern SMP machines with fast IO systems to be unresponsive to user input for a minute at a time or more. Analogues of these patches for the distro kernels involved fully resolved the issue to the customer's satisfaction and obviated workarounds to limit the pagecache's size. The latest spin of these patches basically shoves more pieces of the logic into the wakeup functions, with some efficiency gains from sharing the hot codepath with the rest of the kernel, and a slightly larger diff than the patches with the newly-introduced entrypoint. Writing these was motivated by the push to insulate sched.c from more of the details of wakeup semantics by putting more of the logic into the wakeup functions. In order to accomplish this while still solving (b), the wakeup functions grew a new argument for communication about what object a wakeup event is related to to be passed by the waker. ========= This patch provides an additional argument to wakeup functions so that information may be passed from the waker to the waiter. This is provided as a separate patch so that the overhead of the additional argument can be measured in isolation. No change in performance was observable here.
2003-08-31[PATCH] remove add_wait_queue_cond()Andrew Morton
It has no callers, is using the non-existent spin_lock_irqrestore(), and is obviously very untested. Kill.
2003-05-18[PATCH] sync wakeup on UPIngo Molnar
This fixes the scheduler's sync-wakeup code to be consistent on UP as well. Right now there's a behavioral difference between an UP kernel and an SMP kernel running on a UP box: sync wakeups (which are only activated on SMP) can cause a wakeup of a higher prio task, without preemption. On UP kernels this does not happen. This difference in wakeup behavior is bad. This patch activates sync wakeups on UP as well - in the cases sync wakeups are done the waker knows that it will schedule away soon, so this 'delay preemption' decision is correct on UP as well.
2002-11-15[PATCH] Move wait queue handling from sched.h to wait.hMatthew Wilcox
This patch removes all the wait_queue handling code from sched.h and puts it in wait.h with the rest of the wait_queue handling code. Note that sched.h must continue to include wait.h for the wait_queue_head_t embedded in struct task. However there may be files which only need wait.h now.