summaryrefslogtreecommitdiff
path: root/kernel/workqueue.c
AgeCommit message (Collapse)Author
2006-01-14[PATCH] Unlinline a bunch of other functionsArjan van de Ven
Remove the "inline" keyword from a bunch of big functions in the kernel with the goal of shrinking it by 30kb to 40kb Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Jeff Garzik <jgarzik@pobox.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08[PATCH] fix workqueue oops during cpu offlineNathan Lynch
Use first_cpu(cpu_possible_map) for the single-thread workqueue case. We used to hardcode 0, but that broke on systems where !cpu_possible(0) when workqueue_struct->cpu_workqueue_struct was changed from a static array to alloc_percpu. Commit id bce61dd49d6ba7799be2de17c772e4c701558f14 ("Fix hardcoded cpu=0 in workqueue for per_cpu_ptr() calls") fixed that for Ben's funky sparc64 system, but it regressed my Power5. Offlining cpu 0 oopses upon the next call to queue_work for a single-thread workqueue, because now we try to manipulate per_cpu_ptr(wq->cpu_wq, 1), which is uninitialized. So we need to establish an unchanging "slot" for single-thread workqueues which will have a valid percpu allocation. Since alloc_percpu keys off of cpu_possible_map, which must not change after initialization, make this slot == first_cpu(cpu_possible_map). Signed-off-by: Nathan Lynch <ntl@pobox.com> Cc: <stable@kernel.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08[PATCH] Unchecked alloc_percpu() return in __create_workqueue()Ben Collins
__create_workqueue() not checking return of alloc_percpu() NULL dereference was possible. Signed-off-by: Ben Collins <bcollins@ubuntu.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-08[PATCH] add schedule_on_each_cpu()Christoph Lameter
swap migration's isolate_lru_page() currently uses an IPI to notify other processors that the lru caches need to be drained if the page cannot be found on the LRU. The IPI interrupt may interrupt a processor that is just processing lru requests and cause a race condition. This patch introduces a new function run_on_each_cpu() that uses the keventd() to run the LRU draining on each processor. Processors disable preemption when dealing the LRU caches (these are per processor) and thus executing LRU draining from another process is safe. Thanks to Lee Schermerhorn <lee.schermerhorn@hp.com> for finding this race condition. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-28[PATCH] Fix hardcoded cpu=0 in workqueue for per_cpu_ptr() callsBen Collins
Tracked this down on an Ultra Enterprise 3000. It's a 6-way machine. Odd thing about this machine (and it's good for finding bugs like this) is that the CPU id's are not 0 based. For instance, on my machine the CPU's are 6/7/10/11/14/15. This caused some NULL pointer dereference in kernel/workqueue.c because for single_threaded workqueue's, it hardcoded the cpu to 0. I changed the 0's to any_online_cpu(cpu_online_mask), which cpumask.h claims is "First cpu in mask". So this fits the same usage. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-11-07[PATCH] cpu hoptlug: avoid usage of smp_processor_id() in preemptible codeHeiko Carstens
Replace smp_processor_id() with any_online_cpu(cpu_online_map) in order to avoid lots of "BUG: using smp_processor_id() in preemptible [00000001] code:..." messages in case taking a cpu online fails. All the traces start at the last notifier_call_chain(...) in kernel/cpu.c. Since we hold the cpu_control semaphore it shouldn't be any problem to access cpu_online_map. The reason why cpu_up failed is simply that the cpu that was supposed to be taken online wasn't even there. That is because on s390 we never know when a new cpu comes and therefore cpu_possible_map consists of only ones and doesn't reflect reality. Signed-off-by: Heiko Carstens <heiko.carstens@de.ibm.com> Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-10-30[PATCH] Use alloc_percpu to allocate workqueues locallyChristoph Lameter
This patch makes the workqueus use alloc_percpu instead of an array. The workqueues are placed on nodes local to each processor. The workqueue structure can grow to a significant size on a system with lots of processors if this patch is not applied. 64 bit architectures with all debugging features enabled and configured for 512 processors will not be able to boot without this patch. Signed-off-by: Christoph Lameter <clameter@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07[PATCH] introduce and use kzallocPekka J Enberg
This patch introduces a kzalloc wrapper and converts kernel/ to use it. It saves a little program text. Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi> Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-09-07[PATCH] create_workqueue_thread() signedness fixMika Kukkonen
With "-W -Wno-unused -Wno-sign-compare" I get the following compile warning: CC kernel/workqueue.o kernel/workqueue.c: In function `workqueue_cpu_callback': kernel/workqueue.c:504: warning: ordered comparison of pointer with integer zero On error create_workqueue_thread() returns NULL, not negative pointer, so following trivial patch suggests itself. Signed-off-by: Mika Kukkonen <mikukkon@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-08-10[PATCH] remove name length check in a workqueueJames Bottomley
We have a chek in there to make sure that the name won't overflow task_struct.comm[], but it's triggering for scsi with lots of HBAs, only scsi is using single-threaded workqueues which don't append the "/%d" anyway. All too hard. Just kill the BUG_ON. Cc: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> [ kthread_create() uses vsnprintf() and limits the thing, so no actual overflow can actually happen regardless ] Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-04-16[PATCH] re-export cancel_rearming_delayed_workqueueJames Bottomley
This was unexported by Arjan because we have no current users. However, during a conversion from tasklets to workqueues of the parisc led functions, we ran across a case where this was needed. In particular, the open coded equivalent of cancel_rearming_delayed_workqueue was implemented incorrectly, which is, I think, all the evidence necessary that this is a useful API. Signed-off-by: James Bottomley <James.Bottomley@SteelEye.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-02-22Merge nuts.davemloft.net:/disk1/BK/network-2.6.12David S. Miller
into nuts.davemloft.net:/disk1/BK/net-2.6.12
2005-02-15[PATCH] kthread_bind new worker threads when onlining cpuNathan T. Lynch
We weren't binding new worker threads to their cpu when onlining. Using preempt and the debug version of smp_processor_id found this. Signed-off-by: Nathan Lynch <ntl@pobox.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-02-14[WORKQUEUE]: Add cancel_rearming_delayed_work()Andrew Morton
From: Arjan van de Ven <arjan@infradead.org> cancel_rearming_delayed_workqueue() is only used inside workqueue.c; make this function static (the more useful wrapper around it later in that function remains non-static and exported) Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: David S. Miller <davem@davemloft.net>
2005-01-07[PATCH] Lock initializer cleanup (Core)Thomas Gleixner
Kernel core files converted to use the new lock initializers. Signed-off-by: Thomas Gleixner <tglx@linutronix.de> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-01-07[PATCH] sched: alter_kthread_prioCon Kolivas
Timeslice proportion has been increased substantially for -niced tasks. As a result of this kernel threads have much larger timeslices than they previously had. Change kernel threads' nice value to -5 to bring their timeslice back in line with previous behaviour. This means kernel threads will be less likely to cause large latencies under periods of system stress for normal nice 0 tasks. Signed-off-by: Con Kolivas <kernel@kolivas.org> Acked-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-11-16Email address update.David Woodhouse
The work address is increasingly unreliable and incompetently run. Time to remove all visible instances of it and rely only on one which isn't run by crack-monkeys. Signed-off-by: David Woodhouse <dwmw2@infradead.org>
2004-09-07[PATCH] Speed up oprofile buffer drain codeAnton Blanchard
I noticed a large machine was doing about 400,000 context switches per second when oprofile was enabled. Upon closer inspection it looks like we were rearming the buffer sync timer without modifying the expire time. Now that we have schedule_delayed_work_on I believe we can remove the timer completely. Each cpu should be offset by 1 jiffy so they dont all fire at the same time. I bumped DEFAULT_TIMER_EXPIRE from 2 to 10 times a second to be sure we reap cpu buffers. With the following patch the same large machine gets about 4000 context switches per second. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-08-22[PATCH] Move cache_reap out of timer contextDimitri Sivanich
I'm submitting two patches associated with moving cache_reap functionality out of timer context. Note that these patches do not make any further optimizations to cache_reap at this time. The first patch adds a function similiar to schedule_delayed_work to allow work to be scheduled on another cpu. The second patch makes use of schedule_delayed_work_on to schedule cache_reap to run from keventd. Signed-off-by: Dimitri Sivanich <sivanich@sgi.com> Signed-off-by: Manfred Spraul <manfred@colorfullife.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-06-08[PATCH] flush_workqueue locking simplificationAndrew Morton
From: "Anil" <anil.s.keshavamurthy@intel.com> We don't need lock_cpu_hotplug()/unlock_cpu_hotplug for singlethreaded workqueues. Signed-off-by: Anil Keshavamurthy <anil.s.keshavamurthy@intel.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-06-08[PATCH] speedup flush_workqueue for singlethread_workqueueAndrew Morton
From: "Anil" <anil.s.keshavamurthy@intel.com> In flush_workqueue() for a single_threaded_worqueue case the code flushes the same cpu_workqueue_struct for each online_cpu. Change things so that we only perform the flush once in this case. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-05-14[PATCH] create_workqueue locking fixAndrew Morton
Fix some silliness in there.
2004-05-10[PATCH] worker_thread race fixAndrew Morton
Fix a waitqueue-handling race in worker_thread().
2004-05-10[PATCH] fix deadlock in create_workqueue()Andrew Morton
Fix bug identified by Srivatsa Vaddagiri <vatsa@in.ibm.com>: There's a deadlock in __create_workqueue when CONFIG_HOTPLUG_CPU is set. This can happen when create_workqueue_thread fails to create a worker thread. In that case, we call destroy_workqueue with cpu hotplug lock held. destroy_workqueue however also attempts to take the same lock.
2004-04-26[PATCH] create singlethread_workqueue()Andrew Morton
From: Rusty Russell <rusty@rustcorp.com.au> Workqueues are a great primitive for running things from user context from a completely clean environment. Unfortunately, they currently insist on creating one thread per CPU, which is overkill for many situations, so the more generic keventd workqueue is used for these. Recently deadlocks using keventd were demonstrated, showing that it is not suitable for all uses. 1) Clean up CPU iterators. Always a nice touch. 2) Add __create_workqueue() and create_singlethread_workqueue(), keeping source compatibility. 3) Put workqueues in workqueue list even if !CONFIG_HOTPLUG_CPU (means we need a lock to protect that list). Now we can tell if a wq is single-threaded using list_empty(&wq->list). 4) For single-threaded workqueues, override CPU in queue_work, delayed_work_timer_fn and flush_workqueue to be 0. flush_workqueue now does redundant passes for single-threaded workqueues, but the code remains simple. 5) Make create_workqueue_thread return the thread, so we can easily kthread_bind for multi-threaded workqueues. akpm fixes: - Fix up is_single_threaded() handling - single-threaded wq thread does not have "/0" appended.
2004-04-18[PATCH] Rename PF_IOTHREAD to PF_NOFREEZEAndrew Morton
From: Nigel Cunningham <ncunningham@users.sourceforge.net> A few weeks ago, Pavel and I agreed that PF_IOTHREAD should be renamed to PF_NOFREEZE. This reflects the fact that some threads so marked aren't actually used for IO while suspending, but simply shouldn't be frozen. This patch, against 2.6.5 vanilla, applies that change. In the refrigerator calls, the actual value doesn't matter (so long as it's non-zero) and it makes more sense to use PF_FREEZE so I've used that.
2004-03-18[PATCH] Hotplug CPUs: Workqueue ChangesRusty Russell
Workqueues need to bring up/destroy the per-cpu thread on cpu up/down. 1) Add a global list of workqueues, and keep the name in the structure (to name the newly created thread). 2) Remove BUG_ON in run_workqueue, since thread is dragged off CPU when it goes down. 3) Lock out cpu up/down in flush_workqueue, create_workqueue and destroy_workqueue. 4) Add notifier to add/destroy workqueue threads, and take over work.
2004-03-15[PATCH] flush_workqueue(): detect excessive nestingAndrew Morton
Add a debug check for workqueues nested more than three deep via the direct-run-workqueue() path.
2004-03-15[PATCH] flush_scheduled_work() deadlock fixAndrew Morton
Because keventd is a resource which is shared between unrelated parts of the kernel it is possible for one person's workqueue handler to accidentally call another person's flush_scheduled_work(). thockin managed it by calling mntput() from a workqueue handler. It deadlocks. It's simple enough to fix: teach flush_scheduled_work() to go direct when it discovers that the calling thread is the one which should be running the work. Note that this can cause recursion. The depth of that recursion is equal to the number of currently-queued works which themselves want to call flush_scheduled_work(). If this ever exceeds three I'll eat my hat.
2004-03-11[PATCH] current_is_keventd() speedupAndrew Morton
From: Srivatsa Vaddagiri <vatsa@in.ibm.com> current_is_keventd() doesn't need to search across all the CPUs to identify itself.
2004-03-06[PATCH] fastcall / regparm fixesAndrew Morton
From: Gerd Knorr <kraxel@suse.de> Current gcc's error out if a function's declaration and definition disagree about the register passing convention. The patch adds a new `fastcall' declatation primitive, and uses that in all the FASTCALL functions which we could find. A number of inconsistencies were fixed up along the way.
2004-02-18[PATCH] Minor workqueue.c cleanupAndrew Morton
From: Rusty Russell <rusty@rustcorp.com.au> Move duplicated code to __queue_work(), and don't set the CPU for queue_delayed_work() until the timer goes off. The second one only has an effect on CONFIG_HOTPLUG_CPU where the CPU goes down and the timer goes off on a different CPU than it was scheduled on.
2004-02-18[PATCH] kthread primitiveAndrew Morton
From: Rusty Russell <rusty@rustcorp.com.au> These two patches provide the framework for stopping kernel threads to allow hotplug CPU. This one just adds kthread.c and kthread.h, next one uses it. Most importantly, adds a Monty Python quote to the kernel. Details: The hotplug CPU code introduces two major problems: 1) Threads which previously never stopped (migration thread, ksoftirqd, keventd) have to be stopped cleanly as CPUs go offline. 2) Threads which previously never had to be created now have to be created when a CPU goes online. Unfortunately, stopping a thread is fairly baroque, involving memory barriers, a completion and spinning until the task is actually dead (for example, complete_and_exit() must be used if inside a module). There are also three problems in starting a thread: 1) Doing it from a random process context risks environment contamination: better to do it from keventd to guarantee a clean environment, a-la call_usermodehelper. 2) Getting the task struct without races is a hard: see kernel/sched.c migration_call(), kernel/workqueue.c create_workqueue_thread(). 3) There are races in starting a thread for a CPU which is not yet online: migration thread does a complex dance at the moment for a similar reason (there may be no migration thread to migrate us). Place all this logic in some primitives to make life easier: kthread_create() and kthread_stop(). These primitives require no extra data-structures in the caller: they operate on normal "struct task_struct"s. Other changes: - Expose keventd_up(), as keventd and migration threads will use kthread to launch, and kthread normally uses workqueues and must recognize this case. - Kthreads created at boot before "keventd" are spawned directly. However, this means that they don't have all signals blocked, and hence can be killed. The simplest solution is to always explicitly block all signals in the kthread. - Change over the migration threads, the workqueue threads and the ksoftirqd threads to use kthread. - module.c currently spawns threads directly to stop the machine, so a module can be atomically tested for removal. - Unfortunately, this means that the current task is manipulated (which races with set_cpus_allowed, for example), and it can't set its priority artificially high. Using a kernel thread can solve this cleanly, and with kthread_run, it's simple. - kthreads use keventd, so they inherit its cpus_allowed mask. Unset it. All current users set it explicity anyway, but it's nice to fix. - call_usermode_helper uses keventd, so the process created inherits its cpus_allowed mask. Unset it. - Prevent errors in boot when cpus_possible() contains a cpu which is not online (ie. a cpu didn't come up). This doesn't happen on x86, since a boot failure makes that CPU no longer possible (hacky, but it works). - When the cpu fails to come up, some callbacks do kthread_stop(), which doesn't work without keventd (which hasn't started yet). Call it directly, and take care that it restores signal state (note: do_sigaction does a flush on blocked signals, so we don't need to repeat it).
2004-01-19[PATCH] Remove redundant code in workqueue.cAndrew Morton
From: Rusty Russell <rusty@rustcorp.com.au> It turns out that run_workqueue never has signal_pending(), since setting the handler to SIG_IGN means "don't make zombies, I'm ignoring them". Fix the comment, don't allow the signal, and remove the unused waitpid loop. This also allows simpler conversion of workueues to the kthread mechanism, which uses signals to indicate it's time to stop.
2004-01-18[PATCH] Use for_each_cpu() Where It's Meant To BeAndrew Morton
From: Rusty Russell <rusty@rustcorp.com.au> Some places use cpu_online() where they should be using cpu_possible, most commonly for tallying statistics. This makes no difference without hotplug CPU. Use the for_each_cpu() macro in those places, providing good examples (and making the external hotplug CPU patch smaller). Some places use cpu_online() where they should be using cpu_possible, most commonly for tallying statistics. This makes no difference without hotplug CPU. Use the for_each_cpu() macro in those places, providing good examples (and making the external hotplug CPU patch smaller).
2003-08-18[PATCH] cpumask_t: allow more than BITS_PER_LONG CPUsAndrew Morton
From: William Lee Irwin III <wli@holomorphy.com> Contributions from: Jan Dittmer <jdittmer@sfhq.hn.org> Arnd Bergmann <arnd@arndb.de> "Bryan O'Sullivan" <bos@serpentine.com> "David S. Miller" <davem@redhat.com> Badari Pulavarty <pbadari@us.ibm.com> "Martin J. Bligh" <mbligh@aracnet.com> Zwane Mwaikambo <zwane@linuxpower.ca> It has ben tested on x86, sparc64, x86_64, ia64 (I think), ppc and ppc64. cpumask_t enables systems with NR_CPUS > BITS_PER_LONG to utilize all their cpus by creating an abstract data type dedicated to representing cpu bitmasks, similar to fd sets from userspace, and sweeping the appropriate code to update callers to the access API. The fd set-like structure is according to Linus' own suggestion; the macro calling convention to ambiguate representations with minimal code impact is my own invention. Specifically, a new set of inline functions for manipulating arbitrary-width bitmaps is introduced with a relatively simple implementation, in tandem with a new data type representing bitmaps of width NR_CPUS, cpumask_t, whose accessor functions are defined in terms of the bitmap manipulation inlines. This bitmap ADT found an additional use in i386 arch code handling sparse physical APIC ID's, which was convenient to use in this case as the accounting structure was required to be wider to accommodate the physids consumed by larger numbers of cpus. For the sake of simplicity and low code impact, these cpu bitmasks are passed primarily by value; however, an additional set of accessors along with an auxiliary data type with const call-by-reference semantics is provided to address performance concerns raised in connection with very large systems, such as SGI's larger models, where copying and call-by-value overhead would be prohibitive. Few (if any) users of the call-by-reference API are immediately introduced. Also, in order to avoid calling convention overhead on architectures where structures are required to be passed by value, NR_CPUS <= BITS_PER_LONG is special-cased so that cpumask_t falls back to an unsigned long and the accessors perform the usual bit twiddling on unsigned longs as opposed to arrays thereof. Audits were done with the structure overhead in-place, restoring this special-casing only afterward so as to ensure a more complete API conversion while undergoing the majority of its end-user exposure in -mm. More -mm's were shipped after its restoration to be sure that was tested, too. The immediate users of this functionality are Sun sparc64 systems, SGI mips64 and ia64 systems, and IBM ia32, ppc64, and s390 systems. Of these, only the ppc64 machines needing the functionality have yet to be released; all others have had systems requiring it for full functionality for at least 6 months, and in some cases, since the initial Linux port to the affected architecture.
2003-06-22[PATCH] Workqueue Exit NeateningRusty Russell
Jeff Garzik points out the initializing the exit completion at exit time is foolish: we should just initialize it at creation time live everything else in that structure, and avoid the memory barrier.
2003-06-20[PATCH] workqueue.c subtle fix and core extractionAndrew Morton
From: Rusty Russell <rusty@rustcorp.com.au> A barrier is needed on workqueue shutdown: there's a chance that the thead could see the wq->thread set to NULL before the completion is initialized. Also extracts functions which actually create and destroy workqueues, for use by hotplug CPU patch.
2003-04-14[PATCH] flush_work_queue() fixesAndrew Morton
The workqueue code currently has a notion of a per-cpu queue being "busy". flush_scheduled_work()'s responsibility is to wait for a queue to be not busy. Problem is, flush_scheduled_work() can easily hang up. - The workqueue is deemed "busy" when there are pending delayed (timer-based) works. But if someone repeatedly schedules new delayed work in the callback, the queue will never fall idle, and flush_scheduled_work() will not terminate. - If someone reschedules work (not delayed work) in the work function, that too will cause the queue to never go idle, and flush_scheduled_work() will not terminate. So what this patch does is: - Create a new "cancel_delayed_work()" which will try to kill off any timer-based delayed works. - Change flush_scheduled_work() so that it is immune to people re-adding work in the work callout handler. We can do this by recognising that the caller does *not* want to wait until the workqueue is "empty". The caller merely wants to wait until all works which were pending at the time flush_scheduled_work() was called have completed. The patch uses a couple of sequence numbers for that. So now, if someone wants to reliably remove delayed work they should do: /* * Make sure that my work-callback will no longer schedule new work */ my_driver_is_shutting_down = 1; /* * Kill off any pending delayed work */ cancel_delayed_work(&my_work); /* * OK, there will be no new works scheduled. But there may be one * currently queued or in progress. So wait for that to complete. */ flush_scheduled_work(); The patch also changes the flush_workqueue() sleep to be uninterruptible. We cannot legally bale out if a signal is delivered anyway.
2003-02-10Sanitize kernel daemon signal handling and process naming.Linus Torvalds
Add a name argument to daemonize() (va_arg) to avoid all the kernel threads having to duplicate the name setting over and over again. Make daemonize() disable all signals by default, and add a "allow_signal()" function to let daemons say they explicitly want to support a signal. Make flush_signal() take the signal lock, so that callers do not need to.
2003-02-06Split up "struct signal_struct" into "signal" and "sighand" parts.Linus Torvalds
This is required to get make the old LinuxThread semantics work together with the fixed-for-POSIX full signal sharing. A traditional CLONE_SIGHAND thread (LinuxThread) will not see any other shared signal state, while a new-style CLONE_THREAD thread will share all of it. This way the two methods don't confuse each other.
2003-02-05[PATCH] self-unplugging request queuesAndrew Morton
The patch teaches a queue to unplug itself: a) if is has four requests OR b) if it has had plugged requests for 3 milliseconds. These numbers may need to be tuned, although doing so doesn't seem to make much difference. 10 msecs works OK, so HZ=100 machines will be fine. Instrumentation shows that about 5-10% of requests were started due to the three millisecond timeout (during a kernel compile). That's somewhat significant. It means that the kernel is leaving stuff in the queue, plugged, for too long. This testing was with a uniprocessor preemptible kernel, which is particularly vulnerable to unplug latency (submit some IO, get preempted before the unplug). This patch permits the removal of a lot of rather lame unplugging in page reclaim and in the writeback code, which kicks the queues (globally!) every four megabytes to get writeback underway. This patch doesn't use blk_run_queues(). It is able to kick just the particular queue. The patch is not expected to make much difference really, except for AIO. AIO needs a blk_run_queues() in its io_submit() call. For each request. This means that AIO has to disable plugging altogether, unless something like this patch does it for it. It means that AIO will unplug *all* queues in the machine for every io_submit(). Even against a socket! This patch was tested by disabling blk_run_queues() completely. The system ran OK. The 3 milliseconds may be too long. It's OK for the heavy writeback code, but AIO may want less. Or maybe AIO really wants zero (ie: disable plugging). If that is so, we need new code paths by which AIO can communicate the "immediate unplug" information - a global unplug is not good. To minimise unplug latency due to user CPU load, this patch gives keventd `nice -10'. This is of course completely arbitrary. Really, I think keventd should be SCHED_RR/MAX_RT_PRIO-1, as it has been in -aa kernels for ages.
2002-10-02[PATCH] timer-2.5.40-F7Ingo Molnar
This does a number of timer subsystem enhancements: - simplified timer initialization, now it's the cheapest possible thing: static inline void init_timer(struct timer_list * timer) { timer->base = NULL; } since the timer functions already did a !timer->base check this did not have any effect on their fastpath. - the rule from now on is that timer->base is set upon activation of the timer, and cleared upon deactivation. This also made it possible to: - reorganize all the timer handling code to not assume anything about timer->entry.next and timer->entry.prev - this also removed lots of unnecessery cleaning of these fields. Removed lots of unnecessary list operations from the fastpath. - simplified del_timer_sync(): it now uses del_timer() plus some simple synchronization code. Note that this also fixes a bug: if mod_timer (or add_timer) moves a currently executing timer to another CPU's timer vector, then del_timer_sync() does not synchronize with the handler properly. - bugfix: moved run_local_timers() from scheduler_tick() into update_process_times() .. scheduler_tick() might be called from the fork code which will not quite have the intended effect ... - removed the APIC-timer-IRQ shifting done on SMP, Dipankar Sarma's testing shows no negative effects. - cleaned up include/linux/timer.h: - removed the timer_t typedef, and fixes up kernel/workqueue.c to use the 'struct timer_list' name instead. - removed unnecessery includes - renamed the 'list' field to 'entry' (it's an entry not a list head) - exchanged the 'function' and 'data' fields. This, besides being more logical, also unearthed the last few remaining places that initialized timers by assuming some given field ordering, the patch also fixes these places. (fs/xfs/pagebuf/page_buf.c, net/core/profile.c and net/ipv4/inetpeer.c) - removed the defunct sync_timers(), timer_enter() and timer_exit() prototypes. - added docbook-style comments. - other kernel/timer.c changes: - base->running_timer does not have to be volatile ... - added consistent comments to all the important functions. - made the sync-waiting in del_timer_sync preempt- and lowpower- friendly. i've compiled, booted & tested the patched kernel on x86 UP and SMP. I have tried moderately high networking load as well, to make sure the timer changes are correct - they appear to be.
2002-10-01[PATCH] workqueue flush on destroyChristoph Hellwig
2002-09-30[PATCH] Workqueue AbstractionIngo Molnar
This is the next iteration of the workqueue abstraction. The framework includes: - per-CPU queueing support. on SMP there is a per-CPU worker thread (bound to its CPU) and per-CPU work queues - this feature is completely transparent to workqueue-users. keventd automatically uses this feature. XFS can now update to work-queues and have the same per-CPU performance as it had with its per-CPU worker threads. - delayed work submission there's a new queue_delayed_work(wq, work, delay) function and a new schedule_delayed_work(work, delay) function. The later one is used to correctly fix former tq_timer users. I've reverted those changes in 2.5.40 that changed tq_timer uses to schedule_work() - eg. in the case of random.c or the tty flip queue it was definitely the wrong thing to do. delayed work means a timer embedded in struct work_struct. I considered using split struct work_struct and delayed_work_struct types, but lots of code actively uses task-queues in both delayed and non-delayed mode, so i went for the more generic approach that allows both methods of work submission. Delayed timers do not cause any other overhead in the normal submission path otherwise. - multithreaded run_workqueue() implementation the run_workqueue() function can now be called from multiple contexts, and a worker thread will only use up a single entryy - this property is used by the flushing code, and can potentially be used in the future to extend the number of per-CPU worker threads. - more reliable flushing there's now a 'pending work' counter, which is used to accurately detect when the last work-function has finished execution. It's also used to correctly flush against timed requests. I'm not convinced whether the old keventd implementation got this detail right. - i switched the arguments of the queueing function(s) per Jeff's suggestion, it's more straightforward this way. Driver fixes: i have converted almost every affected driver to the new framework. This cleaned up tons of code. I also fixed a number of drivers that were still using BHs (these drivers did not compile in 2.5.40). while this means lots of changes, it might ease the QA decision whether to put this patch into 2.5. The pach converts roughly 80% of all tqueue-using code to workqueues - and all the places that are not converted to workqueues yet are places that do not compile in vanilla 2.5.40 anyway, due to unrelated changes. I've converted a fair number of drivers that do not compile in 2.5.40, and i think i've managed to convert every driver that compiles under 2.5.40.