| Age | Commit message (Collapse) | Author |
|
This is the next iteration of the workqueue abstraction.
The framework includes:
- per-CPU queueing support.
on SMP there is a per-CPU worker thread (bound to its CPU) and per-CPU
work queues - this feature is completely transparent to workqueue-users.
keventd automatically uses this feature. XFS can now update to work-queues
and have the same per-CPU performance as it had with its per-CPU worker
threads.
- delayed work submission
there's a new queue_delayed_work(wq, work, delay) function and a new
schedule_delayed_work(work, delay) function. The later one is used to
correctly fix former tq_timer users. I've reverted those changes in 2.5.40
that changed tq_timer uses to schedule_work() - eg. in the case of
random.c or the tty flip queue it was definitely the wrong thing to do.
delayed work means a timer embedded in struct work_struct. I considered
using split struct work_struct and delayed_work_struct types, but lots
of code actively uses task-queues in both delayed and non-delayed mode,
so i went for the more generic approach that allows both methods of work
submission. Delayed timers do not cause any other overhead in the
normal submission path otherwise.
- multithreaded run_workqueue() implementation
the run_workqueue() function can now be called from multiple contexts, and
a worker thread will only use up a single entryy - this property is used
by the flushing code, and can potentially be used in the future to extend
the number of per-CPU worker threads.
- more reliable flushing
there's now a 'pending work' counter, which is used to accurately detect
when the last work-function has finished execution. It's also used to
correctly flush against timed requests. I'm not convinced whether the old
keventd implementation got this detail right.
- i switched the arguments of the queueing function(s) per Jeff's
suggestion, it's more straightforward this way.
Driver fixes:
i have converted almost every affected driver to the new framework. This
cleaned up tons of code. I also fixed a number of drivers that were still
using BHs (these drivers did not compile in 2.5.40).
while this means lots of changes, it might ease the QA decision whether to
put this patch into 2.5.
The pach converts roughly 80% of all tqueue-using code to workqueues - and
all the places that are not converted to workqueues yet are places that do
not compile in vanilla 2.5.40 anyway, due to unrelated changes. I've
converted a fair number of drivers that do not compile in 2.5.40, and i
think i've managed to convert every driver that compiles under 2.5.40.
|
|
This is the smptimers patch plus the removal of old BHs and a rewrite of
task-queue handling.
Basically with the removal of TIMER_BH i think the time is right to get
rid of old BHs forever, and to do a massive cleanup of all related
fields. The following five basic 'execution context' abstractions are
supported by the kernel:
- hardirq
- softirq
- tasklet
- keventd-driven task-queues
- process contexts
I've done the following cleanups/simplifications to task-queues:
- removed the ability to define your own task-queue, what can be done is
to schedule_task() a given task to keventd, and to flush all pending
tasks.
This is actually a quite easy transition, since 90% of all task-queue
users in the kernel used BH_IMMEDIATE - which is very similar in
functionality to keventd.
I believe task-queues should not be removed from the kernel altogether.
It's true that they were written as a candidate replacement for BHs
originally, but they do make sense in a different way: it's perhaps the
easiest interface to do deferred processing from IRQ context, in
performance-uncritical code areas. They are easier to use than
tasklets.
code that cares about performance should convert to tasklets - as the
timer code and the serial subsystem has done already. For extreme
performance softirqs should be used - the net subsystem does this.
and we can do this for 2.6 - there are only a couple of areas left after
fixing all the BH_IMMEDIATE places.
i have moved all the taskqueue handling code into kernel/context.c, and
only kept the basic 'queue a task' definitions in include/linux/tqueue.h.
I've converted three of the most commonly used BH_IMMEDIATE users:
tty_io.c, floppy.c and random.c. [random.c might need more thought
though.]
i've also cleaned up kernel/timer.c over that of the stock smptimers
patch: privatized the timer-vec definitions (nothing needs it,
init_timer() used it mistakenly) and cleaned up the code. Plus i've moved
some code around that does not belong into timer.c, and within timer.c
i've organized data and functions along functionality and further
separated the base timer code from the NTP bits.
net_bh_lock: i have removed it, since it would synchronize to nothing. The
old protocol handlers should still run on UP, and on SMP the kernel prints
a warning upon use. Alexey, is this approach fine with you?
scalable timers: i've further improved the patch ported to 2.5 by wli and
Dipankar. There is only one pending issue i can see, the question of
whether to migrate timers in mod_timer() or not. I'm quite convinced that
they should be migrated, but i might be wrong. It's a 10 lines change to
switch between migrating and non-migrating timers, we can do performance
tests later on. The current, more complex migration code is pretty fast
and has been stable under extremely high networking loads in the past 2
years, so we can immediately switch to the simpler variant if someone
proves it improves performance. (I'd say if non-migrating timers improve
Apache performance on one of the bigger NUMA boxes then the point is
proven, no further though will be needed.)
|
|
This is actually part of the work I've been doing to remove BHs, but it
stands by itself.
|
|
- use list_for_each in head_of_free_region
- cleanups from 2.4
- fix for usb
- kill broken queueing
|
|
This patch provides the ability for a block driver to signal it's too
busy to receive more work and temporarily halt the request queue. In
concept it's similar to the networking netif_{start,stop}_queue helpers.
To do this cleanly, I've ripped out the old tq_disk task queue. Instead
an internal list of plugged queues is maintained which will honor the
current queue state (see QUEUE_FLAG_STOPPED bit). Execution of
request_fn has been moved to tasklet context. blk_run_queues() provides
similar functionality to the old run_task_queue(&tq_disk).
Now, this only works at the request_fn level and not at the
make_request_fn level. This is on purpose: drivers working at the
make_request_fn level are essentially providing a piece of the block
level infrastructure themselves. There are basically two reasons for
doing make_request_fn style setups:
o block remappers. start/stop functionality will be done at the target
device in this case, which is the level that will signal hardware full
(or continue) anyways.
o drivers who wish to receive single entities of "buffers" and not
merged requests etc. This could use the start/stop functionality. I'd
suggest _still_ using a request_fn for these, but set the queue
options so that no merging etc ever takes place. This has the added
bonus of providing the usual request depletion throttling at the block
level.
|
|
Here's suspend-to-{RAM,disk} combined patch for
2.5.17. Suspend-to-disk is pretty stable and was tested in
2.4-ac. Suspend-to-RAM is little more experimental, but works for me,
and is certainly better than disk-eating version currently in kernel.
Major parts are: process stopper, S3 specific code, S4 specific
code.
|
|
We have to include linux/bitops.h for arch using generic_xxx().
The following patch changes <asm/bitops.h> of include/linux/* to
<linux/bitops.h>.
|
|
- more bugs found by the automatic stanford checker, yay!
- Andrew Morton: fix SAK locking bugs by moving it into a process context
- Johannes Erdfelt: USB updates
- Jeff Garzik: merge Hermes driver by David Gibson
- Jens Axboe: cdrom merges, ll_rw_blk proper accounting
|
|
|