diff options
| author | Andrew Morton <akpm@digeo.com> | 2002-11-21 19:32:03 -0800 |
|---|---|---|
| committer | Linus Torvalds <torvalds@penguin.transmeta.com> | 2002-11-21 19:32:03 -0800 |
| commit | 5fa9d488fb0dbc8b5583be8e59d8b3091fbcb5e9 (patch) | |
| tree | 638c6b68bd5cfff00c92e46c81a7e5a1b3abb202 /include/linux | |
| parent | 40a7fe2f515b764ee9722ff1b20984c82fcd6910 (diff) | |
[PATCH] Fix busy-wait with writeback to large queues
blk_congestion_wait() is a utility function which various callers use
to throttle themselves to the rate at which the IO system can retire
writes.
The current implementation refuses to wait if no queues are "congested"
(>75% of requests are in flight).
That doesn't work if the queue is so huge that it can hold more than
40% (dirty_ratio) of memory. The queue simply cannot enter congestion
because the VM refuses to allow more than 40% of memory to be dirtied.
(This spin could happen with a lot of normal-sized queues too)
So this patch simply changes blk_congestion_wait() to throttle even if
there are no congested queues. It will cause the caller to sleep until
someone puts back a write request against any queue. (Nobody uses
blk_congestion_wait for read congestion).
The patch adds new state to backing_dev_info->state: a couple of flags
which indicate whether there are _any_ reads or writes in flight
against that queue. This was added to prevent blk_congestion_wait()
from taking a nap when there are no writes at all in flight.
But the "are there any reads" info could be used to defer background
writeout from pdflush, to reduce read-vs-write competition. We'll see.
Because the large request queues have made a fundamental change:
blocking in get_request_wait() has been the main form of VM throttling
for years. But with large queues it doesn't work any more - all
throttling happens in blk_congestion_wait().
Also, change io_schedule_timeout() to propagate the schedule_timeout()
return value. I was using that in some debug code, but it should have
been like that from day one.
Diffstat (limited to 'include/linux')
| -rw-r--r-- | include/linux/backing-dev.h | 12 | ||||
| -rw-r--r-- | include/linux/sched.h | 2 |
2 files changed, 13 insertions, 1 deletions
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h index 94c93c9c5f66..55218964e7ef 100644 --- a/include/linux/backing-dev.h +++ b/include/linux/backing-dev.h @@ -17,6 +17,8 @@ enum bdi_state { BDI_pdflush, /* A pdflush thread is working this device */ BDI_write_congested, /* The write queue is getting full */ BDI_read_congested, /* The read queue is getting full */ + BDI_write_active, /* There are one or more queued writes */ + BDI_read_active, /* There are one or more queued reads */ BDI_unused, /* Available bits start here */ }; @@ -42,4 +44,14 @@ static inline int bdi_write_congested(struct backing_dev_info *bdi) return test_bit(BDI_write_congested, &bdi->state); } +static inline int bdi_read_active(struct backing_dev_info *bdi) +{ + return test_bit(BDI_read_active, &bdi->state); +} + +static inline int bdi_write_active(struct backing_dev_info *bdi) +{ + return test_bit(BDI_write_active, &bdi->state); +} + #endif /* _LINUX_BACKING_DEV_H */ diff --git a/include/linux/sched.h b/include/linux/sched.h index 7946bd8cb0ad..facb0f80d0a8 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -150,7 +150,7 @@ extern void show_stack(unsigned long *stack); extern void show_regs(struct pt_regs *); void io_schedule(void); -void io_schedule_timeout(long timeout); +long io_schedule_timeout(long timeout); extern void cpu_init (void); extern void trap_init(void); |
