summaryrefslogtreecommitdiff
path: root/include/linux
diff options
context:
space:
mode:
authorAndrew Morton <akpm@digeo.com>2002-11-21 19:32:03 -0800
committerLinus Torvalds <torvalds@penguin.transmeta.com>2002-11-21 19:32:03 -0800
commit5fa9d488fb0dbc8b5583be8e59d8b3091fbcb5e9 (patch)
tree638c6b68bd5cfff00c92e46c81a7e5a1b3abb202 /include/linux
parent40a7fe2f515b764ee9722ff1b20984c82fcd6910 (diff)
[PATCH] Fix busy-wait with writeback to large queues
blk_congestion_wait() is a utility function which various callers use to throttle themselves to the rate at which the IO system can retire writes. The current implementation refuses to wait if no queues are "congested" (>75% of requests are in flight). That doesn't work if the queue is so huge that it can hold more than 40% (dirty_ratio) of memory. The queue simply cannot enter congestion because the VM refuses to allow more than 40% of memory to be dirtied. (This spin could happen with a lot of normal-sized queues too) So this patch simply changes blk_congestion_wait() to throttle even if there are no congested queues. It will cause the caller to sleep until someone puts back a write request against any queue. (Nobody uses blk_congestion_wait for read congestion). The patch adds new state to backing_dev_info->state: a couple of flags which indicate whether there are _any_ reads or writes in flight against that queue. This was added to prevent blk_congestion_wait() from taking a nap when there are no writes at all in flight. But the "are there any reads" info could be used to defer background writeout from pdflush, to reduce read-vs-write competition. We'll see. Because the large request queues have made a fundamental change: blocking in get_request_wait() has been the main form of VM throttling for years. But with large queues it doesn't work any more - all throttling happens in blk_congestion_wait(). Also, change io_schedule_timeout() to propagate the schedule_timeout() return value. I was using that in some debug code, but it should have been like that from day one.
Diffstat (limited to 'include/linux')
-rw-r--r--include/linux/backing-dev.h12
-rw-r--r--include/linux/sched.h2
2 files changed, 13 insertions, 1 deletions
diff --git a/include/linux/backing-dev.h b/include/linux/backing-dev.h
index 94c93c9c5f66..55218964e7ef 100644
--- a/include/linux/backing-dev.h
+++ b/include/linux/backing-dev.h
@@ -17,6 +17,8 @@ enum bdi_state {
BDI_pdflush, /* A pdflush thread is working this device */
BDI_write_congested, /* The write queue is getting full */
BDI_read_congested, /* The read queue is getting full */
+ BDI_write_active, /* There are one or more queued writes */
+ BDI_read_active, /* There are one or more queued reads */
BDI_unused, /* Available bits start here */
};
@@ -42,4 +44,14 @@ static inline int bdi_write_congested(struct backing_dev_info *bdi)
return test_bit(BDI_write_congested, &bdi->state);
}
+static inline int bdi_read_active(struct backing_dev_info *bdi)
+{
+ return test_bit(BDI_read_active, &bdi->state);
+}
+
+static inline int bdi_write_active(struct backing_dev_info *bdi)
+{
+ return test_bit(BDI_write_active, &bdi->state);
+}
+
#endif /* _LINUX_BACKING_DEV_H */
diff --git a/include/linux/sched.h b/include/linux/sched.h
index 7946bd8cb0ad..facb0f80d0a8 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -150,7 +150,7 @@ extern void show_stack(unsigned long *stack);
extern void show_regs(struct pt_regs *);
void io_schedule(void);
-void io_schedule_timeout(long timeout);
+long io_schedule_timeout(long timeout);
extern void cpu_init (void);
extern void trap_init(void);