[PATCH] buffer_head takedown for bighighmem machines

This patch addresses the excessive consumption of ZONE_NORMAL by buffer_heads on highmem machines. The algorithms which decide which buffers to shoot down are fairly dumb, but they only cut in on machines with large highmem:lowmem ratios and the code footprint is tiny. The buffer.c change implements the buffer_head accounting - it sets the upper limit on buffer_head memory occupancy to 10% of ZONE_NORMAL. A possible side-effect of this change is that the kernel will perform more calls to get_block() to map pages to disk. This will only be observed when a file is being repeatadly overwritten - this is the only case in which the "cached get_block result" in the buffers is useful. I did quite some testing of this back in the delalloc ext2 days, and was not able to come up with a test in which the cached get_block result was measurably useful. That's for ext2, which has a fast get_block(). A desirable side effect of this patch is that the kernel will be able to cache much more blockdev pagecache in ZONE_NORMAL, so there are more ext2/3 indirect blocks in cache, so with some workloads, less I/O will be performed. In mpage_writepage(): if the number of buffer_heads is excessive then buffers are stripped from pages as they are submitted for writeback. This change is only useful for filesystems which are using the mpage code. That's ext2 and ext3-writeback and JFS. An mpage patch for reiserfs was floating about but seems to have got lost. There is no need to strip buffers for reads because the mpage code does not attach buffers for reads. These are perhaps not the most appropriate buffer_heads to toss away. Perhaps something smarter should be done to detect file overwriting, or to toss the 'oldest' buffer_heads first. In refill_inactive(): if the number of buffer_heads is excessive then strip buffers from pages as they move onto the inactive list. This change is useful for all filesystems. This approach is good because pages which are being repeatedly overwritten will remain on the active list and will retain their buffers, whereas pages which are not being overwritten will be stripped.
author: Andrew Morton <akpm@digeo.com> 2002-09-09 21:09:33 -0700
committer: Linus Torvalds <torvalds@penguin.transmeta.com> 2002-09-09 21:09:33 -0700
commit: e182d61263b7d534df77dde8213e40d955a53029 (patch)
tree: 139f5fa4b1f51c01e86538c005cf7c7cf77cedd6 /include
parent: ce92adf354bfe61de4071297d554c49f623e08aa (diff)
2 files changed, 2 insertions, 0 deletions
diff --git a/include/linux/buffer_head.h b/include/linux/buffer_head.h
index f9c9aafdf036..6e98963a7a49 100644
--- a/include/linux/buffer_head.h
+++ b/include/linux/buffer_head.h
@@ -167,6 +167,7 @@ void wakeup_bdflush(void);
 struct buffer_head *alloc_buffer_head(void);
 void free_buffer_head(struct buffer_head * bh);
 void FASTCALL(unlock_buffer(struct buffer_head *bh));
+extern int buffer_heads_over_limit;
 
 /*
  * Generic address_space_operations implementations for buffer_head-backed
diff --git a/include/linux/pagevec.h b/include/linux/pagevec.h
index d7fa36270a90..278689b2fb2a 100644
--- a/include/linux/pagevec.h
+++ b/include/linux/pagevec.h
@@ -20,6 +20,7 @@ void __pagevec_free(struct pagevec *pvec);
 void __pagevec_lru_add(struct pagevec *pvec);
 void lru_add_drain(void);
 void pagevec_deactivate_inactive(struct pagevec *pvec);
+void pagevec_strip(struct pagevec *pvec);
 
 static inline void pagevec_init(struct pagevec *pvec)
 {
author	Andrew Morton <akpm@digeo.com>	2002-09-09 21:09:33 -0700
committer	Linus Torvalds <torvalds@penguin.transmeta.com>	2002-09-09 21:09:33 -0700
commit	e182d61263b7d534df77dde8213e40d955a53029 (patch)
tree	139f5fa4b1f51c01e86538c005cf7c7cf77cedd6 /include
parent	ce92adf354bfe61de4071297d554c49f623e08aa (diff)