summaryrefslogtreecommitdiff
path: root/include/linux/fs.h
diff options
context:
space:
mode:
authorbill.irwin@oracle.com <bill.irwin@oracle.com>2005-03-04 17:27:21 -0800
committerLinus Torvalds <torvalds@ppc970.osdl.org>2005-03-04 17:27:21 -0800
commit1eeae0158ecd0535a2bc257a53d3472cc37ceb15 (patch)
treee51847da565874b2712dc0d789eb777794334fd8 /include/linux/fs.h
parent3db29f35016f7bb9f2aa459f6dbca26d4ec606c9 (diff)
[PATCH] make mapping->tree_lock an rwlock
Convert mapping->tree_lock to an rwlock. with: dd if=/dev/zero of=foo bs=1 count=2M 0.80s user 4.15s system 99% cpu 4.961 total dd if=/dev/zero of=foo bs=1 count=2M 0.73s user 4.26s system 100% cpu 4.987 total dd if=/dev/zero of=foo bs=1 count=2M 0.79s user 4.25s system 100% cpu 5.034 total dd if=foo of=/dev/null bs=1 0.80s user 3.12s system 99% cpu 3.928 total dd if=foo of=/dev/null bs=1 0.77s user 3.15s system 100% cpu 3.914 total dd if=foo of=/dev/null bs=1 0.92s user 3.02s system 100% cpu 3.935 total (3.926: 1.87 usecs) without: dd if=/dev/zero of=foo bs=1 count=2M 0.85s user 3.92s system 99% cpu 4.780 total dd if=/dev/zero of=foo bs=1 count=2M 0.78s user 4.02s system 100% cpu 4.789 total dd if=/dev/zero of=foo bs=1 count=2M 0.82s user 3.94s system 99% cpu 4.763 total dd if=/dev/zero of=foo bs=1 count=2M 0.71s user 4.10s system 99% cpu 4.810 tota dd if=foo of=/dev/null bs=1 0.76s user 2.68s system 100% cpu 3.438 total dd if=foo of=/dev/null bs=1 0.74s user 2.72s system 99% cpu 3.465 total dd if=foo of=/dev/null bs=1 0.67s user 2.82s system 100% cpu 3.489 total dd if=foo of=/dev/null bs=1 0.70s user 2.62s system 99% cpu 3.326 total (3.430: 1.635 usecs) So on a P4, the additional cost of the rwlock is ~240 nsecs for a one-byte-write(). On the other hand: From: Peter Chubb <peterc@gelato.unsw.edu.au> As part of the Gelato scalability focus group, we've been running OSDL's Re-AIM7 benchmark with an I/O intensive load with varying numbers of processors. The current kernel shows severe contention on the tree_lock in the address space structure when running on tmpfs or ext2 on a RAM disk. Lockstat output for a 12-way: SPINLOCKS HOLD WAIT UTIL CON MEAN( MAX ) MEAN( MAX )(% CPU) TOTAL NOWAIT SPIN RJECT NAME 5.5% 0.4us(3177us) 28us( 20ms)(44.2%) 131821954 94.5% 5.5% 0.00% *TOTAL* 72.3% 13.1% 0.5us( 9.5us) 29us( 20ms)(42.5%) 50542055 86.9% 13.1% 0% find_lock_page+0x30 23.8% 0% 385us(3177us) 0us 23235 100% 0% 0% exit_mmap+0x50 11.5% 0.82% 0.1us( 101us) 17us(5670us)( 1.6%) 50665658 99.2% 0.82% 0% dnotify_parent+0x70 Replacing the spinlock with a multi-reader lock fixes this problem, without unduly affecting anything else. Here are the benchmark results (jobs per minute at a 50-client level, average of 5 runs, standard deviation in parens) on an HP Olympia with 3 cells, 12 processors, and dnotify turned off (after this spinlock, the spinlock in dnotify_parent is the worst contended for this workload). tmpfs............... ext2............... #CPUs spinlock rwlock spinlock rwlock 1 7556(15) 7588(17) +0.42% 3744(20) 3791(16) +1.25% 2 13743(31) 13791(33) +0.35% 6405(30) 6413(24) +0.12% 4 23334(111) 22881(154) -2% 9648(51) 9595(50) -0.55% 8 33580(240) 36163(190) +7.7% 13183(63) 13070(68) -0.85% 12 28748(170) 44064(238)+53% 12681(49) 14504(105)+14% And on a pentium3 single processsor: 1 4177(4) 4169(2) -0.2% 3811(4) 3820(3) +0.23% I'm not sure what's happening in the 4-processor case. The important thing to note is that with a spinlock, the benchmark shows worse performance for a 12 than for an 8-way box; with the patch, the 12 way performs better, as expected. We've done some runs with 16-way as well; without the patch below, the 16-way performs worse than the 12-way. It's a tricky tradeoff, but large-smp is hurt a lot more by the spinlocks than small-smp is by the rwlocks. And I don't think we really want to implement compile-time either-or-locks. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Diffstat (limited to 'include/linux/fs.h')
-rw-r--r--include/linux/fs.h2
1 files changed, 1 insertions, 1 deletions
diff --git a/include/linux/fs.h b/include/linux/fs.h
index f07cb9f7977a..c4081935da26 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -335,7 +335,7 @@ struct backing_dev_info;
struct address_space {
struct inode *host; /* owner: inode, block_device */
struct radix_tree_root page_tree; /* radix tree of all pages */
- spinlock_t tree_lock; /* and spinlock protecting it */
+ rwlock_t tree_lock; /* and rwlock protecting it */
unsigned int i_mmap_writable;/* count VM_SHARED mappings */
struct prio_tree_root i_mmap; /* tree of private and shared mappings */
struct list_head i_mmap_nonlinear;/*list VM_NONLINEAR mappings */