diff options
| author | bill.irwin@oracle.com <bill.irwin@oracle.com> | 2005-03-04 17:27:21 -0800 |
|---|---|---|
| committer | Linus Torvalds <torvalds@ppc970.osdl.org> | 2005-03-04 17:27:21 -0800 |
| commit | 1eeae0158ecd0535a2bc257a53d3472cc37ceb15 (patch) | |
| tree | e51847da565874b2712dc0d789eb777794334fd8 /include/linux/fs.h | |
| parent | 3db29f35016f7bb9f2aa459f6dbca26d4ec606c9 (diff) | |
[PATCH] make mapping->tree_lock an rwlock
Convert mapping->tree_lock to an rwlock.
with:
dd if=/dev/zero of=foo bs=1 count=2M 0.80s user 4.15s system 99% cpu 4.961 total
dd if=/dev/zero of=foo bs=1 count=2M 0.73s user 4.26s system 100% cpu 4.987 total
dd if=/dev/zero of=foo bs=1 count=2M 0.79s user 4.25s system 100% cpu 5.034 total
dd if=foo of=/dev/null bs=1 0.80s user 3.12s system 99% cpu 3.928 total
dd if=foo of=/dev/null bs=1 0.77s user 3.15s system 100% cpu 3.914 total
dd if=foo of=/dev/null bs=1 0.92s user 3.02s system 100% cpu 3.935 total
(3.926: 1.87 usecs)
without:
dd if=/dev/zero of=foo bs=1 count=2M 0.85s user 3.92s system 99% cpu 4.780 total
dd if=/dev/zero of=foo bs=1 count=2M 0.78s user 4.02s system 100% cpu 4.789 total
dd if=/dev/zero of=foo bs=1 count=2M 0.82s user 3.94s system 99% cpu 4.763 total
dd if=/dev/zero of=foo bs=1 count=2M 0.71s user 4.10s system 99% cpu 4.810 tota
dd if=foo of=/dev/null bs=1 0.76s user 2.68s system 100% cpu 3.438 total
dd if=foo of=/dev/null bs=1 0.74s user 2.72s system 99% cpu 3.465 total
dd if=foo of=/dev/null bs=1 0.67s user 2.82s system 100% cpu 3.489 total
dd if=foo of=/dev/null bs=1 0.70s user 2.62s system 99% cpu 3.326 total
(3.430: 1.635 usecs)
So on a P4, the additional cost of the rwlock is ~240 nsecs for a
one-byte-write(). On the other hand:
From: Peter Chubb <peterc@gelato.unsw.edu.au>
As part of the Gelato scalability focus group, we've been running OSDL's
Re-AIM7 benchmark with an I/O intensive load with varying numbers of
processors. The current kernel shows severe contention on the tree_lock in
the address space structure when running on tmpfs or ext2 on a RAM disk.
Lockstat output for a 12-way:
SPINLOCKS HOLD WAIT
UTIL CON MEAN( MAX ) MEAN( MAX )(% CPU) TOTAL NOWAIT SPIN RJECT NAME
5.5% 0.4us(3177us) 28us( 20ms)(44.2%) 131821954 94.5% 5.5% 0.00% *TOTAL*
72.3% 13.1% 0.5us( 9.5us) 29us( 20ms)(42.5%) 50542055 86.9% 13.1% 0% find_lock_page+0x30
23.8% 0% 385us(3177us) 0us 23235 100% 0% 0% exit_mmap+0x50
11.5% 0.82% 0.1us( 101us) 17us(5670us)( 1.6%) 50665658 99.2% 0.82% 0% dnotify_parent+0x70
Replacing the spinlock with a multi-reader lock fixes this problem,
without unduly affecting anything else.
Here are the benchmark results (jobs per minute at a 50-client level, average
of 5 runs, standard deviation in parens) on an HP Olympia with 3 cells, 12
processors, and dnotify turned off (after this spinlock, the spinlock in
dnotify_parent is the worst contended for this workload).
tmpfs............... ext2...............
#CPUs spinlock rwlock spinlock rwlock
1 7556(15) 7588(17) +0.42% 3744(20) 3791(16) +1.25%
2 13743(31) 13791(33) +0.35% 6405(30) 6413(24) +0.12%
4 23334(111) 22881(154) -2% 9648(51) 9595(50) -0.55%
8 33580(240) 36163(190) +7.7% 13183(63) 13070(68) -0.85%
12 28748(170) 44064(238)+53% 12681(49) 14504(105)+14%
And on a pentium3 single processsor:
1 4177(4) 4169(2) -0.2% 3811(4) 3820(3) +0.23%
I'm not sure what's happening in the 4-processor case. The important thing to
note is that with a spinlock, the benchmark shows worse performance for a 12
than for an 8-way box; with the patch, the 12 way performs better, as
expected. We've done some runs with 16-way as well; without the patch below,
the 16-way performs worse than the 12-way.
It's a tricky tradeoff, but large-smp is hurt a lot more by the spinlocks than
small-smp is by the rwlocks. And I don't think we really want to implement
compile-time either-or-locks.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Diffstat (limited to 'include/linux/fs.h')
| -rw-r--r-- | include/linux/fs.h | 2 |
1 files changed, 1 insertions, 1 deletions
diff --git a/include/linux/fs.h b/include/linux/fs.h index f07cb9f7977a..c4081935da26 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -335,7 +335,7 @@ struct backing_dev_info; struct address_space { struct inode *host; /* owner: inode, block_device */ struct radix_tree_root page_tree; /* radix tree of all pages */ - spinlock_t tree_lock; /* and spinlock protecting it */ + rwlock_t tree_lock; /* and rwlock protecting it */ unsigned int i_mmap_writable;/* count VM_SHARED mappings */ struct prio_tree_root i_mmap; /* tree of private and shared mappings */ struct list_head i_mmap_nonlinear;/*list VM_NONLINEAR mappings */ |
