| Age | Commit message (Collapse) | Author |
|
The writeback code paths which walk the superblocks and inodes are
getting an increasing arguments passed to them.
The patch wraps those args into the new `struct writeback_control',
and uses that instead. There is no functional change.
The new writeback_control structure is passed down through the
writeback paths in the place where the old `nr_to_write' pointer used
to be.
writeback_control will be used to pass new information up and down the
writeback paths. Such as whether the writeback should be non-blocking,
and whether queue congestion was encountered.
|
|
generic_writepages() is just a wrapper around mpage_writepages(), so
inline it.
|
|
Spot the difference:
aops.readpage
aops.readpages
aops.writepage
aops.writeback_mapping
The patch renames `writeback_mapping' to `writepages'
|
|
Multipage BIO writeout from the pagecache.
It's pretty much the same as multipage reads. It falls back to buffers
if things got complex.
The write case is a little more complex because it handles pages which
have buffers and pages which do not. If the page didn't have buffers
this code does not add them.
|
|
Implements BIO-based multipage reads into the pagecache, and turns this
on for ext2.
CPU load for `cat large_file > /dev/null' is reduced by approximately
15%. Similar reductions for tiobench with a single thread. (Earlier
claims of 25% were exaggerated - they were measured with slab debug
enabled. But 15% isn't bad for a load which is dominated by copy_*_user
costs).
With 2, 4 and 8 tiobench threads, throughput is increased as well, which was
unexpected. It's due to request queue weirdness. (Generally the
request queueing is doing bad things under certain workloads - that's a
separate issue.)
BIOs of up to 64 kbytes are assembled and submitted for readahead and
for single-page reads. So the work involved in reading 32 pages has gone
from:
- allocate and attach 32 buffer_heads
- submit 32 buffer_heads
- allocate 32 bios
- submit 32 bios
to:
- allocate 2 bios
- submit 2 bios
These pages never have buffers attached. Buffers will be attached
later if the application writes to these pages (file overwrite).
The first version of this code (in the "delayed allocation" patches)
tries to handle everything - bios which start mid-page, bios which end
mid-page and pages which are covered by multiple bios. It is very
complex code and in fact appears to be incorrect: out-of-order BIO
completion could cause a page to come unlocked at the wrong time.
This implementation is much simpler: if things get complex, it just
falls back to the buffer-based block_read_full_page(), which isn't
going away, and which understands all that complexity. There's no
point in doing this in two places.
This code will bypass the buffer layer for
- fully-mapped pages which are on-disk contiguous.
- fully unmapoped pages (holes)
- partially unmapped pages, where the unmappedness is at the end of
the page (end-of-file).
and everything else falls back to buffers.
This means that with blocksize == PAGE_CACHE_SIZE, 100% of pages are
handed direct to BIO. With a heavy 10-minute dbench run on 4k
PAGE_CACHE_SIZE and 1k blocks, 95% of pages were handed direct to BIO.
Almost all of the other 5% were passed to block_read_full_page()
because they were already partially uptodate from an earlier sub-page
write(). This ratio will fall if PAGE_CACHE_SIZE/blocksize is greater
than four. But if that's the case, CPU efficiency is far from the main
concern - there are significant seek and bandwidth problems just at 4
blocks per page.
This code will stress out the block layer somewhat - RAID0 doesn't like
multipage BIOs, and there are probably others. RAID0 seems to struggle
along - readahead fails but read falls back to single-page reads, which
succeed. Such problems may be worked around by setting MPAGE_BIO_MAX_SIZE
to PAGE_CACHE_SIZE in fs/mpage.c.
It is trivial to enable multipage reads for many other filesystems. We
can do that after completion of external testing of ext2.
|