| Age | Commit message (Collapse) | Author |
|
Signed-off-by: David Howells <dhowells@redhat.com>
Acked-by: Arnd Bergmann <arnd@arndb.de>
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Michael Kerrisk <mtk.manpages@gmail.com>
Acked-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Acked-by: Dave Jones <davej@redhat.com>
|
|
To avoid userspace build failures such as:
.../linux/uio.h:37: error: expected `=', `,', `;', `asm' or `__attribute__' before `iov_length'
.../linux/uio.h:47: error: expected declaration specifiers or `...' before `size_t'
move uio functions inside a __KERNEL__ block.
Signed-off-by: Jiri Slaby <jirislaby@gmail.com>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch@infradead.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Greg KH <greg@kroah.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
- afs and rxrpc switched to kvec; definition of kvec moved to uio.h (duh).
- afs/mntpt.c got missing cast added.
at that point afs is sparse-clean and rxrpc has only one remaining warning
(setsockopt from local variable, protected by set_fs()).
|
|
|
|
verifies declarations against definitions and checks argument
types.
|
|
- writev currently returns -EFAULT if _any_ of the segments has an
invalid address. We should only return -EFAULT if the first segment
has a bad address.
If some of the first segments have valid addresses we need to write
them and return a partial result.
- The current code only checks if the sum-of-lengths is negative. If
individual segments have a negative length but the result is positive
we miss that.
So rework the code to detect this, and to be immune to odd wrapping
situations.
As a bonus, we save one pass across the iovec.
- ditto for readv.
The check for "does any segment have a negative length" has already
been performed in do_readv_writev(), but it's basically free here, and
we need to do it for generic_file_read/write anyway.
This all means that the iov_length() function is unsafe because of
wrap/overflow isues. It should only be used after the
generic_file_read/write or do_readv_writev() checking has been
performed. Its callers have been reviewed and they are OK.
The code now passes LTP testing and has been QA'd by Janet's team.
|
|
This is Janet Morgan's patch which converts the readv/writev code
to submit all segments for IO before waiting on them, rather than
submitting each segment separately.
This is a critical performance fix for O_DIRECT reads and writes.
Prior to this change, O_DIRECT vectored IO was forced to wait for
completion against each segment of the iovec rather than submitting all
segments and waiting on the lot. ie: for ten segments, this code will
be ten times faster.
There will also be moderate improvements for buffered IO - smaller code
paths, plus writev() only takes i_sem once.
The patch ended up quite large unfortunately - turned out that the only
sane way to implement this without duplicating significant amounts of
code (the generic_file_write() bounds checking, all the O_DIRECT
handling, etc) was to redo generic_file_read() and generic_file_write()
to take an iovec/nr_segs pair rather than `buf, count'.
New exported functions generic_file_readv() and generic_file_writev()
have been added:
ssize_t generic_file_readv(struct file *filp, const struct iovec *iov,
unsigned long nr_segs, loff_t *ppos);
ssize_t generic_file_writev(struct file *file, const struct iovec *iov,
unsigned long nr_segs, loff_t * ppos);
If a driver does not use these in their file_operations then they will
continue to use the old readv/writev code, which sits in a loop calling
calls fops->read() or fops->write().
ext2, ext3, JFS and the blockdev driver are currently using this
capability.
Some coding cleanups were made in fs/read_write.c. Mainly:
- pass "READ" or "WRITE" around to indicate the diretion of the
operation, rather than the (confusing, inverted)
VERIFY_READ/VERIFY_WRITE.
- Use the identifier `nr_segs' everywhere to indicate the iovec
length rather than `count', which is often used to indicate the
number of bytes in the syscall. It was confusing the heck out of me.
- Some cleanups to the raw driver.
- Some additional generality in fs/direct_io.c: the core `struct dio'
used to be a "populate-and-go" thing. Janet has broken that up so
you can initialise a struct dio once, then loop around feeding it
more file segments, then wait on completion against everything.
- In a couple of places we needed to handle the situation where we
knew, a-priori, that the user was going to get a short read or write.
File size limit exceeded, read past i_size, etc. We handled that by
shortening the iovec in-place with iov_shorten(). Which is not
particularly pretty, but neither were the alternatives.
|
|
|