| Age | Commit message (Collapse) | Author |
|
Some of the code has been gradually transitioned to using the proper
struct request_queue, but there's lots left. So do a full sweet of
the kernel and get rid of this typedef and replace its uses with
the proper type.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
No need to warn unregister_blkdev() failure by the callers. (The previous
patch makes unregister_blkdev() print error message in error case)
Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Currently, the freezer treats all tasks as freezable, except for the kernel
threads that explicitly set the PF_NOFREEZE flag for themselves. This
approach is problematic, since it requires every kernel thread to either
set PF_NOFREEZE explicitly, or call try_to_freeze(), even if it doesn't
care for the freezing of tasks at all.
It seems better to only require the kernel threads that want to or need to
be frozen to use some freezer-related code and to remove any
freezer-related code from the other (nonfreezable) kernel threads, which is
done in this patch.
The patch causes all kernel threads to be nonfreezable by default (ie. to
have PF_NOFREEZE set by default) and introduces the set_freezable()
function that should be called by the freezable kernel threads in order to
unset PF_NOFREEZE. It also makes all of the currently freezable kernel
threads call set_freezable(), so it shouldn't cause any (intentional)
change of behaviour to appear. Additionally, it updates documentation to
describe the freezing of tasks more accurately.
[akpm@linux-foundation.org: build fixes]
Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Acked-by: Nigel Cunningham <nigel@nigel.suspend2.net>
Cc: Pavel Machek <pavel@ucw.cz>
Cc: Oleg Nesterov <oleg@tv-sign.ru>
Cc: Gautham R Shenoy <ego@in.ibm.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
The name 'pin' was badly chosen, it doesn't pin a pipe buffer
in the most commonly used sense in the kernel. So change the
name to 'confirm', after debating this issue with Hugh
Dickins a bit.
A good return from ->confirm() means that the buffer is really
there, and that the contents are good.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
We need to move even more stuff into the header so that folks can use
the splice_to_pipe() implementation instead of open-coding a lot of
pipe knowledge (see relay implementation), so move to our own header
file finally.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
This gets rid of the dependency on ->sendfile() for receiving data
and converts loop to ->splice_read() instead.
Also includes an IV offset fix from Hugh Dickins.
Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
|
|
The kernel on-demand loop device instantiation breaks several user space
tools as the tools are not ready to cope with the "on-demand feature". Fix
it by instantiate default 8 loop devices and also reinstate max_loop module
parameter.
Signed-off-by: Ken Chen <kenchen@google.com>
Acked-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
... doh
Jeremy Fitzhardinge noted that the recent loop.c cleanups worked, but
cause lockdep to complain.
Ouch. OK, the deadlock is real and yes, I'm an idiot. Speaking of which,
we probably want to s/lock/pin/ in drivers/base/map.c to avoid such
brainos again. And yes, this stuff needs clear documentation. Will try
to put one together once I get some sleep...
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Cc: Jeremy Fitzhardinge <jeremy@goop.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Acked-by: Ken Chen <kenchen@google.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
It's very common for file systems to need to zero part or all of a page,
the simplist way is just to use kmap_atomic() and memset(). There's
actually a library function in include/linux/highmem.h that does exactly
that, but it's confusingly named memclear_highpage_flush(), which is
descriptive of *how* it does the work rather than what the *purpose* is.
So this patchset renames the function to zero_user_page(), and calls it
from the various places that currently open code it.
This first patch introduces the new function call, and converts all the
core kernel callsites, both the open-coded ones and the old
memclear_highpage_flush() ones. Following this patch is a series of
conversions for each file system individually, per AKPM, and finally a
patch deprecating the old call. The diffstat below shows the entire
patchset.
[akpm@linux-foundation.org: fix a few things]
Signed-off-by: Nate Diller <nate.diller@gmail.com>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Remove artificial maximum 256 loop device that can be created due to a
legacy device number limit. Searching through lkml archive, there are
several instances where users complained about the artificial limit that
the loop driver impose. There is no reason to have such limit.
This patch rid the limit entirely and make loop device and associated block
queue instantiation on demand. With on-demand instantiation, it also gives
the benefit of not wasting memory if these devices are not in use (compare
to current implementation that always create 8 loop devices), a net
improvement in both areas. This version is both tested with creation of
large number of loop devices and is compatible with existing losetup/mount
user land tools.
There are a number of people who worked on this and provided valuable
suggestions, in no particular order, by:
Jens Axboe
Jan Engelhardt
Christoph Hellwig
Thomas M
Signed-off-by: Ken Chen <kenchen@google.com>
Cc: Jan Engelhardt <jengelh@linux01.gwdg.de>
Cc: Christoph Hellwig <hch@lst.de>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Remove the destroy_dirty_buffers argument from invalidate_bdev(), it hasn't
been used in 6 years (so akpm says).
find * -name \*.[ch] | xargs grep -l invalidate_bdev |
while read file; do
quilt add $file;
sed -ie 's/invalidate_bdev(\([^,]*\),[^)]*)/invalidate_bdev(\1)/g' $file;
done
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
|
|
Signed-off-by: Josef Sipek <jsipek@fsl.cs.sunysb.edu>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
[try #6]
Move the loop device ioctl compat stuff from fs/compat_ioctl.c to the loop
driver so that the loop header file doesn't need to be included.
Signed-Off-By: David Howells <dhowells@redhat.com>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
|
|
Forward port of the patch by Solar and ported by Julio.
Compiles, boots, and passes my looptorturetest.sh.
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Cc: Julio Auto <mindvortex@gmail.com>
Cc: Solar Designer <solar@openwall.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Convert loop.c from the deprecated kernel_thread to kthread. This patch
simplifies the code quite a bit and passes similar testing to the previous
submission on both emulated x86 and s390.
Changes since last submission:
switched to using a rather simple loop based on
wait_event_interruptible.
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This eliminates the i_blksize field from struct inode. Filesystems that want
to provide a per-inode st_blksize can do so by providing their own getattr
routine instead of using the generic_fillattr() function.
Note that some filesystems were providing pretty much random (and incorrect)
values for i_blksize.
[bunk@stusta.de: cleanup]
[akpm@osdl.org: generic_fillattr() fix]
Signed-off-by: "Theodore Ts'o" <tytso@mit.edu>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de>
Signed-off-by: Adrian Bunk <bunk@stusta.de>
|
|
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/devfs-2.6: (22 commits)
[PATCH] devfs: Remove it from the feature_removal.txt file
[PATCH] devfs: Last little devfs cleanups throughout the kernel tree.
[PATCH] devfs: Rename TTY_DRIVER_NO_DEVFS to TTY_DRIVER_DYNAMIC_DEV
[PATCH] devfs: Remove the tty_driver devfs_name field as it's no longer needed
[PATCH] devfs: Remove the line_driver devfs_name field as it's no longer needed
[PATCH] devfs: Remove the videodevice devfs_name field as it's no longer needed
[PATCH] devfs: Remove the gendisk devfs_name field as it's no longer needed
[PATCH] devfs: Remove the miscdevice devfs_name field as it's no longer needed
[PATCH] devfs: Remove the devfs_fs_kernel.h file from the tree
[PATCH] devfs: Remove devfs_remove() function from the kernel tree
[PATCH] devfs: Remove devfs_mk_cdev() function from the kernel tree
[PATCH] devfs: Remove devfs_mk_bdev() function from the kernel tree
[PATCH] devfs: Remove devfs_mk_symlink() function from the kernel tree
[PATCH] devfs: Remove devfs_mk_dir() function from the kernel tree
[PATCH] devfs: Remove devfs_*_tape() functions from the kernel tree
[PATCH] devfs: Remove devfs support from the sound subsystem
[PATCH] devfs: Remove devfs support from the ide subsystem.
[PATCH] devfs: Remove devfs support from the serial subsystem
[PATCH] devfs: Remove devfs from the init code
[PATCH] devfs: Remove devfs from the partition code
...
|
|
Same as with already do with the file operations: keep them in .rodata and
prevents people from doing runtime patching.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Cc: Steven French <sfrench@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
And remove the now unneeded number field.
Also fixes all drivers that set these fields.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Also fixes up all files that #include it.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Removes the devfs_remove() function and all callers of it.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
Removes the devfs_mk_dir() function and all callers of it.
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
|
|
This reverts commit c7b2eff059fcc2d1b7085ee3d84b79fd657a537b.
Hugh Dickins explains:
"It seems too little tested: "losetup -d /dev/loop0" fails with
EINVAL because nothing sets lo_thread; but even when you patch
loop_thread() to set lo->lo_thread = current, it can't survive
more than a few dozen iterations of the loop below (with a tmpfs
mounted on /tst):
j=0
cp /dev/zero /tst
while :
do
let j=j+1
echo "Doing pass $j"
losetup /dev/loop0 /tst/zero
mkfs -t ext2 -b 1024 /dev/loop0 >/dev/null 2>&1
mount -t ext2 /dev/loop0 /mnt
umount /mnt
losetup -d /dev/loop0
done
it collapses with failed ioctl then BUG_ON(!bio).
I think the original lo_done completion was more subtle and safe
than the kthread conversion has allowed for."
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Update loop.c to use a kthread instead of a deprecated kernel_thread for
loop devices.
[akpm@osdl.org: don't change the thread's name]
Signed-off-by: Serge E. Hallyn <serue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
While writing a version of losetup, I ran into the problem that the loop
device was returning total garbage.
It turns out the problem was that this losetup was only issuing the
LOOP_SET_FD ioctl and not issuing a subsequent LOOP_SET_STATUS ioctl. This
losetup didn't have any special status to set, so it left out the call.
The deeper cause is that loop_set_fd sets the transfer function to NULL,
which causes no transfer to happen lo_do_transfer.
This patch fixes the problem by setting transfer to transfer_none in
loop_set_fd.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Check that kernel_thread() succeeded, so we don't wait for something which
cannot happen.
Signed-off-by: Herbert Poetzl <herbert@13thfloor.at>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Semaphore to mutex conversion.
The conversion was generated via scripts, and the result was validated
automatically via a script as well.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Cc: Jens Axboe <axboe@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
|
|
Remove the "inline" keyword from a bunch of big functions in the kernel with
the goal of shrinking it by 30kb to 40kb
Signed-off-by: Arjan van de Ven <arjan@infradead.org>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
Acked-by: Jeff Garzik <jgarzik@pobox.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
convert the block loop device from semaphores to completions.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
This patch converts the inode semaphore to a mutex. I have tested it on
XFS and compiled as much as one can consider on an ia64. Anyway your
luck with it might be different.
Modified-by: Ingo Molnar <mingo@elte.hu>
(finished the conversion)
Signed-off-by: Jes Sorensen <jes@sgi.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
|
|
readpage(), prepare_write(), and commit_write() callers are updated to
understand the special return code AOP_TRUNCATED_PAGE in the style of
writepage() and WRITEPAGE_ACTIVATE. AOP_TRUNCATED_PAGE tells the caller that
the callee has unlocked the page and that the operation should be tried again
with a new page. OCFS2 uses this to detect and work around a lock inversion in
its aop methods. There should be no change in behaviour for methods that don't
return AOP_TRUNCATED_PAGE.
WRITEPAGE_ACTIVATE is also prepended with AOP_ for consistency and they are
made enums so that kerneldoc can be used to document their semantics.
Signed-off-by: Zach Brown <zach.brown@oracle.com>
|
|
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Looks like locking can be optimised quite a lot. Increase lock widths
slightly so lo_lock is taken fewer times per request. Also it was quite
trivial to cover lo_pending with that lock, and remove the atomic
requirement. This also makes memory ordering explicitly correct, which is
nice (not that I particularly saw any mem ordering bugs).
Test was reading 4 250MB files in parallel on ext2-on-tmpfs filesystem (1K
block size, 4K page size). System is 2 socket Xeon with HT (4 thread).
intel:/home/npiggin# umount /dev/loop0 ; mount /dev/loop0 /mnt/loop ; /usr/bin/time ./mtloop.sh
Before:
0.24user 5.51system 0:02.84elapsed 202%CPU (0avgtext+0avgdata 0maxresident)k
0.19user 5.52system 0:02.88elapsed 198%CPU (0avgtext+0avgdata 0maxresident)k
0.19user 5.57system 0:02.89elapsed 198%CPU (0avgtext+0avgdata 0maxresident)k
0.22user 5.51system 0:02.90elapsed 197%CPU (0avgtext+0avgdata 0maxresident)k
0.19user 5.44system 0:02.91elapsed 193%CPU (0avgtext+0avgdata 0maxresident)k
After:
0.07user 2.34system 0:01.68elapsed 143%CPU (0avgtext+0avgdata 0maxresident)k
0.06user 2.37system 0:01.68elapsed 144%CPU (0avgtext+0avgdata 0maxresident)k
0.06user 2.39system 0:01.68elapsed 145%CPU (0avgtext+0avgdata 0maxresident)k
0.06user 2.36system 0:01.68elapsed 144%CPU (0avgtext+0avgdata 0maxresident)k
0.06user 2.42system 0:01.68elapsed 147%CPU (0avgtext+0avgdata 0maxresident)k
Signed-off-by: Nick Piggin <nickpiggin@yahoo.com.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
This is a megarollup of ~60 patches which give various things static scope.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Implements fallback to file_operations->write in the case that
aops->{prepare,commit}_write are not present on the backing filesystem.
The fallback happens in two different ways:
- For normal loop devices, i.e. ones which do not do transformation on
the data but simply pass it along, we simply call fops->write. This
should be pretty much just as fast as using aops->{prepare,commit}_write
directly.
- For all other loop devices (e.g. xor and cryptoloop), i.e. all the
ones which may be doing transformations on the data, we allocate and map
a page (once for each bio), then for each bio vec we copy the bio vec
page data to our mapped page, apply the loop transformation, and use
fops->write to write out the transformed data from our page. Once all
bio vecs from the bio are done, we unmap and free the page.
This approach is the absolute minimum of overhead I could come up with and
for performance hungry people, as you can see I left the address space
operations method in place for filesystems which implement
aops->{prepare,commit}_write.
I have tested this patch with normal loop devices using
aops->{prepare,commit}_write on the backing filesystem, with normal loop
devices using the fops->write code path and with cryptoloop devices using
the double buffering + fops->write code path.
Signed-off-by: Anton Altaparmakov <aia21@cantab.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
With Andries Brouwer <Andries.Brouwer@cwi.nl>
Fix various recursion scenarios wherein it was possible to mount a loop
device on itself, either directly or via intermediate loops devices.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
Convert loopback device to new module_param to get rid of warning.
Signed-off-by: Stephen Hemminger <shemminger@osdl.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
We have a fun situation with read_descriptor_t - all its instances end
up passed to some actor; these actors use desc->buf as their private
data; there are 5 of them and they expect resp:
struct lo_read_data *
struct svc_rqst *
struct file *
struct rpc_xprt *
char __user *
IOW, there is no type safety whatsoever; the field is essentially untyped,
we rely on the fact that actor is chosen by the same code that sets ->buf
and expect it to put something of the right type there.
Right now desc->buf is declared as char __user *. Moreover, the last
argument of ->sendfile() (what should be stored in ->buf) is void __user *,
even though it's actually _never_ a userland pointer.
If nothing else, ->sendfile() should take void * instead; that alone removes
a bunch of bogus warnings. I went further and replaced desc->buf with a
union of void * and char __user *.
|
|
From: Russell King <rmk+lkml@arm.linux.org.uk>
It appears the loop driver has had one flush_dcache_page() call added for
the case where it writes to the backing device page cache pages.
However, it seems to be missing the call where it writes to its own page
cache pages.
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
|
|
|
|
From: Yury Umanets <torque@ukrpost.net>
I have found small inconsistency in loop_set_fd(). It checks if
->sendfile() is implemented for passed block device file. But in fact,
loop back device driver never calls it. It uses ->sendfile() from backing
store file.
|
|
From: Nigel Cunningham <ncunningham@users.sourceforge.net>
A few weeks ago, Pavel and I agreed that PF_IOTHREAD should be renamed to
PF_NOFREEZE. This reflects the fact that some threads so marked aren't
actually used for IO while suspending, but simply shouldn't be frozen.
This patch, against 2.6.5 vanilla, applies that change. In the
refrigerator calls, the actual value doesn't matter (so long as it's
non-zero) and it makes more sense to use PF_FREEZE so I've used that.
|
|
From: Jens Axboe <axboe@suse.de>,
Chris Mason,
me, others.
The global unplug list causes horrid spinlock contention on many-disk
many-CPU setups - throughput is worse than halved.
The other problem with the global unplugging is of course that it will cause
the unplugging of queues which are unrelated to the I/O upon which the caller
is about to wait.
So what we do to solve these problems is to remove the global unplug and set
up the infrastructure under which the VFS can tell the block layer to unplug
only those queues which are relevant to the page or buffer_head whcih is
about to be waited upon.
We do this via the very appropriate address_space->backing_dev_info structure.
Most of the complexity is in devicemapper, MD and swapper_space, because for
these backing devices, multiple queues may need to be unplugged to complete a
page/buffer I/O. In each case we ensure that data structures are in place to
permit us to identify all the lower-level queues which contribute to the
higher-level backing_dev_info. Each contributing queue is told to unplug in
response to a higher-level unplug.
To simplify things in various places we also introduce the concept of a
"synchronous BIO": it is tagged with BIO_RW_SYNC. The block layer will
perform an immediate unplug when it sees one of these go past.
|
|
From: Chris Mason <mason@suse.com>
I think Andrew and I managed to mismerge the loop setup race fix.
loop_set_fd is using get_capacity() to read the size of the disk and
sending that to bd_set_size.
But, it is doing this before calling set_capacity, so the size being used
is wrong. This should clean things up.
|
|
From: Chris Mason <mason@suse.com>
There's a race in loopback setup, it's easiest to trigger with one or more
procs doing loopback mounts at the same time. The problem is that
fs/block_dev.c:do_open() only calls bdev_set_size on the first open.
Picture two procs:
proc1: mount -o loop file1 mnt1
proc2: mount -o loop file2 mnt2
proc1 proc2
open /dev/loop0 # bd_openers now 1
do_open
bd_set_size(bdev, 0) # loop unbound, so bdev size is 0
open /dev/loop0 # bd_openers now 2
loop_set_fd # disk capacity now correct, but
# bdev not updated
mount /dev/loop0 /mnt
do_open
Because bd_openers != 0 for the last do_open, bd_set_size is not called
again and a size of 0 is used. This eventually leads to an oops when the
loop device is unmounted, because fsync_bdev calls block_write_full_page
who decides every page on the block device is outside i_size and unmaps
them.
When ext2 or reiserfs try to sync a metadata buffer, we get an oops on
because the buffers are no longer mapped.
The patch below changes loop_set_fd and loop_clr_fd to also manipulate the
size of the block device, which fixes things for me.
|