summaryrefslogtreecommitdiff
path: root/drivers/block/rd.c
AgeCommit message (Collapse)Author
2008-02-08rewrite rdNick Piggin
This is a rewrite of the ramdisk block device driver. The old one is really difficult because it effectively implements a block device which serves data out of its own buffer cache. It relies on the dirty bit being set, to pin its backing store in cache, however there are non trivial paths which can clear the dirty bit (eg. try_to_free_buffers()), which had recently lead to data corruption. And in general it is completely wrong for a block device driver to do this. The new one is more like a regular block device driver. It has no idea about vm/vfs stuff. It's backing store is similar to the buffer cache (a simple radix-tree of pages), but it doesn't know anything about page cache (the pages in the radix tree are not pagecache pages). There is one slight downside -- direct block device access and filesystem metadata access goes through an extra copy and gets stored in RAM twice. However, this downside is only slight, because the real buffercache of the device is now reclaimable (because we're not playing crazy games with it), so under memory intensive situations, footprint should effectively be the same -- maybe even a slight advantage to the new driver because it can also reclaim buffer heads. The fact that it now goes through all the regular vm/fs paths makes it much more useful for testing, too. text data bss dec hex filename 2837 849 384 4070 fe6 drivers/block/rd.o 3528 371 12 3911 f47 drivers/block/brd.o Text is larger, but data and bss are smaller, making total size smaller. A few other nice things about it: - Similar structure and layout to the new loop device handlinag. - Dynamic ramdisk creation. - Runtime flexible buffer head size (because it is no longer part of the ramdisk code). - Boot / load time flexible ramdisk size, which could easily be extended to a per-ramdisk runtime changeable size (eg. with an ioctl). - Can use highmem for the backing store. [akpm@linux-foundation.org: fix build] [byron.bbradley@gmail.com: make rd_size non-static] Signed-off-by: Nick Piggin <npiggin@suse.de> Signed-off-by: Byron Bradley <byron.bbradley@gmail.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2008-02-06rd: use is_power_of_2() in drivers/block/rd.c.Robert P. J. Day
Signed-off-by: Robert P. J. Day <rpjday@crashcourse.ca> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-11-14rd: fix data corruption on memory pressureChristian Borntraeger
We have seen ramdisk based install systems, where some pages of mapped libraries and programs were suddendly zeroed under memory pressure. This should not happen, as the ramdisk avoids freeing its pages by keeping them dirty all the time. It turns out that there is a case, where the VM makes a ramdisk page clean, without telling the ramdisk driver. On memory pressure shrink_zone runs and it starts to run shrink_active_list. There is a check for buffer_heads_over_limit, and if true, pagevec_strip is called. pagevec_strip calls try_to_release_page. If the mapping has no releasepage callback, try_to_free_buffers is called. try_to_free_buffers has now a special logic for some file systems to make a dirty page clean, if all buffers are clean. Thats what happened in our test case. The simplest solution is to provide a noop-releasepage callback for the ramdisk driver. This avoids try_to_free_buffers for ramdisk pages. Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com> Acked-by: Nick Piggin <npiggin@suse.de> Cc: "Eric W. Biederman" <ebiederm@xmission.com> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-19Fix misspellings of "system", "controller", "interrupt" and "necessary".Robert P. J. Day
Fix the various misspellings of "system", controller", "interrupt" and "[un]necessary". Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Signed-off-by: Adrian Bunk <bunk@kernel.org>
2007-10-17Remove final traces of long-deprecated "ramdisk" kernel parmRobert P. J. Day
Since the "ramdisk" kernel parameter has been officially deprecated since at least 2.6.18, might as well finally get rid of it. Signed-off-by: Robert P. J. Day <rpjday@mindspring.com> Acked-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-17mm: bdi init hooksPeter Zijlstra
provide BDI constructor/destructor hooks [akpm@linux-foundation.org: compile fix] Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2007-10-10Drop 'size' argument from bio_endio and bi_end_ioNeilBrown
As bi_end_io is only called once when the reqeust is complete, the 'size' argument is now redundant. Remove it. Now there is no need for bio_endio to subtract the size completed from bi_size. So don't do that either. While we are at it, change bi_end_io to return void. Signed-off-by: Neil Brown <neilb@suse.de> Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-07-24[BLOCK] Get rid of request_queue_t typedefJens Axboe
Some of the code has been gradually transitioned to using the proper struct request_queue, but there's lots left. So do a full sweet of the kernel and get rid of this typedef and replace its uses with the proper type. Signed-off-by: Jens Axboe <jens.axboe@oracle.com>
2007-05-09Fix occurrences of "the the "Michael Opdenacker
Signed-off-by: Michael Opdenacker <michael@free-electrons.com> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2007-05-07mm: remove destroy_dirty_buffers from invalidate_bdev()Peter Zijlstra
Remove the destroy_dirty_buffers argument from invalidate_bdev(), it hasn't been used in 6 years (so akpm says). find * -name \*.[ch] | xargs grep -l invalidate_bdev | while read file; do quilt add $file; sed -ie 's/invalidate_bdev(\([^,]*\),[^)]*)/invalidate_bdev(\1)/g' $file; done Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2006-10-17[PATCH] rd: memory leak on rd_init() failureAkinobu Mita
If RAM disk driver initialization fails due to blk_alloc_queue() faulure, the gendisk structs stored in rd_disks[] will not be freed completely. This patch resolves that memory leak case by doing alloc_disk() and blk_alloc_queue() at the same time. Signed-off-by: Akinobu Mita <akinobu.mita@gmail.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-07-14[PATCH] ramdisk blocksize Kconfig entryNathan Scott
Make the ramdisk blocksize configurable at kernel compilation time rather than only at boot or module load time, like a couple of the other ramdisk options. I found this handy awhile back but thought little of it, until recently asked by a few of the testing folks here to be able to do the same thing for their automated test setups. The Kconfig comment is largely lifted from comments in rd.c, and hopefully this will increase the chances of making folks aware that the default value often isn't a great choice here (for increasing values of PAGE_SIZE, even moreso). Signed-off-by: Nathan Scott <nathans@sgi.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-30Remove obsolete #include <linux/config.h>Jörn Engel
Signed-off-by: Jörn Engel <joern@wohnheim.fh-wedel.de> Signed-off-by: Adrian Bunk <bunk@stusta.de>
2006-06-29Merge git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/devfs-2.6Linus Torvalds
* git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/devfs-2.6: (22 commits) [PATCH] devfs: Remove it from the feature_removal.txt file [PATCH] devfs: Last little devfs cleanups throughout the kernel tree. [PATCH] devfs: Rename TTY_DRIVER_NO_DEVFS to TTY_DRIVER_DYNAMIC_DEV [PATCH] devfs: Remove the tty_driver devfs_name field as it's no longer needed [PATCH] devfs: Remove the line_driver devfs_name field as it's no longer needed [PATCH] devfs: Remove the videodevice devfs_name field as it's no longer needed [PATCH] devfs: Remove the gendisk devfs_name field as it's no longer needed [PATCH] devfs: Remove the miscdevice devfs_name field as it's no longer needed [PATCH] devfs: Remove the devfs_fs_kernel.h file from the tree [PATCH] devfs: Remove devfs_remove() function from the kernel tree [PATCH] devfs: Remove devfs_mk_cdev() function from the kernel tree [PATCH] devfs: Remove devfs_mk_bdev() function from the kernel tree [PATCH] devfs: Remove devfs_mk_symlink() function from the kernel tree [PATCH] devfs: Remove devfs_mk_dir() function from the kernel tree [PATCH] devfs: Remove devfs_*_tape() functions from the kernel tree [PATCH] devfs: Remove devfs support from the sound subsystem [PATCH] devfs: Remove devfs support from the ide subsystem. [PATCH] devfs: Remove devfs support from the serial subsystem [PATCH] devfs: Remove devfs from the init code [PATCH] devfs: Remove devfs from the partition code ...
2006-06-28[PATCH] mark address_space_operations constChristoph Hellwig
Same as with already do with the file operations: keep them in .rodata and prevents people from doing runtime patching. Signed-off-by: Christoph Hellwig <hch@lst.de> Cc: Steven French <sfrench@us.ibm.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-06-26[PATCH] devfs: Remove the gendisk devfs_name field as it's no longer neededGreg Kroah-Hartman
And remove the now unneeded number field. Also fixes all drivers that set these fields. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-26[PATCH] devfs: Remove the devfs_fs_kernel.h file from the treeGreg Kroah-Hartman
Also fixes up all files that #include it. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-26[PATCH] devfs: Remove devfs_remove() function from the kernel treeGreg Kroah-Hartman
Removes the devfs_remove() function and all callers of it. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-06-26[PATCH] devfs: Remove devfs_mk_dir() function from the kernel treeGreg Kroah-Hartman
Removes the devfs_mk_dir() function and all callers of it. Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
2006-03-24[PATCH] set_page_dirty() return value fixesAndrew Morton
We need set_page_dirty() to return true if it actually transitioned the page from a clean to dirty state. This wasn't right in a couple of places. Do a kernel-wide audit, fix things up. This leaves open the possibility of returning a negative errno from set_page_dirty() sometime in the future. But we don't do that at present. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-03-23[PATCH] sem2mutex: blockdev #2Arjan van de Ven
Semaphore to mutex conversion. The conversion was generated via scripts, and the result was validated automatically via a script as well. Signed-off-by: Arjan van de Ven <arjan@infradead.org> Signed-off-by: Ingo Molnar <mingo@elte.hu> Acked-by: Jens Axboe <axboe@suse.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2006-01-03[PATCH] add AOP_TRUNCATED_PAGE, prepend AOP_ to WRITEPAGE_ACTIVATEZach Brown
readpage(), prepare_write(), and commit_write() callers are updated to understand the special return code AOP_TRUNCATED_PAGE in the style of writepage() and WRITEPAGE_ACTIVATE. AOP_TRUNCATED_PAGE tells the caller that the callee has unlocked the page and that the operation should be tried again with a new page. OCFS2 uses this to detect and work around a lock inversion in its aop methods. There should be no change in behaviour for methods that don't return AOP_TRUNCATED_PAGE. WRITEPAGE_ACTIVATE is also prepended with AOP_ for consistency and they are made enums so that kerneldoc can be used to document their semantics. Signed-off-by: Zach Brown <zach.brown@oracle.com>
2005-10-28[PATCH] gfp_t: remaining bits of drivers/*Al Viro
Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-06[PATCH] drivers/block/rd.c: rd_size shouldn't be staticAdrian Bunk
I somehow missed that there is external usage of rd_size on some architectures. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-05-05[PATCH] make some things staticAdrian Bunk
This patch makes some needlessly global identifiers static. Signed-off-by: Adrian Bunk <bunk@stusta.de> Acked-by: Arjan van de Ven <arjanv@infradead.org> Acked-by: Trond Myklebust <trond.myklebust@fys.uio.no> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-03-30[PATCH] BDI: Provide backing device capability information [try #3]David Howells
The attached patch replaces backing_dev_info::memory_backed with capabilitied bitmap. The capabilities available include: (*) BDI_CAP_NO_ACCT_DIRTY Set if the pages associated with this backing device should not be tracked by the dirty page accounting. (*) BDI_CAP_NO_WRITEBACK Set if dirty pages associated with this backing device should not have writepage() or writepages() invoked upon them to clean them. (*) Capability markers that indicate what a backing device is capable of with regard to memory mapping facilities. These flags indicate whether a device can be mapped directly, whether it can be copied for a mapping, and whether direct mappings can be read, written and/or executed. This information is primarily aimed at improving no-MMU private mapping support. The patch also provides convenience functions for determining the dirty-page capabilities available on backing devices directly or on the backing devices associated with a mapping. These are provided to keep line length down when checking for the capabilities. Signed-Off-By: David Howells <dhowells@redhat.com> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2005-03-11[PATCH] Make lots of things staticAdrian Bunk
This is a megarollup of ~60 patches which give various things static scope. Signed-off-by: Adrian Bunk <bunk@stusta.de> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-12-01[PATCH] make number of ramdisks KconfigurableMarc Leeman
Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-11-01[PATCH] rd: convert to module_param and add module aliasStephen Hemminger
Convert Ramdisk block device to module_param to get rid of warning. Add module alias to cause correct autoloading. Signed-off-by: Stephen Hemminger <shemminger@osdl.org> Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-08-23[PATCH] fix 4K ext2fs support in 2.6 initrd'sEric W. Biederman
The ramdisk_blocksize option has been broken for quite a while in 2.6. Making an initrd with a 4K ext2 filesystem impossible to use. After digging into this, the problem turned out to that rd.c was not setting the hard sector size. There were a few secondary problems like i_blkbits was not being set, and the number KiB in uncompressed ext2 images was not taking into account the block size. I have also corrected the surrounding comments as they were not just incorrect but misleading. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-06-12[PATCH] ramdisk: buffer_uptodate fixAndrew Morton
I waffled over this for ages. On balance, I think it's best to mark those bh's as uptodate. And on reflection, I'm not sure why we go bringing ramdisk blockdev pages uptodate all over the place anyway. But ramdisk is weird and it passes testing. Let those dogs sleep. Signed-off-by: Andrew Morton <akpm@osdl.org> Signed-off-by: Linus Torvalds <torvalds@osdl.org>
2004-05-21[PATCH] trivial: remove duplicated #includesAndrew Morton
From: Rusty Russell <rusty@rustcorp.com.au> From: a.othieno@bluewin.ch (Arthur Othieno) From: Vinay K Nallamothu <vinay-rc@naturesoft.net> Remove various duplicated #includes From: Vinay K Nallamothu <vinay-rc@naturesoft.net> Use mod_timer in drivers_block_floppy98.c From: carbonated beverage <ramune@net-ronin.org> doc update for bk usage bk://... appears to be dead, use http://... instead.
2004-05-21[REDO] ramdisk: separate the blockdev backing_dev_info from the hosted inodesLinus Torvalds
This re-applies the separation of the ramdisk blockdev inode backing store and the files that live in the ramdisk. Cset exclude: torvalds@ppc970.osdl.org|ChangeSet|20040521210025|21437
2004-05-21Undo separate backing_dev_info for ramdisk.c.Linus Torvalds
The code was incomplete. Let's re-do it when the other pieces are in place. Cset exclude: akpm@osdl.org[torvalds]|ChangeSet|20040521202003|24125
2004-05-20[PATCH] ramdisk: separate the blockdev backing_dev_info from the hosted inodes'Andrew Morton
Give appropriate and separate backing_dev_info's to both the ramdisk blockdev inode and to the files which live atop the ramdisk. Everything works now.
2004-05-20[PATCH] ramdisk: implement writepages()Andrew Morton
Implement an empty ->writepages() so that attempts to write back ramdisk pages have less work to do.
2004-05-20[PATCH] ramdisk: fix PageUptodate() handlingAndrew Morton
When a filesystem does getblk() to get a buffer_head against the ramdisk the VFS will allocate a new not-uptodate pagecache page and will attach buffers to it. The filesystem will then bring certain buffer_heads uptodate. But not the whole page. Later, various ramdisk a_ops see the not-uptodate page and wipe the whole thing out, including the parts to which the filesystem wrote! Fix that up by only zapping those parts of the page which are covered by non-uptodate buffers.
2004-05-20[PATCH] ramdisk: use kmap_atomic() in rd_blkdev_pagecache_IO()Andrew Morton
We don't actualy need to kmap the blockdev inode's pages at all, because they're ~__GFP_HIGHMEM. But it's future-safe, and cheap.
2004-05-20[PATCH] ramdisk: lock blockdev pages during "IO".Andrew Morton
There's a race: one CPU writes a 1k block into a ramdisk page which isn't in the blockdev pagecache yet. It memsets the locked page to zeroes. While this is happening, another CPU comes in and tries to write a different 1k block to the "disk". But it doesn't lock the page so it races with the memset and can have its data scribbled over. Fix this up by locking the page even if it already existed in pagecache. Locking a pagecache page in a make_request_fn sounds deadlocky but it is not, because: a) ramdisk_writepage() does nothing but a set_bit(), and cannot recur onto the same page. b) Any higher-level code which holds a page lock is supposed to be allocating its memory with GFP_NOFS, and in 2.6 kernels that's equivalent to GFP_NOIO. (The distinction between GFP_NOIO and GFP_NOFS basically disappeared with the buffer_head LRU, although it was reused for writes to swap).
2004-05-20[PATCH] ramdisk memory allocation fixesAndrew Morton
Allocating pagecache pages within the disk request_fn is deadlocky and prone to page allocation failures, causing write I/O errors. Attempt to improve things by fiddling with gfp masks.
2004-05-20[PATCH] ramdisk fixesAndrew Morton
- Remove the ramdisk special-case in fs-writeback.c - it will soon be unneeded. - Fix rd_ioctl() to return -ENOTTY on invalid ioctl types, not -EINVAL. - Make ramdisk Kconfig friendlier.
2004-04-12[PATCH] per-backing dev unpluggingAndrew Morton
From: Jens Axboe <axboe@suse.de>, Chris Mason, me, others. The global unplug list causes horrid spinlock contention on many-disk many-CPU setups - throughput is worse than halved. The other problem with the global unplugging is of course that it will cause the unplugging of queues which are unrelated to the I/O upon which the caller is about to wait. So what we do to solve these problems is to remove the global unplug and set up the infrastructure under which the VFS can tell the block layer to unplug only those queues which are relevant to the page or buffer_head whcih is about to be waited upon. We do this via the very appropriate address_space->backing_dev_info structure. Most of the complexity is in devicemapper, MD and swapper_space, because for these backing devices, multiple queues may need to be unplugged to complete a page/buffer I/O. In each case we ensure that data structures are in place to permit us to identify all the lower-level queues which contribute to the higher-level backing_dev_info. Each contributing queue is told to unplug in response to a higher-level unplug. To simplify things in various places we also introduce the concept of a "synchronous BIO": it is tagged with BIO_RW_SYNC. The block layer will perform an immediate unplug when it sees one of these go past.
2004-03-04[PATCH] another rd.c leakAndrew Morton
Free the request queues on the rd_init() error recovery path.
2004-03-04Fix ramdisk driver leak on module unload.Jeff Garzik
Noticed by me, fixed by Jens.
2004-02-18[PATCH] ramdisk cleanupAndrew Morton
Fairly pointless coding-style cleanups which I've been sitting on for ages. The ramdisk driver is still buggy: it drops pagecache when unmounted. I still need to fix this. Apparently it also displays data corruption under load even when not unmounted.
2004-02-18[PATCH] fix display of NBD in /proc/partitionsAndrew Morton
The recent change to /proc/partitions which prevents it from displaying removeable media accidentally caused 128 NBD and 16 ramdisk partitions to appear in /proc/partitions instead. So add a specific gendisk flag which says "don't show me in /proc/partitions" and use that.
2004-02-18[PATCH] Remove BDEV_RAW and friendsAndrew Morton
These no longer do anything. This patch changes modules API. It was acked by Arjan@RH and Hubert@Suse.
2003-08-30[PATCH] dev_t handling cleanups (10/12)Alexander Viro
new helper - iminor(inode); defined as minor(inode->i_rdev); lots and lots of places in drivers had been switched to it.
2003-08-30[PATCH] dev_t handling cleanups (9/12)Alexander Viro
struct block_device made the private part of bdevfs inodes; bd_count is gone, we use ->i_count of inode now; separate hash is also gone and we are using iget5_locked()/igrab()/iput() instead.
2003-08-06[PATCH] Proper block queue reference countingJens Axboe
To be able to properly be able to keep references to block queues, we make blk_init_queue() return the queue that it initialized, and let it be independently allocated and then cleaned up on the last reference. I have grepped high and low, and there really shouldn't be any broken uses of blk_init_queue() in the kernel drivers left. The added bonus being blk_init_queue() error checking is explicit now, most of the drivers were broken in this regard (even IDE/SCSI). No drivers have embedded request queue structures. Drivers that don't use blk_init_queue() but blk_queue_make_request(), should allocate the queue with blk_alloc_queue(gfp_mask). I've converted all of them to do that, too. They can call blk_cleanup_queue() now too, using the define blk_put_queue() is probably cleaner though.