summaryrefslogtreecommitdiff
path: root/include/linux/raid
AgeCommit message (Collapse)Author
2003-09-22[PATCH] 32-bit dev_t: internal useAlexander Viro
Starting the conversion: * internal dev_t made 32bit. * new helpers - new_encode_dev(), new_decode_dev(), huge_encode_dev(), huge_decode_dev(), new_valid_dev(). They do encoding/decoding of 32bit and 64bit values; for now huge_... are aliases for new_... and new_valid_dev() is always true. We do 12:20 for 32bit; representation is compatible with 16bit one - we have major in bits 19--8 and minor in 31--20,7--0. That's what the userland sees; internally we have (major << 20)|minor, of course. * MKDEV(), MAJOR() and MINOR() updated. * several places used to handle Missed'em'V dev_t (14:18 split) manually; that stuff had been taken into common helpers. Now we can start replacing old_... with new_... and huge_..., depending on the width available. MKDEV() callers should (for now) make sure that major and minor are within 12:20. That's what the next chunk will do.
2003-08-06[PATCH] Proper block queue reference countingJens Axboe
To be able to properly be able to keep references to block queues, we make blk_init_queue() return the queue that it initialized, and let it be independently allocated and then cleaned up on the last reference. I have grepped high and low, and there really shouldn't be any broken uses of blk_init_queue() in the kernel drivers left. The added bonus being blk_init_queue() error checking is explicit now, most of the drivers were broken in this regard (even IDE/SCSI). No drivers have embedded request queue structures. Drivers that don't use blk_init_queue() but blk_queue_make_request(), should allocate the queue with blk_alloc_queue(gfp_mask). I've converted all of them to do that, too. They can call blk_cleanup_queue() now too, using the define blk_put_queue() is probably cleaner though.
2003-05-26[PATCH] md: Replace bdev_partition_name with calls to bdevnameNeil Brown
2003-05-26[PATCH] md: Remove dependance on MD_SB_DISKS in linear personalityNeil Brown
Linear uses one array sized by MD_SB_DISKS inside a structure. We move it to the end of the structure, declare it as size 0, and arrange for approprate extra space to be allocated on structure allocation.
2003-05-26[PATCH] md: Remove MD_SB_DISKS limits from raid1Neil Brown
raid1 uses MD_SB_DISKS to size two data structures, but the new version-1 superblock allows for more than this number of disks (and most actual arrays use many fewer). This patch sizes to two arrays dynamically. One becomes a separate kmalloced array. The other is moved to the end of the containing structure and appropriate extra space is allocated. Also, change r1buf_pool_alloc (which allocates buffers for a mempool for doing re-sync) to not get r1bio structures from the r1bio pool (which could exhaust the pool) but instead to allocate them separately.
2003-05-26[PATCH] md: Remove dependancy on MD_SB_DISKS from raid0Neil Brown
Arrays with type-1 superblock can have more than MD_SB_DISKS, so we remove the dependancy on that number from raid0, replacing several fixed sized arrays with one dynamically allocated array.
2003-05-26[PATCH] md: Remove dependancy on MD_SB_DISKS from raid5Neil Brown
One embeded array gets moved to end of structure and sized dynamically.
2003-05-26[PATCH] md: Remove dependancy on MD_SB_DISKS from multipathNeil Brown
Multipath has a dependancy on MD_SB_DISKS which is no longer authoritative. We change it to use a separately allocated array.
2003-05-26[PATCH] md: Improve raid0 mapping code to simplify and reduce mem usage.Neil Brown
To cope with a raid0 array with differing sized devices, raid0 divides an array into "strip zones". The first zone covers the start of all devices, upto an offset equal to the size of the smallest device. The second strip zone covers the remaining devices upto the size of the next smallest size, etc. In order to determing which strip zone a given address is in, the array is logically divided into slices the size of the smallest zone, and a 'hash' table is created listing the first and, if relevant, second zone in each slice. As the smallest slice can be very small (imagine an array with a 76G drive and a 75.5G drive) this hash table can be rather large. With this patch, we limit the size of the hash table to one page, at the possible cost of making several probes into the zone list before we find the correct zone. We also cope with the possibility that a zone could be larger than a 32bit sector address would allow.
2003-05-26[PATCH] md: Use new single page bio splitting for raid0 and linearNeil Brown
Sometimes raid0 and linear are required to take a single page bio that spans two devices. We use bio_split to split such a bio into two. The the same time, bio.h is included by linux/raid/md.h so we don't included it elsewhere anymore. We also modify the mergeable_bvec functions to allow a bvec that doesn't fit if it is the first bvec to be added to the bio, and be careful never to return a negative length from a bvec_mergable funciton.
2003-05-07[PATCH] remove partition_name()Andrew Morton
From: Christoph Hellwig <hch@lst.de> partition_name() is a variant of __bdevname() that caches results and returns a pointrer to kmalloc()ed data instead of printing into a buffer. Due to it's caching it gets utterly confused when the name for a dev_t changes (can happen easily now with device mapper and probably in the future with dynamic dev_t users). It's only used by the raid code and most calls are through a wrapper, bdev_partition_name() which takes a struct block_device * that maybe be NULL. The patch below changes the bdev_partition_name() to call bdevname() if possible and the other calls where we really have nothing more than a dev_t to __bdevname. Btw, it would be nice if someone who knows the md code a bit better than me could remove bdev_partition_name() in favour of direct calls to bdevname() where possible - that would also get rid of the returns pointer to string on stack issue that this patch can't fix yet.
2003-04-03[PATCH] md: Cleanups for md to move device size calculations into personalitiesNeil Brown
2003-03-26[PATCH] md: Convert md personalities to new module interfaceNeil Brown
Thanks to Angus Sawyer <angus.sawyer@dsl.pipex.com> and Daniel McNeil <daniel@osdl.org>
2003-03-14[PATCH] md: Add new superblock format for mdNeil Brown
Superblock format '1' resolves a number of issues with superblock format '0'. It is more dense and can support many more sub-devices. It does not contains un-needed redundancy. It adds a few new useful fields
2003-03-14[PATCH] md: Allow components of MD raid array to have data start at offset ↵Neil Brown
from start of device. Normally the data stored on a component of a RAID array is stored from the start of the device. This patch allows a per-device data_offset so the data can start elsewhere. This will allow RAID arrays where the metadata is at the head of the device rather than the tail.
2003-03-14[PATCH] md: Fulltime delayed 'safe_mode' for mdNeil Brown
From: Angus Sawyer <angus.sawyer@dsl.pipex.com> If there are no writes for 20 milliseconds, write out superblock to mark array as clean. Write out superblock with dirty flag before allowing any further write to succeed. If an md thread gets signaled with SIGKILL, reduce the delay to 0. Also tidy up some printk's and make sure writing the superblock isn't noisy.
2003-03-14[PATCH] md: Remove md_recoveryd thread for mdNeil Brown
The md_recoveryd thread is responsible for initiating and cleaning up resync threads. This job can be equally well done by the per-array threads for those arrays which might need it. So the mdrecoveryd thread is gone and the core code that it ran is now run by raid5d, raid1d or multipathd. We add an MD_RECOVERY_NEEDED flag so those daemon don't have to bother trying to lock the md array unless it is likely that something needs to be done. Also modify the names of all threads to have the number of md device.
2003-03-14[PATCH] md: Tidy up recovery_running flags in mdNeil Brown
Md uses ->recovery_running and ->recovery_err to keep track of the status or recovery. This is rather ad hoc and race prone. This patch changes it to ->recovery which has bit flags for various states.
2003-03-14[PATCH] md: Convert /proc/mdstat to use seq_fileNeil Brown
From: Angus Sawyer <angus.sawyer@dsl.pipex.com> Mainly straightforward convert of sprintf -> seq_printf. seq_start and seq_next modelled on /proc/partitions. locking/ref counting as for ITERATE_MDDEV. pos == 0 -> header pos == n -> nth mddev pos == 0x10000 -> tail
2003-02-17[PATCH] Provide a 'safe-mode' for soft raid.Neil Brown
When a raid1 or raid5 array is in 'safe-mode', then the array is marked clean whenever there are no outstanding write requests, and is marked dirty again before allowing any write request to proceed. This means than an unclean shutdown while no write activity is happening will NOT cause a resync to be required. However it does mean extra updates to the superblock. Currently safe-mode is turned on by sending SIGKILL to the raid thread as would happen at a normal shutdown. This should mean that the reboot notifier is no longer needed. After looking more at performance issues I may make safemode be on all the time. I will almost certainly make it on when RAID5 is degraded as an unclean shutdown of a degraded RAID5 means data loss. This code was provided by Angus Sawyer <angus.sawyer@dsl.pipex.com>
2003-02-17[PATCH] Add name of md device to name of thread managing that device.Neil Brown
This allows the thread to easily identified and signalled. The point of signalling will appear in the next patch.
2003-01-05[PATCH] md: Record location of incomplete resync at shutdown and restart ↵Neil Brown
from there. Add a new field to the md superblock, in an used area, to record where resync was up-to on a clean shutdown while resync is active. Restart from this point. The extra field is verified by having a second copy of the event counter. If the second event counter is wrong, we ignore the extra field. This patch thanks to Angus Sawyer <angus.sawyer@dsl.pipex.com>
2002-10-30[PATCH] md: factor out MD superblock handling codeNeil Brown
Define an interface for interpreting and updating superblocks so we can more easily define new formats. With this patch, (almost) all superblock layout information is locating in a small set of routines dedicated to superblock handling. This will allow us to provide a similar set for a different format. The two exceptions are: 1/ autostart_array where the devices listed in the superblock are searched for. 2/ raid5 'knows' the maximum number of devices for compute_parity. These will be addressed in a later patch.
2002-10-28[PATCH] removed a bunch of gratuitous kdev_t usesAlexander Viro
2002-10-08[PATCH] 64-bit sector_t - driver changesAndrew Morton
From Peter Chubb Compaq Smart array sector_t cleanup: prepare for possible 64-bit sector_t Clean up loop device to allow huge backing files. MD transition to 64-bit sector_t. - Hold sizes and offsets as sector_t not int; - use 64-bit arithmetic if necessary to map block-in-raid to zone and block-in-zone
2002-09-21[PATCH] removal of bogus exportsAlexander Viro
partition_name() moved from md.c to partitions/check.c; disk_name() is not exported anymore; partition_name() takes dev_t instead of kdev_t.
2002-09-07[PATCH] (25/25) more cleanups of struct gendisk.Alexander Viro
* we remove the paritition 0 from ->part[] and put the old contents of ->part[0] into gendisk itself; indexes are shifted, obviously. * ->part is allocated at add_gendisk() time and freed at del_gendisk() according to value of ->minor_shift; static arrays of hd_struct are gone from drivers, ditto for manual allocations a-la ide. As the matter of fact, none of the drivers know about struct hd_struct now.
2002-08-22[PATCH] md: Remove per-personality 'operational' and 'write_only' flagsNeil Brown
raid1, raid5 and multipath maintain their own 'operational' flag. This is equivalent to !rdev->faulty and so isn't needed. Similarly raid1 and raid1 maintain a "write_only" flag that is equivalnt to !rdev->in_sync so it isn't needed either. As part of implementing this change, we introduce some extra flag bit in raid5 that are meaningful only inside 'handle_stripe'. Some of these replace the "action" array which recorded what actions were required (and would be performed after the stripe spinlock was released). This has the advantage of reducing our dependance on MD_SB_DISKS which personalities shouldn't need to know about.
2002-08-22[PATCH] md: Remove 'alias_device' flag.Neil Brown
This flag was used by multipath to make sure only one superblock was written, as there is only one real device. The relevant test is now more explicitly dependant on multipath, and the flag is gone.
2002-08-22[PATCH] md: Make spare handling simple ... personalities know lessNeil Brown
1/ Personalities only know about raid_disks devices. Some might be not in_sync and so cannot be read from, but must be written to. - change MD_SB_DISKS to ->raid_disks - add tests for .write_only 2/ rdev->raid_disk is now -1 for spares. desc_nr is maintained by analyse_sbs and sync_sbs. 3/ spare_inactive method is subsumed into hot_remove_disk spare_writable is subsumed into hot_add_disk. hot_add_disk decides which slot a new device will hold. 4/ spare_active now finds all non-in_sync devices and marks them in_sync. 5/ faulty devices are removed by the md recovery thread as soon as they are idle. Any spares that are available are then added.
2002-08-22[PATCH] md: Remove used_slot field from per-personality infoNeil Brown
This is equivalent to ->rdev != NULL, so it isn't needed.
2002-08-22[PATCH] md: Keep track of number of pending requests on each component ↵Neil Brown
device on an MD array This will allow us to know, in the event of a device failure, when the device is completely unused and so can be disconnected from the array. Currently this isn't a problem as drives aren't normally disconnect until after a repacement has been rebuilt, which is a LONG TIME, but that will change shortly... We always increment the count under a spinlock after checking that it hasn't been disconnected already (rdev!= NULL). We disconnect under the same spinlock after checking that the count is zero.
2002-08-22[PATCH] md: MD error handers and md_sync_acct now get rdev instead of bdevNeil Brown
This simplifies the error handlers slighty, but allows for even more simplification later.
2002-08-22[PATCH] md: Store rdev instead of bdev in per-personality status arraysNeil Brown
Holding the rdev instead of the bdev does cause an extra de-reference, but it is conceptually cleaner and will allow lots more tidying up.
2002-07-23[PATCH] MD - Remove get_spare declaration and associated warningNeil Brown
get_spare recently became static and no-one told md_k.h
2002-07-18[PATCH] MD - Get rid of dev in rdev and use bdev exclusively.Neil Brown
Get rid of dev in rdev and use bdev exclusively. There is an awkwardness here in that userspace sometimes passed down a dev_t (e.g. hot_add_disk) and sometime a major and a minor (e.g. add_new_disk). Should we convert both to kdev_t as the uniform standard.... That is what was being done but it seemed very clumsy and things were gets converted back and forth a lot. As bdget used a dev_t, I felt safe in staying with dev_t once I had one rather than converting to kdev_t and back.
2002-07-18[PATCH] MD - Remove the sb from the mddevNeil Brown
Remove the sb from the mddev Now that al the important information is in mddev, we don't need to have an sb off the mddev. We only keep the per-device ones. Previously we determined if "set_array_info" had been run byb checking mddev->sb. Now we check mddev->raid_disks on the assumption that any valid array MUST have a non-zero number of devices.
2002-07-18[PATCH] MD - Remove dependance on superblockNeil Brown
Remove dependance on superblock All the remaining field of interest in the superblock get duplicated in the mddev struture and this is treated as authoritative. The superblock gets completely generated at write time, and all useful information extracted at read time. This means that we can slot in different superblock formats without affecting the bulk of the code.
2002-07-18[PATCH] MD - Move persistent from superblock to mddevNeil Brown
Move persistent from superblock to mddev Tidyup calc_dev_sboffset and calc_dev_size on the way
2002-07-18[PATCH] MD - Remove number and raid_disk from personality arraysNeil Brown
Remove number and raid_disk from personality arrays These are redundant. number not needed any more raid_disk never was as that is the index.
2002-07-18[PATCH] MD - nr_disks is gone from multipath/raid1Neil Brown
nr_disks is gone from multipath/raid1 Never used.
2002-07-18[PATCH] MD - Remove old_dev field.Neil Brown
Remove old_dev field. We used to monitor the pervious device number of a component device for superblock maintenance. This is not needed any more.
2002-07-18[PATCH] MD - Don't maintain disc status in superblock.Neil Brown
Don't maintain disc status in superblock. The state is now in rdev so we don't maintain it in superblock any more. We also nolonger test content of superblock for disk status mddev->spare is now an rdev and not a superblock fragment.
2002-07-18[PATCH] MD - Add "degraded" field to md deviceNeil Brown
Add "degraded" field to md device This is used to determine if a spare should be added without relying on the superblock.
2002-07-18[PATCH] MD - Add in_sync flag to each rdevNeil Brown
Add in_sync flag to each rdev This currently mirrors the MD_DISK_SYNC superblock flag, but soon it will be authoritative and the superblock will only be consulted at start time.
2002-07-18[PATCH] MD - Add raid_disk field to rdevNeil Brown
Add raid_disk field to rdev Also change find_rdev_nr to find based on position in array (raid_disk) not position in superblock (number).
2002-07-18[PATCH] MD - Improve handling of spares in mdNeil Brown
Improve handling of spares in md - hot_remove_disk is given the raid_disk rather than descriptor number so that it can find the device in internal array directly, no search. - spare_inactive now uses mddev->spare->raid_disk instead of mddev->spare->number so it can find the device directly without searching - spare_write does not need number. It can use mddev->spare->raid_disk as above. - spare_active does not need &mddev->spare. It finds the descriptor directly and fixes it without this pointer
2002-07-18[PATCH] MD - Remove concept of 'spare' drive for multipath.Neil Brown
Remove concept of 'spare' drive for multipath. Multipath now treats all working devices as active and does io to to first working one.
2002-07-18[PATCH] MD - Move md_update_sb callsNeil Brown
Move md_update_sb calls When a change which requires a superblock update happens at interrupt time, we currently set a flag (sb_dirty) and wakeup to per-array thread (raid1/raid5d/multipathd) to do the actual update. This patch centralises this. The sb_update is now done by the mdrecoveryd thread. As this is always woken up after the error handler is called, we don't need the call to wakeup the local thread any more. With this, we don't need "md_update_sb" to lock the array any more and only use __md_update_sb which is local to md.c So we rename __md_update_sb back to md_update_sb and stop exporting it.
2002-07-18[PATCH] MD - Pass the correct bdev to md_errorNeil Brown
Pass the correct bdev to md_error After a call to generic_make_request, bio->bi_bdev can have changed (e.g. by a re-mapped like raid0). So we cannot trust it for reporting the source of an error. This patch takes care to find the correct bdev.