<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/drivers/md/md.c, branch v3.0</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.0</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.0'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2011-06-28T06:59:42Z</updated>
<entry>
<title>md: avoid endless recovery loop when waiting for fail device to complete.</title>
<updated>2011-06-28T06:59:42Z</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2011-06-28T06:59:42Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4274215d24633df7302069e51426659d4759c5ed'/>
<id>urn:sha1:4274215d24633df7302069e51426659d4759c5ed</id>
<content type='text'>
If a device fails in a way that causes pending request to take a while
to complete, md will not be able to immediately remove it from the
array in remove_and_add_spares.
It will then incorrectly look like a spare device and md will try to
recover it even though it is failed.
This leads to a recovery process starting and instantly aborting over
and over again.

We should check if the device is faulty before considering it to be a
spare.  This will avoid trying to start a recovery that cannot
proceed.

This bug was introduced in 2.6.26 so that patch is suitable for any
kernel since then.

Cc: stable@kernel.org
Reported-by: Jim Paradis &lt;james.paradis@stratus.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
</entry>
<entry>
<title>md: check -&gt;hot_remove_disk when removing disk</title>
<updated>2011-06-09T01:42:54Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@gmail.com</email>
</author>
<published>2011-06-09T01:42:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=01393f3d5836b7d62e925e6f4658a7eb22b83a11'/>
<id>urn:sha1:01393f3d5836b7d62e925e6f4658a7eb22b83a11</id>
<content type='text'>
Check pers-&gt;hot_remove_disk instead of pers-&gt;hot_add_disk in slot_store()
during disk removal. The linear personality only has -&gt;hot_add_disk and
no -&gt;hot_remove_disk, so that removing disk in the array resulted to
following kernel bug:

$ sudo mdadm --create /dev/md0 --level=linear --raid-devices=4 /dev/loop[0-3]
$ echo none | sudo tee /sys/block/md0/md/dev-loop2/slot
 BUG: unable to handle kernel NULL pointer dereference at           (null)
 IP: [&lt;          (null)&gt;]           (null)
 PGD c9f5d067 PUD 8575a067 PMD 0
 Oops: 0010 [#1] SMP
 CPU 2
 Modules linked in: linear loop bridge stp llc kvm_intel kvm asus_atk0110 sr_mod cdrom sg

 Pid: 10450, comm: tee Not tainted 3.0.0-rc1-leonard+ #173 System manufacturer System Product Name/P5G41TD-M PRO
 RIP: 0010:[&lt;0000000000000000&gt;]  [&lt;          (null)&gt;]           (null)
 RSP: 0018:ffff880085757df0  EFLAGS: 00010282
 RAX: ffffffffa00168e0 RBX: ffff8800d1431800 RCX: 000000000000006e
 RDX: 0000000000000001 RSI: 0000000000000002 RDI: ffff88008543c000
 RBP: ffff880085757e48 R08: 0000000000000002 R09: 000000000000000a
 R10: 0000000000000000 R11: ffff88008543c2e0 R12: 00000000ffffffff
 R13: ffff8800b4641000 R14: 0000000000000005 R15: 0000000000000000
 FS:  00007fe8c9e05700(0000) GS:ffff88011fa00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000000000000000 CR3: 00000000b4502000 CR4: 00000000000406e0
 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
 DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
 Process tee (pid: 10450, threadinfo ffff880085756000, task ffff8800c9f08000)
 Stack:
  ffffffff8138496a ffff8800b4641000 ffff88008543c268 0000000000000000
  ffff8800b4641000 ffff88008543c000 ffff8800d1431868 ffffffff81a78a90
  ffff8800b4641000 ffff88008543c000 ffff8800d1431800 ffff880085757e98
 Call Trace:
  [&lt;ffffffff8138496a&gt;] ? slot_store+0xaa/0x265
  [&lt;ffffffff81384bae&gt;] rdev_attr_store+0x89/0xa8
  [&lt;ffffffff8115a96a&gt;] sysfs_write_file+0x108/0x144
  [&lt;ffffffff81106b87&gt;] vfs_write+0xb1/0x10d
  [&lt;ffffffff8106e6c0&gt;] ? trace_hardirqs_on_caller+0x111/0x135
  [&lt;ffffffff81106cac&gt;] sys_write+0x4d/0x77
  [&lt;ffffffff814fe702&gt;] system_call_fastpath+0x16/0x1b
 Code:  Bad RIP value.
 RIP  [&lt;          (null)&gt;]           (null)
  RSP &lt;ffff880085757df0&gt;
 CR2: 0000000000000000
 ---[ end trace ba5fc64319a826fb ]---

Signed-off-by: Namhyung Kim &lt;namhyung@gmail.com&gt;
Cc: stable@kernel.org
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
</entry>
<entry>
<title>md: Using poll  /proc/mdstat can monitor the events of adding a spare disks</title>
<updated>2011-06-09T01:42:48Z</updated>
<author>
<name>马建朋</name>
<email>majianpeng@gmail.com</email>
</author>
<published>2011-06-09T01:42:48Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9864c0053d3da4c5731ac8a6c4835179310bd40a'/>
<id>urn:sha1:9864c0053d3da4c5731ac8a6c4835179310bd40a</id>
<content type='text'>
Signed-off-by: majianpeng &lt;majianpeng@gmail.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
</entry>
<entry>
<title>MD: add sync_super to mddev_t struct</title>
<updated>2011-06-08T05:11:31Z</updated>
<author>
<name>Jonathan Brassow</name>
<email>jbrassow@redhat.com</email>
</author>
<published>2011-06-07T22:51:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=076f968b37f0232d883749da8f5031df5dea7ade'/>
<id>urn:sha1:076f968b37f0232d883749da8f5031df5dea7ade</id>
<content type='text'>
Add the 'sync_super' function pointer to MD array structure (struct mddev_s)

If device-mapper (dm-raid.c) is to define its own on-disk superblock and be
able to load it, there must still be a way for MD to initiate superblock
updates.  The simplest way to make this happen is to provide a pointer in
the MD array structure that can be set by device-mapper (or other module)
with a function to do this.  If the function has been set, it will be used;
otherwise, the method with be looked up via 'super_types' as usual.

Signed-off-by: Jonathan Brassow &lt;jbrassow@redhat.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
</entry>
<entry>
<title>MD: move thread wakeups into resume</title>
<updated>2011-06-08T05:11:31Z</updated>
<author>
<name>Jonathan Brassow</name>
<email>jbrassow@redhat.com</email>
</author>
<published>2011-06-07T22:49:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0fd018af37dadbb7826850883ad8abfecdb1a00b'/>
<id>urn:sha1:0fd018af37dadbb7826850883ad8abfecdb1a00b</id>
<content type='text'>
Move personality and sync/recovery thread starting outside md_run.

Moving the wakeup's of the personality and sync/recovery threads out of
md_run and into do_md_run and mddev_resume solves two issues:
1) It allows bitmap_load to be called before the sync_thread is run and
2) when MD personalities are used by device-mapper (dm-raid.c), the start-up
of the array is better alligned with device-mapper primatives
(CTR/resume/suspend/DTR).  I/O - in this case, recovery operations - should
not happen until after a resume has taken place.

Signed-off-by: Jonathan Brassow &lt;jbrassow@redhat.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
</entry>
<entry>
<title>MD: possible typo</title>
<updated>2011-06-08T05:11:31Z</updated>
<author>
<name>Jonathan Brassow</name>
<email>jbrassow@redhat.com</email>
</author>
<published>2011-06-07T22:48:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ac42450c7c814769bee963ae4b897c149bb0ab53'/>
<id>urn:sha1:ac42450c7c814769bee963ae4b897c149bb0ab53</id>
<content type='text'>
Make message a bit clearer by s/blocks/k/

I chose 'k' vs 'kiB' or 'kB' because it is what is used earlier in the
message.  'k' may be a bit ambigous, but I think it's better than "blocks"
which normally means 512, but means 1024 in MD.

Signed-off-by: Jonathan Brassow &lt;jbrassow@redhat.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
</entry>
<entry>
<title>MD: no sync IO while suspended</title>
<updated>2011-06-08T05:10:08Z</updated>
<author>
<name>Jonathan Brassow</name>
<email>jbrassow@f14.redhat.com</email>
</author>
<published>2011-06-08T05:10:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=68866e425be2ef2664aa5c691bb3ab789736acf5'/>
<id>urn:sha1:68866e425be2ef2664aa5c691bb3ab789736acf5</id>
<content type='text'>
Disallow resync I/O while the RAID array is suspended.

Recovery, resync, and metadata I/O should not be allowed while a device is
suspended.

Signed-off-by: Jonathan Brassow &lt;jbrassow@redhat.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
</entry>
<entry>
<title>MD: no integrity register if no gendisk</title>
<updated>2011-06-08T05:10:08Z</updated>
<author>
<name>Jonathan Brassow</name>
<email>jbrassow@f14.redhat.com</email>
</author>
<published>2011-06-08T05:10:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=629acb6abac0ef217ee579e14084af2ce7381dbc'/>
<id>urn:sha1:629acb6abac0ef217ee579e14084af2ce7381dbc</id>
<content type='text'>
Don't attempt md_integrity_register if there is no gendisk struct available.

When MD arrays are built via device-mapper, the gendisk structure is not
available via mddev.

Signed-off-by: Jonathan Brassow &lt;jbrassow@redhat.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
</entry>
<entry>
<title>md: allow resync_start to be set while an array is active.</title>
<updated>2011-05-11T05:52:21Z</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2011-05-11T05:52:21Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b098636cf04c89db4036fedc778da0acc666ad1a'/>
<id>urn:sha1:b098636cf04c89db4036fedc778da0acc666ad1a</id>
<content type='text'>
The sysfs attribute 'resync_start' (known internally as recovery_cp),
records where a resync is up to.  A value of 0 means the array is
not known to be in-sync at all.  A value of MaxSector means the array
is believed to be fully in-sync.

When the size of member devices of an array (RAID1,RAID4/5/6) is
increased, the array can be increased to match.  This process sets
resync_start to the old end-of-device offset so that the new part of
the array gets resynced.

However with RAID1 (and RAID6) a resync is not technically necessary
and may be undesirable.  So it would be good if the implied resync
after the array is resized could be avoided.

So: change 'resync_start' so the value can be changed while the array
is active, and as a precaution only allow it to be changed while
resync/recovery is 'frozen'.  Changing it once resync has started is
not going to be useful anyway.

This allows the array to be resized without a resync by:
  write 'frozen' to 'sync_action'
  write new size to 'component_size' (this will set resync_start)
  write 'none' to 'resync_start'
  write 'idle' to 'sync_action'.

Also slightly improve some tests on recovery_cp when resizing
raid1/raid5.  Now that an arbitrary value could be set we should be
more careful in our tests.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
</entry>
<entry>
<title>md: reject a re-add request that cannot be honoured.</title>
<updated>2011-05-11T04:26:20Z</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2011-05-11T04:26:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bedd86b7773fd97f0d708cc0c371c8963ba7ba9a'/>
<id>urn:sha1:bedd86b7773fd97f0d708cc0c371c8963ba7ba9a</id>
<content type='text'>
The 'add_new_disk' ioctl can be used to add a device either as a
spare, or as an active disk that just needs to be resynced based on
write-intent-bitmap information (re-add)

Currently if a re-add is requested but fails we add as a spare
instead.  This makes it impossible for user-space to check for
failure.

So change to require that a re-add attempt will either succeed or
completely fail.  User-space can then decide what to do next.

Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
</content>
</entry>
</feed>
