<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/drivers/md, branch v3.12.19</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.12.19</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.12.19'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2014-03-24T08:45:00Z</updated>
<entry>
<title>dm cache: fix access beyond end of origin device</title>
<updated>2014-03-24T08:45:00Z</updated>
<author>
<name>Heinz Mauelshagen</name>
<email>heinzm@redhat.com</email>
</author>
<published>2014-03-12T15:13:39Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b5ef2d0121f7494d8e06fb51e6646ac1b3693c86'/>
<id>urn:sha1:b5ef2d0121f7494d8e06fb51e6646ac1b3693c86</id>
<content type='text'>
commit e893fba90c09f9b57fb97daae204ea9cc2c52fa5 upstream.

In order to avoid wasting cache space a partial block at the end of the
origin device is not cached.  Unfortunately, the check for such a
partial block at the end of the origin device was flawed.

Fix accesses beyond the end of the origin device that occured due to
attempted promotion of an undetected partial block by:

- initializing the per bio data struct to allow cache_end_io to work properly
- recognizing access to the partial block at the end of the origin device
- avoiding out of bounds access to the discard bitset

Otherwise, users can experience errors like the following:

 attempt to access beyond end of device
 dm-5: rw=0, want=20971520, limit=20971456
 ...
 device-mapper: cache: promotion failed; couldn't copy block

Signed-off-by: Heinz Mauelshagen &lt;heinzm@redhat.com&gt;
Acked-by: Joe Thornber &lt;ejt@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>dm cache: fix truncation bug when copying a block to/from &gt;2TB fast device</title>
<updated>2014-03-24T08:44:59Z</updated>
<author>
<name>Heinz Mauelshagen</name>
<email>heinzm@redhat.com</email>
</author>
<published>2014-03-11T23:40:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=67f1830fcdaefd65371bdd1bba38ca29cf7c2bbf'/>
<id>urn:sha1:67f1830fcdaefd65371bdd1bba38ca29cf7c2bbf</id>
<content type='text'>
commit 8b9d96666529a979acf4825391efcc7c8a3e9f12 upstream.

During demotion or promotion to a cache's &gt;2TB fast device we must not
truncate the cache block's associated sector to 32bits.  The 32bit
temporary result of from_cblock() caused a 32bit multiplication when
calculating the sector of the fast device in issue_copy_real().

Use an intermediate 64bit type to store the 32bit from_cblock() to allow
for proper 64bit multiplication.

Here is an example of how this bug manifests on an ext4 filesystem:

 EXT4-fs error (device dm-0): ext4_mb_generate_buddy:756: group 17136, 32768 clusters in bitmap, 30688 in gd; block bitmap corrupt.
 JBD2: Spotted dirty metadata buffer (dev = dm-0, blocknr = 0). There's a risk of filesystem corruption in case of system crash.

Signed-off-by: Heinz Mauelshagen &lt;heinzm@redhat.com&gt;
Acked-by: Joe Thornber &lt;ejt@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>dm space map metadata: fix refcount decrement below 0 which caused corruption</title>
<updated>2014-03-24T08:44:59Z</updated>
<author>
<name>Joe Thornber</name>
<email>ejt@redhat.com</email>
</author>
<published>2014-03-07T14:57:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b07b194d5cdc0f8cf63bf0039574afef416fafce'/>
<id>urn:sha1:b07b194d5cdc0f8cf63bf0039574afef416fafce</id>
<content type='text'>
commit cebc2de44d3bce53e46476e774126c298ca2c8a9 upstream.

This has been a relatively long-standing issue that wasn't nailed down
until Teng-Feng Yang's meticulous bug report to dm-devel on 3/7/2014,
see: http://www.redhat.com/archives/dm-devel/2014-March/msg00021.html

From that report:
  "When decreasing the reference count of a metadata block with its
  reference count equals 3, we will call dm_btree_remove() to remove
  this enrty from the B+tree which keeps the reference count info in
  metadata device.

  The B+tree will try to rebalance the entry of the child nodes in each
  node it traversed, and the rebalance process contains the following
  steps.

  (1) Finding the corresponding children in current node (shadow_current(s))
  (2) Shadow the children block (issue BOP_INC)
  (3) redistribute keys among children, and free children if necessary (issue BOP_DEC)

  Since the update of a metadata block's reference count could be
  recursive, we will stash these reference count update operations in
  smm-&gt;uncommitted and then process them in a FILO fashion.

  The problem is that step(3) could free the children which is created
  in step(2), so the BOP_DEC issued in step(3) will be carried out
  before the BOP_INC issued in step(2) since these BOPs will be
  processed in FILO fashion. Once the BOP_DEC from step(3) tries to
  decrease the reference count of newly shadow block, it will report
  failure for its reference equals 0 before decreasing. It looks like we
  can solve this issue by processing these BOPs in a FIFO fashion
  instead of FILO."

Commit 5b564d80 ("dm space map: disallow decrementing a reference count
below zero") changed the code to report an error for this temporary
refcount decrement below zero.  So what was previously a harmless
invalid refcount became a hard failure due to the new error path:

 device-mapper: space map common: unable to decrement a reference count below 0
 device-mapper: thin: 253:6: dm_thin_insert_block() failed: error = -22
 device-mapper: thin: 253:6: switching pool to read-only mode

This bug is in dm persistent-data code that is common to the DM thin and
cache targets.  So any users of those targets should apply this fix.

Fix this by applying recursive space map operations in FIFO order rather
than FILO.

Resolves: https://bugzilla.kernel.org/show_bug.cgi?id=68801

Reported-by: Apollon Oikonomopoulos &lt;apoikos@debian.org&gt;
Reported-by: edwillam1007@gmail.com
Reported-by: Teng-Feng Yang &lt;shinrairis@gmail.com&gt;
Signed-off-by: Joe Thornber &lt;ejt@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>dm thin: fix the error path for the thin device constructor</title>
<updated>2014-03-05T16:13:55Z</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2014-02-20T01:32:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5da2b53fd5cc451eea0a2d66bc9702724155bd66'/>
<id>urn:sha1:5da2b53fd5cc451eea0a2d66bc9702724155bd66</id>
<content type='text'>
commit 1acacc0784aab45627b6009e0e9224886279ac0b upstream.

dm_pool_close_thin_device() must be called if dm_set_target_max_io_len()
fails in thin_ctr().  Otherwise __pool_destroy() will fail because the
pool will still have an open thin device:

 device-mapper: thin metadata: attempt to close pmd when 1 device(s) are still open
 device-mapper: thin: __pool_destroy: dm_pool_metadata_close() failed.

Also, must establish error code if failing thin_ctr() because the pool
is in fail_io mode.

Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Acked-by: Joe Thornber &lt;ejt@redhat.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
</entry>
<entry>
<title>dm thin: avoid metadata commit if a pool's thin devices haven't changed</title>
<updated>2014-03-05T16:13:55Z</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2014-02-06T11:08:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d4c47a2b64e03a90281875b67a082a577de0ff43'/>
<id>urn:sha1:d4c47a2b64e03a90281875b67a082a577de0ff43</id>
<content type='text'>
commit 4d1662a30dde6e545086fe0e8fd7e474c4e0b639 upstream.

Commit 905e51b ("dm thin: commit outstanding data every second")
introduced a periodic commit.  This commit occurs regardless of whether
any thin devices have made changes.

Fix the periodic commit to check if any of a pool's thin devices have
changed using dm_pool_changed_this_transaction().

Reported-by: Alexander Larsson &lt;alexl@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Acked-by: Joe Thornber &lt;ejt@redhat.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
</entry>
<entry>
<title>dm mpath: fix stalls when handling invalid ioctls</title>
<updated>2014-03-05T16:13:55Z</updated>
<author>
<name>Hannes Reinecke</name>
<email>hare@suse.de</email>
</author>
<published>2014-02-26T09:07:04Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9673d7e741510a68069578a775cd0c49ef8825df'/>
<id>urn:sha1:9673d7e741510a68069578a775cd0c49ef8825df</id>
<content type='text'>
commit a1989b330093578ea5470bea0a00f940c444c466 upstream.

An invalid ioctl will never be valid, irrespective of whether multipath
has active paths or not.  So for invalid ioctls we do not have to wait
for multipath to activate any paths, but can rather return an error
code immediately.  This fix resolves numerous instances of:

 udevd[]: worker [] unexpectedly returned with status 0x0100

that have been seen during testing.

Signed-off-by: Hannes Reinecke &lt;hare@suse.de&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;

</content>
</entry>
<entry>
<title>md/raid5: Fix CPU hotplug callback registration</title>
<updated>2014-02-22T21:32:29Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2014-02-05T22:12:45Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2d470eab8a6136e063773be8c28794336c1055b8'/>
<id>urn:sha1:2d470eab8a6136e063773be8c28794336c1055b8</id>
<content type='text'>
commit 789b5e0315284463617e106baad360cb9e8db3ac upstream.

Subsystems that want to register CPU hotplug callbacks, as well as perform
initialization for the CPUs that are already online, often do it as shown
below:

	get_online_cpus();

	for_each_online_cpu(cpu)
		init_cpu(cpu);

	register_cpu_notifier(&amp;foobar_cpu_notifier);

	put_online_cpus();

This is wrong, since it is prone to ABBA deadlocks involving the
cpu_add_remove_lock and the cpu_hotplug.lock (when running concurrently
with CPU hotplug operations).

Interestingly, the raid5 code can actually prevent double initialization and
hence can use the following simplified form of callback registration:

	register_cpu_notifier(&amp;foobar_cpu_notifier);

	get_online_cpus();

	for_each_online_cpu(cpu)
		init_cpu(cpu);

	put_online_cpus();

A hotplug operation that occurs between registering the notifier and calling
get_online_cpus(), won't disrupt anything, because the code takes care to
perform the memory allocations only once.

So reorganize the code in raid5 this way to fix the deadlock with callback
registration.

Cc: linux-raid@vger.kernel.org
Fixes: 36d1c6476be51101778882897b315bd928c8c7b5
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
[Srivatsa: Fixed the unregister_cpu_notifier() deadlock, added the
free_scratch_buffer() helper to condense code further and wrote the changelog.]
Signed-off-by: Srivatsa S. Bhat &lt;srivatsa.bhat@linux.vnet.ibm.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>md/raid1: restore ability for check and repair to fix read errors.</title>
<updated>2014-02-22T21:32:29Z</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2014-02-05T01:17:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=93bc05512d3f7bb51e1d0631be065099292708f2'/>
<id>urn:sha1:93bc05512d3f7bb51e1d0631be065099292708f2</id>
<content type='text'>
commit 1877db75589a895bbdc4c4c3f23558e57b521141 upstream.

commit 30bc9b53878a9921b02e3b5bc4283ac1c6de102a
    md/raid1: fix bio handling problems in process_checks()

Move the bio_reset() to a point before where BIO_UPTODATE is checked,
so that check now always report that the bio is uptodate, even if it is not.

This causes process_check() to sometimes treat read-errors as
successful matches so the good data isn't written out.

This patch preserves the flag until it is needed.

Bug was introduced in 3.11, but backported to 3.10-stable (as it fixed
an even worse bug).  So suitable for any -stable since 3.10.

Reported-and-tested-by: Michael Tokarev &lt;mjt@tls.msk.ru&gt;
Fixed: 30bc9b53878a9921b02e3b5bc4283ac1c6de102a
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>dm sysfs: fix a module unload race</title>
<updated>2014-02-13T21:50:22Z</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2014-01-14T00:37:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1a46dbe17b9771a38d702143d7e03b86e4377f7e'/>
<id>urn:sha1:1a46dbe17b9771a38d702143d7e03b86e4377f7e</id>
<content type='text'>
commit 2995fa78e423d7193f3b57835f6c1c75006a0315 upstream.

This reverts commit be35f48610 ("dm: wait until embedded kobject is
released before destroying a device") and provides an improved fix.

The kobject release code that calls the completion must be placed in a
non-module file, otherwise there is a module unload race (if the process
calling dm_kobject_release is preempted and the DM module unloaded after
the completion is triggered, but before dm_kobject_release returns).

To fix this race, this patch moves the completion code to dm-builtin.c
which is always compiled directly into the kernel if BLK_DEV_DM is
selected.

The patch introduces a new dm_kobject_holder structure, its purpose is
to keep the completion and kobject in one place, so that it can be
accessed from non-module code without the need to export the layout of
struct mapped_device to that code.

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;


</content>
</entry>
<entry>
<title>dm space map metadata: fix bug in resizing of thin metadata</title>
<updated>2014-02-13T21:50:18Z</updated>
<author>
<name>Joe Thornber</name>
<email>ejt@redhat.com</email>
</author>
<published>2014-01-21T11:07:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f78c448024f25c7121493203a09b527ee8eeae8a'/>
<id>urn:sha1:f78c448024f25c7121493203a09b527ee8eeae8a</id>
<content type='text'>
commit fca028438fb903852beaf7c3fe1cd326651af57d upstream.

This bug was introduced in commit 7e664b3dec431e ("dm space map metadata:
fix extending the space map").

When extending a dm-thin metadata volume we:

- Switch the space map into a simple bootstrap mode, which allocates
  all space linearly from the newly added space.
- Add new bitmap entries for the new space
- Increment the reference counts for those newly allocated bitmap
  entries
- Commit changes to disk
- Switch back out of bootstrap mode.

But, the disk commit may allocate space itself, if so this fact will be
lost when switching out of bootstrap mode.

The bug exhibited itself as an error when the bitmap_root, with an
erroneous ref count of 0, was subsequently decremented as part of a
later disk commit.  This would cause the disk commit to fail, and thinp
to enter read_only mode.  The metadata was not damaged (thin_check
passed).

The fix is to put the increments + commit into a loop, running until
the commit has not allocated extra space.  In practise this loop only
runs twice.

With this fix the following device mapper testsuite test passes:
 dmtest run --suite thin-provisioning -n thin_remove_works_after_resize

Signed-off-by: Joe Thornber &lt;ejt@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
</feed>
