<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/drivers/md, branch v4.19.280</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.19.280</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.19.280'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2023-04-05T09:15:39Z</updated>
<entry>
<title>md: avoid signed overflow in slot_store()</title>
<updated>2023-04-05T09:15:39Z</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.de</email>
</author>
<published>2023-03-05T22:36:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b7de906cf91457c655edb0cb734df5a19f970ed3'/>
<id>urn:sha1:b7de906cf91457c655edb0cb734df5a19f970ed3</id>
<content type='text'>
[ Upstream commit 3bc57292278a0b6ac4656cad94c14f2453344b57 ]

slot_store() uses kstrtouint() to get a slot number, but stores the
result in an "int" variable (by casting a pointer).
This can result in a negative slot number if the unsigned int value is
very large.

A negative number means that the slot is empty, but setting a negative
slot number this way will not remove the device from the array.  I don't
think this is a serious problem, but it could cause confusion and it is
best to fix it.

Reported-by: Dan Carpenter &lt;error27@gmail.com&gt;
Signed-off-by: NeilBrown &lt;neilb@suse.de&gt;
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>dm crypt: add cond_resched() to dmcrypt_write()</title>
<updated>2023-04-05T09:15:38Z</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2023-03-06T16:17:58Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7b9f8efb5fc888dd938d2964e705b8e00f1dc0f6'/>
<id>urn:sha1:7b9f8efb5fc888dd938d2964e705b8e00f1dc0f6</id>
<content type='text'>
commit fb294b1c0ba982144ca467a75e7d01ff26304e2b upstream.

The loop in dmcrypt_write may be running for unbounded amount of time,
thus we need cond_resched() in it.

This commit fixes the following warning:

[ 3391.153255][   C12] watchdog: BUG: soft lockup - CPU#12 stuck for 23s! [dmcrypt_write/2:2897]
...
[ 3391.387210][   C12] Call trace:
[ 3391.390338][   C12]  blk_attempt_bio_merge.part.6+0x38/0x158
[ 3391.395970][   C12]  blk_attempt_plug_merge+0xc0/0x1b0
[ 3391.401085][   C12]  blk_mq_submit_bio+0x398/0x550
[ 3391.405856][   C12]  submit_bio_noacct+0x308/0x380
[ 3391.410630][   C12]  dmcrypt_write+0x1e4/0x208 [dm_crypt]
[ 3391.416005][   C12]  kthread+0x130/0x138
[ 3391.419911][   C12]  ret_from_fork+0x10/0x18

Reported-by: yangerkun &lt;yangerkun@huawei.com&gt;
Fixes: dc2676210c42 ("dm crypt: offload writes to thread")
Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>dm stats: check for and propagate alloc_percpu failure</title>
<updated>2023-04-05T09:15:38Z</updated>
<author>
<name>Jiasheng Jiang</name>
<email>jiasheng@iscas.ac.cn</email>
</author>
<published>2023-03-16T06:55:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0d96bd507ed7e7d565b6d53ebd3874686f123b2e'/>
<id>urn:sha1:0d96bd507ed7e7d565b6d53ebd3874686f123b2e</id>
<content type='text'>
commit d3aa3e060c4a80827eb801fc448debc9daa7c46b upstream.

Check alloc_precpu()'s return value and return an error from
dm_stats_init() if it fails. Update alloc_dev() to fail if
dm_stats_init() does.

Otherwise, a NULL pointer dereference will occur in dm_stats_cleanup()
even if dm-stats isn't being actively used.

Fixes: fd2ed4d25270 ("dm: add statistics support")
Cc: stable@vger.kernel.org
Signed-off-by: Jiasheng Jiang &lt;jiasheng@iscas.ac.cn&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>dm thin: fix deadlock when swapping to thin device</title>
<updated>2023-04-05T09:15:37Z</updated>
<author>
<name>Coly Li</name>
<email>colyli@suse.de</email>
</author>
<published>2023-02-27T15:23:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=84e13235e08941ce37aa9ee238b6dd007170f0fc'/>
<id>urn:sha1:84e13235e08941ce37aa9ee238b6dd007170f0fc</id>
<content type='text'>
commit 9bbf5feecc7eab2c370496c1c161bbfe62084028 upstream.

This is an already known issue that dm-thin volume cannot be used as
swap, otherwise a deadlock may happen when dm-thin internal memory
demand triggers swap I/O on the dm-thin volume itself.

But thanks to commit a666e5c05e7c ("dm: fix deadlock when swapping to
encrypted device"), the limit_swap_bios target flag can also be used
for dm-thin to avoid the recursive I/O when it is used as swap.

Fix is to simply set ti-&gt;limit_swap_bios to true in both pool_ctr()
and thin_ctr().

In my test, I create a dm-thin volume /dev/vg/swap and use it as swap
device. Then I run fio on another dm-thin volume /dev/vg/main and use
large --blocksize to trigger swap I/O onto /dev/vg/swap.

The following fio command line is used in my test,
  fio --name recursive-swap-io --lockmem 1 --iodepth 128 \
     --ioengine libaio --filename /dev/vg/main --rw randrw \
    --blocksize 1M --numjobs 32 --time_based --runtime=12h

Without this fix, the whole system can be locked up within 15 seconds.

With this fix, there is no any deadlock or hung task observed after
2 hours of running fio.

Furthermore, if blocksize is changed from 1M to 128M, after around 30
seconds fio has no visible I/O, and the out-of-memory killer message
shows up in kernel message. After around 20 minutes all fio processes
are killed and the whole system is back to being alive.

This is exactly what is expected when recursive I/O happens on dm-thin
volume when it is used as swap.

Depends-on: a666e5c05e7c ("dm: fix deadlock when swapping to encrypted device")
Cc: stable@vger.kernel.org
Signed-off-by: Coly Li &lt;colyli@suse.de&gt;
Acked-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>dm flakey: don't corrupt the zero page</title>
<updated>2023-03-11T15:31:53Z</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2023-01-22T19:02:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=98e311be44dbe31ad9c42aa067b2359bac451fda'/>
<id>urn:sha1:98e311be44dbe31ad9c42aa067b2359bac451fda</id>
<content type='text'>
commit f50714b57aecb6b3dc81d578e295f86d9c73f078 upstream.

When we need to zero some range on a block device, the function
__blkdev_issue_zero_pages submits a write bio with the bio vector pointing
to the zero page. If we use dm-flakey with corrupt bio writes option, it
will corrupt the content of the zero page which results in crashes of
various userspace programs. Glibc assumes that memory returned by mmap is
zeroed and it uses it for calloc implementation; if the newly mapped
memory is not zeroed, calloc will return non-zeroed memory.

Fix this bug by testing if the page is equal to ZERO_PAGE(0) and
avoiding the corruption in this case.

Cc: stable@vger.kernel.org
Fixes: a00f5276e266 ("dm flakey: Properly corrupt multi-page bios.")
Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Reviewed-by: Sweet Tea Dorminy &lt;sweettea-kernel@dorminy.me&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>dm flakey: fix logic when corrupting a bio</title>
<updated>2023-03-11T15:31:52Z</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2023-01-22T19:03:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ce551c9ba3d9fe7e7e24a1dd863f827b46360aaa'/>
<id>urn:sha1:ce551c9ba3d9fe7e7e24a1dd863f827b46360aaa</id>
<content type='text'>
commit aa56b9b75996ff4c76a0a4181c2fa0206c3d91cc upstream.

If "corrupt_bio_byte" is set to corrupt reads and corrupt_bio_flags is
used, dm-flakey would erroneously return all writes as errors. Likewise,
if "corrupt_bio_byte" is set to corrupt writes, dm-flakey would return
errors for all reads.

Fix the logic so that if fc-&gt;corrupt_bio_byte is non-zero, dm-flakey
will not abort reads on writes with an error.

Cc: stable@vger.kernel.org
Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Reviewed-by: Sweet Tea Dorminy &lt;sweettea-kernel@dorminy.me&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>dm cache: add cond_resched() to various workqueue loops</title>
<updated>2023-03-11T15:31:48Z</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@kernel.org</email>
</author>
<published>2023-02-16T20:31:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=11e842656fea0fc54322637bda44aca7d5cdf376'/>
<id>urn:sha1:11e842656fea0fc54322637bda44aca7d5cdf376</id>
<content type='text'>
[ Upstream commit 76227f6dc805e9e960128bcc6276647361e0827c ]

Otherwise on resource constrained systems these workqueues may be too
greedy.

Signed-off-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>dm thin: add cond_resched() to various workqueue loops</title>
<updated>2023-03-11T15:31:48Z</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@kernel.org</email>
</author>
<published>2023-02-16T20:29:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5eac0da3f27a46096388b3135c7646751f63fd66'/>
<id>urn:sha1:5eac0da3f27a46096388b3135c7646751f63fd66</id>
<content type='text'>
[ Upstream commit e4f80303c2353952e6e980b23914e4214487f2a6 ]

Otherwise on resource constrained systems these workqueues may be too
greedy.

Signed-off-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>dm: remove flush_scheduled_work() during local_exit()</title>
<updated>2023-03-11T15:31:42Z</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@kernel.org</email>
</author>
<published>2023-02-14T18:06:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4ae9a90d19649e2f9e9ce2d088be4b61c9db9cb3'/>
<id>urn:sha1:4ae9a90d19649e2f9e9ce2d088be4b61c9db9cb3</id>
<content type='text'>
[ Upstream commit 0b22ff5360f5c4e11050b89206370fdf7dc0a226 ]

Commit acfe0ad74d2e1 ("dm: allocate a special workqueue for deferred
device removal") switched from using system workqueue to a single
workqueue local to DM.  But it didn't eliminate the call to
flush_scheduled_work() that was introduced purely for the benefit of
deferred device removal with commit 2c140a246dc ("dm: allow remove to
be deferred").

Since DM core uses its own workqueue (and queue_work) there is no need
to call flush_scheduled_work() from local_exit().  local_exit()'s
destroy_workqueue(deferred_remove_workqueue) handles flushing work
started with queue_work().

Fixes: acfe0ad74d2e1 ("dm: allocate a special workqueue for deferred device removal")
Signed-off-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>dm thin: resume even if in FAIL mode</title>
<updated>2023-01-18T10:30:47Z</updated>
<author>
<name>Luo Meng</name>
<email>luomeng12@huawei.com</email>
</author>
<published>2022-11-30T02:09:45Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d3c61219a7d396a3b5d31b85a16ded67fdaf9ed7'/>
<id>urn:sha1:d3c61219a7d396a3b5d31b85a16ded67fdaf9ed7</id>
<content type='text'>
[ Upstream commit 19eb1650afeb1aa86151f61900e9e5f1de5d8d02 ]

If a thinpool set fail_io while suspending, resume will fail with:
 device-mapper: resume ioctl on vg-thinpool  failed: Invalid argument

The thin-pool also can't be removed if an in-flight bio is in the
deferred list.

This can be easily reproduced using:

  echo "offline" &gt; /sys/block/sda/device/state
  dd if=/dev/zero of=/dev/mapper/thin bs=4K count=1
  dmsetup suspend /dev/mapper/pool
  mkfs.ext4 /dev/mapper/thin
  dmsetup resume /dev/mapper/pool

The root cause is maybe_resize_data_dev() will check fail_io and return
error before called dm_resume.

Fix this by adding FAIL mode check at the end of pool_preresume().

Cc: stable@vger.kernel.org
Fixes: da105ed5fd7e ("dm thin metadata: introduce dm_pool_abort_metadata")
Signed-off-by: Luo Meng &lt;luomeng12@huawei.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
