<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/block, branch v5.10.32</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.10.32</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.10.32'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2021-04-16T09:43:21Z</updated>
<entry>
<title>block: only update parent bi_status when bio fail</title>
<updated>2021-04-16T09:43:21Z</updated>
<author>
<name>Yufen Yu</name>
<email>yuyufen@huawei.com</email>
</author>
<published>2021-03-31T11:53:59Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1d2310d95fb8e29e69ebfc038919c968fbbdcb64'/>
<id>urn:sha1:1d2310d95fb8e29e69ebfc038919c968fbbdcb64</id>
<content type='text'>
[ Upstream commit 3edf5346e4f2ce2fa0c94651a90a8dda169565ee ]

For multiple split bios, if one of the bio is fail, the whole
should return error to application. But we found there is a race
between bio_integrity_verify_fn and bio complete, which return
io success to application after one of the bio fail. The race as
following:

split bio(READ)          kworker

nvme_complete_rq
blk_update_request //split error=0
  bio_endio
    bio_integrity_endio
      queue_work(kintegrityd_wq, &amp;bip-&gt;bip_work);

                         bio_integrity_verify_fn
                         bio_endio //split bio
                          __bio_chain_endio
                             if (!parent-&gt;bi_status)

                               &lt;interrupt entry&gt;
                               nvme_irq
                                 blk_update_request //parent error=7
                                 req_bio_endio
                                    bio-&gt;bi_status = 7 //parent bio
                               &lt;interrupt exit&gt;

                               parent-&gt;bi_status = 0
                        parent-&gt;bi_end_io() // return bi_status=0

The bio has been split as two: split and parent. When split
bio completed, it depends on kworker to do endio, while
bio_integrity_verify_fn have been interrupted by parent bio
complete irq handler. Then, parent bio-&gt;bi_status which have
been set in irq handler will overwrite by kworker.

In fact, even without the above race, we also need to conside
the concurrency beteen mulitple split bio complete and update
the same parent bi_status. Normally, multiple split bios will
be issued to the same hctx and complete from the same irq
vector. But if we have updated queue map between multiple split
bios, these bios may complete on different hw queue and different
irq vector. Then the concurrency update parent bi_status may
cause the final status error.

Suggested-by: Keith Busch &lt;kbusch@kernel.org&gt;
Signed-off-by: Yufen Yu &lt;yuyufen@huawei.com&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Link: https://lore.kernel.org/r/20210331115359.1125679-1-yuyufen@huawei.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>block: recalculate segment count for multi-segment discards correctly</title>
<updated>2021-03-30T12:32:06Z</updated>
<author>
<name>David Jeffery</name>
<email>djeffery@redhat.com</email>
</author>
<published>2021-02-11T14:38:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fc062d21c011dc9e9e49f20e26fb5930fa24c720'/>
<id>urn:sha1:fc062d21c011dc9e9e49f20e26fb5930fa24c720</id>
<content type='text'>
[ Upstream commit a958937ff166fc60d1c3a721036f6ff41bfa2821 ]

When a stacked block device inserts a request into another block device
using blk_insert_cloned_request, the request's nr_phys_segments field gets
recalculated by a call to blk_recalc_rq_segments in
blk_cloned_rq_check_limits. But blk_recalc_rq_segments does not know how to
handle multi-segment discards. For disk types which can handle
multi-segment discards like nvme, this results in discard requests which
claim a single segment when it should report several, triggering a warning
in nvme and causing nvme to fail the discard from the invalid state.

 WARNING: CPU: 5 PID: 191 at drivers/nvme/host/core.c:700 nvme_setup_discard+0x170/0x1e0 [nvme_core]
 ...
 nvme_setup_cmd+0x217/0x270 [nvme_core]
 nvme_loop_queue_rq+0x51/0x1b0 [nvme_loop]
 __blk_mq_try_issue_directly+0xe7/0x1b0
 blk_mq_request_issue_directly+0x41/0x70
 ? blk_account_io_start+0x40/0x50
 dm_mq_queue_rq+0x200/0x3e0
 blk_mq_dispatch_rq_list+0x10a/0x7d0
 ? __sbitmap_queue_get+0x25/0x90
 ? elv_rb_del+0x1f/0x30
 ? deadline_remove_request+0x55/0xb0
 ? dd_dispatch_request+0x181/0x210
 __blk_mq_do_dispatch_sched+0x144/0x290
 ? bio_attempt_discard_merge+0x134/0x1f0
 __blk_mq_sched_dispatch_requests+0x129/0x180
 blk_mq_sched_dispatch_requests+0x30/0x60
 __blk_mq_run_hw_queue+0x47/0xe0
 __blk_mq_delay_run_hw_queue+0x15b/0x170
 blk_mq_sched_insert_requests+0x68/0xe0
 blk_mq_flush_plug_list+0xf0/0x170
 blk_finish_plug+0x36/0x50
 xlog_cil_committed+0x19f/0x290 [xfs]
 xlog_cil_process_committed+0x57/0x80 [xfs]
 xlog_state_do_callback+0x1e0/0x2a0 [xfs]
 xlog_ioend_work+0x2f/0x80 [xfs]
 process_one_work+0x1b6/0x350
 worker_thread+0x53/0x3e0
 ? process_one_work+0x350/0x350
 kthread+0x11b/0x140
 ? __kthread_bind_mask+0x60/0x60
 ret_from_fork+0x22/0x30

This patch fixes blk_recalc_rq_segments to be aware of devices which can
have multi-segment discards. It calculates the correct discard segment
count by counting the number of bio as each discard bio is considered its
own segment.

Fixes: 1e739730c5b9 ("block: optionally merge discontiguous discard bios into a single request")
Signed-off-by: David Jeffery &lt;djeffery@redhat.com&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Reviewed-by: Laurence Oberman &lt;loberman@redhat.com&gt;
Link: https://lore.kernel.org/r/20210211143807.GA115624@redhat
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>block: Suppress uevent for hidden device when removed</title>
<updated>2021-03-30T12:31:52Z</updated>
<author>
<name>Daniel Wagner</name>
<email>dwagner@suse.de</email>
</author>
<published>2021-03-11T15:19:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=07feac84efc65c7d0a4ad44096334766bbe68dcb'/>
<id>urn:sha1:07feac84efc65c7d0a4ad44096334766bbe68dcb</id>
<content type='text'>
[ Upstream commit 9ec491447b90ad6a4056a9656b13f0b3a1e83043 ]

register_disk() suppress uevents for devices with the GENHD_FL_HIDDEN
but enables uevents at the end again in order to announce disk after
possible partitions are created.

When the device is removed the uevents are still on and user land sees
'remove' messages for devices which were never 'add'ed to the system.

  KERNEL[95481.571887] remove   /devices/virtual/nvme-fabrics/ctl/nvme5/nvme0c5n1 (block)

Let's suppress the uevents for GENHD_FL_HIDDEN by not enabling the
uevents at all.

Signed-off-by: Daniel Wagner &lt;dwagner@suse.de&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Martin Wilck &lt;mwilck@suse.com&gt;
Link: https://lore.kernel.org/r/20210311151917.136091-1-dwagner@suse.de
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>block: Fix REQ_OP_ZONE_RESET_ALL handling</title>
<updated>2021-03-30T12:31:51Z</updated>
<author>
<name>Damien Le Moal</name>
<email>damien.lemoal@wdc.com</email>
</author>
<published>2021-03-10T09:09:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d27b0964ade97211fa7a8cd0010ddc8737a054a5'/>
<id>urn:sha1:d27b0964ade97211fa7a8cd0010ddc8737a054a5</id>
<content type='text'>
[ Upstream commit faa44c69daf9ccbd5b8a1aee13e0e0d037c0be17 ]

Similarly to a single zone reset operation (REQ_OP_ZONE_RESET), execute
REQ_OP_ZONE_RESET_ALL operations with REQ_SYNC set.

Signed-off-by: Damien Le Moal &lt;damien.lemoal@wdc.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>blk-cgroup: Fix the recursive blkg rwstat</title>
<updated>2021-03-30T12:31:48Z</updated>
<author>
<name>Xunlei Pang</name>
<email>xlpang@linux.alibaba.com</email>
</author>
<published>2021-03-05T08:13:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=71b996c9b883313be4320954c902e84031399fd9'/>
<id>urn:sha1:71b996c9b883313be4320954c902e84031399fd9</id>
<content type='text'>
[ Upstream commit 4f44657d74873735e93a50eb25014721a66aac19 ]

The current blkio.throttle.io_service_bytes_recursive doesn't
work correctly.

As an example, for the following blkcg hierarchy:
 (Made 1GB READ in test1, 512MB READ in test2)
     test
    /    \
 test1   test2

$ head -n 1 test/test1/blkio.throttle.io_service_bytes_recursive
8:0 Read 1073684480
$ head -n 1 test/test2/blkio.throttle.io_service_bytes_recursive
8:0 Read 537448448
$ head -n 1 test/blkio.throttle.io_service_bytes_recursive
8:0 Read 537448448

Clearly, above data of "test" reflects "test2" not "test1"+"test2".

Do the correct summary in blkg_rwstat_recursive_sum().

Signed-off-by: Xunlei Pang &lt;xlpang@linux.alibaba.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>block: Discard page cache of zone reset target range</title>
<updated>2021-03-17T16:06:27Z</updated>
<author>
<name>Shin'ichiro Kawasaki</name>
<email>shinichiro.kawasaki@wdc.com</email>
</author>
<published>2021-03-11T07:25:46Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a53477849286c518232231e8983629d33d0499a8'/>
<id>urn:sha1:a53477849286c518232231e8983629d33d0499a8</id>
<content type='text'>
commit e5113505904ea1c1c0e1f92c1cfa91fbf4da1694 upstream.

When zone reset ioctl and data read race for a same zone on zoned block
devices, the data read leaves stale page cache even though the zone
reset ioctl zero clears all the zone data on the device. To avoid
non-zero data read from the stale page cache after zone reset, discard
page cache of reset target zones in blkdev_zone_mgmt_ioctl(). Introduce
the helper function blkdev_truncate_zone_range() to discard the page
cache. Ensure the page cache discarded by calling the helper function
before and after zone reset in same manner as fallocate does.

This patch can be applied back to the stable kernel version v5.10.y.
Rework is needed for older stable kernels.

Signed-off-by: Shin'ichiro Kawasaki &lt;shinichiro.kawasaki@wdc.com&gt;
Fixes: 3ed05a987e0f ("blk-zoned: implement ioctls")
Cc: &lt;stable@vger.kernel.org&gt; # 5.10+
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Johannes Thumshirn &lt;johannes.thumshirn@wdc.com&gt;
Link: https://lore.kernel.org/r/20210311072546.678999-1-shinichiro.kawasaki@wdc.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>blk-settings: align max_sectors on "logical_block_size" boundary</title>
<updated>2021-03-04T10:38:22Z</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2021-02-24T02:25:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=556c513e6bac619b19edbfc63bfa3fc0217a83ea'/>
<id>urn:sha1:556c513e6bac619b19edbfc63bfa3fc0217a83ea</id>
<content type='text'>
commit 97f433c3601a24d3513d06f575a389a2ca4e11e4 upstream.

We get I/O errors when we run md-raid1 on the top of dm-integrity on the
top of ramdisk.
device-mapper: integrity: Bio not aligned on 8 sectors: 0xff00, 0xff
device-mapper: integrity: Bio not aligned on 8 sectors: 0xff00, 0xff
device-mapper: integrity: Bio not aligned on 8 sectors: 0xffff, 0x1
device-mapper: integrity: Bio not aligned on 8 sectors: 0xffff, 0x1
device-mapper: integrity: Bio not aligned on 8 sectors: 0x8048, 0xff
device-mapper: integrity: Bio not aligned on 8 sectors: 0x8147, 0xff
device-mapper: integrity: Bio not aligned on 8 sectors: 0x8246, 0xff
device-mapper: integrity: Bio not aligned on 8 sectors: 0x8345, 0xbb

The ramdisk device has logical_block_size 512 and max_sectors 255. The
dm-integrity device uses logical_block_size 4096 and it doesn't affect the
"max_sectors" value - thus, it inherits 255 from the ramdisk. So, we have
a device with max_sectors not aligned on logical_block_size.

The md-raid device sees that the underlying leg has max_sectors 255 and it
will split the bios on 255-sector boundary, making the bios unaligned on
logical_block_size.

In order to fix the bug, we round down max_sectors to logical_block_size.

Cc: stable@vger.kernel.org
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>block: reopen the device in blkdev_reread_part</title>
<updated>2021-03-04T10:38:21Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2021-02-23T15:18:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cc88a819a14c2d2948090b6ddb2db2eee2904efb'/>
<id>urn:sha1:cc88a819a14c2d2948090b6ddb2db2eee2904efb</id>
<content type='text'>
[ Upstream commit 4601b4b130de2329fe06df80ed5d77265f2058e5 ]

Historically the BLKRRPART ioctls called into the now defunct -&gt;revalidate
method, which caused the sd driver to check if any media is present.
When the -&gt;revalidate method was removed this revalidation was lost,
leading to lots of I/O errors when using the eject command.  Fix this by
reopening the device to rescan the partitions, and thus calling the
revalidation logic in the sd driver.

Fixes: 471bd0af544b ("sd: use bdev_check_media_change")
Reported--by: Tom Seewald &lt;tseewald@gmail.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Tested-by: Tom Seewald &lt;tseewald@gmail.com&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Reviewed-by: Minwoo Im &lt;minwoo.im.dev@gmail.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>bsg: free the request before return error code</title>
<updated>2021-03-04T10:37:42Z</updated>
<author>
<name>Pan Bian</name>
<email>bianpan2016@163.com</email>
</author>
<published>2021-01-19T12:33:11Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1c7b7d476e6aedd2a72ee08e3e59a14cf120946f'/>
<id>urn:sha1:1c7b7d476e6aedd2a72ee08e3e59a14cf120946f</id>
<content type='text'>
[ Upstream commit 0f7b4bc6bb1e57c48ef14f1818df947c1612b206 ]

Free the request rq before returning error code.

Fixes: 972248e9111e ("scsi: bsg-lib: handle bidi requests without block layer help")
Signed-off-by: Pan Bian &lt;bianpan2016@163.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>bfq: Avoid false bfq queue merging</title>
<updated>2021-03-04T10:37:18Z</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2020-06-05T14:16:16Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=89e3d1a85df80de70239582f44c91ed943f50006'/>
<id>urn:sha1:89e3d1a85df80de70239582f44c91ed943f50006</id>
<content type='text'>
commit 41e76c85660c022c6bf5713bfb6c21e64a487cec upstream.

bfq_setup_cooperator() uses bfqd-&gt;in_serv_last_pos so detect whether it
makes sense to merge current bfq queue with the in-service queue.
However if the in-service queue is freshly scheduled and didn't dispatch
any requests yet, bfqd-&gt;in_serv_last_pos is stale and contains value
from the previously scheduled bfq queue which can thus result in a bogus
decision that the two queues should be merged. This bug can be observed
for example with the following fio jobfile:

[global]
direct=0
ioengine=sync
invalidate=1
size=1g
rw=read

[reader]
numjobs=4
directory=/mnt

where the 4 processes will end up in the one shared bfq queue although
they do IO to physically very distant files (for some reason I was able to
observe this only with slice_idle=1ms setting).

Fix the problem by invalidating bfqd-&gt;in_serv_last_pos when switching
in-service queue.

Fixes: 058fdecc6de7 ("block, bfq: fix in-service-queue check for queue merging")
CC: stable@vger.kernel.org
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Acked-by: Paolo Valente &lt;paolo.valente@linaro.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
</feed>
