<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/block, branch v6.8.12</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.8.12</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.8.12'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2024-05-30T07:49:03Z</updated>
<entry>
<title>block: support to account io_ticks precisely</title>
<updated>2024-05-30T07:49:03Z</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2024-05-09T12:37:16Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ac18de98a60f1d162248cb4e362ad18a264594a0'/>
<id>urn:sha1:ac18de98a60f1d162248cb4e362ad18a264594a0</id>
<content type='text'>
[ Upstream commit 99dc422335d8b2bd4d105797241d3e715bae90e9 ]

Currently, io_ticks is accounted based on sampling, specifically
update_io_ticks() will always account io_ticks by 1 jiffies from
bdev_start_io_acct()/blk_account_io_start(), and the result can be
inaccurate, for example(HZ is 250):

Test script:
fio -filename=/dev/sda -bs=4k -rw=write -direct=1 -name=test -thinktime=4ms

Test result: util is about 90%, while the disk is really idle.

This behaviour is introduced by commit 5b18b5a73760 ("block: delete
part_round_stats and switch to less precise counting"), however, there
was a key point that is missed that this patch also improve performance
a lot:

Before the commit:
part_round_stats:
  if (part-&gt;stamp != now)
   stats |= 1;

  part_in_flight()
  -&gt; there can be lots of task here in 1 jiffies.
  part_round_stats_single()
   __part_stat_add()
  part-&gt;stamp = now;

After the commit:
update_io_ticks:
  stamp = part-&gt;bd_stamp;
  if (time_after(now, stamp))
   if (try_cmpxchg())
    __part_stat_add()
    -&gt; only one task can reach here in 1 jiffies.

Hence in order to account io_ticks precisely, we only need to know if
there are IO inflight at most once in one jiffies. Noted that for
rq-based device, iterating tags should not be used here because
'tags-&gt;lock' is grabbed in blk_mq_find_and_get_req(), hence
part_stat_lock_inc/dec() and part_in_flight() is used to trace inflight.
The additional overhead is quite little:

 - per cpu add/dec for each IO for rq-based device;
 - per cpu sum for each jiffies;

And it's verified by null-blk that there are no performance degration
under heavy IO pressure.

Fixes: 5b18b5a73760 ("block: delete part_round_stats and switch to less precise counting")
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Link: https://lore.kernel.org/r/20240509123717.3223892-2-yukuai1@huaweicloud.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>block: fix and simplify blkdevparts= cmdline parsing</title>
<updated>2024-05-30T07:49:03Z</updated>
<author>
<name>INAGAKI Hiroshi</name>
<email>musashino.open@gmail.com</email>
</author>
<published>2024-04-21T07:39:52Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=df799a71c6cc322197673d93ce472c61168a6e50'/>
<id>urn:sha1:df799a71c6cc322197673d93ce472c61168a6e50</id>
<content type='text'>
[ Upstream commit bc2e07dfd2c49aaa4b52302cf7b55cf94e025f79 ]

Fix the cmdline parsing of the "blkdevparts=" parameter using strsep(),
which makes the code simpler.

Before commit 146afeb235cc ("block: use strscpy() to instead of
strncpy()"), we used a strncpy() to copy a block device name and partition
names. The commit simply replaced a strncpy() and NULL termination with
a strscpy(). It did not update calculations of length passed to strscpy().
While the length passed to strncpy() is just a length of valid characters
without NULL termination ('\0'), strscpy() takes it as a length of the
destination buffer, including a NULL termination.

Since the source buffer is not necessarily NULL terminated, the current
code copies "length - 1" characters and puts a NULL character in the
destination buffer. It replaces the last character with NULL and breaks
the parsing.

As an example, that buffer will be passed to parse_parts() and breaks
parsing sub-partitions due to the missing ')' at the end, like the
following.

example (Check Point V-80 &amp; OpenWrt):

- Linux Kernel 6.6

  [    0.000000] Kernel command line: console=ttyS0,115200 earlycon=uart8250,mmio32,0xf0512000 crashkernel=30M mvpp2x.queue_mode=1 blkdevparts=mmcblk1:48M@10M(kernel-1),1M(dtb-1),720M(rootfs-1),48M(kernel-2),1M(dtb-2),720M(rootfs-2),300M(default_sw),650M(logs),1M(preset_cfg),1M(adsl),-(storage) maxcpus=4
  ...
  [    0.884016] mmc1: new HS200 MMC card at address 0001
  [    0.889951] mmcblk1: mmc1:0001 004GA0 3.69 GiB
  [    0.895043] cmdline partition format is invalid.
  [    0.895704]  mmcblk1: p1
  [    0.903447] mmcblk1boot0: mmc1:0001 004GA0 2.00 MiB
  [    0.908667] mmcblk1boot1: mmc1:0001 004GA0 2.00 MiB
  [    0.913765] mmcblk1rpmb: mmc1:0001 004GA0 512 KiB, chardev (248:0)

  1. "48M@10M(kernel-1),..." is passed to strscpy() with length=17
     from parse_parts()
  2. strscpy() returns -E2BIG and the destination buffer has
     "48M@10M(kernel-1\0"
  3. "48M@10M(kernel-1\0" is passed to parse_subpart()
  4. parse_subpart() fails to find ')' when parsing a partition name,
     and returns error

- Linux Kernel 6.1

  [    0.000000] Kernel command line: console=ttyS0,115200 earlycon=uart8250,mmio32,0xf0512000 crashkernel=30M mvpp2x.queue_mode=1 blkdevparts=mmcblk1:48M@10M(kernel-1),1M(dtb-1),720M(rootfs-1),48M(kernel-2),1M(dtb-2),720M(rootfs-2),300M(default_sw),650M(logs),1M(preset_cfg),1M(adsl),-(storage) maxcpus=4
  ...
  [    0.953142] mmc1: new HS200 MMC card at address 0001
  [    0.959114] mmcblk1: mmc1:0001 004GA0 3.69 GiB
  [    0.964259]  mmcblk1: p1(kernel-1) p2(dtb-1) p3(rootfs-1) p4(kernel-2) p5(dtb-2) 6(rootfs-2) p7(default_sw) p8(logs) p9(preset_cfg) p10(adsl) p11(storage)
  [    0.979174] mmcblk1boot0: mmc1:0001 004GA0 2.00 MiB
  [    0.984674] mmcblk1boot1: mmc1:0001 004GA0 2.00 MiB
  [    0.989926] mmcblk1rpmb: mmc1:0001 004GA0 512 KiB, chardev (248:0

By the way, strscpy() takes a length of destination buffer and it is
often confusing when copying characters with a specified length. Using
strsep() helps to separate the string by the specified character. Then,
we can use strscpy() naturally with the size of the destination buffer.

Separating the string on the fly is also useful to omit the redundant
string copy, reducing memory usage and improve the code readability.

Fixes: 146afeb235cc ("block: use strscpy() to instead of strncpy()")
Suggested-by: Naohiro Aota &lt;naota@elisp.net&gt;
Signed-off-by: INAGAKI Hiroshi &lt;musashino.open@gmail.com&gt;
Reviewed-by: Daniel Golle &lt;daniel@makrotopia.org&gt;
Link: https://lore.kernel.org/r/20240421074005.565-1-musashino.open@gmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>block: refine the EOF check in blkdev_iomap_begin</title>
<updated>2024-05-30T07:49:03Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2024-05-03T08:10:42Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=72c54e063c32aeb38d43a2bd897821e6e5a1757d'/>
<id>urn:sha1:72c54e063c32aeb38d43a2bd897821e6e5a1757d</id>
<content type='text'>
[ Upstream commit 0c12028aec837f5a002009bbf68d179d506510e8 ]

blkdev_iomap_begin rounds down the offset to the logical block size
before stashing it in iomap-&gt;offset and checking that it still is
inside the inode size.

Check the i_size check to the raw pos value so that we don't try a
zero size write if iter-&gt;pos is unaligned.

Fixes: 487c607df790 ("block: use iomap for writes to block devices")
Reported-by: syzbot+0a3683a0a6fecf909244@syzkaller.appspotmail.com
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Tested-by: syzbot+0a3683a0a6fecf909244@syzkaller.appspotmail.com
Link: https://lore.kernel.org/r/20240503081042.2078062-1-hch@lst.de
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>block: add a partscan sysfs attribute for disks</title>
<updated>2024-05-25T14:28:40Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2024-05-02T13:00:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=646eb6bb115b826ec5f205b9c35b247c97c6915d'/>
<id>urn:sha1:646eb6bb115b826ec5f205b9c35b247c97c6915d</id>
<content type='text'>
commit a4217c6740dc64a3eb6815868a9260825e8c68c6 upstream.

Userspace had been unknowingly relying on a non-stable interface of
kernel internals to determine if partition scanning is enabled for a
given disk. Provide a stable interface for this purpose instead.

Cc: stable@vger.kernel.org # 6.3+
Depends-on: 140ce28dd3be ("block: add a disk_has_partscan helper")
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Link: https://lore.kernel.org/linux-block/ZhQJf8mzq_wipkBH@gardel-login/
Link: https://lore.kernel.org/r/20240502130033.1958492-3-hch@lst.de
[axboe: add links and commit message from Keith]
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>block: add a disk_has_partscan helper</title>
<updated>2024-05-25T14:28:40Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2024-05-02T13:00:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0f40eb57741a6f48ecd91585fce375562239459a'/>
<id>urn:sha1:0f40eb57741a6f48ecd91585fce375562239459a</id>
<content type='text'>
commit 140ce28dd3bee8e53acc27f123ae474d69ef66f0 upstream.

Add a helper to check if partition scanning is enabled instead of
open coding the check in a few places.  This now always checks for
the hidden flag even if all but one of the callers are never reachable
for hidden gendisks.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Link: https://lore.kernel.org/r/20240502130033.1958492-2-hch@lst.de
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>blk-iocost: do not WARN if iocg was already offlined</title>
<updated>2024-05-17T10:14:50Z</updated>
<author>
<name>Li Nan</name>
<email>linan122@huawei.com</email>
</author>
<published>2024-04-19T09:32:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=14b3275f93d4a0d8ddc02195bc4e9869b7a3700e'/>
<id>urn:sha1:14b3275f93d4a0d8ddc02195bc4e9869b7a3700e</id>
<content type='text'>
[ Upstream commit 01bc4fda9ea0a6b52f12326486f07a4910666cf6 ]

In iocg_pay_debt(), warn is triggered if 'active_list' is empty, which
is intended to confirm iocg is active when it has debt. However, warn
can be triggered during a blkcg or disk removal, if iocg_waitq_timer_fn()
is run at that time:

  WARNING: CPU: 0 PID: 2344971 at block/blk-iocost.c:1402 iocg_pay_debt+0x14c/0x190
  Call trace:
  iocg_pay_debt+0x14c/0x190
  iocg_kick_waitq+0x438/0x4c0
  iocg_waitq_timer_fn+0xd8/0x130
  __run_hrtimer+0x144/0x45c
  __hrtimer_run_queues+0x16c/0x244
  hrtimer_interrupt+0x2cc/0x7b0

The warn in this situation is meaningless. Since this iocg is being
removed, the state of the 'active_list' is irrelevant, and 'waitq_timer'
is canceled after removing 'active_list' in ioc_pd_free(), which ensures
iocg is freed after iocg_waitq_timer_fn() returns.

Therefore, add the check if iocg was already offlined to avoid warn
when removing a blkcg or disk.

Signed-off-by: Li Nan &lt;linan122@huawei.com&gt;
Reviewed-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Link: https://lore.kernel.org/r/20240419093257.3004211-1-linan666@huaweicloud.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>blk-iocost: avoid out of bounds shift</title>
<updated>2024-05-17T10:14:44Z</updated>
<author>
<name>Rik van Riel</name>
<email>riel@surriel.com</email>
</author>
<published>2024-04-04T16:32:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=488dc6808cb8369685f18cee81e88e7052ac153b'/>
<id>urn:sha1:488dc6808cb8369685f18cee81e88e7052ac153b</id>
<content type='text'>
[ Upstream commit beaa51b36012fad5a4d3c18b88a617aea7a9b96d ]

UBSAN catches undefined behavior in blk-iocost, where sometimes
iocg-&gt;delay is shifted right by a number that is too large,
resulting in undefined behavior on some architectures.

[  186.556576] ------------[ cut here ]------------
UBSAN: shift-out-of-bounds in block/blk-iocost.c:1366:23
shift exponent 64 is too large for 64-bit type 'u64' (aka 'unsigned long long')
CPU: 16 PID: 0 Comm: swapper/16 Tainted: G S          E    N 6.9.0-0_fbk700_debug_rc2_kbuilder_0_gc85af715cac0 #1
Hardware name: Quanta Twin Lakes MP/Twin Lakes Passive MP, BIOS F09_3A23 12/08/2020
Call Trace:
 &lt;IRQ&gt;
 dump_stack_lvl+0x8f/0xe0
 __ubsan_handle_shift_out_of_bounds+0x22c/0x280
 iocg_kick_delay+0x30b/0x310
 ioc_timer_fn+0x2fb/0x1f80
 __run_timer_base+0x1b6/0x250
...

Avoid that undefined behavior by simply taking the
"delay = 0" branch if the shift is too large.

I am not sure what the symptoms of an undefined value
delay will be, but I suspect it could be more than a
little annoying to debug.

Signed-off-by: Rik van Riel &lt;riel@surriel.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Josef Bacik &lt;josef@toxicpanda.com&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Link: https://lore.kernel.org/r/20240404123253.0f58010f@imladris.surriel.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>block: fix overflow in blk_ioctl_discard()</title>
<updated>2024-05-17T10:14:40Z</updated>
<author>
<name>Li Nan</name>
<email>linan122@huawei.com</email>
</author>
<published>2024-03-29T01:23:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=507d526a98c355e6f3fb2c47aacad44a69784bee'/>
<id>urn:sha1:507d526a98c355e6f3fb2c47aacad44a69784bee</id>
<content type='text'>
[ Upstream commit 22d24a544b0d49bbcbd61c8c0eaf77d3c9297155 ]

There is no check for overflow of 'start + len' in blk_ioctl_discard().
Hung task occurs if submit an discard ioctl with the following param:
  start = 0x80000000000ff000, len = 0x8000000000fff000;
Add the overflow validation now.

Signed-off-by: Li Nan &lt;linan122@huawei.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Link: https://lore.kernel.org/r/20240329012319.2034550-1-linan666@huaweicloud.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>block: fix module reference leakage from bdev_open_by_dev error path</title>
<updated>2024-05-02T14:35:13Z</updated>
<author>
<name>Yu Kuai</name>
<email>yukuai3@huawei.com</email>
</author>
<published>2024-04-06T09:09:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0e9327c67410b129bf85e5c3a5aaea518328636f'/>
<id>urn:sha1:0e9327c67410b129bf85e5c3a5aaea518328636f</id>
<content type='text'>
[ Upstream commit 9617cd6f24b294552a817f80f5225431ef67b540 ]

At the time bdev_may_open() is called, module reference is grabbed
already, hence module reference should be released if bdev_may_open()
failed.

This problem is found by code review.

Fixes: ed5cc702d311 ("block: Add config option to not allow writing to mounted devices")
Signed-off-by: Yu Kuai &lt;yukuai3@huawei.com&gt;
Link: https://lore.kernel.org/r/20240406090930.2252838-22-yukuai1@huaweicloud.com
Signed-off-by: Christian Brauner &lt;brauner@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>block: propagate partition scanning errors to the BLKRRPART ioctl</title>
<updated>2024-04-27T15:12:57Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2024-04-17T14:47:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=349240339c664fc8b9c51d0edb4bf3defa684a43'/>
<id>urn:sha1:349240339c664fc8b9c51d0edb4bf3defa684a43</id>
<content type='text'>
[ Upstream commit 752863bddacab6b5c5164b1df8c8b2e3a175ee28 ]

Commit 4601b4b130de ("block: reopen the device in blkdev_reread_part")
lost the propagation of I/O errors from the low-level read of the
partition table to the user space caller of the BLKRRPART.

Apparently some user space relies on, so restore the propagation.  This
isn't exactly pretty as other block device open calls explicitly do not
are about these errors, so add a new BLK_OPEN_STRICT_SCAN to opt into
the error propagation.

Fixes: 4601b4b130de ("block: reopen the device in blkdev_reread_part")
Reported-by: Saranya Muruganandam &lt;saranyamohan@google.com&gt;
Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Chaitanya Kulkarni &lt;kch@nvidia.com&gt;
Reviewed-by: Shin'ichiro Kawasaki &lt;shinichiro.kawasaki@wdc.com&gt;
Tested-by: Shin'ichiro Kawasaki &lt;shinichiro.kawasaki@wdc.com&gt;
Link: https://lore.kernel.org/r/20240417144743.2277601-1-hch@lst.de
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
