<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/genhd.h, branch v5.0.16</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.0.16</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.0.16'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2018-12-10T15:30:38Z</updated>
<entry>
<title>block: return just one value from part_in_flight</title>
<updated>2018-12-10T15:30:38Z</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2018-12-06T16:41:21Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e016b78201a2d9ff40f3f0da072292689af24c7f'/>
<id>urn:sha1:e016b78201a2d9ff40f3f0da072292689af24c7f</id>
<content type='text'>
The previous patches deleted all the code that needed the second value
returned from part_in_flight - now the kernel only uses the first value.

Consequently, part_in_flight (and blk_mq_in_flight) may be changed so that
it only returns one value.

This patch just refactors the code, there's no functional change.

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: switch to per-cpu in-flight counters</title>
<updated>2018-12-10T15:30:37Z</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2018-12-06T16:41:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1226b8dd0e91331cfab500f305b2c264445a0392'/>
<id>urn:sha1:1226b8dd0e91331cfab500f305b2c264445a0392</id>
<content type='text'>
Now when part_round_stats is gone, we can switch to per-cpu in-flight
counters.

We use the local-atomic type local_t, so that if part_inc_in_flight or
part_dec_in_flight is reentrantly called from an interrupt, the value will
be correct.

The other counters could be corrupted due to reentrant interrupt, but the
corruption only results in slight counter skew - the in_flight counter
must be exact, so it needs local_t.

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: delete part_round_stats and switch to less precise counting</title>
<updated>2018-12-10T15:30:37Z</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2018-12-06T16:41:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5b18b5a737600fd20ba2045f320d5926ebbf341a'/>
<id>urn:sha1:5b18b5a737600fd20ba2045f320d5926ebbf341a</id>
<content type='text'>
We want to convert to per-cpu in_flight counters.

The function part_round_stats needs the in_flight counter every jiffy, it
would be too costly to sum all the percpu variables every jiffy, so it
must be deleted. part_round_stats is used to calculate two counters -
time_in_queue and io_ticks.

time_in_queue can be calculated without part_round_stats, by adding the
duration of the I/O when the I/O ends (the value is almost as exact as the
previously calculated value, except that time for in-progress I/Os is not
counted).

io_ticks can be approximated by increasing the value when I/O is started
or ended and the jiffies value has changed. If the I/Os take less than a
jiffy, the value is as exact as the previously calculated value. If the
I/Os take more than a jiffy, io_ticks can drift behind the previously
calculated value.

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: stop passing 'cpu' to all percpu stats methods</title>
<updated>2018-12-10T15:30:37Z</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2018-12-06T16:41:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=112f158f66cbe25fd561a5dfe9c3826e06abf757'/>
<id>urn:sha1:112f158f66cbe25fd561a5dfe9c3826e06abf757</id>
<content type='text'>
All of part_stat_* and related methods are used with preempt disabled,
so there is no need to pass cpu around to allow of them.  Just call
smp_processor_id() as needed.

Suggested-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: use rcu_work instead of call_rcu to avoid sleep in softirq</title>
<updated>2018-11-28T16:08:27Z</updated>
<author>
<name>Yufen Yu</name>
<email>yuyufen@huawei.com</email>
</author>
<published>2018-11-28T08:42:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=94a2c3a32b62e868dc1e3d854326745a7f1b8c7a'/>
<id>urn:sha1:94a2c3a32b62e868dc1e3d854326745a7f1b8c7a</id>
<content type='text'>
We recently got a stack by syzkaller like this:

BUG: sleeping function called from invalid context at mm/slab.h:361
in_atomic(): 1, irqs_disabled(): 0, pid: 6644, name: blkid
INFO: lockdep is turned off.
CPU: 1 PID: 6644 Comm: blkid Not tainted 4.4.163-514.55.6.9.x86_64+ #76
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
 0000000000000000 5ba6a6b879e50c00 ffff8801f6b07b10 ffffffff81cb2194
 0000000041b58ab3 ffffffff833c7745 ffffffff81cb2080 5ba6a6b879e50c00
 0000000000000000 0000000000000001 0000000000000004 0000000000000000
Call Trace:
 &lt;IRQ&gt;  [&lt;ffffffff81cb2194&gt;] __dump_stack lib/dump_stack.c:15 [inline]
 &lt;IRQ&gt;  [&lt;ffffffff81cb2194&gt;] dump_stack+0x114/0x1a0 lib/dump_stack.c:51
 [&lt;ffffffff8129a981&gt;] ___might_sleep+0x291/0x490 kernel/sched/core.c:7675
 [&lt;ffffffff8129ac33&gt;] __might_sleep+0xb3/0x270 kernel/sched/core.c:7637
 [&lt;ffffffff81794c13&gt;] slab_pre_alloc_hook mm/slab.h:361 [inline]
 [&lt;ffffffff81794c13&gt;] slab_alloc_node mm/slub.c:2610 [inline]
 [&lt;ffffffff81794c13&gt;] slab_alloc mm/slub.c:2692 [inline]
 [&lt;ffffffff81794c13&gt;] kmem_cache_alloc_trace+0x2c3/0x5c0 mm/slub.c:2709
 [&lt;ffffffff81cbe9a7&gt;] kmalloc include/linux/slab.h:479 [inline]
 [&lt;ffffffff81cbe9a7&gt;] kzalloc include/linux/slab.h:623 [inline]
 [&lt;ffffffff81cbe9a7&gt;] kobject_uevent_env+0x2c7/0x1150 lib/kobject_uevent.c:227
 [&lt;ffffffff81cbf84f&gt;] kobject_uevent+0x1f/0x30 lib/kobject_uevent.c:374
 [&lt;ffffffff81cbb5b9&gt;] kobject_cleanup lib/kobject.c:633 [inline]
 [&lt;ffffffff81cbb5b9&gt;] kobject_release+0x229/0x440 lib/kobject.c:675
 [&lt;ffffffff81cbb0a2&gt;] kref_sub include/linux/kref.h:73 [inline]
 [&lt;ffffffff81cbb0a2&gt;] kref_put include/linux/kref.h:98 [inline]
 [&lt;ffffffff81cbb0a2&gt;] kobject_put+0x72/0xd0 lib/kobject.c:692
 [&lt;ffffffff8216f095&gt;] put_device+0x25/0x30 drivers/base/core.c:1237
 [&lt;ffffffff81c4cc34&gt;] delete_partition_rcu_cb+0x1d4/0x2f0 block/partition-generic.c:232
 [&lt;ffffffff813c08bc&gt;] __rcu_reclaim kernel/rcu/rcu.h:118 [inline]
 [&lt;ffffffff813c08bc&gt;] rcu_do_batch kernel/rcu/tree.c:2705 [inline]
 [&lt;ffffffff813c08bc&gt;] invoke_rcu_callbacks kernel/rcu/tree.c:2973 [inline]
 [&lt;ffffffff813c08bc&gt;] __rcu_process_callbacks kernel/rcu/tree.c:2940 [inline]
 [&lt;ffffffff813c08bc&gt;] rcu_process_callbacks+0x59c/0x1c70 kernel/rcu/tree.c:2957
 [&lt;ffffffff8120f509&gt;] __do_softirq+0x299/0xe20 kernel/softirq.c:273
 [&lt;ffffffff81210496&gt;] invoke_softirq kernel/softirq.c:350 [inline]
 [&lt;ffffffff81210496&gt;] irq_exit+0x216/0x2c0 kernel/softirq.c:391
 [&lt;ffffffff82c2cd7b&gt;] exiting_irq arch/x86/include/asm/apic.h:652 [inline]
 [&lt;ffffffff82c2cd7b&gt;] smp_apic_timer_interrupt+0x8b/0xc0 arch/x86/kernel/apic/apic.c:926
 [&lt;ffffffff82c2bc25&gt;] apic_timer_interrupt+0xa5/0xb0 arch/x86/entry/entry_64.S:746
 &lt;EOI&gt;  [&lt;ffffffff814cbf40&gt;] ? audit_kill_trees+0x180/0x180
 [&lt;ffffffff8187d2f7&gt;] fd_install+0x57/0x80 fs/file.c:626
 [&lt;ffffffff8180989e&gt;] do_sys_open+0x45e/0x550 fs/open.c:1043
 [&lt;ffffffff818099c2&gt;] SYSC_open fs/open.c:1055 [inline]
 [&lt;ffffffff818099c2&gt;] SyS_open+0x32/0x40 fs/open.c:1050
 [&lt;ffffffff82c299e1&gt;] entry_SYSCALL_64_fastpath+0x1e/0x9a

In softirq context, we call rcu callback function delete_partition_rcu_cb(),
which may allocate memory by kzalloc with GFP_KERNEL flag. If the
allocation cannot be satisfied, it may sleep. However, That is not allowed
in softirq contex.

Although we found this problem on linux 4.4, the latest kernel version
seems to have this problem as well. And it is very similar to the
previous one:
	https://lkml.org/lkml/2018/7/9/391

Fix it by using RCU workqueue, which allows sleep.

Reviewed-by: Paul E. McKenney &lt;paulmck@linux.ibm.com&gt;
Signed-off-by: Yufen Yu &lt;yuyufen@huawei.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>Merge tag 'v4.19-rc6' into for-4.20/block</title>
<updated>2018-10-01T14:58:57Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2018-10-01T14:58:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c0aac682fa6590cb660cb083dbc09f55e799d2d2'/>
<id>urn:sha1:c0aac682fa6590cb660cb083dbc09f55e799d2d2</id>
<content type='text'>
Merge -rc6 in, for two reasons:

1) Resolve a trivial conflict in the blk-mq-tag.c documentation
2) A few important regression fixes went into upstream directly, so
   they aren't in the 4.20 branch.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;

* tag 'v4.19-rc6': (780 commits)
  Linux 4.19-rc6
  MAINTAINERS: fix reference to moved drivers/{misc =&gt; auxdisplay}/panel.c
  cpufreq: qcom-kryo: Fix section annotations
  perf/core: Add sanity check to deal with pinned event failure
  xen/blkfront: correct purging of persistent grants
  Revert "xen/blkfront: When purging persistent grants, keep them in the buffer"
  selftests/powerpc: Fix Makefiles for headers_install change
  blk-mq: I/O and timer unplugs are inverted in blktrace
  dax: Fix deadlock in dax_lock_mapping_entry()
  x86/boot: Fix kexec booting failure in the SEV bit detection code
  bcache: add separate workqueue for journal_write to avoid deadlock
  drm/amd/display: Fix Edid emulation for linux
  drm/amd/display: Fix Vega10 lightup on S3 resume
  drm/amdgpu: Fix vce work queue was not cancelled when suspend
  Revert "drm/panel: Add device_link from panel device to DRM device"
  xen/blkfront: When purging persistent grants, keep them in the buffer
  clocksource/drivers/timer-atmel-pit: Properly handle error cases
  block: fix deadline elevator drain for zoned block devices
  ACPI / hotplug / PCI: Don't scan for non-hotplug bridges if slot is not bridge
  drm/syncobj: Don't leak fences when WAIT_FOR_SUBMIT is set
  ...

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: genhd: add 'groups' argument to device_add_disk</title>
<updated>2018-09-28T14:30:28Z</updated>
<author>
<name>Hannes Reinecke</name>
<email>hare@suse.de</email>
</author>
<published>2018-09-28T06:17:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fef912bf860e8e7e48a2bfb978a356bba743a8b7'/>
<id>urn:sha1:fef912bf860e8e7e48a2bfb978a356bba743a8b7</id>
<content type='text'>
Update device_add_disk() to take an 'groups' argument so that
individual drivers can register a device with additional sysfs
attributes.
This avoids race condition the driver would otherwise have if these
groups were to be created with sysfs_add_groups().

Signed-off-by: Martin Wilck &lt;martin.wilck@suse.com&gt;
Signed-off-by: Hannes Reinecke &lt;hare@suse.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: use nanosecond resolution for iostat</title>
<updated>2018-09-22T02:26:59Z</updated>
<author>
<name>Omar Sandoval</name>
<email>osandov@fb.com</email>
</author>
<published>2018-09-21T23:44:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b57e99b4b8b0ebdf9707424e7ddc0c392bdc5fe6'/>
<id>urn:sha1:b57e99b4b8b0ebdf9707424e7ddc0c392bdc5fe6</id>
<content type='text'>
Klaus Kusche reported that the I/O busy time in /proc/diskstats was not
updating properly on 4.18. This is because we started using ktime to
track elapsed time, and we convert nanoseconds to jiffies when we update
the partition counter. However, this gets rounded down, so any I/Os that
take less than a jiffy are not accounted for. Previously in this case,
the value of jiffies would sometimes increment while we were doing I/O,
so at least some I/Os were accounted for.

Let's convert the stats to use nanoseconds internally. We still report
milliseconds as before, now more accurately than ever. The value is
still truncated to 32 bits for backwards compatibility.

Fixes: 522a777566f5 ("block: consolidate struct request timestamp fields")
Cc: stable@vger.kernel.org
Reported-by: Klaus Kusche &lt;klaus.kusche@computerix.info&gt;
Signed-off-by: Omar Sandoval &lt;osandov@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: Track DISCARD statistics and output them in stat and diskstat</title>
<updated>2018-07-18T14:44:22Z</updated>
<author>
<name>Michael Callahan</name>
<email>michaelcallahan@fb.com</email>
</author>
<published>2018-07-18T11:47:40Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bdca3c87fb7ad1cc61d231d37eb0d8f90d001e0c'/>
<id>urn:sha1:bdca3c87fb7ad1cc61d231d37eb0d8f90d001e0c</id>
<content type='text'>
Add tracking of REQ_OP_DISCARD ios to the partition statistics and
append them to the various stat files in /sys as well as
/proc/diskstats.  These are tracked with the same four stats as reads
and writes:

Number of discard ios completed.
Number of discard ios merged
Number of discard sectors completed
Milliseconds spent on discard requests

This is done via adding a new STAT_DISCARD define to genhd.h and then
using it to index that stat field for discard requests.

tj: Refreshed on top of v4.17 and other previous updates.

Signed-off-by: Michael Callahan &lt;michaelcallahan@fb.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Andy Newell &lt;newella@fb.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: Define and use STAT_READ and STAT_WRITE</title>
<updated>2018-07-18T14:44:18Z</updated>
<author>
<name>Michael Callahan</name>
<email>michaelcallahan@fb.com</email>
</author>
<published>2018-07-18T11:47:38Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=dbae2c551377b6533a00c11fc7ede370100ab404'/>
<id>urn:sha1:dbae2c551377b6533a00c11fc7ede370100ab404</id>
<content type='text'>
Add defines for STAT_READ and STAT_WRITE for indexing the partition
stat entries. This clarifies some fs/ code which has hardcoded 1 for
STAT_WRITE and will make it easier to extend the stats with additional
fields.

tj: Refreshed on top of v4.17.

Signed-off-by: Michael Callahan &lt;michaelcallahan@fb.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: "Theodore Ts'o" &lt;tytso@mit.edu&gt;
Cc: Jaegeuk Kim &lt;jaegeuk@kernel.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
</feed>
