<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/block, branch v4.19.17</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.19.17</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.19.17'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2019-01-22T20:40:35Z</updated>
<entry>
<title>block: use rcu_work instead of call_rcu to avoid sleep in softirq</title>
<updated>2019-01-22T20:40:35Z</updated>
<author>
<name>Yufen Yu</name>
<email>yuyufen@huawei.com</email>
</author>
<published>2018-11-28T08:42:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4cc66cc4f81fb8b1d6e83548fa79005dcc93ee2a'/>
<id>urn:sha1:4cc66cc4f81fb8b1d6e83548fa79005dcc93ee2a</id>
<content type='text'>
commit 94a2c3a32b62e868dc1e3d854326745a7f1b8c7a upstream.

We recently got a stack by syzkaller like this:

BUG: sleeping function called from invalid context at mm/slab.h:361
in_atomic(): 1, irqs_disabled(): 0, pid: 6644, name: blkid
INFO: lockdep is turned off.
CPU: 1 PID: 6644 Comm: blkid Not tainted 4.4.163-514.55.6.9.x86_64+ #76
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1ubuntu1 04/01/2014
 0000000000000000 5ba6a6b879e50c00 ffff8801f6b07b10 ffffffff81cb2194
 0000000041b58ab3 ffffffff833c7745 ffffffff81cb2080 5ba6a6b879e50c00
 0000000000000000 0000000000000001 0000000000000004 0000000000000000
Call Trace:
 &lt;IRQ&gt;  [&lt;ffffffff81cb2194&gt;] __dump_stack lib/dump_stack.c:15 [inline]
 &lt;IRQ&gt;  [&lt;ffffffff81cb2194&gt;] dump_stack+0x114/0x1a0 lib/dump_stack.c:51
 [&lt;ffffffff8129a981&gt;] ___might_sleep+0x291/0x490 kernel/sched/core.c:7675
 [&lt;ffffffff8129ac33&gt;] __might_sleep+0xb3/0x270 kernel/sched/core.c:7637
 [&lt;ffffffff81794c13&gt;] slab_pre_alloc_hook mm/slab.h:361 [inline]
 [&lt;ffffffff81794c13&gt;] slab_alloc_node mm/slub.c:2610 [inline]
 [&lt;ffffffff81794c13&gt;] slab_alloc mm/slub.c:2692 [inline]
 [&lt;ffffffff81794c13&gt;] kmem_cache_alloc_trace+0x2c3/0x5c0 mm/slub.c:2709
 [&lt;ffffffff81cbe9a7&gt;] kmalloc include/linux/slab.h:479 [inline]
 [&lt;ffffffff81cbe9a7&gt;] kzalloc include/linux/slab.h:623 [inline]
 [&lt;ffffffff81cbe9a7&gt;] kobject_uevent_env+0x2c7/0x1150 lib/kobject_uevent.c:227
 [&lt;ffffffff81cbf84f&gt;] kobject_uevent+0x1f/0x30 lib/kobject_uevent.c:374
 [&lt;ffffffff81cbb5b9&gt;] kobject_cleanup lib/kobject.c:633 [inline]
 [&lt;ffffffff81cbb5b9&gt;] kobject_release+0x229/0x440 lib/kobject.c:675
 [&lt;ffffffff81cbb0a2&gt;] kref_sub include/linux/kref.h:73 [inline]
 [&lt;ffffffff81cbb0a2&gt;] kref_put include/linux/kref.h:98 [inline]
 [&lt;ffffffff81cbb0a2&gt;] kobject_put+0x72/0xd0 lib/kobject.c:692
 [&lt;ffffffff8216f095&gt;] put_device+0x25/0x30 drivers/base/core.c:1237
 [&lt;ffffffff81c4cc34&gt;] delete_partition_rcu_cb+0x1d4/0x2f0 block/partition-generic.c:232
 [&lt;ffffffff813c08bc&gt;] __rcu_reclaim kernel/rcu/rcu.h:118 [inline]
 [&lt;ffffffff813c08bc&gt;] rcu_do_batch kernel/rcu/tree.c:2705 [inline]
 [&lt;ffffffff813c08bc&gt;] invoke_rcu_callbacks kernel/rcu/tree.c:2973 [inline]
 [&lt;ffffffff813c08bc&gt;] __rcu_process_callbacks kernel/rcu/tree.c:2940 [inline]
 [&lt;ffffffff813c08bc&gt;] rcu_process_callbacks+0x59c/0x1c70 kernel/rcu/tree.c:2957
 [&lt;ffffffff8120f509&gt;] __do_softirq+0x299/0xe20 kernel/softirq.c:273
 [&lt;ffffffff81210496&gt;] invoke_softirq kernel/softirq.c:350 [inline]
 [&lt;ffffffff81210496&gt;] irq_exit+0x216/0x2c0 kernel/softirq.c:391
 [&lt;ffffffff82c2cd7b&gt;] exiting_irq arch/x86/include/asm/apic.h:652 [inline]
 [&lt;ffffffff82c2cd7b&gt;] smp_apic_timer_interrupt+0x8b/0xc0 arch/x86/kernel/apic/apic.c:926
 [&lt;ffffffff82c2bc25&gt;] apic_timer_interrupt+0xa5/0xb0 arch/x86/entry/entry_64.S:746
 &lt;EOI&gt;  [&lt;ffffffff814cbf40&gt;] ? audit_kill_trees+0x180/0x180
 [&lt;ffffffff8187d2f7&gt;] fd_install+0x57/0x80 fs/file.c:626
 [&lt;ffffffff8180989e&gt;] do_sys_open+0x45e/0x550 fs/open.c:1043
 [&lt;ffffffff818099c2&gt;] SYSC_open fs/open.c:1055 [inline]
 [&lt;ffffffff818099c2&gt;] SyS_open+0x32/0x40 fs/open.c:1050
 [&lt;ffffffff82c299e1&gt;] entry_SYSCALL_64_fastpath+0x1e/0x9a

In softirq context, we call rcu callback function delete_partition_rcu_cb(),
which may allocate memory by kzalloc with GFP_KERNEL flag. If the
allocation cannot be satisfied, it may sleep. However, That is not allowed
in softirq contex.

Although we found this problem on linux 4.4, the latest kernel version
seems to have this problem as well. And it is very similar to the
previous one:
	https://lkml.org/lkml/2018/7/9/391

Fix it by using RCU workqueue, which allows sleep.

Reviewed-by: Paul E. McKenney &lt;paulmck@linux.ibm.com&gt;
Signed-off-by: Yufen Yu &lt;yuyufen@huawei.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>block: mq-deadline: Fix write completion handling</title>
<updated>2019-01-13T08:51:07Z</updated>
<author>
<name>Damien Le Moal</name>
<email>damien.lemoal@wdc.com</email>
</author>
<published>2018-12-17T06:14:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6353c0a03791c5d076d5edacfabc103cc51d879c'/>
<id>urn:sha1:6353c0a03791c5d076d5edacfabc103cc51d879c</id>
<content type='text'>
commit 7211aef86f79583e59b88a0aba0bc830566f7e8e upstream.

For a zoned block device using mq-deadline, if a write request for a
zone is received while another write was already dispatched for the same
zone, dd_dispatch_request() will return NULL and the newly inserted
write request is kept in the scheduler queue waiting for the ongoing
zone write to complete. With this behavior, when no other request has
been dispatched, rq_list in blk_mq_sched_dispatch_requests() is empty
and blk_mq_sched_mark_restart_hctx() not called. This in turn leads to
__blk_mq_free_request() call of blk_mq_sched_restart() to not run the
queue when the already dispatched write request completes. The newly
dispatched request stays stuck in the scheduler queue until eventually
another request is submitted.

This problem does not affect SCSI disk as the SCSI stack handles queue
restart on request completion. However, this problem is can be triggered
the nullblk driver with zoned mode enabled.

Fix this by always requesting a queue restart in dd_dispatch_request()
if no request was dispatched while WRITE requests are queued.

Fixes: 5700f69178e9 ("mq-deadline: Introduce zone locking support")
Cc: &lt;stable@vger.kernel.org&gt;
Signed-off-by: Damien Le Moal &lt;damien.lemoal@wdc.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

Add missing export of blk_mq_sched_restart()

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;

</content>
</entry>
<entry>
<title>block: deactivate blk_stat timer in wbt_disable_default()</title>
<updated>2019-01-13T08:51:06Z</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2018-12-12T11:44:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=69e9b2858b2051b57d738e4143d1938e7de24cb5'/>
<id>urn:sha1:69e9b2858b2051b57d738e4143d1938e7de24cb5</id>
<content type='text'>
commit 544fbd16a461a318cd80537d1331c0df5c6cf930 upstream.

rwb_enabled() can't be changed when there is any inflight IO.

wbt_disable_default() may set rwb-&gt;wb_normal as zero, however the
blk_stat timer may still be pending, and the timer function will update
wrb-&gt;wb_normal again.

This patch introduces blk_stat_deactivate() and applies it in
wbt_disable_default(), then the following IO hang triggered when running
parted &amp; switching io scheduler can be fixed:

[  369.937806] INFO: task parted:3645 blocked for more than 120 seconds.
[  369.938941]       Not tainted 4.20.0-rc6-00284-g906c801e5248 #498
[  369.939797] "echo 0 &gt; /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  369.940768] parted          D    0  3645   3239 0x00000000
[  369.941500] Call Trace:
[  369.941874]  ? __schedule+0x6d9/0x74c
[  369.942392]  ? wbt_done+0x5e/0x5e
[  369.942864]  ? wbt_cleanup_cb+0x16/0x16
[  369.943404]  ? wbt_done+0x5e/0x5e
[  369.943874]  schedule+0x67/0x78
[  369.944298]  io_schedule+0x12/0x33
[  369.944771]  rq_qos_wait+0xb5/0x119
[  369.945193]  ? karma_partition+0x1c2/0x1c2
[  369.945691]  ? wbt_cleanup_cb+0x16/0x16
[  369.946151]  wbt_wait+0x85/0xb6
[  369.946540]  __rq_qos_throttle+0x23/0x2f
[  369.947014]  blk_mq_make_request+0xe6/0x40a
[  369.947518]  generic_make_request+0x192/0x2fe
[  369.948042]  ? submit_bio+0x103/0x11f
[  369.948486]  ? __radix_tree_lookup+0x35/0xb5
[  369.949011]  submit_bio+0x103/0x11f
[  369.949436]  ? blkg_lookup_slowpath+0x25/0x44
[  369.949962]  submit_bio_wait+0x53/0x7f
[  369.950469]  blkdev_issue_flush+0x8a/0xae
[  369.951032]  blkdev_fsync+0x2f/0x3a
[  369.951502]  do_fsync+0x2e/0x47
[  369.951887]  __x64_sys_fsync+0x10/0x13
[  369.952374]  do_syscall_64+0x89/0x149
[  369.952819]  entry_SYSCALL_64_after_hwframe+0x49/0xbe
[  369.953492] RIP: 0033:0x7f95a1e729d4
[  369.953996] Code: Bad RIP value.
[  369.954456] RSP: 002b:00007ffdb570dd48 EFLAGS: 00000246 ORIG_RAX: 000000000000004a
[  369.955506] RAX: ffffffffffffffda RBX: 000055c2139c6be0 RCX: 00007f95a1e729d4
[  369.956389] RDX: 0000000000000001 RSI: 0000000000001261 RDI: 0000000000000004
[  369.957325] RBP: 0000000000000002 R08: 0000000000000000 R09: 000055c2139c6ce0
[  369.958199] R10: 0000000000000000 R11: 0000000000000246 R12: 000055c2139c0380
[  369.959143] R13: 0000000000000004 R14: 0000000000000100 R15: 0000000000000008

Cc: stable@vger.kernel.org
Cc: Paolo Valente &lt;paolo.valente@linaro.org&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>block/bio: Do not zero user pages</title>
<updated>2018-12-19T18:19:50Z</updated>
<author>
<name>Keith Busch</name>
<email>keith.busch@intel.com</email>
</author>
<published>2018-12-10T15:44:42Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7290c71ded83bad817ed7b4827afd725e81f20d0'/>
<id>urn:sha1:7290c71ded83bad817ed7b4827afd725e81f20d0</id>
<content type='text'>
commit f55adad601c6a97c8c9628195453e0fb23b4a0ae upstream.

We don't need to zero fill the bio if not using kernel allocated pages.

Fixes: f3587d76da05 ("block: Clear kernel memory before copying to user") # v4.20-rc2
Reported-by: Todd Aiken &lt;taiken@mvtech.ca&gt;
Cc: Laurence Oberman &lt;loberman@redhat.com&gt;
Cc: stable@vger.kernel.org
Cc: Bart Van Assche &lt;bvanassche@acm.org&gt;
Tested-by: Laurence Oberman &lt;loberman@redhat.com&gt;
Signed-off-by: Keith Busch &lt;keith.busch@intel.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>blk-mq: punt failed direct issue to dispatch list</title>
<updated>2018-12-08T11:59:10Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2018-12-07T05:17:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=55cbeea76e769b12cd0f1340132d32781f73a3dc'/>
<id>urn:sha1:55cbeea76e769b12cd0f1340132d32781f73a3dc</id>
<content type='text'>
commit c616cbee97aed4bc6178f148a7240206dcdb85a6 upstream.

After the direct dispatch corruption fix, we permanently disallow direct
dispatch of non read/write requests. This works fine off the normal IO
path, as they will be retried like any other failed direct dispatch
request. But for the blk_insert_cloned_request() that only DM uses to
bypass the bottom level scheduler, we always first attempt direct
dispatch. For some types of requests, that's now a permanent failure,
and no amount of retrying will make that succeed. This results in a
livelock.

Instead of making special cases for what we can direct issue, and now
having to deal with DM solving the livelock while still retaining a BUSY
condition feedback loop, always just add a request that has been through
-&gt;queue_rq() to the hardware queue dispatch list. These are safe to use
as no merging can take place there. Additionally, if requests do have
prepped data from drivers, we aren't dependent on them not sharing space
in the request structure to safely add them to the IO scheduler lists.

This basically reverts ffe81d45322c and is based on a patch from Ming,
but with the list insert case covered as well.

Fixes: ffe81d45322c ("blk-mq: fix corruption with direct issue")
Cc: stable@vger.kernel.org
Suggested-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Reported-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Tested-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Acked-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>blk-mq: fix corruption with direct issue</title>
<updated>2018-12-08T11:59:06Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2018-12-05T03:06:48Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=724ff9cbfe1f7a2adbab969f94ee9c2a2592a757'/>
<id>urn:sha1:724ff9cbfe1f7a2adbab969f94ee9c2a2592a757</id>
<content type='text'>
commit ffe81d45322cc3cb140f0db080a4727ea284661e upstream.

If we attempt a direct issue to a SCSI device, and it returns BUSY, then
we queue the request up normally. However, the SCSI layer may have
already setup SG tables etc for this particular command. If we later
merge with this request, then the old tables are no longer valid. Once
we issue the IO, we only read/write the original part of the request,
not the new state of it.

This causes data corruption, and is most often noticed with the file
system complaining about the just read data being invalid:

[  235.934465] EXT4-fs error (device sda1): ext4_iget:4831: inode #7142: comm dpkg-query: bad extra_isize 24937 (inode size 256)

because most of it is garbage...

This doesn't happen from the normal issue path, as we will simply defer
the request to the hardware queue dispatch list if we fail. Once it's on
the dispatch list, we never merge with it.

Fix this from the direct issue path by flagging the request as
REQ_NOMERGE so we don't change the size of it before issue.

See also:
  https://bugzilla.kernel.org/show_bug.cgi?id=201685

Tested-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Fixes: 6ce3dd6eec1 ("blk-mq: issue directly if hw queue isn't busy in case of 'none'")
Cc: stable@vger.kernel.org
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>block: copy ioprio in __bio_clone_fast() and bounce</title>
<updated>2018-12-01T08:37:32Z</updated>
<author>
<name>Hannes Reinecke</name>
<email>hare@suse.com</email>
</author>
<published>2018-11-12T17:35:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=487d58a9c3e5b2357b8bef6461ae228ad8e73f83'/>
<id>urn:sha1:487d58a9c3e5b2357b8bef6461ae228ad8e73f83</id>
<content type='text'>
[ Upstream commit ca474b73896bf6e0c1eb8787eb217b0f80221610 ]

We need to copy the io priority, too; otherwise the clone will run
with a different priority than the original one.

Fixes: 43b62ce3ff0a ("block: move bio io prio to a new field")
Signed-off-by: Hannes Reinecke &lt;hare@suse.com&gt;
Signed-off-by: Jean Delvare &lt;jdelvare@suse.de&gt;

Fixed up subject, and ordered stores.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>block: Clear kernel memory before copying to user</title>
<updated>2018-11-27T15:13:05Z</updated>
<author>
<name>Keith Busch</name>
<email>keith.busch@intel.com</email>
</author>
<published>2018-11-07T14:37:45Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a4da95ea101b2741488c6d84d3ce6e3d0cdffd62'/>
<id>urn:sha1:a4da95ea101b2741488c6d84d3ce6e3d0cdffd62</id>
<content type='text'>
[ Upstream commit f3587d76da05f68098ddb1cb3c98cc6a9e8a402c ]

If the kernel allocates a bounce buffer for user read data, this memory
needs to be cleared before copying it to the user, otherwise it may leak
kernel memory to user space.

Laurence Oberman &lt;loberman@redhat.com&gt;
Signed-off-by: Keith Busch &lt;keith.busch@intel.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>SCSI: fix queue cleanup race before queue initialization is done</title>
<updated>2018-11-21T08:19:18Z</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2018-11-14T08:25:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=410306a0f2baa5d68970cdcf6763d79c16df5f23'/>
<id>urn:sha1:410306a0f2baa5d68970cdcf6763d79c16df5f23</id>
<content type='text'>
commit 8dc765d438f1e42b3e8227b3b09fad7d73f4ec9a upstream.

c2856ae2f315d ("blk-mq: quiesce queue before freeing queue") has
already fixed this race, however the implied synchronize_rcu()
in blk_mq_quiesce_queue() can slow down LUN probe a lot, so caused
performance regression.

Then 1311326cf4755c7 ("blk-mq: avoid to synchronize rcu inside blk_cleanup_queue()")
tried to quiesce queue for avoiding unnecessary synchronize_rcu()
only when queue initialization is done, because it is usual to see
lots of inexistent LUNs which need to be probed.

However, turns out it isn't safe to quiesce queue only when queue
initialization is done. Because when one SCSI command is completed,
the user of sending command can be waken up immediately, then the
scsi device may be removed, meantime the run queue in scsi_end_request()
is still in-progress, so kernel panic can be caused.

In Red Hat QE lab, there are several reports about this kind of kernel
panic triggered during kernel booting.

This patch tries to address the issue by grabing one queue usage
counter during freeing one request and the following run queue.

Fixes: 1311326cf4755c7 ("blk-mq: avoid to synchronize rcu inside blk_cleanup_queue()")
Cc: Andrew Jones &lt;drjones@redhat.com&gt;
Cc: Bart Van Assche &lt;bart.vanassche@wdc.com&gt;
Cc: linux-scsi@vger.kernel.org
Cc: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: James E.J. Bottomley &lt;jejb@linux.vnet.ibm.com&gt;
Cc: stable &lt;stable@vger.kernel.org&gt;
Cc: jianchao.wang &lt;jianchao.w.wang@oracle.com&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>block, bfq: correctly charge and reset entity service in all cases</title>
<updated>2018-11-13T19:08:28Z</updated>
<author>
<name>Paolo Valente</name>
<email>paolo.valente@linaro.org</email>
</author>
<published>2018-09-14T14:23:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=668c01c11b7069db36d3bd12738c9ab8487da852'/>
<id>urn:sha1:668c01c11b7069db36d3bd12738c9ab8487da852</id>
<content type='text'>
[ Upstream commit cbeb869a3d1110450186b738199963c5e68c2a71 ]

BFQ schedules entities (which represent either per-process queues or
groups of queues) as a function of their timestamps. In particular, as
a function of their (virtual) finish times. The finish time of an
entity is computed as a function of the budget assigned to the entity,
assuming, tentatively, that the entity, once in service, will receive
an amount of service equal to its budget. Then, when the entity is
expired because it finishes to be served, this finish time is updated
as a function of the actual service received by the entity. This
allows the entity to be correctly charged with only the service
received, and then to be correctly re-scheduled.

Yet an entity may receive service also while not being the entity in
service (in the scheduling environment of its parent entity), for
several reasons. If the entity remains with no backlog while receiving
this 'unofficial' service, then it is expired. Also on such an
expiration, the finish time of the entity should be updated to account
for only the service actually received by the entity. Unfortunately,
such an update is not performed for an entity expiring without being
the entity in service.

In a similar vein, the service counter of the entity in service is
reset when the entity is expired, to be ready to be used for next
service cycle. This reset too should be performed also in case an
entity is expired because it remains empty after receiving service
while not being the entity in service. But in this case the reset is
not performed.

This commit performs the above update of the finish time and reset of
the service received, also for an entity expiring while not being the
entity in service.

Signed-off-by: Paolo Valente &lt;paolo.valente@linaro.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
</feed>
