<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/block, branch v4.14.149</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.14.149</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.14.149'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2019-10-05T10:47:37Z</updated>
<entry>
<title>blk-mq: move cancel of requeue_work to the front of blk_exit_queue</title>
<updated>2019-10-05T10:47:37Z</updated>
<author>
<name>zhengbin</name>
<email>zhengbin13@huawei.com</email>
</author>
<published>2019-08-12T12:36:55Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d6e13c8d5cda231e6b0d824804b60395abc1cdcc'/>
<id>urn:sha1:d6e13c8d5cda231e6b0d824804b60395abc1cdcc</id>
<content type='text'>
[ Upstream commit e26cc08265dda37d2acc8394604f220ef412299d ]

blk_exit_queue will free elevator_data, while blk_mq_requeue_work
will access it. Move cancel of requeue_work to the front of
blk_exit_queue to avoid use-after-free.

blk_exit_queue                blk_mq_requeue_work
  __elevator_exit               blk_mq_run_hw_queues
    blk_mq_exit_sched             blk_mq_run_hw_queue
      dd_exit_queue                 blk_mq_hctx_has_pending
        kfree(elevator_data)          blk_mq_sched_has_work
                                        dd_has_work

Fixes: fbc2a15e3433 ("blk-mq: move cancel of requeue_work into blk_mq_release")
Cc: stable@vger.kernel.org
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: zhengbin &lt;zhengbin13@huawei.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>block/bio-integrity: fix a memory leak bug</title>
<updated>2019-07-31T05:28:55Z</updated>
<author>
<name>Wenwen Wang</name>
<email>wenwen@cs.uga.edu</email>
</author>
<published>2019-07-11T19:22:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c35f5d31451fa402f2c8ea51609aab5fc8cfc612'/>
<id>urn:sha1:c35f5d31451fa402f2c8ea51609aab5fc8cfc612</id>
<content type='text'>
[ Upstream commit e7bf90e5afe3aa1d1282c1635a49e17a32c4ecec ]

In bio_integrity_prep(), a kernel buffer is allocated through kmalloc() to
hold integrity metadata. Later on, the buffer will be attached to the bio
structure through bio_integrity_add_page(), which returns the number of
bytes of integrity metadata attached. Due to unexpected situations,
bio_integrity_add_page() may return 0. As a result, bio_integrity_prep()
needs to be terminated with 'false' returned to indicate this error.
However, the allocated kernel buffer is not freed on this execution path,
leading to a memory leak.

To fix this issue, free the allocated buffer before returning from
bio_integrity_prep().

Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Acked-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: Wenwen Wang &lt;wenwen@cs.uga.edu&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>block, bfq: NULL out the bic when it's no longer valid</title>
<updated>2019-07-21T07:04:30Z</updated>
<author>
<name>Douglas Anderson</name>
<email>dianders@chromium.org</email>
</author>
<published>2019-06-28T04:44:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=340a4da3c16aa73f9ce7b409e3e65786b7abb202'/>
<id>urn:sha1:340a4da3c16aa73f9ce7b409e3e65786b7abb202</id>
<content type='text'>
commit dbc3117d4ca9e17819ac73501e914b8422686750 upstream.

In reboot tests on several devices we were seeing a "use after free"
when slub_debug or KASAN was enabled.  The kernel complained about:

  Unable to handle kernel paging request at virtual address 6b6b6c2b

...which is a classic sign of use after free under slub_debug.  The
stack crawl in kgdb looked like:

 0  test_bit (addr=&lt;optimized out&gt;, nr=&lt;optimized out&gt;)
 1  bfq_bfqq_busy (bfqq=&lt;optimized out&gt;)
 2  bfq_select_queue (bfqd=&lt;optimized out&gt;)
 3  __bfq_dispatch_request (hctx=&lt;optimized out&gt;)
 4  bfq_dispatch_request (hctx=&lt;optimized out&gt;)
 5  0xc056ef00 in blk_mq_do_dispatch_sched (hctx=0xed249440)
 6  0xc056f728 in blk_mq_sched_dispatch_requests (hctx=0xed249440)
 7  0xc0568d24 in __blk_mq_run_hw_queue (hctx=0xed249440)
 8  0xc0568d94 in blk_mq_run_work_fn (work=&lt;optimized out&gt;)
 9  0xc024c5c4 in process_one_work (worker=0xec6d4640, work=0xed249480)
 10 0xc024cff4 in worker_thread (__worker=0xec6d4640)

Digging in kgdb, it could be found that, though bfqq looked fine,
bfqq-&gt;bic had been freed.

Through further digging, I postulated that perhaps it is illegal to
access a "bic" (AKA an "icq") after bfq_exit_icq() had been called
because the "bic" can be freed at some point in time after this call
is made.  I confirmed that there certainly were cases where the exact
crashing code path would access the "bic" after bfq_exit_icq() had
been called.  Sspecifically I set the "bfqq-&gt;bic" to (void *)0x7 and
saw that the bic was 0x7 at the time of the crash.

To understand a bit more about why this crash was fairly uncommon (I
saw it only once in a few hundred reboots), you can see that much of
the time bfq_exit_icq_fbqq() fully frees the bfqq and thus it can't
access the -&gt;bic anymore.  The only case it doesn't is if
bfq_put_queue() sees a reference still held.

However, even in the case when bfqq isn't freed, the crash is still
rare.  Why?  I tracked what happened to the "bic" after the exit
routine.  It doesn't get freed right away.  Rather,
put_io_context_active() eventually called put_io_context() which
queued up freeing on a workqueue.  The freeing then actually happened
later than that through call_rcu().  Despite all these delays, some
extra debugging showed that all the hoops could be jumped through in
time and the memory could be freed causing the original crash.  Phew!

To make a long story short, assuming it truly is illegal to access an
icq after the "exit_icq" callback is finished, this patch is needed.

Cc: stable@vger.kernel.org
Reviewed-by: Paolo Valente &lt;paolo.valente@unimore.it&gt;
Signed-off-by: Douglas Anderson &lt;dianders@chromium.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>block: bio_iov_iter_get_pages: pin more pages for multi-segment IOs</title>
<updated>2019-07-03T11:15:58Z</updated>
<author>
<name>Martin Wilck</name>
<email>mwilck@suse.com</email>
</author>
<published>2018-07-25T21:15:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ff325dc466cc4bf77f03abf0763e395b45102c61'/>
<id>urn:sha1:ff325dc466cc4bf77f03abf0763e395b45102c61</id>
<content type='text'>
[ Upstream commit 17d51b10d7773e4618bcac64648f30f12d4078fb ]

bio_iov_iter_get_pages() currently only adds pages for the next non-zero
segment from the iov_iter to the bio. That's suboptimal for callers,
which typically try to pin as many pages as fit into the bio. This patch
converts the current bio_iov_iter_get_pages() into a static helper, and
introduces a new helper that allocates as many pages as

 1) fit into the bio,
 2) are present in the iov_iter,
 3) and can be pinned by MM.

Error is returned only if zero pages could be pinned. Because of 3), a
zero return value doesn't necessarily mean all pages have been pinned.
Callers that have to pin every page in the iov_iter must still call this
function in a loop (this is currently the case).

This change matters most for __blkdev_direct_IO_simple(), which calls
bio_iov_iter_get_pages() only once. If it obtains less pages than
requested, it returns a "short write" or "short read", and
__generic_file_write_iter() falls back to buffered writes, which may
lead to data corruption.

Fixes: 72ecad22d9f1 ("block: support a full bio worth of IO for simplified bdev direct-io")
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Martin Wilck &lt;mwilck@suse.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>block: add a lower-level bio_add_page interface</title>
<updated>2019-07-03T11:15:58Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2018-06-01T16:03:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=515e2f3e9f1b7d327bdfda1b131b782f8893e19b'/>
<id>urn:sha1:515e2f3e9f1b7d327bdfda1b131b782f8893e19b</id>
<content type='text'>
[ Upstream commit 0aa69fd32a5f766e997ca8ab4723c5a1146efa8b ]

For the upcoming removal of buffer heads in XFS we need to keep track of
the number of outstanding writeback requests per page.  For this we need
to know if bio_add_page merged a region with the previous bvec or not.
Instead of adding additional arguments this refactors bio_add_page to
be implemented using three lower level helpers which users like XFS can
use directly if they care about the merge decisions.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Reviewed-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Reviewed-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
Signed-off-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>block, bfq: increase idling for weight-raised queues</title>
<updated>2019-06-15T09:54:58Z</updated>
<author>
<name>Paolo Valente</name>
<email>paolo.valente@linaro.org</email>
</author>
<published>2019-03-12T08:59:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f88e587c64b73c8e6ace2281b74ae4499bcb502b'/>
<id>urn:sha1:f88e587c64b73c8e6ace2281b74ae4499bcb502b</id>
<content type='text'>
[ Upstream commit 778c02a236a8728bb992de10ed1f12c0be5b7b0e ]

If a sync bfq_queue has a higher weight than some other queue, and
remains temporarily empty while in service, then, to preserve the
bandwidth share of the queue, it is necessary to plug I/O dispatching
until a new request arrives for the queue. In addition, a timeout
needs to be set, to avoid waiting for ever if the process associated
with the queue has actually finished its I/O.

Even with the above timeout, the device is however not fed with new
I/O for a while, if the process has finished its I/O. If this happens
often, then throughput drops and latencies grow. For this reason, the
timeout is kept rather low: 8 ms is the current default.

Unfortunately, such a low value may cause, on the opposite end, a
violation of bandwidth guarantees for a process that happens to issue
new I/O too late. The higher the system load, the higher the
probability that this happens to some process. This is a problem in
scenarios where service guarantees matter more than throughput. One
important case are weight-raised queues, which need to be granted a
very high fraction of the bandwidth.

To address this issue, this commit lower-bounds the plugging timeout
for weight-raised queues to 20 ms. This simple change provides
relevant benefits. For example, on a PLEXTOR PX-256M5S, with which
gnome-terminal starts in 0.6 seconds if there is no other I/O in
progress, the same applications starts in
- 0.8 seconds, instead of 1.2 seconds, if ten files are being read
  sequentially in parallel
- 1 second, instead of 2 seconds, if, in parallel, five files are
  being read sequentially, and five more files are being written
  sequentially

Tested-by: Holger Hoffstätte &lt;holger@applied-asynchrony.com&gt;
Tested-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Signed-off-by: Paolo Valente &lt;paolo.valente@linaro.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>blk-mq: move cancel of requeue_work into blk_mq_release</title>
<updated>2019-06-15T09:54:54Z</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@redhat.com</email>
</author>
<published>2019-04-30T01:52:24Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=aa331e85a45a28aa62515787892ac43c1484c06f'/>
<id>urn:sha1:aa331e85a45a28aa62515787892ac43c1484c06f</id>
<content type='text'>
[ Upstream commit fbc2a15e3433058582e5635aabe48a3011a644a8 ]

With holding queue's kobject refcount, it is safe for driver
to schedule requeue. However, blk_mq_kick_requeue_list() may
be called after blk_sync_queue() is done because of concurrent
requeue activities, then requeue work may not be completed when
freeing queue, and kernel oops is triggered.

So moving the cancel of requeue_work into blk_mq_release() for
avoiding race between requeue and freeing queue.

Cc: Dongli Zhang &lt;dongli.zhang@oracle.com&gt;
Cc: James Smart &lt;james.smart@broadcom.com&gt;
Cc: Bart Van Assche &lt;bart.vanassche@wdc.com&gt;
Cc: linux-scsi@vger.kernel.org,
Cc: Martin K . Petersen &lt;martin.petersen@oracle.com&gt;,
Cc: Christoph Hellwig &lt;hch@lst.de&gt;,
Cc: James E . J . Bottomley &lt;jejb@linux.vnet.ibm.com&gt;,
Reviewed-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Reviewed-by: Johannes Thumshirn &lt;jthumshirn@suse.de&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Tested-by: James Smart &lt;james.smart@broadcom.com&gt;
Signed-off-by: Ming Lei &lt;ming.lei@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>block: sed-opal: fix IOC_OPAL_ENABLE_DISABLE_MBR</title>
<updated>2019-05-31T13:47:29Z</updated>
<author>
<name>David Kozub</name>
<email>zub@linux.fjfi.cvut.cz</email>
</author>
<published>2019-02-14T00:15:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4654327bee80808636592c27a0ddca3a1ec0934b'/>
<id>urn:sha1:4654327bee80808636592c27a0ddca3a1ec0934b</id>
<content type='text'>
[ Upstream commit 78bf47353b0041865564deeed257a54f047c2fdc ]

The implementation of IOC_OPAL_ENABLE_DISABLE_MBR handled the value
opal_mbr_data.enable_disable incorrectly: enable_disable is expected
to be one of OPAL_MBR_ENABLE(0) or OPAL_MBR_DISABLE(1). enable_disable
was passed directly to set_mbr_done and set_mbr_enable_disable where
is was interpreted as either OPAL_TRUE(1) or OPAL_FALSE(0). The end
result was that calling IOC_OPAL_ENABLE_DISABLE_MBR with OPAL_MBR_ENABLE
actually disabled the shadow MBR and vice versa.

This patch adds correct conversion from OPAL_MBR_DISABLE/ENABLE to
OPAL_FALSE/TRUE. The change affects existing programs using
IOC_OPAL_ENABLE_DISABLE_MBR but this is typically used only once when
setting up an Opal drive.

Acked-by: Jon Derrick &lt;jonathan.derrick@intel.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Scott Bauer &lt;sbauer@plzdonthack.me&gt;
Signed-off-by: David Kozub &lt;zub@linux.fjfi.cvut.cz&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>block: do not leak memory in bio_copy_user_iov()</title>
<updated>2019-04-17T06:37:53Z</updated>
<author>
<name>Jérôme Glisse</name>
<email>jglisse@redhat.com</email>
</author>
<published>2019-04-10T20:27:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6ec54fc43b5a43c6b456b04843f13a2dded1c4f2'/>
<id>urn:sha1:6ec54fc43b5a43c6b456b04843f13a2dded1c4f2</id>
<content type='text'>
commit a3761c3c91209b58b6f33bf69dd8bb8ec0c9d925 upstream.

When bio_add_pc_page() fails in bio_copy_user_iov() we should free
the page we just allocated otherwise we are leaking it.

Cc: linux-block@vger.kernel.org
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: stable@vger.kernel.org
Reviewed-by: Chaitanya Kulkarni &lt;chaitanya.kulkarni@wdc.com&gt;
Signed-off-by: Jérôme Glisse &lt;jglisse@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>blk-mq: fix a hung issue when fsync</title>
<updated>2019-02-20T09:20:44Z</updated>
<author>
<name>Jianchao Wang</name>
<email>jianchao.w.wang@oracle.com</email>
</author>
<published>2019-01-30T09:01:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=883d561c49ebf1d540c3657892995692118afa91'/>
<id>urn:sha1:883d561c49ebf1d540c3657892995692118afa91</id>
<content type='text'>
[ Upstream commit 85bd6e61f34dffa8ec2dc75ff3c02ee7b2f1cbce ]

Florian reported a io hung issue when fsync(). It should be
triggered by following race condition.

data + post flush         a flush

blk_flush_complete_seq
  case REQ_FSEQ_DATA
    blk_flush_queue_rq
    issued to driver      blk_mq_dispatch_rq_list
                            try to issue a flush req
                            failed due to NON-NCQ command
                            .queue_rq return BLK_STS_DEV_RESOURCE

request completion
  req-&gt;end_io // doesn't check RESTART
  mq_flush_data_end_io
    case REQ_FSEQ_POSTFLUSH
      blk_kick_flush
        do nothing because previous flush
        has not been completed
     blk_mq_run_hw_queue
                              insert rq to hctx-&gt;dispatch
                              due to RESTART is still set, do nothing

To fix this, replace the blk_mq_run_hw_queue in mq_flush_data_end_io
with blk_mq_sched_restart to check and clear the RESTART flag.

Fixes: bd166ef1 (blk-mq-sched: add framework for MQ capable IO schedulers)
Reported-by: Florian Stecker &lt;m19@florianstecker.de&gt;
Tested-by: Florian Stecker &lt;m19@florianstecker.de&gt;
Signed-off-by: Jianchao Wang &lt;jianchao.w.wang@oracle.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
