<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/blk-mq.h, branch v4.4.153</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.4.153</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.4.153'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2015-11-07T17:40:47Z</updated>
<entry>
<title>block: add block polling support</title>
<updated>2015-11-07T17:40:47Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@fb.com</email>
</author>
<published>2015-11-05T17:44:55Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=05229beeddf7e75e2e616ddaad4b70e7fca9528d'/>
<id>urn:sha1:05229beeddf7e75e2e616ddaad4b70e7fca9528d</id>
<content type='text'>
Add basic support for polling for specific IO to complete. This uses
the cookie that blk-mq passes back, which enables the block layer
to pass this cookie to the driver to spin for a specific request.

This will be combined with request latency tracking, so we can make
qualified decisions about when to poll and when not to. For now, for
benchmark purposes, we add a sysfs file that controls whether polling
is enabled or not.

Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Acked-by: Christoph Hellwig &lt;hch@lst.de&gt;
Acked-by: Keith Busch &lt;keith.busch@intel.com&gt;
</content>
</entry>
<entry>
<title>block: generic request_queue reference counting</title>
<updated>2015-10-21T20:43:41Z</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2015-10-21T17:20:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3ef28e83ab15799742e55fd13243a5f678b04242'/>
<id>urn:sha1:3ef28e83ab15799742e55fd13243a5f678b04242</id>
<content type='text'>
Allow pmem, and other synchronous/bio-based block drivers, to fallback
on a per-cpu reference count managed by the core for tracking queue
live/dead state.

The existing per-cpu reference count for the blk_mq case is promoted to
be used in all block i/o scenarios.  This involves initializing it by
default, waiting for it to drop to zero at exit, and holding a live
reference over the invocation of q-&gt;make_request_fn() in
generic_make_request().  The blk_mq code continues to take its own
reference per blk_mq request and retains the ability to freeze the
queue, but the check that the queue is frozen is moved to
generic_make_request().

This fixes crash signatures like the following:

 BUG: unable to handle kernel paging request at ffff880140000000
 [..]
 Call Trace:
  [&lt;ffffffff8145e8bf&gt;] ? copy_user_handle_tail+0x5f/0x70
  [&lt;ffffffffa004e1e0&gt;] pmem_do_bvec.isra.11+0x70/0xf0 [nd_pmem]
  [&lt;ffffffffa004e331&gt;] pmem_make_request+0xd1/0x200 [nd_pmem]
  [&lt;ffffffff811c3162&gt;] ? mempool_alloc+0x72/0x1a0
  [&lt;ffffffff8141f8b6&gt;] generic_make_request+0xd6/0x110
  [&lt;ffffffff8141f966&gt;] submit_bio+0x76/0x170
  [&lt;ffffffff81286dff&gt;] submit_bh_wbc+0x12f/0x160
  [&lt;ffffffff81286e62&gt;] submit_bh+0x12/0x20
  [&lt;ffffffff813395bd&gt;] jbd2_write_superblock+0x8d/0x170
  [&lt;ffffffff8133974d&gt;] jbd2_mark_journal_empty+0x5d/0x90
  [&lt;ffffffff813399cb&gt;] jbd2_journal_destroy+0x24b/0x270
  [&lt;ffffffff810bc4ca&gt;] ? put_pwq_unlocked+0x2a/0x30
  [&lt;ffffffff810bc6f5&gt;] ? destroy_workqueue+0x225/0x250
  [&lt;ffffffff81303494&gt;] ext4_put_super+0x64/0x360
  [&lt;ffffffff8124ab1a&gt;] generic_shutdown_super+0x6a/0xf0

Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Keith Busch &lt;keith.busch@intel.com&gt;
Cc: Ross Zwisler &lt;ross.zwisler@linux.intel.com&gt;
Suggested-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Tested-by: Ross Zwisler &lt;ross.zwisler@linux.intel.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>blk-mq: factor out a helper to iterate all tags for a request_queue</title>
<updated>2015-10-01T08:10:57Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2015-09-27T19:01:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0bf6cd5b9531bcc29c0a5e504b6ce2984c6fd8d8'/>
<id>urn:sha1:0bf6cd5b9531bcc29c0a5e504b6ce2984c6fd8d8</id>
<content type='text'>
And replace the blk_mq_tag_busy_iter with it - the driver use has been
replaced with a new helper a while ago, and internal to the block we
only need the new version.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>blk-mq: fix racy updates of rq-&gt;errors</title>
<updated>2015-10-01T08:10:55Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2015-09-27T19:01:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f4829a9b7a61e159367350008a608b062c4f6840'/>
<id>urn:sha1:f4829a9b7a61e159367350008a608b062c4f6840</id>
<content type='text'>
blk_mq_complete_request may be a no-op if the request has already
been completed by others means (e.g. a timeout or cancellation), but
currently drivers have to set rq-&gt;errors before calling
blk_mq_complete_request, which might leave us with the wrong error value.

Add an error parameter to blk_mq_complete_request so that we can
defer setting rq-&gt;errors until we known we won the race to complete the
request.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Sagi Grimberg &lt;sagig@mellanox.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>blk-mq: fix sysfs registration/unregistration race</title>
<updated>2015-09-29T17:32:45Z</updated>
<author>
<name>Akinobu Mita</name>
<email>akinobu.mita@gmail.com</email>
</author>
<published>2015-09-26T17:09:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4593fdbe7a2f44d5e64c627c715dd0bcec9bdf14'/>
<id>urn:sha1:4593fdbe7a2f44d5e64c627c715dd0bcec9bdf14</id>
<content type='text'>
There is a race between cpu hotplug handling and adding/deleting
gendisk for blk-mq, where both are trying to register and unregister
the same sysfs entries.

null_add_dev
    --&gt; blk_mq_init_queue
        --&gt; blk_mq_init_allocated_queue
            --&gt; add to 'all_q_list' (*)
    --&gt; add_disk
        --&gt; blk_register_queue
            --&gt; blk_mq_register_disk (++)

null_del_dev
    --&gt; del_gendisk
        --&gt; blk_unregister_queue
            --&gt; blk_mq_unregister_disk (--)
    --&gt; blk_cleanup_queue
        --&gt; blk_mq_free_queue
            --&gt; del from 'all_q_list' (*)

blk_mq_queue_reinit
    --&gt; blk_mq_sysfs_unregister (-)
    --&gt; blk_mq_sysfs_register (+)

While the request queue is added to 'all_q_list' (*),
blk_mq_queue_reinit() can be called for the queue anytime by CPU
hotplug callback.  But blk_mq_sysfs_unregister (-) and
blk_mq_sysfs_register (+) in blk_mq_queue_reinit must not be called
before blk_mq_register_disk (++) and after blk_mq_unregister_disk (--)
is finished.  Because '/sys/block/*/mq/' is not exists.

There has already been BLK_MQ_F_SYSFS_UP flag in hctx-&gt;flags which can
be used to track these sysfs stuff, but it is only fixing this issue
partially.

In order to fix it completely, we just need per-queue flag instead of
per-hctx flag with appropriate locking.  So this introduces
q-&gt;mq_sysfs_init_done which is properly protected with all_q_mutex.

Also, we need to ensure that blk_mq_map_swqueue() is called with
all_q_mutex is held.  Since hctx-&gt;nr_ctx is reset temporarily and
updated in blk_mq_map_swqueue(), so we should avoid
blk_mq_register_hctx() seeing the temporary hctx-&gt;nr_ctx value
in CPU hotplug handling or adding/deleting gendisk .

Signed-off-by: Akinobu Mita &lt;akinobu.mita@gmail.com&gt;
Reviewed-by: Ming Lei &lt;tom.leiming@gmail.com&gt;
Cc: Ming Lei &lt;tom.leiming@gmail.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>blk-mq: Shared tag enhancements</title>
<updated>2015-06-01T20:35:56Z</updated>
<author>
<name>Keith Busch</name>
<email>keith.busch@intel.com</email>
</author>
<published>2015-06-01T15:29:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f26cdc8536ad50fb802a0445f836b4f94ca09ae7'/>
<id>urn:sha1:f26cdc8536ad50fb802a0445f836b4f94ca09ae7</id>
<content type='text'>
Storage controllers may expose multiple block devices that share hardware
resources managed by blk-mq. This patch enhances the shared tags so a
low-level driver can access the shared resources not tied to the unshared
h/w contexts. This way the LLD can dynamically add and delete disks and
request queues without having to track all the request_queue hctx's to
iterate outstanding tags.

Signed-off-by: Keith Busch &lt;keith.busch@intel.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>blk-mq: fix iteration of busy bitmap</title>
<updated>2015-04-17T14:31:12Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@fb.com</email>
</author>
<published>2015-04-17T14:28:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=569fd0ce96087283866ab8c438dac4bcf1738846'/>
<id>urn:sha1:569fd0ce96087283866ab8c438dac4bcf1738846</id>
<content type='text'>
Commit 889fa31f00b2 was a bit too eager in reducing the loop count,
so we ended up missing queues in some configurations. Ensure that
our division rounds up, so that's not the case.

Reported-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Fixes: 889fa31f00b2 ("blk-mq: reduce unnecessary software queue looping")
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>blk-mq: cleanup blk_mq_rq_to_pdu()</title>
<updated>2015-04-09T21:54:05Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@fb.com</email>
</author>
<published>2015-04-09T21:54:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2963e3f7e8e3465895897a175560210120b932ac'/>
<id>urn:sha1:2963e3f7e8e3465895897a175560210120b932ac</id>
<content type='text'>
Casting to void and adding the size of the request is "shit code" and
only a "crazy monkey on crack" would write that. So lets clean it up.

Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>blk-mq: export blk_mq_run_hw_queues</title>
<updated>2015-03-13T14:28:33Z</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2015-03-12T03:56:38Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b94ec296403e99d5ac9a8c48332cec4118d44b94'/>
<id>urn:sha1:b94ec296403e99d5ac9a8c48332cec4118d44b94</id>
<content type='text'>
Rename blk_mq_run_queues to blk_mq_run_hw_queues, add async argument,
and export it.

DM's suspend support must be able to run the queue without starting
stopped hw queues.

Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>blk-mq: add blk_mq_init_allocated_queue and export blk_mq_register_disk</title>
<updated>2015-03-13T14:26:53Z</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2015-03-13T03:56:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b62c21b71f08b7a4bfd025616ff1da2913a82904'/>
<id>urn:sha1:b62c21b71f08b7a4bfd025616ff1da2913a82904</id>
<content type='text'>
Add a variant of blk_mq_init_queue that allows a previously allocated
queue to be initialized.  blk_mq_init_allocated_queue models
blk_init_allocated_queue -- which was also created for DM's use.

DM's approach to device creation requires a placeholder request_queue be
allocated for use with alloc_dev() but the decision about what type of
request_queue will be ultimately created is deferred until all component
devices referenced in the DM table are processed to determine the table
type (request-based, blk-mq request-based, or bio-based).

Also, because of DM's late finalization of the request_queue type
the call to blk_mq_register_disk() doesn't happen during alloc_dev().
Must export blk_mq_register_disk() so that DM can backfill the 'mq' dir
once the blk-mq queue is fully allocated.

Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Reviewed-by: Ming Lei &lt;ming.lei@canonical.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
</feed>
