<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/blk-mq.h, branch v4.6.5</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.6.5</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.6.5'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2016-03-20T15:34:02Z</updated>
<entry>
<title>blk-mq: Use proper cpumask iterator</title>
<updated>2016-03-20T15:34:02Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2016-03-19T10:30:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=897bb0c7f1ea82d7cc882b19790b5e1df00ffc29'/>
<id>urn:sha1:897bb0c7f1ea82d7cc882b19790b5e1df00ffc29</id>
<content type='text'>
queue_for_each_ctx() iterates over per_cpu variables under the assumption that
the possible cpu mask cannot have holes. That's wrong as all cpumasks can have
holes. In case there are holes the iteration ends up accessing uninitialized
memory and crashing as a result.

Replace the macro by a proper for_each_possible_cpu() loop and drop the unused
macro blk_ctx_sum() which references queue_for_each_ctx().

Reported-by: Xiong Zhou &lt;jencce.kernel@gmail.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>blk-mq: dynamic h/w context count</title>
<updated>2016-02-09T19:42:17Z</updated>
<author>
<name>Keith Busch</name>
<email>keith.busch@intel.com</email>
</author>
<published>2015-12-18T00:08:14Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=868f2f0b72068a097508b6e8870a8950fd8eb7ef'/>
<id>urn:sha1:868f2f0b72068a097508b6e8870a8950fd8eb7ef</id>
<content type='text'>
The hardware's provided queue count may change at runtime with resource
provisioning. This patch allows a block driver to alter the number of
h/w queues available when its resource count changes.

The main part is a new blk-mq API to request a new number of h/w queues
for a given live tag set. The new API freezes all queues using that set,
then adjusts the allocated count prior to remapping these to CPUs.

The bulk of the rest just shifts where h/w contexts and all their
artifacts are allocated and freed.

The number of max h/w contexts is capped to the number of possible cpus
since there is no use for more than that. As such, all pre-allocated
memory for pointers need to account for the max possible rather than
the initial number of queues.

A side effect of this is that the blk-mq will proceed successfully as
long as it can allocate at least one h/w context. Previously it would
fail request queue initialization if less than the requested number
was allocated.

Signed-off-by: Keith Busch &lt;keith.busch@intel.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Tested-by: Jon Derrick &lt;jonathan.derrick@intel.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>blk-mq: add a flags parameter to blk_mq_alloc_request</title>
<updated>2015-12-01T17:53:59Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2015-11-26T08:13:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6f3b0e8bcf3cbb87a7459b3ed018d31d918df3f8'/>
<id>urn:sha1:6f3b0e8bcf3cbb87a7459b3ed018d31d918df3f8</id>
<content type='text'>
We already have the reserved flag, and a nowait flag awkwardly encoded as
a gfp_t.  Add a real flags argument to make the scheme more extensible and
allow for a nicer calling convention.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>block: add block polling support</title>
<updated>2015-11-07T17:40:47Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@fb.com</email>
</author>
<published>2015-11-05T17:44:55Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=05229beeddf7e75e2e616ddaad4b70e7fca9528d'/>
<id>urn:sha1:05229beeddf7e75e2e616ddaad4b70e7fca9528d</id>
<content type='text'>
Add basic support for polling for specific IO to complete. This uses
the cookie that blk-mq passes back, which enables the block layer
to pass this cookie to the driver to spin for a specific request.

This will be combined with request latency tracking, so we can make
qualified decisions about when to poll and when not to. For now, for
benchmark purposes, we add a sysfs file that controls whether polling
is enabled or not.

Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Acked-by: Christoph Hellwig &lt;hch@lst.de&gt;
Acked-by: Keith Busch &lt;keith.busch@intel.com&gt;
</content>
</entry>
<entry>
<title>block: generic request_queue reference counting</title>
<updated>2015-10-21T20:43:41Z</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2015-10-21T17:20:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3ef28e83ab15799742e55fd13243a5f678b04242'/>
<id>urn:sha1:3ef28e83ab15799742e55fd13243a5f678b04242</id>
<content type='text'>
Allow pmem, and other synchronous/bio-based block drivers, to fallback
on a per-cpu reference count managed by the core for tracking queue
live/dead state.

The existing per-cpu reference count for the blk_mq case is promoted to
be used in all block i/o scenarios.  This involves initializing it by
default, waiting for it to drop to zero at exit, and holding a live
reference over the invocation of q-&gt;make_request_fn() in
generic_make_request().  The blk_mq code continues to take its own
reference per blk_mq request and retains the ability to freeze the
queue, but the check that the queue is frozen is moved to
generic_make_request().

This fixes crash signatures like the following:

 BUG: unable to handle kernel paging request at ffff880140000000
 [..]
 Call Trace:
  [&lt;ffffffff8145e8bf&gt;] ? copy_user_handle_tail+0x5f/0x70
  [&lt;ffffffffa004e1e0&gt;] pmem_do_bvec.isra.11+0x70/0xf0 [nd_pmem]
  [&lt;ffffffffa004e331&gt;] pmem_make_request+0xd1/0x200 [nd_pmem]
  [&lt;ffffffff811c3162&gt;] ? mempool_alloc+0x72/0x1a0
  [&lt;ffffffff8141f8b6&gt;] generic_make_request+0xd6/0x110
  [&lt;ffffffff8141f966&gt;] submit_bio+0x76/0x170
  [&lt;ffffffff81286dff&gt;] submit_bh_wbc+0x12f/0x160
  [&lt;ffffffff81286e62&gt;] submit_bh+0x12/0x20
  [&lt;ffffffff813395bd&gt;] jbd2_write_superblock+0x8d/0x170
  [&lt;ffffffff8133974d&gt;] jbd2_mark_journal_empty+0x5d/0x90
  [&lt;ffffffff813399cb&gt;] jbd2_journal_destroy+0x24b/0x270
  [&lt;ffffffff810bc4ca&gt;] ? put_pwq_unlocked+0x2a/0x30
  [&lt;ffffffff810bc6f5&gt;] ? destroy_workqueue+0x225/0x250
  [&lt;ffffffff81303494&gt;] ext4_put_super+0x64/0x360
  [&lt;ffffffff8124ab1a&gt;] generic_shutdown_super+0x6a/0xf0

Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: Keith Busch &lt;keith.busch@intel.com&gt;
Cc: Ross Zwisler &lt;ross.zwisler@linux.intel.com&gt;
Suggested-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Tested-by: Ross Zwisler &lt;ross.zwisler@linux.intel.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>blk-mq: factor out a helper to iterate all tags for a request_queue</title>
<updated>2015-10-01T08:10:57Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2015-09-27T19:01:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0bf6cd5b9531bcc29c0a5e504b6ce2984c6fd8d8'/>
<id>urn:sha1:0bf6cd5b9531bcc29c0a5e504b6ce2984c6fd8d8</id>
<content type='text'>
And replace the blk_mq_tag_busy_iter with it - the driver use has been
replaced with a new helper a while ago, and internal to the block we
only need the new version.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>blk-mq: fix racy updates of rq-&gt;errors</title>
<updated>2015-10-01T08:10:55Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2015-09-27T19:01:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f4829a9b7a61e159367350008a608b062c4f6840'/>
<id>urn:sha1:f4829a9b7a61e159367350008a608b062c4f6840</id>
<content type='text'>
blk_mq_complete_request may be a no-op if the request has already
been completed by others means (e.g. a timeout or cancellation), but
currently drivers have to set rq-&gt;errors before calling
blk_mq_complete_request, which might leave us with the wrong error value.

Add an error parameter to blk_mq_complete_request so that we can
defer setting rq-&gt;errors until we known we won the race to complete the
request.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Sagi Grimberg &lt;sagig@mellanox.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>blk-mq: fix sysfs registration/unregistration race</title>
<updated>2015-09-29T17:32:45Z</updated>
<author>
<name>Akinobu Mita</name>
<email>akinobu.mita@gmail.com</email>
</author>
<published>2015-09-26T17:09:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4593fdbe7a2f44d5e64c627c715dd0bcec9bdf14'/>
<id>urn:sha1:4593fdbe7a2f44d5e64c627c715dd0bcec9bdf14</id>
<content type='text'>
There is a race between cpu hotplug handling and adding/deleting
gendisk for blk-mq, where both are trying to register and unregister
the same sysfs entries.

null_add_dev
    --&gt; blk_mq_init_queue
        --&gt; blk_mq_init_allocated_queue
            --&gt; add to 'all_q_list' (*)
    --&gt; add_disk
        --&gt; blk_register_queue
            --&gt; blk_mq_register_disk (++)

null_del_dev
    --&gt; del_gendisk
        --&gt; blk_unregister_queue
            --&gt; blk_mq_unregister_disk (--)
    --&gt; blk_cleanup_queue
        --&gt; blk_mq_free_queue
            --&gt; del from 'all_q_list' (*)

blk_mq_queue_reinit
    --&gt; blk_mq_sysfs_unregister (-)
    --&gt; blk_mq_sysfs_register (+)

While the request queue is added to 'all_q_list' (*),
blk_mq_queue_reinit() can be called for the queue anytime by CPU
hotplug callback.  But blk_mq_sysfs_unregister (-) and
blk_mq_sysfs_register (+) in blk_mq_queue_reinit must not be called
before blk_mq_register_disk (++) and after blk_mq_unregister_disk (--)
is finished.  Because '/sys/block/*/mq/' is not exists.

There has already been BLK_MQ_F_SYSFS_UP flag in hctx-&gt;flags which can
be used to track these sysfs stuff, but it is only fixing this issue
partially.

In order to fix it completely, we just need per-queue flag instead of
per-hctx flag with appropriate locking.  So this introduces
q-&gt;mq_sysfs_init_done which is properly protected with all_q_mutex.

Also, we need to ensure that blk_mq_map_swqueue() is called with
all_q_mutex is held.  Since hctx-&gt;nr_ctx is reset temporarily and
updated in blk_mq_map_swqueue(), so we should avoid
blk_mq_register_hctx() seeing the temporary hctx-&gt;nr_ctx value
in CPU hotplug handling or adding/deleting gendisk .

Signed-off-by: Akinobu Mita &lt;akinobu.mita@gmail.com&gt;
Reviewed-by: Ming Lei &lt;tom.leiming@gmail.com&gt;
Cc: Ming Lei &lt;tom.leiming@gmail.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>blk-mq: Shared tag enhancements</title>
<updated>2015-06-01T20:35:56Z</updated>
<author>
<name>Keith Busch</name>
<email>keith.busch@intel.com</email>
</author>
<published>2015-06-01T15:29:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f26cdc8536ad50fb802a0445f836b4f94ca09ae7'/>
<id>urn:sha1:f26cdc8536ad50fb802a0445f836b4f94ca09ae7</id>
<content type='text'>
Storage controllers may expose multiple block devices that share hardware
resources managed by blk-mq. This patch enhances the shared tags so a
low-level driver can access the shared resources not tied to the unshared
h/w contexts. This way the LLD can dynamically add and delete disks and
request queues without having to track all the request_queue hctx's to
iterate outstanding tags.

Signed-off-by: Keith Busch &lt;keith.busch@intel.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>blk-mq: fix iteration of busy bitmap</title>
<updated>2015-04-17T14:31:12Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@fb.com</email>
</author>
<published>2015-04-17T14:28:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=569fd0ce96087283866ab8c438dac4bcf1738846'/>
<id>urn:sha1:569fd0ce96087283866ab8c438dac4bcf1738846</id>
<content type='text'>
Commit 889fa31f00b2 was a bit too eager in reducing the loop count,
so we ended up missing queues in some configurations. Ensure that
our division rounds up, so that's not the case.

Reported-by: Guenter Roeck &lt;linux@roeck-us.net&gt;
Fixes: 889fa31f00b2 ("blk-mq: reduce unnecessary software queue looping")
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
</feed>
