<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/blkdev.h, branch v3.16.74</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.16.74</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.16.74'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2018-12-16T22:08:26Z</updated>
<entry>
<title>block: move bio_integrity_{intervals,bytes} into blkdev.h</title>
<updated>2018-12-16T22:08:26Z</updated>
<author>
<name>Greg Edwards</name>
<email>gedwards@ddn.com</email>
</author>
<published>2018-07-25T14:22:58Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ed1a4f65447b5475afaf2f13a74cddf332e3e5ad'/>
<id>urn:sha1:ed1a4f65447b5475afaf2f13a74cddf332e3e5ad</id>
<content type='text'>
commit 359f642700f2ff05d9c94cd9216c97af7b8e9553 upstream.

This allows bio_integrity_bytes() to be called from drivers instead of
open coding it.

Acked-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: Greg Edwards &lt;gedwards@ddn.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
[bwh: Backported to 3.16: bio_integrity_intervals() was called
 bio_integrity_hw_sectors() and had a different implementation.  Move it
 without renaming.]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>block: Fix transfer when chunk sectors exceeds max</title>
<updated>2018-11-20T18:05:29Z</updated>
<author>
<name>Keith Busch</name>
<email>keith.busch@intel.com</email>
</author>
<published>2018-06-26T15:14:58Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b78fafe4e865f4f0f8345e664cdb4cf170847f4a'/>
<id>urn:sha1:b78fafe4e865f4f0f8345e664cdb4cf170847f4a</id>
<content type='text'>
commit 15bfd21fbc5d35834b9ea383dc458a1f0c9e3434 upstream.

A device may have boundary restrictions where the number of sectors
between boundaries exceeds its max transfer size. In this case, we need
to cap the max size to the smaller of the two limits.

Reported-by: Jitendra Bhivare &lt;jitendra.bhivare@broadcom.com&gt;
Tested-by: Jitendra Bhivare &lt;jitendra.bhivare@broadcom.com&gt;
Reviewed-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: Keith Busch &lt;keith.busch@intel.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>blk-mq: fix race between timeout and freeing request</title>
<updated>2018-03-03T15:52:29Z</updated>
<author>
<name>Ming Lei</name>
<email>ming.lei@canonical.com</email>
</author>
<published>2015-08-09T07:41:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7acba7c0621efdfb09bb514500ba22f965aba68b'/>
<id>urn:sha1:7acba7c0621efdfb09bb514500ba22f965aba68b</id>
<content type='text'>
commit 0048b4837affd153897ed1222283492070027aa9 upstream.

Inside timeout handler, blk_mq_tag_to_rq() is called
to retrieve the request from one tag. This way is obviously
wrong because the request can be freed any time and some
fiedds of the request can't be trusted, then kernel oops
might be triggered[1].

Currently wrt. blk_mq_tag_to_rq(), the only special case is
that the flush request can share same tag with the request
cloned from, and the two requests can't be active at the same
time, so this patch fixes the above issue by updating tags-&gt;rqs[tag]
with the active request(either flush rq or the request cloned
from) of the tag.

Also blk_mq_tag_to_rq() gets much simplified with this patch.

Given blk_mq_tag_to_rq() is mainly for drivers and the caller must
make sure the request can't be freed, so in bt_for_each() this
helper is replaced with tags-&gt;rqs[tag].

[1] kernel oops log
[  439.696220] BUG: unable to handle kernel NULL pointer dereference at 0000000000000158^M
[  439.697162] IP: [&lt;ffffffff812d89ba&gt;] blk_mq_tag_to_rq+0x21/0x6e^M
[  439.700653] PGD 7ef765067 PUD 7ef764067 PMD 0 ^M
[  439.700653] Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC ^M
[  439.700653] Dumping ftrace buffer:^M
[  439.700653]    (ftrace buffer empty)^M
[  439.700653] Modules linked in: nbd ipv6 kvm_intel kvm serio_raw^M
[  439.700653] CPU: 6 PID: 2779 Comm: stress-ng-sigfd Not tainted 4.2.0-rc5-next-20150805+ #265^M
[  439.730500] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011^M
[  439.730500] task: ffff880605308000 ti: ffff88060530c000 task.ti: ffff88060530c000^M
[  439.730500] RIP: 0010:[&lt;ffffffff812d89ba&gt;]  [&lt;ffffffff812d89ba&gt;] blk_mq_tag_to_rq+0x21/0x6e^M
[  439.730500] RSP: 0018:ffff880819203da0  EFLAGS: 00010283^M
[  439.730500] RAX: ffff880811b0e000 RBX: ffff8800bb465f00 RCX: 0000000000000002^M
[  439.730500] RDX: 0000000000000000 RSI: 0000000000000202 RDI: 0000000000000000^M
[  439.730500] RBP: ffff880819203db0 R08: 0000000000000002 R09: 0000000000000000^M
[  439.730500] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000202^M
[  439.730500] R13: ffff880814104800 R14: 0000000000000002 R15: ffff880811a2ea00^M
[  439.730500] FS:  00007f165b3f5740(0000) GS:ffff880819200000(0000) knlGS:0000000000000000^M
[  439.730500] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b^M
[  439.730500] CR2: 0000000000000158 CR3: 00000007ef766000 CR4: 00000000000006e0^M
[  439.730500] Stack:^M
[  439.730500]  0000000000000008 ffff8808114eed90 ffff880819203e00 ffffffff812dc104^M
[  439.755663]  ffff880819203e40 ffffffff812d9f5e 0000020000000000 ffff8808114eed80^M
[  439.755663] Call Trace:^M
[  439.755663]  &lt;IRQ&gt; ^M
[  439.755663]  [&lt;ffffffff812dc104&gt;] bt_for_each+0x6e/0xc8^M
[  439.755663]  [&lt;ffffffff812d9f5e&gt;] ? blk_mq_rq_timed_out+0x6a/0x6a^M
[  439.755663]  [&lt;ffffffff812d9f5e&gt;] ? blk_mq_rq_timed_out+0x6a/0x6a^M
[  439.755663]  [&lt;ffffffff812dc1b3&gt;] blk_mq_tag_busy_iter+0x55/0x5e^M
[  439.755663]  [&lt;ffffffff812d88b4&gt;] ? blk_mq_bio_to_request+0x38/0x38^M
[  439.755663]  [&lt;ffffffff812d8911&gt;] blk_mq_rq_timer+0x5d/0xd4^M
[  439.755663]  [&lt;ffffffff810a3e10&gt;] call_timer_fn+0xf7/0x284^M
[  439.755663]  [&lt;ffffffff810a3d1e&gt;] ? call_timer_fn+0x5/0x284^M
[  439.755663]  [&lt;ffffffff812d88b4&gt;] ? blk_mq_bio_to_request+0x38/0x38^M
[  439.755663]  [&lt;ffffffff810a46d6&gt;] run_timer_softirq+0x1ce/0x1f8^M
[  439.755663]  [&lt;ffffffff8104c367&gt;] __do_softirq+0x181/0x3a4^M
[  439.755663]  [&lt;ffffffff8104c76e&gt;] irq_exit+0x40/0x94^M
[  439.755663]  [&lt;ffffffff81031482&gt;] smp_apic_timer_interrupt+0x33/0x3e^M
[  439.755663]  [&lt;ffffffff815559a4&gt;] apic_timer_interrupt+0x84/0x90^M
[  439.755663]  &lt;EOI&gt; ^M
[  439.755663]  [&lt;ffffffff81554350&gt;] ? _raw_spin_unlock_irq+0x32/0x4a^M
[  439.755663]  [&lt;ffffffff8106a98b&gt;] finish_task_switch+0xe0/0x163^M
[  439.755663]  [&lt;ffffffff8106a94d&gt;] ? finish_task_switch+0xa2/0x163^M
[  439.755663]  [&lt;ffffffff81550066&gt;] __schedule+0x469/0x6cd^M
[  439.755663]  [&lt;ffffffff8155039b&gt;] schedule+0x82/0x9a^M
[  439.789267]  [&lt;ffffffff8119b28b&gt;] signalfd_read+0x186/0x49a^M
[  439.790911]  [&lt;ffffffff8106d86a&gt;] ? wake_up_q+0x47/0x47^M
[  439.790911]  [&lt;ffffffff811618c2&gt;] __vfs_read+0x28/0x9f^M
[  439.790911]  [&lt;ffffffff8117a289&gt;] ? __fget_light+0x4d/0x74^M
[  439.790911]  [&lt;ffffffff811620a7&gt;] vfs_read+0x7a/0xc6^M
[  439.790911]  [&lt;ffffffff8116292b&gt;] SyS_read+0x49/0x7f^M
[  439.790911]  [&lt;ffffffff81554c17&gt;] entry_SYSCALL_64_fastpath+0x12/0x6f^M
[  439.790911] Code: 48 89 e5 e8 a9 b8 e7 ff 5d c3 0f 1f 44 00 00 55 89
f2 48 89 e5 41 54 41 89 f4 53 48 8b 47 60 48 8b 1c d0 48 8b 7b 30 48 8b
53 38 &lt;48&gt; 8b 87 58 01 00 00 48 85 c0 75 09 48 8b 97 88 0c 00 00 eb 10
^M
[  439.790911] RIP  [&lt;ffffffff812d89ba&gt;] blk_mq_tag_to_rq+0x21/0x6e^M
[  439.790911]  RSP &lt;ffff880819203da0&gt;^M
[  439.790911] CR2: 0000000000000158^M
[  439.790911] ---[ end trace d40af58949325661 ]---^M

Signed-off-by: Ming Lei &lt;ming.lei@canonical.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
[bwh: Backported to 3.16:
 - Flush state is in struct request_queue, not struct blk_flush_queue
 - Flush request cloning is done in blk_mq_clone_flush_request() rather
   than blk_kick_flush()
 - Drop changes in bt{,_tags}_for_each()
 - Adjust filename, context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>blktrace: Fix potential deadlock between delete &amp; sysfs ops</title>
<updated>2018-02-13T18:42:21Z</updated>
<author>
<name>Waiman Long</name>
<email>longman@redhat.com</email>
</author>
<published>2017-09-20T19:12:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f3f6031fd1a4db8f195e4b57b390b3bb5baa4f14'/>
<id>urn:sha1:f3f6031fd1a4db8f195e4b57b390b3bb5baa4f14</id>
<content type='text'>
commit 5acb3cc2c2e9d3020a4fee43763c6463767f1572 upstream.

The lockdep code had reported the following unsafe locking scenario:

       CPU0                    CPU1
       ----                    ----
  lock(s_active#228);
                               lock(&amp;bdev-&gt;bd_mutex/1);
                               lock(s_active#228);
  lock(&amp;bdev-&gt;bd_mutex);

 *** DEADLOCK ***

The deadlock may happen when one task (CPU1) is trying to delete a
partition in a block device and another task (CPU0) is accessing
tracing sysfs file (e.g. /sys/block/dm-1/trace/act_mask) in that
partition.

The s_active isn't an actual lock. It is a reference count (kn-&gt;count)
on the sysfs (kernfs) file. Removal of a sysfs file, however, require
a wait until all the references are gone. The reference count is
treated like a rwsem using lockdep instrumentation code.

The fact that a thread is in the sysfs callback method or in the
ioctl call means there is a reference to the opended sysfs or device
file. That should prevent the underlying block structure from being
removed.

Instead of using bd_mutex in the block_device structure, a new
blk_trace_mutex is now added to the request_queue structure to protect
access to the blk_trace structure.

Suggested-by: Christoph Hellwig &lt;hch@infradead.org&gt;
Signed-off-by: Waiman Long &lt;longman@redhat.com&gt;
Acked-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;

Fix typo in patch subject line, and prune a comment detailing how
the code used to work.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>blk: rq_data_dir() should not return a boolean</title>
<updated>2017-04-04T21:21:52Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2015-05-27T22:32:15Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5b9e032567e1a762bd86d1c7d4d6cf325b024c33'/>
<id>urn:sha1:5b9e032567e1a762bd86d1c7d4d6cf325b024c33</id>
<content type='text'>
commit 10fbd36e362a0f367e34a7cd876a81295d8fc5ca upstream.

rq_data_dir() returns either READ or WRITE (0 == READ, 1 == WRITE), not
a boolean value.

Now, admittedly the "!= 0" doesn't really change the value (0 stays as
zero, 1 stays as one), but it's not only redundant, it confuses gcc, and
causes gcc to warn about the construct

    switch (rq_data_dir(req)) {
        case READ:
            ...
        case WRITE:
            ...

that we have in a few drivers.

Now, the gcc warning is silly and stupid (it seems to warn not about the
switch value having a different type from the case statements, but about
_any_ boolean switch value), but in this case the code itself is silly
and stupid too, so let's just change it, and get rid of warnings like
this:

  drivers/block/hd.c: In function ‘hd_request’:
  drivers/block/hd.c:630:11: warning: switch condition has boolean value [-Wswitch-bool]
     switch (rq_data_dir(req)) {

The odd '!= 0' came in when "cmd_flags" got turned into a "u64" in
commit 5953316dbf90 ("block: make rq-&gt;cmd_flags be 64-bit") and is
presumably because the old code (that just did a logical 'and' with 1)
would then end up making the type of rq_data_dir() be u64 too.

But if we want to retain the old regular integer type, let's just cast
the result to 'int' rather than use that rather odd '!= 0'.

Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Cc: Arnd Bergmann &lt;arnd@arndb.de&gt;
</content>
</entry>
<entry>
<title>block: Always check queue limits for cloned requests</title>
<updated>2016-01-05T11:22:20Z</updated>
<author>
<name>Hannes Reinecke</name>
<email>hare@suse.de</email>
</author>
<published>2015-11-26T07:46:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ad2e17a773450dedc1f779061d07fc07407918a1'/>
<id>urn:sha1:ad2e17a773450dedc1f779061d07fc07407918a1</id>
<content type='text'>
commit bf4e6b4e757488dee1b6a581f49c7ac34cd217f8 upstream.

When a cloned request is retried on other queues it always needs
to be checked against the queue limits of that queue.
Otherwise the calculations for nr_phys_segments might be wrong,
leading to a crash in scsi_init_sgtable().

To clarify this the patch renames blk_rq_check_limits()
to blk_cloned_rq_check_limits() and removes the symbol
export, as the new function should only be used for
cloned requests and never exported.

Cc: Mike Snitzer &lt;snitzer@redhat.com&gt;
Cc: Ewan Milne &lt;emilne@redhat.com&gt;
Cc: Jeff Moyer &lt;jmoyer@redhat.com&gt;
Signed-off-by: Hannes Reinecke &lt;hare@suse.de&gt;
Fixes: e2a60da74 ("block: Clean up special command handling logic")
Acked-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Luis Henriques &lt;luis.henriques@canonical.com&gt;
</content>
</entry>
<entry>
<title>block: fix alignment_offset math that assumes io_min is a power-of-2</title>
<updated>2014-11-17T14:11:44Z</updated>
<author>
<name>Mike Snitzer</name>
<email>snitzer@redhat.com</email>
</author>
<published>2014-10-08T22:26:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4296b616583a4cb1d4d1d38e2ddb982738a4b446'/>
<id>urn:sha1:4296b616583a4cb1d4d1d38e2ddb982738a4b446</id>
<content type='text'>
commit b8839b8c55f3fdd60dc36abcda7e0266aff7985c upstream.

The math in both blk_stack_limits() and queue_limit_alignment_offset()
assume that a block device's io_min (aka minimum_io_size) is always a
power-of-2.  Fix the math such that it works for non-power-of-2 io_min.

This issue (of alignment_offset != 0) became apparent when testing
dm-thinp with a thinp blocksize that matches a RAID6 stripesize of
1280K.  Commit fdfb4c8c1 ("dm thin: set minimum_io_size to pool's data
block size") unlocked the potential for alignment_offset != 0 due to
the dm-thin-pool's io_min possibly being a non-power-of-2.

Signed-off-by: Mike Snitzer &lt;snitzer@redhat.com&gt;
Acked-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Luis Henriques &lt;luis.henriques@canonical.com&gt;
</content>
</entry>
<entry>
<title>Revert "block: all blk-mq requests are tagged"</title>
<updated>2014-11-13T11:48:49Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2014-10-19T15:13:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2b8ffed36eff00fb2b942096046366d87ce8b549'/>
<id>urn:sha1:2b8ffed36eff00fb2b942096046366d87ce8b549</id>
<content type='text'>
commit e999dbc254044e8d2a5818d92d205f65bae28f37 upstream.

This reverts commit fb3ccb5da71273e7f0d50b50bc879e50cedd60e7.

SCSI-2/SPI actually needs the tagged/untagged flag in the request to
work properly.  Revert this patch and add a follow on to set it in
the right place.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Acked-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Reported-by: Meelis Roos &lt;mroos@linux.ee&gt;
Tested-by: Meelis Roos &lt;mroos@linux.ee&gt;
Signed-off-by: Luis Henriques &lt;luis.henriques@canonical.com&gt;
</content>
</entry>
<entry>
<title>block: add support for limiting gaps in SG lists</title>
<updated>2014-06-24T22:22:24Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@fb.com</email>
</author>
<published>2014-06-24T22:22:24Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=66cb45aa41315d1d9972cada354fbdf7870d7714'/>
<id>urn:sha1:66cb45aa41315d1d9972cada354fbdf7870d7714</id>
<content type='text'>
Another restriction inherited for NVMe - those devices don't support
SG lists that have "gaps" in them. Gaps refers to cases where the
previous SG entry doesn't end on a page boundary. For NVMe, all SG
entries must start at offset 0 (except the first) and end on a page
boundary (except the last).

Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>block: blk_max_size_offset() should check -&gt;max_sectors</title>
<updated>2014-06-18T05:12:02Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@fb.com</email>
</author>
<published>2014-06-18T05:09:29Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=736ed4de766d4f0e8e6142dd4f9d73ef61835ed9'/>
<id>urn:sha1:736ed4de766d4f0e8e6142dd4f9d73ef61835ed9</id>
<content type='text'>
Commit 762380ad9322 inadvertently changed a check for max_sectors
to max_hw_sectors. Revert that part, so we still compare against
max_sectors.

Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
</feed>
