<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/fs/block_dev.c, branch v4.9.15</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.9.15</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.9.15'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2017-01-09T07:32:21Z</updated>
<entry>
<title>block: protect iterate_bdevs() against concurrent close</title>
<updated>2017-01-09T07:32:21Z</updated>
<author>
<name>Rabin Vincent</name>
<email>rabinv@axis.com</email>
</author>
<published>2016-12-01T08:18:28Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=11aa5c10102a425dde885b6b056e1fe354bcbb88'/>
<id>urn:sha1:11aa5c10102a425dde885b6b056e1fe354bcbb88</id>
<content type='text'>
commit af309226db916e2c6e08d3eba3fa5c34225200c4 upstream.

If a block device is closed while iterate_bdevs() is handling it, the
following NULL pointer dereference occurs because bdev-&gt;b_disk is NULL
in bdev_get_queue(), which is called from blk_get_backing_dev_info() (in
turn called by the mapping_cap_writeback_dirty() call in
__filemap_fdatawrite_range()):

 BUG: unable to handle kernel NULL pointer dereference at 0000000000000508
 IP: [&lt;ffffffff81314790&gt;] blk_get_backing_dev_info+0x10/0x20
 PGD 9e62067 PUD 9ee8067 PMD 0
 Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
 Modules linked in:
 CPU: 1 PID: 2422 Comm: sync Not tainted 4.5.0-rc7+ #400
 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996)
 task: ffff880009f4d700 ti: ffff880009f5c000 task.ti: ffff880009f5c000
 RIP: 0010:[&lt;ffffffff81314790&gt;]  [&lt;ffffffff81314790&gt;] blk_get_backing_dev_info+0x10/0x20
 RSP: 0018:ffff880009f5fe68  EFLAGS: 00010246
 RAX: 0000000000000000 RBX: ffff88000ec17a38 RCX: ffffffff81a4e940
 RDX: 7fffffffffffffff RSI: 0000000000000000 RDI: ffff88000ec176c0
 RBP: ffff880009f5fe68 R08: 0000000000000000 R09: 0000000000000000
 R10: 0000000000000001 R11: 0000000000000000 R12: ffff88000ec17860
 R13: ffffffff811b25c0 R14: ffff88000ec178e0 R15: ffff88000ec17a38
 FS:  00007faee505d700(0000) GS:ffff88000fb00000(0000) knlGS:0000000000000000
 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
 CR2: 0000000000000508 CR3: 0000000009e8a000 CR4: 00000000000006e0
 Stack:
  ffff880009f5feb8 ffffffff8112e7f5 0000000000000000 7fffffffffffffff
  0000000000000000 0000000000000000 7fffffffffffffff 0000000000000001
  ffff88000ec178e0 ffff88000ec17860 ffff880009f5fec8 ffffffff8112e81f
 Call Trace:
  [&lt;ffffffff8112e7f5&gt;] __filemap_fdatawrite_range+0x85/0x90
  [&lt;ffffffff8112e81f&gt;] filemap_fdatawrite+0x1f/0x30
  [&lt;ffffffff811b25d6&gt;] fdatawrite_one_bdev+0x16/0x20
  [&lt;ffffffff811bc402&gt;] iterate_bdevs+0xf2/0x130
  [&lt;ffffffff811b2763&gt;] sys_sync+0x63/0x90
  [&lt;ffffffff815d4272&gt;] entry_SYSCALL_64_fastpath+0x12/0x76
 Code: 0f 1f 44 00 00 48 8b 87 f0 00 00 00 55 48 89 e5 &lt;48&gt; 8b 80 08 05 00 00 5d
 RIP  [&lt;ffffffff81314790&gt;] blk_get_backing_dev_info+0x10/0x20
  RSP &lt;ffff880009f5fe68&gt;
 CR2: 0000000000000508
 ---[ end trace 2487336ceb3de62d ]---

The crash is easily reproducible by running the following command, if an
msleep(100) is inserted before the call to func() in iterate_devs():

 while :; do head -c1 /dev/nullb0; done &gt; /dev/null &amp; while :; do sync; done

Fix it by holding the bd_mutex across the func() call and only calling
func() if the bdev is opened.

Fixes: 5c0d6b60a0ba ("vfs: Create function for iterating over block devices")
Reported-and-tested-by: Wei Fang &lt;fangwei1@huawei.com&gt;
Signed-off-by: Rabin Vincent &lt;rabinv@axis.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>block_dev: don't test bdev-&gt;bd_contains when it is not stable</title>
<updated>2017-01-06T09:40:12Z</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.com</email>
</author>
<published>2016-12-12T15:21:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cfa2d65b2622d14b2f1368fbc9a6b4ab85141644'/>
<id>urn:sha1:cfa2d65b2622d14b2f1368fbc9a6b4ab85141644</id>
<content type='text'>
commit bcc7f5b4bee8e327689a4d994022765855c807ff upstream.

bdev-&gt;bd_contains is not stable before calling __blkdev_get().
When __blkdev_get() is called on a parition with -&gt;bd_openers == 0
it sets
  bdev-&gt;bd_contains = bdev;
which is not correct for a partition.
After a call to __blkdev_get() succeeds, -&gt;bd_openers will be &gt; 0
and then -&gt;bd_contains is stable.

When FMODE_EXCL is used, blkdev_get() calls
   bd_start_claiming() -&gt;  bd_prepare_to_claim() -&gt; bd_may_claim()

This call happens before __blkdev_get() is called, so -&gt;bd_contains
is not stable.  So bd_may_claim() cannot safely use -&gt;bd_contains.
It currently tries to use it, and this can lead to a BUG_ON().

This happens when a whole device is already open with a bd_holder (in
use by dm in my particular example) and two threads race to open a
partition of that device for the first time, one opening with O_EXCL and
one without.

The thread that doesn't use O_EXCL gets through blkdev_get() to
__blkdev_get(), gains the -&gt;bd_mutex, and sets bdev-&gt;bd_contains = bdev;

Immediately thereafter the other thread, using FMODE_EXCL, calls
bd_start_claiming() from blkdev_get().  This should fail because the
whole device has a holder, but because bdev-&gt;bd_contains == bdev
bd_may_claim() incorrectly reports success.
This thread continues and blocks on bd_mutex.

The first thread then sets bdev-&gt;bd_contains correctly and drops the mutex.
The thread using FMODE_EXCL then continues and when it calls bd_may_claim()
again in:
			BUG_ON(!bd_may_claim(bdev, whole, holder));
The BUG_ON fires.

Fix this by removing the dependency on -&gt;bd_contains in
bd_may_claim().  As bd_may_claim() has direct access to the whole
device, it can simply test if the target bdev is the whole device.

Fixes: 6b4517a7913a ("block: implement bd_claiming and claiming block")
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>block: implement (some of) fallocate for block devices</title>
<updated>2016-10-11T22:06:30Z</updated>
<author>
<name>Darrick J. Wong</name>
<email>darrick.wong@oracle.com</email>
</author>
<published>2016-10-11T20:51:11Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=25f4c41415e513f0e9fb1f3fce2ce98fcba8d263'/>
<id>urn:sha1:25f4c41415e513f0e9fb1f3fce2ce98fcba8d263</id>
<content type='text'>
After much discussion, it seems that the fallocate feature flag
FALLOC_FL_ZERO_RANGE maps nicely to SCSI WRITE SAME; and the feature
FALLOC_FL_PUNCH_HOLE maps nicely to the devices that have been whitelisted
for zeroing SCSI UNMAP.  Punch still requires that FALLOC_FL_KEEP_SIZE is
set.  A length that goes past the end of the device will be clamped to the
device size if KEEP_SIZE is set; or will return -EINVAL if not.  Both
start and length must be aligned to the device's logical block size.

Since the semantics of fallocate are fairly well established already, wire
up the two pieces.  The other fallocate variants (collapse range, insert
range, and allocate blocks) are not supported.

Link: http://lkml.kernel.org/r/147518379992.22791.8849838163218235007.stgit@birch.djwong.org
Signed-off-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
Reviewed-by: Hannes Reinecke &lt;hare@suse.com&gt;
Reviewed-by: Bart Van Assche &lt;bart.vanassche@sandisk.com&gt;
Cc: Theodore Ts'o &lt;tytso@mit.edu&gt;
Cc: Martin K. Petersen &lt;martin.petersen@oracle.com&gt;
Cc: Mike Snitzer &lt;snitzer@redhat.com&gt; # tweaked header
Cc: Brian Foster &lt;bfoster@redhat.com&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: Hannes Reinecke &lt;hare@suse.de&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>fs/block_dev.c: return the right error in thaw_bdev()</title>
<updated>2016-10-05T20:35:13Z</updated>
<author>
<name>Pierre Morel</name>
<email>pmorel@linux.vnet.ibm.com</email>
</author>
<published>2016-10-04T08:53:40Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=997198ba1ed691c09457120576c27dbd953d0557'/>
<id>urn:sha1:997198ba1ed691c09457120576c27dbd953d0557</id>
<content type='text'>
When triggering thaw-filesystems via magic sysrq, the system enters a
loop in do_thaw_one(), as thaw_bdev() still returns success if
bd_fsfreeze_count == 0. To fix this, let thaw_bdev() always return
error (and simplify the code a bit at the same time).

Reviewed-by: Eric Farman &lt;farman@linux.vnet.ibm.com&gt;
Reviewed-by: Cornelia Huck &lt;cornelia.huck@de.ibm.com&gt;
Signed-off-by: Pierre Morel &lt;pmorel@linux.vnet.ibm.com&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>block_dev: remove DAX leftovers</title>
<updated>2016-09-14T14:41:59Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2016-09-14T09:56:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=223757016837d5bc8546c5683e13fbafe6cb374d'/>
<id>urn:sha1:223757016837d5bc8546c5683e13fbafe6cb374d</id>
<content type='text'>
DAX support for block devices was removed in commits 03cdad
("block: disable block device DAX by default") and 99a01cd
("block: remove BLK_DEV_DAX config option"), but we still kept a call to
dax_do_io and some uneeded i_flags manipulations introduced in commit
bbab37 ("block: Add support for DAX reads/writes to block devices").

Remove those leftovers.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Johannes Thumshirn &lt;jthumshirn@suse.de&gt;
Acked-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>fs/block_dev: fix potential NULL ptr deref in freeze_bdev()</title>
<updated>2016-08-25T14:38:26Z</updated>
<author>
<name>Andrey Ryabinin</name>
<email>aryabinin@virtuozzo.com</email>
</author>
<published>2016-08-23T15:55:31Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5bb53c0fb8e0fc2e34287d5d0fcadc784de913e1'/>
<id>urn:sha1:5bb53c0fb8e0fc2e34287d5d0fcadc784de913e1</id>
<content type='text'>
Calling freeze_bdev() twice on the same block device without mounted
filesystem get_super() will return NULL, which will lead to NULL-ptr
dereference later in drop_super().

Check get_super() result to fix that.

Note, that this is a purely theoretical issue. We have only 3
freeze_bdev() callers. 2 of them are in filesystem code and used on a
device with mounted fs. The third one in lock_fs() has protection in
upper-layer code against freezing block device the second time without
thawing it first.

Signed-off-by: Andrey Ryabinin &lt;aryabinin@virtuozzo.com&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>bdev: fix NULL pointer dereference</title>
<updated>2016-08-22T14:06:15Z</updated>
<author>
<name>Vegard Nossum</name>
<email>vegard.nossum@oracle.com</email>
</author>
<published>2016-08-22T10:47:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e9e5e3fae8da7e237049e00e0bfc9e32fd808fe8'/>
<id>urn:sha1:e9e5e3fae8da7e237049e00e0bfc9e32fd808fe8</id>
<content type='text'>
I got this:

    kasan: GPF could be caused by NULL-ptr deref or user memory access
    general protection fault: 0000 [#1] PREEMPT SMP KASAN
    Dumping ftrace buffer:
       (ftrace buffer empty)
    CPU: 0 PID: 5505 Comm: syz-executor Not tainted 4.8.0-rc2+ #161
    Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014
    task: ffff880113415940 task.stack: ffff880118350000
    RIP: 0010:[&lt;ffffffff8172cb32&gt;]  [&lt;ffffffff8172cb32&gt;] bd_mount+0x52/0xa0
    RSP: 0018:ffff880118357ca0  EFLAGS: 00010207
    RAX: dffffc0000000000 RBX: ffffffffffffffff RCX: ffffc90000bb6000
    RDX: 0000000000000018 RSI: ffffffff846d6b20 RDI: 00000000000000c7
    RBP: ffff880118357cb0 R08: ffff880115967c68 R09: 0000000000000000
    R10: 0000000000000000 R11: 0000000000000000 R12: ffff8801188211e8
    R13: ffffffff847baa20 R14: ffff8801139cb000 R15: 0000000000000080
    FS:  00007fa3ff6c0700(0000) GS:ffff88011aa00000(0000) knlGS:0000000000000000
    CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
    CR2: 00007fc1d8cc7e78 CR3: 0000000109f20000 CR4: 00000000000006f0
    DR0: 000000000000001e DR1: 000000000000001e DR2: 0000000000000000
    DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000600
    Stack:
     ffff880112cfd6c0 ffff8801188211e8 ffff880118357cf0 ffffffff8167f207
     ffffffff816d7a1e ffff880112a413c0 ffffffff847baa20 ffff8801188211e8
     0000000000000080 ffff880112cfd6c0 ffff880118357d38 ffffffff816dce0a
    Call Trace:
     [&lt;ffffffff8167f207&gt;] mount_fs+0x97/0x2e0
     [&lt;ffffffff816d7a1e&gt;] ? alloc_vfsmnt+0x55e/0x760
     [&lt;ffffffff816dce0a&gt;] vfs_kern_mount+0x7a/0x300
     [&lt;ffffffff83c3247c&gt;] ? _raw_read_unlock+0x2c/0x50
     [&lt;ffffffff816dfc87&gt;] do_mount+0x3d7/0x2730
     [&lt;ffffffff81235fd4&gt;] ? trace_do_page_fault+0x1f4/0x3a0
     [&lt;ffffffff816df8b0&gt;] ? copy_mount_string+0x40/0x40
     [&lt;ffffffff8161ea81&gt;] ? memset+0x31/0x40
     [&lt;ffffffff816df73e&gt;] ? copy_mount_options+0x1ee/0x320
     [&lt;ffffffff816e2a02&gt;] SyS_mount+0xb2/0x120
     [&lt;ffffffff816e2950&gt;] ? copy_mnt_ns+0x970/0x970
     [&lt;ffffffff81005524&gt;] do_syscall_64+0x1c4/0x4e0
     [&lt;ffffffff83c3282a&gt;] entry_SYSCALL64_slow_path+0x25/0x25
    Code: 83 e8 63 1b fc ff 48 85 c0 48 89 c3 74 4c e8 56 35 d1 ff 48 8d bb c8 00 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 &lt;80&gt; 3c 02 00 75 36 4c 8b a3 c8 00 00 00 48 b8 00 00 00 00 00 fc
    RIP  [&lt;ffffffff8172cb32&gt;] bd_mount+0x52/0xa0
     RSP &lt;ffff880118357ca0&gt;
    ---[ end trace 13690ad962168b98 ]---

mount_pseudo() returns ERR_PTR(), not NULL, on error.

Fixes: 3684aa7099e0 ("block-dev: enable writeback cgroup support")
Cc: Shaohua Li &lt;shli@fb.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Jens Axboe &lt;axboe@fb.com&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Vegard Nossum &lt;vegard.nossum@oracle.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>block/mm: make bdev_ops-&gt;rw_page() take a bool for read/write</title>
<updated>2016-08-07T20:41:02Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@fb.com</email>
</author>
<published>2016-08-05T14:11:04Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c11f0c0b5bb949673e4fc16c742f0316ae4ced20'/>
<id>urn:sha1:c11f0c0b5bb949673e4fc16c742f0316ae4ced20</id>
<content type='text'>
Commit abf545484d31 changed it from an 'rw' flags type to the
newer ops based interface, but now we're effectively leaking
some bdev internals to the rest of the kernel. Since we only
care about whether it's a read or a write at that level, just
pass in a bool 'is_write' parameter instead.

Then we can also move op_is_write() and friends back under
CONFIG_BLOCK protection.

Reviewed-by: Mike Christie &lt;mchristi@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>mm/block: convert rw_page users to bio op use</title>
<updated>2016-08-04T20:25:33Z</updated>
<author>
<name>Mike Christie</name>
<email>mchristi@redhat.com</email>
</author>
<published>2016-08-04T20:23:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=abf545484d31b68777a85c5c8f5b4bcde08283eb'/>
<id>urn:sha1:abf545484d31b68777a85c5c8f5b4bcde08283eb</id>
<content type='text'>
The rw_page users were not converted to use bio/req ops. As a result
bdev_write_page is not passing down REQ_OP_WRITE and the IOs will
be sent down as reads.

Signed-off-by: Mike Christie &lt;mchristi@redhat.com&gt;
Fixes: 4e1b2d52a80d ("block, fs, drivers: remove REQ_OP compat defs and related code")

Modified by me to:

1) Drop op_flags passing into -&gt;rw_page(), as we don't use it.
2) Make op_is_write() and friends safe to use for !CONFIG_BLOCK

Signed-off-by: Jens Axboe &lt;axboe@fb.com&gt;
</content>
</entry>
<entry>
<title>block: remove BLK_DEV_DAX config option</title>
<updated>2016-08-04T12:50:07Z</updated>
<author>
<name>Ross Zwisler</name>
<email>ross.zwisler@linux.intel.com</email>
</author>
<published>2016-08-03T20:46:15Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=99a01cdf9d572e91c2805a2b89d459797864e3ca'/>
<id>urn:sha1:99a01cdf9d572e91c2805a2b89d459797864e3ca</id>
<content type='text'>
The functionality for block device DAX was already removed with commit
acc93d30d7d4 ("Revert "block: enable dax for raw block devices"")

However, we still had a config option hanging around that was always
disabled because it depended on CONFIG_BROKEN.  This config option was
introduced in commit 03cdadb04077 ("block: disable block device DAX by
default")

This change reverts that commit, removing the dead config option.

Link: http://lkml.kernel.org/r/20160729182314.6368-1-ross.zwisler@linux.intel.com
Signed-off-by: Ross Zwisler &lt;ross.zwisler@linux.intel.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Acked-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
