user/sven/linux.git/block/blk-core.c, branch v3.2.29

block: add blk_queue_dead()

2012-08-02T13:37:54Z

commit 34f6055c80285e4efb3f602a9119db75239744dc upstream. There are a number of QUEUE_FLAG_DEAD tests. Add blk_queue_dead() macro and use it. This patch doesn't introduce any functional difference. Signed-off-by: Tejun Heo Signed-off-by: Jens Axboe Signed-off-by: Ben Hutchings

block: don't kick empty queue in blk_drain_queue()

2011-12-15T19:03:04Z

While probing, fd sets up queue, probes hardware and tears down the queue if probing fails. In the process, blk_drain_queue() kicks the queue which failed to finish initialization and fd is unhappy about that. floppy0: no floppy controllers found ------------[ cut here ]------------ WARNING: at drivers/block/floppy.c:2929 do_fd_request+0xbf/0xd0() Hardware name: To Be Filled By O.E.M. VFS: do_fd_request called on non-open device Modules linked in: Pid: 1, comm: swapper Not tainted 3.2.0-rc4-00077-g5983fe2 #2 Call Trace: [] warn_slowpath_common+0x7a/0xb0 [] warn_slowpath_fmt+0x41/0x50 [] do_fd_request+0xbf/0xd0 [] blk_drain_queue+0x65/0x80 [] blk_cleanup_queue+0xe3/0x1a0 [] floppy_init+0xdeb/0xe28 [] ? daring+0x6b/0x6b [] do_one_initcall+0x3f/0x170 [] kernel_init+0x9d/0x11e [] ? schedule_tail+0x22/0xa0 [] kernel_thread_helper+0x4/0x10 [] ? start_kernel+0x2be/0x2be [] ? gs_change+0xb/0xb Avoid it by making blk_drain_queue() kick queue iff dispatch queue has something on it. Signed-off-by: Tejun Heo Reported-by: Ralf Hildebrandt Reported-by: Wu Fengguang Tested-by: Sergei Trofimovich Signed-off-by: Jens Axboe

block: initialize request_queue's numa node during

2011-11-23T09:59:13Z

struct request_queue is allocated with __GFP_ZERO so its "node" field is zero before initialization. This causes an oops if node 0 is offline in the page allocator because its zonelists are not initialized. From Dave Young's dmesg: SRAT: Node 1 PXM 2 0-d0000000 SRAT: Node 1 PXM 2 100000000-330000000 SRAT: Node 0 PXM 1 330000000-630000000 Initmem setup node 1 0000000000000000-000000000affb000 ... Built 1 zonelists in Node order, mobility grouping on. ... BUG: unable to handle kernel paging request at 0000000000001c08 IP: [] __alloc_pages_nodemask+0xb5/0x870 and __alloc_pages_nodemask+0xb5 translates to a NULL pointer on zonelist->_zonerefs. The fix is to initialize q->node at the time of allocation so the correct node is passed to the slab allocator later. Since blk_init_allocated_queue_node() is no longer needed, merge it with blk_init_allocated_queue(). [rientjes@google.com: changelog, initializing q->node] Cc: stable@vger.kernel.org [2.6.37+] Reported-by: Dave Young Signed-off-by: Mike Snitzer Signed-off-by: David Rientjes Tested-by: Dave Young Signed-off-by: Jens Axboe

block: add missed trace_block_plug

2011-11-16T08:21:50Z

After flush plug list, the list has no request, so we need to add a trace_block_plug(). Signed-off-by: Shaohua Li Reviewed-by: Namhyung Kim Signed-off-by: Andrew Morton Signed-off-by: Jens Axboe

block: avoid unnecessary plug list flush

2011-11-16T08:21:50Z

get_request_wait() could sleep and flush the plug list. If the list is already flushed, don't flush again. Signed-off-by: Shaohua Li Reviewed-by: Namhyung Kim Signed-off-by: Andrew Morton Signed-off-by: Jens Axboe

block: don't call blk_drain_queue() if elevator is not up

2011-11-03T17:52:11Z

blk_cleanup_queue() may be called before elevator is set up on a queue which triggers the following oops. BUG: unable to handle kernel NULL pointer dereference at (null) IP: [] elv_drain_elevator+0x1c/0x70 ... Pid: 830, comm: kworker/0:2 Not tainted 3.1.0-next-20111025_64+ #1590 Bochs Bochs RIP: 0010:[] [] elv_drain_elevator+0x1c/0x70 ... Call Trace: [] blk_drain_queue+0x42/0x70 [] blk_cleanup_queue+0xd0/0x1c0 [] md_free+0x50/0x70 [] kobject_release+0x8b/0x1d0 [] kref_put+0x36/0xa0 [] kobject_put+0x27/0x60 [] mddev_delayed_delete+0x2f/0x40 [] process_one_work+0x100/0x3b0 [] worker_thread+0x15f/0x3a0 [] kthread+0x87/0x90 [] kernel_thread_helper+0x4/0x10 Fix it by making blk_cleanup_queue() check whether q->elevator is set up before invoking blk_drain_queue. Signed-off-by: Tejun Heo Reported-and-tested-by: Jiri Slaby Signed-off-by: Jens Axboe

Merge branch 'for-linus' into for-3.2/core

2011-10-24T14:24:38Z

blk-flush: move the queue kick into

2011-10-24T14:24:31Z

A dm-multipath user reported[1] a problem when trying to boot a kernel with commit 4853abaae7e4a2af938115ce9071ef8684fb7af4 (block: fix flush machinery for stacking drivers with differring flush flags) applied. It turns out that an empty flush request can be sent into blk_insert_flush. When the BUG_ON was fixed to allow for this, I/O on the underlying device would stall. The reason is that blk_insert_cloned_request does not kick the queue. In the aforementioned commit, I had added a special case to kick the queue if data was sent down but the queue flags did not require a flush. A better solution is to push the queue kick up into blk_insert_cloned_request. This patch, along with a follow-on which fixes the BUG_ON, fixes the issue reported. [1] http://www.redhat.com/archives/dm-devel/2011-September/msg00154.html Reported-by: Christophe Saout Signed-off-by: Jeff Moyer Acked-by: Tejun Heo Stable note: 3.1 Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe

block: Remove the control of complete cpu from bio.

2011-10-24T14:11:30Z

bio originally has the functionality to set the complete cpu, but it is broken. Chirstoph said that "This code is unused, and from the all the discussions lately pretty obviously broken. The only thing keeping it serves is creating more confusion and possibly more bugs." And Jens replied with "We can kill bio_set_completion_cpu(). I'm fine with leaving cpu control to the request based drivers, they are the only ones that can toggle the setting anyway". So this patch tries to remove all the work of controling complete cpu from a bio. Cc: Shaohua Li Cc: Christoph Hellwig Signed-off-by: Tao Ma Signed-off-by: Jens Axboe

block: fix request_queue lifetime handling by making blk_queue_cleanup() properly shutdown

2011-10-19T12:42:16Z

request_queue is refcounted but actually depdends on lifetime management from the queue owner - on blk_cleanup_queue(), block layer expects that there's no request passing through request_queue and no new one will. This is fundamentally broken. The queue owner (e.g. SCSI layer) doesn't have a way to know whether there are other active users before calling blk_cleanup_queue() and other users (e.g. bsg) don't have any guarantee that the queue is and would stay valid while it's holding a reference. With delay added in blk_queue_bio() before queue_lock is grabbed, the following oops can be easily triggered when a device is removed with in-flight IOs. sd 0:0:1:0: [sdb] Stopping disk ata1.01: disabled general protection fault: 0000 [#1] PREEMPT SMP CPU 2 Modules linked in: Pid: 648, comm: test_rawio Not tainted 3.1.0-rc3-work+ #56 Bochs Bochs RIP: 0010:[] [] elv_rqhash_find+0x61/0x100 ... Process test_rawio (pid: 648, threadinfo ffff880019efa000, task ffff880019ef8a80) ... Call Trace: [] elv_merge+0x84/0xe0 [] blk_queue_bio+0xf4/0x400 [] generic_make_request+0xca/0x100 [] submit_bio+0x74/0x100 [] dio_bio_submit+0xbc/0xc0 [] __blockdev_direct_IO+0x92e/0xb40 [] blkdev_direct_IO+0x57/0x60 [] generic_file_aio_read+0x6d5/0x760 [] do_sync_read+0xda/0x120 [] vfs_read+0xc5/0x180 [] sys_pread64+0x9a/0xb0 [] system_call_fastpath+0x16/0x1b This happens because blk_queue_cleanup() destroys the queue and elevator whether IOs are in progress or not and DEAD tests are sprinkled in the request processing path without proper synchronization. Similar problem exists for blk-throtl. On queue cleanup, blk-throtl is shutdown whether it has requests in it or not. Depending on timing, it either oopses or throttled bios are lost putting tasks which are waiting for bio completion into eternal D state. The way it should work is having the usual clear distinction between shutdown and release. Shutdown drains all currently pending requests, marks the queue dead, and performs partial teardown of the now unnecessary part of the queue. Even after shutdown is complete, reference holders are still allowed to issue requests to the queue although they will be immmediately failed. The rest of teardown happens on release. This patch makes the following changes to make blk_queue_cleanup() behave as proper shutdown. * QUEUE_FLAG_DEAD is now set while holding both q->exit_mutex and queue_lock. * Unsynchronized DEAD check in generic_make_request_checks() removed. This couldn't make any meaningful difference as the queue could die after the check. * blk_drain_queue() updated such that it can drain all requests and is now called during cleanup. * blk_throtl updated such that it checks DEAD on grabbing queue_lock, drains all throttled bios during cleanup and free td when queue is released. Signed-off-by: Tejun Heo Cc: Vivek Goyal Signed-off-by: Jens Axboe