<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/block, branch v3.5.7</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.5.7</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.5.7'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2012-08-15T14:52:46Z</updated>
<entry>
<title>block: uninitialized ioc-&gt;nr_tasks triggers WARN_ON</title>
<updated>2012-08-15T14:52:46Z</updated>
<author>
<name>Olof Johansson</name>
<email>olof@lixom.net</email>
</author>
<published>2012-08-01T10:17:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bd60cd238b0b3056cc610fad8f52fa5f0e6bdb38'/>
<id>urn:sha1:bd60cd238b0b3056cc610fad8f52fa5f0e6bdb38</id>
<content type='text'>
commit 4638a83e8615de9c16c39dfed234951d0f468cf1 upstream.

Hi,

I'm using the old-fashioned 'dump' backup tool, and I noticed that it spews the
below warning as of 3.5-rc1 and later (3.4 is fine):

[   10.886893] ------------[ cut here ]------------
[   10.886904] WARNING: at include/linux/iocontext.h:140 copy_process+0x1488/0x1560()
[   10.886905] Hardware name: Bochs
[   10.886906] Modules linked in:
[   10.886908] Pid: 2430, comm: dump Not tainted 3.5.0-rc7+ #27
[   10.886908] Call Trace:
[   10.886911]  [&lt;ffffffff8107ce8a&gt;] warn_slowpath_common+0x7a/0xb0
[   10.886912]  [&lt;ffffffff8107ced5&gt;] warn_slowpath_null+0x15/0x20
[   10.886913]  [&lt;ffffffff8107c088&gt;] copy_process+0x1488/0x1560
[   10.886914]  [&lt;ffffffff8107c244&gt;] do_fork+0xb4/0x340
[   10.886918]  [&lt;ffffffff8108effa&gt;] ? recalc_sigpending+0x1a/0x50
[   10.886919]  [&lt;ffffffff8108f6b2&gt;] ? __set_task_blocked+0x32/0x80
[   10.886920]  [&lt;ffffffff81091afa&gt;] ? __set_current_blocked+0x3a/0x60
[   10.886923]  [&lt;ffffffff81051db3&gt;] sys_clone+0x23/0x30
[   10.886925]  [&lt;ffffffff8179bd73&gt;] stub_clone+0x13/0x20
[   10.886927]  [&lt;ffffffff8179baa2&gt;] ? system_call_fastpath+0x16/0x1b
[   10.886928] ---[ end trace 32a14af7ee6a590b ]---

Reproducing is easy, I can hit it on a KVM system with a very basic
config (x86_64 make defconfig + enable the drivers needed). To hit it,
just install dump (on debian/ubuntu, not sure what the package might be
called on Fedora), and:

dump -o -f /tmp/foo /

You'll see the warning in dmesg once it forks off the I/O process and
starts dumping filesystem contents.

I bisected it down to the following commit:

commit f6e8d01bee036460e03bd4f6a79d014f98ba712e
Author: Tejun Heo &lt;tj@kernel.org&gt;
Date:   Mon Mar 5 13:15:26 2012 -0800

    block: add io_context-&gt;active_ref

    Currently ioc-&gt;nr_tasks is used to decide two things - whether an ioc
    is done issuing IOs and whether it's shared by multiple tasks.  This
    patch separate out the first into ioc-&gt;active_ref, which is acquired
    and released using {get|put}_io_context_active() respectively.

    This will be used to associate bio's with a given task.  This patch
    doesn't introduce any visible behavior change.

    Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
    Cc: Vivek Goyal &lt;vgoyal@redhat.com&gt;
    Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;

It seems like the init of ioc-&gt;nr_tasks was removed in that patch,
so it starts out at 0 instead of 1.

Tejun, is the right thing here to add back the init, or should something else
be done?

The below patch removes the warning, but I haven't done any more extensive
testing on it.

Signed-off-by: Olof Johansson &lt;olof@lixom.net&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>scsi: Silence unnecessary warnings about ioctl to partition</title>
<updated>2012-06-15T10:52:46Z</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2012-06-15T10:52:46Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6d9359280753d2955f86d6411047516a9431eb51'/>
<id>urn:sha1:6d9359280753d2955f86d6411047516a9431eb51</id>
<content type='text'>
Sometimes, warnings about ioctls to partition happen often enough that they
form majority of the warnings in the kernel log and users complain. In some
cases warnings are about ioctls such as SG_IO so it's not good to get rid of
the warnings completely as they can ease debugging of userspace problems
when ioctl is refused.

Since I have seen warnings from lots of commands, including some proprietary
userspace applications, I don't think disallowing the ioctls for processes
with CAP_SYS_RAWIO will happen in the near future if ever. So lets just
stop warning for processes with CAP_SYS_RAWIO for which ioctl is allowed.

CC: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
CC: James Bottomley &lt;JBottomley@parallels.com&gt;
CC: linux-scsi@vger.kernel.org
Acked-by: Paolo Bonzini &lt;pbonzini@redhat.com&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: Drop dead function blk_abort_queue()</title>
<updated>2012-06-15T06:46:23Z</updated>
<author>
<name>Asias He</name>
<email>asias@redhat.com</email>
</author>
<published>2012-06-14T07:04:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=76aaa5101fffaef12b45b4c01ed0d0528f23dedf'/>
<id>urn:sha1:76aaa5101fffaef12b45b4c01ed0d0528f23dedf</id>
<content type='text'>
This function was only used by btrfs code in btrfs_abort_devices()
(seems in a wrong way).

It was removed in commit d07eb9117050c9ed3f78296ebcc06128b52693be,
So, Let's remove the dead code to avoid any confusion.

Changes in v2: update commit log, btrfs_abort_devices() was removed
already.

Cc: Jens Axboe &lt;axboe@kernel.dk&gt;
Cc: linux-kernel@vger.kernel.org
Cc: Chris Mason &lt;chris.mason@oracle.com&gt;
Cc: linux-btrfs@vger.kernel.org
Cc: David Sterba &lt;dave@jikos.cz&gt;
Signed-off-by: Asias He &lt;asias@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: Mitigate lock unbalance caused by lock switching</title>
<updated>2012-06-15T06:46:22Z</updated>
<author>
<name>Asias He</name>
<email>asias@redhat.com</email>
</author>
<published>2012-05-24T15:28:52Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5e5cfac0c622d42eff4fa308e91b3c9c1884b4f0'/>
<id>urn:sha1:5e5cfac0c622d42eff4fa308e91b3c9c1884b4f0</id>
<content type='text'>
Commit 777eb1bf15b8532c396821774bf6451e563438f5 disconnects externally
supplied queue_lock before blk_drain_queue(). Switching the lock would
introduce lock unbalance because theads which have taken the external
lock might unlock the internal lock in the during the queue drain. This
patch mitigate this by disconnecting the lock after the queue draining
since queue draining makes a lot of request_queue users go away.

However, please note, this patch only makes the problem less likely to
happen. Anyone who still holds a ref might try to issue a new request on
a dead queue after the blk_cleanup_queue() finishes draining, the lock
unbalance might still happen in this case.

 =====================================
 [ BUG: bad unlock balance detected! ]
 3.4.0+ #288 Not tainted
 -------------------------------------
 fio/17706 is trying to release lock (&amp;(&amp;q-&gt;__queue_lock)-&gt;rlock) at:
 [&lt;ffffffff81329372&gt;] blk_queue_bio+0x2a2/0x380
 but there are no more locks to release!

 other info that might help us debug this:
 1 lock held by fio/17706:
  #0:  (&amp;(&amp;vblk-&gt;lock)-&gt;rlock){......}, at: [&lt;ffffffff81327f1a&gt;]
 get_request_wait+0x19a/0x250

 stack backtrace:
 Pid: 17706, comm: fio Not tainted 3.4.0+ #288
 Call Trace:
  [&lt;ffffffff81329372&gt;] ? blk_queue_bio+0x2a2/0x380
  [&lt;ffffffff810dea49&gt;] print_unlock_inbalance_bug+0xf9/0x100
  [&lt;ffffffff810dfe4f&gt;] lock_release_non_nested+0x1df/0x330
  [&lt;ffffffff811dae24&gt;] ? dio_bio_end_aio+0x34/0xc0
  [&lt;ffffffff811d6935&gt;] ? bio_check_pages_dirty+0x85/0xe0
  [&lt;ffffffff811daea1&gt;] ? dio_bio_end_aio+0xb1/0xc0
  [&lt;ffffffff81329372&gt;] ? blk_queue_bio+0x2a2/0x380
  [&lt;ffffffff81329372&gt;] ? blk_queue_bio+0x2a2/0x380
  [&lt;ffffffff810e0079&gt;] lock_release+0xd9/0x250
  [&lt;ffffffff81a74553&gt;] _raw_spin_unlock_irq+0x23/0x40
  [&lt;ffffffff81329372&gt;] blk_queue_bio+0x2a2/0x380
  [&lt;ffffffff81328faa&gt;] generic_make_request+0xca/0x100
  [&lt;ffffffff81329056&gt;] submit_bio+0x76/0xf0
  [&lt;ffffffff8115470c&gt;] ? set_page_dirty_lock+0x3c/0x60
  [&lt;ffffffff811d69e1&gt;] ? bio_set_pages_dirty+0x51/0x70
  [&lt;ffffffff811dd1a8&gt;] do_blockdev_direct_IO+0xbf8/0xee0
  [&lt;ffffffff811d8620&gt;] ? blkdev_get_block+0x80/0x80
  [&lt;ffffffff811dd4e5&gt;] __blockdev_direct_IO+0x55/0x60
  [&lt;ffffffff811d8620&gt;] ? blkdev_get_block+0x80/0x80
  [&lt;ffffffff811d92e7&gt;] blkdev_direct_IO+0x57/0x60
  [&lt;ffffffff811d8620&gt;] ? blkdev_get_block+0x80/0x80
  [&lt;ffffffff8114c6ae&gt;] generic_file_aio_read+0x70e/0x760
  [&lt;ffffffff810df7c5&gt;] ? __lock_acquire+0x215/0x5a0
  [&lt;ffffffff811e9924&gt;] ? aio_run_iocb+0x54/0x1a0
  [&lt;ffffffff8114bfa0&gt;] ? grab_cache_page_nowait+0xc0/0xc0
  [&lt;ffffffff811e82cc&gt;] aio_rw_vect_retry+0x7c/0x1e0
  [&lt;ffffffff811e8250&gt;] ? aio_fsync+0x30/0x30
  [&lt;ffffffff811e9936&gt;] aio_run_iocb+0x66/0x1a0
  [&lt;ffffffff811ea9b0&gt;] do_io_submit+0x6f0/0xb80
  [&lt;ffffffff8134de2e&gt;] ? trace_hardirqs_on_thunk+0x3a/0x3f
  [&lt;ffffffff811eae50&gt;] sys_io_submit+0x10/0x20
  [&lt;ffffffff81a7c9e9&gt;] system_call_fastpath+0x16/0x1b

Changes since v2: Update commit log to explain how the code is still
                  broken even if we delay the lock switching after the drain.
Changes since v1: Update commit log as Tejun suggested.

Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Asias He &lt;asias@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: Avoid missed wakeup in request waitqueue</title>
<updated>2012-06-15T06:45:25Z</updated>
<author>
<name>Asias He</name>
<email>asias@redhat.com</email>
</author>
<published>2012-06-15T06:45:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=458f27a9823a0841acb4ca59e0e7f33e181f85e2'/>
<id>urn:sha1:458f27a9823a0841acb4ca59e0e7f33e181f85e2</id>
<content type='text'>
After hot-unplug a stressed disk, I found that rl-&gt;wait[] is not empty
while rl-&gt;count[] is empty and there are theads still sleeping on
get_request after the queue cleanup. With simple debug code, I found
there are exactly nr_sleep - nr_wakeup of theads in D state. So there
are missed wakeup.

  $ dmesg | grep nr_sleep
  [   52.917115] ---&gt; nr_sleep=1046, nr_wakeup=873, delta=173
  $ vmstat 1
  1 173  0 712640  24292  96172 0 0  0  0  419  757  0  0  0 100  0

To quote Tejun:

  Ah, okay, freed_request() wakes up single waiter with the assumption
  that after the wakeup there will at least be one successful allocation
  which in turn will continue the wakeup chain until the wait list is
  empty - ie. waiter wakeup is dependent on successful request
  allocation happening after each wakeup.  With queue marked dead, any
  woken up waiter fails the allocation path, so the wakeup chaining is
  lost and we're left with hung waiters. What we need is wake_up_all()
  after drain completion.

This patch fixes the missed wakeup by waking up all the theads which
are sleeping on wait queue after queue drain.

Changes in v2: Drop waitqueue_active() optimization

Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Asias He &lt;asias@redhat.com&gt;

Fixed a bug by me, where stacked devices would oops on calling
blk_drain_queue() since -&gt;rq.wait[] do not get initialized unless
it's a full queue setup.

Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blkcg: drop local variable @q from blkg_destroy()</title>
<updated>2012-06-06T06:35:31Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2012-06-05T11:36:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=27e1f9d1cc87be4e53c6eb7158cafc21c4b85a14'/>
<id>urn:sha1:27e1f9d1cc87be4e53c6eb7158cafc21c4b85a14</id>
<content type='text'>
blkg_destroy() caches @blkg-&gt;q in local variable @q.  While there are
two places which needs @blkg-&gt;q, only lockdep_assert_held() used the
local variable leading to unused local variable warning if lockdep is
configured out.  Drop the local variable and just use @blkg-&gt;q
directly.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: Rakesh Iyer &lt;rni@google.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>blkcg: fix blkg_alloc() failure path</title>
<updated>2012-06-04T08:03:21Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2012-06-04T06:21:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9b2ea86bc9e940950a088e9795ab28f006e73276'/>
<id>urn:sha1:9b2ea86bc9e940950a088e9795ab28f006e73276</id>
<content type='text'>
When policy data allocation fails in the middle, blkg_alloc() invokes
blkg_free() to destroy the half constructed blkg.  This ends up
calling pd_exit_fn() on policy datas which didn't go through
pd_init_fn().  Fix it by making blkg_alloc() call pd_init_fn()
immediately after each policy data allocation.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Acked-by: Vivek Goyal &lt;vgoyal@redhat.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: blkcg_policy_cfq shouldn't be used if !CONFIG_CFQ_GROUP_IOSCHED</title>
<updated>2012-06-04T08:02:29Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2012-06-04T08:02:29Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ffea73fc723a12fdde4c9fb3fcce5d154d1104a1'/>
<id>urn:sha1:ffea73fc723a12fdde4c9fb3fcce5d154d1104a1</id>
<content type='text'>
cfq may be built w/ or w/o blkcg support depending on
CONFIG_CFQ_CGROUP_IOSCHED.  If blkcg support is disabled, most of
related code is ifdef'd out but some part is left dangling -
blkcg_policy_cfq is left zero-filled and blkcg_policy_[un]register()
calls are made on it.

Feeding zero filled policy to blkcg_policy_register() is incorrect and
triggers the following WARN_ON() if CONFIG_BLK_CGROUP &amp;&amp;
!CONFIG_CFQ_GROUP_IOSCHED.

 ------------[ cut here ]------------
 WARNING: at block/blk-cgroup.c:867
 Modules linked in:
 Modules linked in:
 CPU: 3 Not tainted 3.4.0-09547-gfb21aff #1
 Process swapper/0 (pid: 1, task: 000000003ff80000, ksp: 000000003ff7f8b8)
 Krnl PSW : 0704100180000000 00000000003d76ca (blkcg_policy_register+0xca/0xe0)
	    R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:0 CC:1 PM:0 EA:3
 Krnl GPRS: 0000000000000000 00000000014b85ec 00000000014b85b0 0000000000000000
	    000000000096fb60 0000000000000000 00000000009a8e78 0000000000000048
	    000000000099c070 0000000000b6f000 0000000000000000 000000000099c0b8
	    00000000014b85b0 0000000000667580 000000003ff7fd98 000000003ff7fd70
 Krnl Code: 00000000003d76be: a7280001           lhi     %r2,1
	    00000000003d76c2: a7f4ffdf           brc     15,3d7680
	   #00000000003d76c6: a7f40001           brc     15,3d76c8
	   &gt;00000000003d76ca: a7c8ffea           lhi     %r12,-22
	    00000000003d76ce: a7f4ffce           brc     15,3d766a
	    00000000003d76d2: a7f40001           brc     15,3d76d4
	    00000000003d76d6: a7c80000           lhi     %r12,0
	    00000000003d76da: a7f4ffc2           brc     15,3d765e
 Call Trace:
 ([&lt;0000000000b6f000&gt;] initcall_debug+0x0/0x4)
  [&lt;0000000000989e8a&gt;] cfq_init+0x62/0xd4
  [&lt;00000000001000ba&gt;] do_one_initcall+0x3a/0x170
  [&lt;000000000096fb60&gt;] kernel_init+0x214/0x2bc
  [&lt;0000000000623202&gt;] kernel_thread_starter+0x6/0xc
  [&lt;00000000006231fc&gt;] kernel_thread_starter+0x0/0xc
 no locks held by swapper/0/1.
 Last Breaking-Event-Address:
  [&lt;00000000003d76c6&gt;] blkcg_policy_register+0xc6/0xe0
 ---[ end trace b8ef4903fcbf9dd3 ]---

This patch fixes the problem by ensuring all blkcg support code is
inside CONFIG_CFQ_GROUP_IOSCHED.

* blkcg_policy_cfq declaration and blkg_to_cfqg() definition are moved
  inside the first CONFIG_CFQ_GROUP_IOSCHED block.  __maybe_unused is
  dropped from blkcg_policy_cfq decl.

* blkcg_deactivate_poilcy() invocation is moved inside ifdef.  This
  also makes the activation logic match cfq_init_queue().

* All blkcg_policy_[un]register() invocations are moved inside ifdef.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
LKML-Reference: &lt;20120601112954.GC3535@osiris.boeblingen.de.ibm.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: fix return value on cfq_init() failure</title>
<updated>2012-06-04T08:01:38Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2012-06-04T08:01:38Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fd7949564ced88385ca7758a4c1f47c274233dd5'/>
<id>urn:sha1:fd7949564ced88385ca7758a4c1f47c274233dd5</id>
<content type='text'>
cfq_init() would return zero after kmem cache creation failure.  Fix
so that it returns -ENOMEM.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>block: avoid infinite loop in get_task_io_context()</title>
<updated>2012-05-31T11:39:05Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2012-05-31T11:39:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3c9c708c9fc967e389f85bc735e4c1f65d67334e'/>
<id>urn:sha1:3c9c708c9fc967e389f85bc735e4c1f65d67334e</id>
<content type='text'>
Calling get_task_io_context() on a exiting task which isn't %current can
loop forever. This triggers at boot time on my dev machine.

BUG: soft lockup - CPU#3 stuck for 22s ! [mountall.1603]

Fix this by making create_task_io_context() returns -EBUSY in this case
to break the loop.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Alan Cox &lt;alan@linux.intel.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
</feed>
