<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/drivers/md, branch v4.9.53</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.9.53</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.9.53'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2017-10-05T07:43:59Z</updated>
<entry>
<title>md/raid5: preserve STRIPE_ON_UNPLUG_LIST in break_stripe_batch_list</title>
<updated>2017-10-05T07:43:59Z</updated>
<author>
<name>Dennis Yang</name>
<email>dennisyang@qnap.com</email>
</author>
<published>2017-09-06T03:02:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=49c2b839b743dfd8e3b6332494eba00ef47389a3'/>
<id>urn:sha1:49c2b839b743dfd8e3b6332494eba00ef47389a3</id>
<content type='text'>
commit 184a09eb9a2fe425e49c9538f1604b05ed33cfef upstream.

In release_stripe_plug(), if a stripe_head has its STRIPE_ON_UNPLUG_LIST
set, it indicates that this stripe_head is already in the raid5_plug_cb
list and release_stripe() would be called instead to drop a reference
count. Otherwise, the STRIPE_ON_UNPLUG_LIST bit would be set for this
stripe_head and it will get queued into the raid5_plug_cb list.

Since break_stripe_batch_list() did not preserve STRIPE_ON_UNPLUG_LIST,
A stripe could be re-added to plug list while it is still on that list
in the following situation. If stripe_head A is added to another
stripe_head B's batch list, in this case A will have its
batch_head != NULL and be added into the plug list. After that,
stripe_head B gets handled and called break_stripe_batch_list() to
reset all the batched stripe_head(including A which is still on
the plug list)'s state and reset their batch_head to NULL.
Before the plug list gets processed, if there is another write request
comes in and get stripe_head A, A will have its batch_head == NULL
(cleared by calling break_stripe_batch_list() on B) and be added to
plug list once again.

Signed-off-by: Dennis Yang &lt;dennisyang@qnap.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>md/raid5: fix a race condition in stripe batch</title>
<updated>2017-10-05T07:43:59Z</updated>
<author>
<name>Shaohua Li</name>
<email>shli@fb.com</email>
</author>
<published>2017-08-25T17:40:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=648798cc2fd7d748573ba760a66cfa7b561abe77'/>
<id>urn:sha1:648798cc2fd7d748573ba760a66cfa7b561abe77</id>
<content type='text'>
commit 3664847d95e60a9a943858b7800f8484669740fc upstream.

We have a race condition in below scenario, say have 3 continuous stripes, sh1,
sh2 and sh3, sh1 is the stripe_head of sh2 and sh3:

CPU1				CPU2				CPU3
handle_stripe(sh3)
				stripe_add_to_batch_list(sh3)
				-&gt; lock(sh2, sh3)
				-&gt; lock batch_lock(sh1)
				-&gt; add sh3 to batch_list of sh1
				-&gt; unlock batch_lock(sh1)
								clear_batch_ready(sh1)
								-&gt; lock(sh1) and batch_lock(sh1)
								-&gt; clear STRIPE_BATCH_READY for all stripes in batch_list
								-&gt; unlock(sh1) and batch_lock(sh1)
-&gt;clear_batch_ready(sh3)
--&gt;test_and_clear_bit(STRIPE_BATCH_READY, sh3)
---&gt;return 0 as sh-&gt;batch == NULL
				-&gt; sh3-&gt;batch_head = sh1
				-&gt; unlock (sh2, sh3)

In CPU1, handle_stripe will continue handle sh3 even it's in batch stripe list
of sh1. By moving sh3-&gt;batch_head assignment in to batch_lock, we make it
impossible to clear STRIPE_BATCH_READY before batch_head is set.

Thanks Stephane for helping debug this tricky issue.

Reported-and-tested-by: Stephane Thiell &lt;sthiell@stanford.edu&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>bcache: fix bch_hprint crash and improve output</title>
<updated>2017-09-27T12:39:25Z</updated>
<author>
<name>Michael Lyle</name>
<email>mlyle@lyle.org</email>
</author>
<published>2017-09-06T06:26:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=08f75f2c525d786b348dca568761dfde8d6b0d5c'/>
<id>urn:sha1:08f75f2c525d786b348dca568761dfde8d6b0d5c</id>
<content type='text'>
commit 9276717b9e297a62d1151a43d1cd286213f68eb7 upstream.

Most importantly, solve a crash where %llu was used to format signed
numbers.  This would cause a buffer overflow when reading sysfs
writeback_rate_debug, as only 20 bytes were allocated for this and
%llu writes 20 characters plus a null.

Always use the units mechanism rather than having different output
paths for simplicity.

Also, correct problems with display output where 1.10 was a larger
number than 1.09, by multiplying by 10 and then dividing by 1024 instead
of dividing by 100.  (Remainders of &gt;= 1000 would print as .10).

Minor changes: Always display the decimal point instead of trying to
omit it based on number of digits shown.  Decide what units to use
based on 1000 as a threshold, not 1024 (in other words, always print
at most 3 digits before the decimal point).

Signed-off-by: Michael Lyle &lt;mlyle@lyle.org&gt;
Reported-by: Dmitry Yu Okunev &lt;dyokunev@ut.mephi.ru&gt;
Acked-by: Kent Overstreet &lt;kent.overstreet@gmail.com&gt;
Reviewed-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>bcache: fix for gc and write-back race</title>
<updated>2017-09-27T12:39:25Z</updated>
<author>
<name>Tang Junhui</name>
<email>tang.junhui@zte.com.cn</email>
</author>
<published>2017-09-06T06:25:59Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=57aa1a6967b28431d62beec299125a969a469f3f'/>
<id>urn:sha1:57aa1a6967b28431d62beec299125a969a469f3f</id>
<content type='text'>
commit 9baf30972b5568d8b5bc8b3c46a6ec5b58100463 upstream.

gc and write-back get raced (see the email "bcache get stucked" I sended
before):
gc thread                               write-back thread
|                                       |bch_writeback_thread()
|bch_gc_thread()                        |
|                                       |==&gt;read_dirty()
|==&gt;bch_btree_gc()                      |
|==&gt;btree_root() //get btree root       |
|                //node write locker    |
|==&gt;bch_btree_gc_root()                 |
|                                       |==&gt;read_dirty_submit()
|                                       |==&gt;write_dirty()
|                                       |==&gt;continue_at(cl,
|                                       |               write_dirty_finish,
|                                       |               system_wq);
|                                       |==&gt;write_dirty_finish()//excute
|                                       |               //in system_wq
|                                       |==&gt;bch_btree_insert()
|                                       |==&gt;bch_btree_map_leaf_nodes()
|                                       |==&gt;__bch_btree_map_nodes()
|                                       |==&gt;btree_root //try to get btree
|                                       |              //root node read
|                                       |              //lock
|                                       |-----stuck here
|==&gt;bch_btree_set_root()
|==&gt;bch_journal_meta()
|==&gt;bch_journal()
|==&gt;journal_try_write()
|==&gt;journal_write_unlocked() //journal_full(&amp;c-&gt;journal)
|                            //condition satisfied
|==&gt;continue_at(cl, journal_write, system_wq); //try to excute
|                               //journal_write in system_wq
|                               //but work queue is excuting
|                               //write_dirty_finish()
|==&gt;closure_sync(); //wait journal_write execute
|                   //over and wake up gc,
|-------------stuck here
|==&gt;release root node write locker

This patch alloc a separate work-queue for write-back thread to avoid such
race.

(Commit log re-organized by Coly Li to pass checkpatch.pl checking)

Signed-off-by: Tang Junhui &lt;tang.junhui@zte.com.cn&gt;
Acked-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>bcache: Correct return value for sysfs attach errors</title>
<updated>2017-09-27T12:39:25Z</updated>
<author>
<name>Tony Asleson</name>
<email>tasleson@redhat.com</email>
</author>
<published>2017-09-06T06:25:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fa92ff6b77a1ac597c2c8bb973c8a9c0191c108a'/>
<id>urn:sha1:fa92ff6b77a1ac597c2c8bb973c8a9c0191c108a</id>
<content type='text'>
commit 77fa100f27475d08a569b9d51c17722130f089e7 upstream.

If you encounter any errors in bch_cached_dev_attach it will return
a negative error code.  The variable 'v' which stores the result is
unsigned, thus user space sees a very large value returned for bytes
written which can cause incorrect user space behavior.  Utilize 1
signed variable to use throughout the function to preserve error return
capability.

Signed-off-by: Tony Asleson &lt;tasleson@redhat.com&gt;
Acked-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>bcache: correct cache_dirty_target in __update_writeback_rate()</title>
<updated>2017-09-27T12:39:24Z</updated>
<author>
<name>Tang Junhui</name>
<email>tang.junhui@zte.com.cn</email>
</author>
<published>2017-09-06T06:25:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e40cb30162d7f9e9ea8aae9dd1b93e1ff30c1bcd'/>
<id>urn:sha1:e40cb30162d7f9e9ea8aae9dd1b93e1ff30c1bcd</id>
<content type='text'>
commit a8394090a9129b40f9d90dcb7f4a49d60c727ca6 upstream.

__update_write_rate() uses a Proportion-Differentiation Controller
algorithm to control writeback rate. A dirty target number is used in
this PD controller to control writeback rate. A larger target number
will make the writeback rate smaller, on the versus, a smaller target
number will make the writeback rate larger.

bcache uses the following steps to calculate the target number,
1) cache_sectors = all-buckets-of-cache-set * buckets-size
2) cache_dirty_target = cache_sectors * cached-device-writeback_percent
3) target = cache_dirty_target *
(sectors-of-cached-device/sectors-of-all-cached-devices-of-this-cache-set)

The calculation at step 1) for cache_sectors is incorrect, which does
not consider dirty blocks occupied by flash only volume.

A flash only volume can be took as a bcache device without cached
device. All data sectors allocated for it are persistent on cache device
and marked dirty, they are not touched by bcache writeback and garbage
collection code. So data blocks of flash only volume should be ignore
when calculating cache_sectors of cache set.

Current code does not subtract dirty sectors of flash only volume, which
results a larger target number from the above 3 steps. And in sequence
the cache device's writeback rate is smaller then a correct value,
writeback speed is slower on all cached devices.

This patch fixes the incorrect slower writeback rate by subtracting
dirty sectors of flash only volumes in __update_writeback_rate().

(Commit log composed by Coly Li to pass checkpatch.pl checking)

Signed-off-by: Tang Junhui &lt;tang.junhui@zte.com.cn&gt;
Reviewed-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>bcache: do not subtract sectors_to_gc for bypassed IO</title>
<updated>2017-09-27T12:39:24Z</updated>
<author>
<name>Tang Junhui</name>
<email>tang.junhui@zte.com.cn</email>
</author>
<published>2017-09-06T06:25:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8f51f38883dcff93f55e132d3d8b1c991e809474'/>
<id>urn:sha1:8f51f38883dcff93f55e132d3d8b1c991e809474</id>
<content type='text'>
commit 69daf03adef5f7bc13e0ac86b4b8007df1767aab upstream.

Since bypassed IOs use no bucket, so do not subtract sectors_to_gc to
trigger gc thread.

Signed-off-by: tang.junhui &lt;tang.junhui@zte.com.cn&gt;
Acked-by: Coly Li &lt;colyli@suse.de&gt;
Reviewed-by: Eric Wheeler &lt;bcache@linux.ewheeler.net&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>bcache: Fix leak of bdev reference</title>
<updated>2017-09-27T12:39:24Z</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2017-09-06T06:25:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c234e0e77572d78cbac0de7acee2263343595c25'/>
<id>urn:sha1:c234e0e77572d78cbac0de7acee2263343595c25</id>
<content type='text'>
commit 4b758df21ee7081ab41448d21d60367efaa625b3 upstream.

If blkdev_get_by_path() in register_bcache() fails, we try to lookup the
block device using lookup_bdev() to detect which situation we are in to
properly report error. However we never drop the reference returned to
us from lookup_bdev(). Fix that.

Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Acked-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>bcache: initialize dirty stripes in flash_dev_run()</title>
<updated>2017-09-27T12:39:24Z</updated>
<author>
<name>Tang Junhui</name>
<email>tang.junhui@zte.com.cn</email>
</author>
<published>2017-09-06T17:28:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2a9b55742a9fe838ddc86f8b5c75a56867ea1912'/>
<id>urn:sha1:2a9b55742a9fe838ddc86f8b5c75a56867ea1912</id>
<content type='text'>
commit 175206cf9ab63161dec74d9cd7f9992e062491f5 upstream.

bcache uses a Proportion-Differentiation Controller algorithm to control
writeback rate to cached devices. In the PD controller algorithm, dirty
stripes of thin flash device should not be counted in, because flash only
volumes never write back dirty data.

Currently dirty stripe counter for thin flash device is not initialized
when the thin flash device starts. Which means the following calculation
in PD controller will reference an undefined dirty stripes number, and
all cached devices attached to the same cache set where the thin flash
device lies on may have an inaccurate writeback rate.

This patch calles bch_sectors_dirty_init() in flash_dev_run(), to
correctly initialize dirty stripe counter when the thin flash device
starts to run. This patch also does following parameter data type change,
 -void bch_sectors_dirty_init(struct cached_dev *dc);
 +void bch_sectors_dirty_init(struct bcache_device *);
to call this function conveniently in flash_dev_run().

(Commit log is composed by Coly Li)

Signed-off-by: Tang Junhui &lt;tang.junhui@zte.com.cn&gt;
Reviewed-by: Coly Li &lt;colyli@suse.de&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>md/bitmap: disable bitmap_resize for file-backed bitmaps.</title>
<updated>2017-09-27T12:39:21Z</updated>
<author>
<name>NeilBrown</name>
<email>neilb@suse.com</email>
</author>
<published>2017-08-31T00:23:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2cee78081b97366c42bb6349be2a57f89c36df5a'/>
<id>urn:sha1:2cee78081b97366c42bb6349be2a57f89c36df5a</id>
<content type='text'>
commit e8a27f836f165c26f867ece7f31eb5c811692319 upstream.

bitmap_resize() does not work for file-backed bitmaps.
The buffer_heads are allocated and initialized when
the bitmap is read from the file, but resize doesn't
read from the file, it loads from the internal bitmap.
When it comes time to write the new bitmap, the bh is
non-existent and we crash.

The common case when growing an array involves making the array larger,
and that normally means making the bitmap larger.  Doing
that inside the kernel is possible, but would need more code.
It is probably easier to require people who use file-backed
bitmaps to remove them and re-add after a reshape.

So this patch disables the resizing of arrays which have
file-backed bitmaps.  This is better than crashing.

Reported-by: Zhilong Liu &lt;zlliu@suse.com&gt;
Fixes: d60b479d177a ("md/bitmap: add bitmap_resize function to allow bitmap resizing.")
Signed-off-by: NeilBrown &lt;neilb@suse.com&gt;
Signed-off-by: Shaohua Li &lt;shli@fb.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
</feed>
