<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/buffer_head.h, branch v5.10.178</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.10.178</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.10.178'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2022-09-15T09:32:02Z</updated>
<entry>
<title>fs: only do a memory barrier for the first set_buffer_uptodate()</title>
<updated>2022-09-15T09:32:02Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-08-31T16:46:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=13c8f561be38daa37bf57d850d00676e13c779bd'/>
<id>urn:sha1:13c8f561be38daa37bf57d850d00676e13c779bd</id>
<content type='text'>
commit 2f79cdfe58c13949bbbb65ba5926abfe9561d0ec upstream.

Commit d4252071b97d ("add barriers to buffer_uptodate and
set_buffer_uptodate") added proper memory barriers to the buffer head
BH_Uptodate bit, so that anybody who tests a buffer for being up-to-date
will be guaranteed to actually see initialized state.

However, that commit didn't _just_ add the memory barrier, it also ended
up dropping the "was it already set" logic that the BUFFER_FNS() macro
had.

That's conceptually the right thing for a generic "this is a memory
barrier" operation, but in the case of the buffer contents, we really
only care about the memory barrier for the _first_ time we set the bit,
in that the only memory ordering protection we need is to avoid anybody
seeing uninitialized memory contents.

Any other access ordering wouldn't be about the BH_Uptodate bit anyway,
and would require some other proper lock (typically BH_Lock or the folio
lock).  A reader that races with somebody invalidating the buffer head
isn't an issue wrt the memory ordering, it's a serialization issue.

Now, you'd think that the buffer head operations don't matter in this
day and age (and I certainly thought so), but apparently some loads
still end up being heavy users of buffer heads.  In particular, the
kernel test robot reported that not having this bit access optimization
in place caused a noticeable direct IO performance regression on ext4:

  fxmark.ssd_ext4_no_jnl_DWTL_54_directio.works/sec -26.5% regression

although you presumably need a fast disk and a lot of cores to actually
notice.

Link: https://lore.kernel.org/all/Yw8L7HTZ%2FdE2%2Fo9C@xsang-OptiPlex-9020/
Reported-by: kernel test robot &lt;oliver.sang@intel.com&gt;
Tested-by: Fengwei Yin &lt;fengwei.yin@intel.com&gt;
Cc: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: stable@kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>add barriers to buffer_uptodate and set_buffer_uptodate</title>
<updated>2022-08-21T13:15:19Z</updated>
<author>
<name>Mikulas Patocka</name>
<email>mpatocka@redhat.com</email>
</author>
<published>2022-08-09T18:32:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=57f2ee517dec126d091fe11f995ecb75ee39b09d'/>
<id>urn:sha1:57f2ee517dec126d091fe11f995ecb75ee39b09d</id>
<content type='text'>
commit d4252071b97d2027d246f6a82cbee4d52f618b47 upstream.

Let's have a look at this piece of code in __bread_slow:

	get_bh(bh);
	bh-&gt;b_end_io = end_buffer_read_sync;
	submit_bh(REQ_OP_READ, 0, bh);
	wait_on_buffer(bh);
	if (buffer_uptodate(bh))
		return bh;

Neither wait_on_buffer nor buffer_uptodate contain any memory barrier.
Consequently, if someone calls sb_bread and then reads the buffer data,
the read of buffer data may be executed before wait_on_buffer(bh) on
architectures with weak memory ordering and it may return invalid data.

Fix this bug by adding a memory barrier to set_buffer_uptodate and an
acquire barrier to buffer_uptodate (in a similar way as
folio_test_uptodate and folio_mark_uptodate).

Signed-off-by: Mikulas Patocka &lt;mpatocka@redhat.com&gt;
Reviewed-by: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>fs: move the buffer_heads_over_limit stub to buffer_head.h</title>
<updated>2020-06-24T15:16:02Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2020-06-20T07:16:42Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d2de7ea48d83195ef1310555f1fdd9e8e1bab0d3'/>
<id>urn:sha1:d2de7ea48d83195ef1310555f1fdd9e8e1bab0d3</id>
<content type='text'>
Move the !CONFIG_BLOCK stub to the same place as the non-stub
declaration.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Bart Van Assche &lt;bvanassche@acm.org&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
<entry>
<title>buffer_head.h: remove attach_page_buffers</title>
<updated>2020-06-02T17:59:08Z</updated>
<author>
<name>Guoqing Jiang</name>
<email>guoqing.jiang@cloud.ionos.com</email>
</author>
<published>2020-06-02T04:48:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7b59435a2afed12dc9b2ec1b930efa2e94f1c397'/>
<id>urn:sha1:7b59435a2afed12dc9b2ec1b930efa2e94f1c397</id>
<content type='text'>
All the callers have replaced attach_page_buffers with the new function
attach_page_private, so remove it.

Signed-off-by: Guoqing Jiang &lt;guoqing.jiang@cloud.ionos.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Reviewed-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Cc: Roman Gushchin &lt;guro@fb.com&gt;
Cc: Andreas Dilger &lt;adilger@dilger.ca&gt;
Link: http://lkml.kernel.org/r/20200517214718.468-10-guoqing.jiang@cloud.ionos.com
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>ext4: use non-movable memory for superblock readahead</title>
<updated>2020-04-16T03:58:48Z</updated>
<author>
<name>Roman Gushchin</name>
<email>guro@fb.com</email>
</author>
<published>2020-02-29T00:14:11Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d87f639258a6a5980183f11876c884931ad93da2'/>
<id>urn:sha1:d87f639258a6a5980183f11876c884931ad93da2</id>
<content type='text'>
Since commit a8ac900b8163 ("ext4: use non-movable memory for the
superblock") buffers for ext4 superblock were allocated using
the sb_bread_unmovable() helper which allocated buffer heads
out of non-movable memory blocks. It was necessarily to not block
page migrations and do not cause cma allocation failures.

However commit 85c8f176a611 ("ext4: preload block group descriptors")
broke this by introducing pre-reading of the ext4 superblock.
The problem is that __breadahead() is using __getblk() underneath,
which allocates buffer heads out of movable memory.

It resulted in page migration failures I've seen on a machine
with an ext4 partition and a preallocated cma area.

Fix this by introducing sb_breadahead_unmovable() and
__breadahead_gfp() helpers which use non-movable memory for buffer
head allocations and use them for the ext4 superblock readahead.

Reviewed-by: Andreas Dilger &lt;adilger@dilger.ca&gt;
Fixes: 85c8f176a611 ("ext4: preload block group descriptors")
Signed-off-by: Roman Gushchin &lt;guro@fb.com&gt;
Link: https://lore.kernel.org/r/20200229001411.128010-1-guro@fb.com
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
</content>
</entry>
<entry>
<title>fs/buffer: Make BH_Uptodate_Lock bit_spin_lock a regular spinlock_t</title>
<updated>2020-03-28T12:21:08Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2019-11-18T13:28:24Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f1e67e355c2aafeddf1eac31335709236996d2fe'/>
<id>urn:sha1:f1e67e355c2aafeddf1eac31335709236996d2fe</id>
<content type='text'>
Bit spinlocks are problematic if PREEMPT_RT is enabled, because they
disable preemption, which is undesired for latency reasons and breaks when
regular spinlocks are taken within the bit_spinlock locked region because
regular spinlocks are converted to 'sleeping spinlocks' on RT.

PREEMPT_RT replaced the bit spinlocks with regular spinlocks to avoid this
problem. The replacement was done conditionaly at compile time, but
Christoph requested to do an unconditional conversion.

Jan suggested to move the spinlock into a existing padding hole which
avoids a size increase of struct buffer_head on production kernels.

As a benefit the lock gains lockdep coverage.

[ bigeasy: Remove the wrapper and use always spinlock_t and move it into
           the padding hole ]

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Jan Kara &lt;jack@suse.cz&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Link: https://lkml.kernel.org/r/20191118132824.rclhrbujqh4b4g4d@linutronix.de
</content>
</entry>
<entry>
<title>ext4: convert fault handler to use vm_fault_t type</title>
<updated>2018-10-03T02:20:50Z</updated>
<author>
<name>Souptick Joarder</name>
<email>jrdr.linux@gmail.com</email>
</author>
<published>2018-10-03T02:20:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=401b25aa1a75e7fe4e3202a6336604269697d705'/>
<id>urn:sha1:401b25aa1a75e7fe4e3202a6336604269697d705</id>
<content type='text'>
Return type of ext4_page_mkwrite and ext4_filemap_fault are
changed to use vm_fault_t type.

With this patch all the callers of block_page_mkwrite_return()
are changed to handle vm_fault_t. So converting the return type
of block_page_mkwrite_return() to vm_fault_t.

Signed-off-by: Souptick Joarder &lt;jrdr.linux@gmail.com&gt;
Signed-off-by: Theodore Ts'o &lt;tytso@mit.edu&gt;
Reviewed-by: Matthew Wilcox &lt;willy@infradead.org&gt;
</content>
</entry>
<entry>
<title>fs: move page_cache_seek_hole_data to iomap.c</title>
<updated>2018-06-02T01:37:33Z</updated>
<author>
<name>Christoph Hellwig</name>
<email>hch@lst.de</email>
</author>
<published>2018-06-01T16:04:40Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8a78cb1f1b98e5ea970674e0f049832d19e76ace'/>
<id>urn:sha1:8a78cb1f1b98e5ea970674e0f049832d19e76ace</id>
<content type='text'>
This function is only used by the iomap code, depends on being called
from it, and will soon stop poking into buffer head internals.

Signed-off-by: Christoph Hellwig &lt;hch@lst.de&gt;
Reviewed-by: Andreas Gruenbacher &lt;agruenba@redhat.com&gt;
Reviewed-by: Dave Chinner &lt;dchinner@redhat.com&gt;
Reviewed-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
Signed-off-by: Darrick J. Wong &lt;darrick.wong@oracle.com&gt;
</content>
</entry>
<entry>
<title>Merge tag 'for-linus-20180204' of git://git.kernel.dk/linux-block</title>
<updated>2018-02-04T19:16:35Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2018-02-04T19:16:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=64b28683deba132f301d1cecfc25c32e295f53a1'/>
<id>urn:sha1:64b28683deba132f301d1cecfc25c32e295f53a1</id>
<content type='text'>
Pull more block updates from Jens Axboe:
 "Most of this is fixes and not new code/features:

   - skd fix from Arnd, fixing a build error dependent on sla allocator
     type.

   - blk-mq scheduler discard merging fixes, one from me and one from
     Keith. This fixes a segment miscalculation for blk-mq-sched, where
     we mistakenly think two segments are physically contigious even
     though the request isn't carrying real data. Also fixes a bio-to-rq
     merge case.

   - Don't re-set a bit on the buffer_head flags, if it's already set.
     This can cause scalability concerns on bigger machines and
     workloads. From Kemi Wang.

   - Add BLK_STS_DEV_RESOURCE return value to blk-mq, allowing us to
     distuingish between a local (device related) resource starvation
     and a global one. The latter might happen without IO being in
     flight, so it has to be handled a bit differently. From Ming"

* tag 'for-linus-20180204' of git://git.kernel.dk/linux-block:
  block: skd: fix incorrect linux/slab_def.h inclusion
  buffer: Avoid setting buffer bits that are already set
  blk-mq-sched: Enable merging discard bio into request
  blk-mq: fix discard merge with scheduler attached
  blk-mq: introduce BLK_STS_DEV_RESOURCE
</content>
</entry>
<entry>
<title>buffer: Avoid setting buffer bits that are already set</title>
<updated>2018-02-02T14:56:30Z</updated>
<author>
<name>Kemi Wang</name>
<email>kemi.wang@intel.com</email>
</author>
<published>2017-10-24T01:16:42Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=60f91826ca62bcf85d6d5fc90941337282787671'/>
<id>urn:sha1:60f91826ca62bcf85d6d5fc90941337282787671</id>
<content type='text'>
It's expensive to set buffer flags that are already set, because that
causes a costly cache line transition.

A common case is setting the "verified" flag during ext4 writes.
This patch checks for the flag being set first.

With the AIM7/creat-clo benchmark testing on a 48G ramdisk based-on ext4
file system, we see 3.3%(15431-&gt;15936) improvement of aim7.jobs-per-min on
a 2-sockets broadwell platform.

What the benchmark does is: it forks 3000 processes, and each  process do
the following:
a) open a new file
b) close the file
c) delete the file
until loop=100*1000 times.

The original patch is contributed by Andi Kleen.

Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Tested-by: Kemi Wang &lt;kemi.wang@intel.com&gt;
Signed-off-by: Kemi Wang &lt;kemi.wang@intel.com&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
</content>
</entry>
</feed>
