<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/swap.h, branch v6.6.17</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.6.17</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.6.17'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2023-08-24T23:20:28Z</updated>
<entry>
<title>mm/swap: inline folio_set_swap_entry() and folio_swap_entry()</title>
<updated>2023-08-24T23:20:28Z</updated>
<author>
<name>David Hildenbrand</name>
<email>david@redhat.com</email>
</author>
<published>2023-08-21T16:08:48Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3d2c908768877714a354ee6d7bf93e801400d5e2'/>
<id>urn:sha1:3d2c908768877714a354ee6d7bf93e801400d5e2</id>
<content type='text'>
Let's simply work on the folio directly and remove the helpers.

Link: https://lkml.kernel.org/r/20230821160849.531668-4-david@redhat.com
Signed-off-by: David Hildenbrand &lt;david@redhat.com&gt;
Suggested-by: Matthew Wilcox &lt;willy@infradead.org&gt;
Reviewed-by: Chris Li &lt;chrisl@kernel.org&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Dan Streetman &lt;ddstreet@ieee.org&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Peter Xu &lt;peterx@redhat.com&gt;
Cc: Seth Jennings &lt;sjenning@redhat.com&gt;
Cc: Vitaly Wool &lt;vitaly.wool@konsulko.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/swap: use dedicated entry for swap in folio</title>
<updated>2023-08-24T23:20:28Z</updated>
<author>
<name>Matthew Wilcox</name>
<email>willy@infradead.org</email>
</author>
<published>2023-08-21T16:08:47Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=85a1333417a7561c1d10a77d6c873a37e6ea63a0'/>
<id>urn:sha1:85a1333417a7561c1d10a77d6c873a37e6ea63a0</id>
<content type='text'>
Let's stop working on the private field and use an explicit swap field. 
We have to move the swp_entry_t typedef.

Link: https://lkml.kernel.org/r/20230821160849.531668-3-david@redhat.com
Signed-off-by: Matthew Wilcox &lt;willy@infradead.org&gt;
Signed-off-by: David Hildenbrand &lt;david@redhat.com&gt;
Reviewed-by: Chris Li &lt;chrisl@kernel.org&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Dan Streetman &lt;ddstreet@ieee.org&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Peter Xu &lt;peterx@redhat.com&gt;
Cc: Seth Jennings &lt;sjenning@redhat.com&gt;
Cc: Vitaly Wool &lt;vitaly.wool@konsulko.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/swap: stop using page-&gt;private on tail pages for THP_SWAP</title>
<updated>2023-08-24T23:20:28Z</updated>
<author>
<name>David Hildenbrand</name>
<email>david@redhat.com</email>
</author>
<published>2023-08-21T16:08:46Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cfeed8ffe55b37fa10286aaaa1369da00cb88440'/>
<id>urn:sha1:cfeed8ffe55b37fa10286aaaa1369da00cb88440</id>
<content type='text'>
Patch series "mm/swap: stop using page-&gt;private on tail pages for THP_SWAP
+ cleanups".

This series stops using page-&gt;private on tail pages for THP_SWAP, replaces
folio-&gt;private by folio-&gt;swap for swapcache folios, and starts using
"new_folio" for tail pages that we are splitting to remove the usage of
page-&gt;private for swapcache handling completely.


This patch (of 4):

Let's stop using page-&gt;private on tail pages, making it possible to just
unconditionally reuse that field in the tail pages of large folios.

The remaining usage of the private field for THP_SWAP is in the THP
splitting code (mm/huge_memory.c), that we'll handle separately later.

Update the THP_SWAP documentation and sanity checks in mm_types.h and
__split_huge_page_tail().

[david@redhat.com: stop using page-&gt;private on tail pages for THP_SWAP]
  Link: https://lkml.kernel.org/r/6f0a82a3-6948-20d9-580b-be1dbf415701@redhat.com
Link: https://lkml.kernel.org/r/20230821160849.531668-1-david@redhat.com
Link: https://lkml.kernel.org/r/20230821160849.531668-2-david@redhat.com
Signed-off-by: David Hildenbrand &lt;david@redhat.com&gt;
Acked-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;	[arm64]
Reviewed-by: Yosry Ahmed &lt;yosryahmed@google.com&gt;
Cc: Dan Streetman &lt;ddstreet@ieee.org&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Peter Xu &lt;peterx@redhat.com&gt;
Cc: Seth Jennings &lt;sjenning@redhat.com&gt;
Cc: Vitaly Wool &lt;vitaly.wool@konsulko.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: kill frontswap</title>
<updated>2023-08-21T20:37:26Z</updated>
<author>
<name>Johannes Weiner</name>
<email>hannes@cmpxchg.org</email>
</author>
<published>2023-07-17T16:02:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=42c06a0e8ebe95b81e5fb41c6556ff22d9255b0c'/>
<id>urn:sha1:42c06a0e8ebe95b81e5fb41c6556ff22d9255b0c</id>
<content type='text'>
The only user of frontswap is zswap, and has been for a long time.  Have
swap call into zswap directly and remove the indirection.

[hannes@cmpxchg.org: remove obsolete comment, per Yosry]
  Link: https://lkml.kernel.org/r/20230719142832.GA932528@cmpxchg.org
[fengwei.yin@intel.com: don't warn if none swapcache folio is passed to zswap_load]
  Link: https://lkml.kernel.org/r/20230810095652.3905184-1-fengwei.yin@intel.com
Link: https://lkml.kernel.org/r/20230717160227.GA867137@cmpxchg.org
Signed-off-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Signed-off-by: Yin Fengwei &lt;fengwei.yin@intel.com&gt;
Acked-by: Konrad Rzeszutek Wilk &lt;konrad.wilk@oracle.com&gt;
Acked-by: Nhat Pham &lt;nphamcs@gmail.com&gt;
Acked-by: Yosry Ahmed &lt;yosryahmed@google.com&gt;
Acked-by: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Domenico Cerasuolo &lt;cerasuolodomenico@gmail.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Vitaly Wool &lt;vitaly.wool@konsulko.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: remove check_move_unevictable_pages()</title>
<updated>2023-06-23T23:59:29Z</updated>
<author>
<name>Matthew Wilcox (Oracle)</name>
<email>willy@infradead.org</email>
</author>
<published>2023-06-21T16:45:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e0b72c14d8dcc9477e580c261041dae86d4906fe'/>
<id>urn:sha1:e0b72c14d8dcc9477e580c261041dae86d4906fe</id>
<content type='text'>
All callers have now been converted to call
check_move_unevictable_folios().

Link: https://lkml.kernel.org/r/20230621164557.3510324-7-willy@infradead.org
Signed-off-by: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/swap: swap_vma_readahead() do the pte_offset_map()</title>
<updated>2023-06-19T23:19:19Z</updated>
<author>
<name>Hugh Dickins</name>
<email>hughd@google.com</email>
</author>
<published>2023-06-09T01:52:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4f8fcf4ced0b7184149045818dcc2f9e2689b775'/>
<id>urn:sha1:4f8fcf4ced0b7184149045818dcc2f9e2689b775</id>
<content type='text'>
swap_vma_readahead() has been proceeding in an unconventional way, its
preliminary swap_ra_info() doing the pte_offset_map() and pte_unmap(),
then relying on that pte pointer even after the pte_unmap() - in its
CONFIG_64BIT case (I think !CONFIG_HIGHPTE was intended; whereas 32-bit
copied ptes to stack while they were mapped, but had to limit how many).

Though it would be difficult to construct a failing testcase, accessing
page table after pte_unmap() will become bad practice, even on 64-bit: an
rcu_read_unlock() in pte_unmap() will allow page table to be freed.

Move relevant definitions from include/linux/swap.h to mm/swap_state.c,
nothing else used them.  Delete the CONFIG_64BIT distinction and buffer,
delete all reference to ptes from swap_ra_info(), use pte_offset_map()
repeatedly in swap_vma_readahead(), breaking from the loop if it fails.

(Will the repeated "map" and "unmap" show up as a slowdown anywhere?  If
so, maybe modify __read_swap_cache_async() to do the pte_unmap() only when
it does not find the page already in the swapcache.)

Use ptep_get_lockless(), mainly for its READ_ONCE().  Correctly advance
the address passed down to each call of __read__swap_cache_async().

Link: https://lkml.kernel.org/r/b7c64ab3-9e44-aac0-d2b-c57de578af1c@google.com
Signed-off-by: Hugh Dickins &lt;hughd@google.com&gt;
Reviewed-by: "Huang, Ying" &lt;ying.huang@intel.com&gt;
Cc: Alistair Popple &lt;apopple@nvidia.com&gt;
Cc: Anshuman Khandual &lt;anshuman.khandual@arm.com&gt;
Cc: Axel Rasmussen &lt;axelrasmussen@google.com&gt;
Cc: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Cc: Christoph Hellwig &lt;hch@infradead.org&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Ira Weiny &lt;ira.weiny@intel.com&gt;
Cc: Jason Gunthorpe &lt;jgg@ziepe.ca&gt;
Cc: Kirill A. Shutemov &lt;kirill.shutemov@linux.intel.com&gt;
Cc: Lorenzo Stoakes &lt;lstoakes@gmail.com&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Cc: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Mike Rapoport (IBM) &lt;rppt@kernel.org&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Naoya Horiguchi &lt;naoya.horiguchi@nec.com&gt;
Cc: Pavel Tatashin &lt;pasha.tatashin@soleen.com&gt;
Cc: Peter Xu &lt;peterx@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Qi Zheng &lt;zhengqi.arch@bytedance.com&gt;
Cc: Ralph Campbell &lt;rcampbell@nvidia.com&gt;
Cc: Ryan Roberts &lt;ryan.roberts@arm.com&gt;
Cc: SeongJae Park &lt;sj@kernel.org&gt;
Cc: Song Liu &lt;song@kernel.org&gt;
Cc: Steven Price &lt;steven.price@arm.com&gt;
Cc: Suren Baghdasaryan &lt;surenb@google.com&gt;
Cc: Thomas Hellström &lt;thomas.hellstrom@linux.intel.com&gt;
Cc: Will Deacon &lt;will@kernel.org&gt;
Cc: Yang Shi &lt;shy828301@gmail.com&gt;
Cc: Yu Zhao &lt;yuzhao@google.com&gt;
Cc: Zack Rusin &lt;zackr@vmware.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: vmscan: mark kswapd_run() and kswapd_stop() __meminit</title>
<updated>2023-06-19T23:19:00Z</updated>
<author>
<name>Miaohe Lin</name>
<email>linmiaohe@huawei.com</email>
</author>
<published>2023-06-06T12:18:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e5797dc011182f8b25420bc977f37cd92fc6e755'/>
<id>urn:sha1:e5797dc011182f8b25420bc977f37cd92fc6e755</id>
<content type='text'>
Add __meminit to kswapd_run() and kswapd_stop() to ensure they're default
to __init when memory hotplug is not enabled.

Link: https://lkml.kernel.org/r/20230606121813.242163-1-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Acked-by: Yu Zhao &lt;yuzhao@google.com&gt;
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>swap: remove __swp_swapcount()</title>
<updated>2023-06-09T23:25:49Z</updated>
<author>
<name>Huang Ying</name>
<email>ying.huang@intel.com</email>
</author>
<published>2023-05-29T06:13:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3ecdeb0f876e91c4a7129ba2ba5baa530aa6c4f9'/>
<id>urn:sha1:3ecdeb0f876e91c4a7129ba2ba5baa530aa6c4f9</id>
<content type='text'>
__swp_swapcount() just encloses the calling to swap_swapcount() with
get/put_swap_device().  It is called in __read_swap_cache_async() only,
which encloses the calling with get/put_swap_device() already.  So,
__read_swap_cache_async() can call swap_swapcount() directly.

Link: https://lkml.kernel.org/r/20230529061355.125791-4-ying.huang@intel.com
Signed-off-by: "Huang, Ying" &lt;ying.huang@intel.com&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Reviewed-by: Chris Li (Google) &lt;chrisl@kernel.org&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Matthew Wilcox &lt;willy@infradead.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Tim Chen &lt;tim.c.chen@linux.intel.com&gt;
Cc: Yang Shi &lt;shy828301@gmail.com&gt;
Cc: Yu Zhao &lt;yuzhao@google.com&gt;
Cc: Yosry Ahmed &lt;yosryahmed@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>workingset: refactor LRU refault to expose refault recency check</title>
<updated>2023-06-09T23:25:16Z</updated>
<author>
<name>Nhat Pham</name>
<email>nphamcs@gmail.com</email>
</author>
<published>2023-05-03T01:36:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ffcb5f5262b756a598eefb11e340bbd027cde037'/>
<id>urn:sha1:ffcb5f5262b756a598eefb11e340bbd027cde037</id>
<content type='text'>
Patch series "cachestat: a new syscall for page cache state of files",
v13.

There is currently no good way to query the page cache statistics of large
files and directory trees.  There is mincore(), but it scales poorly: the
kernel writes out a lot of bitmap data that userspace has to aggregate,
when the user really does not care about per-page information in that
case.  The user also needs to mmap and unmap each file as it goes along,
which can be quite slow as well.

Some use cases where this information could come in handy:
  * Allowing database to decide whether to perform an index scan or direct
    table queries based on the in-memory cache state of the index.
  * Visibility into the writeback algorithm, for performance issues
    diagnostic.
  * Workload-aware writeback pacing: estimating IO fulfilled by page cache
    (and IO to be done) within a range of a file, allowing for more
    frequent syncing when and where there is IO capacity, and batching
    when there is not.
  * Computing memory usage of large files/directory trees, analogous to
    the du tool for disk usage.

More information about these use cases could be found in this thread:
https://lore.kernel.org/lkml/20230315170934.GA97793@cmpxchg.org/

This series of patches introduces a new system call, cachestat, that
summarizes the page cache statistics (number of cached pages, dirty pages,
pages marked for writeback, evicted pages etc.) of a file, in a specified
range of bytes.  It also include a selftest suite that tests some typical
usage.  Currently, the syscall is only wired in for x86 architecture.

This interface is inspired by past discussion and concerns with fincore,
which has a similar design (and as a result, issues) as mincore.  Relevant
links:

https://lkml.indiana.edu/hypermail/linux/kernel/1302.1/04207.html
https://lkml.indiana.edu/hypermail/linux/kernel/1302.1/04209.html


I have also developed a small tool that computes the memory usage of files
and directories, analogous to the du utility.  User can choose between
mincore or cachestat (with cachestat exporting more information than
mincore).  To compare the performance of these two options, I benchmarked
the tool on the root directory of a Meta's server machine, each for five
runs:

Using cachestat
real -- Median: 33.377s, Average: 33.475s, Standard Deviation: 0.3602
user -- Median: 4.08s, Average: 4.1078s, Standard Deviation: 0.0742
sys -- Median: 28.823s, Average: 28.8866s, Standard Deviation: 0.2689

Using mincore:
real -- Median: 102.352s, Average: 102.3442s, Standard Deviation: 0.2059
user -- Median: 10.149s, Average: 10.1482s, Standard Deviation: 0.0162
sys -- Median: 91.186s, Average: 91.2084s, Standard Deviation: 0.2046

I also ran both syscalls on a 2TB sparse file:

Using cachestat:
real    0m0.009s
user    0m0.000s
sys     0m0.009s

Using mincore:
real    0m37.510s
user    0m2.934s
sys     0m34.558s

Very large files like this are the pathological case for mincore.  In
fact, to compute the stats for a single 2TB file, mincore takes as long as
cachestat takes to compute the stats for the entire tree!  This could
easily happen inadvertently when we run it on subdirectories.  Mincore is
clearly not suitable for a general-purpose command line tool.

Regarding security concerns, cachestat() should not pose any additional
issues.  The caller already has read permission to the file itself (since
they need an fd to that file to call cachestat).  This means that the
caller can access the underlying data in its entirety, which is a much
greater source of information (and as a result, a much greater security
risk) than the cache status itself.

The latest API change (in v13 of the patch series) is suggested by Jens
Axboe.  It allows for 64-bit length argument, even on 32-bit architecture
(which is previously not possible due to the limit on the number of
syscall arguments).  Furthermore, it eliminates the need for compatibility
handling - every user can use the same ABI.


This patch (of 4):

In preparation for computing recently evicted pages in cachestat, refactor
workingset_refault and lru_gen_refault to expose a helper function that
would test if an evicted page is recently evicted.

[penguin-kernel@I-love.SAKURA.ne.jp: add missing rcu_read_unlock() in lru_gen_refault()]
  Link: https://lkml.kernel.org/r/610781bc-cf11-fc89-a46f-87cb8235d439@I-love.SAKURA.ne.jp
Link: https://lkml.kernel.org/r/20230503013608.2431726-1-nphamcs@gmail.com
Link: https://lkml.kernel.org/r/20230503013608.2431726-2-nphamcs@gmail.com
Signed-off-by: Nhat Pham &lt;nphamcs@gmail.com&gt;
Signed-off-by: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Acked-by: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Brian Foster &lt;bfoster@redhat.com&gt;
Cc: Johannes Weiner &lt;hannes@cmpxchg.org&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Michael Kerrisk &lt;mtk.manpages@gmail.com&gt;
Cc: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: correct arg in reclaim_pages()/reclaim_clean_pages_from_list()</title>
<updated>2023-04-21T21:52:02Z</updated>
<author>
<name>Kefeng Wang</name>
<email>wangkefeng.wang@huawei.com</email>
</author>
<published>2023-04-17T11:48:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4bf4f155bfbc77a12ad2c9e12d9f57718adb9a5c'/>
<id>urn:sha1:4bf4f155bfbc77a12ad2c9e12d9f57718adb9a5c</id>
<content type='text'>
Both of them change the arg from page_list to folio_list when convert them
to use a folio, but not the declaration, let's correct it, also move the
reclaim_pages() from swap.h to internal.h as it only used in mm.

Link: https://lkml.kernel.org/r/20230417114807.186786-1-wangkefeng.wang@huawei.com
Signed-off-by: Kefeng Wang &lt;wangkefeng.wang@huawei.com&gt;
Reviwed-by: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Reviewed-by: SeongJae Park &lt;sj@kernel.org&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
</feed>
