<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/swapops.h, branch v6.1.23</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.1.23</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.1.23'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2022-12-10T02:41:16Z</updated>
<entry>
<title>mm/swap: fix SWP_PFN_BITS with CONFIG_PHYS_ADDR_T_64BIT on 32bit</title>
<updated>2022-12-10T02:41:16Z</updated>
<author>
<name>David Hildenbrand</name>
<email>david@redhat.com</email>
</author>
<published>2022-12-05T15:08:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=630dc25e43dacfb5af94cd41532e77c47ec1caff'/>
<id>urn:sha1:630dc25e43dacfb5af94cd41532e77c47ec1caff</id>
<content type='text'>
We use "unsigned long" to store a PFN in the kernel and phys_addr_t to
store a physical address.

On a 64bit system, both are 64bit wide.  However, on a 32bit system, the
latter might be 64bit wide.  This is, for example, the case on x86 with
PAE: phys_addr_t and PTEs are 64bit wide, while "unsigned long" only spans
32bit.

The current definition of SWP_PFN_BITS without MAX_PHYSMEM_BITS misses
that case, and assumes that the maximum PFN is limited by an 32bit
phys_addr_t.  This implies, that SWP_PFN_BITS will currently only be able
to cover 4 GiB - 1 on any 32bit system with 4k page size, which is wrong.

Let's rely on the number of bits in phys_addr_t instead, but make sure to
not exceed the maximum swap offset, to not make the BUILD_BUG_ON() in
is_pfn_swap_entry() unhappy.  Note that swp_entry_t is effectively an
unsigned long and the maximum swap offset shares that value with the swap
type.

For example, on an 8 GiB x86 PAE system with a kernel config based on
Debian 11.5 (-&gt; CONFIG_FLATMEM=y, CONFIG_X86_PAE=y), we will currently
fail removing migration entries (remove_migration_ptes()), because
mm/page_vma_mapped.c:check_pte() will fail to identify a PFN match as
swp_offset_pfn() wrongly masks off PFN bits.  For example,
split_huge_page_to_list()-&gt;...-&gt;remap_page() will leave migration entries
in place and continue to unlock the page.

Later, when we stumble over these migration entries (e.g., via
/proc/self/pagemap), pfn_swap_entry_to_page() will BUG_ON() because these
migration entries shouldn't exist anymore and the page was unlocked.

[   33.067591] kernel BUG at include/linux/swapops.h:497!
[   33.067597] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[   33.067602] CPU: 3 PID: 742 Comm: cow Tainted: G            E      6.1.0-rc8+ #16
[   33.067605] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-1.fc36 04/01/2014
[   33.067606] EIP: pagemap_pmd_range+0x644/0x650
[   33.067612] Code: 00 00 00 00 66 90 89 ce b9 00 f0 ff ff e9 ff fb ff ff 89 d8 31 db e8 48 c6 52 00 e9 23 fb ff ff e8 61 83 56 00 e9 b6 fe ff ff &lt;0f&gt; 0b bf 00 f0 ff ff e9 38 fa ff ff 3e 8d 74 26 00 55 89 e5 57 31
[   33.067615] EAX: ee394000 EBX: 00000002 ECX: ee394000 EDX: 00000000
[   33.067617] ESI: c1b0ded4 EDI: 00024a00 EBP: c1b0ddb4 ESP: c1b0dd68
[   33.067619] DS: 007b ES: 007b FS: 00d8 GS: 0033 SS: 0068 EFLAGS: 00010246
[   33.067624] CR0: 80050033 CR2: b7a00000 CR3: 01bbbd20 CR4: 00350ef0
[   33.067625] Call Trace:
[   33.067628]  ? madvise_free_pte_range+0x720/0x720
[   33.067632]  ? smaps_pte_range+0x4b0/0x4b0
[   33.067634]  walk_pgd_range+0x325/0x720
[   33.067637]  ? mt_find+0x1d6/0x3a0
[   33.067641]  ? mt_find+0x1d6/0x3a0
[   33.067643]  __walk_page_range+0x164/0x170
[   33.067646]  walk_page_range+0xf9/0x170
[   33.067648]  ? __kmem_cache_alloc_node+0x2a8/0x340
[   33.067653]  pagemap_read+0x124/0x280
[   33.067658]  ? default_llseek+0x101/0x160
[   33.067662]  ? smaps_account+0x1d0/0x1d0
[   33.067664]  vfs_read+0x90/0x290
[   33.067667]  ? do_madvise.part.0+0x24b/0x390
[   33.067669]  ? debug_smp_processor_id+0x12/0x20
[   33.067673]  ksys_pread64+0x58/0x90
[   33.067675]  __ia32_sys_ia32_pread64+0x1b/0x20
[   33.067680]  __do_fast_syscall_32+0x4c/0xc0
[   33.067683]  do_fast_syscall_32+0x29/0x60
[   33.067686]  do_SYSENTER_32+0x15/0x20
[   33.067689]  entry_SYSENTER_32+0x98/0xf1

Decrease the indentation level of SWP_PFN_BITS and SWP_PFN_MASK to keep it
readable and consistent.

[david@redhat.com: rely on sizeof(phys_addr_t) and min_t() instead]
  Link: https://lkml.kernel.org/r/20221206105737.69478-1-david@redhat.com
[david@redhat.com: use "int" for comparison, as we're only comparing numbers &lt; 64]
  Link: https://lkml.kernel.org/r/1f157500-2676-7cef-a84e-9224ed64e540@redhat.com
Link: https://lkml.kernel.org/r/20221205150857.167583-1-david@redhat.com
Fixes: 0d206b5d2e0d ("mm/swap: add swp_offset_pfn() to fetch PFN from swap entry")
Signed-off-by: David Hildenbrand &lt;david@redhat.com&gt;
Acked-by: Peter Xu &lt;peterx@redhat.com&gt;
Reviewed-by: Yang Shi &lt;shy828301@gmail.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/swap: cache swap migration A/D bits support</title>
<updated>2022-09-27T02:46:05Z</updated>
<author>
<name>Peter Xu</name>
<email>peterx@redhat.com</email>
</author>
<published>2022-08-11T16:13:31Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5154e607967d3f587fda84a40abbf900275016c9'/>
<id>urn:sha1:5154e607967d3f587fda84a40abbf900275016c9</id>
<content type='text'>
Introduce a variable swap_migration_ad_supported to cache whether the arch
supports swap migration A/D bits.

Here one thing to mention is that SWP_MIG_TOTAL_BITS will internally
reference the other macro MAX_PHYSMEM_BITS, which is a function call on
x86 (constant on all the rest of archs).

It's safe to reference it in swapfile_init() because when reaching here
we're already during initcalls level 4 so we must have initialized 5-level
pgtable for x86_64 (right after early_identify_cpu() finishes).

- start_kernel
  - setup_arch
    - early_cpu_init
      - get_cpu_cap --&gt; fetch from CPUID (including X86_FEATURE_LA57)
      - early_identify_cpu --&gt; clear X86_FEATURE_LA57 (if early lvl5 not enabled (USE_EARLY_PGTABLE_L5))
  - arch_call_rest_init
    - rest_init
      - kernel_init
        - kernel_init_freeable
          - do_basic_setup
            - do_initcalls --&gt; calls swapfile_init() (initcall level 4)

This should slightly speed up the migration swap entry handlings.

Link: https://lkml.kernel.org/r/20220811161331.37055-8-peterx@redhat.com
Signed-off-by: Peter Xu &lt;peterx@redhat.com&gt;
Cc: Alistair Popple &lt;apopple@nvidia.com&gt;
Cc: Andi Kleen &lt;andi.kleen@intel.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Huang Ying &lt;ying.huang@intel.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: "Kirill A . Shutemov" &lt;kirill@shutemov.name&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Nadav Amit &lt;nadav.amit@gmail.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/swap: cache maximum swapfile size when init swap</title>
<updated>2022-09-27T02:46:05Z</updated>
<author>
<name>Peter Xu</name>
<email>peterx@redhat.com</email>
</author>
<published>2022-08-11T16:13:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=be45a4902c7caa717fee6b2f671e59b396ed395c'/>
<id>urn:sha1:be45a4902c7caa717fee6b2f671e59b396ed395c</id>
<content type='text'>
We used to have swapfile_maximum_size() fetching a maximum value of
swapfile size per-arch.

As the caller of max_swapfile_size() grows, this patch introduce a
variable "swapfile_maximum_size" and cache the value of old
max_swapfile_size(), so that we don't need to calculate the value every
time.

Caching the value in swapfile_init() is safe because when reaching the
phase we should have initialized all the relevant information.  Here the
major arch to take care of is x86, which defines the max swapfile size
based on L1TF mitigation.

Here both X86_BUG_L1TF or l1tf_mitigation should have been setup properly
when reaching swapfile_init().  As a reference, the code path looks like
this for x86:

- start_kernel
  - setup_arch
    - early_cpu_init
      - early_identify_cpu --&gt; setup X86_BUG_L1TF
  - parse_early_param
    - l1tf_cmdline --&gt; set l1tf_mitigation
  - check_bugs
    - l1tf_select_mitigation --&gt; set l1tf_mitigation
  - arch_call_rest_init
    - rest_init
      - kernel_init
        - kernel_init_freeable
          - do_basic_setup
            - do_initcalls --&gt; calls swapfile_init() (initcall level 4)

The swapfile size only depends on swp pte format on non-x86 archs, so
caching it is safe too.

Since at it, rename max_swapfile_size() to arch_max_swapfile_size()
because arch can define its own function, so it's more straightforward to
have "arch_" as its prefix.  At the meantime, export swapfile_maximum_size
to replace the old usages of max_swapfile_size().

[peterx@redhat.com: declare arch_max_swapfile_size) in swapfile.h]
  Link: https://lkml.kernel.org/r/YxTh1GuC6ro5fKL5@xz-m1.local
Link: https://lkml.kernel.org/r/20220811161331.37055-7-peterx@redhat.com
Signed-off-by: Peter Xu &lt;peterx@redhat.com&gt;
Reviewed-by: "Huang, Ying" &lt;ying.huang@intel.com&gt;
Cc: Alistair Popple &lt;apopple@nvidia.com&gt;
Cc: Andi Kleen &lt;andi.kleen@intel.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: "Kirill A . Shutemov" &lt;kirill@shutemov.name&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Nadav Amit &lt;nadav.amit@gmail.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm: remember young/dirty bit for page migrations</title>
<updated>2022-09-27T02:46:05Z</updated>
<author>
<name>Peter Xu</name>
<email>peterx@redhat.com</email>
</author>
<published>2022-08-11T16:13:29Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2e3468778dbe3ec389a10c21a703bb8e5be5cfbc'/>
<id>urn:sha1:2e3468778dbe3ec389a10c21a703bb8e5be5cfbc</id>
<content type='text'>
When page migration happens, we always ignore the young/dirty bit settings
in the old pgtable, and marking the page as old in the new page table
using either pte_mkold() or pmd_mkold(), and keeping the pte clean.

That's fine from functional-wise, but that's not friendly to page reclaim
because the moving page can be actively accessed within the procedure. 
Not to mention hardware setting the young bit can bring quite some
overhead on some systems, e.g.  x86_64 needs a few hundreds nanoseconds to
set the bit.  The same slowdown problem to dirty bits when the memory is
first written after page migration happened.

Actually we can easily remember the A/D bit configuration and recover the
information after the page is migrated.  To achieve it, define a new set
of bits in the migration swap offset field to cache the A/D bits for old
pte.  Then when removing/recovering the migration entry, we can recover
the A/D bits even if the page changed.

One thing to mention is that here we used max_swapfile_size() to detect
how many swp offset bits we have, and we'll only enable this feature if we
know the swp offset is big enough to store both the PFN value and the A/D
bits.  Otherwise the A/D bits are dropped like before.

Link: https://lkml.kernel.org/r/20220811161331.37055-6-peterx@redhat.com
Signed-off-by: Peter Xu &lt;peterx@redhat.com&gt;
Reviewed-by: "Huang, Ying" &lt;ying.huang@intel.com&gt;
Cc: Alistair Popple &lt;apopple@nvidia.com&gt;
Cc: Andi Kleen &lt;andi.kleen@intel.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: "Kirill A . Shutemov" &lt;kirill@shutemov.name&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Nadav Amit &lt;nadav.amit@gmail.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/swap: add swp_offset_pfn() to fetch PFN from swap entry</title>
<updated>2022-09-27T02:46:05Z</updated>
<author>
<name>Peter Xu</name>
<email>peterx@redhat.com</email>
</author>
<published>2022-08-11T16:13:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0d206b5d2e0d7d7f09ac9540e3ab3e35a34f536e'/>
<id>urn:sha1:0d206b5d2e0d7d7f09ac9540e3ab3e35a34f536e</id>
<content type='text'>
We've got a bunch of special swap entries that stores PFN inside the swap
offset fields.  To fetch the PFN, normally the user just calls
swp_offset() assuming that'll be the PFN.

Add a helper swp_offset_pfn() to fetch the PFN instead, fetching only the
max possible length of a PFN on the host, meanwhile doing proper check
with MAX_PHYSMEM_BITS to make sure the swap offsets can actually store the
PFNs properly always using the BUILD_BUG_ON() in is_pfn_swap_entry().

One reason to do so is we never tried to sanitize whether swap offset can
really fit for storing PFN.  At the meantime, this patch also prepares us
with the future possibility to store more information inside the swp
offset field, so assuming "swp_offset(entry)" to be the PFN will not stand
any more very soon.

Replace many of the swp_offset() callers to use swp_offset_pfn() where
proper.  Note that many of the existing users are not candidates for the
replacement, e.g.:

  (1) When the swap entry is not a pfn swap entry at all, or,
  (2) when we wanna keep the whole swp_offset but only change the swp type.

For the latter, it can happen when fork() triggered on a write-migration
swap entry pte, we may want to only change the migration type from
write-&gt;read but keep the rest, so it's not "fetching PFN" but "changing
swap type only".  They're left aside so that when there're more
information within the swp offset they'll be carried over naturally in
those cases.

Since at it, dropping hwpoison_entry_to_pfn() because that's exactly what
the new swp_offset_pfn() is about.

Link: https://lkml.kernel.org/r/20220811161331.37055-4-peterx@redhat.com
Signed-off-by: Peter Xu &lt;peterx@redhat.com&gt;
Reviewed-by: "Huang, Ying" &lt;ying.huang@intel.com&gt;
Cc: Alistair Popple &lt;apopple@nvidia.com&gt;
Cc: Andi Kleen &lt;andi.kleen@intel.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: "Kirill A . Shutemov" &lt;kirill@shutemov.name&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Nadav Amit &lt;nadav.amit@gmail.com&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/swap: comment all the ifdef in swapops.h</title>
<updated>2022-09-27T02:46:04Z</updated>
<author>
<name>Peter Xu</name>
<email>peterx@redhat.com</email>
</author>
<published>2022-08-11T16:13:26Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=eba4d770efc86a3710e36b828190858abfa3bb74'/>
<id>urn:sha1:eba4d770efc86a3710e36b828190858abfa3bb74</id>
<content type='text'>
swapops.h contains quite a few layers of ifdef, some of the "else" and
"endif" doesn't get proper comment on the macro so it's hard to follow on
what are they referring to.  Add the comments.

Link: https://lkml.kernel.org/r/20220811161331.37055-3-peterx@redhat.com
Signed-off-by: Peter Xu &lt;peterx@redhat.com&gt;
Suggested-by: Nadav Amit &lt;nadav.amit@gmail.com&gt;
Reviewed-by: Huang Ying &lt;ying.huang@intel.com&gt;
Reviewed-by: Alistair Popple &lt;apopple@nvidia.com&gt;
Cc: Andi Kleen &lt;andi.kleen@intel.com&gt;
Cc: Andrea Arcangeli &lt;aarcange@redhat.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: "Kirill A . Shutemov" &lt;kirill@shutemov.name&gt;
Cc: Minchan Kim &lt;minchan@kernel.org&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Dave Hansen &lt;dave.hansen@intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm, hwpoison: use num_poisoned_pages_sub() to decrease num_poisoned_pages</title>
<updated>2022-09-27T02:46:04Z</updated>
<author>
<name>Miaohe Lin</name>
<email>linmiaohe@huawei.com</email>
</author>
<published>2022-08-30T12:36:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=21c9e90ab9a4c991d21dd15cc5163c99a885d4a8'/>
<id>urn:sha1:21c9e90ab9a4c991d21dd15cc5163c99a885d4a8</id>
<content type='text'>
Use num_poisoned_pages_sub() to combine multiple atomic ops into one. Also
num_poisoned_pages_dec() can be killed as there's no caller now.

Link: https://lkml.kernel.org/r/20220830123604.25763-4-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Acked-by: Naoya Horiguchi &lt;naoya.horiguchi@nec.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm, hwpoison: make unpoison aware of raw error info in hwpoisoned hugepage</title>
<updated>2022-08-09T01:06:44Z</updated>
<author>
<name>Naoya Horiguchi</name>
<email>naoya.horiguchi@nec.com</email>
</author>
<published>2022-07-14T04:24:16Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ac5fcde0a96a18773f06b7c00c5ea081bbdc64b3'/>
<id>urn:sha1:ac5fcde0a96a18773f06b7c00c5ea081bbdc64b3</id>
<content type='text'>
Raw error info list needs to be removed when hwpoisoned hugetlb is
unpoisoned.  And unpoison handler needs to know how many errors there are
in the target hugepage.  So add them.

HPageVmemmapOptimized(hpage) and HPageRawHwpUnreliable(hpage)) sometimes
can't be unpoisoned, so skip them.

Link: https://lkml.kernel.org/r/20220714042420.1847125-5-naoya.horiguchi@linux.dev
Signed-off-by: Naoya Horiguchi &lt;naoya.horiguchi@nec.com&gt;
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Reviewed-by: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Liu Shixin &lt;liushixin2@huawei.com&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: Yang Shi &lt;shy828301@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/migration: fix potential pte_unmap on an not mapped pte</title>
<updated>2022-07-04T01:08:37Z</updated>
<author>
<name>Miaohe Lin</name>
<email>linmiaohe@huawei.com</email>
</author>
<published>2022-05-30T11:30:16Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ad1ac596e8a8c4b06715dfbd89853eb73c9886b2'/>
<id>urn:sha1:ad1ac596e8a8c4b06715dfbd89853eb73c9886b2</id>
<content type='text'>
__migration_entry_wait and migration_entry_wait_on_locked assume pte is
always mapped from caller.  But this is not the case when it's called from
migration_entry_wait_huge and follow_huge_pmd.  Add a hugetlbfs variant
that calls hugetlb_migration_entry_wait(ptep == NULL) to fix this issue.

Link: https://lkml.kernel.org/r/20220530113016.16663-5-linmiaohe@huawei.com
Fixes: 30dad30922cc ("mm: migration: add migrate_entry_wait_huge()")
Signed-off-by: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Suggested-by: David Hildenbrand &lt;david@redhat.com&gt;
Reviewed-by: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Alistair Popple &lt;apopple@nvidia.com&gt;
Cc: Christoph Hellwig &lt;hch@lst.de&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: Huang Ying &lt;ying.huang@intel.com&gt;
Cc: kernel test robot &lt;lkp@intel.com&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: Peter Xu &lt;peterx@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/swapfile: unuse_pte can map random data if swap read fails</title>
<updated>2022-05-27T16:33:45Z</updated>
<author>
<name>Miaohe Lin</name>
<email>linmiaohe@huawei.com</email>
</author>
<published>2022-05-19T12:50:26Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9f186f9e5fa9ebdaef909fd45f825a6ce281f13c'/>
<id>urn:sha1:9f186f9e5fa9ebdaef909fd45f825a6ce281f13c</id>
<content type='text'>
Patch series "A few fixup patches for mm", v4.

This series contains a few patches to avoid mapping random data if swap
read fails and fix lost swap bits in unuse_pte.  Also we free hwpoison and
swapin error entry in madvise_free_pte_range and so on.  More details can
be found in the respective changelogs.  


This patch (of 5):

There is a bug in unuse_pte(): when swap page happens to be unreadable,
page filled with random data is mapped into user address space.  In case
of error, a special swap entry indicating swap read fails is set to the
page table.  So the swapcache page can be freed and the user won't end up
with a permanently mounted swap because a sector is bad.  And if the page
is accessed later, the user process will be killed so that corrupted data
is never consumed.  On the other hand, if the page is never accessed, the
user won't even notice it.

Link: https://lkml.kernel.org/r/20220519125030.21486-1-linmiaohe@huawei.com
Link: https://lkml.kernel.org/r/20220519125030.21486-2-linmiaohe@huawei.com
Signed-off-by: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Hugh Dickins &lt;hughd@google.com&gt;
Cc: Matthew Wilcox (Oracle) &lt;willy@infradead.org&gt;
Cc: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: NeilBrown &lt;neilb@suse.de&gt;
Cc: Alistair Popple &lt;apopple@nvidia.com&gt;
Cc: Suren Baghdasaryan &lt;surenb@google.com&gt;
Cc: Peter Xu &lt;peterx@redhat.com&gt;
Cc: Ralph Campbell &lt;rcampbell@nvidia.com&gt;
Cc: Naoya Horiguchi &lt;naoya.horiguchi@nec.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
</feed>
