<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/memory.h, branch v6.17.8</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.17.8</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.17.8'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2025-07-13T23:38:17Z</updated>
<entry>
<title>mm,memory_hotplug: drop status_change_nid parameter from memory_notify</title>
<updated>2025-07-13T23:38:17Z</updated>
<author>
<name>Oscar Salvador</name>
<email>osalvador@suse.de</email>
</author>
<published>2025-06-16T13:51:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d2a9721d807de405b198291badcc807700746781'/>
<id>urn:sha1:d2a9721d807de405b198291badcc807700746781</id>
<content type='text'>
There no users left of status_change_nid, so drop it from memory_notify
struct.

Link: https://lkml.kernel.org/r/20250616135158.450136-12-osalvador@suse.de
Signed-off-by: Oscar Salvador &lt;osalvador@suse.de&gt;
Suggested-by: David Hildenbrand &lt;david@redhat.com&gt;
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Reviewed-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Harry Yoo &lt;harry.yoo@oracle.com&gt;
Cc: Hyeonggon Yoo &lt;42.hyeyoo@gmail.com&gt;
Cc: Joanthan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Cc: Rakie Kim &lt;rakie.kim@sk.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm,memory_hotplug: remove status_change_nid_normal and update documentation</title>
<updated>2025-07-13T23:38:14Z</updated>
<author>
<name>Oscar Salvador</name>
<email>osalvador@suse.de</email>
</author>
<published>2025-06-16T13:51:45Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8d2882a8edb8621d37fd8931e0686070cc6cc189'/>
<id>urn:sha1:8d2882a8edb8621d37fd8931e0686070cc6cc189</id>
<content type='text'>
Now that the last user of status_change_nid_normal is gone, we can remove
it.  Update documentation accordingly.

Link: https://lkml.kernel.org/r/20250616135158.450136-3-osalvador@suse.de
Signed-off-by: Oscar Salvador &lt;osalvador@suse.de&gt;
Reviewed-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Harry Yoo &lt;harry.yoo@oracle.com&gt;
Cc: Hyeonggon Yoo &lt;42.hyeyoo@gmail.com&gt;
Cc: Joanthan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Cc: Rakie Kim &lt;rakie.kim@sk.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>drivers/base/node: optimize memory block registration to reduce boot time</title>
<updated>2025-07-10T05:41:59Z</updated>
<author>
<name>Donet Tom</name>
<email>donettom@linux.ibm.com</email>
</author>
<published>2025-05-28T17:18:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4f745def815d86eece3614aa0cd10a25cf94fed4'/>
<id>urn:sha1:4f745def815d86eece3614aa0cd10a25cf94fed4</id>
<content type='text'>
Patch series "drivers/base/node.c: optimization and cleanups", v7.


This patch (of 7)

During node device initialization, `memory blocks` are registered under
each NUMA node.  The `memory blocks` to be registered are identified using
the node's start and end PFNs, which are obtained from the node's pg_data

However, not all PFNs within this range necessarily belong to the same
node—some may belong to other nodes.  Additionally, due to the
discontiguous nature of physical memory, certain sections within a `memory
block` may be absent.

As a result, `memory blocks` that fall between a node's start and end PFNs
may span across multiple nodes, and some sections within those blocks may
be missing.  `Memory blocks` have a fixed size, which is architecture
dependent.

Due to these considerations, the memory block registration is currently
performed as follows:

for_each_online_node(nid):
    start_pfn = pgdat-&gt;node_start_pfn;
    end_pfn = pgdat-&gt;node_start_pfn + node_spanned_pages;
    for_each_memory_block_between(PFN_PHYS(start_pfn), PFN_PHYS(end_pfn))
        mem_blk = memory_block_id(pfn_to_section_nr(pfn));
        pfn_mb_start=section_nr_to_pfn(mem_blk-&gt;start_section_nr)
        pfn_mb_end = pfn_start + memory_block_pfns - 1
        for (pfn = pfn_mb_start; pfn &lt; pfn_mb_end; pfn++):
            if (get_nid_for_pfn(pfn) != nid):
                continue;
            else
                do_register_memory_block_under_node(nid, mem_blk,
                                                        MEMINIT_EARLY);

Here, we derive the start and end PFNs from the node's pg_data, then
determine the memory blocks that may belong to the node.  For each `memory
block` in this range, we inspect all PFNs it contains and check their
associated NUMA node ID.  If a PFN within the block matches the current
node, the memory block is registered under that node.

If CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, get_nid_for_pfn() performs
a binary search in the `memblock regions` to determine the NUMA node ID
for a given PFN.  If it is not enabled, the node ID is retrieved directly
from the struct page.

On large systems, this process can become time-consuming, especially since
we iterate over each `memory block` and all PFNs within it until a match
is found.  When CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled, the
additional overhead of the binary search increases the execution time
significantly, potentially leading to soft lockups during boot.

In this patch, we iterate over `memblock region` to identify the `memory
blocks` that belong to the current NUMA node.  `memblock regions` are
contiguous memory ranges, each associated with a single NUMA node, and
they do not span across multiple nodes.

for_each_memory_region(r): // r =&gt; region
  if (!node_online(r-&gt;nid)):
    continue;
  else
    for_each_memory_block_between(r-&gt;base, r-&gt;base + r-&gt;size - 1):
      do_register_memory_block_under_node(r-&gt;nid, mem_blk, MEMINIT_EARLY);

We iterate over all memblock regions, and if the node associated with the
region is online, we calculate the start and end memory blocks based on
the region's start and end PFNs.  We then register all the memory blocks
within that range under the region node.

Test Results on My system with 32TB RAM
=======================================
1. Boot time with CONFIG_DEFERRED_STRUCT_PAGE_INIT enabled.

Without this patch
------------------
Startup finished in 1min 16.528s (kernel)

With this patch
---------------
Startup finished in 17.236s (kernel) - 78% Improvement

2. Boot time with CONFIG_DEFERRED_STRUCT_PAGE_INIT disabled.

Without this patch
------------------
Startup finished in 28.320s (kernel)

With this patch
---------------
Startup finished in 15.621s (kernel) - 46% Improvement

[donettom@linux.ibm.com: restore removed extra line]
  Link: https://lkml.kernel.org/r/20250609140354.467908-1-donettom@linux.ibm.com
Link: https://lkml.kernel.org/r/2a0a05c2dffc62a742bf1dd030098be4ce99be28.1748452241.git.donettom@linux.ibm.com
Link: https://lkml.kernel.org/r/2a0a05c2dffc62a742bf1dd030098be4ce99be28.1748452241.git.donettom@linux.ibm.com
Signed-off-by: Donet Tom &lt;donettom@linux.ibm.com&gt;
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Acked-by: Oscar Salvador &lt;osalvador@suse.de&gt;
Acked-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Acked-by: Zi Yan &lt;ziy@nvidia.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>memory: implement memory_block_advise/probe_max_size</title>
<updated>2025-05-12T00:48:07Z</updated>
<author>
<name>Gregory Price</name>
<email>gourry@gourry.net</email>
</author>
<published>2025-01-27T15:34:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=737e9d021993c99377bb61e762c0d6945a37615b'/>
<id>urn:sha1:737e9d021993c99377bb61e762c0d6945a37615b</id>
<content type='text'>
Patch series "memory,x86,acpi: hotplug memory alignment advisement", v8.

When physical address regions are not aligned to memory block size, the
misaligned portion is lost (stranded capacity).

Block size (min/max/selected) is architecture defined.  Most architectures
tend to use the minimum block size or some simplistic heurist.  On x86,
memory block size increases up to 2GB, and is otherwise fitted to the
alignment of non-hotplug (i.e.  not special purpose memory).

CXL exposes its memory for management through the ACPI CEDT (CXL Early
Detection Table) in a field called the CXL Fixed Memory Window.  Per the
CXL specification, this memory must be aligned to at least 256MB.

When a CFMW aligns on a size less than the block size, this causes a loss
of up to 2GB per CFMW on x86.  It is not uncommon for CFMW to be allocated
per-device - though this behavior is BIOS defined.

This patch set provides 3 things:
 1) implement advise/query functions in driverse/base/memory.c to
    report/query architecture agnostic hotplug block alignment advice.
 2) update x86 memblock size logic to consider the hotplug advice
 3) add code in acpi/numa/srat.c to report CFMW alignment advice

The advisement interfaces are design to be called during arch_init code
prior to allocator and smp_init.  start_kernel will call these through
setup_arch() (via acpi and mm/init_64.c on x86), which occurs prior to
mm_core_init and smp_init - so no need for atomics.

There's an attempt to signal callers to advise() that query has already
occurred, but this is predicated on the notion that query actually occurs
(which presently only happens on the x86 arch).  This is to assist
debugging future users.  Otherwise, the advise() call has been marked
__init to help static discovery of bad call times.

Once query is called the first time, it will always return the same value.

Interfaces return -EBUSY and 0 respectively on systems without hotplug.


This patch (of 3):

Hotplug memory sources may have opinions on what the memblock size should
be - usually for alignment purposes.  For example, CXL memory extents can
be 256MB with a matching alignment.  If this size/alignment is smaller
than the block size, it can result in stranded capacity.

Implement memory_block_advise_max_size for use prior to allocator init,
for software to advise the system on the max block size.

Implement memory_block_probe_max_size for use by arch init code to
calculate the best block size.  Use of advice is architecture defined.

The probe value can never change after first probe.  Calls to advise after
probe will return -EBUSY to aid debugging.

On systems without hotplug, always return -ENODEV and 0 respectively.

Link: https://lkml.kernel.org/r/20250127153405.3379117-1-gourry@gourry.net
Link: https://lkml.kernel.org/r/20250127153405.3379117-2-gourry@gourry.net
Signed-off-by: Gregory Price &lt;gourry@gourry.net&gt;
Suggested-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Acked-by: Mike Rapoport (Microsoft) &lt;rppt@kernel.org&gt;
Acked-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Tested-by: Fan Ni &lt;fan.ni@samsung.com&gt;
Reviewed-by: Ira Weiny &lt;ira.weiny@intel.com&gt;
Acked-by: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: Alison Schofield &lt;alison.schofield@intel.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Borislav Betkov &lt;bp@alien8.de&gt;
Cc: Bruno Faccini &lt;bfaccini@nvidia.com&gt;
Cc: Dave Hansen &lt;dave.hansen@linux.intel.com&gt;
Cc: Dave Jiang &lt;dave.jiang@intel.com&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: Haibo Xu &lt;haibo1.xu@intel.com&gt;
Cc: "H. Peter Anvin" &lt;hpa@zytor.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Joanthan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Cc: Len Brown &lt;lenb@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Cc: Robert Richter &lt;rrichter@amd.com&gt;
Cc: Thomas Gleinxer &lt;tglx@linutronix.de&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>drivers/base/memory: correct the field name in the header</title>
<updated>2025-03-18T05:07:02Z</updated>
<author>
<name>Gavin Shan</name>
<email>gshan@redhat.com</email>
</author>
<published>2025-03-11T23:30:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1a24776fca39ed79d7f5e0b0592b5addd784981e'/>
<id>urn:sha1:1a24776fca39ed79d7f5e0b0592b5addd784981e</id>
<content type='text'>
Replace @blocks with @memory_blocks to match with the definition of struct
memory_group.

Link: https://lkml.kernel.org/r/20250311233045.148943-3-gshan@redhat.com
Signed-off-by: Gavin Shan &lt;gshan@redhat.com&gt;
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Acked-by: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: Danilo Krummrich &lt;dakr@kernel.org&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: "Rafael J. Wysocki" &lt;rafael@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'cxl-for-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl</title>
<updated>2024-03-16T17:04:12Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2024-03-16T17:04:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=02c163e959b72059ce409a8516170dc40193001f'/>
<id>urn:sha1:02c163e959b72059ce409a8516170dc40193001f</id>
<content type='text'>
Pull CXL updates from Dan Williams:
 "CXL has mechanisms to enumerate the performance characteristics of
  memory devices. Those mechanisms allow Linux to build the equivalent
  of ACPI SRAT, SLIT, and HMAT tables dynamically at runtime. That
  capability is necessary because static ACPI can not represent dynamic
  CXL configurations (and reconfigurations).

  So, building on the v6.8 work to add "Quality of Service" enumeration,
  this update plumbs CXL "access coordinates" (read/write access latency
  and bandwidth) in all the same places that ACPI HMAT feeds similar
  data. Follow-on patches from the -mm side can then use that data to
  feed mechanisms like mm/memory-tiers.c. Greg has acked the touch to
  drivers/base/.

  The other feature update this cycle is support for CXL error injection
  via the ACPI EINJ module. That facility enables injection of bus
  protocol errors provided the user knows the magic address values to
  insert in the interface. To hide that magic, and make this easier to
  use, new error injection attributes were added to CXL debugfs. That
  interface injects the errors relative to a CXL object rather than
  require user tooling to know how to lookup and inject RCRB (Root
  Complex Register Block) addresses into the raw EINJ debugfs interface.
  It received some helpful review comments from Tony, but no explicit
  acks from the ACPI side. The primary user visible change for existing
  EINJ users is that they may find that einj.ko was already loaded by
  cxl_core.ko. Previously, einj.ko was only loaded on demand.

  The usual collection of miscellaneous cleanups are also present this
  cycle.

  Summary:

   - Supplement ACPI HMAT reported memory performance with native CXL
     memory performance enumeration

   - Add support for CXL error injection via the ACPI EINJ mechanism

   - Cleanup CXL DOE and CDAT integration

   - Miscellaneous cleanups and fixes"

* tag 'cxl-for-6.9' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (21 commits)
  Documentation/ABI/testing/debugfs-cxl: Fix "Unexpected indentation"
  lib/firmware_table: Provide buffer length argument to cdat_table_parse()
  cxl/pci: Get rid of pointer arithmetic reading CDAT table
  cxl/pci: Rename DOE mailbox handle to doe_mb
  cxl: Fix the incorrect assignment of SSLBIS entry pointer initial location
  cxl/core: Add CXL EINJ debugfs files
  EINJ, Documentation: Update EINJ kernel doc
  EINJ: Add CXL error type support
  EINJ: Migrate to a platform driver
  cxl/region: Deal with numa nodes not enumerated by SRAT
  cxl/region: Add memory hotplug notifier for cxl region
  cxl/region: Add sysfs attribute for locality attributes of CXL regions
  cxl/region: Calculate performance data for a region
  cxl: Set cxlmd-&gt;endpoint before adding port device
  cxl: Move QoS class to be calculated from the nearest CPU
  cxl: Split out host bridge access coordinates
  cxl: Split out combine_coordinates() for common shared usage
  ACPI: HMAT / cxl: Add retrieval of generic port coordinates for both access classes
  ACPI: HMAT: Introduce 2 levels of generic port access class
  base/node / ACPI: Enumerate node access class for 'struct access_coordinate'
  ...
</content>
</entry>
<entry>
<title>cxl/region: Add memory hotplug notifier for cxl region</title>
<updated>2024-03-12T19:34:12Z</updated>
<author>
<name>Dave Jiang</name>
<email>dave.jiang@intel.com</email>
</author>
<published>2024-03-08T21:59:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=067353a46d8ccdac279ebab97c038c3658e97541'/>
<id>urn:sha1:067353a46d8ccdac279ebab97c038c3658e97541</id>
<content type='text'>
When the CXL region is formed, the driver computes the performance data
for the region. However this data is not available at the node data
collection that has been populated by the HMAT during kernel
initialization. Add a memory hotplug notifier to update the access
coordinates to the 'struct memory_target' context kept by the
HMAT_REPORTING code.

Add CXL_CALLBACK_PRI for a memory hotplug callback priority. Set the
priority number to be called before HMAT_CALLBACK_PRI. The CXL update must
happen before hmat_callback().

A new HMAT_REPORTING helper hmat_update_target_coordinates() is added in
order to allow CXL to update the memory_target access coordinates.

A new ext_updated member is added to the memory_target to indicate that
the access coordinates within the memory_target has been updated by an
external agent such as CXL. This prevents data being overwritten by the
hmat_update_target_attrs() triggered by hmat_callback().

Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Rafael J. Wysocki &lt;rafael@kernel.org&gt;
Reviewed-by: Huang, Ying &lt;ying.huang@intel.com&gt;
Reviewed-by: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Tested-by: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Signed-off-by: Dave Jiang &lt;dave.jiang@intel.com&gt;
Link: https://lore.kernel.org/r/20240308220055.2172956-12-dave.jiang@intel.com
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
</content>
</entry>
<entry>
<title>mm/memory_hotplug: introduce MEM_PREPARE_ONLINE/MEM_FINISH_OFFLINE notifiers</title>
<updated>2024-02-22T00:00:01Z</updated>
<author>
<name>Sumanth Korikkar</name>
<email>sumanthk@linux.ibm.com</email>
</author>
<published>2024-01-08T13:27:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c5f1e2d1890935a734c302b9b8579748222b8e1e'/>
<id>urn:sha1:c5f1e2d1890935a734c302b9b8579748222b8e1e</id>
<content type='text'>
Patch series "implement "memmap on memory" feature on s390".

This series provides "memmap on memory" support on s390 platform.  "memmap
on memory" allows struct pages array to be allocated from the hotplugged
memory range instead of allocating it from main system memory.

s390 currently preallocates struct pages array for all potentially
possible memory, which ensures memory onlining always succeeds, but with
the cost of significant memory consumption from the available system
memory during boottime.  In certain extreme configuration, this could lead
to ipl failure.

"memmap on memory" ensures struct pages array are populated from self
contained hotplugged memory range instead of depleting the available
system memory and this could eliminate ipl failure on s390 platform.

On other platforms, system might go OOM when the physically hotplugged
memory depletes the available memory before it is onlined.  Hence, "memmap
on memory" feature was introduced as described in commit a08a2ae34613
("mm,memory_hotplug: allocate memmap from the added memory range").

Unlike other architectures, s390 memory blocks are not physically
accessible until it is online.  To make it physically accessible two new
memory notifiers MEM_PREPARE_ONLINE / MEM_FINISH_OFFLINE are added and
this notifier lets the hypervisor inform that the memory should be made
physically accessible.  This allows for "memmap on memory" initialization
during memory hotplug onlining phase, which is performed before calling
MEM_GOING_ONLINE notifier.

Patch 1 introduces MEM_PREPARE_ONLINE/MEM_FINISH_OFFLINE memory notifiers
to prepare the transition of memory to and from a physically accessible
state.  New mhp_flag MHP_OFFLINE_INACCESSIBLE is introduced to ensure
altmap cannot be written when adding memory - before it is set online. 
This enhancement is crucial for implementing the "memmap on memory"
feature for s390 in a subsequent patch.

Patches 2 allocates vmemmap pages from self-contained memory range for
s390.  It allocates memory map (struct pages array) from the hotplugged
memory range, rather than using system memory by passing altmap to vmemmap
functions.

Patch 3 removes unhandled memory notifier types on s390.

Patch 4 implements MEM_PREPARE_ONLINE/MEM_FINISH_OFFLINE memory notifiers
on s390.  MEM_PREPARE_ONLINE memory notifier makes memory block physical
accessible via sclp assign command.  The notifier ensures self-contained
memory maps are accessible and hence enabling the "memmap on memory" on
s390.  MEM_FINISH_OFFLINE memory notifier shifts the memory block to an
inaccessible state via sclp unassign command.

Patch 5 finally enables MHP_MEMMAP_ON_MEMORY on s390.


This patch (of 5):

Introduce MEM_PREPARE_ONLINE/MEM_FINISH_OFFLINE memory notifiers to
prepare the transition of memory to and from a physically accessible
state.  This enhancement is crucial for implementing the "memmap on
memory" feature for s390 in a subsequent patch.

Platforms such as x86 can support physical memory hotplug via ACPI.  When
there is physical memory hotplug, ACPI event leads to the memory addition
with the following callchain:

acpi_memory_device_add()
  -&gt; acpi_memory_enable_device()
     -&gt; __add_memory()

After this, the hotplugged memory is physically accessible, and altmap
support prepared, before the "memmap on memory" initialization in
memory_block_online() is called.

On s390, memory hotplug works in a different way.  The available hotplug
memory has to be defined upfront in the hypervisor, but it is made
physically accessible only when the user sets it online via sysfs,
currently in the MEM_GOING_ONLINE notifier.  This is too late and "memmap
on memory" initialization is performed before calling MEM_GOING_ONLINE
notifier.

During the memory hotplug addition phase, altmap support is prepared and
during the memory onlining phase s390 requires memory to be physically
accessible and then subsequently initiate the "memmap on memory"
initialization process.

The memory provider will handle new MEM_PREPARE_ONLINE /
MEM_FINISH_OFFLINE notifications and make the memory accessible.

The mhp_flag MHP_OFFLINE_INACCESSIBLE is introduced and is relevant when
used along with MHP_MEMMAP_ON_MEMORY, because the altmap cannot be written
(e.g., poisoned) when adding memory -- before it is set online.  This
allows for adding memory with an altmap that is not currently made
available by a hypervisor.  When onlining that memory, the hypervisor can
be instructed to make that memory accessible via the new notifiers and the
onlining phase will not require any memory allocations, which is helpful
in low-memory situations.

All architectures ignore unknown memory notifiers.  Therefore, the
introduction of these new notifiers does not result in any functional
modifications across architectures.

Link: https://lkml.kernel.org/r/20240108132747.3238763-1-sumanthk@linux.ibm.com
Link: https://lkml.kernel.org/r/20240108132747.3238763-2-sumanthk@linux.ibm.com
Signed-off-by: Sumanth Korikkar &lt;sumanthk@linux.ibm.com&gt;
Suggested-by: Gerald Schaefer &lt;gerald.schaefer@linux.ibm.com&gt;
Suggested-by: David Hildenbrand &lt;david@redhat.com&gt;
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Alexander Gordeev &lt;agordeev@linux.ibm.com&gt;
Cc: Aneesh Kumar K.V &lt;aneesh.kumar@linux.ibm.com&gt;
Cc: Anshuman Khandual &lt;anshuman.khandual@arm.com&gt;
Cc: Heiko Carstens &lt;hca@linux.ibm.com&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: Vasily Gorbik &lt;gor@linux.ibm.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/memory_hotplug: embed vmem_altmap details in memory block</title>
<updated>2023-08-21T20:37:49Z</updated>
<author>
<name>Aneesh Kumar K.V</name>
<email>aneesh.kumar@linux.ibm.com</email>
</author>
<published>2023-08-08T09:15:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1a8c64e110435e44e71bcd50a75663174b575f22'/>
<id>urn:sha1:1a8c64e110435e44e71bcd50a75663174b575f22</id>
<content type='text'>
With memmap on memory, some architecture needs more details w.r.t altmap
such as base_pfn, end_pfn, etc to unmap vmemmap memory.  Instead of
computing them again when we remove a memory block, embed vmem_altmap
details in struct memory_block if we are using memmap on memory block
feature.

[yangyingliang@huawei.com: fix error return code in add_memory_resource()]
  Link: https://lkml.kernel.org/r/20230809081552.1351184-1-yangyingliang@huawei.com
Link: https://lkml.kernel.org/r/20230808091501.287660-7-aneesh.kumar@linux.ibm.com
Signed-off-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.ibm.com&gt;
Signed-off-by: Yang Yingliang &lt;yangyingliang@huawei.com&gt;
Acked-by: Michal Hocko &lt;mhocko@suse.com&gt;
Acked-by: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Christophe Leroy &lt;christophe.leroy@csgroup.eu&gt;
Cc: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Cc: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Cc: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: Vishal Verma &lt;vishal.l.verma@intel.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
<entry>
<title>mm/hwpoison: introduce per-memory_block hwpoison counter</title>
<updated>2022-11-09T01:37:22Z</updated>
<author>
<name>Naoya Horiguchi</name>
<email>naoya.horiguchi@nec.com</email>
</author>
<published>2022-10-24T06:20:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5033091de814ab4b5623faed2755f3064e19e2d2'/>
<id>urn:sha1:5033091de814ab4b5623faed2755f3064e19e2d2</id>
<content type='text'>
Currently PageHWPoison flag does not behave well when experiencing memory
hotremove/hotplug.  Any data field in struct page is unreliable when the
associated memory is offlined, and the current mechanism can't tell
whether a memory block is onlined because a new memory devices is
installed or because previous failed offline operations are undone. 
Especially if there's a hwpoisoned memory, it's unclear what the best
option is.

So introduce a new mechanism to make struct memory_block remember that a
memory block has hwpoisoned memory inside it.  And make any online event
fail if the onlining memory block contains hwpoison.  struct memory_block
is freed and reallocated over ACPI-based hotremove/hotplug, but not over
sysfs-based hotremove/hotplug.  So the new counter can distinguish these
cases.

Link: https://lkml.kernel.org/r/20221024062012.1520887-5-naoya.horiguchi@linux.dev
Signed-off-by: Naoya Horiguchi &lt;naoya.horiguchi@nec.com&gt;
Reported-by: kernel test robot &lt;lkp@intel.com&gt;
Reviewed-by: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Cc: David Hildenbrand &lt;david@redhat.com&gt;
Cc: Jane Chu &lt;jane.chu@oracle.com&gt;
Cc: Mike Kravetz &lt;mike.kravetz@oracle.com&gt;
Cc: Muchun Song &lt;songmuchun@bytedance.com&gt;
Cc: Oscar Salvador &lt;osalvador@suse.de&gt;
Cc: Yang Shi &lt;shy828301@gmail.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
</content>
</entry>
</feed>
