| Age | Commit message (Collapse) | Author |
|
Pull CXL updates from Dave Jiang:
- Introduce cxl_memdev_attach and pave way for soft reserved handling,
type2 accelerator enabling, and LSA 2.0 enabling. All these series
require the endpoint driver to settle before continuing the memdev
driver probe.
- Address CXL port error protocol handling and reporting.
The large patch series was split into three parts. The first two
parts are included here with the final part coming later.
The first part consists of a series of code refactoring to PCI AER
sub-system that addresses CXL and also CXL RAS code to prepare for
port error handling.
The second part refactors the CXL code to move management of
component registers to cxl_port objects to allow all CXL AER errors
to be handled through the cxl_port hierarchy.
- Provide AMD Zen5 platform address translation for CXL using ACPI
PRMT. This includes a conventions document to explain why this is
needed and how it's implemented.
- Misc CXL patches of fixes, cleanups, and updates. Including CXL
address translation for unaligned MOD3 regions.
[ TLA service: CXL is "Compute Express Link" ]
* tag 'cxl-for-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (59 commits)
cxl: Disable HPA/SPA translation handlers for Normalized Addressing
cxl/region: Factor out code into cxl_region_setup_poison()
cxl/atl: Lock decoders that need address translation
cxl: Enable AMD Zen5 address translation using ACPI PRMT
cxl/acpi: Prepare use of EFI runtime services
cxl: Introduce callback for HPA address ranges translation
cxl/region: Use region data to get the root decoder
cxl/region: Add @hpa_range argument to function cxl_calc_interleave_pos()
cxl/region: Separate region parameter setup and region construction
cxl: Simplify cxl_root_ops allocation and handling
cxl/region: Store HPA range in struct cxl_region
cxl/region: Store root decoder in struct cxl_region
cxl/region: Rename misleading variable name @hpa to @hpa_range
Documentation/driver-api/cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement
cxl, doc: Moving conventions in separate files
cxl, doc: Remove isonum.txt inclusion
cxl/port: Unify endpoint and switch port lookup
cxl/port: Move endpoint component register management to cxl_port
cxl/port: Map Port RAS registers
cxl/port: Move dport RAS setup to dport add time
...
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Pull MM updates from Andrew Morton:
- "powerpc/64s: do not re-activate batched TLB flush" makes
arch_{enter|leave}_lazy_mmu_mode() nest properly (Alexander Gordeev)
It adds a generic enter/leave layer and switches architectures to use
it. Various hacks were removed in the process.
- "zram: introduce compressed data writeback" implements data
compression for zram writeback (Richard Chang and Sergey Senozhatsky)
- "mm: folio_zero_user: clear page ranges" adds clearing of contiguous
page ranges for hugepages. Large improvements during demand faulting
are demonstrated (David Hildenbrand)
- "memcg cleanups" tidies up some memcg code (Chen Ridong)
- "mm/damon: introduce {,max_}nr_snapshots and tracepoint for damos
stats" improves DAMOS stat's provided information, deterministic
control, and readability (SeongJae Park)
- "selftests/mm: hugetlb cgroup charging: robustness fixes" fixes a few
issues in the hugetlb cgroup charging selftests (Li Wang)
- "Fix va_high_addr_switch.sh test failure - again" addresses several
issues in the va_high_addr_switch test (Chunyu Hu)
- "mm/damon/tests/core-kunit: extend existing test scenarios" improves
the KUnit test coverage for DAMON (Shu Anzai)
- "mm/khugepaged: fix dirty page handling for MADV_COLLAPSE" fixes a
glitch in khugepaged which was causing madvise(MADV_COLLAPSE) to
transiently return -EAGAIN (Shivank Garg)
- "arch, mm: consolidate hugetlb early reservation" reworks and
consolidates a pile of straggly code related to reservation of
hugetlb memory from bootmem and creation of CMA areas for hugetlb
(Mike Rapoport)
- "mm: clean up anon_vma implementation" cleans up the anon_vma
implementation in various ways (Lorenzo Stoakes)
- "tweaks for __alloc_pages_slowpath()" does a little streamlining of
the page allocator's slowpath code (Vlastimil Babka)
- "memcg: separate private and public ID namespaces" cleans up the
memcg ID code and prevents the internal-only private IDs from being
exposed to userspace (Shakeel Butt)
- "mm: hugetlb: allocate frozen gigantic folio" cleans up the
allocation of frozen folios and avoids some atomic refcount
operations (Kefeng Wang)
- "mm/damon: advance DAMOS-based LRU sorting" improves DAMOS's movement
of memory betewwn the active and inactive LRUs and adds auto-tuning
of the ratio-based quotas and of monitoring intervals (SeongJae Park)
- "Support page table check on PowerPC" makes
CONFIG_PAGE_TABLE_CHECK_ENFORCED work on powerpc (Andrew Donnellan)
- "nodemask: align nodes_and{,not} with underlying bitmap ops" makes
nodes_and() and nodes_andnot() propagate the return values from the
underlying bit operations, enabling some cleanup in calling code
(Yury Norov)
- "mm/damon: hide kdamond and kdamond_lock from API callers" cleans up
some DAMON internal interfaces (SeongJae Park)
- "mm/khugepaged: cleanups and scan limit fix" does some cleanup work
in khupaged and fixes a scan limit accounting issue (Shivank Garg)
- "mm: balloon infrastructure cleanups" goes to town on the balloon
infrastructure and its page migration function. Mainly cleanups, also
some locking simplification (David Hildenbrand)
- "mm/vmscan: add tracepoint and reason for kswapd_failures reset" adds
additional tracepoints to the page reclaim code (Jiayuan Chen)
- "Replace wq users and add WQ_PERCPU to alloc_workqueue() users" is
part of Marco's kernel-wide migration from the legacy workqueue APIs
over to the preferred unbound workqueues (Marco Crivellari)
- "Various mm kselftests improvements/fixes" provides various unrelated
improvements/fixes for the mm kselftests (Kevin Brodsky)
- "mm: accelerate gigantic folio allocation" greatly speeds up gigantic
folio allocation, mainly by avoiding unnecessary work in
pfn_range_valid_contig() (Kefeng Wang)
- "selftests/damon: improve leak detection and wss estimation
reliability" improves the reliability of two of the DAMON selftests
(SeongJae Park)
- "mm/damon: cleanup kdamond, damon_call(), damos filter and
DAMON_MIN_REGION" does some cleanup work in the core DAMON code
(SeongJae Park)
- "Docs/mm/damon: update intro, modules, maintainer profile, and misc"
performs maintenance work on the DAMON documentation (SeongJae Park)
- "mm: add and use vma_assert_stabilised() helper" refactors and cleans
up the core VMA code. The main aim here is to be able to use the mmap
write lock's lockdep state to perform various assertions regarding
the locking which the VMA code requires (Lorenzo Stoakes)
- "mm, swap: swap table phase II: unify swapin use" removes some old
swap code (swap cache bypassing and swap synchronization) which
wasn't working very well. Various other cleanups and simplifications
were made. The end result is a 20% speedup in one benchmark (Kairui
Song)
- "enable PT_RECLAIM on more 64-bit architectures" makes PT_RECLAIM
available on 64-bit alpha, loongarch, mips, parisc, and um. Various
cleanups were performed along the way (Qi Zheng)
* tag 'mm-stable-2026-02-11-19-22' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm: (325 commits)
mm/memory: handle non-split locks correctly in zap_empty_pte_table()
mm: move pte table reclaim code to memory.c
mm: make PT_RECLAIM depends on MMU_GATHER_RCU_TABLE_FREE
mm: convert __HAVE_ARCH_TLB_REMOVE_TABLE to CONFIG_HAVE_ARCH_TLB_REMOVE_TABLE config
um: mm: enable MMU_GATHER_RCU_TABLE_FREE
parisc: mm: enable MMU_GATHER_RCU_TABLE_FREE
mips: mm: enable MMU_GATHER_RCU_TABLE_FREE
LoongArch: mm: enable MMU_GATHER_RCU_TABLE_FREE
alpha: mm: enable MMU_GATHER_RCU_TABLE_FREE
mm: change mm/pt_reclaim.c to use asm/tlb.h instead of asm-generic/tlb.h
mm/damon/stat: remove __read_mostly from memory_idle_ms_percentiles
zsmalloc: make common caches global
mm: add SPDX id lines to some mm source files
mm/zswap: use %pe to print error pointers
mm/vmscan: use %pe to print error pointers
mm/readahead: fix typo in comment
mm: khugepaged: fix NR_FILE_PAGES and NR_SHMEM in collapse_file()
mm: refactor vma_map_pages to use vm_insert_pages
mm/damon: unify address range representation with damon_addr_range
mm/cma: replace snprintf with strscpy in cma_new_area
...
|
|
Add support for normalized CXL address translation through ACPI PRM method
to support AMD Zen5 platforms. Including a conventions doc that explains
how the translation is implemented and for future implementations that
need such setup to comply with the current implementation method.
cxl: Disable HPA/SPA translation handlers for Normalized Addressing
cxl/region: Factor out code into cxl_region_setup_poison()
cxl/atl: Lock decoders that need address translation
cxl: Enable AMD Zen5 address translation using ACPI PRMT
cxl/acpi: Prepare use of EFI runtime services
cxl: Introduce callback for HPA address ranges translation
cxl/region: Use region data to get the root decoder
cxl/region: Add @hpa_range argument to function cxl_calc_interleave_pos()
cxl/region: Separate region parameter setup and region construction
cxl: Simplify cxl_root_ops allocation and handling
cxl/region: Store HPA range in struct cxl_region
cxl/region: Store root decoder in struct cxl_region
cxl/region: Rename misleading variable name @hpa to @hpa_range
Documentation/driver-api/cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement
cxl, doc: Moving conventions in separate files
cxl, doc: Remove isonum.txt inclusion
|
|
Zen5 enablement
This adds a convention document for the following patch series:
cxl: ACPI PRM Address Translation Support and AMD Zen5 enablement
Version 7 and later:
https://lore.kernel.org/linux-cxl/20251114213931.30754-1-rrichter@amd.com/
Link: https://lore.kernel.org/linux-cxl/20251114213931.30754-1-rrichter@amd.com/
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Robert Richter <rrichter@amd.com>
Link: https://patch.msgid.link/20260203173604.1440334-3-rrichter@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Moving conventions in separate files.
Cc: Jonathan Corbet <corbet@lwn.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Robert Richter <rrichter@amd.com>
Link: https://patch.msgid.link/20260203173604.1440334-2-rrichter@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
This patch removes the line to include:: <isonum.txt>. From Jon:
"This include has been cargo-culted around the docs...the only real
use of it is to write |copy| rather than ©, but these docs don't even
do that. It can be taken out."
Cc: Jonathan Corbet <corbet@lwn.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Robert Richter <rrichter@amd.com>
Link: https://patch.msgid.link/20260203173604.1440334-1-rrichter@amd.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Every architecture that supports hugetlb_cma command line parameter
reserves CMA areas for hugetlb during setup_arch().
This obfuscates the ordering of hugetlb CMA initialization with respect to
the rest initialization of the core MM.
Introduce arch_hugetlb_cma_order() callback to allow architectures report
the desired order-per-bit of CMA areas and provide a week implementation
of arch_hugetlb_cma_order() for architectures that don't support hugetlb
with CMA.
Use this callback in hugetlb_cma_reserve() instead if passing the order as
parameter and call hugetlb_cma_reserve() from mm_core_init_early() rather
than have it spread over architecture specific code.
Link: https://lkml.kernel.org/r/20260111082105.290734-28-rppt@kernel.org
Signed-off-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Alex Shi <alexs@kernel.org>
Cc: Andreas Larsson <andreas@gaisler.com>
Cc: "Borislav Petkov (AMD)" <bp@alien8.de>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: David Hildenbrand <david@kernel.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Dinh Nguyen <dinguyen@kernel.org>
Cc: Geert Uytterhoeven <geert@linux-m68k.org>
Cc: Guo Ren <guoren@kernel.org>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Helge Deller <deller@gmx.de>
Cc: Huacai Chen <chenhuacai@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Johannes Berg <johannes@sipsolutions.net>
Cc: John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Klara Modin <klarasmodin@gmail.com>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Magnus Lindholm <linmag7@gmail.com>
Cc: Matt Turner <mattst88@gmail.com>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Michal Simek <monstr@monstr.eu>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Pratyush Yadav <pratyush@kernel.org>
Cc: Richard Weinberger <richard@nod.at>
Cc: "Ritesh Harjani (IBM)" <ritesh.list@gmail.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Stafford Horne <shorne@gmail.com>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Cc: Vineet Gupta <vgupta@kernel.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
|
|
The root document usually has a special :ref:`genindex` link to the
generated index. This is also the case for Documentation/index.rst. The
other index.rst files deeper in the directory hierarchy usually don't.
For SPHINXDIRS builds, the root document isn't Documentation/index.rst,
but some other index.rst in the hierarchy. Currently they have a
".. only::" block to add the index link when doing SPHINXDIRS html
builds.
This is obviously very tedious and repetitive. The link is also added to
all index.rst files in the hierarchy for SPHINXDIRS builds, not just the
root document.
Put the boilerplate in a sphinx-includes/subproject-index.rst file, and
include it at the end of the root document for subproject builds in an
ad-hoc source-read extension defined in conf.py.
For now, keep having the boilerplate in translations, because this
approach currently doesn't cover translated index link headers.
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jani Nikula <jani.nikula@intel.com>
Tested-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
Reviewed-by: Mauro Carvalho Chehab <mchehab+huawei@kernel.org>
[jc: did s/doctree/kern_doc_dir/ ]
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Message-ID: <20260123143149.2024303-1-jani.nikula@intel.com>
|
|
Describe cxl memory device hotplug implications, in particular how the
platform CEDT CFMWS must be described to support successful hot-add of
memory devices.
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Alejandro Lucero Palau <alucerop@amd.com>
Link: https://patch.msgid.link/20251219170538.1675743-3-gourry@gourry.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add a snippet about what Linux expects BIOS/EFI to do (and not
to do) to the BIOS/EFI section.
Suggested-by: Alejandro Lucero Palau <alucerop@amd.com>
Signed-off-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Alejandro Lucero Palau <alucerop@amd.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20251219170538.1675743-2-gourry@gourry.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
The node/zone quirk section of the cxl documentation is incorrect.
The actual reason for fallback allocation misbehavior in the
described configuration is due to a kswapd/reclaim thrashing scenario
fixed by the linked patch. Remove this section.
Link: https://lore.kernel.org/linux-mm/20250919162134.1098208-1-hannes@cmpxchg.org/
Signed-off-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull CXL updates from Dave Jiang:
"The changes include adding poison injection support, fixing CXL access
coordinates when onlining CXL memory, and delaing the enumeration of
downstream switch ports for CXL hierarchy to ensure that the CXL link
is established at the time of enumeration to address a few issues
observed on AMD and Intel platforms.
Misc changes:
- Use str_plural() instead of open code for emitting strings.
- Use str_enabled_disabled() instead of ternary operator
- Fix emit of type resource_size_t argument for
validate_region_offset()
- Typo fixup in CXL driver-api documentation
- Rename CFMWS coherency restriction defines
- Add convention doc describe dealing with x86 low memory hole
and CXL
Poison Inject support:
- Move hpa_to_spa callback to new reoot decoder ops structure
- Define a SPA to HPA callback for interleave calculation with
XOR math
- Add support for SPA to DPA address translation with XOR
- Add locked variants of poison inject and clear functions
- Add inject and clear poison support by region offset
CXL access coordinates update fix:
- A comment update for hotplug memory callback prority defines
- Add node_update_perf_attrs() for updating perf attrs on a node
- Update cxl_access_coordinates() to use the new node update function
- Remove hmat_update_target_coordinates() and related code
CXL delayed downstream port enumeration and initialization:
- Add helper to detect top of CXL device topology and remove
open coding
- Add helper to delete single dport
- Add a cached copy of target_map to cxl_decoder
- Refactor decoder setup to reduce cxl_test burden
- Defer dport allocation for switch ports
- Add mock version of devm_cxl_add_dport_by_dev() for cxl_test
- Adjust the mock version of devm_cxl_switch_port_decoders_setup()
due to cxl core usage
- Setup target_map for cxl_test decoder initialization
- Change SSLBIS handler to handle single dport
- Move port register setup to when first dport appears"
* tag 'cxl-for-6.18' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl: (25 commits)
cxl: Move port register setup to when first dport appear
cxl: Change sslbis handler to only handle single dport
cxl/test: Setup target_map for cxl_test decoder initialization
cxl/test: Adjust the mock version of devm_cxl_switch_port_decoders_setup()
cxl/test: Add mock version of devm_cxl_add_dport_by_dev()
cxl: Defer dport allocation for switch ports
cxl/test: Refactor decoder setup to reduce cxl_test burden
cxl: Add a cached copy of target_map to cxl_decoder
cxl: Add helper to delete dport
cxl: Add helper to detect top of CXL device topology
cxl: Documentation/driver-api/cxl: Describe the x86 Low Memory Hole solution
cxl/acpi: Rename CFMW coherency restrictions
Documentation/driver-api: Fix typo error in cxl
acpi/hmat: Remove now unused hmat_update_target_coordinates()
cxl, acpi/hmat: Update CXL access coordinates directly instead of through HMAT
drivers/base/node: Add a helper function node_update_perf_attrs()
mm/memory_hotplug: Update comment for hotplug memory callback priorities
cxl: Fix emit of type resource_size_t argument for validate_region_offset()
cxl/region: Add inject and clear poison by region offset
cxl/core: Add locked variants of the poison inject and clear funcs
...
|
|
Add documentation on how to resolve conflicts between CXL Fixed Memory
Windows, Platform Low Memory Holes, intermediate Switch and Endpoint
Decoders.
[dj]: Fixed inconsistent spacing after '.'
[dj]: Fixed subject line from Alison.
[dj]: Removed '::' before table from Bagas.
Reviewed-by: Gregory Price <gourry@gourry.net>
Signed-off-by: Fabio M. De Francesco <fabio.m.de.francesco@linux.intel.com>
Reviewed-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Fixed the following typo errors
intersparsed ==> interspersed
in Documentation/driver-api/cxl/platform/bios-and-efi.rst
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Link: https://patch.msgid.link/20250818175335.5312-1-rakuram.e96@gmail.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Corrected a few spelling mistakes
functionalty ==> functionality
in Documentation/driver-api/cxl/devices/device-types.rst
adjascent ==> adjacent
in Documentation/driver-api/cxl/platform/example-configurations/one-dev-per-hb.rst
succeessful ==> successful
in Documentation/driver-api/thermal/exynos_thermal_emulation.rst
Signed-off-by: Ranganath V N <vnranganath.20@gmail.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Signed-off-by: Jonathan Corbet <corbet@lwn.net>
Link: https://lore.kernel.org/r/20250814184304.20448-1-vnranganath.20@gmail.com
|
|
Add CXL region debugfs attributes to inject and clear poison based
on an offset into the region. These new interfaces allow users to
operate on poison at the region level without needing to resolve
Device Physical Addresses (DPA) or target individual memdevs.
The implementation uses a new helper, region_offset_to_dpa_result()
that applies decoder interleave logic, including XOR-based address
decoding when applicable. Note that XOR decodes rely on driver
internal xormaps which are not exposed to userspace. So, this support
is not only a simplification of poison operations that could be done
using existing per memdev operations, but also it enables this
functionality for XOR interleaved regions for the first time.
New debugfs attributes are added in /sys/kernel/debug/cxl/regionX/:
inject_poison and clear_poison. These are only exposed if all memdevs
participating in the region support both inject and clear commands,
ensuring consistent and reliable behavior across multi-device regions.
If tracing is enabled, these operations are logged as cxl_poison
events in /sys/kernel/tracing/trace.
The ABI documentation warns users of the significant risks that
come with using these capabilities.
A CXL Maturity Map update shows this user flow is now supported.
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/f3fd8628ab57ea79704fb2d645902cd499c066af.1754290144.git.alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Fix several typos and improve comment clarity in the CXL device types
docs:
"w/" replaced with "with"
"sill" -> "still"
"The allows" -> "This allows"
"capacity" corrected to "capable"
"more devices" corrected to "more upstream devices" in MLD description
These changes improve readability and enhance the documentation quality.
[ dj: Fix up "one or more hosts" to "one or more upstream devices" from
Gregory ]
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Gregory Price <gourry@gourry.net>
Link: https://patch.msgid.link/20250616060737.1645393-1-alok.a.tiwari@oracle.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Fix typo 'enumates' to 'enumerate' in CXL driver operation
documentation to improve readability.
Signed-off-by: Nai-Chen Cheng <bleach1827@gmail.com>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Reviewed-by: Li Ming <ming.li@zohomail.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20250610173152.33566-1-bleach1827@gmail.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
This patch corrects several typographical issues and improves phrasing
in memory-devices.rst:
- Fixes duplicate word ("1 one") and adjusts phrasing for clarity.
- Adds missing hyphen in "on-device".
- Corrects "a give memory device" to "a given memory device".
- fix singular/plural "decoder resource" -> "decoder resources".
- Clarifies "spans to Host Bridges" -> "spans two Host Bridges".
- change "at a" -> "a"
These changes improve readability and accuracy of the documentation.
Signed-off-by: Alok Tiwari <alok.a.tiwari@oracle.com>
Reviewed-by: Randy Dunlap <rdunlap@infradead.org>
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Jonathan Cameron <jonathan.cameron@huawei.com>
Link: https://patch.msgid.link/20250609171130.2375901-1-alok.a.tiwari@oracle.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
There exists shipping platforms that bend, break, or otherwise lean on
ambiguities in the CXL specification. Without driver changes to accommodate
these deviations, end users are left without CXL subsystem RAS features.
Specifically, provisioning, error translation, and other flows require the
CXL subsystem to understand the platforms CXL topology beyond undecorated
memory address ranges.
Those isolated compatibility problems risk growing into deeper upstream
maintenance burden if different platform vendors arrive at diverging
solutions. For example, there are multiple options for resolving
low-memory-mmio intersecting large-interleave-ways CXL windows. Linux
should only entertain one solution to that problem.
Now, with the ACPI Specification Working Group, situations like this would
be resolved with the "Code First ECN" process to codify Linux expectations
in a specification. In the absence of such a process for the CXL
specification, create a file in Linux documentation to detail the
motivations, assumptions, tradeoffs, and proposals for amending
specification language.
The goal is to capture the issues such that platform vendors arrive at
compatible solutions for these problems and serve as a repository for
potential specification updates. The expectation is to update
conventions.rst along with CXL subsystem code changes to accommodate the
platform topology.
[ dj: Rebased against v6.16-rc1 ]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Robert Richter <rrichter@amd.com>
Link: https://patch.msgid.link/20250603185254.3730099-1-dan.j.williams@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add documentation on how to calculate the access coordinates for a given
CXL region in detail.
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20250515000923.2590820-4-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add description in the SRAT document to describe the Generic Port
Affinity sub-table.
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20250515000923.2590820-3-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add documentation for CDAT structures for CXL usages.
Reviewed-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/20250515000923.2590820-2-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Changes for extended-linear cache, hetero-interleave, and HPA->DPA
address translation.
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20250512214225.1389484-1-alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
pmem.c regs.c mbox.c identifiers were missing. Add them to
memory-devices.rst following their respective DOC comment includes.
Two acpi.c identifiers were available, but not used in kernel-doc's:
1) Add add_cxl_resources to memory-devices.rst and fix up the Sphinx
complaint on the ascii art by escaping it.
2) Add cxl_acpi_evaluate_qtg_dsm to access-coordinates.rst.
core/features.c is new. Add a "DOC: cxl features" comment to the
source and identifiers to memory_devices.rst.
Signed-off-by: Alison Schofield <alison.schofield@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Link: https://patch.msgid.link/20250513215813.1419645-1-alison.schofield@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add some crosslinks between pages in the CXL docs - mostly to the
ACPI tables.
Suggested-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Gregory Price <gourry@gourry.net>
Link: https://patch.msgid.link/20250512162134.3596150-18-gourry@gourry.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add docs on how CXL capacity interacts with CMA and HugeTLB allocation
interfaces.
Signed-off-by: Gregory Price <gourry@gourry.net>
Link: https://patch.msgid.link/20250512162134.3596150-17-gourry@gourry.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Document a bit about how reclaim interacts with various CXL
configurations.
Signed-off-by: Gregory Price <gourry@gourry.net>
Link: https://patch.msgid.link/20250512162134.3596150-16-gourry@gourry.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Document some interesting interactions that occur when exposing CXL
memory capacity to page allocator.
Signed-off-by: Gregory Price <gourry@gourry.net>
Link: https://patch.msgid.link/20250512162134.3596150-15-gourry@gourry.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Small example of accessing CXL memory capacity via DAX device
Signed-off-by: Gregory Price <gourry@gourry.net>
Link: https://patch.msgid.link/20250512162134.3596150-14-gourry@gourry.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add documentation on how the CXL driver surfaces memory through the
DAX driver and memory-hotplug.
Signed-off-by: Gregory Price <gourry@gourry.net>
Link: https://patch.msgid.link/20250512162134.3596150-13-gourry@gourry.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add documentation on how the CXL driver interacts with the DAX driver.
Signed-off-by: Gregory Price <gourry@gourry.net>
Link: https://patch.msgid.link/20250512162134.3596150-12-gourry@gourry.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add 4 example configurations:
- single device
- cross-host-bridge interleave
- intra-host-bridge-interleave
- multi-level interleave
Signed-off-by: Gregory Price <gourry@gourry.net>
Link: https://patch.msgid.link/20250512162134.3596150-11-gourry@gourry.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add docs for the CXL driver that explains the base devices,
decoder types, region types, mailbox interfaces, and decoder
programming.
Signed-off-by: Gregory Price <gourry@gourry.net>
Link: https://patch.msgid.link/20250512162134.3596150-10-gourry@gourry.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Document __init time configurations that affect CXL driver probe
process and memory region configuration.
Signed-off-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20250512162134.3596150-9-gourry@gourry.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add type-3 device configuration overview that explains the probe
process for a type-3 device from early-boot through memory-hotplug.
Signed-off-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20250512162134.3596150-8-gourry@gourry.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add example ACPI Table configurations for different sample platforms.
Signed-off-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20250512162134.3596150-7-gourry@gourry.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add basic ACPI table information needed to understand the CXL
driver probe process.
Signed-off-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20250512162134.3596150-6-gourry@gourry.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add some docs on CXL configurations done in bios/efi that affect
linux configuration - information vendors may care to consider.
Signed-off-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20250512162134.3596150-5-gourry@gourry.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add a simple device primer sufficient to understand the theory
of operation documentation.
Signed-off-by: Gregory Price <gourry@gourry.net>
Link: https://patch.msgid.link/20250512162134.3596150-4-gourry@gourry.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Place the hierarchy diagram in access-coordinates.rst in a code block.
Fix a few grammar issues.
Suggested-by: Randy Dunlap <rdunlap@infradead.org>
Suggested-by: Bagas Sanjaya <bagasdotme@gmail.com>
Signed-off-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20250512162134.3596150-3-gourry@gourry.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Restructure the cxl folder to make adding docs per-page cleaner.
Signed-off-by: Gregory Price <gourry@gourry.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Link: https://patch.msgid.link/20250512162134.3596150-2-gourry@gourry.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Similar to how the acpi_nfit driver exports Optane dirty shutdown count,
introduce:
/sys/bus/cxl/devices/nvdimm-bridge0/ndbusX/nmemY/cxl/dirty_shutdown
Under the conditions that 1) dirty shutdown can be set, 2) Device GPF
DVSEC exists, and 3) the count itself can be retrieved.
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Dave Jiang <dave.jiang@intel.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20250220220235.276831-4-dave@stgolabs.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add support for GPF flows. It is found that the CXL specification
around this to be a bit too involved from the driver side. And while
this should really all handled by the hardware, this patch takes
things with a grain of salt.
Upon respective port enumeration, both phase timeouts are set to
a max of 20 seconds, which is the NMI watchdog default for lockup
detection. The premise is that the kernel does not have enough
information to set anything better than a max across the board
and hope devices finish their GPF flows within the platform energy
budget.
Timeout detection is based on dirty Shutdown semantics. The driver
will mark it as dirty, expecting that the device clear it upon a
successful GPF event. The admin may consult the device Health and
check the dirty shutdown counter to see if there was a problem
with data integrity.
[ davej: Explicitly set return to 0 in update_gpf_port_dvsec() ]
[ davej: Add spec reference for 'struct cxl_mbox_set_shutdown_state_in ]
[ davej: Fix 0-day reported issue ]
Signed-off-by: Davidlohr Bueso <dave@stgolabs.net>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20250124233533.910535-1-dave@stgolabs.net
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Create a kernel documentation to describe how the CXL shared upstream
link bandwidth is calculated.
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Acked-by: Dan Williams <dan.j.williams@intel.com>
Link: https://patch.msgid.link/20240904001316.1688225-4-dave.jiang@intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl
Pull CXL updates from Dave Jiang:
"Core:
- A CXL maturity map has been added to the documentation to detail
the current state of CXL enabling.
It provides the status of the current state of various CXL features
to inform current and future contributors of where things are and
which areas need contribution.
- A notifier handler has been added in order for a newly created CXL
memory region to trigger the abstract distance metrics calculation.
This should bring parity for CXL memory to the same level vs
hotplugged DRAM for NUMA abstract distance calculation. The
abstract distance reflects relative performance used for memory
tiering handling.
- An addition for XOR math has been added to address the CXL DPA to
SPA translation.
CXL address translation did not support address interleave math
with XOR prior to this change.
Fixes:
- Fix to address race condition in the CXL memory hotplug notifier
- Add missing MODULE_DESCRIPTION() for CXL modules
- Fix incorrect vendor debug UUID define
Misc:
- A warning has been added to inform users of an unsupported
configuration when mixing CXL VH and RCH/RCD hierarchies
- The ENXIO error code has been replaced with EBUSY for inject poison
limit reached via debugfs and cxl-test support
- Moving the PCI config read in cxl_dvsec_rr_decode() to avoid
unnecessary PCI config reads
- A refactor to a common struct for DRAM and general media CXL
events"
* tag 'cxl-for-6.11' of git://git.kernel.org/pub/scm/linux/kernel/git/cxl/cxl:
cxl/core/pci: Move reading of control register to immediately before usage
cxl: Remove defunct code calculating host bridge target positions
cxl/region: Verify target positions using the ordered target list
cxl: Restore XOR'd position bits during address translation
cxl/core: Fold cxl_trace_hpa() into cxl_dpa_to_hpa()
cxl/test: Replace ENXIO with EBUSY for inject poison limit reached
cxl/memdev: Replace ENXIO with EBUSY for inject poison limit reached
cxl/acpi: Warn on mixed CXL VH and RCH/RCD Hierarchy
cxl/core: Fix incorrect vendor debug UUID define
Documentation: CXL Maturity Map
cxl/region: Simplify cxl_region_nid()
cxl/region: Support to calculate memory tier abstract distance
cxl/region: Fix a race condition in memory hotplug notifier
cxl: add missing MODULE_DESCRIPTION() macros
cxl/events: Use a common struct for DRAM and General Media events
|
|
Provide a survey of the work-in-progress maturity (implementation
status) of various aspects of the CXL subsystem.
Clarify that in addition to ongoing upkeep relative to specification
updates, there are some long running themes in the driver that respond
to the discovery of new corner cases (bugs) and new use cases (feature
extensions).
The primary audience is distribution maintainers, but it also serves as
a guide for kernel developers to understand what aspects of the CXL
subsystem need more help. It is a landing page to document ongoing
progress, and a guide to discern exposure to work-in-progress features.
Reviewed-by: Adam Manzanares <a.manzanares@samsung.com>
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Reviewed-by: Ira Weiny <ira.weiny@intel.com>
Reviewed-by: Alison Schofield <alison.schofield@intel.com>
Link: https://patch.msgid.link/172005486862.2048248.6668794717827294862.stgit@dwillia2-xfh.jf.intel.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Add the missing files into cxl driver api and fix the compile warning.
Suggested-by: Dan Williams <dan.j.williams@intel.com>
Suggested-by: Alison Schofield <alison.schofield@intel.com>
Signed-off-by: Yao Xingtao <yaoxt.fnst@fujitsu.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://patch.msgid.link/20240614084755.59503-3-yaoxt.fnst@fujitsu.com
Signed-off-by: Dave Jiang <dave.jiang@intel.com>
|
|
Sphinx reported kernel-doc failure warning, pointing to non-existent
drivers/cxl/region.h (which doesn't also exist throughout repo history):
WARNING: kernel-doc './scripts/kernel-doc -rst -enable-lineno -sphinx-version 2.4.4 -no-doc-sections ./drivers/cxl/region.h' failed with return code 1
Above cause error message to be displayed on htmldocs output.
Delete the reference.
Fixes: 779dd20cfb56c5 ("cxl/region: Add region creation support")
Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/20220804075448.98241-4-bagasdotme@gmail.com
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|
|
CXL 2.0 allows for dynamic provisioning of new memory regions (system
physical address resources like "System RAM" and "Persistent Memory").
Whereas DDR and PMEM resources are conveyed statically at boot, CXL
allows for assembling and instantiating new regions from the available
capacity of CXL memory expanders in the system.
Sysfs with an "echo $region_name > $create_region_attribute" interface
is chosen as the mechanism to initiate the provisioning process. This
was chosen over ioctl() and netlink() to keep the configuration
interface entirely in a pseudo-fs interface, and it was chosen over
configfs since, aside from this one creation event, the interface is
read-mostly. I.e. configfs supports cases where an object is designed to
be provisioned each boot, like an iSCSI storage target, and CXL region
creation is mostly for PMEM regions which are created usually once
per-lifetime of a server instance. This is an improvement over nvdimm
that pre-created "seed" devices that tended to confuse users looking to
determine which devices are active and which are idle.
Recall that the major change that CXL brings over previous persistent
memory architectures is the ability to dynamically define new regions.
Compare that to drivers like 'nfit' where the region configuration is
statically defined by platform firmware.
Regions are created as a child of a root decoder that encompasses an
address space with constraints. When created through sysfs, the root
decoder is explicit. When created from an LSA's region structure a root
decoder will possibly need to be inferred by the driver.
Upon region creation through sysfs, a vacant region is created with a
unique name. Regions have a number of attributes that must be configured
before the region can be bound to the driver where HDM decoder program
is completed.
An example of creating a new region:
- Allocate a new region name:
region=$(cat /sys/bus/cxl/devices/decoder0.0/create_pmem_region)
- Create a new region by name:
while
region=$(cat /sys/bus/cxl/devices/decoder0.0/create_pmem_region)
! echo $region > /sys/bus/cxl/devices/decoder0.0/create_pmem_region
do true; done
- Region now exists in sysfs:
stat -t /sys/bus/cxl/devices/decoder0.0/$region
- Delete the region, and name:
echo $region > /sys/bus/cxl/devices/decoder0.0/delete_region
Signed-off-by: Ben Widawsky <bwidawsk@kernel.org>
Reviewed-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
Link: https://lore.kernel.org/r/165784333909.1758207.794374602146306032.stgit@dwillia2-xfh.jf.intel.com
[djbw: simplify locking, reword changelog]
Signed-off-by: Dan Williams <dan.j.williams@intel.com>
|