summaryrefslogtreecommitdiff
path: root/arch/x86/kernel/cpu/resctrl
AgeCommit message (Collapse)Author
2025-04-01Merge tag 'driver-core-6.15-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core Pull driver core updatesk from Greg KH: "Here is the big set of driver core updates for 6.15-rc1. Lots of stuff happened this development cycle, including: - kernfs scaling changes to make it even faster thanks to rcu - bin_attribute constify work in many subsystems - faux bus minor tweaks for the rust bindings - rust binding updates for driver core, pci, and platform busses, making more functionaliy available to rust drivers. These are all due to people actually trying to use the bindings that were in 6.14. - make Rafael and Danilo full co-maintainers of the driver core codebase - other minor fixes and updates" * tag 'driver-core-6.15-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/driver-core: (52 commits) rust: platform: require Send for Driver trait implementers rust: pci: require Send for Driver trait implementers rust: platform: impl Send + Sync for platform::Device rust: pci: impl Send + Sync for pci::Device rust: platform: fix unrestricted &mut platform::Device rust: pci: fix unrestricted &mut pci::Device rust: device: implement device context marker rust: pci: use to_result() in enable_device_mem() MAINTAINERS: driver core: mark Rafael and Danilo as co-maintainers rust/kernel/faux: mark Registration methods inline driver core: faux: only create the device if probe() succeeds rust/faux: Add missing parent argument to Registration::new() rust/faux: Drop #[repr(transparent)] from faux::Registration rust: io: fix devres test with new io accessor functions rust: io: rename `io::Io` accessors kernfs: Move dput() outside of the RCU section. efi: rci2: mark bin_attribute as __ro_after_init rapidio: constify 'struct bin_attribute' firmware: qemu_fw_cfg: constify 'struct bin_attribute' powerpc/perf/hv-24x7: Constify 'struct bin_attribute' ...
2025-03-12x86/resctrl: Move get_{mon,ctrl}_domain_from_cpu() to live with their callersJames Morse
Each of get_{mon,ctrl}_domain_from_cpu() only has one caller. Once the filesystem code is moved to /fs/, there is no equivalent to core.c. Move these functions to each live next to their caller. This allows them to be made static and the header file entries to be removed. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-31-james.morse@arm.com
2025-03-12x86/resctrl: Move get_config_index() to a headerJames Morse
get_config_index() is used by the architecture specific code to map a CLOSID+type pair to an index in the configuration arrays. MPAM needs to do this too to preserve the ABI to user-space, there is no reason to do it differently. Move the helper to a header file to allow all architectures that either use or emulate CDP to use the same pattern of CLOSID values. Moving this to a header file means it must be marked inline, which matches the existing compiler choice for this static function. Co-developed-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-30-james.morse@arm.com
2025-03-12x86/resctrl: Handle throttle_mode for SMBA resourcesJames Morse
Now that the visibility of throttle_mode is being managed by resctrl, it should consider resources other than MBA that may have a throttle_mode. SMBA is one such resource. Extend thread_throttle_mode_init() to check SMBA for a throttle_mode. Adding support for multiple resources means it is possible for a platform with both MBA and SMBA, but an undefined throttle_mode on one of them to make the file visible. Add the 'undefined' case to rdt_thread_throttle_mode_show(). Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-29-james.morse@arm.com
2025-03-12x86/resctrl: Move RFTYPE flags to be managed by resctrlJames Morse
resctrl_file_fflags_init() is called from the architecture specific code to make the 'thread_throttle_mode' file visible. The architecture specific code has already set the membw.throttle_mode in the rdt_resource. This forces the RFTYPE flags used by resctrl to be exposed to the architecture specific code. This doesn't need to be specific to the architecture, the throttle_mode can be used by resctrl to determine if the 'thread_throttle_mode' file should be visible. This allows the RFTYPE flags to be private to resctrl. Add thread_throttle_mode_init(), and use it to call resctrl_file_fflags_init() from resctrl_init(). This avoids publishing an extra function between the architecture and filesystem code. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-28-james.morse@arm.com
2025-03-12x86/resctrl: Make resctrl_arch_pseudo_lock_fn() take a plrJames Morse
resctrl_arch_pseudo_lock_fn() has architecture specific behaviour, and takes a struct rdtgroup as an argument. After the filesystem code moves to /fs/, the definition of struct rdtgroup will not be available to the architecture code. The only reason resctrl_arch_pseudo_lock_fn() wants the rdtgroup is for the CLOSID. Embed that in the pseudo_lock_region as a closid, and move the definition of struct pseudo_lock_region to resctrl.h. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-27-james.morse@arm.com
2025-03-12x86/resctrl: Make prefetch_disable_bits belong to the arch codeJames Morse
prefetch_disable_bits is set by rdtgroup_locksetup_enter() from a value provided by the architecture, but is largely read by other architecture helpers. Make resctrl_arch_get_prefetch_disable_bits() set prefetch_disable_bits so that it can be isolated to arch-code from where the other arch-code helpers can use its cached value. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-26-james.morse@arm.com
2025-03-12x86/resctrl: Allow an architecture to disable pseudo lockJames Morse
Pseudo-lock relies on knowledge of the micro-architecture to disable prefetchers etc. On arm64 these controls are typically secure only, meaning Linux can't access them. Arm's cache-lockdown feature works in a very different way. Resctrl's pseudo-lock isn't going to be used on arm64 platforms. Add a Kconfig symbol that can be selected by the architecture. This enables or disables building of the pseudo_lock.c file, and replaces the functions with stubs. An additional IS_ENABLED() check is needed in rdtgroup_mode_write() so that attempting to enable pseudo-lock reports an "Unknown or unsupported mode" to user-space via the last_cmd_status file. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-25-james.morse@arm.com
2025-03-12x86/resctrl: Add resctrl_arch_ prefix to pseudo lock functionsJames Morse
resctrl's pseudo lock has some copy-to-cache and measurement functions that are micro-architecture specific. For example, pseudo_lock_fn() is not at all portable. Label these 'resctrl_arch_' so they stay under /arch/x86. To expose these functions to the filesystem code they need an entry in a header file, and can't be marked static. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-24-james.morse@arm.com
2025-03-12x86/resctrl: Move mbm_cfg_mask to struct rdt_resourceJames Morse
The mbm_cfg_mask field lists the bits that user-space can set when configuring an event. This value is output via the last_cmd_status file. Once the filesystem parts of resctrl are moved to live in /fs/, the struct rdt_hw_resource is inaccessible to the filesystem code. Because this value is output to user-space, it has to be accessible to the filesystem code. Move it to struct rdt_resource. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-23-james.morse@arm.com
2025-03-12x86/resctrl: Move mba_mbps_default_event init to filesystem codeJames Morse
mba_mbps_default_event is initialised based on whether mbm_local or mbm_total is supported. In the case of both, it is initialised to mbm_local. mba_mbps_default_event is initialised in core.c's get_rdt_mon_resources(), while all the readers are in rdtgroup.c. After this code is split into architecture-specific and filesystem code, get_rdt_mon_resources() remains part of the architecture code, which would mean mba_mbps_default_event has to be exposed by the filesystem code. Move the initialisation to the filesystem's resctrl_mon_resource_init(). Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-22-james.morse@arm.com
2025-03-12x86/resctrl: Change mon_event_config_{read,write}() to be arch helpersJames Morse
mon_event_config_{read,write}() are called via IPI and access model specific registers to do their work. To support another architecture, this needs abstracting. Rename mon_event_config_{read,write}() to have a "resctrl_arch_" prefix, and move their struct mon_config_info parameter into <linux/resctrl.h>. This allows another architecture to supply an implementation of these. As struct mon_config_info is now exposed globally, give it a 'resctrl_' prefix. MPAM systems need access to the domain to do this work, add the resource and domain to struct resctrl_mon_config_info. Co-developed-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-21-james.morse@arm.com
2025-03-12x86/resctrl: Add resctrl_arch_is_evt_configurable() to abstract BMECJames Morse
When BMEC is supported the resctrl event can be configured in a number of ways. This depends on architecture support. rdt_get_mon_l3_config() modifies the struct mon_evt and calls resctrl_file_fflags_init() to create the files that allow the configuration. Splitting this into separate architecture and filesystem parts would require the struct mon_evt and resctrl_file_fflags_init() to be exposed. Instead, add resctrl_arch_is_evt_configurable(), and use this from resctrl_mon_resource_init() to initialise struct mon_evt and call resctrl_file_fflags_init(). resctrl_arch_is_evt_configurable() calls rdt_cpu_has() so it doesn't obviously benefit from being inlined. Putting it in core.c will allow rdt_cpu_has() to eventually become static. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-20-james.morse@arm.com
2025-03-12x86/resctrl: Move the is_mbm_*_enabled() helpers to asm/resctrl.hJames Morse
The architecture specific parts of resctrl provide helpers like is_mbm_total_enabled() and is_mbm_local_enabled() to hide accesses to the rdt_mon_features bitmap. Exposing a group of helpers between the architecture and filesystem code is preferable to a single unsigned-long like rdt_mon_features. Helpers can be more readable and have a well defined behaviour, while allowing architectures to hide more complex behaviour. Once the filesystem parts of resctrl are moved, these existing helpers can no longer live in internal.h. Move them to include/linux/resctrl.h Once these are exposed to the wider kernel, they should have a 'resctrl_arch_' prefix, to fit the rest of the arch<->fs interface. Move and rename the helpers that touch rdt_mon_features directly. is_mbm_event() and is_mbm_enabled() are only called from rdtgroup.c, so can be moved into that file. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-19-james.morse@arm.com
2025-03-12x86/resctrl: Rewrite and move the for_each_*_rdt_resource() walkersJames Morse
The for_each_*_rdt_resource() helpers walk the architecture's array of structures, using the resctrl visible part as an iterator. These became over-complex when the structures were split into a filesystem and architecture-specific struct. This approach avoided the need to touch every call site, and was done before there was a helper to retrieve a resource by rid. Once the filesystem parts of resctrl are moved to /fs/, both the arch's resource array, and the definition of those structures is no longer accessible. To support resctrl, each architecture would have to provide equally complex macros. Rewrite the macro to make use of resctrl_arch_get_resource(), and move these to include/linux/resctrl.h so existing x86 arch code continues to use them. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-18-james.morse@arm.com
2025-03-12x86/resctrl: Move monitor init work to a resctrl init callJames Morse
rdt_get_mon_l3_config() is called from the arch's resctrl_arch_late_init(), and initialises both architecture specific fields, such as hw_res->mon_scale and resctrl filesystem fields by calling dom_data_init(). To separate the filesystem and architecture parts of resctrl, this function needs splitting up. Add resctrl_mon_resource_init() to do the filesystem specific work, and call it from resctrl_init(). This runs later, but is still before the filesystem is mounted and the rmid_ptrs[] array can be used. [ bp: Massage commit message. ] Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-17-james.morse@arm.com
2025-03-12x86/resctrl: Move monitor exit work to a resctrl exit callJames Morse
rdt_put_mon_l3_config() is called via the architecture's resctrl_arch_exit() call, and appears to free the rmid_ptrs[] and closid_num_dirty_rmid[] arrays. In reality this code is marked __exit, and is removed by the linker as resctrl can't be built as a module. To separate the filesystem and architecture parts of resctrl, this free()ing work needs to be triggered by the filesystem, as these structures belong to the filesystem code. Rename rdt_put_mon_l3_config() to resctrl_mon_resource_exit() and call it from resctrl_exit(). The kfree() is currently dependent on r->mon_capable. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-16-james.morse@arm.com
2025-03-12x86/resctrl: Add an arch helper to reset one resourceJames Morse
On umount(), resctrl resets each resource back to its default configuration. It only ever does this for all resources in one go. reset_all_ctrls() is architecture specific as it works with struct rdt_hw_resource. Make reset_all_ctrls() an arch helper that resets one resource. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-15-james.morse@arm.com
2025-03-12x86/resctrl: Move resctrl types to a separate headerJames Morse
When resctrl is fully factored into core and per-arch code, each arch will need to use some resctrl common definitions in order to define its own specializations and helpers. Following conventional practice, it would be desirable to put the dependent arch definitions in an <asm/resctrl.h> header that is included by the common <linux/resctrl.h> header. However, this can make it awkward to avoid a circular dependency between <linux/resctrl.h> and the arch header. To avoid such dependencies, move the affected common types and constants into a new header that does not need to depend on <linux/resctrl.h> or on the arch headers. The same logic applies to the monitor-configuration defines, move these too. Some kind of enumeration for events is needed between the filesystem and architecture code. Take the x86 definition as its convenient for x86. The definition of enum resctrl_event_id is needed to allow the architecture code to define resctrl_arch_mon_ctx_alloc() and resctrl_arch_mon_ctx_free(). The definition of enum resctrl_res_level is needed to allow the architecture code to define resctrl_arch_set_cdp_enabled() and resctrl_arch_get_cdp_enabled(). The bits for mbm_local_bytes_config et al are ABI, and must be the same on all architectures. These are documented in Documentation/arch/x86/resctrl.rst The maintainers entry for these headers was missed when resctrl.h was created. Add a wildcard entry to match both resctrl.h and resctrl_types.h. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-14-james.morse@arm.com
2025-03-12x86/resctrl: Move rdt_find_domain() to be visible to arch and fs codeJames Morse
rdt_find_domain() finds a domain given a resource and a cache-id. This is used by both the architecture code and the filesystem code. After the filesystem code moves to live in /fs/, this helper is either duplicated by all architectures, or needs exposing by the filesystem code. Add the declaration to the global header file. As it's now globally visible, and has only a handful of callers, swap the 'rdt' for 'resctrl'. Move the function to live with its caller in ctrlmondata.c as the filesystem code will not have anything corresponding to core.c. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-13-james.morse@arm.com
2025-03-12x86/resctrl: Expose resctrl fs's init function to the rest of the kernelJames Morse
rdtgroup_init() needs exposing to the rest of the kernel so that arch code can call it once it lives in core code. As this is one of the few functions exposed, rename it to have "resctrl" in the name. The same goes for the exit call. Rename x86's arch code init functions for RDT to have an arch prefix to make it clear these are part of the architecture code. Co-developed-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-12-james.morse@arm.com
2025-03-12x86/resctrl: Remove rdtgroup from update_cpu_closid_rmid()James Morse
update_cpu_closid_rmid() takes a struct rdtgroup as an argument, which it uses to update the local CPUs default pqr values. This is a problem once the resctrl parts move out to /fs/, as the arch code cannot poke around inside struct rdtgroup. Rename update_cpu_closid_rmid() as resctrl_arch_sync_cpus_defaults() to be used as the target of an IPI, and pass the effective CLOSID and RMID in a new struct. Co-developed-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-11-james.morse@arm.com
2025-03-12x86/resctrl: Add helper for setting CPU default propertiesJames Morse
rdtgroup_rmdir_ctrl() and rdtgroup_rmdir_mon() set the per-CPU pqr_state for CPUs that were part of the rmdir()'d group. Another architecture might not have a 'pqr_state', its hardware may need the values in a different format. MPAM's equivalent of RMID values are not unique, and always need the CLOSID to be provided too. There is only one caller that modifies a single value, (rdtgroup_rmdir_mon()). MPAM always needs both CLOSID and RMID for the hardware value as these are written to the same system register. As rdtgroup_rmdir_mon() has the CLOSID on hand, only provide a helper to set both values. These values are read by __resctrl_sched_in(), but may be written by a different CPU without any locking, add READ/WRTE_ONCE() to avoid torn values. Co-developed-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-10-james.morse@arm.com
2025-03-12x86/resctrl: Generate default_ctrl instead of sharing itJames Morse
The struct rdt_resource default_ctrl is used by both the architecture code for resetting the hardware controls, and sometimes by the filesystem code as the default value for the schema, unless the bandwidth software controller is in use. Having the default exposed by the architecture code causes unnecessary duplication for each architecture as the default value must be specified, but can be derived from other schema properties. Now that the maximum bandwidth is explicitly described, resctrl can derive the default value from the schema format and the other resource properties. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-9-james.morse@arm.com
2025-03-12x86/resctrl: Add max_bw to struct resctrl_membwJames Morse
__rdt_get_mem_config_amd() and __get_mem_config_intel() both use the default_ctrl property as a maximum value. This is because the MBA schema works differently between these platforms. Doing this complicates determining whether the default_ctrl property belongs to the arch code, or can be derived from the schema format. Deriving the maximum or default value from the schema format would avoid the architecture code having to tell resctrl such obvious things as the maximum percentage is 100, and the maximum bitmap is all ones. Maximum bandwidth is always going to vary per platform. Add max_bw as a special case. This is currently used for the maximum MBA percentage on Intel platforms, but can be removed from the architecture code if 'percentage' becomes a schema format resctrl supports directly. This value isn't needed for other schema formats. This will allow the default_ctrl to be generated from the schema properties when it is needed. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-8-james.morse@arm.com
2025-03-12x86/resctrl: Remove data_width and the tabular formatJames Morse
The resctrl architecture code provides a data_width for the controls of each resource. This is used to zero pad all control values in the schemata file so they appear in columns. The same is done with the resource names to complete the visual effect. e.g. | SMBA:0=2048 | L3:0=00ff AMD platforms discover their maximum bandwidth for the MB resource from firmware, but hard-code the data_width to 4. If the maximum bandwidth requires more digits - the tabular format is silently broken. This is also broken when the mba_MBps mount option is used as the field width isn't updated. If new schema are added resctrl will need to be able to determine the maximum width. The benefit of this pretty-printing is questionable. Instead of handling runtime discovery of the data_width for AMD platforms, remove the feature. These fields are always zero padded so should be harmless to remove if the whole field has been treated as a number. In the above example, this would now look like this: | SMBA:0=2048 | L3:0=ff Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-7-james.morse@arm.com
2025-03-12x86/resctrl: Use schema type to determine the schema format stringJames Morse
Resctrl's architecture code gets to specify a format string that is used when printing schema entries. This is expected to be one of two values that the filesystem code supports. Setting this format string allows the architecture code to change the ABI resctrl presents to user-space. Instead, use the schema format enum to choose which format string to use. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-6-james.morse@arm.com
2025-03-12x86/resctrl: Use schema type to determine how to parse schema valuesJames Morse
Resctrl's architecture code gets to specify a function pointer that is used when parsing schema entries. This is expected to be one of two helpers from the filesystem code. Setting this function pointer allows the architecture code to change the ABI resctrl presents to user-space, and forces resctrl to expose these helpers. Instead, add a schema format enum to choose which schema parser to use. This allows the helpers to be made static and the structs used for passing arguments moved out of shared headers. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-5-james.morse@arm.com
2025-03-12x86/resctrl: Remove fflags from struct rdt_resourceJames Morse
The resctrl arch code specifies whether a resource controls a cache or memory using the fflags field. This field is then used by resctrl to determine which files should be exposed in the filesystem. Allowing the architecture to pick this value means the RFTYPE_ flags have to be in a shared header, and allows an architecture to create a combination that resctrl does not support. Remove the fflags field, and pick the value based on the resource id. Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-4-james.morse@arm.com
2025-03-12x86/resctrl: Add a helper to avoid reaching into the arch code resource listJames Morse
Resctrl occasionally wants to know something about a specific resource, in these cases it reaches into the arch code's rdt_resources_all[] array. Once the filesystem parts of resctrl are moved to /fs/, this means it will need visibility of the architecture specific struct rdt_hw_resource definition, and the array of all resources. All architectures would also need a r_resctrl member in this struct. Instead, abstract this via a helper to allow architectures to do different things here. Move the level enum to the resctrl header and add a helper to retrieve the struct rdt_resource by 'rid'. resctrl_arch_get_resource() should not return NULL for any value in the enum, it may instead return a dummy resource that is !alloc_enabled && !mon_enabled. Co-developed-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: Dave Martin <Dave.Martin@arm.com> Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-3-james.morse@arm.com
2025-03-12x86/resctrl: Fix allocation of cleanest CLOSID on platforms with no monitorsJames Morse
Commit 6eac36bb9eb0 ("x86/resctrl: Allocate the cleanest CLOSID by searching closid_num_dirty_rmid") added logic that causes resctrl to search for the CLOSID with the fewest dirty cache lines when creating a new control group, if requested by the arch code. This depends on the values read from the llc_occupancy counters. The logic is applicable to architectures where the CLOSID effectively forms part of the monitoring identifier and so do not allow complete freedom to choose an unused monitoring identifier for a given CLOSID. This support missed that some platforms may not have these counters. This causes a NULL pointer dereference when creating a new control group as the array was not allocated by dom_data_init(). As this feature isn't necessary on platforms that don't have cache occupancy monitors, add this to the check that occurs when a new control group is allocated. Fixes: 6eac36bb9eb0 ("x86/resctrl: Allocate the cleanest CLOSID by searching closid_num_dirty_rmid") Signed-off-by: James Morse <james.morse@arm.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Reviewed-by: David Hildenbrand <david@redhat.com> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com> Reviewed-by: Fenghua Yu <fenghuay@nvidia.com> Reviewed-by: Babu Moger <babu.moger@amd.com> Tested-by: Carl Worth <carl@os.amperecomputing.com> # arm64 Tested-by: Shaopeng Tan <tan.shaopeng@jp.fujitsu.com> Tested-by: Peter Newman <peternewman@google.com> Tested-by: Amit Singh Tomar <amitsinght@marvell.com> # arm64 Tested-by: Shanker Donthineni <sdonthineni@nvidia.com> # arm64 Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20250311183715.16445-2-james.morse@arm.com
2025-02-15kernfs: Use RCU to access kernfs_node::name.Sebastian Andrzej Siewior
Using RCU lifetime rules to access kernfs_node::name can avoid the trouble with kernfs_rename_lock in kernfs_name() and kernfs_path_from_node() if the fs was created with KERNFS_ROOT_INVARIANT_PARENT. This is usefull as it allows to implement kernfs_path_from_node() only with RCU protection and avoiding kernfs_rename_lock. The lock is only required if the __parent node can be changed and the function requires an unchanged hierarchy while it iterates from the node to its parent. The change is needed to allow the lookup of the node's path (kernfs_path_from_node()) from context which runs always with disabled preemption and or interrutps even on PREEMPT_RT. The problem is that kernfs_rename_lock becomes a sleeping lock on PREEMPT_RT. I went through all ::name users and added the required access for the lookup with a few extensions: - rdtgroup_pseudo_lock_create() drops all locks and then uses the name later on. resctrl supports rename with different parents. Here I made a temporal copy of the name while it is used outside of the lock. - kernfs_rename_ns() accepts NULL as new_parent. This simplifies sysfs_move_dir_ns() where it can set NULL in order to reuse the current name. - kernfs_rename_ns() is only using kernfs_rename_lock if the parents are different. All users use either kernfs_rwsem (for stable path view) or just RCU for the lookup. The ::name uses always RCU free. Use RCU lifetime guarantees to access kernfs_node::name. Suggested-by: Tejun Heo <tj@kernel.org> Acked-by: Tejun Heo <tj@kernel.org> Reported-by: syzbot+6ea37e2e6ffccf41a7e6@syzkaller.appspotmail.com Closes: https://lore.kernel.org/lkml/67251dc6.050a0220.529b6.015e.GAE@google.com/ Reported-by: Hillf Danton <hdanton@sina.com> Closes: https://lore.kernel.org/20241102001224.2789-1-hdanton@sina.com Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/20250213145023.2820193-7-bigeasy@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-02-15kernfs: Use RCU to access kernfs_node::parent.Sebastian Andrzej Siewior
kernfs_rename_lock is used to obtain stable kernfs_node::{name|parent} pointer. This is a preparation to access kernfs_node::parent under RCU and ensure that the pointer remains stable under the RCU lifetime guarantees. For a complete path, as it is done in kernfs_path_from_node(), the kernfs_rename_lock is still required in order to obtain a stable parent relationship while computing the relevant node depth. This must not change while the nodes are inspected in order to build the path. If the kernfs user never moves the nodes (changes the parent) then the kernfs_rename_lock is not required and the RCU guarantees are sufficient. This "restriction" can be set with KERNFS_ROOT_INVARIANT_PARENT. Otherwise the lock is required. Rename kernfs_node::parent to kernfs_node::__parent to denote the RCU access and use RCU accessor while accessing the node. Make cgroup use KERNFS_ROOT_INVARIANT_PARENT since the parent here can not change. Acked-by: Tejun Heo <tj@kernel.org> Cc: Yonghong Song <yonghong.song@linux.dev> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de> Link: https://lore.kernel.org/r/20250213145023.2820193-6-bigeasy@linutronix.de Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
2025-01-21Merge tag 'x86_cpu_for_v6.14_rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cpuid updates from Borislav Petkov: - Remove the less generic CPU matching infra around struct x86_cpu_desc and use the generic struct x86_cpu_id thing - Remove magic naked numbers for CPUID functions and use proper defines of the prefix CPUID_LEAF_*. Consolidate some of the crazy use around the tree - Smaller cleanups and improvements * tag 'x86_cpu_for_v6.14_rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/cpu: Make all all CPUID leaf names consistent x86/fpu: Remove unnecessary CPUID level check x86/fpu: Move CPUID leaf definitions to common code x86/tsc: Remove CPUID "frequency" leaf magic numbers. x86/tsc: Move away from TSC leaf magic numbers x86/cpu: Move TSC CPUID leaf definition x86/cpu: Refresh DCA leaf reading code x86/cpu: Remove unnecessary MwAIT leaf checks x86/cpu: Use MWAIT leaf definition x86/cpu: Move MWAIT leaf definition to common header x86/cpu: Remove 'x86_cpu_desc' infrastructure x86/cpu: Move AMD erratum 1386 table over to 'x86_cpu_id' x86/cpu: Replace PEBS use of 'x86_cpu_desc' use with 'x86_cpu_id' x86/cpu: Expose only stepping min/max interface x86/cpu: Introduce new microcode matching helper x86/cpufeature: Document cpu_feature_enabled() as the default to use x86/paravirt: Remove the WBINVD callback x86/cpufeatures: Free up unused feature bits
2024-12-12x86/resctrl: Add write option to "mba_MBps_event" fileTony Luck
The "mba_MBps" mount option provides an alternate method to control memory bandwidth. Instead of specifying allowable bandwidth as a percentage of maximum possible, the user provides a MiB/s limit value. There is a file in each CTRL_MON group directory that shows the event currently in use. Allow writing that file to choose a different event. A user can choose any of the memory bandwidth monitoring events listed in /sys/fs/resctrl/info/L3_mon/mon_features independently for each CTRL_MON group by writing to each of the "mba_MBps_event" files. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20241206163148.83828-8-tony.luck@intel.com
2024-12-12x86/resctrl: Add "mba_MBps_event" file to CTRL_MON directoriesTony Luck
The "mba_MBps" mount option provides an alternate method to control memory bandwidth. Instead of specifying allowable bandwidth as a percentage of maximum possible, the user provides a MiB/s limit value. In preparation to allow the user to pick the memory bandwidth monitoring event used as input to the feedback loop, provide a file in each CTRL_MON group directory that shows the event currently in use. Note that this file is only visible when the "mba_MBps" mount option is in use. Suggested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20241206163148.83828-7-tony.luck@intel.com
2024-12-10x86/resctrl: Make mba_sc use total bandwidth if local is not supportedTony Luck
The default input measurement to the mba_sc feedback loop for memory bandwidth control when the user mounts with the "mba_MBps" option is the local bandwidth event. But some systems may not support a local bandwidth event. When local bandwidth event is not supported, check for support of total bandwidth and use that instead. Relax the mount option check to allow use of the "mba_MBps" option for systems when only total bandwidth monitoring is supported. Also update the error message. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20241206163148.83828-6-tony.luck@intel.com
2024-12-10x86/resctrl: Compute memory bandwidth for all supported eventsTony Luck
Switching between local and total memory bandwidth events as the input to the mba_sc feedback loop would be cumbersome and take effect slowly in the current implementation as the bandwidth is only known after two consecutive readings of the same event. Compute the bandwidth for all supported events. This doesn't add significant overhead and will make changing which event is used simple. Suggested-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20241206163148.83828-5-tony.luck@intel.com
2024-12-10x86/resctrl: Modify update_mba_bw() to use per CTRL_MON group eventTony Luck
update_mba_bw() hard codes use of the memory bandwidth local event which prevents more flexible options from being deployed. Change this function to use the event specified in the rdtgroup that is being processed. Mount time checks for the "mba_MBps" option ensure that local memory bandwidth is enabled. So drop the redundant is_mbm_local_enabled() check. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20241206163148.83828-4-tony.luck@intel.com
2024-12-10x86/resctrl: Prepare for per-CTRL_MON group mba_MBps controlTony Luck
Resctrl uses local memory bandwidth event as input to the feedback loop when the mba_MBps mount option is used. This means that this mount option cannot be used on systems that only support monitoring of total bandwidth. Prepare to allow users to choose the input event independently for each CTRL_MON group by adding a global variable "mba_mbps_default_event" used to set the default event for each CTRL_MON group, and a new field "mba_mbps_event" in struct rdtgroup to track which event is used for each CTRL_MON group. Notes: 1) Both of these are only used when the user mounts the filesystem with the "mba_MBps" option. 2) Only check for support of local bandwidth event when initializing mba_mbps_default_event. Support for total bandwidth event can be added after other routines in resctrl have been updated to handle total bandwidth event. [ bp: Move mba_mbps_default_event extern into the arch header. ] Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Tested-by: Babu Moger <babu.moger@amd.com> Link: https://lore.kernel.org/r/20241206163148.83828-3-tony.luck@intel.com
2024-12-09x86/resctrl: Introduce resctrl_file_fflags_init() to initialize fflagsBabu Moger
thread_throttle_mode_init() and mbm_config_rftype_init() both initialize fflags for resctrl files. Adding new files will involve adding another function to initialize the fflags. This can be simplified by adding a new function resctrl_file_fflags_init() and passing the file name and flags to be initialized. Consolidate fflags initialization into resctrl_file_fflags_init() and remove thread_throttle_mode_init() and mbm_config_rftype_init(). [ Tony: Drop __init attribute so resctrl_file_fflags_init() can be used at run time. ] Signed-off-by: Babu Moger <babu.moger@amd.com> Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/r/20241206163148.83828-2-tony.luck@intel.com
2024-12-09x86/resctrl: Use kthread_run_on_cpu()Frederic Weisbecker
Use the proper API instead of open coding it. Signed-off-by: Frederic Weisbecker <frederic@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Acked-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/r/20240807160228.26206-3-frederic@kernel.org
2024-12-06x86/paravirt: Remove the WBINVD callbackJuergen Gross
The pv_ops::cpu.wbinvd paravirt callback is a leftover of lguest times. Today it is no longer needed, as all users use the native WBINVD implementation. Remove the callback and rename native_wbinvd() to wbinvd(). Signed-off-by: Juergen Gross <jgross@suse.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://lore.kernel.org/r/20241203071550.26487-1-jgross@suse.com
2024-11-19Merge tag 'x86_cache_for_v6.13' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull x86 cache resource control updates from Borislav Petkov: - Add support for 6-node sub-NUMA clustering on Intel - Cleanup * tag 'x86_cache_for_v6.13' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86/resctrl: Support Sub-NUMA cluster mode SNC6 x86/resctrl: Slightly clean-up mbm_config_show()
2024-11-06x86/resctrl: Support Sub-NUMA cluster mode SNC6Tony Luck
Support Sub-NUMA cluster mode with 6 nodes per L3 cache (SNC6) on some Intel platforms. Signed-off-by: Tony Luck <tony.luck@intel.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/r/20241031220213.17991-1-tony.luck@intel.com
2024-10-14x86/resctrl: Slightly clean-up mbm_config_show()Christophe JAILLET
'mon_info' is already zeroed in the list_for_each_entry() loop below. There is no need to explicitly initialize it here. It just wastes some space and cycles. Remove this un-needed code. On a x86_64, with allmodconfig: Before: ====== text data bss dec hex filename 74967 5103 1880 81950 1401e arch/x86/kernel/cpu/resctrl/rdtgroup.o After: ===== text data bss dec hex filename 74903 5103 1880 81886 13fde arch/x86/kernel/cpu/resctrl/rdtgroup.o Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/r/b2ebc809c8b6c6440d17b12ccf7c2d29aaafd488.1720868538.git.christophe.jaillet@wanadoo.fr
2024-10-08x86/resctrl: Annotate get_mem_config() functions as __initNathan Chancellor
After a recent LLVM change [1] that deduces __cold on functions that only call cold code (such as __init functions), there is a section mismatch warning from __get_mem_config_intel(), which got moved to .text.unlikely. as a result of that optimization: WARNING: modpost: vmlinux: section mismatch in reference: \ __get_mem_config_intel+0x77 (section: .text.unlikely.) -> thread_throttle_mode_init (section: .init.text) Mark __get_mem_config_intel() as __init as well since it is only called from __init code, which clears up the warning. While __rdt_get_mem_config_amd() does not exhibit a warning because it does not call any __init code, it is a similar function that is only called from __init code like __get_mem_config_intel(), so mark it __init as well to keep the code symmetrical. CONFIG_SECTION_MISMATCH_WARN_ONLY=n would turn this into a fatal error. Fixes: 05b93417ce5b ("x86/intel_rdt/mba: Add primary support for Memory Bandwidth Allocation (MBA)") Fixes: 4d05bf71f157 ("x86/resctrl: Introduce AMD QOS feature") Signed-off-by: Nathan Chancellor <nathan@kernel.org> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Cc: <stable@kernel.org> Link: https://github.com/llvm/llvm-project/commit/6b11573b8c5e3d36beee099dbe7347c2a007bf53 [1] Link: https://lore.kernel.org/r/20240917-x86-restctrl-get_mem_config_intel-init-v3-1-10d521256284@kernel.org
2024-10-08x86/resctrl: Avoid overflow in MB settings in bw_validate()Martin Kletzander
The resctrl schemata file supports specifying memory bandwidth associated with the Memory Bandwidth Allocation (MBA) feature via a percentage (this is the default) or bandwidth in MiBps (when resctrl is mounted with the "mba_MBps" option). The allowed range for the bandwidth percentage is from /sys/fs/resctrl/info/MB/min_bandwidth to 100, using a granularity of /sys/fs/resctrl/info/MB/bandwidth_gran. The supported range for the MiBps bandwidth is 0 to U32_MAX. There are two issues with parsing of MiBps memory bandwidth: * The user provided MiBps is mistakenly rounded up to the granularity that is unique to percentage input. * The user provided MiBps is parsed using unsigned long (thus accepting values up to ULONG_MAX), and then assigned to u32 that could result in overflow. Do not round up the MiBps value and parse user provided bandwidth as the u32 it is intended to be. Use the appropriate kstrtou32() that can detect out of range values. Fixes: 8205a078ba78 ("x86/intel_rdt/mba_sc: Add schemata support") Fixes: 6ce1560d35f6 ("x86/resctrl: Switch over to the resctrl mbps_val list") Co-developed-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com> Signed-off-by: Martin Kletzander <nert.pinx@gmail.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Reviewed-by: Tony Luck <tony.luck@intel.com>
2024-09-27[tree-wide] finally take no_llseek outAl Viro
no_llseek had been defined to NULL two years ago, in commit 868941b14441 ("fs: remove no_llseek") To quote that commit, At -rc1 we'll need do a mechanical removal of no_llseek - git grep -l -w no_llseek | grep -v porting.rst | while read i; do sed -i '/\<no_llseek\>/d' $i done would do it. Unfortunately, that hadn't been done. Linus, could you do that now, so that we could finally put that thing to rest? All instances are of the form .llseek = no_llseek, so it's obviously safe. Signed-off-by: Al Viro <viro@zeniv.linux.org.uk> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2024-08-28x86/resctrl: Fix arch_mbm_* array overrun on SNCPeter Newman
When using resctrl on systems with Sub-NUMA Clustering enabled, monitoring groups may be allocated RMID values which would overrun the arch_mbm_{local,total} arrays. This is due to inconsistencies in whether the SNC-adjusted num_rmid value or the unadjusted value in resctrl_arch_system_num_rmid_idx() is used. The num_rmid value for the L3 resource is currently: resctrl_arch_system_num_rmid_idx() / snc_nodes_per_l3_cache As a simple fix, make resctrl_arch_system_num_rmid_idx() return the SNC-adjusted, L3 num_rmid value on x86. Fixes: e13db55b5a0d ("x86/resctrl: Introduce snc_nodes_per_l3_cache") Signed-off-by: Peter Newman <peternewman@google.com> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com> Link: https://lore.kernel.org/r/20240822190212.1848788-1-peternewman@google.com