summaryrefslogtreecommitdiff
path: root/arch/x86/events
AgeCommit message (Collapse)Author
2025-12-14Merge tag 'perf-urgent-2025-12-12' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf event fixes from Ingo Molnar: - Fix NULL pointer dereference crash in the Intel PMU driver - Fix missing read event generation on task exit - Fix AMD uncore driver init error handling - Fix whitespace noise * tag 'perf-urgent-2025-12-12' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: perf/x86/intel: Fix NULL event dereference crash in handle_pmi_common() perf/core: Fix missing read event generation on task exit perf/x86/amd/uncore: Fix the return value of amd_uncore_df_event_init() on error perf/uprobes: Remove <space><Tab> whitespace noise
2025-12-12perf/x86/intel: Fix NULL event dereference crash in handle_pmi_common()Evan Li
handle_pmi_common() may observe an active bit set in cpuc->active_mask while the corresponding cpuc->events[] entry has already been cleared, which leads to a NULL pointer dereference. This can happen when interrupt throttling stops all events in a group while PEBS processing is still in progress. perf_event_overflow() can trigger perf_event_throttle_group(), which stops the group and clears the cpuc->events[] entry, but the active bit may still be set when handle_pmi_common() iterates over the events. The following recent fix: 7e772a93eb61 ("perf/x86: Fix NULL event access and potential PEBS record loss") moved the cpuc->events[] clearing from x86_pmu_stop() to x86_pmu_del() and relied on cpuc->active_mask/pebs_enabled checks. However, handle_pmi_common() can still encounter a NULL cpuc->events[] entry despite the active bit being set. Add an explicit NULL check on the event pointer before using it, to cover this legitimate scenario and avoid the NULL dereference crash. Fixes: 7e772a93eb61 ("perf/x86: Fix NULL event access and potential PEBS record loss") Reported-by: kitta <kitta@linux.alibaba.com> Co-developed-by: kitta <kitta@linux.alibaba.com> Signed-off-by: Evan Li <evan.li@linux.alibaba.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Link: https://patch.msgid.link/20251212084943.2124787-1-evan.li@linux.alibaba.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=220855
2025-12-09perf/x86/amd/uncore: Fix the return value of amd_uncore_df_event_init() on errorSandipan Das
If amd_uncore_event_init() fails, return an error irrespective of the pmu_version. Setting hwc->config should be safe even if there is an error so use this opportunity to simplify the code. Closes: https://lore.kernel.org/all/aTaI0ci3vZ44lmBn@stanley.mountain/ Fixes: d6389d3ccc13 ("perf/x86/amd/uncore: Refactor uncore management") Reported-by: Dan Carpenter <dan.carpenter@linaro.org> Signed-off-by: Sandipan Das <sandipan.das@amd.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/076935e23a70335d33bd6e23308b75ae0ad35ba2.1765268667.git.sandipan.das@amd.com
2025-12-05Merge tag 'soc-drivers-6.19' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc Pull SoC driver updates from Arnd Bergmann: "This is the first half of the driver changes: - A treewide interface change to the "syscore" operations for power management, as a preparation for future Tegra specific changes - Reset controller updates with added drivers for LAN969x, eic770 and RZ/G3S SoCs - Protection of system controller registers on Renesas and Google SoCs, to prevent trivially triggering a system crash from e.g. debugfs access - soc_device identification updates on Nvidia, Exynos and Mediatek - debugfs support in the ST STM32 firewall driver - Minor updates for SoC drivers on AMD/Xilinx, Renesas, Allwinner, TI - Cleanups for memory controller support on Nvidia and Renesas" * tag 'soc-drivers-6.19' of git://git.kernel.org/pub/scm/linux/kernel/git/soc/soc: (114 commits) memory: tegra186-emc: Fix missing put_bpmp Documentation: reset: Remove reset_controller_add_lookup() reset: fix BIT macro reference reset: rzg2l-usbphy-ctrl: Fix a NULL vs IS_ERR() bug in probe reset: th1520: Support reset controllers in more subsystems reset: th1520: Prepare for supporting multiple controllers dt-bindings: reset: thead,th1520-reset: Add controllers for more subsys dt-bindings: reset: thead,th1520-reset: Remove non-VO-subsystem resets reset: remove legacy reset lookup code clk: davinci: psc: drop unused reset lookup reset: rzg2l-usbphy-ctrl: Add support for RZ/G3S SoC reset: rzg2l-usbphy-ctrl: Add support for USB PWRRDY dt-bindings: reset: renesas,rzg2l-usbphy-ctrl: Document RZ/G3S support reset: eswin: Add eic7700 reset driver dt-bindings: reset: eswin: Documentation for eic7700 SoC reset: sparx5: add LAN969x support dt-bindings: reset: microchip: Add LAN969x support soc: rockchip: grf: Add select correct PWM implementation on RK3368 soc/tegra: pmc: Add USB wake events for Tegra234 amba: tegra-ahb: Fix device leak on SMMU enable ...
2025-12-02Merge tag 'x86_misc_for_6.19-rc1' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull misc x86 updates from Dave Hansen: "The most significant are some changes to ensure that symbols exported for KVM are used only by KVM modules themselves, along with some related cleanups. In true x86/misc fashion, the other patch is completely unrelated and just enhances an existing pr_warn() to make it clear to users how they have tainted their kernel when something is mucking with MSRs. Summary: - Make MSR-induced taint easier for users to track down - Restrict KVM-specific exports to KVM itself" * tag 'x86_misc_for_6.19-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: x86: Restrict KVM-induced symbol exports to KVM modules where obvious/possible x86/mm: Drop unnecessary export of "ptdump_walk_pgd_level_debugfs" x86/mtrr: Drop unnecessary export of "mtrr_state" x86/bugs: Drop unnecessary export of "x86_spec_ctrl_base" x86/msr: Add CPU_OUT_OF_SPEC taint name to "unrecognized" pr_warn(msg)
2025-12-01Merge tag 'perf-core-2025-12-01' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull performance events updates from Ingo Molnar: "Callchain support: - Add support for deferred user-space stack unwinding for perf, enabled on x86. (Peter Zijlstra, Steven Rostedt) - unwind_user/x86: Enable frame pointer unwinding on x86 (Josh Poimboeuf) x86 PMU support and infrastructure: - x86/insn: Simplify for_each_insn_prefix() (Peter Zijlstra) - x86/insn,uprobes,alternative: Unify insn_is_nop() (Peter Zijlstra) Intel PMU driver: - Large series to prepare for and implement architectural PEBS support for Intel platforms such as Clearwater Forest (CWF) and Panther Lake (PTL). (Dapeng Mi, Kan Liang) - Check dynamic constraints (Kan Liang) - Optimize PEBS extended config (Peter Zijlstra) - cstates: - Remove PC3 support from LunarLake (Zhang Rui) - Add Pantherlake support (Zhang Rui) - Clearwater Forest support (Zide Chen) AMD PMU driver: - x86/amd: Check event before enable to avoid GPF (George Kennedy) Fixes and cleanups: - task_work: Fix NMI race condition (Peter Zijlstra) - perf/x86: Fix NULL event access and potential PEBS record loss (Dapeng Mi) - Misc other fixes and cleanups (Dapeng Mi, Ingo Molnar, Peter Zijlstra)" * tag 'perf-core-2025-12-01' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (38 commits) perf/x86/intel: Fix and clean up intel_pmu_drain_arch_pebs() type use perf/x86/intel: Optimize PEBS extended config perf/x86/intel: Check PEBS dyn_constraints perf/x86/intel: Add a check for dynamic constraints perf/x86/intel: Add counter group support for arch-PEBS perf/x86/intel: Setup PEBS data configuration and enable legacy groups perf/x86/intel: Update dyn_constraint base on PEBS event precise level perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSR perf/x86/intel: Process arch-PEBS records or record fragments perf/x86/intel/ds: Factor out PEBS group processing code to functions perf/x86/intel/ds: Factor out PEBS record processing code to functions perf/x86/intel: Initialize architectural PEBS perf/x86/intel: Correct large PEBS flag check perf/x86/intel: Replace x86_pmu.drain_pebs calling with static call perf/x86: Fix NULL event access and potential PEBS record loss perf/x86: Remove redundant is_x86_event() prototype entry,unwind/deferred: Fix unwind_reset_info() placement unwind_user/x86: Fix arch=um build perf: Support deferred user unwind unwind_user/x86: Teach FP unwind about start of function ...
2025-11-19perf/x86/intel/uncore: Remove superfluous checkJiri Slaby (SUSE)
The 'pmu' pointer cannot be NULL, as it is taken as a pointer to an array. Remove the superfluous NULL check. Found by Coverity: CID#1497507. Signed-off-by: Jiri Slaby (SUSE) <jirislaby@kernel.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Cc: Liang Kan <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://patch.msgid.link/20251119091538.825307-1-jirislaby@kernel.org
2025-11-14syscore: Pass context data to callbacksThierry Reding
Several drivers can benefit from registering per-instance data along with the syscore operations. To achieve this, move the modifiable fields out of the syscore_ops structure and into a separate struct syscore that can be registered with the framework. Add a void * driver data field for drivers to store contextual data that will be passed to the syscore ops. Acked-by: Rafael J. Wysocki (Intel) <rafael@kernel.org> Signed-off-by: Thierry Reding <treding@nvidia.com>
2025-11-12x86: Restrict KVM-induced symbol exports to KVM modules where obvious/possibleSean Christopherson
Extend KVM's export macro framework to provide EXPORT_SYMBOL_FOR_KVM(), and use the helper macro to export symbols for KVM throughout x86 if and only if KVM will build one or more modules, and only for those modules. To avoid unnecessary exports when CONFIG_KVM=m but kvm.ko will not be built (because no vendor modules are selected), let arch code #define EXPORT_SYMBOL_FOR_KVM to suppress/override the exports. Note, the set of symbols to restrict to KVM was generated by manual search and audit; any "misses" are due to human error, not some grand plan. Signed-off-by: Sean Christopherson <seanjc@google.com> Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com> Acked-by: Kai Huang <kai.huang@intel.com> Tested-by: Kai Huang <kai.huang@intel.com> Link: https://patch.msgid.link/20251112173944.1380633-5-seanjc%40google.com
2025-11-12perf/x86/intel: Fix and clean up intel_pmu_drain_arch_pebs() type useIngo Molnar
The following commit introduced a build failure on x86-32: 21954c8a0ff ("perf/x86/intel: Process arch-PEBS records or record fragments") ... arch/x86/events/intel/ds.c:2983:24: error: cast from pointer to integer of different size [-Werror=pointer-to-int-cast] The forced type conversion to 'u64' and 'void *' are not 32-bit clean, but they are also entirely unnecessary: ->pebs_vaddr is 'void *' already, and integer-compatible pointer arithmetics will work just fine on it. Fix & simplify the code. Reported-by: Stephen Rothwell <sfr@canb.auug.org.au> Fixes: d21954c8a0ff ("perf/x86/intel: Process arch-PEBS records or record fragments") Signed-off-by: Ingo Molnar <mingo@kernel.org> Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Link: https://patch.msgid.link/20251029102136.61364-10-dapeng1.mi@linux.intel.com
2025-11-07perf/x86/intel: Optimize PEBS extended configPeter Zijlstra
Similar to enable_acr_event, avoid the branch. Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2025-11-07perf/x86/intel: Check PEBS dyn_constraintsPeter Zijlstra
Handle the interaction between ("perf/x86/intel: Update dyn_constraint base on PEBS event precise level") and ("perf/x86/intel: Add a check for dynamic constraints"). Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2025-11-07perf/x86/intel: Add a check for dynamic constraintsKan Liang
The current event scheduler has a limit. If the counter constraint of an event is not a subset of any other counter constraint with an equal or higher weight. The counters may not be fully utilized. To workaround it, the commit bc1738f6ee83 ("perf, x86: Fix event scheduler for constraints with overlapping counters") introduced an overlap flag, which is hardcoded to the event constraint that may trigger the limit. It only works for static constraints. Many features on and after Intel PMON v6 require dynamic constraints. An event constraint is decided by both static and dynamic constraints at runtime. See commit 4dfe3232cc04 ("perf/x86: Add dynamic constraint"). The dynamic constraints are from CPUID enumeration. It's impossible to hardcode it in advance. It's not practical to set the overlap flag to all events. It's harmful to the scheduler. For the existing Intel platforms, the dynamic constraints don't trigger the limit. A real fix is not required. However, for virtualization, VMM may give a weird CPUID enumeration to a guest. It's impossible to indicate what the weird enumeration is. A check is introduced, which can list the possible breaks if a weird enumeration is used. Check the dynamic constraints enumerated for normal, branch counters logging, and auto-counter reload. Check both PEBS and non-PEBS constratins. Closes: https://lore.kernel.org/lkml/20250416195610.GC38216@noisy.programming.kicks-ass.net/ Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20250512175542.2000708-1-kan.liang@linux.intel.com
2025-11-07perf/x86/intel: Add counter group support for arch-PEBSDapeng Mi
Base on previous adaptive PEBS counter snapshot support, add counter group support for architectural PEBS. Since arch-PEBS shares same counter group layout with adaptive PEBS, directly reuse __setup_pebs_counter_group() helper to process arch-PEBS counter group. Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251029102136.61364-13-dapeng1.mi@linux.intel.com
2025-11-07perf/x86/intel: Setup PEBS data configuration and enable legacy groupsDapeng Mi
Different with legacy PEBS, arch-PEBS provides per-counter PEBS data configuration by programing MSR IA32_PMC_GPx/FXx_CFG_C MSRs. This patch obtains PEBS data configuration from event attribute and then writes the PEBS data configuration to MSR IA32_PMC_GPx/FXx_CFG_C and enable corresponding PEBS groups. Please notice this patch only enables XMM SIMD regs sampling for arch-PEBS, the other SIMD regs (OPMASK/YMM/ZMM) sampling on arch-PEBS would be supported after PMI based SIMD regs (OPMASK/YMM/ZMM) sampling is supported. Co-developed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251029102136.61364-12-dapeng1.mi@linux.intel.com
2025-11-07perf/x86/intel: Update dyn_constraint base on PEBS event precise levelDapeng Mi
arch-PEBS provides CPUIDs to enumerate which counters support PEBS sampling and precise distribution PEBS sampling. Thus PEBS constraints should be dynamically configured base on these counter and precise distribution bitmap instead of defining them statically. Update event dyn_constraint base on PEBS event precise level. Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251029102136.61364-11-dapeng1.mi@linux.intel.com
2025-11-07perf/x86/intel: Allocate arch-PEBS buffer and initialize PEBS_BASE MSRDapeng Mi
Arch-PEBS introduces a new MSR IA32_PEBS_BASE to store the arch-PEBS buffer physical address. This patch allocates arch-PEBS buffer and then initialize IA32_PEBS_BASE MSR with the buffer physical address. Co-developed-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251029102136.61364-10-dapeng1.mi@linux.intel.com
2025-11-07perf/x86/intel: Process arch-PEBS records or record fragmentsDapeng Mi
A significant difference with adaptive PEBS is that arch-PEBS record supports fragments which means an arch-PEBS record could be split into several independent fragments which have its own arch-PEBS header in each fragment. This patch defines architectural PEBS record layout structures and add helpers to process arch-PEBS records or fragments. Only legacy PEBS groups like basic, GPR, XMM and LBR groups are supported in this patch, the new added YMM/ZMM/OPMASK vector registers capturing would be supported in the future. Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251029102136.61364-9-dapeng1.mi@linux.intel.com
2025-11-07perf/x86/intel/ds: Factor out PEBS group processing code to functionsDapeng Mi
Adaptive PEBS and arch-PEBS share lots of same code to process these PEBS groups, like basic, GPR and meminfo groups. Extract these shared code to generic functions to avoid duplicated code. Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251029102136.61364-8-dapeng1.mi@linux.intel.com
2025-11-07perf/x86/intel/ds: Factor out PEBS record processing code to functionsDapeng Mi
Beside some PEBS record layout difference, arch-PEBS can share most of PEBS record processing code with adaptive PEBS. Thus, factor out these common processing code to independent inline functions, so they can be reused by subsequent arch-PEBS handler. Suggested-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251029102136.61364-7-dapeng1.mi@linux.intel.com
2025-11-07perf/x86/intel: Initialize architectural PEBSDapeng Mi
arch-PEBS leverages CPUID.23H.4/5 sub-leaves enumerate arch-PEBS supported capabilities and counters bitmap. This patch parses these 2 sub-leaves and initializes arch-PEBS capabilities and corresponding structures. Since IA32_PEBS_ENABLE and MSR_PEBS_DATA_CFG MSRs are no longer existed for arch-PEBS, arch-PEBS doesn't need to manipulate these MSRs. Thus add a simple pair of __intel_pmu_pebs_enable/disable() callbacks for arch-PEBS. Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251029102136.61364-6-dapeng1.mi@linux.intel.com
2025-11-07perf/x86/intel: Correct large PEBS flag checkDapeng Mi
current large PEBS flag check only checks if sample_regs_user contains unsupported GPRs but doesn't check if sample_regs_intr contains unsupported GPRs. Of course, currently PEBS HW supports to sample all perf supported GPRs, the missed check doesn't cause real issue. But it won't be true any more after the subsequent patches support to sample SSP register. SSP sampling is not supported by adaptive PEBS HW and it would be supported until arch-PEBS HW. So correct this issue. Fixes: a47ba4d77e12 ("perf/x86: Enable free running PEBS for REGS_USER/INTR") Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251029102136.61364-5-dapeng1.mi@linux.intel.com
2025-11-07perf/x86/intel: Replace x86_pmu.drain_pebs calling with static callDapeng Mi
Use x86_pmu_drain_pebs static call to replace calling x86_pmu.drain_pebs function pointer. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251029102136.61364-4-dapeng1.mi@linux.intel.com
2025-11-07perf/x86: Fix NULL event access and potential PEBS record lossDapeng Mi
When intel_pmu_drain_pebs_icl() is called to drain PEBS records, the perf_event_overflow() could be called to process the last PEBS record. While perf_event_overflow() could trigger the interrupt throttle and stop all events of the group, like what the below call-chain shows. perf_event_overflow() -> __perf_event_overflow() ->__perf_event_account_interrupt() -> perf_event_throttle_group() -> perf_event_throttle() -> event->pmu->stop() -> x86_pmu_stop() The side effect of stopping the events is that all corresponding event pointers in cpuc->events[] array are cleared to NULL. Assume there are two PEBS events (event a and event b) in a group. When intel_pmu_drain_pebs_icl() calls perf_event_overflow() to process the last PEBS record of PEBS event a, interrupt throttle is triggered and all pointers of event a and event b are cleared to NULL. Then intel_pmu_drain_pebs_icl() tries to process the last PEBS record of event b and encounters NULL pointer access. To avoid this issue, move cpuc->events[] clearing from x86_pmu_stop() to x86_pmu_del(). It's safe since cpuc->active_mask or cpuc->pebs_enabled is always checked before access the event pointer from cpuc->events[]. Closes: https://lore.kernel.org/oe-lkp/202507042103.a15d2923-lkp@intel.com Fixes: 9734e25fbf5a ("perf: Fix the throttle logic for a group") Reported-by: kernel test robot <oliver.sang@intel.com> Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251029102136.61364-3-dapeng1.mi@linux.intel.com
2025-11-07perf/x86: Remove redundant is_x86_event() prototypeDapeng Mi
2 is_x86_event() prototypes are defined in perf_event.h. Remove the redundant one. Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251029102136.61364-2-dapeng1.mi@linux.intel.com
2025-11-05Revert "perf/x86: Always store regs->ip in perf_callchain_kernel()"Jiri Olsa
This reverts commit 83f44ae0f8afcc9da659799db8693f74847e66b3. Currently we store initial stacktrace entry twice for non-HW ot_regs, which means callers that fail perf_hw_regs(regs) condition in perf_callchain_kernel. It's easy to reproduce this bpftrace: # bpftrace -e 'tracepoint:sched:sched_process_exec { print(kstack()); }' Attaching 1 probe... bprm_execve+1767 bprm_execve+1767 do_execveat_common.isra.0+425 __x64_sys_execve+56 do_syscall_64+133 entry_SYSCALL_64_after_hwframe+118 When perf_callchain_kernel calls unwind_start with first_frame, AFAICS we do not skip regs->ip, but it's added as part of the unwind process. Hence reverting the extra perf_callchain_store for non-hw regs leg. I was not able to bisect this, so I'm not really sure why this was needed in v5.2 and why it's not working anymore, but I could see double entries as far as v5.10. I did the test for both ORC and framepointer unwind with and without the this fix and except for the initial entry the stacktraces are the same. Acked-by: Song Liu <song@kernel.org> Signed-off-by: Jiri Olsa <jolsa@kernel.org> Link: https://lore.kernel.org/r/20251104215405.168643-2-jolsa@kernel.org Signed-off-by: Alexei Starovoitov <ast@kernel.org> Acked-by: Steven Rostedt (Google) <rostedt@goodmis.org>
2025-10-29perf/x86/intel/uncore: Add uncore PMU support for Wildcat Lakedongsheng
WildcatLake (WCL) is a variant of PantherLake (PTL) and shares the same uncore PMU features with PTL. Therefore, directly reuse Pantherlake's uncore PMU enabling code for WildcatLake. Signed-off-by: dongsheng <dongsheng.x.zhang@intel.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20250908061639.938105-2-dapeng1.mi@linux.intel.com
2025-10-29perf/x86/intel: Add PMU support for WildcatLakeDapeng Mi
WildcatLake is a variant of PantherLake and shares same PMU features, so directly reuse Pantherlake's code to enable PMU features for WildcatLake. Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Zide Chen <zide.chen@intel.com> Link: https://patch.msgid.link/20250908061639.938105-1-dapeng1.mi@linux.intel.com
2025-10-29unwind_user/x86: Teach FP unwind about start of functionPeter Zijlstra
When userspace is interrupted at the start of a function, before we get a chance to complete the frame, unwind will miss one caller. X86 has a uprobe specific fixup for this, add bits to the generic unwinder to support this. Suggested-by: Jens Remus <jremus@linux.ibm.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://patch.msgid.link/20251024145156.GM4068168@noisy.programming.kicks-ass.net
2025-10-29perf/x86/intel/cstate: Add Pantherlake supportZhang Rui
Like Lunarlake, Pantherlake supports CC1/CC6/CC7 and PC2/PC6/PC10. Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://patch.msgid.link/20251023223754.1743928-4-zide.chen@intel.com
2025-10-29perf/x86/intel/cstate: Remove PC3 support from LunarLakeZhang Rui
LunarLake doesn't support Package C3. Remove the PC3 residency counter support from LunarLake. Fixes: 26579860fbd5 ("perf/x86/intel/cstate: Add Lunarlake support") Signed-off-by: Zhang Rui <rui.zhang@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://patch.msgid.link/20251023223754.1743928-3-zide.chen@intel.com
2025-10-29perf/x86/intel/cstate: Add Clearwater Forest supportZide Chen
Clearwater Forest is based on the Darkmont Atom microarchitecture. >From the perspective of C-state residency profiling, it supports the same residency counters as Sierra Forest: CC1/CC6, PC2/PC6, and MC6. Please note that the C1E residency counter can only be read via PMT, not MSR. Therefore, tools relying on the perf_event framework cannot access the C1E residency. Signed-off-by: Zhenyu Wang <zhenyuw.linux@gmail.com> Signed-off-by: Zide Chen <zide.chen@intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://patch.msgid.link/20251023223754.1743928-2-zide.chen@intel.com
2025-10-29perf/x86/intel: Fix KASAN global-out-of-bounds warningDapeng Mi
When running "perf mem record" command on CWF, the below KASAN global-out-of-bounds warning is seen. ================================================================== BUG: KASAN: global-out-of-bounds in cmt_latency_data+0x176/0x1b0 Read of size 4 at addr ffffffffb721d000 by task dtlb/9850 Call Trace: kasan_report+0xb8/0xf0 cmt_latency_data+0x176/0x1b0 setup_arch_pebs_sample_data+0xf49/0x2560 intel_pmu_drain_arch_pebs+0x577/0xb00 handle_pmi_common+0x6c4/0xc80 The issue is caused by below code in __grt_latency_data(). The code tries to access x86_hybrid_pmu structure which doesn't exist on non-hybrid platform like CWF. WARN_ON_ONCE(hybrid_pmu(event->pmu)->pmu_type == hybrid_big) So add is_hybrid() check before calling this WARN_ON_ONCE to fix the global-out-of-bounds access issue. Fixes: 090262439f66 ("perf/x86/intel: Rename model-specific pebs_latency_data functions") Reported-by: Xudong Hao <xudong.hao@intel.com> Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Zide Chen <zide.chen@intel.com> Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20251028064214.1451968-1-dapeng1.mi@linux.intel.com
2025-10-16perf/x86/amd: Check event before enable to avoid GPFGeorge Kennedy
On AMD machines cpuc->events[idx] can become NULL in a subtle race condition with NMI->throttle->x86_pmu_stop(). Check event for NULL in amd_pmu_enable_all() before enable to avoid a GPF. This appears to be an AMD only issue. Syzkaller reported a GPF in amd_pmu_enable_all. INFO: NMI handler (perf_event_nmi_handler) took too long to run: 13.143 msecs Oops: general protection fault, probably for non-canonical address 0xdffffc0000000034: 0000 PREEMPT SMP KASAN NOPTI KASAN: null-ptr-deref in range [0x00000000000001a0-0x00000000000001a7] CPU: 0 UID: 0 PID: 328415 Comm: repro_36674776 Not tainted 6.12.0-rc1-syzk RIP: 0010:x86_pmu_enable_event (arch/x86/events/perf_event.h:1195 arch/x86/events/core.c:1430) RSP: 0018:ffff888118009d60 EFLAGS: 00010012 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000000 RDX: 0000000000000034 RSI: 0000000000000000 RDI: 00000000000001a0 RBP: 0000000000000001 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002 R13: ffff88811802a440 R14: ffff88811802a240 R15: ffff8881132d8601 FS: 00007f097dfaa700(0000) GS:ffff888118000000(0000) GS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000200001c0 CR3: 0000000103d56000 CR4: 00000000000006f0 Call Trace: <IRQ> amd_pmu_enable_all (arch/x86/events/amd/core.c:760 (discriminator 2)) x86_pmu_enable (arch/x86/events/core.c:1360) event_sched_out (kernel/events/core.c:1191 kernel/events/core.c:1186 kernel/events/core.c:2346) __perf_remove_from_context (kernel/events/core.c:2435) event_function (kernel/events/core.c:259) remote_function (kernel/events/core.c:92 (discriminator 1) kernel/events/core.c:72 (discriminator 1)) __flush_smp_call_function_queue (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./include/trace/events/csd.h:64 kernel/smp.c:135 kernel/smp.c:540) __sysvec_call_function_single (./arch/x86/include/asm/jump_label.h:27 ./include/linux/jump_label.h:207 ./arch/x86/include/asm/trace/irq_vectors.h:99 arch/x86/kernel/smp.c:272) sysvec_call_function_single (arch/x86/kernel/smp.c:266 (discriminator 47) arch/x86/kernel/smp.c:266 (discriminator 47)) </IRQ> Reported-by: syzkaller <syzkaller@googlegroups.com> Signed-off-by: George Kennedy <george.kennedy@oracle.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
2025-08-21perf/x86: Print PMU counters bitmap in x86_pmu_show_pmu_cap()Dapeng Mi
Along with the introduction Perfmon v6, pmu counters could be incontinuous, like fixed counters on CWF, only fixed counters 0-3 and 5-7 are supported, there is no fixed counter 4 on CWF. To accommodate this change, archPerfmonExt CPUID (0x23) leaves are introduced to enumerate the true-view of counters bitmap. Current perf code already supports archPerfmonExt CPUID and uses counters-bitmap to enumerate HW really supported counters, but x86_pmu_show_pmu_cap() still only dumps the absolute counter number instead of true-view bitmap, it's out-dated and may mislead readers. So dump counters true-view bitmap in x86_pmu_show_pmu_cap() and opportunistically change the dump sequence and words. Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20250820023032.17128-8-dapeng1.mi@linux.intel.com
2025-08-21perf/x86/intel: Add ICL_FIXED_0_ADAPTIVE bit into INTEL_FIXED_BITS_MASKDapeng Mi
ICL_FIXED_0_ADAPTIVE is missed to be added into INTEL_FIXED_BITS_MASK, add it. With help of this new INTEL_FIXED_BITS_MASK, intel_pmu_enable_fixed() can be optimized. The old fixed counter control bits can be unconditionally cleared with INTEL_FIXED_BITS_MASK and then set new control bits base on new configuration. Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Yi Lai <yi1.lai@intel.com> Link: https://lore.kernel.org/r/20250820023032.17128-7-dapeng1.mi@linux.intel.com
2025-08-21perf/x86/intel: Change macro GLOBAL_CTRL_EN_PERF_METRICS to BIT_ULL(48)Dapeng Mi
Macro GLOBAL_CTRL_EN_PERF_METRICS is defined to 48 instead of BIT_ULL(48), it's inconsistent with other similar macros. This leads to this macro is quite easily used wrongly since users thinks it's a bit-mask just like other similar macros. Thus change GLOBAL_CTRL_EN_PERF_METRICS to BIT_ULL(48) and eliminate this potential misuse. Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Tested-by: Yi Lai <yi1.lai@intel.com> Link: https://lore.kernel.org/r/20250820023032.17128-6-dapeng1.mi@linux.intel.com
2025-08-21perf/x86/intel: Fix IA32_PMC_x_CFG_B MSRs access errorDapeng Mi
When running perf_fuzzer on PTL, sometimes the below "unchecked MSR access error" is seen when accessing IA32_PMC_x_CFG_B MSRs. [ 55.611268] unchecked MSR access error: WRMSR to 0x1986 (tried to write 0x0000000200000001) at rIP: 0xffffffffac564b28 (native_write_msr+0x8/0x30) [ 55.611280] Call Trace: [ 55.611282] <TASK> [ 55.611284] ? intel_pmu_config_acr+0x87/0x160 [ 55.611289] intel_pmu_enable_acr+0x6d/0x80 [ 55.611291] intel_pmu_enable_event+0xce/0x460 [ 55.611293] x86_pmu_start+0x78/0xb0 [ 55.611297] x86_pmu_enable+0x218/0x3a0 [ 55.611300] ? x86_pmu_enable+0x121/0x3a0 [ 55.611302] perf_pmu_enable+0x40/0x50 [ 55.611307] ctx_resched+0x19d/0x220 [ 55.611309] __perf_install_in_context+0x284/0x2f0 [ 55.611311] ? __pfx_remote_function+0x10/0x10 [ 55.611314] remote_function+0x52/0x70 [ 55.611317] ? __pfx_remote_function+0x10/0x10 [ 55.611319] generic_exec_single+0x84/0x150 [ 55.611323] smp_call_function_single+0xc5/0x1a0 [ 55.611326] ? __pfx_remote_function+0x10/0x10 [ 55.611329] perf_install_in_context+0xd1/0x1e0 [ 55.611331] ? __pfx___perf_install_in_context+0x10/0x10 [ 55.611333] __do_sys_perf_event_open+0xa76/0x1040 [ 55.611336] __x64_sys_perf_event_open+0x26/0x30 [ 55.611337] x64_sys_call+0x1d8e/0x20c0 [ 55.611339] do_syscall_64+0x4f/0x120 [ 55.611343] entry_SYSCALL_64_after_hwframe+0x76/0x7e On PTL, GP counter 0 and 1 doesn't support auto counter reload feature, thus it would trigger a #GP when trying to write 1 on bit 0 of CFG_B MSR which requires to enable auto counter reload on GP counter 0. The root cause of causing this issue is the check for auto counter reload (ACR) counter mask from user space is incorrect in intel_pmu_acr_late_setup() helper. It leads to an invalid ACR counter mask from user space could be set into hw.config1 and then written into CFG_B MSRs and trigger the MSR access warning. e.g., User may create a perf event with ACR counter mask (config2=0xcb), and there is only 1 event created, so "cpuc->n_events" is 1. The correct check condition should be "i + idx >= cpuc->n_events" instead of "i + idx > cpuc->n_events" (it looks a typo). Otherwise, the counter mask would traverse twice and an invalid "cpuc->assign[1]" bit (bit 0) is set into hw.config1 and cause MSR accessing error. Besides, also check if the ACR counter mask corresponding events are ACR events. If not, filter out these counter mask. If a event is not a ACR event, it could be scheduled to an HW counter which doesn't support ACR. It's invalid to add their counter index in ACR counter mask. Furthermore, remove the WARN_ON_ONCE() since it's easily triggered as user could set any invalid ACR counter mask and the warning message could mislead users. Fixes: ec980e4facef ("perf/x86/intel: Support auto counter reload") Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20250820023032.17128-3-dapeng1.mi@linux.intel.com
2025-08-21perf/x86/intel: Use early_initcall() to hook bts_init()Dapeng Mi
After the commit 'd971342d38bf ("perf/x86/intel: Decouple BTS initialization from PEBS initialization")' is introduced, x86_pmu.bts would initialized in bts_init() which is hooked by arch_initcall(). Whereas init_hw_perf_events() is hooked by early_initcall(). Once the core PMU is initialized, nmi watchdog initialization is called immediately before bts_init() is called. It leads to the BTS buffer is not really initialized since bts_init() is not called and x86_pmu.bts is still false at that time. Worse, BTS buffer would never be initialized then unless all core PMU events are freed and reserve_ds_buffers() is called again. Thus aligning with init_hw_perf_events(), use early_initcall() to hook bts_init() to ensure x86_pmu.bts is initialized before nmi watchdog initialization. Fixes: d971342d38bf ("perf/x86/intel: Decouple BTS initialization from PEBS initialization") Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Link: https://lore.kernel.org/r/20250820023032.17128-2-dapeng1.mi@linux.intel.com
2025-07-09perf/x86/intel/uncore: Add iMC freerunning for Panther LakeKan Liang
PTL uncore imc freerunning counters are the same as the previous HW. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20250707201750.616527-5-kan.liang@linux.intel.com
2025-07-09perf/x86/intel/uncore: Add Panther Lake supportKan Liang
The Panther Lake supports CBOX, MC, sNCU, and HBO uncore PMON. The CBOX is similar to Lunar Lake. The only difference is the number of CBOX. The other three uncore PMON can be retrieved from the discovery table. The global control register resides in the sNCU. The global freeze bit is set by default. It must be cleared before monitoring any uncore counters. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20250707201750.616527-4-kan.liang@linux.intel.com
2025-07-09perf/x86/intel/uncore: Support customized MMIO map sizeKan Liang
For a server platform, the MMIO map size is always 0x4000. However, a client platform may have a smaller map size. Make the map size customizable. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20250707201750.616527-3-kan.liang@linux.intel.com
2025-07-09perf/x86/intel/uncore: Support MSR portal for discovery tablesKan Liang
Starting from the Panther Lake, the discovery table mechanism is also supported in client platforms. The difference is that the portal of the global discovery table is retrieved from an MSR. The layout of discovery tables are the same as the server platforms. Factor out __parse_discovery_table() to parse discover tables. The uncore PMON is Die scope. Need to parse the discovery tables for each die. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Link: https://lore.kernel.org/r/20250707201750.616527-2-kan.liang@linux.intel.com
2025-06-13perf/x86/intel: Fix crash in icl_update_topdown_event()Kan Liang
The perf_fuzzer found a hard-lockup crash on a RaptorLake machine: Oops: general protection fault, maybe for address 0xffff89aeceab400: 0000 CPU: 23 UID: 0 PID: 0 Comm: swapper/23 Tainted: [W]=WARN Hardware name: Dell Inc. Precision 9660/0VJ762 RIP: 0010:native_read_pmc+0x7/0x40 Code: cc e8 8d a9 01 00 48 89 03 5b cd cc cc cc cc 0f 1f ... RSP: 000:fffb03100273de8 EFLAGS: 00010046 .... Call Trace: <TASK> icl_update_topdown_event+0x165/0x190 ? ktime_get+0x38/0xd0 intel_pmu_read_event+0xf9/0x210 __perf_event_read+0xf9/0x210 CPUs 16-23 are E-core CPUs that don't support the perf metrics feature. The icl_update_topdown_event() should not be invoked on these CPUs. It's a regression of commit: f9bdf1f95339 ("perf/x86/intel: Avoid disable PMU if !cpuc->enabled in sample read") The bug introduced by that commit is that the is_topdown_event() function is mistakenly used to replace the is_topdown_count() call to check if the topdown functions for the perf metrics feature should be invoked. Fix it. Fixes: f9bdf1f95339 ("perf/x86/intel: Avoid disable PMU if !cpuc->enabled in sample read") Closes: https://lore.kernel.org/lkml/352f0709-f026-cd45-e60c-60dfd97f73f3@maine.edu/ Reported-by: Vince Weaver <vincent.weaver@maine.edu> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Signed-off-by: Ingo Molnar <mingo@kernel.org> Tested-by: Vince Weaver <vincent.weaver@maine.edu> Cc: stable@vger.kernel.org # v6.15+ Link: https://lore.kernel.org/r/20250612143818.2889040-1-kan.liang@linux.intel.com
2025-05-31perf/x86/intel: Fix incorrect MSR index calculations in intel_pmu_config_acr()Dapeng Mi
The MSR offset calculations in intel_pmu_config_acr() are buggy. To calculate fixed counter MSR addresses in intel_pmu_config_acr(), the HW counter index "idx" is subtracted by INTEL_PMC_IDX_FIXED. This leads to the ACR mask value of fixed counters to be incorrectly saved to the positions of GP counters in acr_cfg_b[], e.g. For fixed counter 0, its ACR counter mask should be saved to acr_cfg_b[32], but it's saved to acr_cfg_b[0] incorrectly. Fix this issue. [ mingo: Clarified & improved the changelog. ] Fixes: ec980e4facef ("perf/x86/intel: Support auto counter reload") Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com> Signed-off-by: Ingo Molnar <mingo@kernel.org> Reviewed-by: Kan Liang <kan.liang@linux.intel.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20250529080236.2552247-2-dapeng1.mi@linux.intel.com
2025-05-26Merge tag 'x86-core-2025-05-25' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull core x86 updates from Ingo Molnar: "Boot code changes: - A large series of changes to reorganize the x86 boot code into a better isolated and easier to maintain base of PIC early startup code in arch/x86/boot/startup/, by Ard Biesheuvel. Motivation & background: | Since commit | | c88d71508e36 ("x86/boot/64: Rewrite startup_64() in C") | | dated Jun 6 2017, we have been using C code on the boot path in a way | that is not supported by the toolchain, i.e., to execute non-PIC C | code from a mapping of memory that is different from the one provided | to the linker. It should have been obvious at the time that this was a | bad idea, given the need to sprinkle fixup_pointer() calls left and | right to manipulate global variables (including non-pointer variables) | without crashing. | | This C startup code has been expanding, and in particular, the SEV-SNP | startup code has been expanding over the past couple of years, and | grown many of these warts, where the C code needs to use special | annotations or helpers to access global objects. This tree includes the first phase of this work-in-progress x86 boot code reorganization. Scalability enhancements and micro-optimizations: - Improve code-patching scalability (Eric Dumazet) - Remove MFENCEs for X86_BUG_CLFLUSH_MONITOR (Andrew Cooper) CPU features enumeration updates: - Thorough reorganization and cleanup of CPUID parsing APIs (Ahmed S. Darwish) - Fix, refactor and clean up the cacheinfo code (Ahmed S. Darwish, Thomas Gleixner) - Update CPUID bitfields to x86-cpuid-db v2.3 (Ahmed S. Darwish) Memory management changes: - Allow temporary MMs when IRQs are on (Andy Lutomirski) - Opt-in to IRQs-off activate_mm() (Andy Lutomirski) - Simplify choose_new_asid() and generate better code (Borislav Petkov) - Simplify 32-bit PAE page table handling (Dave Hansen) - Always use dynamic memory layout (Kirill A. Shutemov) - Make SPARSEMEM_VMEMMAP the only memory model (Kirill A. Shutemov) - Make 5-level paging support unconditional (Kirill A. Shutemov) - Stop prefetching current->mm->mmap_lock on page faults (Mateusz Guzik) - Predict valid_user_address() returning true (Mateusz Guzik) - Consolidate initmem_init() (Mike Rapoport) FPU support and vector computing: - Enable Intel APX support (Chang S. Bae) - Reorgnize and clean up the xstate code (Chang S. Bae) - Make task_struct::thread constant size (Ingo Molnar) - Restore fpu_thread_struct_whitelist() to fix CONFIG_HARDENED_USERCOPY=y (Kees Cook) - Simplify the switch_fpu_prepare() + switch_fpu_finish() logic (Oleg Nesterov) - Always preserve non-user xfeatures/flags in __state_perm (Sean Christopherson) Microcode loader changes: - Help users notice when running old Intel microcode (Dave Hansen) - AMD: Do not return error when microcode update is not necessary (Annie Li) - AMD: Clean the cache if update did not load microcode (Boris Ostrovsky) Code patching (alternatives) changes: - Simplify, reorganize and clean up the x86 text-patching code (Ingo Molnar) - Make smp_text_poke_batch_process() subsume smp_text_poke_batch_finish() (Nikolay Borisov) - Refactor the {,un}use_temporary_mm() code (Peter Zijlstra) Debugging support: - Add early IDT and GDT loading to debug relocate_kernel() bugs (David Woodhouse) - Print the reason for the last reset on modern AMD CPUs (Yazen Ghannam) - Add AMD Zen debugging document (Mario Limonciello) - Fix opcode map (!REX2) superscript tags (Masami Hiramatsu) - Stop decoding i64 instructions in x86-64 mode at opcode (Masami Hiramatsu) CPU bugs and bug mitigations: - Remove X86_BUG_MMIO_UNKNOWN (Borislav Petkov) - Fix SRSO reporting on Zen1/2 with SMT disabled (Borislav Petkov) - Restructure and harmonize the various CPU bug mitigation methods (David Kaplan) - Fix spectre_v2 mitigation default on Intel (Pawan Gupta) MSR API: - Large MSR code and API cleanup (Xin Li) - In-kernel MSR API type cleanups and renames (Ingo Molnar) PKEYS: - Simplify PKRU update in signal frame (Chang S. Bae) NMI handling code: - Clean up, refactor and simplify the NMI handling code (Sohil Mehta) - Improve NMI duration console printouts (Sohil Mehta) Paravirt guests interface: - Restrict PARAVIRT_XXL to 64-bit only (Kirill A. Shutemov) SEV support: - Share the sev_secrets_pa value again (Tom Lendacky) x86 platform changes: - Introduce the <asm/amd/> header namespace (Ingo Molnar) - i2c: piix4, x86/platform: Move the SB800 PIIX4 FCH definitions to <asm/amd/fch.h> (Mario Limonciello) Fixes and cleanups: - x86 assembly code cleanups and fixes (Uros Bizjak) - Misc fixes and cleanups (Andi Kleen, Andy Lutomirski, Andy Shevchenko, Ard Biesheuvel, Bagas Sanjaya, Baoquan He, Borislav Petkov, Chang S. Bae, Chao Gao, Dan Williams, Dave Hansen, David Kaplan, David Woodhouse, Eric Biggers, Ingo Molnar, Josh Poimboeuf, Juergen Gross, Malaya Kumar Rout, Mario Limonciello, Nathan Chancellor, Oleg Nesterov, Pawan Gupta, Peter Zijlstra, Shivank Garg, Sohil Mehta, Thomas Gleixner, Uros Bizjak, Xin Li)" * tag 'x86-core-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (331 commits) x86/bugs: Fix spectre_v2 mitigation default on Intel x86/bugs: Restructure ITS mitigation x86/xen/msr: Fix uninitialized variable 'err' x86/msr: Remove a superfluous inclusion of <asm/asm.h> x86/paravirt: Restrict PARAVIRT_XXL to 64-bit only x86/mm/64: Make 5-level paging support unconditional x86/mm/64: Make SPARSEMEM_VMEMMAP the only memory model x86/mm/64: Always use dynamic memory layout x86/bugs: Fix indentation due to ITS merge x86/cpuid: Rename hypervisor_cpuid_base()/for_each_possible_hypervisor_cpuid_base() to cpuid_base_hypervisor()/for_each_possible_cpuid_base_hypervisor() x86/cpu/intel: Rename CPUID(0x2) descriptors iterator parameter x86/cacheinfo: Rename CPUID(0x2) descriptors iterator parameter x86/cpuid: Rename cpuid_get_leaf_0x2_regs() to cpuid_leaf_0x2() x86/cpuid: Rename have_cpuid_p() to cpuid_feature() x86/cpuid: Set <asm/cpuid/api.h> as the main CPUID header x86/cpuid: Move CPUID(0x2) APIs into <cpuid/api.h> x86/msr: Add rdmsrl_on_cpu() compatibility wrapper x86/mm: Fix kernel-doc descriptions of various pgtable methods x86/asm-offsets: Export certain 'struct cpuinfo_x86' fields for 64-bit asm use too x86/boot: Defer initialization of VM space related global variables ...
2025-05-26Merge tag 'perf-core-2025-05-25' of ↵Linus Torvalds
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip Pull perf events updates from Ingo Molnar: "Core & generic-arch updates: - Add support for dynamic constraints and propagate it to the Intel driver (Kan Liang) - Fix & enhance driver-specific throttling support (Kan Liang) - Record sample last_period before updating on the x86 and PowerPC platforms (Mark Barnett) - Make perf_pmu_unregister() usable (Peter Zijlstra) - Unify perf_event_free_task() / perf_event_exit_task_context() (Peter Zijlstra) - Simplify perf_event_release_kernel() and perf_event_free_task() (Peter Zijlstra) - Allocate non-contiguous AUX pages by default (Yabin Cui) Uprobes updates: - Add support to emulate NOP instructions (Jiri Olsa) - selftests/bpf: Add 5-byte NOP uprobe trigger benchmark (Jiri Olsa) x86 Intel PMU enhancements: - Support Intel Auto Counter Reload [ACR] (Kan Liang) - Add PMU support for Clearwater Forest (Dapeng Mi) - Arch-PEBS preparatory changes: (Dapeng Mi) - Parse CPUID archPerfmonExt leaves for non-hybrid CPUs - Decouple BTS initialization from PEBS initialization - Introduce pairs of PEBS static calls x86 AMD PMU enhancements: - Use hrtimer for handling overflows in the AMD uncore driver (Sandipan Das) - Prevent UMC counters from saturating (Sandipan Das) Fixes and cleanups: - Fix put_ctx() ordering (Frederic Weisbecker) - Fix irq work dereferencing garbage (Frederic Weisbecker) - Misc fixes and cleanups (Changbin Du, Frederic Weisbecker, Ian Rogers, Ingo Molnar, Kan Liang, Peter Zijlstra, Qing Wang, Sandipan Das, Thorsten Blum)" * tag 'perf-core-2025-05-25' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (60 commits) perf/headers: Clean up <linux/perf_event.h> a bit perf/uapi: Clean up <uapi/linux/perf_event.h> a bit perf/uapi: Fix PERF_RECORD_SAMPLE comments in <uapi/linux/perf_event.h> mips/perf: Remove driver-specific throttle support xtensa/perf: Remove driver-specific throttle support sparc/perf: Remove driver-specific throttle support loongarch/perf: Remove driver-specific throttle support csky/perf: Remove driver-specific throttle support arc/perf: Remove driver-specific throttle support alpha/perf: Remove driver-specific throttle support perf/apple_m1: Remove driver-specific throttle support perf/arm: Remove driver-specific throttle support s390/perf: Remove driver-specific throttle support powerpc/perf: Remove driver-specific throttle support perf/x86/zhaoxin: Remove driver-specific throttle support perf/x86/amd: Remove driver-specific throttle support perf/x86/intel: Remove driver-specific throttle support perf: Only dump the throttle log for the leader perf: Fix the throttle logic for a group perf/core: Add the is_event_in_freq_mode() helper to simplify the code ...
2025-05-21perf/x86/zhaoxin: Remove driver-specific throttle supportKan Liang
The throttle support has been added in the generic code. Remove the driver-specific throttle support. Besides the throttle, perf_event_overflow may return true because of event_limit. It already does an inatomic event disable. The pmu->stop is not required either. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250520181644.2673067-6-kan.liang@linux.intel.com
2025-05-21perf/x86/amd: Remove driver-specific throttle supportKan Liang
The throttle support has been added in the generic code. Remove the driver-specific throttle support. Besides the throttle, perf_event_overflow may return true because of event_limit. It already does an inatomic event disable. The pmu->stop is not required either. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Ravi Bangoria <ravi.bangoria@amd.com> Link: https://lore.kernel.org/r/20250520181644.2673067-5-kan.liang@linux.intel.com
2025-05-21perf/x86/intel: Remove driver-specific throttle supportKan Liang
The throttle support has been added in the generic code. Remove the driver-specific throttle support. Besides the throttle, perf_event_overflow may return true because of event_limit. It already does an inatomic event disable. The pmu->stop is not required either. Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Link: https://lore.kernel.org/r/20250520181644.2673067-4-kan.liang@linux.intel.com