| Age | Commit message (Collapse) | Author |
|
Add an extra "../" to the relative paths so that the uAPI headers
provided by tools can be found correctly.
Fixes: a724a8fce5e25b45 ("perf kvm stat: Fix build error")
Reported-by: Namhyung Kim <namhyung@kernel.org>
Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The "Read backward ring buffer" test crashes on big-endian (e.g. s390x)
due to a NULL dereference when the backward mmap path isn't enabled.
Reproducer:
# ./perf test -F 'Read backward ring buffer'
Segmentation fault (core dumped)
# uname -m
s390x
#
Root cause:
get_config_terms() stores into evsel_config_term::val.val (u64) while later
code reads boolean fields such as evsel_config_term::val.overwrite.
On big-endian the 1-byte boolean is left-aligned, so writing
evsel_config_term::val.val = 1 is read back as
evsel_config_term::val.overwrite = 0,
leaving backward mmap disabled and a NULL map being used.
Store values in the union member that matches the term type, e.g.:
/* for OVERWRITE */
new_term->val.overwrite = 1; /* not new_term->val.val = 1 */
to fix this. Improve add_config_term() and add two more parameters for
string and value. Function add_config_term() now creates a complete node
element of type evsel_config_term and handles all evsel_config_term::val
union members.
Impact:
Enables backward mmap on big-endian and prevents the crash.
No change on little-endian.
Output after:
# ./perf test -Fv 44
--- start ---
Using CPUID IBM,9175,705,ME1,3.8,002f
mmap size 1052672B
mmap size 8192B
---- end ----
44: Read backward ring buffer : Ok
#
Fixes: 159ca97cd97ce8cc ("perf parse-events: Refactor get_config_terms() to remove macros")
Reviewed-by: James Clark <james.clark@linaro.org>
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Use metricgroup__for_each_metric() rather than
pmu_metrics_table__for_each_metric() that combines the
default metric table with, a potentially empty, CPUID table.
Fixes: cee275edcdb1acfd ("perf metricgroup: Don't early exit if no CPUID table exists")
Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Leo Yan <leo.yan@arm.com>
Cc: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
perf_event__synthesize_modules() allocates a single union perf_event and
reuses it across every kernel module callback.
After the first module is processed, perf_record_mmap2__read_build_id()
sets PERF_RECORD_MISC_MMAP_BUILD_ID in header.misc and writes that
module's build ID into the event.
On subsequent iterations the callback overwrites start, len, pid, and
filename for the next module but never clears the stale build ID fields
or the MMAP_BUILD_ID flag.
When perf_record_mmap2__read_build_id() runs for the second module it
sees the flag, reads the stale build ID into a dso_id, and
__dso__improve_id() permanently poisons the DSO with the wrong build ID.
Every module after the first therefore receives the first module's build
ID in its MMAP2 record.
On a system with the sunrpc and nfsd modules loaded, this causes perf
script and perf report to show [unknown] for all module symbols.
The latent bug has existed since commit d9f2ecbc5e47fca7 ("perf dso:
Move build_id to dso_id") introduced the PERF_RECORD_MISC_MMAP_BUILD_ID
check in perf_record_mmap2__read_build_id().
Commit 53b00ff358dc75b1 ("perf record: Make --buildid-mmap the default")
then exposed it to all users by making the MMAP2-with-build-ID path the
default. Both commits were merged in the same series.
Clear the MMAP_BUILD_ID flag and zero the build_id union before each
call to perf_record_mmap2__read_build_id() so that every module starts
with a clean slate.
Fixes: d9f2ecbc5e47fca7 ("perf dso: Move build_id to dso_id")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
A copy-paste of a similar issue fixed by Peter Collingbourne in:
https://lore.kernel.org/linux-perf-users/20260304190613.2507582-1-pcc@google.com/
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Collingbourne <pcc@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The hashmap__new() function never returns NULL, it returns error
pointers. Fix the error checking to match.
Additionally, set src->samples to NULL to prevent any later code from
accidentally using the error pointer.
Fixes: d3e7cad6f36d9e80 ("perf annotate: Add a hashmap for symbol histogram")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Chen Ni <nichen@iscas.ac.cn>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tianyou Li <tianyou.li@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
These #defines have been removed from the kernel headers in favour of
the string based PMU format attributes. Usages were previously removed
from the recording side of cs-etm in Perf. Finish the removal by
removing usages from the decode side too.
It's a straight replacement of the old #defines with the new register
bit definitions. Except cs_etm__setup_timeless_decoding() which wasn't
looking at the saved metadata and was instead hard coding an access to
'attr.config'. This was vulnerable to the same issue of .config being
moved to .config2 etc that the original removal of ETM_OPT_* tried to
fix. So fix that too.
Signed-off-by: James Clark <james.clark@linaro.org>
Tested-by: Leo Yan <leo.yan@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Mike Leach <mike.leach@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Will Deacon <will@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
If a branch target points to one past the end of a function, the branch
should be treated as a branch to another function.
This can happen e.g. with a tail call to a function that is laid out
immediately after the caller.
Fixes: 751b1783da784299 ("perf annotate: Mark jumps to outher functions with the call arrow")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Peter Collingbourne <pcc@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://linux-review.googlesource.com/id/Ide471112e82d68177e0faf08ca411d9fcf0a7bdf
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools
Pull perf tools updates from Arnaldo Carvalho de Melo:
- Introduce 'perf sched stats' tool with record/report/diff workflows
using schedstat counters
- Add a faster libdw based addr2line implementation and allow selecting
it or its alternatives via 'perf config addr2line.style='
- Data-type profiling fixes and improvements including the ability to
select fields using 'perf report''s -F/-fields, e.g.:
'perf report --fields overhead,type'
- Add 'perf test' regression tests for Data-type profiling with C and
Rust workloads
- Fix srcline printing with inlines in callchains, make sure this has
coverage in 'perf test'
- Fix printing of leaf IP in LBR callchains
- Fix display of metrics without sufficient permission in 'perf stat'
- Print all machines in 'perf kvm report -vvv', not just the host
- Switch from SHA-1 to BLAKE2s for build ID generation, remove SHA-1
code
- Fix 'perf report's histogram entry collapsing with '-F' option
- Use system's cacheline size instead of a hardcoded value in 'perf
report'
- Allow filtering conversion by time range in 'perf data'
- Cover conversion to CTF using 'perf data' in 'perf test'
- Address newer glibc const-correctness (-Werror=discarded-qualifiers)
issues
- Fixes and improvements for ARM's CoreSight support, simplify ARM SPE
event config in 'perf mem', update docs for 'perf c2c' including the
ARM events it can be used with
- Build support for generating metrics from arch specific python
script, add extra AMD, Intel, ARM64 metrics using it
- Add AMD Zen 6 events and metrics
- Add JSON file with OpenHW Risc-V CVA6 hardware counters
- Add 'perf kvm' stats live testing
- Add more 'perf stat' tests to 'perf test'
- Fix segfault in `perf lock contention -b/--use-bpf`
- Fix various 'perf test' cases for s390
- Build system cleanups, bump minimum shellcheck version to 0.7.2
- Support building the capstone based annotation routines as a plugin
- Allow passing extra Clang flags via EXTRA_BPF_FLAGS
* tag 'perf-tools-for-v7.0-1-2026-02-21' of git://git.kernel.org/pub/scm/linux/kernel/git/perf/perf-tools: (255 commits)
perf test script: Add python script testing support
perf test script: Add perl script testing support
perf script: Allow the generated script to be a path
perf test: perf data --to-ctf testing
perf test: Test pipe mode with data conversion --to-json
perf json: Pipe mode --to-ctf support
perf json: Pipe mode --to-json support
perf check: Add libbabeltrace to the listed features
perf build: Allow passing extra Clang flags via EXTRA_BPF_FLAGS
perf test data_type_profiling.sh: Skip just the Rust tests if code_with_type workload is missing
tools build: Fix feature test for rust compiler
perf libunwind: Fix calls to thread__e_machine()
perf stat: Add no-affinity flag
perf evlist: Reduce affinity use and move into iterator, fix no affinity
perf evlist: Missing TPEBS close in evlist__close()
perf evlist: Special map propagation for tool events that read on 1 CPU
perf stat-shadow: In prepare_metric fix guard on reading NULL perf_stat_evsel
Revert "perf tool_pmu: More accurately set the cpus for tool events"
tools build: Emit dependencies file for test-rust.bin
tools build: Make test-rust.bin be removed by the 'clean' target
...
|
|
In pipe mode the environment may not be fully initialized so be robust
to fields being NULL.
Add default handling of attr events, use the feature events to populate
the ctf writer environment.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Derek Foreman <derek.foreman@collabora.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In pipe mode the environment may not be fully initialized so be robust
to fields being NULL. Add default handling of feature and attr events.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Derek Foreman <derek.foreman@collabora.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add the missing 'e_flags' option to fix the build.
Fixes: 4e66527f8859a661 ("perf thread: Add optional e_flags output argument to thread__e_machine")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip
Pull performance event updates from Ingo Molnar:
"x86 PMU driver updates:
- Add support for the core PMU for Intel Diamond Rapids (DMR) CPUs
(Dapeng Mi)
Compared to previous iterations of the Intel PMU code, there's been
a lot of changes, which center around three main areas:
- Introduce the OFF-MODULE RESPONSE (OMR) facility to replace the
Off-Core Response (OCR) facility
- New PEBS data source encoding layout
- Support the new "RDPMC user disable" feature
- Likewise, a large series adds uncore PMU support for Intel Diamond
Rapids (DMR) CPUs (Zide Chen)
This centers around these four main areas:
- DMR may have two Integrated I/O and Memory Hub (IMH) dies,
separate from the compute tile (CBB) dies. Each CBB and each IMH
die has its own discovery domain.
- Unlike prior CPUs that retrieve the global discovery table
portal exclusively via PCI or MSR, DMR uses PCI for IMH PMON
discovery and MSR for CBB PMON discovery.
- DMR introduces several new PMON types: SCA, HAMVF, D2D_ULA, UBR,
PCIE4, CRS, CPC, ITC, OTC, CMS, and PCIE6.
- IIO free-running counters in DMR are MMIO-based, unlike SPR.
- Also add support for Add missing PMON units for Intel Panther Lake,
and support Nova Lake (NVL), which largely maps to Panther Lake.
(Zide Chen)
- KVM integration: Add support for mediated vPMUs (by Kan Liang and
Sean Christopherson, with fixes and cleanups by Peter Zijlstra,
Sandipan Das and Mingwei Zhang)
- Add Intel cstate driver to support for Wildcat Lake (WCL) CPUs,
which are a low-power variant of Panther Lake (Zide Chen)
- Add core, cstate and MSR PMU support for the Airmont NP Intel CPU
(aka MaxLinear Lightning Mountain), which maps to the existing
Airmont code (Martin Schiller)
Performance enhancements:
- Speed up kexec shutdown by avoiding unnecessary cross CPU calls
(Jan H. Schönherr)
- Fix slow perf_event_task_exit() with LBR callstacks (Namhyung Kim)
User-space stack unwinding support:
- Various cleanups and refactorings in preparation to generalize the
unwinding code for other architectures (Jens Remus)
Uprobes updates:
- Transition from kmap_atomic to kmap_local_page (Keke Ming)
- Fix incorrect lockdep condition in filter_chain() (Breno Leitao)
- Fix XOL allocation failure for 32-bit tasks (Oleg Nesterov)
Misc fixes and cleanups:
- s390: Remove kvm_types.h from Kbuild (Randy Dunlap)
- x86/intel/uncore: Convert comma to semicolon (Chen Ni)
- x86/uncore: Clean up const mismatch (Greg Kroah-Hartman)
- x86/ibs: Fix typo in dc_l2tlb_miss comment (Xiang-Bin Shi)"
* tag 'perf-core-2026-02-09' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (58 commits)
s390: remove kvm_types.h from Kbuild
uprobes: Fix incorrect lockdep condition in filter_chain()
x86/ibs: Fix typo in dc_l2tlb_miss comment
x86/uprobes: Fix XOL allocation failure for 32-bit tasks
perf/x86/intel/uncore: Convert comma to semicolon
perf/x86/intel: Add support for rdpmc user disable feature
perf/x86: Use macros to replace magic numbers in attr_rdpmc
perf/x86/intel: Add core PMU support for Novalake
perf/x86/intel: Add support for PEBS memory auxiliary info field in NVL
perf/x86/intel: Add core PMU support for DMR
perf/x86/intel: Add support for PEBS memory auxiliary info field in DMR
perf/x86/intel: Support the 4 new OMR MSRs introduced in DMR and NVL
perf/core: Fix slow perf_event_task_exit() with LBR callstacks
perf/core: Speed up kexec shutdown by avoiding unnecessary cross CPU calls
uprobes: use kmap_local_page() for temporary page mappings
arm/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
mips/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
arm64/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
riscv/uprobes: use kmap_local_page() in arch_uprobe_copy_ixol()
perf/x86/intel/uncore: Add Nova Lake support
...
|
|
Add flag that disables affinity behavior.
Using sched_setaffinity() to place a perf thread on a CPU can avoid
certain interprocessor interrupts but may introduce a delay due to the
scheduling, particularly on loaded machines.
Add a command line option to disable the behavior.
This behavior is less present in other tools like `perf record`, as it
uses a ring buffer and doesn't make repeated system calls.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The evlist__for_each_cpu iterator will call sched_setaffitinity when
moving between CPUs to avoid IPIs.
If only 1 IPI is saved then this may be unprofitable as the delay to get
scheduled may be considerable.
This may be particularly true if reading an event group in `perf stat`
in interval mode.
Move the affinity handling completely into the iterator so that a single
evlist__use_affinity can determine whether CPU affinities will be used.
For `perf record` the change is minimal as the dummy event and the real
event will always make the use of affinities the thing to do.
In `perf stat`, tool events are ignored and affinities only used if >1
event on the same CPU occur.
Determining if affinities are useful is done by evlist__use_affinity
which tests per-event whether the event's PMU benefits from affinity use
- it is assumed only perf event using PMUs do.
Fix a bug where when there are no affinities that the CPU map iterator
may reference a CPU not present in the initial evsel. Fix by making the
iterator and non-iterator code common.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The libperf evsel close won't close TPEBS events properly.
Add a test to do this. The libperf close routine is used in
evlist__close() for affinity reasons.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Tool events like duration_time don't need a perf_cpu_map that contains
all online CPUs.
Having such a perf_cpu_map causes overheads when iterating between
events for CPU affinity.
During parsing mark events that just read on a single CPU map index as
such, then during map propagation set up the evsel's CPUs and thereby
the evlists accordingly.
The setting cannot be done early in parsing as user CPUs are only fully
known when evlist__create_maps is called.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andres Freund <andres@anarazel.de>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The aggr value is setup to always be non-null creating a redundant
guard for reading from it. Switch to using the perf_stat_evsel (ps)
and narrow the scope of aggr so that it is known valid when used.
Fixes: 3d65f6445fd93e3e ("perf stat-shadow: Read tool events directly")
Reported-by: Andres Freund <andres@anarazel.de>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
This reverts commit d8d8a0b3603a9a8fa207cf9e4f292e81dc5d1008.
The setting of a user CPU map can cause an empty intersection when
combined with CPU 0 and the event removed. This later triggers a segv in
the stat-shadow logic. Let's put back a full online CPU map for now by
reverting this patch.
Closes: https://lore.kernel.org/linux-perf-users/cgja46br2smmznxs7kbeabs6zgv3b4olfqgh2fdp5mxk2yom4v@w6jjgov6hdi6/
Reported-by: Andres Freund <andres@anarazel.de>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dr. David Alan Gilbert <linux@treblig.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Test case perftool-testsuite_report fails on s390 for some time
now.
Root cause is a time out which is too tight for large s390 machines.
The time out value addr2line_timeout_ms is per default set to 1 second.
This is the maximum time the function read_addr2line_record() waits for
a reply from the forked off tool addr2line, which is started as a child
in interactive mode.
It reads stdin (an address in hexadecimal) and replies on stdout with
function name, file name and line number. This might take more than one
second.
However one second is not always enough and the reply from addr2line
tool is not received. Function read_addr2line_record() fails and emits
a warning, which is not expected by the test case. It fails.
Output before:
# perf test -F 133
-- [ PASS ] -- perf_report :: setup :: prepare the perf.data file
==================
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.087 MB \
/tmp/perftool-testsuite_report.FHz/perf_report/perf.data.1 \
(207 samples) ]
==================
-- [ PASS ] -- perf_report :: setup :: prepare the perf.data.1 file
## [ PASS ] ## perf_report :: setup SUMMARY
-- [ SKIP ] -- perf_report :: test_basic :: help message :: testcase skipped
Line did not match any pattern: "cmd__addr2line /usr/lib/debug/lib/modules/
6.19.0-20260205.rc8.git366.9845cf73f7db.300.fc43.s390x+next/
vmlinux: could not read first record"
Line did not match any pattern: "cmd__addr2line /usr/lib/debug/lib/modules/
6.19.0-20260205.rc8.git366.9845cf73f7db.300.fc43.s390x+next/
vmlinux: could not read first record"
-- [ FAIL ] -- perf_report :: test_basic :: basic execution
(output regexp parsing)
....
133: perftool-testsuite_report : FAILED!
Output after:
# ./perf test -F 133
-- [ PASS ] -- perf_report :: setup :: prepare the perf.data file
==================
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.087 MB \
/tmp/perftool-testsuite_report.Mlp/perf_report/perf.data.1
(188 samples) ]
==================
-- [ PASS ] -- perf_report :: setup :: prepare the perf.data.1 file
## [ PASS ] ## perf_report :: setup SUMMARY
-- [ SKIP ] -- perf_report :: test_basic :: help message :: testcase skipped
-- [ PASS ] -- perf_report :: test_basic :: basic execution
-- [ PASS ] -- perf_report :: test_basic :: number of samples
-- [ PASS ] -- perf_report :: test_basic :: header
-- [ PASS ] -- perf_report :: test_basic :: header timestamp
-- [ PASS ] -- perf_report :: test_basic :: show CPU utilization
-- [ PASS ] -- perf_report :: test_basic :: pid
-- [ PASS ] -- perf_report :: test_basic :: non-existing symbol
-- [ PASS ] -- perf_report :: test_basic :: symbol filter
-- [ PASS ] -- perf_report :: test_basic :: latency header
-- [ PASS ] -- perf_report :: test_basic :: default report for latency profile
-- [ PASS ] -- perf_report :: test_basic :: latency report for latency profile
-- [ PASS ] -- perf_report :: test_basic :: parallelism histogram
## [ PASS ] ## perf_report :: test_basic SUMMARY
133: perftool-testsuite_report : Ok
#
Fixes: 257046a36750a6db ("perf srcline: Fallback between addr2line implementations")
Reviewed-by: Jan Polensky <japo@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Alexander Gordeev <agordeev@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: linux-s390@vger.kernel.org
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
When run on a kernel without BTF info, perf crashes:
libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled?
libbpf: failed to find valid kernel BTF
Program received signal SIGSEGV, Segmentation fault.
0x00005555556915b7 in btf.type_cnt ()
(gdb) bt
#0 0x00005555556915b7 in btf.type_cnt ()
#1 0x0000555555691fbc in btf_find_by_name_kind ()
#2 0x00005555556920d0 in btf.find_by_name_kind ()
#3 0x00005555558a1b7c in init_numa_data (con=0x7fffffffd0a0) at util/bpf_lock_contention.c:125
#4 0x00005555558a264b in lock_contention_prepare (con=0x7fffffffd0a0) at util/bpf_lock_contention.c:313
#5 0x0000555555620702 in __cmd_contention (argc=0, argv=0x7fffffffea10) at builtin-lock.c:2084
#6 0x0000555555622c8d in cmd_lock (argc=0, argv=0x7fffffffea10) at builtin-lock.c:2755
#7 0x0000555555651451 in run_builtin (p=0x555556104f00 <commands+576>, argc=3, argv=0x7fffffffea10)
at perf.c:349
#8 0x00005555556516ed in handle_internal_command (argc=3, argv=0x7fffffffea10) at perf.c:401
#9 0x000055555565184e in run_argv (argcp=0x7fffffffe7fc, argv=0x7fffffffe7f0) at perf.c:445
#10 0x0000555555651b9f in main (argc=3, argv=0x7fffffffea10) at perf.c:553
Check if btf loading failed, and don't do anything with it in
init_numa_data(). This leads to the following error message, instead of
just a crash:
libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled?
libbpf: failed to find valid kernel BTF
libbpf: kernel BTF is missing at '/sys/kernel/btf/vmlinux', was CONFIG_DEBUG_INFO_BTF enabled?
libbpf: failed to find valid kernel BTF
libbpf: Error loading vmlinux BTF: -ESRCH
libbpf: failed to load BPF skeleton 'lock_contention_bpf': -ESRCH
Failed to load lock-contention BPF skeleton
lock contention BPF setup failed
Signed-off-by: Tycho Andersen (AMD) <tycho@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Testing:
- Built perf
- Executed perf mem record and report
Committer notes:
This addresses a TODO and improves the situation where record and
report/c2c are performed on the same machine or in machines with the
same cacheline size, but the proper way is to store the cacheline size
in the perf.data header at 'record' time and then use it at post
processing time.
Signed-off-by: Ricky Ringler <ricky.ringler@proton.me>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20260129004223.26799-1-ricky.ringler@proton.me
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
On data type profiling, it tried to match register name with a partial
string. For example, it allowed to match with "%rbp)" or "%rdi,8)".
But with recent change in the area, it doesn't match anymore and break
the data type profiling.
Let's pass the correct register name by removing the unwanted part.
Add arch__dwarf_regnum() to handle it in a single place.
Closes: 7d3n23li6drroxrdlpxn7ixehdeszkjdftah3zyngjl2qs22ef@yelcjv53v42o
Reported-by: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Zecheng Li <zli94@ncsu.edu>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Currently, `perf stat` skips or hides metrics when the underlying
hardware events cannot be counted (e.g., due to insufficient permissions
or unsupported events).
In `--metric-only` mode, this often results in missing columns or blank
spaces, making the output difficult to parse.
Modify the logic to ensure metrics are consistently displayed by
propagating NAN (Not a Number) through the expression evaluator.
Specifically:
1. Update `prepare_metric()` in stat-shadow.c to treat uncounted events
(where `run == 0`) as NAN. This leverages the existing math in expr.y
to propagate NAN through metric expressions.
2. Remove the early return in the display logic's `printout()` function
that was previously skipping metrics in `--metric-only` mode for
failed events.
l
3. Simplify `perf_stat__skip_metric_event()` to no longer depend on
event runtime.
Tested:
1. `perf all metrics test` did not crash while paranoid is 2.
2. Multiple combinations with `CPUs_utilized` while paranoid is 2.
$ ./perf stat -M CPUs_utilized -a -- sleep 1
Performance counter stats for 'system wide':
<not supported> msec cpu-clock:u # nan CPUs CPUs_utilized
1,006,356,120 duration_time
1.004375550 seconds time elapsed
$ ./perf stat -M CPUs_utilized -a -j -- sleep 1
{"counter-value" : "<not supported>", "unit" : "msec", "event" : "cpu-clock:u", "event-runtime" : 0, "pcnt-running" : 100.00, "metric-value" : "nan", "metric-unit" : "CPUs CPUs_utilized"}
{"counter-value" : "1006642462.000000", "unit" : "", "event" : "duration_time", "event-runtime" : 1, "pcnt-running" : 100.00}
$ ./perf stat -M CPUs_utilized -a --metric-only -- sleep 1
Performance counter stats for 'system wide':
CPUs CPUs_utilized
nan
1.004424652 seconds time elapsed
$ ./perf stat -M CPUs_utilized -a --metric-only -j -- sleep 1
{"CPUs CPUs_utilized" : "none"}
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Chun-Tse Shao <ctshao@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yang Li <yang.lee@linux.alibaba.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The current IP of a leaf function when reported from a perf record with
"--call-graph lbr" is the "to" field of the LBR branch stack record.
The sample for the event being recorded may be further into the function
and there may be inlining information associated with it.
Rather than use the branch stack "to" field in this case switch to the
callchain appending the sample->ip and thereby allowing the inline
information to show.
Before this change:
```
$ perf record --call-graph lbr perf test -w inlineloop
...
$ perf script --fields +srcline
...
perf-inlineloop 467586 4649.344493: 950905 cpu_core/cycles/P:
55dfda2829c0 parent+0x0 (perf)
inlineloop.c:31
55dfda282a96 inlineloop+0x86 (perf)
inlineloop.c:47
55dfda236420 run_workload+0x59 (perf)
builtin-test.c:715
55dfda236b03 cmd_test+0x413 (perf)
builtin-test.c:825
...
```
After this change:
```
$ perf record --call-graph lbr perf test -w inlineloop
...
$ perf script --fields +srcline
...
perf-inlineloop 529703 11878.680815: 950905 cpu_core/cycles/P:
555ce86be9e6 leaf+0x26
inlineloop.c:20 (inlined)
555ce86be9e6 middle+0x26
inlineloop.c:27 (inlined)
555ce86be9e6 parent+0x26 (perf)
inlineloop.c:32
555ce86bea96 inlineloop+0x86 (perf)
inlineloop.c:47
555ce8672420 run_workload+0x59 (perf)
builtin-test.c:715
555ce8672b03 cmd_test+0x413 (perf)
builtin-test.c:825
...
```
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Krzysztof Łopatowski <krzysztof.m.lopatowski@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Weilin Wang <weilin.wang@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Since commit ceea279f9376 ("perf kvm stat: Remove use of the arch
directory"), a native build on Arm64 machine reports:
util/kvm-stat-arch/kvm-stat-x86.c:7:10: fatal error: asm/svm.h: No such file or directory
7 | #include <asm/svm.h>
| ^~~~~~~~~~~
compilation terminated.
The build fails to find x86's asm headers when building for Arm64. Fix
this by including asm headers with relative path instead.
Fixes: ceea279f9376 ("perf kvm stat: Remove use of the arch directory")
Signed-off-by: Leo Yan <leo.yan@arm.com>
Link: https://lore.kernel.org/r/20260206-perf_fix_kvm_stat_error-v1-1-ad40115876be@arm.com
Cc: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: James Clark <james.clark@linaro.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
In line with the previous patch, the __weak arch_sdt_arg_parse_op()
function is removed.
Architectural-specific implementations in the arch/ directory are now
converted into sub-functions within the util/perf-regs-arch/ directory.
The perf_sdt_arg_parse_op() function will call these sub-functions based
on the EM_HOST.
This change enables cross-architecture calls to arch_sdt_arg_parse_op().
No functional changes are intended.
Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Guo Ren <guoren@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xudong Hao <xudong.hao@intel.com>
Cc: Zide Chen <zide.chen@intel.com>
[ Fixed up somme fuzz with powerpc and x86 Build files wrt removing perf_regs.o ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Currently, some architecture-specific perf-regs functions, such as
arch__intr_reg_mask() and arch__user_reg_mask(), are defined with the
__weak attribute.
This approach ensures that only functions matching the architecture of
the build/run host are compiled and executed, reducing build time and
binary size.
However, this __weak attribute restricts these functions to be called
only on the same architecture, preventing cross-architecture
functionality.
For example, a perf.data file captured on x86 cannot be parsed on an ARM
platform.
To address this limitation, this patch removes the __weak attribute from
these perf-regs functions.
The architecture-specific code is moved from the arch/ directory to the
util/perf-regs-arch/ directory.
The appropriate architectural functions are then called based on the
EM_HOST.
No functional changes are intended.
Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Guo Ren <guoren@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xudong Hao <xudong.hao@intel.com>
Cc: Zide Chen <zide.chen@intel.com>
[ Fixed up somme fuzz with s390 and riscv Build files wrt removing perf_regs.o ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Fix an issue where the `perf` tool aborts unexpectedly when running the
following command:
```
perf record -e cycles -I -- true
Usage: perf record [<options>] [<command>]
or: perf record [<options>] -- <command> [<options>]
-I, --intr-regs[=<any register>]
sample selected machine registers on interrupt, use '-I?' to list register names
```
The usage of the `-I` or `--user-regs` options without specifying any
registers should default to sampling all general-purpose registers.
However, this currently causes an abnormal termination.
The issue was introduced by commit 3d06db9bad1a ("perf regs: Refactor
use of arch__sample_reg_masks() to perf_reg_name()").
This patch resolves the problem, ensuring that the `-I` or `--user-regs`
options work as intended without causing an abort.
Fixes: 3d06db9bad1ad8e6 ("perf regs: Refactor use of arch__sample_reg_masks() to perf_reg_name()")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Guo Ren <guoren@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: linux-arm-kernel@lists.infradead.org
Cc: linux-csky@vger.kernel.org
Cc: linux-riscv@lists.infradead.org
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Xudong Hao <xudong.hao@intel.com>
Cc: Zide Chen <zide.chen@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The failure to find a table of metrics with a CPUID shouldn't early
exit as the metric code will now also consider the default table.
When searching for a metric or metric group,
pmu_metrics_table__for_each_metric() considers all tables and so the
caller doesn't need to switch the table to do this.
Fixes: c7adeb0974f18da4 ("perf jevents: Add set of common metrics based on default ones")
Reviewed-by: Leo Yan <leo.yan@arm.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Leo Yan <leo.yan@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The machine can be calculated from a thread via its maps.
Don't require the machine argument to simplify callers and also to delay
computing the machine until a little later.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Jones <ajones@ventanamicro.com>
Cc: Anubhav Shelat <ashelat@redhat.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quan Zhou <zhouquan@iscas.ac.cn>
Cc: Shimin Guo <shimin.guo@skydio.com>
Cc: Swapnil Sapkal <swapnil.sapkal@amd.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yunseong Kim <ysk@kzalloc.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Add 64-bits of feature data to record the ELF machine and flags.
This allows readers to initialize based on the data.
For example, `perf kvm stat` wants to initialize based on the kind of
data to be read, but at initialization time there are no threads to base
this data upon and using the host means cross platform support won't
work.
The values in the perf_env also act as a cache for these within the
session.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Jones <ajones@ventanamicro.com>
Cc: Anubhav Shelat <ashelat@redhat.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quan Zhou <zhouquan@iscas.ac.cn>
Cc: Shimin Guo <shimin.guo@skydio.com>
Cc: Swapnil Sapkal <swapnil.sapkal@amd.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yunseong Kim <ysk@kzalloc.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Allow e_flags as well as e_machine to be computed using the e_machine
helper.
This isn't currently used, the argument is always NULL, but it will be
used for a new header feature.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Jones <ajones@ventanamicro.com>
Cc: Anubhav Shelat <ashelat@redhat.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quan Zhou <zhouquan@iscas.ac.cn>
Cc: Shimin Guo <shimin.guo@skydio.com>
Cc: Swapnil Sapkal <swapnil.sapkal@amd.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yunseong Kim <ysk@kzalloc.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Pass the e_machine to the kvm functions so that they aren't just wired
to EM_HOST.
In the case of a session move some setup until the session
is created.
As the session isn't fully running the default EM_HOST is returned as no
e_machine can be found in a running machine.
This is, however, some marginal progress to cross platform support.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Jones <ajones@ventanamicro.com>
Cc: Anubhav Shelat <ashelat@redhat.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quan Zhou <zhouquan@iscas.ac.cn>
Cc: Shimin Guo <shimin.guo@skydio.com>
Cc: Swapnil Sapkal <swapnil.sapkal@amd.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yunseong Kim <ysk@kzalloc.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
`perf kvm stat` supports record and report options.
By using the arch directory a report for a different machine type cannot
be supported.
Move the kvm-stat code out of the arch directory and into
util/kvm-stat-arch following the pattern of perf-regs and dwarf-regs.
Avoid duplicate symbols by renaming functions to have the architecture
name within them.
For global variables, wrap them in an architecture specific function.
Selecting the architecture to use with `perf kvm stat` is selected by
EM_HOST, ie no different than before the change.
Later the ELF machine can be determined from the session or a header
feature (ie EM_HOST at the time of the record).
The build and #define HAVE_KVM_STAT_SUPPORT is now redundant so remove
across Makefiles and in the build.
Opportunistically constify architectural structs and arrays.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Aditya Bodkhe <aditya.b1@linux.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexandre Ghiti <alex@ghiti.fr>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Jones <ajones@ventanamicro.com>
Cc: Anubhav Shelat <ashelat@redhat.com>
Cc: Anup Patel <anup@brainfault.org>
Cc: Athira Rajeev <atrajeev@linux.ibm.com>
Cc: Blake Jones <blakejones@google.com>
Cc: Chun-Tse Shao <ctshao@google.com>
Cc: Dapeng Mi <dapeng1.mi@linux.intel.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Howard Chu <howardchu95@gmail.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Quan Zhou <zhouquan@iscas.ac.cn>
Cc: Shimin Guo <shimin.guo@skydio.com>
Cc: Swapnil Sapkal <swapnil.sapkal@amd.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yunseong Kim <ysk@kzalloc.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
If perf is built with LIBCAPSTONE_DLOPEN=1, support dlopen-ing
libcapstone.so and then calling the necessary functions by looking them
up using dlsym.
The types come from capstone.h which means the libcapstone feature check
needs to pass, and NO_CAPSTONE=1 hasn't been defined. This will cause
the definition of HAVE_LIBCAPSTONE_SUPPORT.
Earlier versions of this code tried to declare the necessary
capstone.h constants and structs, but they weren't stable and caused
breakages across libcapstone releases.
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Charlie Jenkins <charlie@rivosinc.com>
Cc: Collin Funk <collin.funk1@gmail.com>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Now that the bitfield dependency is resolved, the explicit inclusion of
kernel.h is no longer needed.
Remove the redundant include.
Signed-off-by: Leo Yan <leo.yan@arm.com>
Cc: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
The function cpumask_to_cpulist() allocates memory with calloc() and
stores the result in 'bm', but then incorrectly checks 'cpumask' for
NULL instead of 'bm'.
This means that if the allocation fails, the function will dereference a
NULL pointer when trying to access 'bm'.
Fix the check to test the correct variable 'bm'.
Fixes: d40c68a49f69c9bd ("perf header: Support CPU DOMAIN relation info")
Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Swapnil Sapkal <swapnil.sapkal@amd.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anubhav Shelat <ashelat@redhat.com>
Cc: Chen Yu <yu.c.chen@intel.com>
Cc: Gautham Shenoy <gautham.shenoy@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
cpumask and cpulist from cpu-domain header have hardcoded max_cpus value
of 1024.
Current systems have more cpus than this value. Replace it with
MAX_NR_CPUS.
Also define a macro to represent domain name length.
Fixes: d40c68a49f69c9bd ("perf header: Support CPU DOMAIN relation info")
Reported-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Reviewed-by: Shrikanth Hegde <sshegde@linux.ibm.com>
Signed-off-by: Swapnil Sapkal <swapnil.sapkal@amd.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Anubhav Shelat <ashelat@redhat.com>
Cc: Chen Yu <yu.c.chen@intel.com>
Cc: Gautham Shenoy <gautham.shenoy@amd.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Thomas Falcon <thomas.falcon@intel.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Ian Rogers noticed that 678ed6b707e4b2db ("perf strlist: Don't write to
const memory") breaks the 'Remove thread map' 'perf test' entry, because
it keeps pointers to the temporary string introduced to avoid touching
the const memory.
This is because the thread_map__new_by_[pt]id_str() were the only
methods using the slist->dont_dupstr knob to keep pointers to the
original const string list, as it uses strtol to parse numbers and it
stops at the comma.
As this is the only case of dont_dupstr use, dupstr being the default,
and it gets in the way of getting rid of the last const-correctness,
remove this knob, with it:
$ perf test 37
37: Remove thread map : Ok
$
Fixes: 678ed6b707e4b2db ("perf strlist: Don't write to const memory")
Reported-by: Ian Rogers <irogers@google.com>
Tested-by: Ian Rogers <irogers@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
As newer glibcs will propagate the const attribute of the searched table
to its return.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
tables
As newer glibcs will propagate the const attribute of the searched table
to its return.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
const tables
As newer glibcs will propagate the const attribute of the searched table
to its return.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
tables
As newer glibcs will propagate the const attribute of the searched table
to its return.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
As newer glibcs will propagate the const attribute of the searched table
to its return.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
tables
As newer glibcs will propagate the const attribute of the searched table
to its return.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To avoid having more variables, just cast the const variable searched to
non-const since the result will not be modified, its only later that
that variable will be used to modify something, but then its non-const
memory being modified, so using a cast is the cheapest thing here.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
To address const-correctness errors on newer glibcs (-Werror=discarded-qualifiers).
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Since it is freshly allocated just attribute it to a non-const pointer
and then change it via that pointer.
That way we avoid const-correctness warnings in recent glibc versions.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|
|
Do a strdup to the list string and parse from it, free at the end.
This is to deal with newer glibcs const-correctness.
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
|