summaryrefslogtreecommitdiff
path: root/tools/perf/builtin-script.c
AgeCommit message (Collapse)Author
2025-10-02perf capstone: Move capstone functionality into its own fileIan Rogers
Capstone disassembly support was split between disasm.c and print_insn.c. Move support out of these files into capstone.[ch] and remove include capstone/capstone.h from those files. As disassembly routines can fail, make failure the only option without HAVE_LIBCAPSTONE_SUPPORT. For simplicity's sake, duplicate the read_symbol utility function. The intent with moving capstone support into a single file is that dynamic support, using dlopen for libcapstone, can be added in later patches. This can potentially always succeed or fail, so relying on ifdefs isn't sufficient. Using dlopen is a useful option to minimize the perf tools dependencies and potentially size. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Alexandre Ghiti <alexghiti@rivosinc.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.ibm.com> Cc: Bill Wendling <morbo@google.com> Cc: Charlie Jenkins <charlie@rivosinc.com> Cc: Collin Funk <collin.funk1@gmail.com> Cc: Dmitriy Vyukov <dvyukov@google.com> Cc: Dr. David Alan Gilbert <linux@treblig.org> Cc: Eric Biggers <ebiggers@kernel.org> Cc: Haibo Xu <haibo1.xu@intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Justin Stitt <justinstitt@google.com> Cc: Li Huafei <lihuafei1@huawei.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nathan Chancellor <nathan@kernel.org> Cc: Nick Desaulniers <nick.desaulniers+lkml@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Stephen Brennan <stephen.s.brennan@oracle.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-01perf script: Enable to present DTL entriesAthira Rajeev
The process_event() function in "builtin-script.c" invokes perf_sample__fprintf_synth() for displaying PERF_TYPE_SYNTH type events. if (attr->type == PERF_TYPE_SYNTH && PRINT_FIELD(SYNTH)) perf_sample__fprintf_synth(sample, evsel, fp); perf_sample__fprintf_synth() process the sample depending on the value in evsel->core.attr.config. Introduce perf_sample__fprintf_synth_vpadtl() and invoke this for PERF_SYNTH_POWERPC_VPA_DTL Sample output: ./perf record -a -e sched:*,vpa_dtl/dtl_all/ -c 1000000000 sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.300 MB perf.data ] ./perf script perf 13322 [002] 233.835807: sched:sched_switch: perf:13322 [120] R ==> migration/2:27 [0] migration/2 27 [002] 233.835811: sched:sched_migrate_task: comm=perf pid=13322 prio=120 orig_cpu=2 dest_cpu=3 migration/2 27 [002] 233.835818: sched:sched_stat_runtime: comm=migration/2 pid=27 runtime=9214 [ns] migration/2 27 [002] 233.835819: sched:sched_switch: migration/2:27 [0] S ==> swapper/2:0 [120] swapper 0 [002] 233.835822: vpa-dtl: timebase: 338954486062657 dispatch_reason:decrementer_interrupt, preempt_reason:H_CEDE, enqueue_to_dispatch_time:435, ready_to_enqueue_time:0, waiting_to_ready_time:34775058, processor_id: 202 c0000000000f8094 plpar_hcall_norets_notrace+0x18 ([kernel.kallsyms]) swapper 0 [001] 233.835886: vpa-dtl: timebase: 338954486095398 dispatch_reason:priv_doorbell, preempt_reason:H_CEDE, enqueue_to_dispatch_time:542, ready_to_enqueue_time:0, waiting_to_ready_time:1245360, processor_id: 201 c0000000000f8094 plpar_hcall_norets_notrace+0x18 ([kernel.kallsyms]) Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Athira Rajeev <atrajeev@linux.ibm.com> Tested-by: Tejas Manhas <tejas05@linux.ibm.com> Tested-by: Venkat Rao Bagalkote <venkat88@linux.ibm.com> Cc: Aboorva Devarajan <aboorvad@linux.ibm.com> Cc: Aditya Bodkhe <Aditya.Bodkhe1@ibm.com> Cc: Hari Bathini <hbathini@linux.vnet.ibm.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Madhavan Srinivasan <maddy@linux.ibm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Shrikanth Hegde <sshegde@linux.ibm.com> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-10-01perf tools: Fix duplicated words in documentation and commentsMarkus Heidelberg
- "the the" - "in in" - "a a" Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Markus Heidelberg <m.heidelberg@cab.de> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-07-25perf sample: Remove arch notion of sample parsingIan Rogers
By definition arch sample parsing and synthesis will inhibit certain kinds of cross-platform record then analysis (report, script, etc.). Remove arch_perf_parse_sample_weight and arch_perf_synthesize_sample_weight replacing with a common implementation. Combine perf_sample p_stage_cyc and retire_lat as weight3 to capture the differing uses regardless of compiled for architecture. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250724163302.596743-21-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-25perf evlist: Change env variable to sessionIan Rogers
The session holds a perf_env pointer env. In UI code container_of is used to turn the env to a session, but this assumes the session header's env is in use. Rather than a dubious container_of, hold the session in the evlist and derive the env from the session with evsel__env, perf_session__env, etc. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250724163302.596743-11-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-25perf session: Add accessor for session->header.envIan Rogers
The perf_env from the header in the session is frequently accessed, add an accessor function rather than access directly. Cache the value to avoid repeated calls. No behavioral change. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250724163302.596743-10-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-07-11perf stat: Move metric list from config to evlistIan Rogers
The rblist of metric_event that then have a list of associated metric_expr is moved out of the stat_config and into the evlist. This is done as part of refactoring things for python, having the state split in two places complicates that implementation. The evlist is doing the harder work of enabling and disabling events, the metrics are needed to compute a value and it doesn't seem unreasonable to hang them from the evlist. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250710235126.1086011-7-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-06-20perf tools: display the new PERF_RECORD_BPF_METADATA eventBlake Jones
Here's some example "perf script -D" output for the new event type. The ": unhandled!" message is from tool.c, analogous to other behavior there. I've elided some rows with all NUL characters for brevity, and I wrapped one of the >75-column lines to fit in the commit guidelines. 0x50fc8@perf.data [0x260]: event: 84 . . ... raw event: size 608 bytes . 0000: 54 00 00 00 00 00 60 02 62 70 66 5f 70 72 6f 67 T.....`.bpf_prog . 0010: 5f 31 65 30 61 32 65 33 36 36 65 35 36 66 31 61 _1e0a2e366e56f1a . 0020: 32 5f 70 65 72 66 5f 73 61 6d 70 6c 65 5f 66 69 2_perf_sample_fi . 0030: 6c 74 65 72 00 00 00 00 00 00 00 00 00 00 00 00 lter............ . 0040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [...] . 0110: 74 65 73 74 5f 76 61 6c 75 65 00 00 00 00 00 00 test_value...... . 0120: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [...] . 0150: 34 32 00 00 00 00 00 00 00 00 00 00 00 00 00 00 42.............. . 0160: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [...] 0 0x50fc8 [0x260]: PERF_RECORD_BPF_METADATA \ prog bpf_prog_1e0a2e366e56f1a2_perf_sample_filter entry 0: test_value = 42 : unhandled! Signed-off-by: Blake Jones <blakejones@google.com> Link: https://lore.kernel.org/r/20250612194939.162730-5-blakejones@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-05-05perf script: Display off-cpu samples correctlyHoward Chu
No PERF_SAMPLE_CALLCHAIN in sample_type, but 'perf script' needs to display a callchain, have to specify manually. Also, prefer displaying a callchain: gvfs-afc-volume 2267 [001] 3829232.955656: 1001115340 offcpu-time: 77f05292603f __pselect+0xbf (/usr/lib/x86_64-linux-gnu/libc.so.6) 77f052a1801c [unknown] (/usr/lib/x86_64-linux-gnu/libusbmuxd-2.0.so.6.0.0) 77f052a18d45 [unknown] (/usr/lib/x86_64-linux-gnu/libusbmuxd-2.0.so.6.0.0) 77f05289ca94 start_thread+0x384 (/usr/lib/x86_64-linux-gnu/libc.so.6) 77f052929c3c clone3+0x2c (/usr/lib/x86_64-linux-gnu/libc.so.6) to a raw binary BPF output: BPF output: 0000: dd 08 00 00 db 08 00 00 <DD>...<DB>... 0008: cc ce ab 3b 00 00 00 00 <CC>Ϋ;.... 0010: 06 00 00 00 00 00 00 00 ........ 0018: 00 fe ff ff ff ff ff ff .<FE><FF><FF><FF><FF><FF><FF> 0020: 3f 60 92 52 f0 77 00 00 ?`.R<F0>w.. 0028: 1c 80 a1 52 f0 77 00 00 ..<A1>R<F0>w.. 0030: 45 8d a1 52 f0 77 00 00 E.<A1>R<F0>w.. 0038: 94 ca 89 52 f0 77 00 00 .<CA>.R<F0>w.. 0040: 3c 9c 92 52 f0 77 00 00 <..R<F0>w.. 0048: 00 00 00 00 00 00 00 00 ........ 0050: 00 00 00 00 .... Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Howard Chu <howardchu95@gmail.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Tested-by: Gautam Menghani <gautam@linux.ibm.com> Tested-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20241108204137.2444151-9-howardchu95@gmail.com Link: https://lore.kernel.org/r/20250501022809.449767-8-howardchu95@gmail.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2025-03-05perf script: Fix output type for dynamically allocated core PMU'sThomas Falcon
This patch was originally posted here: https://lore.kernel.org/all/20241213215421.661139-1-thomas.falcon@intel.com/ I have rebased on top of Arnaldo's patch here: https://lore.kernel.org/all/Z2XCi3PgstSrV0SE@x1/ The original commit message: " perf script output may show different fields on different core PMU's that exist on heterogeneous platforms. For example, perf record -e "{cpu_core/mem-loads-aux/,cpu_core/event=0xcd,\ umask=0x01,ldlat=3,name=MEM_UOPS_RETIRED.LOAD_LATENCY/}:upp"\ -c10000 -W -d -a -- sleep 1 perf script: chromium-browse 46572 [002] 544966.882384: 10000 cpu_core/MEM_UOPS_RETIRED.LOAD_LATENCY/: 7ffdf1391b0c 10268100142 \ |OP LOAD|LVL L1 hit|SNP None|TLB L1 or L2 hit|LCK No|BLK N/A 5 7 0 7fad7c47425d [unknown] (/usr/lib64/libglib-2.0.so.0.8000.3) perf record -e cpu_atom/event=0xd0,umask=0x05,ldlat=3,\ name=MEM_UOPS_RETIRED.LOAD_LATENCY/upp -c10000 -W -d -a -- sleep 1 perf script: gnome-control-c 534224 [023] 544951.816227: 10000 cpu_atom/MEM_UOPS_RETIRED.LOAD_LATENCY/: 7f0aaaa0aae0 [unknown] (/usr/lib64/libglib-2.0.so.0.8000.3) Some fields, such as data_src, are not included by default. The cause is that while one PMU may be assigned a type such as PERF_TYPE_RAW, other core PMU's are dynamically allocated at boot time. If this value does not match an existing PERF_TYPE_X value, output_type(perf_event_attr.type) will return OUTPUT_TYPE_OTHER. Instead search for a core PMU with a matching perf_event_attr type and, if one is found, return PERF_TYPE_RAW to match output of other core PMU's. " Suggested-by: Kan Liang <kan.liang@intel.com> Suggested-by: Ian Rogers <irogers@google.com> Signed-off-by: Thomas Falcon <thomas.falcon@intel.com> Reviewed-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250305163935.1605312-1-thomas.falcon@intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05perf script: Add not taken event for branch stackLeo Yan
The branch stack has an existed field for printing mispredict, extend the field for printing events and add support not-taken event. Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Leo Yan <leo.yan@arm.com> Link: https://lore.kernel.org/r/20250304111240.3378214-6-leo.yan@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-03-05perf script: Make printing flags reliableLeo Yan
Add a check for the generated string of flags. Print out the raw number if the string generation fails. Use the SAMPLE_FLAGS_STR_ALIGNED_SIZE macro to replace the value '21'. Reviewed-by: Ian Rogers <irogers@google.com> Reviewed-by: James Clark <james.clark@linaro.org> Signed-off-by: Leo Yan <leo.yan@arm.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20250304111240.3378214-2-leo.yan@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2025-02-12perf sample: Make user_regs and intr_regs optionalIan Rogers
The struct dump_regs contains 512 bytes of cache_regs, meaning the two values in perf_sample contribute 1088 bytes of its total 1384 bytes size. Initializing this much memory has a cost reported by Tavian Barnes <tavianator@tavianator.com> as about 2.5% when running `perf script --itrace=i0`: https://lore.kernel.org/lkml/d841b97b3ad2ca8bcab07e4293375fb7c32dfce7.1736618095.git.tavianator@tavianator.com/ Adrian Hunter <adrian.hunter@intel.com> replied that the zero initialization was necessary and couldn't simply be removed. This patch aims to strike a middle ground of still zeroing the perf_sample, but removing 79% of its size by make user_regs and intr_regs optional pointers to zalloc-ed memory. To support the allocation accessors are created for user_regs and intr_regs. To support correct cleanup perf_sample__init and perf_sample__exit functions are created and added throughout the code base. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20250113194345.1537821-1-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-12-20perf script: Cache the output typeArnaldo Carvalho de Melo
Right now every time we need to figure out the type of an evsel for output purposes we do a quick sequence of ifs, but there are new cases where there is a need to do more complex iterations over multiple data structures, sso allow for caching this operation on a hole of 'struct evsel'. This should really be done on the evsel->priv area that 'perf script' sets up, but more work is needed to make sure that it is allocated when we need it, right now it is only used for conditionally, add some comments so that we move this to that 'perf script' specific area when the conditions are in place for that. Acked-by: Thomas Falcon <thomas.falcon@intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Link: https://lore.kernel.org/lkml/Z2XCi3PgstSrV0SE@x1 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-18perf script: Move perf_sample__sprintf_flags to trace-event-scripting.cIan Rogers
perf_sample__sprintf_flags is used in the python C code and so needs to be in the util library rather than a builtin. Signed-off-by: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/20241119011644.971342-12-irogers@google.com Cc: Mark Rutland <mark.rutland@arm.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Arnaldo Carvalho de Melo <acme@kernel.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Veronika Molnarova <vmolnaro@redhat.com> Cc: linux-kernel@vger.kernel.org Cc: linux-perf-users@vger.kernel.org Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-18perf script: Move script_fetch_insn to trace-event-scripting.cIan Rogers
Add native_arch as a parameter to script_fetch_insn rather than relying on the builtin-script value that won't be initialized for the dlfilter and python Context use cases. Assume both of those cases are running natively. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Veronika Molnarova <vmolnaro@redhat.com> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20241119011644.971342-11-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-18perf script: Move script_spec code to trace-event-scripting.cIan Rogers
The script_spec code is referenced in util/trace-event-scripting but the list was in builtin-script, accessed via a function that required a stub function in python.c. Move all the logic to trace-event-scripting, with lookup and foreach functions exposed for builtin-script's benefit. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Veronika Molnarova <vmolnaro@redhat.com> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20241119011644.971342-10-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-18perf stat: Move stat_config into config.cIan Rogers
stat_config is accessed by config.c via helper functions, but declared in builtin-stat. Move to util/config.c so that stub functions aren't needed in python.c which doesn't link against the builtin files. To avoid name conflicts change builtin-script to use the same stat_config as builtin-stat. Rename local variables in tests to avoid shadow declaration warnings. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Veronika Molnarova <vmolnaro@redhat.com> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20241119011644.971342-9-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-18perf script: Move find_scripts to browser/scripts.cIan Rogers
The only use of find_scripts is in browser/scripts.c but the definition in builtin causes linking problems requiring a stub in python.c. Move the function to allow the stub to be removed. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Veronika Molnarova <vmolnaro@redhat.com> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20241119011644.971342-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-18perf script: Use openat for directory iterationIan Rogers
Rewrite the directory iteration to use openat so that large character arrays aren't needed. The arrays are warned about potential buffer overflows by GCC when the code exists in a single C file. Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Veronika Molnarova <vmolnaro@redhat.com> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20241119011644.971342-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-18perf script: Move scripting_max_stack out of builtinIan Rogers
scripting_max_stack is used in util code which is linked into the python module. Move the variable declaration to util/trace-event-scripting.c to avoid conditional compilation. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dapeng Mi <dapeng1.mi@linux.intel.com> Cc: Howard Chu <howardchu95@gmail.com> Cc: Ilya Leoshkevich <iii@linux.ibm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Josh Poimboeuf <jpoimboe@redhat.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Michael Petlan <mpetlan@redhat.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Veronika Molnarova <vmolnaro@redhat.com> Cc: Weilin Wang <weilin.wang@intel.com> Link: https://lore.kernel.org/r/20241119011644.971342-5-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-12-09perf evsel: Add/use accessor for tp_formatIan Rogers
Add an accessor function for tp_format. Rather than search+replace uses try to use a variable and reuse it. Add additional NULL checks when accessing/using the value. Make sure the PTR_ERR is nulled out on error path in evsel__newtp_idx. Reviewed-by: Namhyung Kim <namhyung@kernel.org> Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dominique Martinet <asmadeus@codewreck.org> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@linaro.org> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Steven Rostedt (VMware) <rostedt@goodmis.org> Cc: Thomas Falcon <thomas.falcon@intel.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Yang Jihong <yangjihong@bytedance.com> Cc: Yang Li <yang.lee@linux.alibaba.com> Cc: Ze Gao <zegao2021@gmail.com> Cc: Zixian Cai <fzczx123@gmail.com> Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20241118225345.889810-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-11-08perf build: Include libtraceevent headers directly indicated by pkg-configYicong Yang
Currently the libtraceevent's found by pkg-config, which give the include path as: [root@localhost tmp]# pkg-config --cflags libtraceevent -I/usr/local/include/traceevent So we should include the libtraceevent headers directly without "traceevent/" prefix. Update all the users. Fixes: 0f0e1f445690 ("perf build: Use pkg-config for feature check for libtrace{event,fs}") Suggested-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/linux-perf-users/ZyF5_Hf1iL01kldE@google.com/ Signed-off-by: Yicong Yang <yangyicong@hisilicon.com> Cc: leo.yan@arm.com Cc: amadio@gentoo.org Cc: linuxarm@huawei.com Link: https://lore.kernel.org/r/20241105105649.45399-1-yangyicong@huawei.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-29perf arm-spe: Correctly set sample flagsGraham Woodward
Set flags on all synthesized instruction and branch samples. Signed-off-by: Graham Woodward <graham.woodward@arm.com> Reviewed-by: James Clark <james.clark@linaro.org> Tested-by: Leo Yan <leo.yan@arm.com> Cc: nd@arm.com Cc: mike.leach@linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20241025143009.25419-4-graham.woodward@arm.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-10-17perf stat: Change color to threshold in print_metricIan Rogers
Colors don't mean things in CSV and JSON output, switch to a threshold enum value that the standard output can convert to a color. Updating the CSV and JSON output will be later changes. Signed-off-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: Weilin Wang <weilin.wang@intel.com> Cc: Will Deacon <will@kernel.org> Cc: James Clark <james.clark@linaro.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Leo Yan <leo.yan@linux.dev> Cc: Sumanth Korikkar <sumanthk@linux.ibm.com> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: John Garry <john.g.garry@oracle.com> Link: https://lore.kernel.org/r/20241017175356.783793-6-irogers@google.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2024-09-11perf env: Find correct branch counter info on hybridKan Liang
No event is printed in the "Branch Counter" column on hybrid machines. For example, $ perf record -e "{cpu_core/branch-instructions/pp,cpu_core/branches/}:S" -j any,counter $ perf report --total-cycles # Branch counter abbr list: # cpu_core/branch-instructions/pp = A # cpu_core/branches/ = B # '-' No event occurs # '+' Event occurrences may be lost due to branch counter saturated # # Sampled Cycles% Sampled Cycles Avg Cycles% Avg Cycles Branch Counter # ............... .............. ........... .......... .............. 44.54% 727.1K 0.00% 1 |+ |+ | 36.31% 592.7K 0.00% 2 |+ |+ | 17.83% 291.1K 0.00% 1 |+ |+ | The branch counter information (br_cntr_width and br_cntr_nr) in the perf_env is retrieved from the CPU_PMU_CAPS. However, the CPU_PMU_CAPS is not available on hybrid machines. Without the width information, the number of occurrences of an event cannot be calculated. For a hybrid machine, the caps information should be retrieved from the PMU_CAPS, and stored in the perf_env->pmu_caps. Add a perf_env__find_br_cntr_info() to return the correct branch counter information from the corresponding fields. Committer notes: While testing I couldn't s ee those "Branch counter" columns enabled by pressing 'B' on the TUI, after reporting it to the list Kan explained the situation: <quote Kan Liang> For a hybrid client, the "Branch Counter" feature is only supported starting from the just released Lunar Lake. Perf falls back to only "ANY" on your Raptor Lake. The "The branch counter is not available" message is expected. Here is the 'perf evlist' result from my Lunar Lake machine, # perf evlist -v cpu_core/branch-instructions/pp: type: 4 (cpu_core), size: 136, config: 0xc4 (branch-instructions), { sample_period, sample_freq }: 4000, sample_type: IP|TID|TIME|READ|PERIOD|BRANCH_STACK|IDENTIFIER, read_format: ID|GROUP|LOST, disabled: 1, freq: 1, enable_on_exec: 1, precise_ip: 2, sample_id_all: 1, exclude_guest: 1, branch_sample_type: ANY|COUNTERS # </quote> Fixes: 6f9d8d1de2c61288 ("perf script: Add branch counters") Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240909184201.553519-1-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-09-03perf script: Minimize "not reaching sample" for '-F +brstackinsn'Andi Kleen
In some situations 'perf script -F +brstackinsn' sees a lot of "not reaching sample" messages. This happens when the last LBR block before the sample contains a branch that is not in the LBR, and the instruction dumping stops. $ perf record -b emacs -Q --batch '()' [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.396 MB perf.data (443 samples) ] $ perf script -F +brstackinsn ... 00007f0ab2d171a4 insn: 41 0f 94 c0 00007f0ab2d171a8 insn: 83 fa 01 00007f0ab2d171ab insn: 74 d3 # PRED 6 cycles [313] 1.00 IPC 00007f0ab2d17180 insn: 45 84 c0 00007f0ab2d17183 insn: 74 28 ... not reaching sample ... $ perf script -F +brstackinsn | grep -c reach 136 $ This is a problem for further analysis that wants to see the full code upto the sample. There are two common cases where the message is bogus: - The LBR only logs taken branches, but the branch might be a conditional branch that is not taken (that is the most common case actually) - The LBR sampling uses a filter ignoring some branches, but the perf script check checks for all branches. This patch fixes these two conditions, by only checking for conditional branches, as well as checking the perf_event_attr's branch filter attributes. For the test case above it fixes all the messages: $ ./perf script -F +brstackinsn | grep -c reach 0 Note that there are still conditions when the message is hit -- sometimes there can be a unconditional branch that misses the LBR update before the sample -- but they are much more rare now. Signed-off-by: Andi Kleen <ak@linux.intel.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20240229161828.386397-1-ak@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-14perf script: Add branch countersKan Liang
It's useful to print the branch counter information for each jump in the brstackinsn when it's available. Add a new field 'brcntr' to display the branch counter information. By default, the abbreviation will be used to indicate the branch counter. In the verbose mode, the real event name is shown. $ perf script -F +brstackinsn,+brcntr # Branch counter abbr list: # branch-instructions:ppp = A # branch-misses = B # '-' No event occurs # '+' Event occurrences may be lost due to branch counter saturated tchain_edit 332203 3366329.405674: 53030 branch-instructions:ppp: 401781 f3+0x2c (home/sdp/test/tchain_edit) f3+31: 0000000000401774 insn: eb 04 br_cntr: AA # PRED 5 cycles [5] 000000000040177a insn: 81 7d fc 0f 27 00 00 0000000000401781 insn: 7e e3 br_cntr: A # PRED 1 cycles [6] 2.00 IPC 0000000000401766 insn: 8b 45 fc 0000000000401769 insn: 83 e0 01 000000000040176c insn: 85 c0 000000000040176e insn: 74 06 br_cntr: A # PRED 1 cycles [7] 4.00 IPC 0000000000401776 insn: 83 45 fc 01 000000000040177a insn: 81 7d fc 0f 27 00 00 0000000000401781 insn: 7e e3 br_cntr: A # PRED 7 cycles [14] 0.43 IPC $ perf script -F +brstackinsn,+brcntr -v tchain_edit 332203 3366329.405674: 53030 branch-instructions:ppp: 401781 f3+0x2c (/home/sdp/os.linux.perf.test-suite/kernels/lbr_kernel/tchain_edit) f3+31: 0000000000401774 insn: eb 04 br_cntr: branch-instructions:ppp 2 branch-misses 0 # PRED 5 cycles [5] 000000000040177a insn: 81 7d fc 0f 27 00 00 0000000000401781 insn: 7e e3 br_cntr: branch-instructions:ppp 1 branch-misses 0 # PRED 1 cycles [6] 2.00 IPC 0000000000401766 insn: 8b 45 fc 0000000000401769 insn: 83 e0 01 000000000040176c insn: 85 c0 000000000040176e insn: 74 06 br_cntr: branch-instructions:ppp 1 branch-misses 0 # PRED 1 cycles [7] 4.00 IPC 0000000000401776 insn: 83 45 fc 01 000000000040177a insn: 81 7d fc 0f 27 00 00 0000000000401781 insn: 7e e3 br_cntr: branch-instructions:ppp 1 branch-misses 0 # PRED 7 cycles [14] 0.43 IPC Originally-by: Tinghao Zhang <tinghao.zhang@intel.com> Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Ingo Molnar <mingo@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephane Eranian <eranian@google.com> Link: https://lore.kernel.org/r/20240813160208.2493643-9-kan.liang@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-12perf script: Use perf_tool__init()Ian Rogers
Use perf_tool__init() so that more uses of 'struct perf_tool' can be const and not relying on perf_tool__fill_defaults(). Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20240812204720.631678-14-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-12perf tool: Constify tool pointersIan Rogers
The tool pointer (to a struct largely of function pointers) is passed around but is unchanged except at initialization. Change parameter and variable types to be const to lower the possibilities of what could happen with a tool. Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Adrian Hunter <adrian.hunter@intel.com> Tested-by: Leo Yan <leo.yan@arm.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Anshuman Khandual <anshuman.khandual@arm.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Huacai Chen <chenhuacai@kernel.org> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: Jonathan Cameron <jonathan.cameron@huawei.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Nick Desaulniers <ndesaulniers@google.com> Cc: Nick Terrell <terrelln@fb.com> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20240812204720.631678-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-08-12perf script: add --addr2line optionMartin Liška
Similarly to other subcommands (like report, top), it would be handy to provide a path for addr2line command. Signed-off-by: Martin Liska <martin.liska@hey.com> Cc: Ian Rogers <irogers@google.com> Link: https://lore.kernel.org/r/eadc3e36-029d-4848-9d69-272fe5a83a26@foxlink.cz Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-07perf mem-info: Add reference count checkingIan Rogers
Add reference count checking and switch 'struct mem_info' usage to use accessor functions. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240507183545.1236093-8-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-07perf mem-info: Move mem-info out of mem-events and symbolIan Rogers
Move mem-info to its own header rather than having it split between mem-events and symbol. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kajol Jain <kjain@linux.ibm.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Oliver Upton <oliver.upton@linux.dev> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Tim Chen <tim.c.chen@linux.intel.com> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: Yicong Yang <yangyicong@hisilicon.com> Link: https://lore.kernel.org/r/20240507183545.1236093-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-06perf dso: Add reference count checking and accessor functionsIan Rogers
Add reference count checking to struct dso, this can help with implementing correct reference counting discipline. To avoid RC_CHK_ACCESS everywhere, add accessor functions for the variables in struct dso. The majority of the change is mechanical in nature and not easy to split up. Committer testing: 'perf test' up to this patch shows no regressions. But: util/symbol.c: In function ‘dso__load_bfd_symbols’: util/symbol.c:1683:9: error: too few arguments to function ‘dso__set_adjust_symbols’ 1683 | dso__set_adjust_symbols(dso); | ^~~~~~~~~~~~~~~~~~~~~~~ In file included from util/symbol.c:21: util/dso.h:268:20: note: declared here 268 | static inline void dso__set_adjust_symbols(struct dso *dso, bool val) | ^~~~~~~~~~~~~~~~~~~~~~~ make[6]: *** [/home/acme/git/perf-tools-next/tools/build/Makefile.build:106: /tmp/tmp.ZWHbQftdN6/util/symbol.o] Error 1 MKDIR /tmp/tmp.ZWHbQftdN6/tests/workloads/ make[6]: *** Waiting for unfinished jobs.... This was updated: - symbols__fixup_end(&dso->symbols, false); - symbols__fixup_duplicate(&dso->symbols); - dso->adjust_symbols = 1; + symbols__fixup_end(dso__symbols(dso), false); + symbols__fixup_duplicate(dso__symbols(dso)); + dso__set_adjust_symbols(dso); But not build tested with BUILD_NONDISTRO and libbfd devel files installed (binutils-devel on fedora). Add the missing argument: symbols__fixup_end(dso__symbols(dso), false); symbols__fixup_duplicate(dso__symbols(dso)); - dso__set_adjust_symbols(dso); + dso__set_adjust_symbols(dso, true); Signed-off-by: Ian Rogers <irogers@google.com> Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Ben Gainey <ben.gainey@arm.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Chengen Du <chengen.du@canonical.com> Cc: Colin Ian King <colin.i.king@gmail.com> Cc: Dima Kogan <dima@secretsauce.net> Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linux.dev> Cc: Li Dong <lidong@vivo.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Masami Hiramatsu <mhiramat@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Paran Lee <p4ranlee@gmail.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Song Liu <song@kernel.org> Cc: Sun Haiyong <sunhaiyong@loongson.cn> Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Tiezhu Yang <yangtiezhu@loongson.cn> Cc: Yanteng Si <siyanteng@loongson.cn> Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com> Link: https://lore.kernel.org/r/20240504213803.218974-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-08perf script: Consolidate capstone print functionsAdrian Hunter
Consolidate capstone print functions, to reduce duplication. Amend call sites to use a file pointer for output, which is consistent with most perf tools print functions. Add print_opts with an option to print also the hex value of a resolved symbol+offset. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Link: https://lore.kernel.org/r/20240401210925.209671-4-ak@linux.intel.com Signed-off-by: Andi Kleen <ak@linux.intel.com> [ Added missing inttypes.h include to use PRIx64 in util/print_insn.c ] Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-05perf script: Add capstone support for '-F +brstackdisasm'Andi Kleen
Support capstone output for the '-F +brstackinsn' branch dump. The new output is enabled with the new field 'brstackdisasm'. This was possible before with --xed, but now also allow it for users that don't have xed using the builtin capstone support. Before: perf record -b emacs -Q --batch '()' perf script -F +brstackinsn ... emacs 55778 1814366.755945: 151564 cycles:P: 7f0ab2d17192 intel_check_word.constprop.0+0x162 (/usr/lib64/ld-linux-x86-64.s> intel_check_word.constprop.0+237: 00007f0ab2d1711d insn: 75 e6 # PRED 3 cycles [3] 00007f0ab2d17105 insn: 73 51 00007f0ab2d17107 insn: 48 89 c1 00007f0ab2d1710a insn: 48 39 ca 00007f0ab2d1710d insn: 73 96 00007f0ab2d1710f insn: 48 8d 04 11 00007f0ab2d17113 insn: 48 d1 e8 00007f0ab2d17116 insn: 49 8d 34 c1 00007f0ab2d1711a insn: 44 3a 06 00007f0ab2d1711d insn: 75 e6 # PRED 3 cycles [6] 3.00 IPC 00007f0ab2d17105 insn: 73 51 # PRED 1 cycles [7] 1.00 IPC 00007f0ab2d17158 insn: 48 8d 50 01 00007f0ab2d1715c insn: eb 92 # PRED 1 cycles [8] 2.00 IPC 00007f0ab2d170f0 insn: 48 39 ca 00007f0ab2d170f3 insn: 73 b0 # PRED 1 cycles [9] 2.00 IPC After (perf must be compiled with capstone): perf script -F +brstackdisasm ... emacs 55778 1814366.755945: 151564 cycles:P: 7f0ab2d17192 intel_check_word.constprop.0+0x162 (/usr/lib64/ld-linux-x86-64.s> intel_check_word.constprop.0+237: 00007f0ab2d1711d jne intel_check_word.constprop.0+0xd5 # PRED 3 cycles [3] 00007f0ab2d17105 jae intel_check_word.constprop.0+0x128 00007f0ab2d17107 movq %rax, %rcx 00007f0ab2d1710a cmpq %rcx, %rdx 00007f0ab2d1710d jae intel_check_word.constprop.0+0x75 00007f0ab2d1710f leaq (%rcx, %rdx), %rax 00007f0ab2d17113 shrq $1, %rax 00007f0ab2d17116 leaq (%r9, %rax, 8), %rsi 00007f0ab2d1711a cmpb (%rsi), %r8b 00007f0ab2d1711d jne intel_check_word.constprop.0+0xd5 # PRED 3 cycles [6] 3.00 IPC 00007f0ab2d17105 jae intel_check_word.constprop.0+0x128 # PRED 1 cycles [7] 1.00 IPC 00007f0ab2d17158 leaq 1(%rax), %rdx 00007f0ab2d1715c jmp intel_check_word.constprop.0+0xc0 # PRED 1 cycles [8] 2.00 IPC 00007f0ab2d170f0 cmpq %rcx, %rdx 00007f0ab2d170f3 jae intel_check_word.constprop.0+0x75 # PRED 1 cycles [9] 2.00 IPC Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: https://lore.kernel.org/r/20240401210925.209671-3-ak@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-05perf script: Support 32bit code under 64bit OS with capstoneAndi Kleen
Use the DSO to resolve whether an IP is 32bit or 64bit and use that to configure capstone to the correct mode. This allows to correctly disassemble 32bit code under a 64bit OS. % cat > loop.c volatile int var; int main(void) { int i; for (i = 0; i < 100000; i++) var++; } % gcc -m32 -o loop loop.c % perf record -e cycles:u ./loop % perf script -F +disasm loop 82665 1833176.618023: 1 cycles:u: f7eed500 _start+0x0 (/usr/lib/ld-linux.so.2) movl %esp, %eax loop 82665 1833176.618029: 1 cycles:u: f7eed500 _start+0x0 (/usr/lib/ld-linux.so.2) movl %esp, %eax loop 82665 1833176.618031: 7 cycles:u: f7eed500 _start+0x0 (/usr/lib/ld-linux.so.2) movl %esp, %eax loop 82665 1833176.618034: 91 cycles:u: f7eed500 _start+0x0 (/usr/lib/ld-linux.so.2) movl %esp, %eax loop 82665 1833176.618036: 1242 cycles:u: f7eed500 _start+0x0 (/usr/lib/ld-linux.so.2) movl %esp, %eax Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Thomas Richter <tmricht@linux.ibm.com> Signed-off-by: Andi Kleen <ak@linux.intel.com> Link: https://lore.kernel.org/r/20240401210925.209671-2-ak@linux.intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-03perf evsel: Use evsel__name_is() helperYang Jihong
Code cleanup, replace strcmp(evsel__name(evsel, {NAME})) with evsel__name_is() helper. No functional change. Committer notes: Fix this build error: trace.syscalls.events.bpf_output = evlist__last(trace.evlist); - assert(evsel__name_is(trace.syscalls.events.bpf_output), "__augmented_syscalls__"); + assert(evsel__name_is(trace.syscalls.events.bpf_output, "__augmented_syscalls__")); Reviewed-by: Ian Rogers <irogers@google.com> Signed-off-by: Yang Jihong <yangjihong@bytedance.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: James Clark <james.clark@arm.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20240401062724.1006010-3-yangjihong@bytedance.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21perf script: Show also errors for --insn-trace optionAdrian Hunter
The trace could be misleading if trace errors are not taken into account, so display them also by adding the itrace "e" option. Note --call-trace and --call-ret-trace already add the itrace "e" option. Fixes: b585ebdb5912cf14 ("perf script: Add --insn-trace for instruction decoding") Reviewed-by: Andi Kleen <ak@linux.intel.com> Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Ian Rogers <irogers@google.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240315071334.3478-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-02-20perf: script: add raw|disasm arguments to --insn-trace optionChangbin Du
Now '--insn-trace' accept a argument to specify the output format: - raw: display raw instructions. - disasm: display mnemonic instructions (if capstone is installed). $ sudo perf script --insn-trace=raw ls 1443864 [006] 2275506.209908875: 7f216b426100 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) insn: 48 89 e7 ls 1443864 [006] 2275506.209908875: 7f216b426103 _start+0x3 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) insn: e8 e8 0c 00 00 ls 1443864 [006] 2275506.209908875: 7f216b426df0 _dl_start+0x0 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) insn: f3 0f 1e fa $ sudo perf script --insn-trace=disasm ls 1443864 [006] 2275506.209908875: 7f216b426100 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) movq %rsp, %rdi ls 1443864 [006] 2275506.209908875: 7f216b426103 _start+0x3 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) callq _dl_start+0x0 ls 1443864 [006] 2275506.209908875: 7f216b426df0 _dl_start+0x0 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) illegal instruction ls 1443864 [006] 2275506.209908875: 7f216b426df4 _dl_start+0x4 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) pushq %rbp ls 1443864 [006] 2275506.209908875: 7f216b426df5 _dl_start+0x5 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) movq %rsp, %rbp ls 1443864 [006] 2275506.209908875: 7f216b426df8 _dl_start+0x8 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) pushq %r15 Signed-off-by: Changbin Du <changbin.du@huawei.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Cc: changbin.du@gmail.com Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240217074046.4100789-5-changbin.du@huawei.com
2024-02-20perf: script: add field 'disasm' to display mnemonic instructionsChangbin Du
In addition to the 'insn' field, this adds a new field 'disasm' to display mnemonic instructions instead of the raw code. $ sudo perf script -F +disasm perf-exec 1443864 [006] 2275506.209848: psb: psb offs: 0 0 [unknown] ([unknown]) perf-exec 1443864 [006] 2275506.209848: cbr: cbr: 41 freq: 4100 MHz (114%) 0 [unknown] ([unknown]) ls 1443864 [006] 2275506.209905: 1 branches:uH: 7f216b426100 _start+0x0 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) movq %rsp, %rdi ls 1443864 [006] 2275506.209908: 1 branches:uH: 7f216b426103 _start+0x3 (/usr/lib/x86_64-linux-gnu/ld-2.31.so) callq _dl_start+0x0 Signed-off-by: Changbin Du <changbin.du@huawei.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Cc: changbin.du@gmail.com Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240217074046.4100789-4-changbin.du@huawei.com
2024-02-20perf: util: use capstone disasm engine to show assembly instructionsChangbin Du
Currently, the instructions of samples are shown as raw hex strings which are hard to read. x86 has a special option '--xed' to disassemble the hex string via intel XED tool. Here we use capstone as our disassembler engine to give more friendly instructions. We select libcapstone because capstone can provide more insn details. Perf will fallback to raw instructions if libcapstone is not available. The advantages compared to XED tool: * Support arm, arm64, x86-32, x86_64 (more could be supported), xed only for x86_64. * Immediate address operands are shown as symbol+offs. Signed-off-by: Changbin Du <changbin.du@huawei.com> Reviewed-by: Adrian Hunter <adrian.hunter@intel.com> Cc: changbin.du@gmail.com Cc: Thomas Richter <tmricht@linux.ibm.com> Cc: Andi Kleen <ak@linux.intel.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240217074046.4100789-3-changbin.du@huawei.com
2024-02-08perf tools: Make it possible to see perf's kernel and module memory mappingsAdrian Hunter
Dump kmaps if using 'perf --debug kmaps' or verbose > 2 (e.g. -vvv) for tools 'perf script' and 'perf report' if there is no browser. Example: $ perf --debug kmaps script 2>&1 >/dev/null | grep kvm.intel build id event received for /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko: 0691d75e10e72ebbbd45a44c59f6d00a5604badf [20] Map: 0-3a3 4f5d8 [kvm_intel].modinfo Map: 0-5240 5f280 [kvm_intel]__versions Map: 0-30 64 [kvm_intel].note.Linux Map: 0-14 644c0 [kvm_intel].orc_header Map: 0-5297 43680 [kvm_intel].rodata Map: 0-5bee 3b837 [kvm_intel].text.unlikely Map: 0-7e0 41430 [kvm_intel].noinstr.text Map: 0-2080 713c0 [kvm_intel].bss Map: 0-26 705c8 [kvm_intel].data..read_mostly Map: 0-5888 6a4c0 [kvm_intel].data Map: 0-22 70220 [kvm_intel].data.once Map: 0-40 705f0 [kvm_intel].data..percpu Map: 0-1685 41d20 [kvm_intel].init.text Map: 0-4b8 6fd60 [kvm_intel].init.data Map: 0-380 70248 [kvm_intel]__dyndbg Map: 0-8 70218 [kvm_intel].exit.data Map: 0-438 4f980 [kvm_intel]__param Map: 0-5f5 4ca0f [kvm_intel].rodata.str1.1 Map: 0-3657 493b8 [kvm_intel].rodata.str1.8 Map: 0-e0 70640 [kvm_intel].data..ro_after_init Map: 0-500 70ec0 [kvm_intel].gnu.linkonce.this_module Map: ffffffffc13a7000-ffffffffc1421000 a0 /lib/modules/6.7.2-local/kernel/arch/x86/kvm/kvm-intel.ko The example above shows how the module section mappings are all wrong except for the main .text mapping at 0xffffffffc13a7000. Signed-off-by: Adrian Hunter <adrian.hunter@intel.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Like Xu <like.xu.linux@gmail.com> Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240208085326.13432-2-adrian.hunter@intel.com
2024-02-07perf script: Print source line for each jump in brstackinsnKan Liang
With the srcline option, the perf script only prints a source line at the beginning of a sample with call/ret from functions, but not for each jump in brstackinsn. It's useful to print a source line for each jump in brstackinsn when the end user analyze the full assembler sequences of branch sequences for the sample. The srccode option can also be used to locate the source code line. However, it's printed almost for every line and makes the output less readable. $perf script -F +brstackinsn,+srcline --xed Before the patch, tchain_edit_deb 1463275 15228549.107820: 282495 instructions:u: 401133 f3+0xd (/home/kan/os.li> tchain_edit.c:22 f3+40: tchain_edit.c:20 000000000040114e jle 0x401133 # PRED 6 cycles [6] 0000000000401133 movl -0x4(%rbp), %eax 0000000000401136 and $0x1, %eax 0000000000401139 test %eax, %eax 000000000040113b jz 0x401143 000000000040113d addl $0x1, -0x4(%rbp) 0000000000401141 jmp 0x401147 # PRED 3 cycles [9] 2.00 IPC 0000000000401147 cmpl $0x3e7, -0x4(%rbp) 000000000040114e jle 0x401133 # PRED 6 cycles [15] 0.33 IPC After the patch, tchain_edit_deb 1463275 15228549.107820: 282495 instructions:u: 401133 f3+0xd (/home/kan/os.li> tchain_edit.c:22 f3+40: tchain_edit.c:20 000000000040114e jle 0x401133 srcline: tchain_edit.c:20 # PRED 6 cycles [6] 0000000000401133 movl -0x4(%rbp), %eax 0000000000401136 and $0x1, %eax 0000000000401139 test %eax, %eax 000000000040113b jz 0x401143 000000000040113d addl $0x1, -0x4(%rbp) 0000000000401141 jmp 0x401147 srcline: tchain_edit.c:23 # PRED 3 cycles [9] 2.00 IPC 0000000000401147 cmpl $0x3e7, -0x4(%rbp) 000000000040114e jle 0x401133 srcline: tchain_edit.c:20 # PRED 6 cycles [15] 0.33 IPC Signed-off-by: Kan Liang <kan.liang@linux.intel.com> Reviewed-by: Ian Rogers <irogers@google.com> Cc: ahmad.yasin@intel.com Cc: amiri.khalil@intel.com Cc: ak@linux.intel.com Signed-off-by: Namhyung Kim <namhyung@kernel.org> Link: https://lore.kernel.org/r/20240205145819.1943114-1-kan.liang@linux.intel.com
2023-10-17perf: script: fix missing ',' for fields optionChangbin Du
A comma is missed at the end of line. Signed-off-by: Changbin Du <changbin.du@huawei.com> Link: https://lore.kernel.org/r/20231017015524.797065-1-changbin.du@huawei.com Signed-off-by: Namhyung Kim <namhyung@kernel.org>
2023-08-08perf script: Print "cgroup" field on the same line as "comm"Ivan Babrou
Commit 3fd7a168bf51 ("perf script: Add 'cgroup' field for output") added support for printing cgroup path in perf script output. It was okay if you didn't want any stacks: $ sudo perf script --comms jpegtran:23f4bf -F comm,tid,cpu,time,cgroup jpegtran:23f4bf 3321915 [013] 404718.587488: /idle.slice/polish.service jpegtran:23f4bf 3321915 [031] 404718.592073: /idle.slice/polish.service With stacks it gets messier as cgroup is printed after the stack: $ perf script --comms jpegtran:23f4bf -F comm,tid,cpu,time,cgroup,ip,sym jpegtran:23f4bf 3321915 [013] 404718.587488: 5c554 compress_output 570d9 jpeg_finish_compress 3476e jpegtran_main 330ee jpegtran::main 326e2 core::ops::function::FnOnce::call_once (inlined) 326e2 std::sys_common::backtrace::__rust_begin_short_backtrace /idle.slice/polish.service jpegtran:23f4bf 3321915 [031] 404718.592073: 8474d jsimd_encode_mcu_AC_first_prepare_sse2.PADDING 55af68e62fff [unknown] /idle.slice/polish.service Let's instead print cgroup on the same line as comm: $ perf script --comms jpegtran:23f4bf -F comm,tid,cpu,time,cgroup,ip,sym jpegtran:23f4bf 3321915 [013] 404718.587488: /idle.slice/polish.service 5c554 compress_output 570d9 jpeg_finish_compress 3476e jpegtran_main 330ee jpegtran::main 326e2 core::ops::function::FnOnce::call_once (inlined) 326e2 std::sys_common::backtrace::__rust_begin_short_backtrace jpegtran:23f4bf 3321915 [031] 404718.592073: /idle.slice/polish.service 8474d jsimd_encode_mcu_AC_first_prepare_sse2.PADDING 55af68e62fff [unknown] Fixes: 3fd7a168bf514979 ("perf script: Add 'cgroup' field for output") Signed-off-by: Ivan Babrou <ivan@cloudflare.com> Acked-by: Ian Rogers <irogers@google.com> Acked-by: Namhyung Kim <namhyung@kernel.org> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: kernel-team@cloudflare.com Link: https://lore.kernel.org/r/20230718000737.49077-1-ivan@cloudflare.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-06-12perf script: Remove some large stack allocationsIan Rogers
Some char buffers are stack allocated but in total they come to 24kb. Avoid Wstack-usage warnings by moving the arrays to being dynamically allocated. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Peter Zijlstra <peterz@infradead.org> Link: https://lore.kernel.org/r/20230527034324.2597593-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-06-12perf callchain: Use pthread keys for tls callchain_cursorIan Rogers
Pthread keys are more portable than __thread and allow the association of a destructor with the key. Use the destructor to clean up TLS callchain cursors to aid understanding memory leaks. Committer notes: Had to fixup a series of unconverted places and also check for the return of get_tls_callchain_cursor() as it may fail and return NULL. In that unlikely case we now either print something to a file, if the caller was expecting to print a callchain, or return an error code to state that resolving the callchain isn't possible. In some cases this was made easier because thread__resolve_callchain() already can fail for other reasons, so this new one (cursor == NULL) can be added and the callers don't have to explicitely check for this new condition. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Brian Robbins <brianrob@linux.microsoft.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Fangrui Song <maskray@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Babrou <ivan@cloudflare.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Wenyu Liu <liuwenyu7@huawei.com> Cc: Will Deacon <will@kernel.org> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Ye Xingchen <ye.xingchen@zte.com.cn> Cc: Yuan Can <yuancan@huawei.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230608232823.4027869-25-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-06-12perf addr_location: Add init/exit/copy functionsIan Rogers
struct addr_location holds references to multiple reference counted objects. Add init/exit functions to make maintenance of those more consistent with the rest of the code and to try to avoid leaks. Modification of thread reference counts isn't included in this change. Committer notes: I needed to initialize result to sample->ip to make sure is set to something, fixing a compile time error, mostly keeping the previous logic as build_alloc_func_list() already does debugging/error prints about what went wrong if it takes the 'goto out'. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Brian Robbins <brianrob@linux.microsoft.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Fangrui Song <maskray@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Babrou <ivan@cloudflare.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Wenyu Liu <liuwenyu7@huawei.com> Cc: Will Deacon <will@kernel.org> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Ye Xingchen <ye.xingchen@zte.com.cn> Cc: Yuan Can <yuancan@huawei.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230608232823.4027869-7-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-06-12perf thread: Add accessor functions for threadIan Rogers
Using accessors will make it easier to add reference count checking in later patches. Committer notes: thread->nsinfo wasn't wrapped as it is used together with nsinfo__zput(), where does a trick to set the field with a refcount being dropped to NULL, and that doesn't work well with using thread__nsinfo(thread), that loses the &thread->nsinfo pointer. When refcount checking is added to 'struct thread', later in this series, nsinfo__zput(RC_CHK_ACCESS(thread)->nsinfo) will be used to check the thread pointer. Signed-off-by: Ian Rogers <irogers@google.com> Cc: Adrian Hunter <adrian.hunter@intel.com> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com> Cc: Ali Saidi <alisaidi@amazon.com> Cc: Andi Kleen <ak@linux.intel.com> Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com> Cc: Brian Robbins <brianrob@linux.microsoft.com> Cc: Changbin Du <changbin.du@huawei.com> Cc: Dmitrii Dolgov <9erthalion6@gmail.com> Cc: Fangrui Song <maskray@google.com> Cc: German Gomez <german.gomez@arm.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Ivan Babrou <ivan@cloudflare.com> Cc: James Clark <james.clark@arm.com> Cc: Jing Zhang <renyu.zj@linux.alibaba.com> Cc: Jiri Olsa <jolsa@kernel.org> Cc: John Garry <john.g.garry@oracle.com> Cc: K Prateek Nayak <kprateek.nayak@amd.com> Cc: Kan Liang <kan.liang@linux.intel.com> Cc: Leo Yan <leo.yan@linaro.org> Cc: Liam Howlett <liam.howlett@oracle.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Miguel Ojeda <ojeda@kernel.org> Cc: Mike Leach <mike.leach@linaro.org> Cc: Namhyung Kim <namhyung@kernel.org> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Ravi Bangoria <ravi.bangoria@amd.com> Cc: Sean Christopherson <seanjc@google.com> Cc: Steinar H. Gunderson <sesse@google.com> Cc: Suzuki Poulouse <suzuki.poulose@arm.com> Cc: Wenyu Liu <liuwenyu7@huawei.com> Cc: Will Deacon <will@kernel.org> Cc: Yang Jihong <yangjihong1@huawei.com> Cc: Ye Xingchen <ye.xingchen@zte.com.cn> Cc: Yuan Can <yuancan@huawei.com> Cc: coresight@lists.linaro.org Cc: linux-arm-kernel@lists.infradead.org Link: https://lore.kernel.org/r/20230608232823.4027869-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>