<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/tools/perf, branch v6.6.65</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.6.65</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.6.65'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2024-12-09T09:32:29Z</updated>
<entry>
<title>perf trace: Avoid garbage when not printing a syscall's arguments</title>
<updated>2024-12-09T09:32:29Z</updated>
<author>
<name>Benjamin Peterson</name>
<email>benjamin@engflow.com</email>
</author>
<published>2024-11-07T23:21:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2f465476ff8b147857fbc0aa56925b33f38dde50'/>
<id>urn:sha1:2f465476ff8b147857fbc0aa56925b33f38dde50</id>
<content type='text'>
[ Upstream commit 1302e352b26f34991b619b5d0b621b76d20a3883 ]

syscall__scnprintf_args may not place anything in the output buffer
(e.g., because the arguments are all zero). If that happened in
trace__fprintf_sys_enter, its fprintf would receive an unitialized
buffer leading to garbage output.

Fix the problem by passing the (possibly zero) bounds of the argument
buffer to the output fprintf.

Fixes: a98392bb1e169a04 ("perf trace: Use beautifiers on syscalls:sys_enter_ handlers")
Signed-off-by: Benjamin Peterson &lt;benjamin@engflow.com&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Tested-by: Howard Chu &lt;howardchu95@gmail.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20241107232128.108981-2-benjamin@engflow.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf trace: Do not lose last events in a race</title>
<updated>2024-12-09T09:32:29Z</updated>
<author>
<name>Benjamin Peterson</name>
<email>benjamin@engflow.com</email>
</author>
<published>2024-11-07T23:21:26Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fad7be2395123cc9a36cbeb02fd7f13f4980eb9a'/>
<id>urn:sha1:fad7be2395123cc9a36cbeb02fd7f13f4980eb9a</id>
<content type='text'>
[ Upstream commit 3fd7c36973a250e17a4ee305a31545a9426021f4 ]

If a perf trace event selector specifies a maximum number of events to output
(i.e., "/nr=N/" syntax), the event printing handler, trace__event_handler,
disables the event selector after the maximum number events are
printed.

Furthermore, trace__event_handler checked if the event selector was
disabled before doing any work. This avoided exceeding the maximum
number of events to print if more events were in the buffer before the
selector was disabled.

However, the event selector can be disabled for reasons other than
exceeding the maximum number of events. In particular, when the traced
subprocess exits, the main loop disables all event selectors. This meant
the last events of a traced subprocess might be lost to the printing
handler's short-circuiting logic.

This nondeterministic problem could be seen by running the following many times:

  $ perf trace -e syscalls:sys_enter_exit_group true

trace__event_handler should simply check for exceeding the maximum number of
events to print rather than the state of the event selector.

Fixes: a9c5e6c1e9bff42c ("perf trace: Introduce per-event maximum number of events property")
Signed-off-by: Benjamin Peterson &lt;benjamin@engflow.com&gt;
Tested-by: Howard Chu &lt;howardchu95@gmail.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20241107232128.108981-1-benjamin@engflow.com
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf trace: Fix tracing itself, creating feedback loops</title>
<updated>2024-12-09T09:32:28Z</updated>
<author>
<name>Howard Chu</name>
<email>howardchu95@gmail.com</email>
</author>
<published>2024-10-30T05:24:31Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e3b4b95c2b8bc7cd29cbf744b70bca21694b8889'/>
<id>urn:sha1:e3b4b95c2b8bc7cd29cbf744b70bca21694b8889</id>
<content type='text'>
[ Upstream commit fe4f9b4124967ffb75d66994520831231b779550 ]

There exists a pids_filtered map in augmented_raw_syscalls.bpf.c that
ceases to provide functionality after the BPF skeleton migration done
in:

5e6da6be3082f77b ("perf trace: Migrate BPF augmentation to use a skeleton")

Before the migration, pid_filtered map works, courtesy of Arnaldo
Carvalho de Melo &lt;acme@kernel.org&gt;:

  ⬢ [acme@toolbox perf-tools]$ git log --oneline -5
  6f769c3458b6cf2d (HEAD) perf tests trace+probe_vfs_getname.sh: Accept quotes surrounding the filename
  7777ac3dfe29f55d perf test trace+probe_vfs_getname.sh: Remove stray \ before /
  33d9c5062113a4bd perf script python: Add stub for PMU symbol to the python binding
  e59fea47f83e8a9a perf symbols: Fix DSO kernel load and symbol process to correctly map DSO to its long_name, type and adjust_symbols
  878460e8d0ff84a0 perf build: Remove -Wno-unused-but-set-variable from the flex flags when building with clang &lt; 13.0.0

  root@x1:/home/acme/git/perf-tools# perf trace -e /tmp/augmented_raw_syscalls.o -e write* --max-events=30  &amp;
  [1] 180632
  root@x1:/home/acme/git/perf-tools#      0.000 ( 0.051 ms): NetworkManager/1127 write(fd: 3, buf: 0x7ffeb508ef70, count: 8)                           = 8
       0.115 ( 0.010 ms): NetworkManager/1127 write(fd: 3, buf: 0x7ffeb508ef70, count: 8)                           = 8
       0.916 ( 0.068 ms): sudo/156867 write(fd: 8, buf: 0x55cb4cd2f650, count: 246)                         = 246
       1.699 ( 0.047 ms): sudo/156867 write(fd: 8, buf: 0x55cb4cd2f650, count: 121)                         = 121
       2.167 ( 0.041 ms): sudo/156867 write(fd: 8, buf: 0x55cb4cd2f650, count: 121)                         = 121
       2.739 ( 0.042 ms): sudo/156867 write(fd: 8, buf: 0x55cb4cd2f650, count: 121)                         = 121
       3.138 ( 0.027 ms): sudo/156867 write(fd: 8, buf: 0x55cb4cd2f650, count: 121)                         = 121
       3.477 ( 0.027 ms): sudo/156867 write(fd: 8, buf: 0x55cb4cd2f650, count: 121)                         = 121
       3.738 ( 0.023 ms): sudo/156867 write(fd: 8, buf: 0x55cb4cd2f650, count: 121)                         = 121
       3.946 ( 0.024 ms): sudo/156867 write(fd: 8, buf: 0x55cb4cd2f650, count: 121)                         = 121
       4.195 ( 0.024 ms): sudo/156867 write(fd: 8, buf: 0x55cb4cd2f650, count: 121)                         = 121
       4.212 ( 0.026 ms): NetworkManager/1127 write(fd: 3, buf: 0x7ffeb508ef70, count: 8)                           = 8
       4.285 ( 0.006 ms): NetworkManager/1127 write(fd: 3, buf: 0x7ffeb508ef70, count: 8)                           = 8
       4.445 ( 0.018 ms): sudo/156867 write(fd: 8, buf: 0x55cb4cd2f650, count: 260)                         = 260
       4.508 ( 0.009 ms): sudo/156867 write(fd: 8, buf: 0x55cb4cd2f650, count: 124)                         = 124
       4.592 ( 0.010 ms): sudo/156867 write(fd: 8, buf: 0x55cb4cd2f650, count: 116)                         = 116
       4.666 ( 0.009 ms): sudo/156867 write(fd: 8, buf: 0x55cb4cd2f650, count: 130)                         = 130
       4.715 ( 0.010 ms): sudo/156867 write(fd: 8, buf: 0x55cb4cd2f650, count: 95)                          = 95
       4.765 ( 0.007 ms): sudo/156867 write(fd: 8, buf: 0x55cb4cd2f650, count: 102)                         = 102
       4.815 ( 0.009 ms): sudo/156867 write(fd: 8, buf: 0x55cb4cd2f650, count: 79)                          = 79
       4.890 ( 0.008 ms): sudo/156867 write(fd: 8, buf: 0x55cb4cd2f650, count: 57)                          = 57
       4.937 ( 0.007 ms): sudo/156867 write(fd: 8, buf: 0x55cb4cd2f650, count: 89)                          = 89
       5.009 ( 0.010 ms): sudo/156867 write(fd: 8, buf: 0x55cb4cd2f650, count: 112)                         = 112
       5.059 ( 0.010 ms): sudo/156867 write(fd: 8, buf: 0x55cb4cd2f650, count: 112)                         = 112
       5.116 ( 0.007 ms): sudo/156867 write(fd: 8, buf: 0x55cb4cd2f650, count: 79)                          = 79
       5.152 ( 0.009 ms): sudo/156867 write(fd: 8, buf: 0x55cb4cd2f650, count: 33)                          = 33
       5.215 ( 0.008 ms): sudo/156867 write(fd: 8, buf: 0x55cb4cd2f650, count: 37)                          = 37
       5.293 ( 0.010 ms): sudo/156867 write(fd: 8, buf: 0x55cb4cd2f650, count: 128)                         = 128
       5.339 ( 0.009 ms): sudo/156867 write(fd: 8, buf: 0x55cb4cd2f650, count: 89)                          = 89
       5.384 ( 0.008 ms): sudo/156867 write(fd: 8, buf: 0x55cb4cd2f650, count: 100)                         = 100

  [1]+  Done                    perf trace -e /tmp/augmented_raw_syscalls.o -e write* --max-events=30
  root@x1:/home/acme/git/perf-tools#

No events for the 'perf trace' (pid 180632), i.e. no feedback loop.

If we leave it running:

  root@x1:/home/acme/git/perf-tools# perf trace -e /tmp/augmented_raw_syscalls.o -e landlock_add_rule &amp;
  [1] 181068
  root@x1:/home/acme/git/perf-tools#

  And then look at what maps it sets up:

  root@x1:/home/acme/git/perf-tools# bpftool map | grep pids_filtered -A3
  1190: hash  name pids_filtered  flags 0x0
          key 4B  value 1B  max_entries 64  memlock 7264B
          btf_id 1613
          pids perf(181068)
  root@x1:/home/acme/git/perf-tools#

  And ask for dumping its contents:

  We see that we are _also_ setting it to filter those:

  root@x1:/home/acme/git/perf-tools# bpftool map dump id 1190
  [{
          "key": 181068,
          "value": 1
      },{
          "key": 156801,
          "value": 1
      }
  ]

Now testing the migration commit:

  perf $ git log
  commit 5e6da6be3082f77be06894a1a94d52a90b4007dc (HEAD)
  Author: Ian Rogers &lt;irogers@google.com&gt;
  Date:   Thu Aug 10 11:48:51 2023 -0700

      perf trace: Migrate BPF augmentation to use a skeleton

  perf $ ./perf trace -e write --max-events=10 &amp; echo #!
  [1] 1808653
  perf $
       0.000 ( 0.010 ms): :1808671/1808671 write(fd: 1, buf: 0x6003f5b26fc0, count: 11) = 11
       0.162 (         ): perf/1808653 write(fd: 2, buf: 0x7fffc2174e50, count: 11)     ...
       0.174 (         ): perf/1808653 write(fd: 2, buf: 0x74ce21804563, count: 1)      ...
       0.184 (         ): perf/1808653 write(fd: 2, buf: 0x57b936589052, count: 5)

The feedback loop is there.

Keep it running, look into the bpf map:

  perf $ bpftool map | grep pids_filtered
  10675: hash  name pids_filtered  flags 0x0

  perf $ bpftool map dump id 10675
  []

The map is empty.

Now, this commit:

  64917f4df048a064 ("perf trace: Use heuristic when deciding if a syscall tracepoint "const char *" field is really a string")

Temporarily fixed the feedback loop for perf trace -e write, that's
because before using the heuristic, write is hooked to sys_enter_openat:

  perf $ git log
  commit 83a0943b1870944612a8aa0049f910826ebfd4f7 (HEAD)
  Author: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
  Date:   Thu Aug 17 12:11:51 2023 -0300

      perf trace: Use the augmented_raw_syscall BPF skel only for tracing syscalls

  perf $ ./perf trace -e write --max-events=10 -v 2&gt;&amp;1 | grep Reusing
  Reusing "openat" BPF sys_enter augmenter for "write"

And after the heuristic fix, it's unaugmented:

  perf $ git log
  commit 64917f4df048a0649ea7901c2321f020e71e6f24 (HEAD)
  Author: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
  Date:   Thu Aug 17 15:14:21 2023 -0300

      perf trace: Use heuristic when deciding if a syscall tracepoint "const char *" field is really a string

  perf $ ./perf trace -e write --max-events=10 -v 2&gt;&amp;1 | grep Reusing
  perf $

After using the heuristic, write is hooked to syscall_unaugmented, which
returns 1.

  SEC("tp/raw_syscalls/sys_enter")
  int syscall_unaugmented(struct syscall_enter_args *args)
  {
  	return 1;
  }

If the BPF program returns 1, the tracepoint filter will filter it
(since the tracepoint filter for perf is correctly set), but before the
heuristic, when it was hooked to a sys_enter_openat(), which is a BPF
program that calls bpf_perf_event_output() and writes to the buffer, it
didn't get filtered, thus creating feedback loop. So switching write to
unaugmented accidentally fixed the problem.

But some syscalls are not so lucky, for example newfstatat:
perf $ ./perf trace -e newfstatat --max-events=100 &amp; echo #!
[1] 2166948

   457.718 (         ): perf/2166948 newfstatat(dfd: CWD, filename: "/proc/self/ns/mnt", statbuf: 0x7fff0132a9f0) ...
   457.749 (         ): perf/2166948 newfstatat(dfd: CWD, filename: "/proc/2166950/ns/mnt", statbuf: 0x7fff0132aa80) ...
   457.962 (         ): perf/2166948 newfstatat(dfd: CWD, filename: "/proc/self/ns/mnt", statbuf: 0x7fff0132a9f0) ...

Currently, write is augmented by the new BTF general augmenter (which
calls bpf_perf_event_output()). The problem, which luckily got fixed,
resurfaced, and that’s how it was discovered.

Fixes: 5e6da6be3082f77b ("perf trace: Migrate BPF augmentation to use a skeleton")
Signed-off-by: Howard Chu &lt;howardchu95@gmail.com&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Ian Rogers &lt;irogers@google.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: James Clark &lt;james.clark@linaro.org&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20241030052431.2220130-1-howardchu95@gmail.com
[ Check if trace-&gt;skel is non-NULL, as it is only initialized if trace-&gt;trace_syscalls is set ]
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf list: Fix topic and pmu_name argument order</title>
<updated>2024-12-09T09:32:28Z</updated>
<author>
<name>Jean-Philippe Romain</name>
<email>jean-philippe.romain@foss.st.com</email>
</author>
<published>2024-11-09T02:58:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f38ab496c7c45ce594f092ab74a03d7644821d66'/>
<id>urn:sha1:f38ab496c7c45ce594f092ab74a03d7644821d66</id>
<content type='text'>
[ Upstream commit d99b3125726aade4f5ec4aae04805134ab4b0abd ]

Fix function definitions to match header file declaration. Fix two
callers to pass the arguments in the right order.

On Intel Tigerlake, before:
```
$ perf list -j|grep "\"Topic\""|sort|uniq
        "Topic": "cache",
        "Topic": "cpu",
        "Topic": "floating point",
        "Topic": "frontend",
        "Topic": "memory",
        "Topic": "other",
        "Topic": "pfm icl",
        "Topic": "pfm ix86arch",
        "Topic": "pfm perf_raw",
        "Topic": "pipeline",
        "Topic": "tool",
        "Topic": "uncore interconnect",
        "Topic": "uncore memory",
        "Topic": "uncore other",
        "Topic": "virtual memory",
$ perf list -j|grep "\"Unit\""|sort|uniq
        "Unit": "cache",
        "Unit": "cpu",
        "Unit": "cstate_core",
        "Unit": "cstate_pkg",
        "Unit": "i915",
        "Unit": "icl",
        "Unit": "intel_bts",
        "Unit": "intel_pt",
        "Unit": "ix86arch",
        "Unit": "msr",
        "Unit": "perf_raw",
        "Unit": "power",
        "Unit": "tool",
        "Unit": "uncore_arb",
        "Unit": "uncore_clock",
        "Unit": "uncore_imc_free_running_0",
        "Unit": "uncore_imc_free_running_1",
```

After:
```
$ perf list -j|grep "\"Topic\""|sort|uniq
        "Topic": "cache",
        "Topic": "floating point",
        "Topic": "frontend",
        "Topic": "memory",
        "Topic": "other",
        "Topic": "pfm icl",
        "Topic": "pfm ix86arch",
        "Topic": "pfm perf_raw",
        "Topic": "pipeline",
        "Topic": "tool",
        "Topic": "uncore interconnect",
        "Topic": "uncore memory",
        "Topic": "uncore other",
        "Topic": "virtual memory",
$ perf list -j|grep "\"Unit\""|sort|uniq
        "Unit": "cpu",
        "Unit": "cstate_core",
        "Unit": "cstate_pkg",
        "Unit": "i915",
        "Unit": "icl",
        "Unit": "intel_bts",
        "Unit": "intel_pt",
        "Unit": "ix86arch",
        "Unit": "msr",
        "Unit": "perf_raw",
        "Unit": "power",
        "Unit": "tool",
        "Unit": "uncore_arb",
        "Unit": "uncore_clock",
        "Unit": "uncore_imc_free_running_0",
        "Unit": "uncore_imc_free_running_1",
```

Fixes: e5c6109f4813246a ("perf list: Reorganize to use callbacks to allow honouring command line options")
Reviewed-by: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Signed-off-by: Jean-Philippe Romain &lt;jean-philippe.romain@foss.st.com&gt;
Tested-by: Ian Rogers &lt;irogers@google.com&gt;
Cc: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@kernel.org&gt;
Cc: Junhao He &lt;hejunhao3@huawei.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20241109025801.560378-1-irogers@google.com
[ I fixed the two callers and added it to Jean-Phillippe's original change. ]
Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf trace: avoid garbage when not printing a trace event's arguments</title>
<updated>2024-12-09T09:32:28Z</updated>
<author>
<name>Benjamin Peterson</name>
<email>benjamin@engflow.com</email>
</author>
<published>2024-11-03T20:48:16Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ad8cf4a18d9945bc03536a568f03524c28cea04d'/>
<id>urn:sha1:ad8cf4a18d9945bc03536a568f03524c28cea04d</id>
<content type='text'>
[ Upstream commit 5fb8e56542a3cf469fdf25d77f50e21cbff3ae7e ]

trace__fprintf_tp_fields may not print any tracepoint arguments. E.g., if the
argument values are all zero. Previously, this would result in a totally
uninitialized buffer being passed to fprintf, which could lead to garbage on the
console. Fix the problem by passing the number of initialized bytes fprintf.

Fixes: f11b2803bb88 ("perf trace: Allow choosing how to augment the tracepoint arguments")
Signed-off-by: Benjamin Peterson &lt;benjamin@engflow.com&gt;
Tested-by: Howard Chu &lt;howardchu95@gmail.com&gt;
Tested-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Link: https://lore.kernel.org/r/20241103204816.7834-1-benjamin@engflow.com
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf ftrace latency: Fix unit on histogram first entry when using --use-nsec</title>
<updated>2024-12-09T09:32:25Z</updated>
<author>
<name>Arnaldo Carvalho de Melo</name>
<email>acme@kernel.org</email>
</author>
<published>2024-10-29T19:29:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=aac3361f6d22e6a85dcbd68ade16631bcb2d70dc'/>
<id>urn:sha1:aac3361f6d22e6a85dcbd68ade16631bcb2d70dc</id>
<content type='text'>
[ Upstream commit 064d569e20e82c065b1dec9d20c29c7087bb1a00 ]

The use_nsec arg wasn't being taken into account when printing the first
histogram entry, fix it:

  root@number:~# perf ftrace latency --use-nsec -T switch_mm_irqs_off -a sleep 2
  #   DURATION     |      COUNT | GRAPH                                          |
       0 - 1    us |          0 |                                                |
       1 - 2    ns |          0 |                                                |
       2 - 4    ns |          0 |                                                |
       4 - 8    ns |          0 |                                                |
       8 - 16   ns |          0 |                                                |
      16 - 32   ns |          0 |                                                |
      32 - 64   ns |        125 |                                                |
      64 - 128  ns |        335 |                                                |
     128 - 256  ns |       2155 | ####                                           |
     256 - 512  ns |       9996 | ###################                            |
     512 - 1024 ns |       4958 | #########                                      |
       1 - 2    us |       4636 | #########                                      |
       2 - 4    us |       1053 | ##                                             |
       4 - 8    us |         15 |                                                |
       8 - 16   us |          1 |                                                |
      16 - 32   us |          0 |                                                |
      32 - 64   us |          0 |                                                |
      64 - 128  us |          0 |                                                |
     128 - 256  us |          0 |                                                |
     256 - 512  us |          0 |                                                |
     512 - 1024 us |          0 |                                                |
       1 - ...  ms |          0 |                                                |
  root@number:~#

After:

  root@number:~# perf ftrace latency --use-nsec -T switch_mm_irqs_off -a sleep 2
  #   DURATION     |      COUNT | GRAPH                                          |
       0 - 1    ns |          0 |                                                |
       1 - 2    ns |          0 |                                                |
       2 - 4    ns |          0 |                                                |
       4 - 8    ns |          0 |                                                |
       8 - 16   ns |          0 |                                                |
      16 - 32   ns |          0 |                                                |
      32 - 64   ns |         19 |                                                |
      64 - 128  ns |         94 |                                                |
     128 - 256  ns |       2191 | ####                                           |
     256 - 512  ns |       9719 | ####################                           |
     512 - 1024 ns |       5330 | ###########                                    |
       1 - 2    us |       4104 | ########                                       |
       2 - 4    us |        807 | #                                              |
       4 - 8    us |          9 |                                                |
       8 - 16   us |          0 |                                                |
      16 - 32   us |          0 |                                                |
      32 - 64   us |          0 |                                                |
      64 - 128  us |          0 |                                                |
     128 - 256  us |          0 |                                                |
     256 - 512  us |          0 |                                                |
     512 - 1024 us |          0 |                                                |
       1 - ...  ms |          0 |                                                |
  root@number:~#

Fixes: 84005bb6148618cc ("perf ftrace latency: Add -n/--use-nsec option")
Signed-off-by: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Gabriele Monaco &lt;gmonaco@redhat.com&gt;
Link: https://lore.kernel.org/r/ZyE3frB-hMXHCnMO@x1
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf probe: Correct demangled symbols in C++ program</title>
<updated>2024-12-09T09:32:25Z</updated>
<author>
<name>Leo Yan</name>
<email>leo.yan@arm.com</email>
</author>
<published>2024-10-12T14:14:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cbc853c490c986156462b382f604fe489f3c4c72'/>
<id>urn:sha1:cbc853c490c986156462b382f604fe489f3c4c72</id>
<content type='text'>
[ Upstream commit 314909f13cc12d47c468602c37dace512d225eeb ]

An issue can be observed when probe C++ demangled symbol with steps:

  # nm test_cpp_mangle | grep print_data
    0000000000000c94 t _GLOBAL__sub_I__Z10print_datai
    0000000000000afc T _Z10print_datai
    0000000000000b38 T _Z10print_dataR5Point

  # perf probe -x /home/niayan01/test_cpp_mangle -F --demangle
    ...
    print_data(Point&amp;)
    print_data(int)
    ...

  # perf --debug verbose=3 probe -x test_cpp_mangle --add "test=print_data(int)"
    probe-definition(0): test=print_data(int)
    symbol:print_data(int) file:(null) line:0 offset:0 return:0 lazy:(null)
    0 arguments
    Open Debuginfo file: /home/niayan01/test_cpp_mangle
    Try to find probe point from debuginfo.
    Symbol print_data(int) address found : afc
    Matched function: print_data [2ccf]
    Probe point found: print_data+0
    Found 1 probe_trace_events.
    Opening /sys/kernel/tracing//uprobe_events write=1
    Opening /sys/kernel/tracing//README write=0
    Writing event: p:probe_test_cpp_mangle/test /home/niayan01/test_cpp_mangle:0xb38
    ...

When tried to probe symbol "print_data(int)", the log shows:

    Symbol print_data(int) address found : afc

The found address is 0xafc - which is right with verifying the output
result from nm. Afterwards when write event, the command uses offset
0xb38 in the last log, which is a wrong address.

The dwarf_diename() gets a common function name, in above case, it
returns string "print_data". As a result, the tool parses the offset
based on the common name. This leads to probe at the wrong symbol
"print_data(Point&amp;)".

To fix the issue, use the die_get_linkage_name() function to retrieve
the distinct linkage name - this is the mangled name for the C++ case.
Based on this unique name, the tool can get a correct offset for
probing. Based on DWARF doc, it is possible the linkage name is missed
in the DIE, it rolls back to use dwarf_diename().

After:

  # perf --debug verbose=3 probe -x test_cpp_mangle --add "test=print_data(int)"
    probe-definition(0): test=print_data(int)
    symbol:print_data(int) file:(null) line:0 offset:0 return:0 lazy:(null)
    0 arguments
    Open Debuginfo file: /home/niayan01/test_cpp_mangle
    Try to find probe point from debuginfo.
    Symbol print_data(int) address found : afc
    Matched function: print_data [2d06]
    Probe point found: print_data+0
    Found 1 probe_trace_events.
    Opening /sys/kernel/tracing//uprobe_events write=1
    Opening /sys/kernel/tracing//README write=0
    Writing event: p:probe_test_cpp_mangle/test /home/niayan01/test_cpp_mangle:0xafc
    Added new event:
      probe_test_cpp_mangle:test (on print_data(int) in /home/niayan01/test_cpp_mangle)

    You can now use it in all perf tools, such as:

            perf record -e probe_test_cpp_mangle:test -aR sleep 1

  # perf --debug verbose=3 probe -x test_cpp_mangle --add "test2=print_data(Point&amp;)"
    probe-definition(0): test2=print_data(Point&amp;)
    symbol:print_data(Point&amp;) file:(null) line:0 offset:0 return:0 lazy:(null)
    0 arguments
    Open Debuginfo file: /home/niayan01/test_cpp_mangle
    Try to find probe point from debuginfo.
    Symbol print_data(Point&amp;) address found : b38
    Matched function: print_data [2ccf]
    Probe point found: print_data+0
    Found 1 probe_trace_events.
    Opening /sys/kernel/tracing//uprobe_events write=1
    Parsing probe_events: p:probe_test_cpp_mangle/test /home/niayan01/test_cpp_mangle:0x0000000000000afc
    Group:probe_test_cpp_mangle Event:test probe:p
    Opening /sys/kernel/tracing//README write=0
    Writing event: p:probe_test_cpp_mangle/test2 /home/niayan01/test_cpp_mangle:0xb38
    Added new event:
      probe_test_cpp_mangle:test2 (on print_data(Point&amp;) in /home/niayan01/test_cpp_mangle)

    You can now use it in all perf tools, such as:

            perf record -e probe_test_cpp_mangle:test2 -aR sleep 1

Fixes: fb1587d869a3 ("perf probe: List probes with line number and file name")
Signed-off-by: Leo Yan &lt;leo.yan@arm.com&gt;
Acked-by: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Link: https://lore.kernel.org/r/20241012141432.877894-1-leo.yan@arm.com
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf probe: Fix libdw memory leak</title>
<updated>2024-12-09T09:32:25Z</updated>
<author>
<name>Ian Rogers</name>
<email>irogers@google.com</email>
</author>
<published>2024-10-16T23:56:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2c6f6c3843df3c02dc314e728b490701028156af'/>
<id>urn:sha1:2c6f6c3843df3c02dc314e728b490701028156af</id>
<content type='text'>
[ Upstream commit 4585038b8e186252141ef86e9f0d8e97f11dce8d ]

Add missing dwarf_cfi_end to free memory associated with probe_finder
cfi_eh which is allocated and owned via a call to
dwarf_getcfi_elf. Confusingly cfi_dbg shouldn't be freed as its memory
is owned by the passed in debuginfo struct. Add comments to highlight
this.

This addresses leak sanitizer issues seen in:
tools/perf/tests/shell/test_uprobe_from_different_cu.sh

Fixes: 270bde1e76f4 ("perf probe: Search both .eh_frame and .debug_frame sections for probe location")
Signed-off-by: Ian Rogers &lt;irogers@google.com&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Steinar H. Gunderson &lt;sesse@google.com&gt;
Cc: Alexander Lobakin &lt;aleksander.lobakin@intel.com&gt;
Cc: Masami Hiramatsu (Google) &lt;mhiramat@kernel.org&gt;
Cc: Kajol Jain &lt;kjain@linux.ibm.com&gt;
Cc: Athira Rajeev &lt;atrajeev@linux.vnet.ibm.com&gt;
Cc: Hemant Kumar &lt;hemant@linux.vnet.ibm.com&gt;
Link: https://lore.kernel.org/r/20241016235622.52166-3-irogers@google.com
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf test attr: Add back missing topdown events</title>
<updated>2024-12-09T09:32:24Z</updated>
<author>
<name>Veronika Molnarova</name>
<email>vmolnaro@redhat.com</email>
</author>
<published>2024-03-11T08:16:11Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9ac8d66362305642588c974b44a82d87c04d440e'/>
<id>urn:sha1:9ac8d66362305642588c974b44a82d87c04d440e</id>
<content type='text'>
[ Upstream commit 6bff76af9635411214ca44ea38fc2781e78064b6 ]

With the patch 0b6c5371c03c "Add missing topdown metrics events" eight
topdown metric events with numbers ranging from 0x8000 to 0x8700 were
added to the test since they were added as 'perf stat' default events.
Later the patch 951efb9976ce "Update no event/metric expectations" kept
only 4 of those events(0x8000-0x8300).

Currently, the topdown events with numbers 0x8400 to 0x8700 are missing
from the list of expected events resulting in a failure. Add back the
missing topdown events.

Fixes: 951efb9976ce ("perf test attr: Update no event/metric expectations")
Signed-off-by: Veronika Molnarova &lt;vmolnaro@redhat.com&gt;
Tested-by: Ian Rogers &lt;irogers@google.com&gt;
Cc: mpetlan@redhat.com
Link: https://lore.kernel.org/r/20240311081611.7835-1-vmolnaro@redhat.com
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf trace: Keep exited threads for summary</title>
<updated>2024-12-09T09:32:24Z</updated>
<author>
<name>Michael Petlan</name>
<email>mpetlan@redhat.com</email>
</author>
<published>2024-09-27T15:19:26Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0c4753453964cc5a416ec4a561ae9664e1bbc032'/>
<id>urn:sha1:0c4753453964cc5a416ec4a561ae9664e1bbc032</id>
<content type='text'>
[ Upstream commit d29d92df410e2fb523f640478b18f70c1823e55e ]

Since 9ffa6c7512ca ("perf machine thread: Remove exited threads by
default") perf cleans exited threads up, but as said, sometimes they
are necessary to be kept. The mentioned commit does not cover all the
cases, we also need the information to construct the summary table in
perf-trace.

Before:
    # perf trace -s true

     Summary of events:

After:
    # perf trace -s -- true

     Summary of events:

     true (383382), 64 events, 91.4%

       syscall            calls  errors  total       min       avg       max       stddev
                                         (msec)    (msec)    (msec)    (msec)        (%)
       --------------- --------  ------ -------- --------- --------- ---------     ------
       mmap                   8      0     0.150     0.013     0.019     0.031     11.90%
       mprotect               3      0     0.045     0.014     0.015     0.017      6.47%
       openat                 2      0     0.014     0.006     0.007     0.007      9.73%
       munmap                 1      0     0.009     0.009     0.009     0.009      0.00%
       access                 1      1     0.009     0.009     0.009     0.009      0.00%
       pread64                4      0     0.006     0.001     0.001     0.002      4.53%
       fstat                  2      0     0.005     0.001     0.002     0.003     37.59%
       arch_prctl             2      1     0.003     0.001     0.002     0.002     25.91%
       read                   1      0     0.003     0.003     0.003     0.003      0.00%
       close                  2      0     0.003     0.001     0.001     0.001      3.86%
       brk                    1      0     0.002     0.002     0.002     0.002      0.00%
       rseq                   1      0     0.001     0.001     0.001     0.001      0.00%
       prlimit64              1      0     0.001     0.001     0.001     0.001      0.00%
       set_robust_list        1      0     0.001     0.001     0.001     0.001      0.00%
       set_tid_address        1      0     0.001     0.001     0.001     0.001      0.00%
       execve                 1      0     0.000     0.000     0.000     0.000      0.00%

[namhyung: simplified the condition]

Fixes: 9ffa6c7512ca ("perf machine thread: Remove exited threads by default")
Reported-by: Veronika Molnarova &lt;vmolnaro@redhat.com&gt;
Signed-off-by: Michael Petlan &lt;mpetlan@redhat.com&gt;
Link: https://lore.kernel.org/r/20240927151926.399474-1-mpetlan@redhat.com
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
