<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/perf_event.h, branch v6.1.152</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.1.152</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.1.152'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2025-06-04T12:40:16Z</updated>
<entry>
<title>perf: Avoid the read if the count is already updated</title>
<updated>2025-06-04T12:40:16Z</updated>
<author>
<name>Peter Zijlstra (Intel)</name>
<email>peterz@infradead.org</email>
</author>
<published>2025-01-21T15:23:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e1c3bfe365f10c3c5cfa53e16ad60201879f74f4'/>
<id>urn:sha1:e1c3bfe365f10c3c5cfa53e16ad60201879f74f4</id>
<content type='text'>
[ Upstream commit 8ce939a0fa194939cc1f92dbd8bc1a7806e7d40a ]

The event may have been updated in the PMU-specific implementation,
e.g., Intel PEBS counters snapshotting. The common code should not
read and overwrite the value.

The PERF_SAMPLE_READ in the data-&gt;sample_type can be used to detect
whether the PMU-specific value is available. If yes, avoid the
pmu-&gt;read() in the common code. Add a new flag, skip_read, to track the
case.

Factor out a perf_pmu_read() to clean up the code.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Kan Liang &lt;kan.liang@linux.intel.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/20250121152303.3128733-3-kan.liang@linux.intel.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf: Fix event leak upon exec and file release</title>
<updated>2024-08-03T06:49:41Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>frederic@kernel.org</email>
</author>
<published>2024-06-21T09:16:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ed2c202dac55423a52d7e2290f2888bf08b8ee99'/>
<id>urn:sha1:ed2c202dac55423a52d7e2290f2888bf08b8ee99</id>
<content type='text'>
commit 3a5465418f5fd970e86a86c7f4075be262682840 upstream.

The perf pending task work is never waited upon the matching event
release. In the case of a child event, released via free_event()
directly, this can potentially result in a leaked event, such as in the
following scenario that doesn't even require a weak IRQ work
implementation to trigger:

schedule()
   prepare_task_switch()
=======&gt; &lt;NMI&gt;
      perf_event_overflow()
         event-&gt;pending_sigtrap = ...
         irq_work_queue(&amp;event-&gt;pending_irq)
&lt;======= &lt;/NMI&gt;
      perf_event_task_sched_out()
          event_sched_out()
              event-&gt;pending_sigtrap = 0;
              atomic_long_inc_not_zero(&amp;event-&gt;refcount)
              task_work_add(&amp;event-&gt;pending_task)
   finish_lock_switch()
=======&gt; &lt;IRQ&gt;
   perf_pending_irq()
      //do nothing, rely on pending task work
&lt;======= &lt;/IRQ&gt;

begin_new_exec()
   perf_event_exit_task()
      perf_event_exit_event()
         // If is child event
         free_event()
            WARN(atomic_long_cmpxchg(&amp;event-&gt;refcount, 1, 0) != 1)
            // event is leaked

Similar scenarios can also happen with perf_event_remove_on_exec() or
simply against concurrent perf_event_release().

Fix this with synchonizing against the possibly remaining pending task
work while freeing the event, just like is done with remaining pending
IRQ work. This means that the pending task callback neither need nor
should hold a reference to the event, preventing it from ever beeing
freed.

Fixes: 517e6a301f34 ("perf: Fix perf_pending_task() UaF")
Signed-off-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240621091601.18227-5-frederic@kernel.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>perf: Disallow mis-matched inherited group reads</title>
<updated>2023-10-25T10:03:15Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2023-10-18T11:56:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f6952655a61264900ed08e9d642adad8222f8e29'/>
<id>urn:sha1:f6952655a61264900ed08e9d642adad8222f8e29</id>
<content type='text'>
commit 32671e3799ca2e4590773fd0e63aaa4229e50c06 upstream.

Because group consistency is non-atomic between parent (filedesc) and children
(inherited) events, it is possible for PERF_FORMAT_GROUP read() to try and sum
non-matching counter groups -- with non-sensical results.

Add group_generation to distinguish the case where a parent group removes and
adds an event and thus has the same number, but a different configuration of
events as inherited groups.

This became a problem when commit fa8c269353d5 ("perf/core: Invert
perf_read_group() loops") flipped the order of child_list and sibling_list.
Previously it would iterate the group (sibling_list) first, and for each
sibling traverse the child_list. In this order, only the group composition of
the parent is relevant. By flipping the order the group composition of the
child (inherited) events becomes an issue and the mis-match in group
composition becomes evident.

That said; even prior to this commit, while reading of a group that is not
equally inherited was not broken, it still made no sense.

(Ab)use ECHILD as error return to indicate issues with child process group
composition.

Fixes: fa8c269353d5 ("perf/core: Invert perf_read_group() loops")
Reported-by: Budimir Markovic &lt;markovicbudimir@gmail.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/20231018115654.GK33217@noisy.programming.kicks-ass.net
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>hw_breakpoint: fix single-stepping when using bpf_overflow_handler</title>
<updated>2023-09-23T09:11:00Z</updated>
<author>
<name>Tomislav Novak</name>
<email>tnovak@meta.com</email>
</author>
<published>2023-06-05T19:19:23Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=dc1d81ee9312dee73544ec88d356e644edd5c413'/>
<id>urn:sha1:dc1d81ee9312dee73544ec88d356e644edd5c413</id>
<content type='text'>
[ Upstream commit d11a69873d9a7435fe6a48531e165ab80a8b1221 ]

Arm platforms use is_default_overflow_handler() to determine if the
hw_breakpoint code should single-step over the breakpoint trigger or
let the custom handler deal with it.

Since bpf_overflow_handler() currently isn't recognized as a default
handler, attaching a BPF program to a PERF_TYPE_BREAKPOINT event causes
it to keep firing (the instruction triggering the data abort exception
is never skipped). For example:

  # bpftrace -e 'watchpoint:0x10000:4:w { print("hit") }' -c ./test
  Attaching 1 probe...
  hit
  hit
  [...]
  ^C

(./test performs a single 4-byte store to 0x10000)

This patch replaces the check with uses_default_overflow_handler(),
which accounts for the bpf_overflow_handler() case by also testing
if one of the perf_event_output functions gets invoked indirectly,
via orig_default_handler.

Signed-off-by: Tomislav Novak &lt;tnovak@meta.com&gt;
Tested-by: Samuel Gosselin &lt;sgosselin@google.com&gt; # arm64
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Link: https://lore.kernel.org/linux-arm-kernel/20220923203644.2731604-1-tnovak@fb.com/
Link: https://lore.kernel.org/r/20230605191923.1219974-1-tnovak@meta.com
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf: Fix missing SIGTRAPs</title>
<updated>2022-10-17T14:32:05Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2022-10-06T13:00:39Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ca6c21327c6af02b7eec31ce4b9a740a18c6c13f'/>
<id>urn:sha1:ca6c21327c6af02b7eec31ce4b9a740a18c6c13f</id>
<content type='text'>
Marco reported:

Due to the implementation of how SIGTRAP are delivered if
perf_event_attr::sigtrap is set, we've noticed 3 issues:

  1. Missing SIGTRAP due to a race with event_sched_out() (more
     details below).

  2. Hardware PMU events being disabled due to returning 1 from
     perf_event_overflow(). The only way to re-enable the event is
     for user space to first "properly" disable the event and then
     re-enable it.

  3. The inability to automatically disable an event after a
     specified number of overflows via PERF_EVENT_IOC_REFRESH.

The worst of the 3 issues is problem (1), which occurs when a
pending_disable is "consumed" by a racing event_sched_out(), observed
as follows:

		CPU0			|	CPU1
	--------------------------------+---------------------------
	__perf_event_overflow()		|
	 perf_event_disable_inatomic()	|
	  pending_disable = CPU0	| ...
					| _perf_event_enable()
					|  event_function_call()
					|   task_function_call()
					|    /* sends IPI to CPU0 */
	&lt;IPI&gt;				| ...
	 __perf_event_enable()		+---------------------------
	  ctx_resched()
	   task_ctx_sched_out()
	    ctx_sched_out()
	     group_sched_out()
	      event_sched_out()
	       pending_disable = -1
	&lt;/IPI&gt;
	&lt;IRQ-work&gt;
	 perf_pending_event()
	  perf_pending_event_disable()
	   /* Fails to send SIGTRAP because no pending_disable! */
	&lt;/IRQ-work&gt;

In the above case, not only is that particular SIGTRAP missed, but also
all future SIGTRAPs because 'event_limit' is not reset back to 1.

To fix, rework pending delivery of SIGTRAP via IRQ-work by introduction
of a separate 'pending_sigtrap', no longer using 'event_limit' and
'pending_disable' for its delivery.

Additionally; and different to Marco's proposed patch:

 - recognise that pending_disable effectively duplicates oncpu for
   the case where it is set. As such, change the irq_work handler to
   use -&gt;oncpu to target the event and use pending_* as boolean toggles.

 - observe that SIGTRAP targets the ctx-&gt;task, so the context switch
   optimization that carries contexts between tasks is invalid. If
   the irq_work were delayed enough to hit after a context switch the
   SIGTRAP would be delivered to the wrong task.

 - observe that if the event gets scheduled out
   (rotation/migration/context-switch/...) the irq-work would be
   insufficient to deliver the SIGTRAP when the event gets scheduled
   back in (the irq-work might still be pending on the old CPU).

   Therefore have event_sched_out() convert the pending sigtrap into a
   task_work which will deliver the signal at return_to_user.

Fixes: 97ba62b27867 ("perf: Add support for SIGTRAP on perf events")
Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Debugged-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Reported-by: Marco Elver &lt;elver@google.com&gt;
Debugged-by: Marco Elver &lt;elver@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Marco Elver &lt;elver@google.com&gt;
Tested-by: Marco Elver &lt;elver@google.com&gt;
</content>
</entry>
<entry>
<title>perf: Fix lockdep_assert_event_ctx()</title>
<updated>2022-10-04T11:32:08Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2022-10-04T08:46:58Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0ce38047e82a02017839b6cae837f13a1383a3a0'/>
<id>urn:sha1:0ce38047e82a02017839b6cae837f13a1383a3a0</id>
<content type='text'>
I'm a flamin' moron; because even after Mark told me it should be '&amp;&amp;'
I still got it wrong in the final commit.

Fixes: f3c0eba28704 ("perf: Add a few assertions")
Reported-by: Borislav Petkov &lt;bp@alien8.de&gt;
Reported-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Tested-by: Borislav Petkov &lt;bp@alien8.de&gt;
Link: https://lkml.kernel.org/r/YvvIWmDBWdIUCMZj@FVFF77S0Q05N
</content>
</entry>
<entry>
<title>perf: Use sample_flags for raw_data</title>
<updated>2022-09-27T20:50:24Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2022-09-21T22:00:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=838d9bb62d132ec3baf1b5aba2e95ef9a7a9a3cd'/>
<id>urn:sha1:838d9bb62d132ec3baf1b5aba2e95ef9a7a9a3cd</id>
<content type='text'>
Use the new sample_flags to indicate whether the raw data field is
filled by the PMU driver.  Although it could check with the NULL,
follow the same rule with other fields.

Remove the raw field from the perf_sample_data_init() to minimize
the number of cache lines touched.

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/20220921220032.2858517-2-namhyung@kernel.org
</content>
</entry>
<entry>
<title>perf: Use sample_flags for addr</title>
<updated>2022-09-27T20:50:24Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2022-09-21T22:00:31Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7b084630153152239d84990ac4540c2dd360186f'/>
<id>urn:sha1:7b084630153152239d84990ac4540c2dd360186f</id>
<content type='text'>
Use the new sample_flags to indicate whether the addr field is filled by
the PMU driver.  As most PMU drivers pass 0, it can set the flag only if
it has a non-zero value.  And use 0 in perf_sample_output() if it's not
filled already.

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/20220921220032.2858517-1-namhyung@kernel.org
</content>
</entry>
<entry>
<title>perf: Add a few assertions</title>
<updated>2022-09-07T19:54:01Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2022-09-02T16:48:55Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f3c0eba287049237b23d1300376768293eb89e69'/>
<id>urn:sha1:f3c0eba287049237b23d1300376768293eb89e69</id>
<content type='text'>
While auditing 6b959ba22d34 ("perf/core: Fix reentry problem in
perf_output_read_group()") a few spots were found that wanted
assertions.

Notable for_each_sibling_event() relies on exclusion from
modification. This would normally be holding either ctx-&gt;lock or
ctx-&gt;mutex, however due to how things are constructed disabling IRQs
is a valid and sufficient substitute for ctx-&gt;lock.

Another possible site to add assertions would be the various
pmu::{add,del,read,..}() methods, but that's not trivially expressable
in C -- the best option is wrappers, but those are easy enough to
forget.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
</content>
</entry>
<entry>
<title>perf/core: Assert PERF_EVENT_FLAG_ARCH does not overlap with generic flags</title>
<updated>2022-09-07T19:54:00Z</updated>
<author>
<name>Anshuman Khandual</name>
<email>anshuman.khandual@arm.com</email>
</author>
<published>2022-09-07T09:19:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f67dd218fafd9de9a13d095e775b621db76a058f'/>
<id>urn:sha1:f67dd218fafd9de9a13d095e775b621db76a058f</id>
<content type='text'>
This just ensures that PERF_EVENT_FLAG_ARCH does not overlap with generic
hardware event flags.

Signed-off-by: Anshuman Khandual &lt;anshuman.khandual@arm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: James Clark &lt;james.clark@arm.com&gt;
Link: https://lkml.kernel.org/r/20220907091924.439193-3-anshuman.khandual@arm.com
</content>
</entry>
</feed>
