<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/events, branch v5.15.55</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.15.55</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.15.55'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2022-06-09T08:22:48Z</updated>
<entry>
<title>signal: Deliver SIGTRAP on perf event asynchronously if blocked</title>
<updated>2022-06-09T08:22:48Z</updated>
<author>
<name>Marco Elver</name>
<email>elver@google.com</email>
</author>
<published>2022-04-04T11:12:04Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=60768ffcede23197efbceb391437c9e22d9c7718'/>
<id>urn:sha1:60768ffcede23197efbceb391437c9e22d9c7718</id>
<content type='text'>
[ Upstream commit 78ed93d72ded679e3caf0758357209887bda885f ]

With SIGTRAP on perf events, we have encountered termination of
processes due to user space attempting to block delivery of SIGTRAP.
Consider this case:

    &lt;set up SIGTRAP on a perf event&gt;
    ...
    sigset_t s;
    sigemptyset(&amp;s);
    sigaddset(&amp;s, SIGTRAP | &lt;and others&gt;);
    sigprocmask(SIG_BLOCK, &amp;s, ...);
    ...
    &lt;perf event triggers&gt;

When the perf event triggers, while SIGTRAP is blocked, force_sig_perf()
will force the signal, but revert back to the default handler, thus
terminating the task.

This makes sense for error conditions, but not so much for explicitly
requested monitoring. However, the expectation is still that signals
generated by perf events are synchronous, which will no longer be the
case if the signal is blocked and delivered later.

To give user space the ability to clearly distinguish synchronous from
asynchronous signals, introduce siginfo_t::si_perf_flags and
TRAP_PERF_FLAG_ASYNC (opted for flags in case more binary information is
required in future).

The resolution to the problem is then to (a) no longer force the signal
(avoiding the terminations), but (b) tell user space via si_perf_flags
if the signal was synchronous or not, so that such signals can be
handled differently (e.g. let user space decide to ignore or consider
the data imprecise).

The alternative of making the kernel ignore SIGTRAP on perf events if
the signal is blocked may work for some usecases, but likely causes
issues in others that then have to revert back to interception of
sigprocmask() (which we want to avoid). [ A concrete example: when using
breakpoint perf events to track data-flow, in a region of code where
signals are blocked, data-flow can no longer be tracked accurately.
When a relevant asynchronous signal is received after unblocking the
signal, the data-flow tracking logic needs to know its state is
imprecise. ]

Fixes: 97ba62b27867 ("perf: Add support for SIGTRAP on perf events")
Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Signed-off-by: Marco Elver &lt;elver@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Tested-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Link: https://lore.kernel.org/r/20220404111204.935357-1-elver@google.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf: Fix sys_perf_event_open() race against self</title>
<updated>2022-05-25T07:57:27Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2022-05-20T18:38:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e085354dde254bc6c83ee604ea66c2b36f9f9067'/>
<id>urn:sha1:e085354dde254bc6c83ee604ea66c2b36f9f9067</id>
<content type='text'>
commit 3ac6487e584a1eb54071dbe1212e05b884136704 upstream.

Norbert reported that it's possible to race sys_perf_event_open() such
that the looser ends up in another context from the group leader,
triggering many WARNs.

The move_group case checks for races against itself, but the
!move_group case doesn't, seemingly relying on the previous
group_leader-&gt;ctx == ctx check. However, that check is racy due to not
holding any locks at that time.

Therefore, re-check the result after acquiring locks and bailing
if they no longer match.

Additionally, clarify the not_move_group case from the
move_group-vs-move_group race.

Fixes: f63a8daa5812 ("perf: Fix event-&gt;ctx locking")
Reported-by: Norbert Slusarek &lt;nslusarek@gmx.net&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>perf/core: Fix perf_mmap fail when CONFIG_PERF_USE_VMALLOC enabled</title>
<updated>2022-04-27T12:38:58Z</updated>
<author>
<name>Zhipeng Xie</name>
<email>xiezhipeng1@huawei.com</email>
</author>
<published>2022-02-09T14:54:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=56637084e8a551e7896d846f1e96415681973cbd'/>
<id>urn:sha1:56637084e8a551e7896d846f1e96415681973cbd</id>
<content type='text'>
[ Upstream commit 60490e7966659b26d74bf1fa4aa8693d9a94ca88 ]

This problem can be reproduced with CONFIG_PERF_USE_VMALLOC enabled on
both x86_64 and aarch64 arch when using sysdig -B(using ebpf)[1].
sysdig -B works fine after rebuilding the kernel with
CONFIG_PERF_USE_VMALLOC disabled.

I tracked it down to the if condition event-&gt;rb-&gt;nr_pages != nr_pages
in perf_mmap is true when CONFIG_PERF_USE_VMALLOC is enabled where
event-&gt;rb-&gt;nr_pages = 1 and nr_pages = 2048 resulting perf_mmap to
return -EINVAL. This is because when CONFIG_PERF_USE_VMALLOC is
enabled, rb-&gt;nr_pages is always equal to 1.

Arch with CONFIG_PERF_USE_VMALLOC enabled by default:
	arc/arm/csky/mips/sh/sparc/xtensa

Arch with CONFIG_PERF_USE_VMALLOC disabled by default:
	x86_64/aarch64/...

Fix this problem by using data_page_nr()

[1] https://github.com/draios/sysdig

Fixes: 906010b2134e ("perf_event: Provide vmalloc() based mmap() backing")
Signed-off-by: Zhipeng Xie &lt;xiezhipeng1@huawei.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/20220209145417.6495-1-xiezhipeng1@huawei.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf/core: Inherit event_caps</title>
<updated>2022-04-13T18:59:24Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2022-03-28T20:01:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fcf9ccf9d26a13d4a10ee6b167e429d96db97fe4'/>
<id>urn:sha1:fcf9ccf9d26a13d4a10ee6b167e429d96db97fe4</id>
<content type='text'>
commit e3265a4386428d3d157d9565bb520aabff8b4bf0 upstream.

It was reported that some perf event setup can make fork failed on
ARM64.  It was the case of a group of mixed hw and sw events and it
failed in perf_event_init_task() due to armpmu_event_init().

The ARM PMU code checks if all the events in a group belong to the
same PMU except for software events.  But it didn't set the event_caps
of inherited events and no longer identify them as software events.
Therefore the test failed in a child process.

A simple reproducer is:

  $ perf stat -e '{cycles,cs,instructions}' perf bench sched messaging
  # Running 'sched/messaging' benchmark:
  perf: fork(): Invalid argument

The perf stat was fine but the perf bench failed in fork().  Let's
inherit the event caps from the parent.

Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Link: https://lkml.kernel.org/r/20220328200112.457740-1-namhyung@kernel.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>perf/core: Fix address filter parser for multiple filters</title>
<updated>2022-04-08T12:23:10Z</updated>
<author>
<name>Adrian Hunter</name>
<email>adrian.hunter@intel.com</email>
</author>
<published>2022-01-31T07:24:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=929d8a87f709cb39f56c0b7afcace940e282117e'/>
<id>urn:sha1:929d8a87f709cb39f56c0b7afcace940e282117e</id>
<content type='text'>
[ Upstream commit d680ff24e9e14444c63945b43a37ede7cd6958f9 ]

Reset appropriate variables in the parser loop between parsing separate
filters, so that they do not interfere with parsing the next filter.

Fixes: 375637bc524952 ("perf/core: Introduce address range filtering")
Signed-off-by: Adrian Hunter &lt;adrian.hunter@intel.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20220131072453.2839535-4-adrian.hunter@intel.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf: Fix list corruption in perf_cgroup_switch()</title>
<updated>2022-02-16T11:56:41Z</updated>
<author>
<name>Song Liu</name>
<email>song@kernel.org</email>
</author>
<published>2022-02-04T00:40:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7969fe91c9830e045901970e9d755b7505881d4a'/>
<id>urn:sha1:7969fe91c9830e045901970e9d755b7505881d4a</id>
<content type='text'>
commit 5f4e5ce638e6a490b976ade4a40017b40abb2da0 upstream.

There's list corruption on cgrp_cpuctx_list. This happens on the
following path:

  perf_cgroup_switch: list_for_each_entry(cgrp_cpuctx_list)
      cpu_ctx_sched_in
         ctx_sched_in
            ctx_pinned_sched_in
              merge_sched_in
                  perf_cgroup_event_disable: remove the event from the list

Use list_for_each_entry_safe() to allow removing an entry during
iteration.

Fixes: 058fe1c0440e ("perf/core: Make cgroup switch visit only cpuctxs with cgroup events")
Signed-off-by: Song Liu &lt;song@kernel.org&gt;
Reviewed-by: Rik van Riel &lt;riel@surriel.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/20220204004057.2961252-1-song@kernel.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>perf: Always wake the parent event</title>
<updated>2022-02-16T11:56:15Z</updated>
<author>
<name>James Clark</name>
<email>james.clark@arm.com</email>
</author>
<published>2021-12-06T11:38:40Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a3486ef99a3bd198ef8fca4ff7bc68e97ddea9d9'/>
<id>urn:sha1:a3486ef99a3bd198ef8fca4ff7bc68e97ddea9d9</id>
<content type='text'>
[ Upstream commit 961c39121759ad09a89598ec4ccdd34ae0468a19 ]

When using per-process mode and event inheritance is set to true,
forked processes will create a new perf events via inherit_event() -&gt;
perf_event_alloc(). But these events will not have ring buffers
assigned to them. Any call to wakeup will be dropped if it's called on
an event with no ring buffer assigned because that's the object that
holds the wakeup list.

If the child event is disabled due to a call to
perf_aux_output_begin() or perf_aux_output_end(), the wakeup is
dropped leaving userspace hanging forever on the poll.

Normally the event is explicitly re-enabled by userspace after it
wakes up to read the aux data, but in this case it does not get woken
up so the event remains disabled.

This can be reproduced when using Arm SPE and 'stress' which forks once
before running the workload. By looking at the list of aux buffers read,
it's apparent that they stop after the fork:

  perf record -e arm_spe// -vvv -- stress -c 1

With this patch applied they continue to be printed. This behaviour
doesn't happen when using systemwide or per-cpu mode.

Reported-by: Ruben Ayrapetyan &lt;Ruben.Ayrapetyan@arm.com&gt;
Signed-off-by: James Clark &lt;james.clark@arm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/20211206113840.130802-2-james.clark@arm.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf: Copy perf_event_attr::sig_data on modification</title>
<updated>2022-02-08T17:34:12Z</updated>
<author>
<name>Marco Elver</name>
<email>elver@google.com</email>
</author>
<published>2022-01-31T10:34:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=64e133ce280b78b3d42d45d348484ac3fe0eb7e5'/>
<id>urn:sha1:64e133ce280b78b3d42d45d348484ac3fe0eb7e5</id>
<content type='text'>
[ Upstream commit 3c25fc97f5590060464cabfa25710970ecddbc96 ]

The intent has always been that perf_event_attr::sig_data should also be
modifiable along with PERF_EVENT_IOC_MODIFY_ATTRIBUTES, because it is
observable by user space if SIGTRAP on events is requested.

Currently only PERF_TYPE_BREAKPOINT is modifiable, and explicitly copies
relevant breakpoint-related attributes in hw_breakpoint_copy_attr().
This misses copying perf_event_attr::sig_data.

Since sig_data is not specific to PERF_TYPE_BREAKPOINT, introduce a
helper to copy generic event-type-independent attributes on
modification.

Fixes: 97ba62b27867 ("perf: Add support for SIGTRAP on perf events")
Reported-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Signed-off-by: Marco Elver &lt;elver@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Link: https://lore.kernel.org/r/20220131103407.1971678-1-elver@google.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf/core: Fix cgroup event list management</title>
<updated>2022-02-01T16:27:15Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2022-01-24T19:58:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fd26531855e62fa210015dcd493e1c8c3814aea1'/>
<id>urn:sha1:fd26531855e62fa210015dcd493e1c8c3814aea1</id>
<content type='text'>
commit c5de60cd622a2607c043ba65e25a6e9998a369f9 upstream.

The active cgroup events are managed in the per-cpu cgrp_cpuctx_list.
This list is only accessed from current cpu and not protected by any
locks.  But from the commit ef54c1a476ae ("perf: Rework
perf_event_exit_event()"), it's possible to access (actually modify)
the list from another cpu.

In the perf_remove_from_context(), it can remove an event from the
context without an IPI when the context is not active.  This is not
safe with cgroup events which can have some active events in the
context even if ctx-&gt;is_active is 0 at the moment.  The target cpu
might be in the middle of list iteration at the same time.

If the event is enabled when it's about to be closed, it might call
perf_cgroup_event_disable() and list_del() with the cgrp_cpuctx_list
on a different cpu.

This resulted in a crash due to an invalid list pointer access during
the cgroup list traversal on the cpu which the event belongs to.

Let's fallback to IPI to access the cgrp_cpuctx_list from that cpu.
Similarly, perf_install_in_context() should use IPI for the cgroup
events too.

Fixes: ef54c1a476ae ("perf: Rework perf_event_exit_event()")
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/20220124195808.2252071-1-namhyung@kernel.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf: Fix perf_event_read_local() time</title>
<updated>2022-02-01T16:27:10Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2021-12-20T12:19:52Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=767060539ac4fe683300a9004e7956b1f6dbd657'/>
<id>urn:sha1:767060539ac4fe683300a9004e7956b1f6dbd657</id>
<content type='text'>
[ Upstream commit 09f5e7dc7ad705289e1b1ec065439aa3c42951c4 ]

Time readers that cannot take locks (due to NMI etc..) currently make
use of perf_event::shadow_ctx_time, which, for that event gives:

  time' = now + (time - timestamp)

or, alternatively arranged:

  time' = time + (now - timestamp)

IOW, the progression of time since the last time the shadow_ctx_time
was updated.

There's problems with this:

 A) the shadow_ctx_time is per-event, even though the ctx_time it
    reflects is obviously per context. The direct concequence of this
    is that the context needs to iterate all events all the time to
    keep the shadow_ctx_time in sync.

 B) even with the prior point, the context itself might not be active
    meaning its time should not advance to begin with.

 C) shadow_ctx_time isn't consistently updated when ctx_time is

There are 3 users of this stuff, that suffer differently from this:

 - calc_timer_values()
   - perf_output_read()
   - perf_event_update_userpage()	/* A */

 - perf_event_read_local()		/* A,B */

In particular, perf_output_read() doesn't suffer at all, because it's
sample driven and hence only relevant when the event is actually
running.

This same was supposed to be true for perf_event_update_userpage(),
after all self-monitoring implies the context is active *HOWEVER*, as
per commit f79256532682 ("perf/core: fix userpage-&gt;time_enabled of
inactive events") this goes wrong when combined with counter
overcommit, in that case those events that do not get scheduled when
the context becomes active (task events typically) miss out on the
EVENT_TIME update and ENABLED time is inflated (for a little while)
with the time the context was inactive. Once the event gets rotated
in, this gets corrected, leading to a non-monotonic timeflow.

perf_event_read_local() made things even worse, it can request time at
any point, suffering all the problems perf_event_update_userpage()
does and more. Because while perf_event_update_userpage() is limited
by the context being active, perf_event_read_local() users have no
such constraint.

Therefore, completely overhaul things and do away with
perf_event::shadow_ctx_time. Instead have regular context time updates
keep track of this offset directly and provide perf_event_time_now()
to complement perf_event_time().

perf_event_time_now() will, in adition to being context wide, also
take into account if the context is active. For inactive context, it
will not advance time.

This latter property means the cgroup perf_cgroup_info context needs
to grow addition state to track this.

Additionally, since all this is strictly per-cpu, we can use barrier()
to order context activity vs context time.

Fixes: 7d9285e82db5 ("perf/bpf: Extend the perf_event_read_local() interface, a.k.a. "bpf: perf event change needed for subsequent bpf helpers"")
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Tested-by: Song Liu &lt;song@kernel.org&gt;
Tested-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Link: https://lkml.kernel.org/r/YcB06DasOBtU0b00@hirez.programming.kicks-ass.net
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
