user/sven/linux.git/kernel/events/internal.h, branch v6.12.10

perf/aux: Fix AUX buffer serialization

2024-09-04T16:22:56Z

Ole reported that event->mmap_mutex is strictly insufficient to serialize the AUX buffer, add a per RB mutex to fully serialize it. Note that in the lock order comment the perf_event::mmap_mutex order was already wrong, that is, it nesting under mmap_lock is not new with this patch. Fixes: 45bfb2e50471 ("perf: Add AUX area to ring buffer for raw data streams") Reported-by: Ole Signed-off-by: Peter Zijlstra (Intel) Signed-off-by: Ingo Molnar

perf: Move swevent_htable::recursion into task_struct.

2024-07-09T11:26:36Z

The swevent_htable::recursion counter is used to avoid creating an swevent while an event is processed to avoid recursion. The counter is per-CPU and preemption must be disabled to have a stable counter. perf_pending_task() disables preemption to access the counter and then signal. This is problematic on PREEMPT_RT because sending a signal uses a spinlock_t which must not be acquired in atomic on PREEMPT_RT because it becomes a sleeping lock. The atomic context can be avoided by moving the counter into the task_struct. There is a 4 byte hole between futex_state (usually always on) and the following perf pointer (perf_event_ctxp). After the recursion lost some weight it fits perfectly. Move swevent_htable::recursion into task_struct. Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Peter Zijlstra (Intel) Tested-by: Marco Elver Link: https://lore.kernel.org/r/20240704170424.1466941-6-bigeasy@linutronix.de

perf: Shrink the size of the recursion counter.

2024-07-09T11:26:35Z

There are four recursion counter, one for each context. The type of the counter is `int' but the counter is used as `bool' since it is only incremented if zero. The main goal here is to shrink the whole struct into 32bit int which can later be added task_struct into an existing hole. Reduce the type of the recursion counter to an unsigned char, keep the increment/ decrement operation. Signed-off-by: Sebastian Andrzej Siewior Signed-off-by: Peter Zijlstra (Intel) Tested-by: Marco Elver Link: https://lore.kernel.org/r/20240704170424.1466941-5-bigeasy@linutronix.de

perf: Fix perf_aux_size() for greater-than 32-bit size

2024-07-04T14:00:22Z

perf_buffer->aux_nr_pages uses a 32-bit type, so a cast is needed to calculate a 64-bit size. Fixes: 45bfb2e50471 ("perf: Add AUX area to ring buffer for raw data streams") Signed-off-by: Adrian Hunter Signed-off-by: Peter Zijlstra (Intel) Link: https://lore.kernel.org/r/20240624201101.60186-5-adrian.hunter@intel.com

perf/core: Fix perf_mmap fail when CONFIG_PERF_USE_VMALLOC enabled

2022-04-19T19:15:42Z

This problem can be reproduced with CONFIG_PERF_USE_VMALLOC enabled on both x86_64 and aarch64 arch when using sysdig -B(using ebpf)[1]. sysdig -B works fine after rebuilding the kernel with CONFIG_PERF_USE_VMALLOC disabled. I tracked it down to the if condition event->rb->nr_pages != nr_pages in perf_mmap is true when CONFIG_PERF_USE_VMALLOC is enabled where event->rb->nr_pages = 1 and nr_pages = 2048 resulting perf_mmap to return -EINVAL. This is because when CONFIG_PERF_USE_VMALLOC is enabled, rb->nr_pages is always equal to 1. Arch with CONFIG_PERF_USE_VMALLOC enabled by default: arc/arm/csky/mips/sh/sparc/xtensa Arch with CONFIG_PERF_USE_VMALLOC disabled by default: x86_64/aarch64/... Fix this problem by using data_page_nr() [1] https://github.com/draios/sysdig Fixes: 906010b2134e ("perf_event: Provide vmalloc() based mmap() backing") Signed-off-by: Zhipeng Xie Signed-off-by: Peter Zijlstra (Intel) Link: https://lkml.kernel.org/r/20220209145417.6495-1-xiezhipeng1@huawei.com

tracing/perf: Add interrupt_context_level() helper

2021-10-20T00:33:20Z

Now that there are three different instances of doing the addition trick to the preempt_count() and NMI_MASK, HARDIRQ_MASK and SOFTIRQ_OFFSET macros, it deserves a helper function defined in the preempt.h header. Add the interrupt_context_level() helper and replace the three instances that do that logic with it. Link: https://lore.kernel.org/all/20211015142541.4badd8a9@gandalf.local.home/ Signed-off-by: Steven Rostedt (VMware)

perf: Optimize get_recursion_context()

2020-11-09T17:12:34Z

"Look ma, no branches!" Signed-off-by: Peter Zijlstra (Intel) Acked-by: Jesper Dangaard Brouer Link: https://lkml.kernel.org/r/20201030151955.187580298@infradead.org

perf: Fix get_recursion_context()

2020-11-09T17:12:34Z

One should use in_serving_softirq() to detect SoftIRQ context. Fixes: 96f6d4444302 ("perf_counter: avoid recursion") Signed-off-by: Peter Zijlstra (Intel) Link: https://lkml.kernel.org/r/20201030151955.120572175@infradead.org

perf/core: Replace zero-length array with flexible-array

2020-05-19T18:34:16Z

The current codebase makes use of the zero-length array language extension to the C90 standard, but the preferred mechanism to declare variable-length types such as these ones is a flexible array member[1][2], introduced in C99: struct foo { int stuff; struct boo array[]; }; By making use of the mechanism above, we will get a compiler warning in case the flexible array does not occur last in the structure, which will help us prevent some kind of undefined behavior bugs from being inadvertently introduced[3] to the codebase from now on. Also, notice that, dynamic memory allocations won't be affected by this change: "Flexible array members have incomplete type, and so the sizeof operator may not be applied. As a quirk of the original implementation of zero-length arrays, sizeof evaluates to zero."[1] sizeof(flexible-array-member) triggers a warning because flexible array members have incomplete type[1]. There are some instances of code in which the sizeof operator is being incorrectly/erroneously applied to zero-length arrays and the result is zero. Such instances may be hiding some bugs. So, this work (flexible-array member conversions) will also help to get completely rid of those sorts of issues. This issue was found with the help of Coccinelle. [1] https://gcc.gnu.org/onlinedocs/gcc/Zero-Length.html [2] https://github.com/KSPP/linux/issues/21 [3] commit 76497732932f ("cxgb3/l2t: Fix undefined behaviour") Signed-off-by: Gustavo A. R. Silva Signed-off-by: Peter Zijlstra (Intel) Link: https://lkml.kernel.org/r/20200511201227.GA14041@embeddedor

perf: Make struct ring_buffer less ambiguous

2020-01-13T18:19:38Z

eBPF requires needing to know the size of the perf ring buffer structure. But it unfortunately has the same name as the generic ring buffer used by tracing and oprofile. To make it less ambiguous, rename the perf ring buffer structure to "perf_buffer". As other parts of the ring buffer code has "perf_" as the prefix, it only makes sense to give the ring buffer the "perf_" prefix as well. Link: https://lore.kernel.org/r/20191213153553.GE20583@krava Acked-by: Peter Zijlstra Suggested-by: Alexei Starovoitov Signed-off-by: Steven Rostedt (VMware)