<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/perf_event.h, branch v6.7.9</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.7.9</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.7.9'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2023-11-15T03:18:31Z</updated>
<entry>
<title>perf/core: Fix cpuctx refcounting</title>
<updated>2023-11-15T03:18:31Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2023-06-09T10:34:46Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=889c58b3155ff4c8e8671c95daef63d6fabbb6b1'/>
<id>urn:sha1:889c58b3155ff4c8e8671c95daef63d6fabbb6b1</id>
<content type='text'>
Audit of the refcounting turned up that perf_pmu_migrate_context()
fails to migrate the ctx refcount.

Fixes: bd2756811766 ("perf: Rewrite core context handling")
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://lkml.kernel.org/r/20230612093539.085862001@infradead.org
Cc: &lt;stable@vger.kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'perf-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2023-10-30T23:44:35Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-10-30T23:44:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bceb7accb7b60f9844807c7433af06493ed058b7'/>
<id>urn:sha1:bceb7accb7b60f9844807c7433af06493ed058b7</id>
<content type='text'>
Pull performance event updates from Ingo Molnar:
 - Add AMD Unified Memory Controller (UMC) events introduced with Zen 4
 - Simplify &amp; clean up the uncore management code
 - Fall back from RDPMC to RDMSR on certain uncore PMUs
 - Improve per-package and cstate event reading
 - Extend the Intel ref-cycles event to GP counters
 - Fix Intel MTL event constraints
 - Improve the Intel hybrid CPU handling code
 - Micro-optimize the RAPL code
 - Optimize perf_cgroup_switch()
 - Improve large AUX area error handling
 - Misc fixes and cleanups

* tag 'perf-core-2023-10-28' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (26 commits)
  perf/x86/amd/uncore: Pass through error code for initialization failures, instead of -ENODEV
  perf/x86/amd/uncore: Fix uninitialized return value in amd_uncore_init()
  x86/cpu: Fix the AMD Fam 17h, Fam 19h, Zen2 and Zen4 MSR enumerations
  perf: Optimize perf_cgroup_switch()
  perf/x86/amd/uncore: Add memory controller support
  perf/x86/amd/uncore: Add group exclusivity
  perf/x86/amd/uncore: Use rdmsr if rdpmc is unavailable
  perf/x86/amd/uncore: Move discovery and registration
  perf/x86/amd/uncore: Refactor uncore management
  perf/core: Allow reading package events from perf_event_read_local
  perf/x86/cstate: Allow reading the package statistics from local CPU
  perf/x86/intel/pt: Fix kernel-doc comments
  perf/x86/rapl: Annotate 'struct rapl_pmus' with __counted_by
  perf/core: Rename perf_proc_update_handler() -&gt; perf_event_max_sample_rate_handler(), for readability
  perf/x86/rapl: Fix "Using plain integer as NULL pointer" Sparse warning
  perf/x86/rapl: Use local64_try_cmpxchg in rapl_event_update()
  perf/x86/rapl: Stop doing cpu_relax() in the local64_cmpxchg() loop in rapl_event_update()
  perf/core: Bail out early if the request AUX area is out of bound
  perf/x86/intel: Extend the ref-cycles event to GP counters
  perf/x86/intel: Fix broken fixed event constraints extension
  ...
</content>
</entry>
<entry>
<title>perf: Disallow mis-matched inherited group reads</title>
<updated>2023-10-19T08:09:42Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2023-10-18T11:56:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=32671e3799ca2e4590773fd0e63aaa4229e50c06'/>
<id>urn:sha1:32671e3799ca2e4590773fd0e63aaa4229e50c06</id>
<content type='text'>
Because group consistency is non-atomic between parent (filedesc) and children
(inherited) events, it is possible for PERF_FORMAT_GROUP read() to try and sum
non-matching counter groups -- with non-sensical results.

Add group_generation to distinguish the case where a parent group removes and
adds an event and thus has the same number, but a different configuration of
events as inherited groups.

This became a problem when commit fa8c269353d5 ("perf/core: Invert
perf_read_group() loops") flipped the order of child_list and sibling_list.
Previously it would iterate the group (sibling_list) first, and for each
sibling traverse the child_list. In this order, only the group composition of
the parent is relevant. By flipping the order the group composition of the
child (inherited) events becomes an issue and the mis-match in group
composition becomes evident.

That said; even prior to this commit, while reading of a group that is not
equally inherited was not broken, it still made no sense.

(Ab)use ECHILD as error return to indicate issues with child process group
composition.

Fixes: fa8c269353d5 ("perf/core: Invert perf_read_group() loops")
Reported-by: Budimir Markovic &lt;markovicbudimir@gmail.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/20231018115654.GK33217@noisy.programming.kicks-ass.net
</content>
</entry>
<entry>
<title>perf: Optimize perf_cgroup_switch()</title>
<updated>2023-10-12T17:28:38Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2023-10-09T21:04:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f06cc667f79909e9175460b167c277b7c64d3df0'/>
<id>urn:sha1:f06cc667f79909e9175460b167c277b7c64d3df0</id>
<content type='text'>
Namhyung reported that bd2756811766 ("perf: Rewrite core context handling")
regresses context switch overhead when perf-cgroup is in use together
with 'slow' PMUs like uncore.

Specifically, perf_cgroup_switch()'s perf_ctx_disable() /
ctx_sched_out() etc.. all iterate the full list of active PMUs for
that CPU, even if they don't have cgroup events.

Previously there was cgrp_cpuctx_list which linked the relevant PMUs
together, but that got lost in the rework. Instead of re-instruducing
a similar list, let the perf_event_pmu_context iteration skip those
that do not have cgroup events. This avoids growing multiple versions
of the perf_event_pmu_context iteration.

Measured performance (on a slightly different patch):

Before)

  $ taskset -c 0 ./perf bench sched pipe -l 10000 -G AAA,BBB
  # Running 'sched/pipe' benchmark:
  # Executed 10000 pipe operations between two processes

       Total time: 0.901 [sec]

        90.128700 usecs/op
            11095 ops/sec

After)

  $ taskset -c 0 ./perf bench sched pipe -l 10000 -G AAA,BBB
  # Running 'sched/pipe' benchmark:
  # Executed 10000 pipe operations between two processes

       Total time: 0.065 [sec]

         6.560100 usecs/op
           152436 ops/sec

Fixes: bd2756811766 ("perf: Rewrite core context handling")
Reported-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Debugged-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/20231009210425.GC6307@noisy.programming.kicks-ass.net
</content>
</entry>
<entry>
<title>perf/core: Rename perf_proc_update_handler() -&gt; perf_event_max_sample_rate_handler(), for readability</title>
<updated>2023-10-05T08:26:50Z</updated>
<author>
<name>Xiu Jianfeng</name>
<email>xiujianfeng@huawei.com</email>
</author>
<published>2023-07-21T09:06:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e6814ec3ba1994561db9b1c05a80227d30cc18fa'/>
<id>urn:sha1:e6814ec3ba1994561db9b1c05a80227d30cc18fa</id>
<content type='text'>
Follow the naming pattern of the other sysctl handlers in perf.

Signed-off-by: Xiu Jianfeng &lt;xiujianfeng@huawei.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://lore.kernel.org/r/20230721090607.172002-1-xiujianfeng@huawei.com
</content>
</entry>
<entry>
<title>Merge tag 'riscv-for-linus-6.6-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux</title>
<updated>2023-09-01T15:09:48Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-09-01T15:09:48Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e0152e7481c6c63764d6ea8ee41af5cf9dfac5e9'/>
<id>urn:sha1:e0152e7481c6c63764d6ea8ee41af5cf9dfac5e9</id>
<content type='text'>
Pull RISC-V updates from Palmer Dabbelt:

 - Support for the new "riscv,isa-extensions" and "riscv,isa-base"
   device tree interfaces for probing extensions

 - Support for userspace access to the performance counters

 - Support for more instructions in kprobes

 - Crash kernels can be allocated above 4GiB

 - Support for KCFI

 - Support for ELFs in !MMU configurations

 - ARCH_KMALLOC_MINALIGN has been reduced to 8

 - mmap() defaults to sv48-sized addresses, with longer addresses hidden
   behind a hint (similar to Arm and Intel)

 - Also various fixes and cleanups

* tag 'riscv-for-linus-6.6-mw1' of git://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux: (51 commits)
  lib/Kconfig.debug: Restrict DEBUG_INFO_SPLIT for RISC-V
  riscv: support PREEMPT_DYNAMIC with static keys
  riscv: Move create_tmp_mapping() to init sections
  riscv: Mark KASAN tmp* page tables variables as static
  riscv: mm: use bitmap_zero() API
  riscv: enable DEBUG_FORCE_FUNCTION_ALIGN_64B
  riscv: remove redundant mv instructions
  RISC-V: mm: Document mmap changes
  RISC-V: mm: Update pgtable comment documentation
  RISC-V: mm: Add tests for RISC-V mm
  RISC-V: mm: Restrict address space for sv39,sv48,sv57
  riscv: enable DMA_BOUNCE_UNALIGNED_KMALLOC for !dma_coherent
  riscv: allow kmalloc() caches aligned to the smallest value
  riscv: support the elf-fdpic binfmt loader
  binfmt_elf_fdpic: support 64-bit systems
  riscv: Allow CONFIG_CFI_CLANG to be selected
  riscv/purgatory: Disable CFI
  riscv: Add CFI error handling
  riscv: Add ftrace_stub_graph
  riscv: Add types to indirectly called assembly functions
  ...
</content>
</entry>
<entry>
<title>Merge tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux</title>
<updated>2023-08-29T00:34:54Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2023-08-29T00:34:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=542034175ca715d28522a59c1f88a58349f2765c'/>
<id>urn:sha1:542034175ca715d28522a59c1f88a58349f2765c</id>
<content type='text'>
Pull arm64 updates from Will Deacon:
 "I think we have a bit less than usual on the architecture side, but
  that's somewhat balanced out by a large crop of perf/PMU driver
  updates and extensions to our selftests.

  CPU features and system registers:

   - Advertise hinted conditional branch support (FEAT_HBC) to userspace

   - Avoid false positive "SANITY CHECK" warning when xCR registers
     differ outside of the length field

  Documentation:

   - Fix macro name typo in SME documentation

  Entry code:

   - Unmask exceptions earlier on the system call entry path

  Memory management:

   - Don't bother clearing PTE_RDONLY for dirty ptes in pte_wrprotect()
     and pte_modify()

  Perf and PMU drivers:

   - Initial support for Coresight TRBE devices on ACPI systems (the
     coresight driver changes will come later)

   - Fix hw_breakpoint single-stepping when called from bpf

   - Fixes for DDR PMU on i.MX8MP SoC

   - Add NUMA-awareness to Hisilicon PCIe PMU driver

   - Fix locking dependency issue in Arm DMC620 PMU driver

   - Workaround Hisilicon erratum 162001900 in the SMMUv3 PMU driver

   - Add support for Arm CMN-700 r3 parts to the CMN PMU driver

   - Add support for recent Arm Cortex CPU PMUs

   - Update Hisilicon PMU maintainers

  Selftests:

   - Add a bunch of new features to the hwcap test (JSCVT, PMULL, AES,
     SHA1, etc)

   - Fix SSVE test to leave streaming-mode after grabbing the signal
     context

   - Add new test for SVE vector-length changes with SME enabled

  Miscellaneous:

   - Allow compiler to warn on suspicious looking system register
     expressions

   - Work around SDEI firmware bug by aborting any running handlers on a
     kernel crash

   - Fix some harmless warnings when building with W=1

   - Remove some unused function declarations

   - Other minor fixes and cleanup"

* tag 'arm64-upstream' of git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux: (62 commits)
  drivers/perf: hisi: Update HiSilicon PMU maintainers
  arm_pmu: acpi: Add a representative platform device for TRBE
  arm_pmu: acpi: Refactor arm_spe_acpi_register_device()
  kselftest/arm64: Fix hwcaps selftest build
  hw_breakpoint: fix single-stepping when using bpf_overflow_handler
  arm64/sysreg: refactor deprecated strncpy
  kselftest/arm64: add jscvt feature to hwcap test
  kselftest/arm64: add pmull feature to hwcap test
  kselftest/arm64: add AES feature check to hwcap test
  kselftest/arm64: add SHA1 and related features to hwcap test
  arm64: sysreg: Generate C compiler warnings on {read,write}_sysreg_s arguments
  kselftest/arm64: build BTI tests in output directory
  perf/imx_ddr: don't enable counter0 if none of 4 counters are used
  perf/imx_ddr: speed up overflow frequency of cycle
  drivers/perf: hisi: Schedule perf session according to locality
  kselftest/arm64: fix a memleak in zt_regs_run()
  perf/arm-dmc620: Fix dmc620_pmu_irqs_lock/cpu_hotplug_lock circular lock dependency
  perf/smmuv3: Add MODULE_ALIAS for module auto loading
  perf/smmuv3: Enable HiSilicon Erratum 162001900 quirk for HIP08/09
  kselftest/arm64: Size sycall-abi buffers for the actual maximum VL
  ...
</content>
</entry>
<entry>
<title>hw_breakpoint: fix single-stepping when using bpf_overflow_handler</title>
<updated>2023-08-18T16:04:09Z</updated>
<author>
<name>Tomislav Novak</name>
<email>tnovak@meta.com</email>
</author>
<published>2023-06-05T19:19:23Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d11a69873d9a7435fe6a48531e165ab80a8b1221'/>
<id>urn:sha1:d11a69873d9a7435fe6a48531e165ab80a8b1221</id>
<content type='text'>
Arm platforms use is_default_overflow_handler() to determine if the
hw_breakpoint code should single-step over the breakpoint trigger or
let the custom handler deal with it.

Since bpf_overflow_handler() currently isn't recognized as a default
handler, attaching a BPF program to a PERF_TYPE_BREAKPOINT event causes
it to keep firing (the instruction triggering the data abort exception
is never skipped). For example:

  # bpftrace -e 'watchpoint:0x10000:4:w { print("hit") }' -c ./test
  Attaching 1 probe...
  hit
  hit
  [...]
  ^C

(./test performs a single 4-byte store to 0x10000)

This patch replaces the check with uses_default_overflow_handler(),
which accounts for the bpf_overflow_handler() case by also testing
if one of the perf_event_output functions gets invoked indirectly,
via orig_default_handler.

Signed-off-by: Tomislav Novak &lt;tnovak@meta.com&gt;
Tested-by: Samuel Gosselin &lt;sgosselin@google.com&gt; # arm64
Reviewed-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Link: https://lore.kernel.org/linux-arm-kernel/20220923203644.2731604-1-tnovak@fb.com/
Link: https://lore.kernel.org/r/20230605191923.1219974-1-tnovak@meta.com
Signed-off-by: Will Deacon &lt;will@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf: Fix wrong comment about default event_idx</title>
<updated>2023-08-16T14:28:15Z</updated>
<author>
<name>Alexandre Ghiti</name>
<email>alexghiti@rivosinc.com</email>
</author>
<published>2023-08-02T08:03:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=366d259ff597e81d90639ae21269b3f82cd4ebb7'/>
<id>urn:sha1:366d259ff597e81d90639ae21269b3f82cd4ebb7</id>
<content type='text'>
Since commit c719f56092ad ("perf: Fix and clean up initialization of
pmu::event_idx"), event_idx default implementation has returned 0, not
idx + 1, so fix the comment that can be misleading.

Signed-off-by: Alexandre Ghiti &lt;alexghiti@rivosinc.com&gt;
Reviewed-by: Andrew Jones &lt;ajones@ventanamicro.com&gt;
Reviewed-by: Atish Patra &lt;atishp@rivosinc.com&gt;
</content>
</entry>
<entry>
<title>perf: Remove unused extern declaration arch_perf_get_page_size()</title>
<updated>2023-07-26T10:28:48Z</updated>
<author>
<name>YueHaibing</name>
<email>yuehaibing@huawei.com</email>
</author>
<published>2023-07-25T13:50:38Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=62af03223785c11a0916df6a854ef4785d2350a5'/>
<id>urn:sha1:62af03223785c11a0916df6a854ef4785d2350a5</id>
<content type='text'>
commit 8af26be06272 ("perf/core: Fix arch_perf_get_page_size()")
left behind this.

Signed-off-by: YueHaibing &lt;yuehaibing@huawei.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/20230725135038.25060-1-yuehaibing@huawei.com
</content>
</entry>
</feed>
