<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/events/ring_buffer.c, branch v5.5.13</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.5.13</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.5.13'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2019-11-13T10:06:14Z</updated>
<entry>
<title>perf/aux: Allow using AUX data in perf samples</title>
<updated>2019-11-13T10:06:14Z</updated>
<author>
<name>Alexander Shishkin</name>
<email>alexander.shishkin@linux.intel.com</email>
</author>
<published>2019-10-25T14:08:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a4faf00d994c40e64f656805ac375c65e324eefb'/>
<id>urn:sha1:a4faf00d994c40e64f656805ac375c65e324eefb</id>
<content type='text'>
AUX data can be used to annotate perf events such as performance counters
or tracepoints/breakpoints by including it in sample records when
PERF_SAMPLE_AUX flag is set. Such samples would be instrumental in debugging
and profiling by providing, for example, a history of instruction flow
leading up to the event's overflow.

The implementation makes use of grouping an AUX event with all the events
that wish to take samples of the AUX data, such that the former is the
group leader. The samplees should also specify the desired size of the AUX
sample via attr.aux_sample_size.

AUX capable PMUs need to explicitly add support for sampling, because it
relies on a new callback to take a snapshot of the buffer without touching
the event states.

Signed-off-by: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: David Ahern &lt;dsahern@gmail.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Cc: adrian.hunter@intel.com
Cc: mathieu.poirier@linaro.org
Link: https://lkml.kernel.org/r/20191025140835.53665-2-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf/ring_buffer: Matching the memory allocate and free, in rb_alloc()</title>
<updated>2019-10-17T19:31:55Z</updated>
<author>
<name>Yunfeng Ye</name>
<email>yeyunfeng@huawei.com</email>
</author>
<published>2019-10-14T08:15:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d7e78706e43107fa269fe34b1a69e653f5ec9f2c'/>
<id>urn:sha1:d7e78706e43107fa269fe34b1a69e653f5ec9f2c</id>
<content type='text'>
Currently perf_mmap_alloc_page() is used to allocate memory in
rb_alloc(), but using free_page() to free memory in the failure path.

It's better to use perf_mmap_free_page() instead.

Signed-off-by: Yunfeng Ye &lt;yeyunfeng@huawei.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: &lt;jolsa@redhat.co&gt;
Cc: &lt;acme@kernel.org&gt;
Cc: &lt;mingo@redhat.com&gt;
Cc: &lt;mark.rutland@arm.com&gt;
Cc: &lt;namhyung@kernel.org&gt;
Cc: &lt;alexander.shishkin@linux.intel.com&gt;
Link: https://lkml.kernel.org/r/575c7e8c-90c7-4e3a-b41d-f894d8cdbd7f@huawei.com
</content>
</entry>
<entry>
<title>perf/ring_buffer: Modify the parameter type of perf_mmap_free_page()</title>
<updated>2019-10-17T19:31:55Z</updated>
<author>
<name>Yunfeng Ye</name>
<email>yeyunfeng@huawei.com</email>
</author>
<published>2019-10-14T08:14:59Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8a9f91c51ea72b126864e0db616b1bac12261200'/>
<id>urn:sha1:8a9f91c51ea72b126864e0db616b1bac12261200</id>
<content type='text'>
In perf_mmap_free_page(), the unsigned long type is converted to the
pointer type, but where the call is made, the pointer type is converted
to the unsigned long type. There is no need to do these operations.

Modify the parameter type of perf_mmap_free_page() to pointer type.

Signed-off-by: Yunfeng Ye &lt;yeyunfeng@huawei.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: &lt;jolsa@redhat.co&gt;
Cc: &lt;acme@kernel.org&gt;
Cc: &lt;mingo@redhat.com&gt;
Cc: &lt;mark.rutland@arm.com&gt;
Cc: &lt;namhyung@kernel.org&gt;
Cc: &lt;alexander.shishkin@linux.intel.com&gt;
Link: https://lkml.kernel.org/r/e6ae3f0c-d04c-50f9-544a-aee3b30330cd@huawei.com
</content>
</entry>
<entry>
<title>perf/ring-buffer: Use regular variables for nesting</title>
<updated>2019-05-24T07:00:11Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2019-05-17T11:52:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5322ea58a06da2e69c5ef36a9b4d4b9255edd423'/>
<id>urn:sha1:5322ea58a06da2e69c5ef36a9b4d4b9255edd423</id>
<content type='text'>
While the IRQ/NMI will nest, the nest-count will be invariant over the
actual exception, since it will decrement equal to increment.

This means we can -- carefully -- use a regular variable since the
typical LOAD-STORE race doesn't exist (similar to preempt_count).

This optimizes the ring-buffer for all LOAD-STORE architectures, since
they need to use atomic ops to implement local_t.

Suggested-by: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Cc: acme@kernel.org
Cc: mark.rutland@arm.com
Cc: namhyung@kernel.org
Cc: yabinc@google.com
Link: http://lkml.kernel.org/r/20190517115418.481392777@infradead.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf/ring-buffer: Always use {READ,WRITE}_ONCE() for rb-&gt;user_page data</title>
<updated>2019-05-24T07:00:11Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2019-05-17T11:52:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4d839dd9e4356bbacf3eb0ab13a549b83b008c21'/>
<id>urn:sha1:4d839dd9e4356bbacf3eb0ab13a549b83b008c21</id>
<content type='text'>
We must use {READ,WRITE}_ONCE() on rb-&gt;user_page data such that
concurrent usage will see whole values. A few key sites were missing
this.

Suggested-by: Yabin Cui &lt;yabinc@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Cc: acme@kernel.org
Cc: mark.rutland@arm.com
Cc: namhyung@kernel.org
Fixes: 7b732a750477 ("perf_counter: new output ABI - part 1")
Link: http://lkml.kernel.org/r/20190517115418.394192145@infradead.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf/ring_buffer: Add ordering to rb-&gt;nest increment</title>
<updated>2019-05-24T07:00:10Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2019-05-17T11:52:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3f9fbe9bd86c534eba2faf5d840fd44c6049f50e'/>
<id>urn:sha1:3f9fbe9bd86c534eba2faf5d840fd44c6049f50e</id>
<content type='text'>
Similar to how decrementing rb-&gt;next too early can cause data_head to
(temporarily) be observed to go backward, so too can this happen when
we increment too late.

This barrier() ensures the rb-&gt;head load happens after the increment,
both the one in the 'goto again' path, as the one from
perf_output_get_handle() -- albeit very unlikely to matter for the
latter.

Suggested-by: Yabin Cui &lt;yabinc@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Cc: acme@kernel.org
Cc: mark.rutland@arm.com
Cc: namhyung@kernel.org
Fixes: ef60777c9abd ("perf: Optimize the perf_output() path by removing IRQ-disables")
Link: http://lkml.kernel.org/r/20190517115418.309516009@infradead.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf/ring_buffer: Fix exposing a temporarily decreased data_head</title>
<updated>2019-05-24T07:00:10Z</updated>
<author>
<name>Yabin Cui</name>
<email>yabinc@google.com</email>
</author>
<published>2019-05-17T11:52:31Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1b038c6e05ff70a1e66e3e571c2e6106bdb75f53'/>
<id>urn:sha1:1b038c6e05ff70a1e66e3e571c2e6106bdb75f53</id>
<content type='text'>
In perf_output_put_handle(), an IRQ/NMI can happen in below location and
write records to the same ring buffer:

	...
	local_dec_and_test(&amp;rb-&gt;nest)
	...                          &lt;-- an IRQ/NMI can happen here
	rb-&gt;user_page-&gt;data_head = head;
	...

In this case, a value A is written to data_head in the IRQ, then a value
B is written to data_head after the IRQ. And A &gt; B. As a result,
data_head is temporarily decreased from A to B. And a reader may see
data_head &lt; data_tail if it read the buffer frequently enough, which
creates unexpected behaviors.

This can be fixed by moving dec(&amp;rb-&gt;nest) to after updating data_head,
which prevents the IRQ/NMI above from updating data_head.

[ Split up by peterz. ]

Signed-off-by: Yabin Cui &lt;yabinc@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@kernel.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Cc: mark.rutland@arm.com
Fixes: ef60777c9abd ("perf: Optimize the perf_output() path by removing IRQ-disables")
Link: http://lkml.kernel.org/r/20190517115418.224478157@infradead.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf/ring_buffer: Fix AUX software double buffering</title>
<updated>2019-05-03T10:46:10Z</updated>
<author>
<name>Alexander Shishkin</name>
<email>alexander.shishkin@linux.intel.com</email>
</author>
<published>2019-05-03T08:55:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=26ae4f4406f88d82d79c85c11ac5fae18213cd38'/>
<id>urn:sha1:26ae4f4406f88d82d79c85c11ac5fae18213cd38</id>
<content type='text'>
This recent commit:

  5768402fd9c6e87 ("perf/ring_buffer: Use high order allocations for AUX buffers optimistically")

overlooked the fact that the previous one page granularity of the AUX buffer
provided an implicit double buffering capability to the PMU driver, which
went away when the entire buffer became one high-order page.

Always make the full-trace mode AUX allocation at least two-part to preserve
the previous behavior and allow the implicit double buffering to continue.

Reported-by: Ammy Yi &lt;ammy.yi@intel.com&gt;
Signed-off-by: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Acked-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Cc: adrian.hunter@intel.com
Fixes: 5768402fd9c6e87 ("perf/ring_buffer: Use high order allocations for AUX buffers optimistically")
Link: http://lkml.kernel.org/r/20190503085536.24119-2-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf/ring_buffer: Fix AUX record suppression</title>
<updated>2019-04-16T10:13:57Z</updated>
<author>
<name>Alexander Shishkin</name>
<email>alexander.shishkin@linux.intel.com</email>
</author>
<published>2019-03-29T09:13:38Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=339bc4183596e1f68c2c98a03b87aa124107c317'/>
<id>urn:sha1:339bc4183596e1f68c2c98a03b87aa124107c317</id>
<content type='text'>
The following commit:

  1627314fb54a33e ("perf: Suppress AUX/OVERWRITE records")

has an unintended side-effect of also suppressing all AUX records with no flags
and non-zero size, so all the regular records in the full trace mode.
This breaks some use cases for people.

Fix this by restoring "regular" AUX records.

Reported-by: Ben Gainey &lt;Ben.Gainey@arm.com&gt;
Tested-by: Ben Gainey &lt;Ben.Gainey@arm.com&gt;
Signed-off-by: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Fixes: 1627314fb54a33e ("perf: Suppress AUX/OVERWRITE records")
Link: https://lkml.kernel.org/r/20190329091338.29999-1-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf/core: Fix perf_event_disable_inatomic() race</title>
<updated>2019-04-12T06:55:55Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2019-04-04T13:03:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1d54ad944074010609562da5c89e4f5df2f4e5db'/>
<id>urn:sha1:1d54ad944074010609562da5c89e4f5df2f4e5db</id>
<content type='text'>
Thomas-Mich Richter reported he triggered a WARN()ing from event_function_local()
on his s390. The problem boils down to:

	CPU-A				CPU-B

	perf_event_overflow()
	  perf_event_disable_inatomic()
	    @pending_disable = 1
	    irq_work_queue();

	sched-out
	  event_sched_out()
	    @pending_disable = 0

					sched-in
					perf_event_overflow()
					  perf_event_disable_inatomic()
					    @pending_disable = 1;
					    irq_work_queue(); // FAILS

	irq_work_run()
	  perf_pending_event()
	    if (@pending_disable)
	      perf_event_disable_local(); // WHOOPS

The problem exists in generic, but s390 is particularly sensitive
because it doesn't implement arch_irq_work_raise(), nor does it call
irq_work_run() from it's PMU interrupt handler (nor would that be
sufficient in this case, because s390 also generates
perf_event_overflow() from pmu::stop). Add to that the fact that s390
is a virtual architecture and (virtual) CPU-A can stall long enough
for the above race to happen, even if it would self-IPI.

Adding a irq_work_sync() to event_sched_in() would work for all hardare
PMUs that properly use irq_work_run() but fails for software PMUs.

Instead encode the CPU number in @pending_disable, such that we can
tell which CPU requested the disable. This then allows us to detect
the above scenario and even redirect the IPI to make up for the failed
queue.

Reported-by: Thomas-Mich Richter &lt;tmricht@linux.ibm.com&gt;
Tested-by: Thomas Richter &lt;tmricht@linux.ibm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Hendrik Brueckner &lt;brueckner@linux.ibm.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
</feed>
