<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/events/ring_buffer.c, branch v4.14.329</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.14.329</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.14.329'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2019-06-22T06:16:16Z</updated>
<entry>
<title>perf/ring-buffer: Always use {READ,WRITE}_ONCE() for rb-&gt;user_page data</title>
<updated>2019-06-22T06:16:16Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2019-05-17T11:52:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a9a83106b163b50127d85c8a3f0684aa592e07be'/>
<id>urn:sha1:a9a83106b163b50127d85c8a3f0684aa592e07be</id>
<content type='text'>
[ Upstream commit 4d839dd9e4356bbacf3eb0ab13a549b83b008c21 ]

We must use {READ,WRITE}_ONCE() on rb-&gt;user_page data such that
concurrent usage will see whole values. A few key sites were missing
this.

Suggested-by: Yabin Cui &lt;yabinc@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Cc: acme@kernel.org
Cc: mark.rutland@arm.com
Cc: namhyung@kernel.org
Fixes: 7b732a750477 ("perf_counter: new output ABI - part 1")
Link: http://lkml.kernel.org/r/20190517115418.394192145@infradead.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf/ring_buffer: Add ordering to rb-&gt;nest increment</title>
<updated>2019-06-22T06:16:16Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2019-05-17T11:52:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2bc92a4f8be1427bf9bb9c6a5f9ac3377fd3b886'/>
<id>urn:sha1:2bc92a4f8be1427bf9bb9c6a5f9ac3377fd3b886</id>
<content type='text'>
[ Upstream commit 3f9fbe9bd86c534eba2faf5d840fd44c6049f50e ]

Similar to how decrementing rb-&gt;next too early can cause data_head to
(temporarily) be observed to go backward, so too can this happen when
we increment too late.

This barrier() ensures the rb-&gt;head load happens after the increment,
both the one in the 'goto again' path, as the one from
perf_output_get_handle() -- albeit very unlikely to matter for the
latter.

Suggested-by: Yabin Cui &lt;yabinc@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Cc: acme@kernel.org
Cc: mark.rutland@arm.com
Cc: namhyung@kernel.org
Fixes: ef60777c9abd ("perf: Optimize the perf_output() path by removing IRQ-disables")
Link: http://lkml.kernel.org/r/20190517115418.309516009@infradead.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf/ring_buffer: Fix exposing a temporarily decreased data_head</title>
<updated>2019-06-22T06:16:16Z</updated>
<author>
<name>Yabin Cui</name>
<email>yabinc@google.com</email>
</author>
<published>2019-05-17T11:52:31Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9e2de43b538600df9a9bfffdb13730886aff5ceb'/>
<id>urn:sha1:9e2de43b538600df9a9bfffdb13730886aff5ceb</id>
<content type='text'>
[ Upstream commit 1b038c6e05ff70a1e66e3e571c2e6106bdb75f53 ]

In perf_output_put_handle(), an IRQ/NMI can happen in below location and
write records to the same ring buffer:

	...
	local_dec_and_test(&amp;rb-&gt;nest)
	...                          &lt;-- an IRQ/NMI can happen here
	rb-&gt;user_page-&gt;data_head = head;
	...

In this case, a value A is written to data_head in the IRQ, then a value
B is written to data_head after the IRQ. And A &gt; B. As a result,
data_head is temporarily decreased from A to B. And a reader may see
data_head &lt; data_tail if it read the buffer frequently enough, which
creates unexpected behaviors.

This can be fixed by moving dec(&amp;rb-&gt;nest) to after updating data_head,
which prevents the IRQ/NMI above from updating data_head.

[ Split up by peterz. ]

Signed-off-by: Yabin Cui &lt;yabinc@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@kernel.org&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Cc: mark.rutland@arm.com
Fixes: ef60777c9abd ("perf: Optimize the perf_output() path by removing IRQ-disables")
Link: http://lkml.kernel.org/r/20190517115418.224478157@infradead.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf/core: Fix impossible ring-buffer sizes warning</title>
<updated>2019-02-20T09:20:51Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2019-02-13T06:57:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=23822b55552fcdd3e7c476c53275c5b74c33226b'/>
<id>urn:sha1:23822b55552fcdd3e7c476c53275c5b74c33226b</id>
<content type='text'>
commit 528871b456026e6127d95b1b2bd8e3a003dc1614 upstream.

The following commit:

  9dff0aa95a32 ("perf/core: Don't WARN() for impossible ring-buffer sizes")

results in perf recording failures with larger mmap areas:

  root@skl:/tmp# perf record -g -a
  failed to mmap with 12 (Cannot allocate memory)

The root cause is that the following condition is buggy:

	if (order_base_2(size) &gt;= MAX_ORDER)
		goto fail;

The problem is that @size is in bytes and MAX_ORDER is in pages,
so the right test is:

	if (order_base_2(size) &gt;= PAGE_SHIFT+MAX_ORDER)
		goto fail;

Fix it.

Reported-by: "Jin, Yao" &lt;yao.jin@linux.intel.com&gt;
Bisected-by: Borislav Petkov &lt;bp@alien8.de&gt;
Analyzed-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Julien Thierry &lt;julien.thierry@arm.com&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Fixes: 9dff0aa95a32 ("perf/core: Don't WARN() for impossible ring-buffer sizes")
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>perf/core: Don't WARN() for impossible ring-buffer sizes</title>
<updated>2019-02-12T18:46:13Z</updated>
<author>
<name>Mark Rutland</name>
<email>mark.rutland@arm.com</email>
</author>
<published>2019-01-10T14:27:45Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bbf850876fcff432c2ed1f6f7c4e144c2c22bd82'/>
<id>urn:sha1:bbf850876fcff432c2ed1f6f7c4e144c2c22bd82</id>
<content type='text'>
commit 9dff0aa95a324e262ffb03f425d00e4751f3294e upstream.

The perf tool uses /proc/sys/kernel/perf_event_mlock_kb to determine how
large its ringbuffer mmap should be. This can be configured to arbitrary
values, which can be larger than the maximum possible allocation from
kmalloc.

When this is configured to a suitably large value (e.g. thanks to the
perf fuzzer), attempting to use perf record triggers a WARN_ON_ONCE() in
__alloc_pages_nodemask():

   WARNING: CPU: 2 PID: 5666 at mm/page_alloc.c:4511 __alloc_pages_nodemask+0x3f8/0xbc8

Let's avoid this by checking that the requested allocation is possible
before calling kzalloc.

Reported-by: Julien Thierry &lt;julien.thierry@arm.com&gt;
Signed-off-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Julien Thierry &lt;julien.thierry@arm.com&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Link: https://lkml.kernel.org/r/20190110142745.25495-1-mark.rutland@arm.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>perf/core: Fix possible Spectre-v1 indexing for -&gt;aux_pages[]</title>
<updated>2018-05-16T08:10:31Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2018-04-20T12:03:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7093d5d0caa7a23301168a79be29b145c8b5b627'/>
<id>urn:sha1:7093d5d0caa7a23301168a79be29b145c8b5b627</id>
<content type='text'>
commit 4411ec1d1993e8dbff2898390e3fed280d88e446 upstream.

&gt; kernel/events/ring_buffer.c:871 perf_mmap_to_page() warn: potential spectre issue 'rb-&gt;aux_pages'

Userspace controls @pgoff through the fault address. Sanitize the
array index before doing the array dereference.

Reported-by: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: &lt;stable@kernel.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>perf/aux: Only update -&gt;aux_wakeup in non-overwrite mode</title>
<updated>2017-09-29T08:06:45Z</updated>
<author>
<name>Alexander Shishkin</name>
<email>alexander.shishkin@linux.intel.com</email>
</author>
<published>2017-09-06T16:08:11Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=441430eb54a00586f95f1aefc48e0801bbd6a923'/>
<id>urn:sha1:441430eb54a00586f95f1aefc48e0801bbd6a923</id>
<content type='text'>
The following commit:

  d9a50b0256 ("perf/aux: Ensure aux_wakeup represents most recent wakeup index")

changed the AUX wakeup position calculation to rounddown(), which causes
a division-by-zero in AUX overwrite mode (aka "snapshot mode").

The zero denominator results from the fact that perf record doesn't set
aux_watermark to anything, in which case the kernel will set it to half
the AUX buffer size, but only for non-overwrite mode. In the overwrite
mode aux_watermark stays zero.

The good news is that, AUX overwrite mode, wakeups don't happen and
related bookkeeping is not relevant, so we can simply forego the whole
wakeup updates.

Signed-off-by: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: will.deacon@arm.com
Link: http://lkml.kernel.org/r/20170906160811.16510-1-alexander.shishkin@linux.intel.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf/aux: Ensure aux_wakeup represents most recent wakeup index</title>
<updated>2017-08-25T09:04:16Z</updated>
<author>
<name>Will Deacon</name>
<email>will.deacon@arm.com</email>
</author>
<published>2017-08-16T16:18:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d9a50b0256f06bd39a1bed1ba40baec37c356b11'/>
<id>urn:sha1:d9a50b0256f06bd39a1bed1ba40baec37c356b11</id>
<content type='text'>
The aux_watermark member of struct ring_buffer represents the period (in
terms of bytes) at which wakeup events should be generated when data is
written to the aux buffer in non-snapshot mode. On hardware that cannot
generate an interrupt when the aux_head reaches an arbitrary wakeup index
(such as ARM SPE), the aux_head sampled from handle-&gt;head in
perf_aux_output_{skip,end} may in fact be past the wakeup index. This
can lead to wakeup slowly falling behind the head. For example, consider
the case where hardware can only generate an interrupt on a page-boundary
and the aux buffer is initialised as follows:

  // Buffer size is 2 * PAGE_SIZE
  rb-&gt;aux_head = rb-&gt;aux_wakeup = 0
  rb-&gt;aux_watermark = PAGE_SIZE / 2

following the first perf_aux_output_begin call, the handle is
initialised with:

  handle-&gt;head = 0
  handle-&gt;size = 2 * PAGE_SIZE
  handle-&gt;wakeup = PAGE_SIZE / 2

and the hardware will be programmed to generate an interrupt at
PAGE_SIZE.

When the interrupt is raised, the hardware head will be at PAGE_SIZE,
so calling perf_aux_output_end(handle, PAGE_SIZE) puts the ring buffer
into the following state:

  rb-&gt;aux_head = PAGE_SIZE
  rb-&gt;aux_wakeup = PAGE_SIZE / 2
  rb-&gt;aux_watermark = PAGE_SIZE / 2

and then the next call to perf_aux_output_begin will result in:

  handle-&gt;head = handle-&gt;wakeup = PAGE_SIZE

for which the semantics are unclear and, for a smaller aux_watermark
(e.g. PAGE_SIZE / 4), then the wakeup would in fact be behind head at
this point.

This patch fixes the problem by rounding down the aux_head (as sampled
from the handle) to the nearest aux_watermark boundary when updating
rb-&gt;aux_wakeup, therefore taking into account any overruns by the
hardware.

Reported-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1502900297-21839-2-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf/aux: Make aux_{head,wakeup} ring_buffer members long</title>
<updated>2017-08-25T09:04:15Z</updated>
<author>
<name>Will Deacon</name>
<email>will.deacon@arm.com</email>
</author>
<published>2017-08-16T16:18:16Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2ab346cfb0decf01523949e29f5cf542f2304611'/>
<id>urn:sha1:2ab346cfb0decf01523949e29f5cf542f2304611</id>
<content type='text'>
The aux_head and aux_wakeup members of struct ring_buffer are defined
using the local_t type, despite the fact that they are only accessed via
the perf_aux_output_*() functions, which cannot race with each other for a
given ring buffer.

This patch changes the type of the members to long, so we can avoid
using the local_*() API where it isn't needed.

Signed-off-by: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mark Rutland &lt;mark.rutland@arm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-arm-kernel@lists.infradead.org
Link: http://lkml.kernel.org/r/1502900297-21839-1-git-send-email-will.deacon@arm.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>perf/aux: Correct return code of rb_alloc_aux() if !has_aux(ev)</title>
<updated>2017-06-21T09:58:30Z</updated>
<author>
<name>Hendrik Brueckner</name>
<email>brueckner@linux.vnet.ibm.com</email>
</author>
<published>2017-06-20T10:26:39Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8a1898db51a3390241cd5fae267dc8aaa9db0f8b'/>
<id>urn:sha1:8a1898db51a3390241cd5fae267dc8aaa9db0f8b</id>
<content type='text'>
If the event for which an AUX area is about to be allocated, does
not support setting up an AUX area, rb_alloc_aux() return -ENOTSUPP.

This error condition is being returned unfiltered to the user space,
and, for example, the perf tools fails with:

  failed to mmap with 524 (INTERNAL ERROR: strerror_r(524, 0x3fff497a1c8, 512)=22)

This error can be easily seen with "perf record -m 128,256 -e cpu-clock".

The 524 error code maps to -ENOTSUPP (in rb_alloc_aux()). The -ENOTSUPP
error code shall be only used within the kernel.  So the correct error
code would then be -EOPNOTSUPP.

With this commit, the perf tool then reports:

  failed to mmap with 95 (Operation not supported)

which is more clear.

Signed-off-by: Hendrik Brueckner &lt;brueckner@linux.vnet.ibm.com&gt;
Acked-by: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Pu Hou &lt;bjhoupu@linux.vnet.ibm.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Thomas-Mich Richter &lt;tmricht@linux.vnet.ibm.com&gt;
Cc: acme@kernel.org
Cc: linux-s390@vger.kernel.org
Link: http://lkml.kernel.org/r/1497954399-6355-1-git-send-email-brueckner@linux.vnet.ibm.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
</feed>
