<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/trace/ring_buffer.c, branch v3.18.67</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.18.67</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.18.67'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2017-04-30T03:49:13Z</updated>
<entry>
<title>ring-buffer: Have ring_buffer_iter_empty() return true when empty</title>
<updated>2017-04-30T03:49:13Z</updated>
<author>
<name>Steven Rostedt (VMware)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2017-04-19T18:29:46Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0b83d4b7a052ca6cede13077538862bdc75058ea'/>
<id>urn:sha1:0b83d4b7a052ca6cede13077538862bdc75058ea</id>
<content type='text'>
commit 78f7a45dac2a2d2002f98a3a95f7979867868d73 upstream.

I noticed that reading the snapshot file when it is empty no longer gives a
status. It suppose to show the status of the snapshot buffer as well as how
to allocate and use it. For example:

 &gt;# cat snapshot
 # tracer: nop
 #
 #
 # * Snapshot is allocated *
 #
 # Snapshot commands:
 # echo 0 &gt; snapshot : Clears and frees snapshot buffer
 # echo 1 &gt; snapshot : Allocates snapshot buffer, if not already allocated.
 #                      Takes a snapshot of the main buffer.
 # echo 2 &gt; snapshot : Clears snapshot buffer (but does not allocate or free)
 #                      (Doesn't have to be '2' works with any number that
 #                       is not a '0' or '1')

But instead it just showed an empty buffer:

 &gt;# cat snapshot
 # tracer: nop
 #
 # entries-in-buffer/entries-written: 0/0   #P:4
 #
 #                              _-----=&gt; irqs-off
 #                             / _----=&gt; need-resched
 #                            | / _---=&gt; hardirq/softirq
 #                            || / _--=&gt; preempt-depth
 #                            ||| /     delay
 #           TASK-PID   CPU#  ||||    TIMESTAMP  FUNCTION
 #              | |       |   ||||       |         |

What happened was that it was using the ring_buffer_iter_empty() function to
see if it was empty, and if it was, it showed the status. But that function
was returning false when it was empty. The reason was that the iter header
page was on the reader page, and the reader page was empty, but so was the
buffer itself. The check only tested to see if the iter was on the commit
page, but the commit page was no longer pointing to the reader page, but as
all pages were empty, the buffer is also.

Fixes: 651e22f2701b ("ring-buffer: Always reset iterator to reader page")
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>ring-buffer: Fix return value check in test_ringbuffer()</title>
<updated>2017-04-22T05:15:06Z</updated>
<author>
<name>Wei Yongjun</name>
<email>yongjun_wei@trendmicro.com.cn</email>
</author>
<published>2016-06-17T17:33:59Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=814051bdd74d00cfe02c12526f493823812e4e4b'/>
<id>urn:sha1:814051bdd74d00cfe02c12526f493823812e4e4b</id>
<content type='text'>
commit 62277de758b155dc04b78f195a1cb5208c37b2df upstream.

In case of error, the function kthread_run() returns ERR_PTR()
and never returns NULL. The NULL test in the return value check
should be replaced with IS_ERR().

Link: http://lkml.kernel.org/r/1466184839-14927-1-git-send-email-weiyj_lk@163.com

Fixes: 6c43e554a ("ring-buffer: Add ring buffer startup selftest")
Signed-off-by: Wei Yongjun &lt;yongjun_wei@trendmicro.com.cn&gt;
Signed-off-by: Steven Rostedt (VMware) &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>ring-buffer: Prevent overflow of size in ring_buffer_resize()</title>
<updated>2016-06-06T23:11:05Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2016-05-13T13:34:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=180fbec3621c16c23eb5de917577b9aa5dcb1d57'/>
<id>urn:sha1:180fbec3621c16c23eb5de917577b9aa5dcb1d57</id>
<content type='text'>
[ Upstream commit 59643d1535eb220668692a5359de22545af579f6 ]

If the size passed to ring_buffer_resize() is greater than MAX_LONG - BUF_PAGE_SIZE
then the DIV_ROUND_UP() will return zero.

Here's the details:

  # echo 18014398509481980 &gt; /sys/kernel/debug/tracing/buffer_size_kb

tracing_entries_write() processes this and converts kb to bytes.

 18014398509481980 &lt;&lt; 10 = 18446744073709547520

and this is passed to ring_buffer_resize() as unsigned long size.

 size = DIV_ROUND_UP(size, BUF_PAGE_SIZE);

Where DIV_ROUND_UP(a, b) is (a + b - 1)/b

BUF_PAGE_SIZE is 4080 and here

 18446744073709547520 + 4080 - 1 = 18446744073709551599

where 18446744073709551599 is still smaller than 2^64

 2^64 - 18446744073709551599 = 17

But now 18446744073709551599 / 4080 = 4521260802379792

and size = size * 4080 = 18446744073709551360

This is checked to make sure its still greater than 2 * 4080,
which it is.

Then we convert to the number of buffer pages needed.

 nr_page = DIV_ROUND_UP(size, BUF_PAGE_SIZE)

but this time size is 18446744073709551360 and

 2^64 - (18446744073709551360 + 4080 - 1) = -3823

Thus it overflows and the resulting number is less than 4080, which makes

  3823 / 4080 = 0

an nr_pages is set to this. As we already checked against the minimum that
nr_pages may be, this causes the logic to fail as well, and we crash the
kernel.

There's no reason to have the two DIV_ROUND_UP() (that's just result of
historical code changes), clean up the code and fix this bug.

Cc: stable@vger.kernel.org # 3.5+
Fixes: 83f40318dab00 ("ring-buffer: Make removal of ring buffer pages atomic")
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
</entry>
<entry>
<title>ring-buffer: Use long for nr_pages to avoid overflow failures</title>
<updated>2016-06-06T23:11:05Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2016-05-12T15:01:24Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a2d04c9ec64914e94e1a43827fef6089bfeab0b3'/>
<id>urn:sha1:a2d04c9ec64914e94e1a43827fef6089bfeab0b3</id>
<content type='text'>
[ Upstream commit 9b94a8fba501f38368aef6ac1b30e7335252a220 ]

The size variable to change the ring buffer in ftrace is a long. The
nr_pages used to update the ring buffer based on the size is int. On 64 bit
machines this can cause an overflow problem.

For example, the following will cause the ring buffer to crash:

 # cd /sys/kernel/debug/tracing
 # echo 10 &gt; buffer_size_kb
 # echo 8556384240 &gt; buffer_size_kb

Then you get the warning of:

 WARNING: CPU: 1 PID: 318 at kernel/trace/ring_buffer.c:1527 rb_update_pages+0x22f/0x260

Which is:

  RB_WARN_ON(cpu_buffer, nr_removed);

Note each ring buffer page holds 4080 bytes.

This is because:

 1) 10 causes the ring buffer to have 3 pages.
    (10kb requires 3 * 4080 pages to hold)

 2) (2^31 / 2^10  + 1) * 4080 = 8556384240
    The value written into buffer_size_kb is shifted by 10 and then passed
    to ring_buffer_resize(). 8556384240 * 2^10 = 8761737461760

 3) The size passed to ring_buffer_resize() is then divided by BUF_PAGE_SIZE
    which is 4080. 8761737461760 / 4080 = 2147484672

 4) nr_pages is subtracted from the current nr_pages (3) and we get:
    2147484669. This value is saved in a signed integer nr_pages_to_update

 5) 2147484669 is greater than 2^31 but smaller than 2^32, a signed int
    turns into the value of -2147482627

 6) As the value is a negative number, in update_pages_handler() it is
    negated and passed to rb_remove_pages() and 2147482627 pages will
    be removed, which is much larger than 3 and it causes the warning
    because not all the pages asked to be removed were removed.

Link: https://bugzilla.kernel.org/show_bug.cgi?id=118001

Cc: stable@vger.kernel.org # 2.6.28+
Fixes: 7a8e76a3829f1 ("tracing: unified trace buffer")
Reported-by: Hao Qin &lt;QEver.cn@gmail.com&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
</entry>
<entry>
<title>ring-buffer: Move recursive check to per_cpu descriptor</title>
<updated>2016-06-06T23:11:04Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2015-05-27T14:27:47Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=654052eea3c5ac394d44388eec6101897e4316a1'/>
<id>urn:sha1:654052eea3c5ac394d44388eec6101897e4316a1</id>
<content type='text'>
[ Upstream commit 58a09ec6e3ec88c9c7e061479f1ef7fe93324a87 ]

Instead of using a global per_cpu variable to perform the recursive
checks into the ring buffer, use the already existing per_cpu descriptor
that is part of the ring buffer itself.

Not only does this simplify the code, it also allows for one ring buffer
to be used within the guts of the use of another ring buffer. For example
trace_printk() can now be used within the ring buffer to record changes
done by an instance into the main ring buffer. The recursion checks
will prevent the trace_printk() itself from causing recursive issues
with the main ring buffer (it is just ignored), but the recursive
checks wont prevent the trace_printk() from recording other ring buffers.

Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
</entry>
<entry>
<title>ring-buffer: Remove duplicate use of '&amp;' in recursive code</title>
<updated>2016-06-06T23:11:04Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2015-03-27T21:39:49Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=090243482940087932c879868ca3b676a6701075'/>
<id>urn:sha1:090243482940087932c879868ca3b676a6701075</id>
<content type='text'>
[ Upstream commit d631c8cceb1d1d06f372878935949d421585186b ]

A clean up of the recursive protection code changed

  val = this_cpu_read(current_context);
  val--;
  val &amp;= this_cpu_read(current_context);

to

  val = this_cpu_read(current_context);
  val &amp;= val &amp; (val - 1);

Which has a duplicate use of '&amp;' as the above is the same as

  val = val &amp; (val - 1);

Actually, it would be best to remove that line altogether and
just add it to where it is used.

And Christoph even mentioned that it can be further compacted to
just a single line:

  __this_cpu_and(current_context, __this_cpu_read(current_context) - 1);

Link: http://lkml.kernel.org/alpine.DEB.2.11.1503271423580.23114@gentwo.org

Suggested-by: Christoph Lameter &lt;cl@linux.com&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
</entry>
<entry>
<title>ring-buffer: Add unlikelys to make fast path the default</title>
<updated>2016-06-06T23:11:04Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2015-05-21T21:39:29Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4baf9733a655b71db7b436484dfdc04566c3a5a3'/>
<id>urn:sha1:4baf9733a655b71db7b436484dfdc04566c3a5a3</id>
<content type='text'>
[ Upstream commit 3205f8063b6cc54b20d5080fb79dfcbd9c39e93d ]

I was running the trace_event benchmark and noticed that the times
to record a trace_event was all over the place. I looked at the assembly
of the ring_buffer_lock_reserver() and saw this:

 &lt;ring_buffer_lock_reserve&gt;:
       31 c0                   xor    %eax,%eax
       48 83 3d 76 47 bd 00    cmpq   $0x1,0xbd4776(%rip)        # ffffffff81d10d60 &lt;ring_buffer_flags&gt;
       01
       55                      push   %rbp
       48 89 e5                mov    %rsp,%rbp
       75 1d                   jne    ffffffff8113c60d &lt;ring_buffer_lock_reserve+0x2d&gt;
       65 ff 05 69 e3 ec 7e    incl   %gs:0x7eece369(%rip)        # a960 &lt;__preempt_count&gt;
       8b 47 08                mov    0x8(%rdi),%eax
       85 c0                   test   %eax,%eax
 +---- 74 12                   je     ffffffff8113c610 &lt;ring_buffer_lock_reserve+0x30&gt;
 |     65 ff 0d 5b e3 ec 7e    decl   %gs:0x7eece35b(%rip)        # a960 &lt;__preempt_count&gt;
 |     0f 84 85 00 00 00       je     ffffffff8113c690 &lt;ring_buffer_lock_reserve+0xb0&gt;
 |     31 c0                   xor    %eax,%eax
 |     5d                      pop    %rbp
 |     c3                      retq
 |     90                      nop
 +---&gt; 65 44 8b 05 48 e3 ec    mov    %gs:0x7eece348(%rip),%r8d        # a960 &lt;__preempt_count&gt;
       7e
       41 81 e0 ff ff ff 7f    and    $0x7fffffff,%r8d
       b0 08                   mov    $0x8,%al
       65 8b 0d 58 36 ed 7e    mov    %gs:0x7eed3658(%rip),%ecx        # fc80 &lt;current_context&gt;
       41 f7 c0 00 ff 1f 00    test   $0x1fff00,%r8d
       74 1e                   je     ffffffff8113c64f &lt;ring_buffer_lock_reserve+0x6f&gt;
       41 f7 c0 00 00 10 00    test   $0x100000,%r8d
       b0 01                   mov    $0x1,%al
       75 13                   jne    ffffffff8113c64f &lt;ring_buffer_lock_reserve+0x6f&gt;
       41 81 e0 00 00 0f 00    and    $0xf0000,%r8d
       49 83 f8 01             cmp    $0x1,%r8
       19 c0                   sbb    %eax,%eax
       83 e0 02                and    $0x2,%eax
       83 c0 02                add    $0x2,%eax
       85 c8                   test   %ecx,%eax
       75 ab                   jne    ffffffff8113c5fe &lt;ring_buffer_lock_reserve+0x1e&gt;
       09 c8                   or     %ecx,%eax
       65 89 05 24 36 ed 7e    mov    %eax,%gs:0x7eed3624(%rip)        # fc80 &lt;current_context&gt;

The arrow is the fast path.

After adding the unlikely's, the fast path looks a bit better:

 &lt;ring_buffer_lock_reserve&gt;:
       31 c0                   xor    %eax,%eax
       48 83 3d 76 47 bd 00    cmpq   $0x1,0xbd4776(%rip)        # ffffffff81d10d60 &lt;ring_buffer_flags&gt;
       01
       55                      push   %rbp
       48 89 e5                mov    %rsp,%rbp
       75 7b                   jne    ffffffff8113c66b &lt;ring_buffer_lock_reserve+0x8b&gt;
       65 ff 05 69 e3 ec 7e    incl   %gs:0x7eece369(%rip)        # a960 &lt;__preempt_count&gt;
       8b 47 08                mov    0x8(%rdi),%eax
       85 c0                   test   %eax,%eax
       0f 85 9f 00 00 00       jne    ffffffff8113c6a1 &lt;ring_buffer_lock_reserve+0xc1&gt;
       65 8b 0d 57 e3 ec 7e    mov    %gs:0x7eece357(%rip),%ecx        # a960 &lt;__preempt_count&gt;
       81 e1 ff ff ff 7f       and    $0x7fffffff,%ecx
       b0 08                   mov    $0x8,%al
       65 8b 15 68 36 ed 7e    mov    %gs:0x7eed3668(%rip),%edx        # fc80 &lt;current_context&gt;
       f7 c1 00 ff 1f 00       test   $0x1fff00,%ecx
       75 50                   jne    ffffffff8113c670 &lt;ring_buffer_lock_reserve+0x90&gt;
       85 d0                   test   %edx,%eax
       75 7d                   jne    ffffffff8113c6a1 &lt;ring_buffer_lock_reserve+0xc1&gt;
       09 d0                   or     %edx,%eax
       65 89 05 53 36 ed 7e    mov    %eax,%gs:0x7eed3653(%rip)        # fc80 &lt;current_context&gt;
       65 8b 05 fc da ec 7e    mov    %gs:0x7eecdafc(%rip),%eax        # a130 &lt;cpu_number&gt;
       89 c2                   mov    %eax,%edx

Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
</entry>
<entry>
<title>ring-buffer: Replace this_cpu_*() with __this_cpu_*()</title>
<updated>2015-05-17T23:11:55Z</updated>
<author>
<name>Steven Rostedt</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2015-03-17T14:40:38Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6ec5fc3a8308e298002f81b37f605ff10669a8f9'/>
<id>urn:sha1:6ec5fc3a8308e298002f81b37f605ff10669a8f9</id>
<content type='text'>
[ Upstream commit 80a9b64e2c156b6523e7a01f2ba6e5d86e722814 ]

It has come to my attention that this_cpu_read/write are horrible on
architectures other than x86. Worse yet, they actually disable
preemption or interrupts! This caused some unexpected tracing results
on ARM.

   101.356868: preempt_count_add &lt;-ring_buffer_lock_reserve
   101.356870: preempt_count_sub &lt;-ring_buffer_lock_reserve

The ring_buffer_lock_reserve has recursion protection that requires
accessing a per cpu variable. But since preempt_disable() is traced, it
too got traced while accessing the variable that is suppose to prevent
recursion like this.

The generic version of this_cpu_read() and write() are:

 #define this_cpu_generic_read(pcp)					\
 ({	typeof(pcp) ret__;						\
	preempt_disable();						\
	ret__ = *this_cpu_ptr(&amp;(pcp));					\
	preempt_enable();						\
	ret__;								\
 })

 #define this_cpu_generic_to_op(pcp, val, op)				\
 do {									\
	unsigned long flags;						\
	raw_local_irq_save(flags);					\
	*__this_cpu_ptr(&amp;(pcp)) op val;					\
	raw_local_irq_restore(flags);					\
 } while (0)

Which is unacceptable for locations that know they are within preempt
disabled or interrupt disabled locations.

Paul McKenney stated that __this_cpu_() versions produce much better code on
other architectures than this_cpu_() does, if we know that the call is done in
a preempt disabled location.

I also changed the recursive_unlock() to use two local variables instead
of accessing the per_cpu variable twice.

Link: http://lkml.kernel.org/r/20150317114411.GE3589@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/20150317104038.312e73d1@gandalf.local.home

Cc: stable@vger.kernel.org
Acked-by: Christoph Lameter &lt;cl@linux.com&gt;
Reported-by: Uwe Kleine-Koenig &lt;u.kleine-koenig@pengutronix.de&gt;
Tested-by: Uwe Kleine-Koenig &lt;u.kleine-koenig@pengutronix.de&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
</content>
</entry>
<entry>
<title>ring-buffer: Do not wake up a splice waiter when page is not full</title>
<updated>2015-03-06T22:52:59Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2015-02-11T03:14:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=49c1bf0c2cddecb3494a62a52b40bd0c7e1b9017'/>
<id>urn:sha1:49c1bf0c2cddecb3494a62a52b40bd0c7e1b9017</id>
<content type='text'>
commit 1e0d6714aceb770b04161fbedd7765d0e1fc27bd upstream.

When an application connects to the ring buffer via splice, it can only
read full pages. Splice does not work with partial pages. If there is
not enough data to fill a page, the splice command will either block
or return -EAGAIN (if set to nonblock).

Code was added where if the page is not full, to just sleep again.
The problem is, it will get woken up again on the next event. That
is, when something is written into the ring buffer, if there is a waiter
it will wake it up. The waiter would then check the buffer, see that
it still does not have enough data to fill a page and go back to sleep.
To make matters worse, when the waiter goes back to sleep, it could
cause another event, which would wake it back up again to see it
doesn't have enough data and sleep again. This produces a tremendous
overhead and fills the ring buffer with noise.

For example, recording sched_switch on an idle system for 10 seconds
produces 25,350,475 events!!!

Create another wait queue for those waiters wanting full pages.
When an event is written, it only wakes up waiters if there's a full
page of data. It does not wake up the waiter if the page is not yet
full.

After this change, recording sched_switch on an idle system for 10
seconds produces only 800 events. Getting rid of 25,349,675 useless
events (99.9969% of events!!), is something to take seriously.

Cc: Rabin Vincent &lt;rabin@rab.in&gt;
Fixes: e30f53aad220 "tracing: Do not busy wait in buffer splice"
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>tracing: Do not busy wait in buffer splice</title>
<updated>2014-11-10T21:45:43Z</updated>
<author>
<name>Rabin Vincent</name>
<email>rabin@rab.in</email>
</author>
<published>2014-11-10T18:46:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e30f53aad2202b5526c40c36d8eeac8bf290bde5'/>
<id>urn:sha1:e30f53aad2202b5526c40c36d8eeac8bf290bde5</id>
<content type='text'>
On a !PREEMPT kernel, attempting to use trace-cmd results in a soft
lockup:

 # trace-cmd record -e raw_syscalls:* -F false
 NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [trace-cmd:61]
 ...
 Call Trace:
  [&lt;ffffffff8105b580&gt;] ? __wake_up_common+0x90/0x90
  [&lt;ffffffff81092e25&gt;] wait_on_pipe+0x35/0x40
  [&lt;ffffffff810936e3&gt;] tracing_buffers_splice_read+0x2e3/0x3c0
  [&lt;ffffffff81093300&gt;] ? tracing_stats_read+0x2a0/0x2a0
  [&lt;ffffffff812d10ab&gt;] ? _raw_spin_unlock+0x2b/0x40
  [&lt;ffffffff810dc87b&gt;] ? do_read_fault+0x21b/0x290
  [&lt;ffffffff810de56a&gt;] ? handle_mm_fault+0x2ba/0xbd0
  [&lt;ffffffff81095c80&gt;] ? trace_event_buffer_lock_reserve+0x40/0x80
  [&lt;ffffffff810951e2&gt;] ? trace_buffer_lock_reserve+0x22/0x60
  [&lt;ffffffff81095c80&gt;] ? trace_event_buffer_lock_reserve+0x40/0x80
  [&lt;ffffffff8112415d&gt;] do_splice_to+0x6d/0x90
  [&lt;ffffffff81126971&gt;] SyS_splice+0x7c1/0x800
  [&lt;ffffffff812d1edd&gt;] tracesys_phase2+0xd3/0xd8

The problem is this: tracing_buffers_splice_read() calls
ring_buffer_wait() to wait for data in the ring buffers.  The buffers
are not empty so ring_buffer_wait() returns immediately.  But
tracing_buffers_splice_read() calls ring_buffer_read_page() with full=1,
meaning it only wants to read a full page.  When the full page is not
available, tracing_buffers_splice_read() tries to wait again with
ring_buffer_wait(), which again returns immediately, and so on.

Fix this by adding a "full" argument to ring_buffer_wait() which will
make ring_buffer_wait() wait until the writer has left the reader's
page, i.e.  until full-page reads will succeed.

Link: http://lkml.kernel.org/r/1415645194-25379-1-git-send-email-rabin@rab.in

Cc: stable@vger.kernel.org # 3.16+
Fixes: b1169cc69ba9 ("tracing: Remove mock up poll wait function")
Signed-off-by: Rabin Vincent &lt;rabin@rab.in&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
</feed>
