<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/trace, branch v3.16.3</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.16.3</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.16.3'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2014-09-17T16:22:12Z</updated>
<entry>
<title>ring-buffer: Up rb_iter_peek() loop count to 3</title>
<updated>2014-09-17T16:22:12Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2014-08-06T19:36:31Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=39ed6dfc36293ce24734ec571569a0921d8b0745'/>
<id>urn:sha1:39ed6dfc36293ce24734ec571569a0921d8b0745</id>
<content type='text'>
commit 021de3d904b88b1771a3a2cfc5b75023c391e646 upstream.

After writting a test to try to trigger the bug that caused the
ring buffer iterator to become corrupted, I hit another bug:

 WARNING: CPU: 1 PID: 5281 at kernel/trace/ring_buffer.c:3766 rb_iter_peek+0x113/0x238()
 Modules linked in: ipt_MASQUERADE sunrpc [...]
 CPU: 1 PID: 5281 Comm: grep Tainted: G        W     3.16.0-rc3-test+ #143
 Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./To be filled by O.E.M., BIOS SDBLI944.86P 05/08/2007
  0000000000000000 ffffffff81809a80 ffffffff81503fb0 0000000000000000
  ffffffff81040ca1 ffff8800796d6010 ffffffff810c138d ffff8800796d6010
  ffff880077438c80 ffff8800796d6010 ffff88007abbe600 0000000000000003
 Call Trace:
  [&lt;ffffffff81503fb0&gt;] ? dump_stack+0x4a/0x75
  [&lt;ffffffff81040ca1&gt;] ? warn_slowpath_common+0x7e/0x97
  [&lt;ffffffff810c138d&gt;] ? rb_iter_peek+0x113/0x238
  [&lt;ffffffff810c138d&gt;] ? rb_iter_peek+0x113/0x238
  [&lt;ffffffff810c14df&gt;] ? ring_buffer_iter_peek+0x2d/0x5c
  [&lt;ffffffff810c6f73&gt;] ? tracing_iter_reset+0x6e/0x96
  [&lt;ffffffff810c74a3&gt;] ? s_start+0xd7/0x17b
  [&lt;ffffffff8112b13e&gt;] ? kmem_cache_alloc_trace+0xda/0xea
  [&lt;ffffffff8114cf94&gt;] ? seq_read+0x148/0x361
  [&lt;ffffffff81132d98&gt;] ? vfs_read+0x93/0xf1
  [&lt;ffffffff81132f1b&gt;] ? SyS_read+0x60/0x8e
  [&lt;ffffffff8150bf9f&gt;] ? tracesys+0xdd/0xe2

Debugging this bug, which triggers when the rb_iter_peek() loops too
many times (more than 2 times), I discovered there's a case that can
cause that function to legitimately loop 3 times!

rb_iter_peek() is different than rb_buffer_peek() as the rb_buffer_peek()
only deals with the reader page (it's for consuming reads). The
rb_iter_peek() is for traversing the buffer without consuming it, and as
such, it can loop for one more reason. That is, if we hit the end of
the reader page or any page, it will go to the next page and try again.

That is, we have this:

 1. iter-&gt;head &gt; iter-&gt;head_page-&gt;page-&gt;commit
    (rb_inc_iter() which moves the iter to the next page)
    try again

 2. event = rb_iter_head_event()
    event-&gt;type_len == RINGBUF_TYPE_TIME_EXTEND
    rb_advance_iter()
    try again

 3. read the event.

But we never get to 3, because the count is greater than 2 and we
cause the WARNING and return NULL.

Up the counter to 3.

Fixes: 69d1b839f7ee "ring-buffer: Bind time extend and data events together"
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>ring-buffer: Always reset iterator to reader page</title>
<updated>2014-09-17T16:22:11Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2014-08-06T18:11:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1cfa896d6e52e3d9d2cb2133a0c495c816556ca9'/>
<id>urn:sha1:1cfa896d6e52e3d9d2cb2133a0c495c816556ca9</id>
<content type='text'>
commit 651e22f2701b4113989237c3048d17337dd2185c upstream.

When performing a consuming read, the ring buffer swaps out a
page from the ring buffer with a empty page and this page that
was swapped out becomes the new reader page. The reader page
is owned by the reader and since it was swapped out of the ring
buffer, writers do not have access to it (there's an exception
to that rule, but it's out of scope for this commit).

When reading the "trace" file, it is a non consuming read, which
means that the data in the ring buffer will not be modified.
When the trace file is opened, a ring buffer iterator is allocated
and writes to the ring buffer are disabled, such that the iterator
will not have issues iterating over the data.

Although the ring buffer disabled writes, it does not disable other
reads, or even consuming reads. If a consuming read happens, then
the iterator is reset and starts reading from the beginning again.

My tests would sometimes trigger this bug on my i386 box:

WARNING: CPU: 0 PID: 5175 at kernel/trace/trace.c:1527 __trace_find_cmdline+0x66/0xaa()
Modules linked in:
CPU: 0 PID: 5175 Comm: grep Not tainted 3.16.0-rc3-test+ #8
Hardware name:                  /DG965MQ, BIOS MQ96510J.86A.0372.2006.0605.1717 06/05/2006
 00000000 00000000 f09c9e1c c18796b3 c1b5d74c f09c9e4c c103a0e3 c1b5154b
 f09c9e78 00001437 c1b5d74c 000005f7 c10bd85a c10bd85a c1cac57c f09c9eb0
 ed0e0000 f09c9e64 c103a185 00000009 f09c9e5c c1b5154b f09c9e78 f09c9e80^M
Call Trace:
 [&lt;c18796b3&gt;] dump_stack+0x4b/0x75
 [&lt;c103a0e3&gt;] warn_slowpath_common+0x7e/0x95
 [&lt;c10bd85a&gt;] ? __trace_find_cmdline+0x66/0xaa
 [&lt;c10bd85a&gt;] ? __trace_find_cmdline+0x66/0xaa
 [&lt;c103a185&gt;] warn_slowpath_fmt+0x33/0x35
 [&lt;c10bd85a&gt;] __trace_find_cmdline+0x66/0xaa^M
 [&lt;c10bed04&gt;] trace_find_cmdline+0x40/0x64
 [&lt;c10c3c16&gt;] trace_print_context+0x27/0xec
 [&lt;c10c4360&gt;] ? trace_seq_printf+0x37/0x5b
 [&lt;c10c0b15&gt;] print_trace_line+0x319/0x39b
 [&lt;c10ba3fb&gt;] ? ring_buffer_read+0x47/0x50
 [&lt;c10c13b1&gt;] s_show+0x192/0x1ab
 [&lt;c10bfd9a&gt;] ? s_next+0x5a/0x7c
 [&lt;c112e76e&gt;] seq_read+0x267/0x34c
 [&lt;c1115a25&gt;] vfs_read+0x8c/0xef
 [&lt;c112e507&gt;] ? seq_lseek+0x154/0x154
 [&lt;c1115ba2&gt;] SyS_read+0x54/0x7f
 [&lt;c188488e&gt;] syscall_call+0x7/0xb
---[ end trace 3f507febd6b4cc83 ]---
&gt;&gt;&gt;&gt; ##### CPU 1 buffer started ####

Which was the __trace_find_cmdline() function complaining about the pid
in the event record being negative.

After adding more test cases, this would trigger more often. Strangely
enough, it would never trigger on a single test, but instead would trigger
only when running all the tests. I believe that was the case because it
required one of the tests to be shutting down via delayed instances while
a new test started up.

After spending several days debugging this, I found that it was caused by
the iterator becoming corrupted. Debugging further, I found out why
the iterator became corrupted. It happened with the rb_iter_reset().

As consuming reads may not read the full reader page, and only part
of it, there's a "read" field to know where the last read took place.
The iterator, must also start at the read position. In the rb_iter_reset()
code, if the reader page was disconnected from the ring buffer, the iterator
would start at the head page within the ring buffer (where writes still
happen). But the mistake there was that it still used the "read" field
to start the iterator on the head page, where it should always start
at zero because readers never read from within the ring buffer where
writes occur.

I originally wrote a patch to have it set the iter-&gt;head to 0 instead
of iter-&gt;head_page-&gt;read, but then I questioned why it wasn't always
setting the iter to point to the reader page, as the reader page is
still valid.  The list_empty(reader_page-&gt;list) just means that it was
successful in swapping out. But the reader_page may still have data.

There was a bug report a long time ago that was not reproducible that
had something about trace_pipe (consuming read) not matching trace
(iterator read). This may explain why that happened.

Anyway, the correct answer to this bug is to always use the reader page
an not reset the iterator to inside the writable ring buffer.

Fixes: d769041f8653 "ring_buffer: implement new locking"
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>tracing: Fix wraparound problems in "uptime" trace clock</title>
<updated>2014-07-21T13:56:12Z</updated>
<author>
<name>Tony Luck</name>
<email>tony.luck@intel.com</email>
</author>
<published>2014-07-18T18:43:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=58d4e21e50ff3cc57910a8abc20d7e14375d2f61'/>
<id>urn:sha1:58d4e21e50ff3cc57910a8abc20d7e14375d2f61</id>
<content type='text'>
The "uptime" trace clock added in:

    commit 8aacf017b065a805d27467843490c976835eb4a5
    tracing: Add "uptime" trace clock that uses jiffies

has wraparound problems when the system has been up more
than 1 hour 11 minutes and 34 seconds. It converts jiffies
to nanoseconds using:
        (u64)jiffies_to_usecs(jiffy) * 1000ULL
but since jiffies_to_usecs() only returns a 32-bit value, it
truncates at 2^32 microseconds.  An additional problem on 32-bit
systems is that the argument is "unsigned long", so fixing the
return value only helps until 2^32 jiffies (49.7 days on a HZ=1000
system).

Avoid these problems by using jiffies_64 as our basis, and
not converting to nanoseconds (we do convert to clock_t because
user facing API must not be dependent on internal kernel
HZ values).

Link: http://lkml.kernel.org/p/99d63c5bfe9b320a3b428d773825a37095bf6a51.1405708254.git.tony.luck@intel.com

Cc: stable@vger.kernel.org # 3.10+
Fixes: 8aacf017b065 "tracing: Add "uptime" trace clock that uses jiffies"
Signed-off-by: Tony Luck &lt;tony.luck@intel.com&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>ring-buffer: Fix polling on trace_pipe</title>
<updated>2014-07-15T21:07:47Z</updated>
<author>
<name>Martin Lau</name>
<email>kafai@fb.com</email>
</author>
<published>2014-06-10T06:06:42Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=97b8ee845393701edc06e27ccec2876ff9596019'/>
<id>urn:sha1:97b8ee845393701edc06e27ccec2876ff9596019</id>
<content type='text'>
ring_buffer_poll_wait() should always put the poll_table to its wait_queue
even there is immediate data available.  Otherwise, the following epoll and
read sequence will eventually hang forever:

1. Put some data to make the trace_pipe ring_buffer read ready first
2. epoll_ctl(efd, EPOLL_CTL_ADD, trace_pipe_fd, ee)
3. epoll_wait()
4. read(trace_pipe_fd) till EAGAIN
5. Add some more data to the trace_pipe ring_buffer
6. epoll_wait() -&gt; this epoll_wait() will block forever

~ During the epoll_ctl(efd, EPOLL_CTL_ADD,...) call in step 2,
  ring_buffer_poll_wait() returns immediately without adding poll_table,
  which has poll_table-&gt;_qproc pointing to ep_poll_callback(), to its
  wait_queue.
~ During the epoll_wait() call in step 3 and step 6,
  ring_buffer_poll_wait() cannot add ep_poll_callback() to its wait_queue
  because the poll_table-&gt;_qproc is NULL and it is how epoll works.
~ When there is new data available in step 6, ring_buffer does not know
  it has to call ep_poll_callback() because it is not in its wait queue.
  Hence, block forever.

Other poll implementation seems to call poll_wait() unconditionally as the very
first thing to do.  For example, tcp_poll() in tcp.c.

Link: http://lkml.kernel.org/p/20140610060637.GA14045@devbig242.prn2.facebook.com

Cc: stable@vger.kernel.org # 2.6.27
Fixes: 2a2cc8f7c4d0 "ftrace: allow the event pipe to be polled"
Reviewed-by: Chris Mason &lt;clm@fb.com&gt;
Signed-off-by: Martin Lau &lt;kafai@fb.com&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>tracing: Add TRACE_ITER_PRINTK flag check in __trace_puts/__trace_bputs</title>
<updated>2014-07-15T15:58:40Z</updated>
<author>
<name>zhangwei(Jovi)</name>
<email>jovi.zhangwei@huawei.com</email>
</author>
<published>2013-07-18T08:31:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f0160a5a2912267c02cfe692eac955c360de5fdf'/>
<id>urn:sha1:f0160a5a2912267c02cfe692eac955c360de5fdf</id>
<content type='text'>
The TRACE_ITER_PRINTK check in __trace_puts/__trace_bputs is missing,
so add it, to be consistent with __trace_printk/__trace_bprintk.
Those functions are all called by the same function: trace_printk().

Link: http://lkml.kernel.org/p/51E7A7D6.8090900@huawei.com

Cc: stable@vger.kernel.org # 3.11+
Signed-off-by: zhangwei(Jovi) &lt;jovi.zhangwei@huawei.com&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>tracing: Fix graph tracer with stack tracer on other archs</title>
<updated>2014-07-15T15:10:25Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2014-07-15T15:05:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5f8bf2d263a20b986225ae1ed7d6759dc4b93af9'/>
<id>urn:sha1:5f8bf2d263a20b986225ae1ed7d6759dc4b93af9</id>
<content type='text'>
Running my ftrace tests on PowerPC, it failed the test that checks
if function_graph tracer is affected by the stack tracer. It was.
Looking into this, I found that the update_function_graph_func()
must be called even if the trampoline function is not changed.
This is because archs like PowerPC do not support ftrace_ops being
passed by assembly and instead uses a helper function (what the
trampoline function points to). Since this function is not changed
even when multiple ftrace_ops are added to the code, the test that
falls out before calling update_function_graph_func() will miss that
the update must still be done.

Call update_function_graph_function() for all calls to
update_ftrace_function()

Cc: stable@vger.kernel.org # 3.3+
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>tracing: Add ftrace_trace_stack into __trace_puts/__trace_bputs</title>
<updated>2014-07-15T15:04:40Z</updated>
<author>
<name>zhangwei(Jovi)</name>
<email>jovi.zhangwei@huawei.com</email>
</author>
<published>2013-07-18T08:31:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8abfb8727f4a724d31f9ccfd8013fbd16d539445'/>
<id>urn:sha1:8abfb8727f4a724d31f9ccfd8013fbd16d539445</id>
<content type='text'>
Currently trace option stacktrace is not applicable for
trace_printk with constant string argument, the reason is
in __trace_puts/__trace_bputs ftrace_trace_stack is missing.

In contrast, when using trace_printk with non constant string
argument(will call into __trace_printk/__trace_bprintk), then
trace option stacktrace is workable, this inconstant result
will confuses users a lot.

Link: http://lkml.kernel.org/p/51E7A7C9.9040401@huawei.com

Cc: stable@vger.kernel.org # 3.10+
Signed-off-by: zhangwei(Jovi) &lt;jovi.zhangwei@huawei.com&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>tracing: instance_rmdir() leaks ftrace_event_file-&gt;filter</title>
<updated>2014-07-14T18:41:13Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2014-07-11T19:06:38Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2448e3493cb3874baa90725c87869455ebf11cd2'/>
<id>urn:sha1:2448e3493cb3874baa90725c87869455ebf11cd2</id>
<content type='text'>
instance_rmdir() path destroys the event files but forgets to free
file-&gt;filter. Change remove_event_file_dir() to free_event_filter().

Link: http://lkml.kernel.org/p/20140711190638.GA19517@redhat.com

Cc: Masami Hiramatsu &lt;masami.hiramatsu.pt@hitachi.com&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Srikar Dronamraju &lt;srikar@linux.vnet.ibm.com&gt;
Cc: Tom Zanussi &lt;tom.zanussi@linux.intel.com&gt;
Cc: "zhangwei(Jovi)" &lt;jovi.zhangwei@huawei.com&gt;
Cc: stable@vger.kernel.org # 3.11+
Fixes: f6a84bdc75b5 "tracing: Introduce remove_event_file_dir()"
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>tracing: Remove ftrace_stop/start() from reading the trace file</title>
<updated>2014-07-01T16:45:54Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2014-06-25T03:50:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=099ed151675cd1d2dbeae1dac697975f6a68716d'/>
<id>urn:sha1:099ed151675cd1d2dbeae1dac697975f6a68716d</id>
<content type='text'>
Disabling reading and writing to the trace file should not be able to
disable all function tracing callbacks. There's other users today
(like kprobes and perf). Reading a trace file should not stop those
from happening.

Cc: stable@vger.kernel.org # 3.0+
Reviewed-by: Masami Hiramatsu &lt;masami.hiramatsu.pt@hitachi.com&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>tracing/uprobes: Fix the usage of uprobe_buffer_enable() in probe_event_enable()</title>
<updated>2014-06-30T17:22:33Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2014-06-27T17:01:46Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fb6bab6a5ad46d00b5ffa22268f21df1cd7c59df'/>
<id>urn:sha1:fb6bab6a5ad46d00b5ffa22268f21df1cd7c59df</id>
<content type='text'>
The usage of uprobe_buffer_enable() added by dcad1a20 is very wrong,

1. uprobe_buffer_enable() and uprobe_buffer_disable() are not balanced,
   _enable() should be called only if !enabled.

2. If uprobe_buffer_enable() fails probe_event_enable() should clear
   tp.flags and free event_file_link.

3. If uprobe_register() fails it should do uprobe_buffer_disable().

Link: http://lkml.kernel.org/p/20140627170146.GA18332@redhat.com

Acked-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Acked-by: Srikar Dronamraju &lt;srikar@linux.vnet.ibm.com&gt;
Reviewed-by: Masami Hiramatsu &lt;masami.hiramatsu.pt@hitachi.com&gt;
Fixes: dcad1a204f72 "tracing/uprobes: Fetch args before reserving a ring buffer"
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
</feed>
