<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/trace/ring_buffer.c, branch v4.1.22</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.1.22</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.1.22'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2015-03-30T17:36:31Z</updated>
<entry>
<title>ring-buffer: Remove duplicate use of '&amp;' in recursive code</title>
<updated>2015-03-30T17:36:31Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2015-03-27T21:39:49Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d631c8cceb1d1d06f372878935949d421585186b'/>
<id>urn:sha1:d631c8cceb1d1d06f372878935949d421585186b</id>
<content type='text'>
A clean up of the recursive protection code changed

  val = this_cpu_read(current_context);
  val--;
  val &amp;= this_cpu_read(current_context);

to

  val = this_cpu_read(current_context);
  val &amp;= val &amp; (val - 1);

Which has a duplicate use of '&amp;' as the above is the same as

  val = val &amp; (val - 1);

Actually, it would be best to remove that line altogether and
just add it to where it is used.

And Christoph even mentioned that it can be further compacted to
just a single line:

  __this_cpu_and(current_context, __this_cpu_read(current_context) - 1);

Link: http://lkml.kernel.org/alpine.DEB.2.11.1503271423580.23114@gentwo.org

Suggested-by: Christoph Lameter &lt;cl@linux.com&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>ring-buffer: Replace this_cpu_*() with __this_cpu_*()</title>
<updated>2015-03-25T12:56:49Z</updated>
<author>
<name>Steven Rostedt</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2015-03-17T14:40:38Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=80a9b64e2c156b6523e7a01f2ba6e5d86e722814'/>
<id>urn:sha1:80a9b64e2c156b6523e7a01f2ba6e5d86e722814</id>
<content type='text'>
It has come to my attention that this_cpu_read/write are horrible on
architectures other than x86. Worse yet, they actually disable
preemption or interrupts! This caused some unexpected tracing results
on ARM.

   101.356868: preempt_count_add &lt;-ring_buffer_lock_reserve
   101.356870: preempt_count_sub &lt;-ring_buffer_lock_reserve

The ring_buffer_lock_reserve has recursion protection that requires
accessing a per cpu variable. But since preempt_disable() is traced, it
too got traced while accessing the variable that is suppose to prevent
recursion like this.

The generic version of this_cpu_read() and write() are:

 #define this_cpu_generic_read(pcp)					\
 ({	typeof(pcp) ret__;						\
	preempt_disable();						\
	ret__ = *this_cpu_ptr(&amp;(pcp));					\
	preempt_enable();						\
	ret__;								\
 })

 #define this_cpu_generic_to_op(pcp, val, op)				\
 do {									\
	unsigned long flags;						\
	raw_local_irq_save(flags);					\
	*__this_cpu_ptr(&amp;(pcp)) op val;					\
	raw_local_irq_restore(flags);					\
 } while (0)

Which is unacceptable for locations that know they are within preempt
disabled or interrupt disabled locations.

Paul McKenney stated that __this_cpu_() versions produce much better code on
other architectures than this_cpu_() does, if we know that the call is done in
a preempt disabled location.

I also changed the recursive_unlock() to use two local variables instead
of accessing the per_cpu variable twice.

Link: http://lkml.kernel.org/r/20150317114411.GE3589@linux.vnet.ibm.com
Link: http://lkml.kernel.org/r/20150317104038.312e73d1@gandalf.local.home

Cc: stable@vger.kernel.org
Acked-by: Christoph Lameter &lt;cl@linux.com&gt;
Reported-by: Uwe Kleine-Koenig &lt;u.kleine-koenig@pengutronix.de&gt;
Tested-by: Uwe Kleine-Koenig &lt;u.kleine-koenig@pengutronix.de&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>ring-buffer: Do not wake up a splice waiter when page is not full</title>
<updated>2015-02-11T12:41:42Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2015-02-11T03:14:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1e0d6714aceb770b04161fbedd7765d0e1fc27bd'/>
<id>urn:sha1:1e0d6714aceb770b04161fbedd7765d0e1fc27bd</id>
<content type='text'>
When an application connects to the ring buffer via splice, it can only
read full pages. Splice does not work with partial pages. If there is
not enough data to fill a page, the splice command will either block
or return -EAGAIN (if set to nonblock).

Code was added where if the page is not full, to just sleep again.
The problem is, it will get woken up again on the next event. That
is, when something is written into the ring buffer, if there is a waiter
it will wake it up. The waiter would then check the buffer, see that
it still does not have enough data to fill a page and go back to sleep.
To make matters worse, when the waiter goes back to sleep, it could
cause another event, which would wake it back up again to see it
doesn't have enough data and sleep again. This produces a tremendous
overhead and fills the ring buffer with noise.

For example, recording sched_switch on an idle system for 10 seconds
produces 25,350,475 events!!!

Create another wait queue for those waiters wanting full pages.
When an event is written, it only wakes up waiters if there's a full
page of data. It does not wake up the waiter if the page is not yet
full.

After this change, recording sched_switch on an idle system for 10
seconds produces only 800 events. Getting rid of 25,349,675 useless
events (99.9969% of events!!), is something to take seriously.

Cc: stable@vger.kernel.org # 3.16+
Cc: Rabin Vincent &lt;rabin@rab.in&gt;
Fixes: e30f53aad220 "tracing: Do not busy wait in buffer splice"
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>tracing: Remove unneeded includes of debugfs.h and fs.h</title>
<updated>2015-01-22T16:19:48Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2015-01-20T16:28:28Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3efb5f21a36fbddd524cffe36426a84622ce580e'/>
<id>urn:sha1:3efb5f21a36fbddd524cffe36426a84622ce580e</id>
<content type='text'>
The creation of tracing files and directories is for the most part
encapsulated in helper functions in trace.c. Other files do not need to
include debugfs.h or fs.h, as they may have needed to in the past.

Remove them from the files that do not need them.

Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'trace-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace</title>
<updated>2014-12-11T03:58:13Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-12-11T03:58:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1dd7dcb6eaa677b034e7ef63df8320277507ae70'/>
<id>urn:sha1:1dd7dcb6eaa677b034e7ef63df8320277507ae70</id>
<content type='text'>
Pull tracing updates from Steven Rostedt:
 "There was a lot of clean ups and minor fixes.  One of those clean ups
  was to the trace_seq code.  It also removed the return values to the
  trace_seq_*() functions and use trace_seq_has_overflowed() to see if
  the buffer filled up or not.  This is similar to work being done to
  the seq_file code as well in another tree.

  Some of the other goodies include:

   - Added some "!" (NOT) logic to the tracing filter.

   - Fixed the frame pointer logic to the x86_64 mcount trampolines

   - Added the logic for dynamic trampolines on !CONFIG_PREEMPT systems.
     That is, the ftrace trampoline can be dynamically allocated and be
     called directly by functions that only have a single hook to them"

* tag 'trace-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (55 commits)
  tracing: Truncated output is better than nothing
  tracing: Add additional marks to signal very large time deltas
  Documentation: describe trace_buf_size parameter more accurately
  tracing: Allow NOT to filter AND and OR clauses
  tracing: Add NOT to filtering logic
  ftrace/fgraph/x86: Have prepare_ftrace_return() take ip as first parameter
  ftrace/x86: Get rid of ftrace_caller_setup
  ftrace/x86: Have save_mcount_regs macro also save stack frames if needed
  ftrace/x86: Add macro MCOUNT_REG_SIZE for amount of stack used to save mcount regs
  ftrace/x86: Simplify save_mcount_regs on getting RIP
  ftrace/x86: Have save_mcount_regs store RIP in %rdi for first parameter
  ftrace/x86: Rename MCOUNT_SAVE_FRAME and add more detailed comments
  ftrace/x86: Move MCOUNT_SAVE_FRAME out of header file
  ftrace/x86: Have static tracing also use ftrace_caller_setup
  ftrace/x86: Have static function tracing always test for function graph
  kprobes: Add IPMODIFY flag to kprobe_ftrace_ops
  ftrace, kprobes: Support IPMODIFY flag to find IP modify conflict
  kprobes/ftrace: Recover original IP if pre_handler doesn't change it
  tracing/trivial: Fix typos and make an int into a bool
  tracing: Deletion of an unnecessary check before iput()
  ...
</content>
</entry>
<entry>
<title>ring-buffer: Remove check of trace_seq_{puts,printf}() return values</title>
<updated>2014-11-19T20:25:40Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2014-11-12T16:49:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c0cd93aa1640a48038bacbee093695f892ea0130'/>
<id>urn:sha1:c0cd93aa1640a48038bacbee093695f892ea0130</id>
<content type='text'>
Remove checking the return value of all trace_seq_puts(). It was wrong
anyway as only the last return value mattered. But as the trace_seq_puts()
is going to be a void function in the future, we should not be checking
the return value of it anyway.

Just return !trace_seq_has_overflowed() instead.

Reviewed-by: Petr Mladek &lt;pmladek@suse.cz&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>tracing: Do not busy wait in buffer splice</title>
<updated>2014-11-10T21:45:43Z</updated>
<author>
<name>Rabin Vincent</name>
<email>rabin@rab.in</email>
</author>
<published>2014-11-10T18:46:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e30f53aad2202b5526c40c36d8eeac8bf290bde5'/>
<id>urn:sha1:e30f53aad2202b5526c40c36d8eeac8bf290bde5</id>
<content type='text'>
On a !PREEMPT kernel, attempting to use trace-cmd results in a soft
lockup:

 # trace-cmd record -e raw_syscalls:* -F false
 NMI watchdog: BUG: soft lockup - CPU#0 stuck for 22s! [trace-cmd:61]
 ...
 Call Trace:
  [&lt;ffffffff8105b580&gt;] ? __wake_up_common+0x90/0x90
  [&lt;ffffffff81092e25&gt;] wait_on_pipe+0x35/0x40
  [&lt;ffffffff810936e3&gt;] tracing_buffers_splice_read+0x2e3/0x3c0
  [&lt;ffffffff81093300&gt;] ? tracing_stats_read+0x2a0/0x2a0
  [&lt;ffffffff812d10ab&gt;] ? _raw_spin_unlock+0x2b/0x40
  [&lt;ffffffff810dc87b&gt;] ? do_read_fault+0x21b/0x290
  [&lt;ffffffff810de56a&gt;] ? handle_mm_fault+0x2ba/0xbd0
  [&lt;ffffffff81095c80&gt;] ? trace_event_buffer_lock_reserve+0x40/0x80
  [&lt;ffffffff810951e2&gt;] ? trace_buffer_lock_reserve+0x22/0x60
  [&lt;ffffffff81095c80&gt;] ? trace_event_buffer_lock_reserve+0x40/0x80
  [&lt;ffffffff8112415d&gt;] do_splice_to+0x6d/0x90
  [&lt;ffffffff81126971&gt;] SyS_splice+0x7c1/0x800
  [&lt;ffffffff812d1edd&gt;] tracesys_phase2+0xd3/0xd8

The problem is this: tracing_buffers_splice_read() calls
ring_buffer_wait() to wait for data in the ring buffers.  The buffers
are not empty so ring_buffer_wait() returns immediately.  But
tracing_buffers_splice_read() calls ring_buffer_read_page() with full=1,
meaning it only wants to read a full page.  When the full page is not
available, tracing_buffers_splice_read() tries to wait again with
ring_buffer_wait(), which again returns immediately, and so on.

Fix this by adding a "full" argument to ring_buffer_wait() which will
make ring_buffer_wait() wait until the writer has left the reader's
page, i.e.  until full-page reads will succeed.

Link: http://lkml.kernel.org/r/1415645194-25379-1-git-send-email-rabin@rab.in

Cc: stable@vger.kernel.org # 3.16+
Fixes: b1169cc69ba9 ("tracing: Remove mock up poll wait function")
Signed-off-by: Rabin Vincent &lt;rabin@rab.in&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>ring-buffer: Fix infinite spin in reading buffer</title>
<updated>2014-10-02T20:51:18Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2014-10-02T20:51:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=24607f114fd14f2f37e3e0cb3d47bce96e81e848'/>
<id>urn:sha1:24607f114fd14f2f37e3e0cb3d47bce96e81e848</id>
<content type='text'>
Commit 651e22f2701b "ring-buffer: Always reset iterator to reader page"
fixed one bug but in the process caused another one. The reset is to
update the header page, but that fix also changed the way the cached
reads were updated. The cache reads are used to test if an iterator
needs to be updated or not.

A ring buffer iterator, when created, disables writes to the ring buffer
but does not stop other readers or consuming reads from happening.
Although all readers are synchronized via a lock, they are only
synchronized when in the ring buffer functions. Those functions may
be called by any number of readers. The iterator continues down when
its not interrupted by a consuming reader. If a consuming read
occurs, the iterator starts from the beginning of the buffer.

The way the iterator sees that a consuming read has happened since
its last read is by checking the reader "cache". The cache holds the
last counts of the read and the reader page itself.

Commit 651e22f2701b changed what was saved by the cache_read when
the rb_iter_reset() occurred, making the iterator never match the cache.
Then if the iterator calls rb_iter_reset(), it will go into an
infinite loop by checking if the cache doesn't match, doing the reset
and retrying, just to see that the cache still doesn't match! Which
should never happen as the reset is suppose to set the cache to the
current value and there's locks that keep a consuming reader from
having access to the data.

Fixes: 651e22f2701b "ring-buffer: Always reset iterator to reader page"
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>trace: Fix epoll hang when we race with new entries</title>
<updated>2014-08-26T00:18:11Z</updated>
<author>
<name>Josef Bacik</name>
<email>jbacik@fb.com</email>
</author>
<published>2014-08-25T17:59:41Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4ce97dbf50245227add17c83d87dc838e7ca79d0'/>
<id>urn:sha1:4ce97dbf50245227add17c83d87dc838e7ca79d0</id>
<content type='text'>
Epoll on trace_pipe can sometimes hang in a weird case.  If the ring buffer is
empty when we set waiters_pending but an event shows up exactly at that moment
we can miss being woken up by the ring buffers irq work.  Since
ring_buffer_empty() is inherently racey we will sometimes think that the buffer
is not empty.  So we don't get woken up and we don't think there are any events
even though there were some ready when we added the watch, which makes us hang.
This patch fixes this by making sure that we are actually on the wait list
before we set waiters_pending, and add a memory barrier to make sure
ring_buffer_empty() is going to be correct.

Link: http://lkml.kernel.org/p/1408989581-23727-1-git-send-email-jbacik@fb.com

Cc: stable@vger.kernel.org # 3.10+
Cc: Martin Lau &lt;kafai@fb.com&gt;
Signed-off-by: Josef Bacik &lt;jbacik@fb.com&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'trace-fixes-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace</title>
<updated>2014-08-10T00:29:36Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2014-08-10T00:29:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fc335c1b68c68f626f07f1819e57d112d666bbba'/>
<id>urn:sha1:fc335c1b68c68f626f07f1819e57d112d666bbba</id>
<content type='text'>
Pull trace file read iterator fixes from Steven Rostedt:
 "This contains a fix for two long standing bugs.  Both of which are
  rarely ever hit, and requires the user to do something that users
  rarely do.  It took a few special test cases to even trigger this bug,
  and one of them was just one test in the process of finishing up as
  another one started.

  Both bugs have to do with the ring buffer iterator rb_iter_peek(), but
  one is more indirect than the other.

  The fist bug fix is simply an increase in the safety net loop counter.
  The counter makes sure that the rb_iter_peek() only iterates the
  number of times we expect it can, and no more.  Well, there was one
  way it could iterate one more than we expected, and that caused the
  ring buffer to shutdown with a nasty warning.  The fix was simply to
  up that counter by one.

  The other bug has to be with rb_iter_reset() (called by
  rb_iter_peek()).  This happens when a user reads both the trace_pipe
  and trace files.  The trace_pipe is a consuming read and does not use
  the ring buffer iterator, but the trace file is not a consuming read
  and does use the ring buffer iterator.  When the trace file is being
  read, if it detects that a consuming read occurred, it resets the
  iterator and starts over.  But the reset code that does this
  (rb_iter_reset()), checks if the reader_page is linked to the ring
  buffer or not, and will look into the ring buffer itself if it is not.
  This is wrong, as it should always try to read the reader page first.
  Not to mention, the code that looked into the ring buffer did it
  wrong, and used the header_page "read" offset to start reading on that
  page.  That offset is bogus for pages in the writable ring buffer, and
  was corrupting the iterator, and it would start returning bogus
  events"

* tag 'trace-fixes-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace:
  ring-buffer: Always reset iterator to reader page
  ring-buffer: Up rb_iter_peek() loop count to 3
</content>
</entry>
</feed>
