<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/trace/trace.h, branch v4.9.5</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.9.5</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.9.5'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2016-09-12T13:59:46Z</updated>
<entry>
<title>tracing: Have max_latency be defined for HWLAT_TRACER as well</title>
<updated>2016-09-12T13:59:46Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2016-09-07T16:45:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f971cc9aabc287120bbe7f3f1abe70c13e61ee94'/>
<id>urn:sha1:f971cc9aabc287120bbe7f3f1abe70c13e61ee94</id>
<content type='text'>
The hwlat tracer uses tr-&gt;max_latency, and if it's the only tracer enabled
that uses it, the build will fail. Add max_latency and its file when the
hwlat tracer is enabled.

Link: http://lkml.kernel.org/r/d6c3b7eb-ba95-1ffa-0453-464e1e24262a@infradead.org

Reported-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Tested-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Acked-by: Randy Dunlap &lt;rdunlap@infradead.org&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>tracing: Added hardware latency tracer</title>
<updated>2016-09-02T16:47:51Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2016-06-23T16:45:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e7c15cd8a113335cf7154f027c9c8da1a92238ee'/>
<id>urn:sha1:e7c15cd8a113335cf7154f027c9c8da1a92238ee</id>
<content type='text'>
The hardware latency tracer has been in the PREEMPT_RT patch for some time.
It is used to detect possible SMIs or any other hardware interruptions that
the kernel is unaware of. Note, NMIs may also be detected, but that may be
good to note as well.

The logic is pretty simple. It simply creates a thread that spins on a
single CPU for a specified amount of time (width) within a periodic window
(window). These numbers may be adjusted by their cooresponding names in

   /sys/kernel/tracing/hwlat_detector/

The defaults are window = 1000000 us (1 second)
                 width  =  500000 us (1/2 second)

The loop consists of:

	t1 = trace_clock_local();
	t2 = trace_clock_local();

Where trace_clock_local() is a variant of sched_clock().

The difference of t2 - t1 is recorded as the "inner" timestamp and also the
timestamp  t1 - prev_t2 is recorded as the "outer" timestamp. If either of
these differences are greater than the time denoted in
/sys/kernel/tracing/tracing_thresh then it records the event.

When this tracer is started, and tracing_thresh is zero, it changes to the
default threshold of 10 us.

The hwlat tracer in the PREEMPT_RT patch was originally written by
Jon Masters. I have modified it quite a bit and turned it into a
tracer.

Based-on-code-by: Jon Masters &lt;jcm@redhat.com&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>ftrace: Reduce size of function graph entries</title>
<updated>2016-07-05T21:28:30Z</updated>
<author>
<name>Namhyung Kim</name>
<email>namhyung@kernel.org</email>
</author>
<published>2016-06-29T10:56:48Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a4a551b8f1d4c4ebffd0f49dfef44df3128546f8'/>
<id>urn:sha1:a4a551b8f1d4c4ebffd0f49dfef44df3128546f8</id>
<content type='text'>
Currently ftrace_graph_ent{,_entry} and ftrace_graph_ret{,_entry} struct
can have padding bytes at the end due to alignment in 64-bit data type.
As these data are recorded so frequently, those paddings waste
non-negligible space.  As the ring buffer maintains alignment properly
for each architecture, just to remove the extra padding using 'packed'
attribute.

  ftrace_graph_ent_entry:  24 -&gt; 20
  ftrace_graph_ret_entry:  48 -&gt; 44

Also I moved the 'overrun' field in struct ftrace_graph_ret to minimize
the padding in the middle.

Tested on x86_64 only.

Link: http://lkml.kernel.org/r/1467197808-13578-1-git-send-email-namhyung@kernel.org

Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: linux-arch@vger.kernel.org
Signed-off-by: Namhyung Kim &lt;namhyung@kernel.org&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>ftrace: Move toplevel init out of ftrace_init_tracefs()</title>
<updated>2016-07-05T14:47:03Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2016-07-05T14:04:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=501c2375253c0795048f48368e0b3e8b2f6646dc'/>
<id>urn:sha1:501c2375253c0795048f48368e0b3e8b2f6646dc</id>
<content type='text'>
Commit 345ddcc882d8 ("ftrace: Have set_ftrace_pid use the bitmap like events
do") placed ftrace_init_tracefs into the instance creation, and encapsulated
the top level updating with an if conditional, as the top level only gets
updated at boot up. Unfortunately, this triggers section mismatch errors as
the init functions are called from a function that can be called later, and
the section mismatch logic is unaware of the if conditional that would
prevent it from happening at run time.

To make everyone happy, create a separate ftrace_init_tracefs_toplevel()
routine that only gets called by init functions, and this will be what calls
other init functions for the toplevel directory.

Link: http://lkml.kernel.org/r/20160704102139.19cbc0d9@gandalf.local.home

Reported-by: kbuild test robot &lt;fengguang.wu@intel.com&gt;
Reported-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Fixes: 345ddcc882d8 ("ftrace: Have set_ftrace_pid use the bitmap like events do")
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>ftrace: Have set_ftrace_pid use the bitmap like events do</title>
<updated>2016-06-20T13:54:19Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2016-04-22T22:11:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=345ddcc882d8896dcbdcb3e0ee4a415fc23ec8b0'/>
<id>urn:sha1:345ddcc882d8896dcbdcb3e0ee4a415fc23ec8b0</id>
<content type='text'>
Convert set_ftrace_pid to use the bitmap like set_event_pid does. This
allows for instances to use the pid filtering as well, and will allow for
function-fork option to set if the children of a traced function should be
traced or not.

Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>tracing: Move pid_list write processing into its own function</title>
<updated>2016-06-20T13:54:18Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2016-04-21T15:35:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=76c813e26606d35ea9d8d6f96e646b3944c730a9'/>
<id>urn:sha1:76c813e26606d35ea9d8d6f96e646b3944c730a9</id>
<content type='text'>
The addition of PIDs into a pid_list via the write operation of
set_event_pid is a bit complex. The same operation will be needed for
function tracing pids. Move the code into its own generic function in
trace.c, so that we can avoid duplication of this code.

Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>tracing: Move the pid_list seq_file functions to be global</title>
<updated>2016-06-20T13:54:17Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2016-04-20T19:19:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5cc8976bd52153678ca37cc1e3000833b20276f3'/>
<id>urn:sha1:5cc8976bd52153678ca37cc1e3000833b20276f3</id>
<content type='text'>
To allow other aspects of ftrace to use the pid_list logic, we need to reuse
the seq_file functions. Making the generic part into functions that can be
called by other files will help in this regard.

Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>tracing: Make the pid filtering helper functions global</title>
<updated>2016-06-20T13:54:16Z</updated>
<author>
<name>Steven Rostedt</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2016-04-14T11:38:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4e267db135c44d0b18e553899fe7df32b89211a5'/>
<id>urn:sha1:4e267db135c44d0b18e553899fe7df32b89211a5</id>
<content type='text'>
Make the functions used for pid filtering global for tracing, such that the
function tracer can use the pid code as well.

Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>tracing: Use temp buffer when filtering events</title>
<updated>2016-05-03T21:59:24Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2016-05-03T21:15:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0fc1b09ff1ff404ddf753f5ffa5cd0adc8fdcdc9'/>
<id>urn:sha1:0fc1b09ff1ff404ddf753f5ffa5cd0adc8fdcdc9</id>
<content type='text'>
Filtering of events requires the data to be written to the ring buffer
before it can be decided to filter or not. This is because the parameters of
the filter are based on the result that is written to the ring buffer and
not on the parameters that are passed into the trace functions.

The ftrace ring buffer is optimized for writing into the ring buffer and
committing. The discard procedure used when filtering decides the event
should be discarded is much more heavy weight. Thus, using a temporary
filter when filtering events can speed things up drastically.

Without a temp buffer we have:

 # trace-cmd start -p nop
 # perf stat -r 10 hackbench 50
       0.790706626 seconds time elapsed ( +-  0.71% )

 # trace-cmd start -e all
 # perf stat -r 10 hackbench 50
       1.566904059 seconds time elapsed ( +-  0.27% )

 # trace-cmd start -e all -f 'common_preempt_count==20'
 # perf stat -r 10 hackbench 50
       1.690598511 seconds time elapsed ( +-  0.19% )

 # trace-cmd start -e all -f 'common_preempt_count!=20'
 # perf stat -r 10 hackbench 50
       1.707486364 seconds time elapsed ( +-  0.30% )

The first run above is without any tracing, just to get a based figure.
hackbench takes ~0.79 seconds to run on the system.

The second run enables tracing all events where nothing is filtered. This
increases the time by 100% and hackbench takes 1.57 seconds to run.

The third run filters all events where the preempt count will equal "20"
(this should never happen) thus all events are discarded. This takes 1.69
seconds to run. This is 10% slower than just committing the events!

The last run enables all events and filters where the filter will commit all
events, and this takes 1.70 seconds to run. The filtering overhead is
approximately 10%. Thus, the discard and commit of an event from the ring
buffer may be about the same time.

With this patch, the numbers change:

 # trace-cmd start -p nop
 # perf stat -r 10 hackbench 50
       0.778233033 seconds time elapsed ( +-  0.38% )

 # trace-cmd start -e all
 # perf stat -r 10 hackbench 50
       1.582102692 seconds time elapsed ( +-  0.28% )

 # trace-cmd start -e all -f 'common_preempt_count==20'
 # perf stat -r 10 hackbench 50
       1.309230710 seconds time elapsed ( +-  0.22% )

 # trace-cmd start -e all -f 'common_preempt_count!=20'
 # perf stat -r 10 hackbench 50
       1.786001924 seconds time elapsed ( +-  0.20% )

The first run is again the base with no tracing.

The second run is all tracing with no filtering. It is a little slower, but
that may be well within the noise.

The third run shows that discarding all events only took 1.3 seconds. This
is a speed up of 23%! The discard is much faster than even the commit.

The one downside is shown in the last run. Events that are not discarded by
the filter will take longer to add, this is due to the extra copy of the
event.

Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
<entry>
<title>tracing: Have trace_buffer_unlock_commit() call the _regs version with NULL</title>
<updated>2016-04-29T21:44:01Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2016-04-29T21:44:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=33fddff24d05d71f97722cb7deec4964d39d10dc'/>
<id>urn:sha1:33fddff24d05d71f97722cb7deec4964d39d10dc</id>
<content type='text'>
There's no real difference between trace_buffer_unlock_commit() and
trace_buffer_unlock_commit_regs() except that the former passes NULL to
ftrace_stack_trace() instead of regs. Have the former be a static inline of
the latter which passes NULL for regs.

Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
</content>
</entry>
</feed>
