<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/time/timer.c, branch v4.4.2</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.4.2</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.4.2'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2015-11-04T19:23:19Z</updated>
<entry>
<title>timers: Use proper base migration in add_timer_on()</title>
<updated>2015-11-04T19:23:19Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2015-11-04T17:15:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=22b886dd1018093920c4250dee2a9a3cb7cff7b8'/>
<id>urn:sha1:22b886dd1018093920c4250dee2a9a3cb7cff7b8</id>
<content type='text'>
Regardless of the previous CPU a timer was on, add_timer_on()
currently simply sets timer-&gt;flags to the new CPU.  As the caller must
be seeing the timer as idle, this is locally fine, but the timer
leaving the old base while unlocked can lead to race conditions as
follows.

Let's say timer was on cpu 0.

  cpu 0					cpu 1
  -----------------------------------------------------------------------------
  del_timer(timer) succeeds
					del_timer(timer)
					  lock_timer_base(timer) locks cpu_0_base
  add_timer_on(timer, 1)
    spin_lock(&amp;cpu_1_base-&gt;lock)
    timer-&gt;flags set to cpu_1_base
    operates on @timer			  operates on @timer

This triggered with mod_delayed_work_on() which contains
"if (del_timer()) add_timer_on()" sequence eventually leading to the
following oops.

  BUG: unable to handle kernel NULL pointer dereference at           (null)
  IP: [&lt;ffffffff810ca6e9&gt;] detach_if_pending+0x69/0x1a0
  ...
  Workqueue: wqthrash wqthrash_workfunc [wqthrash]
  task: ffff8800172ca680 ti: ffff8800172d0000 task.ti: ffff8800172d0000
  RIP: 0010:[&lt;ffffffff810ca6e9&gt;]  [&lt;ffffffff810ca6e9&gt;] detach_if_pending+0x69/0x1a0
  ...
  Call Trace:
   [&lt;ffffffff810cb0b4&gt;] del_timer+0x44/0x60
   [&lt;ffffffff8106e836&gt;] try_to_grab_pending+0xb6/0x160
   [&lt;ffffffff8106e913&gt;] mod_delayed_work_on+0x33/0x80
   [&lt;ffffffffa0000081&gt;] wqthrash_workfunc+0x61/0x90 [wqthrash]
   [&lt;ffffffff8106dba8&gt;] process_one_work+0x1e8/0x650
   [&lt;ffffffff8106e05e&gt;] worker_thread+0x4e/0x450
   [&lt;ffffffff810746af&gt;] kthread+0xef/0x110
   [&lt;ffffffff8185980f&gt;] ret_from_fork+0x3f/0x70

Fix it by updating add_timer_on() to perform proper migration as
__mod_timer() does.

Reported-and-tested-by: Jeff Layton &lt;jlayton@poochiereds.net&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Chris Worley &lt;chris.worley@primarydata.com&gt;
Cc: bfields@fieldses.org
Cc: Michael Skralivetsky &lt;michael.skralivetsky@primarydata.com&gt;
Cc: Trond Myklebust &lt;trond.myklebust@primarydata.com&gt;
Cc: Shaohua Li &lt;shli@fb.com&gt;
Cc: Jeff Layton &lt;jlayton@poochiereds.net&gt;
Cc: kernel-team@fb.com
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20151029103113.2f893924@tlielax.poochiereds.net
Link: http://lkml.kernel.org/r/20151104171533.GI5749@mtj.duckdns.org
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
</entry>
<entry>
<title>timers: Use __fls in apply_slack()</title>
<updated>2015-10-11T20:13:46Z</updated>
<author>
<name>Rasmus Villemoes</name>
<email>linux@rasmusvillemoes.dk</email>
</author>
<published>2015-10-02T07:45:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9fc4468d546b6eb55b0aa5b04b0c36238ebf57e7'/>
<id>urn:sha1:9fc4468d546b6eb55b0aa5b04b0c36238ebf57e7</id>
<content type='text'>
In apply_slack(), find_last_bit() is applied to a bitmask consisting
of precisely BITS_PER_LONG bits. Since mask is non-zero, we might as
well eliminate the function call and use __fls() directly. On x86_64,
this shaves 23 bytes of the only caller, mod_timer().

This also gets rid of Coverity CID 1192106, but that is a false
positive: Coverity is not aware that mask != 0 implies that
find_last_bit will not return BITS_PER_LONG.

Signed-off-by: Rasmus Villemoes &lt;linux@rasmusvillemoes.dk&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Link: http://lkml.kernel.org/r/1443771931-6284-1-git-send-email-linux@rasmusvillemoes.dk
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
</entry>
<entry>
<title>timers: Fix data race in timer_stats_account_timer()</title>
<updated>2015-09-22T13:43:18Z</updated>
<author>
<name>Dmitry Vyukov</name>
<email>dvyukov@google.com</email>
</author>
<published>2015-09-18T13:54:23Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3ed769bdb2a2484fd7f9f7f3047413053aacbe21'/>
<id>urn:sha1:3ed769bdb2a2484fd7f9f7f3047413053aacbe21</id>
<content type='text'>
timer_stats_account_timer() reads timer-&gt;start_site, then checks it
for NULL and then re-reads it again, while
timer_stats_timer_clear_start_info() can concurrently reset
timer-&gt;start_site to NULL. This should not lead to crashes, but can
double number of entries in timer stats as start_site is used during
comparison, the doubled entries will have unuseful NULL start_site.

Read timer-&gt;start_site only once in timer_stats_account_timer().

The data race was found with KernelThreadSanitizer (KTSAN).

Signed-off-by: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Cc: andreyknvl@google.com
Cc: glider@google.com
Cc: kcc@google.com
Cc: ktsan@googlegroups.com
Cc: john.stultz@linaro.org
Link: http://lkml.kernel.org/r/1442584463-69553-1-git-send-email-dvyukov@google.com
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
</entry>
<entry>
<title>timer: Write timer-&gt;flags atomically</title>
<updated>2015-08-18T13:31:16Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-08-17T17:18:48Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d0023a1448abdcc892b8bca631e74bb1888efd02'/>
<id>urn:sha1:d0023a1448abdcc892b8bca631e74bb1888efd02</id>
<content type='text'>
lock_timer_base() cannot prevent the following :

CPU1 ( in __mod_timer()
timer-&gt;flags |= TIMER_MIGRATING;
spin_unlock(&amp;base-&gt;lock);
base = new_base;
spin_lock(&amp;base-&gt;lock);
// The next line clears TIMER_MIGRATING
timer-&gt;flags &amp;= ~TIMER_BASEMASK;
                                  CPU2 (in lock_timer_base())
                                  see timer base is cpu0 base
                                  spin_lock_irqsave(&amp;base-&gt;lock, *flags);
                                  if (timer-&gt;flags == tf)
                                       return base; // oops, wrong base
timer-&gt;flags |= base-&gt;cpu // too late

We must write timer-&gt;flags in one go, otherwise we can fool other cpus.

Fixes: bc7a34b8b9eb ("timer: Reduce timer migration overhead if disabled")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Jon Christopherson &lt;jon@jons.org&gt;
Cc: David Miller &lt;davem@davemloft.net&gt;
Cc: xen-devel@lists.xen.org
Cc: david.vrabel@citrix.com
Cc: Sander Eikelenboom &lt;linux@eikelenboom.it&gt;
Link: http://lkml.kernel.org/r/1439831928.32680.11.camel@edumazet-glaptop2.roam.corp.google.com
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
</entry>
<entry>
<title>timer: Fix hotplug regression</title>
<updated>2015-06-26T20:58:06Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2015-06-26T20:08:38Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=24bfcb100959c8641a627b5604d967243f8f240c'/>
<id>urn:sha1:24bfcb100959c8641a627b5604d967243f8f240c</id>
<content type='text'>
The recent timer wheel rework removed the get/put_cpu_var() pair in
the hotplug migration code, which results in:

BUG: using smp_processor_id() in preemptible [00000000] code: hib.sh/2845
...
[&lt;ffffffff810d4fa3&gt;] timer_cpu_notify+0x53/0x12

That hunk is a leftover from an earlier iteration and went unnoticed
so far.

Restore the previous code which was obviously correct.

Fixes: 0eeda71bc30d 'timer: Replace timer base by a cpu index'
Reported-and_tested-by: Borislav Petkov &lt;bp@alien8.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
</entry>
<entry>
<title>timer: Minimize nohz off overhead</title>
<updated>2015-06-19T13:18:28Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2015-05-26T22:50:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=683be13a284720205228e29207ef11a1c3c322b9'/>
<id>urn:sha1:683be13a284720205228e29207ef11a1c3c322b9</id>
<content type='text'>
If nohz is disabled on the kernel command line the [hr]timer code
still calls wake_up_nohz_cpu() and tick_nohz_full_cpu(), a pretty
pointless exercise. Cache nohz_active in [hr]timer per cpu bases and
avoid the overhead.

Before:
  48.10%  hog       [.] main
  15.25%  [kernel]  [k] _raw_spin_lock_irqsave
   9.76%  [kernel]  [k] _raw_spin_unlock_irqrestore
   6.50%  [kernel]  [k] mod_timer
   6.44%  [kernel]  [k] lock_timer_base.isra.38
   3.87%  [kernel]  [k] detach_if_pending
   3.80%  [kernel]  [k] del_timer
   2.67%  [kernel]  [k] internal_add_timer
   1.33%  [kernel]  [k] __internal_add_timer
   0.73%  [kernel]  [k] timerfn
   0.54%  [kernel]  [k] wake_up_nohz_cpu

After:
  48.73%  hog       [.] main
  15.36%  [kernel]  [k] _raw_spin_lock_irqsave
   9.77%  [kernel]  [k] _raw_spin_unlock_irqrestore
   6.61%  [kernel]  [k] lock_timer_base.isra.38
   6.42%  [kernel]  [k] mod_timer
   3.90%  [kernel]  [k] detach_if_pending
   3.76%  [kernel]  [k] del_timer
   2.41%  [kernel]  [k] internal_add_timer
   1.39%  [kernel]  [k] __internal_add_timer
   0.76%  [kernel]  [k] timerfn

We probably should have a cached value for nohz full in the per cpu
bases as well to avoid the cpumask check. The base cache line is hot
already, the cpumask not necessarily.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Joonwoo Park &lt;joonwoop@codeaurora.org&gt;
Cc: Wenbo Wang &lt;wenbo.wang@memblaze.com&gt;
Link: http://lkml.kernel.org/r/20150526224512.207378134@linutronix.de
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
</entry>
<entry>
<title>timer: Reduce timer migration overhead if disabled</title>
<updated>2015-06-19T13:18:28Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2015-05-26T22:50:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bc7a34b8b9ebfb0f4b8a35a72a0b134fd6c5ef50'/>
<id>urn:sha1:bc7a34b8b9ebfb0f4b8a35a72a0b134fd6c5ef50</id>
<content type='text'>
Eric reported that the timer_migration sysctl is not really nice
performance wise as it needs to check at every timer insertion whether
the feature is enabled or not. Further the check does not live in the
timer code, so we have an extra function call which checks an extra
cache line to figure out that it is disabled.

We can do better and store that information in the per cpu (hr)timer
bases. I pondered to use a static key, but that's a nightmare to
update from the nohz code and the timer base cache line is hot anyway
when we select a timer base.

The old logic enabled the timer migration unconditionally if
CONFIG_NO_HZ was set even if nohz was disabled on the kernel command
line.

With this modification, we start off with migration disabled. The user
visible sysctl is still set to enabled. If the kernel switches to NOHZ
migration is enabled, if the user did not disable it via the sysctl
prior to the switch. If nohz=off is on the kernel command line,
migration stays disabled no matter what.

Before:
  47.76%  hog       [.] main
  14.84%  [kernel]  [k] _raw_spin_lock_irqsave
   9.55%  [kernel]  [k] _raw_spin_unlock_irqrestore
   6.71%  [kernel]  [k] mod_timer
   6.24%  [kernel]  [k] lock_timer_base.isra.38
   3.76%  [kernel]  [k] detach_if_pending
   3.71%  [kernel]  [k] del_timer
   2.50%  [kernel]  [k] internal_add_timer
   1.51%  [kernel]  [k] get_nohz_timer_target
   1.28%  [kernel]  [k] __internal_add_timer
   0.78%  [kernel]  [k] timerfn
   0.48%  [kernel]  [k] wake_up_nohz_cpu

After:
  48.10%  hog       [.] main
  15.25%  [kernel]  [k] _raw_spin_lock_irqsave
   9.76%  [kernel]  [k] _raw_spin_unlock_irqrestore
   6.50%  [kernel]  [k] mod_timer
   6.44%  [kernel]  [k] lock_timer_base.isra.38
   3.87%  [kernel]  [k] detach_if_pending
   3.80%  [kernel]  [k] del_timer
   2.67%  [kernel]  [k] internal_add_timer
   1.33%  [kernel]  [k] __internal_add_timer
   0.73%  [kernel]  [k] timerfn
   0.54%  [kernel]  [k] wake_up_nohz_cpu


Reported-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Joonwoo Park &lt;joonwoop@codeaurora.org&gt;
Cc: Wenbo Wang &lt;wenbo.wang@memblaze.com&gt;
Link: http://lkml.kernel.org/r/20150526224512.127050787@linutronix.de
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
</entry>
<entry>
<title>timer: Stats: Simplify the flags handling</title>
<updated>2015-06-19T13:18:27Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2015-05-26T22:50:31Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c74441a17eb975b604e339ca6c11b9ab9aaca11f'/>
<id>urn:sha1:c74441a17eb975b604e339ca6c11b9ab9aaca11f</id>
<content type='text'>
Simplify the handling of the flag storage for the timer statistics. No
intermediate storage anymore. Just hand over the flags field.

I left the printout of 'deferrable' for now because changing this
would be an ABI update and I have no idea how strong people feel about
that. OTOH, I wonder whether we should kill the whole timer stats
stuff because all of that information can be retrieved via ftrace/perf
as well.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Joonwoo Park &lt;joonwoop@codeaurora.org&gt;
Cc: Wenbo Wang &lt;wenbo.wang@memblaze.com&gt;
Link: http://lkml.kernel.org/r/20150526224512.046626248@linutronix.de
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
</entry>
<entry>
<title>timer: Replace timer base by a cpu index</title>
<updated>2015-06-19T13:18:27Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2015-05-26T22:50:29Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0eeda71bc30d74f66f8231f45621d5ace3419186'/>
<id>urn:sha1:0eeda71bc30d74f66f8231f45621d5ace3419186</id>
<content type='text'>
Instead of storing a pointer to the per cpu tvec_base we can simply
cache a CPU index in the timer_list and use that to get hold of the
correct per cpu tvec_base. This is only used in lock_timer_base() and
the slightly larger code is peanuts versus the spinlock operation and
the d-cache foot print of the timer wheel.

Aside of that this allows to get rid of following nuisances:

 - boot_tvec_base

   That statically allocated 4k bss data is just kept around so the
   timer has a home when it gets statically initialized. It serves no
   other purpose.

   With the CPU index we assign the timer to CPU0 at static
   initialization time and therefor can avoid the whole boot_tvec_base
   dance.  That also simplifies the init code, which just can use the
   per cpu base.

   Before:
     text	   data	    bss	    dec	    hex	filename
    17491	   9201	   4160	  30852	   7884	../build/kernel/time/timer.o
   After:
     text	   data	    bss	    dec	    hex	filename
    17440	   9193	      0	  26633	   6809	../build/kernel/time/timer.o

 - Overloading the base pointer with various flags

   The CPU index has enough space to hold the flags (deferrable,
   irqsafe) so we can get rid of the extra masking and bit fiddling
   with the base pointer.

As a benefit we reduce the size of struct timer_list on 64 bit
machines. 4 - 8 bytes, a size reduction up to 15% per struct timer_list,
which is a real win as we have tons of them embedded in other structs.

This changes also the newly added deferrable printout of the timer
start trace point to capture and print all timer-&gt;flags, which allows
us to decode the target cpu of the timer as well.

We might have used bitfields for this, but that would change the
static initializers and the init function for no value to accomodate
big endian bitfields.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Joonwoo Park &lt;joonwoop@codeaurora.org&gt;
Cc: Wenbo Wang &lt;wenbo.wang@memblaze.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Badhri Jagan Sridharan &lt;Badhri@google.com&gt;
Link: http://lkml.kernel.org/r/20150526224511.950084301@linutronix.de
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;

</content>
</entry>
<entry>
<title>timer: Use hlist for the timer wheel hash buckets</title>
<updated>2015-06-19T13:18:27Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2015-05-26T22:50:28Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1dabbcec2c0a36fe43509d06499b9e512e70a028'/>
<id>urn:sha1:1dabbcec2c0a36fe43509d06499b9e512e70a028</id>
<content type='text'>
This reduces the size of struct tvec_base by 50% and results in
slightly smaller code as well.

Before:
   struct tvec_base: size: 8256, cachelines: 129

   text	   data	    bss	    dec	    hex	filename
  17698	  13297	   8256	  39251	   9953	../build/kernel/time/timer.o

After:
  struct tvec_base: 4160, cachelines: 65

   text	   data	    bss	    dec	    hex	filename
  17491	   9201	   4160	  30852	   7884	../build/kernel/time/timer.o

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Joonwoo Park &lt;joonwoop@codeaurora.org&gt;
Cc: Wenbo Wang &lt;wenbo.wang@memblaze.com&gt;
Link: http://lkml.kernel.org/r/20150526224511.854731214@linutronix.de
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
</entry>
</feed>
