<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/time/timer.c, branch v4.14.143</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.14.143</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.14.143'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2018-09-19T20:43:39Z</updated>
<entry>
<title>timers: Clear timer_base::must_forward_clk with timer_base::lock held</title>
<updated>2018-09-19T20:43:39Z</updated>
<author>
<name>Gaurav Kohli</name>
<email>gkohli@codeaurora.org</email>
</author>
<published>2018-08-02T08:51:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cb71229f64835c6c570c7697884054e589630bfd'/>
<id>urn:sha1:cb71229f64835c6c570c7697884054e589630bfd</id>
<content type='text'>
[ Upstream commit 363e934d8811d799c88faffc5bfca782fd728334 ]

timer_base::must_forward_clock is indicating that the base clock might be
stale due to a long idle sleep.

The forwarding of the base clock takes place in the timer softirq or when a
timer is enqueued to a base which is idle. If the enqueue of timer to an
idle base happens from a remote CPU, then the following race can happen:

  CPU0					CPU1
  run_timer_softirq			mod_timer

					base = lock_timer_base(timer);
  base-&gt;must_forward_clk = false
					if (base-&gt;must_forward_clk)
				       	    forward(base); -&gt; skipped

					enqueue_timer(base, timer, idx);
					-&gt; idx is calculated high due to
					   stale base
					unlock_timer_base(timer);
  base = lock_timer_base(timer);
  forward(base);

The root cause is that timer_base::must_forward_clk is cleared outside the
timer_base::lock held region, so the remote queuing CPU observes it as
cleared, but the base clock is still stale. This can cause large
granularity values for timers, i.e. the accuracy of the expiry time
suffers.

Prevent this by clearing the flag with timer_base::lock held, so that the
forwarding takes place before the cleared flag is observable by a remote
CPU.

Signed-off-by: Gaurav Kohli &lt;gkohli@codeaurora.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: john.stultz@linaro.org
Cc: sboyd@kernel.org
Cc: linux-arm-msm@vger.kernel.org
Link: https://lkml.kernel.org/r/1533199863-22748-1-git-send-email-gkohli@codeaurora.org
Signed-off-by: Sasha Levin &lt;alexander.levin@microsoft.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>timers: Forward timer base before migrating timers</title>
<updated>2018-03-09T06:41:04Z</updated>
<author>
<name>Lingutla Chandrasekhar</name>
<email>clingutla@codeaurora.org</email>
</author>
<published>2018-01-18T11:50:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6b218ed6bd07857422b13966678af6e6a3509d86'/>
<id>urn:sha1:6b218ed6bd07857422b13966678af6e6a3509d86</id>
<content type='text'>
commit c52232a49e203a65a6e1a670cd5262f59e9364a0 upstream.

On CPU hotunplug the enqueued timers of the unplugged CPU are migrated to a
live CPU. This happens from the control thread which initiated the unplug.

If the CPU on which the control thread runs came out from a longer idle
period then the base clock of that CPU might be stale because the control
thread runs prior to any event which forwards the clock.

In such a case the timers from the unplugged CPU are queued on the live CPU
based on the stale clock which can cause large delays due to increased
granularity of the outer timer wheels which are far away from base:;clock.

But there is a worse problem than that. The following sequence of events
illustrates it:

 - CPU0 timer1 is queued expires = 59969 and base-&gt;clk = 59131.

   The timer is queued at wheel level 2, with resulting expiry time = 60032
   (due to level granularity).

 - CPU1 enters idle @60007, with next timer expiry @60020.

 - CPU0 is hotplugged at @60009

 - CPU1 exits idle and runs the control thread which migrates the
   timers from CPU0

   timer1 is now queued in level 0 for immediate handling in the next
   softirq because the requested expiry time 59969 is before CPU1 base-&gt;clk
   60007

 - CPU1 runs code which forwards the base clock which succeeds because the
   next expiring timer. which was collected at idle entry time is still set
   to 60020.

   So it forwards beyond 60007 and therefore misses to expire the migrated
   timer1. That timer gets expired when the wheel wraps around again, which
   takes between 63 and 630ms depending on the HZ setting.

Address both problems by invoking forward_timer_base() for the control CPUs
timer base. All other places, which might run into a similar problem
(mod_timer()/add_timer_on()) already invoke forward_timer_base() to avoid
that.

[ tglx: Massaged comment and changelog ]

Fixes: a683f390b93f ("timers: Forward the wheel clock whenever possible")
Co-developed-by: Neeraj Upadhyay &lt;neeraju@codeaurora.org&gt;
Signed-off-by: Neeraj Upadhyay &lt;neeraju@codeaurora.org&gt;
Signed-off-by: Lingutla Chandrasekhar &lt;clingutla@codeaurora.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Anna-Maria Gleixner &lt;anna-maria@linutronix.de&gt;
Cc: linux-arm-msm@vger.kernel.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180118115022.6368-1-clingutla@codeaurora.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>timers: Unconditionally check deferrable base</title>
<updated>2018-01-23T18:58:12Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2018-01-14T22:19:49Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4db98c583205ce93771ee16eeb5543185706df0c'/>
<id>urn:sha1:4db98c583205ce93771ee16eeb5543185706df0c</id>
<content type='text'>
commit ed4bbf7910b28ce3c691aef28d245585eaabda06 upstream.

When the timer base is checked for expired timers then the deferrable base
must be checked as well. This was missed when making the deferrable base
independent of base::nohz_active.

Fixes: ced6d5c11d3e ("timers: Use deferrable base independent of base::nohz_active")
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Anna-Maria Gleixner &lt;anna-maria@linutronix.de&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Sebastian Siewior &lt;bigeasy@linutronix.de&gt;
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: rt@linutronix.de
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>timers: Reinitialize per cpu bases on hotplug</title>
<updated>2018-01-02T19:31:15Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2017-12-27T20:37:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6fae6de72ad44e98b5ae58a662d110c58594aad9'/>
<id>urn:sha1:6fae6de72ad44e98b5ae58a662d110c58594aad9</id>
<content type='text'>
commit 26456f87aca7157c057de65c9414b37f1ab881d1 upstream.

The timer wheel bases are not (re)initialized on CPU hotplug. That leaves
them with a potentially stale clk and next_expiry valuem, which can cause
trouble then the CPU is plugged.

Add a prepare callback which forwards the clock, sets next_expiry to far in
the future and reset the control flags to a known state.

Set base-&gt;must_forward_clk so the first timer which is queued will try to
forward the clock to current jiffies.

Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel")
Reported-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Sebastian Siewior &lt;bigeasy@linutronix.de&gt;
Cc: Anna-Maria Gleixner &lt;anna-maria@linutronix.de&gt;
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272152200.2431@nanos
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>timers: Invoke timer_start_debug() where it makes sense</title>
<updated>2018-01-02T19:31:15Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2017-12-22T14:51:14Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8f1aa64ab08621cc3d2b1600ab80da5d3128abb6'/>
<id>urn:sha1:8f1aa64ab08621cc3d2b1600ab80da5d3128abb6</id>
<content type='text'>
commit fd45bb77ad682be728d1002431d77b8c73342836 upstream.

The timer start debug function is called before the proper timer base is
set. As a consequence the trace data contains the stale CPU and flags
values.

Call the debug function after setting the new base and flags.

Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel")
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Sebastian Siewior &lt;bigeasy@linutronix.de&gt;
Cc: rt@linutronix.de
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Anna-Maria Gleixner &lt;anna-maria@linutronix.de&gt;
Link: https://lkml.kernel.org/r/20171222145337.792907137@linutronix.de
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>timers: Use deferrable base independent of base::nohz_active</title>
<updated>2018-01-02T19:31:15Z</updated>
<author>
<name>Anna-Maria Gleixner</name>
<email>anna-maria@linutronix.de</email>
</author>
<published>2017-12-22T14:51:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e4fb2e7e92ec638f0bd489c56b4b6d9004fd1f40'/>
<id>urn:sha1:e4fb2e7e92ec638f0bd489c56b4b6d9004fd1f40</id>
<content type='text'>
commit ced6d5c11d3e7b342f1a80f908e6756ebd4b8ddd upstream.

During boot and before base::nohz_active is set in the timer bases, deferrable
timers are enqueued into the standard timer base. This works correctly as
long as base::nohz_active is false.

Once it base::nohz_active is set and a timer which was enqueued before that
is accessed the lock selector code choses the lock of the deferred
base. This causes unlocked access to the standard base and in case the
timer is removed it does not clear the pending flag in the standard base
bitmap which causes get_next_timer_interrupt() to return bogus values.

To prevent that, the deferrable timers must be enqueued in the deferrable
base, even when base::nohz_active is not set. Those deferrable timers also
need to be expired unconditional.

Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel")
Signed-off-by: Anna-Maria Gleixner &lt;anna-maria@linutronix.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Sebastian Siewior &lt;bigeasy@linutronix.de&gt;
Cc: rt@linutronix.de
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Link: https://lkml.kernel.org/r/20171222145337.633328378@linutronix.de
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>timers: Fix excessive granularity of new timers after a nohz idle</title>
<updated>2017-08-24T09:40:18Z</updated>
<author>
<name>Nicholas Piggin</name>
<email>npiggin@gmail.com</email>
</author>
<published>2017-08-22T08:43:48Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2fe59f507a65dbd734b990a11ebc7488f6f87a24'/>
<id>urn:sha1:2fe59f507a65dbd734b990a11ebc7488f6f87a24</id>
<content type='text'>
When a timer base is idle, it is forwarded when a new timer is added
to ensure that granularity does not become excessive. When not idle,
the timer tick is expected to increment the base.

However there are several problems:

- If an existing timer is modified, the base is forwarded only after
  the index is calculated.

- The base is not forwarded by add_timer_on.

- There is a window after a timer is restarted from a nohz idle, after
  it is marked not-idle and before the timer tick on this CPU, where a
  timer may be added but the ancient base does not get forwarded.

These result in excessive granularity (a 1 jiffy timeout can blow out
to 100s of jiffies), which cause the rcu lockup detector to trigger,
among other things.

Fix this by keeping track of whether the timer base has been idle
since it was last run or forwarded, and if so then forward it before
adding a new timer.

There is still a case where mod_timer optimises the case of a pending
timer mod with the same expiry time, where the timer can see excessive
granularity relative to the new, shorter interval. A comment is added,
but it's not changed because it is an important fastpath for
networking.

This has been tested and found to fix the RCU softlockup messages.

Testing was also done with tracing to measure requested versus
achieved wakeup latencies for all non-deferrable timers in an idle
system (with no lockup watchdogs running). Wakeup latency relative to
absolute latency is calculated (note this suffers from round-up skew
at low absolute times) and analysed:

             max     avg      std
upstream   506.0    1.20     4.68
patched      2.0    1.08     0.15

The bug was noticed due to the lockup detector Kconfig changes
dropping it out of people's .configs and resulting in larger base
clk skew When the lockup detectors are enabled, no CPU can go idle for
longer than 4 seconds, which limits the granularity errors.
Sub-optimal timer behaviour is observable on a smaller scale in that
case:

	     max     avg      std
upstream     9.0    1.05     0.19
patched      2.0    1.04     0.11

Fixes: Fixes: a683f390b93f ("timers: Forward the wheel clock whenever possible")
Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Tested-by: David Miller &lt;davem@davemloft.net&gt;
Cc: dzickus@redhat.com
Cc: sfr@canb.auug.org.au
Cc: mpe@ellerman.id.au
Cc: Stephen Boyd &lt;sboyd@codeaurora.org&gt;
Cc: linuxarm@huawei.com
Cc: abdhalee@linux.vnet.ibm.com
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: akpm@linux-foundation.org
Cc: paulmck@linux.vnet.ibm.com
Cc: torvalds@linux-foundation.org
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/20170822084348.21436-1-npiggin@gmail.com

</content>
</entry>
<entry>
<title>timers: Fix overflow in get_next_timer_interrupt</title>
<updated>2017-08-01T12:20:53Z</updated>
<author>
<name>Matija Glavinic Pecotic</name>
<email>matija.glavinic-pecotic.ext@nokia.com</email>
</author>
<published>2017-08-01T07:11:52Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=34f41c0316ed52b0b44542491d89278efdaa70e4'/>
<id>urn:sha1:34f41c0316ed52b0b44542491d89278efdaa70e4</id>
<content type='text'>
For e.g. HZ=100, timer being 430 jiffies in the future, and 32 bit
unsigned int, there is an overflow on unsigned int right-hand side
of the expression which results with wrong values being returned.

Type cast the multiplier to 64bit to avoid that issue.

Fixes: 46c8f0b077a8 ("timers: Fix get_next_timer_interrupt() computation")
Signed-off-by: Matija Glavinic Pecotic &lt;matija.glavinic-pecotic.ext@nokia.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Alexander Sverdlin &lt;alexander.sverdlin@nokia.com&gt;
Cc: khilman@baylibre.com
Cc: akpm@linux-foundation.org
Cc: stable@vger.kernel.org
Link: http://lkml.kernel.org/r/a7900f04-2a21-c9fd-67be-ab334d459ee5@nokia.com
</content>
</entry>
<entry>
<title>timers: Make the cpu base lock raw</title>
<updated>2017-06-28T22:27:24Z</updated>
<author>
<name>Sebastian Andrzej Siewior</name>
<email>bigeasy@linutronix.de</email>
</author>
<published>2017-06-27T16:15:38Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2287d8664fe7345ead891017eccd879fc605305e'/>
<id>urn:sha1:2287d8664fe7345ead891017eccd879fc605305e</id>
<content type='text'>
The timers cpu base lock could not be converted to a raw spinlock becaue
the lock held time was non-deterministic due to cascading and long lasting
timer wheel traversals.

The rework of the timer wheel to the new non-cascading model removed also
the wheel traversals and the lock held times are deterministic now. This
allows to make the lock raw and thereby unbreaks NOHz* on preempt-RT.

Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Link: http://lkml.kernel.org/r/20170627161538.30257-1-bigeasy@linutronix.de
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;

</content>
</entry>
<entry>
<title>timers: Fix parameter description of try_to_del_timer_sync()</title>
<updated>2017-06-20T19:33:55Z</updated>
<author>
<name>Peter Meerwald-Stadler</name>
<email>pmeerw@pmeerw.net</email>
</author>
<published>2017-05-30T19:41:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d15bc69affc57d7985a01745ca28eafa0772325b'/>
<id>urn:sha1:d15bc69affc57d7985a01745ca28eafa0772325b</id>
<content type='text'>
Signed-off-by: Peter Meerwald-Stadler &lt;pmeerw@pmeerw.net&gt;
Link: http://lkml.kernel.org/r/20170530194103.7454-1-pmeerw@pmeerw.net
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: trivial@rustcorp.com.au
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;

</content>
</entry>
</feed>
