<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/time/timer.c, branch v4.9.119</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.9.119</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.9.119'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2018-03-11T15:21:28Z</updated>
<entry>
<title>timers: Forward timer base before migrating timers</title>
<updated>2018-03-11T15:21:28Z</updated>
<author>
<name>Lingutla Chandrasekhar</name>
<email>clingutla@codeaurora.org</email>
</author>
<published>2018-01-18T11:50:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=13e75c74cd69ca460778fad5ab902f0b20869267'/>
<id>urn:sha1:13e75c74cd69ca460778fad5ab902f0b20869267</id>
<content type='text'>
commit c52232a49e203a65a6e1a670cd5262f59e9364a0 upstream.

On CPU hotunplug the enqueued timers of the unplugged CPU are migrated to a
live CPU. This happens from the control thread which initiated the unplug.

If the CPU on which the control thread runs came out from a longer idle
period then the base clock of that CPU might be stale because the control
thread runs prior to any event which forwards the clock.

In such a case the timers from the unplugged CPU are queued on the live CPU
based on the stale clock which can cause large delays due to increased
granularity of the outer timer wheels which are far away from base:;clock.

But there is a worse problem than that. The following sequence of events
illustrates it:

 - CPU0 timer1 is queued expires = 59969 and base-&gt;clk = 59131.

   The timer is queued at wheel level 2, with resulting expiry time = 60032
   (due to level granularity).

 - CPU1 enters idle @60007, with next timer expiry @60020.

 - CPU0 is hotplugged at @60009

 - CPU1 exits idle and runs the control thread which migrates the
   timers from CPU0

   timer1 is now queued in level 0 for immediate handling in the next
   softirq because the requested expiry time 59969 is before CPU1 base-&gt;clk
   60007

 - CPU1 runs code which forwards the base clock which succeeds because the
   next expiring timer. which was collected at idle entry time is still set
   to 60020.

   So it forwards beyond 60007 and therefore misses to expire the migrated
   timer1. That timer gets expired when the wheel wraps around again, which
   takes between 63 and 630ms depending on the HZ setting.

Address both problems by invoking forward_timer_base() for the control CPUs
timer base. All other places, which might run into a similar problem
(mod_timer()/add_timer_on()) already invoke forward_timer_base() to avoid
that.

[ tglx: Massaged comment and changelog ]

Fixes: a683f390b93f ("timers: Forward the wheel clock whenever possible")
Co-developed-by: Neeraj Upadhyay &lt;neeraju@codeaurora.org&gt;
Signed-off-by: Neeraj Upadhyay &lt;neeraju@codeaurora.org&gt;
Signed-off-by: Lingutla Chandrasekhar &lt;clingutla@codeaurora.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Anna-Maria Gleixner &lt;anna-maria@linutronix.de&gt;
Cc: linux-arm-msm@vger.kernel.org
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/20180118115022.6368-1-clingutla@codeaurora.org
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>timers: Unconditionally check deferrable base</title>
<updated>2018-01-23T18:57:04Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2018-01-14T22:19:49Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=676109b28cadbb9c6e6fedee733f3b2111923f17'/>
<id>urn:sha1:676109b28cadbb9c6e6fedee733f3b2111923f17</id>
<content type='text'>
commit ed4bbf7910b28ce3c691aef28d245585eaabda06 upstream.

When the timer base is checked for expired timers then the deferrable base
must be checked as well. This was missed when making the deferrable base
independent of base::nohz_active.

Fixes: ced6d5c11d3e ("timers: Use deferrable base independent of base::nohz_active")
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Anna-Maria Gleixner &lt;anna-maria@linutronix.de&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Sebastian Siewior &lt;bigeasy@linutronix.de&gt;
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: rt@linutronix.de
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>timers: Reinitialize per cpu bases on hotplug</title>
<updated>2018-01-02T19:35:17Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2017-12-27T20:37:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=249d4a9b3246f4ec92433ba8ea3bae5ceb4dc1ed'/>
<id>urn:sha1:249d4a9b3246f4ec92433ba8ea3bae5ceb4dc1ed</id>
<content type='text'>
commit 26456f87aca7157c057de65c9414b37f1ab881d1 upstream.

The timer wheel bases are not (re)initialized on CPU hotplug. That leaves
them with a potentially stale clk and next_expiry valuem, which can cause
trouble then the CPU is plugged.

Add a prepare callback which forwards the clock, sets next_expiry to far in
the future and reset the control flags to a known state.

Set base-&gt;must_forward_clk so the first timer which is queued will try to
forward the clock to current jiffies.

Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel")
Reported-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Sebastian Siewior &lt;bigeasy@linutronix.de&gt;
Cc: Anna-Maria Gleixner &lt;anna-maria@linutronix.de&gt;
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272152200.2431@nanos
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>timers: Invoke timer_start_debug() where it makes sense</title>
<updated>2018-01-02T19:35:17Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2017-12-22T14:51:14Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=574e543ff970ea208d6d97524e0373d3741a6a34'/>
<id>urn:sha1:574e543ff970ea208d6d97524e0373d3741a6a34</id>
<content type='text'>
commit fd45bb77ad682be728d1002431d77b8c73342836 upstream.

The timer start debug function is called before the proper timer base is
set. As a consequence the trace data contains the stale CPU and flags
values.

Call the debug function after setting the new base and flags.

Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel")
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Sebastian Siewior &lt;bigeasy@linutronix.de&gt;
Cc: rt@linutronix.de
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Anna-Maria Gleixner &lt;anna-maria@linutronix.de&gt;
Link: https://lkml.kernel.org/r/20171222145337.792907137@linutronix.de
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>timers: Use deferrable base independent of base::nohz_active</title>
<updated>2018-01-02T19:35:17Z</updated>
<author>
<name>Anna-Maria Gleixner</name>
<email>anna-maria@linutronix.de</email>
</author>
<published>2017-12-22T14:51:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d840687aa8a3ca7b8219b1a207a1c55e47c90225'/>
<id>urn:sha1:d840687aa8a3ca7b8219b1a207a1c55e47c90225</id>
<content type='text'>
commit ced6d5c11d3e7b342f1a80f908e6756ebd4b8ddd upstream.

During boot and before base::nohz_active is set in the timer bases, deferrable
timers are enqueued into the standard timer base. This works correctly as
long as base::nohz_active is false.

Once it base::nohz_active is set and a timer which was enqueued before that
is accessed the lock selector code choses the lock of the deferred
base. This causes unlocked access to the standard base and in case the
timer is removed it does not clear the pending flag in the standard base
bitmap which causes get_next_timer_interrupt() to return bogus values.

To prevent that, the deferrable timers must be enqueued in the deferrable
base, even when base::nohz_active is not set. Those deferrable timers also
need to be expired unconditional.

Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel")
Signed-off-by: Anna-Maria Gleixner &lt;anna-maria@linutronix.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Sebastian Siewior &lt;bigeasy@linutronix.de&gt;
Cc: rt@linutronix.de
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Link: https://lkml.kernel.org/r/20171222145337.633328378@linutronix.de
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>timer/sysclt: Restrict timer migration sysctl values to 0 and 1</title>
<updated>2017-10-05T07:44:04Z</updated>
<author>
<name>Myungho Jung</name>
<email>mhjungk@gmail.com</email>
</author>
<published>2017-04-19T22:24:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4c00015385faccd992e98dfedfeaa07ac56d7194'/>
<id>urn:sha1:4c00015385faccd992e98dfedfeaa07ac56d7194</id>
<content type='text'>
commit b94bf594cf8ed67cdd0439e70fa939783471597a upstream.

timer_migration sysctl acts as a boolean switch, so the allowed values
should be restricted to 0 and 1.

Add the necessary extra fields to the sysctl table entry to enforce that.

[ tglx: Rewrote changelog ]

Signed-off-by: Myungho Jung &lt;mhjungk@gmail.com&gt;
Link: http://lkml.kernel.org/r/1492640690-3550-1-git-send-email-mhjungk@gmail.com
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Kazuhiro Hayashi &lt;kazuhiro3.hayashi@toshiba.co.jp&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>timers: Fix excessive granularity of new timers after a nohz idle</title>
<updated>2017-08-30T08:21:51Z</updated>
<author>
<name>Nicholas Piggin</name>
<email>npiggin@gmail.com</email>
</author>
<published>2017-08-22T08:43:48Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=70b3fd5ce2ce1dfe8f563e93d31c124b84593af4'/>
<id>urn:sha1:70b3fd5ce2ce1dfe8f563e93d31c124b84593af4</id>
<content type='text'>
commit 2fe59f507a65dbd734b990a11ebc7488f6f87a24 upstream.

When a timer base is idle, it is forwarded when a new timer is added
to ensure that granularity does not become excessive. When not idle,
the timer tick is expected to increment the base.

However there are several problems:

- If an existing timer is modified, the base is forwarded only after
  the index is calculated.

- The base is not forwarded by add_timer_on.

- There is a window after a timer is restarted from a nohz idle, after
  it is marked not-idle and before the timer tick on this CPU, where a
  timer may be added but the ancient base does not get forwarded.

These result in excessive granularity (a 1 jiffy timeout can blow out
to 100s of jiffies), which cause the rcu lockup detector to trigger,
among other things.

Fix this by keeping track of whether the timer base has been idle
since it was last run or forwarded, and if so then forward it before
adding a new timer.

There is still a case where mod_timer optimises the case of a pending
timer mod with the same expiry time, where the timer can see excessive
granularity relative to the new, shorter interval. A comment is added,
but it's not changed because it is an important fastpath for
networking.

This has been tested and found to fix the RCU softlockup messages.

Testing was also done with tracing to measure requested versus
achieved wakeup latencies for all non-deferrable timers in an idle
system (with no lockup watchdogs running). Wakeup latency relative to
absolute latency is calculated (note this suffers from round-up skew
at low absolute times) and analysed:

             max     avg      std
upstream   506.0    1.20     4.68
patched      2.0    1.08     0.15

The bug was noticed due to the lockup detector Kconfig changes
dropping it out of people's .configs and resulting in larger base
clk skew When the lockup detectors are enabled, no CPU can go idle for
longer than 4 seconds, which limits the granularity errors.
Sub-optimal timer behaviour is observable on a smaller scale in that
case:

	     max     avg      std
upstream     9.0    1.05     0.19
patched      2.0    1.04     0.11

Fixes: Fixes: a683f390b93f ("timers: Forward the wheel clock whenever possible")
Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Jonathan Cameron &lt;Jonathan.Cameron@huawei.com&gt;
Tested-by: David Miller &lt;davem@davemloft.net&gt;
Cc: dzickus@redhat.com
Cc: sfr@canb.auug.org.au
Cc: mpe@ellerman.id.au
Cc: Stephen Boyd &lt;sboyd@codeaurora.org&gt;
Cc: linuxarm@huawei.com
Cc: abdhalee@linux.vnet.ibm.com
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: akpm@linux-foundation.org
Cc: paulmck@linux.vnet.ibm.com
Cc: torvalds@linux-foundation.org
Link: http://lkml.kernel.org/r/20170822084348.21436-1-npiggin@gmail.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>timers: Fix overflow in get_next_timer_interrupt</title>
<updated>2017-08-11T15:49:30Z</updated>
<author>
<name>Matija Glavinic Pecotic</name>
<email>matija.glavinic-pecotic.ext@nokia.com</email>
</author>
<published>2017-08-01T07:11:52Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9ef8b23b94b98ec9b270e6fca5eadb97c96d809a'/>
<id>urn:sha1:9ef8b23b94b98ec9b270e6fca5eadb97c96d809a</id>
<content type='text'>
commit 34f41c0316ed52b0b44542491d89278efdaa70e4 upstream.

For e.g. HZ=100, timer being 430 jiffies in the future, and 32 bit
unsigned int, there is an overflow on unsigned int right-hand side
of the expression which results with wrong values being returned.

Type cast the multiplier to 64bit to avoid that issue.

Fixes: 46c8f0b077a8 ("timers: Fix get_next_timer_interrupt() computation")
Signed-off-by: Matija Glavinic Pecotic &lt;matija.glavinic-pecotic.ext@nokia.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Alexander Sverdlin &lt;alexander.sverdlin@nokia.com&gt;
Cc: khilman@baylibre.com
Cc: akpm@linux-foundation.org
Link: http://lkml.kernel.org/r/a7900f04-2a21-c9fd-67be-ab334d459ee5@nokia.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>timers: Prevent base clock corruption when forwarding</title>
<updated>2016-10-25T14:32:50Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2016-10-22T11:07:37Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6bad6bccf2d717f652d37e63cf261eaa23466009'/>
<id>urn:sha1:6bad6bccf2d717f652d37e63cf261eaa23466009</id>
<content type='text'>
When a timer is enqueued we try to forward the timer base clock. This
mechanism has two issues:

1) Forwarding a remote base unlocked

The forwarding function is called from get_target_base() with the current
timer base lock held. But if the new target base is a different base than
the current base (can happen with NOHZ, sigh!) then the forwarding is done
on an unlocked base. This can lead to corruption of base-&gt;clk.

Solution is simple: Invoke the forwarding after the target base is locked.

2) Possible corruption due to jiffies advancing

This is similar to the issue in get_net_timer_interrupt() which was fixed
in the previous patch. jiffies can advance between check and assignement
and therefore advancing base-&gt;clk beyond the next expiry value.

So we need to read jiffies into a local variable once and do the checks and
assignment with the local copy.

Fixes: a683f390b93f("timers: Forward the wheel clock whenever possible")
Reported-by: Ashton Holmes &lt;scoopta@gmail.com&gt;
Reported-by: Michael Thayer &lt;michael.thayer@oracle.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Michal Necasek &lt;michal.necasek@oracle.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: knut.osmundsen@oracle.com
Cc: stable@vger.kernel.org
Cc: stern@rowland.harvard.edu
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20161022110552.253640125@linutronix.de
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;

</content>
</entry>
<entry>
<title>timers: Prevent base clock rewind when forwarding clock</title>
<updated>2016-10-25T14:32:50Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2016-10-22T11:07:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=041ad7bc758db259bb960ef795197dd14aab19a6'/>
<id>urn:sha1:041ad7bc758db259bb960ef795197dd14aab19a6</id>
<content type='text'>
Ashton and Michael reported, that kernel versions 4.8 and later suffer from
USB timeouts which are caused by the timer wheel rework.

This is caused by a bug in the base clock forwarding mechanism, which leads
to timers expiring early. The scenario which leads to this is:

run_timers()
  while (jiffies &gt;= base-&gt;clk) {
    collect_expired_timers();
    base-&gt;clk++;
    expire_timers();
  }          

So base-&gt;clk = jiffies + 1. Now the cpu goes idle:

idle()
  get_next_timer_interrupt()
    nextevt = __next_time_interrupt();
    if (time_after(nextevt, base-&gt;clk))
       	base-&gt;clk = jiffies;

jiffies has not advanced since run_timers(), so this assignment effectively
decrements base-&gt;clk by one.

base-&gt;clk is the index into the timer wheel arrays. So let's assume the
following state after the base-&gt;clk increment in run_timers():

 jiffies = 0
 base-&gt;clk = 1

A timer gets enqueued with an expiry delta of 63 ticks (which is the case
with the USB timeout and HZ=250) so the resulting bucket index is:

  base-&gt;clk + delta = 1 + 63 = 64

The timer goes into the first wheel level. The array size is 64 so it ends
up in bucket 0, which is correct as it takes 63 ticks to advance base-&gt;clk
to index into bucket 0 again.

If the cpu goes idle before jiffies advance, then the bug in the forwarding
mechanism sets base-&gt;clk back to 0, so the next invocation of run_timers()
at the next tick will index into bucket 0 and therefore expire the timer 62
ticks too early.

Instead of blindly setting base-&gt;clk to jiffies we must make the forwarding
conditional on jiffies &gt; base-&gt;clk, but we cannot use jiffies for this as
we might run into the following issue:

  if (time_after(jiffies, base-&gt;clk) {
    if (time_after(nextevt, base-&gt;clk))
       base-&gt;clk = jiffies;

jiffies can increment between the check and the assigment far enough to
advance beyond nextevt. So we need to use a stable value for checking.

get_next_timer_interrupt() has the basej argument which is the jiffies
value snapshot taken in the calling code. So we can just that.

Thanks to Ashton for bisecting and providing trace data!

Fixes: a683f390b93f ("timers: Forward the wheel clock whenever possible")
Reported-by: Ashton Holmes &lt;scoopta@gmail.com&gt;
Reported-by: Michael Thayer &lt;michael.thayer@oracle.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Michal Necasek &lt;michal.necasek@oracle.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: knut.osmundsen@oracle.com
Cc: stable@vger.kernel.org
Cc: stern@rowland.harvard.edu
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20161022110552.175308322@linutronix.de
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;

</content>
</entry>
</feed>
