<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/time/timer.c, branch v4.9.5</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.9.5</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.9.5'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2016-10-25T14:32:50Z</updated>
<entry>
<title>timers: Prevent base clock corruption when forwarding</title>
<updated>2016-10-25T14:32:50Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2016-10-22T11:07:37Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6bad6bccf2d717f652d37e63cf261eaa23466009'/>
<id>urn:sha1:6bad6bccf2d717f652d37e63cf261eaa23466009</id>
<content type='text'>
When a timer is enqueued we try to forward the timer base clock. This
mechanism has two issues:

1) Forwarding a remote base unlocked

The forwarding function is called from get_target_base() with the current
timer base lock held. But if the new target base is a different base than
the current base (can happen with NOHZ, sigh!) then the forwarding is done
on an unlocked base. This can lead to corruption of base-&gt;clk.

Solution is simple: Invoke the forwarding after the target base is locked.

2) Possible corruption due to jiffies advancing

This is similar to the issue in get_net_timer_interrupt() which was fixed
in the previous patch. jiffies can advance between check and assignement
and therefore advancing base-&gt;clk beyond the next expiry value.

So we need to read jiffies into a local variable once and do the checks and
assignment with the local copy.

Fixes: a683f390b93f("timers: Forward the wheel clock whenever possible")
Reported-by: Ashton Holmes &lt;scoopta@gmail.com&gt;
Reported-by: Michael Thayer &lt;michael.thayer@oracle.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Michal Necasek &lt;michal.necasek@oracle.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: knut.osmundsen@oracle.com
Cc: stable@vger.kernel.org
Cc: stern@rowland.harvard.edu
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20161022110552.253640125@linutronix.de
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;

</content>
</entry>
<entry>
<title>timers: Prevent base clock rewind when forwarding clock</title>
<updated>2016-10-25T14:32:50Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2016-10-22T11:07:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=041ad7bc758db259bb960ef795197dd14aab19a6'/>
<id>urn:sha1:041ad7bc758db259bb960ef795197dd14aab19a6</id>
<content type='text'>
Ashton and Michael reported, that kernel versions 4.8 and later suffer from
USB timeouts which are caused by the timer wheel rework.

This is caused by a bug in the base clock forwarding mechanism, which leads
to timers expiring early. The scenario which leads to this is:

run_timers()
  while (jiffies &gt;= base-&gt;clk) {
    collect_expired_timers();
    base-&gt;clk++;
    expire_timers();
  }          

So base-&gt;clk = jiffies + 1. Now the cpu goes idle:

idle()
  get_next_timer_interrupt()
    nextevt = __next_time_interrupt();
    if (time_after(nextevt, base-&gt;clk))
       	base-&gt;clk = jiffies;

jiffies has not advanced since run_timers(), so this assignment effectively
decrements base-&gt;clk by one.

base-&gt;clk is the index into the timer wheel arrays. So let's assume the
following state after the base-&gt;clk increment in run_timers():

 jiffies = 0
 base-&gt;clk = 1

A timer gets enqueued with an expiry delta of 63 ticks (which is the case
with the USB timeout and HZ=250) so the resulting bucket index is:

  base-&gt;clk + delta = 1 + 63 = 64

The timer goes into the first wheel level. The array size is 64 so it ends
up in bucket 0, which is correct as it takes 63 ticks to advance base-&gt;clk
to index into bucket 0 again.

If the cpu goes idle before jiffies advance, then the bug in the forwarding
mechanism sets base-&gt;clk back to 0, so the next invocation of run_timers()
at the next tick will index into bucket 0 and therefore expire the timer 62
ticks too early.

Instead of blindly setting base-&gt;clk to jiffies we must make the forwarding
conditional on jiffies &gt; base-&gt;clk, but we cannot use jiffies for this as
we might run into the following issue:

  if (time_after(jiffies, base-&gt;clk) {
    if (time_after(nextevt, base-&gt;clk))
       base-&gt;clk = jiffies;

jiffies can increment between the check and the assigment far enough to
advance beyond nextevt. So we need to use a stable value for checking.

get_next_timer_interrupt() has the basej argument which is the jiffies
value snapshot taken in the calling code. So we can just that.

Thanks to Ashton for bisecting and providing trace data!

Fixes: a683f390b93f ("timers: Forward the wheel clock whenever possible")
Reported-by: Ashton Holmes &lt;scoopta@gmail.com&gt;
Reported-by: Michael Thayer &lt;michael.thayer@oracle.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Michal Necasek &lt;michal.necasek@oracle.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: knut.osmundsen@oracle.com
Cc: stable@vger.kernel.org
Cc: stern@rowland.harvard.edu
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20161022110552.175308322@linutronix.de
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;

</content>
</entry>
<entry>
<title>timers: Lock base for same bucket optimization</title>
<updated>2016-10-25T14:27:39Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2016-10-24T09:55:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4da9152a4308dcbf611cde399c695c359fc9145f'/>
<id>urn:sha1:4da9152a4308dcbf611cde399c695c359fc9145f</id>
<content type='text'>
Linus stumbled over the unlocked modification of the timer expiry value in
mod_timer() which is an optimization for timers which stay in the same
bucket - due to the bucket granularity - despite their expiry time getting
updated.

The optimization itself still makes sense even if we take the lock, because
in case that the bucket stays the same, we avoid the pointless
queue/enqueue dance.

Make the check and the modification of timer-&gt;expires protected by the base
lock and shuffle the remaining code around so we can keep the lock held
when we actually have to requeue the timer to a different bucket.

Fixes: f00c0afdfa62 ("timers: Implement optimization for same expiry time in mod_timer()")
Reported-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1610241711220.4983@nanos
Cc: stable@vger.kernel.org
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
</content>
</entry>
<entry>
<title>timers: Plug locking race vs. timer migration</title>
<updated>2016-10-25T14:27:39Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2016-10-24T09:41:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b831275a3553c32091222ac619cfddd73a5553fb'/>
<id>urn:sha1:b831275a3553c32091222ac619cfddd73a5553fb</id>
<content type='text'>
Linus noticed that lock_timer_base() lacks a READ_ONCE() for accessing the
timer flags. As a consequence the compiler is allowed to reload the flags
between the initial check for TIMER_MIGRATION and the following timer base
computation and the spin lock of the base.

While this has not been observed (yet), we need to make sure that it never
happens.

Fixes: 0eeda71bc30d ("timer: Replace timer base by a cpu index")
Reported-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/alpine.DEB.2.20.1610241711220.4983@nanos
Cc: stable@vger.kernel.org
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
</content>
</entry>
<entry>
<title>latent_entropy: Mark functions with __latent_entropy</title>
<updated>2016-10-10T21:51:45Z</updated>
<author>
<name>Emese Revfy</name>
<email>re.emese@gmail.com</email>
</author>
<published>2016-06-20T18:42:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0766f788eb727e2e330d55d30545db65bcf2623f'/>
<id>urn:sha1:0766f788eb727e2e330d55d30545db65bcf2623f</id>
<content type='text'>
The __latent_entropy gcc attribute can be used only on functions and
variables.  If it is on a function then the plugin will instrument it for
gathering control-flow entropy. If the attribute is on a variable then
the plugin will initialize it with random contents.  The variable must
be an integer, an integer array type or a structure with integer fields.

These specific functions have been selected because they are init
functions (to help gather boot-time entropy), are called at unpredictable
times, or they have variable loops, each of which provide some level of
latent entropy.

Signed-off-by: Emese Revfy &lt;re.emese@gmail.com&gt;
[kees: expanded commit message]
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
</content>
</entry>
<entry>
<title>timers: Fix get_next_timer_interrupt() computation</title>
<updated>2016-08-09T07:31:55Z</updated>
<author>
<name>Chris Metcalf</name>
<email>cmetcalf@mellanox.com</email>
</author>
<published>2016-08-08T20:29:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=46c8f0b077a838eb1f6169bb370aab8ed98f7630'/>
<id>urn:sha1:46c8f0b077a838eb1f6169bb370aab8ed98f7630</id>
<content type='text'>
The tick_nohz_stop_sched_tick() routine is not properly
canceling the sched timer when nothing is pending, because
get_next_timer_interrupt() is no longer returning KTIME_MAX in
that case.  This causes periodic interrupts when none are needed.

When determining the next interrupt time, we first use
__next_timer_interrupt() to get the first expiring timer in the
timer wheel.  If no timer is found, we return the base clock value
plus NEXT_TIMER_MAX_DELTA to indicate there is no timer in the
timer wheel.

Back in get_next_timer_interrupt(), we set the "expires" value
by converting the timer wheel expiry (in ticks) to a nsec value.
But we don't want to do this if the timer wheel expiry value
indicates no timer; we want to return KTIME_MAX.

Prior to commit 500462a9de65 ("timers: Switch to a non-cascading
wheel") we checked base-&gt;active_timers to see if any timers
were active, and if not, we didn't touch the expiry value and so
properly returned KTIME_MAX.  Now we don't have active_timers.

To fix this, we now just check the timer wheel expiry value to
see if it is "now + NEXT_TIMER_MAX_DELTA", and if it is, we don't
try to compute a new value based on it, but instead simply let the
KTIME_MAX value in expires remain.

Fixes: 500462a9de65 "timers: Switch to a non-cascading wheel"
Signed-off-by: Chris Metcalf &lt;cmetcalf@mellanox.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Link: http://lkml.kernel.org/r/1470688147-22287-1-git-send-email-cmetcalf@mellanox.com
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;

</content>
</entry>
<entry>
<title>timers/core: Convert to hotplug state machine</title>
<updated>2016-07-15T08:41:42Z</updated>
<author>
<name>Richard Cochran</name>
<email>rcochran@linutronix.de</email>
</author>
<published>2016-07-13T17:16:59Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=24f73b99716a9cd8cbb345c41ced6b3b5ed94006'/>
<id>urn:sha1:24f73b99716a9cd8cbb345c41ced6b3b5ed94006</id>
<content type='text'>
When tearing down, call timers_dead_cpu() before notify_dead().
There is a hidden dependency between:

 - timers
 - block multiqueue
 - rcutree

If timers_dead_cpu() comes later than blk_mq_queue_reinit_notify()
that latter function causes a RCU stall.

Signed-off-by: Richard Cochran &lt;rcochran@linutronix.de&gt;
Signed-off-by: Anna-Maria Gleixner &lt;anna-maria@linutronix.de&gt;
Reviewed-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Rasmus Villemoes &lt;linux@rasmusvillemoes.dk&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160713153337.566790058@linutronix.de
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'timers/fast-wheel' into timers/core</title>
<updated>2016-07-07T08:35:28Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2016-07-07T08:35:28Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4b4b20852d1009c5e8bc357b22353b62e3a241c7'/>
<id>urn:sha1:4b4b20852d1009c5e8bc357b22353b62e3a241c7</id>
<content type='text'>
</content>
</entry>
<entry>
<title>timers: Implement optimization for same expiry time in mod_timer()</title>
<updated>2016-07-07T08:35:12Z</updated>
<author>
<name>Anna-Maria Gleixner</name>
<email>anna-maria@linutronix.de</email>
</author>
<published>2016-07-04T09:50:40Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f00c0afdfa625165a609513bc74164d56752ec3e'/>
<id>urn:sha1:f00c0afdfa625165a609513bc74164d56752ec3e</id>
<content type='text'>
The existing optimization for same expiry time in mod_timer() checks whether
the timer expiry time is the same as the new requested expiry time. In the old
timer wheel implementation this does not take the slack batching into account,
neither does the new implementation evaluate whether the new expiry time will
requeue the timer to the same bucket.

To optimize that, we can calculate the resulting bucket and check if the new
expiry time is different from the current expiry time. This calculation
happens outside the base lock held region. If the resulting bucket is the same
we can avoid taking the base lock and requeueing the timer.

If the timer needs to be requeued then we have to check under the base lock
whether the base time has changed between the lockless calculation and taking
the lock. If it has changed we need to recalculate under the lock.

This optimization takes effect for timers which are enqueued into the less
granular wheel levels (1 and above). With a simple test case the functionality
has been verified:

            Before        After
 Match:       5.5%        86.6%
 Requeue:    94.5%        13.4%
 Recalc:                  &lt;0.01%

In the non optimized case the timer is requeued in 94.5% of the cases. With
the index optimization in place the requeue rate drops to 13.4%. The case
where the lockless index calculation has to be redone is less than 0.01%.

With a real world test case (networking) we observed the following changes:

            Before        After
 Match:      97.8%        99.7%
 Requeue:     2.2%         0.3%
 Recalc:                  &lt;0.001%

That means two percent fewer lock/requeue/unlock operations done in one of
the hot path use cases of timers.

Signed-off-by: Anna-Maria Gleixner &lt;anna-maria@linutronix.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Arjan van de Ven &lt;arjan@infradead.org&gt;
Cc: Chris Mason &lt;clm@fb.com&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: George Spelvin &lt;linux@sciencehorizons.net&gt;
Cc: Josh Triplett &lt;josh@joshtriplett.org&gt;
Cc: Len Brown &lt;lenb@kernel.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094342.778527749@linutronix.de
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>timers: Split out index calculation</title>
<updated>2016-07-07T08:35:12Z</updated>
<author>
<name>Anna-Maria Gleixner</name>
<email>anna-maria@linutronix.de</email>
</author>
<published>2016-07-04T09:50:39Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ffdf047728f8f93df896b58049c7513856027141'/>
<id>urn:sha1:ffdf047728f8f93df896b58049c7513856027141</id>
<content type='text'>
For further optimizations we need to seperate index calculation
from queueing. No functional change.

Signed-off-by: Anna-Maria Gleixner &lt;anna-maria@linutronix.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Arjan van de Ven &lt;arjan@infradead.org&gt;
Cc: Chris Mason &lt;clm@fb.com&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: George Spelvin &lt;linux@sciencehorizons.net&gt;
Cc: Josh Triplett &lt;josh@joshtriplett.org&gt;
Cc: Len Brown &lt;lenb@kernel.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: rt@linutronix.de
Link: http://lkml.kernel.org/r/20160704094342.691159619@linutronix.de
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
</feed>
