<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/time, branch v4.15.6</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.15.6</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.15.6'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2018-01-27T14:12:22Z</updated>
<entry>
<title>hrtimer: Reset hrtimer cpu base proper on CPU hotplug</title>
<updated>2018-01-27T14:12:22Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2018-01-26T13:54:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d5421ea43d30701e03cadc56a38854c36a8b4433'/>
<id>urn:sha1:d5421ea43d30701e03cadc56a38854c36a8b4433</id>
<content type='text'>
The hrtimer interrupt code contains a hang detection and mitigation
mechanism, which prevents that a long delayed hrtimer interrupt causes a
continous retriggering of interrupts which prevent the system from making
progress. If a hang is detected then the timer hardware is programmed with
a certain delay into the future and a flag is set in the hrtimer cpu base
which prevents newly enqueued timers from reprogramming the timer hardware
prior to the chosen delay. The subsequent hrtimer interrupt after the delay
clears the flag and resumes normal operation.

If such a hang happens in the last hrtimer interrupt before a CPU is
unplugged then the hang_detected flag is set and stays that way when the
CPU is plugged in again. At that point the timer hardware is not armed and
it cannot be armed because the hang_detected flag is still active, so
nothing clears that flag. As a consequence the CPU does not receive hrtimer
interrupts and no timers expire on that CPU which results in RCU stalls and
other malfunctions.

Clear the flag along with some other less critical members of the hrtimer
cpu base to ensure starting from a clean state when a CPU is plugged in.

Thanks to Paul, Sebastian and Anna-Maria for their help to get down to the
root cause of that hard to reproduce heisenbug. Once understood it's
trivial and certainly justifies a brown paperbag.

Fixes: 41d2e4949377 ("hrtimer: Tune hrtimer_interrupt hang logic")
Reported-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Sebastian Sewior &lt;bigeasy@linutronix.de&gt;
Cc: Anna-Maria Gleixner &lt;anna-maria@linutronix.de&gt;
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801261447590.2067@nanos

</content>
</entry>
<entry>
<title>timers: Unconditionally check deferrable base</title>
<updated>2018-01-14T22:25:33Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2018-01-14T22:19:49Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ed4bbf7910b28ce3c691aef28d245585eaabda06'/>
<id>urn:sha1:ed4bbf7910b28ce3c691aef28d245585eaabda06</id>
<content type='text'>
When the timer base is checked for expired timers then the deferrable base
must be checked as well. This was missed when making the deferrable base
independent of base::nohz_active.

Fixes: ced6d5c11d3e ("timers: Use deferrable base independent of base::nohz_active")
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Anna-Maria Gleixner &lt;anna-maria@linutronix.de&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Sebastian Siewior &lt;bigeasy@linutronix.de&gt;
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: stable@vger.kernel.org
Cc: rt@linutronix.de
</content>
</entry>
<entry>
<title>Merge branch 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2017-12-31T20:30:34Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2017-12-31T20:30:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cea92e843e40452c08ba313abc39f59efbb4c29c'/>
<id>urn:sha1:cea92e843e40452c08ba313abc39f59efbb4c29c</id>
<content type='text'>
Pull timer fixes from Thomas Gleixner:
 "A pile of fixes for long standing issues with the timer wheel and the
  NOHZ code:

   - Prevent timer base confusion accross the nohz switch, which can
     cause unlocked access and data corruption

   - Reinitialize the stale base clock on cpu hotplug to prevent subtle
     side effects including rollovers on 32bit

   - Prevent an interrupt storm when the timer softirq is already
     pending caused by tick_nohz_stop_sched_tick()

   - Move the timer start tracepoint to a place where it actually makes
     sense

   - Add documentation to timerqueue functions as they caused confusion
     several times now"

* 'timers-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  timerqueue: Document return values of timerqueue_add/del()
  timers: Invoke timer_start_debug() where it makes sense
  nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick()
  timers: Reinitialize per cpu bases on hotplug
  timers: Use deferrable base independent of base::nohz_active
</content>
</entry>
<entry>
<title>Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2017-12-31T20:27:19Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2017-12-31T20:27:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4c470317f91e5e684201f21e237fe444ed47c18c'/>
<id>urn:sha1:4c470317f91e5e684201f21e237fe444ed47c18c</id>
<content type='text'>
Pull scheduler fixes from Thomas Gleixner:
 "Three patches addressing the fallout of the CPU_ISOLATION changes
  especially with NO_HZ_FULL plus documentation of boot parameter
  dependency"

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  sched/isolation: Document boot parameters dependency on CONFIG_CPU_ISOLATION=y
  sched/isolation: Enable CONFIG_CPU_ISOLATION=y by default
  sched/isolation: Make CONFIG_NO_HZ_FULL select CONFIG_CPU_ISOLATION
</content>
</entry>
<entry>
<title>timers: Invoke timer_start_debug() where it makes sense</title>
<updated>2017-12-29T22:13:10Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2017-12-22T14:51:14Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fd45bb77ad682be728d1002431d77b8c73342836'/>
<id>urn:sha1:fd45bb77ad682be728d1002431d77b8c73342836</id>
<content type='text'>
The timer start debug function is called before the proper timer base is
set. As a consequence the trace data contains the stale CPU and flags
values.

Call the debug function after setting the new base and flags.

Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel")
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Sebastian Siewior &lt;bigeasy@linutronix.de&gt;
Cc: stable@vger.kernel.org
Cc: rt@linutronix.de
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Anna-Maria Gleixner &lt;anna-maria@linutronix.de&gt;
Link: https://lkml.kernel.org/r/20171222145337.792907137@linutronix.de

</content>
</entry>
<entry>
<title>nohz: Prevent a timer interrupt storm in tick_nohz_stop_sched_tick()</title>
<updated>2017-12-29T22:13:10Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2017-12-22T14:51:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5d62c183f9e9df1deeea0906d099a94e8a43047a'/>
<id>urn:sha1:5d62c183f9e9df1deeea0906d099a94e8a43047a</id>
<content type='text'>
The conditions in irq_exit() to invoke tick_nohz_irq_exit() which
subsequently invokes tick_nohz_stop_sched_tick() are:

  if ((idle_cpu(cpu) &amp;&amp; !need_resched()) || tick_nohz_full_cpu(cpu))

If need_resched() is not set, but a timer softirq is pending then this is
an indication that the softirq code punted and delegated the execution to
softirqd. need_resched() is not true because the current interrupted task
takes precedence over softirqd.

Invoking tick_nohz_irq_exit() in this case can cause an endless loop of
timer interrupts because the timer wheel contains an expired timer, but
softirqs are not yet executed. So it returns an immediate expiry request,
which causes the timer to fire immediately again. Lather, rinse and
repeat....

Prevent that by adding a check for a pending timer soft interrupt to the
conditions in tick_nohz_stop_sched_tick() which avoid calling
get_next_timer_interrupt(). That keeps the tick sched timer on the tick and
prevents a repetitive programming of an already expired timer.

Reported-by: Sebastian Siewior &lt;bigeasy@linutronix.d&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Anna-Maria Gleixner &lt;anna-maria@linutronix.de&gt;
Cc: Sebastian Siewior &lt;bigeasy@linutronix.de&gt;
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272156050.2431@nanos

</content>
</entry>
<entry>
<title>timers: Reinitialize per cpu bases on hotplug</title>
<updated>2017-12-29T22:13:09Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2017-12-27T20:37:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=26456f87aca7157c057de65c9414b37f1ab881d1'/>
<id>urn:sha1:26456f87aca7157c057de65c9414b37f1ab881d1</id>
<content type='text'>
The timer wheel bases are not (re)initialized on CPU hotplug. That leaves
them with a potentially stale clk and next_expiry valuem, which can cause
trouble then the CPU is plugged.

Add a prepare callback which forwards the clock, sets next_expiry to far in
the future and reset the control flags to a known state.

Set base-&gt;must_forward_clk so the first timer which is queued will try to
forward the clock to current jiffies.

Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel")
Reported-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Sebastian Siewior &lt;bigeasy@linutronix.de&gt;
Cc: Anna-Maria Gleixner &lt;anna-maria@linutronix.de&gt;
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272152200.2431@nanos

</content>
</entry>
<entry>
<title>timers: Use deferrable base independent of base::nohz_active</title>
<updated>2017-12-29T22:13:09Z</updated>
<author>
<name>Anna-Maria Gleixner</name>
<email>anna-maria@linutronix.de</email>
</author>
<published>2017-12-22T14:51:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ced6d5c11d3e7b342f1a80f908e6756ebd4b8ddd'/>
<id>urn:sha1:ced6d5c11d3e7b342f1a80f908e6756ebd4b8ddd</id>
<content type='text'>
During boot and before base::nohz_active is set in the timer bases, deferrable
timers are enqueued into the standard timer base. This works correctly as
long as base::nohz_active is false.

Once it base::nohz_active is set and a timer which was enqueued before that
is accessed the lock selector code choses the lock of the deferred
base. This causes unlocked access to the standard base and in case the
timer is removed it does not clear the pending flag in the standard base
bitmap which causes get_next_timer_interrupt() to return bogus values.

To prevent that, the deferrable timers must be enqueued in the deferrable
base, even when base::nohz_active is not set. Those deferrable timers also
need to be expired unconditional.

Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel")
Signed-off-by: Anna-Maria Gleixner &lt;anna-maria@linutronix.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Sebastian Siewior &lt;bigeasy@linutronix.de&gt;
Cc: stable@vger.kernel.org
Cc: rt@linutronix.de
Cc: Paul McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Link: https://lkml.kernel.org/r/20171222145337.633328378@linutronix.de

</content>
</entry>
<entry>
<title>cpufreq: schedutil: Use idle_calls counter of the remote CPU</title>
<updated>2017-12-28T11:26:54Z</updated>
<author>
<name>Joel Fernandes</name>
<email>joelaf@google.com</email>
</author>
<published>2017-12-21T01:22:45Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=466a2b42d67644447a1765276259a3ea5531ddff'/>
<id>urn:sha1:466a2b42d67644447a1765276259a3ea5531ddff</id>
<content type='text'>
Since the recent remote cpufreq callback work, its possible that a cpufreq
update is triggered from a remote CPU. For single policies however, the current
code uses the local CPU when trying to determine if the remote sg_cpu entered
idle or is busy. This is incorrect. To remedy this, compare with the nohz tick
idle_calls counter of the remote CPU.

Fixes: 674e75411fc2 (sched: cpufreq: Allow remote cpufreq callbacks)
Acked-by: Viresh Kumar &lt;viresh.kumar@linaro.org&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Joel Fernandes &lt;joelaf@google.com&gt;
Cc: 4.14+ &lt;stable@vger.kernel.org&gt; # 4.14+
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
</entry>
<entry>
<title>sched/isolation: Make CONFIG_NO_HZ_FULL select CONFIG_CPU_ISOLATION</title>
<updated>2017-12-18T12:46:42Z</updated>
<author>
<name>Paul E. McKenney</name>
<email>paulmck@linux.vnet.ibm.com</email>
</author>
<published>2017-12-14T18:18:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bf29cb238dc0656e6564b6a94bb82e11d2129437'/>
<id>urn:sha1:bf29cb238dc0656e6564b6a94bb82e11d2129437</id>
<content type='text'>
CONFIG_NO_HZ_FULL doesn't make sense without CONFIG_CPU_ISOLATION. In
fact enabling the first without the second is a regression as nohz_full=
boot parameter gets silently ignored.

Besides this unnatural combination hangs RCU gp kthread when running
rcutorture for reasons that are not yet fully understood:

	rcu_preempt kthread starved for 9974 jiffies! g4294967208
	+c4294967207 f0x0 RCU_GP_WAIT_FQS(3) -&gt;state=0x402 -&gt;cpu=0
	rcu_preempt     I 7464     8      2 0x80000000
	Call Trace:
		__schedule+0x493/0x620
		schedule+0x24/0x40
		schedule_timeout+0x330/0x3b0
		? preempt_count_sub+0xea/0x140
		? collect_expired_timers+0xb0/0xb0
		rcu_gp_kthread+0x6bf/0xef0

This commit therefore makes NO_HZ_FULL select CPU_ISOLATION, which
prevents all these bad behaviours.

Reported-by: kernel test robot &lt;xiaolong.ye@intel.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Signed-off-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Cc: Chris Metcalf &lt;cmetcalf@mellanox.com&gt;
Cc: Christoph Lameter &lt;cl@linux.com&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Luiz Capitulino &lt;lcapitulino@redhat.com&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Wanpeng Li &lt;kernellwp@gmail.com&gt;
Fixes: 5c4991e24c69 ("sched/isolation: Split out new CONFIG_CPU_ISOLATION=y config from CONFIG_NO_HZ_FULL")
Link: http://lkml.kernel.org/r/1513275507-29200-2-git-send-email-frederic@kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
</feed>
