<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/cpu.c, branch v4.14.16</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.14.16</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.14.16'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2018-01-02T19:31:15Z</updated>
<entry>
<title>timers: Reinitialize per cpu bases on hotplug</title>
<updated>2018-01-02T19:31:15Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2017-12-27T20:37:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6fae6de72ad44e98b5ae58a662d110c58594aad9'/>
<id>urn:sha1:6fae6de72ad44e98b5ae58a662d110c58594aad9</id>
<content type='text'>
commit 26456f87aca7157c057de65c9414b37f1ab881d1 upstream.

The timer wheel bases are not (re)initialized on CPU hotplug. That leaves
them with a potentially stale clk and next_expiry valuem, which can cause
trouble then the CPU is plugged.

Add a prepare callback which forwards the clock, sets next_expiry to far in
the future and reset the control flags to a known state.

Set base-&gt;must_forward_clk so the first timer which is queued will try to
forward the clock to current jiffies.

Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel")
Reported-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Sebastian Siewior &lt;bigeasy@linutronix.de&gt;
Cc: Anna-Maria Gleixner &lt;anna-maria@linutronix.de&gt;
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1712272152200.2431@nanos
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>smp/hotplug: Move step CPUHP_AP_SMPCFD_DYING to the correct place</title>
<updated>2017-12-14T08:52:56Z</updated>
<author>
<name>Lai Jiangshan</name>
<email>jiangshanlai@gmail.com</email>
</author>
<published>2017-11-28T13:19:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b68df97ec8d0fd2074c130a64c831b8b760f1228'/>
<id>urn:sha1:b68df97ec8d0fd2074c130a64c831b8b760f1228</id>
<content type='text'>
commit 46febd37f9c758b05cd25feae8512f22584742fe upstream.

Commit 31487f8328f2 ("smp/cfd: Convert core to hotplug state machine")
accidently put this step on the wrong place. The step should be at the
cpuhp_ap_states[] rather than the cpuhp_bp_states[].

grep smpcfd /sys/devices/system/cpu/hotplug/states
 40: smpcfd:prepare
129: smpcfd:dying

"smpcfd:dying" was missing before.
So was the invocation of the function smpcfd_dying_cpu().

Fixes: 31487f8328f2 ("smp/cfd: Convert core to hotplug state machine")
Signed-off-by: Lai Jiangshan &lt;jiangshanlai@gmail.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Cc: Boris Ostrovsky &lt;boris.ostrovsky@oracle.com&gt;
Link: https://lkml.kernel.org/r/20171128131954.81229-1-jiangshanlai@gmail.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>cpu/hotplug: Reset node state after operation</title>
<updated>2017-10-21T14:11:30Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2017-10-21T14:06:52Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1f7c70d6b2bc5de301f30456621e1161fddf4242'/>
<id>urn:sha1:1f7c70d6b2bc5de301f30456621e1161fddf4242</id>
<content type='text'>
The recent rework of the cpu hotplug internals changed the usage of the per
cpu state-&gt;node field, but missed to clean it up after usage.

So subsequent hotplug operations use the stale pointer from a previous
operation and hand it into the callback functions. The callbacks then
dereference a pointer which either belongs to a different facility or
points to freed and potentially reused memory. In either case data
corruption and crashes are the obvious consequence.

Reset the node and the last pointers in the per cpu state to NULL after the
operation which set them has completed.

Fixes: 96abb968549c ("smp/hotplug: Allow external multi-instance rollback")
Reported-by: Tvrtko Ursulin &lt;tursulin@ursulin.net&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Cc: Boris Ostrovsky &lt;boris.ostrovsky@oracle.com&gt;
Cc: "Paul E. McKenney" &lt;paulmck@linux.vnet.ibm.com&gt;
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1710211606130.3213@nanos

</content>
</entry>
<entry>
<title>Merge branch 'core-watchdog-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2017-10-06T15:36:41Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2017-10-06T15:36:41Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=27efed3e8384e4d87fe3c07e7a046c1f43eb0993'/>
<id>urn:sha1:27efed3e8384e4d87fe3c07e7a046c1f43eb0993</id>
<content type='text'>
Pull watchddog clean-up and fixes from Thomas Gleixner:
 "The watchdog (hard/softlockup detector) code is pretty much broken in
  its current state. The patch series addresses this by removing all
  duct tape and refactoring it into a workable state.

  The reasons why I ask for inclusion that late in the cycle are:

   1) The code causes lockdep splats vs. hotplug locking which get
      reported over and over. Unfortunately there is no easy fix.

   2) The risk of breakage is minimal because it's already broken

   3) As 4.14 is a long term stable kernel, I prefer to have working
      watchdog code in that and the lockdep issues resolved. I wouldn't
      ask you to pull if 4.14 wouldn't be a LTS kernel or if the
      solution would be easy to backport.

   4) The series was around before the merge window opened, but then got
      delayed due to the UP failure caused by the for_each_cpu()
      surprise which we discussed recently.

  Changes vs. V1:

   - Addressed your review points

   - Addressed the warning in the powerpc code which was discovered late

   - Changed two function names which made sense up to a certain point
     in the series. Now they match what they do in the end.

   - Fixed a 'unused variable' warning, which got not detected by the
     intel robot. I triggered it when trying all possible related config
     combinations manually. Randconfig testing seems not random enough.

  The changes have been tested by and reviewed by Don Zickus and tested
  and acked by Micheal Ellerman for powerpc"

* 'core-watchdog-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
  watchdog/core: Put softlockup_threads_initialized under ifdef guard
  watchdog/core: Rename some softlockup_* functions
  powerpc/watchdog: Make use of watchdog_nmi_probe()
  watchdog/core, powerpc: Lock cpus across reconfiguration
  watchdog/core, powerpc: Replace watchdog_nmi_reconfigure()
  watchdog/hardlockup/perf: Fix spelling mistake: "permanetely" -&gt; "permanently"
  watchdog/hardlockup/perf: Cure UP damage
  watchdog/hardlockup: Clean up hotplug locking mess
  watchdog/hardlockup/perf: Simplify deferred event destroy
  watchdog/hardlockup/perf: Use new perf CPU enable mechanism
  watchdog/hardlockup/perf: Implement CPU enable replacement
  watchdog/hardlockup/perf: Implement init time detection of perf
  watchdog/hardlockup/perf: Implement init time perf validation
  watchdog/core: Get rid of the racy update loop
  watchdog/core, powerpc: Make watchdog_nmi_reconfigure() two stage
  watchdog/sysctl: Clean up sysctl variable name space
  watchdog/sysctl: Get rid of the #ifdeffery
  watchdog/core: Clean up header mess
  watchdog/core: Further simplify sysctl handling
  watchdog/core: Get rid of the thread teardown/setup dance
  ...
</content>
</entry>
<entry>
<title>smp/hotplug: Hotplug state fail injection</title>
<updated>2017-09-25T20:11:44Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2017-09-20T17:00:21Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1db49484f21ed0fcdadd0635a3669f5f386546fa'/>
<id>urn:sha1:1db49484f21ed0fcdadd0635a3669f5f386546fa</id>
<content type='text'>
Add a sysfs file to one-time fail a specific state. This can be used
to test the state rollback code paths.

Something like this (hotplug-up.sh):

  #!/bin/bash

  echo 0 &gt; /debug/sched_debug
  echo 1 &gt; /debug/tracing/events/cpuhp/enable

  ALL_STATES=`cat /sys/devices/system/cpu/hotplug/states | cut -d':' -f1`
  STATES=${1:-$ALL_STATES}

  for state in $STATES
  do
	  echo 0 &gt; /sys/devices/system/cpu/cpu1/online
	  echo 0 &gt; /debug/tracing/trace
	  echo Fail state: $state
	  echo $state &gt; /sys/devices/system/cpu/cpu1/hotplug/fail
	  cat /sys/devices/system/cpu/cpu1/hotplug/fail
	  echo 1 &gt; /sys/devices/system/cpu/cpu1/online

	  cat /debug/tracing/trace &gt; hotfail-${state}.trace

	  sleep 1
  done

Can be used to test for all possible rollback (barring multi-instance)
scenarios on CPU-up, CPU-down is a trivial modification of the above.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Cc: max.byungchul.park@gmail.com
Link: https://lkml.kernel.org/r/20170920170546.972581715@infradead.org


</content>
</entry>
<entry>
<title>smp/hotplug: Differentiate the AP completion between up and down</title>
<updated>2017-09-25T20:11:43Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2017-09-20T17:00:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5ebe7742fff8be5f1359bc50f5d43fb6ff7bd060'/>
<id>urn:sha1:5ebe7742fff8be5f1359bc50f5d43fb6ff7bd060</id>
<content type='text'>
With lockdep-crossrelease we get deadlock reports that span cpu-up and
cpu-down chains. Such deadlocks cannot possibly happen because cpu-up
and cpu-down are globally serialized.

  takedown_cpu()
    irq_lock_sparse()
    wait_for_completion(&amp;st-&gt;done)

                                cpuhp_thread_fun
                                  cpuhp_up_callback
                                    cpuhp_invoke_callback
                                      irq_affinity_online_cpu
                                        irq_local_spare()
                                        irq_unlock_sparse()
                                  complete(&amp;st-&gt;done)

Now that we have consistent AP state, we can trivially separate the
AP completion between up and down using st-&gt;bringup.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: max.byungchul.park@gmail.com
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Link: https://lkml.kernel.org/r/20170920170546.872472799@infradead.org


</content>
</entry>
<entry>
<title>smp/hotplug: Differentiate the AP-work lockdep class between up and down</title>
<updated>2017-09-25T20:11:43Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2017-09-20T17:00:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5f4b55e10645b7371322c800a5ec745cab487a6c'/>
<id>urn:sha1:5f4b55e10645b7371322c800a5ec745cab487a6c</id>
<content type='text'>
With lockdep-crossrelease we get deadlock reports that span cpu-up and
cpu-down chains. Such deadlocks cannot possibly happen because cpu-up
and cpu-down are globally serialized.

  CPU0                  CPU1                    CPU2
  cpuhp_up_callbacks:   takedown_cpu:           cpuhp_thread_fun:

  cpuhp_state
                        irq_lock_sparse()
    irq_lock_sparse()
                        wait_for_completion()
                                                cpuhp_state
                                                complete()

Now that we have consistent AP state, we can trivially separate the
AP-work class between up and down using st-&gt;bringup.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: max.byungchul.park@gmail.com
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Link: https://lkml.kernel.org/r/20170920170546.922524234@infradead.org


</content>
</entry>
<entry>
<title>smp/hotplug: Callback vs state-machine consistency</title>
<updated>2017-09-25T20:11:43Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2017-09-20T17:00:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=724a86881d03ee5794148e65142e24ed3621be66'/>
<id>urn:sha1:724a86881d03ee5794148e65142e24ed3621be66</id>
<content type='text'>
While the generic callback functions have an 'int' return and thus
appear to be allowed to return error, this is not true for all states.

Specifically, what used to be STARTING/DYING are ran with IRQs
disabled from critical parts of CPU bringup/teardown and are not
allowed to fail. Add WARNs to enforce this rule.

But since some callbacks are indeed allowed to fail, we have the
situation where a state-machine rollback encounters a failure, in this
case we're stuck, we can't go forward and we can't go back. Also add a
WARN for that case.

AFAICT this is a fundamental 'problem' with no real obvious solution.
We want the 'prepare' callbacks to allow failure on either up or down.
Typically on prepare-up this would be things like -ENOMEM from
resource allocations, and the typical usage in prepare-down would be
something like -EBUSY to avoid CPUs being taken away.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Cc: max.byungchul.park@gmail.com
Link: https://lkml.kernel.org/r/20170920170546.819539119@infradead.org


</content>
</entry>
<entry>
<title>smp/hotplug: Rewrite AP state machine core</title>
<updated>2017-09-25T20:11:42Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2017-09-20T17:00:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4dddfb5faa6118564b0c54a163353d13882299d8'/>
<id>urn:sha1:4dddfb5faa6118564b0c54a163353d13882299d8</id>
<content type='text'>
There is currently no explicit state change on rollback. That is,
st-&gt;bringup, st-&gt;rollback and st-&gt;target are not consistent when doing
the rollback.

Rework the AP state handling to be more coherent. This does mean we
have to do a second AP kick-and-wait for rollback, but since rollback
is the slow path of a slowpath, this really should not matter.

Take this opportunity to simplify the AP thread function to only run a
single callback per invocation. This unifies the three single/up/down
modes is supports. The looping it used to do for up/down are achieved
by retaining should_run and relying on the main smpboot_thread_fn()
loop.

(I have most of a patch that does the same for the BP state handling,
but that's not critical and gets a little complicated because
CPUHP_BRINGUP_CPU does the AP handoff from a callback, which gets
recursive @st usage, I still have de-fugly that.)

[ tglx: Move cpuhp_down_callbacks() et al. into the HOTPLUG_CPU section to
  	avoid gcc complaining about unused functions. Make the HOTPLUG_CPU
  	one piece instead of having two consecutive ifdef sections of the
  	same type. ]

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Cc: max.byungchul.park@gmail.com
Link: https://lkml.kernel.org/r/20170920170546.769658088@infradead.org


</content>
</entry>
<entry>
<title>smp/hotplug: Allow external multi-instance rollback</title>
<updated>2017-09-25T20:11:42Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2017-09-20T17:00:16Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=96abb968549cdefd0964d1f7af0a79f4e6e7f897'/>
<id>urn:sha1:96abb968549cdefd0964d1f7af0a79f4e6e7f897</id>
<content type='text'>
Currently the rollback of multi-instance states is handled inside
cpuhp_invoke_callback(). The problem is that when we want to allow an
explicit state change for rollback, we need to return from the
function without doing the rollback.

Change cpuhp_invoke_callback() to optionally return the multi-instance
state, such that rollback can be done from a subsequent call.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: bigeasy@linutronix.de
Cc: efault@gmx.de
Cc: rostedt@goodmis.org
Cc: max.byungchul.park@gmail.com
Link: https://lkml.kernel.org/r/20170920170546.720361181@infradead.org


</content>
</entry>
</feed>
