<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/stop_machine.c, branch v4.18-rc2</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.18-rc2</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.18-rc2'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2018-05-15T06:10:50Z</updated>
<entry>
<title>Merge tag 'v4.17-rc5' into locking/core, to pick up fixes</title>
<updated>2018-05-15T06:10:50Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2018-05-15T06:10:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=371b3269082500fc418043742467119ab0d224c6'/>
<id>urn:sha1:371b3269082500fc418043742467119ab0d224c6</id>
<content type='text'>
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>stop_machine, sched: Fix migrate_swap() vs. active_balance() deadlock</title>
<updated>2018-05-03T05:38:03Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2018-04-20T09:50:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0b26351b910fb8fe6a056f8a1bbccabe50c0e19f'/>
<id>urn:sha1:0b26351b910fb8fe6a056f8a1bbccabe50c0e19f</id>
<content type='text'>
Matt reported the following deadlock:

CPU0					CPU1

schedule(.prev=migrate/0)		&lt;fault&gt;
  pick_next_task()			  ...
    idle_balance()			    migrate_swap()
      active_balance()			      stop_two_cpus()
						spin_lock(stopper0-&gt;lock)
						spin_lock(stopper1-&gt;lock)
						ttwu(migrate/0)
						  smp_cond_load_acquire() -- waits for schedule()
        stop_one_cpu(1)
	  spin_lock(stopper1-&gt;lock) -- waits for stopper lock

Fix this deadlock by taking the wakeups out from under stopper-&gt;lock.
This allows the active_balance() to queue the stop work and finish the
context switch, which in turn allows the wakeup from migrate_swap() to
observe the context and complete the wakeup.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reported-by: Matt Fleming &lt;matt@codeblueprint.co.uk&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Matt Fleming &lt;matt@codeblueprint.co.uk&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Michal Hocko &lt;mhocko@suse.com&gt;
Cc: Mike Galbraith &lt;umgwanakikbuti@gmail.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/20180420095005.GH4064@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>stop_machine: Use raw spinlocks</title>
<updated>2018-04-27T12:34:51Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2018-04-23T19:16:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=de5b55c1d4e30740009864eb35ce4ed856aac01d'/>
<id>urn:sha1:de5b55c1d4e30740009864eb35ce4ed856aac01d</id>
<content type='text'>
Use raw-locks in stop_machine() to allow locking in irq-off and
preempt-disabled regions on -RT. This also documents the possible locking
context in general.

[bigeasy: update patch description.]
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Link: https://lkml.kernel.org/r/20180423191635.6014-1-bigeasy@linutronix.de

</content>
</entry>
<entry>
<title>stop_machine: Provide stop_machine_cpuslocked()</title>
<updated>2017-05-26T08:10:36Z</updated>
<author>
<name>Sebastian Andrzej Siewior</name>
<email>bigeasy@linutronix.de</email>
</author>
<published>2017-05-24T08:15:16Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fe5595c074005bd94f0c7d1644175941149f6768'/>
<id>urn:sha1:fe5595c074005bd94f0c7d1644175941149f6768</id>
<content type='text'>
Some call sites of stop_machine() are within a get_online_cpus() protected
region.

stop_machine() calls get_online_cpus() as well, which is possible in the
current implementation but prevents converting the hotplug locking to a
percpu rwsem.

Provide stop_machine_cpuslocked() to avoid nested calls to get_online_cpus().

Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Acked-by: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Acked-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Link: http://lkml.kernel.org/r/20170524081547.400700852@linutronix.de

</content>
</entry>
<entry>
<title>locking/core, stop_machine: Yield the CPU during stop machine()</title>
<updated>2016-11-16T09:15:09Z</updated>
<author>
<name>Christian Borntraeger</name>
<email>borntraeger@de.ibm.com</email>
</author>
<published>2016-10-25T09:03:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bf0d31c05411f030e7df9f98b5be4d0c6fd5cd39'/>
<id>urn:sha1:bf0d31c05411f030e7df9f98b5be4d0c6fd5cd39</id>
<content type='text'>
Some time ago the following commit:

  57f2ffe14fd125c2 ("s390: remove diag 44 calls from cpu_relax()")

... stopped cpu_relax() on s390 yielding to the hypervisor.

As it turns out this made stop_machine() run really slow on virtualized
overcommited systems. For example the kprobes test during bootup took
several seconds instead of just running unnoticed with large guests.

Therefore, yielding was reintroduced with commit:

  4d92f50249eb ("s390: reintroduce diag 44 calls for cpu_relax()")

... but in fact the stop machine code seems to be the only place where
this yielding was really necessary. This place is probably the most
important one as it makes all but one guest CPUs wait for one guest CPU.

As we now have cpu_relax_yield(), we can use this in multi_cpu_stop().
For now lets only add it here. We can add it later in other places
when necessary.

Signed-off-by: Christian Borntraeger &lt;borntraeger@de.ibm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Martin Schwidefsky &lt;schwidefsky@de.ibm.com&gt;
Cc: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Cc: Noam Camus &lt;noamc@ezchip.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Russell King &lt;linux@armlinux.org.uk&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: linuxppc-dev@lists.ozlabs.org
Cc: virtualization@lists.linux-foundation.org
Cc: xen-devel@lists.xenproject.org
Link: http://lkml.kernel.org/r/1477386195-32736-3-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2016-10-03T20:39:00Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2016-10-03T20:39:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=af79ad2b1f337a00aa150b993635b10bc68dc842'/>
<id>urn:sha1:af79ad2b1f337a00aa150b993635b10bc68dc842</id>
<content type='text'>
Pull scheduler changes from Ingo Molnar:
 "The main changes are:

   - irqtime accounting cleanups and enhancements. (Frederic Weisbecker)

   - schedstat debugging enhancements, make it more broadly runtime
     available. (Josh Poimboeuf)

   - More work on asymmetric topology/capacity scheduling. (Morten
     Rasmussen)

   - sched/wait fixes and cleanups. (Oleg Nesterov)

   - PELT (per entity load tracking) improvements. (Peter Zijlstra)

   - Rewrite and enhance select_idle_siblings(). (Peter Zijlstra)

   - sched/numa enhancements/fixes (Rik van Riel)

   - sched/cputime scalability improvements (Stanislaw Gruszka)

   - Load calculation arithmetics fixes. (Dietmar Eggemann)

   - sched/deadline enhancements (Tommaso Cucinotta)

   - Fix utilization accounting when switching to the SCHED_NORMAL
     policy. (Vincent Guittot)

   - ... plus misc cleanups and enhancements"

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (64 commits)
  sched/irqtime: Consolidate irqtime flushing code
  sched/irqtime: Consolidate accounting synchronization with u64_stats API
  u64_stats: Introduce IRQs disabled helpers
  sched/irqtime: Remove needless IRQs disablement on kcpustat update
  sched/irqtime: No need for preempt-safe accessors
  sched/fair: Fix min_vruntime tracking
  sched/debug: Add SCHED_WARN_ON()
  sched/core: Fix set_user_nice()
  sched/fair: Introduce set_curr_task() helper
  sched/core, ia64: Rename set_curr_task()
  sched/core: Fix incorrect utilization accounting when switching to fair class
  sched/core: Optimize SCHED_SMT
  sched/core: Rewrite and improve select_idle_siblings()
  sched/core: Replace sd_busy/nr_busy_cpus with sched_domain_shared
  sched/core: Introduce 'struct sched_domain_shared'
  sched/core: Restructure destroy_sched_domain()
  sched/core: Remove unused @cpu argument from destroy_sched_domain*()
  sched/wait: Introduce init_wait_entry()
  sched/wait: Avoid abort_exclusive_wait() in __wait_on_bit_lock()
  sched/wait: Avoid abort_exclusive_wait() in ___wait_event()
  ...
</content>
</entry>
<entry>
<title>stop_machine: Remove stop_cpus_lock and lg_double_lock/unlock()</title>
<updated>2016-09-22T13:25:55Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2015-11-21T18:11:48Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e6253970413d99f416f7de8bd516e5f1834d8216'/>
<id>urn:sha1:e6253970413d99f416f7de8bd516e5f1834d8216</id>
<content type='text'>
stop_two_cpus() and stop_cpus() use stop_cpus_lock to avoid the deadlock,
we need to ensure that the stopper functions can't be queued "backwards"
from one another. This doesn't look nice; if we use lglock then we do not
really need stopper-&gt;lock, cpu_stop_queue_work() could use lg_local_lock()
under local_irq_save().

OTOH it would be even better to avoid lglock in stop_machine.c and remove
lg_double_lock(). This patch adds "bool stop_cpus_in_progress" set/cleared
by queue_stop_cpus_work(), and changes cpu_stop_queue_two_works() to busy
wait until it is cleared.

queue_stop_cpus_work() sets stop_cpus_in_progress = T lockless, but after
it queues a work on CPU1 it must be visible to stop_two_cpus(CPU1, CPU2)
which checks it under the same lock. And since stop_two_cpus() holds the
2nd lock too, queue_stop_cpus_work() can not clear stop_cpus_in_progress
if it is also going to queue a work on CPU2, it needs to take that 2nd
lock to do this.

Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/20151121181148.GA433@redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>stop_machine: Avoid a sleep and wakeup in stop_one_cpu()</title>
<updated>2016-09-22T12:53:45Z</updated>
<author>
<name>Cheng Chao</name>
<email>cs.os.kernel@gmail.com</email>
</author>
<published>2016-09-14T02:01:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bf89a304722f6904009499a31dc68ab9a5c9742e'/>
<id>urn:sha1:bf89a304722f6904009499a31dc68ab9a5c9742e</id>
<content type='text'>
In case @cpu == smp_proccessor_id(), we can avoid a sleep+wakeup
cycle by doing a preemption.

Callers such as sched_exec() can benefit from this change.

Signed-off-by: Cheng Chao &lt;cs.os.kernel@gmail.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: akpm@linux-foundation.org
Cc: chris@chris-wilson.co.uk
Cc: tj@kernel.org
Link: http://lkml.kernel.org/r/1473818510-6779-1-git-send-email-cs.os.kernel@gmail.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>stop_machine: Touch_nmi_watchdog() after MULTI_STOP_PREPARE</title>
<updated>2016-07-27T09:12:11Z</updated>
<author>
<name>Oleg Nesterov</name>
<email>oleg@redhat.com</email>
</author>
<published>2016-07-26T18:57:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ce4f06dcbb5d6d04d202f1b81ac72d5679dcdfc0'/>
<id>urn:sha1:ce4f06dcbb5d6d04d202f1b81ac72d5679dcdfc0</id>
<content type='text'>
Suppose that stop_machine(fn) hangs because fn() hangs. In this case NMI
hard-lockup can be triggered on another CPU which does nothing wrong and
the trace from nmi_panic() won't help to investigate the problem.

And this change "fixes" the problem we (seem to) hit in practice.

 - stop_two_cpus(0, 1) races with show_state_filter() running on CPU_0.

 - CPU_1 already spins in MULTI_STOP_PREPARE state, it detects the soft
   lockup and tries to report the problem.

 - show_state_filter() enables preemption, CPU_0 calls multi_cpu_stop()
   which goes to MULTI_STOP_DISABLE_IRQ state and disables interrupts.

 - CPU_1 spends more than 10 seconds trying to flush the log buffer to
   the slow serial console.

 - NMI interrupt on CPU_0 (which now waits for CPU_1) calls nmi_panic().

Reported-by: Wang Shu &lt;shuwang@redhat.com&gt;
Signed-off-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Reviewed-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Dave Anderson &lt;anderson@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@linux.vnet.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Tejun Heo &lt;tj@kernel.org&gt;
Link: http://lkml.kernel.org/r/20160726185736.GB4088@redhat.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>kernel/stop_machine.c: remove CONFIG_SMP dependencies</title>
<updated>2016-01-16T19:17:24Z</updated>
<author>
<name>Andrew Morton</name>
<email>akpm@linux-foundation.org</email>
</author>
<published>2016-01-16T00:58:16Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b493c34309cb4aebc44272897067ebf54cb07271'/>
<id>urn:sha1:b493c34309cb4aebc44272897067ebf54cb07271</id>
<content type='text'>
stop_machine.o is only built if CONFIG_SMP=y, so this ifdef always
evaluates to true.

[akpm@linux-foundation.org: remove now-unneeded ifdef]
Reported-by: Valentin Rothberg &lt;valentinrothberg@gmail.com&gt;
Cc: Chris Wilson &lt;chris@chris-wilson.co.uk&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
</content>
</entry>
</feed>
