<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/cpu.c, branch v5.10.82</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.10.82</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.10.82'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2021-07-19T07:44:59Z</updated>
<entry>
<title>cpu/hotplug: Cure the cpusets trainwreck</title>
<updated>2021-07-19T07:44:59Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2021-03-27T21:01:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b5e26be407e642dc0ff00fd09387c48d36725a0a'/>
<id>urn:sha1:b5e26be407e642dc0ff00fd09387c48d36725a0a</id>
<content type='text'>
commit b22afcdf04c96ca58327784e280e10288cfd3303 upstream.

Alexey and Joshua tried to solve a cpusets related hotplug problem which is
user space visible and results in unexpected behaviour for some time after
a CPU has been plugged in and the corresponding uevent was delivered.

cpusets delegate the hotplug work (rebuilding cpumasks etc.) to a
workqueue. This is done because the cpusets code has already a lock
nesting of cgroups_mutex -&gt; cpu_hotplug_lock. A synchronous callback or
waiting for the work to finish with cpu_hotplug_lock held can and will
deadlock because that results in the reverse lock order.

As a consequence the uevent can be delivered before cpusets have consistent
state which means that a user space invocation of sched_setaffinity() to
move a task to the plugged CPU fails up to the point where the scheduled
work has been processed.

The same is true for CPU unplug, but that does not create user observable
failure (yet).

It's still inconsistent to claim that an operation is finished before it
actually is and that's the real issue at hand. uevents just make it
reliably observable.

Obviously the problem should be fixed in cpusets/cgroups, but untangling
that is pretty much impossible because according to the changelog of the
commit which introduced this 8 years ago:

 3a5a6d0c2b03("cpuset: don't nest cgroup_mutex inside get_online_cpus()")

the lock order cgroups_mutex -&gt; cpu_hotplug_lock is a design decision and
the whole code is built around that.

So bite the bullet and invoke the relevant cpuset function, which waits for
the work to finish, in _cpu_up/down() after dropping cpu_hotplug_lock and
only when tasks are not frozen by suspend/hibernate because that would
obviously wait forever.

Waiting there with cpu_add_remove_lock, which is protecting the present
and possible CPU maps, held is not a problem at all because neither work
queues nor cpusets/cgroups have any lockchains related to that lock.

Waiting in the hotplug machinery is not problematic either because there
are already state callbacks which wait for hardware queues to drain. It
makes the operations slightly slower, but hotplug is slow anyway.

This ensures that state is consistent before returning from a hotplug
up/down operation. It's still inconsistent during the operation, but that's
a different story.

Add a large comment which explains why this is done and why this is not a
dump ground for the hack of the day to work around half thought out locking
schemes. Document also the implications vs. hotplug operations and
serialization or the lack of it.

Thanks to Alexy and Joshua for analyzing why this temporary
sched_setaffinity() failure happened.

Fixes: 3a5a6d0c2b03("cpuset: don't nest cgroup_mutex inside get_online_cpus()")
Reported-by: Alexey Klimov &lt;aklimov@redhat.com&gt;
Reported-by: Joshua Baker &lt;jobaker@redhat.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Tested-by: Alexey Klimov &lt;aklimov@redhat.com&gt;
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/87tuowcnv3.ffs@nanos.tec.linutronix.de
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>kernel/cpu: add arch override for clear_tasks_mm_cpumask() mm handling</title>
<updated>2020-11-26T13:10:39Z</updated>
<author>
<name>Nicholas Piggin</name>
<email>npiggin@gmail.com</email>
</author>
<published>2020-11-26T10:25:29Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8ff00399b153440c1c83e20c43020385b416415b'/>
<id>urn:sha1:8ff00399b153440c1c83e20c43020385b416415b</id>
<content type='text'>
powerpc/64s keeps a counter in the mm which counts bits set in
mm_cpumask as well as other things. This means it can't use generic code
to clear bits out of the mask and doesn't adjust the arch specific
counter.

Add an arch override that allows powerpc/64s to use
clear_tasks_mm_cpumask().

Signed-off-by: Nicholas Piggin &lt;npiggin@gmail.com&gt;
Reviewed-by: Aneesh Kumar K.V &lt;aneesh.kumar@linux.ibm.com&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt;
Link: https://lore.kernel.org/r/20201126102530.691335-4-npiggin@gmail.com
</content>
</entry>
<entry>
<title>Merge tag 'sched-core-2020-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2020-06-03T20:06:42Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-06-03T20:06:42Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d479c5a1919b4e569dcd3ae9c84ed74a675d0b94'/>
<id>urn:sha1:d479c5a1919b4e569dcd3ae9c84ed74a675d0b94</id>
<content type='text'>
Pull scheduler updates from Ingo Molnar:
 "The changes in this cycle are:

   - Optimize the task wakeup CPU selection logic, to improve
     scalability and reduce wakeup latency spikes

   - PELT enhancements

   - CFS bandwidth handling fixes

   - Optimize the wakeup path by remove rq-&gt;wake_list and replacing it
     with -&gt;ttwu_pending

   - Optimize IPI cross-calls by making flush_smp_call_function_queue()
     process sync callbacks first.

   - Misc fixes and enhancements"

* tag 'sched-core-2020-06-02' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (36 commits)
  irq_work: Define irq_work_single() on !CONFIG_IRQ_WORK too
  sched/headers: Split out open-coded prototypes into kernel/sched/smp.h
  sched: Replace rq::wake_list
  sched: Add rq::ttwu_pending
  irq_work, smp: Allow irq_work on call_single_queue
  smp: Optimize send_call_function_single_ipi()
  smp: Move irq_work_run() out of flush_smp_call_function_queue()
  smp: Optimize flush_smp_call_function_queue()
  sched: Fix smp_call_function_single_async() usage for ILB
  sched/core: Offload wakee task activation if it the wakee is descheduling
  sched/core: Optimize ttwu() spinning on p-&gt;on_cpu
  sched: Defend cfs and rt bandwidth quota against overflow
  sched/cpuacct: Fix charge cpuacct.usage_sys
  sched/fair: Replace zero-length array with flexible-array
  sched/pelt: Sync util/runnable_sum with PELT window when propagating
  sched/cpuacct: Use __this_cpu_add() instead of this_cpu_ptr()
  sched/fair: Optimize enqueue_task_fair()
  sched: Make scheduler_ipi inline
  sched: Clean up scheduler_ipi()
  sched/core: Simplify sched_init()
  ...
</content>
</entry>
<entry>
<title>cpu/hotplug: Remove __freeze_secondary_cpus()</title>
<updated>2020-05-07T13:18:41Z</updated>
<author>
<name>Qais Yousef</name>
<email>qais.yousef@arm.com</email>
</author>
<published>2020-04-30T11:40:04Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fb7fb84a0c4e8021ddecb157802d58241a3f1a40'/>
<id>urn:sha1:fb7fb84a0c4e8021ddecb157802d58241a3f1a40</id>
<content type='text'>
The refactored function is no longer required as the codepaths that call
freeze_secondary_cpus() are all suspend/resume related now.

Signed-off-by: Qais Yousef &lt;qais.yousef@arm.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: "Rafael J. Wysocki" &lt;rjw@rjwysocki.net&gt;
Link: https://lkml.kernel.org/r/20200430114004.17477-2-qais.yousef@arm.com

</content>
</entry>
<entry>
<title>cpu/hotplug: Remove disable_nonboot_cpus()</title>
<updated>2020-05-07T13:18:40Z</updated>
<author>
<name>Qais Yousef</name>
<email>qais.yousef@arm.com</email>
</author>
<published>2020-04-30T11:40:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=565558558985b1d7cd43b21f18c1ad6b232788d0'/>
<id>urn:sha1:565558558985b1d7cd43b21f18c1ad6b232788d0</id>
<content type='text'>
The single user could have called freeze_secondary_cpus() directly.

Since this function was a source of confusion, remove it as it's
just a pointless wrapper.

While at it, rename enable_nonboot_cpus() to thaw_secondary_cpus() to
preserve the naming symmetry.

Done automatically via:

	git grep -l enable_nonboot_cpus | xargs sed -i 's/enable_nonboot_cpus/thaw_secondary_cpus/g'

Signed-off-by: Qais Yousef &lt;qais.yousef@arm.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: "Rafael J. Wysocki" &lt;rjw@rjwysocki.net&gt;
Link: https://lkml.kernel.org/r/20200430114004.17477-1-qais.yousef@arm.com

</content>
</entry>
<entry>
<title>sched/core: Fix illegal RCU from offline CPUs</title>
<updated>2020-04-30T18:14:41Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2020-04-01T21:40:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bf2c59fce4074e55d622089b34be3a6bc95484fb'/>
<id>urn:sha1:bf2c59fce4074e55d622089b34be3a6bc95484fb</id>
<content type='text'>
In the CPU-offline process, it calls mmdrop() after idle entry and the
subsequent call to cpuhp_report_idle_dead(). Once execution passes the
call to rcu_report_dead(), RCU is ignoring the CPU, which results in
lockdep complaining when mmdrop() uses RCU from either memcg or
debugobjects below.

Fix it by cleaning up the active_mm state from BP instead. Every arch
which has CONFIG_HOTPLUG_CPU should have already called idle_task_exit()
from AP. The only exception is parisc because it switches them to
&amp;init_mm unconditionally (see smp_boot_one_cpu() and smp_cpu_init()),
but the patch will still work there because it calls mmgrab(&amp;init_mm) in
smp_cpu_init() and then should call mmdrop(&amp;init_mm) in finish_cpu().

  WARNING: suspicious RCU usage
  -----------------------------
  kernel/workqueue.c:710 RCU or wq_pool_mutex should be held!

  other info that might help us debug this:

  RCU used illegally from offline CPU!
  Call Trace:
   dump_stack+0xf4/0x164 (unreliable)
   lockdep_rcu_suspicious+0x140/0x164
   get_work_pool+0x110/0x150
   __queue_work+0x1bc/0xca0
   queue_work_on+0x114/0x120
   css_release+0x9c/0xc0
   percpu_ref_put_many+0x204/0x230
   free_pcp_prepare+0x264/0x570
   free_unref_page+0x38/0xf0
   __mmdrop+0x21c/0x2c0
   idle_task_exit+0x170/0x1b0
   pnv_smp_cpu_kill_self+0x38/0x2e0
   cpu_die+0x48/0x64
   arch_cpu_idle_dead+0x30/0x50
   do_idle+0x2f4/0x470
   cpu_startup_entry+0x38/0x40
   start_secondary+0x7a8/0xa80
   start_secondary_resume+0x10/0x14

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Qian Cai &lt;cai@lca.pw&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Michael Ellerman &lt;mpe@ellerman.id.au&gt; (powerpc)
Link: https://lkml.kernel.org/r/20200401214033.8448-1-cai@lca.pw
</content>
</entry>
<entry>
<title>cpu/hotplug: Fix a typo in comment "broadacasted"-&gt;"broadcasted"</title>
<updated>2020-04-27T20:42:04Z</updated>
<author>
<name>Ethon Paul</name>
<email>ethp@qq.com</email>
</author>
<published>2020-04-17T16:40:04Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=182e073f68a080a29920f6dca796ccf4806b0329'/>
<id>urn:sha1:182e073f68a080a29920f6dca796ccf4806b0329</id>
<content type='text'>
Signed-off-by: Ethon Paul &lt;ethp@qq.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lkml.kernel.org/r/20200417164008.6541-1-ethp@qq.com

</content>
</entry>
<entry>
<title>Merge tag 'smp-core-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2020-03-31T01:06:39Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2020-03-31T01:06:39Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=992a1a3b45b5c0b6e69ecc2a3f32b0d02da28d58'/>
<id>urn:sha1:992a1a3b45b5c0b6e69ecc2a3f32b0d02da28d58</id>
<content type='text'>
Pull core SMP updates from Thomas Gleixner:
 "CPU (hotplug) updates:

   - Support for locked CSD objects in smp_call_function_single_async()
     which allows to simplify callsites in the scheduler core and MIPS

   - Treewide consolidation of CPU hotplug functions which ensures the
     consistency between the sysfs interface and kernel state. The low
     level functions cpu_up/down() are now confined to the core code and
     not longer accessible from random code"

* tag 'smp-core-2020-03-30' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (22 commits)
  cpu/hotplug: Ignore pm_wakeup_pending() for disable_nonboot_cpus()
  cpu/hotplug: Hide cpu_up/down()
  cpu/hotplug: Move bringup of secondary CPUs out of smp_init()
  torture: Replace cpu_up/down() with add/remove_cpu()
  firmware: psci: Replace cpu_up/down() with add/remove_cpu()
  xen/cpuhotplug: Replace cpu_up/down() with device_online/offline()
  parisc: Replace cpu_up/down() with add/remove_cpu()
  sparc: Replace cpu_up/down() with add/remove_cpu()
  powerpc: Replace cpu_up/down() with add/remove_cpu()
  x86/smp: Replace cpu_up/down() with add/remove_cpu()
  arm64: hibernate: Use bringup_hibernate_cpu()
  cpu/hotplug: Provide bringup_hibernate_cpu()
  arm64: Use reboot_cpu instead of hardconding it to 0
  arm64: Don't use disable_nonboot_cpus()
  ARM: Use reboot_cpu instead of hardcoding it to 0
  ARM: Don't use disable_nonboot_cpus()
  ia64: Replace cpu_down() with smp_shutdown_nonboot_cpus()
  cpu/hotplug: Create a new function to shutdown nonboot cpus
  cpu/hotplug: Add new {add,remove}_cpu() functions
  sched/core: Remove rq.hrtick_csd_pending
  ...
</content>
</entry>
<entry>
<title>cpu/hotplug: Ignore pm_wakeup_pending() for disable_nonboot_cpus()</title>
<updated>2020-03-28T10:42:55Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2020-03-27T11:06:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e98eac6ff1b45e4e73f2e6031b37c256ccb5d36b'/>
<id>urn:sha1:e98eac6ff1b45e4e73f2e6031b37c256ccb5d36b</id>
<content type='text'>
A recent change to freeze_secondary_cpus() which added an early abort if a
wakeup is pending missed the fact that the function is also invoked for
shutdown, reboot and kexec via disable_nonboot_cpus().

In case of disable_nonboot_cpus() the wakeup event needs to be ignored as
the purpose is to terminate the currently running kernel.

Add a 'suspend' argument which is only set when the freeze is in context of
a suspend operation. If not set then an eventually pending wakeup event is
ignored.

Fixes: a66d955e910a ("cpu/hotplug: Abort disabling secondary CPUs if wakeup is pending")
Reported-by: Boqun Feng &lt;boqun.feng@gmail.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Pavankumar Kondeti &lt;pkondeti@codeaurora.org&gt;
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/874kuaxdiz.fsf@nanos.tec.linutronix.de


</content>
</entry>
<entry>
<title>cpu/hotplug: Hide cpu_up/down()</title>
<updated>2020-03-25T11:59:38Z</updated>
<author>
<name>Qais Yousef</name>
<email>qais.yousef@arm.com</email>
</author>
<published>2020-03-23T13:51:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=33c3736ec88811b9b6f6ce2cc8967f6b97c3db5e'/>
<id>urn:sha1:33c3736ec88811b9b6f6ce2cc8967f6b97c3db5e</id>
<content type='text'>
Use separate functions for the device core to bring a CPU up and down.

Users outside the device core must use add/remove_cpu() which will take
care of extra housekeeping work like keeping sysfs in sync.

Make cpu_up/down() static and replace the extra layer of indirection.

[ tglx: Removed the extra wrapper functions and adjusted function names ]

Signed-off-by: Qais Yousef &lt;qais.yousef@arm.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: https://lkml.kernel.org/r/20200323135110.30522-18-qais.yousef@arm.com
</content>
</entry>
</feed>
