<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/smp.c, branch v6.1.57</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.1.57</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.1.57'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2022-10-10T19:49:34Z</updated>
<entry>
<title>Merge tag 'bitmap-6.1-rc1' of https://github.com/norov/linux</title>
<updated>2022-10-10T19:49:34Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-10-10T19:49:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d4013bc4d49f6da8178a340348369bb9920225c9'/>
<id>urn:sha1:d4013bc4d49f6da8178a340348369bb9920225c9</id>
<content type='text'>
Pull bitmap updates from Yury Norov:

 - Fix unsigned comparison to -1 in CPUMAP_FILE_MAX_BYTES (Phil Auld)

 - cleanup nr_cpu_ids vs nr_cpumask_bits mess (me)

   This series cleans that mess and adds new config FORCE_NR_CPUS that
   allows to optimize cpumask subsystem if the number of CPUs is known
   at compile-time.

 - optimize find_bit() functions (me)

   Reworks find_bit() functions based on new FIND_{FIRST,NEXT}_BIT()
   macros.

 - add find_nth_bit() (me)

   Adds find_nth_bit(), which is ~70 times faster than bitcounting with
   for_each() loop:

	for_each_set_bit(bit, mask, size)
		if (n-- == 0)
			return bit;

   Also adds bitmap_weight_and() to let people replace this pattern:

	tmp = bitmap_alloc(nbits);
	bitmap_and(tmp, map1, map2, nbits);
	weight = bitmap_weight(tmp, nbits);
	bitmap_free(tmp);

   with a single bitmap_weight_and() call.

 - repair cpumask_check() (me)

   After switching cpumask to use nr_cpu_ids, cpumask_check() started
   generating many false-positive warnings. This series fixes it.

 - Add for_each_cpu_andnot() and for_each_cpu_andnot() (Valentin
   Schneider)

   Extends the API with one more function and applies it in sched/core.

* tag 'bitmap-6.1-rc1' of https://github.com/norov/linux: (28 commits)
  sched/core: Merge cpumask_andnot()+for_each_cpu() into for_each_cpu_andnot()
  lib/test_cpumask: Add for_each_cpu_and(not) tests
  cpumask: Introduce for_each_cpu_andnot()
  lib/find_bit: Introduce find_next_andnot_bit()
  cpumask: fix checking valid cpu range
  lib/bitmap: add tests for for_each() loops
  lib/find: optimize for_each() macros
  lib/bitmap: introduce for_each_set_bit_wrap() macro
  lib/find_bit: add find_next{,_and}_bit_wrap
  cpumask: switch for_each_cpu{,_not} to use for_each_bit()
  net: fix cpu_max_bits_warn() usage in netif_attrmask_next{,_and}
  cpumask: add cpumask_nth_{,and,andnot}
  lib/bitmap: remove bitmap_ord_to_pos
  lib/bitmap: add tests for find_nth_bit()
  lib: add find_nth{,_and,_andnot}_bit()
  lib/bitmap: add bitmap_weight_and()
  lib/bitmap: don't call __bitmap_weight() in kernel code
  tools: sync find_bit() implementation
  lib/find_bit: optimize find_next_bit() functions
  lib/find_bit: create find_first_zero_bit_le()
  ...
</content>
</entry>
<entry>
<title>lib/cpumask: add FORCE_NR_CPUS config option</title>
<updated>2022-09-20T23:11:44Z</updated>
<author>
<name>Yury Norov</name>
<email>yury.norov@gmail.com</email>
</author>
<published>2022-09-05T23:08:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6f9c07be9d020489326098801f0661f754c7c865'/>
<id>urn:sha1:6f9c07be9d020489326098801f0661f754c7c865</id>
<content type='text'>
The size of cpumasks is hard-limited by compile-time parameter NR_CPUS,
but defined at boot-time when kernel parses ACPI/DT tables, and stored in
nr_cpu_ids. In many practical cases, number of CPUs for a target is known
at compile time, and can be provided with NR_CPUS.

In that case, compiler may be instructed to rely on NR_CPUS as on actual
number of CPUs, not an upper limit. It allows to optimize many cpumask
routines and significantly shrink size of the kernel image.

This patch adds FORCE_NR_CPUS option to teach the compiler to rely on
NR_CPUS and enable corresponding optimizations.

If FORCE_NR_CPUS=y, kernel will not set nr_cpu_ids at boot, but only check
that the actual number of possible CPUs is equal to NR_CPUS, and WARN if
that doesn't hold.

The new option is especially useful in embedded applications because
kernel configurations are unique for each SoC, the number of CPUs is
constant and known well, and memory limitations are typically harder.

For my 4-CPU ARM64 build with NR_CPUS=4, FORCE_NR_CPUS=y saves 46KB:
  add/remove: 3/4 grow/shrink: 46/729 up/down: 652/-46952 (-46300)

Signed-off-by: Yury Norov &lt;yury.norov@gmail.com&gt;
</content>
</entry>
<entry>
<title>smp: add set_nr_cpu_ids()</title>
<updated>2022-09-20T00:51:53Z</updated>
<author>
<name>Yury Norov</name>
<email>yury.norov@gmail.com</email>
</author>
<published>2022-09-05T23:08:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=38bef8e57f2149acd2c910a98f57dd6291d2e0ec'/>
<id>urn:sha1:38bef8e57f2149acd2c910a98f57dd6291d2e0ec</id>
<content type='text'>
In preparation to support compile-time nr_cpu_ids, add a setter for
the variable.

This is a no-op for all arches.

Signed-off-by: Yury Norov &lt;yury.norov@gmail.com&gt;
</content>
</entry>
<entry>
<title>smp: don't declare nr_cpu_ids if NR_CPUS == 1</title>
<updated>2022-09-20T00:51:53Z</updated>
<author>
<name>Yury Norov</name>
<email>yury.norov@gmail.com</email>
</author>
<published>2022-09-05T23:08:16Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=53fc190cc6771c5494d782210334d4ebb50c7103'/>
<id>urn:sha1:53fc190cc6771c5494d782210334d4ebb50c7103</id>
<content type='text'>
SMP and NR_CPUS are independent options, hence nr_cpu_ids may be
declared even if NR_CPUS == 1, which is useless.

Signed-off-by: Yury Norov &lt;yury.norov@gmail.com&gt;
</content>
</entry>
<entry>
<title>sched/debug: Try trigger_single_cpu_backtrace(cpu) in dump_cpu_task()</title>
<updated>2022-08-31T12:03:14Z</updated>
<author>
<name>Zhen Lei</name>
<email>thunder.leizhen@huawei.com</email>
</author>
<published>2022-08-04T02:34:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e73dfe30930b75c98746152e7a2f6a8ab6067b51'/>
<id>urn:sha1:e73dfe30930b75c98746152e7a2f6a8ab6067b51</id>
<content type='text'>
The trigger_all_cpu_backtrace() function attempts to send an NMI to the
target CPU, which usually provides much better stack traces than the
dump_cpu_task() function's approach of dumping that stack from some other
CPU.  So much so that most calls to dump_cpu_task() only happen after
a call to trigger_all_cpu_backtrace() has failed.  And the exception to
this rule really should attempt to use trigger_all_cpu_backtrace() first.

Therefore, move the trigger_all_cpu_backtrace() invocation into
dump_cpu_task().

Signed-off-by: Zhen Lei &lt;thunder.leizhen@huawei.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Juri Lelli &lt;juri.lelli@redhat.com&gt;
Cc: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Cc: Dietmar Eggemann &lt;dietmar.eggemann@arm.com&gt;
Cc: Ben Segall &lt;bsegall@google.com&gt;
Cc: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Daniel Bristot de Oliveira &lt;bristot@redhat.com&gt;
Cc: Valentin Schneider &lt;vschneid@redhat.com&gt;
</content>
</entry>
<entry>
<title>locking/csd_lock: Change csdlock_debug from early_param to __setup</title>
<updated>2022-07-19T18:40:00Z</updated>
<author>
<name>Chen Zhongjin</name>
<email>chenzhongjin@huawei.com</email>
</author>
<published>2022-05-10T09:46:39Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9c9b26b0df270d4f9246e483a44686fca951a29c'/>
<id>urn:sha1:9c9b26b0df270d4f9246e483a44686fca951a29c</id>
<content type='text'>
The csdlock_debug kernel-boot parameter is parsed by the
early_param() function csdlock_debug().  If set, csdlock_debug()
invokes static_branch_enable() to enable csd_lock_wait feature, which
triggers a panic on arm64 for kernels built with CONFIG_SPARSEMEM=y and
CONFIG_SPARSEMEM_VMEMMAP=n.

With CONFIG_SPARSEMEM_VMEMMAP=n, __nr_to_section is called in
static_key_enable() and returns NULL, resulting in a NULL dereference
because mem_section is initialized only later in sparse_init().

This is also a problem for powerpc because early_param() functions
are invoked earlier than jump_label_init(), also resulting in
static_key_enable() failures.  These failures cause the warning "static
key 'xxx' used before call to jump_label_init()".

Thus, early_param is too early for csd_lock_wait to run
static_branch_enable(), so changes it to __setup to fix these.

Fixes: 8d0968cc6b8f ("locking/csd_lock: Add boot parameter for controlling CSD lock debugging")
Cc: stable@vger.kernel.org
Reported-by: Chen jingwen &lt;chenjingwen6@huawei.com&gt;
Signed-off-by: Chen Zhongjin &lt;chenzhongjin@huawei.com&gt;
Signed-off-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge tag 'sched-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2022-05-24T18:11:13Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-05-24T18:11:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6f3f04c19074972ea12edeed23b07a32894e9e03'/>
<id>urn:sha1:6f3f04c19074972ea12edeed23b07a32894e9e03</id>
<content type='text'>
Pull scheduler updates from Ingo Molnar:

 - Updates to scheduler metrics:
     - PELT fixes &amp; enhancements
     - PSI fixes &amp; enhancements
     - Refactor cpu_util_without()

 - Updates to instrumentation/debugging:
     - Remove sched_trace_*() helper functions - can be done via debug
       info
     - Fix double update_rq_clock() warnings

 - Introduce &amp; use "preemption model accessors" to simplify some of the
   Kconfig complexity.

 - Make softirq handling RT-safe.

 - Misc smaller fixes &amp; cleanups.

* tag 'sched-core-2022-05-23' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  topology: Remove unused cpu_cluster_mask()
  sched: Reverse sched_class layout
  sched/deadline: Remove superfluous rq clock update in push_dl_task()
  sched/core: Avoid obvious double update_rq_clock warning
  smp: Make softirq handling RT safe in flush_smp_call_function_queue()
  smp: Rename flush_smp_call_function_from_idle()
  sched: Fix missing prototype warnings
  sched/fair: Remove cfs_rq_tg_path()
  sched/fair: Remove sched_trace_*() helper functions
  sched/fair: Refactor cpu_util_without()
  sched/fair: Revise comment about lb decision matrix
  sched/psi: report zeroes for CPU full at the system level
  sched/fair: Delete useless condition in tg_unthrottle_up()
  sched/fair: Fix cfs_rq_clock_pelt() for throttled cfs_rq
  sched/fair: Move calculate of avg_load to a better location
  mailmap: Update my email address to @redhat.com
  MAINTAINERS: Add myself as scheduler topology reviewer
  psi: Fix trigger being fired unexpectedly at initial
  ftrace: Use preemption model accessors for trace header printout
  kcsan: Use preemption model accessors
</content>
</entry>
<entry>
<title>Merge tag 'rcu.2022.05.19a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu</title>
<updated>2022-05-23T18:46:51Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2022-05-23T18:46:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1e57930e9f4083ad5854ab6eadffe790a8167fb4'/>
<id>urn:sha1:1e57930e9f4083ad5854ab6eadffe790a8167fb4</id>
<content type='text'>
Pull RCU update from Paul McKenney:

 - Documentation updates

 - Miscellaneous fixes

 - Callback-offloading updates, mainly simplifications

 - RCU-tasks updates, including some -rt fixups, handling of systems
   with sparse CPU numbering, and a fix for a boot-time race-condition
   failure

 - Put SRCU on a memory diet in order to reduce the size of the
   srcu_struct structure

 - Torture-test updates fixing some bugs in tests and closing some
   testing holes

 - Torture-test updates for the RCU tasks flavors, most notably ensuring
   that building rcutorture and friends does not change the
   RCU-tasks-related Kconfig options

 - Torture-test scripting updates

 - Expedited grace-period updates, most notably providing
   milliseconds-scale (not all that) soft real-time response from
   synchronize_rcu_expedited().

   This is also the first time in almost 30 years of RCU that someone
   other than me has pushed for a reduction in the RCU CPU stall-warning
   timeout, in this case by more than three orders of magnitude from 21
   seconds to 20 milliseconds. This tighter timeout applies only to
   expedited grace periods

* tag 'rcu.2022.05.19a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: (80 commits)
  rcu: Move expedited grace period (GP) work to RT kthread_worker
  rcu: Introduce CONFIG_RCU_EXP_CPU_STALL_TIMEOUT
  srcu: Drop needless initialization of sdp in srcu_gp_start()
  srcu: Prevent expedited GPs and blocking readers from consuming CPU
  srcu: Add contention check to call_srcu() srcu_data -&gt;lock acquisition
  srcu: Automatically determine size-transition strategy at boot
  rcutorture: Make torture.sh allow for --kasan
  rcutorture: Make torture.sh refscale and rcuscale specify Tasks Trace RCU
  rcutorture: Make kvm.sh allow more memory for --kasan runs
  torture: Save "make allmodconfig" .config file
  scftorture: Remove extraneous "scf" from per_version_boot_params
  rcutorture: Adjust scenarios' Kconfig options for CONFIG_PREEMPT_DYNAMIC
  torture: Enable CSD-lock stall reports for scftorture
  torture: Skip vmlinux check for kvm-again.sh runs
  scftorture: Adjust for TASKS_RCU Kconfig option being selected
  rcuscale: Allow rcuscale without RCU Tasks Rude/Trace
  rcuscale: Allow rcuscale without RCU Tasks
  refscale: Allow refscale without RCU Tasks Rude/Trace
  refscale: Allow refscale without RCU Tasks
  rcutorture: Allow specifying per-scenario stat_interval
  ...
</content>
</entry>
<entry>
<title>Merge tag 'v5.18-rc5' into sched/core to pull in fixes &amp; to resolve a conflict</title>
<updated>2022-05-06T08:21:46Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@kernel.org</email>
</author>
<published>2022-05-06T08:21:46Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d70522fc541224b8351ac26f4765f2c6268f8d72'/>
<id>urn:sha1:d70522fc541224b8351ac26f4765f2c6268f8d72</id>
<content type='text'>
 - sched/core is on a pretty old -rc1 base - refresh it to include recent fixes.
 - this also allows up to resolve a (trivial) .mailmap conflict

Conflicts:
	.mailmap

Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>smp: Make softirq handling RT safe in flush_smp_call_function_queue()</title>
<updated>2022-05-01T08:03:43Z</updated>
<author>
<name>Sebastian Andrzej Siewior</name>
<email>bigeasy@linutronix.de</email>
</author>
<published>2022-04-13T13:31:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1a90bfd220201fbe050dfc15deaac20ca5f15638'/>
<id>urn:sha1:1a90bfd220201fbe050dfc15deaac20ca5f15638</id>
<content type='text'>
flush_smp_call_function_queue() invokes do_softirq() which is not available
on PREEMPT_RT. flush_smp_call_function_queue() is invoked from the idle
task and the migration task with preemption or interrupts disabled.

So RT kernels cannot process soft interrupts in that context as that has to
acquire 'sleeping spinlocks' which is not possible with preemption or
interrupts disabled and forbidden from the idle task anyway.

The currently known SMP function call which raises a soft interrupt is in
the block layer, but this functionality is not enabled on RT kernels due to
latency and performance reasons.

RT could wake up ksoftirqd unconditionally, but this wants to be avoided if
there were soft interrupts pending already when this is invoked in the
context of the migration task. The migration task might have preempted a
threaded interrupt handler which raised a soft interrupt, but did not reach
the local_bh_enable() to process it. The "running" ksoftirqd might prevent
the handling in the interrupt thread context which is causing latency
issues.

Add a new function which handles this case explicitely for RT and falls
back to do_softirq() on !RT kernels. In the RT case this warns when one of
the flushed SMP function calls raised a soft interrupt so this can be
investigated.

[ tglx: Moved the RT part out of SMP code ]

Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/YgKgL6aPj8aBES6G@linutronix.de
Link: https://lore.kernel.org/r/20220413133024.356509586@linutronix.de

</content>
</entry>
</feed>
