user/sven/linux.git/kernel/smp.c, branch v6.16.1

Merge tag 'csd-lock.2025.01.28a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu

2025-01-28T19:34:03Z

Pull CSD-lock update from Paul McKenney: "Allow runtime modification of the csd_lock_timeout and panic_on_ipistall module parameters" * tag 'csd-lock.2025.01.28a' of git://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu: locking/csd-lock: make CSD lock debug tunables writable in /sys

locking/csd-lock: make CSD lock debug tunables writable in /sys

2024-12-12T04:50:11Z

Currently the CSD lock tunables can only be set at boot time in the kernel commandline, but the way these variables are used means there is really no reason not to tune them at runtime through /sys. Make the CSD lock debug tunables tunable through /sys. Signed-off-by: Rik van Riel Signed-off-by: Paul E. McKenney

smp/scf: Evaluate local cond_func() before IPI side-effects

2024-12-05T13:25:28Z

In smp_call_function_many_cond(), the local cond_func() is evaluated after triggering the remote CPU IPIs. If cond_func() depends on loading shared state updated by other CPU's IPI handlers func(), then triggering execution of remote CPUs IPI before evaluating cond_func() may have unexpected consequences. One example scenario is evaluating a jiffies delay in cond_func(), which is updated by func() in the IPI handlers. This situation can prevent execution of periodic cleanup code on the local CPU. Signed-off-by: Mathieu Desnoyers Signed-off-by: Ingo Molnar Reviewed-by: Rik van Riel Cc: Linus Torvalds Cc: Peter Zijlstra Cc: Thomas Gleixner Link: https://lore.kernel.org/r/20241203163558.3455535-1-mathieu.desnoyers@efficios.com

locking/csd-lock: Switch from sched_clock() to ktime_get_mono_fast_ns()

2024-10-11T16:31:21Z

Currently, the CONFIG_CSD_LOCK_WAIT_DEBUG code uses sched_clock() to check for excessive CSD-lock wait times. This works, but does not guarantee monotonic timestamps on x86 due to the sched_clock() function's use of the rdtsc instruction, which does not guarantee ordering. This means that, given successive calls to sched_clock(), the second might return an earlier time than the second, that is, time might seem to go backwards. This can (and does!) result in false-positive CSD-lock wait complaints claiming almost 2^64 nanoseconds of delay. Therefore, switch from sched_clock() to ktime_get_mono_fast_ns(), which does guarantee monotonic timestamps via the rdtsc_ordered() function, which as the name implies, does guarantee ordered timestamps, at least in the absence of calls from NMI handlers, which are not involved in this code path. Signed-off-by: Paul E. McKenney Reviewed-by: Rik van Riel Cc: Neeraj Upadhyay Cc: Leonardo Bras Cc: Thomas Gleixner Cc: "Peter Zijlstra (Intel)"

smp: print only local CPU info when sched_clock goes backward

2024-08-14T18:36:48Z

About 40% of all csd_lock warnings observed in our fleet appear to be due to sched_clock() going backward in time (usually only a little bit), resulting in ts0 being larger than ts2. When the local CPU is at fault, we should print out a message reflecting that, rather than trying to get the remote CPU's stack trace. Signed-off-by: Rik van Riel Tested-by: "Paul E. McKenney" Signed-off-by: Neeraj Upadhyay

locking/csd-lock: Use backoff for repeated reports of same incident

2024-08-14T18:36:48Z

Currently, the CSD-lock diagnostics in CONFIG_CSD_LOCK_WAIT_DEBUG=y kernels are emitted at five-second intervals. Although this has proven to be a good time interval for the first diagnostic, if the target CPU keeps interrupts disabled for way longer than five seconds, the ratio of useful new information to pointless repetition increases considerably. Therefore, back off the time period for repeated reports of the same incident, increasing linearly with the number of reports and logarithmicly with the number of online CPUs. [ paulmck: Apply Dan Carpenter feedback. ] Signed-off-by: Paul E. McKenney Cc: Imran Khan Cc: Ingo Molnar Cc: Leonardo Bras Cc: "Peter Zijlstra (Intel)" Cc: Rik van Riel Reviewed-by: Rik van Riel Signed-off-by: Neeraj Upadhyay

locking/csd_lock: Provide an indication of ongoing CSD-lock stall

2024-08-14T18:35:39Z

If a CSD-lock stall goes on long enough, it will cause an RCU CPU stall warning. This additional warning provides much additional console-log traffic and little additional information. Therefore, provide a new csd_lock_is_stuck() function that returns true if there is an ongoing CSD-lock stall. This function will be used by the RCU CPU stall warnings to provide a one-line indication of the stall when this function returns true. [ neeraj.upadhyay: Apply Rik van Riel feedback. ] [ neeraj.upadhyay: Apply kernel test robot feedback. ] Signed-off-by: Paul E. McKenney Cc: Imran Khan Cc: Ingo Molnar Cc: Leonardo Bras Cc: "Peter Zijlstra (Intel)" Cc: Rik van Riel Signed-off-by: Neeraj Upadhyay

locking/csd_lock: Print large numbers as negatives

2024-07-29T02:14:38Z

The CSD-lock-hold diagnostics from CONFIG_CSD_LOCK_WAIT_DEBUG are printed in nanoseconds as unsigned long longs, which is a bit obtuse for human readers when timing bugs result in negative CSD-lock hold times. Yes, there are some people to whom it is immediately obvious that 18446744073709551615 is really -1, but for the rest of us... Therefore, print these numbers as signed long longs, making the negative hold times immediately apparent. Reported-by: Rik van Riel Signed-off-by: Paul E. McKenney Cc: Imran Khan Cc: Ingo Molnar Cc: Leonardo Bras Cc: "Peter Zijlstra (Intel)" Cc: Rik van Riel Reviewed-by: Rik van Riel Signed-off-by: Neeraj Upadhyay

smp: Add missing destroy_work_on_stack() call in smp_call_on_cpu()

2024-07-10T20:40:39Z

For CONFIG_DEBUG_OBJECTS_WORK=y kernels sscs.work defined by INIT_WORK_ONSTACK() is initialized by debug_object_init_on_stack() for the debug check in __init_work() to work correctly. But this lacks the counterpart to remove the tracked object from debug objects again, which will cause a debug object warning once the stack is freed. Add the missing destroy_work_on_stack() invocation to cure that. [ tglx: Massaged changelog ] Signed-off-by: Zqiang Signed-off-by: Thomas Gleixner Tested-by: Paul E. McKenney Link: https://lore.kernel.org/r/20240704065213.13559-1-qiang.zhang1211@gmail.com

smp: Use str_plural() to fix Coccinelle warnings

2024-06-17T13:17:44Z

Fixes the following two Coccinelle/coccicheck warnings reported by string_choices.cocci: opportunity for str_plural(num_cpus) opportunity for str_plural(num_nodes) Signed-off-by: Thorsten Blum Signed-off-by: Thomas Gleixner Acked-by: Paul E. McKenney Link: https://lore.kernel.org/r/20240508154225.309703-2-thorsten.blum@toblux.com