sched/rt: Skip currently executing CPU in rto_next_cpu()

CPU0 becomes overloaded when hosting a CPU-bound RT task, a non-CPU-bound RT task, and a CFS task stuck in kernel space. When other CPUs switch from RT to non-RT tasks, RT load balancing (LB) is triggered; with HAVE_RT_PUSH_IPI enabled, they send IPIs to CPU0 to drive the execution of rto_push_irq_work_func. During push_rt_task on CPU0, if next_task->prio < rq->donor->prio, resched_curr() sets NEED_RESCHED and after the push operation completes, CPU0 calls rto_next_cpu(). Since only CPU0 is overloaded in this scenario, rto_next_cpu() should ideally return -1 (no further IPI needed). However, multiple CPUs invoking tell_cpu_to_push() during LB increments rd->rto_loop_next. Even when rd->rto_cpu is set to -1, the mismatch between rd->rto_loop and rd->rto_loop_next forces rto_next_cpu() to restart its search from -1. With CPU0 remaining overloaded (satisfying rt_nr_migratory && rt_nr_total > 1), it gets reselected, causing CPU0 to queue irq_work to itself and send self-IPIs repeatedly. As long as CPU0 stays overloaded and other CPUs run pull_rt_tasks(), it falls into an infinite self-IPI loop, which triggers a CPU hardlockup due to continuous self-interrupts. The trigging scenario is as follows: cpu0 cpu1 cpu2 pull_rt_task tell_cpu_to_push <------------irq_work_queue_on rto_push_irq_work_func push_rt_task resched_curr(rq) pull_rt_task rto_next_cpu tell_cpu_to_push <-------------------------- atomic_inc(rto_loop_next) rd->rto_loop != next rto_next_cpu irq_work_queue_on rto_push_irq_work_func Fix redundant self-IPI by filtering the initiating CPU in rto_next_cpu(). This solution has been verified to effectively eliminate spurious self-IPIs and prevent CPU hardlockup scenarios. Fixes: 4bdced5c9a29 ("sched/rt: Simplify the IPI based RT balancing logic") Suggested-by: Steven Rostedt (Google) <rostedt@goodmis.org> Suggested-by: K Prateek Nayak <kprateek.nayak@amd.com> Signed-off-by: Chen Jinghuang <chenjinghuang2@huawei.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org> Reviewed-by: Valentin Schneider <vschneid@redhat.com> Link: https://patch.msgid.link/20260122012533.673768-1-chenjinghuang2@huawei.com
author: Chen Jinghuang <chenjinghuang2@huawei.com> 2026-01-22 01:25:33 +0000
committer: Peter Zijlstra <peterz@infradead.org> 2026-02-03 12:04:19 +0100
commit: 94894c9c477e53bcea052e075c53f89df3d2a33e (patch)
tree: 8d3bcf6bbe62b0cb3810ae2bda19801ee4a703ad /kernel
parent: 505da6689305b1103e9a8ab6636c6a7cf74cd5b1 (diff)
1 files changed, 5 insertions, 0 deletions
diff --git a/kernel/sched/rt.c b/kernel/sched/rt.c
index 0a9b2cd6da72..a7680477fa6f 100644
--- a/kernel/sched/rt.c
+++ b/kernel/sched/rt.c
@@ -2106,6 +2106,7 @@ static void push_rt_tasks(struct rq *rq)
  */
 static int rto_next_cpu(struct root_domain *rd)
 {
+	int this_cpu = smp_processor_id();
 	int next;
 	int cpu;
 
@@ -2129,6 +2130,10 @@ static int rto_next_cpu(struct root_domain *rd)
 
 		rd->rto_cpu = cpu;
 
+		/* Do not send IPI to self */
+		if (cpu == this_cpu)
+			continue;
+
 		if (cpu < nr_cpu_ids)
 			return cpu;
author	Chen Jinghuang <chenjinghuang2@huawei.com>	2026-01-22 01:25:33 +0000
committer	Peter Zijlstra <peterz@infradead.org>	2026-02-03 12:04:19 +0100
commit	94894c9c477e53bcea052e075c53f89df3d2a33e (patch)
tree	8d3bcf6bbe62b0cb3810ae2bda19801ee4a703ad /kernel
parent	505da6689305b1103e9a8ab6636c6a7cf74cd5b1 (diff)