<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/rcu/tree.c, branch v6.17.9</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.17.9</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.17.9'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2025-08-11T03:13:49Z</updated>
<entry>
<title>rcu: Fix racy re-initialization of irq_work causing hangs</title>
<updated>2025-08-11T03:13:49Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>frederic@kernel.org</email>
</author>
<published>2025-08-08T17:03:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=61399e0c5410567ef60cb1cda34cca42903842e3'/>
<id>urn:sha1:61399e0c5410567ef60cb1cda34cca42903842e3</id>
<content type='text'>
RCU re-initializes the deferred QS irq work everytime before attempting
to queue it. However there are situations where the irq work is
attempted to be queued even though it is already queued. In that case
re-initializing messes-up with the irq work queue that is about to be
handled.

The chances for that to happen are higher when the architecture doesn't
support self-IPIs and irq work are then all lazy, such as with the
following sequence:

1) rcu_read_unlock() is called when IRQs are disabled and there is a
   grace period involving blocked tasks on the node. The irq work
   is then initialized and queued.

2) The related tasks are unblocked and the CPU quiescent state
   is reported. rdp-&gt;defer_qs_iw_pending is reset to DEFER_QS_IDLE,
   allowing the irq work to be requeued in the future (note the previous
   one hasn't fired yet).

3) A new grace period starts and the node has blocked tasks.

4) rcu_read_unlock() is called when IRQs are disabled again. The irq work
   is re-initialized (but it's queued! and its node is cleared) and
   requeued. Which means it's requeued to itself.

5) The irq work finally fires with the tick. But since it was requeued
   to itself, it loops and hangs.

Fix this with initializing the irq work only once before the CPU boots.

Fixes: b41642c87716 ("rcu: Fix rcu_read_unlock() deadloop due to IRQ work")
Reported-by: kernel test robot &lt;oliver.sang@intel.com&gt;
Closes: https://lore.kernel.org/oe-lkp/202508071303.c1134cce-lkp@intel.com
Signed-off-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Reviewed-by: Joel Fernandes &lt;joelagnelf@nvidia.com&gt;
Signed-off-by: Neeraj Upadhyay (AMD) &lt;neeraj.upadhyay@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge branches 'rcu-exp.23.07.2025', 'rcu.22.07.2025', 'torture-scripts.16.07.2025', 'srcu.19.07.2025', 'rcu.nocb.18.07.2025' and 'refscale.07.07.2025' into rcu.merge.23.07.2025</title>
<updated>2025-07-23T16:12:20Z</updated>
<author>
<name>Neeraj Upadhyay (AMD)</name>
<email>neeraj.upadhyay@kernel.org</email>
</author>
<published>2025-07-23T16:12:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cc1d1365f0f414f6522378867baa997642a7e6b2'/>
<id>urn:sha1:cc1d1365f0f414f6522378867baa997642a7e6b2</id>
<content type='text'>
</content>
</entry>
<entry>
<title>rcu: Document concurrent quiescent state reporting for offline CPUs</title>
<updated>2025-07-22T11:40:50Z</updated>
<author>
<name>Joel Fernandes</name>
<email>joelagnelf@nvidia.com</email>
</author>
<published>2025-07-15T20:01:55Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5d71c2b53f1790c2ca09d03848839c610653d278'/>
<id>urn:sha1:5d71c2b53f1790c2ca09d03848839c610653d278</id>
<content type='text'>
The synchronization of CPU offlining with GP initialization is confusing
to put it mildly (rightfully so as the issue it deals with is complex).
Recent discussions brought up a question -- what prevents the
rcu_implicit_dyntick_qs() from warning about QS reports for offline
CPUs (missing QS reports for offline CPUs causing indefinite hangs).

QS reporting for now-offline CPUs should only happen from:
- gp_init()
- rcutree_cpu_report_dead()

Add some documentation on this and refer to it from comments in the code
explaining how QS reporting is not missed when these functions are
concurrently running.

I referred heavily to this post [1] about the need for the ofl_lock.
[1] https://lore.kernel.org/all/20180924164443.GF4222@linux.ibm.com/

[ Applied paulmck feedback on moving documentation to Requirements.rst ]

Link: https://lore.kernel.org/all/01b4d228-9416-43f8-a62e-124b92e8741a@paulmck-laptop/
Co-developed-by: "Paul E. McKenney" &lt;paulmck@kernel.org&gt;
Signed-off-by: "Paul E. McKenney" &lt;paulmck@kernel.org&gt;
Reviewed-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Joel Fernandes &lt;joelagnelf@nvidia.com&gt;
Signed-off-by: Neeraj Upadhyay (AMD) &lt;neeraj.upadhyay@kernel.org&gt;
</content>
</entry>
<entry>
<title>rcu: Document separation of rcu_state and rnp's gp_seq</title>
<updated>2025-07-22T11:40:31Z</updated>
<author>
<name>Joel Fernandes</name>
<email>joelagnelf@nvidia.com</email>
</author>
<published>2025-07-15T20:01:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=186779c036468038b0d077ec5333a51512f867e5'/>
<id>urn:sha1:186779c036468038b0d077ec5333a51512f867e5</id>
<content type='text'>
The details of this are subtle and was discussed recently. Add a
quick-quiz about this and refer to it from the code, for more clarity.

Reviewed-by: "Paul E. McKenney" &lt;paulmck@kernel.org&gt;
Signed-off-by: Joel Fernandes &lt;joelagnelf@nvidia.com&gt;
Signed-off-by: Neeraj Upadhyay (AMD) &lt;neeraj.upadhyay@kernel.org&gt;
</content>
</entry>
<entry>
<title>rcu: Document GP init vs hotplug-scan ordering requirements</title>
<updated>2025-07-22T11:39:35Z</updated>
<author>
<name>Joel Fernandes</name>
<email>joelagnelf@nvidia.com</email>
</author>
<published>2025-07-15T20:01:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=30a7806adab5f6b971cf07439ed6a3fac3fd80cf'/>
<id>urn:sha1:30a7806adab5f6b971cf07439ed6a3fac3fd80cf</id>
<content type='text'>
Add detailed comments explaining the critical ordering constraints
during RCU grace period initialization, based on discussions with
Frederic.

Reviewed-by: "Paul E. McKenney" &lt;paulmck@kernel.org&gt;
Co-developed-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Joel Fernandes &lt;joelagnelf@nvidia.com&gt;
Signed-off-by: Neeraj Upadhyay (AMD) &lt;neeraj.upadhyay@kernel.org&gt;
</content>
</entry>
<entry>
<title>rcu: Enable rcu_normal_wake_from_gp on small systems</title>
<updated>2025-07-16T03:53:56Z</updated>
<author>
<name>Uladzislau Rezki (Sony)</name>
<email>urezki@gmail.com</email>
</author>
<published>2025-07-02T14:59:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=78370df5c3574cdc5c6de7844481b4dc0ef4f172'/>
<id>urn:sha1:78370df5c3574cdc5c6de7844481b4dc0ef4f172</id>
<content type='text'>
Automatically enable the rcu_normal_wake_from_gp parameter on
systems with a small number of CPUs. The activation threshold
is set to 16 CPUs.

This helps to reduce a latency of normal synchronize_rcu() API
by waking up GP-waiters earlier and decoupling synchronize_rcu()
callers from regular callback handling.

A benchmark running 64 parallel jobs(system with 64 CPUs) invoking
synchronize_rcu() demonstrates a notable latency reduction with the
setting enabled.

Latency distribution (microseconds):

&lt;default&gt;
 0      - 9999   : 1
 10000  - 19999  : 4
 20000  - 29999  : 399
 30000  - 39999  : 3197
 40000  - 49999  : 10428
 50000  - 59999  : 17363
 60000  - 69999  : 15529
 70000  - 79999  : 9287
 80000  - 89999  : 4249
 90000  - 99999  : 1915
 100000 - 109999 : 922
 110000 - 119999 : 390
 120000 - 129999 : 187
 ...
&lt;default&gt;

&lt;rcu_normal_wake_from_gp&gt;
 0      - 9999  : 1
 10000  - 19999 : 234
 20000  - 29999 : 6678
 30000  - 39999 : 33463
 40000  - 49999 : 20669
 50000  - 59999 : 2766
 60000  - 69999 : 183
 ...
&lt;rcu_normal_wake_from_gp&gt;

Reviewed-by: Joel Fernandes &lt;joelagnelf@nvidia.com&gt;
Signed-off-by: Uladzislau Rezki (Sony) &lt;urezki@gmail.com&gt;
Reviewed-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Neeraj Upadhyay (AMD) &lt;neeraj.upadhyay@kernel.org&gt;
</content>
</entry>
<entry>
<title>rcu/exp: Warn on QS requested on dying CPU</title>
<updated>2025-07-08T17:51:13Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>frederic@kernel.org</email>
</author>
<published>2025-04-29T13:43:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fc39760cd0f432cd399a209c43dcb887fffdb6dc'/>
<id>urn:sha1:fc39760cd0f432cd399a209c43dcb887fffdb6dc</id>
<content type='text'>
It is not possible to send an IPI to a dying CPU that has passed the
CPUHP_TEARDOWN_CPU stage. Remaining unhandled IPIs are handled later at
CPUHP_AP_SMPCFD_DYING stage by stop machine. This is the last
opportunity for RCU exp handler to request an expedited quiescent state.
And the upcoming final context switch between stop machine and idle must
have reported the requested context switch.

Therefore, it should not be possible to observe a pending requested
expedited quiescent state when RCU finally stops watching the outgoing
CPU. Once IPIs aren't possible anymore, the QS for the target CPU will
be reported on its behalf by the RCU exp kworker.

Provide an assertion to verify those expectations.

Reviewed-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Signed-off-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Joel Fernandes &lt;joelagnelf@nvidia.com&gt;
Signed-off-by: Neeraj Upadhyay (AMD) &lt;neeraj.upadhyay@kernel.org&gt;
</content>
</entry>
<entry>
<title>rcu/exp: Remove needless CPU up quiescent state report</title>
<updated>2025-07-08T17:51:13Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>frederic@kernel.org</email>
</author>
<published>2025-04-29T13:43:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bf0a57445d3fddfb47046aeccf4f5cb938a4e996'/>
<id>urn:sha1:bf0a57445d3fddfb47046aeccf4f5cb938a4e996</id>
<content type='text'>
A CPU coming online checks for an ongoing grace period and reports
a quiescent state accordingly if needed. This special treatment that
shortcuts the expedited IPI finds its origin as an optimization purpose
on the following commit:

	338b0f760e84 (rcu: Better hotplug handling for synchronize_sched_expedited()

The point is to avoid an IPI while waiting for a CPU to become online
or failing to become offline.

However this is pointless and even error prone for several reasons:

* If the CPU has been seen offline in the first round scanning offline
  and idle CPUs, no IPI is even tried and the quiescent state is
  reported on behalf of the CPU.

* This means that if the IPI fails, the CPU just became offline. So
  it's unlikely to become online right away, unless the cpu hotplug
  operation failed and rolled back, which is a rare event that can
  wait a jiffy for a new IPI to be issued.

* But then the "optimization" applying on failing CPU hotplug down only
  applies to !PREEMPT_RCU.

* This force reports a quiescent state even if -&gt;cpu_no_qs.b.exp is not
  set. As a result it can race with remote QS reports on the same rdp.
  Fortunately it happens to be OK but an accident is waiting to happen.

For all those reasons, remove this optimization that doesn't look worthy
to keep around.

Reported-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Signed-off-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Reviewed-by: Paul E. McKenney &lt;paulmck@kernel.org&gt;
Signed-off-by: Joel Fernandes &lt;joelagnelf@nvidia.com&gt;
Signed-off-by: Neeraj Upadhyay (AMD) &lt;neeraj.upadhyay@kernel.org&gt;
</content>
</entry>
<entry>
<title>rcu: Robustify rcu_is_cpu_rrupt_from_idle()</title>
<updated>2025-06-25T03:07:17Z</updated>
<author>
<name>Frederic Weisbecker</name>
<email>frederic@kernel.org</email>
</author>
<published>2025-04-29T10:08:40Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3dfdfaff2d49aa7895ced8a54e7d61755eac6830'/>
<id>urn:sha1:3dfdfaff2d49aa7895ced8a54e7d61755eac6830</id>
<content type='text'>
RCU relies on the context tracking nesting counter in order to determine
if it is running in extended quiescent state.

However the context tracking nesting counter is not completely
synchronized with the actual context tracking state:

* The nesting counter is set to 1 or incremented further _after_ the
  actual state is set to RCU watching.

* The nesting counter is set to 0 or decremented further _before_ the
  actual state is set to RCU not watching.

Therefore it is safe to assume that if ct_nesting() &gt; 0, RCU is
watching. But if ct_nesting() &lt;= 0, RCU is not watching except for tiny
windows.

This hasn't been a problem so far because rcu_is_cpu_rrupt_from_idle()
has only been called from interrupts. However the code is confusing
and abuses the role of the context tracking nesting counter while there
are more accurate indicators available.

Clarify and robustify accordingly.

Signed-off-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Signed-off-by: Joel Fernandes &lt;joelagnelf@nvidia.com&gt;
Signed-off-by: Neeraj Upadhyay (AMD) &lt;neeraj.upadhyay@kernel.org&gt;
</content>
</entry>
<entry>
<title>rcu: Return early if callback is not specified</title>
<updated>2025-06-20T19:31:48Z</updated>
<author>
<name>Uladzislau Rezki (Sony)</name>
<email>urezki@gmail.com</email>
</author>
<published>2025-06-10T17:34:48Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=33b6a1f155d627f5bd80c7485c598ce45428f74f'/>
<id>urn:sha1:33b6a1f155d627f5bd80c7485c598ce45428f74f</id>
<content type='text'>
Currently the call_rcu() API does not check whether a callback
pointer is NULL. If NULL is passed, rcu_core() will try to invoke
it, resulting in NULL pointer dereference and a kernel crash.

To prevent this and improve debuggability, this patch adds a check
for NULL and emits a kernel stack trace to help identify a faulty
caller.

Signed-off-by: Uladzislau Rezki (Sony) &lt;urezki@gmail.com&gt;
Reviewed-by: Joel Fernandes &lt;joelagnelf@nvidia.com&gt;
Signed-off-by: Joel Fernandes &lt;joelagnelf@nvidia.com&gt;
</content>
</entry>
</feed>
