summaryrefslogtreecommitdiff
path: root/Documentation/RCU/Design/Data-Structures
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2025-07-30 11:01:41 -0700
committerLinus Torvalds <torvalds@linux-foundation.org>2025-07-30 11:01:41 -0700
commit2db4df0c09eeb209726261f43fc556360b38ec99 (patch)
treeec2369ba858fb0ea6f11c90b81800f258587e733 /Documentation/RCU/Design/Data-Structures
parent7dff275c663178e9a12a0c0038e4b3be2f3edcba (diff)
parentcc1d1365f0f414f6522378867baa997642a7e6b2 (diff)
Merge tag 'rcu.release.v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux
Pull RCU updates from Neeraj Upadhyay: "Expedited grace period updates: - Protect against early RCU exp quiescent state reporting during exp grace period initialization - Remove superfluous barrier in task unblock path - Remove the CPU online quiescent state report optimization, which is error prone for certain scenarios - Add warning for unexpected pending requested expedited quiescent state on dying CPU Core: - Robustify rcu_is_cpu_rrupt_from_idle() by using more accurate indicators of the actual context tracking state of a CPU - Handle ->defer_qs_iw_pending field data race - Enable rcu_normal_wake_from_gp by default on systems with <= 16 CPUs - Fix lockup in rcu_read_unlock() due to recursive irq_exit() calls - Refactor expedited handling condition in rcu_read_unlock_special() - Documentation updates for hotplug and GP init scan ordering, separation of rcu_state and rnp's gp_seq states, quiescent state reporting for offline CPUs torture-scripts: - Cleanup and improve scripts : remove superfluous warnings for disabled tests; better handling of kvm.sh --kconfig arg; suppress some confusing diagnostics; tolerate bad kvm.sh args; add new diagnostic for build output; fail allmodconfig testing on warnings - Include RCU_TORTURE_TEST_CHK_RDR_STATE config for KCSAN kernels - Disable default RCU-tasks and clocksource-wdog testing on arm64 - Add EXPERT Kconfig option for arm64 KCSAN runs - Remove SRCU-lite testing rcutorture: - Start torture writer threads creation after reader threads to handle race in SRCU-P scenario - Add SRCU down_read()/up_read() test - Add diagnostics for delayed SRCU up_read(), unmatched up_read(), print number of up/down readers and the number of such readers which migrated to other CPU - Ignore certain unsupported configurations for trivial RCU test - Fix splats in RT kernels due to inaccurate checks for BH-disabled context - Enable checks and logs to capture intentionally exercised unexpected scenarios (too short readers) for BUSTED test - Remove SRCU-lite testing srcu: - Expedite SRCU-fast grace periods - Remove SRCU-lite implementation - Add guards for SRCU-fast readers rcu nocb: - Dump NOCB group leader state on stall detection - Robustify nocb_cb_kthread pointer accesses - Fix delayed execution of hurry callbacks when LAZY_RCU is enabled refscale: - Fix multiplication overflow in "loops" and "nreaders" calculations" * tag 'rcu.release.v6.17' of git://git.kernel.org/pub/scm/linux/kernel/git/rcu/linux: (49 commits) rcu: Document concurrent quiescent state reporting for offline CPUs rcu: Document separation of rcu_state and rnp's gp_seq rcu: Document GP init vs hotplug-scan ordering requirements srcu: Add guards for SRCU-fast readers rcu: Fix delayed execution of hurry callbacks rcu: Refactor expedited handling check in rcu_read_unlock_special() checkpatch: Remove SRCU-lite deprecation srcu: Remove SRCU-lite implementation srcu: Expedite SRCU-fast grace periods rcutorture: Remove support for SRCU-lite rcutorture: Remove SRCU-lite scenarios torture: Remove support for SRCU-lite torture: Make torture.sh --allmodconfig testing fail on warnings torture: Add "ERROR" diagnostic for testing kernel-build output torture: Make torture.sh tolerate runs having bad kvm.sh arguments torture: Add textid.txt file to --do-allmodconfig and --do-rcu-rust runs torture: Extract testid.txt generation to separate script torture: Suppress "find" diagnostics from torture.sh --do-none run torture: Provide EXPERT Kconfig option for arm64 KCSAN torture.sh runs rcu: Fix rcu_read_unlock() deadloop due to IRQ work ...
Diffstat (limited to 'Documentation/RCU/Design/Data-Structures')
-rw-r--r--Documentation/RCU/Design/Data-Structures/Data-Structures.rst33
1 files changed, 33 insertions, 0 deletions
diff --git a/Documentation/RCU/Design/Data-Structures/Data-Structures.rst b/Documentation/RCU/Design/Data-Structures/Data-Structures.rst
index 04e16775c752..1b0aad184dd7 100644
--- a/Documentation/RCU/Design/Data-Structures/Data-Structures.rst
+++ b/Documentation/RCU/Design/Data-Structures/Data-Structures.rst
@@ -286,6 +286,39 @@ in order to detect the beginnings and ends of grace periods in a
distributed fashion. The values flow from ``rcu_state`` to ``rcu_node``
(down the tree from the root to the leaves) to ``rcu_data``.
++-----------------------------------------------------------------------+
+| **Quick Quiz**: |
++-----------------------------------------------------------------------+
+| Given that the root rcu_node structure has a gp_seq field, |
+| why does RCU maintain a separate gp_seq in the rcu_state structure? |
+| Why not just use the root rcu_node's gp_seq as the official record |
+| and update it directly when starting a new grace period? |
++-----------------------------------------------------------------------+
+| **Answer**: |
++-----------------------------------------------------------------------+
+| On single-node RCU trees (where the root node is also a leaf), |
+| updating the root node's gp_seq immediately would create unnecessary |
+| lock contention. Here's why: |
+| |
+| If we did rcu_seq_start() directly on the root node's gp_seq: |
+| |
+| 1. All CPUs would immediately see their node's gp_seq from their rdp's|
+| gp_seq, in rcu_pending(). They would all then invoke the RCU-core. |
+| 2. Which calls note_gp_changes() and try to acquire the node lock. |
+| 3. But rnp->qsmask isn't initialized yet (happens later in |
+| rcu_gp_init()) |
+| 4. So each CPU would acquire the lock, find it can't determine if it |
+| needs to report quiescent state (no qsmask), update rdp->gp_seq, |
+| and release the lock. |
+| 5. Result: Lots of lock acquisitions with no grace period progress |
+| |
+| By having a separate rcu_state.gp_seq, we can increment the official |
+| grace period counter without immediately affecting what CPUs see in |
+| their nodes. The hierarchical propagation in rcu_gp_init() then |
+| updates the root node's gp_seq and qsmask together under the same lock|
+| acquisition, avoiding this useless contention. |
++-----------------------------------------------------------------------+
+
Miscellaneous
'''''''''''''