user/sven/linux.git/kernel/rcu/tree.c, branch v6.5.3

Merge branches 'doc.2023.05.10a', 'fixes.2023.05.11a', 'kvfree.2023.05.10a', 'nocb.2023.05.11a', 'rcu-tasks.2023.05.10a', 'torture.2023.05.15a' and 'rcu-urgent.2023.06.06a' into HEAD

2023-06-07T20:44:06Z

doc.2023.05.10a: Documentation updates fixes.2023.05.11a: Miscellaneous fixes kvfree.2023.05.10a: kvfree_rcu updates nocb.2023.05.11a: Callback-offloading updates rcu-tasks.2023.05.10a: Tasks RCU updates torture.2023.05.15a: Torture-test updates rcu-urgent.2023.06.06a: Urgent SRCU fix

rcu-tasks: Stop rcu_tasks_invoke_cbs() from using never-onlined CPUs

2023-05-11T20:42:39Z

The rcu_tasks_invoke_cbs() function relies on queue_work_on() to silently fall back to WORK_CPU_UNBOUND when the specified CPU is offline. However, the queue_work_on() function's silent fallback mechanism relies on that CPU having been online at some time in the past. When queue_work_on() is passed a CPU that has never been online, workqueue lockups ensue, which can be bad for your kernel's general health and well-being. This commit therefore checks whether a given CPU has ever been online, and, if not substitutes WORK_CPU_UNBOUND in the subsequent call to queue_work_on(). Why not simply omit the queue_work_on() call entirely? Because this function is flooding callback-invocation notifications to all CPUs, and must deal with possibilities that include a sparse cpu_possible_mask. This commit also moves the setting of the rcu_data structure's ->beenonline field to rcu_cpu_starting(), which executes on the incoming CPU before that CPU has ever enabled interrupts. This ensures that the required workqueues are present. In addition, because the incoming CPU has not yet enabled its interrupts, there cannot yet have been any softirq handlers running on this CPU, which means that the WARN_ON_ONCE(!rdp->beenonline) within the RCU_SOFTIRQ handler cannot have triggered yet. Fixes: d363f833c6d88 ("rcu-tasks: Use workqueues for multiple rcu_tasks_invoke_cbs() invocations") Reported-by: Tejun Heo Signed-off-by: Paul E. McKenney

rcu: Make rcu_cpu_starting() rely on interrupts being disabled

2023-05-11T20:42:39Z

Currently, rcu_cpu_starting() is written so that it might be invoked with interrupts enabled. However, it is always called when interrupts are disabled, either by rcu_init(), notify_cpu_starting(), or from a call point prior to the call to notify_cpu_starting(). But why bother requiring that interrupts be disabled? The purpose is to allow the rcu_data structure's ->beenonline flag to be set after all early processing has completed for the incoming CPU, thus allowing this flag to be used to determine when workqueues have been set up for the incoming CPU, while still allowing this flag to be used as a diagnostic within rcu_core(). This commit therefore makes rcu_cpu_starting() rely on interrupts being disabled. Signed-off-by: Paul E. McKenney

rcu: Mark rcu_cpu_kthread() accesses to ->rcu_cpu_has_work

2023-05-11T20:42:39Z

The rcu_data structure's ->rcu_cpu_has_work field can be modified by any CPU attempting to wake up the rcuc kthread. Therefore, this commit marks accesses to this field from the rcu_cpu_kthread() function. This data race was reported by KCSAN. Not appropriate for backporting due to failure being unlikely. Signed-off-by: Paul E. McKenney

rcu: Employ jiffies-based backstop to callback time limit

2023-05-11T20:42:39Z

Currently, if there are more than 100 ready-to-invoke RCU callbacks queued on a given CPU, the rcu_do_batch() function sets a timeout for invocation of the series. This timeout defaulting to three milliseconds, and may be adjusted using the rcutree.rcu_resched_ns kernel boot parameter. This timeout is checked using local_clock(), but the overhead of this function combined with the common-case very small callback-invocation overhead means that local_clock() is checked every 32nd invocation. This works well except for longer-than average callbacks. For example, a series of 500-microsecond-duration callbacks means that local_clock() is checked only once every 16 milliseconds, which makes it difficult to enforce a three-millisecond timeout. This commit therefore adds a Kconfig option RCU_DOUBLE_CHECK_CB_TIME that enables backup timeout checking using the coarser grained but lighter weight jiffies. If the jiffies counter detects a timeout, then local_clock() is consulted even if this is not the 32nd callback. This prevents the aforementioned 16-millisecond latency blow. Reported-by: Domas Mituzas Signed-off-by: Paul E. McKenney

rcu: Check callback-invocation time limit for rcuc kthreads

2023-05-11T20:42:39Z

Currently, a callback-invocation time limit is enforced only for callbacks invoked from the softirq environment, the rationale being that when callbacks are instead invoked from rcuc and rcuoc kthreads, these callbacks cannot be holding up other softirq vectors. Which is in fact true. However, if an rcuc kthread spends too much time invoking callbacks, it can delay quiescent-state reports from its CPU, which can also be a problem. This commit therefore applies the callback-invocation time limit to callback invocation from the rcuc kthreads as well as from softirq. Signed-off-by: Paul E. McKenney

rcu/kvfree: Make drain_page_cache() take early return if cache is disabled

2023-05-10T00:26:21Z

If the rcutree.rcu_min_cached_objs kernel boot parameter is set to zero, then krcp->page_cache_work will never be triggered to fill page cache. In addition, the put_cached_bnode() will not fill page cache. As a result krcp->bkvcache will always be empty, so there is no need to acquire krcp->lock to get page from krcp->bkvcache. This commit therefore makes drain_page_cache() return immediately if the rcu_min_cached_objs is zero. Signed-off-by: Zqiang Reviewed-by: Uladzislau Rezki (Sony) Signed-off-by: Paul E. McKenney

rcu/kvfree: Make fill page cache start from krcp->nr_bkv_objs

2023-05-10T00:26:21Z

When the fill_page_cache_func() function is invoked, it assumes that the cache of pages is completely empty. However, there can be some time between triggering execution of this function and its actual invocation. During this time, kfree_rcu_work() might run, and might fill in part or all of this cache of pages, thus invalidating the fill_page_cache_func() function's assumption. This will not overfill the cache because put_cached_bnode() will reject the extra page. However, it will result in a needless allocation and freeing of one extra page, which might not be helpful under lowish-memory conditions. This commit therefore causes the fill_page_cache_func() to explicitly account for pages that have been placed into the cache shortly before it starts running. Signed-off-by: Zqiang Reviewed-by: Uladzislau Rezki (Sony) Signed-off-by: Paul E. McKenney

rcu/kvfree: Do not run a page work if a cache is disabled

2023-05-10T00:26:21Z

By default the cache size is 5 pages per CPU, but it can be disabled at boot time by setting the rcu_min_cached_objs to zero. When that happens, the current code will uselessly set an hrtimer to schedule refilling this cache with zero pages. This commit therefore streamlines this process by simply refusing the set the hrtimer when rcu_min_cached_objs is zero. Signed-off-by: Uladzislau Rezki (Sony) Signed-off-by: Paul E. McKenney

rcu/kvfree: Use consistent krcp when growing kfree_rcu() page cache

2023-05-10T00:26:21Z

The add_ptr_to_bulk_krc_lock() function is invoked to allocate a new kfree_rcu() page, also known as a kvfree_rcu_bulk_data structure. The kfree_rcu_cpu structure's lock is used to protect this operation, except that this lock must be momentarily dropped when allocating memory. It is clearly important that the lock that is reacquired be the same lock that was acquired initially via krc_this_cpu_lock(). Unfortunately, this same krc_this_cpu_lock() function is used to re-acquire this lock, and if the task migrated to some other CPU during the memory allocation, this will result in the kvfree_rcu_bulk_data structure being added to the wrong CPU's kfree_rcu_cpu structure. This commit therefore replaces that second call to krc_this_cpu_lock() with raw_spin_lock_irqsave() in order to explicitly acquire the lock on the correct kfree_rcu_cpu structure, thus keeping things straight even when the task migrates. Signed-off-by: Zqiang Signed-off-by: Paul E. McKenney