user/sven/linux.git/kernel/sched, branch v6.1.5

sched/psi: Fix possible missing or delayed pending event

2022-12-31T12:31:55Z

[ Upstream commit e38f89af6a13e895805febd3a329a13ab7e66fa4 ] When a pending event exists and growth is less than the threshold, the current logic is to skip this trigger without generating event. However, from e6df4ead85d9 ("psi: fix possible trigger missing in the window"), our purpose is to generate event as long as pending event exists and the rate meets the limit, no matter what growth is. This patch handles this case properly. Fixes: e6df4ead85d9 ("psi: fix possible trigger missing in the window") Signed-off-by: Hao Lee Signed-off-by: Peter Zijlstra (Intel) Acked-by: Suren Baghdasaryan Link: https://lore.kernel.org/r/20220919072356.GA29069@haolee.io Signed-off-by: Sasha Levin

sched/uclamp: Cater for uclamp in find_energy_efficient_cpu()'s early exit condition

2022-12-31T12:31:55Z

[ Upstream commit d81304bc6193554014d4372a01debdf65e1e9a4d ] If the utilization of the woken up task is 0, we skip the energy calculation because it has no impact. But if the task is boosted (uclamp_min != 0) will have an impact on task placement and frequency selection. Only skip if the util is truly 0 after applying uclamp values. Change uclamp_task_cpu() signature to avoid unnecessary additional calls to uclamp_eff_get(). feec() is the only user now. Fixes: 732cd75b8c920 ("sched/fair: Select an energy-efficient CPU on task wake-up") Signed-off-by: Qais Yousef Signed-off-by: Peter Zijlstra (Intel) Link: https://lore.kernel.org/r/20220804143609.515789-8-qais.yousef@arm.com Signed-off-by: Sasha Levin

sched/uclamp: Make cpu_overutilized() use util_fits_cpu()

2022-12-31T12:31:55Z

[ Upstream commit c56ab1b3506ba0e7a872509964b100912bde165d ] So that it is now uclamp aware. This fixes a major problem of busy tasks capped with UCLAMP_MAX keeping the system in overutilized state which disables EAS and leads to wasting energy in the long run. Without this patch running a busy background activity like JIT compilation on Pixel 6 causes the system to be in overutilized state 74.5% of the time. With this patch this goes down to 9.79%. It also fixes another problem when long running tasks that have their UCLAMP_MIN changed while running such that they need to upmigrate to honour the new UCLAMP_MIN value. The upmigration doesn't get triggered because overutilized state never gets set in this state, hence misfit migration never happens at tick in this case until the task wakes up again. Fixes: af24bde8df202 ("sched/uclamp: Add uclamp support to energy_compute()") Signed-off-by: Qais Yousef Signed-off-by: Peter Zijlstra (Intel) Link: https://lore.kernel.org/r/20220804143609.515789-7-qais.yousef@arm.com Signed-off-by: Sasha Levin

sched/uclamp: Make asym_fits_capacity() use util_fits_cpu()

2022-12-31T12:31:55Z

[ Upstream commit a2e7f03ed28fce26c78b985f87913b6ce3accf9d ] Use the new util_fits_cpu() to ensure migration margin and capacity pressure are taken into account correctly when uclamp is being used otherwise we will fail to consider CPUs as fitting in scenarios where they should. s/asym_fits_capacity/asym_fits_cpu/ to better reflect what it does now. Fixes: b4c9c9f15649 ("sched/fair: Prefer prev cpu in asymmetric wakeup path") Signed-off-by: Qais Yousef Signed-off-by: Peter Zijlstra (Intel) Link: https://lore.kernel.org/r/20220804143609.515789-6-qais.yousef@arm.com Signed-off-by: Sasha Levin

sched/uclamp: Make select_idle_capacity() use util_fits_cpu()

2022-12-31T12:31:54Z

[ Upstream commit b759caa1d9f667b94727b2ad12589cbc4ce13a82 ] Use the new util_fits_cpu() to ensure migration margin and capacity pressure are taken into account correctly when uclamp is being used otherwise we will fail to consider CPUs as fitting in scenarios where they should. Fixes: b4c9c9f15649 ("sched/fair: Prefer prev cpu in asymmetric wakeup path") Signed-off-by: Qais Yousef Signed-off-by: Peter Zijlstra (Intel) Link: https://lore.kernel.org/r/20220804143609.515789-5-qais.yousef@arm.com Signed-off-by: Sasha Levin

sched/uclamp: Fix fits_capacity() check in feec()

2022-12-31T12:31:54Z

[ Upstream commit 244226035a1f9b2b6c326e55ae5188fab4f428cb ] As reported by Yun Hsiang [1], if a task has its uclamp_min >= 0.8 * 1024, it'll always pick the previous CPU because fits_capacity() will always return false in this case. The new util_fits_cpu() logic should handle this correctly for us beside more corner cases where similar failures could occur, like when using UCLAMP_MAX. We open code uclamp_rq_util_with() except for the clamp() part, util_fits_cpu() needs the 'raw' values to be passed to it. Also introduce uclamp_rq_{set, get}() shorthand accessors to get uclamp value for the rq. Makes the code more readable and ensures the right rules (use READ_ONCE/WRITE_ONCE) are respected transparently. [1] https://lists.linaro.org/pipermail/eas-dev/2020-July/001488.html Fixes: 1d42509e475c ("sched/fair: Make EAS wakeup placement consider uclamp restrictions") Reported-by: Yun Hsiang Signed-off-by: Qais Yousef Signed-off-by: Peter Zijlstra (Intel) Link: https://lore.kernel.org/r/20220804143609.515789-4-qais.yousef@arm.com Signed-off-by: Sasha Levin

sched/uclamp: Make task_fits_capacity() use util_fits_cpu()

2022-12-31T12:31:54Z

[ Upstream commit b48e16a69792b5dc4a09d6807369d11b2970cc36 ] So that the new uclamp rules in regard to migration margin and capacity pressure are taken into account correctly. Fixes: a7008c07a568 ("sched/fair: Make task_fits_capacity() consider uclamp restrictions") Co-developed-by: Vincent Guittot Signed-off-by: Qais Yousef Signed-off-by: Peter Zijlstra (Intel) Link: https://lore.kernel.org/r/20220804143609.515789-3-qais.yousef@arm.com Signed-off-by: Sasha Levin

sched/uclamp: Fix relationship between uclamp and migration margin

2022-12-31T12:31:54Z

[ Upstream commit 48d5e9daa8b767e75ed9421665b037a49ce4bc04 ] fits_capacity() verifies that a util is within 20% margin of the capacity of a CPU, which is an attempt to speed up upmigration. But when uclamp is used, this 20% margin is problematic because for example if a task is boosted to 1024, then it will not fit on any CPU according to fits_capacity() logic. Or if a task is boosted to capacity_orig_of(medium_cpu). The task will end up on big instead on the desired medium CPU. Similar corner cases exist for uclamp and usage of capacity_of(). Slightest irq pressure on biggest CPU for example will make a 1024 boosted task look like it can't fit. What we really want is for uclamp comparisons to ignore the migration margin and capacity pressure, yet retain them for when checking the _actual_ util signal. For example, task p: p->util_avg = 300 p->uclamp[UCLAMP_MIN] = 1024 Will fit a big CPU. But p->util_avg = 900 p->uclamp[UCLAMP_MIN] = 1024 will not, this should trigger overutilized state because the big CPU is now *actually* being saturated. Similar reasoning applies to capping tasks with UCLAMP_MAX. For example: p->util_avg = 1024 p->uclamp[UCLAMP_MAX] = capacity_orig_of(medium_cpu) Should fit the task on medium cpus without triggering overutilized state. Inlined comments expand more on desired behavior in more scenarios. Introduce new util_fits_cpu() function which encapsulates the new logic. The new function is not used anywhere yet, but will be used to update various users of fits_capacity() in later patches. Fixes: af24bde8df202 ("sched/uclamp: Add uclamp support to energy_compute()") Signed-off-by: Qais Yousef Signed-off-by: Peter Zijlstra (Intel) Link: https://lore.kernel.org/r/20220804143609.515789-2-qais.yousef@arm.com Signed-off-by: Sasha Levin

Revert "cpufreq: schedutil: Move max CPU capacity to sugov_policy"

2022-11-22T18:56:52Z

This reverts commit 6d5afdc97ea71958287364a1f1d07e59ef151b11. On a Pixel 6 device, it is observed that this commit increases latency by approximately 50ms, or 20%, in migrating a task that requires full CPU utilization from a LITTLE CPU to Fmax on a big CPU. Reverting this change restores the latency back to its original baseline value. Fixes: 6d5afdc97ea7 ("cpufreq: schedutil: Move max CPU capacity to sugov_policy") Signed-off-by: Sam Wu Signed-off-by: Rafael J. Wysocki

sched: Fix race in task_call_func()

2022-11-14T08:58:32Z

There is a very narrow race between schedule() and task_call_func(). CPU0 CPU1 __schedule() rq_lock(); prev_state = READ_ONCE(prev->__state); if (... && prev_state) { deactivate_tasl(rq, prev, ...) prev->on_rq = 0; task_call_func() raw_spin_lock_irqsave(p->pi_lock); state = READ_ONCE(p->__state); smp_rmb(); if (... || p->on_rq) // false!!! rq = __task_rq_lock() ret = func(); next = pick_next_task(); rq = context_switch(prev, next) prepare_lock_switch() spin_release(&__rq_lockp(rq)->dep_map...) So while the task is on it's way out, it still holds rq->lock for a little while, and right then task_call_func() comes in and figures it doesn't need rq->lock anymore (because the task is already dequeued -- but still running there) and then the __set_task_frozen() thing observes it's holding rq->lock and yells murder. Avoid this by waiting for p->on_cpu to get cleared, which guarantees the task is fully finished on the old CPU. ( While arguably the fixes tag is 'wrong' -- none of the previous task_call_func() users appears to care for this case. ) Fixes: f5d39b020809 ("freezer,sched: Rewrite core freezer logic") Reported-by: Ville Syrjälä Signed-off-by: Peter Zijlstra (Intel) Tested-by: Ville Syrjälä Link: https://lkml.kernel.org/r/Y1kdRNNfUeAU+FNl@hirez.programming.kicks-ass.net