<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/sched, branch v6.1.5</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.1.5</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.1.5'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2022-12-31T12:31:55Z</updated>
<entry>
<title>sched/psi: Fix possible missing or delayed pending event</title>
<updated>2022-12-31T12:31:55Z</updated>
<author>
<name>Hao Lee</name>
<email>haolee.swjtu@gmail.com</email>
</author>
<published>2022-09-19T07:23:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=44de1adb24841511c82386fe4e3892954d1ccf3b'/>
<id>urn:sha1:44de1adb24841511c82386fe4e3892954d1ccf3b</id>
<content type='text'>
[ Upstream commit e38f89af6a13e895805febd3a329a13ab7e66fa4 ]

When a pending event exists and growth is less than the threshold, the
current logic is to skip this trigger without generating event. However,
from e6df4ead85d9 ("psi: fix possible trigger missing in the window"),
our purpose is to generate event as long as pending event exists and the
rate meets the limit, no matter what growth is.
This patch handles this case properly.

Fixes: e6df4ead85d9 ("psi: fix possible trigger missing in the window")
Signed-off-by: Hao Lee &lt;haolee.swjtu@gmail.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Suren Baghdasaryan &lt;surenb@google.com&gt;
Link: https://lore.kernel.org/r/20220919072356.GA29069@haolee.io
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/uclamp: Cater for uclamp in find_energy_efficient_cpu()'s early exit condition</title>
<updated>2022-12-31T12:31:55Z</updated>
<author>
<name>Qais Yousef</name>
<email>qais.yousef@arm.com</email>
</author>
<published>2022-08-04T14:36:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e7c542d2cfae906aaeef326a9dd368344d2ff5f3'/>
<id>urn:sha1:e7c542d2cfae906aaeef326a9dd368344d2ff5f3</id>
<content type='text'>
[ Upstream commit d81304bc6193554014d4372a01debdf65e1e9a4d ]

If the utilization of the woken up task is 0, we skip the energy
calculation because it has no impact.

But if the task is boosted (uclamp_min != 0) will have an impact on task
placement and frequency selection. Only skip if the util is truly
0 after applying uclamp values.

Change uclamp_task_cpu() signature to avoid unnecessary additional calls
to uclamp_eff_get(). feec() is the only user now.

Fixes: 732cd75b8c920 ("sched/fair: Select an energy-efficient CPU on task wake-up")
Signed-off-by: Qais Yousef &lt;qais.yousef@arm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20220804143609.515789-8-qais.yousef@arm.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/uclamp: Make cpu_overutilized() use util_fits_cpu()</title>
<updated>2022-12-31T12:31:55Z</updated>
<author>
<name>Qais Yousef</name>
<email>qais.yousef@arm.com</email>
</author>
<published>2022-08-04T14:36:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cbd8c040409cd7343db655d4b0fc702bfd9560f6'/>
<id>urn:sha1:cbd8c040409cd7343db655d4b0fc702bfd9560f6</id>
<content type='text'>
[ Upstream commit c56ab1b3506ba0e7a872509964b100912bde165d ]

So that it is now uclamp aware.

This fixes a major problem of busy tasks capped with UCLAMP_MAX keeping
the system in overutilized state which disables EAS and leads to wasting
energy in the long run.

Without this patch running a busy background activity like JIT
compilation on Pixel 6 causes the system to be in overutilized state
74.5% of the time.

With this patch this goes down to  9.79%.

It also fixes another problem when long running tasks that have their
UCLAMP_MIN changed while running such that they need to upmigrate to
honour the new UCLAMP_MIN value. The upmigration doesn't get triggered
because overutilized state never gets set in this state, hence misfit
migration never happens at tick in this case until the task wakes up
again.

Fixes: af24bde8df202 ("sched/uclamp: Add uclamp support to energy_compute()")
Signed-off-by: Qais Yousef &lt;qais.yousef@arm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20220804143609.515789-7-qais.yousef@arm.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/uclamp: Make asym_fits_capacity() use util_fits_cpu()</title>
<updated>2022-12-31T12:31:55Z</updated>
<author>
<name>Qais Yousef</name>
<email>qais.yousef@arm.com</email>
</author>
<published>2022-08-04T14:36:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=324ce6e2c603a8e53c1b5b8f7d66eec41275ff89'/>
<id>urn:sha1:324ce6e2c603a8e53c1b5b8f7d66eec41275ff89</id>
<content type='text'>
[ Upstream commit a2e7f03ed28fce26c78b985f87913b6ce3accf9d ]

Use the new util_fits_cpu() to ensure migration margin and capacity
pressure are taken into account correctly when uclamp is being used
otherwise we will fail to consider CPUs as fitting in scenarios where
they should.

s/asym_fits_capacity/asym_fits_cpu/ to better reflect what it does now.

Fixes: b4c9c9f15649 ("sched/fair: Prefer prev cpu in asymmetric wakeup path")
Signed-off-by: Qais Yousef &lt;qais.yousef@arm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20220804143609.515789-6-qais.yousef@arm.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/uclamp: Make select_idle_capacity() use util_fits_cpu()</title>
<updated>2022-12-31T12:31:54Z</updated>
<author>
<name>Qais Yousef</name>
<email>qais.yousef@arm.com</email>
</author>
<published>2022-08-04T14:36:04Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3b9c1559de8c3a5b974f7ad931dafc56e8694b2a'/>
<id>urn:sha1:3b9c1559de8c3a5b974f7ad931dafc56e8694b2a</id>
<content type='text'>
[ Upstream commit b759caa1d9f667b94727b2ad12589cbc4ce13a82 ]

Use the new util_fits_cpu() to ensure migration margin and capacity
pressure are taken into account correctly when uclamp is being used
otherwise we will fail to consider CPUs as fitting in scenarios where
they should.

Fixes: b4c9c9f15649 ("sched/fair: Prefer prev cpu in asymmetric wakeup path")
Signed-off-by: Qais Yousef &lt;qais.yousef@arm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20220804143609.515789-5-qais.yousef@arm.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/uclamp: Fix fits_capacity() check in feec()</title>
<updated>2022-12-31T12:31:54Z</updated>
<author>
<name>Qais Yousef</name>
<email>qais.yousef@arm.com</email>
</author>
<published>2022-08-04T14:36:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=75ba48621b33c072d26a6635a32842f8b6ab3338'/>
<id>urn:sha1:75ba48621b33c072d26a6635a32842f8b6ab3338</id>
<content type='text'>
[ Upstream commit 244226035a1f9b2b6c326e55ae5188fab4f428cb ]

As reported by Yun Hsiang [1], if a task has its uclamp_min &gt;= 0.8 * 1024,
it'll always pick the previous CPU because fits_capacity() will always
return false in this case.

The new util_fits_cpu() logic should handle this correctly for us beside
more corner cases where similar failures could occur, like when using
UCLAMP_MAX.

We open code uclamp_rq_util_with() except for the clamp() part,
util_fits_cpu() needs the 'raw' values to be passed to it.

Also introduce uclamp_rq_{set, get}() shorthand accessors to get uclamp
value for the rq. Makes the code more readable and ensures the right
rules (use READ_ONCE/WRITE_ONCE) are respected transparently.

[1] https://lists.linaro.org/pipermail/eas-dev/2020-July/001488.html

Fixes: 1d42509e475c ("sched/fair: Make EAS wakeup placement consider uclamp restrictions")
Reported-by: Yun Hsiang &lt;hsiang023167@gmail.com&gt;
Signed-off-by: Qais Yousef &lt;qais.yousef@arm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20220804143609.515789-4-qais.yousef@arm.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/uclamp: Make task_fits_capacity() use util_fits_cpu()</title>
<updated>2022-12-31T12:31:54Z</updated>
<author>
<name>Qais Yousef</name>
<email>qais.yousef@arm.com</email>
</author>
<published>2022-08-04T14:36:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d355960543a7b8699d73e09f115d5b6825dc4be3'/>
<id>urn:sha1:d355960543a7b8699d73e09f115d5b6825dc4be3</id>
<content type='text'>
[ Upstream commit b48e16a69792b5dc4a09d6807369d11b2970cc36 ]

So that the new uclamp rules in regard to migration margin and capacity
pressure are taken into account correctly.

Fixes: a7008c07a568 ("sched/fair: Make task_fits_capacity() consider uclamp restrictions")
Co-developed-by: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Signed-off-by: Qais Yousef &lt;qais.yousef@arm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20220804143609.515789-3-qais.yousef@arm.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/uclamp: Fix relationship between uclamp and migration margin</title>
<updated>2022-12-31T12:31:54Z</updated>
<author>
<name>Qais Yousef</name>
<email>qais.yousef@arm.com</email>
</author>
<published>2022-08-04T14:36:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0ef8dac2464618afcb1d2e168eba1a62ac5040bf'/>
<id>urn:sha1:0ef8dac2464618afcb1d2e168eba1a62ac5040bf</id>
<content type='text'>
[ Upstream commit 48d5e9daa8b767e75ed9421665b037a49ce4bc04 ]

fits_capacity() verifies that a util is within 20% margin of the
capacity of a CPU, which is an attempt to speed up upmigration.

But when uclamp is used, this 20% margin is problematic because for
example if a task is boosted to 1024, then it will not fit on any CPU
according to fits_capacity() logic.

Or if a task is boosted to capacity_orig_of(medium_cpu). The task will
end up on big instead on the desired medium CPU.

Similar corner cases exist for uclamp and usage of capacity_of().
Slightest irq pressure on biggest CPU for example will make a 1024
boosted task look like it can't fit.

What we really want is for uclamp comparisons to ignore the migration
margin and capacity pressure, yet retain them for when checking the
_actual_ util signal.

For example, task p:

	p-&gt;util_avg = 300
	p-&gt;uclamp[UCLAMP_MIN] = 1024

Will fit a big CPU. But

	p-&gt;util_avg = 900
	p-&gt;uclamp[UCLAMP_MIN] = 1024

will not, this should trigger overutilized state because the big CPU is
now *actually* being saturated.

Similar reasoning applies to capping tasks with UCLAMP_MAX. For example:

	p-&gt;util_avg = 1024
	p-&gt;uclamp[UCLAMP_MAX] = capacity_orig_of(medium_cpu)

Should fit the task on medium cpus without triggering overutilized
state.

Inlined comments expand more on desired behavior in more scenarios.

Introduce new util_fits_cpu() function which encapsulates the new logic.
The new function is not used anywhere yet, but will be used to update
various users of fits_capacity() in later patches.

Fixes: af24bde8df202 ("sched/uclamp: Add uclamp support to energy_compute()")
Signed-off-by: Qais Yousef &lt;qais.yousef@arm.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lore.kernel.org/r/20220804143609.515789-2-qais.yousef@arm.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>Revert "cpufreq: schedutil: Move max CPU capacity to sugov_policy"</title>
<updated>2022-11-22T18:56:52Z</updated>
<author>
<name>Sam Wu</name>
<email>wusamuel@google.com</email>
</author>
<published>2022-11-10T19:57:32Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cdcc5ef26b39c3d02d4e69c0352b007ebe438a22'/>
<id>urn:sha1:cdcc5ef26b39c3d02d4e69c0352b007ebe438a22</id>
<content type='text'>
This reverts commit 6d5afdc97ea71958287364a1f1d07e59ef151b11.

On a Pixel 6 device, it is observed that this commit increases
latency by approximately 50ms, or 20%, in migrating a task
that requires full CPU utilization from a LITTLE CPU to Fmax
on a big CPU. Reverting this change restores the latency back
to its original baseline value.

Fixes: 6d5afdc97ea7 ("cpufreq: schedutil: Move max CPU capacity to sugov_policy")
Signed-off-by: Sam Wu &lt;wusamuel@google.com&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
</content>
</entry>
<entry>
<title>sched: Fix race in task_call_func()</title>
<updated>2022-11-14T08:58:32Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2022-10-26T11:43:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=91dabf33ae5df271da63e87ad7833e5fdb4a44b9'/>
<id>urn:sha1:91dabf33ae5df271da63e87ad7833e5fdb4a44b9</id>
<content type='text'>
There is a very narrow race between schedule() and task_call_func().

  CPU0						CPU1

  __schedule()
    rq_lock();
    prev_state = READ_ONCE(prev-&gt;__state);
    if (... &amp;&amp; prev_state) {
      deactivate_tasl(rq, prev, ...)
        prev-&gt;on_rq = 0;

						task_call_func()
						  raw_spin_lock_irqsave(p-&gt;pi_lock);
						  state = READ_ONCE(p-&gt;__state);
						  smp_rmb();
						  if (... || p-&gt;on_rq) // false!!!
						    rq = __task_rq_lock()

						  ret = func();

    next = pick_next_task();
    rq = context_switch(prev, next)
      prepare_lock_switch()
        spin_release(&amp;__rq_lockp(rq)-&gt;dep_map...)

So while the task is on it's way out, it still holds rq-&gt;lock for a
little while, and right then task_call_func() comes in and figures it
doesn't need rq-&gt;lock anymore (because the task is already dequeued --
but still running there) and then the __set_task_frozen() thing observes
it's holding rq-&gt;lock and yells murder.

Avoid this by waiting for p-&gt;on_cpu to get cleared, which guarantees
the task is fully finished on the old CPU.

( While arguably the fixes tag is 'wrong' -- none of the previous
  task_call_func() users appears to care for this case. )

Fixes: f5d39b020809 ("freezer,sched: Rewrite core freezer logic")
Reported-by: Ville Syrjälä &lt;ville.syrjala@linux.intel.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Tested-by: Ville Syrjälä &lt;ville.syrjala@linux.intel.com&gt;
Link: https://lkml.kernel.org/r/Y1kdRNNfUeAU+FNl@hirez.programming.kicks-ass.net
</content>
</entry>
</feed>
