<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/sched/debug.c, branch v6.7.9</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.7.9</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.7.9'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2023-09-29T08:20:21Z</updated>
<entry>
<title>sched/deadline: Make dl_rq-&gt;pushable_dl_tasks update drive dl_rq-&gt;overloaded</title>
<updated>2023-09-29T08:20:21Z</updated>
<author>
<name>Valentin Schneider</name>
<email>vschneid@redhat.com</email>
</author>
<published>2023-09-28T15:02:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5fe7765997b139e2d922b58359dea181efe618f9'/>
<id>urn:sha1:5fe7765997b139e2d922b58359dea181efe618f9</id>
<content type='text'>
dl_rq-&gt;dl_nr_migratory is increased whenever a DL entity is enqueued and it has
nr_cpus_allowed &gt; 1. Unlike the pushable_dl_tasks tree, dl_rq-&gt;dl_nr_migratory
includes a dl_rq's current task. This means a dl_rq can have a migratable
current, N non-migratable queued tasks, and be flagged as overloaded and have
its CPU set in the dlo_mask, despite having an empty pushable_tasks tree.

Make an dl_rq's overload logic be driven by {enqueue,dequeue}_pushable_dl_task(),
in other words make DL RQs only be flagged as overloaded if they have at
least one runnable-but-not-current migratable task.

 o push_dl_task() is unaffected, as it is a no-op if there are no pushable
   tasks.

 o pull_dl_task() now no longer scans runqueues whose sole migratable task is
   their current one, which it can't do anything about anyway.
   It may also now pull tasks to a DL RQ with dl_nr_running &gt; 1 if only its
   current task is migratable.

Since dl_rq-&gt;dl_nr_migratory becomes unused, remove it.

RT had the exact same mechanism (rt_rq-&gt;rt_nr_migratory) which was dropped
in favour of relying on rt_rq-&gt;pushable_tasks, see:

  612f769edd06 ("sched/rt: Make rt_rq-&gt;pushable_tasks updates drive rto_mask")

Signed-off-by: Valentin Schneider &lt;vschneid@redhat.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Acked-by: Juri Lelli &lt;juri.lelli@redhat.com&gt;
Link: https://lore.kernel.org/r/20230928150251.463109-1-vschneid@redhat.com
</content>
</entry>
<entry>
<title>sched/rt: Make rt_rq-&gt;pushable_tasks updates drive rto_mask</title>
<updated>2023-09-25T08:25:29Z</updated>
<author>
<name>Valentin Schneider</name>
<email>vschneid@redhat.com</email>
</author>
<published>2023-08-11T11:20:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=612f769edd06a6e42f7cd72425488e68ddaeef0a'/>
<id>urn:sha1:612f769edd06a6e42f7cd72425488e68ddaeef0a</id>
<content type='text'>
Sebastian noted that the rto_push_work IRQ work can be queued for a CPU
that has an empty pushable_tasks list, which means nothing useful will be
done in the IPI other than queue the work for the next CPU on the rto_mask.

rto_push_irq_work_func() only operates on tasks in the pushable_tasks list,
but the conditions for that irq_work to be queued (and for a CPU to be
added to the rto_mask) rely on rq_rt-&gt;nr_migratory instead.

nr_migratory is increased whenever an RT task entity is enqueued and it has
nr_cpus_allowed &gt; 1. Unlike the pushable_tasks list, nr_migratory includes a
rt_rq's current task. This means a rt_rq can have a migratible current, N
non-migratible queued tasks, and be flagged as overloaded / have its CPU
set in the rto_mask, despite having an empty pushable_tasks list.

Make an rt_rq's overload logic be driven by {enqueue,dequeue}_pushable_task().
Since rt_rq-&gt;{rt_nr_migratory,rt_nr_total} become unused, remove them.

Note that the case where the current task is pushed away to make way for a
migration-disabled task remains unchanged: the migration-disabled task has
to be in the pushable_tasks list in the first place, which means it has
nr_cpus_allowed &gt; 1.

Reported-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Valentin Schneider &lt;vschneid@redhat.com&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Tested-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Link: https://lore.kernel.org/r/20230811112044.3302588-1-vschneid@redhat.com
</content>
</entry>
<entry>
<title>sched/debug: Update stale reference to sched_debug.c</title>
<updated>2023-09-21T06:30:19Z</updated>
<author>
<name>Sebastian Andrzej Siewior</name>
<email>bigeasy@linutronix.de</email>
</author>
<published>2023-09-20T13:00:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=622f0a1d544fa88dda10d27727835e825c84ae0f'/>
<id>urn:sha1:622f0a1d544fa88dda10d27727835e825c84ae0f</id>
<content type='text'>
Since commit:

   8a99b6833c884 ("sched: Move SCHED_DEBUG sysctl to debugfs")

The sched_debug interface moved from /proc to debugfs. The comment
mentions still the outdated proc interfaces.

Update the comment, point to the current location of the interface.

Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://lore.kernel.org/r/20230920130025.412071-3-bigeasy@linutronix.de
</content>
</entry>
<entry>
<title>sched/debug: Remove the /proc/sys/kernel/sched_child_runs_first sysctl</title>
<updated>2023-09-21T06:30:18Z</updated>
<author>
<name>Sebastian Andrzej Siewior</name>
<email>bigeasy@linutronix.de</email>
</author>
<published>2023-09-20T13:00:24Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=17e7170645e34c519443ba63895264bbdee7beee'/>
<id>urn:sha1:17e7170645e34c519443ba63895264bbdee7beee</id>
<content type='text'>
The /proc/sys/kernel/sched_child_runs_first knob is no longer connected since:

   5e963f2bd4654 ("sched/fair: Commit to EEVDF")

Remove it.

Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://lore.kernel.org/r/20230920130025.412071-2-bigeasy@linutronix.de
</content>
</entry>
<entry>
<title>sched/debug: Rename sysctl_sched_min_granularity to sysctl_sched_base_slice</title>
<updated>2023-07-19T07:43:59Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2023-05-31T11:58:48Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e4ec3318a17f5dcf11bc23b2d2c1da4c1c5bb507'/>
<id>urn:sha1:e4ec3318a17f5dcf11bc23b2d2c1da4c1c5bb507</id>
<content type='text'>
EEVDF uses this tunable as the base request/slice -- make sure the
name reflects this.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://lore.kernel.org/r/20230531124604.205287511@infradead.org
</content>
</entry>
<entry>
<title>sched/fair: Commit to EEVDF</title>
<updated>2023-07-19T07:43:58Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2023-05-31T11:58:47Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5e963f2bd4654a202a8a05aa3a86cb0300b10e6c'/>
<id>urn:sha1:5e963f2bd4654a202a8a05aa3a86cb0300b10e6c</id>
<content type='text'>
EEVDF is a better defined scheduling policy, as a result it has less
heuristics/tunables. There is no compelling reason to keep CFS around.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://lore.kernel.org/r/20230531124604.137187212@infradead.org
</content>
</entry>
<entry>
<title>sched/fair: Implement an EEVDF-like scheduling policy</title>
<updated>2023-07-19T07:43:58Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2023-05-31T11:58:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=147f3efaa24182a21706bca15eab2f3f4630b5fe'/>
<id>urn:sha1:147f3efaa24182a21706bca15eab2f3f4630b5fe</id>
<content type='text'>
Where CFS is currently a WFQ based scheduler with only a single knob,
the weight. The addition of a second, latency oriented parameter,
makes something like WF2Q or EEVDF based a much better fit.

Specifically, EEVDF does EDF like scheduling in the left half of the
tree -- those entities that are owed service. Except because this is a
virtual time scheduler, the deadlines are in virtual time as well,
which is what allows over-subscription.

EEVDF has two parameters:

 - weight, or time-slope: which is mapped to nice just as before

 - request size, or slice length: which is used to compute
   the virtual deadline as: vd_i = ve_i + r_i/w_i

Basically, by setting a smaller slice, the deadline will be earlier
and the task will be more eligible and ran earlier.

Tick driven preemption is driven by request/slice completion; while
wakeup preemption is driven by the deadline.

Because the tree is now effectively an interval tree, and the
selection is no longer 'leftmost', over-scheduling is less of a
problem.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://lore.kernel.org/r/20230531124603.931005524@infradead.org
</content>
</entry>
<entry>
<title>sched/fair: Add cfs_rq::avg_vruntime</title>
<updated>2023-07-19T07:43:58Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2023-05-31T11:58:40Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=af4cf40470c22efa3987200fd19478199e08e103'/>
<id>urn:sha1:af4cf40470c22efa3987200fd19478199e08e103</id>
<content type='text'>
In order to move to an eligibility based scheduling policy, we need
to have a better approximation of the ideal scheduler.

Specifically, for a virtual time weighted fair queueing based
scheduler the ideal scheduler will be the weighted average of the
individual virtual runtimes (math in the comment).

As such, compute the weighted average to approximate the ideal
scheduler -- note that the approximation is in the individual task
behaviour, which isn't strictly conformant.

Specifically consider adding a task with a vruntime left of center, in
this case the average will move backwards in time -- something the
ideal scheduler would of course never do.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Link: https://lore.kernel.org/r/20230531124603.654144274@infradead.org
</content>
</entry>
<entry>
<title>sched/debug: Dump domains' sched group flags</title>
<updated>2023-07-13T13:21:53Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2023-07-07T22:57:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ed74cc4995d314ea6cbf406caf978c442f451fa5'/>
<id>urn:sha1:ed74cc4995d314ea6cbf406caf978c442f451fa5</id>
<content type='text'>
There have been a case where the SD_SHARE_CPUCAPACITY sched group flag
in a parent domain were not set and propagated properly when a degenerate
domain is removed.

Add dump of domain sched group flags of a CPU to make debug easier
in the future.

Usage:
cat /debug/sched/domains/cpu0/domain1/groups_flags
to dump cpu0 domain1's sched group flags.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Signed-off-by: Tim Chen &lt;tim.c.chen@linux.intel.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Valentin Schneider &lt;vschneid@redhat.com&gt;
Link: https://lore.kernel.org/r/ed1749262d94d95a8296c86a415999eda90bcfe3.1688770494.git.tim.c.chen@linux.intel.com
</content>
</entry>
<entry>
<title>sched/debug: Correct printing for rq-&gt;nr_uninterruptible</title>
<updated>2023-05-08T08:58:39Z</updated>
<author>
<name>晏艳(采苓)</name>
<email>yanyan.yan@antgroup.com</email>
</author>
<published>2023-05-06T07:42:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a6fcdd8d95f7486150b3faadfea119fc3dfc3b74'/>
<id>urn:sha1:a6fcdd8d95f7486150b3faadfea119fc3dfc3b74</id>
<content type='text'>
Commit e6fe3f422be1 ("sched: Make multiple runqueue task counters
32-bit") changed the type for rq-&gt;nr_uninterruptible from "unsigned
long" to "unsigned int", but left wrong cast print to
/sys/kernel/debug/sched/debug and to the console.

For example, nr_uninterruptible's value is fffffff7 with type
"unsigned int", (long)nr_uninterruptible shows 4294967287 while
(int)nr_uninterruptible prints -9. So using int cast fixes wrong
printing.

Signed-off-by: Yan Yan &lt;yanyan.yan@antgroup.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/20230506074253.44526-1-yanyan.yan@antgroup.com
</content>
</entry>
</feed>
