<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/sched/rt.c, branch v5.4.76</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.4.76</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.4.76'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2020-06-22T07:31:07Z</updated>
<entry>
<title>sched: Defend cfs and rt bandwidth quota against overflow</title>
<updated>2020-06-22T07:31:07Z</updated>
<author>
<name>Huaixin Chang</name>
<email>changhuaixin@linux.alibaba.com</email>
</author>
<published>2020-04-25T10:52:48Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9fa3b0bd9914133cf362a1719af522fa3fddee1e'/>
<id>urn:sha1:9fa3b0bd9914133cf362a1719af522fa3fddee1e</id>
<content type='text'>
[ Upstream commit d505b8af58912ae1e1a211fabc9995b19bd40828 ]

When users write some huge number into cpu.cfs_quota_us or
cpu.rt_runtime_us, overflow might happen during to_ratio() shifts of
schedulable checks.

to_ratio() could be altered to avoid unnecessary internal overflow, but
min_cfs_quota_period is less than 1 &lt;&lt; BW_SHIFT, so a cutoff would still
be needed. Set a cap MAX_BW for cfs_quota_us and rt_runtime_us to
prevent overflow.

Signed-off-by: Huaixin Chang &lt;changhuaixin@linux.alibaba.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Reviewed-by: Ben Segall &lt;bsegall@google.com&gt;
Link: https://lkml.kernel.org/r/20200425105248.60093-1-changhuaixin@linux.alibaba.com
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>sched/core: Further clarify sched_class::set_next_task()</title>
<updated>2020-01-26T09:01:03Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2019-11-08T13:16:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=37bb3c4646818d15089fe2c48424d52700743a93'/>
<id>urn:sha1:37bb3c4646818d15089fe2c48424d52700743a93</id>
<content type='text'>
commit a0e813f26ebcb25c0b5e504498fbd796cca1a4ba upstream.

It turns out there really is something special to the first
set_next_task() invocation. In specific the 'change' pattern really
should not cause balance callbacks.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: bsegall@google.com
Cc: dietmar.eggemann@arm.com
Cc: juri.lelli@redhat.com
Cc: ktkhai@virtuozzo.com
Cc: mgorman@suse.de
Cc: qais.yousef@arm.com
Cc: qperret@google.com
Cc: rostedt@goodmis.org
Cc: valentin.schneider@arm.com
Cc: vincent.guittot@linaro.org
Fixes: f95d4eaee6d0 ("sched/{rt,deadline}: Fix set_next_task vs pick_next_task")
Link: https://lkml.kernel.org/r/20191108131909.775434698@infradead.org
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>sched: Fix pick_next_task() vs 'change' pattern race</title>
<updated>2019-11-08T21:34:14Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2019-11-08T10:11:52Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6e2df0581f569038719cf2bc2b3baa3fcc83cab4'/>
<id>urn:sha1:6e2df0581f569038719cf2bc2b3baa3fcc83cab4</id>
<content type='text'>
Commit 67692435c411 ("sched: Rework pick_next_task() slow-path")
inadvertly introduced a race because it changed a previously
unexplored dependency between dropping the rq-&gt;lock and
sched_class::put_prev_task().

The comments about dropping rq-&gt;lock, in for example
newidle_balance(), only mentions the task being current and -&gt;on_cpu
being set. But when we look at the 'change' pattern (in for example
sched_setnuma()):

	queued = task_on_rq_queued(p); /* p-&gt;on_rq == TASK_ON_RQ_QUEUED */
	running = task_current(rq, p); /* rq-&gt;curr == p */

	if (queued)
		dequeue_task(...);
	if (running)
		put_prev_task(...);

	/* change task properties */

	if (queued)
		enqueue_task(...);
	if (running)
		set_next_task(...);

It becomes obvious that if we do this after put_prev_task() has
already been called on @p, things go sideways. This is exactly what
the commit in question allows to happen when it does:

	prev-&gt;sched_class-&gt;put_prev_task(rq, prev, rf);
	if (!rq-&gt;nr_running)
		newidle_balance(rq, rf);

The newidle_balance() call will drop rq-&gt;lock after we've called
put_prev_task() and that allows the above 'change' pattern to
interleave and mess up the state.

Furthermore, it turns out we lost the RT-pull when we put the last DL
task.

Fix both problems by extracting the balancing from put_prev_task() and
doing a multi-class balance() pass before put_prev_task().

Fixes: 67692435c411 ("sched: Rework pick_next_task() slow-path")
Reported-by: Quentin Perret &lt;qperret@google.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Tested-by: Quentin Perret &lt;qperret@google.com&gt;
Tested-by: Valentin Schneider &lt;valentin.schneider@arm.com&gt;
</content>
</entry>
<entry>
<title>Merge branch 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2019-09-17T19:35:15Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2019-09-17T19:35:15Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7f2444d38f6bbfa12bc15e2533d8f9daa85ca02b'/>
<id>urn:sha1:7f2444d38f6bbfa12bc15e2533d8f9daa85ca02b</id>
<content type='text'>
Pull core timer updates from Thomas Gleixner:
 "Timers and timekeeping updates:

   - A large overhaul of the posix CPU timer code which is a preparation
     for moving the CPU timer expiry out into task work so it can be
     properly accounted on the task/process.

     An update to the bogus permission checks will come later during the
     merge window as feedback was not complete before heading of for
     travel.

   - Switch the timerqueue code to use cached rbtrees and get rid of the
     homebrewn caching of the leftmost node.

   - Consolidate hrtimer_init() + hrtimer_init_sleeper() calls into a
     single function

   - Implement the separation of hrtimers to be forced to expire in hard
     interrupt context even when PREEMPT_RT is enabled and mark the
     affected timers accordingly.

   - Implement a mechanism for hrtimers and the timer wheel to protect
     RT against priority inversion and live lock issues when a (hr)timer
     which should be canceled is currently executing the callback.
     Instead of infinitely spinning, the task which tries to cancel the
     timer blocks on a per cpu base expiry lock which is held and
     released by the (hr)timer expiry code.

   - Enable the Hyper-V TSC page based sched_clock for Hyper-V guests
     resulting in faster access to timekeeping functions.

   - Updates to various clocksource/clockevent drivers and their device
     tree bindings.

   - The usual small improvements all over the place"

* 'timers-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (101 commits)
  posix-cpu-timers: Fix permission check regression
  posix-cpu-timers: Always clear head pointer on dequeue
  hrtimer: Add a missing bracket and hide `migration_base' on !SMP
  posix-cpu-timers: Make expiry_active check actually work correctly
  posix-timers: Unbreak CONFIG_POSIX_TIMERS=n build
  tick: Mark sched_timer to expire in hard interrupt context
  hrtimer: Add kernel doc annotation for HRTIMER_MODE_HARD
  x86/hyperv: Hide pv_ops access for CONFIG_PARAVIRT=n
  posix-cpu-timers: Utilize timerqueue for storage
  posix-cpu-timers: Move state tracking to struct posix_cputimers
  posix-cpu-timers: Deduplicate rlimit handling
  posix-cpu-timers: Remove pointless comparisons
  posix-cpu-timers: Get rid of 64bit divisions
  posix-cpu-timers: Consolidate timer expiry further
  posix-cpu-timers: Get rid of zero checks
  rlimit: Rewrite non-sensical RLIMIT_CPU comment
  posix-cpu-timers: Respect INFINITY for hard RTTIME limit
  posix-cpu-timers: Switch thread group sampling to array
  posix-cpu-timers: Restructure expiry array
  posix-cpu-timers: Remove cputime_expires
  ...
</content>
</entry>
<entry>
<title>posix-cpu-timers: Move expiry cache into struct posix_cputimers</title>
<updated>2019-08-28T09:50:35Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2019-08-21T19:09:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3a245c0f110e2bfcf7f2cd2248a29005c78999e3'/>
<id>urn:sha1:3a245c0f110e2bfcf7f2cd2248a29005c78999e3</id>
<content type='text'>
The expiry cache belongs into the posix_cputimers container where the other
cpu timers information is.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Reviewed-by: Frederic Weisbecker &lt;frederic@kernel.org&gt;
Link: https://lkml.kernel.org/r/20190821192921.014444012@linutronix.de

</content>
</entry>
<entry>
<title>sched: Rework pick_next_task() slow-path</title>
<updated>2019-08-08T07:09:31Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2019-05-29T20:36:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=67692435c411e5c53a1c588ecca2037aebd81f2e'/>
<id>urn:sha1:67692435c411e5c53a1c588ecca2037aebd81f2e</id>
<content type='text'>
Avoid the RETRY_TASK case in the pick_next_task() slow path.

By doing the put_prev_task() early, we get the rt/deadline pull done,
and by testing rq-&gt;nr_running we know if we need newidle_balance().

This then gives a stable state to pick a task from.

Since the fast-path is fair only; it means the other classes will
always have pick_next_task(.prev=NULL, .rf=NULL) and we can simplify.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Aaron Lu &lt;aaron.lwe@gmail.com&gt;
Cc: Valentin Schneider &lt;valentin.schneider@arm.com&gt;
Cc: mingo@kernel.org
Cc: Phil Auld &lt;pauld@redhat.com&gt;
Cc: Julien Desfossez &lt;jdesfossez@digitalocean.com&gt;
Cc: Nishanth Aravamudan &lt;naravamudan@digitalocean.com&gt;
Link: https://lkml.kernel.org/r/aa34d24b36547139248f32a30138791ac6c02bd6.1559129225.git.vpillai@digitalocean.com
</content>
</entry>
<entry>
<title>sched: Allow put_prev_task() to drop rq-&gt;lock</title>
<updated>2019-08-08T07:09:31Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2019-05-29T20:36:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5f2a45fc9e89e022233085e6f0f352eb6ff770bb'/>
<id>urn:sha1:5f2a45fc9e89e022233085e6f0f352eb6ff770bb</id>
<content type='text'>
Currently the pick_next_task() loop is convoluted and ugly because of
how it can drop the rq-&gt;lock and needs to restart the picking.

For the RT/Deadline classes, it is put_prev_task() where we do
balancing, and we could do this before the picking loop. Make this
possible.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Valentin Schneider &lt;valentin.schneider@arm.com&gt;
Cc: Aaron Lu &lt;aaron.lwe@gmail.com&gt;
Cc: mingo@kernel.org
Cc: Phil Auld &lt;pauld@redhat.com&gt;
Cc: Julien Desfossez &lt;jdesfossez@digitalocean.com&gt;
Cc: Nishanth Aravamudan &lt;naravamudan@digitalocean.com&gt;
Link: https://lkml.kernel.org/r/e4519f6850477ab7f3d257062796e6425ee4ba7c.1559129225.git.vpillai@digitalocean.com
</content>
</entry>
<entry>
<title>sched: Add task_struct pointer to sched_class::set_curr_task</title>
<updated>2019-08-08T07:09:31Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2019-05-29T20:36:41Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=03b7fad167efca3b7abbbb39733933f9df56e79c'/>
<id>urn:sha1:03b7fad167efca3b7abbbb39733933f9df56e79c</id>
<content type='text'>
In preparation of further separating pick_next_task() and
set_curr_task() we have to pass the actual task into it, while there,
rename the thing to better pair with put_prev_task().

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Aaron Lu &lt;aaron.lwe@gmail.com&gt;
Cc: Valentin Schneider &lt;valentin.schneider@arm.com&gt;
Cc: mingo@kernel.org
Cc: Phil Auld &lt;pauld@redhat.com&gt;
Cc: Julien Desfossez &lt;jdesfossez@digitalocean.com&gt;
Cc: Nishanth Aravamudan &lt;naravamudan@digitalocean.com&gt;
Link: https://lkml.kernel.org/r/a96d1bcdd716db4a4c5da2fece647a1456c0ed78.1559129225.git.vpillai@digitalocean.com
</content>
</entry>
<entry>
<title>sched/{rt,deadline}: Fix set_next_task vs pick_next_task</title>
<updated>2019-08-08T07:09:30Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2019-05-29T20:36:40Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f95d4eaee6d0207bff2dc93371133d31227d4cfb'/>
<id>urn:sha1:f95d4eaee6d0207bff2dc93371133d31227d4cfb</id>
<content type='text'>
Because pick_next_task() implies set_curr_task() and some of the
details haven't mattered too much, some of what _should_ be in
set_curr_task() ended up in pick_next_task, correct this.

This prepares the way for a pick_next_task() variant that does not
affect the current state; allowing remote picking.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Aaron Lu &lt;aaron.lwe@gmail.com&gt;
Cc: Valentin Schneider &lt;valentin.schneider@arm.com&gt;
Cc: mingo@kernel.org
Cc: Phil Auld &lt;pauld@redhat.com&gt;
Cc: Julien Desfossez &lt;jdesfossez@digitalocean.com&gt;
Cc: Nishanth Aravamudan &lt;naravamudan@digitalocean.com&gt;
Link: https://lkml.kernel.org/r/38c61d5240553e043c27c5e00b9dd0d184dd6081.1559129225.git.vpillai@digitalocean.com
</content>
</entry>
<entry>
<title>sched: Mark hrtimers to expire in hard interrupt context</title>
<updated>2019-08-01T18:51:19Z</updated>
<author>
<name>Sebastian Andrzej Siewior</name>
<email>bigeasy@linutronix.de</email>
</author>
<published>2019-07-26T18:30:52Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d5096aa65acd0ef2d18ac8247260ab4481ade399'/>
<id>urn:sha1:d5096aa65acd0ef2d18ac8247260ab4481ade399</id>
<content type='text'>
The scheduler related hrtimers need to expire in hard interrupt context
even on PREEMPT_RT enabled kernels. Mark then as such.

No functional change.

[ tglx: Split out from larger combo patch. Add changelog. ]

Signed-off-by: Sebastian Andrzej Siewior &lt;bigeasy@linutronix.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Link: https://lkml.kernel.org/r/20190726185753.077004842@linutronix.de



</content>
</entry>
</feed>
