<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/sched_rt.c, branch v3.2.79</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.2.79</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.2.79'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2015-03-06T00:39:19Z</updated>
<entry>
<title>sched/rt: Reduce rq lock contention by eliminating locking of non-feasible target</title>
<updated>2015-03-06T00:39:19Z</updated>
<author>
<name>Tim Chen</name>
<email>tim.c.chen@linux.intel.com</email>
</author>
<published>2014-12-12T23:38:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=985504f71bb0ca24ac647c72446cc621980aef9f'/>
<id>urn:sha1:985504f71bb0ca24ac647c72446cc621980aef9f</id>
<content type='text'>
commit 80e3d87b2c5582db0ab5e39610ce3707d97ba409 upstream.

This patch adds checks that prevens futile attempts to move rt tasks
to a CPU with active tasks of equal or higher priority.

This reduces run queue lock contention and improves the performance of
a well known OLTP benchmark by 0.7%.

Signed-off-by: Tim Chen &lt;tim.c.chen@linux.intel.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Shawn Bohrer &lt;sbohrer@rgmadvisors.com&gt;
Cc: Suruchi Kadu &lt;suruchi.a.kadu@intel.com&gt;
Cc: Doug Nelson&lt;doug.nelson@intel.com&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Link: http://lkml.kernel.org/r/1421430374.2399.27.camel@schen9-desk2.jf.intel.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>sched/rt: Avoid updating RT entry timeout twice within one tick period</title>
<updated>2014-02-15T19:20:18Z</updated>
<author>
<name>Ying Xue</name>
<email>ying.xue@windriver.com</email>
</author>
<published>2012-07-17T07:03:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b01e0013de67885b61907347ffcdf1e327c4e25e'/>
<id>urn:sha1:b01e0013de67885b61907347ffcdf1e327c4e25e</id>
<content type='text'>
commit 57d2aa00dcec67afa52478730f2b524521af14fb upstream.

The issue below was found in 2.6.34-rt rather than mainline rt
kernel, but the issue still exists upstream as well.

So please let me describe how it was noticed on 2.6.34-rt:

On this version, each softirq has its own thread, it means there
is at least one RT FIFO task per cpu. The priority of these
tasks is set to 49 by default. If user launches an RT FIFO task
with priority lower than 49 of softirq RT tasks, it's possible
there are two RT FIFO tasks enqueued one cpu runqueue at one
moment. By current strategy of balancing RT tasks, when it comes
to RT tasks, we really need to put them off to a CPU that they
can run on as soon as possible. Even if it means a bit of cache
line flushing, we want RT tasks to be run with the least latency.

When the user RT FIFO task which just launched before is
running, the sched timer tick of the current cpu happens. In this
tick period, the timeout value of the user RT task will be
updated once. Subsequently, we try to wake up one softirq RT
task on its local cpu. As the priority of current user RT task
is lower than the softirq RT task, the current task will be
preempted by the higher priority softirq RT task. Before
preemption, we check to see if current can readily move to a
different cpu. If so, we will reschedule to allow the RT push logic
to try to move current somewhere else. Whenever the woken
softirq RT task runs, it first tries to migrate the user FIFO RT
task over to a cpu that is running a task of lesser priority. If
migration is done, it will send a reschedule request to the found
cpu by IPI interrupt. Once the target cpu responds the IPI
interrupt, it will pick the migrated user RT task to preempt its
current task. When the user RT task is running on the new cpu,
the sched timer tick of the cpu fires. So it will tick the user
RT task again. This also means the RT task timeout value will be
updated again. As the migration may be done in one tick period,
it means the user RT task timeout value will be updated twice
within one tick.

If we set a limit on the amount of cpu time for the user RT task
by setrlimit(RLIMIT_RTTIME), the SIGXCPU signal should be posted
upon reaching the soft limit.

But exactly when the SIGXCPU signal should be sent depends on the
RT task timeout value. In fact the timeout mechanism of sending
the SIGXCPU signal assumes the RT task timeout is increased once
every tick.

However, currently the timeout value may be added twice per
tick. So it results in the SIGXCPU signal being sent earlier
than expected.

To solve this issue, we prevent the timeout value from increasing
twice within one tick time by remembering the jiffies value of
last updating the timeout. As long as the RT task's jiffies is
different with the global jiffies value, we allow its timeout to
be updated.

Signed-off-by: Ying Xue &lt;ying.xue@windriver.com&gt;
Signed-off-by: Fan Du &lt;fan.du@windriver.com&gt;
Reviewed-by: Yong Zhang &lt;yong.zhang0@gmail.com&gt;
Acked-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/1342508623-2887-1-git-send-email-ying.xue@windriver.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
[ lizf: backported to 3.4: adjust context ]
Signed-off-by: Li Zefan &lt;lizefan@huawei.com&gt;
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>sched: Unthrottle rt runqueues in __disable_runtime()</title>
<updated>2014-02-15T19:20:18Z</updated>
<author>
<name>Peter Boonstoppel</name>
<email>pboonstoppel@nvidia.com</email>
</author>
<published>2012-08-09T22:34:47Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4553dab7f3e57b70fee61739325b6ff7bb56ac9b'/>
<id>urn:sha1:4553dab7f3e57b70fee61739325b6ff7bb56ac9b</id>
<content type='text'>
commit a4c96ae319b8047f62dedbe1eac79e321c185749 upstream.

migrate_tasks() uses _pick_next_task_rt() to get tasks from the
real-time runqueues to be migrated. When rt_rq is throttled
_pick_next_task_rt() won't return anything, in which case
migrate_tasks() can't move all threads over and gets stuck in an
infinite loop.

Instead unthrottle rt runqueues before migrating tasks.

Additionally: move unthrottle_offline_cfs_rqs() to rq_offline_fair()

Signed-off-by: Peter Boonstoppel &lt;pboonstoppel@nvidia.com&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Paul Turner &lt;pjt@google.com&gt;
Link: http://lkml.kernel.org/r/5FBF8E85CA34454794F0F7ECBA79798F379D3648B7@HQMAIL04.nvidia.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
[ lizf: backported to 3.4: adjust context ]
Signed-off-by: Li Zefan &lt;lizefan@huawei.com&gt;
[bwh: Backported to 3.2:
 - Adjust filenames
 - unthrottle_offline_cfs_rqs() is already static, but defined in sched.c
   after including sched_fair.c, so add forward declaration
 - unthrottle_offline_cfs_rqs() also needs to be defined for all CONFIG_SMP
   configurations now]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>sched,rt: fix isolated CPUs leaving root_task_group indefinitely throttled</title>
<updated>2014-02-15T19:20:17Z</updated>
<author>
<name>Mike Galbraith</name>
<email>efault@gmx.de</email>
</author>
<published>2012-08-07T08:02:38Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=aee1f8b87e203b0525f340d2fde94f5a82f6c4bd'/>
<id>urn:sha1:aee1f8b87e203b0525f340d2fde94f5a82f6c4bd</id>
<content type='text'>
commit e221d028bb08b47e624c5f0a31732c642db9d19a upstream.

Root task group bandwidth replenishment must service all CPUs, regardless of
where the timer was last started, and regardless of the isolation mechanism,
lest 'Quoth the Raven, "Nevermore"' become rt scheduling policy.

Signed-off-by: Mike Galbraith &lt;efault@gmx.de&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/r/1344326558.6968.25.camel@marge.simpson.net
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>sched/rt: Fix SCHED_RR across cgroups</title>
<updated>2014-02-15T19:20:17Z</updated>
<author>
<name>Colin Cross</name>
<email>ccross@android.com</email>
</author>
<published>2012-05-17T04:34:23Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=13d8ff3f710fae656eb7c5648c1efce749bf8b52'/>
<id>urn:sha1:13d8ff3f710fae656eb7c5648c1efce749bf8b52</id>
<content type='text'>
commit 454c79999f7eaedcdf4c15c449e43902980cbdf5 upstream.

task_tick_rt() has an optimization to only reschedule SCHED_RR tasks
if they were the only element on their rq.  However, with cgroups
a SCHED_RR task could be the only element on its per-cgroup rq but
still be competing with other SCHED_RR tasks in its parent's
cgroup.  In this case, the SCHED_RR task in the child cgroup would
never yield at the end of its timeslice.  If the child cgroup
rt_runtime_us was the same as the parent cgroup rt_runtime_us,
the task in the parent cgroup would starve completely.

Modify task_tick_rt() to check that the task is the only task on its
rq, and that the each of the scheduling entities of its ancestors
is also the only entity on its rq.

Signed-off-by: Colin Cross &lt;ccross@android.com&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/r/1337229266-15798-1-git-send-email-ccross@android.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>sched/rt: Fix rq's cpupri leak while enqueue/dequeue child RT entities</title>
<updated>2014-02-15T19:20:14Z</updated>
<author>
<name>Kirill Tkhai</name>
<email>tkhai@yandex.ru</email>
</author>
<published>2013-11-27T15:59:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=069623122342d660364e011d1d3e2dfaad18b75d'/>
<id>urn:sha1:069623122342d660364e011d1d3e2dfaad18b75d</id>
<content type='text'>
commit 757dfcaa41844595964f1220f1d33182dae49976 upstream.

This patch touches the RT group scheduling case.

Functions inc_rt_prio_smp() and dec_rt_prio_smp() change (global) rq's
priority, while rt_rq passed to them may be not the top-level rt_rq.
This is wrong, because changing of priority on a child level does not
guarantee that the priority is the highest all over the rq. So, this
leak makes RT balancing unusable.

The short example: the task having the highest priority among all rq's
RT tasks (no one other task has the same priority) are waking on a
throttle rt_rq.  The rq's cpupri is set to the task's priority
equivalent, but real rq-&gt;rt.highest_prio.curr is less.

The patch below fixes the problem.

Signed-off-by: Kirill Tkhai &lt;tkhai@yandex.ru&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
CC: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Link: http://lkml.kernel.org/r/49231385567953@web4m.yandex.ru
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>sched/rt: Use root_domain of rt_rq not current processor</title>
<updated>2013-02-20T03:15:20Z</updated>
<author>
<name>Shawn Bohrer</name>
<email>sbohrer@rgmadvisors.com</email>
</author>
<published>2013-01-14T17:55:31Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7fc80853e471b5442bd0e793f4eadf102cdd5faa'/>
<id>urn:sha1:7fc80853e471b5442bd0e793f4eadf102cdd5faa</id>
<content type='text'>
commit aa7f67304d1a03180f463258aa6f15a8b434e77d upstream.

When the system has multiple domains do_sched_rt_period_timer()
can run on any CPU and may iterate over all rt_rq in
cpu_online_mask.  This means when balance_runtime() is run for a
given rt_rq that rt_rq may be in a different rd than the current
processor.  Thus if we use smp_processor_id() to get rd in
do_balance_runtime() we may borrow runtime from a rt_rq that is
not part of our rd.

This changes do_balance_runtime to get the rd from the passed in
rt_rq ensuring that we borrow runtime only from the correct rd
for the given rt_rq.

This fixes a BUG at kernel/sched/rt.c:687! in __disable_runtime
when we try reclaim runtime lent to other rt_rq but runtime has
been lent to a rt_rq in another rd.

Signed-off-by: Shawn Bohrer &lt;sbohrer@rgmadvisors.com&gt;
Acked-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Acked-by: Mike Galbraith &lt;bitbucket@online.de&gt;
Cc: peterz@infradead.org
Link: http://lkml.kernel.org/r/1358186131-29494-1-git-send-email-sbohrer@rgmadvisors.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>sched/rt: Fix task stack corruption under __ARCH_WANT_INTERRUPTS_ON_CTXSW</title>
<updated>2012-02-13T19:16:56Z</updated>
<author>
<name>Chanho Min</name>
<email>chanho0207@gmail.com</email>
</author>
<published>2012-01-05T11:00:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6341f8928cf458016bab6aab444536843083ef0a'/>
<id>urn:sha1:6341f8928cf458016bab6aab444536843083ef0a</id>
<content type='text'>
commit cb297a3e433dbdcf7ad81e0564e7b804c941ff0d upstream.

This issue happens under the following conditions:

 1. preemption is off
 2. __ARCH_WANT_INTERRUPTS_ON_CTXSW is defined
 3. RT scheduling class
 4. SMP system

Sequence is as follows:

 1.suppose current task is A. start schedule()
 2.task A is enqueued pushable task at the entry of schedule()
   __schedule
    prev = rq-&gt;curr;
    ...
    put_prev_task
     put_prev_task_rt
      enqueue_pushable_task
 4.pick the task B as next task.
   next = pick_next_task(rq);
 3.rq-&gt;curr set to task B and context_switch is started.
   rq-&gt;curr = next;
 4.At the entry of context_swtich, release this cpu's rq-&gt;lock.
   context_switch
    prepare_task_switch
     prepare_lock_switch
      raw_spin_unlock_irq(&amp;rq-&gt;lock);
 5.Shortly after rq-&gt;lock is released, interrupt is occurred and start IRQ context
 6.try_to_wake_up() which called by ISR acquires rq-&gt;lock
    try_to_wake_up
     ttwu_remote
      rq = __task_rq_lock(p)
      ttwu_do_wakeup(rq, p, wake_flags);
        task_woken_rt
 7.push_rt_task picks the task A which is enqueued before.
   task_woken_rt
    push_rt_tasks(rq)
     next_task = pick_next_pushable_task(rq)
 8.At find_lock_lowest_rq(), If double_lock_balance() returns 0,
   lowest_rq can be the remote rq.
  (But,If preemption is on, double_lock_balance always return 1 and it
   does't happen.)
   push_rt_task
    find_lock_lowest_rq
     if (double_lock_balance(rq, lowest_rq))..
 9.find_lock_lowest_rq return the available rq. task A is migrated to
   the remote cpu/rq.
   push_rt_task
    ...
    deactivate_task(rq, next_task, 0);
    set_task_cpu(next_task, lowest_rq-&gt;cpu);
    activate_task(lowest_rq, next_task, 0);
 10. But, task A is on irq context at this cpu.
     So, task A is scheduled by two cpus at the same time until restore from IRQ.
     Task A's stack is corrupted.

To fix it, don't migrate an RT task if it's still running.

Signed-off-by: Chanho Min &lt;chanho.min@lge.com&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Acked-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Link: http://lkml.kernel.org/r/CAOAMb1BHA=5fm7KTewYyke6u-8DP0iUuJMpgQw54vNeXFsGpoQ@mail.gmail.com
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>sched, rt: Provide means of disabling cross-cpu bandwidth sharing</title>
<updated>2011-11-14T11:50:40Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2011-10-06T20:39:14Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4a6184ce7a48c478dee0d8a9ed74c1fa35161858'/>
<id>urn:sha1:4a6184ce7a48c478dee0d8a9ed74c1fa35161858</id>
<content type='text'>
Normally the RT bandwidth scheme will share bandwidth across the
entire root_domain. However sometimes its convenient to disable this
sharing for debug purposes. Provide a simple feature switch to this
end.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>sched: Warn on rt throttling</title>
<updated>2011-10-06T10:47:04Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2011-10-05T11:32:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1c83437e80186832a9a48dbb6b8868d28e40e562'/>
<id>urn:sha1:1c83437e80186832a9a48dbb6b8868d28e40e562</id>
<content type='text'>
The default rt-throttling is a source of never ending questions. Warn
once when we go into throttling so folks have that info in dmesg.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1110051331480.18778@ionos
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
</feed>
