<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/sched/fair.c, branch v3.4.70</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.4.70</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.4.70'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2013-10-01T16:10:51Z</updated>
<entry>
<title>sched/fair: Fix small race where child-&gt;se.parent,cfs_rq might point to invalid ones</title>
<updated>2013-10-01T16:10:51Z</updated>
<author>
<name>Daisuke Nishimura</name>
<email>nishimura@mxp.nes.nec.co.jp</email>
</author>
<published>2013-09-10T09:16:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=62875332eafea9ee150a6037c0a1a20669e02aa1'/>
<id>urn:sha1:62875332eafea9ee150a6037c0a1a20669e02aa1</id>
<content type='text'>
commit 6c9a27f5da9609fca46cb2b183724531b48f71ad upstream.

There is a small race between copy_process() and cgroup_attach_task()
where child-&gt;se.parent,cfs_rq points to invalid (old) ones.

        parent doing fork()      | someone moving the parent to another cgroup
  -------------------------------+---------------------------------------------
    copy_process()
      + dup_task_struct()
        -&gt; parent-&gt;se is copied to child-&gt;se.
           se.parent,cfs_rq of them point to old ones.

                                     cgroup_attach_task()
                                       + cgroup_task_migrate()
                                         -&gt; parent-&gt;cgroup is updated.
                                       + cpu_cgroup_attach()
                                         + sched_move_task()
                                           + task_move_group_fair()
                                             +- set_task_rq()
                                                -&gt; se.parent,cfs_rq of parent
                                                   are updated.

      + cgroup_fork()
        -&gt; parent-&gt;cgroup is copied to child-&gt;cgroup. (*1)
      + sched_fork()
        + task_fork_fair()
          -&gt; se.parent,cfs_rq of child are accessed
             while they point to old ones. (*2)

In the worst case, this bug can lead to "use-after-free" and cause a panic,
because it's new cgroup's refcount that is incremented at (*1),
so the old cgroup(and related data) can be freed before (*2).

In fact, a panic caused by this bug was originally caught in RHEL6.4.

    BUG: unable to handle kernel NULL pointer dereference at (null)
    IP: [&lt;ffffffff81051e3e&gt;] sched_slice+0x6e/0xa0
    [...]
    Call Trace:
     [&lt;ffffffff81051f25&gt;] place_entity+0x75/0xa0
     [&lt;ffffffff81056a3a&gt;] task_fork_fair+0xaa/0x160
     [&lt;ffffffff81063c0b&gt;] sched_fork+0x6b/0x140
     [&lt;ffffffff8106c3c2&gt;] copy_process+0x5b2/0x1450
     [&lt;ffffffff81063b49&gt;] ? wake_up_new_task+0xd9/0x130
     [&lt;ffffffff8106d2f4&gt;] do_fork+0x94/0x460
     [&lt;ffffffff81072a9e&gt;] ? sys_wait4+0xae/0x100
     [&lt;ffffffff81009598&gt;] sys_clone+0x28/0x30
     [&lt;ffffffff8100b393&gt;] stub_clone+0x13/0x20
     [&lt;ffffffff8100b072&gt;] ? system_call_fastpath+0x16/0x1b

Signed-off-by: Daisuke Nishimura &lt;nishimura@mxp.nes.nec.co.jp&gt;
Signed-off-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/039601ceae06$733d3130$59b79390$@mxp.nes.nec.co.jp
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>sched: Fix the broken sched_rr_get_interval()</title>
<updated>2013-08-11T22:38:43Z</updated>
<author>
<name>Zhu Yanhai</name>
<email>gaoyang.zyh@taobao.com</email>
</author>
<published>2013-01-08T04:56:52Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1836cd176170d39f5b1fda9cd47d3c3f0f10dad2'/>
<id>urn:sha1:1836cd176170d39f5b1fda9cd47d3c3f0f10dad2</id>
<content type='text'>
commit a59f4e079d19464eebb9b06513a1d4f55fdae5ba upstream.

The caller of sched_sliced() should pass se.cfs_rq and se as the
arguments, however in sched_rr_get_interval() we gave it
rq.cfs_rq and se, which made the following computation obviously
wrong.

The change was introduced by commit:

  77034937dc45 sched: fix crash in sys_sched_rr_get_interval()

... 5 years ago, while it had been the correct 'cfs_rq_of' before
the commit. The change seems to be irrelevant to the commit
msg, which was to return a 0 timeslice for tasks that are on an
idle runqueue. So I believe that was just a plain typo.

Signed-off-by: Zhu Yanhai &lt;gaoyang.zyh@taobao.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Paul Turner &lt;pjt@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Link: http://lkml.kernel.org/r/1357621012-15039-1-git-send-email-gaoyang.zyh@taobao.com
[ Since this is an ABI and an old bug, we'll test this via a
  slow upstream route, to hopefully discover any app breakage. ]
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>sched: Fix more load-balancing fallout</title>
<updated>2012-04-26T10:54:52Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2012-04-17T11:38:40Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=eb95308ee2a69403909e111837b9068c64cfc349'/>
<id>urn:sha1:eb95308ee2a69403909e111837b9068c64cfc349</id>
<content type='text'>
Commits 367456c756a6 ("sched: Ditch per cgroup task lists for
load-balancing") and 5d6523ebd ("sched: Fix load-balance wreckage")
left some more wreckage.

By setting loop_max unconditionally to -&gt;nr_running load-balancing
could take a lot of time on very long runqueues (hackbench!). So keep
the sysctl as max limit of the amount of tasks we'll iterate.

Furthermore, the min load filter for migration completely fails with
cgroups since inequality in per-cpu state can easily lead to such
small loads :/

Furthermore the change to add new tasks to the tail of the queue
instead of the head seems to have some effect.. not quite sure I
understand why.

Combined these fixes solve the huge hackbench regression reported by
Tim when hackbench is ran in a cgroup.

Reported-by: Tim Chen &lt;tim.c.chen@linux.intel.com&gt;
Acked-by: Tim Chen &lt;tim.c.chen@linux.intel.com&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Link: http://lkml.kernel.org/r/1335365763.28150.267.camel@twins
[ got rid of the CONFIG_PREEMPT tuning and made small readability edits ]
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2012-03-29T21:46:05Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-03-29T21:46:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7fda0412c5f7afdd1a5ff518f98dee5157266d8a'/>
<id>urn:sha1:7fda0412c5f7afdd1a5ff518f98dee5157266d8a</id>
<content type='text'>
Pull scheduler fixes from Ingo Molnar.

* 'sched-urgent-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  cpusets: Remove an unused variable
  sched/rt: Improve pick_next_highest_task_rt()
  sched: Fix select_fallback_rq() vs cpu_active/cpu_online
  sched/x86/smp: Do not enable IRQs over calibrate_delay()
  sched: Fix compiler warning about declared inline after use
  MAINTAINERS: Update email address for SCHEDULER and PERF EVENTS
</content>
</entry>
<entry>
<title>sched: Fix compiler warning about declared inline after use</title>
<updated>2012-03-23T09:39:28Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2012-03-21T20:07:16Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6c16a6dcb05e51ace340ff7bc6dbe647f1593528'/>
<id>urn:sha1:6c16a6dcb05e51ace340ff7bc6dbe647f1593528</id>
<content type='text'>
kernel/sched/fair.c:420: warning: 'account_cfs_rq_runtime' declared inline after being called
kernel/sched/fair.c:420: warning: previous declaration of 'account_cfs_rq_runtime' was here
kernel/sched/fair.c:1165: warning: 'return_cfs_rq_runtime' declared inlineafter being called
kernel/sched/fair.c:1165: warning: previous declaration of 'return_cfs_rq_runtime' was here

Reported-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Link: http://lkml.kernel.org/r/20120321200717.49BB4A024E@akpm.mtv.corp.google.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
</content>
</entry>
<entry>
<title>Merge branch 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2012-03-20T17:31:44Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-03-20T17:31:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2ba68940c893c8f0bfc8573c041254251bb6aeab'/>
<id>urn:sha1:2ba68940c893c8f0bfc8573c041254251bb6aeab</id>
<content type='text'>
Pull scheduler changes for v3.4 from Ingo Molnar

* 'sched-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip: (27 commits)
  printk: Make it compile with !CONFIG_PRINTK
  sched/x86: Fix overflow in cyc2ns_offset
  sched: Fix nohz load accounting -- again!
  sched: Update yield() docs
  printk/sched: Introduce special printk_sched() for those awkward moments
  sched/nohz: Correctly initialize 'next_balance' in 'nohz' idle balancer
  sched: Cleanup cpu_active madness
  sched: Fix load-balance wreckage
  sched: Clean up parameter passing of proc_sched_autogroup_set_nice()
  sched: Ditch per cgroup task lists for load-balancing
  sched: Rename load-balancing fields
  sched: Move load-balancing arguments into helper struct
  sched/rt: Do not submit new work when PI-blocked
  sched/rt: Prevent idle task boosting
  sched/wait: Add __wake_up_all_locked() API
  sched/rt: Document scheduler related skip-resched-check sites
  sched/rt: Use schedule_preempt_disabled()
  sched/rt: Add schedule_preempt_disabled()
  sched/rt: Do not throttle when PI boosting
  sched/rt: Keep period timer ticking when rt throttling is active
  ...
</content>
</entry>
<entry>
<title>sched/nohz: Correctly initialize 'next_balance' in 'nohz' idle balancer</title>
<updated>2012-03-12T19:43:16Z</updated>
<author>
<name>Diwakar Tundlam</name>
<email>dtundlam@nvidia.com</email>
</author>
<published>2012-03-07T22:44:26Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=554cecaf733623b327eef9652b65965eb1081b81'/>
<id>urn:sha1:554cecaf733623b327eef9652b65965eb1081b81</id>
<content type='text'>
The 'next_balance' field of 'nohz' idle balancer must be initialized
to jiffies. Since jiffies is initialized to negative 300 seconds the
'nohz' idle balancer does not run for the first 300s (5mins) after
bootup. If no new processes are spawed or no idle cycles happen, the
load on the cpus will remain unbalanced for that duration.

Signed-off-by: Diwakar Tundlam &lt;dtundlam@nvidia.com&gt;
Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/r/1DD7BFEDD3147247B1355BEFEFE4665237994F30EF@HQMAIL04.nvidia.com
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>sched: Fix load-balance wreckage</title>
<updated>2012-03-12T19:43:15Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2012-03-09T23:07:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5d6523ebd2f67de9d23285aad7f3910e7b0aee83'/>
<id>urn:sha1:5d6523ebd2f67de9d23285aad7f3910e7b0aee83</id>
<content type='text'>
Commit 367456c ("sched: Ditch per cgroup task lists for
load-balancing") completely wrecked load-balancing due to
a few silly mistakes.

Correct those and remove more pointless code.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Link: http://lkml.kernel.org/n/tip-zk04ihygwxn7qqrlpaf73b0r@git.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>Merge branch 'perf/urgent' into perf/core</title>
<updated>2012-03-05T08:20:08Z</updated>
<author>
<name>Ingo Molnar</name>
<email>mingo@elte.hu</email>
</author>
<published>2012-03-05T08:20:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=737f24bda723fdf89ecaacb99fa2bf5683c32799'/>
<id>urn:sha1:737f24bda723fdf89ecaacb99fa2bf5683c32799</id>
<content type='text'>
Conflicts:
	tools/perf/builtin-record.c
	tools/perf/builtin-top.c
	tools/perf/perf.h
	tools/perf/util/top.h

Merge reason: resolve these cherry-picking conflicts.

Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
<entry>
<title>sched: Ditch per cgroup task lists for load-balancing</title>
<updated>2012-03-01T12:08:37Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>a.p.zijlstra@chello.nl</email>
</author>
<published>2012-02-20T20:49:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=367456c756a6b84f493ca9cc5b17b1f5d38ef466'/>
<id>urn:sha1:367456c756a6b84f493ca9cc5b17b1f5d38ef466</id>
<content type='text'>
Per cgroup load-balance has numerous problems, chief amongst them that
there is no real sane order in them. So stop pretending it makes sense
and enqueue all tasks on a single list.

This also allows us to more easily fix the fwd progress issue
uncovered by the lock-break stuff. Rotate the list on failure to
migreate and limit the total iterations to nr_running (which with
releasing the lock isn't strictly accurate but close enough).

Also add a filter that skips very light tasks on the first attempt
around the list, this attempts to avoid shooting whole cgroups around
without affecting over balance.

Signed-off-by: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: pjt@google.com
Link: http://lkml.kernel.org/n/tip-tx8yqydc7eimgq7i4rkc3a4g@git.kernel.org
Signed-off-by: Ingo Molnar &lt;mingo@elte.hu&gt;
</content>
</entry>
</feed>
