<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel, branch v4.4.76</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.4.76</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.4.76'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2017-07-05T12:37:21Z</updated>
<entry>
<title>sched/loadavg: Avoid loadavg spikes caused by delayed NO_HZ accounting</title>
<updated>2017-07-05T12:37:21Z</updated>
<author>
<name>Matt Fleming</name>
<email>matt@codeblueprint.co.uk</email>
</author>
<published>2017-02-17T12:07:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6ca11db55f62ea484c4ecf28b1fa9da14536f31e'/>
<id>urn:sha1:6ca11db55f62ea484c4ecf28b1fa9da14536f31e</id>
<content type='text'>
commit 6e5f32f7a43f45ee55c401c0b9585eb01f9629a8 upstream.

If we crossed a sample window while in NO_HZ we will add LOAD_FREQ to
the pending sample window time on exit, setting the next update not
one window into the future, but two.

This situation on exiting NO_HZ is described by:

  this_rq-&gt;calc_load_update &lt; jiffies &lt; calc_load_update

In this scenario, what we should be doing is:

  this_rq-&gt;calc_load_update = calc_load_update		     [ next window ]

But what we actually do is:

  this_rq-&gt;calc_load_update = calc_load_update + LOAD_FREQ   [ next+1 window ]

This has the effect of delaying load average updates for potentially
up to ~9seconds.

This can result in huge spikes in the load average values due to
per-cpu uninterruptible task counts being out of sync when accumulated
across all CPUs.

It's safe to update the per-cpu active count if we wake between sample
windows because any load that we left in 'calc_load_idle' will have
been zero'd when the idle load was folded in calc_global_load().

This issue is easy to reproduce before,

  commit 9d89c257dfb9 ("sched/fair: Rewrite runnable load and utilization average tracking")

just by forking short-lived process pipelines built from ps(1) and
grep(1) in a loop. I'm unable to reproduce the spikes after that
commit, but the bug still seems to be present from code review.

Signed-off-by: Matt Fleming &lt;matt@codeblueprint.co.uk&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Mike Galbraith &lt;efault@gmx.de&gt;
Cc: Mike Galbraith &lt;umgwanakikbuti@gmail.com&gt;
Cc: Morten Rasmussen &lt;morten.rasmussen@arm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vincent Guittot &lt;vincent.guittot@linaro.org&gt;
Fixes: commit 5167e8d ("sched/nohz: Rewrite and fix load-avg computation -- again")
Link: http://lkml.kernel.org/r/20170217120731.11868-2-matt@codeblueprint.co.uk
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>kernel/panic.c: add missing \n</title>
<updated>2017-07-05T12:37:19Z</updated>
<author>
<name>Jiri Slaby</name>
<email>jslaby@suse.cz</email>
</author>
<published>2017-01-24T23:18:29Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=70f41003b9d18e24ad90b81be5a47a5f32e756d3'/>
<id>urn:sha1:70f41003b9d18e24ad90b81be5a47a5f32e756d3</id>
<content type='text'>
[ Upstream commit ff7a28a074ccbea999dadbb58c46212cf90984c6 ]

When a system panics, the "Rebooting in X seconds.." message is never
printed because it lacks a new line.  Fix it.

Link: http://lkml.kernel.org/r/20170119114751.2724-1-jslaby@suse.cz
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>sysctl: enable strict writes</title>
<updated>2017-07-05T12:37:16Z</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2016-01-20T23:00:45Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2449a71eb98204fc54ff55ddd6825cc5141ce176'/>
<id>urn:sha1:2449a71eb98204fc54ff55ddd6825cc5141ce176</id>
<content type='text'>
commit 41662f5cc55335807d39404371cfcbb1909304c4 upstream.

SYSCTL_WRITES_WARN was added in commit f4aacea2f5d1 ("sysctl: allow for
strict write position handling"), and released in v3.16 in August of
2014.  Since then I can find only 1 instance of non-zero offset
writing[1], and it was fixed immediately in CRIU[2].  As such, it
appears safe to flip this to the strict state now.

[1] https://www.google.com/search?q="when%20file%20position%20was%20not%200"
[2] http://lists.openvz.org/pipermail/criu/2015-April/019819.html

Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Sumit Semwal &lt;sumit.semwal@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>time: Fix clock-&gt;read(clock) race around clocksource changes</title>
<updated>2017-06-29T10:48:51Z</updated>
<author>
<name>John Stultz</name>
<email>john.stultz@linaro.org</email>
</author>
<published>2017-06-08T23:44:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1fecf3977defb3161ba194e5ddbdeca9be638377'/>
<id>urn:sha1:1fecf3977defb3161ba194e5ddbdeca9be638377</id>
<content type='text'>
commit ceea5e3771ed2378668455fa21861bead7504df5 upstream.

In tests, which excercise switching of clocksources, a NULL
pointer dereference can be observed on AMR64 platforms in the
clocksource read() function:

u64 clocksource_mmio_readl_down(struct clocksource *c)
{
	return ~(u64)readl_relaxed(to_mmio_clksrc(c)-&gt;reg) &amp; c-&gt;mask;
}

This is called from the core timekeeping code via:

	cycle_now = tkr-&gt;read(tkr-&gt;clock);

tkr-&gt;read is the cached tkr-&gt;clock-&gt;read() function pointer.
When the clocksource is changed then tkr-&gt;clock and tkr-&gt;read
are updated sequentially. The code above results in a sequential
load operation of tkr-&gt;read and tkr-&gt;clock as well.

If the store to tkr-&gt;clock hits between the loads of tkr-&gt;read
and tkr-&gt;clock, then the old read() function is called with the
new clock pointer. As a consequence the read() function
dereferences a different data structure and the resulting 'reg'
pointer can point anywhere including NULL.

This problem was introduced when the timekeeping code was
switched over to use struct tk_read_base. Before that, it was
theoretically possible as well when the compiler decided to
reload clock in the code sequence:

     now = tk-&gt;clock-&gt;read(tk-&gt;clock);

Add a helper function which avoids the issue by reading
tk_read_base-&gt;clock once into a local variable clk and then issue
the read function via clk-&gt;read(clk). This guarantees that the
read() function always gets the proper clocksource pointer handed
in.

Since there is now no use for the tkr.read pointer, this patch
also removes it, and to address stopping the fast timekeeper
during suspend/resume, it introduces a dummy clocksource to use
rather then just a dummy read function.

Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
Acked-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Prarit Bhargava &lt;prarit@redhat.com&gt;
Cc: Richard Cochran &lt;richardcochran@gmail.com&gt;
Cc: Stephen Boyd &lt;stephen.boyd@linaro.org&gt;
Cc: Miroslav Lichvar &lt;mlichvar@redhat.com&gt;
Cc: Daniel Mentz &lt;danielmentz@google.com&gt;
Link: http://lkml.kernel.org/r/1496965462-20003-2-git-send-email-john.stultz@linaro.org
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>signal: Only reschedule timers on signals timers have sent</title>
<updated>2017-06-29T10:48:51Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2017-06-13T09:31:16Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bc7b3e9984a8e83e3256c08a059ca745b5d0935c'/>
<id>urn:sha1:bc7b3e9984a8e83e3256c08a059ca745b5d0935c</id>
<content type='text'>
commit 57db7e4a2d92c2d3dfbca4ef8057849b2682436b upstream.

Thomas Gleixner  wrote:
&gt; The CRIU support added a 'feature' which allows a user space task to send
&gt; arbitrary (kernel) signals to itself. The changelog says:
&gt;
&gt;   The kernel prevents sending of siginfo with positive si_code, because
&gt;   these codes are reserved for kernel.  I think we can allow a task to
&gt;   send such a siginfo to itself.  This operation should not be dangerous.
&gt;
&gt; Quite contrary to that claim, it turns out that it is outright dangerous
&gt; for signals with info-&gt;si_code == SI_TIMER. The following code sequence in
&gt; a user space task allows to crash the kernel:
&gt;
&gt;    id = timer_create(CLOCK_XXX, ..... signo = SIGX);
&gt;    timer_set(id, ....);
&gt;    info-&gt;si_signo = SIGX;
&gt;    info-&gt;si_code = SI_TIMER:
&gt;    info-&gt;_sifields._timer._tid = id;
&gt;    info-&gt;_sifields._timer._sys_private = 2;
&gt;    rt_[tg]sigqueueinfo(..., SIGX, info);
&gt;    sigemptyset(&amp;sigset);
&gt;    sigaddset(&amp;sigset, SIGX);
&gt;    rt_sigtimedwait(sigset, info);
&gt;
&gt; For timers based on CLOCK_PROCESS_CPUTIME_ID, CLOCK_THREAD_CPUTIME_ID this
&gt; results in a kernel crash because sigwait() dequeues the signal and the
&gt; dequeue code observes:
&gt;
&gt;   info-&gt;si_code == SI_TIMER &amp;&amp; info-&gt;_sifields._timer._sys_private != 0
&gt;
&gt; which triggers the following callchain:
&gt;
&gt;  do_schedule_next_timer() -&gt; posix_cpu_timer_schedule() -&gt; arm_timer()
&gt;
&gt; arm_timer() executes a list_add() on the timer, which is already armed via
&gt; the timer_set() syscall. That's a double list add which corrupts the posix
&gt; cpu timer list. As a consequence the kernel crashes on the next operation
&gt; touching the posix cpu timer list.
&gt;
&gt; Posix clocks which are internally implemented based on hrtimers are not
&gt; affected by this because hrtimer_start() can handle already armed timers
&gt; nicely, but it's a reliable way to trigger the WARN_ON() in
&gt; hrtimer_forward(), which complains about calling that function on an
&gt; already armed timer.

This problem has existed since the posix timer code was merged into
2.5.63. A few releases earlier in 2.5.60 ptrace gained the ability to
inject not just a signal (which linux has supported since 1.0) but the
full siginfo of a signal.

The core problem is that the code will reschedule in response to
signals getting dequeued not just for signals the timers sent but
for other signals that happen to a si_code of SI_TIMER.

Avoid this confusion by testing to see if the queued signal was
preallocated as all timer signals are preallocated, and so far
only the timer code preallocates signals.

Move the check for if a timer needs to be rescheduled up into
collect_signal where the preallocation check must be performed,
and pass the result back to dequeue_signal where the code reschedules
timers.   This makes it clear why the code cares about preallocated
timers.

Reported-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
History Tree: https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
Reference: 66dd34ad31e5 ("signal: allow to send any siginfo to itself")
Reference: 1669ce53e2ff ("Add PTRACE_GETSIGINFO and PTRACE_SETSIGINFO")
Fixes: db8b50ba75f2 ("[PATCH] POSIX clocks &amp; timers")
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>alarmtimer: Rate limit periodic intervals</title>
<updated>2017-06-26T05:13:11Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2017-05-30T21:15:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=26605a06dd921df52e2395af853367e670e6381b'/>
<id>urn:sha1:26605a06dd921df52e2395af853367e670e6381b</id>
<content type='text'>
commit ff86bf0c65f14346bf2440534f9ba5ac232c39a0 upstream.

The alarmtimer code has another source of potentially rearming itself too
fast. Interval timers with a very samll interval have a similar CPU hog
effect as the previously fixed overflow issue.

The reason is that alarmtimers do not implement the normal protection
against this kind of problem which the other posix timer use:

  timer expires -&gt; queue signal -&gt; deliver signal -&gt; rearm timer

This scheme brings the rearming under scheduler control and prevents
permanently firing timers which hog the CPU.

Bringing this scheme to the alarm timer code is a major overhaul because it
lacks all the necessary mechanisms completely.

So for a quick fix limit the interval to one jiffie. This is not
problematic in practice as alarmtimers are usually backed by an RTC for
suspend which have 1 second resolution. It could be therefor argued that
the resolution of this clock should be set to 1 second in general, but
that's outside the scope of this fix.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Kostya Serebryany &lt;kcc@google.com&gt;
Cc: syzkaller &lt;syzkaller@googlegroups.com&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Link: http://lkml.kernel.org/r/20170530211655.896767100@linutronix.de
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;


</content>
</entry>
<entry>
<title>alarmtimer: Prevent overflow of relative timers</title>
<updated>2017-06-26T05:13:10Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2017-05-30T21:15:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=aac7fa215e8fc795287328c2914aa123e7538690'/>
<id>urn:sha1:aac7fa215e8fc795287328c2914aa123e7538690</id>
<content type='text'>
commit f4781e76f90df7aec400635d73ea4c35ee1d4765 upstream.

Andrey reported a alartimer related RCU stall while fuzzing the kernel with
syzkaller.

The reason for this is an overflow in ktime_add() which brings the
resulting time into negative space and causes immediate expiry of the
timer. The following rearm with a small interval does not bring the timer
back into positive space due to the same issue.

This results in a permanent firing alarmtimer which hogs the CPU.

Use ktime_add_safe() instead which detects the overflow and clamps the
result to KTIME_SEC_MAX.

Reported-by: Andrey Konovalov &lt;andreyknvl@google.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Kostya Serebryany &lt;kcc@google.com&gt;
Cc: syzkaller &lt;syzkaller@googlegroups.com&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: Dmitry Vyukov &lt;dvyukov@google.com&gt;
Link: http://lkml.kernel.org/r/20170530211655.802921648@linutronix.de
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>genirq: Release resources in __setup_irq() error path</title>
<updated>2017-06-26T05:13:10Z</updated>
<author>
<name>Heiner Kallweit</name>
<email>hkallweit1@gmail.com</email>
</author>
<published>2017-06-10T22:38:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4d4d501cd7079d9052bb9ea6778f15491508f95e'/>
<id>urn:sha1:4d4d501cd7079d9052bb9ea6778f15491508f95e</id>
<content type='text'>
commit fa07ab72cbb0d843429e61bf179308aed6cbe0dd upstream.

In case __irq_set_trigger() fails the resources requested via
irq_request_resources() are not released.

Add the missing release call into the error handling path.

Fixes: c1bacbae8192 ("genirq: Provide irq_request/release_resources chip callbacks")
Signed-off-by: Heiner Kallweit &lt;hkallweit1@gmail.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Link: http://lkml.kernel.org/r/655538f5-cb20-a892-ff15-fbd2dd1fa4ec@gmail.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>perf/core: Drop kernel samples even though :u is specified</title>
<updated>2017-06-14T11:16:25Z</updated>
<author>
<name>Jin Yao</name>
<email>yao.jin@linux.intel.com</email>
</author>
<published>2017-05-25T10:09:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e582b82c160a63c2acd02bd6aa7b959a8700e96c'/>
<id>urn:sha1:e582b82c160a63c2acd02bd6aa7b959a8700e96c</id>
<content type='text'>
commit cc1582c231ea041fbc68861dfaf957eaf902b829 upstream.

When doing sampling, for example:

  perf record -e cycles:u ...

On workloads that do a lot of kernel entry/exits we see kernel
samples, even though :u is specified. This is due to skid existing.

This might be a security issue because it can leak kernel addresses even
though kernel sampling support is disabled.

The patch drops the kernel samples if exclude_kernel is specified.

For example, test on Haswell desktop:

  perf record -e cycles:u &lt;mgen&gt;
  perf report --stdio

Before patch applied:

    99.77%  mgen     mgen              [.] buf_read
     0.20%  mgen     mgen              [.] rand_buf_init
     0.01%  mgen     [kernel.vmlinux]  [k] apic_timer_interrupt
     0.00%  mgen     mgen              [.] last_free_elem
     0.00%  mgen     libc-2.23.so      [.] __random_r
     0.00%  mgen     libc-2.23.so      [.] _int_malloc
     0.00%  mgen     mgen              [.] rand_array_init
     0.00%  mgen     [kernel.vmlinux]  [k] page_fault
     0.00%  mgen     libc-2.23.so      [.] __random
     0.00%  mgen     libc-2.23.so      [.] __strcasestr
     0.00%  mgen     ld-2.23.so        [.] strcmp
     0.00%  mgen     ld-2.23.so        [.] _dl_start
     0.00%  mgen     libc-2.23.so      [.] sched_setaffinity@@GLIBC_2.3.4
     0.00%  mgen     ld-2.23.so        [.] _start

We can see kernel symbols apic_timer_interrupt and page_fault.

After patch applied:

    99.79%  mgen     mgen           [.] buf_read
     0.19%  mgen     mgen           [.] rand_buf_init
     0.00%  mgen     libc-2.23.so   [.] __random_r
     0.00%  mgen     mgen           [.] rand_array_init
     0.00%  mgen     mgen           [.] last_free_elem
     0.00%  mgen     libc-2.23.so   [.] vfprintf
     0.00%  mgen     libc-2.23.so   [.] rand
     0.00%  mgen     libc-2.23.so   [.] __random
     0.00%  mgen     libc-2.23.so   [.] _int_malloc
     0.00%  mgen     libc-2.23.so   [.] _IO_doallocbuf
     0.00%  mgen     ld-2.23.so     [.] do_lookup_x
     0.00%  mgen     ld-2.23.so     [.] open_verify.constprop.7
     0.00%  mgen     ld-2.23.so     [.] _dl_important_hwcaps
     0.00%  mgen     libc-2.23.so   [.] sched_setaffinity@@GLIBC_2.3.4
     0.00%  mgen     ld-2.23.so     [.] _start

There are only userspace symbols.

Signed-off-by: Jin Yao &lt;yao.jin@linux.intel.com&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Alexander Shishkin &lt;alexander.shishkin@linux.intel.com&gt;
Cc: Arnaldo Carvalho de Melo &lt;acme@redhat.com&gt;
Cc: Jiri Olsa &lt;jolsa@redhat.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Namhyung Kim &lt;namhyung@kernel.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Stephane Eranian &lt;eranian@google.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Vince Weaver &lt;vincent.weaver@maine.edu&gt;
Cc: acme@kernel.org
Cc: jolsa@kernel.org
Cc: kan.liang@intel.com
Cc: mark.rutland@arm.com
Cc: will.deacon@arm.com
Cc: yao.jin@intel.com
Link: http://lkml.kernel.org/r/1495706947-3744-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>cpuset: consider dying css as offline</title>
<updated>2017-06-14T11:16:23Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2017-05-24T16:03:48Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c8acec90d9dd11f9ebae8ab4a70eac5e1339297d'/>
<id>urn:sha1:c8acec90d9dd11f9ebae8ab4a70eac5e1339297d</id>
<content type='text'>
commit 41c25707d21716826e3c1f60967f5550610ec1c9 upstream.

In most cases, a cgroup controller don't care about the liftimes of
cgroups.  For the controller, a css becomes online when -&gt;css_online()
is called on it and offline when -&gt;css_offline() is called.

However, cpuset is special in that the user interface it exposes cares
whether certain cgroups exist or not.  Combined with the RCU delay
between cgroup removal and css offlining, this can lead to user
visible behavior oddities where operations which should succeed after
cgroup removals fail for some time period.  The effects of cgroup
removals are delayed when seen from userland.

This patch adds css_is_dying() which tests whether offline is pending
and updates is_cpuset_online() so that the function returns false also
while offline is pending.  This gets rid of the userland visible
delays.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Reported-by: Daniel Jordan &lt;daniel.m.jordan@oracle.com&gt;
Link: http://lkml.kernel.org/r/327ca1f5-7957-fbb9-9e5f-9ba149d40ba2@oracle.com
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
</feed>
