<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel, branch v3.2.80</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.2.80</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.2.80'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2016-04-30T22:05:20Z</updated>
<entry>
<title>fs/coredump: prevent fsuid=0 dumps into user-controlled directories</title>
<updated>2016-04-30T22:05:20Z</updated>
<author>
<name>Jann Horn</name>
<email>jann@thejh.net</email>
</author>
<published>2016-03-22T21:25:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b1bf6857ac304ee1c05cb3d804f70312e947887c'/>
<id>urn:sha1:b1bf6857ac304ee1c05cb3d804f70312e947887c</id>
<content type='text'>
commit 378c6520e7d29280f400ef2ceaf155c86f05a71a upstream.

This commit fixes the following security hole affecting systems where
all of the following conditions are fulfilled:

 - The fs.suid_dumpable sysctl is set to 2.
 - The kernel.core_pattern sysctl's value starts with "/". (Systems
   where kernel.core_pattern starts with "|/" are not affected.)
 - Unprivileged user namespace creation is permitted. (This is
   true on Linux &gt;=3.8, but some distributions disallow it by
   default using a distro patch.)

Under these conditions, if a program executes under secure exec rules,
causing it to run with the SUID_DUMP_ROOT flag, then unshares its user
namespace, changes its root directory and crashes, the coredump will be
written using fsuid=0 and a path derived from kernel.core_pattern - but
this path is interpreted relative to the root directory of the process,
allowing the attacker to control where a coredump will be written with
root privileges.

To fix the security issue, always interpret core_pattern for dumps that
are written under SUID_DUMP_ROOT relative to the root directory of init.

Signed-off-by: Jann Horn &lt;jann@thejh.net&gt;
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Cc: Oleg Nesterov &lt;oleg@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
[bwh: Backported to 3.2: adjust filename, context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>tracing: Fix crash from reading trace_pipe with sendfile</title>
<updated>2016-04-30T22:05:20Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2016-03-18T19:46:48Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1719bf67d6c52a9bc84ddf24a3e85d5f69fe3f90'/>
<id>urn:sha1:1719bf67d6c52a9bc84ddf24a3e85d5f69fe3f90</id>
<content type='text'>
commit a29054d9478d0435ab01b7544da4f674ab13f533 upstream.

If tracing contains data and the trace_pipe file is read with sendfile(),
then it can trigger a NULL pointer dereference and various BUG_ON within the
VM code.

There's a patch to fix this in the splice_to_pipe() code, but it's also a
good idea to not let that happen from trace_pipe either.

Link: http://lkml.kernel.org/r/1457641146-9068-1-git-send-email-rabin@rab.in

Reported-by: Rabin Vincent &lt;rabin.vincent@gmail.com&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>tracing: Have preempt(irqs)off trace preempt disabled functions</title>
<updated>2016-04-30T22:05:20Z</updated>
<author>
<name>Steven Rostedt (Red Hat)</name>
<email>rostedt@goodmis.org</email>
</author>
<published>2016-03-18T16:27:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9e1128ecd9cab0f5af43cb4f8a6ff9091aaf15b8'/>
<id>urn:sha1:9e1128ecd9cab0f5af43cb4f8a6ff9091aaf15b8</id>
<content type='text'>
commit cb86e05390debcc084cfdb0a71ed4c5dbbec517d upstream.

Joel Fernandes reported that the function tracing of preempt disabled
sections was not being reported when running either the preemptirqsoff or
preemptoff tracers. This was due to the fact that the function tracer
callback for those tracers checked if irqs were disabled before tracing. But
this fails when we want to trace preempt off locations as well.

Joel explained that he wanted to see funcitons where interrupts are enabled
but preemption was disabled. The expected output he wanted:

   &lt;...&gt;-2265    1d.h1 3419us : preempt_count_sub &lt;-irq_exit
   &lt;...&gt;-2265    1d..1 3419us : __do_softirq &lt;-irq_exit
   &lt;...&gt;-2265    1d..1 3419us : msecs_to_jiffies &lt;-__do_softirq
   &lt;...&gt;-2265    1d..1 3420us : irqtime_account_irq &lt;-__do_softirq
   &lt;...&gt;-2265    1d..1 3420us : __local_bh_disable_ip &lt;-__do_softirq
   &lt;...&gt;-2265    1..s1 3421us : run_timer_softirq &lt;-__do_softirq
   &lt;...&gt;-2265    1..s1 3421us : hrtimer_run_pending &lt;-run_timer_softirq
   &lt;...&gt;-2265    1..s1 3421us : _raw_spin_lock_irq &lt;-run_timer_softirq
   &lt;...&gt;-2265    1d.s1 3422us : preempt_count_add &lt;-_raw_spin_lock_irq
   &lt;...&gt;-2265    1d.s2 3422us : _raw_spin_unlock_irq &lt;-run_timer_softirq
   &lt;...&gt;-2265    1..s2 3422us : preempt_count_sub &lt;-_raw_spin_unlock_irq
   &lt;...&gt;-2265    1..s1 3423us : rcu_bh_qs &lt;-__do_softirq
   &lt;...&gt;-2265    1d.s1 3423us : irqtime_account_irq &lt;-__do_softirq
   &lt;...&gt;-2265    1d.s1 3423us : __local_bh_enable &lt;-__do_softirq

There's a comment saying that the irq disabled check is because there's a
possible race that tracing_cpu may be set when the function is executed. But
I don't remember that race. For now, I added a check for preemption being
enabled too to not record the function, as there would be no race if that
was the case. I need to re-investigate this, as I'm now thinking that the
tracing_cpu will always be correct. But no harm in keeping the check for
now, except for the slight performance hit.

Link: http://lkml.kernel.org/r/1457770386-88717-1-git-send-email-agnel.joel@gmail.com

Fixes: 5e6d2b9cfa3a "tracing: Use one prologue for the preempt irqs off tracer function tracers"
Cc: stable@vget.kernel.org # 2.6.37+
Reported-by: Joel Fernandes &lt;agnel.joel@gmail.com&gt;
Signed-off-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>sched/cputime: Fix steal time accounting vs. CPU hotplug</title>
<updated>2016-04-30T22:05:16Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2016-03-04T14:59:42Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a41e182b6aa057417054cb61bbae7ec891397484'/>
<id>urn:sha1:a41e182b6aa057417054cb61bbae7ec891397484</id>
<content type='text'>
commit e9532e69b8d1d1284e8ecf8d2586de34aec61244 upstream.

On CPU hotplug the steal time accounting can keep a stale rq-&gt;prev_steal_time
value over CPU down and up. So after the CPU comes up again the delta
calculation in steal_account_process_tick() wreckages itself due to the
unsigned math:

	 u64 steal = paravirt_steal_clock(smp_processor_id());

	 steal -= this_rq()-&gt;prev_steal_time;

So if steal is smaller than rq-&gt;prev_steal_time we end up with an insane large
value which then gets added to rq-&gt;prev_steal_time, resulting in a permanent
wreckage of the accounting. As a consequence the per CPU stats in /proc/stat
become stale.

Nice trick to tell the world how idle the system is (100%) while the CPU is
100% busy running tasks. Though we prefer realistic numbers.

None of the accounting values which use a previous value to account for
fractions is reset at CPU hotplug time. update_rq_clock_task() has a sanity
check for prev_irq_time and prev_steal_time_rq, but that sanity check solely
deals with clock warps and limits the /proc/stat visible wreckage. The
prev_time values are still wrong.

Solution is simple: Reset rq-&gt;prev_*_time when the CPU is plugged in again.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Rik van Riel &lt;riel@redhat.com&gt;
Cc: Frederic Weisbecker &lt;fweisbec@gmail.com&gt;
Cc: Glauber Costa &lt;glommer@parallels.com&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Fixes: commit 095c0aa83e52 "sched: adjust scheduler cpu power for stolen time"
Fixes: commit aa483808516c "sched: Remove irq time from available CPU power"
Fixes: commit e6e6685accfa "KVM guest: Steal time accounting"
Link: http://lkml.kernel.org/r/alpine.DEB.2.11.1603041539490.3686@nanos
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
[bwh: Backported to 3.2: adjust filenames]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>kernel/resource.c: fix muxed resource handling in __request_region()</title>
<updated>2016-04-01T00:54:34Z</updated>
<author>
<name>Simon Guinot</name>
<email>simon.guinot@sequanux.org</email>
</author>
<published>2015-09-09T22:15:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5215afa128adcfbf917fabc76bc69d35de4beab8'/>
<id>urn:sha1:5215afa128adcfbf917fabc76bc69d35de4beab8</id>
<content type='text'>
commit 59ceeaaf355fa0fb16558ef7c24413c804932ada upstream.

In __request_region, if a conflict with a BUSY and MUXED resource is
detected, then the caller goes to sleep and waits for the resource to be
released.  A pointer on the conflicting resource is kept.  At wake-up
this pointer is used as a parent to retry to request the region.

A first problem is that this pointer might well be invalid (if for
example the conflicting resource have already been freed).  Another
problem is that the next call to __request_region() fails to detect a
remaining conflict.  The previously conflicting resource is passed as a
parameter and __request_region() will look for a conflict among the
children of this resource and not at the resource itself.  It is likely
to succeed anyway, even if there is still a conflict.

Instead, the parent of the conflicting resource should be passed to
__request_region().

As a fix, this patch doesn't update the parent resource pointer in the
case we have to wait for a muxed region right after.

Reported-and-tested-by: Vincent Pelletier &lt;plr.vincent@gmail.com&gt;
Signed-off-by: Simon Guinot &lt;simon.guinot@sequanux.org&gt;
Tested-by: Vincent Donnefort &lt;vdonnefort@gmail.com&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>sched: fix __sched_setscheduler() vs load balancing race</title>
<updated>2016-02-27T14:28:49Z</updated>
<author>
<name>Mike Galbraith</name>
<email>umgwanakikbuti@gmail.com</email>
</author>
<published>2016-02-17T03:02:59Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=34bce1998711e65b8a7ef37eb92641cdc368587a'/>
<id>urn:sha1:34bce1998711e65b8a7ef37eb92641cdc368587a</id>
<content type='text'>
__sched_setscheduler() may release rq-&gt;lock in pull_rt_task() as a task is
being changed rt -&gt; fair class.  load balancing may sneak in, move the task
behind __sched_setscheduler()'s back, which explodes in switched_to_fair()
when the passed but no longer valid rq is used.  Tell can_migrate_task() to
say no if -&gt;pi_lock is held.

@stable: Kernels that predate SCHED_DEADLINE can use this simple (and tested)
check in lieu of backport of the full 18 patch mainline treatment.

Signed-off-by: Mike Galbraith &lt;umgwanakikbuti@gmail.com&gt;
[bwh: Backported to 3.2:
 - Adjust numbering in the comment
 - Adjust filename]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Cc: Byungchul Park &lt;byungchul.park@lge.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Willy Tarreau &lt;w@1wt.eu&gt;
</content>
</entry>
<entry>
<title>pipe: limit the per-user amount of pages allocated in pipes</title>
<updated>2016-02-27T14:28:49Z</updated>
<author>
<name>Willy Tarreau</name>
<email>w@1wt.eu</email>
</author>
<published>2016-01-18T15:36:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=92375b85b70395c8180991084c05e8d78e55d066'/>
<id>urn:sha1:92375b85b70395c8180991084c05e8d78e55d066</id>
<content type='text'>
commit 759c01142a5d0f364a462346168a56de28a80f52 upstream.

On no-so-small systems, it is possible for a single process to cause an
OOM condition by filling large pipes with data that are never read. A
typical process filling 4000 pipes with 1 MB of data will use 4 GB of
memory. On small systems it may be tricky to set the pipe max size to
prevent this from happening.

This patch makes it possible to enforce a per-user soft limit above
which new pipes will be limited to a single page, effectively limiting
them to 4 kB each, as well as a hard limit above which no new pipes may
be created for this user. This has the effect of protecting the system
against memory abuse without hurting other users, and still allowing
pipes to work correctly though with less data at once.

The limit are controlled by two new sysctls : pipe-user-pages-soft, and
pipe-user-pages-hard. Both may be disabled by setting them to zero. The
default soft limit allows the default number of FDs per process (1024)
to create pipes of the default size (64kB), thus reaching a limit of 64MB
before starting to create only smaller pipes. With 256 processes limited
to 1024 FDs each, this results in 1024*64kB + (256*1024 - 1024) * 4kB =
1084 MB of memory allocated for a user. The hard limit is disabled by
default to avoid breaking existing applications that make intensive use
of pipes (eg: for splicing).

Reported-by: socketpair@gmail.com
Reported-by: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Mitigates: CVE-2013-4312 (Linux 2.0+)
Suggested-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Willy Tarreau &lt;w@1wt.eu&gt;
Signed-off-by: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
[bwh: Backported to 3.2: adjust context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>itimers: Handle relative timers with CONFIG_TIME_LOW_RES proper</title>
<updated>2016-02-27T14:28:42Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2016-01-14T16:54:48Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6841a66225c4857ed74b1b1d227c1e2ac5434321'/>
<id>urn:sha1:6841a66225c4857ed74b1b1d227c1e2ac5434321</id>
<content type='text'>
commit 51cbb5242a41700a3f250ecfb48dcfb7e4375ea4 upstream.

As Helge reported for timerfd we have the same issue in itimers. We return
remaining time larger than the programmed relative time to user space in case
of CONFIG_TIME_LOW_RES=y. Use the proper function to adjust the extra time
added in hrtimer_start_range_ns().

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Helge Deller &lt;deller@gmx.de&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: linux-m68k@lists.linux-m68k.org
Cc: dhowells@redhat.com
Link: http://lkml.kernel.org/r/20160114164159.528222587@linutronix.de
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>posix-timers: Handle relative timers with CONFIG_TIME_LOW_RES proper</title>
<updated>2016-02-27T14:28:42Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2016-01-14T16:54:47Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c7f98587dd2eea27cdfe2ba3b60983c9e9e92de9'/>
<id>urn:sha1:c7f98587dd2eea27cdfe2ba3b60983c9e9e92de9</id>
<content type='text'>
commit 572c39172684c3711e4a03c9a7380067e2b0661c upstream.

As Helge reported for timerfd we have the same issue in posix timers. We
return remaining time larger than the programmed relative time to user space
in case of CONFIG_TIME_LOW_RES=y. Use the proper function to adjust the extra
time added in hrtimer_start_range_ns().

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Helge Deller &lt;deller@gmx.de&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: linux-m68k@lists.linux-m68k.org
Cc: dhowells@redhat.com
Link: http://lkml.kernel.org/r/20160114164159.450510905@linutronix.de
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
[bwh: Backported to 3.2: adjust filename]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
<entry>
<title>hrtimer: Handle remaining time proper for TIME_LOW_RES</title>
<updated>2016-02-27T14:28:41Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2016-01-14T16:54:46Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e7989996ab9554b314dd8425c5f1d64a5b8a0a9b'/>
<id>urn:sha1:e7989996ab9554b314dd8425c5f1d64a5b8a0a9b</id>
<content type='text'>
commit 203cbf77de59fc8f13502dcfd11350c6d4a5c95f upstream.

If CONFIG_TIME_LOW_RES is enabled we add a jiffie to the relative timeout to
prevent short sleeps, but we do not account for that in interfaces which
retrieve the remaining time.

Helge observed that timerfd can return a remaining time larger than the
relative timeout. That's not expected and breaks userland test programs.

Store the information that the timer was armed relative and provide functions
to adjust the remaining time. To avoid bloating the hrtimer struct make state
a u8, which as a bonus results in better code on x86 at least.

Reported-and-tested-by: Helge Deller &lt;deller@gmx.de&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: John Stultz &lt;john.stultz@linaro.org&gt;
Cc: linux-m68k@lists.linux-m68k.org
Cc: dhowells@redhat.com
Link: http://lkml.kernel.org/r/20160114164159.273328486@linutronix.de
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
[bwh: Backported to 3.2:
 - Use #ifdef instead of IS_ENABLED() as that doesn't work for config
   symbols that don't exist on the current architecture
 - Use KTIME_LOW_RES directly instead of hrtimer_resolution
 - Use ktime_sub() instead of modifying ktime::tv64 directly
 - Adjust filename, context]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
</content>
</entry>
</feed>
