<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include, branch v3.4.105</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.4.105</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.4.105'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2014-12-01T10:02:42Z</updated>
<entry>
<title>mnt: Only change user settable mount flags in remount</title>
<updated>2014-12-01T10:02:42Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2014-07-28T23:26:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b47d65db8f8e765ef0267d13681c9bf12a148fb5'/>
<id>urn:sha1:b47d65db8f8e765ef0267d13681c9bf12a148fb5</id>
<content type='text'>
commit a6138db815df5ee542d848318e5dae681590fccd upstream.

Kenton Varda &lt;kenton@sandstorm.io&gt; discovered that by remounting a
read-only bind mount read-only in a user namespace the
MNT_LOCK_READONLY bit would be cleared, allowing an unprivileged user
to the remount a read-only mount read-write.

Correct this by replacing the mask of mount flags to preserve
with a mask of mount flags that may be changed, and preserve
all others.   This ensures that any future bugs with this mask and
remount will fail in an easy to detect way where new mount flags
simply won't change.

Acked-by: Serge E. Hallyn &lt;serge.hallyn@ubuntu.com&gt;
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Cc: Francis Moreau &lt;francis.moro@gmail.com&gt;
Signed-off-by: Zefan Li &lt;lizefan@huawei.com&gt;
</content>
</entry>
<entry>
<title>cpuset: PF_SPREAD_PAGE and PF_SPREAD_SLAB should be atomic flags</title>
<updated>2014-12-01T10:02:38Z</updated>
<author>
<name>Zefan Li</name>
<email>lizefan@huawei.com</email>
</author>
<published>2014-09-25T01:41:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4fae6ccac642aa30d189dea30ef14306aad4d2d2'/>
<id>urn:sha1:4fae6ccac642aa30d189dea30ef14306aad4d2d2</id>
<content type='text'>
commit 2ad654bc5e2b211e92f66da1d819e47d79a866f0 upstream.

When we change cpuset.memory_spread_{page,slab}, cpuset will flip
PF_SPREAD_{PAGE,SLAB} bit of tsk-&gt;flags for each task in that cpuset.
This should be done using atomic bitops, but currently we don't,
which is broken.

Tetsuo reported a hard-to-reproduce kernel crash on RHEL6, which happened
when one thread tried to clear PF_USED_MATH while at the same time another
thread tried to flip PF_SPREAD_PAGE/PF_SPREAD_SLAB. They both operate on
the same task.

Here's the full report:
https://lkml.org/lkml/2014/9/19/230

To fix this, we make PF_SPREAD_PAGE and PF_SPREAD_SLAB atomic flags.

v4:
- updated mm/slab.c. (Fengguang Wu)
- updated Documentation.

Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Fixes: 950592f7b991 ("cpusets: update tasks' page/slab spread flags in time")
Reported-by: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Signed-off-by: Zefan Li &lt;lizefan@huawei.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
[lizf: Backported to 3.4:
 - adjust context
 - check current-&gt;flags &amp; PF_MEMPOLICY rather than current-&gt;mempolicy]
</content>
</entry>
<entry>
<title>sched: add macros to define bitops for task atomic flags</title>
<updated>2014-12-01T10:02:38Z</updated>
<author>
<name>Zefan Li</name>
<email>lizefan@huawei.com</email>
</author>
<published>2014-09-25T01:40:40Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=699c06b386d592bede77d5a28ed1637c80ab99c0'/>
<id>urn:sha1:699c06b386d592bede77d5a28ed1637c80ab99c0</id>
<content type='text'>
commit e0e5070b20e01f0321f97db4e4e174f3f6b49e50 upstream.

This will simplify code when we add new flags.

v3:
- Kees pointed out that no_new_privs should never be cleared, so we
shouldn't define task_clear_no_new_privs(). we define 3 macros instead
of a single one.

v2:
- updated scripts/tags.sh, suggested by Peter

Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Miao Xie &lt;miaox@cn.fujitsu.com&gt;
Cc: Tetsuo Handa &lt;penguin-kernel@I-love.SAKURA.ne.jp&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Acked-by: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Zefan Li &lt;lizefan@huawei.com&gt;
Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
[lizf: Backported to 3.4:
 - adjust context
 - remove no_new_priv code
 - add atomic_flags to struct task_struct]
</content>
</entry>
<entry>
<title>jiffies: Fix timeval conversion to jiffies</title>
<updated>2014-12-01T10:02:32Z</updated>
<author>
<name>Andrew Hunter</name>
<email>ahh@google.com</email>
</author>
<published>2014-09-04T21:17:16Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=49d4f0197912461fb2939ca51521b47655b04b11'/>
<id>urn:sha1:49d4f0197912461fb2939ca51521b47655b04b11</id>
<content type='text'>
commit d78c9300c51d6ceed9f6d078d4e9366f259de28c upstream.

timeval_to_jiffies tried to round a timeval up to an integral number
of jiffies, but the logic for doing so was incorrect: intervals
corresponding to exactly N jiffies would become N+1. This manifested
itself particularly repeatedly stopping/starting an itimer:

setitimer(ITIMER_PROF, &amp;val, NULL);
setitimer(ITIMER_PROF, NULL, &amp;val);

would add a full tick to val, _even if it was exactly representable in
terms of jiffies_ (say, the result of a previous rounding.)  Doing
this repeatedly would cause unbounded growth in val.  So fix the math.

Here's what was wrong with the conversion: we essentially computed
(eliding seconds)

jiffies = usec  * (NSEC_PER_USEC/TICK_NSEC)

by using scaling arithmetic, which took the best approximation of
NSEC_PER_USEC/TICK_NSEC with denominator of 2^USEC_JIFFIE_SC =
x/(2^USEC_JIFFIE_SC), and computed:

jiffies = (usec * x) &gt;&gt; USEC_JIFFIE_SC

and rounded this calculation up in the intermediate form (since we
can't necessarily exactly represent TICK_NSEC in usec.) But the
scaling arithmetic is a (very slight) *over*approximation of the true
value; that is, instead of dividing by (1 usec/ 1 jiffie), we
effectively divided by (1 usec/1 jiffie)-epsilon (rounding
down). This would normally be fine, but we want to round timeouts up,
and we did so by adding 2^USEC_JIFFIE_SC - 1 before the shift; this
would be fine if our division was exact, but dividing this by the
slightly smaller factor was equivalent to adding just _over_ 1 to the
final result (instead of just _under_ 1, as desired.)

In particular, with HZ=1000, we consistently computed that 10000 usec
was 11 jiffies; the same was true for any exact multiple of
TICK_NSEC.

We could possibly still round in the intermediate form, adding
something less than 2^USEC_JIFFIE_SC - 1, but easier still is to
convert usec-&gt;nsec, round in nanoseconds, and then convert using
time*spec*_to_jiffies.  This adds one constant multiplication, and is
not observably slower in microbenchmarks on recent x86 hardware.

Tested: the following program:

int main() {
  struct itimerval zero = {{0, 0}, {0, 0}};
  /* Initially set to 10 ms. */
  struct itimerval initial = zero;
  initial.it_interval.tv_usec = 10000;
  setitimer(ITIMER_PROF, &amp;initial, NULL);
  /* Save and restore several times. */
  for (size_t i = 0; i &lt; 10; ++i) {
    struct itimerval prev;
    setitimer(ITIMER_PROF, &amp;zero, &amp;prev);
    /* on old kernels, this goes up by TICK_USEC every iteration */
    printf("previous value: %ld %ld %ld %ld\n",
           prev.it_interval.tv_sec, prev.it_interval.tv_usec,
           prev.it_value.tv_sec, prev.it_value.tv_usec);
    setitimer(ITIMER_PROF, &amp;prev, NULL);
  }
    return 0;
}

Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Paul Turner &lt;pjt@google.com&gt;
Cc: Richard Cochran &lt;richardcochran@gmail.com&gt;
Cc: Prarit Bhargava &lt;prarit@redhat.com&gt;
Reviewed-by: Paul Turner &lt;pjt@google.com&gt;
Reported-by: Aaron Jacobs &lt;jacobsa@google.com&gt;
Signed-off-by: Andrew Hunter &lt;ahh@google.com&gt;
[jstultz: Tweaked to apply to 3.17-rc]
Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
[lizf: Backported to 3.4: adjust filename]
Signed-off-by: Zefan Li &lt;lizefan@huawei.com&gt;
</content>
</entry>
<entry>
<title>regulatory: add NUL to alpha2</title>
<updated>2014-12-01T10:02:22Z</updated>
<author>
<name>Eliad Peller</name>
<email>eliad@wizery.com</email>
</author>
<published>2014-06-11T07:23:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=be93eea739598a5bf2562110323d79d80d0e6a33'/>
<id>urn:sha1:be93eea739598a5bf2562110323d79d80d0e6a33</id>
<content type='text'>
commit a5fe8e7695dc3f547e955ad2b662e3e72969e506 upstream.

alpha2 is defined as 2-chars array, but is used in multiple
places as string (e.g. with nla_put_string calls), which
might leak kernel data.

Solve it by simply adding an extra char for the NULL
terminator, making such operations safe.

Signed-off-by: Eliad Peller &lt;eliadx.peller@intel.com&gt;
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
[lizf: Backported to 3.4: adjust context]
Signed-off-by: Zefan Li &lt;lizefan@huawei.com&gt;
</content>
</entry>
<entry>
<title>slab/mempolicy: always use local policy from interrupt context</title>
<updated>2014-09-25T03:49:17Z</updated>
<author>
<name>Andi Kleen</name>
<email>ak@linux.intel.com</email>
</author>
<published>2012-06-09T09:40:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=69db2d4044cc494b03bad01b2e106f7324ef3fdf'/>
<id>urn:sha1:69db2d4044cc494b03bad01b2e106f7324ef3fdf</id>
<content type='text'>
commit e7b691b085fda913830e5280ae6f724b2a63c824 upstream.

slab_node() could access current-&gt;mempolicy from interrupt context.
However there's a race condition during exit where the mempolicy
is first freed and then the pointer zeroed.

Using this from interrupts seems bogus anyways. The interrupt
will interrupt a random process and therefore get a random
mempolicy. Many times, this will be idle's, which noone can change.

Just disable this here and always use local for slab
from interrupts. I also cleaned up the callers of slab_node a bit
which always passed the same argument.

I believe the original mempolicy code did that in fact,
so it's likely a regression.

v2: send version with correct logic
v3: simplify. fix typo.
Reported-by: Arun Sharma &lt;asharma@fb.com&gt;
Cc: penberg@kernel.org
Cc: cl@linux.com
Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;
[tdmackey@twitter.com: Rework control flow based on feedback from
cl@linux.com, fix logic, and cleanup current task_struct reference]
Acked-by: David Rientjes &lt;rientjes@google.com&gt;
Acked-by: Christoph Lameter &lt;cl@linux.com&gt;
Acked-by: KOSAKI Motohiro &lt;kosaki.motohiro@jp.fujitsu.com&gt;
Signed-off-by: David Mackey &lt;tdmackey@twitter.com&gt;
Signed-off-by: Pekka Enberg &lt;penberg@kernel.org&gt;
Signed-off-by: Zefan Li &lt;lizefan@huawei.com&gt;
</content>
</entry>
<entry>
<title>ip: make IP identifiers less predictable</title>
<updated>2014-08-14T00:42:35Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2014-07-26T06:58:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=509a15a5d6b0cfd3e4e396844615df6335ff4c62'/>
<id>urn:sha1:509a15a5d6b0cfd3e4e396844615df6335ff4c62</id>
<content type='text'>
[ Upstream commit 04ca6973f7c1a0d8537f2d9906a0cf8e69886d75 ]

In "Counting Packets Sent Between Arbitrary Internet Hosts", Jeffrey and
Jedidiah describe ways exploiting linux IP identifier generation to
infer whether two machines are exchanging packets.

With commit 73f156a6e8c1 ("inetpeer: get rid of ip_id_count"), we
changed IP id generation, but this does not really prevent this
side-channel technique.

This patch adds a random amount of perturbation so that IP identifiers
for a given destination [1] are no longer monotonically increasing after
an idle period.

Note that prandom_u32_max(1) returns 0, so if generator is used at most
once per jiffy, this patch inserts no hole in the ID suite and do not
increase collision probability.

This is jiffies based, so in the worst case (HZ=1000), the id can
rollover after ~65 seconds of idle time, which should be fine.

We also change the hash used in __ip_select_ident() to not only hash
on daddr, but also saddr and protocol, so that ICMP probes can not be
used to infer information for other protocols.

For IPv6, adds saddr into the hash as well, but not nexthdr.

If I ping the patched target, we can see ID are now hard to predict.

21:57:11.008086 IP (...)
    A &gt; target: ICMP echo request, seq 1, length 64
21:57:11.010752 IP (... id 2081 ...)
    target &gt; A: ICMP echo reply, seq 1, length 64

21:57:12.013133 IP (...)
    A &gt; target: ICMP echo request, seq 2, length 64
21:57:12.015737 IP (... id 3039 ...)
    target &gt; A: ICMP echo reply, seq 2, length 64

21:57:13.016580 IP (...)
    A &gt; target: ICMP echo request, seq 3, length 64
21:57:13.019251 IP (... id 3437 ...)
    target &gt; A: ICMP echo reply, seq 3, length 64

[1] TCP sessions uses a per flow ID generator not changed by this patch.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: Jeffrey Knockel &lt;jeffk@cs.unm.edu&gt;
Reported-by: Jedidiah R. Crandall &lt;crandall@cs.unm.edu&gt;
Cc: Willy Tarreau &lt;w@1wt.eu&gt;
Cc: Hannes Frederic Sowa &lt;hannes@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>inetpeer: get rid of ip_id_count</title>
<updated>2014-08-14T00:42:35Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2014-06-02T12:26:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ad52eef552c7896ec6024ee72fc126167fe5c4e2'/>
<id>urn:sha1:ad52eef552c7896ec6024ee72fc126167fe5c4e2</id>
<content type='text'>
[ Upstream commit 73f156a6e8c1074ac6327e0abd1169e95eb66463 ]

Ideally, we would need to generate IP ID using a per destination IP
generator.

linux kernels used inet_peer cache for this purpose, but this had a huge
cost on servers disabling MTU discovery.

1) each inet_peer struct consumes 192 bytes

2) inetpeer cache uses a binary tree of inet_peer structs,
   with a nominal size of ~66000 elements under load.

3) lookups in this tree are hitting a lot of cache lines, as tree depth
   is about 20.

4) If server deals with many tcp flows, we have a high probability of
   not finding the inet_peer, allocating a fresh one, inserting it in
   the tree with same initial ip_id_count, (cf secure_ip_id())

5) We garbage collect inet_peer aggressively.

IP ID generation do not have to be 'perfect'

Goal is trying to avoid duplicates in a short period of time,
so that reassembly units have a chance to complete reassembly of
fragments belonging to one message before receiving other fragments
with a recycled ID.

We simply use an array of generators, and a Jenkin hash using the dst IP
as a key.

ipv6_select_ident() is put back into net/ipv6/ip6_output.c where it
belongs (it is only used from this file)

secure_ip_id() and secure_ipv6_id() no longer are needed.

Rename ip_select_ident_more() to ip_select_ident_segs() to avoid
unnecessary decrement/increment of the number of segments.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>Revert: "net: ip, ipv6: handle gso skbs in forwarding path"</title>
<updated>2014-08-07T19:00:11Z</updated>
<author>
<name>Greg Kroah-Hartman</name>
<email>gregkh@linuxfoundation.org</email>
</author>
<published>2014-08-05T04:42:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=68199d62034a736cae12fee49b0dad5342612583'/>
<id>urn:sha1:68199d62034a736cae12fee49b0dad5342612583</id>
<content type='text'>
This reverts commit 29a3cd46644ec8098dbe1c12f89643b5c11831a9 which is
commit fe6cc55f3a9a053482a76f5a6b2257cee51b4663 upstream.

Cc: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Cc: Marcelo Ricardo Leitner &lt;mleitner@redhat.com&gt;
Cc: Florian Westphal &lt;fw@strlen.de&gt;
Cc: David S. Miller &lt;davem@davemloft.net&gt;
Cc: Thomas Jarosch &lt;thomas.jarosch@intra2net.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>printk: rename printk_sched to printk_deferred</title>
<updated>2014-08-07T19:00:10Z</updated>
<author>
<name>John Stultz</name>
<email>john.stultz@linaro.org</email>
</author>
<published>2014-06-04T23:11:40Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f73ff697833654ee578b606ea746d15dc1220aab'/>
<id>urn:sha1:f73ff697833654ee578b606ea746d15dc1220aab</id>
<content type='text'>
commit aac74dc495456412c4130a1167ce4beb6c1f0b38 upstream.

After learning we'll need some sort of deferred printk functionality in
the timekeeping core, Peter suggested we rename the printk_sched function
so it can be reused by needed subsystems.

This only changes the function name. No logic changes.

Signed-off-by: John Stultz &lt;john.stultz@linaro.org&gt;
Reviewed-by: Steven Rostedt &lt;rostedt@goodmis.org&gt;
Cc: Jan Kara &lt;jack@suse.cz&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Jiri Bohac &lt;jbohac@suse.cz&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
</feed>
