<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/net/core, branch v3.10.68</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.10.68</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.10.68'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2014-12-16T17:09:42Z</updated>
<entry>
<title>rtnetlink: release net refcnt on error in do_setlink()</title>
<updated>2014-12-16T17:09:42Z</updated>
<author>
<name>Nicolas Dichtel</name>
<email>nicolas.dichtel@6wind.com</email>
</author>
<published>2014-11-27T09:16:15Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1973ac37c15ab85b170aa166b6192f45c03a3a5d'/>
<id>urn:sha1:1973ac37c15ab85b170aa166b6192f45c03a3a5d</id>
<content type='text'>
[ Upstream commit e0ebde0e131b529fd721b24f62872def5ec3718c ]

rtnl_link_get_net() holds a reference on the 'struct net', we need to release
it in case of error.

CC: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Fixes: b51642f6d77b ("net: Enable a userns root rtnl calls that are safe for unprivilged users")
Signed-off-by: Nicolas Dichtel &lt;nicolas.dichtel@6wind.com&gt;
Reviewed-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>iovec: make sure the caller actually wants anything in memcpy_fromiovecend</title>
<updated>2014-08-14T01:24:15Z</updated>
<author>
<name>Sasha Levin</name>
<email>sasha.levin@oracle.com</email>
</author>
<published>2014-08-01T03:00:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9d868b94a6f52a17a2a94b34e544800d741b7852'/>
<id>urn:sha1:9d868b94a6f52a17a2a94b34e544800d741b7852</id>
<content type='text'>
[ Upstream commit 06ebb06d49486676272a3c030bfeef4bd969a8e6 ]

Check for cases when the caller requests 0 bytes instead of running off
and dereferencing potentially invalid iovecs.

Signed-off-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>net: Correctly set segment mac_len in skb_segment().</title>
<updated>2014-08-14T01:24:15Z</updated>
<author>
<name>Vlad Yasevich</name>
<email>vyasevic@redhat.com</email>
</author>
<published>2014-07-31T14:33:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c290a4ef1fe38bd8d29573468a189e49dbaa0681'/>
<id>urn:sha1:c290a4ef1fe38bd8d29573468a189e49dbaa0681</id>
<content type='text'>
[ Upstream commit fcdfe3a7fa4cb74391d42b6a26dc07c20dab1d82 ]

When performing segmentation, the mac_len value is copied right
out of the original skb.  However, this value is not always set correctly
(like when the packet is VLAN-tagged) and we'll end up copying a bad
value.

One way to demonstrate this is to configure a VM which tags
packets internally and turn off VLAN acceleration on the forwarding
bridge port.  The packets show up corrupt like this:
16:18:24.985548 52:54:00:ab:be:25 &gt; 52:54:00:26:ce:a3, ethertype 802.1Q
(0x8100), length 1518: vlan 100, p 0, ethertype 0x05e0,
        0x0000:  8cdb 1c7c 8cdb 0064 4006 b59d 0a00 6402 ...|...d@.....d.
        0x0010:  0a00 6401 9e0d b441 0a5e 64ec 0330 14fa ..d....A.^d..0..
        0x0020:  29e3 01c9 f871 0000 0101 080a 000a e833)....q.........3
        0x0030:  000f 8c75 6e65 7470 6572 6600 6e65 7470 ...unetperf.netp
        0x0040:  6572 6600 6e65 7470 6572 6600 6e65 7470 erf.netperf.netp
        0x0050:  6572 6600 6e65 7470 6572 6600 6e65 7470 erf.netperf.netp
        0x0060:  6572 6600 6e65 7470 6572 6600 6e65 7470 erf.netperf.netp
        ...

This also leads to awful throughput as GSO packets are dropped and
cause retransmissions.

The solution is to set the mac_len using the values already available
in then new skb.  We've already adjusted all of the header offset, so we
might as well correctly figure out the mac_len using skb_reset_mac_len().
After this change, packets are segmented correctly and performance
is restored.

CC: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Vlad Yasevich &lt;vyasevic@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>net: sendmsg: fix NULL pointer dereference</title>
<updated>2014-08-14T01:24:15Z</updated>
<author>
<name>Andrey Ryabinin</name>
<email>ryabinin.a.a@gmail.com</email>
</author>
<published>2014-07-26T17:26:58Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=15229fa9d4588b2d0e91ee81954c3a4f3c30dcb8'/>
<id>urn:sha1:15229fa9d4588b2d0e91ee81954c3a4f3c30dcb8</id>
<content type='text'>
[ Upstream commit 40eea803c6b2cfaab092f053248cbeab3f368412 ]

Sasha's report:
	&gt; While fuzzing with trinity inside a KVM tools guest running the latest -next
	&gt; kernel with the KASAN patchset, I've stumbled on the following spew:
	&gt;
	&gt; [ 4448.949424] ==================================================================
	&gt; [ 4448.951737] AddressSanitizer: user-memory-access on address 0
	&gt; [ 4448.952988] Read of size 2 by thread T19638:
	&gt; [ 4448.954510] CPU: 28 PID: 19638 Comm: trinity-c76 Not tainted 3.16.0-rc4-next-20140711-sasha-00046-g07d3099-dirty #813
	&gt; [ 4448.956823]  ffff88046d86ca40 0000000000000000 ffff880082f37e78 ffff880082f37a40
	&gt; [ 4448.958233]  ffffffffb6e47068 ffff880082f37a68 ffff880082f37a58 ffffffffb242708d
	&gt; [ 4448.959552]  0000000000000000 ffff880082f37a88 ffffffffb24255b1 0000000000000000
	&gt; [ 4448.961266] Call Trace:
	&gt; [ 4448.963158] dump_stack (lib/dump_stack.c:52)
	&gt; [ 4448.964244] kasan_report_user_access (mm/kasan/report.c:184)
	&gt; [ 4448.965507] __asan_load2 (mm/kasan/kasan.c:352)
	&gt; [ 4448.966482] ? netlink_sendmsg (net/netlink/af_netlink.c:2339)
	&gt; [ 4448.967541] netlink_sendmsg (net/netlink/af_netlink.c:2339)
	&gt; [ 4448.968537] ? get_parent_ip (kernel/sched/core.c:2555)
	&gt; [ 4448.970103] sock_sendmsg (net/socket.c:654)
	&gt; [ 4448.971584] ? might_fault (mm/memory.c:3741)
	&gt; [ 4448.972526] ? might_fault (./arch/x86/include/asm/current.h:14 mm/memory.c:3740)
	&gt; [ 4448.973596] ? verify_iovec (net/core/iovec.c:64)
	&gt; [ 4448.974522] ___sys_sendmsg (net/socket.c:2096)
	&gt; [ 4448.975797] ? put_lock_stats.isra.13 (./arch/x86/include/asm/preempt.h:98 kernel/locking/lockdep.c:254)
	&gt; [ 4448.977030] ? lock_release_holdtime (kernel/locking/lockdep.c:273)
	&gt; [ 4448.978197] ? lock_release_non_nested (kernel/locking/lockdep.c:3434 (discriminator 1))
	&gt; [ 4448.979346] ? check_chain_key (kernel/locking/lockdep.c:2188)
	&gt; [ 4448.980535] __sys_sendmmsg (net/socket.c:2181)
	&gt; [ 4448.981592] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2600)
	&gt; [ 4448.982773] ? trace_hardirqs_on (kernel/locking/lockdep.c:2607)
	&gt; [ 4448.984458] ? syscall_trace_enter (arch/x86/kernel/ptrace.c:1500 (discriminator 2))
	&gt; [ 4448.985621] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2600)
	&gt; [ 4448.986754] SyS_sendmmsg (net/socket.c:2201)
	&gt; [ 4448.987708] tracesys (arch/x86/kernel/entry_64.S:542)
	&gt; [ 4448.988929] ==================================================================

This reports means that we've come to netlink_sendmsg() with msg-&gt;msg_name == NULL and msg-&gt;msg_namelen &gt; 0.

After this report there was no usual "Unable to handle kernel NULL pointer dereference"
and this gave me a clue that address 0 is mapped and contains valid socket address structure in it.

This bug was introduced in f3d3342602f8bcbf37d7c46641cb9bca7618eb1c
(net: rework recvmsg handler msg_name and msg_namelen logic).
Commit message states that:
	"Set msg-&gt;msg_name = NULL if user specified a NULL in msg_name but had a
	 non-null msg_namelen in verify_iovec/verify_compat_iovec. This doesn't
	 affect sendto as it would bail out earlier while trying to copy-in the
	 address."
But in fact this affects sendto when address 0 is mapped and contains
socket address structure in it. In such case copy-in address will succeed,
verify_iovec() function will successfully exit with msg-&gt;msg_namelen &gt; 0
and msg-&gt;msg_name == NULL.

This patch fixes it by setting msg_namelen to 0 if msg_name == NULL.

Cc: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: &lt;stable@vger.kernel.org&gt;
Reported-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Signed-off-by: Andrey Ryabinin &lt;a.ryabinin@samsung.com&gt;
Acked-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>inetpeer: get rid of ip_id_count</title>
<updated>2014-08-14T01:24:15Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2014-06-02T12:26:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ff1f69a89a613223c57c13190a6c9be928ac4b9d'/>
<id>urn:sha1:ff1f69a89a613223c57c13190a6c9be928ac4b9d</id>
<content type='text'>
[ Upstream commit 73f156a6e8c1074ac6327e0abd1169e95eb66463 ]

Ideally, we would need to generate IP ID using a per destination IP
generator.

linux kernels used inet_peer cache for this purpose, but this had a huge
cost on servers disabling MTU discovery.

1) each inet_peer struct consumes 192 bytes

2) inetpeer cache uses a binary tree of inet_peer structs,
   with a nominal size of ~66000 elements under load.

3) lookups in this tree are hitting a lot of cache lines, as tree depth
   is about 20.

4) If server deals with many tcp flows, we have a high probability of
   not finding the inet_peer, allocating a fresh one, inserting it in
   the tree with same initial ip_id_count, (cf secure_ip_id())

5) We garbage collect inet_peer aggressively.

IP ID generation do not have to be 'perfect'

Goal is trying to avoid duplicates in a short period of time,
so that reassembly units have a chance to complete reassembly of
fragments belonging to one message before receiving other fragments
with a recycled ID.

We simply use an array of generators, and a Jenkin hash using the dst IP
as a key.

ipv6_select_ident() is put back into net/ipv6/ip6_output.c where it
belongs (it is only used from this file)

secure_ip_id() and secure_ipv6_id() no longer are needed.

Rename ip_select_ident_more() to ip_select_ident_segs() to avoid
unnecessary decrement/increment of the number of segments.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>ipv4: fix dst race in sk_dst_get()</title>
<updated>2014-07-28T15:00:04Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2014-06-24T17:05:11Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=86e48c03d774e01ccd71ecba4fc4b5c2bc0b5b41'/>
<id>urn:sha1:86e48c03d774e01ccd71ecba4fc4b5c2bc0b5b41</id>
<content type='text'>
[ Upstream commit f88649721268999bdff09777847080a52004f691 ]

When IP route cache had been removed in linux-3.6, we broke assumption
that dst entries were all freed after rcu grace period. DST_NOCACHE
dst were supposed to be freed from dst_release(). But it appears
we want to keep such dst around, either in UDP sockets or tunnels.

In sk_dst_get() we need to make sure dst refcount is not 0
before incrementing it, or else we might end up freeing a dst
twice.

DST_NOCACHE set on a dst does not mean this dst can not be attached
to a socket or a tunnel.

Then, before actual freeing, we need to observe a rcu grace period
to make sure all other cpus can catch the fact the dst is no longer
usable.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: Dormando &lt;dormando@rydia.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>skbuff: skb_segment: orphan frags before copying</title>
<updated>2014-07-01T03:09:45Z</updated>
<author>
<name>Michael S. Tsirkin</name>
<email>mst@redhat.com</email>
</author>
<published>2014-03-10T17:28:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b124695c12cb8c4dc0f91cc18eac640f546c6456'/>
<id>urn:sha1:b124695c12cb8c4dc0f91cc18eac640f546c6456</id>
<content type='text'>
commit 1fd819ecb90cc9b822cd84d3056ddba315d3340f upstream.

skb_segment copies frags around, so we need
to copy them carefully to avoid accessing
user memory after reporting completion to userspace
through a callback.

skb_segment doesn't normally happen on datapath:
TSO needs to be disabled - so disabling zero copy
in this case does not look like a big deal.

Signed-off-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Acked-by: Herbert Xu &lt;herbert@gondor.apana.org.au&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
[bwh: Backported to 3.2.  As skb_segment() only supports page-frags *or* a
  frag list, there is no need for the additional frag_skb pointer or the
  preparatory renaming.]
Signed-off-by: Ben Hutchings &lt;ben@decadent.org.uk&gt;
Cc: Eddie Chapman &lt;eddie@ehuk.net&gt; # backported to 3.10
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>rtnetlink: fix userspace API breakage for iproute2 &lt; v3.9.0</title>
<updated>2014-06-26T19:12:38Z</updated>
<author>
<name>Michal Schmidt</name>
<email>mschmidt@redhat.com</email>
</author>
<published>2014-05-28T12:15:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fb0d10bff97d0016a8305d6102699a93afe5011a'/>
<id>urn:sha1:fb0d10bff97d0016a8305d6102699a93afe5011a</id>
<content type='text'>
[ Upstream commit e5eca6d41f53db48edd8cf88a3f59d2c30227f8e ]

When running RHEL6 userspace on a current upstream kernel, "ip link"
fails to show VF information.

The reason is a kernel&lt;-&gt;userspace API change introduced by commit
88c5b5ce5cb57 ("rtnetlink: Call nlmsg_parse() with correct header length"),
after which the kernel does not see iproute2's IFLA_EXT_MASK attribute
in the netlink request.

iproute2 adjusted for the API change in its commit 63338dca4513
("libnetlink: Use ifinfomsg instead of rtgenmsg in rtnl_wilddump_req_filter").

The problem has been noticed before:
http://marc.info/?l=linux-netdev&amp;m=136692296022182&amp;w=2
(Subject: Re: getting VF link info seems to be broken in 3.9-rc8)

We can do better than tell those with old userspace to upgrade. We can
recognize the old iproute2 in the kernel by checking the netlink message
length. Even when including the IFLA_EXT_MASK attribute, its netlink
message is shorter than struct ifinfomsg.

With this patch "ip link" shows VF information in both old and new
iproute2 versions.

Signed-off-by: Michal Schmidt &lt;mschmidt@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>net: force a list_del() in unregister_netdevice_many()</title>
<updated>2014-06-26T19:12:38Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2014-06-06T13:44:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6a827d8a67f750f3096e668657e16d479363d4b5'/>
<id>urn:sha1:6a827d8a67f750f3096e668657e16d479363d4b5</id>
<content type='text'>
[ Upstream commit 87757a917b0b3c0787e0563c679762152be81312 ]

unregister_netdevice_many() API is error prone and we had too
many bugs because of dangling LIST_HEAD on stacks.

See commit f87e6f47933e3e ("net: dont leave active on stack LIST_HEAD")

In fact, instead of making sure no caller leaves an active list_head,
just force a list_del() in the callee. No one seems to need to access
the list after unregister_netdevice_many()

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>net: Use netlink_ns_capable to verify the permisions of netlink messages</title>
<updated>2014-06-26T19:12:37Z</updated>
<author>
<name>Eric W. Biederman</name>
<email>ebiederm@xmission.com</email>
</author>
<published>2014-04-23T21:29:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1141a455802884d3bcbcf6b30e1d65d09cf286e1'/>
<id>urn:sha1:1141a455802884d3bcbcf6b30e1d65d09cf286e1</id>
<content type='text'>
[ Upstream commit 90f62cf30a78721641e08737bda787552428061e ]

It is possible by passing a netlink socket to a more privileged
executable and then to fool that executable into writing to the socket
data that happens to be valid netlink message to do something that
privileged executable did not intend to do.

To keep this from happening replace bare capable and ns_capable calls
with netlink_capable, netlink_net_calls and netlink_ns_capable calls.
Which act the same as the previous calls except they verify that the
opener of the socket had the desired permissions as well.

Reported-by: Andy Lutomirski &lt;luto@amacapital.net&gt;
Signed-off-by: "Eric W. Biederman" &lt;ebiederm@xmission.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
</feed>
