<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/skbuff.h, branch v3.16.42</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.16.42</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.16.42'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2016-03-24T10:01:13Z</updated>
<entry>
<title>mld, igmp: Fix reserved tailroom calculation</title>
<updated>2016-03-24T10:01:13Z</updated>
<author>
<name>Benjamin Poirier</name>
<email>bpoirier@suse.com</email>
</author>
<published>2016-02-29T23:03:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=842ab65a6030186998cf4f2845a34fe139775a32'/>
<id>urn:sha1:842ab65a6030186998cf4f2845a34fe139775a32</id>
<content type='text'>
commit 1837b2e2bcd23137766555a63867e649c0b637f0 upstream.

The current reserved_tailroom calculation fails to take hlen and tlen into
account.

skb:
[__hlen__|__data____________|__tlen___|__extra__]
^                                               ^
head                                            skb_end_offset

In this representation, hlen + data + tlen is the size passed to alloc_skb.
"extra" is the extra space made available in __alloc_skb because of
rounding up by kmalloc. We can reorder the representation like so:

[__hlen__|__data____________|__extra__|__tlen___]
^                                               ^
head                                            skb_end_offset

The maximum space available for ip headers and payload without
fragmentation is min(mtu, data + extra). Therefore,
reserved_tailroom
= data + extra + tlen - min(mtu, data + extra)
= skb_end_offset - hlen - min(mtu, skb_end_offset - hlen - tlen)
= skb_tailroom - min(mtu, skb_tailroom - tlen) ; after skb_reserve(hlen)

Compare the second line to the current expression:
reserved_tailroom = skb_end_offset - min(mtu, skb_end_offset)
and we can see that hlen and tlen are not taken into account.

The min() in the third line can be expanded into:
if mtu &lt; skb_tailroom - tlen:
	reserved_tailroom = skb_tailroom - mtu
else:
	reserved_tailroom = tlen

Depending on hlen, tlen, mtu and the number of multicast address records,
the current code may output skbs that have less tailroom than
dev-&gt;needed_tailroom or it may output more skbs than needed because not all
space available is used.

Fixes: 4c672e4b ("ipv6: mld: fix add_grhead skb_over_panic for devs with large MTUs")
Signed-off-by: Benjamin Poirier &lt;bpoirier@suse.com&gt;
Acked-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques &lt;luis.henriques@canonical.com&gt;
</content>
</entry>
<entry>
<title>net:Add sysctl_max_skb_frags</title>
<updated>2016-03-24T10:00:44Z</updated>
<author>
<name>Hans Westgaard Ry</name>
<email>hans.westgaard.ry@oracle.com</email>
</author>
<published>2016-02-03T08:26:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1f3c2130941ac194a3f78e80c65cc73b64c2a2df'/>
<id>urn:sha1:1f3c2130941ac194a3f78e80c65cc73b64c2a2df</id>
<content type='text'>
commit 5f74f82ea34c0da80ea0b49192bb5ea06e063593 upstream.

Devices may have limits on the number of fragments in an skb they support.
Current codebase uses a constant as maximum for number of fragments one
skb can hold and use.
When enabling scatter/gather and running traffic with many small messages
the codebase uses the maximum number of fragments and may thereby violate
the max for certain devices.
The patch introduces a global variable as max number of fragments.

Signed-off-by: Hans Westgaard Ry &lt;hans.westgaard.ry@oracle.com&gt;
Reviewed-by: Håkon Bugge &lt;haakon.bugge@oracle.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
[ luis: backported to 3.16: adjusted context ]
Signed-off-by: Luis Henriques &lt;luis.henriques@canonical.com&gt;
</content>
</entry>
<entry>
<title>Revert "[stable-only] net: add length argument to skb_copy_and_csum_datagram_iovec"</title>
<updated>2016-01-25T10:44:07Z</updated>
<author>
<name>Luis Henriques</name>
<email>luis.henriques@canonical.com</email>
</author>
<published>2016-01-25T10:30:21Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=0920a7112e0d26d234c7c6c0ad1777b914a72cda'/>
<id>urn:sha1:0920a7112e0d26d234c7c6c0ad1777b914a72cda</id>
<content type='text'>
This reverts commit fa89ae5548ed282f0ceb4660b3b93e4e2ee875f3.

As reported by Michal Kubecek, the backport of commit 89c22d8c3b27
("net: Fix skb csum races when peeking") exposed an upstream bug.
It is being reverted and replaced by a better fix.

Signed-off-by: Luis Henriques &lt;luis.henriques@canonical.com&gt;
</content>
</entry>
<entry>
<title>[stable-only] net: add length argument to skb_copy_and_csum_datagram_iovec</title>
<updated>2015-10-28T10:33:22Z</updated>
<author>
<name>Sabrina Dubroca</name>
<email>sd@queasysnail.net</email>
</author>
<published>2015-10-15T12:25:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fa89ae5548ed282f0ceb4660b3b93e4e2ee875f3'/>
<id>urn:sha1:fa89ae5548ed282f0ceb4660b3b93e4e2ee875f3</id>
<content type='text'>
Without this length argument, we can read past the end of the iovec in
memcpy_toiovec because we have no way of knowing the total length of the
iovec's buffers.

This is needed for stable kernels where 89c22d8c3b27 ("net: Fix skb
csum races when peeking") has been backported but that don't have the
ioviter conversion, which is almost all the stable trees &lt;= 3.18.

This also fixes a kernel crash for NFS servers when the client uses
 -onfsvers=3,proto=udp to mount the export.

Signed-off-by: Sabrina Dubroca &lt;sd@queasysnail.net&gt;
Reviewed-by: Hannes Frederic Sowa &lt;hannes@stressinduktion.org&gt;
[ luis: backported to 3.16:
  - dropped changes to net/rxrpc/ar-recvmsg.c ]
Signed-off-by: Luis Henriques &lt;luis.henriques@canonical.com&gt;
</content>
</entry>
<entry>
<title>skbuff: Fix skb checksum partial check.</title>
<updated>2015-10-28T10:33:18Z</updated>
<author>
<name>Pravin B Shelar</name>
<email>pshelar@nicira.com</email>
</author>
<published>2015-09-29T00:24:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fc3befaacb2c900d239c65562f253b95344448b0'/>
<id>urn:sha1:fc3befaacb2c900d239c65562f253b95344448b0</id>
<content type='text'>
commit 31b33dfb0a144469dd805514c9e63f4993729a48 upstream.

Earlier patch 6ae459bda tried to detect void ckecksum partial
skb by comparing pull length to checksum offset. But it does
not work for all cases since checksum-offset depends on
updates to skb-&gt;data.

Following patch fixes it by validating checksum start offset
after skb-data pointer is updated. Negative value of checksum
offset start means there is no need to checksum.

Fixes: 6ae459bda ("skbuff: Fix skb checksum flag on skb pull")
Reported-by: Andrew Vagin &lt;avagin@odin.com&gt;
Signed-off-by: Pravin B Shelar &lt;pshelar@nicira.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Luis Henriques &lt;luis.henriques@canonical.com&gt;
</content>
</entry>
<entry>
<title>skbuff: Fix skb checksum flag on skb pull</title>
<updated>2015-10-28T10:33:18Z</updated>
<author>
<name>Pravin B Shelar</name>
<email>pshelar@nicira.com</email>
</author>
<published>2015-09-22T19:57:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a5162a04e7e2d84d133dde460f7e84c4a70656a9'/>
<id>urn:sha1:a5162a04e7e2d84d133dde460f7e84c4a70656a9</id>
<content type='text'>
commit 6ae459bdaaeebc632b16e54dcbabb490c6931d61 upstream.

VXLAN device can receive skb with checksum partial. But the checksum
offset could be in outer header which is pulled on receive. This results
in negative checksum offset for the skb. Such skb can cause the assert
failure in skb_checksum_help(). Following patch fixes the bug by setting
checksum-none while pulling outer header.

Following is the kernel panic msg from old kernel hitting the bug.

------------[ cut here ]------------
kernel BUG at net/core/dev.c:1906!
RIP: 0010:[&lt;ffffffff81518034&gt;] skb_checksum_help+0x144/0x150
Call Trace:
&lt;IRQ&gt;
[&lt;ffffffffa0164c28&gt;] queue_userspace_packet+0x408/0x470 [openvswitch]
[&lt;ffffffffa016614d&gt;] ovs_dp_upcall+0x5d/0x60 [openvswitch]
[&lt;ffffffffa0166236&gt;] ovs_dp_process_packet_with_key+0xe6/0x100 [openvswitch]
[&lt;ffffffffa016629b&gt;] ovs_dp_process_received_packet+0x4b/0x80 [openvswitch]
[&lt;ffffffffa016c51a&gt;] ovs_vport_receive+0x2a/0x30 [openvswitch]
[&lt;ffffffffa0171383&gt;] vxlan_rcv+0x53/0x60 [openvswitch]
[&lt;ffffffffa01734cb&gt;] vxlan_udp_encap_recv+0x8b/0xf0 [openvswitch]
[&lt;ffffffff8157addc&gt;] udp_queue_rcv_skb+0x2dc/0x3b0
[&lt;ffffffff8157b56f&gt;] __udp4_lib_rcv+0x1cf/0x6c0
[&lt;ffffffff8157ba7a&gt;] udp_rcv+0x1a/0x20
[&lt;ffffffff8154fdbd&gt;] ip_local_deliver_finish+0xdd/0x280
[&lt;ffffffff81550128&gt;] ip_local_deliver+0x88/0x90
[&lt;ffffffff8154fa7d&gt;] ip_rcv_finish+0x10d/0x370
[&lt;ffffffff81550365&gt;] ip_rcv+0x235/0x300
[&lt;ffffffff8151ba1d&gt;] __netif_receive_skb+0x55d/0x620
[&lt;ffffffff8151c360&gt;] netif_receive_skb+0x80/0x90
[&lt;ffffffff81459935&gt;] virtnet_poll+0x555/0x6f0
[&lt;ffffffff8151cd04&gt;] net_rx_action+0x134/0x290
[&lt;ffffffff810683d8&gt;] __do_softirq+0xa8/0x210
[&lt;ffffffff8162fe6c&gt;] call_softirq+0x1c/0x30
[&lt;ffffffff810161a5&gt;] do_softirq+0x65/0xa0
[&lt;ffffffff810687be&gt;] irq_exit+0x8e/0xb0
[&lt;ffffffff81630733&gt;] do_IRQ+0x63/0xe0
[&lt;ffffffff81625f2e&gt;] common_interrupt+0x6e/0x6e

Reported-by: Anupam Chanda &lt;achanda@vmware.com&gt;
Signed-off-by: Pravin B Shelar &lt;pshelar@nicira.com&gt;
Acked-by: Tom Herbert &lt;tom@herbertland.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Luis Henriques &lt;luis.henriques@canonical.com&gt;
</content>
</entry>
<entry>
<title>net: fix crash in build_skb()</title>
<updated>2015-04-29T10:39:51Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-04-24T23:05:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fcc9ecaa7dd216d68904bdf910334519fe6ae918'/>
<id>urn:sha1:fcc9ecaa7dd216d68904bdf910334519fe6ae918</id>
<content type='text'>
commit 2ea2f62c8bda242433809c7f4e9eae1c52c40bbe upstream.

When I added pfmemalloc support in build_skb(), I forgot netlink
was using build_skb() with a vmalloc() area.

In this patch I introduce __build_skb() for netlink use,
and build_skb() is a wrapper handling both skb-&gt;head_frag and
skb-&gt;pfmemalloc

This means netlink no longer has to hack skb-&gt;head_frag

[ 1567.700067] kernel BUG at arch/x86/mm/physaddr.c:26!
[ 1567.700067] invalid opcode: 0000 [#1] PREEMPT SMP KASAN
[ 1567.700067] Dumping ftrace buffer:
[ 1567.700067]    (ftrace buffer empty)
[ 1567.700067] Modules linked in:
[ 1567.700067] CPU: 9 PID: 16186 Comm: trinity-c182 Not tainted 4.0.0-next-20150424-sasha-00037-g4796e21 #2167
[ 1567.700067] task: ffff880127efb000 ti: ffff880246770000 task.ti: ffff880246770000
[ 1567.700067] RIP: __phys_addr (arch/x86/mm/physaddr.c:26 (discriminator 3))
[ 1567.700067] RSP: 0018:ffff8802467779d8  EFLAGS: 00010202
[ 1567.700067] RAX: 000041000ed8e000 RBX: ffffc9008ed8e000 RCX: 000000000000002c
[ 1567.700067] RDX: 0000000000000004 RSI: 0000000000000000 RDI: ffffffffb3fd6049
[ 1567.700067] RBP: ffff8802467779f8 R08: 0000000000000019 R09: ffff8801d0168000
[ 1567.700067] R10: ffff8801d01680c7 R11: ffffed003a02d019 R12: ffffc9000ed8e000
[ 1567.700067] R13: 0000000000000f40 R14: 0000000000001180 R15: ffffc9000ed8e000
[ 1567.700067] FS:  00007f2a7da3f700(0000) GS:ffff8801d1000000(0000) knlGS:0000000000000000
[ 1567.700067] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1567.700067] CR2: 0000000000738308 CR3: 000000022e329000 CR4: 00000000000007e0
[ 1567.700067] Stack:
[ 1567.700067]  ffffc9000ed8e000 ffff8801d0168000 ffffc9000ed8e000 ffff8801d0168000
[ 1567.700067]  ffff880246777a28 ffffffffad7c0a21 0000000000001080 ffff880246777c08
[ 1567.700067]  ffff88060d302e68 ffff880246777b58 ffff880246777b88 ffffffffad9a6821
[ 1567.700067] Call Trace:
[ 1567.700067] build_skb (include/linux/mm.h:508 net/core/skbuff.c:316)
[ 1567.700067] netlink_sendmsg (net/netlink/af_netlink.c:1633 net/netlink/af_netlink.c:2329)
[ 1567.774369] ? sched_clock_cpu (kernel/sched/clock.c:311)
[ 1567.774369] ? netlink_unicast (net/netlink/af_netlink.c:2273)
[ 1567.774369] ? netlink_unicast (net/netlink/af_netlink.c:2273)
[ 1567.774369] sock_sendmsg (net/socket.c:614 net/socket.c:623)
[ 1567.774369] sock_write_iter (net/socket.c:823)
[ 1567.774369] ? sock_sendmsg (net/socket.c:806)
[ 1567.774369] __vfs_write (fs/read_write.c:479 fs/read_write.c:491)
[ 1567.774369] ? get_lock_stats (kernel/locking/lockdep.c:249)
[ 1567.774369] ? default_llseek (fs/read_write.c:487)
[ 1567.774369] ? vtime_account_user (kernel/sched/cputime.c:701)
[ 1567.774369] ? rw_verify_area (fs/read_write.c:406 (discriminator 4))
[ 1567.774369] vfs_write (fs/read_write.c:539)
[ 1567.774369] SyS_write (fs/read_write.c:586 fs/read_write.c:577)
[ 1567.774369] ? SyS_read (fs/read_write.c:577)
[ 1567.774369] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63)
[ 1567.774369] ? trace_hardirqs_on_caller (kernel/locking/lockdep.c:2594 kernel/locking/lockdep.c:2636)
[ 1567.774369] ? trace_hardirqs_on_thunk (arch/x86/lib/thunk_64.S:42)
[ 1567.774369] system_call_fastpath (arch/x86/kernel/entry_64.S:261)

Fixes: 79930f5892e ("net: do not deplete pfmemalloc reserve")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: Sasha Levin &lt;sasha.levin@oracle.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Luis Henriques &lt;luis.henriques@canonical.com&gt;
</content>
</entry>
<entry>
<title>net: add skb_checksum_complete_unset</title>
<updated>2015-04-29T10:32:04Z</updated>
<author>
<name>Tom Herbert</name>
<email>tom@herbertland.com</email>
</author>
<published>2015-04-20T21:10:04Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c0fd4b0ef36f8ccbf6ddb953b5fa69af79b65c6c'/>
<id>urn:sha1:c0fd4b0ef36f8ccbf6ddb953b5fa69af79b65c6c</id>
<content type='text'>
commit 4e18b9adf2f910ec4d30b811a74a5b626e6c6125 upstream.

This function changes ip_summed to CHECKSUM_NONE if CHECKSUM_COMPLETE
is set. This is called to discard checksum-complete when packet
is being modified and checksum is not pulled for headers in a layer.

Signed-off-by: Tom Herbert &lt;tom@herbertland.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Luis Henriques &lt;luis.henriques@canonical.com&gt;
</content>
</entry>
<entry>
<title>net: Always untag vlan-tagged traffic on input.</title>
<updated>2014-10-15T10:05:26Z</updated>
<author>
<name>Vlad Yasevich</name>
<email>vyasevic@redhat.com</email>
</author>
<published>2014-08-08T18:42:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=84beb1a9999475ea1d25629345119da137474853'/>
<id>urn:sha1:84beb1a9999475ea1d25629345119da137474853</id>
<content type='text'>
[ Upstream commit 0d5501c1c828fb97d02af50aa9d2b1a5498b94e4 ]

Currently the functionality to untag traffic on input resides
as part of the vlan module and is build only when VLAN support
is enabled in the kernel.  When VLAN is disabled, the function
vlan_untag() turns into a stub and doesn't really untag the
packets.  This seems to create an interesting interaction
between VMs supporting checksum offloading and some network drivers.

There are some drivers that do not allow the user to change
tx-vlan-offload feature of the driver.  These drivers also seem
to assume that any VLAN-tagged traffic they transmit will
have the vlan information in the vlan_tci and not in the vlan
header already in the skb.  When transmitting skbs that already
have tagged data with partial checksum set, the checksum doesn't
appear to be updated correctly by the card thus resulting in a
failure to establish TCP connections.

The following is a packet trace taken on the receiver where a
sender is a VM with a VLAN configued.  The host VM is running on
doest not have VLAN support and the outging interface on the
host is tg3:
10:12:43.503055 52:54:00:ae:42:3f &gt; 28:d2:44:7d:c2:de, ethertype 802.1Q
(0x8100), length 78: vlan 100, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 27243,
offset 0, flags [DF], proto TCP (6), length 60)
    10.0.100.1.58545 &gt; 10.0.100.10.ircu-2: Flags [S], cksum 0xdc39 (incorrect
-&gt; 0x48d9), seq 1069378582, win 29200, options [mss 1460,sackOK,TS val
4294837885 ecr 0,nop,wscale 7], length 0
10:12:44.505556 52:54:00:ae:42:3f &gt; 28:d2:44:7d:c2:de, ethertype 802.1Q
(0x8100), length 78: vlan 100, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 27244,
offset 0, flags [DF], proto TCP (6), length 60)
    10.0.100.1.58545 &gt; 10.0.100.10.ircu-2: Flags [S], cksum 0xdc39 (incorrect
-&gt; 0x44ee), seq 1069378582, win 29200, options [mss 1460,sackOK,TS val
4294838888 ecr 0,nop,wscale 7], length 0

This connection finally times out.

I've only access to the TG3 hardware in this configuration thus have
only tested this with TG3 driver.  There are a lot of other drivers
that do not permit user changes to vlan acceleration features, and
I don't know if they all suffere from a similar issue.

The patch attempt to fix this another way.  It moves the vlan header
stipping code out of the vlan module and always builds it into the
kernel network core.  This way, even if vlan is not supported on
a virtualizatoin host, the virtual machines running on top of such
host will still work with VLANs enabled.

CC: Patrick McHardy &lt;kaber@trash.net&gt;
CC: Nithin Nayak Sujir &lt;nsujir@broadcom.com&gt;
CC: Michael Chan &lt;mchan@broadcom.com&gt;
CC: Jiri Pirko &lt;jiri@resnulli.us&gt;
Signed-off-by: Vladislav Yasevich &lt;vyasevic@redhat.com&gt;
Acked-by: Jiri Pirko &lt;jiri@resnulli.us&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>net: add skb_pop_rcv_encapsulation</title>
<updated>2014-06-15T08:00:50Z</updated>
<author>
<name>Tom Herbert</name>
<email>therbert@google.com</email>
</author>
<published>2014-06-15T06:24:28Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e5eb4e30a51236079fb22bb9f75fcd31915b03c6'/>
<id>urn:sha1:e5eb4e30a51236079fb22bb9f75fcd31915b03c6</id>
<content type='text'>
This function is used by UDP encapsulation protocols in RX when
crossing encapsulation boundary. If ip_summed is set to
CHECKSUM_UNNECESSARY and encapsulation is not set, change to
CHECKSUM_NONE since the checksum has not been validated within the
encapsulation. Clears csum_valid by the same rationale.

Signed-off-by: Tom Herbert &lt;therbert@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
