<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/net/core/dev.c, branch v3.12.39</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.12.39</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.12.39'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2015-03-12T09:06:27Z</updated>
<entry>
<title>net: reject creation of netdev names with colons</title>
<updated>2015-03-12T09:06:27Z</updated>
<author>
<name>Matthew Thode</name>
<email>mthode@mthode.org</email>
</author>
<published>2015-02-18T00:31:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=eb83542b81fa8bdbc20a0e69edafedce19aeaaf5'/>
<id>urn:sha1:eb83542b81fa8bdbc20a0e69edafedce19aeaaf5</id>
<content type='text'>
[ Upstream commit a4176a9391868bfa87705bcd2e3b49e9b9dd2996 ]

colons are used as a separator in netdev device lookup in dev_ioctl.c

Specific functions are SIOCGIFTXQLEN SIOCETHTOOL SIOCSIFNAME

Signed-off-by: Matthew Thode &lt;mthode@mthode.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>net: rps: fix cpu unplug</title>
<updated>2015-02-10T10:16:48Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2015-01-16T01:04:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=488070c6dbd92ba5431f54fabb2101d09e3a018c'/>
<id>urn:sha1:488070c6dbd92ba5431f54fabb2101d09e3a018c</id>
<content type='text'>
[ Upstream commit ac64da0b83d82abe62f78b3d0e21cca31aea24fa ]

softnet_data.input_pkt_queue is protected by a spinlock that
we must hold when transferring packets from victim queue to an active
one. This is because other cpus could still be trying to enqueue packets
into victim queue.

A second problem is that when we transfert the NAPI poll_list from
victim to current cpu, we absolutely need to special case the percpu
backlog, because we do not want to add complex locking to protect
process_queue : Only owner cpu is allowed to manipulate it, unless cpu
is offline.

Based on initial patch from Prasad Sodagudi &amp; Subash Abhinov
Kasiviswanathan.

This version is better because we do not slow down packet processing,
only make migration safer.

Reported-by: Prasad Sodagudi &lt;psodagud@codeaurora.org&gt;
Reported-by: Subash Abhinov Kasiviswanathan &lt;subashab@codeaurora.org&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Tom Herbert &lt;therbert@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>net/core: Handle csum for CHECKSUM_COMPLETE VXLAN forwarding</title>
<updated>2015-01-26T13:39:19Z</updated>
<author>
<name>Jay Vosburgh</name>
<email>jay.vosburgh@canonical.com</email>
</author>
<published>2014-12-19T23:32:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e5924c9528cbe4baa4b13230addac2a7b52f4fe6'/>
<id>urn:sha1:e5924c9528cbe4baa4b13230addac2a7b52f4fe6</id>
<content type='text'>
[ Upstream commit 2c26d34bbcc0b3f30385d5587aa232289e2eed8e ]

When using VXLAN tunnels and a sky2 device, I have experienced
checksum failures of the following type:

[ 4297.761899] eth0: hw csum failure
[...]
[ 4297.765223] Call Trace:
[ 4297.765224]  &lt;IRQ&gt;  [&lt;ffffffff8172f026&gt;] dump_stack+0x46/0x58
[ 4297.765235]  [&lt;ffffffff8162ba52&gt;] netdev_rx_csum_fault+0x42/0x50
[ 4297.765238]  [&lt;ffffffff8161c1a0&gt;] ? skb_push+0x40/0x40
[ 4297.765240]  [&lt;ffffffff8162325c&gt;] __skb_checksum_complete+0xbc/0xd0
[ 4297.765243]  [&lt;ffffffff8168c602&gt;] tcp_v4_rcv+0x2e2/0x950
[ 4297.765246]  [&lt;ffffffff81666ca0&gt;] ? ip_rcv_finish+0x360/0x360

	These are reliably reproduced in a network topology of:

container:eth0 == host(OVS VXLAN on VLAN) == bond0 == eth0 (sky2) -&gt; switch

	When VXLAN encapsulated traffic is received from a similarly
configured peer, the above warning is generated in the receive
processing of the encapsulated packet.  Note that the warning is
associated with the container eth0.

        The skbs from sky2 have ip_summed set to CHECKSUM_COMPLETE, and
because the packet is an encapsulated Ethernet frame, the checksum
generated by the hardware includes the inner protocol and Ethernet
headers.

	The receive code is careful to update the skb-&gt;csum, except in
__dev_forward_skb, as called by dev_forward_skb.  __dev_forward_skb
calls eth_type_trans, which in turn calls skb_pull_inline(skb, ETH_HLEN)
to skip over the Ethernet header, but does not update skb-&gt;csum when
doing so.

	This patch resolves the problem by adding a call to
skb_postpull_rcsum to update the skb-&gt;csum after the call to
eth_type_trans.

Signed-off-by: Jay Vosburgh &lt;jay.vosburgh@canonical.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>net: Fix stacked vlan offload features computation</title>
<updated>2015-01-26T13:39:16Z</updated>
<author>
<name>Toshiaki Makita</name>
<email>makita.toshiaki@lab.ntt.co.jp</email>
</author>
<published>2014-12-22T10:04:14Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6e1a8eb4ed3f954059ffae5563710275bc63d4fa'/>
<id>urn:sha1:6e1a8eb4ed3f954059ffae5563710275bc63d4fa</id>
<content type='text'>
[ Upstream commit 796f2da81bead71ffc91ef70912cd8d1827bf756 ]

When vlan tags are stacked, it is very likely that the outer tag is stored
in skb-&gt;vlan_tci and skb-&gt;protocol shows the inner tag's vlan_proto.
Currently netif_skb_features() first looks at skb-&gt;protocol even if there
is the outer tag in vlan_tci, thus it incorrectly retrieves the protocol
encapsulated by the inner vlan instead of the inner vlan protocol.
This allows GSO packets to be passed to HW and they end up being
corrupted.

Fixes: 58e998c6d239 ("offloading: Force software GSO for multiple vlan tags.")
Signed-off-by: Toshiaki Makita &lt;makita.toshiaki@lab.ntt.co.jp&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>net: fix checksum features handling in netif_skb_features()</title>
<updated>2014-10-31T11:14:36Z</updated>
<author>
<name>Michal Kubeček</name>
<email>mkubecek@suse.cz</email>
</author>
<published>2014-08-25T13:16:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=297b3ddd679ac4e4958661df855595bb49c42a18'/>
<id>urn:sha1:297b3ddd679ac4e4958661df855595bb49c42a18</id>
<content type='text'>
commit db115037bb57cdfe97078b13da762213f7980e81 upstream.

This is follow-up to

  da08143b8520 ("vlan: more careful checksum features handling")

which introduced more careful feature intersection in vlan code,
taking into account that HW_CSUM should be considered superset
of IP_CSUM/IPV6_CSUM. The same is needed in netif_skb_features()
in order to avoid offloading mismatch warning when vlan is
created on top of a bond consisting of slaves supporting IP/IPv6
checksumming but not vlan Tx offloading.

Signed-off-by: Michal Kubecek &lt;mkubecek@suse.cz&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>net: Always untag vlan-tagged traffic on input.</title>
<updated>2014-10-17T07:43:10Z</updated>
<author>
<name>Vlad Yasevich</name>
<email>vyasevic@redhat.com</email>
</author>
<published>2014-08-08T18:42:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f55ef858b1beb467609805086a18a0e91857e577'/>
<id>urn:sha1:f55ef858b1beb467609805086a18a0e91857e577</id>
<content type='text'>
[ Upstream commit 0d5501c1c828fb97d02af50aa9d2b1a5498b94e4 ]

Currently the functionality to untag traffic on input resides
as part of the vlan module and is build only when VLAN support
is enabled in the kernel.  When VLAN is disabled, the function
vlan_untag() turns into a stub and doesn't really untag the
packets.  This seems to create an interesting interaction
between VMs supporting checksum offloading and some network drivers.

There are some drivers that do not allow the user to change
tx-vlan-offload feature of the driver.  These drivers also seem
to assume that any VLAN-tagged traffic they transmit will
have the vlan information in the vlan_tci and not in the vlan
header already in the skb.  When transmitting skbs that already
have tagged data with partial checksum set, the checksum doesn't
appear to be updated correctly by the card thus resulting in a
failure to establish TCP connections.

The following is a packet trace taken on the receiver where a
sender is a VM with a VLAN configued.  The host VM is running on
doest not have VLAN support and the outging interface on the
host is tg3:
10:12:43.503055 52:54:00:ae:42:3f &gt; 28:d2:44:7d:c2:de, ethertype 802.1Q
(0x8100), length 78: vlan 100, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 27243,
offset 0, flags [DF], proto TCP (6), length 60)
    10.0.100.1.58545 &gt; 10.0.100.10.ircu-2: Flags [S], cksum 0xdc39 (incorrect
-&gt; 0x48d9), seq 1069378582, win 29200, options [mss 1460,sackOK,TS val
4294837885 ecr 0,nop,wscale 7], length 0
10:12:44.505556 52:54:00:ae:42:3f &gt; 28:d2:44:7d:c2:de, ethertype 802.1Q
(0x8100), length 78: vlan 100, p 0, ethertype IPv4, (tos 0x0, ttl 64, id 27244,
offset 0, flags [DF], proto TCP (6), length 60)
    10.0.100.1.58545 &gt; 10.0.100.10.ircu-2: Flags [S], cksum 0xdc39 (incorrect
-&gt; 0x44ee), seq 1069378582, win 29200, options [mss 1460,sackOK,TS val
4294838888 ecr 0,nop,wscale 7], length 0

This connection finally times out.

I've only access to the TG3 hardware in this configuration thus have
only tested this with TG3 driver.  There are a lot of other drivers
that do not permit user changes to vlan acceleration features, and
I don't know if they all suffere from a similar issue.

The patch attempt to fix this another way.  It moves the vlan header
stipping code out of the vlan module and always builds it into the
kernel network core.  This way, even if vlan is not supported on
a virtualizatoin host, the virtual machines running on top of such
host will still work with VLANs enabled.

CC: Patrick McHardy &lt;kaber@trash.net&gt;
CC: Nithin Nayak Sujir &lt;nsujir@broadcom.com&gt;
CC: Michael Chan &lt;mchan@broadcom.com&gt;
CC: Jiri Pirko &lt;jiri@resnulli.us&gt;
Signed-off-by: Vladislav Yasevich &lt;vyasevic@redhat.com&gt;
Acked-by: Jiri Pirko &lt;jiri@resnulli.us&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>net: Fix NETDEV_CHANGE notifier usage causing spurious arp flush</title>
<updated>2014-07-29T15:01:05Z</updated>
<author>
<name>Loic Prylli</name>
<email>loicp@google.com</email>
</author>
<published>2014-07-02T04:39:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4c824ea2e6272f897012dacbced560b9aa11fd46'/>
<id>urn:sha1:4c824ea2e6272f897012dacbced560b9aa11fd46</id>
<content type='text'>
[ Upstream commit 54951194656e4853e441266fd095f880bc0398f3 ]

A bug was introduced in NETDEV_CHANGE notifier sequence causing the
arp table to be sometimes spuriously cleared (including manual arp
entries marked permanent), upon network link carrier changes.

The changed argument for the notifier was applied only to a single
caller of NETDEV_CHANGE, missing among others netdev_state_change().
So upon net_carrier events induced by the network, which are
triggering a call to netdev_state_change(), arp_netdev_event() would
decide whether to clear or not arp cache based on random/junk stack
values (a kind of read buffer overflow).

Fixes: be9efd365328 ("net: pass changed flags along with NETDEV_CHANGE event")
Fixes: 6c8b4e3ff81b ("arp: flush arp cache on IFF_NOARP change")
Signed-off-by: Loic Prylli &lt;loicp@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>net: Do not enable tx-nocache-copy by default</title>
<updated>2014-06-27T08:25:16Z</updated>
<author>
<name>Benjamin Poirier</name>
<email>bpoirier@suse.de</email>
</author>
<published>2014-01-07T15:11:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9f6e089cb55bdc5a90fe6ec755a20941da9a0b3b'/>
<id>urn:sha1:9f6e089cb55bdc5a90fe6ec755a20941da9a0b3b</id>
<content type='text'>
commit cdb3f4a31b64c3a1c6eef40bc01ebc9594c58a8c upstream.

There are many cases where this feature does not improve performance or even
reduces it.

For example, here are the results from tests that I've run using 3.12.6 on one
Intel Xeon W3565 and one i7 920 connected by ixgbe adapters. The results are
from the Xeon, but they're similar on the i7. All numbers report the
mean±stddev over 10 runs of 10s.

1) latency tests similar to what is described in "c6e1a0d net: Allow no-cache
copy from user on transmit"
There is no statistically significant difference between tx-nocache-copy
on/off.
nic irqs spread out (one queue per cpu)

200x netperf -r 1400,1
tx-nocache-copy off
        692000±1000 tps
        50/90/95/99% latency (us): 275±2/643.8±0.4/799±1/2474.4±0.3
tx-nocache-copy on
        693000±1000 tps
        50/90/95/99% latency (us): 274±1/644.1±0.7/800±2/2474.5±0.7

200x netperf -r 14000,14000
tx-nocache-copy off
        86450±80 tps
        50/90/95/99% latency (us): 334.37±0.02/838±1/2100±20/3990±40
tx-nocache-copy on
        86110±60 tps
        50/90/95/99% latency (us): 334.28±0.01/837±2/2110±20/3990±20

2) single stream throughput tests
tx-nocache-copy leads to higher service demand

                        throughput  cpu0        cpu1        demand
                        (Gb/s)      (Gcycle)    (Gcycle)    (cycle/B)

nic irqs and netperf on cpu0 (1x netperf -T0,0 -t omni -- -d send)

tx-nocache-copy off     9402±5      9.4±0.2                 0.80±0.01
tx-nocache-copy on      9403±3      9.85±0.04               0.838±0.004

nic irqs on cpu0, netperf on cpu1 (1x netperf -T1,1 -t omni -- -d send)

tx-nocache-copy off     9401±5      5.83±0.03   5.0±0.1     0.923±0.007
tx-nocache-copy on      9404±2      5.74±0.03   5.523±0.009 0.958±0.002

As a second example, here are some results from Eric Dumazet with latest
net-next.
tx-nocache-copy also leads to higher service demand

(cpu is Intel(R) Xeon(R) CPU X5660  @ 2.80GHz)

lpq83:~# ./ethtool -K eth0 tx-nocache-copy on
lpq83:~# perf stat ./netperf -H lpq84 -c
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpq84.prod.google.com () port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % U      us/KB   us/KB

 87380  16384  16384    10.00      9407.44   2.50     -1.00    0.522   -1.000

 Performance counter stats for './netperf -H lpq84 -c':

       4282.648396 task-clock                #    0.423 CPUs utilized
             9,348 context-switches          #    0.002 M/sec
                88 CPU-migrations            #    0.021 K/sec
               355 page-faults               #    0.083 K/sec
    11,812,797,651 cycles                    #    2.758 GHz                     [82.79%]
     9,020,522,817 stalled-cycles-frontend   #   76.36% frontend cycles idle    [82.54%]
     4,579,889,681 stalled-cycles-backend    #   38.77% backend  cycles idle    [67.33%]
     6,053,172,792 instructions              #    0.51  insns per cycle
                                             #    1.49  stalled cycles per insn [83.64%]
       597,275,583 branches                  #  139.464 M/sec                   [83.70%]
         8,960,541 branch-misses             #    1.50% of all branches         [83.65%]

      10.128990264 seconds time elapsed

lpq83:~# ./ethtool -K eth0 tx-nocache-copy off
lpq83:~# perf stat ./netperf -H lpq84 -c
MIGRATED TCP STREAM TEST from 0.0.0.0 (0.0.0.0) port 0 AF_INET to lpq84.prod.google.com () port 0 AF_INET
Recv   Send    Send                          Utilization       Service Demand
Socket Socket  Message  Elapsed              Send     Recv     Send    Recv
Size   Size    Size     Time     Throughput  local    remote   local   remote
bytes  bytes   bytes    secs.    10^6bits/s  % S      % U      us/KB   us/KB

 87380  16384  16384    10.00      9412.45   2.15     -1.00    0.449   -1.000

 Performance counter stats for './netperf -H lpq84 -c':

       2847.375441 task-clock                #    0.281 CPUs utilized
            11,632 context-switches          #    0.004 M/sec
                49 CPU-migrations            #    0.017 K/sec
               354 page-faults               #    0.124 K/sec
     7,646,889,749 cycles                    #    2.686 GHz                     [83.34%]
     6,115,050,032 stalled-cycles-frontend   #   79.97% frontend cycles idle    [83.31%]
     1,726,460,071 stalled-cycles-backend    #   22.58% backend  cycles idle    [66.55%]
     2,079,702,453 instructions              #    0.27  insns per cycle
                                             #    2.94  stalled cycles per insn [83.22%]
       363,773,213 branches                  #  127.757 M/sec                   [83.29%]
         4,242,732 branch-misses             #    1.17% of all branches         [83.51%]

      10.128449949 seconds time elapsed

CC: Tom Herbert &lt;therbert@google.com&gt;
Signed-off-by: Benjamin Poirier &lt;bpoirier@suse.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>net: force a list_del() in unregister_netdevice_many()</title>
<updated>2014-06-23T08:28:03Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2014-06-06T13:44:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cbaf35e5ba05b9ff2f43d798fe4082cca8151861'/>
<id>urn:sha1:cbaf35e5ba05b9ff2f43d798fe4082cca8151861</id>
<content type='text'>
[ Upstream commit 87757a917b0b3c0787e0563c679762152be81312 ]

unregister_netdevice_many() API is error prone and we had too
many bugs because of dangling LIST_HEAD on stacks.

See commit f87e6f47933e3e ("net: dont leave active on stack LIST_HEAD")

In fact, instead of making sure no caller leaves an active list_head,
just force a list_del() in the callee. No one seems to need to access
the list after unregister_netdevice_many()

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
<entry>
<title>net-gro: reset skb-&gt;truesize in napi_reuse_skb()</title>
<updated>2014-05-29T09:49:34Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2014-04-03T16:28:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ac993cf25a957b332864e3200644b50e179826fc'/>
<id>urn:sha1:ac993cf25a957b332864e3200644b50e179826fc</id>
<content type='text'>
[ Upstream commit e33d0ba8047b049c9262fdb1fcafb93cb52ceceb ]

Recycling skb always had been very tough...

This time it appears GRO layer can accumulate skb-&gt;truesize
adjustments made by drivers when they attach a fragment to skb.

skb_gro_receive() can only subtract from skb-&gt;truesize the used part
of a fragment.

I spotted this problem seeing TcpExtPruneCalled and
TcpExtTCPRcvCollapsed that were unexpected with a recent kernel, where
TCP receive window should be sized properly to accept traffic coming
from a driver not overshooting skb-&gt;truesize.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Jiri Slaby &lt;jslaby@suse.cz&gt;
</content>
</entry>
</feed>
