<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/net/core/dev.c, branch v4.20.5</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.20.5</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.20.5'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2018-12-06T00:22:05Z</updated>
<entry>
<title>net: use skb_list_del_init() to remove from RX sublists</title>
<updated>2018-12-06T00:22:05Z</updated>
<author>
<name>Edward Cree</name>
<email>ecree@solarflare.com</email>
</author>
<published>2018-12-04T17:37:57Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=22f6bbb7bcfcef0b373b0502a7ff390275c575dd'/>
<id>urn:sha1:22f6bbb7bcfcef0b373b0502a7ff390275c575dd</id>
<content type='text'>
list_del() leaves the skb-&gt;next pointer poisoned, which can then lead to
 a crash in e.g. OVS forwarding.  For example, setting up an OVS VXLAN
 forwarding bridge on sfc as per:

========
$ ovs-vsctl show
5dfd9c47-f04b-4aaa-aa96-4fbb0a522a30
    Bridge "br0"
        Port "br0"
            Interface "br0"
                type: internal
        Port "enp6s0f0"
            Interface "enp6s0f0"
        Port "vxlan0"
            Interface "vxlan0"
                type: vxlan
                options: {key="1", local_ip="10.0.0.5", remote_ip="10.0.0.4"}
    ovs_version: "2.5.0"
========
(where 10.0.0.5 is an address on enp6s0f1)
and sending traffic across it will lead to the following panic:
========
general protection fault: 0000 [#1] SMP PTI
CPU: 5 PID: 0 Comm: swapper/5 Not tainted 4.20.0-rc3-ehc+ #701
Hardware name: Dell Inc. PowerEdge R710/0M233H, BIOS 6.4.0 07/23/2013
RIP: 0010:dev_hard_start_xmit+0x38/0x200
Code: 53 48 89 fb 48 83 ec 20 48 85 ff 48 89 54 24 08 48 89 4c 24 18 0f 84 ab 01 00 00 48 8d 86 90 00 00 00 48 89 f5 48 89 44 24 10 &lt;4c&gt; 8b 33 48 c7 03 00 00 00 00 48 8b 05 c7 d1 b3 00 4d 85 f6 0f 95
RSP: 0018:ffff888627b437e0 EFLAGS: 00010202
RAX: 0000000000000000 RBX: dead000000000100 RCX: ffff88862279c000
RDX: ffff888614a342c0 RSI: 0000000000000000 RDI: 0000000000000000
RBP: ffff888618a88000 R08: 0000000000000001 R09: 00000000000003e8
R10: 0000000000000000 R11: ffff888614a34140 R12: 0000000000000000
R13: 0000000000000062 R14: dead000000000100 R15: ffff888616430000
FS:  0000000000000000(0000) GS:ffff888627b40000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f6d2bc6d000 CR3: 000000000200a000 CR4: 00000000000006e0
Call Trace:
 &lt;IRQ&gt;
 __dev_queue_xmit+0x623/0x870
 ? masked_flow_lookup+0xf7/0x220 [openvswitch]
 ? ep_poll_callback+0x101/0x310
 do_execute_actions+0xaba/0xaf0 [openvswitch]
 ? __wake_up_common+0x8a/0x150
 ? __wake_up_common_lock+0x87/0xc0
 ? queue_userspace_packet+0x31c/0x5b0 [openvswitch]
 ovs_execute_actions+0x47/0x120 [openvswitch]
 ovs_dp_process_packet+0x7d/0x110 [openvswitch]
 ovs_vport_receive+0x6e/0xd0 [openvswitch]
 ? dst_alloc+0x64/0x90
 ? rt_dst_alloc+0x50/0xd0
 ? ip_route_input_slow+0x19a/0x9a0
 ? __udp_enqueue_schedule_skb+0x198/0x1b0
 ? __udp4_lib_rcv+0x856/0xa30
 ? __udp4_lib_rcv+0x856/0xa30
 ? cpumask_next_and+0x19/0x20
 ? find_busiest_group+0x12d/0xcd0
 netdev_frame_hook+0xce/0x150 [openvswitch]
 __netif_receive_skb_core+0x205/0xae0
 __netif_receive_skb_list_core+0x11e/0x220
 netif_receive_skb_list+0x203/0x460
 ? __efx_rx_packet+0x335/0x5e0 [sfc]
 efx_poll+0x182/0x320 [sfc]
 net_rx_action+0x294/0x3c0
 __do_softirq+0xca/0x297
 irq_exit+0xa6/0xb0
 do_IRQ+0x54/0xd0
 common_interrupt+0xf/0xf
 &lt;/IRQ&gt;
========
So, in all listified-receive handling, instead pull skbs off the lists with
 skb_list_del_init().

Fixes: 9af86f933894 ("net: core: fix use-after-free in __netif_receive_skb_list_core")
Fixes: 7da517a3bc52 ("net: core: Another step of skb receive list processing")
Fixes: a4ca8b7df73c ("net: ipv4: fix drop handling in ip_list_rcv() and ip_list_rcv_finish()")
Fixes: d8269e2cbf90 ("net: ipv6: listify ipv6_rcv() and ip6_rcv_finish()")
Signed-off-by: Edward Cree &lt;ecree@solarflare.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net/core: tidy up an error message</title>
<updated>2018-12-04T00:14:51Z</updated>
<author>
<name>Qian Cai</name>
<email>cai@gmx.us</email>
</author>
<published>2018-12-02T02:11:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bf29e9e9b6d2f09cdbf39b48d028f0b49e944f85'/>
<id>urn:sha1:bf29e9e9b6d2f09cdbf39b48d028f0b49e944f85</id>
<content type='text'>
netif_napi_add() could report an error like this below due to it allows
to pass a format string for wildcarding before calling
dev_get_valid_name(),

"netif_napi_add() called with weight 256 on device eth%d"

For example, hns_enet_drv module does this.

hns_nic_try_get_ae
  hns_nic_init_ring_data
    netif_napi_add
  register_netdev
    dev_get_valid_name

Hence, make it a bit more human-readable by using netdev_err_once()
instead.

Signed-off-by: Qian Cai &lt;cai@gmx.us&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: fix XPS static_key accounting</title>
<updated>2018-11-29T19:06:08Z</updated>
<author>
<name>Sabrina Dubroca</name>
<email>sd@queasysnail.net</email>
</author>
<published>2018-11-29T13:14:49Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=867d0ad476db89a1e8af3f297af402399a54eea5'/>
<id>urn:sha1:867d0ad476db89a1e8af3f297af402399a54eea5</id>
<content type='text'>
Commit 04157469b7b8 ("net: Use static_key for XPS maps") introduced a
static key for XPS, but the increments/decrements don't match.

First, the static key's counter is incremented once for each queue, but
only decremented once for a whole batch of queues, leading to large
unbalances.

Second, the xps_rxqs_needed key is decremented whenever we reset a batch
of queues, whether they had any rxqs mapping or not, so that if we setup
cpu-XPS on em1 and RXQS-XPS on em2, resetting the queues on em1 would
decrement the xps_rxqs_needed key.

This reworks the accounting scheme so that the xps_needed key is
incremented only once for each type of XPS for all the queues on a
device, and the xps_rxqs_needed key is incremented only once for all
queues. This is sufficient to let us retrieve queues via
get_xps_queue().

This patch introduces a new reset_xps_maps(), which reinitializes and
frees the appropriate map (xps_rxqs_map or xps_cpus_map), and drops a
reference to the needed keys:
 - both xps_needed and xps_rxqs_needed, in case of rxqs maps,
 - only xps_needed, in case of CPU maps.

Now, we also need to call reset_xps_maps() at the end of
__netif_set_xps_queue() when there's no active map left, for example
when writing '00000000,00000000' to all queues' xps_rxqs setting.

Fixes: 04157469b7b8 ("net: Use static_key for XPS maps")
Signed-off-by: Sabrina Dubroca &lt;sd@queasysnail.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: restore call to netdev_queue_numa_node_write when resetting XPS</title>
<updated>2018-11-29T19:06:08Z</updated>
<author>
<name>Sabrina Dubroca</name>
<email>sd@queasysnail.net</email>
</author>
<published>2018-11-29T13:14:48Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f28c020fb488e1a8b87469812017044bef88aa2b'/>
<id>urn:sha1:f28c020fb488e1a8b87469812017044bef88aa2b</id>
<content type='text'>
Before commit 80d19669ecd3 ("net: Refactor XPS for CPUs and Rx queues"),
netif_reset_xps_queues() did netdev_queue_numa_node_write() for all the
queues being reset. Now, this is only done when the "active" variable in
clean_xps_maps() is false, ie when on all the CPUs, there's no active
XPS mapping left.

Fixes: 80d19669ecd3 ("net: Refactor XPS for CPUs and Rx queues")
Signed-off-by: Sabrina Dubroca &lt;sd@queasysnail.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: don't keep lonely packets forever in the gro hash</title>
<updated>2018-11-23T19:18:53Z</updated>
<author>
<name>Paolo Abeni</name>
<email>pabeni@redhat.com</email>
</author>
<published>2018-11-21T17:21:35Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=605108acfe6233b72e2f803aa1cb59a2af3001ca'/>
<id>urn:sha1:605108acfe6233b72e2f803aa1cb59a2af3001ca</id>
<content type='text'>
Eric noted that with UDP GRO and NAPI timeout, we could keep a single
UDP packet inside the GRO hash forever, if the related NAPI instance
calls napi_gro_complete() at an higher frequency than the NAPI timeout.
Willem noted that even TCP packets could be trapped there, till the
next retransmission.
This patch tries to address the issue, flushing the old packets -
those with a NAPI_GRO_CB age before the current jiffy - before scheduling
the NAPI timeout. The rationale is that such a timeout should be
well below a jiffy and we are not flushing packets eligible for sane GRO.

v1  -&gt; v2:
 - clarified the commit message and comment

RFC -&gt; v1:
 - added 'Fixes tags', cleaned-up the wording.

Reported-by: Eric Dumazet &lt;eric.dumazet@gmail.com&gt;
Fixes: 3b47d30396ba ("net: gro: add a per device gro flush timer")
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Acked-by: Willem de Bruijn &lt;willemb@google.com&gt;
Acked-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net-gro: reset skb-&gt;pkt_type in napi_reuse_skb()</title>
<updated>2018-11-18T05:59:18Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2018-11-18T05:57:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=33d9a2c72f086cbf1087b2fd2d1a15aa9df14a7f'/>
<id>urn:sha1:33d9a2c72f086cbf1087b2fd2d1a15aa9df14a7f</id>
<content type='text'>
eth_type_trans() assumes initial value for skb-&gt;pkt_type
is PACKET_HOST.

This is indeed the value right after a fresh skb allocation.

However, it is possible that GRO merged a packet with a different
value (like PACKET_OTHERHOST in case macvlan is used), so
we need to make sure napi-&gt;skb will have pkt_type set back to
PACKET_HOST.

Otherwise, valid packets might be dropped by the stack because
their pkt_type is not PACKET_HOST.

napi_reuse_skb() was added in commit 96e93eab2033 ("gro: Add
internal interfaces for VLAN"), but this bug always has
been there.

Fixes: 96e93eab2033 ("gro: Add internal interfaces for VLAN")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: do not abort bulk send on BQL status</title>
<updated>2018-11-03T22:40:01Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2018-10-31T15:39:13Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fe60faa5063822f2d555f4f326c7dd72a60929bf'/>
<id>urn:sha1:fe60faa5063822f2d555f4f326c7dd72a60929bf</id>
<content type='text'>
Before calling dev_hard_start_xmit(), upper layers tried
to cook optimal skb list based on BQL budget.

Problem is that GSO packets can end up comsuming more than
the BQL budget.

Breaking the loop is not useful, since requeued packets
are ahead of any packets still in the qdisc.

It is also more expensive, since next TX completion will
push these packets later, while skbs are not in cpu caches.

It is also a behavior difference with TSO packets, that can
break the BQL limit by a large amount.

Note that drivers should use __netdev_tx_sent_queue()
in order to have optimal xmit_more support, and avoid
useless atomic operations as shown in the following patch.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: Properly unlink GRO packets on overflow.</title>
<updated>2018-10-28T17:35:12Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2018-10-28T17:35:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ece23711dd956cd5053c9cb03e9fe0668f9c8894'/>
<id>urn:sha1:ece23711dd956cd5053c9cb03e9fe0668f9c8894</id>
<content type='text'>
Just like with normal GRO processing, we have to initialize
skb-&gt;next to NULL when we unlink overflow packets from the
GRO hash lists.

Fixes: d4546c2509b1 ("net: Convert GRO SKB handling to list_head.")
Reported-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Tested-by: Oleksandr Natalenko &lt;oleksandr@natalenko.name&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next</title>
<updated>2018-10-16T06:21:07Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2018-10-16T06:21:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e85679511e48168b0f066b6ae585556b5e0d8f5b'/>
<id>urn:sha1:e85679511e48168b0f066b6ae585556b5e0d8f5b</id>
<content type='text'>
Daniel Borkmann says:

====================
pull-request: bpf-next 2018-10-16

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Convert BPF sockmap and kTLS to both use a new sk_msg API and enable
   sk_msg BPF integration for the latter, from Daniel and John.

2) Enable BPF syscall side to indicate for maps that they do not support
   a map lookup operation as opposed to just missing key, from Prashant.

3) Add bpftool map create command which after map creation pins the
   map into bpf fs for further processing, from Jakub.

4) Add bpftool support for attaching programs to maps allowing sock_map
   and sock_hash to be used from bpftool, from John.

5) Improve syscall BPF map update/delete path for map-in-map types to
   wait a RCU grace period for pending references to complete, from Daniel.

6) Couple of follow-up fixes for the BPF socket lookup to get it
   enabled also when IPv6 is compiled as a module, from Joe.

7) Fix a generic-XDP bug to handle the case when the Ethernet header
   was mangled and thus update skb's protocol and data, from Jesper.

8) Add a missing BTF header length check between header copies from
   user space, from Wenwen.

9) Minor fixups in libbpf to use __u32 instead u32 types and include
   proper perf_event.h uapi header instead of perf internal one, from Yonghong.

10) Allow to pass user-defined flags through EXTRA_CFLAGS and EXTRA_LDFLAGS
    to bpftool's build, from Jiri.

11) BPF kselftest tweaks to add LWTUNNEL to config fragment and to install
    with_addr.sh script from flow dissector selftest, from Anders.
====================

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>FDDI: defza: Support capturing outgoing SMT traffic</title>
<updated>2018-10-16T04:46:06Z</updated>
<author>
<name>Maciej W. Rozycki</name>
<email>macro@linux-mips.org</email>
</author>
<published>2018-10-09T22:57:49Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9f9a742db40f95f4dc20fc7293de4ea6ddb24e47'/>
<id>urn:sha1:9f9a742db40f95f4dc20fc7293de4ea6ddb24e47</id>
<content type='text'>
DEC FDDIcontroller 700 (DEFZA) uses a Tx/Rx queue pair to communicate
SMT frames with adapter's firmware.  Any SMT frame received from the RMC
via the Rx queue is queued back by the driver to the SMT Rx queue for
the firmware to process.  Similarly the firmware uses the SMT Tx queue
to supply the driver with SMT frames which are queued back to the Tx
queue for the RMC to send to the ring.

When a network tap is attached to an FDDI interface handled by `defza'
any incoming SMT frames captured are queued to our usual processing of
network data received, which in turn delivers them to any listening
taps.

However the outgoing SMT frames produced by the firmware bypass our
network protocol stack and are therefore not delivered to taps.  This in
turn means that taps are missing a part of network traffic sent by the
adapter, which may make it more difficult to track down network problems
or do general traffic analysis.

Call `dev_queue_xmit_nit' then in the SMT Tx path, having checked that
a network tap is attached, with a newly-created `dev_nit_active' helper
wrapping the usual condition used in the transmit path.

Signed-off-by: Maciej W. Rozycki &lt;macro@linux-mips.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
