<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux/netdevice.h, branch v4.12.9</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.12.9</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.12.9'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2017-06-14T19:14:51Z</updated>
<entry>
<title>net: update undefined -&gt;ndo_change_mtu() comment</title>
<updated>2017-06-14T19:14:51Z</updated>
<author>
<name>Magnus Damm</name>
<email>damm+renesas@opensource.se</email>
</author>
<published>2017-06-14T07:15:24Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=db46a0e1be7eac45d0bb1bdcd438b8d47c920451'/>
<id>urn:sha1:db46a0e1be7eac45d0bb1bdcd438b8d47c920451</id>
<content type='text'>
Update -&gt;ndo_change_mtu() callback comment to remove text
about returning error in case of undefined callback. This
change makes the comment match the existing code behavior.

Signed-off-by: Magnus Damm &lt;damm+renesas@opensource.se&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: ipv6: Release route when device is unregistering</title>
<updated>2017-06-08T15:12:39Z</updated>
<author>
<name>David Ahern</name>
<email>dsahern@gmail.com</email>
</author>
<published>2017-06-07T18:26:23Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8397ed36b7c585f8d3e06c431f4137309124f78f'/>
<id>urn:sha1:8397ed36b7c585f8d3e06c431f4137309124f78f</id>
<content type='text'>
Roopa reported attempts to delete a bond device that is referenced in a
multipath route is hanging:

$ ifdown bond2    # ifupdown2 command that deletes virtual devices
unregister_netdevice: waiting for bond2 to become free. Usage count = 2

Steps to reproduce:
    echo 1 &gt; /proc/sys/net/ipv6/conf/all/ignore_routes_with_linkdown
    ip link add dev bond12 type bond
    ip link add dev bond13 type bond
    ip addr add 2001:db8:2::0/64 dev bond12
    ip addr add 2001:db8:3::0/64 dev bond13
    ip route add 2001:db8:33::0/64 nexthop via 2001:db8:2::2 nexthop via 2001:db8:3::2
    ip link del dev bond12
    ip link del dev bond13

The root cause is the recent change to keep routes on a linkdown. Update
the check to detect when the device is unregistering and release the
route for that case.

Fixes: a1a22c12060e4 ("net: ipv6: Keep nexthop of multipath route on admin down")
Reported-by: Roopa Prabhu &lt;roopa@cumulusnetworks.com&gt;
Signed-off-by: David Ahern &lt;dsahern@gmail.com&gt;
Acked-by: Roopa Prabhu &lt;roopa@cumulusnetworks.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: Fix inconsistent teardown and release of private netdev state.</title>
<updated>2017-06-07T19:53:24Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2017-05-08T16:52:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cf124db566e6b036b8bcbe8decbed740bdfac8c6'/>
<id>urn:sha1:cf124db566e6b036b8bcbe8decbed740bdfac8c6</id>
<content type='text'>
Network devices can allocate reasources and private memory using
netdev_ops-&gt;ndo_init().  However, the release of these resources
can occur in one of two different places.

Either netdev_ops-&gt;ndo_uninit() or netdev-&gt;destructor().

The decision of which operation frees the resources depends upon
whether it is necessary for all netdev refs to be released before it
is safe to perform the freeing.

netdev_ops-&gt;ndo_uninit() presumably can occur right after the
NETDEV_UNREGISTER notifier completes and the unicast and multicast
address lists are flushed.

netdev-&gt;destructor(), on the other hand, does not run until the
netdev references all go away.

Further complicating the situation is that netdev-&gt;destructor()
almost universally does also a free_netdev().

This creates a problem for the logic in register_netdevice().
Because all callers of register_netdevice() manage the freeing
of the netdev, and invoke free_netdev(dev) if register_netdevice()
fails.

If netdev_ops-&gt;ndo_init() succeeds, but something else fails inside
of register_netdevice(), it does call ndo_ops-&gt;ndo_uninit().  But
it is not able to invoke netdev-&gt;destructor().

This is because netdev-&gt;destructor() will do a free_netdev() and
then the caller of register_netdevice() will do the same.

However, this means that the resources that would normally be released
by netdev-&gt;destructor() will not be.

Over the years drivers have added local hacks to deal with this, by
invoking their destructor parts by hand when register_netdevice()
fails.

Many drivers do not try to deal with this, and instead we have leaks.

Let's close this hole by formalizing the distinction between what
private things need to be freed up by netdev-&gt;destructor() and whether
the driver needs unregister_netdevice() to perform the free_netdev().

netdev-&gt;priv_destructor() performs all actions to free up the private
resources that used to be freed by netdev-&gt;destructor(), except for
free_netdev().

netdev-&gt;needs_free_netdev is a boolean that indicates whether
free_netdev() should be done at the end of unregister_netdevice().

Now, register_netdevice() can sanely release all resources after
ndo_ops-&gt;ndo_init() succeeds, by invoking both ndo_ops-&gt;ndo_uninit()
and netdev-&gt;priv_destructor().

And at the end of unregister_netdevice(), we invoke
netdev-&gt;priv_destructor() and optionally call free_netdev().

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>xdp: refine xdp api with regards to generic xdp</title>
<updated>2017-05-12T01:30:57Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2017-05-11T23:04:46Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d67b9cd28c1d7f82c2e5e727731ea7c89b23a0a8'/>
<id>urn:sha1:d67b9cd28c1d7f82c2e5e727731ea7c89b23a0a8</id>
<content type='text'>
While working on the iproute2 generic XDP frontend, I noticed that
as of right now it's possible to have native *and* generic XDP
programs loaded both at the same time for the case when a driver
supports native XDP.

The intended model for generic XDP from b5cdae3291f7 ("net: Generic
XDP") is, however, that only one out of the two can be present at
once which is also indicated as such in the XDP netlink dump part.
The main rationale for generic XDP is to ease accessibility (in
case a driver does not yet have XDP support) and to generically
provide a semantical model as an example for driver developers
wanting to add XDP support. The generic XDP option for an XDP
aware driver can still be useful for comparing and testing both
implementations.

However, it is not intended to have a second XDP processing stage
or layer with exactly the same functionality of the first native
stage. Only reason could be to have a partial fallback for future
XDP features that are not supported yet in the native implementation
and we probably also shouldn't strive for such fallback and instead
encourage native feature support in the first place. Given there's
currently no such fallback issue or use case, lets not go there yet
if we don't need to.

Therefore, change semantics for loading XDP and bail out if the
user tries to load a generic XDP program when a native one is
present and vice versa. Another alternative to bailing out would
be to handle the transition from one flavor to another gracefully,
but that would require to bring the device down, exchange both
types of programs, and bring it up again in order to avoid a tiny
window where a packet could hit both hooks. Given this complicates
the logic for just a debugging feature in the native case, I went
with the simpler variant.

For the dump, remove IFLA_XDP_FLAGS that was added with b5cdae3291f7
and reuse IFLA_XDP_ATTACHED for indicating the mode. Dumping all
or just a subset of flags that were used for loading the XDP prog
is suboptimal in the long run since not all flags are useful for
dumping and if we start to reuse the same flag definitions for
load and dump, then we'll waste bit space. What we really just
want is to dump the mode for now.

Current IFLA_XDP_ATTACHED semantics are: nothing was installed (0),
a program is running at the native driver layer (1). Thus, add a
mode that says that a program is running at generic XDP layer (2).
Applications will handle this fine in that older binaries will
just indicate that something is attached at XDP layer, effectively
this is similar to IFLA_XDP_FLAGS attr that we would have had
modulo the redundancy.

Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>xdp: propagate extended ack to XDP setup</title>
<updated>2017-05-01T14:35:47Z</updated>
<author>
<name>Jakub Kicinski</name>
<email>jakub.kicinski@netronome.com</email>
</author>
<published>2017-05-01T04:46:46Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ddf9f970764f4390aba767e77fddaaced4a6760d'/>
<id>urn:sha1:ddf9f970764f4390aba767e77fddaaced4a6760d</id>
<content type='text'>
Drivers usually have a number of restrictions for running XDP
- most common being buffer sizes, LRO and number of rings.
Even though some drivers try to be helpful and print error
messages experience shows that users don't often consult
kernel logs on netlink errors.  Try to use the new extended
ack mechanism to carry the message back to user space.

Signed-off-by: Jakub Kicinski &lt;jakub.kicinski@netronome.com&gt;
Acked-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: update comment for netif_dormant() function</title>
<updated>2017-04-27T20:23:17Z</updated>
<author>
<name>Zhang Shengju</name>
<email>zhangshengju@cmss.chinamobile.com</email>
</author>
<published>2017-04-26T03:05:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8ecbc40ada116f2f7d6b61cd646802c87b7c5c7d'/>
<id>urn:sha1:8ecbc40ada116f2f7d6b61cd646802c87b7c5c7d</id>
<content type='text'>
This patch updates the comment for netif_dormant() function to reflect
the intended usage.

Signed-off-by: Zhang Shengju &lt;zhangshengju@cmss.chinamobile.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: move xdp_prog field in RX cache lines</title>
<updated>2017-04-25T20:25:36Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2017-04-25T18:36:52Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7acedaf5c4355f812cfef883ac28bf15f7d9205e'/>
<id>urn:sha1:7acedaf5c4355f812cfef883ac28bf15f7d9205e</id>
<content type='text'>
(struct net_device, xdp_prog) field should be moved in RX cache lines,
reducing latencies when a single packet is received on idle host,
since netif_elide_gro() needs it.

Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: Generic XDP</title>
<updated>2017-04-25T17:33:49Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2017-04-18T19:36:58Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b5cdae3291f7be7a34e75affe4c0ec1f7f328b64'/>
<id>urn:sha1:b5cdae3291f7be7a34e75affe4c0ec1f7f328b64</id>
<content type='text'>
This provides a generic SKB based non-optimized XDP path which is used
if either the driver lacks a specific XDP implementation, or the user
requests it via a new IFLA_XDP_FLAGS value named XDP_FLAGS_SKB_MODE.

It is arguable that perhaps I should have required something like
this as part of the initial XDP feature merge.

I believe this is critical for two reasons:

1) Accessibility.  More people can play with XDP with less
   dependencies.  Yes I know we have XDP support in virtio_net, but
   that just creates another depedency for learning how to use this
   facility.

   I wrote this to make life easier for the XDP newbies.

2) As a model for what the expected semantics are.  If there is a pure
   generic core implementation, it serves as a semantic example for
   driver folks adding XDP support.

One thing I have not tried to address here is the issue of
XDP_PACKET_HEADROOM, thanks to Daniel for spotting that.  It seems
incredibly expensive to do a skb_cow(skb, XDP_PACKET_HEADROOM) or
whatever even if the XDP program doesn't try to push headers at all.
I think we really need the verifier to somehow propagate whether
certain XDP helpers are used or not.

v5:
 - Handle both negative and positive offset after running prog
 - Fix mac length in XDP_TX case (Alexei)
 - Use rcu_dereference_protected() in free_netdev (kbuild test robot)

v4:
 - Fix MAC header adjustmnet before calling prog (David Ahern)
 - Disable LRO when generic XDP is installed (Michael Chan)
 - Bypass qdisc et al. on XDP_TX and record the event (Alexei)
 - Do not perform generic XDP on reinjected packets (DaveM)

v3:
 - Make sure XDP program sees packet at MAC header, push back MAC
   header if we do XDP_TX.  (Alexei)
 - Elide GRO when generic XDP is in use.  (Alexei)
 - Add XDP_FLAG_SKB_MODE flag which the user can use to request generic
   XDP even if the driver has an XDP implementation.  (Alexei)
 - Report whether SKB mode is in use in rtnl_xdp_fill() via XDP_FLAGS
   attribute.  (Daniel)

v2:
 - Add some "fall through" comments in switch statements based
   upon feedback from Andrew Lunn
 - Use RCU for generic xdp_prog, thanks to Johannes Berg.

Tested-by: Andy Gospodarek &lt;andy@greyhouse.net&gt;
Tested-by: Jesper Dangaard Brouer &lt;brouer@redhat.com&gt;
Tested-by: David Ahern &lt;dsa@cumulusnetworks.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge branch 'master' of git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next</title>
<updated>2017-04-21T19:11:28Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2017-04-21T19:11:28Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6b633e82b0f902a4cceb9bcdcb5bb31d04ca6264'/>
<id>urn:sha1:6b633e82b0f902a4cceb9bcdcb5bb31d04ca6264</id>
<content type='text'>
Steffen Klassert says:

====================
pull request (net-next): ipsec-next 2017-04-20

This adds the basic infrastructure for IPsec hardware
offloading, it creates a configuration API and adjusts
the packet path.

1) Add the needed netdev features to configure IPsec offloads.

2) Add the IPsec hardware offloading API.

3) Prepare the ESP packet path for hardware offloading.

4) Add gso handlers for esp4 and esp6, this implements
   the software fallback for GSO packets.

5) Add xfrm replay handler functions for offloading.

6) Change ESP to use a synchronous crypto algorithm on
   offloading, we don't have the option for asynchronous
   returns when we handle IPsec at layer2.

7) Add a xfrm validate function to validate_xmit_skb. This
   implements the software fallback for non GSO packets.

8) Set the inner_network and inner_transport members of
   the SKB, as well as encapsulation, to reflect the actual
   positions of these headers, and removes them only once
   encryption is done on the payload.
   From Ilan Tayari.

9) Prepare the ESP GRO codepath for hardware offloading.

10) Fix incorrect null pointer check in esp6.
    From Colin Ian King.

11) Fix for the GSO software fallback path to detect the
    fallback correctly.
    From Ilan Tayari.

Please pull or let me know if there are problems.
====================

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Replace 2 jiffies with sysctl netdev_budget_usecs to enable softirq tuning</title>
<updated>2017-04-21T17:22:34Z</updated>
<author>
<name>Matthew Whitehead</name>
<email>tedheadster@gmail.com</email>
</author>
<published>2017-04-19T16:37:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7acf8a1e8a28b3d7407a8d8061a7d0766cfac2f4'/>
<id>urn:sha1:7acf8a1e8a28b3d7407a8d8061a7d0766cfac2f4</id>
<content type='text'>
Constants used for tuning are generally a bad idea, especially as hardware
changes over time. Replace the constant 2 jiffies with sysctl variable
netdev_budget_usecs to enable sysadmins to tune the softirq processing.
Also document the variable.

For example, a very fast machine might tune this to 1000 microseconds,
while my regression testing 486DX-25 needs it to be 4000 microseconds on
a nearly idle network to prevent time_squeeze from being incremented.

Version 2: changed jiffies to microseconds for predictable units.

Signed-off-by: Matthew Whitehead &lt;tedheadster@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
