diff options
| author | Alexei Starovoitov <ast@kernel.org> | 2018-01-05 15:30:47 -0800 |
|---|---|---|
| committer | Alexei Starovoitov <ast@kernel.org> | 2018-01-05 15:31:20 -0800 |
| commit | 11d16edb04f113348b0c1d0c26cb666e9baaa7d3 (patch) | |
| tree | 3e3d4ae57b0ebd5158afd33be318775883e943fa /include | |
| parent | 5f103c5d4dbadec0f2cacd39b6429e1b8a8cf983 (diff) | |
| parent | 0fca931a6f21c11f675363b92b5a4fe86da59f30 (diff) | |
Merge branch 'xdp_rxq_info'
Jesper Dangaard Brouer says:
====================
V4:
* Added reviewers/acks to patches
* Fix patch desc in i40e that got out-of-sync with code
* Add SPDX license headers for the two new files added in patch 14
V3:
* Fixed bug in virtio_net driver
* Removed export of xdp_rxq_info_init()
V2:
* Changed API exposed to drivers
- Removed invocation of "init" in drivers, and only call "reg"
(Suggested by Saeed)
- Allow "reg" to fail and handle this in drivers
(Suggested by David Ahern)
* Removed the SINKQ qtype, instead allow to register as "unused"
* Also fixed some drivers during testing on actual HW (noted in patches)
There is a need for XDP to know more about the RX-queue a given XDP
frames have arrived on. For both the XDP bpf-prog and kernel side.
Instead of extending struct xdp_buff each time new info is needed,
this patchset takes a different approach. Struct xdp_buff is only
extended with a pointer to a struct xdp_rxq_info (allowing for easier
extending this later). This xdp_rxq_info contains information related
to how the driver have setup the individual RX-queue's. This is
read-mostly information, and all xdp_buff frames (in drivers
napi_poll) point to the same xdp_rxq_info (per RX-queue).
We stress this data/cache-line is for read-mostly info. This is NOT
for dynamic per packet info, use the data_meta for such use-cases.
This patchset start out small, and only expose ingress_ifindex and the
RX-queue index to the XDP/BPF program. Access to tangible info like
the ingress ifindex and RX queue index, is fairly easy to comprehent.
The other future use-cases could allow XDP frames to be recycled back
to the originating device driver, by providing info on RX device and
queue number.
As XDP doesn't have driver feature flags, and eBPF code due to
bpf-tail-calls cannot determine that XDP driver invoke it, this
patchset have to update every driver that support XDP.
For driver developers (review individual driver patches!):
The xdp_rxq_info is tied to the drivers RX-ring(s). Whenever a RX-ring
modification require (temporary) stopping RX frames, then the
xdp_rxq_info should (likely) also be unregistred and re-registered,
especially if reallocating the pages in the ring. Make sure ethtool
set_channels does the right thing. When replacing XDP prog, if and
only if RX-ring need to be changed, then also re-register the
xdp_rxq_info.
I'm Cc'ing the individual driver patches to the registered maintainers.
Testing:
I've only tested the NIC drivers I have hardware for. The general
test procedure is to (DUT = Device Under Test):
(1) run pktgen script pktgen_sample04_many_flows.sh (against DUT)
(2) run samples/bpf program xdp_rxq_info --dev $DEV (on DUT)
(3) runtime modify number of NIC queues via ethtool -L (on DUT)
(4) runtime modify number of NIC ring-size via ethtool -G (on DUT)
Patch based on git tree bpf-next (at commit fb982666e380c1632a):
https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git/
====================
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Diffstat (limited to 'include')
| -rw-r--r-- | include/linux/filter.h | 2 | ||||
| -rw-r--r-- | include/linux/netdevice.h | 2 | ||||
| -rw-r--r-- | include/net/xdp.h | 48 | ||||
| -rw-r--r-- | include/uapi/linux/bpf.h | 3 |
4 files changed, 55 insertions, 0 deletions
diff --git a/include/linux/filter.h b/include/linux/filter.h index 2b0df2703671..425056c7f96c 100644 --- a/include/linux/filter.h +++ b/include/linux/filter.h @@ -20,6 +20,7 @@ #include <linux/set_memory.h> #include <linux/kallsyms.h> +#include <net/xdp.h> #include <net/sch_generic.h> #include <uapi/linux/filter.h> @@ -503,6 +504,7 @@ struct xdp_buff { void *data_end; void *data_meta; void *data_hard_start; + struct xdp_rxq_info *rxq; }; /* Compute the linear packet data range [data, data_end) which diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 49bfc6eec74c..440b000f07f4 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -44,6 +44,7 @@ #include <net/dcbnl.h> #endif #include <net/netprio_cgroup.h> +#include <net/xdp.h> #include <linux/netdev_features.h> #include <linux/neighbour.h> @@ -686,6 +687,7 @@ struct netdev_rx_queue { #endif struct kobject kobj; struct net_device *dev; + struct xdp_rxq_info xdp_rxq; } ____cacheline_aligned_in_smp; /* diff --git a/include/net/xdp.h b/include/net/xdp.h new file mode 100644 index 000000000000..b2362ddfa694 --- /dev/null +++ b/include/net/xdp.h @@ -0,0 +1,48 @@ +/* include/net/xdp.h + * + * Copyright (c) 2017 Jesper Dangaard Brouer, Red Hat Inc. + * Released under terms in GPL version 2. See COPYING. + */ +#ifndef __LINUX_NET_XDP_H__ +#define __LINUX_NET_XDP_H__ + +/** + * DOC: XDP RX-queue information + * + * The XDP RX-queue info (xdp_rxq_info) is associated with the driver + * level RX-ring queues. It is information that is specific to how + * the driver have configured a given RX-ring queue. + * + * Each xdp_buff frame received in the driver carry a (pointer) + * reference to this xdp_rxq_info structure. This provides the XDP + * data-path read-access to RX-info for both kernel and bpf-side + * (limited subset). + * + * For now, direct access is only safe while running in NAPI/softirq + * context. Contents is read-mostly and must not be updated during + * driver NAPI/softirq poll. + * + * The driver usage API is a register and unregister API. + * + * The struct is not directly tied to the XDP prog. A new XDP prog + * can be attached as long as it doesn't change the underlying + * RX-ring. If the RX-ring does change significantly, the NIC driver + * naturally need to stop the RX-ring before purging and reallocating + * memory. In that process the driver MUST call unregistor (which + * also apply for driver shutdown and unload). The register API is + * also mandatory during RX-ring setup. + */ + +struct xdp_rxq_info { + struct net_device *dev; + u32 queue_index; + u32 reg_state; +} ____cacheline_aligned; /* perf critical, avoid false-sharing */ + +int xdp_rxq_info_reg(struct xdp_rxq_info *xdp_rxq, + struct net_device *dev, u32 queue_index); +void xdp_rxq_info_unreg(struct xdp_rxq_info *xdp_rxq); +void xdp_rxq_info_unused(struct xdp_rxq_info *xdp_rxq); +bool xdp_rxq_info_is_reg(struct xdp_rxq_info *xdp_rxq); + +#endif /* __LINUX_NET_XDP_H__ */ diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h index f2f8b36e2ad4..405317f9c064 100644 --- a/include/uapi/linux/bpf.h +++ b/include/uapi/linux/bpf.h @@ -899,6 +899,9 @@ struct xdp_md { __u32 data; __u32 data_end; __u32 data_meta; + /* Below access go though struct xdp_rxq_info */ + __u32 ingress_ifindex; /* rxq->dev->ifindex */ + __u32 rx_queue_index; /* rxq->queue_index */ }; enum sk_action { |
