summaryrefslogtreecommitdiff
path: root/net/ipv6
diff options
context:
space:
mode:
authorLinus Torvalds <torvalds@linux-foundation.org>2026-02-11 19:31:52 -0800
committerLinus Torvalds <torvalds@linux-foundation.org>2026-02-11 19:31:52 -0800
commit37a93dd5c49b5fda807fd204edf2547c3493319c (patch)
treece1ef5a642b9ea3d7242156438eb96dc5607a752 /net/ipv6
parent098b6e44cbaa2d526d06af90c862d13fb414a0ec (diff)
parent83310d613382f74070fc8b402f3f6c2af8439ead (diff)
Merge tag 'net-next-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-nextipvs/mainipvs/HEADipvs-next/mainipvs-next/HEADdavem/net-next/maindavem/net-next/HEAD
Pull networking updates from Paolo Abeni: "Core & protocols: - A significant effort all around the stack to guide the compiler to make the right choice when inlining code, to avoid unneeded calls for small helper and stack canary overhead in the fast-path. This generates better and faster code with very small or no text size increases, as in many cases the call generated more code than the actual inlined helper. - Extend AccECN implementation so that is now functionally complete, also allow the user-space enabling it on a per network namespace basis. - Add support for memory providers with large (above 4K) rx buffer. Paired with hw-gro, larger rx buffer sizes reduce the number of buffers traversing the stack, dincreasing single stream CPU usage by up to ~30%. - Do not add HBH header to Big TCP GSO packets. This simplifies the RX path, the TX path and the NIC drivers, and is possible because user-space taps can now interpret correctly such packets without the HBH hint. - Allow IPv6 routes to be configured with a gateway address that is resolved out of a different interface than the one specified, aligning IPv6 to IPv4 behavior. - Multi-queue aware sch_cake. This makes it possible to scale the rate shaper of sch_cake across multiple CPUs, while still enforcing a single global rate on the interface. - Add support for the nbcon (new buffer console) infrastructure to netconsole, enabling lock-free, priority-based console operations that are safer in crash scenarios. - Improve the TCP ipv6 output path to cache the flow information, saving cpu cycles, reducing cache line misses and stack use. - Improve netfilter packet tracker to resolve clashes for most protocols, avoiding unneeded drops on rare occasions. - Add IP6IP6 tunneling acceleration to the flowtable infrastructure. - Reduce tcp socket size by one cache line. - Notify neighbour changes atomically, avoiding inconsistencies between the notification sequence and the actual states sequence. - Add vsock namespace support, allowing complete isolation of vsocks across different network namespaces. - Improve xsk generic performances with cache-alignment-oriented optimizations. - Support netconsole automatic target recovery, allowing netconsole to reestablish targets when underlying low-level interface comes back online. Driver API: - Support for switching the working mode (automatic vs manual) of a DPLL device via netlink. - Introduce PHY ports representation to expose multiple front-facing media ports over a single MAC. - Introduce "rx-polarity" and "tx-polarity" device tree properties, to generalize polarity inversion requirements for differential signaling. - Add helper to create, prepare and enable managed clocks. Device drivers: - Add Huawei hinic3 PF etherner driver. - Add DWMAC glue driver for Motorcomm YT6801 PCIe ethernet controller. - Add ethernet driver for MaxLinear MxL862xx switches - Remove parallel-port Ethernet driver. - Convert existing driver timestamp configuration reporting to hwtstamp_get and remove legacy ioctl(). - Convert existing drivers to .get_rx_ring_count(), simplifing the RX ring count retrieval. Also remove the legacy fallback path. - Ethernet high-speed NICs: - Broadcom (bnxt, bng): - bnxt: add FW interface update to support FEC stats histogram and NVRAM defragmentation - bng: add TSO and H/W GRO support - nVidia/Mellanox (mlx5): - improve latency of channel restart operations, reducing the used H/W resources - add TSO support for UDP over GRE over VLAN - add flow counters support for hardware steering (HWS) rules - use a static memory area to store headers for H/W GRO, leading to 12% RX tput improvement - Intel (100G, ice, idpf): - ice: reorganizes layout of Tx and Rx rings for cacheline locality and utilizes __cacheline_group* macros on the new layouts - ice: introduces Synchronous Ethernet (SyncE) support - Meta (fbnic): - adds debugfs for firmware mailbox and tx/rx rings vectors - Ethernet virtual: - geneve: introduce GRO/GSO support for double UDP encapsulation - Ethernet NICs consumer, and embedded: - Synopsys (stmmac): - some code refactoring and cleanups - RealTek (r8169): - add support for RTL8127ATF (10G Fiber SFP) - add dash and LTR support - Airoha: - AN8811HB 2.5 Gbps phy support - Freescale (fec): - add XDP zero-copy support - Thunderbolt: - add get link setting support to allow bonding - Renesas: - add support for RZ/G3L GBETH SoC - Ethernet switches: - Maxlinear: - support R(G)MII slow rate configuration - add support for Intel GSW150 - Motorcomm (yt921x): - add DCB/QoS support - TI: - icssm-prueth: support bridging (STP/RSTP) via the switchdev framework - Ethernet PHYs: - Realtek: - enable SGMII and 2500Base-X in-band auto-negotiation - simplify and reunify C22/C45 drivers - Micrel: convert bindings to DT schema - CAN: - move skb headroom content into skb extensions, making CAN metadata access more robust - CAN drivers: - rcar_canfd: - add support for FD-only mode - add support for the RZ/T2H SoC - sja1000: cleanup the CAN state handling - WiFi: - implement EPPKE/802.1X over auth frames support - split up drop reasons better, removing generic RX_DROP - additional FTM capabilities: 6 GHz support, supported number of spatial streams and supported number of LTF repetitions - better mac80211 iterators to enumerate resources - initial UHR (Wi-Fi 8) support for cfg80211/mac80211 - WiFi drivers: - Qualcomm/Atheros: - ath11k: support for Channel Frequency Response measurement - ath12k: a significant driver refactor to support multi-wiphy devices and and pave the way for future device support in the same driver (rather than splitting to ath13k) - ath12k: support for the QCC2072 chipset - Intel: - iwlwifi: partial Neighbor Awareness Networking (NAN) support - iwlwifi: initial support for U-NII-9 and IEEE 802.11bn - RealTek (rtw89): - preparations for RTL8922DE support - Bluetooth: - implement setsockopt(BT_PHY) to set the connection packet type/PHY - set link_policy on incoming ACL connections - Bluetooth drivers: - btusb: add support for MediaTek7920, Realtek RTL8761BU and 8851BE - btqca: add WCN6855 firmware priority selection feature" * tag 'net-next-7.0' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next: (1254 commits) bnge/bng_re: Add a new HSI net: macb: Fix tx/rx malfunction after phy link down and up af_unix: Fix memleak of newsk in unix_stream_connect(). net: ti: icssg-prueth: Add optional dependency on HSR net: dsa: add basic initial driver for MxL862xx switches net: mdio: add unlocked mdiodev C45 bus accessors net: dsa: add tag format for MxL862xx switches dt-bindings: net: dsa: add MaxLinear MxL862xx selftests: drivers: net: hw: Modify toeplitz.c to poll for packets octeontx2-pf: Unregister devlink on probe failure net: renesas: rswitch: fix forwarding offload statemachine ionic: Rate limit unknown xcvr type messages tcp: inet6_csk_xmit() optimization tcp: populate inet->cork.fl.u.ip6 in tcp_v6_syn_recv_sock() tcp: populate inet->cork.fl.u.ip6 in tcp_v6_connect() ipv6: inet6_csk_xmit() and inet6_csk_update_pmtu() use inet->cork.fl.u.ip6 ipv6: use inet->cork.fl.u.ip6 and np->final in ip6_datagram_dst_update() ipv6: use np->final in inet6_sk_rebuild_header() ipv6: add daddr/final storage in struct ipv6_pinfo net: stmmac: qcom-ethqos: fix qcom_ethqos_serdes_powerup() ...
Diffstat (limited to 'net/ipv6')
-rw-r--r--net/ipv6/Makefile2
-rw-r--r--net/ipv6/addrconf.c23
-rw-r--r--net/ipv6/af_inet6.c61
-rw-r--r--net/ipv6/datagram.c21
-rw-r--r--net/ipv6/exthdrs.c79
-rw-r--r--net/ipv6/icmp.c9
-rw-r--r--net/ipv6/inet6_connection_sock.c65
-rw-r--r--net/ipv6/ip6_fib.c12
-rw-r--r--net/ipv6/ip6_gre.c2
-rw-r--r--net/ipv6/ip6_input.c2
-rw-r--r--net/ipv6/ip6_offload.c79
-rw-r--r--net/ipv6/ip6_output.c122
-rw-r--r--net/ipv6/ip6_tunnel.c33
-rw-r--r--net/ipv6/ipv6_sockglue.c4
-rw-r--r--net/ipv6/output_core.c7
-rw-r--r--net/ipv6/raw.c34
-rw-r--r--net/ipv6/route.c41
-rw-r--r--net/ipv6/sit.c2
-rw-r--r--net/ipv6/tcp_ipv6.c78
-rw-r--r--net/ipv6/tcpv6_offload.c12
-rw-r--r--net/ipv6/udp.c5
-rw-r--r--net/ipv6/udp_offload.c3
22 files changed, 346 insertions, 350 deletions
diff --git a/net/ipv6/Makefile b/net/ipv6/Makefile
index d283c59df4c1..0492f1a0b491 100644
--- a/net/ipv6/Makefile
+++ b/net/ipv6/Makefile
@@ -45,7 +45,7 @@ obj-$(CONFIG_IPV6_FOU) += fou6.o
obj-y += addrconf_core.o exthdrs_core.o ip6_checksum.o ip6_icmp.o
obj-$(CONFIG_INET) += output_core.o protocol.o \
- ip6_offload.o tcpv6_offload.o exthdrs_offload.o
+ ip6_offload.o exthdrs_offload.o
obj-$(subst m,y,$(CONFIG_IPV6)) += inet6_hashtables.o
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 27ab9d7adc64..6db9cf9e2a50 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1013,7 +1013,7 @@ ipv6_link_dev_addr(struct inet6_dev *idev, struct inet6_ifaddr *ifp)
list_for_each(p, &idev->addr_list) {
struct inet6_ifaddr *ifa
= list_entry(p, struct inet6_ifaddr, if_list);
- if (ifp_scope >= ipv6_addr_src_scope(&ifa->addr))
+ if (ifp_scope > ipv6_addr_src_scope(&ifa->addr))
break;
}
@@ -3339,11 +3339,10 @@ static int ipv6_generate_stable_address(struct in6_addr *address,
const struct inet6_dev *idev)
{
static DEFINE_SPINLOCK(lock);
- static __u32 digest[SHA1_DIGEST_WORDS];
- static __u32 workspace[SHA1_WORKSPACE_WORDS];
+ static struct sha1_ctx sha_ctx;
static union {
- char __data[SHA1_BLOCK_SIZE];
+ u8 __data[SHA1_BLOCK_SIZE];
struct {
struct in6_addr secret;
__be32 prefix[2];
@@ -3368,20 +3367,26 @@ static int ipv6_generate_stable_address(struct in6_addr *address,
retry:
spin_lock_bh(&lock);
- sha1_init_raw(digest);
+ sha1_init(&sha_ctx);
+
memset(&data, 0, sizeof(data));
- memset(workspace, 0, sizeof(workspace));
memcpy(data.hwaddr, idev->dev->perm_addr, idev->dev->addr_len);
data.prefix[0] = address->s6_addr32[0];
data.prefix[1] = address->s6_addr32[1];
data.secret = secret;
data.dad_count = dad_count;
- sha1_transform(digest, data.__data, workspace);
+ sha1_update(&sha_ctx, data.__data, sizeof(data));
+ /*
+ * Note that the SHA-1 finalization is omitted here, and the digest is
+ * pulled directly from the internal SHA-1 state (making it incompatible
+ * with standard SHA-1). Unusual, but technically okay since the data
+ * length is fixed and is a multiple of the SHA-1 block size.
+ */
temp = *address;
- temp.s6_addr32[2] = (__force __be32)digest[0];
- temp.s6_addr32[3] = (__force __be32)digest[1];
+ temp.s6_addr32[2] = (__force __be32)sha_ctx.state.h[0];
+ temp.s6_addr32[3] = (__force __be32)sha_ctx.state.h[1];
spin_unlock_bh(&lock);
diff --git a/net/ipv6/af_inet6.c b/net/ipv6/af_inet6.c
index b705751eb73c..31ba677d0442 100644
--- a/net/ipv6/af_inet6.c
+++ b/net/ipv6/af_inet6.c
@@ -224,8 +224,8 @@ lookup_protocol:
inet6_set_bit(MC6_LOOP, sk);
inet6_set_bit(MC6_ALL, sk);
np->pmtudisc = IPV6_PMTUDISC_WANT;
- inet6_assign_bit(REPFLOW, sk, net->ipv6.sysctl.flowlabel_reflect &
- FLOWLABEL_REFLECT_ESTABLISHED);
+ inet6_assign_bit(REPFLOW, sk, READ_ONCE(net->ipv6.sysctl.flowlabel_reflect) &
+ FLOWLABEL_REFLECT_ESTABLISHED);
sk->sk_ipv6only = net->ipv6.sysctl.bindv6only;
sk->sk_txrehash = READ_ONCE(net->core.sysctl_txrehash);
@@ -824,45 +824,42 @@ EXPORT_SYMBOL(inet6_unregister_protosw);
int inet6_sk_rebuild_header(struct sock *sk)
{
struct ipv6_pinfo *np = inet6_sk(sk);
+ struct inet_sock *inet = inet_sk(sk);
+ struct in6_addr *final_p;
struct dst_entry *dst;
+ struct flowi6 *fl6;
dst = __sk_dst_check(sk, np->dst_cookie);
+ if (dst)
+ return 0;
+
+ fl6 = &inet->cork.fl.u.ip6;
+ memset(fl6, 0, sizeof(*fl6));
+ fl6->flowi6_proto = sk->sk_protocol;
+ fl6->daddr = sk->sk_v6_daddr;
+ fl6->saddr = np->saddr;
+ fl6->flowlabel = np->flow_label;
+ fl6->flowi6_oif = sk->sk_bound_dev_if;
+ fl6->flowi6_mark = sk->sk_mark;
+ fl6->fl6_dport = inet->inet_dport;
+ fl6->fl6_sport = inet->inet_sport;
+ fl6->flowi6_uid = sk_uid(sk);
+ security_sk_classify_flow(sk, flowi6_to_flowi_common(fl6));
- if (!dst) {
- struct inet_sock *inet = inet_sk(sk);
- struct in6_addr *final_p, final;
- struct flowi6 fl6;
-
- memset(&fl6, 0, sizeof(fl6));
- fl6.flowi6_proto = sk->sk_protocol;
- fl6.daddr = sk->sk_v6_daddr;
- fl6.saddr = np->saddr;
- fl6.flowlabel = np->flow_label;
- fl6.flowi6_oif = sk->sk_bound_dev_if;
- fl6.flowi6_mark = sk->sk_mark;
- fl6.fl6_dport = inet->inet_dport;
- fl6.fl6_sport = inet->inet_sport;
- fl6.flowi6_uid = sk_uid(sk);
- security_sk_classify_flow(sk, flowi6_to_flowi_common(&fl6));
-
- rcu_read_lock();
- final_p = fl6_update_dst(&fl6, rcu_dereference(np->opt),
- &final);
- rcu_read_unlock();
-
- dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p);
- if (IS_ERR(dst)) {
- sk->sk_route_caps = 0;
- WRITE_ONCE(sk->sk_err_soft, -PTR_ERR(dst));
- return PTR_ERR(dst);
- }
+ rcu_read_lock();
+ final_p = fl6_update_dst(fl6, rcu_dereference(np->opt), &np->final);
+ rcu_read_unlock();
- ip6_dst_store(sk, dst, false, false);
+ dst = ip6_dst_lookup_flow(sock_net(sk), sk, fl6, final_p);
+ if (IS_ERR(dst)) {
+ sk->sk_route_caps = 0;
+ WRITE_ONCE(sk->sk_err_soft, -PTR_ERR(dst));
+ return PTR_ERR(dst);
}
+ ip6_dst_store(sk, dst, false, false);
return 0;
}
-EXPORT_SYMBOL_GPL(inet6_sk_rebuild_header);
bool ipv6_opt_accepted(const struct sock *sk, const struct sk_buff *skb,
const struct inet6_skb_parm *opt)
diff --git a/net/ipv6/datagram.c b/net/ipv6/datagram.c
index 83e03176819c..c564b68a0562 100644
--- a/net/ipv6/datagram.c
+++ b/net/ipv6/datagram.c
@@ -72,12 +72,12 @@ static void ip6_datagram_flow_key_init(struct flowi6 *fl6,
int ip6_datagram_dst_update(struct sock *sk, bool fix_sk_saddr)
{
struct ip6_flowlabel *flowlabel = NULL;
- struct in6_addr *final_p, final;
- struct ipv6_txoptions *opt;
- struct dst_entry *dst;
struct inet_sock *inet = inet_sk(sk);
struct ipv6_pinfo *np = inet6_sk(sk);
- struct flowi6 fl6;
+ struct ipv6_txoptions *opt;
+ struct in6_addr *final_p;
+ struct dst_entry *dst;
+ struct flowi6 *fl6;
int err = 0;
if (inet6_test_bit(SNDFLOW, sk) &&
@@ -86,14 +86,15 @@ int ip6_datagram_dst_update(struct sock *sk, bool fix_sk_saddr)
if (IS_ERR(flowlabel))
return -EINVAL;
}
- ip6_datagram_flow_key_init(&fl6, sk);
+ fl6 = &inet_sk(sk)->cork.fl.u.ip6;
+ ip6_datagram_flow_key_init(fl6, sk);
rcu_read_lock();
opt = flowlabel ? flowlabel->opt : rcu_dereference(np->opt);
- final_p = fl6_update_dst(&fl6, opt, &final);
+ final_p = fl6_update_dst(fl6, opt, &np->final);
rcu_read_unlock();
- dst = ip6_dst_lookup_flow(sock_net(sk), sk, &fl6, final_p);
+ dst = ip6_dst_lookup_flow(sock_net(sk), sk, fl6, final_p);
if (IS_ERR(dst)) {
err = PTR_ERR(dst);
goto out;
@@ -101,17 +102,17 @@ int ip6_datagram_dst_update(struct sock *sk, bool fix_sk_saddr)
if (fix_sk_saddr) {
if (ipv6_addr_any(&np->saddr))
- np->saddr = fl6.saddr;
+ np->saddr = fl6->saddr;
if (ipv6_addr_any(&sk->sk_v6_rcv_saddr)) {
- sk->sk_v6_rcv_saddr = fl6.saddr;
+ sk->sk_v6_rcv_saddr = fl6->saddr;
inet->inet_rcv_saddr = LOOPBACK4_IPV6;
if (sk->sk_prot->rehash)
sk->sk_prot->rehash(sk);
}
}
- ip6_sk_dst_store_flow(sk, dst, &fl6);
+ ip6_sk_dst_store_flow(sk, dst, fl6);
out:
fl6_sock_release(flowlabel);
diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
index a23eb8734e15..209fdf1b1aa9 100644
--- a/net/ipv6/exthdrs.c
+++ b/net/ipv6/exthdrs.c
@@ -314,7 +314,7 @@ fail_and_free:
}
extlen = (skb_transport_header(skb)[1] + 1) << 3;
- if (extlen > net->ipv6.sysctl.max_dst_opts_len)
+ if (extlen > READ_ONCE(net->ipv6.sysctl.max_dst_opts_len))
goto fail_and_free;
opt->lastopt = opt->dst1 = skb_network_header_len(skb);
@@ -322,7 +322,8 @@ fail_and_free:
dstbuf = opt->dst1;
#endif
- if (ip6_parse_tlv(false, skb, net->ipv6.sysctl.max_dst_opts_cnt)) {
+ if (ip6_parse_tlv(false, skb,
+ READ_ONCE(net->ipv6.sysctl.max_dst_opts_cnt))) {
skb->transport_header += extlen;
opt = IP6CB(skb);
#if IS_ENABLED(CONFIG_IPV6_MIP6)
@@ -1049,11 +1050,12 @@ fail_and_free:
}
extlen = (skb_transport_header(skb)[1] + 1) << 3;
- if (extlen > net->ipv6.sysctl.max_hbh_opts_len)
+ if (extlen > READ_ONCE(net->ipv6.sysctl.max_hbh_opts_len))
goto fail_and_free;
opt->flags |= IP6SKB_HOPBYHOP;
- if (ip6_parse_tlv(true, skb, net->ipv6.sysctl.max_hbh_opts_cnt)) {
+ if (ip6_parse_tlv(true, skb,
+ READ_ONCE(net->ipv6.sysctl.max_hbh_opts_cnt))) {
skb->transport_header += extlen;
opt = IP6CB(skb);
opt->nhoff = sizeof(struct ipv6hdr);
@@ -1072,9 +1074,9 @@ fail_and_free:
* for headers.
*/
-static void ipv6_push_rthdr0(struct sk_buff *skb, u8 *proto,
- struct ipv6_rt_hdr *opt,
- struct in6_addr **addr_p, struct in6_addr *saddr)
+static u8 ipv6_push_rthdr0(struct sk_buff *skb, u8 proto,
+ struct ipv6_rt_hdr *opt,
+ struct in6_addr **addr_p, struct in6_addr *saddr)
{
struct rt0_hdr *phdr, *ihdr;
int hops;
@@ -1093,13 +1095,13 @@ static void ipv6_push_rthdr0(struct sk_buff *skb, u8 *proto,
phdr->addr[hops - 1] = **addr_p;
*addr_p = ihdr->addr;
- phdr->rt_hdr.nexthdr = *proto;
- *proto = NEXTHDR_ROUTING;
+ phdr->rt_hdr.nexthdr = proto;
+ return NEXTHDR_ROUTING;
}
-static void ipv6_push_rthdr4(struct sk_buff *skb, u8 *proto,
- struct ipv6_rt_hdr *opt,
- struct in6_addr **addr_p, struct in6_addr *saddr)
+static u8 ipv6_push_rthdr4(struct sk_buff *skb, u8 proto,
+ struct ipv6_rt_hdr *opt,
+ struct in6_addr **addr_p, struct in6_addr *saddr)
{
struct ipv6_sr_hdr *sr_phdr, *sr_ihdr;
int plen, hops;
@@ -1142,58 +1144,61 @@ static void ipv6_push_rthdr4(struct sk_buff *skb, u8 *proto,
}
#endif
- sr_phdr->nexthdr = *proto;
- *proto = NEXTHDR_ROUTING;
+ sr_phdr->nexthdr = proto;
+ return NEXTHDR_ROUTING;
}
-static void ipv6_push_rthdr(struct sk_buff *skb, u8 *proto,
- struct ipv6_rt_hdr *opt,
- struct in6_addr **addr_p, struct in6_addr *saddr)
+static u8 ipv6_push_rthdr(struct sk_buff *skb, u8 proto,
+ struct ipv6_rt_hdr *opt,
+ struct in6_addr **addr_p, struct in6_addr *saddr)
{
switch (opt->type) {
case IPV6_SRCRT_TYPE_0:
case IPV6_SRCRT_STRICT:
case IPV6_SRCRT_TYPE_2:
- ipv6_push_rthdr0(skb, proto, opt, addr_p, saddr);
+ proto = ipv6_push_rthdr0(skb, proto, opt, addr_p, saddr);
break;
case IPV6_SRCRT_TYPE_4:
- ipv6_push_rthdr4(skb, proto, opt, addr_p, saddr);
+ proto = ipv6_push_rthdr4(skb, proto, opt, addr_p, saddr);
break;
default:
break;
}
+ return proto;
}
-static void ipv6_push_exthdr(struct sk_buff *skb, u8 *proto, u8 type, struct ipv6_opt_hdr *opt)
+static u8 ipv6_push_exthdr(struct sk_buff *skb, u8 proto, u8 type, struct ipv6_opt_hdr *opt)
{
struct ipv6_opt_hdr *h = skb_push(skb, ipv6_optlen(opt));
memcpy(h, opt, ipv6_optlen(opt));
- h->nexthdr = *proto;
- *proto = type;
+ h->nexthdr = proto;
+ return type;
}
-void ipv6_push_nfrag_opts(struct sk_buff *skb, struct ipv6_txoptions *opt,
- u8 *proto,
- struct in6_addr **daddr, struct in6_addr *saddr)
+u8 ipv6_push_nfrag_opts(struct sk_buff *skb, struct ipv6_txoptions *opt,
+ u8 proto,
+ struct in6_addr **daddr, struct in6_addr *saddr)
{
if (opt->srcrt) {
- ipv6_push_rthdr(skb, proto, opt->srcrt, daddr, saddr);
+ proto = ipv6_push_rthdr(skb, proto, opt->srcrt, daddr, saddr);
/*
* IPV6_RTHDRDSTOPTS is ignored
* unless IPV6_RTHDR is set (RFC3542).
*/
if (opt->dst0opt)
- ipv6_push_exthdr(skb, proto, NEXTHDR_DEST, opt->dst0opt);
+ proto = ipv6_push_exthdr(skb, proto, NEXTHDR_DEST, opt->dst0opt);
}
if (opt->hopopt)
- ipv6_push_exthdr(skb, proto, NEXTHDR_HOP, opt->hopopt);
+ proto = ipv6_push_exthdr(skb, proto, NEXTHDR_HOP, opt->hopopt);
+ return proto;
}
-void ipv6_push_frag_opts(struct sk_buff *skb, struct ipv6_txoptions *opt, u8 *proto)
+u8 ipv6_push_frag_opts(struct sk_buff *skb, struct ipv6_txoptions *opt, u8 proto)
{
if (opt->dst1opt)
- ipv6_push_exthdr(skb, proto, NEXTHDR_DEST, opt->dst1opt);
+ proto = ipv6_push_exthdr(skb, proto, NEXTHDR_DEST, opt->dst1opt);
+ return proto;
}
EXPORT_SYMBOL(ipv6_push_frag_opts);
@@ -1334,21 +1339,21 @@ struct ipv6_txoptions *__ipv6_fixup_options(struct ipv6_txoptions *opt_space,
EXPORT_SYMBOL_GPL(__ipv6_fixup_options);
/**
- * fl6_update_dst - update flowi destination address with info given
+ * __fl6_update_dst - update flowi destination address with info given
* by srcrt option, if any.
*
* @fl6: flowi6 for which daddr is to be updated
* @opt: struct ipv6_txoptions in which to look for srcrt opt
* @orig: copy of original daddr address if modified
*
- * Returns NULL if no txoptions or no srcrt, otherwise returns orig
+ * Return: NULL if no srcrt or invalid srcrt type, otherwise returns orig
* and initial value of fl6->daddr set in orig
*/
-struct in6_addr *fl6_update_dst(struct flowi6 *fl6,
- const struct ipv6_txoptions *opt,
- struct in6_addr *orig)
+struct in6_addr *__fl6_update_dst(struct flowi6 *fl6,
+ const struct ipv6_txoptions *opt,
+ struct in6_addr *orig)
{
- if (!opt || !opt->srcrt)
+ if (!opt->srcrt)
return NULL;
*orig = fl6->daddr;
@@ -1372,4 +1377,4 @@ struct in6_addr *fl6_update_dst(struct flowi6 *fl6,
return orig;
}
-EXPORT_SYMBOL_GPL(fl6_update_dst);
+EXPORT_SYMBOL_GPL(__fl6_update_dst);
diff --git a/net/ipv6/icmp.c b/net/ipv6/icmp.c
index 9d37e7711bc2..375ecd779fda 100644
--- a/net/ipv6/icmp.c
+++ b/net/ipv6/icmp.c
@@ -958,7 +958,8 @@ static enum skb_drop_reason icmpv6_echo_reply(struct sk_buff *skb)
tmp_hdr.icmp6_type = type;
memset(&fl6, 0, sizeof(fl6));
- if (net->ipv6.sysctl.flowlabel_reflect & FLOWLABEL_REFLECT_ICMPV6_ECHO_REPLIES)
+ if (READ_ONCE(net->ipv6.sysctl.flowlabel_reflect) &
+ FLOWLABEL_REFLECT_ICMPV6_ECHO_REPLIES)
fl6.flowlabel = ip6_flowlabel(ipv6_hdr(skb));
fl6.flowi6_proto = IPPROTO_ICMPV6;
@@ -1066,6 +1067,12 @@ enum skb_drop_reason icmpv6_notify(struct sk_buff *skb, u8 type,
if (reason != SKB_NOT_DROPPED_YET)
goto out;
+ if (nexthdr == IPPROTO_RAW) {
+ /* Add a more specific reason later ? */
+ reason = SKB_DROP_REASON_NOT_SPECIFIED;
+ goto out;
+ }
+
/* BUGGG_FUTURE: we should try to parse exthdrs in this packet.
Without this we will not able f.e. to make source routed
pmtu discovery.
diff --git a/net/ipv6/inet6_connection_sock.c b/net/ipv6/inet6_connection_sock.c
index ea5cf3fdfdd6..11fc2f7de2fe 100644
--- a/net/ipv6/inet6_connection_sock.c
+++ b/net/ipv6/inet6_connection_sock.c
@@ -25,14 +25,14 @@
#include <net/sock_reuseport.h>
struct dst_entry *inet6_csk_route_req(const struct sock *sk,
+ struct dst_entry *dst,
struct flowi6 *fl6,
const struct request_sock *req,
u8 proto)
{
- struct inet_request_sock *ireq = inet_rsk(req);
+ const struct inet_request_sock *ireq = inet_rsk(req);
const struct ipv6_pinfo *np = inet6_sk(sk);
struct in6_addr *final_p, final;
- struct dst_entry *dst;
memset(fl6, 0, sizeof(*fl6));
fl6->flowi6_proto = proto;
@@ -48,25 +48,20 @@ struct dst_entry *inet6_csk_route_req(const struct sock *sk,
fl6->flowi6_uid = sk_uid(sk);
security_req_classify_flow(req, flowi6_to_flowi_common(fl6));
- dst = ip6_dst_lookup_flow(sock_net(sk), sk, fl6, final_p);
- if (IS_ERR(dst))
- return NULL;
-
+ if (!dst) {
+ dst = ip6_dst_lookup_flow(sock_net(sk), sk, fl6, final_p);
+ if (IS_ERR(dst))
+ return NULL;
+ }
return dst;
}
-static inline
-struct dst_entry *__inet6_csk_dst_check(struct sock *sk, u32 cookie)
-{
- return __sk_dst_check(sk, cookie);
-}
-
static struct dst_entry *inet6_csk_route_socket(struct sock *sk,
struct flowi6 *fl6)
{
struct inet_sock *inet = inet_sk(sk);
struct ipv6_pinfo *np = inet6_sk(sk);
- struct in6_addr *final_p, final;
+ struct in6_addr *final_p;
struct dst_entry *dst;
memset(fl6, 0, sizeof(*fl6));
@@ -83,41 +78,41 @@ static struct dst_entry *inet6_csk_route_socket(struct sock *sk,
security_sk_classify_flow(sk, flowi6_to_flowi_common(fl6));
rcu_read_lock();
- final_p = fl6_update_dst(fl6, rcu_dereference(np->opt), &final);
+ final_p = fl6_update_dst(fl6, rcu_dereference(np->opt), &np->final);
rcu_read_unlock();
- dst = __inet6_csk_dst_check(sk, np->dst_cookie);
- if (!dst) {
- dst = ip6_dst_lookup_flow(sock_net(sk), sk, fl6, final_p);
+ dst = ip6_dst_lookup_flow(sock_net(sk), sk, fl6, final_p);
+
+ if (!IS_ERR(dst))
+ ip6_dst_store(sk, dst, false, false);
- if (!IS_ERR(dst))
- ip6_dst_store(sk, dst, false, false);
- }
return dst;
}
int inet6_csk_xmit(struct sock *sk, struct sk_buff *skb, struct flowi *fl_unused)
{
+ struct flowi6 *fl6 = &inet_sk(sk)->cork.fl.u.ip6;
struct ipv6_pinfo *np = inet6_sk(sk);
- struct flowi6 fl6;
struct dst_entry *dst;
int res;
- dst = inet6_csk_route_socket(sk, &fl6);
- if (IS_ERR(dst)) {
- WRITE_ONCE(sk->sk_err_soft, -PTR_ERR(dst));
- sk->sk_route_caps = 0;
- kfree_skb(skb);
- return PTR_ERR(dst);
+ dst = __sk_dst_check(sk, np->dst_cookie);
+ if (unlikely(!dst)) {
+ dst = inet6_csk_route_socket(sk, fl6);
+ if (IS_ERR(dst)) {
+ WRITE_ONCE(sk->sk_err_soft, -PTR_ERR(dst));
+ sk->sk_route_caps = 0;
+ kfree_skb(skb);
+ return PTR_ERR(dst);
+ }
+ /* Restore final destination back after routing done */
+ fl6->daddr = sk->sk_v6_daddr;
}
rcu_read_lock();
skb_dst_set_noref(skb, dst);
- /* Restore final destination back after routing done */
- fl6.daddr = sk->sk_v6_daddr;
-
- res = ip6_xmit(sk, skb, &fl6, sk->sk_mark, rcu_dereference(np->opt),
+ res = ip6_xmit(sk, skb, fl6, sk->sk_mark, rcu_dereference(np->opt),
np->tclass, READ_ONCE(sk->sk_priority));
rcu_read_unlock();
return res;
@@ -126,13 +121,15 @@ EXPORT_SYMBOL_GPL(inet6_csk_xmit);
struct dst_entry *inet6_csk_update_pmtu(struct sock *sk, u32 mtu)
{
- struct flowi6 fl6;
- struct dst_entry *dst = inet6_csk_route_socket(sk, &fl6);
+ struct flowi6 *fl6 = &inet_sk(sk)->cork.fl.u.ip6;
+ struct dst_entry *dst;
+
+ dst = inet6_csk_route_socket(sk, fl6);
if (IS_ERR(dst))
return NULL;
dst->ops->update_pmtu(dst, sk, NULL, mtu, true);
- dst = inet6_csk_route_socket(sk, &fl6);
+ dst = inet6_csk_route_socket(sk, fl6);
return IS_ERR(dst) ? NULL : dst;
}
diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c
index c6439e30e892..9880d608392b 100644
--- a/net/ipv6/ip6_fib.c
+++ b/net/ipv6/ip6_fib.c
@@ -1375,14 +1375,14 @@ static void fib6_start_gc(struct net *net, struct fib6_info *rt)
if (!timer_pending(&net->ipv6.ip6_fib_timer) &&
(rt->fib6_flags & RTF_EXPIRES))
mod_timer(&net->ipv6.ip6_fib_timer,
- jiffies + net->ipv6.sysctl.ip6_rt_gc_interval);
+ jiffies + READ_ONCE(net->ipv6.sysctl.ip6_rt_gc_interval));
}
void fib6_force_start_gc(struct net *net)
{
if (!timer_pending(&net->ipv6.ip6_fib_timer))
mod_timer(&net->ipv6.ip6_fib_timer,
- jiffies + net->ipv6.sysctl.ip6_rt_gc_interval);
+ jiffies + READ_ONCE(net->ipv6.sysctl.ip6_rt_gc_interval));
}
static void __fib6_update_sernum_upto_root(struct fib6_info *rt,
@@ -2414,6 +2414,7 @@ static void fib6_gc_all(struct net *net, struct fib6_gc_args *gc_args)
void fib6_run_gc(unsigned long expires, struct net *net, bool force)
{
struct fib6_gc_args gc_args;
+ int ip6_rt_gc_interval;
unsigned long now;
if (force) {
@@ -2422,8 +2423,8 @@ void fib6_run_gc(unsigned long expires, struct net *net, bool force)
mod_timer(&net->ipv6.ip6_fib_timer, jiffies + HZ);
return;
}
- gc_args.timeout = expires ? (int)expires :
- net->ipv6.sysctl.ip6_rt_gc_interval;
+ ip6_rt_gc_interval = READ_ONCE(net->ipv6.sysctl.ip6_rt_gc_interval);
+ gc_args.timeout = expires ? (int)expires : ip6_rt_gc_interval;
gc_args.more = 0;
fib6_gc_all(net, &gc_args);
@@ -2432,8 +2433,7 @@ void fib6_run_gc(unsigned long expires, struct net *net, bool force)
if (gc_args.more)
mod_timer(&net->ipv6.ip6_fib_timer,
- round_jiffies(now
- + net->ipv6.sysctl.ip6_rt_gc_interval));
+ round_jiffies(now + ip6_rt_gc_interval));
else
timer_delete(&net->ipv6.ip6_fib_timer);
spin_unlock_bh(&net->ipv6.fib6_gc_lock);
diff --git a/net/ipv6/ip6_gre.c b/net/ipv6/ip6_gre.c
index d19d86ed4376..dafcc0dcd77a 100644
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -1057,7 +1057,7 @@ static netdev_tx_t ip6erspan_tunnel_xmit(struct sk_buff *skb,
/* TooBig packet may have updated dst->dev's mtu */
if (!t->parms.collect_md && dst) {
mtu = READ_ONCE(dst_dev(dst)->mtu);
- if (dst_mtu(dst) > mtu)
+ if (dst6_mtu(dst) > mtu)
dst->ops->update_pmtu(dst, NULL, skb, mtu, false);
}
err = ip6_tnl_xmit(skb, dev, dsfield, &fl6, encap_limit, &mtu,
diff --git a/net/ipv6/ip6_input.c b/net/ipv6/ip6_input.c
index 168ec07e31cc..2bcb981c91aa 100644
--- a/net/ipv6/ip6_input.c
+++ b/net/ipv6/ip6_input.c
@@ -262,7 +262,7 @@ static struct sk_buff *ip6_rcv_core(struct sk_buff *skb, struct net_device *dev,
skb->transport_header = skb->network_header + sizeof(*hdr);
IP6CB(skb)->nhoff = offsetof(struct ipv6hdr, nexthdr);
- pkt_len = ntohs(hdr->payload_len);
+ pkt_len = ipv6_payload_len(skb, hdr);
/* pkt_len may be zero if Jumbo payload option is present */
if (pkt_len || hdr->nexthdr != NEXTHDR_HOP) {
diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c
index fce91183797a..bd7f780e37a5 100644
--- a/net/ipv6/ip6_offload.c
+++ b/net/ipv6/ip6_offload.c
@@ -19,23 +19,7 @@
#include <net/gso.h>
#include "ip6_offload.h"
-
-/* All GRO functions are always builtin, except UDP over ipv6, which lays in
- * ipv6 module, as it depends on UDPv6 lookup function, so we need special care
- * when ipv6 is built as a module
- */
-#if IS_BUILTIN(CONFIG_IPV6)
-#define INDIRECT_CALL_L4(f, f2, f1, ...) INDIRECT_CALL_2(f, f2, f1, __VA_ARGS__)
-#else
-#define INDIRECT_CALL_L4(f, f2, f1, ...) INDIRECT_CALL_1(f, f2, __VA_ARGS__)
-#endif
-
-#define indirect_call_gro_receive_l4(f2, f1, cb, head, skb) \
-({ \
- unlikely(gro_recursion_inc_test(skb)) ? \
- NAPI_GRO_CB(skb)->flush |= 1, NULL : \
- INDIRECT_CALL_L4(cb, f2, f1, head, skb); \
-})
+#include "tcpv6_offload.c"
static int ipv6_gro_pull_exthdrs(struct sk_buff *skb, int off, int proto)
{
@@ -110,7 +94,7 @@ static struct sk_buff *ipv6_gso_segment(struct sk_buff *skb,
struct sk_buff *segs = ERR_PTR(-EINVAL);
struct ipv6hdr *ipv6h;
const struct net_offload *ops;
- int proto, err;
+ int proto;
struct frag_hdr *fptr;
unsigned int payload_len;
u8 *prevhdr;
@@ -120,9 +104,6 @@ static struct sk_buff *ipv6_gso_segment(struct sk_buff *skb,
bool gso_partial;
skb_reset_network_header(skb);
- err = ipv6_hopopt_jumbo_remove(skb);
- if (err)
- return ERR_PTR(err);
nhoff = skb_network_header(skb) - skb_mac_header(skb);
if (unlikely(!pskb_may_pull(skb, sizeof(*ipv6h))))
goto out;
@@ -298,9 +279,19 @@ not_same_flow:
skb_gro_postpull_rcsum(skb, iph, nlen);
- pp = indirect_call_gro_receive_l4(tcp6_gro_receive, udp6_gro_receive,
- ops->callbacks.gro_receive, head, skb);
+ if (unlikely(gro_recursion_inc_test(skb))) {
+ flush = 1;
+ goto out;
+ }
+ if (likely(proto == IPPROTO_TCP))
+ pp = tcp6_gro_receive(head, skb);
+#if IS_BUILTIN(CONFIG_IPV6)
+ else if (likely(proto == IPPROTO_UDP))
+ pp = udp6_gro_receive(head, skb);
+#endif
+ else
+ pp = ops->callbacks.gro_receive(head, skb);
out:
skb_gro_flush_final(skb, pp, flush);
@@ -342,48 +333,28 @@ INDIRECT_CALLABLE_SCOPE int ipv6_gro_complete(struct sk_buff *skb, int nhoff)
const struct net_offload *ops;
struct ipv6hdr *iph;
int err = -ENOSYS;
- u32 payload_len;
if (skb->encapsulation) {
skb_set_inner_protocol(skb, cpu_to_be16(ETH_P_IPV6));
skb_set_inner_network_header(skb, nhoff);
}
- payload_len = skb->len - nhoff - sizeof(*iph);
- if (unlikely(payload_len > IPV6_MAXPLEN)) {
- struct hop_jumbo_hdr *hop_jumbo;
- int hoplen = sizeof(*hop_jumbo);
-
- /* Move network header left */
- memmove(skb_mac_header(skb) - hoplen, skb_mac_header(skb),
- skb->transport_header - skb->mac_header);
- skb->data -= hoplen;
- skb->len += hoplen;
- skb->mac_header -= hoplen;
- skb->network_header -= hoplen;
- iph = (struct ipv6hdr *)(skb->data + nhoff);
- hop_jumbo = (struct hop_jumbo_hdr *)(iph + 1);
-
- /* Build hop-by-hop options */
- hop_jumbo->nexthdr = iph->nexthdr;
- hop_jumbo->hdrlen = 0;
- hop_jumbo->tlv_type = IPV6_TLV_JUMBO;
- hop_jumbo->tlv_len = 4;
- hop_jumbo->jumbo_payload_len = htonl(payload_len + hoplen);
-
- iph->nexthdr = NEXTHDR_HOP;
- iph->payload_len = 0;
- } else {
- iph = (struct ipv6hdr *)(skb->data + nhoff);
- iph->payload_len = htons(payload_len);
- }
+ iph = (struct ipv6hdr *)(skb->data + nhoff);
+ ipv6_set_payload_len(iph, skb->len - nhoff - sizeof(*iph));
nhoff += sizeof(*iph) + ipv6_exthdrs_len(iph, &ops);
+
+ if (likely(ops == &net_hotdata.tcpv6_offload))
+ return tcp6_gro_complete(skb, nhoff);
+#if IS_BUILTIN(CONFIG_IPV6)
+ if (ops == &net_hotdata.udpv6_offload)
+ return udp6_gro_complete(skb, nhoff);
+#endif
+
if (WARN_ON(!ops || !ops->callbacks.gro_complete))
goto out;
- err = INDIRECT_CALL_L4(ops->callbacks.gro_complete, tcp6_gro_complete,
- udp6_gro_complete, skb, nhoff);
+ err = ops->callbacks.gro_complete(skb, nhoff);
out:
return err;
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
index f904739e99b9..769c39fed1f3 100644
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -80,7 +80,7 @@ static int ip6_finish_output2(struct net *net, struct sock *sk, struct sk_buff *
hdr = ipv6_hdr(skb);
daddr = &hdr->daddr;
- if (ipv6_addr_is_multicast(daddr)) {
+ if (unlikely(ipv6_addr_is_multicast(daddr))) {
if (!(dev->flags & IFF_LOOPBACK) && sk_mc_loop(sk) &&
((mroute6_is_socket(net, skb) &&
!(IP6CB(skb)->flags & IP6SKB_FORWARDED)) ||
@@ -179,8 +179,7 @@ ip6_finish_output_gso_slowpath_drop(struct net *net, struct sock *sk,
static int ip6_finish_output_gso(struct net *net, struct sock *sk,
struct sk_buff *skb, unsigned int mtu)
{
- if (!(IP6CB(skb)->flags & IP6SKB_FAKEJUMBO) &&
- !skb_gso_validate_network_len(skb, mtu))
+ if (unlikely(!skb_gso_validate_network_len(skb, mtu)))
return ip6_finish_output_gso_slowpath_drop(net, sk, skb, mtu);
return ip6_finish_output2(net, sk, skb);
@@ -202,8 +201,8 @@ static int __ip6_finish_output(struct net *net, struct sock *sk, struct sk_buff
if (skb_is_gso(skb))
return ip6_finish_output_gso(net, sk, skb, mtu);
- if (skb->len > mtu ||
- (IP6CB(skb)->frag_max_size && skb->len > IP6CB(skb)->frag_max_size))
+ if (unlikely(skb->len > mtu ||
+ (IP6CB(skb)->frag_max_size && skb->len > IP6CB(skb)->frag_max_size)))
return ip6_fragment(net, sk, skb, ip6_finish_output2);
return ip6_finish_output2(net, sk, skb);
@@ -273,8 +272,6 @@ int ip6_xmit(const struct sock *sk, struct sk_buff *skb, struct flowi6 *fl6,
struct in6_addr *first_hop = &fl6->daddr;
struct dst_entry *dst = skb_dst(skb);
struct inet6_dev *idev = ip6_dst_idev(dst);
- struct hop_jumbo_hdr *hop_jumbo;
- int hoplen = sizeof(*hop_jumbo);
struct net *net = sock_net(sk);
unsigned int head_room;
struct net_device *dev;
@@ -287,7 +284,7 @@ int ip6_xmit(const struct sock *sk, struct sk_buff *skb, struct flowi6 *fl6,
rcu_read_lock();
dev = dst_dev_rcu(dst);
- head_room = sizeof(struct ipv6hdr) + hoplen + LL_RESERVED_SPACE(dev);
+ head_room = sizeof(struct ipv6hdr) + LL_RESERVED_SPACE(dev);
if (opt)
head_room += opt->opt_nflen + opt->opt_flen;
@@ -301,32 +298,22 @@ int ip6_xmit(const struct sock *sk, struct sk_buff *skb, struct flowi6 *fl6,
}
}
- if (opt) {
+ if (unlikely(opt)) {
seg_len += opt->opt_nflen + opt->opt_flen;
if (opt->opt_flen)
- ipv6_push_frag_opts(skb, opt, &proto);
+ proto = ipv6_push_frag_opts(skb, opt, proto);
if (opt->opt_nflen)
- ipv6_push_nfrag_opts(skb, opt, &proto, &first_hop,
- &fl6->saddr);
+ proto = ipv6_push_nfrag_opts(skb, opt, proto,
+ &first_hop,
+ &fl6->saddr);
}
- if (unlikely(seg_len > IPV6_MAXPLEN)) {
- hop_jumbo = skb_push(skb, hoplen);
-
- hop_jumbo->nexthdr = proto;
- hop_jumbo->hdrlen = 0;
- hop_jumbo->tlv_type = IPV6_TLV_JUMBO;
- hop_jumbo->tlv_len = 4;
- hop_jumbo->jumbo_payload_len = htonl(seg_len + hoplen);
-
- proto = IPPROTO_HOPOPTS;
+ if (unlikely(seg_len > IPV6_MAXPLEN))
seg_len = 0;
- IP6CB(skb)->flags |= IP6SKB_FAKEJUMBO;
- }
- skb_push(skb, sizeof(struct ipv6hdr));
+ __skb_push(skb, sizeof(struct ipv6hdr));
skb_reset_network_header(skb);
hdr = ipv6_hdr(skb);
@@ -352,8 +339,8 @@ int ip6_xmit(const struct sock *sk, struct sk_buff *skb, struct flowi6 *fl6,
skb->priority = priority;
skb->mark = mark;
- mtu = dst_mtu(dst);
- if ((skb->len <= mtu) || skb->ignore_df || skb_is_gso(skb)) {
+ mtu = dst6_mtu(dst);
+ if (likely((skb->len <= mtu) || skb->ignore_df || skb_is_gso(skb))) {
IP6_INC_STATS(net, idev, IPSTATS_MIB_OUTREQUESTS);
/* if egress device is enslaved to an L3 master device pass the
@@ -382,7 +369,7 @@ int ip6_xmit(const struct sock *sk, struct sk_buff *skb, struct flowi6 *fl6,
ipv6_local_error((struct sock *)sk, EMSGSIZE, fl6, mtu);
IP6_INC_STATS(net, idev, IPSTATS_MIB_FRAGFAILS);
- kfree_skb(skb);
+ kfree_skb_reason(skb, SKB_DROP_REASON_PKT_TOO_BIG);
unlock:
rcu_read_unlock();
return ret;
@@ -653,7 +640,7 @@ int ip6_forward(struct sk_buff *skb)
if (mtu < IPV6_MIN_MTU)
mtu = IPV6_MIN_MTU;
- if (ip6_pkt_too_big(skb, mtu)) {
+ if (unlikely(ip6_pkt_too_big(skb, mtu))) {
/* Again, force OUTPUT device used as source address */
skb->dev = dev;
icmpv6_send(skb, ICMPV6_PKT_TOOBIG, 0, mtu);
@@ -1352,12 +1339,13 @@ static void ip6_append_data_mtu(unsigned int *mtu,
}
static int ip6_setup_cork(struct sock *sk, struct inet_cork_full *cork,
- struct inet6_cork *v6_cork, struct ipcm6_cookie *ipc6,
+ struct ipcm6_cookie *ipc6,
struct rt6_info *rt)
{
+ struct ipv6_txoptions *nopt, *opt = ipc6->opt;
+ struct inet6_cork *v6_cork = &cork->base6;
struct ipv6_pinfo *np = inet6_sk(sk);
unsigned int mtu, frag_size;
- struct ipv6_txoptions *nopt, *opt = ipc6->opt;
/* callers pass dst together with a reference, set it first so
* ip6_cork_release() can put it down even in case of an error.
@@ -1367,7 +1355,7 @@ static int ip6_setup_cork(struct sock *sk, struct inet_cork_full *cork,
/*
* setup for corking
*/
- if (opt) {
+ if (unlikely(opt)) {
if (WARN_ON(v6_cork->opt))
return -EINVAL;
@@ -1402,10 +1390,10 @@ static int ip6_setup_cork(struct sock *sk, struct inet_cork_full *cork,
v6_cork->dontfrag = ipc6->dontfrag;
if (rt->dst.flags & DST_XFRM_TUNNEL)
mtu = READ_ONCE(np->pmtudisc) >= IPV6_PMTUDISC_PROBE ?
- READ_ONCE(rt->dst.dev->mtu) : dst_mtu(&rt->dst);
+ READ_ONCE(rt->dst.dev->mtu) : dst6_mtu(&rt->dst);
else
mtu = READ_ONCE(np->pmtudisc) >= IPV6_PMTUDISC_PROBE ?
- READ_ONCE(rt->dst.dev->mtu) : dst_mtu(xfrm_dst_path(&rt->dst));
+ READ_ONCE(rt->dst.dev->mtu) : dst6_mtu(xfrm_dst_path(&rt->dst));
frag_size = READ_ONCE(np->frag_size);
if (frag_size && frag_size < mtu)
@@ -1430,17 +1418,17 @@ static int ip6_setup_cork(struct sock *sk, struct inet_cork_full *cork,
static int __ip6_append_data(struct sock *sk,
struct sk_buff_head *queue,
struct inet_cork_full *cork_full,
- struct inet6_cork *v6_cork,
struct page_frag *pfrag,
int getfrag(void *from, char *to, int offset,
int len, int odd, struct sk_buff *skb),
void *from, size_t length, int transhdrlen,
unsigned int flags)
{
- struct sk_buff *skb, *skb_prev = NULL;
+ unsigned int maxfraglen, fragheaderlen, mtu, orig_mtu, pmtu;
+ struct inet6_cork *v6_cork = &cork_full->base6;
struct inet_cork *cork = &cork_full->base;
struct flowi6 *fl6 = &cork_full->fl.u.ip6;
- unsigned int maxfraglen, fragheaderlen, mtu, orig_mtu, pmtu;
+ struct sk_buff *skb, *skb_prev = NULL;
struct ubuf_info *uarg = NULL;
int exthdrlen = 0;
int dst_exthdrlen = 0;
@@ -1843,7 +1831,6 @@ int ip6_append_data(struct sock *sk,
struct rt6_info *rt, unsigned int flags)
{
struct inet_sock *inet = inet_sk(sk);
- struct ipv6_pinfo *np = inet6_sk(sk);
int exthdrlen;
int err;
@@ -1854,7 +1841,7 @@ int ip6_append_data(struct sock *sk,
* setup for corking
*/
dst_hold(&rt->dst);
- err = ip6_setup_cork(sk, &inet->cork, &np->cork,
+ err = ip6_setup_cork(sk, &inet->cork,
ipc6, rt);
if (err)
return err;
@@ -1868,7 +1855,7 @@ int ip6_append_data(struct sock *sk,
}
return __ip6_append_data(sk, &sk->sk_write_queue, &inet->cork,
- &np->cork, sk_page_frag(sk), getfrag,
+ sk_page_frag(sk), getfrag,
from, length, transhdrlen, flags);
}
EXPORT_SYMBOL_GPL(ip6_append_data);
@@ -1881,10 +1868,11 @@ static void ip6_cork_steal_dst(struct sk_buff *skb, struct inet_cork_full *cork)
skb_dst_set(skb, dst);
}
-static void ip6_cork_release(struct inet_cork_full *cork,
- struct inet6_cork *v6_cork)
+static void ip6_cork_release(struct inet_cork_full *cork)
{
- if (v6_cork->opt) {
+ struct inet6_cork *v6_cork = &cork->base6;
+
+ if (unlikely(v6_cork->opt)) {
struct ipv6_txoptions *opt = v6_cork->opt;
kfree(opt->dst0opt);
@@ -1903,15 +1891,14 @@ static void ip6_cork_release(struct inet_cork_full *cork,
struct sk_buff *__ip6_make_skb(struct sock *sk,
struct sk_buff_head *queue,
- struct inet_cork_full *cork,
- struct inet6_cork *v6_cork)
+ struct inet_cork_full *cork)
{
struct sk_buff *skb, *tmp_skb;
struct sk_buff **tail_skb;
struct in6_addr *final_dst;
struct net *net = sock_net(sk);
struct ipv6hdr *hdr;
- struct ipv6_txoptions *opt = v6_cork->opt;
+ struct ipv6_txoptions *opt;
struct rt6_info *rt = dst_rt6_info(cork->base.dst);
struct flowi6 *fl6 = &cork->fl.u.ip6;
unsigned char proto = fl6->flowi6_proto;
@@ -1940,19 +1927,22 @@ struct sk_buff *__ip6_make_skb(struct sock *sk,
__skb_pull(skb, skb_network_header_len(skb));
final_dst = &fl6->daddr;
- if (opt && opt->opt_flen)
- ipv6_push_frag_opts(skb, opt, &proto);
- if (opt && opt->opt_nflen)
- ipv6_push_nfrag_opts(skb, opt, &proto, &final_dst, &fl6->saddr);
-
+ opt = cork->base6.opt;
+ if (unlikely(opt)) {
+ if (opt->opt_flen)
+ proto = ipv6_push_frag_opts(skb, opt, proto);
+ if (opt->opt_nflen)
+ proto = ipv6_push_nfrag_opts(skb, opt, proto,
+ &final_dst, &fl6->saddr);
+ }
skb_push(skb, sizeof(struct ipv6hdr));
skb_reset_network_header(skb);
hdr = ipv6_hdr(skb);
- ip6_flow_hdr(hdr, v6_cork->tclass,
+ ip6_flow_hdr(hdr, cork->base6.tclass,
ip6_make_flowlabel(net, skb, fl6->flowlabel,
ip6_autoflowlabel(net, sk), fl6));
- hdr->hop_limit = v6_cork->hop_limit;
+ hdr->hop_limit = cork->base6.hop_limit;
hdr->nexthdr = proto;
hdr->saddr = fl6->saddr;
hdr->daddr = *final_dst;
@@ -1966,7 +1956,7 @@ struct sk_buff *__ip6_make_skb(struct sock *sk,
ip6_cork_steal_dst(skb, cork);
IP6_INC_STATS(net, rt->rt6i_idev, IPSTATS_MIB_OUTREQUESTS);
- if (proto == IPPROTO_ICMPV6) {
+ if (unlikely(proto == IPPROTO_ICMPV6)) {
struct inet6_dev *idev = ip6_dst_idev(skb_dst(skb));
u8 icmp6_type;
@@ -1979,7 +1969,7 @@ struct sk_buff *__ip6_make_skb(struct sock *sk,
ICMP6_INC_STATS(net, idev, ICMP6_MIB_OUTMSGS);
}
- ip6_cork_release(cork, v6_cork);
+ ip6_cork_release(cork);
out:
return skb;
}
@@ -2018,8 +2008,7 @@ EXPORT_SYMBOL_GPL(ip6_push_pending_frames);
static void __ip6_flush_pending_frames(struct sock *sk,
struct sk_buff_head *queue,
- struct inet_cork_full *cork,
- struct inet6_cork *v6_cork)
+ struct inet_cork_full *cork)
{
struct sk_buff *skb;
@@ -2030,13 +2019,13 @@ static void __ip6_flush_pending_frames(struct sock *sk,
kfree_skb(skb);
}
- ip6_cork_release(cork, v6_cork);
+ ip6_cork_release(cork);
}
void ip6_flush_pending_frames(struct sock *sk)
{
__ip6_flush_pending_frames(sk, &sk->sk_write_queue,
- &inet_sk(sk)->cork, &inet6_sk(sk)->cork);
+ &inet_sk(sk)->cork);
}
EXPORT_SYMBOL_GPL(ip6_flush_pending_frames);
@@ -2047,9 +2036,8 @@ struct sk_buff *ip6_make_skb(struct sock *sk,
struct ipcm6_cookie *ipc6, struct rt6_info *rt,
unsigned int flags, struct inet_cork_full *cork)
{
- struct inet6_cork v6_cork;
- struct sk_buff_head queue;
int exthdrlen = (ipc6->opt ? ipc6->opt->opt_flen : 0);
+ struct sk_buff_head queue;
int err;
if (flags & MSG_PROBE) {
@@ -2062,21 +2050,21 @@ struct sk_buff *ip6_make_skb(struct sock *sk,
cork->base.flags = 0;
cork->base.addr = 0;
cork->base.opt = NULL;
- v6_cork.opt = NULL;
- err = ip6_setup_cork(sk, cork, &v6_cork, ipc6, rt);
+ cork->base6.opt = NULL;
+ err = ip6_setup_cork(sk, cork, ipc6, rt);
if (err) {
- ip6_cork_release(cork, &v6_cork);
+ ip6_cork_release(cork);
return ERR_PTR(err);
}
- err = __ip6_append_data(sk, &queue, cork, &v6_cork,
+ err = __ip6_append_data(sk, &queue, cork,
&current->task_frag, getfrag, from,
length + exthdrlen, transhdrlen + exthdrlen,
flags);
if (err) {
- __ip6_flush_pending_frames(sk, &queue, cork, &v6_cork);
+ __ip6_flush_pending_frames(sk, &queue, cork);
return ERR_PTR(err);
}
- return __ip6_make_skb(sk, &queue, cork, &v6_cork);
+ return __ip6_make_skb(sk, &queue, cork);
}
diff --git a/net/ipv6/ip6_tunnel.c b/net/ipv6/ip6_tunnel.c
index c1f39735a236..4c29aa94e86e 100644
--- a/net/ipv6/ip6_tunnel.c
+++ b/net/ipv6/ip6_tunnel.c
@@ -638,7 +638,7 @@ ip4ip6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
/* change mtu on this route */
if (rel_type == ICMP_DEST_UNREACH && rel_code == ICMP_FRAG_NEEDED) {
- if (rel_info > dst_mtu(skb_dst(skb2)))
+ if (rel_info > dst6_mtu(skb_dst(skb2)))
goto out;
skb_dst_update_pmtu_no_confirm(skb2, rel_info);
@@ -1187,7 +1187,7 @@ route_lookup:
t->parms.name);
goto tx_err_dst_release;
}
- mtu = dst_mtu(dst) - eth_hlen - psh_hlen - t->tun_hlen;
+ mtu = dst6_mtu(dst) - eth_hlen - psh_hlen - t->tun_hlen;
if (encap_limit >= 0) {
max_headroom += 8;
mtu -= 8;
@@ -1265,7 +1265,7 @@ route_lookup:
if (encap_limit >= 0) {
init_tel_txopt(&opt, encap_limit);
- ipv6_push_frag_opts(skb, &opt.ops, &proto);
+ proto = ipv6_push_frag_opts(skb, &opt.ops, proto);
}
skb_push(skb, sizeof(struct ipv6hdr));
@@ -1828,6 +1828,32 @@ int ip6_tnl_encap_setup(struct ip6_tnl *t,
}
EXPORT_SYMBOL_GPL(ip6_tnl_encap_setup);
+static int ip6_tnl_fill_forward_path(struct net_device_path_ctx *ctx,
+ struct net_device_path *path)
+{
+ struct ip6_tnl *t = netdev_priv(ctx->dev);
+ struct flowi6 fl6 = {
+ .daddr = t->parms.raddr,
+ };
+ struct dst_entry *dst;
+ int err;
+
+ dst = ip6_route_output(dev_net(ctx->dev), NULL, &fl6);
+ if (!dst->error) {
+ path->type = DEV_PATH_TUN;
+ path->tun.src_v6 = t->parms.laddr;
+ path->tun.dst_v6 = t->parms.raddr;
+ path->tun.l3_proto = IPPROTO_IPV6;
+ path->dev = ctx->dev;
+ ctx->dev = dst->dev;
+ }
+
+ err = dst->error;
+ dst_release(dst);
+
+ return err;
+}
+
static const struct net_device_ops ip6_tnl_netdev_ops = {
.ndo_init = ip6_tnl_dev_init,
.ndo_uninit = ip6_tnl_dev_uninit,
@@ -1836,6 +1862,7 @@ static const struct net_device_ops ip6_tnl_netdev_ops = {
.ndo_change_mtu = ip6_tnl_change_mtu,
.ndo_get_stats64 = dev_get_tstats64,
.ndo_get_iflink = ip6_tnl_get_iflink,
+ .ndo_fill_forward_path = ip6_tnl_fill_forward_path,
};
#define IPXIPX_FEATURES (NETIF_F_SG | \
diff --git a/net/ipv6/ipv6_sockglue.c b/net/ipv6/ipv6_sockglue.c
index a61e742794f9..d784a8644ff2 100644
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -1184,7 +1184,7 @@ int do_ipv6_getsockopt(struct sock *sk, int level, int optname,
rcu_read_lock();
dst = __sk_dst_get(sk);
if (dst)
- val = dst_mtu(dst);
+ val = dst6_mtu(dst);
rcu_read_unlock();
if (!val)
return -ENOTCONN;
@@ -1283,7 +1283,7 @@ int do_ipv6_getsockopt(struct sock *sk, int level, int optname,
rcu_read_lock();
dst = __sk_dst_get(sk);
if (dst)
- mtuinfo.ip6m_mtu = dst_mtu(dst);
+ mtuinfo.ip6m_mtu = dst6_mtu(dst);
rcu_read_unlock();
if (!mtuinfo.ip6m_mtu)
return -ENOTCONN;
diff --git a/net/ipv6/output_core.c b/net/ipv6/output_core.c
index 1c9b283a4132..cba1684a3f30 100644
--- a/net/ipv6/output_core.c
+++ b/net/ipv6/output_core.c
@@ -125,12 +125,7 @@ EXPORT_SYMBOL(ip6_dst_hoplimit);
int __ip6_local_out(struct net *net, struct sock *sk, struct sk_buff *skb)
{
- int len;
-
- len = skb->len - sizeof(struct ipv6hdr);
- if (len > IPV6_MAXPLEN)
- len = 0;
- ipv6_hdr(skb)->payload_len = htons(len);
+ ipv6_set_payload_len(ipv6_hdr(skb), skb->len - sizeof(struct ipv6hdr));
IP6CB(skb)->nhoff = offsetof(struct ipv6hdr, nexthdr);
/* if egress device is enslaved to an L3 master device pass the
diff --git a/net/ipv6/raw.c b/net/ipv6/raw.c
index b4cd05dba9b6..27a268059168 100644
--- a/net/ipv6/raw.c
+++ b/net/ipv6/raw.c
@@ -90,23 +90,24 @@ EXPORT_SYMBOL_GPL(raw_v6_match);
* 0 - deliver
* 1 - block
*/
-static int icmpv6_filter(const struct sock *sk, const struct sk_buff *skb)
+static int icmpv6_filter(const struct sock *sk, struct sk_buff *skb)
{
- struct icmp6hdr _hdr;
const struct icmp6hdr *hdr;
+ const __u32 *data;
+ unsigned int type;
/* We require only the four bytes of the ICMPv6 header, not any
* additional bytes of message body in "struct icmp6hdr".
*/
- hdr = skb_header_pointer(skb, skb_transport_offset(skb),
- ICMPV6_HDRLEN, &_hdr);
- if (hdr) {
- const __u32 *data = &raw6_sk(sk)->filter.data[0];
- unsigned int type = hdr->icmp6_type;
+ if (!pskb_may_pull(skb, ICMPV6_HDRLEN))
+ return 1;
- return (data[type >> 5] & (1U << (type & 31))) != 0;
- }
- return 1;
+ hdr = (struct icmp6hdr *)skb->data;
+ type = hdr->icmp6_type;
+
+ data = &raw6_sk(sk)->filter.data[0];
+
+ return (data[type >> 5] & (1U << (type & 31))) != 0;
}
#if IS_ENABLED(CONFIG_IPV6_MIP6)
@@ -141,15 +142,13 @@ EXPORT_SYMBOL(rawv6_mh_filter_unregister);
static bool ipv6_raw_deliver(struct sk_buff *skb, int nexthdr)
{
struct net *net = dev_net(skb->dev);
- const struct in6_addr *saddr;
- const struct in6_addr *daddr;
+ const struct ipv6hdr *ip6h;
struct hlist_head *hlist;
- struct sock *sk;
bool delivered = false;
+ struct sock *sk;
__u8 hash;
- saddr = &ipv6_hdr(skb)->saddr;
- daddr = saddr + 1;
+ ip6h = ipv6_hdr(skb);
hash = raw_hashfunc(net, nexthdr);
hlist = &raw_v6_hashinfo.ht[hash];
@@ -157,7 +156,7 @@ static bool ipv6_raw_deliver(struct sk_buff *skb, int nexthdr)
sk_for_each_rcu(sk, hlist) {
int filtered;
- if (!raw_v6_match(net, sk, nexthdr, daddr, saddr,
+ if (!raw_v6_match(net, sk, nexthdr, &ip6h->daddr, &ip6h->saddr,
inet6_iif(skb), inet6_sdif(skb)))
continue;
@@ -171,6 +170,7 @@ static bool ipv6_raw_deliver(struct sk_buff *skb, int nexthdr)
switch (nexthdr) {
case IPPROTO_ICMPV6:
filtered = icmpv6_filter(sk, skb);
+ ip6h = ipv6_hdr(skb);
break;
#if IS_ENABLED(CONFIG_IPV6_MIP6)
@@ -529,7 +529,7 @@ static int rawv6_push_pending_frames(struct sock *sk, struct flowi6 *fl6,
offset = rp->offset;
total_len = inet_sk(sk)->cork.base.length;
- opt = inet6_sk(sk)->cork.opt;
+ opt = inet_sk(sk)->cork.base6.opt;
total_len -= opt ? opt->opt_flen : 0;
if (offset >= total_len - 1) {
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index e3a260a5564b..c0350d97307e 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -2049,6 +2049,8 @@ unlock:
static bool rt6_mtu_change_route_allowed(struct inet6_dev *idev,
struct rt6_info *rt, int mtu)
{
+ u32 dmtu = dst6_mtu(&rt->dst);
+
/* If the new MTU is lower than the route PMTU, this new MTU will be the
* lowest MTU in the path: always allow updating the route PMTU to
* reflect PMTU decreases.
@@ -2059,10 +2061,10 @@ static bool rt6_mtu_change_route_allowed(struct inet6_dev *idev,
* handle this.
*/
- if (dst_mtu(&rt->dst) >= mtu)
+ if (dmtu >= mtu)
return true;
- if (dst_mtu(&rt->dst) == idev->cnf.mtu6)
+ if (dmtu == idev->cnf.mtu6)
return true;
return false;
@@ -2895,7 +2897,7 @@ static void rt6_do_update_pmtu(struct rt6_info *rt, u32 mtu)
dst_metric_set(&rt->dst, RTAX_MTU, mtu);
rt->rt6i_flags |= RTF_MODIFIED;
- rt6_update_expires(rt, net->ipv6.sysctl.ip6_rt_mtu_expires);
+ rt6_update_expires(rt, READ_ONCE(net->ipv6.sysctl.ip6_rt_mtu_expires));
}
static bool rt6_cache_allowed_for_pmtu(const struct rt6_info *rt)
@@ -2932,7 +2934,7 @@ static void __ip6_rt_update_pmtu(struct dst_entry *dst, const struct sock *sk,
if (mtu < IPV6_MIN_MTU)
return;
- if (mtu >= dst_mtu(dst))
+ if (mtu >= dst6_mtu(dst))
return;
if (!rt6_cache_allowed_for_pmtu(rt6)) {
@@ -3248,7 +3250,7 @@ EXPORT_SYMBOL_GPL(ip6_sk_redirect);
static unsigned int ip6_default_advmss(const struct dst_entry *dst)
{
- unsigned int mtu = dst_mtu(dst);
+ unsigned int mtu = dst6_mtu(dst);
struct net *net;
mtu -= sizeof(struct ipv6hdr) + sizeof(struct tcphdr);
@@ -3256,8 +3258,8 @@ static unsigned int ip6_default_advmss(const struct dst_entry *dst)
rcu_read_lock();
net = dst_dev_net_rcu(dst);
- if (mtu < net->ipv6.sysctl.ip6_rt_min_advmss)
- mtu = net->ipv6.sysctl.ip6_rt_min_advmss;
+ mtu = max_t(unsigned int, mtu,
+ READ_ONCE(net->ipv6.sysctl.ip6_rt_min_advmss));
rcu_read_unlock();
@@ -3359,10 +3361,10 @@ out:
static void ip6_dst_gc(struct dst_ops *ops)
{
struct net *net = container_of(ops, struct net, ipv6.ip6_dst_ops);
- int rt_min_interval = net->ipv6.sysctl.ip6_rt_gc_min_interval;
- int rt_elasticity = net->ipv6.sysctl.ip6_rt_gc_elasticity;
- int rt_gc_timeout = net->ipv6.sysctl.ip6_rt_gc_timeout;
- unsigned long rt_last_gc = net->ipv6.ip6_rt_last_gc;
+ int rt_min_interval = READ_ONCE(net->ipv6.sysctl.ip6_rt_gc_min_interval);
+ int rt_elasticity = READ_ONCE(net->ipv6.sysctl.ip6_rt_gc_elasticity);
+ int rt_gc_timeout = READ_ONCE(net->ipv6.sysctl.ip6_rt_gc_timeout);
+ unsigned long rt_last_gc = READ_ONCE(net->ipv6.ip6_rt_last_gc);
unsigned int val;
int entries;
@@ -3419,11 +3421,8 @@ static int ip6_route_check_nh_onlink(struct net *net,
err = ip6_nh_lookup_table(net, cfg, gw_addr, tbid, 0, &res);
if (!err && !(res.fib6_flags & RTF_REJECT) &&
- /* ignore match if it is the default route */
- !ipv6_addr_any(&res.f6i->fib6_dst.addr) &&
- (res.fib6_type != RTN_UNICAST || dev != res.nh->fib_nh_dev)) {
- NL_SET_ERR_MSG(extack,
- "Nexthop has invalid gateway or device mismatch");
+ res.fib6_type != RTN_UNICAST) {
+ NL_SET_ERR_MSG(extack, "Nexthop has invalid gateway");
err = -EINVAL;
}
@@ -5008,7 +5007,7 @@ void rt6_sync_down_dev(struct net_device *dev, unsigned long event)
};
struct net *net = dev_net(dev);
- if (net->ipv6.sysctl.skip_notify_on_dev_down)
+ if (READ_ONCE(net->ipv6.sysctl.skip_notify_on_dev_down))
fib6_clean_all_skip_notify(net, fib6_ifdown, &arg);
else
fib6_clean_all(net, fib6_ifdown, &arg);
@@ -6408,6 +6407,7 @@ errout:
void fib6_info_hw_flags_set(struct net *net, struct fib6_info *f6i,
bool offload, bool trap, bool offload_failed)
{
+ u8 fib_notify_on_flag_change;
struct sk_buff *skb;
int err;
@@ -6419,8 +6419,9 @@ void fib6_info_hw_flags_set(struct net *net, struct fib6_info *f6i,
WRITE_ONCE(f6i->offload, offload);
WRITE_ONCE(f6i->trap, trap);
+ fib_notify_on_flag_change = READ_ONCE(net->ipv6.sysctl.fib_notify_on_flag_change);
/* 2 means send notifications only if offload_failed was changed. */
- if (net->ipv6.sysctl.fib_notify_on_flag_change == 2 &&
+ if (fib_notify_on_flag_change == 2 &&
READ_ONCE(f6i->offload_failed) == offload_failed)
return;
@@ -6432,7 +6433,7 @@ void fib6_info_hw_flags_set(struct net *net, struct fib6_info *f6i,
*/
return;
- if (!net->ipv6.sysctl.fib_notify_on_flag_change)
+ if (!fib_notify_on_flag_change)
return;
skb = nlmsg_new(rt6_nlmsg_size(f6i), GFP_KERNEL);
@@ -6529,7 +6530,7 @@ static int ipv6_sysctl_rtcache_flush(const struct ctl_table *ctl, int write,
return ret;
net = (struct net *)ctl->extra1;
- delay = net->ipv6.sysctl.flush_delay;
+ delay = READ_ONCE(net->ipv6.sysctl.flush_delay);
fib6_run_gc(delay <= 0 ? 0 : (unsigned long)delay, net, delay > 0);
return 0;
}
diff --git a/net/ipv6/sit.c b/net/ipv6/sit.c
index cf37ad9686e6..439c8a1c6625 100644
--- a/net/ipv6/sit.c
+++ b/net/ipv6/sit.c
@@ -962,7 +962,7 @@ static netdev_tx_t ipip6_tunnel_xmit(struct sk_buff *skb,
}
if (df) {
- mtu = dst_mtu(&rt->dst) - t_hlen;
+ mtu = dst4_mtu(&rt->dst) - t_hlen;
if (mtu < IPV4_MIN_MTU) {
DEV_STATS_INC(dev, collisions);
diff --git a/net/ipv6/tcp_ipv6.c b/net/ipv6/tcp_ipv6.c
index 280fe5978559..d10487b4e5bf 100644
--- a/net/ipv6/tcp_ipv6.c
+++ b/net/ipv6/tcp_ipv6.c
@@ -138,15 +138,15 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr_unsized *uaddr,
{
struct sockaddr_in6 *usin = (struct sockaddr_in6 *) uaddr;
struct inet_connection_sock *icsk = inet_csk(sk);
- struct in6_addr *saddr = NULL, *final_p, final;
struct inet_timewait_death_row *tcp_death_row;
struct ipv6_pinfo *np = tcp_inet6_sk(sk);
+ struct in6_addr *saddr = NULL, *final_p;
struct inet_sock *inet = inet_sk(sk);
struct tcp_sock *tp = tcp_sk(sk);
struct net *net = sock_net(sk);
struct ipv6_txoptions *opt;
struct dst_entry *dst;
- struct flowi6 fl6;
+ struct flowi6 *fl6;
int addr_type;
int err;
@@ -156,14 +156,15 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr_unsized *uaddr,
if (usin->sin6_family != AF_INET6)
return -EAFNOSUPPORT;
- memset(&fl6, 0, sizeof(fl6));
+ fl6 = &inet_sk(sk)->cork.fl.u.ip6;
+ memset(fl6, 0, sizeof(*fl6));
if (inet6_test_bit(SNDFLOW, sk)) {
- fl6.flowlabel = usin->sin6_flowinfo&IPV6_FLOWINFO_MASK;
- IP6_ECN_flow_init(fl6.flowlabel);
- if (fl6.flowlabel&IPV6_FLOWLABEL_MASK) {
+ fl6->flowlabel = usin->sin6_flowinfo & IPV6_FLOWINFO_MASK;
+ IP6_ECN_flow_init(fl6->flowlabel);
+ if (fl6->flowlabel & IPV6_FLOWLABEL_MASK) {
struct ip6_flowlabel *flowlabel;
- flowlabel = fl6_sock_lookup(sk, fl6.flowlabel);
+ flowlabel = fl6_sock_lookup(sk, fl6->flowlabel);
if (IS_ERR(flowlabel))
return -EINVAL;
fl6_sock_release(flowlabel);
@@ -212,7 +213,7 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr_unsized *uaddr,
}
sk->sk_v6_daddr = usin->sin6_addr;
- np->flow_label = fl6.flowlabel;
+ np->flow_label = fl6->flowlabel;
/*
* TCP over IPv4
@@ -260,24 +261,24 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr_unsized *uaddr,
if (!ipv6_addr_any(&sk->sk_v6_rcv_saddr))
saddr = &sk->sk_v6_rcv_saddr;
- fl6.flowi6_proto = IPPROTO_TCP;
- fl6.daddr = sk->sk_v6_daddr;
- fl6.saddr = saddr ? *saddr : np->saddr;
- fl6.flowlabel = ip6_make_flowinfo(np->tclass, np->flow_label);
- fl6.flowi6_oif = sk->sk_bound_dev_if;
- fl6.flowi6_mark = sk->sk_mark;
- fl6.fl6_dport = usin->sin6_port;
- fl6.fl6_sport = inet->inet_sport;
- if (IS_ENABLED(CONFIG_IP_ROUTE_MULTIPATH) && !fl6.fl6_sport)
- fl6.flowi6_flags = FLOWI_FLAG_ANY_SPORT;
- fl6.flowi6_uid = sk_uid(sk);
+ fl6->flowi6_proto = IPPROTO_TCP;
+ fl6->daddr = sk->sk_v6_daddr;
+ fl6->saddr = saddr ? *saddr : np->saddr;
+ fl6->flowlabel = ip6_make_flowinfo(np->tclass, np->flow_label);
+ fl6->flowi6_oif = sk->sk_bound_dev_if;
+ fl6->flowi6_mark = sk->sk_mark;
+ fl6->fl6_dport = usin->sin6_port;
+ fl6->fl6_sport = inet->inet_sport;
+ if (IS_ENABLED(CONFIG_IP_ROUTE_MULTIPATH) && !fl6->fl6_sport)
+ fl6->flowi6_flags = FLOWI_FLAG_ANY_SPORT;
+ fl6->flowi6_uid = sk_uid(sk);
opt = rcu_dereference_protected(np->opt, lockdep_sock_is_held(sk));
- final_p = fl6_update_dst(&fl6, opt, &final);
+ final_p = fl6_update_dst(fl6, opt, &np->final);
- security_sk_classify_flow(sk, flowi6_to_flowi_common(&fl6));
+ security_sk_classify_flow(sk, flowi6_to_flowi_common(fl6));
- dst = ip6_dst_lookup_flow(net, sk, &fl6, final_p);
+ dst = ip6_dst_lookup_flow(net, sk, fl6, final_p);
if (IS_ERR(dst)) {
err = PTR_ERR(dst);
goto failure;
@@ -287,7 +288,7 @@ static int tcp_v6_connect(struct sock *sk, struct sockaddr_unsized *uaddr,
tcp_death_row = &sock_net(sk)->ipv4.tcp_death_row;
if (!saddr) {
- saddr = &fl6.saddr;
+ saddr = &fl6->saddr;
err = inet_bhash2_update_saddr(sk, saddr, AF_INET6);
if (err)
@@ -351,7 +352,7 @@ failure:
static void tcp_v6_mtu_reduced(struct sock *sk)
{
struct dst_entry *dst;
- u32 mtu;
+ u32 mtu, dmtu;
if ((1 << sk->sk_state) & (TCPF_LISTEN | TCPF_CLOSE))
return;
@@ -368,8 +369,9 @@ static void tcp_v6_mtu_reduced(struct sock *sk)
if (!dst)
return;
- if (inet_csk(sk)->icsk_pmtu_cookie > dst_mtu(dst)) {
- tcp_sync_mss(sk, dst_mtu(dst));
+ dmtu = dst6_mtu(dst);
+ if (inet_csk(sk)->icsk_pmtu_cookie > dmtu) {
+ tcp_sync_mss(sk, dmtu);
tcp_simple_retransmit(sk);
}
}
@@ -537,7 +539,7 @@ static int tcp_v6_send_synack(const struct sock *sk, struct dst_entry *dst,
u8 tclass;
/* First, grab a route. */
- if (!dst && (dst = inet6_csk_route_req(sk, fl6, req,
+ if (!dst && (dst = inet6_csk_route_req(sk, NULL, fl6, req,
IPPROTO_TCP)) == NULL)
goto done;
@@ -787,7 +789,7 @@ static struct dst_entry *tcp_v6_route_req(const struct sock *sk,
if (security_inet_conn_request(sk, skb, req))
return NULL;
- return inet6_csk_route_req(sk, &fl->u.ip6, req, IPPROTO_TCP);
+ return inet6_csk_route_req(sk, NULL, &fl->u.ip6, req, IPPROTO_TCP);
}
struct request_sock_ops tcp6_request_sock_ops __read_mostly = {
@@ -1085,7 +1087,8 @@ static void tcp_v6_send_reset(const struct sock *sk, struct sk_buff *skb,
txhash = inet_twsk(sk)->tw_txhash;
}
} else {
- if (net->ipv6.sysctl.flowlabel_reflect & FLOWLABEL_REFLECT_TCP_RESET)
+ if (READ_ONCE(net->ipv6.sysctl.flowlabel_reflect) &
+ FLOWLABEL_REFLECT_TCP_RESET)
label = ip6_flowlabel(ipv6h);
}
@@ -1315,12 +1318,12 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff *
struct request_sock *req_unhash,
bool *own_req)
{
- struct inet_request_sock *ireq;
- struct ipv6_pinfo *newnp;
const struct ipv6_pinfo *np = tcp_inet6_sk(sk);
+ struct inet_request_sock *ireq;
struct ipv6_txoptions *opt;
struct inet_sock *newinet;
bool found_dup_sk = false;
+ struct ipv6_pinfo *newnp;
struct tcp_sock *newtp;
struct sock *newsk;
#ifdef CONFIG_TCP_MD5SIG
@@ -1389,11 +1392,9 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff *
if (sk_acceptq_is_full(sk))
goto exit_overflow;
- if (!dst) {
- dst = inet6_csk_route_req(sk, &fl6, req, IPPROTO_TCP);
- if (!dst)
- goto exit;
- }
+ dst = inet6_csk_route_req(sk, dst, &fl6, req, IPPROTO_TCP);
+ if (!dst)
+ goto exit;
newsk = tcp_create_openreq_child(sk, req, skb);
if (!newsk)
@@ -1409,6 +1410,7 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff *
inet6_sk_rx_dst_set(newsk, skb);
newinet = inet_sk(newsk);
+ newinet->cork.fl.u.ip6 = fl6;
newinet->pinet6 = tcp_inet6_sk(newsk);
newinet->ipv6_fl_list = NULL;
newinet->inet_opt = NULL;
@@ -1466,7 +1468,7 @@ static struct sock *tcp_v6_syn_recv_sock(const struct sock *sk, struct sk_buff *
tcp_ca_openreq_child(newsk, dst);
- tcp_sync_mss(newsk, dst_mtu(dst));
+ tcp_sync_mss(newsk, dst6_mtu(dst));
newtp->advmss = tcp_mss_clamp(tcp_sk(sk), dst_metric_advmss(dst));
tcp_initialize_rcv_mss(newsk);
@@ -2331,6 +2333,8 @@ struct proto tcpv6_prot = {
.sysctl_rmem_offset = offsetof(struct net, ipv4.sysctl_tcp_rmem),
.max_header = MAX_TCP_HEADER,
.obj_size = sizeof(struct tcp6_sock),
+ .freeptr_offset = offsetof(struct tcp6_sock,
+ tcp.inet_conn.icsk_inet.sk.sk_freeptr),
.ipv6_pinfo_offset = offsetof(struct tcp6_sock, inet6),
.slab_flags = SLAB_TYPESAFE_BY_RCU,
.twsk_prot = &tcp6_timewait_sock_ops,
diff --git a/net/ipv6/tcpv6_offload.c b/net/ipv6/tcpv6_offload.c
index 5670d32c27f8..f2a659cd6183 100644
--- a/net/ipv6/tcpv6_offload.c
+++ b/net/ipv6/tcpv6_offload.c
@@ -24,9 +24,6 @@ static void tcp6_check_fraglist_gro(struct list_head *head, struct sk_buff *skb,
struct net *net;
int iif, sdif;
- if (likely(!(skb->dev->features & NETIF_F_GRO_FRAGLIST)))
- return;
-
p = tcp_gro_lookup(head, th);
if (p) {
NAPI_GRO_CB(skb)->is_flist = NAPI_GRO_CB(p)->is_flist;
@@ -45,8 +42,8 @@ static void tcp6_check_fraglist_gro(struct list_head *head, struct sk_buff *skb,
#endif /* IS_ENABLED(CONFIG_IPV6) */
}
-INDIRECT_CALLABLE_SCOPE
-struct sk_buff *tcp6_gro_receive(struct list_head *head, struct sk_buff *skb)
+static __always_inline struct sk_buff *tcp6_gro_receive(struct list_head *head,
+ struct sk_buff *skb)
{
struct tcphdr *th;
@@ -60,7 +57,8 @@ struct sk_buff *tcp6_gro_receive(struct list_head *head, struct sk_buff *skb)
if (!th)
goto flush;
- tcp6_check_fraglist_gro(head, skb, th);
+ if (unlikely(skb->dev->features & NETIF_F_GRO_FRAGLIST))
+ tcp6_check_fraglist_gro(head, skb, th);
return tcp_gro_receive(head, skb, th);
@@ -69,7 +67,7 @@ flush:
return NULL;
}
-INDIRECT_CALLABLE_SCOPE int tcp6_gro_complete(struct sk_buff *skb, int thoff)
+static __always_inline int tcp6_gro_complete(struct sk_buff *skb, int thoff)
{
const u16 offset = NAPI_GRO_CB(skb)->network_offsets[skb->encapsulation];
const struct ipv6hdr *iph = (struct ipv6hdr *)(skb->data + offset);
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index 794c13674e8a..010b909275dd 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -875,7 +875,8 @@ static int udpv6_queue_rcv_one_skb(struct sock *sk, struct sk_buff *skb)
/*
* UDP-Lite specific tests, ignored on UDP sockets (see net/ipv4/udp.c).
*/
- if (udp_test_bit(UDPLITE_RECV_CC, sk) && UDP_SKB_CB(skb)->partial_cov) {
+ if (unlikely(udp_test_bit(UDPLITE_RECV_CC, sk) &&
+ UDP_SKB_CB(skb)->partial_cov)) {
u16 pcrlen = READ_ONCE(up->pcrlen);
if (pcrlen == 0) { /* full coverage was set */
@@ -1439,7 +1440,7 @@ csum_partial:
send:
err = ip6_send_skb(skb);
- if (err) {
+ if (unlikely(err)) {
if (err == -ENOBUFS && !inet6_test_bit(RECVERR6, sk)) {
UDP6_INC_STATS(sock_net(sk),
UDP_MIB_SNDBUFERRORS, is_udplite);
diff --git a/net/ipv6/udp_offload.c b/net/ipv6/udp_offload.c
index 046f13b1d77a..e003b8494dc0 100644
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -132,7 +132,6 @@ static struct sock *udp6_gro_lookup_skb(struct sk_buff *skb, __be16 sport,
sdif, net->ipv4.udp_table, NULL);
}
-INDIRECT_CALLABLE_SCOPE
struct sk_buff *udp6_gro_receive(struct list_head *head, struct sk_buff *skb)
{
struct udphdr *uh = udp_gro_udphdr(skb);
@@ -165,7 +164,7 @@ flush:
return NULL;
}
-INDIRECT_CALLABLE_SCOPE int udp6_gro_complete(struct sk_buff *skb, int nhoff)
+int udp6_gro_complete(struct sk_buff *skb, int nhoff)
{
const u16 offset = NAPI_GRO_CB(skb)->network_offsets[skb->encapsulation];
const struct ipv6hdr *ipv6h = (struct ipv6hdr *)(skb->data + offset);