<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/net/openvswitch, branch v5.10.221</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.10.221</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v5.10.221'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2024-06-16T11:32:27Z</updated>
<entry>
<title>openvswitch: Set the skbuff pkt_type for proper pmtud support.</title>
<updated>2024-06-16T11:32:27Z</updated>
<author>
<name>Aaron Conole</name>
<email>aconole@redhat.com</email>
</author>
<published>2024-05-16T20:09:41Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8cae65ace421e098d39455512b3366a20ab6e5fe'/>
<id>urn:sha1:8cae65ace421e098d39455512b3366a20ab6e5fe</id>
<content type='text'>
[ Upstream commit 30a92c9e3d6b073932762bef2ac66f4ee784c657 ]

Open vSwitch is originally intended to switch at layer 2, only dealing with
Ethernet frames.  With the introduction of l3 tunnels support, it crossed
into the realm of needing to care a bit about some routing details when
making forwarding decisions.  If an oversized packet would need to be
fragmented during this forwarding decision, there is a chance for pmtu
to get involved and generate a routing exception.  This is gated by the
skbuff-&gt;pkt_type field.

When a flow is already loaded into the openvswitch module this field is
set up and transitioned properly as a packet moves from one port to
another.  In the case that a packet execute is invoked after a flow is
newly installed this field is not properly initialized.  This causes the
pmtud mechanism to omit sending the required exception messages across
the tunnel boundary and a second attempt needs to be made to make sure
that the routing exception is properly setup.  To fix this, we set the
outgoing packet's pkt_type to PACKET_OUTGOING, since it can only get
to the openvswitch module via a port device or packet command.

Even for bridge ports as users, the pkt_type needs to be reset when
doing the transmit as the packet is truly outgoing and routing needs
to get involved post packet transformations, in the case of
VXLAN/GENEVE/udp-tunnel packets.  In general, the pkt_type on output
gets ignored, since we go straight to the driver, but in the case of
tunnel ports they go through IP routing layer.

This issue is periodically encountered in complex setups, such as large
openshift deployments, where multiple sets of tunnel traversal occurs.
A way to recreate this is with the ovn-heater project that can setup
a networking environment which mimics such large deployments.  We need
larger environments for this because we need to ensure that flow
misses occur.  In these environment, without this patch, we can see:

  ./ovn_cluster.sh start
  podman exec ovn-chassis-1 ip r a 170.168.0.5/32 dev eth1 mtu 1200
  podman exec ovn-chassis-1 ip netns exec sw01p1 ip r flush cache
  podman exec ovn-chassis-1 ip netns exec sw01p1 \
         ping 21.0.0.3 -M do -s 1300 -c2
  PING 21.0.0.3 (21.0.0.3) 1300(1328) bytes of data.
  From 21.0.0.3 icmp_seq=2 Frag needed and DF set (mtu = 1142)

  --- 21.0.0.3 ping statistics ---
  ...

Using tcpdump, we can also see the expected ICMP FRAG_NEEDED message is not
sent into the server.

With this patch, setting the pkt_type, we see the following:

  podman exec ovn-chassis-1 ip netns exec sw01p1 \
         ping 21.0.0.3 -M do -s 1300 -c2
  PING 21.0.0.3 (21.0.0.3) 1300(1328) bytes of data.
  From 21.0.0.3 icmp_seq=1 Frag needed and DF set (mtu = 1222)
  ping: local error: message too long, mtu=1222

  --- 21.0.0.3 ping statistics ---
  ...

In this case, the first ping request receives the FRAG_NEEDED message and
a local routing exception is created.

Tested-by: Jaime Caamano &lt;jcaamano@redhat.com&gt;
Reported-at: https://issues.redhat.com/browse/FDP-164
Fixes: 58264848a5a7 ("openvswitch: Add vxlan tunneling support.")
Signed-off-by: Aaron Conole &lt;aconole@redhat.com&gt;
Acked-by: Eelco Chaudron &lt;echaudro@redhat.com&gt;
Link: https://lore.kernel.org/r/20240516200941.16152-1-aconole@redhat.com
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: openvswitch: fix overwriting ct original tuple for ICMPv6</title>
<updated>2024-06-16T11:32:09Z</updated>
<author>
<name>Ilya Maximets</name>
<email>i.maximets@ovn.org</email>
</author>
<published>2024-05-09T09:38:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5ab6aecbede080b44b8e34720ab72050bf1e6982'/>
<id>urn:sha1:5ab6aecbede080b44b8e34720ab72050bf1e6982</id>
<content type='text'>
[ Upstream commit 7c988176b6c16c516474f6fceebe0f055af5eb56 ]

OVS_PACKET_CMD_EXECUTE has 3 main attributes:
 - OVS_PACKET_ATTR_KEY - Packet metadata in a netlink format.
 - OVS_PACKET_ATTR_PACKET - Binary packet content.
 - OVS_PACKET_ATTR_ACTIONS - Actions to execute on the packet.

OVS_PACKET_ATTR_KEY is parsed first to populate sw_flow_key structure
with the metadata like conntrack state, input port, recirculation id,
etc.  Then the packet itself gets parsed to populate the rest of the
keys from the packet headers.

Whenever the packet parsing code starts parsing the ICMPv6 header, it
first zeroes out fields in the key corresponding to Neighbor Discovery
information even if it is not an ND packet.

It is an 'ipv6.nd' field.  However, the 'ipv6' is a union that shares
the space between 'nd' and 'ct_orig' that holds the original tuple
conntrack metadata parsed from the OVS_PACKET_ATTR_KEY.

ND packets should not normally have conntrack state, so it's fine to
share the space, but normal ICMPv6 Echo packets or maybe other types of
ICMPv6 can have the state attached and it should not be overwritten.

The issue results in all but the last 4 bytes of the destination
address being wiped from the original conntrack tuple leading to
incorrect packet matching and potentially executing wrong actions
in case this packet recirculates within the datapath or goes back
to userspace.

ND fields should not be accessed in non-ND packets, so not clearing
them should be fine.  Executing memset() only for actual ND packets to
avoid the issue.

Initializing the whole thing before parsing is needed because ND packet
may not contain all the options.

The issue only affects the OVS_PACKET_CMD_EXECUTE path and doesn't
affect packets entering OVS datapath from network interfaces, because
in this case CT metadata is populated from skb after the packet is
already parsed.

Fixes: 9dd7f8907c37 ("openvswitch: Add original direction conntrack tuple to sw_flow_key.")
Reported-by: Antonin Bas &lt;antonin.bas@broadcom.com&gt;
Closes: https://github.com/openvswitch/ovs-issues/issues/327
Signed-off-by: Ilya Maximets &lt;i.maximets@ovn.org&gt;
Acked-by: Aaron Conole &lt;aconole@redhat.com&gt;
Acked-by: Eelco Chaudron &lt;echaudro@redhat.com&gt;
Link: https://lore.kernel.org/r/20240509094228.1035477-1-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: openvswitch: Fix Use-After-Free in ovs_ct_exit</title>
<updated>2024-05-02T14:23:41Z</updated>
<author>
<name>Hyunwoo Kim</name>
<email>v4bel@theori.io</email>
</author>
<published>2024-04-22T09:37:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=35880c3fa6f8fe281a19975d2992644588ca33d3'/>
<id>urn:sha1:35880c3fa6f8fe281a19975d2992644588ca33d3</id>
<content type='text'>
[ Upstream commit 5ea7b72d4fac2fdbc0425cd8f2ea33abe95235b2 ]

Since kfree_rcu, which is called in the hlist_for_each_entry_rcu traversal
of ovs_ct_limit_exit, is not part of the RCU read critical section, it
is possible that the RCU grace period will pass during the traversal and
the key will be free.

To prevent this, it should be changed to hlist_for_each_entry_safe.

Fixes: 11efd5cb04a1 ("openvswitch: Support conntrack zone limit")
Signed-off-by: Hyunwoo Kim &lt;v4bel@theori.io&gt;
Reviewed-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reviewed-by: Aaron Conole &lt;aconole@redhat.com&gt;
Link: https://lore.kernel.org/r/ZiYvzQN/Ry5oeFQW@v4bel-B760M-AORUS-ELITE-AX
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: openvswitch: fix unwanted error log on timeout policy probing</title>
<updated>2024-05-02T14:23:33Z</updated>
<author>
<name>Ilya Maximets</name>
<email>i.maximets@ovn.org</email>
</author>
<published>2024-04-03T20:38:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=583b7b856f7f3e99658642533defcecd39faf73f'/>
<id>urn:sha1:583b7b856f7f3e99658642533defcecd39faf73f</id>
<content type='text'>
[ Upstream commit 4539f91f2a801c0c028c252bffae56030cfb2cae ]

On startup, ovs-vswitchd probes different datapath features including
support for timeout policies.  While probing, it tries to execute
certain operations with OVS_PACKET_ATTR_PROBE or OVS_FLOW_ATTR_PROBE
attributes set.  These attributes tell the openvswitch module to not
log any errors when they occur as it is expected that some of the
probes will fail.

For some reason, setting the timeout policy ignores the PROBE attribute
and logs a failure anyway.  This is causing the following kernel log
on each re-start of ovs-vswitchd:

  kernel: Failed to associated timeout policy `ovs_test_tp'

Fix that by using the same logging macro that all other messages are
using.  The message will still be printed at info level when needed
and will be rate limited, but with a net rate limiter instead of
generic printk one.

The nf_ct_set_timeout() itself will still print some info messages,
but at least this change makes logging in openvswitch module more
consistent.

Fixes: 06bd2bdf19d2 ("openvswitch: Add timeout support to ct action")
Signed-off-by: Ilya Maximets &lt;i.maximets@ovn.org&gt;
Acked-by: Eelco Chaudron &lt;echaudro@redhat.com&gt;
Link: https://lore.kernel.org/r/20240403203803.2137962-1-i.maximets@ovn.org
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: openvswitch: limit the number of recursions from action sets</title>
<updated>2024-02-23T07:42:23Z</updated>
<author>
<name>Aaron Conole</name>
<email>aconole@redhat.com</email>
</author>
<published>2024-02-07T13:24:15Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=55cfccb658fc142d7fbfeae2d0496b7841d128c3'/>
<id>urn:sha1:55cfccb658fc142d7fbfeae2d0496b7841d128c3</id>
<content type='text'>
[ Upstream commit 6e2f90d31fe09f2b852de25125ca875aabd81367 ]

The ovs module allows for some actions to recursively contain an action
list for complex scenarios, such as sampling, checking lengths, etc.
When these actions are copied into the internal flow table, they are
evaluated to validate that such actions make sense, and these calls
happen recursively.

The ovs-vswitchd userspace won't emit more than 16 recursion levels
deep.  However, the module has no such limit and will happily accept
limits larger than 16 levels nested.  Prevent this by tracking the
number of recursions happening and manually limiting it to 16 levels
nested.

The initial implementation of the sample action would track this depth
and prevent more than 3 levels of recursion, but this was removed to
support the clone use case, rather than limited at the current userspace
limit.

Fixes: 798c166173ff ("openvswitch: Optimize sample action for the clone use cases")
Signed-off-by: Aaron Conole &lt;aconole@redhat.com&gt;
Reviewed-by: Simon Horman &lt;horms@kernel.org&gt;
Link: https://lore.kernel.org/r/20240207132416.1488485-2-aconole@redhat.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: openvswitch: fix possible memory leak in ovs_meter_cmd_set()</title>
<updated>2023-02-22T11:55:57Z</updated>
<author>
<name>Hangyu Hua</name>
<email>hbh25y@gmail.com</email>
</author>
<published>2023-02-10T02:05:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c0f65ee0a3329eb4b94beaef0268633696e2d0c6'/>
<id>urn:sha1:c0f65ee0a3329eb4b94beaef0268633696e2d0c6</id>
<content type='text'>
commit 2fa28f5c6fcbfc794340684f36d2581b4f2d20b5 upstream.

old_meter needs to be free after it is detached regardless of whether
the new meter is successfully attached.

Fixes: c7c4c44c9a95 ("net: openvswitch: expand the meters supported number")
Signed-off-by: Hangyu Hua &lt;hbh25y@gmail.com&gt;
Acked-by: Eelco Chaudron &lt;echaudro@redhat.com&gt;
Reviewed-by: Simon Horman &lt;simon.horman@corigine.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>net: openvswitch: fix flow memory leak in ovs_flow_cmd_new</title>
<updated>2023-02-15T16:22:14Z</updated>
<author>
<name>Fedor Pchelkin</name>
<email>pchelkin@ispras.ru</email>
</author>
<published>2023-02-01T21:02:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=70154489f531587996f3e9d7cceeee65cff0001d'/>
<id>urn:sha1:70154489f531587996f3e9d7cceeee65cff0001d</id>
<content type='text'>
[ Upstream commit 0c598aed445eb45b0ee7ba405f7ece99ee349c30 ]

Syzkaller reports a memory leak of new_flow in ovs_flow_cmd_new() as it is
not freed when an allocation of a key fails.

BUG: memory leak
unreferenced object 0xffff888116668000 (size 632):
  comm "syz-executor231", pid 1090, jiffies 4294844701 (age 18.871s)
  hex dump (first 32 bytes):
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [&lt;00000000defa3494&gt;] kmem_cache_zalloc include/linux/slab.h:654 [inline]
    [&lt;00000000defa3494&gt;] ovs_flow_alloc+0x19/0x180 net/openvswitch/flow_table.c:77
    [&lt;00000000c67d8873&gt;] ovs_flow_cmd_new+0x1de/0xd40 net/openvswitch/datapath.c:957
    [&lt;0000000010a539a8&gt;] genl_family_rcv_msg_doit+0x22d/0x330 net/netlink/genetlink.c:739
    [&lt;00000000dff3302d&gt;] genl_family_rcv_msg net/netlink/genetlink.c:783 [inline]
    [&lt;00000000dff3302d&gt;] genl_rcv_msg+0x328/0x590 net/netlink/genetlink.c:800
    [&lt;000000000286dd87&gt;] netlink_rcv_skb+0x153/0x430 net/netlink/af_netlink.c:2515
    [&lt;0000000061fed410&gt;] genl_rcv+0x24/0x40 net/netlink/genetlink.c:811
    [&lt;000000009dc0f111&gt;] netlink_unicast_kernel net/netlink/af_netlink.c:1313 [inline]
    [&lt;000000009dc0f111&gt;] netlink_unicast+0x545/0x7f0 net/netlink/af_netlink.c:1339
    [&lt;000000004a5ee816&gt;] netlink_sendmsg+0x8e7/0xde0 net/netlink/af_netlink.c:1934
    [&lt;00000000482b476f&gt;] sock_sendmsg_nosec net/socket.c:651 [inline]
    [&lt;00000000482b476f&gt;] sock_sendmsg+0x152/0x190 net/socket.c:671
    [&lt;00000000698574ba&gt;] ____sys_sendmsg+0x70a/0x870 net/socket.c:2356
    [&lt;00000000d28d9e11&gt;] ___sys_sendmsg+0xf3/0x170 net/socket.c:2410
    [&lt;0000000083ba9120&gt;] __sys_sendmsg+0xe5/0x1b0 net/socket.c:2439
    [&lt;00000000c00628f8&gt;] do_syscall_64+0x30/0x40 arch/x86/entry/common.c:46
    [&lt;000000004abfdcf4&gt;] entry_SYSCALL_64_after_hwframe+0x61/0xc6

To fix this the patch rearranges the goto labels to reflect the order of
object allocations and adds appropriate goto statements on the error
paths.

Found by Linux Verification Center (linuxtesting.org) with Syzkaller.

Fixes: 68bb10101e6b ("openvswitch: Fix flow lookup to use unmasked key")
Signed-off-by: Fedor Pchelkin &lt;pchelkin@ispras.ru&gt;
Signed-off-by: Alexey Khoroshilov &lt;khoroshilov@ispras.ru&gt;
Acked-by: Eelco Chaudron &lt;echaudro@redhat.com&gt;
Reviewed-by: Simon Horman &lt;simon.horman@corigine.com&gt;
Link: https://lore.kernel.org/r/20230201210218.361970-1-pchelkin@ispras.ru
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>openvswitch: Fix flow lookup to use unmasked key</title>
<updated>2023-01-14T09:16:12Z</updated>
<author>
<name>Eelco Chaudron</name>
<email>echaudro@redhat.com</email>
</author>
<published>2022-12-15T14:46:33Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6736b61ecf230dd656464de0f514bdeadb384f20'/>
<id>urn:sha1:6736b61ecf230dd656464de0f514bdeadb384f20</id>
<content type='text'>
[ Upstream commit 68bb10101e6b0a6bb44e9c908ef795fc4af99eae ]

The commit mentioned below causes the ovs_flow_tbl_lookup() function
to be called with the masked key. However, it's supposed to be called
with the unmasked key. This due to the fact that the datapath supports
installing wider flows, and OVS relies on this behavior. For example
if ipv4(src=1.1.1.1/192.0.0.0, dst=1.1.1.2/192.0.0.0) exists, a wider
flow (smaller mask) of ipv4(src=192.1.1.1/128.0.0.0,dst=192.1.1.2/
128.0.0.0) is allowed to be added.

However, if we try to add a wildcard rule, the installation fails:

$ ovs-appctl dpctl/add-flow system@myDP "in_port(1),eth_type(0x0800), \
  ipv4(src=1.1.1.1/192.0.0.0,dst=1.1.1.2/192.0.0.0,frag=no)" 2
$ ovs-appctl dpctl/add-flow system@myDP "in_port(1),eth_type(0x0800), \
  ipv4(src=192.1.1.1/0.0.0.0,dst=49.1.1.2/0.0.0.0,frag=no)" 2
ovs-vswitchd: updating flow table (File exists)

The reason is that the key used to determine if the flow is already
present in the system uses the original key ANDed with the mask.
This results in the IP address not being part of the (miniflow) key,
i.e., being substituted with an all-zero value. When doing the actual
lookup, this results in the key wrongfully matching the first flow,
and therefore the flow does not get installed.

This change reverses the commit below, but rather than having the key
on the stack, it's allocated.

Fixes: 190aa3e77880 ("openvswitch: Fix Frame-size larger than 1024 bytes warning.")

Signed-off-by: Eelco Chaudron &lt;echaudro@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>netfilter: conntrack: Fix data-races around ct mark</title>
<updated>2022-12-02T16:40:00Z</updated>
<author>
<name>Daniel Xu</name>
<email>dxu@dxuuu.xyz</email>
</author>
<published>2022-11-09T19:39:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5c97af75f53c626283afd8a800a4bd57614f761f'/>
<id>urn:sha1:5c97af75f53c626283afd8a800a4bd57614f761f</id>
<content type='text'>
[ Upstream commit 52d1aa8b8249ff477aaa38b6f74a8ced780d079c ]

nf_conn:mark can be read from and written to in parallel. Use
READ_ONCE()/WRITE_ONCE() for reads and writes to prevent unwanted
compiler optimizations.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Daniel Xu &lt;dxu@dxuuu.xyz&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>openvswitch: switch from WARN to pr_warn</title>
<updated>2022-11-03T14:57:53Z</updated>
<author>
<name>Aaron Conole</name>
<email>aconole@redhat.com</email>
</author>
<published>2022-10-25T10:50:17Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=79631daa5a5166c6aa4e25f447e8c08ee030637a'/>
<id>urn:sha1:79631daa5a5166c6aa4e25f447e8c08ee030637a</id>
<content type='text'>
[ Upstream commit fd954cc1919e35cb92f78671cab6e42d661945a3 ]

As noted by Paolo Abeni, pr_warn doesn't generate any splat and can still
preserve the warning to the user that feature downgrade occurred.  We
likely cannot introduce other kinds of checks / enforcement here because
syzbot can generate different genl versions to the datapath.

Reported-by: syzbot+31cde0bef4bbf8ba2d86@syzkaller.appspotmail.com
Fixes: 44da5ae5fbea ("openvswitch: Drop user features if old user space attempted to create datapath")
Cc: Thomas Graf &lt;tgraf@suug.ch&gt;
Signed-off-by: Aaron Conole &lt;aconole@redhat.com&gt;
Acked-by: Ilya Maximets &lt;i.maximets@ovn.org&gt;
Signed-off-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
</feed>
