<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/net/ip_vs.h, branch v6.1.4</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.1.4</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v6.1.4'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2022-12-31T12:32:26Z</updated>
<entry>
<title>ipvs: use u64_stats_t for the per-cpu counters</title>
<updated>2022-12-31T12:32:26Z</updated>
<author>
<name>Julian Anastasov</name>
<email>ja@ssi.bg</email>
</author>
<published>2022-11-22T16:46:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cbde8b3acbc8d098240d8f423512689ffa84e80c'/>
<id>urn:sha1:cbde8b3acbc8d098240d8f423512689ffa84e80c</id>
<content type='text'>
[ Upstream commit 1dbd8d9a82e3f26b9d063292d47ece673f48fce2 ]

Use the provided u64_stats_t type to avoid
load/store tearing.

Fixes: 316580b69d0a ("u64_stats: provide u64_stats_t type")
Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Cc: yunhong-cgl jiang &lt;xintian1976@gmail.com&gt;
Cc: "dust.li" &lt;dust.li@linux.alibaba.com&gt;
Reviewed-by: Jiri Wiesner &lt;jwiesner@suse.de&gt;
Tested-by: Jiri Wiesner &lt;jwiesner@suse.de&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>ipvs: add sysctl_run_estimation to support disable estimation</title>
<updated>2021-10-07T17:52:58Z</updated>
<author>
<name>Dust Li</name>
<email>dust.li@linux.alibaba.com</email>
</author>
<published>2021-08-20T05:37:52Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2232642ec3fb4aad6ae4da1e109f55a0e7f2d204'/>
<id>urn:sha1:2232642ec3fb4aad6ae4da1e109f55a0e7f2d204</id>
<content type='text'>
estimation_timer will iterate the est_list to do estimation
for each ipvs stats. When there are lots of services, the
list can be very large.
We found that estimation_timer() run for more then 200ms on a
machine with 104 CPU and 50K services.

yunhong-cgl jiang report the same phenomenon before:
https://www.spinics.net/lists/lvs-devel/msg05426.html

In some cases(for example a large K8S cluster with many ipvs services),
ipvs estimation may not be needed. So adding a sysctl blob to allow
users to disable this completely.

Default is: 1 (enable)

Cc: yunhong-cgl jiang &lt;xintian1976@gmail.com&gt;
Signed-off-by: Dust Li &lt;dust.li@linux.alibaba.com&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Acked-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>netfilter: move handlers to net/ip_vs.h</title>
<updated>2021-02-05T02:37:57Z</updated>
<author>
<name>Leon Romanovsky</name>
<email>leonro@nvidia.com</email>
</author>
<published>2021-02-03T13:51:12Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=edf597da02a01edb26bddf06890fb81eee3d82cf'/>
<id>urn:sha1:edf597da02a01edb26bddf06890fb81eee3d82cf</id>
<content type='text'>
Fix the following compilation warnings:
net/netfilter/ipvs/ip_vs_proto_tcp.c:147:1: warning: no previous prototype for 'tcp_snat_handler' [-Wmissing-prototypes]
  147 | tcp_snat_handler(struct sk_buff *skb, struct ip_vs_protocol *pp,
      | ^~~~~~~~~~~~~~~~
net/netfilter/ipvs/ip_vs_proto_udp.c:136:1: warning: no previous prototype for 'udp_snat_handler' [-Wmissing-prototypes]
  136 | udp_snat_handler(struct sk_buff *skb, struct ip_vs_protocol *pp,
      | ^~~~~~~~~~~~~~~~

Signed-off-by: Leon Romanovsky &lt;leonro@nvidia.com&gt;
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
</content>
</entry>
<entry>
<title>ipvs: remove dependency on ip6_tables</title>
<updated>2020-08-31T21:06:51Z</updated>
<author>
<name>Yaroslav Bolyukin</name>
<email>iam@lach.pw</email>
</author>
<published>2020-08-29T13:59:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=144b0a0e608690d46e9a77819249bdd8d23bdcb6'/>
<id>urn:sha1:144b0a0e608690d46e9a77819249bdd8d23bdcb6</id>
<content type='text'>
This dependency was added because ipv6_find_hdr was in iptables specific
code but is no longer required

Fixes: f8f626754ebe ("ipv6: Move ipv6_find_hdr() out of Netfilter code.")
Fixes: 63dca2c0b0e7 ("ipvs: Fix faulty IPv6 extension header handling in IPVS")
Signed-off-by: Yaroslav Bolyukin &lt;iam@lach.pw&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>ipvs: queue delayed work to expire no destination connections if expire_nodest_conn=1</title>
<updated>2020-07-21T23:17:59Z</updated>
<author>
<name>Andrew Sy Kim</name>
<email>kim.andrewsy@gmail.com</email>
</author>
<published>2020-07-08T16:16:38Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=35dfb013149f74c2be1ff9c78f14e6a3cd1539d1'/>
<id>urn:sha1:35dfb013149f74c2be1ff9c78f14e6a3cd1539d1</id>
<content type='text'>
When expire_nodest_conn=1 and a destination is deleted, IPVS does not
expire the existing connections until the next matching incoming packet.
If there are many connection entries from a single client to a single
destination, many packets may get dropped before all the connections are
expired (more likely with lots of UDP traffic). An optimization can be
made where upon deletion of a destination, IPVS queues up delayed work
to immediately expire any connections with a deleted destination. This
ensures any reused source ports from a client (within the IPVS timeouts)
are scheduled to new real servers instead of silently dropped.

Signed-off-by: Andrew Sy Kim &lt;kim.andrewsy@gmail.com&gt;
Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>ipvs: allow connection reuse for unconfirmed conntrack</title>
<updated>2020-07-03T23:18:37Z</updated>
<author>
<name>Julian Anastasov</name>
<email>ja@ssi.bg</email>
</author>
<published>2020-07-01T15:17:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f'/>
<id>urn:sha1:f0a5e4d7a594e0fe237d3dfafb069bb82f80f42f</id>
<content type='text'>
YangYuxi is reporting that connection reuse
is causing one-second delay when SYN hits
existing connection in TIME_WAIT state.
Such delay was added to give time to expire
both the IPVS connection and the corresponding
conntrack. This was considered a rare case
at that time but it is causing problem for
some environments such as Kubernetes.

As nf_conntrack_tcp_packet() can decide to
release the conntrack in TIME_WAIT state and
to replace it with a fresh NEW conntrack, we
can use this to allow rescheduling just by
tuning our check: if the conntrack is
confirmed we can not schedule it to different
real server and the one-second delay still
applies but if new conntrack was created,
we are free to select new real server without
any delays.

YangYuxi lists some of the problem reports:

- One second connection delay in masquerading mode:
https://marc.info/?t=151683118100004&amp;r=1&amp;w=2

- IPVS low throughput #70747
https://github.com/kubernetes/kubernetes/issues/70747

- Apache Bench can fill up ipvs service proxy in seconds #544
https://github.com/cloudnativelabs/kube-router/issues/544

- Additional 1s latency in `host -&gt; service IP -&gt; pod`
https://github.com/kubernetes/kubernetes/issues/90854

Fixes: f719e3754ee2 ("ipvs: drop first packet to redirect conntrack")
Co-developed-by: YangYuxi &lt;yx.atom1@gmail.com&gt;
Signed-off-by: YangYuxi &lt;yx.atom1@gmail.com&gt;
Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Reviewed-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>ipvs: register hooks only with services</title>
<updated>2020-06-30T16:37:39Z</updated>
<author>
<name>Julian Anastasov</name>
<email>ja@ssi.bg</email>
</author>
<published>2020-06-21T15:40:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=857ca89711de3dbcc674d58a6d7d297ee0bd34e1'/>
<id>urn:sha1:857ca89711de3dbcc674d58a6d7d297ee0bd34e1</id>
<content type='text'>
Keep the IPVS hooks registered in Netfilter only
while there are configured virtual services. This
saves CPU cycles while IPVS is loaded but not used.

Signed-off-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Reviewed-by: Simon Horman &lt;horms@verge.net.au&gt;
Signed-off-by: Pablo Neira Ayuso &lt;pablo@netfilter.org&gt;
</content>
</entry>
<entry>
<title>Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net</title>
<updated>2019-11-02T20:54:56Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2019-11-02T20:12:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d31e95585ca697fb31440c6fe30113adc85ecfbd'/>
<id>urn:sha1:d31e95585ca697fb31440c6fe30113adc85ecfbd</id>
<content type='text'>
The only slightly tricky merge conflict was the netdevsim because the
mutex locking fix overlapped a lot of driver reload reorganization.

The rest were (relatively) trivial in nature.

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipvs: move old_secure_tcp into struct netns_ipvs</title>
<updated>2019-10-24T09:56:02Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2019-10-23T16:53:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c24b75e0f9239e78105f81c5f03a751641eb07ef'/>
<id>urn:sha1:c24b75e0f9239e78105f81c5f03a751641eb07ef</id>
<content type='text'>
syzbot reported the following issue :

BUG: KCSAN: data-race in update_defense_level / update_defense_level

read to 0xffffffff861a6260 of 4 bytes by task 3006 on cpu 1:
 update_defense_level+0x621/0xb30 net/netfilter/ipvs/ip_vs_ctl.c:177
 defense_work_handler+0x3d/0xd0 net/netfilter/ipvs/ip_vs_ctl.c:225
 process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
 worker_thread+0xa0/0x800 kernel/workqueue.c:2415
 kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352

write to 0xffffffff861a6260 of 4 bytes by task 7333 on cpu 0:
 update_defense_level+0xa62/0xb30 net/netfilter/ipvs/ip_vs_ctl.c:205
 defense_work_handler+0x3d/0xd0 net/netfilter/ipvs/ip_vs_ctl.c:225
 process_one_work+0x3d4/0x890 kernel/workqueue.c:2269
 worker_thread+0xa0/0x800 kernel/workqueue.c:2415
 kthread+0x1d4/0x200 drivers/block/aoe/aoecmd.c:1253
 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:352

Reported by Kernel Concurrency Sanitizer on:
CPU: 0 PID: 7333 Comm: kworker/0:5 Not tainted 5.4.0-rc3+ #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Workqueue: events defense_work_handler

Indeed, old_secure_tcp is currently a static variable, while it
needs to be a per netns variable.

Fixes: a0840e2e165a ("IPVS: netns, ip_vs_ctl local vars moved to ipvs struct.")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Reported-by: syzbot &lt;syzkaller@googlegroups.com&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
</content>
</entry>
<entry>
<title>ipvs: batch __ip_vs_cleanup</title>
<updated>2019-10-08T09:28:33Z</updated>
<author>
<name>Haishuang Yan</name>
<email>yanhaishuang@cmss.chinamobile.com</email>
</author>
<published>2019-09-27T04:54:50Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5d5a0815f854a5b0e21d97e16cfadad69ce5fb04'/>
<id>urn:sha1:5d5a0815f854a5b0e21d97e16cfadad69ce5fb04</id>
<content type='text'>
It's better to batch __ip_vs_cleanup to speedup ipvs
connections dismantle.

Signed-off-by: Haishuang Yan &lt;yanhaishuang@cmss.chinamobile.com&gt;
Acked-by: Julian Anastasov &lt;ja@ssi.bg&gt;
Signed-off-by: Simon Horman &lt;horms@verge.net.au&gt;
</content>
</entry>
</feed>
