<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/net/core, branch v4.4.266</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.4.266</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.4.266'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2021-03-30T12:45:01Z</updated>
<entry>
<title>can: dev: Move device back to init netns on owning netns delete</title>
<updated>2021-03-30T12:45:01Z</updated>
<author>
<name>Martin Willi</name>
<email>martin@strongswan.org</email>
</author>
<published>2021-03-02T12:24:23Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4c4af8157ea92684aa648723f64895325f3cbfbb'/>
<id>urn:sha1:4c4af8157ea92684aa648723f64895325f3cbfbb</id>
<content type='text'>
commit 3a5ca857079ea022e0b1b17fc154f7ad7dbc150f upstream.

When a non-initial netns is destroyed, the usual policy is to delete
all virtual network interfaces contained, but move physical interfaces
back to the initial netns. This keeps the physical interface visible
on the system.

CAN devices are somewhat special, as they define rtnl_link_ops even
if they are physical devices. If a CAN interface is moved into a
non-initial netns, destroying that netns lets the interface vanish
instead of moving it back to the initial netns. default_device_exit()
skips CAN interfaces due to having rtnl_link_ops set. Reproducer:

  ip netns add foo
  ip link set can0 netns foo
  ip netns delete foo

WARNING: CPU: 1 PID: 84 at net/core/dev.c:11030 ops_exit_list+0x38/0x60
CPU: 1 PID: 84 Comm: kworker/u4:2 Not tainted 5.10.19 #1
Workqueue: netns cleanup_net
[&lt;c010e700&gt;] (unwind_backtrace) from [&lt;c010a1d8&gt;] (show_stack+0x10/0x14)
[&lt;c010a1d8&gt;] (show_stack) from [&lt;c086dc10&gt;] (dump_stack+0x94/0xa8)
[&lt;c086dc10&gt;] (dump_stack) from [&lt;c086b938&gt;] (__warn+0xb8/0x114)
[&lt;c086b938&gt;] (__warn) from [&lt;c086ba10&gt;] (warn_slowpath_fmt+0x7c/0xac)
[&lt;c086ba10&gt;] (warn_slowpath_fmt) from [&lt;c0629f20&gt;] (ops_exit_list+0x38/0x60)
[&lt;c0629f20&gt;] (ops_exit_list) from [&lt;c062a5c4&gt;] (cleanup_net+0x230/0x380)
[&lt;c062a5c4&gt;] (cleanup_net) from [&lt;c0142c20&gt;] (process_one_work+0x1d8/0x438)
[&lt;c0142c20&gt;] (process_one_work) from [&lt;c0142ee4&gt;] (worker_thread+0x64/0x5a8)
[&lt;c0142ee4&gt;] (worker_thread) from [&lt;c0148a98&gt;] (kthread+0x148/0x14c)
[&lt;c0148a98&gt;] (kthread) from [&lt;c0100148&gt;] (ret_from_fork+0x14/0x2c)

To properly restore physical CAN devices to the initial netns on owning
netns exit, introduce a flag on rtnl_link_ops that can be set by drivers.
For CAN devices setting this flag, default_device_exit() considers them
non-virtual, applying the usual namespace move.

The issue was introduced in the commit mentioned below, as at that time
CAN devices did not have a dellink() operation.

Fixes: e008b5fc8dc7 ("net: Simplfy default_device_exit and improve batching.")
Link: https://lore.kernel.org/r/20210302122423.872326-1-martin@strongswan.org
Signed-off-by: Martin Willi &lt;martin@strongswan.org&gt;
Signed-off-by: Marc Kleine-Budde &lt;mkl@pengutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>pktgen: fix misuse of BUG_ON() in pktgen_thread_worker()</title>
<updated>2021-03-07T10:24:21Z</updated>
<author>
<name>Di Zhu</name>
<email>zhudi21@huawei.com</email>
</author>
<published>2021-01-25T12:42:29Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6c193272c4195ce13f66a4b97d91938337cac90b'/>
<id>urn:sha1:6c193272c4195ce13f66a4b97d91938337cac90b</id>
<content type='text'>
[ Upstream commit 275b1e88cabb34dbcbe99756b67e9939d34a99b6 ]

pktgen create threads for all online cpus and bond these threads to
relevant cpu repecivtily. when this thread firstly be woken up, it
will compare cpu currently running with the cpu specified at the time
of creation and if the two cpus are not equal, BUG_ON() will take effect
causing panic on the system.
Notice that these threads could be migrated to other cpus before start
running because of the cpu hotplug after these threads have created. so the
BUG_ON() used here seems unreasonable and we can replace it with WARN_ON()
to just printf a warning other than panic the system.

Signed-off-by: Di Zhu &lt;zhudi21@huawei.com&gt;
Link: https://lore.kernel.org/r/20210125124229.19334-1-zhudi21@huawei.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: fix up truesize of cloned skb in skb_prepare_for_shift()</title>
<updated>2021-03-07T10:24:20Z</updated>
<author>
<name>Marco Elver</name>
<email>elver@google.com</email>
</author>
<published>2021-02-01T16:04:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=7118945cdf0d4b5a76eb4f5f330ac6f48d372025'/>
<id>urn:sha1:7118945cdf0d4b5a76eb4f5f330ac6f48d372025</id>
<content type='text'>
commit 097b9146c0e26aabaa6ff3e5ea536a53f5254a79 upstream.

Avoid the assumption that ksize(kmalloc(S)) == ksize(kmalloc(S)): when
cloning an skb, save and restore truesize after pskb_expand_head(). This
can occur if the allocator decides to service an allocation of the same
size differently (e.g. use a different size class, or pass the
allocation on to KFENCE).

Because truesize is used for bookkeeping (such as sk_wmem_queued), a
modified truesize of a cloned skb may result in corrupt bookkeeping and
relevant warnings (such as in sk_stream_kill_queues()).

Link: https://lkml.kernel.org/r/X9JR/J6dMMOy1obu@elver.google.com
Reported-by: syzbot+7b99aafdcc2eedea6178@syzkaller.appspotmail.com
Suggested-by: Eric Dumazet &lt;edumazet@google.com&gt;
Signed-off-by: Marco Elver &lt;elver@google.com&gt;
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Link: https://lore.kernel.org/r/20210201160420.2826895-1-elver@google.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>skbuff: back tiny skbs with kmalloc() in __netdev_alloc_skb() too</title>
<updated>2021-01-30T12:25:57Z</updated>
<author>
<name>Alexander Lobakin</name>
<email>alobakin@pm.me</email>
</author>
<published>2021-01-15T15:04:40Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c29efd706d7c2b7c85c86ddf05b999b0ee8cda1d'/>
<id>urn:sha1:c29efd706d7c2b7c85c86ddf05b999b0ee8cda1d</id>
<content type='text'>
commit 66c556025d687dbdd0f748c5e1df89c977b6c02a upstream.

Commit 3226b158e67c ("net: avoid 32 x truesize under-estimation for
tiny skbs") ensured that skbs with data size lower than 1025 bytes
will be kmalloc'ed to avoid excessive page cache fragmentation and
memory consumption.
However, the fix adressed only __napi_alloc_skb() (primarily for
virtio_net and napi_get_frags()), but the issue can still be achieved
through __netdev_alloc_skb(), which is still used by several drivers.
Drivers often allocate a tiny skb for headers and place the rest of
the frame to frags (so-called copybreak).
Mirror the condition to __netdev_alloc_skb() to handle this case too.

Since v1 [0]:
 - fix "Fixes:" tag;
 - refine commit message (mention copybreak usecase).

[0] https://lore.kernel.org/netdev/20210114235423.232737-1-alobakin@pm.me

Fixes: a1c7fff7e18f ("net: netdev_alloc_skb() use build_skb()")
Signed-off-by: Alexander Lobakin &lt;alobakin@pm.me&gt;
Link: https://lore.kernel.org/r/20210115150354.85967-1-alobakin@pm.me
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>net: avoid 32 x truesize under-estimation for tiny skbs</title>
<updated>2021-01-23T14:36:56Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2021-01-13T16:18:19Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4f0fca0e17b174a00580ae0411fbb64b81d1e967'/>
<id>urn:sha1:4f0fca0e17b174a00580ae0411fbb64b81d1e967</id>
<content type='text'>
[ Upstream commit 3226b158e67cfaa677fd180152bfb28989cb2fac ]

Both virtio net and napi_get_frags() allocate skbs
with a very small skb-&gt;head

While using page fragments instead of a kmalloc backed skb-&gt;head might give
a small performance improvement in some cases, there is a huge risk of
under estimating memory usage.

For both GOOD_COPY_LEN and GRO_MAX_HEAD, we can fit at least 32 allocations
per page (order-3 page in x86), or even 64 on PowerPC

We have been tracking OOM issues on GKE hosts hitting tcp_mem limits
but consuming far more memory for TCP buffers than instructed in tcp_mem[2]

Even if we force napi_alloc_skb() to only use order-0 pages, the issue
would still be there on arches with PAGE_SIZE &gt;= 32768

This patch makes sure that small skb head are kmalloc backed, so that
other objects in the slab page can be reused instead of being held as long
as skbs are sitting in socket queues.

Note that we might in the future use the sk_buff napi cache,
instead of going through a more expensive __alloc_skb()

Another idea would be to use separate page sizes depending
on the allocated length (to never have more than 4 frags per page)

I would like to thank Greg Thelen for his precious help on this matter,
analysing crash dumps is always a time consuming task.

Fixes: fd11a83dd363 ("net: Pull out core bits of __netdev_alloc_skb and add __napi_alloc_skb")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Paolo Abeni &lt;pabeni@redhat.com&gt;
Cc: Greg Thelen &lt;gthelen@google.com&gt;
Reviewed-by: Alexander Duyck &lt;alexanderduyck@fb.com&gt;
Acked-by: Michael S. Tsirkin &lt;mst@redhat.com&gt;
Link: https://lore.kernel.org/r/20210113161819.1155526-1-eric.dumazet@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>net: drop bogus skb with CHECKSUM_PARTIAL and offset beyond end of trimmed packet</title>
<updated>2021-01-17T12:55:14Z</updated>
<author>
<name>Vasily Averin</name>
<email>vvs@virtuozzo.com</email>
</author>
<published>2020-12-14T19:07:39Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2555bb2a5163e3741d5dd5916f3a9f0228750aca'/>
<id>urn:sha1:2555bb2a5163e3741d5dd5916f3a9f0228750aca</id>
<content type='text'>
commit 54970a2fbb673f090b7f02d7f57b10b2e0707155 upstream.

syzbot reproduces BUG_ON in skb_checksum_help():
tun creates (bogus) skb with huge partial-checksummed area and
small ip packet inside. Then ip_rcv trims the skb based on size
of internal ip packet, after that csum offset points beyond of
trimmed skb. Then checksum_tg() called via netfilter hook
triggers BUG_ON:

        offset = skb_checksum_start_offset(skb);
        BUG_ON(offset &gt;= skb_headlen(skb));

To work around the problem this patch forces pskb_trim_rcsum_slow()
to return -EINVAL in described scenario. It allows its callers to
drop such kind of packets.

Link: https://syzkaller.appspot.com/bug?id=b419a5ca95062664fe1a60b764621eb4526e2cd0
Reported-by: syzbot+7010af67ced6105e5ab6@syzkaller.appspotmail.com
Signed-off-by: Vasily Averin &lt;vvs@virtuozzo.com&gt;
Acked-by: Willem de Bruijn &lt;willemb@google.com&gt;
Link: https://lore.kernel.org/r/1b2494af-2c56-8ee2-7bc0-923fcad1cdf8@virtuozzo.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>net: Have netpoll bring-up DSA management interface</title>
<updated>2020-11-24T11:48:10Z</updated>
<author>
<name>Florian Fainelli</name>
<email>f.fainelli@gmail.com</email>
</author>
<published>2020-11-17T03:52:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b2892feb8aa81cc2c6eac1d441a5d8ef24d2ec0f'/>
<id>urn:sha1:b2892feb8aa81cc2c6eac1d441a5d8ef24d2ec0f</id>
<content type='text'>
[ Upstream commit 1532b9778478577152201adbafa7738b1e844868 ]

DSA network devices rely on having their DSA management interface up and
running otherwise their ndo_open() will return -ENETDOWN. Without doing
this it would not be possible to use DSA devices as netconsole when
configured on the command line. These devices also do not utilize the
upper/lower linking so the check about the netpoll device having upper
is not going to be a problem.

The solution adopted here is identical to the one done for
net/ipv4/ipconfig.c with 728c02089a0e ("net: ipv4: handle DSA enabled
master network devices"), with the network namespace scope being
restricted to that of the process configuring netpoll.

Fixes: 04ff53f96a93 ("net: dsa: Add netconsole support")
Tested-by: Vladimir Oltean &lt;olteanv@gmail.com&gt;
Signed-off-by: Florian Fainelli &lt;f.fainelli@gmail.com&gt;
Link: https://lore.kernel.org/r/20201117035236.22658-1-f.fainelli@gmail.com
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>neigh_stat_seq_next() should increase position index</title>
<updated>2020-10-01T09:11:51Z</updated>
<author>
<name>Vasily Averin</name>
<email>vvs@virtuozzo.com</email>
</author>
<published>2020-01-23T07:11:28Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=2fa70e389417da2cc54e1bb77aca749f2407ceac'/>
<id>urn:sha1:2fa70e389417da2cc54e1bb77aca749f2407ceac</id>
<content type='text'>
[ Upstream commit 1e3f9f073c47bee7c23e77316b07bc12338c5bba ]

if seq_file .next fuction does not change position index,
read after some lseek can generate unexpected output.

https://bugzilla.kernel.org/show_bug.cgi?id=206283
Signed-off-by: Vasily Averin &lt;vvs@virtuozzo.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Sasha Levin &lt;sashal@kernel.org&gt;
</content>
</entry>
<entry>
<title>net: disable netpoll on fresh napis</title>
<updated>2020-09-12T09:45:33Z</updated>
<author>
<name>Jakub Kicinski</name>
<email>kuba@kernel.org</email>
</author>
<published>2020-08-26T19:40:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=3fe69cc4436810cbbf07b1191af05bd3ca18837d'/>
<id>urn:sha1:3fe69cc4436810cbbf07b1191af05bd3ca18837d</id>
<content type='text'>
[ Upstream commit 96e97bc07e90f175a8980a22827faf702ca4cb30 ]

napi_disable() makes sure to set the NAPI_STATE_NPSVC bit to prevent
netpoll from accessing rings before init is complete. However, the
same is not done for fresh napi instances in netif_napi_add(),
even though we expect NAPI instances to be added as disabled.

This causes crashes during driver reconfiguration (enabling XDP,
changing the channel count) - if there is any printk() after
netif_napi_add() but before napi_enable().

To ensure memory ordering is correct we need to use RCU accessors.

Reported-by: Rob Sherwood &lt;rsher@fb.com&gt;
Fixes: 2d8bff12699a ("netpoll: Close race condition between poll_one_napi and napi_disable")
Signed-off-by: Jakub Kicinski &lt;kuba@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>net: Fix potential wrong skb-&gt;protocol in skb_vlan_untag()</title>
<updated>2020-09-03T09:19:22Z</updated>
<author>
<name>Miaohe Lin</name>
<email>linmiaohe@huawei.com</email>
</author>
<published>2020-08-15T08:44:31Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c40fc891ca6c7ada3b10e5c16593e6469d4dfa22'/>
<id>urn:sha1:c40fc891ca6c7ada3b10e5c16593e6469d4dfa22</id>
<content type='text'>
[ Upstream commit 55eff0eb7460c3d50716ed9eccf22257b046ca92 ]

We may access the two bytes after vlan_hdr in vlan_set_encap_proto(). So
we should pull VLAN_HLEN + sizeof(unsigned short) in skb_vlan_untag() or
we may access the wrong data.

Fixes: 0d5501c1c828 ("net: Always untag vlan-tagged traffic on input.")
Signed-off-by: Miaohe Lin &lt;linmiaohe@huawei.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
</feed>
