<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/net, branch v4.15.9</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.15.9</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.15.9'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2018-03-09T06:47:35Z</updated>
<entry>
<title>udplite: fix partial checksum initialization</title>
<updated>2018-03-09T06:47:35Z</updated>
<author>
<name>Alexey Kodanev</name>
<email>alexey.kodanev@oracle.com</email>
</author>
<published>2018-02-15T17:18:43Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1e4661b4d10739bbe3fb3efc4c46e03334149601'/>
<id>urn:sha1:1e4661b4d10739bbe3fb3efc4c46e03334149601</id>
<content type='text'>
[ Upstream commit 15f35d49c93f4fa9875235e7bf3e3783d2dd7a1b ]

Since UDP-Lite is always using checksum, the following path is
triggered when calculating pseudo header for it:

  udp4_csum_init() or udp6_csum_init()
    skb_checksum_init_zero_check()
      __skb_checksum_validate_complete()

The problem can appear if skb-&gt;len is less than CHECKSUM_BREAK. In
this particular case __skb_checksum_validate_complete() also invokes
__skb_checksum_complete(skb). If UDP-Lite is using partial checksum
that covers only part of a packet, the function will return bad
checksum and the packet will be dropped.

It can be fixed if we skip skb_checksum_init_zero_check() and only
set the required pseudo header checksum for UDP-Lite with partial
checksum before udp4_csum_init()/udp6_csum_init() functions return.

Fixes: ed70fcfcee95 ("net: Call skb_checksum_init in IPv4")
Fixes: e4f45b7f40bd ("net: Call skb_checksum_init in IPv6")
Signed-off-by: Alexey Kodanev &lt;alexey.kodanev@oracle.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>net_sched: get rid of rcu_barrier() in tcf_block_put_ext()</title>
<updated>2018-02-12T06:07:21Z</updated>
<author>
<name>Cong Wang</name>
<email>xiyou.wangcong@gmail.com</email>
</author>
<published>2017-12-04T18:48:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=41551c14bf0d519f464daf4ff81a4109b5b92f09'/>
<id>urn:sha1:41551c14bf0d519f464daf4ff81a4109b5b92f09</id>
<content type='text'>
[ Upstream commit efbf78973978b0d25af59bc26c8013a942af6e64 ]

Both Eric and Paolo noticed the rcu_barrier() we use in
tcf_block_put_ext() could be a performance bottleneck when
we have a lot of tc classes.

Paolo provided the following to demonstrate the issue:

tc qdisc add dev lo root htb
for I in `seq 1 1000`; do
        tc class add dev lo parent 1: classid 1:$I htb rate 100kbit
        tc qdisc add dev lo parent 1:$I handle $((I + 1)): htb
        for J in `seq 1 10`; do
                tc filter add dev lo parent $((I + 1)): u32 match ip src 1.1.1.$J
        done
done
time tc qdisc del dev root

real    0m54.764s
user    0m0.023s
sys     0m0.000s

The rcu_barrier() there is to ensure we free the block after all chains
are gone, that is, to queue tcf_block_put_final() at the tail of workqueue.
We can achieve this ordering requirement by refcnt'ing tcf block instead,
that is, the tcf block is freed only when the last chain in this block is
gone. This also simplifies the code.

Paolo reported after this patch we get:

real    0m0.017s
user    0m0.000s
sys     0m0.017s

Tested-by: Paolo Abeni &lt;pabeni@redhat.com&gt;
Cc: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Jiri Pirko &lt;jiri@mellanox.com&gt;
Cc: Jamal Hadi Salim &lt;jhs@mojatatu.com&gt;
Signed-off-by: Cong Wang &lt;xiyou.wangcong@gmail.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>net: don't call update_pmtu unconditionally</title>
<updated>2018-01-25T21:27:34Z</updated>
<author>
<name>Nicolas Dichtel</name>
<email>nicolas.dichtel@6wind.com</email>
</author>
<published>2018-01-25T18:03:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f15ca723c1ebe6c1a06bc95fda6b62cd87b44559'/>
<id>urn:sha1:f15ca723c1ebe6c1a06bc95fda6b62cd87b44559</id>
<content type='text'>
Some dst_ops (e.g. md_dst_ops)) doesn't set this handler. It may result to:
"BUG: unable to handle kernel NULL pointer dereference at           (null)"

Let's add a helper to check if update_pmtu is available before calling it.

Fixes: 52a589d51f10 ("geneve: update skb dst pmtu on tx path")
Fixes: a93bf0ff4490 ("vxlan: update skb dst pmtu on tx path")
CC: Roman Kapl &lt;code@rkapl.cz&gt;
CC: Xin Long &lt;lucien.xin@gmail.com&gt;
Signed-off-by: Nicolas Dichtel &lt;nicolas.dichtel@6wind.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: tcp: close sock if net namespace is exiting</title>
<updated>2018-01-25T15:56:45Z</updated>
<author>
<name>Dan Streetman</name>
<email>ddstreet@ieee.org</email>
</author>
<published>2018-01-18T21:14:26Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=4ee806d51176ba7b8ff1efd81f271d7252e03a1d'/>
<id>urn:sha1:4ee806d51176ba7b8ff1efd81f271d7252e03a1d</id>
<content type='text'>
When a tcp socket is closed, if it detects that its net namespace is
exiting, close immediately and do not wait for FIN sequence.

For normal sockets, a reference is taken to their net namespace, so it will
never exit while the socket is open.  However, kernel sockets do not take a
reference to their net namespace, so it may begin exiting while the kernel
socket is still open.  In this case if the kernel socket is a tcp socket,
it will stay open trying to complete its close sequence.  The sock's dst(s)
hold a reference to their interface, which are all transferred to the
namespace's loopback interface when the real interfaces are taken down.
When the namespace tries to take down its loopback interface, it hangs
waiting for all references to the loopback interface to release, which
results in messages like:

unregister_netdevice: waiting for lo to become free. Usage count = 1

These messages continue until the socket finally times out and closes.
Since the net namespace cleanup holds the net_mutex while calling its
registered pernet callbacks, any new net namespace initialization is
blocked until the current net namespace finishes exiting.

After this change, the tcp socket notices the exiting net namespace, and
closes immediately, releasing its dst(s) and their reference to the
loopback interface, which lets the net namespace continue exiting.

Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=97811
Signed-off-by: Dan Streetman &lt;ddstreet@canonical.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net: sched: fix TCF_LAYER_LINK case in tcf_get_base_ptr</title>
<updated>2018-01-24T19:52:48Z</updated>
<author>
<name>Wolfgang Bumiller</name>
<email>w.bumiller@proxmox.com</email>
</author>
<published>2018-01-18T10:32:36Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d3303a65a00c94372ddab831570647488e6c06e2'/>
<id>urn:sha1:d3303a65a00c94372ddab831570647488e6c06e2</id>
<content type='text'>
TCF_LAYER_LINK and TCF_LAYER_NETWORK returned the same pointer as
skb-&gt;data points to the network header.
Use skb_mac_header instead.

Signed-off-by: Wolfgang Bumiller &lt;w.bumiller@proxmox.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv6: Fix getsockopt() for sockets with default IPV6_AUTOFLOWLABEL</title>
<updated>2018-01-24T00:53:24Z</updated>
<author>
<name>Ben Hutchings</name>
<email>ben.hutchings@codethink.co.uk</email>
</author>
<published>2018-01-22T20:06:42Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e9191ffb65d8e159680ce0ad2224e1acbde6985c'/>
<id>urn:sha1:e9191ffb65d8e159680ce0ad2224e1acbde6985c</id>
<content type='text'>
Commit 513674b5a2c9 ("net: reevalulate autoflowlabel setting after
sysctl setting") removed the initialisation of
ipv6_pinfo::autoflowlabel and added a second flag to indicate
whether this field or the net namespace default should be used.

The getsockopt() handling for this case was not updated, so it
currently returns 0 for all sockets for which IPV6_AUTOFLOWLABEL is
not explicitly enabled.  Fix it to return the effective value, whether
that has been set at the socket or net namespace level.

Fixes: 513674b5a2c9 ("net: reevalulate autoflowlabel setting after sysctl ...")
Signed-off-by: Ben Hutchings &lt;ben.hutchings@codethink.co.uk&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net, sched: fix panic when updating miniq {b,q}stats</title>
<updated>2018-01-16T20:02:36Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2018-01-15T22:12:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=81d947e2b8dd2394586c3eaffdd2357797d3bf59'/>
<id>urn:sha1:81d947e2b8dd2394586c3eaffdd2357797d3bf59</id>
<content type='text'>
While working on fixing another bug, I ran into the following panic
on arm64 by simply attaching clsact qdisc, adding a filter and running
traffic on ingress to it:

  [...]
  [  178.188591] Unable to handle kernel read from unreadable memory at virtual address 810fb501f000
  [  178.197314] Mem abort info:
  [  178.200121]   ESR = 0x96000004
  [  178.203168]   Exception class = DABT (current EL), IL = 32 bits
  [  178.209095]   SET = 0, FnV = 0
  [  178.212157]   EA = 0, S1PTW = 0
  [  178.215288] Data abort info:
  [  178.218175]   ISV = 0, ISS = 0x00000004
  [  178.222019]   CM = 0, WnR = 0
  [  178.224997] user pgtable: 4k pages, 48-bit VAs, pgd = 0000000023cb3f33
  [  178.231531] [0000810fb501f000] *pgd=0000000000000000
  [  178.236508] Internal error: Oops: 96000004 [#1] SMP
  [...]
  [  178.311855] CPU: 73 PID: 2497 Comm: ping Tainted: G        W        4.15.0-rc7+ #5
  [  178.319413] Hardware name: FOXCONN R2-1221R-A4/C2U4N_MB, BIOS G31FB18A 03/31/2017
  [  178.326887] pstate: 60400005 (nZCv daif +PAN -UAO)
  [  178.331685] pc : __netif_receive_skb_core+0x49c/0xac8
  [  178.336728] lr : __netif_receive_skb+0x28/0x78
  [  178.341161] sp : ffff00002344b750
  [  178.344465] x29: ffff00002344b750 x28: ffff810fbdfd0580
  [  178.349769] x27: 0000000000000000 x26: ffff000009378000
  [...]
  [  178.418715] x1 : 0000000000000054 x0 : 0000000000000000
  [  178.424020] Process ping (pid: 2497, stack limit = 0x000000009f0a3ff4)
  [  178.430537] Call trace:
  [  178.432976]  __netif_receive_skb_core+0x49c/0xac8
  [  178.437670]  __netif_receive_skb+0x28/0x78
  [  178.441757]  process_backlog+0x9c/0x160
  [  178.445584]  net_rx_action+0x2f8/0x3f0
  [...]

Reason is that sch_ingress and sch_clsact are doing mini_qdisc_pair_init()
which sets up miniq pointers to cpu_{b,q}stats from the underlying qdisc.
Problem is that this cannot work since they are actually set up right after
the qdisc -&gt;init() callback in qdisc_create(), so first packet going into
sch_handle_ingress() tries to call mini_qdisc_bstats_cpu_update() and we
therefore panic.

In order to fix this, allocation of {b,q}stats needs to happen before we
call into -&gt;init(). In net-next, there's already such option through commit
d59f5ffa59d8 ("net: sched: a dflt qdisc may be used with per cpu stats").
However, the bug needs to be fixed in net still for 4.15. Thus, include
these bits to reduce any merge churn and reuse the static_flags field to
set TCQ_F_CPUSTATS, and remove the allocation from qdisc_create() since
there is no other user left. Prashant Bhole ran into the same issue but
for net-next, thus adding him below as well as co-author. Same issue was
also reported by Sandipan Das when using bcc.

Fixes: 46209401f8f6 ("net: core: introduce mini_Qdisc and eliminate usage of tp-&gt;q for clsact fastpath")
Reference: https://lists.iovisor.org/pipermail/iovisor-dev/2018-January/001190.html
Reported-by: Sandipan Das &lt;sandipan@linux.vnet.ibm.com&gt;
Co-authored-by: Prashant Bhole &lt;bhole_prashant_q7@lab.ntt.co.jp&gt;
Co-authored-by: John Fastabend &lt;john.fastabend@gmail.com&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Jiri Pirko &lt;jiri@resnulli.us&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>Merge tag 'mac80211-for-davem-2018-01-15' of git://git.kernel.org/pub/scm/linux/kernel/git/jberg/mac80211</title>
<updated>2018-01-16T19:28:14Z</updated>
<author>
<name>David S. Miller</name>
<email>davem@davemloft.net</email>
</author>
<published>2018-01-16T19:28:14Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=161f72ed6dbe7fb176585091d3b797125d310399'/>
<id>urn:sha1:161f72ed6dbe7fb176585091d3b797125d310399</id>
<content type='text'>
Johannes Berg says:

====================
More fixes:
 * hwsim:
    - properly flush deletion works at module unload
    - validate # of channels passed from userspace
 * cfg80211:
    - fix RCU locking regression
    - initialize on-stack channel data for nl80211 event
    - check dev_set_name() return value
====================

Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY</title>
<updated>2018-01-15T19:53:43Z</updated>
<author>
<name>Jim Westfall</name>
<email>jwestfall@surrealistic.net</email>
</author>
<published>2018-01-14T12:18:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cd9ff4de0107c65d69d02253bb25d6db93c3dbc1'/>
<id>urn:sha1:cd9ff4de0107c65d69d02253bb25d6db93c3dbc1</id>
<content type='text'>
Map all lookup neigh keys to INADDR_ANY for loopback/point-to-point devices
to avoid making an entry for every remote ip the device needs to talk to.

This used the be the old behavior but became broken in a263b3093641f
(ipv4: Make neigh lookups directly in output packet path) and later removed
in 0bb4087cbec0 (ipv4: Fix neigh lookup keying over loopback/point-to-point
devices) because it was broken.

Signed-off-by: Jim Westfall &lt;jwestfall@surrealistic.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
<entry>
<title>net/tls: Fix inverted error codes to avoid endless loop</title>
<updated>2018-01-15T19:21:57Z</updated>
<author>
<name>r.hering@avm.de</name>
<email>r.hering@avm.de</email>
</author>
<published>2018-01-12T14:42:06Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=30be8f8dba1bd2aff73e8447d59228471233a3d4'/>
<id>urn:sha1:30be8f8dba1bd2aff73e8447d59228471233a3d4</id>
<content type='text'>
sendfile() calls can hang endless with using Kernel TLS if a socket error occurs.
Socket error codes must be inverted by Kernel TLS before returning because
they are stored with positive sign. If returned non-inverted they are
interpreted as number of bytes sent, causing endless looping of the
splice mechanic behind sendfile().

Signed-off-by: Robert Hering &lt;r.hering@avm.de&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
</content>
</entry>
</feed>
