<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include/linux, branch v4.17.9</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.17.9</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.17.9'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2018-07-22T13:16:09Z</updated>
<entry>
<title>bpf: undo prog rejection on read-only lock failure</title>
<updated>2018-07-22T13:16:09Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2018-06-28T21:34:59Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b1a4a5d0b0cba805445ecefb1ea24581fd868aa1'/>
<id>urn:sha1:b1a4a5d0b0cba805445ecefb1ea24581fd868aa1</id>
<content type='text'>
commit 85782e037f8aba8922dadb24a1523ca0b82ab8bc upstream.

Partially undo commit 9facc336876f ("bpf: reject any prog that failed
read-only lock") since it caused a regression, that is, syzkaller was
able to manage to cause a panic via fault injection deep in set_memory_ro()
path by letting an allocation fail: In x86's __change_page_attr_set_clr()
it was able to change the attributes of the primary mapping but not in
the alias mapping via cpa_process_alias(), so the second, inner call
to the __change_page_attr() via __change_page_attr_set_clr() had to split
a larger page and failed in the alloc_pages() with the artifically triggered
allocation error which is then propagated down to the call site.

Thus, for set_memory_ro() this means that it returned with an error, but
from debugging a probe_kernel_write() revealed EFAULT on that memory since
the primary mapping succeeded to get changed. Therefore the subsequent
hdr-&gt;locked = 0 reset triggered the panic as it was performed on read-only
memory, so call-site assumptions were infact wrong to assume that it would
either succeed /or/ not succeed at all since there's no such rollback in
set_memory_*() calls from partial change of mappings, in other words, we're
left in a state that is "half done". A later undo via set_memory_rw() is
succeeding though due to matching permissions on that part (aka due to the
try_preserve_large_page() succeeding). While reproducing locally with
explicitly triggering this error, the initial splitting only happens on
rare occasions and in real world it would additionally need oom conditions,
but that said, it could partially fail. Therefore, it is definitely wrong
to bail out on set_memory_ro() error and reject the program with the
set_memory_*() semantics we have today. Shouldn't have gone the extra mile
since no other user in tree today infact checks for any set_memory_*()
errors, e.g. neither module_enable_ro() / module_disable_ro() for module
RO/NX handling which is mostly default these days nor kprobes core with
alloc_insn_page() / free_insn_page() as examples that could be invoked long
after bootup and original 314beb9bcabf ("x86: bpf_jit_comp: secure bpf jit
against spraying attacks") did neither when it got first introduced to BPF
so "improving" with bailing out was clearly not right when set_memory_*()
cannot handle it today.

Kees suggested that if set_memory_*() can fail, we should annotate it with
__must_check, and all callers need to deal with it gracefully given those
set_memory_*() markings aren't "advisory", but they're expected to actually
do what they say. This might be an option worth to move forward in future
but would at the same time require that set_memory_*() calls from supporting
archs are guaranteed to be "atomic" in that they provide rollback if part
of the range fails, once that happened, the transition from RW -&gt; RO could
be made more robust that way, while subsequent RO -&gt; RW transition /must/
continue guaranteeing to always succeed the undo part.

Reported-by: syzbot+a4eb8c7766952a1ca872@syzkaller.appspotmail.com
Reported-by: syzbot+d866d1925855328eac3b@syzkaller.appspotmail.com
Fixes: 9facc336876f ("bpf: reject any prog that failed read-only lock")
Cc: Laura Abbott &lt;labbott@redhat.com&gt;
Cc: Kees Cook &lt;keescook@chromium.org&gt;
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>bpf: enforce correct alignment for instructions</title>
<updated>2018-07-22T13:16:08Z</updated>
<author>
<name>Eric Dumazet</name>
<email>edumazet@google.com</email>
</author>
<published>2018-06-21T00:24:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d3aaff449e24bd254db7a44e049461a127137114'/>
<id>urn:sha1:d3aaff449e24bd254db7a44e049461a127137114</id>
<content type='text'>
commit 9262478220eac908ae6e168c3df2c453c87e2da3 upstream.

After commit 9facc336876f ("bpf: reject any prog that failed read-only lock")
offsetof(struct bpf_binary_header, image) became 3 instead of 4,
breaking powerpc BPF badly, since instructions need to be word aligned.

Fixes: 9facc336876f ("bpf: reject any prog that failed read-only lock")
Signed-off-by: Eric Dumazet &lt;edumazet@google.com&gt;
Cc: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Cc: Martin KaFai Lau &lt;kafai@fb.com&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>arm64: Call ARCH_WORKAROUND_2 on transitions between EL0 and EL1</title>
<updated>2018-07-22T13:16:07Z</updated>
<author>
<name>Marc Zyngier</name>
<email>marc.zyngier@arm.com</email>
</author>
<published>2018-07-20T09:47:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=55e72ae8b2dd30c369f678828bbbc3586b729b07'/>
<id>urn:sha1:55e72ae8b2dd30c369f678828bbbc3586b729b07</id>
<content type='text'>
commit 8e2906245f1e3b0d027169d9f2e55ce0548cb96e upstream.

In order for the kernel to protect itself, let's call the SSBD mitigation
implemented by the higher exception level (either hypervisor or firmware)
on each transition between userspace and kernel.

We must take the PSCI conduit into account in order to target the
right exception level, hence the introduction of a runtime patching
callback.

Reviewed-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Reviewed-by: Julien Grall &lt;julien.grall@arm.com&gt;
Acked-by: Will Deacon &lt;will.deacon@arm.com&gt;
Signed-off-by: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Signed-off-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>arm/arm64: smccc: Add SMCCC-specific return codes</title>
<updated>2018-07-22T13:16:07Z</updated>
<author>
<name>Marc Zyngier</name>
<email>marc.zyngier@arm.com</email>
</author>
<published>2018-07-20T09:47:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e27fa3220ca328c5cc090e9bd7a54016e3a00d17'/>
<id>urn:sha1:e27fa3220ca328c5cc090e9bd7a54016e3a00d17</id>
<content type='text'>
commit eff0e9e1078ea7dc1d794dc50e31baef984c46d7 upstream.

We've so far used the PSCI return codes for SMCCC because they
were extremely similar. But with the new ARM DEN 0070A specification,
"NOT_REQUIRED" (-2) is clashing with PSCI's "PSCI_RET_INVALID_PARAMS".

Let's bite the bullet and add SMCCC specific return codes. Users
can be repainted as and when required.

Acked-by: Will Deacon &lt;will.deacon@arm.com&gt;
Reviewed-by: Mark Rutland &lt;mark.rutland@arm.com&gt;
Signed-off-by: Marc Zyngier &lt;marc.zyngier@arm.com&gt;
Signed-off-by: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>block: don't use blocking queue entered for recursive bio submits</title>
<updated>2018-07-22T13:16:06Z</updated>
<author>
<name>Jens Axboe</name>
<email>axboe@kernel.dk</email>
</author>
<published>2018-06-02T20:04:07Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cc4cc98785daf355e9b6405bb5adee52d121b1b3'/>
<id>urn:sha1:cc4cc98785daf355e9b6405bb5adee52d121b1b3</id>
<content type='text'>
commit cd4a4ae4683dc2e09380118e205e057896dcda2b upstream.

If we end up splitting a bio and the queue goes away between
the initial submission and the later split submission, then we
can block forever in blk_queue_enter() waiting for the reference
to drop to zero. This will never happen, since we already hold
a reference.

Mark a split bio as already having entered the queue, so we can
just use the live non-blocking queue enter variant.

Thanks to Tetsuo Handa for the analysis.

Reported-by: syzbot+c4f9cebf9d651f6e54de@syzkaller.appspotmail.com
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>bpf: reject any prog that failed read-only lock</title>
<updated>2018-07-22T13:16:06Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2018-06-15T00:30:48Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ab2acc0f4a9f8591c5d82421ca2b0810f17bafba'/>
<id>urn:sha1:ab2acc0f4a9f8591c5d82421ca2b0810f17bafba</id>
<content type='text'>
commit 9facc336876f7ecf9edba4c67b90426fde4ec898 upstream.

We currently lock any JITed image as read-only via bpf_jit_binary_lock_ro()
as well as the BPF image as read-only through bpf_prog_lock_ro(). In
the case any of these would fail we throw a WARN_ON_ONCE() in order to
yell loudly to the log. Perhaps, to some extend, this may be comparable
to an allocation where __GFP_NOWARN is explicitly not set.

Added via 65869a47f348 ("bpf: improve read-only handling"), this behavior
is slightly different compared to any of the other in-kernel set_memory_ro()
users who do not check the return code of set_memory_ro() and friends /at
all/ (e.g. in the case of module_enable_ro() / module_disable_ro()). Given
in BPF this is mandatory hardening step, we want to know whether there
are any issues that would leave both BPF data writable. So it happens
that syzkaller enabled fault injection and it triggered memory allocation
failure deep inside x86's change_page_attr_set_clr() which was triggered
from set_memory_ro().

Now, there are two options: i) leaving everything as is, and ii) reworking
the image locking code in order to have a final checkpoint out of the
central bpf_prog_select_runtime() which probes whether any of the calls
during prog setup weren't successful, and then bailing out with an error.
Option ii) is a better approach since this additional paranoia avoids
altogether leaving any potential W+X pages from BPF side in the system.
Therefore, lets be strict about it, and reject programs in such unlikely
occasion. While testing I noticed also that one bpf_prog_lock_ro()
call was missing on the outer dummy prog in case of calls, e.g. in the
destructor we call bpf_prog_free_deferred() on the main prog where we
try to bpf_prog_unlock_free() the program, and since we go via
bpf_prog_select_runtime() do that as well.

Reported-by: syzbot+3b889862e65a98317058@syzkaller.appspotmail.com
Reported-by: syzbot+9e762b52dd17e616a7a5@syzkaller.appspotmail.com
Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Acked-by: Martin KaFai Lau &lt;kafai@fb.com&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>bdi: Fix another oops in wb_workfn()</title>
<updated>2018-07-22T13:16:06Z</updated>
<author>
<name>Jan Kara</name>
<email>jack@suse.cz</email>
</author>
<published>2018-06-18T13:46:58Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=69a7cd51a13cb64bd2baf69107748b2f34a35545'/>
<id>urn:sha1:69a7cd51a13cb64bd2baf69107748b2f34a35545</id>
<content type='text'>
commit 3ee7e8697d5860b173132606d80a9cd35e7113ee upstream.

syzbot is reporting NULL pointer dereference at wb_workfn() [1] due to
wb-&gt;bdi-&gt;dev being NULL. And Dmitry confirmed that wb-&gt;state was
WB_shutting_down after wb-&gt;bdi-&gt;dev became NULL. This indicates that
unregister_bdi() failed to call wb_shutdown() on one of wb objects.

The problem is in cgwb_bdi_unregister() which does cgwb_kill() and thus
drops bdi's reference to wb structures before going through the list of
wbs again and calling wb_shutdown() on each of them. This way the loop
iterating through all wbs can easily miss a wb if that wb has already
passed through cgwb_remove_from_bdi_list() called from wb_shutdown()
from cgwb_release_workfn() and as a result fully shutdown bdi although
wb_workfn() for this wb structure is still running. In fact there are
also other ways cgwb_bdi_unregister() can race with
cgwb_release_workfn() leading e.g. to use-after-free issues:

CPU1                            CPU2
                                cgwb_bdi_unregister()
                                  cgwb_kill(*slot);

cgwb_release()
  queue_work(cgwb_release_wq, &amp;wb-&gt;release_work);
cgwb_release_workfn()
                                  wb = list_first_entry(&amp;bdi-&gt;wb_list, ...)
                                  spin_unlock_irq(&amp;cgwb_lock);
  wb_shutdown(wb);
  ...
  kfree_rcu(wb, rcu);
                                  wb_shutdown(wb); -&gt; oops use-after-free

We solve these issues by synchronizing writeback structure shutdown from
cgwb_bdi_unregister() with cgwb_release_workfn() using a new mutex. That
way we also no longer need synchronization using WB_shutting_down as the
mutex provides it for CONFIG_CGROUP_WRITEBACK case and without
CONFIG_CGROUP_WRITEBACK wb_shutdown() can be called only once from
bdi_unregister().

Reported-by: syzbot &lt;syzbot+4a7438e774b21ddd8eca@syzkaller.appspotmail.com&gt;
Acked-by: Tejun Heo &lt;tj@kernel.org&gt;
Signed-off-by: Jan Kara &lt;jack@suse.cz&gt;
Signed-off-by: Jens Axboe &lt;axboe@kernel.dk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>net/mlx5: E-Switch, Avoid setup attempt if not being e-switch manager</title>
<updated>2018-07-22T13:16:00Z</updated>
<author>
<name>Or Gerlitz</name>
<email>ogerlitz@mellanox.com</email>
</author>
<published>2018-05-31T08:16:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ac1131c5d89d56d53f99a60163c7a00397124a45'/>
<id>urn:sha1:ac1131c5d89d56d53f99a60163c7a00397124a45</id>
<content type='text'>
[ Upstream commit 0efc8562491b7d36f6bbc4fbc8f3348cb6641e9c ]

In smartnic env, the host (PF) driver might not be an e-switch
manager, hence the FW will err on driver attempts to deal with
setting/unsetting the eswitch and as a result the overall setup
of sriov will fail.

Fix that by avoiding the operation if e-switch management is not
allowed for this driver instance. While here, move to use the
correct name for the esw manager capability name.

Fixes: 81848731ff40 ('net/mlx5: E-Switch, Add SR-IOV (FDB) support')
Signed-off-by: Or Gerlitz &lt;ogerlitz@mellanox.com&gt;
Reported-by: Guy Kushnir &lt;guyk@mellanox.com&gt;
Reviewed-by: Eli Cohen &lt;eli@melloanox.com&gt;
Tested-by: Eli Cohen &lt;eli@melloanox.com&gt;
Signed-off-by: Saeed Mahameed &lt;saeedm@mellanox.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>net: fix use-after-free in GRO with ESP</title>
<updated>2018-07-22T13:15:59Z</updated>
<author>
<name>Sabrina Dubroca</name>
<email>sd@queasysnail.net</email>
</author>
<published>2018-06-30T15:38:55Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=beff4d81c5472ba11f9ed0db0e273a4cf9314583'/>
<id>urn:sha1:beff4d81c5472ba11f9ed0db0e273a4cf9314583</id>
<content type='text'>
[ Upstream commit 603d4cf8fe095b1ee78f423d514427be507fb513 ]

Since the addition of GRO for ESP, gro_receive can consume the skb and
return -EINPROGRESS. In that case, the lower layer GRO handler cannot
touch the skb anymore.

Commit 5f114163f2f5 ("net: Add a skb_gro_flush_final helper.") converted
some of the gro_receive handlers that can lead to ESP's gro_receive so
that they wouldn't access the skb when -EINPROGRESS is returned, but
missed other spots, mainly in tunneling protocols.

This patch finishes the conversion to using skb_gro_flush_final(), and
adds a new helper, skb_gro_flush_final_remcsum(), used in VXLAN and
GUE.

Fixes: 5f114163f2f5 ("net: Add a skb_gro_flush_final helper.")
Signed-off-by: Sabrina Dubroca &lt;sd@queasysnail.net&gt;
Reviewed-by: Stefano Brivio &lt;sbrivio@redhat.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>atm: Preserve value of skb-&gt;truesize when accounting to vcc</title>
<updated>2018-07-22T13:15:58Z</updated>
<author>
<name>David Woodhouse</name>
<email>dwmw2@infradead.org</email>
</author>
<published>2018-06-16T10:55:44Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=9993992284836cee7d3a8a78c239aab22bbbe724'/>
<id>urn:sha1:9993992284836cee7d3a8a78c239aab22bbbe724</id>
<content type='text'>
[ Upstream commit 9bbe60a67be5a1c6f79b3c9be5003481a50529ff ]

ATM accounts for in-flight TX packets in sk_wmem_alloc of the VCC on
which they are to be sent. But it doesn't take ownership of those
packets from the sock (if any) which originally owned them. They should
remain owned by their actual sender until they've left the box.

There's a hack in pskb_expand_head() to avoid adjusting skb-&gt;truesize
for certain skbs, precisely to avoid messing up sk_wmem_alloc
accounting. Ideally that hack would cover the ATM use case too, but it
doesn't — skbs which aren't owned by any sock, for example PPP control
frames, still get their truesize adjusted when the low-level ATM driver
adds headroom.

This has always been an issue, it seems. The truesize of a packet
increases, and sk_wmem_alloc on the VCC goes negative. But this wasn't
for normal traffic, only for control frames. So I think we just got away
with it, and we probably needed to send 2GiB of LCP echo frames before
the misaccounting would ever have caused a problem and caused
atm_may_send() to start refusing packets.

Commit 14afee4b609 ("net: convert sock.sk_wmem_alloc from atomic_t to
refcount_t") did exactly what it was intended to do, and turned this
mostly-theoretical problem into a real one, causing PPPoATM to fail
immediately as sk_wmem_alloc underflows and atm_may_send() *immediately*
starts refusing to allow new packets.

The least intrusive solution to this problem is to stash the value of
skb-&gt;truesize that was accounted to the VCC, in a new member of the
ATM_SKB(skb) structure. Then in atm_pop_raw() subtract precisely that
value instead of the then-current value of skb-&gt;truesize.

Fixes: 158f323b9868 ("net: adjust skb-&gt;truesize in pskb_expand_head()")
Signed-off-by: David Woodhouse &lt;dwmw2@infradead.org&gt;
Tested-by: Kevin Darbyshire-Bryant &lt;ldir@darbyshire-bryant.me.uk&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
</feed>
