<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include, branch v4.9.81</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.9.81</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.9.81'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2018-02-13T11:36:01Z</updated>
<entry>
<title>x86/retpoline: Avoid retpolines for built-in __init functions</title>
<updated>2018-02-13T11:36:01Z</updated>
<author>
<name>David Woodhouse</name>
<email>dwmw@amazon.co.uk</email>
</author>
<published>2018-02-01T11:27:20Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fe4333893936daf7c02c3b8afc379bc600b5e286'/>
<id>urn:sha1:fe4333893936daf7c02c3b8afc379bc600b5e286</id>
<content type='text'>
(cherry picked from commit 66f793099a636862a71c59d4a6ba91387b155e0c)

There's no point in building init code with retpolines, since it runs before
any potentially hostile userspace does. And before the retpoline is actually
ALTERNATIVEd into place, for much of it.

Signed-off-by: David Woodhouse &lt;dwmw@amazon.co.uk&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: karahmed@amazon.de
Cc: peterz@infradead.org
Cc: bp@alien8.de
Link: https://lkml.kernel.org/r/1517484441-1420-2-git-send-email-dwmw@amazon.co.uk
Signed-off-by: David Woodhouse &lt;dwmw@amazon.co.uk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>vfs, fdtable: Prevent bounds-check bypass via speculative execution</title>
<updated>2018-02-13T11:36:00Z</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2018-01-30T01:03:05Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c26ceec69576cb61157d2487812fb2776e125260'/>
<id>urn:sha1:c26ceec69576cb61157d2487812fb2776e125260</id>
<content type='text'>
(cherry picked from commit 56c30ba7b348b90484969054d561f711ba196507)

'fd' is a user controlled value that is used as a data dependency to
read from the 'fdt-&gt;fd' array.  In order to avoid potential leaks of
kernel memory values, block speculative execution of the instruction
stream that could issue reads based on an invalid 'file *' returned from
__fcheck_files.

Co-developed-by: Elena Reshetova &lt;elena.reshetova@intel.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-arch@vger.kernel.org
Cc: kernel-hardening@lists.openwall.com
Cc: gregkh@linuxfoundation.org
Cc: Al Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: torvalds@linux-foundation.org
Cc: alan@linux.intel.com
Link: https://lkml.kernel.org/r/151727418500.33451.17392199002892248656.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: David Woodhouse &lt;dwmw@amazon.co.uk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>array_index_nospec: Sanitize speculative array de-references</title>
<updated>2018-02-13T11:35:59Z</updated>
<author>
<name>Dan Williams</name>
<email>dan.j.williams@intel.com</email>
</author>
<published>2018-01-30T01:02:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=579ef9ea20d67a80a0bc5d093259ff6e97087d59'/>
<id>urn:sha1:579ef9ea20d67a80a0bc5d093259ff6e97087d59</id>
<content type='text'>
(cherry picked from commit f3804203306e098dae9ca51540fcd5eb700d7f40)

array_index_nospec() is proposed as a generic mechanism to mitigate
against Spectre-variant-1 attacks, i.e. an attack that bypasses boundary
checks via speculative execution. The array_index_nospec()
implementation is expected to be safe for current generation CPUs across
multiple architectures (ARM, x86).

Based on an original implementation by Linus Torvalds, tweaked to remove
speculative flows by Alexei Starovoitov, and tweaked again by Linus to
introduce an x86 assembly implementation for the mask generation.

Co-developed-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Co-developed-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Suggested-by: Cyril Novikov &lt;cnovikov@lynx.com&gt;
Signed-off-by: Dan Williams &lt;dan.j.williams@intel.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: linux-arch@vger.kernel.org
Cc: kernel-hardening@lists.openwall.com
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Catalin Marinas &lt;catalin.marinas@arm.com&gt;
Cc: Will Deacon &lt;will.deacon@arm.com&gt;
Cc: Russell King &lt;linux@armlinux.org.uk&gt;
Cc: gregkh@linuxfoundation.org
Cc: torvalds@linux-foundation.org
Cc: alan@linux.intel.com
Link: https://lkml.kernel.org/r/151727414229.33451.18411580953862676575.stgit@dwillia2-desk3.amr.corp.intel.com
Signed-off-by: David Woodhouse &lt;dwmw@amazon.co.uk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>module/retpoline: Warn about missing retpoline in module</title>
<updated>2018-02-13T11:35:58Z</updated>
<author>
<name>Andi Kleen</name>
<email>ak@linux.intel.com</email>
</author>
<published>2018-01-25T23:50:28Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a1745ad92f50e95b6c2d101cd9b77253969ddccc'/>
<id>urn:sha1:a1745ad92f50e95b6c2d101cd9b77253969ddccc</id>
<content type='text'>
(cherry picked from commit caf7501a1b4ec964190f31f9c3f163de252273b8)

There's a risk that a kernel which has full retpoline mitigations becomes
vulnerable when a module gets loaded that hasn't been compiled with the
right compiler or the right option.

To enable detection of that mismatch at module load time, add a module info
string "retpoline" at build time when the module was compiled with
retpoline support. This only covers compiled C source, but assembler source
or prebuilt object files are not checked.

If a retpoline enabled kernel detects a non retpoline protected module at
load time, print a warning and report it in the sysfs vulnerability file.

[ tglx: Massaged changelog ]

Signed-off-by: Andi Kleen &lt;ak@linux.intel.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: David Woodhouse &lt;dwmw2@infradead.org&gt;
Cc: gregkh@linuxfoundation.org
Cc: torvalds@linux-foundation.org
Cc: jeyu@kernel.org
Cc: arjan@linux.intel.com
Link: https://lkml.kernel.org/r/20180125235028.31211-1-andi@firstfloor.org
Signed-off-by: David Woodhouse &lt;dwmw@amazon.co.uk&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>tty: fix data race between tty_init_dev and flush of buf</title>
<updated>2018-02-03T16:05:41Z</updated>
<author>
<name>Gaurav Kohli</name>
<email>gkohli@codeaurora.org</email>
</author>
<published>2018-01-23T07:46:34Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=55eaecffe3d663d02084023b9fc06d5f39b97389'/>
<id>urn:sha1:55eaecffe3d663d02084023b9fc06d5f39b97389</id>
<content type='text'>
commit b027e2298bd588d6fa36ed2eda97447fb3eac078 upstream.

There can be a race, if receive_buf call comes before
tty initialization completes in n_tty_open and tty-&gt;disc_data
may be NULL.

CPU0					CPU1
----					----
 000|n_tty_receive_buf_common()   	n_tty_open()
-001|n_tty_receive_buf2()		tty_ldisc_open.isra.3()
-002|tty_ldisc_receive_buf(inline)	tty_ldisc_setup()

Using ldisc semaphore lock in tty_init_dev till disc_data
initializes completely.

Signed-off-by: Gaurav Kohli &lt;gkohli@codeaurora.org&gt;
Reviewed-by: Alan Cox &lt;alan@linux.intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>net/mlx5: Define interface bits for fencing UMR wqe</title>
<updated>2018-02-03T16:05:33Z</updated>
<author>
<name>Max Gurtovoy</name>
<email>maxg@mellanox.com</email>
</author>
<published>2017-05-28T07:53:10Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=20e6f5bdf542cb2510bd54e69eed6fe395f6c026'/>
<id>urn:sha1:20e6f5bdf542cb2510bd54e69eed6fe395f6c026</id>
<content type='text'>
commit 1410a90ae449061b7e1ae19d275148f36948801b upstream.

HW can implement UMR wqe re-transmission in various ways.
Thus, add HCA cap to distinguish the needed fence for UMR to make
sure that the wqe wouldn't fail on mkey checks.

Signed-off-by: Max Gurtovoy &lt;maxg@mellanox.com&gt;
Acked-by: Leon Romanovsky &lt;leon@kernel.org&gt;
Reviewed-by: Christoph Hellwig &lt;hch@lst.de&gt;
Signed-off-by: Doug Ledford &lt;dledford@redhat.com&gt;
Cc: Marta Rybczynska &lt;mrybczyn@kalray.eu&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>bpf: avoid false sharing of map refcount with max_entries</title>
<updated>2018-01-31T11:55:57Z</updated>
<author>
<name>Daniel Borkmann</name>
<email>daniel@iogearbox.net</email>
</author>
<published>2018-01-29T01:48:58Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5cb917aa1f1e03df9a4c29b363e3900d73508fa8'/>
<id>urn:sha1:5cb917aa1f1e03df9a4c29b363e3900d73508fa8</id>
<content type='text'>
[ upstream commit be95a845cc4402272994ce290e3ad928aff06cb9 ]

In addition to commit b2157399cc98 ("bpf: prevent out-of-bounds
speculation") also change the layout of struct bpf_map such that
false sharing of fast-path members like max_entries is avoided
when the maps reference counter is altered. Therefore enforce
them to be placed into separate cachelines.

pahole dump after change:

  struct bpf_map {
        const struct bpf_map_ops  * ops;                 /*     0     8 */
        struct bpf_map *           inner_map_meta;       /*     8     8 */
        void *                     security;             /*    16     8 */
        enum bpf_map_type          map_type;             /*    24     4 */
        u32                        key_size;             /*    28     4 */
        u32                        value_size;           /*    32     4 */
        u32                        max_entries;          /*    36     4 */
        u32                        map_flags;            /*    40     4 */
        u32                        pages;                /*    44     4 */
        u32                        id;                   /*    48     4 */
        int                        numa_node;            /*    52     4 */
        bool                       unpriv_array;         /*    56     1 */

        /* XXX 7 bytes hole, try to pack */

        /* --- cacheline 1 boundary (64 bytes) --- */
        struct user_struct *       user;                 /*    64     8 */
        atomic_t                   refcnt;               /*    72     4 */
        atomic_t                   usercnt;              /*    76     4 */
        struct work_struct         work;                 /*    80    32 */
        char                       name[16];             /*   112    16 */
        /* --- cacheline 2 boundary (128 bytes) --- */

        /* size: 128, cachelines: 2, members: 17 */
        /* sum members: 121, holes: 1, sum holes: 7 */
  };

Now all entries in the first cacheline are read only throughout
the life time of the map, set up once during map creation. Overall
struct size and number of cachelines doesn't change from the
reordering. struct bpf_map is usually first member and embedded
in map structs in specific map implementations, so also avoid those
members to sit at the end where it could potentially share the
cacheline with first map values e.g. in the array since remote
CPUs could trigger map updates just as well for those (easily
dirtying members like max_entries intentionally as well) while
having subsequent values in cache.

Quoting from Google's Project Zero blog [1]:

  Additionally, at least on the Intel machine on which this was
  tested, bouncing modified cache lines between cores is slow,
  apparently because the MESI protocol is used for cache coherence
  [8]. Changing the reference counter of an eBPF array on one
  physical CPU core causes the cache line containing the reference
  counter to be bounced over to that CPU core, making reads of the
  reference counter on all other CPU cores slow until the changed
  reference counter has been written back to memory. Because the
  length and the reference counter of an eBPF array are stored in
  the same cache line, this also means that changing the reference
  counter on one physical CPU core causes reads of the eBPF array's
  length to be slow on other physical CPU cores (intentional false
  sharing).

While this doesn't 'control' the out-of-bounds speculation through
masking the index as in commit b2157399cc98, triggering a manipulation
of the map's reference counter is really trivial, so lets not allow
to easily affect max_entries from it.

Splitting to separate cachelines also generally makes sense from
a performance perspective anyway in that fast-path won't have a
cache miss if the map gets pinned, reused in other progs, etc out
of control path, thus also avoids unintentional false sharing.

  [1] https://googleprojectzero.blogspot.ch/2018/01/reading-privileged-memory-with-side.html

Signed-off-by: Daniel Borkmann &lt;daniel@iogearbox.net&gt;
Signed-off-by: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY</title>
<updated>2018-01-31T11:55:55Z</updated>
<author>
<name>Jim Westfall</name>
<email>jwestfall@surrealistic.net</email>
</author>
<published>2018-01-14T12:18:51Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=260eb694b5a4bd3424a7c445128fb5a784e95e3d'/>
<id>urn:sha1:260eb694b5a4bd3424a7c445128fb5a784e95e3d</id>
<content type='text'>
[ Upstream commit cd9ff4de0107c65d69d02253bb25d6db93c3dbc1 ]

Map all lookup neigh keys to INADDR_ANY for loopback/point-to-point devices
to avoid making an entry for every remote ip the device needs to talk to.

This used the be the old behavior but became broken in a263b3093641f
(ipv4: Make neigh lookups directly in output packet path) and later removed
in 0bb4087cbec0 (ipv4: Fix neigh lookup keying over loopback/point-to-point
devices) because it was broken.

Signed-off-by: Jim Westfall &lt;jwestfall@surrealistic.net&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>net: tcp: close sock if net namespace is exiting</title>
<updated>2018-01-31T11:55:54Z</updated>
<author>
<name>Dan Streetman</name>
<email>ddstreet@ieee.org</email>
</author>
<published>2018-01-18T21:14:26Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cf67be7a1a21c4d15593d7bacff5a59e30749b74'/>
<id>urn:sha1:cf67be7a1a21c4d15593d7bacff5a59e30749b74</id>
<content type='text'>
[ Upstream commit 4ee806d51176ba7b8ff1efd81f271d7252e03a1d ]

When a tcp socket is closed, if it detects that its net namespace is
exiting, close immediately and do not wait for FIN sequence.

For normal sockets, a reference is taken to their net namespace, so it will
never exit while the socket is open.  However, kernel sockets do not take a
reference to their net namespace, so it may begin exiting while the kernel
socket is still open.  In this case if the kernel socket is a tcp socket,
it will stay open trying to complete its close sequence.  The sock's dst(s)
hold a reference to their interface, which are all transferred to the
namespace's loopback interface when the real interfaces are taken down.
When the namespace tries to take down its loopback interface, it hangs
waiting for all references to the loopback interface to release, which
results in messages like:

unregister_netdevice: waiting for lo to become free. Usage count = 1

These messages continue until the socket finally times out and closes.
Since the net namespace cleanup holds the net_mutex while calling its
registered pernet callbacks, any new net namespace initialization is
blocked until the current net namespace finishes exiting.

After this change, the tcp socket notices the exiting net namespace, and
closes immediately, releasing its dst(s) and their reference to the
loopback interface, which lets the net namespace continue exiting.

Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=97811
Signed-off-by: Dan Streetman &lt;ddstreet@canonical.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>ipv6: Fix getsockopt() for sockets with default IPV6_AUTOFLOWLABEL</title>
<updated>2018-01-31T11:55:53Z</updated>
<author>
<name>Ben Hutchings</name>
<email>ben.hutchings@codethink.co.uk</email>
</author>
<published>2018-01-22T20:06:42Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=8b0d3e81cdecb92b12c9d7924749a6f62f855790'/>
<id>urn:sha1:8b0d3e81cdecb92b12c9d7924749a6f62f855790</id>
<content type='text'>
[ Upstream commit e9191ffb65d8e159680ce0ad2224e1acbde6985c ]

Commit 513674b5a2c9 ("net: reevalulate autoflowlabel setting after
sysctl setting") removed the initialisation of
ipv6_pinfo::autoflowlabel and added a second flag to indicate
whether this field or the net namespace default should be used.

The getsockopt() handling for this case was not updated, so it
currently returns 0 for all sockets for which IPV6_AUTOFLOWLABEL is
not explicitly enabled.  Fix it to return the effective value, whether
that has been set at the socket or net namespace level.

Fixes: 513674b5a2c9 ("net: reevalulate autoflowlabel setting after sysctl ...")
Signed-off-by: Ben Hutchings &lt;ben.hutchings@codethink.co.uk&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
</feed>
