<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/include, branch v4.9.43</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.9.43</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.9.43'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2017-08-11T15:49:37Z</updated>
<entry>
<title>workqueue: implicit ordered attribute should be overridable</title>
<updated>2017-08-11T15:49:37Z</updated>
<author>
<name>Tejun Heo</name>
<email>tj@kernel.org</email>
</author>
<published>2017-07-23T12:36:15Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f9636c9bdd5828f29cdfaf620e3a424a5f8cc221'/>
<id>urn:sha1:f9636c9bdd5828f29cdfaf620e3a424a5f8cc221</id>
<content type='text'>
commit 0a94efb5acbb6980d7c9ab604372d93cd507e4d8 upstream.

5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be
ordered") automatically enabled ordered attribute for unbound
workqueues w/ max_active == 1.  Because ordered workqueues reject
max_active and some attribute changes, this implicit ordered mode
broke cases where the user creates an unbound workqueue w/ max_active
== 1 and later explicitly changes the related attributes.

This patch distinguishes explicit and implicit ordered setting and
overrides from attribute changes if implict.

Signed-off-by: Tejun Heo &lt;tj@kernel.org&gt;
Fixes: 5c0338c68706 ("workqueue: restore WQ_UNBOUND/max_active==1 to be ordered")
Cc: Holger Hoffstätte &lt;holger@applied-asynchrony.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>signal: protect SIGNAL_UNKILLABLE from unintentional clearing.</title>
<updated>2017-08-11T15:49:36Z</updated>
<author>
<name>Jamie Iles</name>
<email>jamie.iles@oracle.com</email>
</author>
<published>2017-01-11T00:57:54Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=916a05b90d8322f38ea133feb3a194617e507c3c'/>
<id>urn:sha1:916a05b90d8322f38ea133feb3a194617e507c3c</id>
<content type='text'>
[ Upstream commit 2d39b3cd34e6d323720d4c61bd714f5ae202c022 ]

Since commit 00cd5c37afd5 ("ptrace: permit ptracing of /sbin/init") we
can now trace init processes.  init is initially protected with
SIGNAL_UNKILLABLE which will prevent fatal signals such as SIGSTOP, but
there are a number of paths during tracing where SIGNAL_UNKILLABLE can
be implicitly cleared.

This can result in init becoming stoppable/killable after tracing.  For
example, running:

  while true; do kill -STOP 1; done &amp;
  strace -p 1

and then stopping strace and the kill loop will result in init being
left in state TASK_STOPPED.  Sending SIGCONT to init will resume it, but
init will now respond to future SIGSTOP signals rather than ignoring
them.

Make sure that when setting SIGNAL_STOP_CONTINUED/SIGNAL_STOP_STOPPED
that we don't clear SIGNAL_UNKILLABLE.

Link: http://lkml.kernel.org/r/20170104122017.25047-1-jamie.iles@oracle.com
Signed-off-by: Jamie Iles &lt;jamie.iles@oracle.com&gt;
Acked-by: Oleg Nesterov &lt;oleg@redhat.com&gt;
Cc: Alexander Viro &lt;viro@zeniv.linux.org.uk&gt;
Cc: Ingo Molnar &lt;mingo@redhat.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>mm, slab: make sure that KMALLOC_MAX_SIZE will fit into MAX_ORDER</title>
<updated>2017-08-11T15:49:36Z</updated>
<author>
<name>Michal Hocko</name>
<email>mhocko@suse.com</email>
</author>
<published>2017-01-11T00:57:27Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c736011052cf2fdc6e310195ce0fd0805ddbb0b6'/>
<id>urn:sha1:c736011052cf2fdc6e310195ce0fd0805ddbb0b6</id>
<content type='text'>
[ Upstream commit bb1107f7c6052c863692a41f78c000db792334bf ]

Andrey Konovalov has reported the following warning triggered by the
syzkaller fuzzer.

  WARNING: CPU: 1 PID: 9935 at mm/page_alloc.c:3511 __alloc_pages_nodemask+0x159c/0x1e20
  Kernel panic - not syncing: panic_on_warn set ...
  CPU: 1 PID: 9935 Comm: syz-executor0 Not tainted 4.9.0-rc7+ #34
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
  Call Trace:
    __alloc_pages_slowpath mm/page_alloc.c:3511
    __alloc_pages_nodemask+0x159c/0x1e20 mm/page_alloc.c:3781
    alloc_pages_current+0x1c7/0x6b0 mm/mempolicy.c:2072
    alloc_pages include/linux/gfp.h:469
    kmalloc_order+0x1f/0x70 mm/slab_common.c:1015
    kmalloc_order_trace+0x1f/0x160 mm/slab_common.c:1026
    kmalloc_large include/linux/slab.h:422
    __kmalloc+0x210/0x2d0 mm/slub.c:3723
    kmalloc include/linux/slab.h:495
    ep_write_iter+0x167/0xb50 drivers/usb/gadget/legacy/inode.c:664
    new_sync_write fs/read_write.c:499
    __vfs_write+0x483/0x760 fs/read_write.c:512
    vfs_write+0x170/0x4e0 fs/read_write.c:560
    SYSC_write fs/read_write.c:607
    SyS_write+0xfb/0x230 fs/read_write.c:599
    entry_SYSCALL_64_fastpath+0x1f/0xc2

The issue is caused by a lack of size check for the request size in
ep_write_iter which should be fixed.  It, however, points to another
problem, that SLUB defines KMALLOC_MAX_SIZE too large because the its
KMALLOC_SHIFT_MAX is (MAX_ORDER + PAGE_SHIFT) which means that the
resulting page allocator request might be MAX_ORDER which is too large
(see __alloc_pages_slowpath).

The same applies to the SLOB allocator which allows even larger sizes.
Make sure that they are capped properly and never request more than
MAX_ORDER order.

Link: http://lkml.kernel.org/r/20161220130659.16461-2-mhocko@kernel.org
Signed-off-by: Michal Hocko &lt;mhocko@suse.com&gt;
Reported-by: Andrey Konovalov &lt;andreyknvl@google.com&gt;
Acked-by: Christoph Lameter &lt;cl@linux.com&gt;
Cc: Alexei Starovoitov &lt;ast@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Sasha Levin &lt;alexander.levin@verizon.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>wext: handle NULL extra data in iwe_stream_add_point better</title>
<updated>2017-08-11T15:49:34Z</updated>
<author>
<name>Arnd Bergmann</name>
<email>arnd@arndb.de</email>
</author>
<published>2017-01-11T14:35:25Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b87145215abe415dc3a31d0ff0e187117fe8e1b6'/>
<id>urn:sha1:b87145215abe415dc3a31d0ff0e187117fe8e1b6</id>
<content type='text'>
commit 93be2b74279c15c2844684b1a027fdc71dd5d9bf upstream.

gcc-7 complains that wl3501_cs passes NULL into a function that
then uses the argument as the input for memcpy:

drivers/net/wireless/wl3501_cs.c: In function 'wl3501_get_scan':
include/net/iw_handler.h:559:3: error: argument 2 null where non-null expected [-Werror=nonnull]
   memcpy(stream + point_len, extra, iwe-&gt;u.data.length);

This works fine here because iwe-&gt;u.data.length is guaranteed to be 0
and the memcpy doesn't actually have an effect.

Making the length check explicit avoids the warning and should have
no other effect here.

Also check the pointer itself, since otherwise we get warnings
elsewhere in the code.

Signed-off-by: Arnd Bergmann &lt;arnd@arndb.de&gt;
Signed-off-by: Johannes Berg &lt;johannes.berg@intel.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>sctp: fix the check for _sctp_walk_params and _sctp_walk_errors</title>
<updated>2017-08-11T15:49:33Z</updated>
<author>
<name>Xin Long</name>
<email>lucien.xin@gmail.com</email>
</author>
<published>2017-07-26T08:24:59Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=df32d08293ea7db4fe7e030bd47c71b7f63fc05f'/>
<id>urn:sha1:df32d08293ea7db4fe7e030bd47c71b7f63fc05f</id>
<content type='text'>
[ Upstream commit 6b84202c946cd3da3a8daa92c682510e9ed80321 ]

Commit b1f5bfc27a19 ("sctp: don't dereference ptr before leaving
_sctp_walk_{params, errors}()") tried to fix the issue that it
may overstep the chunk end for _sctp_walk_{params, errors} with
'chunk_end &gt; offset(length) + sizeof(length)'.

But it introduced a side effect: When processing INIT, it verifies
the chunks with 'param.v == chunk_end' after iterating all params
by sctp_walk_params(). With the check 'chunk_end &gt; offset(length)
+ sizeof(length)', it would return when the last param is not yet
accessed. Because the last param usually is fwdtsn supported param
whose size is 4 and 'chunk_end == offset(length) + sizeof(length)'

This is a badly issue even causing sctp couldn't process 4-shakes.
Client would always get abort when connecting to server, due to
the failure of INIT chunk verification on server.

The patch is to use 'chunk_end &lt;= offset(length) + sizeof(length)'
instead of 'chunk_end &lt; offset(length) + sizeof(length)' for both
_sctp_walk_params and _sctp_walk_errors.

Fixes: b1f5bfc27a19 ("sctp: don't dereference ptr before leaving _sctp_walk_{params, errors}()")
Signed-off-by: Xin Long &lt;lucien.xin@gmail.com&gt;
Acked-by: Neil Horman &lt;nhorman@tuxdriver.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>sctp: don't dereference ptr before leaving _sctp_walk_{params, errors}()</title>
<updated>2017-08-11T15:49:33Z</updated>
<author>
<name>Alexander Potapenko</name>
<email>glider@google.com</email>
</author>
<published>2017-07-14T16:32:45Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=cc6f1486f2cb053a647db3a51cba5fbeabaa4a3f'/>
<id>urn:sha1:cc6f1486f2cb053a647db3a51cba5fbeabaa4a3f</id>
<content type='text'>
[ Upstream commit b1f5bfc27a19f214006b9b4db7b9126df2dfdf5a ]

If the length field of the iterator (|pos.p| or |err|) is past the end
of the chunk, we shouldn't access it.

This bug has been detected by KMSAN. For the following pair of system
calls:

  socket(PF_INET6, SOCK_STREAM, 0x84 /* IPPROTO_??? */) = 3
  sendto(3, "A", 1, MSG_OOB, {sa_family=AF_INET6, sin6_port=htons(0),
         inet_pton(AF_INET6, "::1", &amp;sin6_addr), sin6_flowinfo=0,
         sin6_scope_id=0}, 28) = 1

the tool has reported a use of uninitialized memory:

  ==================================================================
  BUG: KMSAN: use of uninitialized memory in sctp_rcv+0x17b8/0x43b0
  CPU: 1 PID: 2940 Comm: probe Not tainted 4.11.0-rc5+ #2926
  Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs
  01/01/2011
  Call Trace:
   &lt;IRQ&gt;
   __dump_stack lib/dump_stack.c:16
   dump_stack+0x172/0x1c0 lib/dump_stack.c:52
   kmsan_report+0x12a/0x180 mm/kmsan/kmsan.c:927
   __msan_warning_32+0x61/0xb0 mm/kmsan/kmsan_instr.c:469
   __sctp_rcv_init_lookup net/sctp/input.c:1074
   __sctp_rcv_lookup_harder net/sctp/input.c:1233
   __sctp_rcv_lookup net/sctp/input.c:1255
   sctp_rcv+0x17b8/0x43b0 net/sctp/input.c:170
   sctp6_rcv+0x32/0x70 net/sctp/ipv6.c:984
   ip6_input_finish+0x82f/0x1ee0 net/ipv6/ip6_input.c:279
   NF_HOOK ./include/linux/netfilter.h:257
   ip6_input+0x239/0x290 net/ipv6/ip6_input.c:322
   dst_input ./include/net/dst.h:492
   ip6_rcv_finish net/ipv6/ip6_input.c:69
   NF_HOOK ./include/linux/netfilter.h:257
   ipv6_rcv+0x1dbd/0x22e0 net/ipv6/ip6_input.c:203
   __netif_receive_skb_core+0x2f6f/0x3a20 net/core/dev.c:4208
   __netif_receive_skb net/core/dev.c:4246
   process_backlog+0x667/0xba0 net/core/dev.c:4866
   napi_poll net/core/dev.c:5268
   net_rx_action+0xc95/0x1590 net/core/dev.c:5333
   __do_softirq+0x485/0x942 kernel/softirq.c:284
   do_softirq_own_stack+0x1c/0x30 arch/x86/entry/entry_64.S:902
   &lt;/IRQ&gt;
   do_softirq kernel/softirq.c:328
   __local_bh_enable_ip+0x25b/0x290 kernel/softirq.c:181
   local_bh_enable+0x37/0x40 ./include/linux/bottom_half.h:31
   rcu_read_unlock_bh ./include/linux/rcupdate.h:931
   ip6_finish_output2+0x19b2/0x1cf0 net/ipv6/ip6_output.c:124
   ip6_finish_output+0x764/0x970 net/ipv6/ip6_output.c:149
   NF_HOOK_COND ./include/linux/netfilter.h:246
   ip6_output+0x456/0x520 net/ipv6/ip6_output.c:163
   dst_output ./include/net/dst.h:486
   NF_HOOK ./include/linux/netfilter.h:257
   ip6_xmit+0x1841/0x1c00 net/ipv6/ip6_output.c:261
   sctp_v6_xmit+0x3b7/0x470 net/sctp/ipv6.c:225
   sctp_packet_transmit+0x38cb/0x3a20 net/sctp/output.c:632
   sctp_outq_flush+0xeb3/0x46e0 net/sctp/outqueue.c:885
   sctp_outq_uncork+0xb2/0xd0 net/sctp/outqueue.c:750
   sctp_side_effects net/sctp/sm_sideeffect.c:1773
   sctp_do_sm+0x6962/0x6ec0 net/sctp/sm_sideeffect.c:1147
   sctp_primitive_ASSOCIATE+0x12c/0x160 net/sctp/primitive.c:88
   sctp_sendmsg+0x43e5/0x4f90 net/sctp/socket.c:1954
   inet_sendmsg+0x498/0x670 net/ipv4/af_inet.c:762
   sock_sendmsg_nosec net/socket.c:633
   sock_sendmsg net/socket.c:643
   SYSC_sendto+0x608/0x710 net/socket.c:1696
   SyS_sendto+0x8a/0xb0 net/socket.c:1664
   do_syscall_64+0xe6/0x130 arch/x86/entry/common.c:285
   entry_SYSCALL64_slow_path+0x25/0x25 arch/x86/entry/entry_64.S:246
  RIP: 0033:0x401133
  RSP: 002b:00007fff6d99cd38 EFLAGS: 00000246 ORIG_RAX: 000000000000002c
  RAX: ffffffffffffffda RBX: 00000000004002b0 RCX: 0000000000401133
  RDX: 0000000000000001 RSI: 0000000000494088 RDI: 0000000000000003
  RBP: 00007fff6d99cd90 R08: 00007fff6d99cd50 R09: 000000000000001c
  R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000000
  R13: 00000000004063d0 R14: 0000000000406460 R15: 0000000000000000
  origin:
   save_stack_trace+0x37/0x40 arch/x86/kernel/stacktrace.c:59
   kmsan_save_stack_with_flags mm/kmsan/kmsan.c:302
   kmsan_internal_poison_shadow+0xb1/0x1a0 mm/kmsan/kmsan.c:198
   kmsan_poison_shadow+0x6d/0xc0 mm/kmsan/kmsan.c:211
   slab_alloc_node mm/slub.c:2743
   __kmalloc_node_track_caller+0x200/0x360 mm/slub.c:4351
   __kmalloc_reserve net/core/skbuff.c:138
   __alloc_skb+0x26b/0x840 net/core/skbuff.c:231
   alloc_skb ./include/linux/skbuff.h:933
   sctp_packet_transmit+0x31e/0x3a20 net/sctp/output.c:570
   sctp_outq_flush+0xeb3/0x46e0 net/sctp/outqueue.c:885
   sctp_outq_uncork+0xb2/0xd0 net/sctp/outqueue.c:750
   sctp_side_effects net/sctp/sm_sideeffect.c:1773
   sctp_do_sm+0x6962/0x6ec0 net/sctp/sm_sideeffect.c:1147
   sctp_primitive_ASSOCIATE+0x12c/0x160 net/sctp/primitive.c:88
   sctp_sendmsg+0x43e5/0x4f90 net/sctp/socket.c:1954
   inet_sendmsg+0x498/0x670 net/ipv4/af_inet.c:762
   sock_sendmsg_nosec net/socket.c:633
   sock_sendmsg net/socket.c:643
   SYSC_sendto+0x608/0x710 net/socket.c:1696
   SyS_sendto+0x8a/0xb0 net/socket.c:1664
   do_syscall_64+0xe6/0x130 arch/x86/entry/common.c:285
   return_from_SYSCALL_64+0x0/0x6a arch/x86/entry/entry_64.S:246
  ==================================================================

Signed-off-by: Alexander Potapenko &lt;glider@google.com&gt;
Signed-off-by: David S. Miller &lt;davem@davemloft.net&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>iscsi-target: Fix initial login PDU asynchronous socket close OOPs</title>
<updated>2017-08-11T15:49:31Z</updated>
<author>
<name>Nicholas Bellinger</name>
<email>nab@linux-iscsi.org</email>
</author>
<published>2017-05-25T04:47:09Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bdabf097f05b9ef78c58622e3dfdec39b11a2ee5'/>
<id>urn:sha1:bdabf097f05b9ef78c58622e3dfdec39b11a2ee5</id>
<content type='text'>
commit 25cdda95fda78d22d44157da15aa7ea34be3c804 upstream.

This patch fixes a OOPs originally introduced by:

   commit bb048357dad6d604520c91586334c9c230366a14
   Author: Nicholas Bellinger &lt;nab@linux-iscsi.org&gt;
   Date:   Thu Sep 5 14:54:04 2013 -0700

   iscsi-target: Add sk-&gt;sk_state_change to cleanup after TCP failure

which would trigger a NULL pointer dereference when a TCP connection
was closed asynchronously via iscsi_target_sk_state_change(), but only
when the initial PDU processing in iscsi_target_do_login() from iscsi_np
process context was blocked waiting for backend I/O to complete.

To address this issue, this patch makes the following changes.

First, it introduces some common helper functions used for checking
socket closing state, checking login_flags, and atomically checking
socket closing state + setting login_flags.

Second, it introduces a LOGIN_FLAGS_INITIAL_PDU bit to know when a TCP
connection has dropped via iscsi_target_sk_state_change(), but the
initial PDU processing within iscsi_target_do_login() in iscsi_np
context is still running.  For this case, it sets LOGIN_FLAGS_CLOSED,
but doesn't invoke schedule_delayed_work().

The original NULL pointer dereference case reported by MNC is now handled
by iscsi_target_do_login() doing a iscsi_target_sk_check_close() before
transitioning to FFP to determine when the socket has already closed,
or iscsi_target_start_negotiation() if the login needs to exchange
more PDUs (eg: iscsi_target_do_login returned 0) but the socket has
closed.  For both of these cases, the cleanup up of remaining connection
resources will occur in iscsi_target_start_negotiation() from iscsi_np
process context once the failure is detected.

Finally, to handle to case where iscsi_target_sk_state_change() is
called after the initial PDU procesing is complete, it now invokes
conn-&gt;login_work -&gt; iscsi_target_do_login_rx() to perform cleanup once
existing iscsi_target_sk_check_close() checks detect connection failure.
For this case, the cleanup of remaining connection resources will occur
in iscsi_target_do_login_rx() from delayed workqueue process context
once the failure is detected.

Reported-by: Mike Christie &lt;mchristi@redhat.com&gt;
Reviewed-by: Mike Christie &lt;mchristi@redhat.com&gt;
Tested-by: Mike Christie &lt;mchristi@redhat.com&gt;
Cc: Mike Christie &lt;mchristi@redhat.com&gt;
Reported-by: Hannes Reinecke &lt;hare@suse.com&gt;
Cc: Hannes Reinecke &lt;hare@suse.com&gt;
Cc: Sagi Grimberg &lt;sagi@grimberg.me&gt;
Cc: Varun Prakash &lt;varun@chelsio.com&gt;
Signed-off-by: Nicholas Bellinger &lt;nab@linux-iscsi.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>cpuset: fix a deadlock due to incomplete patching of cpusets_enabled()</title>
<updated>2017-08-11T15:49:29Z</updated>
<author>
<name>Dima Zavin</name>
<email>dmitriyz@waymo.com</email>
</author>
<published>2017-08-02T20:32:18Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=45a636ec1849f842feb0716c7e25176a8b53af69'/>
<id>urn:sha1:45a636ec1849f842feb0716c7e25176a8b53af69</id>
<content type='text'>
commit 89affbf5d9ebb15c6460596822e8857ea2f9e735 upstream.

In codepaths that use the begin/retry interface for reading
mems_allowed_seq with irqs disabled, there exists a race condition that
stalls the patch process after only modifying a subset of the
static_branch call sites.

This problem manifested itself as a deadlock in the slub allocator,
inside get_any_partial.  The loop reads mems_allowed_seq value (via
read_mems_allowed_begin), performs the defrag operation, and then
verifies the consistency of mem_allowed via the read_mems_allowed_retry
and the cookie returned by xxx_begin.

The issue here is that both begin and retry first check if cpusets are
enabled via cpusets_enabled() static branch.  This branch can be
rewritted dynamically (via cpuset_inc) if a new cpuset is created.  The
x86 jump label code fully synchronizes across all CPUs for every entry
it rewrites.  If it rewrites only one of the callsites (specifically the
one in read_mems_allowed_retry) and then waits for the
smp_call_function(do_sync_core) to complete while a CPU is inside the
begin/retry section with IRQs off and the mems_allowed value is changed,
we can hang.

This is because begin() will always return 0 (since it wasn't patched
yet) while retry() will test the 0 against the actual value of the seq
counter.

The fix is to use two different static keys: one for begin
(pre_enable_key) and one for retry (enable_key).  In cpuset_inc(), we
first bump the pre_enable key to ensure that cpuset_mems_allowed_begin()
always return a valid seqcount if are enabling cpusets.  Similarly, when
disabling cpusets via cpuset_dec(), we first ensure that callers of
cpuset_mems_allowed_retry() will start ignoring the seqcount value
before we let cpuset_mems_allowed_begin() return 0.

The relevant stack traces of the two stuck threads:

  CPU: 1 PID: 1415 Comm: mkdir Tainted: G L  4.9.36-00104-g540c51286237 #4
  Hardware name: Default string Default string/Hardware, BIOS 4.29.1-20170526215256 05/26/2017
  task: ffff8817f9c28000 task.stack: ffffc9000ffa4000
  RIP: smp_call_function_many+0x1f9/0x260
  Call Trace:
    smp_call_function+0x3b/0x70
    on_each_cpu+0x2f/0x90
    text_poke_bp+0x87/0xd0
    arch_jump_label_transform+0x93/0x100
    __jump_label_update+0x77/0x90
    jump_label_update+0xaa/0xc0
    static_key_slow_inc+0x9e/0xb0
    cpuset_css_online+0x70/0x2e0
    online_css+0x2c/0xa0
    cgroup_apply_control_enable+0x27f/0x3d0
    cgroup_mkdir+0x2b7/0x420
    kernfs_iop_mkdir+0x5a/0x80
    vfs_mkdir+0xf6/0x1a0
    SyS_mkdir+0xb7/0xe0
    entry_SYSCALL_64_fastpath+0x18/0xad

  ...

  CPU: 2 PID: 1 Comm: init Tainted: G L  4.9.36-00104-g540c51286237 #4
  Hardware name: Default string Default string/Hardware, BIOS 4.29.1-20170526215256 05/26/2017
  task: ffff8818087c0000 task.stack: ffffc90000030000
  RIP: int3+0x39/0x70
  Call Trace:
    &lt;#DB&gt; ? ___slab_alloc+0x28b/0x5a0
    &lt;EOE&gt; ? copy_process.part.40+0xf7/0x1de0
    __slab_alloc.isra.80+0x54/0x90
    copy_process.part.40+0xf7/0x1de0
    copy_process.part.40+0xf7/0x1de0
    kmem_cache_alloc_node+0x8a/0x280
    copy_process.part.40+0xf7/0x1de0
    _do_fork+0xe7/0x6c0
    _raw_spin_unlock_irq+0x2d/0x60
    trace_hardirqs_on_caller+0x136/0x1d0
    entry_SYSCALL_64_fastpath+0x5/0xad
    do_syscall_64+0x27/0x350
    SyS_clone+0x19/0x20
    do_syscall_64+0x60/0x350
    entry_SYSCALL64_slow_path+0x25/0x25

Link: http://lkml.kernel.org/r/20170731040113.14197-1-dmitriyz@waymo.com
Fixes: 46e700abc44c ("mm, page_alloc: remove unnecessary taking of a seqlock when cpusets are disabled")
Signed-off-by: Dima Zavin &lt;dmitriyz@waymo.com&gt;
Reported-by: Cliff Spradlin &lt;cspradlin@waymo.com&gt;
Acked-by: Vlastimil Babka &lt;vbabka@suse.cz&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Christopher Lameter &lt;cl@linux.com&gt;
Cc: Li Zefan &lt;lizefan@huawei.com&gt;
Cc: Pekka Enberg &lt;penberg@kernel.org&gt;
Cc: David Rientjes &lt;rientjes@google.com&gt;
Cc: Joonsoo Kim &lt;iamjoonsoo.kim@lge.com&gt;
Cc: Mel Gorman &lt;mgorman@techsingularity.net&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>mm, mprotect: flush TLB if potentially racing with a parallel reclaim leaving stale TLB entries</title>
<updated>2017-08-11T15:49:29Z</updated>
<author>
<name>Mel Gorman</name>
<email>mgorman@suse.de</email>
</author>
<published>2017-08-02T20:31:52Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5a1eef71aa2aaa46892a07a0cfc94dd38d24b040'/>
<id>urn:sha1:5a1eef71aa2aaa46892a07a0cfc94dd38d24b040</id>
<content type='text'>
commit 3ea277194daaeaa84ce75180ec7c7a2075027a68 upstream.

Nadav Amit identified a theoritical race between page reclaim and
mprotect due to TLB flushes being batched outside of the PTL being held.

He described the race as follows:

        CPU0                            CPU1
        ----                            ----
                                        user accesses memory using RW PTE
                                        [PTE now cached in TLB]
        try_to_unmap_one()
        ==&gt; ptep_get_and_clear()
        ==&gt; set_tlb_ubc_flush_pending()
                                        mprotect(addr, PROT_READ)
                                        ==&gt; change_pte_range()
                                        ==&gt; [ PTE non-present - no flush ]

                                        user writes using cached RW PTE
        ...

        try_to_unmap_flush()

The same type of race exists for reads when protecting for PROT_NONE and
also exists for operations that can leave an old TLB entry behind such
as munmap, mremap and madvise.

For some operations like mprotect, it's not necessarily a data integrity
issue but it is a correctness issue as there is a window where an
mprotect that limits access still allows access.  For munmap, it's
potentially a data integrity issue although the race is massive as an
munmap, mmap and return to userspace must all complete between the
window when reclaim drops the PTL and flushes the TLB.  However, it's
theoritically possible so handle this issue by flushing the mm if
reclaim is potentially currently batching TLB flushes.

Other instances where a flush is required for a present pte should be ok
as either the page lock is held preventing parallel reclaim or a page
reference count is elevated preventing a parallel free leading to
corruption.  In the case of page_mkclean there isn't an obvious path
that userspace could take advantage of without using the operations that
are guarded by this patch.  Other users such as gup as a race with
reclaim looks just at PTEs.  huge page variants should be ok as they
don't race with reclaim.  mincore only looks at PTEs.  userfault also
should be ok as if a parallel reclaim takes place, it will either fault
the page back in or read some of the data before the flush occurs
triggering a fault.

Note that a variant of this patch was acked by Andy Lutomirski but this
was for the x86 parts on top of his PCID work which didn't make the 4.13
merge window as expected.  His ack is dropped from this version and
there will be a follow-on patch on top of PCID that will include his
ack.

[akpm@linux-foundation.org: tweak comments]
[akpm@linux-foundation.org: fix spello]
Link: http://lkml.kernel.org/r/20170717155523.emckq2esjro6hf3z@suse.de
Reported-by: Nadav Amit &lt;nadav.amit@gmail.com&gt;
Signed-off-by: Mel Gorman &lt;mgorman@suse.de&gt;
Cc: Andy Lutomirski &lt;luto@kernel.org&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>device property: Make dev_fwnode() public</title>
<updated>2017-08-11T15:49:29Z</updated>
<author>
<name>Sakari Ailus</name>
<email>sakari.ailus@linux.intel.com</email>
</author>
<published>2017-03-28T07:52:24Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=1f32e67adac40508e9c8753151effbe0bbf06bdb'/>
<id>urn:sha1:1f32e67adac40508e9c8753151effbe0bbf06bdb</id>
<content type='text'>
commit e44bb0cbdc88686c21e2175a990b40bf6db5d005 upstream.

The function to obtain a fwnode related to a struct device is useful for
drivers that use the fwnode property API: it allows not being aware of the
underlying firmware implementation.

Signed-off-by: Sakari Ailus &lt;sakari.ailus@linux.intel.com&gt;
Reviewed-by: Mika Westerberg &lt;mika.westerberg@linux.intel.com&gt;
Signed-off-by: Rafael J. Wysocki &lt;rafael.j.wysocki@intel.com&gt;
Cc: Chris Metcalf &lt;cmetcalf@mellanox.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
</feed>
