<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/futex.c, branch v4.4.265</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.4.265</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v4.4.265'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2021-03-17T15:07:19Z</updated>
<entry>
<title>futex: fix dead code in attach_to_pi_owner()</title>
<updated>2021-03-17T15:07:19Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2021-03-11T03:26:00Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=96f3135d3e5d3921c8b6eb9bf0f841ec4457f8b5'/>
<id>urn:sha1:96f3135d3e5d3921c8b6eb9bf0f841ec4457f8b5</id>
<content type='text'>
This patch comes directly from an origin patch (commit
91509e84949fc97e7424521c32a9e227746e0b85) in v4.9.
And it is part of a full patch which was originally back-ported
to v4.14 as commit e6e00df182908f34360c3c9f2d13cc719362e9c0

The handle_exit_race() function is defined in commit 9c3f39860367
 ("futex: Cure exit race"), which never returns -EBUSY. This results
in a small piece of dead code in the attach_to_pi_owner() function:

	int ret = handle_exit_race(uaddr, uval, p); /* Never return -EBUSY */
	...
	if (ret == -EBUSY)
		*exiting = p; /* dead code */

The return value -EBUSY is added to handle_exit_race() in upsteam
commit ac31c7ff8624409 ("futex: Provide distinct return value when
owner is exiting"). This commit was incorporated into v4.9.255, before
the function handle_exit_race() was introduced, whitout Modify
handle_exit_race().

To fix dead code, extract the change of handle_exit_race() from
commit ac31c7ff8624409 ("futex: Provide distinct return value when owner
 is exiting"), re-incorporated.

Lee writes:

This commit takes the remaining functional snippet of:

 ac31c7ff8624409 ("futex: Provide distinct return value when owner is exiting")

... and is the correct fix for this issue.

Fixes: 9c3f39860367 ("futex: Cure exit race")
Cc: stable@vger.kernel.org # v4.9.258
Signed-off-by: Xiaoming Ni &lt;nixiaoming@huawei.com&gt;
Reviewed-by: Lee Jones &lt;lee.jones@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Zheng Yejian &lt;zhengyejian1@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>futex: Cure exit race</title>
<updated>2021-03-17T15:07:19Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2021-03-11T03:25:59Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=6fb7304f4526736417c48af821a85d1b94d634dd'/>
<id>urn:sha1:6fb7304f4526736417c48af821a85d1b94d634dd</id>
<content type='text'>
commit da791a667536bf8322042e38ca85d55a78d3c273 upstream.

This patch comes directly from an origin patch (commit
9c3f3986036760c48a92f04b36774aa9f63673f80) in v4.9.

Stefan reported, that the glibc tst-robustpi4 test case fails
occasionally. That case creates the following race between
sys_exit() and sys_futex_lock_pi():

 CPU0				CPU1

 sys_exit()			sys_futex()
  do_exit()			 futex_lock_pi()
   exit_signals(tsk)		  No waiters:
    tsk-&gt;flags |= PF_EXITING;	  *uaddr == 0x00000PID
  mm_release(tsk)		  Set waiter bit
   exit_robust_list(tsk) {	  *uaddr = 0x80000PID;
      Set owner died		  attach_to_pi_owner() {
    *uaddr = 0xC0000000;	   tsk = get_task(PID);
   }				   if (!tsk-&gt;flags &amp; PF_EXITING) {
  ...				     attach();
  tsk-&gt;flags |= PF_EXITPIDONE;	   } else {
				     if (!(tsk-&gt;flags &amp; PF_EXITPIDONE))
				       return -EAGAIN;
				     return -ESRCH; &lt;--- FAIL
				   }

ESRCH is returned all the way to user space, which triggers the glibc test
case assert. Returning ESRCH unconditionally is wrong here because the user
space value has been changed by the exiting task to 0xC0000000, i.e. the
FUTEX_OWNER_DIED bit is set and the futex PID value has been cleared. This
is a valid state and the kernel has to handle it, i.e. taking the futex.

Cure it by rereading the user space value when PF_EXITING and PF_EXITPIDONE
is set in the task which 'owns' the futex. If the value has changed, let
the kernel retry the operation, which includes all regular sanity checks
and correctly handles the FUTEX_OWNER_DIED case.

If it hasn't changed, then return ESRCH as there is no way to distinguish
this case from malfunctioning user space. This happens when the exiting
task did not have a robust list, the robust list was corrupted or the user
space value in the futex was simply bogus.

Reported-by: Stefan Liebler &lt;stli@linux.ibm.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Heiko Carstens &lt;heiko.carstens@de.ibm.com&gt;
Cc: Darren Hart &lt;dvhart@infradead.org&gt;
Cc: Ingo Molnar &lt;mingo@kernel.org&gt;
Cc: Sasha Levin &lt;sashal@kernel.org&gt;
Cc: stable@vger.kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=200467
Link: https://lkml.kernel.org/r/20181210152311.986181245@linutronix.de
Signed-off-by: Sudip Mukherjee &lt;sudipm.mukherjee@gmail.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
[Lee: Required to satisfy functional dependency from futex back-port.
 Re-add the missing handle_exit_race() parts from:
 3d4775df0a89 ("futex: Replace PF_EXITPIDONE with a state")]
Signed-off-by: Lee Jones &lt;lee.jones@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Zheng Yejian &lt;zhengyejian1@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>futex: Change locking rules</title>
<updated>2021-03-17T15:07:19Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2021-03-11T03:25:58Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d88fa1a42f4a6780fb7df38a28b594a19aa71f3a'/>
<id>urn:sha1:d88fa1a42f4a6780fb7df38a28b594a19aa71f3a</id>
<content type='text'>
commit 734009e96d1983ad739e5b656e03430b3660c913 upstream.

This patch comes directly from an origin patch (commit
dc3f2ff11740159080f2e8e359ae0ab57c8e74b6) in v4.9.

Currently futex-pi relies on hb-&gt;lock to serialize everything. But hb-&gt;lock
creates another set of problems, especially priority inversions on RT where
hb-&gt;lock becomes a rt_mutex itself.

The rt_mutex::wait_lock is the most obvious protection for keeping the
futex user space value and the kernel internal pi_state in sync.

Rework and document the locking so rt_mutex::wait_lock is held accross all
operations which modify the user space value and the pi state.

This allows to invoke rt_mutex_unlock() (including deboost) without holding
hb-&gt;lock as a next step.

Nothing yet relies on the new locking rules.

Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: juri.lelli@arm.com
Cc: bigeasy@linutronix.de
Cc: xlpang@redhat.com
Cc: rostedt@goodmis.org
Cc: mathieu.desnoyers@efficios.com
Cc: jdesfossez@efficios.com
Cc: dvhart@infradead.org
Cc: bristot@redhat.com
Link: http://lkml.kernel.org/r/20170322104151.751993333@infradead.org
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
[Lee: Back-ported in support of a previous futex back-port attempt]
Signed-off-by: Lee Jones &lt;lee.jones@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
Signed-off-by: Zheng Yejian &lt;zhengyejian1@huawei.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>futex: fix spin_lock() / spin_unlock_irq() imbalance</title>
<updated>2021-03-11T12:46:35Z</updated>
<author>
<name>Thomas Schoebel-Theuer</name>
<email>tst@1und1.de</email>
</author>
<published>2021-03-07T07:23:22Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ad83307d1e625704cbf8e88de2c66dc8b175899e'/>
<id>urn:sha1:ad83307d1e625704cbf8e88de2c66dc8b175899e</id>
<content type='text'>
This patch and problem analysis is specific for 4.4 LTS, due to incomplete
backporting of other fixes. Later LTS series have different backports.

The following is obviously incorrect:

static int wake_futex_pi(u32 __user *uaddr, u32 uval, struct futex_q *this,
             struct futex_hash_bucket *hb)
{
[...]
	raw_spin_lock(&amp;pi_state-&gt;pi_mutex.wait_lock);
[...]
	raw_spin_unlock_irq(&amp;pi_state-&gt;pi_mutex.wait_lock);
[...]
}

The 4.4-specific fix should probably go in the direction of
b4abf91047c,
making everything irq-safe.

Probably, backporting of b4abf91047c
to 4.4 LTS could thus be another good idea.

However, this might involve some more 4.4-specific work and
require thorough testing:

&gt; git log --oneline v4.4..b4abf91047c -- kernel/futex.c kernel/locking/rtmutex.c | wc -l
10

So this patch is just an obvious quickfix for now.

Hint: the lock order is documented in 4.9.y and later. A similar
documenting is missing in 4.4.y. Please somebody either backport also,
or write a new description, if there would be some differences I cannot
easily see at the moment. Without reliable docs,
inspection of the locking correctness may become a pain.
 
Signed-off-by: Thomas Schoebel-Theuer &lt;tst@1und1.de&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Lee Jones &lt;lee.jones@linaro.org&gt;
Fixes: 394fc4981426 ("futex: Rework inconsistent rt_mutex/futex_q state")
Fixes: 6510e4a2d04f ("futex,rt_mutex: Provide futex specific rt_mutex API")
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>futex: fix irq self-deadlock and satisfy assertion</title>
<updated>2021-03-11T12:46:35Z</updated>
<author>
<name>Thomas Schoebel-Theuer</name>
<email>tst@1und1.de</email>
</author>
<published>2021-03-07T07:26:29Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d68eefc0f6050e64dc57aefc0638eac7bb441050'/>
<id>urn:sha1:d68eefc0f6050e64dc57aefc0638eac7bb441050</id>
<content type='text'>
This patch and problem analysis is specific for 4.4 LTS, due to incomplete
backporting of other fixes. Later LTS series have different backports.

Since v4.4.257 when CONFIG_PROVE_LOCKING=y
the following triggers right after reboot of our pre-life systems
which equal our production setup:

Mar 03 11:27:33 icpu-test-bap10 kernel: =================================
Mar 03 11:27:33 icpu-test-bap10 kernel: [ INFO: inconsistent lock state ]
Mar 03 11:27:33 icpu-test-bap10 kernel: 4.4.259-rc1-grsec+ #730 Not tainted
Mar 03 11:27:33 icpu-test-bap10 kernel: ---------------------------------
Mar 03 11:27:33 icpu-test-bap10 kernel: inconsistent {IN-HARDIRQ-W} -&gt; {HARDIRQ-ON-W} usage.
Mar 03 11:27:33 icpu-test-bap10 kernel: apache2-ssl/9310 [HC0[0]:SC0[0]:HE1:SE1] takes:
Mar 03 11:27:33 icpu-test-bap10 kernel:  (&amp;p-&gt;pi_lock){?.-.-.}, at: [&lt;ffffffff810abb68&gt;] pi_state_update_owner+0x51/0xd7
Mar 03 11:27:33 icpu-test-bap10 kernel: {IN-HARDIRQ-W} state was registered at:
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81088c4a&gt;] __lock_acquire+0x3a7/0xe4a
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81089b01&gt;] lock_acquire+0x18d/0x1bc
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff8170151c&gt;] _raw_spin_lock_irqsave+0x3e/0x50
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff810719a5&gt;] try_to_wake_up+0x2c/0x210
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81071bf3&gt;] default_wake_function+0xd/0xf
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81083588&gt;] autoremove_wake_function+0x11/0x35
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff810830b2&gt;] __wake_up_common+0x48/0x7c
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff8108311a&gt;] __wake_up+0x34/0x46
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff814c2a23&gt;] megasas_complete_int_cmd+0x31/0x33
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff814c60a0&gt;] megasas_complete_cmd+0x570/0x57b
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff814d05bc&gt;] complete_cmd_fusion+0x23e/0x33d
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff814d0768&gt;] megasas_isr_fusion+0x67/0x74
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81091ae5&gt;] handle_irq_event_percpu+0x134/0x311
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81091cf5&gt;] handle_irq_event+0x33/0x51
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff810948b9&gt;] handle_edge_irq+0xa3/0xc2
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81005f7b&gt;] handle_irq+0xf9/0x101
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81005700&gt;] do_IRQ+0x80/0xf5
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81702228&gt;] ret_from_intr+0x0/0x20
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff8100cab0&gt;] arch_cpu_idle+0xa/0xc
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81083a5a&gt;] default_idle_call+0x1e/0x20
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81083b9d&gt;] cpu_startup_entry+0x141/0x22f
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff816fb853&gt;] rest_init+0x135/0x13b
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81d5ce99&gt;] start_kernel+0x3fa/0x40a
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81d5c2af&gt;] x86_64_start_reservations+0x2a/0x2c
Mar 03 11:27:33 icpu-test-bap10 kernel:   [&lt;ffffffff81d5c3d0&gt;] x86_64_start_kernel+0x11f/0x12c
Mar 03 11:27:33 icpu-test-bap10 kernel: irq event stamp: 1457
Mar 03 11:27:33 icpu-test-bap10 kernel: hardirqs last  enabled at (1457): [&lt;ffffffff81042a69&gt;] get_user_pages_fast+0xeb/0x14f
Mar 03 11:27:33 icpu-test-bap10 kernel: hardirqs last disabled at (1456): [&lt;ffffffff810429dd&gt;] get_user_pages_fast+0x5f/0x14f
Mar 03 11:27:33 icpu-test-bap10 kernel: softirqs last  enabled at (1446): [&lt;ffffffff815e127d&gt;] release_sock+0x142/0x14d
Mar 03 11:27:33 icpu-test-bap10 kernel: softirqs last disabled at (1444): [&lt;ffffffff815e116f&gt;] release_sock+0x34/0x14d
Mar 03 11:27:33 icpu-test-bap10 kernel:
                                        other info that might help us debug this:
Mar 03 11:27:33 icpu-test-bap10 kernel:  Possible unsafe locking scenario:
Mar 03 11:27:33 icpu-test-bap10 kernel:        CPU0
Mar 03 11:27:33 icpu-test-bap10 kernel:        ----
Mar 03 11:27:33 icpu-test-bap10 kernel:   lock(&amp;p-&gt;pi_lock);
Mar 03 11:27:33 icpu-test-bap10 kernel:   &lt;Interrupt&gt;
Mar 03 11:27:33 icpu-test-bap10 kernel:     lock(&amp;p-&gt;pi_lock);
Mar 03 11:27:33 icpu-test-bap10 kernel:
                                         *** DEADLOCK ***
Mar 03 11:27:33 icpu-test-bap10 kernel: 2 locks held by apache2-ssl/9310:
Mar 03 11:27:33 icpu-test-bap10 kernel:  #0:  (&amp;(&amp;(__futex_data.queues)[i].lock)-&gt;rlock){+.+...}, at: [&lt;ffffffff810ae4e6&gt;] do
Mar 03 11:27:33 icpu-test-bap10 kernel:  #1:  (&amp;lock-&gt;wait_lock){+.+...}, at: [&lt;ffffffff810ae53a&gt;] do_futex+0x639/0x809
Mar 03 11:27:33 icpu-test-bap10 kernel:
                                        stack backtrace:
Mar 03 11:27:33 icpu-test-bap10 kernel: CPU: 13 PID: 9310 UID: 99 Comm: apache2-ssl Not tainted 4.4.259-rc1-grsec+ #730
Mar 03 11:27:33 icpu-test-bap10 kernel: Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.11.0 11/02/2019
Mar 03 11:27:33 icpu-test-bap10 kernel:  0000000000000000 ffff883fb79bfc00 ffffffff816f8fc2 ffff883ffa66d300
Mar 03 11:27:33 icpu-test-bap10 kernel:  ffffffff8eaa71f0 ffff883fb79bfc50 ffffffff81088484 0000000000000000
Mar 03 11:27:33 icpu-test-bap10 kernel:  0000000000000001 0000000000000001 0000000000000002 ffff883ffa66db58
Mar 03 11:27:33 icpu-test-bap10 kernel: Call Trace:
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff816f8fc2&gt;] dump_stack+0x94/0xca
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff81088484&gt;] print_usage_bug+0x1bc/0x1d1
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff81087d76&gt;] ? check_usage_forwards+0x98/0x98
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff810885a5&gt;] mark_lock+0x10c/0x203
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff81088cb9&gt;] __lock_acquire+0x416/0xe4a
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff810abb68&gt;] ? pi_state_update_owner+0x51/0xd7
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff81089b01&gt;] lock_acquire+0x18d/0x1bc
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff81089b01&gt;] ? lock_acquire+0x18d/0x1bc
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff810abb68&gt;] ? pi_state_update_owner+0x51/0xd7
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff81700d12&gt;] _raw_spin_lock+0x2a/0x39
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff810abb68&gt;] ? pi_state_update_owner+0x51/0xd7
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff810abb68&gt;] pi_state_update_owner+0x51/0xd7
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff810ae5af&gt;] do_futex+0x6ae/0x809
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff810ae83d&gt;] SyS_futex+0x133/0x143
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff8100158a&gt;] ? syscall_trace_enter_phase2+0x1a2/0x1bb
Mar 03 11:27:33 icpu-test-bap10 kernel:  [&lt;ffffffff81701848&gt;] tracesys_phase2+0x90/0x95

Bisecting detects 47e452fcf2f
in the above specific scenario using apache-ssl,
but apparently the missing *_irq() was introduced in
34c8e1c2c02.

However, just reverting the old _irq() variants to a similar status
than before 34c8e1c2c02,
or using _irqsave() / _irqrestore() as some other backports are doing
in various places, would not really help.

The fundamental problem is the following violation of the assertion
lockdep_assert_held(&amp;pi_state-&gt;pi_mutex.wait_lock) in pi_state_update_owner():

Mar 03 12:50:03 icpu-test-bap10 kernel: ------------[ cut here ]------------
Mar 03 12:50:03 icpu-test-bap10 kernel: WARNING: CPU: 37 PID: 8488 at kernel/futex.c:844 pi_state_update_owner+0x3d/0xd7()
Mar 03 12:50:03 icpu-test-bap10 kernel: Modules linked in: xt_time xt_connlimit xt_connmark xt_NFLOG xt_limit xt_hashlimit veth ip_set_bitmap_port xt_DSCP xt_multiport ip_set_hash_ip xt_owner xt_set ip_set_hash_net xt_state xt_conntrack nf_conntrack_ftp mars lz4_decompress lz4_compress ipmi_devintf x86_pkg_temp_thermal coretemp crct10dif_pclmul crc32_pclmul hed ipmi_si ipmi_msghandler processor crc32c_intel ehci_pci ehci_hcd usbcore i40e usb_common
Mar 03 12:50:03 icpu-test-bap10 kernel: CPU: 37 PID: 8488 UID: 99 Comm: apache2-ssl Not tainted 4.4.259-rc1-grsec+ #737
Mar 03 12:50:03 icpu-test-bap10 kernel: Hardware name: Dell Inc. PowerEdge R630/02C2CP, BIOS 2.11.0 11/02/2019
Mar 03 12:50:03 icpu-test-bap10 kernel:  0000000000000000 ffff883f863f7c70 ffffffff816f9002 0000000000000000
Mar 03 12:50:03 icpu-test-bap10 kernel:  0000000000000009 ffff883f863f7ca8 ffffffff8104cda2 ffffffff810abac7
Mar 03 12:50:03 icpu-test-bap10 kernel:  ffff883ffbfe5e80 0000000000000000 ffff883f82ed4bc0 00007fc01c9bf000
Mar 03 12:50:03 icpu-test-bap10 kernel: Call Trace:
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff816f9002&gt;] dump_stack+0x94/0xca
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff8104cda2&gt;] warn_slowpath_common+0x94/0xad
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810abac7&gt;] ? pi_state_update_owner+0x3d/0xd7
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff8104ce5f&gt;] warn_slowpath_null+0x15/0x17
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810abac7&gt;] pi_state_update_owner+0x3d/0xd7
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810abea8&gt;] free_pi_state+0x2d/0x73
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810abf0b&gt;] unqueue_me_pi+0x1d/0x31
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810ad735&gt;] futex_lock_pi+0x27a/0x2e8
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff81088bca&gt;] ? __lock_acquire+0x327/0xe4a
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810ae6a9&gt;] do_futex+0x784/0x809
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810cfa9a&gt;] ? seccomp_phase1+0xde/0x1e7
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810a4503&gt;] ? current_kernel_time64+0xb/0x31
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810d23c3&gt;] ? current_kernel_time+0xb/0xf
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff810ae861&gt;] SyS_futex+0x133/0x143
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff8100158a&gt;] ? syscall_trace_enter_phase2+0x1a2/0x1bb
Mar 03 12:50:03 icpu-test-bap10 kernel:  [&lt;ffffffff81701888&gt;] tracesys_phase2+0x90/0x95
Mar 03 12:50:03 icpu-test-bap10 kernel: ---[ end trace 968f95a458dea951 ]---

In order to both (1) prevent the self-deadlock, and (2) to satisfy the assertion
at pi_state_update_owner(), some locking with irq disable is needed,
at least in the specific call stack.

Interestingly, there existed a suchalike locking just before
f08a4af5ccb.

This is just a quick hotfix, resurrecting some previous
locks at the old places, but now using -&gt;wait_lock in place
of the previous -&gt;pi_lock (which was in place before
f08a4af5ccb).

The -&gt;pi_lock is now also taken, by the new code
which had been introduced in
34c8e1c2c02.

When this patch is applied, both the above splats are
no longer triggering at my prelife machines.

Without this patch, I cannot ensure stable production at
1&amp;1 Ionos.

Hint for further work: I have not yet tested other call paths,
since I am under time pressure for security reasons.

Hint for further hardening of 4.4.y and probably some more LTS series:
Probably some more systematic testing with CONFIG_PROVE_LOCKING
(and probably some more options) should be invested
in order to make the 4.4 LTS series really "stable" again.

Signed-off-by: Thomas Schoebel-Theuer &lt;tst@1und1.de&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Lee Jones &lt;lee.jones@linaro.org&gt;
Fixes: f08a4af5ccb2 ("futex: Use pi_state_update_owner() in put_pi_state()")
Fixes: 34c8e1c2c025 ("futex: Provide and use pi_state_update_owner()")
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>futex: Ensure the correct return value from futex_lock_pi()</title>
<updated>2021-03-07T10:24:19Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2021-01-20T15:00:24Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=074e7d5157830ebd69e4abceba938367c6933ce9'/>
<id>urn:sha1:074e7d5157830ebd69e4abceba938367c6933ce9</id>
<content type='text'>
commit 12bb3f7f1b03d5913b3f9d4236a488aa7774dfe9 upstream.

In case that futex_lock_pi() was aborted by a signal or a timeout and the
task returned without acquiring the rtmutex, but is the designated owner of
the futex due to a concurrent futex_unlock_pi() fixup_owner() is invoked to
establish consistent state. In that case it invokes fixup_pi_state_owner()
which in turn tries to acquire the rtmutex again. If that succeeds then it
does not propagate this success to fixup_owner() and futex_lock_pi()
returns -EINTR or -ETIMEOUT despite having the futex locked.

Return success from fixup_pi_state_owner() in all cases where the current
task owns the rtmutex and therefore the futex and propagate it correctly
through fixup_owner(). Fixup the other callsite which does not expect a
positive return value.

Fixes: c1e2f0eaf015 ("futex: Avoid violating the 10th rule of futex")
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
[Sharan: Backported patch for kernel 4.4.y. Also folded in is a part
 of the cleanup patch d7c5ed73b19c("futex: Remove needless goto's")]
Signed-off-by: Sharan Turlapati &lt;sturlapati@vmware.com&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>futex: Fix OWNER_DEAD fixup</title>
<updated>2021-03-03T15:44:24Z</updated>
<author>
<name>Peter Zijlstra</name>
<email>peterz@infradead.org</email>
</author>
<published>2018-01-22T10:39:47Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=e68489bc827dbb9ae28f3e082b147d303599151b'/>
<id>urn:sha1:e68489bc827dbb9ae28f3e082b147d303599151b</id>
<content type='text'>
commit a97cb0e7b3f4c6297fd857055ae8e895f402f501 upstream.

Both Geert and DaveJ reported that the recent futex commit:

  c1e2f0eaf015 ("futex: Avoid violating the 10th rule of futex")

introduced a problem with setting OWNER_DEAD. We set the bit on an
uninitialized variable and then entirely optimize it away as a
dead-store.

Move the setting of the bit to where it is more useful.

Reported-by: Geert Uytterhoeven &lt;geert@linux-m68k.org&gt;
Reported-by: Dave Jones &lt;davej@codemonkey.org.uk&gt;
Signed-off-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Cc: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Cc: Paul E. McKenney &lt;paulmck@us.ibm.com&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Fixes: c1e2f0eaf015 ("futex: Avoid violating the 10th rule of futex")
Link: http://lkml.kernel.org/r/20180122103947.GD2228@hirez.programming.kicks-ass.net
Signed-off-by: Ingo Molnar &lt;mingo@kernel.org&gt;
Signed-off-by: Zheng Yejian &lt;zhengyejian1@huawei.com&gt;
Reviewed-by: Lee Jones &lt;lee.jones@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>futex: Handle faults correctly for PI futexes</title>
<updated>2021-02-10T08:07:26Z</updated>
<author>
<name>Lee Jones</name>
<email>lee.jones@linaro.org</email>
</author>
<published>2021-02-04T17:29:03Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=46358277b2da868763517f79aa0ac25ce78c4f68'/>
<id>urn:sha1:46358277b2da868763517f79aa0ac25ce78c4f68</id>
<content type='text'>
From: Thomas Gleixner &lt;tglx@linutronix.de&gt;

fixup_pi_state_owner() tries to ensure that the state of the rtmutex,
pi_state and the user space value related to the PI futex are consistent
before returning to user space. In case that the user space value update
faults and the fault cannot be resolved by faulting the page in via
fault_in_user_writeable() the function returns with -EFAULT and leaves
the rtmutex and pi_state owner state inconsistent.

A subsequent futex_unlock_pi() operates on the inconsistent pi_state and
releases the rtmutex despite not owning it which can corrupt the RB tree of
the rtmutex and cause a subsequent kernel stack use after free.

It was suggested to loop forever in fixup_pi_state_owner() if the fault
cannot be resolved, but that results in runaway tasks which is especially
undesired when the problem happens due to a programming error and not due
to malice.

As the user space value cannot be fixed up, the proper solution is to make
the rtmutex and the pi_state consistent so both have the same owner. This
leaves the user space value out of sync. Any subsequent operation on the
futex will fail because the 10th rule of PI futexes (pi_state owner and
user space value are consistent) has been violated.

As a consequence this removes the inept attempts of 'fixing' the situation
in case that the current task owns the rtmutex when returning with an
unresolvable fault by unlocking the rtmutex which left pi_state::owner and
rtmutex::owner out of sync in a different and only slightly less dangerous
way.

Fixes: 1b7558e457ed ("futexes: fix fault handling in futex_lock_pi")
Reported-by: gzobqq@gmail.com
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Lee Jones &lt;lee.jones@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>futex: Simplify fixup_pi_state_owner()</title>
<updated>2021-02-10T08:07:26Z</updated>
<author>
<name>Lee Jones</name>
<email>lee.jones@linaro.org</email>
</author>
<published>2021-02-04T17:29:02Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=47e452fcf2f50645247aa0240ae26e0d14d6d3ad'/>
<id>urn:sha1:47e452fcf2f50645247aa0240ae26e0d14d6d3ad</id>
<content type='text'>
From: Thomas Gleixner &lt;tglx@linutronix.de&gt;

[ Upstream commit f2dac39d93987f7de1e20b3988c8685523247ae2 ]

Too many gotos already and an upcoming fix would make it even more
unreadable.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Lee Jones &lt;lee.jones@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
<entry>
<title>futex: Use pi_state_update_owner() in put_pi_state()</title>
<updated>2021-02-10T08:07:26Z</updated>
<author>
<name>Lee Jones</name>
<email>lee.jones@linaro.org</email>
</author>
<published>2021-02-04T17:29:01Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=f08a4af5ccb2a907a225dfd3d882656e5473da4a'/>
<id>urn:sha1:f08a4af5ccb2a907a225dfd3d882656e5473da4a</id>
<content type='text'>
From: Thomas Gleixner &lt;tglx@linutronix.de&gt;

[ Upstream commit 6ccc84f917d33312eb2846bd7b567639f585ad6d ]

No point in open coding it. This way it gains the extra sanity checks.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Acked-by: Peter Zijlstra (Intel) &lt;peterz@infradead.org&gt;
Cc: stable@vger.kernel.org
Signed-off-by: Lee Jones &lt;lee.jones@linaro.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;
</content>
</entry>
</feed>
