<feed xmlns='http://www.w3.org/2005/Atom'>
<title>user/sven/linux.git/kernel/futex.c, branch v3.4.60</title>
<subtitle>Linux Kernel
</subtitle>
<id>https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.4.60</id>
<link rel='self' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/atom?h=v3.4.60'/>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/'/>
<updated>2013-08-20T15:26:28Z</updated>
<entry>
<title>futex: Take hugepages into account when generating futex_key</title>
<updated>2013-08-20T15:26:28Z</updated>
<author>
<name>Zhang Yi</name>
<email>wetpzy@gmail.com</email>
</author>
<published>2013-06-25T13:19:31Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=a42efb79d54d9a13c8f68df122c832bca08b74ae'/>
<id>urn:sha1:a42efb79d54d9a13c8f68df122c832bca08b74ae</id>
<content type='text'>
commit 13d60f4b6ab5b702dc8d2ee20999f98a93728aec upstream.

The futex_keys of process shared futexes are generated from the page
offset, the mapping host and the mapping index of the futex user space
address. This should result in an unique identifier for each futex.

Though this is not true when futexes are located in different subpages
of an hugepage. The reason is, that the mapping index for all those
futexes evaluates to the index of the base page of the hugetlbfs
mapping. So a futex at offset 0 of the hugepage mapping and another
one at offset PAGE_SIZE of the same hugepage mapping have identical
futex_keys. This happens because the futex code blindly uses
page-&gt;index.

Steps to reproduce the bug:

1. Map a file from hugetlbfs. Initialize pthread_mutex1 at offset 0
   and pthread_mutex2 at offset PAGE_SIZE of the hugetlbfs
   mapping.

   The mutexes must be initialized as PTHREAD_PROCESS_SHARED because
   PTHREAD_PROCESS_PRIVATE mutexes are not affected by this issue as
   their keys solely depend on the user space address.

2. Lock mutex1 and mutex2

3. Create thread1 and in the thread function lock mutex1, which
   results in thread1 blocking on the locked mutex1.

4. Create thread2 and in the thread function lock mutex2, which
   results in thread2 blocking on the locked mutex2.

5. Unlock mutex2. Despite the fact that mutex2 got unlocked, thread2
   still blocks on mutex2 because the futex_key points to mutex1.

To solve this issue we need to take the normal page index of the page
which contains the futex into account, if the futex is in an hugetlbfs
mapping. In other words, we calculate the normal page mapping index of
the subpage in the hugetlbfs mapping.

Mappings which are not based on hugetlbfs are not affected and still
use page-&gt;index.

Thanks to Mel Gorman who provided a patch for adding proper evaluation
functions to the hugetlbfs code to avoid exposing hugetlbfs specific
details to the futex code.

[ tglx: Massaged changelog ]

Signed-off-by: Zhang Yi &lt;zhang.yi20@zte.com.cn&gt;
Reviewed-by: Jiang Biao &lt;jiang.biao2@zte.com.cn&gt;
Tested-by: Ma Chenggong &lt;ma.chenggong@zte.com.cn&gt;
Reviewed-by: 'Mel Gorman' &lt;mgorman@suse.de&gt;
Acked-by: 'Darren Hart' &lt;dvhart@linux.intel.com&gt;
Cc: 'Peter Zijlstra' &lt;peterz@infradead.org&gt;
Link: http://lkml.kernel.org/r/000101ce71a6%24a83c5880%24f8b50980%24@com
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Mike Galbraith &lt;mgalbraith@suse.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;


</content>
</entry>
<entry>
<title>futex: Revert "futex: Mark get_robust_list as deprecated"</title>
<updated>2013-02-28T14:59:01Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2013-02-18T08:52:08Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=803437207a3e6fef7791adeb7a0c2adb4b012459'/>
<id>urn:sha1:803437207a3e6fef7791adeb7a0c2adb4b012459</id>
<content type='text'>
commit fe2b05f7ca9f906be61dced5489f63b8b4d7c770 upstream.

This reverts commit ec0c4274e33c0373e476b73e01995c53128f1257.

get_robust_list() is in use and a removal would break existing user
space. With the permission checks in place it's not longer a security
hole. Remove the deprecation warnings.

Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Cyrill Gorcunov &lt;gorcunov@openvz.org&gt;
Cc: Richard Weinberger &lt;richard@nod.at&gt;
Cc: akpm@linux-foundation.org
Cc: paul.gortmaker@windriver.com
Cc: davej@redhat.com
Cc: keescook@chromium.org
Cc: ebiederm@xmission.com
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>futex: avoid wake_futex() for a PI futex_q</title>
<updated>2012-12-03T19:47:07Z</updated>
<author>
<name>Darren Hart</name>
<email>dvhart@linux.intel.com</email>
</author>
<published>2012-11-27T00:29:56Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=fceca5e72e787dd0a8ea29e22a874e363389356c'/>
<id>urn:sha1:fceca5e72e787dd0a8ea29e22a874e363389356c</id>
<content type='text'>
commit aa10990e028cac3d5e255711fb9fb47e00700e35 upstream.

Dave Jones reported a bug with futex_lock_pi() that his trinity test
exposed.  Sometime between queue_me() and taking the q.lock_ptr, the
lock_ptr became NULL, resulting in a crash.

While futex_wake() is careful to not call wake_futex() on futex_q's with
a pi_state or an rt_waiter (which are either waiting for a
futex_unlock_pi() or a PI futex_requeue()), futex_wake_op() and
futex_requeue() do not perform the same test.

Update futex_wake_op() and futex_requeue() to test for q.pi_state and
q.rt_waiter and abort with -EINVAL if detected.  To ensure any future
breakage is caught, add a WARN() to wake_futex() if the same condition
is true.

This fix has seen 3 hours of testing with "trinity -c futex" on an
x86_64 VM with 4 CPUS.

[akpm@linux-foundation.org: tidy up the WARN()]
Signed-off-by: Darren Hart &lt;dvhart@linux.intel.com&gt;
Reported-by: Dave Jones &lt;davej@redat.com&gt;
Cc: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Cc: Peter Zijlstra &lt;peterz@infradead.org&gt;
Cc: Ingo Molnar &lt;mingo@elte.hu&gt;
Cc: John Kacur &lt;jkacur@redhat.com&gt;
Signed-off-by: Andrew Morton &lt;akpm@linux-foundation.org&gt;
Signed-off-by: Linus Torvalds &lt;torvalds@linux-foundation.org&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>futex: Handle futex_pi OWNER_DIED take over correctly</title>
<updated>2012-11-17T21:16:22Z</updated>
<author>
<name>Thomas Gleixner</name>
<email>tglx@linutronix.de</email>
</author>
<published>2012-10-23T20:29:38Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=c4cbedfda2227df82126c9dd5e7593565bf45d21'/>
<id>urn:sha1:c4cbedfda2227df82126c9dd5e7593565bf45d21</id>
<content type='text'>
commit 59fa6245192159ab5e1e17b8e31f15afa9cff4bf upstream.

Siddhesh analyzed a failure in the take over of pi futexes in case the
owner died and provided a workaround.
See: http://sourceware.org/bugzilla/show_bug.cgi?id=14076

The detailed problem analysis shows:

Futex F is initialized with PTHREAD_PRIO_INHERIT and
PTHREAD_MUTEX_ROBUST_NP attributes.

T1 lock_futex_pi(F);

T2 lock_futex_pi(F);
   --&gt; T2 blocks on the futex and creates pi_state which is associated
       to T1.

T1 exits
   --&gt; exit_robust_list() runs
       --&gt; Futex F userspace value TID field is set to 0 and
           FUTEX_OWNER_DIED bit is set.

T3 lock_futex_pi(F);
   --&gt; Succeeds due to the check for F's userspace TID field == 0
   --&gt; Claims ownership of the futex and sets its own TID into the
       userspace TID field of futex F
   --&gt; returns to user space

T1 --&gt; exit_pi_state_list()
       --&gt; Transfers pi_state to waiter T2 and wakes T2 via
       	   rt_mutex_unlock(&amp;pi_state-&gt;mutex)

T2 --&gt; acquires pi_state-&gt;mutex and gains real ownership of the
       pi_state
   --&gt; Claims ownership of the futex and sets its own TID into the
       userspace TID field of futex F
   --&gt; returns to user space

T3 --&gt; observes inconsistent state

This problem is independent of UP/SMP, preemptible/non preemptible
kernels, or process shared vs. private. The only difference is that
certain configurations are more likely to expose it.

So as Siddhesh correctly analyzed the following check in
futex_lock_pi_atomic() is the culprit:

	if (unlikely(ownerdied || !(curval &amp; FUTEX_TID_MASK))) {

We check the userspace value for a TID value of 0 and take over the
futex unconditionally if that's true.

AFAICT this check is there as it is correct for a different corner
case of futexes: the WAITERS bit became stale.

Now the proposed change

-	if (unlikely(ownerdied || !(curval &amp; FUTEX_TID_MASK))) {
+       if (unlikely(ownerdied ||
+                       !(curval &amp; (FUTEX_TID_MASK | FUTEX_WAITERS)))) {

solves the problem, but it's not obvious why and it wreckages the
"stale WAITERS bit" case.

What happens is, that due to the WAITERS bit being set (T2 is blocked
on that futex) it enforces T3 to go through lookup_pi_state(), which
in the above case returns an existing pi_state and therefor forces T3
to legitimately fight with T2 over the ownership of the pi_state (via
pi_state-&gt;mutex). Probelm solved!

Though that does not work for the "WAITERS bit is stale" problem
because if lookup_pi_state() does not find existing pi_state it
returns -ERSCH (due to TID == 0) which causes futex_lock_pi() to
return -ESRCH to user space because the OWNER_DIED bit is not set.

Now there is a different solution to that problem. Do not look at the
user space value at all and enforce a lookup of possibly available
pi_state. If pi_state can be found, then the new incoming locker T3
blocks on that pi_state and legitimately races with T2 to acquire the
rt_mutex and the pi_state and therefor the proper ownership of the
user space futex.

lookup_pi_state() has the correct order of checks. It first tries to
find a pi_state associated with the user space futex and only if that
fails it checks for futex TID value = 0. If no pi_state is available
nothing can create new state at that point because this happens with
the hash bucket lock held.

So the above scenario changes to:

T1 lock_futex_pi(F);

T2 lock_futex_pi(F);
   --&gt; T2 blocks on the futex and creates pi_state which is associated
       to T1.

T1 exits
   --&gt; exit_robust_list() runs
       --&gt; Futex F userspace value TID field is set to 0 and
           FUTEX_OWNER_DIED bit is set.

T3 lock_futex_pi(F);
   --&gt; Finds pi_state and blocks on pi_state-&gt;rt_mutex

T1 --&gt; exit_pi_state_list()
       --&gt; Transfers pi_state to waiter T2 and wakes it via
       	   rt_mutex_unlock(&amp;pi_state-&gt;mutex)

T2 --&gt; acquires pi_state-&gt;mutex and gains ownership of the pi_state
   --&gt; Claims ownership of the futex and sets its own TID into the
       userspace TID field of futex F
   --&gt; returns to user space

This covers all gazillion points on which T3 might come in between
T1's exit_robust_list() clearing the TID field and T2 fixing it up. It
also solves the "WAITERS bit stale" problem by forcing the take over.

Another benefit of changing the code this way is that it makes it less
dependent on untrusted user space values and therefor minimizes the
possible wreckage which might be inflicted.

As usual after staring for too long at the futex code my brain hurts
so much that I really want to ditch that whole optimization of
avoiding the syscall for the non contended case for PI futexes and rip
out the maze of corner case handling code. Unfortunately we can't as
user space relies on that existing behaviour, but at least thinking
about it helps me to preserve my mental sanity. Maybe we should
nevertheless :)

Reported-and-tested-by: Siddhesh Poyarekar &lt;siddhesh.poyarekar@gmail.com&gt;
Link: http://lkml.kernel.org/r/alpine.LFD.2.02.1210232138540.2756@ionos
Acked-by: Darren Hart &lt;dvhart@linux.intel.com&gt;
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>futex: Forbid uaddr == uaddr2 in futex_wait_requeue_pi()</title>
<updated>2012-08-09T15:31:53Z</updated>
<author>
<name>Darren Hart</name>
<email>dvhart@linux.intel.com</email>
</author>
<published>2012-07-20T18:53:31Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=b3f9576e98e0dfb4f9be87618da4c5f6e8640ee0'/>
<id>urn:sha1:b3f9576e98e0dfb4f9be87618da4c5f6e8640ee0</id>
<content type='text'>
commit 6f7b0a2a5c0fb03be7c25bd1745baa50582348ef upstream.

If uaddr == uaddr2, then we have broken the rule of only requeueing
from a non-pi futex to a pi futex with this call. If we attempt this,
as the trinity test suite manages to do, we miss early wakeups as
q.key is equal to key2 (because they are the same uaddr). We will then
attempt to dereference the pi_mutex (which would exist had the futex_q
been properly requeued to a pi futex) and trigger a NULL pointer
dereference.

Signed-off-by: Darren Hart &lt;dvhart@linux.intel.com&gt;
Cc: Dave Jones &lt;davej@redhat.com&gt;
Link: http://lkml.kernel.org/r/ad82bfe7f7d130247fbe2b5b4275654807774227.1342809673.git.dvhart@linux.intel.com
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>futex: Fix bug in WARN_ON for NULL q.pi_state</title>
<updated>2012-08-09T15:31:53Z</updated>
<author>
<name>Darren Hart</name>
<email>dvhart@linux.intel.com</email>
</author>
<published>2012-07-20T18:53:30Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=47b6ff731a701d898c732e2f2dd67c5178fc0960'/>
<id>urn:sha1:47b6ff731a701d898c732e2f2dd67c5178fc0960</id>
<content type='text'>
commit f27071cb7fe3e1d37a9dbe6c0dfc5395cd40fa43 upstream.

The WARN_ON in futex_wait_requeue_pi() for a NULL q.pi_state was testing
the address (&amp;q.pi_state) of the pointer instead of the value
(q.pi_state) of the pointer. Correct it accordingly.

Signed-off-by: Darren Hart &lt;dvhart@linux.intel.com&gt;
Cc: Dave Jones &lt;davej@redhat.com&gt;
Link: http://lkml.kernel.org/r/1c85d97f6e5f79ec389a4ead3e367363c74bd09a.1342809673.git.dvhart@linux.intel.com
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>futex: Test for pi_mutex on fault in futex_wait_requeue_pi()</title>
<updated>2012-08-09T15:31:53Z</updated>
<author>
<name>Darren Hart</name>
<email>dvhart@linux.intel.com</email>
</author>
<published>2012-07-20T18:53:29Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=d48c1ba2979634ecbbe344a8bd65035f32777f1b'/>
<id>urn:sha1:d48c1ba2979634ecbbe344a8bd65035f32777f1b</id>
<content type='text'>
commit b6070a8d9853eda010a549fa9a09eb8d7269b929 upstream.

If fixup_pi_state_owner() faults, pi_mutex may be NULL. Test
for pi_mutex != NULL before testing the owner against current
and possibly unlocking it.

Signed-off-by: Darren Hart &lt;dvhart@linux.intel.com&gt;
Cc: Dave Jones &lt;davej@redhat.com&gt;
Cc: Dan Carpenter &lt;dan.carpenter@oracle.com&gt;
Link: http://lkml.kernel.org/r/dc59890338fc413606f04e5c5b131530734dae3d.1342809673.git.dvhart@linux.intel.com
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Greg Kroah-Hartman &lt;gregkh@linuxfoundation.org&gt;

</content>
</entry>
<entry>
<title>futex: Mark get_robust_list as deprecated</title>
<updated>2012-03-29T09:37:17Z</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2012-03-23T19:08:55Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=ec0c4274e33c0373e476b73e01995c53128f1257'/>
<id>urn:sha1:ec0c4274e33c0373e476b73e01995c53128f1257</id>
<content type='text'>
Notify get_robust_list users that the syscall is going away.

Suggested-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Randy Dunlap &lt;rdunlap@xenotime.net&gt;
Cc: Darren Hart &lt;dvhart@linux.intel.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Jiri Kosina &lt;jkosina@suse.cz&gt;
Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: Serge E. Hallyn &lt;serge.hallyn@canonical.com&gt;
Cc: kernel-hardening@lists.openwall.com
Cc: spender@grsecurity.net
Link: http://lkml.kernel.org/r/20120323190855.GA27213@www.outflux.net
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
</entry>
<entry>
<title>futex: Do not leak robust list to unprivileged process</title>
<updated>2012-03-29T09:37:17Z</updated>
<author>
<name>Kees Cook</name>
<email>keescook@chromium.org</email>
</author>
<published>2012-03-19T23:12:53Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=bdbb776f882f5ad431aa1e694c69c1c3d6a4a5b8'/>
<id>urn:sha1:bdbb776f882f5ad431aa1e694c69c1c3d6a4a5b8</id>
<content type='text'>
It was possible to extract the robust list head address from a setuid
process if it had used set_robust_list(), allowing an ASLR info leak. This
changes the permission checks to be the same as those used for similar
info that comes out of /proc.

Running a setuid program that uses robust futexes would have had:
  cred-&gt;euid != pcred-&gt;euid
  cred-&gt;euid == pcred-&gt;uid
so the old permissions check would allow it. I'm not aware of any setuid
programs that use robust futexes, so this is just a preventative measure.

(This patch is based on changes from grsecurity.)

Signed-off-by: Kees Cook &lt;keescook@chromium.org&gt;
Cc: Darren Hart &lt;dvhart@linux.intel.com&gt;
Cc: Peter Zijlstra &lt;a.p.zijlstra@chello.nl&gt;
Cc: Jiri Kosina &lt;jkosina@suse.cz&gt;
Cc: Eric W. Biederman &lt;ebiederm@xmission.com&gt;
Cc: David Howells &lt;dhowells@redhat.com&gt;
Cc: Serge E. Hallyn &lt;serge.hallyn@canonical.com&gt;
Cc: kernel-hardening@lists.openwall.com
Cc: spender@grsecurity.net
Link: http://lkml.kernel.org/r/20120319231253.GA20893@www.outflux.net
Signed-off-by: Thomas Gleixner &lt;tglx@linutronix.de&gt;
</content>
</entry>
<entry>
<title>Merge branch 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip</title>
<updated>2012-03-20T00:11:15Z</updated>
<author>
<name>Linus Torvalds</name>
<email>torvalds@linux-foundation.org</email>
</author>
<published>2012-03-20T00:11:15Z</published>
<link rel='alternate' type='text/html' href='https://git.stealer.net/cgit.cgi/user/sven/linux.git/commit/?id=5ed59af85077d28875a3a137b21933aaf1b4cd50'/>
<id>urn:sha1:5ed59af85077d28875a3a137b21933aaf1b4cd50</id>
<content type='text'>
Pull core/locking changes for v3.4 from Ingo Molnar

* 'core-locking-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip:
  futex: Simplify return logic
  futex: Cover all PI opcodes with cmpxchg enabled check
</content>
</entry>
</feed>
