LKML Archive on lore.kernel.org
 help / color / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: LKML <linux-kernel@vger.kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Darren Hart <darren@dvhart.com>, Ingo Molnar <mingo@kernel.org>,
	Davidlohr Bueso <davidlohr@hp.com>, Kees Cook <kees@outflux.net>,
	wad@chromium.org
Subject: [patch 1/5] futex: Make unlock_pi more robust
Date: Wed, 11 Jun 2014 20:45:38 -0000
Message-ID: <20140611204237.016987332@linutronix.de> (raw)
In-Reply-To: <20140611202744.676528190@linutronix.de>


[-- Attachment #0: futex-make-unlock-pi-more-robust.patch --]
[-- Type: text/plain, Size: 4267 bytes --]

The kernel tries to atomically unlock the futex without checking
whether there is kernel state associated to the futex.

So if user space manipulated the user space value, this will leave
kernel internal state around associated to the owner task. 

For robustness sake, lookup first whether there are waiters on the
futex. If there are waiters, wake the top priority waiter with all the
proper sanity checks applied.

If there are no waiters, do the atomic release. We do not have to
preserve the waiters bit in this case, because a potentially incoming
waiter is blocked on the hb->lock and will acquire the futex
atomically. We neither have to preserve the owner died bit. The caller
is the owner and it was supposed to cleanup the mess.

Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---
 kernel/futex.c |   78 +++++++++++++++++++--------------------------------------
 1 file changed, 26 insertions(+), 52 deletions(-)

Index: linux/kernel/futex.c
===================================================================
--- linux.orig/kernel/futex.c
+++ linux/kernel/futex.c
@@ -1186,22 +1186,6 @@ static int wake_futex_pi(u32 __user *uad
 	return 0;
 }
 
-static int unlock_futex_pi(u32 __user *uaddr, u32 uval)
-{
-	u32 uninitialized_var(oldval);
-
-	/*
-	 * There is no waiter, so we unlock the futex. The owner died
-	 * bit has not to be preserved here. We are the owner:
-	 */
-	if (cmpxchg_futex_value_locked(&oldval, uaddr, uval, 0))
-		return -EFAULT;
-	if (oldval != uval)
-		return -EAGAIN;
-
-	return 0;
-}
-
 /*
  * Express the locking dependencies for lockdep:
  */
@@ -2401,10 +2385,10 @@ uaddr_faulted:
  */
 static int futex_unlock_pi(u32 __user *uaddr, unsigned int flags)
 {
-	struct futex_hash_bucket *hb;
-	struct futex_q *this, *next;
+	u32 uninitialized_var(curval), uval, vpid = task_pid_vnr(current);
 	union futex_key key = FUTEX_KEY_INIT;
-	u32 uval, vpid = task_pid_vnr(current);
+	struct futex_hash_bucket *hb;
+	struct futex_q *match;
 	int ret;
 
 retry:
@@ -2417,57 +2401,47 @@ retry:
 		return -EPERM;
 
 	ret = get_futex_key(uaddr, flags & FLAGS_SHARED, &key, VERIFY_WRITE);
-	if (unlikely(ret != 0))
-		goto out;
+	if (ret)
+		return ret;
 
 	hb = hash_futex(&key);
 	spin_lock(&hb->lock);
 
 	/*
-	 * To avoid races, try to do the TID -> 0 atomic transition
-	 * again. If it succeeds then we can return without waking
-	 * anyone else up. We only try this if neither the waiters nor
-	 * the owner died bit are set.
-	 */
-	if (!(uval & ~FUTEX_TID_MASK) &&
-	    cmpxchg_futex_value_locked(&uval, uaddr, vpid, 0))
-		goto pi_faulted;
-	/*
-	 * Rare case: we managed to release the lock atomically,
-	 * no need to wake anyone else up:
-	 */
-	if (unlikely(uval == vpid))
-		goto out_unlock;
-
-	/*
-	 * Ok, other tasks may need to be woken up - check waiters
-	 * and do the wakeup if necessary:
-	 */
-	plist_for_each_entry_safe(this, next, &hb->chain, list) {
-		if (!match_futex (&this->key, &key))
-			continue;
-		ret = wake_futex_pi(uaddr, uval, this);
+	 * Check waiters first. We do not trust user space values at
+	 * all and we at least want to know if user space fiddled
+	 * with the futex value instead of blindly unlocking.
+	 */
+	match = futex_top_waiter(hb, &key);
+	if (match) {
+		ret = wake_futex_pi(uaddr, uval, match);
 		/*
-		 * The atomic access to the futex value
-		 * generated a pagefault, so retry the
-		 * user-access and the wakeup:
+		 * The atomic access to the futex value generated a
+		 * pagefault, so retry the user-access and the wakeup:
 		 */
 		if (ret == -EFAULT)
 			goto pi_faulted;
 		goto out_unlock;
 	}
+
 	/*
-	 * No waiters - kernel unlocks the futex:
+	 * We have no kernel internal state, i.e. no waiters in the
+	 * kernel. Waiters which are about to queue themself are stuck
+	 * on hb->lock. So we can safely ignore them. We do neither
+	 * preserve the WAITERS bit not the OWNER_DIED one. We are the
+	 * owner.
 	 */
-	ret = unlock_futex_pi(uaddr, uval);
-	if (ret == -EFAULT)
+	if (cmpxchg_futex_value_locked(&curval, uaddr, uval, 0))
 		goto pi_faulted;
 
+	/*
+	 * If uval has changed, let user space handle it.
+	 */
+	ret = (curval == uval) ? 0 : -EAGAIN;
+
 out_unlock:
 	spin_unlock(&hb->lock);
 	put_futex_key(&key);
-
-out:
 	return ret;
 
 pi_faulted:



  reply index

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-11 20:45 [patch 0/5] futex: More robustness tweaks Thomas Gleixner
2014-06-11 20:45 ` Thomas Gleixner [this message]
2014-06-16 16:18   ` [patch 1/5] futex: Make unlock_pi more robust Darren Hart
2014-06-16 22:15     ` Thomas Gleixner
2014-06-16 22:28       ` Thomas Gleixner
2014-06-16 22:49         ` Darren Hart
2014-06-16 22:39       ` Darren Hart
2014-06-21 20:33   ` [tip:locking/core] " tip-bot for Thomas Gleixner
2014-06-11 20:45 ` [patch 3/5] futex: Split out the waiter check from lookup_pi_state() Thomas Gleixner
2014-06-16 18:12   ` Darren Hart
2014-06-21 20:33   ` [tip:locking/core] " tip-bot for Thomas Gleixner
2014-06-11 20:45 ` [patch 2/5] futex: Use futex_top_waiter() in lookup_pi_state() Thomas Gleixner
2014-06-16 16:51   ` Darren Hart
2014-06-21 20:33   ` [tip:locking/core] " tip-bot for Thomas Gleixner
2014-06-11 20:45 ` [patch 4/5] futex: Split out the first waiter attachment from lookup_pi_state() Thomas Gleixner
2014-06-16 18:19   ` Darren Hart
2014-06-21 20:33   ` [tip:locking/core] " tip-bot for Thomas Gleixner
2014-06-11 20:45 ` [patch 5/5] futex: Simplify futex_lock_pi_atomic() and make it more robust Thomas Gleixner
2014-06-13  5:46   ` Darren Hart
2014-06-13  8:34     ` Thomas Gleixner
2014-06-13  9:36       ` Thomas Gleixner
2014-06-13  9:44         ` [patch V2 " Thomas Gleixner
2014-06-13 20:51           ` Davidlohr Bueso
2014-06-16 20:36           ` Darren Hart
2014-06-17  7:20             ` Thomas Gleixner
2014-06-21 20:34           ` [tip:locking/core] " tip-bot for Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140611204237.016987332@linutronix.de \
    --to=tglx@linutronix.de \
    --cc=darren@dvhart.com \
    --cc=davidlohr@hp.com \
    --cc=kees@outflux.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=wad@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LKML Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/lkml/0 lkml/git/0.git
	git clone --mirror https://lore.kernel.org/lkml/1 lkml/git/1.git
	git clone --mirror https://lore.kernel.org/lkml/2 lkml/git/2.git
	git clone --mirror https://lore.kernel.org/lkml/3 lkml/git/3.git
	git clone --mirror https://lore.kernel.org/lkml/4 lkml/git/4.git
	git clone --mirror https://lore.kernel.org/lkml/5 lkml/git/5.git
	git clone --mirror https://lore.kernel.org/lkml/6 lkml/git/6.git
	git clone --mirror https://lore.kernel.org/lkml/7 lkml/git/7.git
	git clone --mirror https://lore.kernel.org/lkml/8 lkml/git/8.git
	git clone --mirror https://lore.kernel.org/lkml/9 lkml/git/9.git
	git clone --mirror https://lore.kernel.org/lkml/10 lkml/git/10.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 lkml lkml/ https://lore.kernel.org/lkml \
		linux-kernel@vger.kernel.org
	public-inbox-index lkml

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-kernel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git