linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Darren Hart <darren@dvhart.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@kernel.org>,
	Davidlohr Bueso <davidlohr@hp.com>, Kees Cook <kees@outflux.net>,
	wad@chromium.org
Subject: Re: [patch 1/5] futex: Make unlock_pi more robust
Date: Mon, 16 Jun 2014 09:18:48 -0700	[thread overview]
Message-ID: <1402935528.15603.14.camel@rage> (raw)
In-Reply-To: <20140611204237.016987332@linutronix.de>

On Wed, 2014-06-11 at 20:45 +0000, Thomas Gleixner wrote:
> The kernel tries to atomically unlock the futex without checking
> whether there is kernel state associated to the futex.
> 
> So if user space manipulated the user space value, this will leave
> kernel internal state around associated to the owner task. 
> 
> For robustness sake, lookup first whether there are waiters on the
> futex. If there are waiters, wake the top priority waiter with all the
> proper sanity checks applied.
> 
> If there are no waiters, do the atomic release. We do not have to
> preserve the waiters bit in this case, because a potentially incoming
> waiter is blocked on the hb->lock and will acquire the futex
> atomically. We neither have to preserve the owner died bit. The caller
> is the owner and it was supposed to cleanup the mess.
> 
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
>  kernel/futex.c |   78
> +++++++++++++++++++--------------------------------------
>  1 file changed, 26 insertions(+), 52 deletions(-)
> 
> Index: linux/kernel/futex.c
> ===================================================================
> --- linux.orig/kernel/futex.c
> +++ linux/kernel/futex.c
> @@ -1186,22 +1186,6 @@ static int wake_futex_pi(u32 __user *uad
>         return 0;
>  }
>  
> -static int unlock_futex_pi(u32 __user *uaddr, u32 uval)
> -{
> -       u32 uninitialized_var(oldval);
> -
> -       /*
> -        * There is no waiter, so we unlock the futex. The owner died
> -        * bit has not to be preserved here. We are the owner:
> -        */
> -       if (cmpxchg_futex_value_locked(&oldval, uaddr, uval, 0))
> -               return -EFAULT;
> -       if (oldval != uval)
> -               return -EAGAIN;
> -
> -       return 0;
> -}

I thought we used that in multiple places, but apparently not. Worth
consolidating.

> -
>  /*
>   * Express the locking dependencies for lockdep:
>   */
> @@ -2401,10 +2385,10 @@ uaddr_faulted:
>   */
>  static int futex_unlock_pi(u32 __user *uaddr, unsigned int flags)
>  {
> -       struct futex_hash_bucket *hb;
> -       struct futex_q *this, *next;
> +       u32 uninitialized_var(curval), uval, vpid =
> task_pid_vnr(current);
>         union futex_key key = FUTEX_KEY_INIT;
> -       u32 uval, vpid = task_pid_vnr(current);
> +       struct futex_hash_bucket *hb;
> +       struct futex_q *match;
>         int ret;
>  
>  retry:
> @@ -2417,57 +2401,47 @@ retry:
>                 return -EPERM;
>  
>         ret = get_futex_key(uaddr, flags & FLAGS_SHARED, &key,
> VERIFY_WRITE);
> -       if (unlikely(ret != 0))
> -               goto out;
> +       if (ret)
> +               return ret;

Looks like you're also trying to move away from a single exit point to
multiple exit points. I prefer the single exit (which you've probably
noticed :-), but it's a subjective thing, and so long as we are not
duplicating cleanup logic, I guess it's fine either way. This change was
not mentioned in the commit message though.

>  
>         hb = hash_futex(&key);
>         spin_lock(&hb->lock);
>  
>         /*
> -        * To avoid races, try to do the TID -> 0 atomic transition
> -        * again. If it succeeds then we can return without waking
> -        * anyone else up. We only try this if neither the waiters nor
> -        * the owner died bit are set.
> -        */
> -       if (!(uval & ~FUTEX_TID_MASK) &&
> -           cmpxchg_futex_value_locked(&uval, uaddr, vpid, 0))
> -               goto pi_faulted;
> -       /*
> -        * Rare case: we managed to release the lock atomically,
> -        * no need to wake anyone else up:
> -        */
> -       if (unlikely(uval == vpid))
> -               goto out_unlock;
> -
> -       /*
> -        * Ok, other tasks may need to be woken up - check waiters
> -        * and do the wakeup if necessary:
> -        */
> -       plist_for_each_entry_safe(this, next, &hb->chain, list) {
> -               if (!match_futex (&this->key, &key))
> -                       continue;
> -               ret = wake_futex_pi(uaddr, uval, this);
> +        * Check waiters first. We do not trust user space values at
> +        * all and we at least want to know if user space fiddled
> +        * with the futex value instead of blindly unlocking.
> +        */
> +       match = futex_top_waiter(hb, &key);
> +       if (match) {
> +               ret = wake_futex_pi(uaddr, uval, match);
>                 /*
> -                * The atomic access to the futex value
> -                * generated a pagefault, so retry the
> -                * user-access and the wakeup:
> +                * The atomic access to the futex value generated a
> +                * pagefault, so retry the user-access and the wakeup:
>                  */
>                 if (ret == -EFAULT)
>                         goto pi_faulted;
>                 goto out_unlock;
>         }
> +
>         /*
> -        * No waiters - kernel unlocks the futex:
> +        * We have no kernel internal state, i.e. no waiters in the
> +        * kernel. Waiters which are about to queue themself are stuck

themselves

> +        * on hb->lock. So we can safely ignore them. We do neither
> +        * preserve the WAITERS bit not the OWNER_DIED one. We are the

We preserve neither the WAITERS bit nor the OWNER_DIED bit.
(the above use of "do" and "not" is incorrect and could easily be
misinterpreted).

> +        * owner.

In wake_futex_pi we verify ownership by matching pi_state->owner ==
current, but here the only test is the TID value, which is set by
userspace - which we don't trust...

I'm trying to determine if it matters in this case... if there are no
waiters, is the pi_state still around? If so, it does indeed matter, and
we should be verifying.


>          */
> -       ret = unlock_futex_pi(uaddr, uval);
> -       if (ret == -EFAULT)
> +       if (cmpxchg_futex_value_locked(&curval, uaddr, uval, 0))
>                 goto pi_faulted;
>  

This refactoring seems like it would be best done as a prequel patch so
as not to confuse cleanup with functional change. At least that is what
you and others have beaten into me over the years ;-)

> +       /*
> +        * If uval has changed, let user space handle it.
> +        */
> +       ret = (curval == uval) ? 0 : -EAGAIN;
> +
>  out_unlock:
>         spin_unlock(&hb->lock);
>         put_futex_key(&key);
> -
> -out:
>         return ret;
>  

By dropping this you won't return ret, but rather fall through into
pi_faulted... which certainly isn't what you wanted.

The need for better test coverage is very evident now :-)

--
Darren

>  pi_faulted:
> 



  reply	other threads:[~2014-06-16 16:11 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-11 20:45 [patch 0/5] futex: More robustness tweaks Thomas Gleixner
2014-06-11 20:45 ` [patch 1/5] futex: Make unlock_pi more robust Thomas Gleixner
2014-06-16 16:18   ` Darren Hart [this message]
2014-06-16 22:15     ` Thomas Gleixner
2014-06-16 22:28       ` Thomas Gleixner
2014-06-16 22:49         ` Darren Hart
2014-06-16 22:39       ` Darren Hart
2014-06-21 20:33   ` [tip:locking/core] " tip-bot for Thomas Gleixner
2014-06-11 20:45 ` [patch 3/5] futex: Split out the waiter check from lookup_pi_state() Thomas Gleixner
2014-06-16 18:12   ` Darren Hart
2014-06-21 20:33   ` [tip:locking/core] " tip-bot for Thomas Gleixner
2014-06-11 20:45 ` [patch 2/5] futex: Use futex_top_waiter() in lookup_pi_state() Thomas Gleixner
2014-06-16 16:51   ` Darren Hart
2014-06-21 20:33   ` [tip:locking/core] " tip-bot for Thomas Gleixner
2014-06-11 20:45 ` [patch 4/5] futex: Split out the first waiter attachment from lookup_pi_state() Thomas Gleixner
2014-06-16 18:19   ` Darren Hart
2014-06-21 20:33   ` [tip:locking/core] " tip-bot for Thomas Gleixner
2014-06-11 20:45 ` [patch 5/5] futex: Simplify futex_lock_pi_atomic() and make it more robust Thomas Gleixner
2014-06-13  5:46   ` Darren Hart
2014-06-13  8:34     ` Thomas Gleixner
2014-06-13  9:36       ` Thomas Gleixner
2014-06-13  9:44         ` [patch V2 " Thomas Gleixner
2014-06-13 20:51           ` Davidlohr Bueso
2014-06-16 20:36           ` Darren Hart
2014-06-17  7:20             ` Thomas Gleixner
2014-06-21 20:34           ` [tip:locking/core] " tip-bot for Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1402935528.15603.14.camel@rage \
    --to=darren@dvhart.com \
    --cc=davidlohr@hp.com \
    --cc=kees@outflux.net \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    --cc=wad@chromium.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).