linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Davidlohr Bueso <dave@stgolabs.net>
To: tglx@linutronix.de, mingo@redhat.com
Cc: peterz@infradead.org, dvhart@infradead.org,
	linux-kernel@vger.kernel.org, dave@stgolabs.net,
	Davidlohr Bueso <dbueso@suse.de>
Subject: [PATCH 2/2] futex: Leave the pi lock stealer in a consistent state upon successful fault
Date: Sun, 14 Mar 2021 22:02:24 -0700	[thread overview]
Message-ID: <20210315050224.107056-3-dave@stgolabs.net> (raw)
In-Reply-To: <20210315050224.107056-1-dave@stgolabs.net>

Before 34b1a1ce145 (futex: Handle faults correctly for PI futexes) any
concurrent pi_state->owner fixup would assume that the task that fixed
things on our behalf also correctly updated the userspace value. This
is not always the case anymore, and can result in scenarios where a lock
stealer returns a successful FUTEX_PI_LOCK operation but raced during a fault
with an enqueued top waiter in an immutable state so the uval TID was
not updated for the stealer, breaking otherwise expected (and valid)
semantics and confusing the stealer task:

with pi_state->owner == victim.

victim							stealer
futex_lock_pi() {
  queue_me(&q, hb);
  rt_mutex_timed_futex_lock() {
							futex_lock_pi() {
							  // lock steal
							  rt_mutex_timed_futex_lock();
    // timeout
  }

  spin_lock(q.lock_ptr);
  fixup_owner(!locked) {
    fixup_pi_state_owner(NULL) {
      oldowner = pi_state->owner
      newowner = stealer;
      handle_err:
      //drop locks

      ret = fault_in_user_writeable() {			spin_lock(q.lock_ptr);
							fixup_owner(locked) {
      } // -EFAULT					    fixup_pi_state_owner(current) {
							      oldowner = pi_state->owner
							      newowner = current;
							      handle_err:
							      // drop locks
							      ret = fault_in_user_writeable() {

      // take locks
      if (pi_state->owner != oldowner) // false

      pi_state_update_owner(rt_mutex_owner());

							       } // SUCCESS
   }
   // all locks dropped					       // take locks
							       if (pi_state->owner != oldowner) // success
}								 return 1;

This leaves: (pi_state == pi_mutex owner == stealer) AND (uval TID == victim).

This patch proposes for the lock stealer to do a retry upon seeing someone
changed pi_state->owner while all locks were dropped if the fault was
successful. This allows to self-fixup the user state of the lock, albeit
an incorrect order compared to traditionally updating userspace before
pi_state, but this is an extraordinary scenario.

For the cases of a normal fixups, this does add some unnecessary overhead
by having to deal with userspace value when things are already ok, but
this case is pretty rare and we've already given up any inch of performance
when releasing all locks, for faulting/blocking.

Signed-off-by: Davidlohr Bueso <dbueso@suse.de>
---
 kernel/futex.c | 16 +++++++++++++---
 1 file changed, 13 insertions(+), 3 deletions(-)

diff --git a/kernel/futex.c b/kernel/futex.c
index ded7af2ba87f..95ce10c4e33d 100644
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2460,7 +2460,6 @@ static int __fixup_pi_state_owner(u32 __user *uaddr, struct futex_q *q,
 
 	case -EAGAIN:
 		cond_resched();
-		err = 0;
 		break;
 
 	default:
@@ -2474,11 +2473,22 @@ static int __fixup_pi_state_owner(u32 __user *uaddr, struct futex_q *q,
 	/*
 	 * Check if someone else fixed it for us:
 	 */
-	if (pi_state->owner != oldowner)
+	if (pi_state->owner != oldowner) {
+		/*
+		 * The change might have come from the rare immutable
+		 * state below, which leaves the userspace value out of
+		 * sync. But if we are the lock stealer and can update
+		 * the uval, do so, instead of reporting a successful
+		 * lock operation with an invalid user state.
+		 */
+		if (!err && argowner == current)
+			goto retry;
+
 		return argowner == current;
+	}
 
 	/* Retry if err was -EAGAIN or the fault in succeeded */
-	if (!err)
+	if (err == -EAGAIN || !err)
 		goto retry;
 
 	/*
-- 
2.26.2


  parent reply	other threads:[~2021-03-15  5:03 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-15  5:02 [PATCH -tip 0/2] futex: Two pi fixes Davidlohr Bueso
2021-03-15  5:02 ` [PATCH 1/2] futex: Fix irq mismatch in exit_pi_state_list() Davidlohr Bueso
2021-03-15 13:12   ` Peter Zijlstra
2021-03-15 19:03     ` Davidlohr Bueso
2021-03-15  5:02 ` Davidlohr Bueso [this message]
2021-03-16 11:20   ` [PATCH 2/2] futex: Leave the pi lock stealer in a consistent state upon successful fault Peter Zijlstra
2021-03-16 18:03     ` Davidlohr Bueso
2021-03-16 19:48       ` Thomas Gleixner
2021-03-16 20:12       ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210315050224.107056-3-dave@stgolabs.net \
    --to=dave@stgolabs.net \
    --cc=dbueso@suse.de \
    --cc=dvhart@infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).