All of lore.kernel.org
 help / color / mirror / Atom feed
From: Julia Cartwright <julia@ni.com>
To: Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>
Cc: Gratian Crisan <gratian.crisan@ni.com>,
	<linux-kernel@vger.kernel.org>,
	Darren Hart <dvhart@infradead.org>,
	Ingo Molnar <mingo@redhat.com>
Subject: PI futexes + lock stealing woes
Date: Wed, 29 Nov 2017 11:56:05 -0600	[thread overview]
Message-ID: <20171129175605.GA863@jcartwri.amer.corp.natinst.com> (raw)

Hey Thomas, Peter-

Gratian and I have been debugging into a nasty and difficult race w/
futexes seemingly the culprit.  The original symptom we were seeing
was a seemingly spurious -EDEADLK from a futex(LOCK_PI) operation.

On further analysis, however, it appears the thread which gets the
spurious -EDEADLK has observed a weird futex state: a prior
futex(WAIT_REQUEUE_PI) operation has returned -ETIMEDOUT, but the uaddr2
futex word owner field indicates that it's the owner.

Here's an attempt to boil down this situation into a pseudo trace; I'm
happy to forward along the full traces as well, if that would be
helpful:

   waiter                                  waker                                            stealer (prio > waiter)

   futex(WAIT_REQUEUE_PI, uaddr, uaddr2,
         timeout=[N ms])
      futex_wait_requeue_pi()
         futex_wait_queue_me()
            freezable_schedule()
            <scheduled out>
                                           futex(LOCK_PI, uaddr2)
                                           futex(CMP_REQUEUE_PI, uaddr,
                                                 uaddr2, 1, 0)
                                              /* requeues waiter to uaddr2 */
                                           futex(UNLOCK_PI, uaddr2)
                                                 wake_futex_pi()
                                                    cmp_futex_value_locked(uaddr, waiter)
                                                    wake_up_q()
           <woken by waker>
           <hrtimer_wakeup() fires,
            clears sleeper->task>
                                                                                           futex(LOCK_PI, uaddr2)
                                                                                              __rt_mutex_start_proxy_lock()
                                                                                                 try_to_take_rt_mutex() /* steals lock */
                                                                                                    rt_mutex_set_owner(lock, stealer)
                                                                                              <preempted>
         <scheduled in>
         rt_mutex_wait_proxy_lock()
            __rt_mutex_slowlock()
               try_to_take_rt_mutex() /* fails, lock held by stealer */
               if (timeout && !timeout->task)
                  return -ETIMEDOUT;
            fixup_owner()
               /* lock wasn't acquired, so,
                  fixup_pi_state_owner skipped */
   return -ETIMEDOUT;

   /* At this point, we've returned -ETIMEDOUT to userspace, but the
    * futex word shows waiter to be the owner, and the pi_mutex has
    * stealer as the owner */

   futex_lock(LOCK_PI, uaddr2)
     -> bails with EDEADLK, futex word says we're owner.

At some later point in execution, the stealer gets scheduled back in and
will do fixup_owner() which fixes up the futex word, but at that point
it's too late: the waiter has already observed the wonky state.

fixup_owner() used to have additional seemingly relevant checks in place
that were removed 73d786bd043eb ("futex: Rework inconsistent
rt_mutex/futex_q state").

The actual kernel we've been testing is 4.9.33-rt23, w/ 153fbd1226fb3
("futex: Fix more put_pi_state() vs. exit_pi_state_list() races")
cherry-picked w/ PREEMPT_RT_FULL.  However, it appears that this issue
may affect v4.15-rc1?

Thoughts on how to move forward?

Nasty.

   Julia

             reply	other threads:[~2017-11-29 17:57 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-29 17:56 Julia Cartwright [this message]
2017-12-01 20:11 ` PI futexes + lock stealing woes Darren Hart
2017-12-01 21:49   ` Julia Cartwright
2017-12-02  0:32     ` Darren Hart
2017-12-06 23:46 ` Peter Zijlstra
2017-12-07  2:09   ` Gratian Crisan
2017-12-07 10:45     ` Peter Zijlstra
2017-12-07 14:26       ` Peter Zijlstra
2017-12-07 14:57         ` Gratian Crisan
2017-12-07 19:50           ` Julia Cartwright
2017-12-07 23:02             ` Gratian Crisan
2017-12-08 12:49               ` [PATCH] futex: Avoid violating the 10th rule of futex Peter Zijlstra
2017-12-08 16:04                 ` Gratian Crisan
2018-01-08 21:09                 ` Julia Cartwright
2018-01-14 18:06                 ` [tip:locking/urgent] " tip-bot for Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20171129175605.GA863@jcartwri.amer.corp.natinst.com \
    --to=julia@ni.com \
    --cc=dvhart@infradead.org \
    --cc=gratian.crisan@ni.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.