From: Peter Zijlstra <peterz@infradead.org>
To: tglx@linutronix.de
Cc: mingo@kernel.org, juri.lelli@arm.com, rostedt@goodmis.org,
xlpang@redhat.com, bigeasy@linutronix.de,
linux-kernel@vger.kernel.org, mathieu.desnoyers@efficios.com,
jdesfossez@efficios.com, bristot@redhat.com,
dvhart@infradead.org, peterz@infradead.org
Subject: [PATCH -v6 13/13] futex: futex_lock_pi() vs PREEMPT_RT_FULL
Date: Wed, 22 Mar 2017 11:36:00 +0100 [thread overview]
Message-ID: <20170322104152.161341537@infradead.org> (raw)
In-Reply-To: 20170322103547.756091212@infradead.org
[-- Attachment #1: peterz-futex-pi-unlock-12.patch --]
[-- Type: text/plain, Size: 6343 bytes --]
When PREEMPT_RT_FULL does the spinlock -> rt_mutex substitution the PI
chain code will (falsely) report a deadlock and BUG.
The problem is that we hold hb->lock (now an rt_mutex) while doing
task_blocks_on_rt_mutex on the futex's pi_state::rtmutex. This, when
interleaved just right with futex_unlock_pi() leads it to believe we
have an AB-BA deadlock.
Task1 (holds rt_mutex, Task2 (does FUTEX_LOCK_PI)
does FUTEX_UNLOCK_PI)
lock hb->lock
lock rt_mutex (as per start_proxy)
lock hb->lock
Which is a trivial AB-BA.
It is not an actual deadlock, because we won't be holding hb->lock by
the time we actually block on rt_mutex, but the chainwalk code doesn't
know that.
To avoid this problem, do the same thing we do in futex_unlock_pi()
and drop hb->lock after acquiring wait_lock. This still fully
serializes against futex_unlock_pi(), since adding to the wait_list
does the very same lock dance, and removing it holds both locks.
Reported-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
kernel/futex.c | 30 +++++++++++++++++-------
kernel/locking/rtmutex.c | 49 ++++++++++++++++++++++------------------
kernel/locking/rtmutex_common.h | 3 ++
3 files changed, 52 insertions(+), 30 deletions(-)
--- a/kernel/futex.c
+++ b/kernel/futex.c
@@ -2654,20 +2654,33 @@ static int futex_lock_pi(u32 __user *uad
goto no_block;
}
+ rt_mutex_init_waiter(&rt_waiter);
+
/*
- * We must add ourselves to the rt_mutex waitlist while holding hb->lock
- * such that the hb and rt_mutex wait lists match.
+ * On PREEMPT_RT_FULL, when hb->lock becomes an rt_mutex, we must not
+ * hold it while doing rt_mutex_start_proxy(), because then it will
+ * include hb->lock in the blocking chain, even through we'll not in
+ * fact hold it while blocking. This will lead it to report -EDEADLK
+ * and BUG when futex_unlock_pi() interleaves with this.
+ *
+ * Therefore acquire wait_lock while holding hb->lock, but drop the
+ * latter before calling rt_mutex_start_proxy_lock(). This still fully
+ * serializes against futex_unlock_pi() as that does the exact same
+ * lock handoff sequence.
*/
- rt_mutex_init_waiter(&rt_waiter);
- ret = rt_mutex_start_proxy_lock(&q.pi_state->pi_mutex, &rt_waiter, current);
+ raw_spin_lock_irq(&q.pi_state->pi_mutex.wait_lock);
+ spin_unlock(q.lock_ptr);
+ ret = __rt_mutex_start_proxy_lock(&q.pi_state->pi_mutex, &rt_waiter, current);
+ raw_spin_unlock_irq(&q.pi_state->pi_mutex.wait_lock);
+
if (ret) {
if (ret == 1)
ret = 0;
+ spin_lock(q.lock_ptr);
goto no_block;
}
- spin_unlock(q.lock_ptr);
if (unlikely(to))
hrtimer_start_expires(&to->timer, HRTIMER_MODE_ABS);
@@ -2680,6 +2693,9 @@ static int futex_lock_pi(u32 __user *uad
* first acquire the hb->lock before removing the lock from the
* rt_mutex waitqueue, such that we can keep the hb and rt_mutex
* wait lists consistent.
+ *
+ * In particular; it is important that futex_unlock_pi() can not
+ * observe this inconsistency.
*/
if (ret && !rt_mutex_cleanup_proxy_lock(&q.pi_state->pi_mutex, &rt_waiter))
ret = 0;
@@ -2791,10 +2807,6 @@ static int futex_unlock_pi(u32 __user *u
get_pi_state(pi_state);
/*
- * Since modifying the wait_list is done while holding both
- * hb->lock and wait_lock, holding either is sufficient to
- * observe it.
- *
* By taking wait_lock while still holding hb->lock, we ensure
* there is no point where we hold neither; and therefore
* wake_futex_pi() must observe a state consistent with what we
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1669,31 +1669,14 @@ void rt_mutex_proxy_unlock(struct rt_mut
rt_mutex_set_owner(lock, NULL);
}
-/**
- * rt_mutex_start_proxy_lock() - Start lock acquisition for another task
- * @lock: the rt_mutex to take
- * @waiter: the pre-initialized rt_mutex_waiter
- * @task: the task to prepare
- *
- * Returns:
- * 0 - task blocked on lock
- * 1 - acquired the lock for task, caller should wake it up
- * <0 - error
- *
- * Special API call for FUTEX_REQUEUE_PI support.
- */
-int rt_mutex_start_proxy_lock(struct rt_mutex *lock,
+int __rt_mutex_start_proxy_lock(struct rt_mutex *lock,
struct rt_mutex_waiter *waiter,
struct task_struct *task)
{
int ret;
- raw_spin_lock_irq(&lock->wait_lock);
-
- if (try_to_take_rt_mutex(lock, task, NULL)) {
- raw_spin_unlock_irq(&lock->wait_lock);
+ if (try_to_take_rt_mutex(lock, task, NULL))
return 1;
- }
/* We enforce deadlock detection for futexes */
ret = task_blocks_on_rt_mutex(lock, waiter, task,
@@ -1712,12 +1695,36 @@ int rt_mutex_start_proxy_lock(struct rt_
if (unlikely(ret))
remove_waiter(lock, waiter);
- raw_spin_unlock_irq(&lock->wait_lock);
-
debug_rt_mutex_print_deadlock(waiter);
return ret;
}
+
+/**
+ * rt_mutex_start_proxy_lock() - Start lock acquisition for another task
+ * @lock: the rt_mutex to take
+ * @waiter: the pre-initialized rt_mutex_waiter
+ * @task: the task to prepare
+ *
+ * Returns:
+ * 0 - task blocked on lock
+ * 1 - acquired the lock for task, caller should wake it up
+ * <0 - error
+ *
+ * Special API call for FUTEX_REQUEUE_PI support.
+ */
+int rt_mutex_start_proxy_lock(struct rt_mutex *lock,
+ struct rt_mutex_waiter *waiter,
+ struct task_struct *task)
+{
+ int ret;
+
+ raw_spin_lock_irq(&lock->wait_lock);
+ ret = __rt_mutex_start_proxy_lock(lock, waiter, task);
+ raw_spin_unlock_irq(&lock->wait_lock);
+
+ return ret;
+}
/**
* rt_mutex_next_owner - return the next owner of the lock
--- a/kernel/locking/rtmutex_common.h
+++ b/kernel/locking/rtmutex_common.h
@@ -104,6 +104,9 @@ extern void rt_mutex_init_proxy_locked(s
extern void rt_mutex_proxy_unlock(struct rt_mutex *lock,
struct task_struct *proxy_owner);
extern void rt_mutex_init_waiter(struct rt_mutex_waiter *waiter);
+extern int __rt_mutex_start_proxy_lock(struct rt_mutex *lock,
+ struct rt_mutex_waiter *waiter,
+ struct task_struct *task);
extern int rt_mutex_start_proxy_lock(struct rt_mutex *lock,
struct rt_mutex_waiter *waiter,
struct task_struct *task);
next prev parent reply other threads:[~2017-03-22 10:45 UTC|newest]
Thread overview: 60+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-03-22 10:35 [PATCH -v6 00/13] The arduous story of FUTEX_UNLOCK_PI Peter Zijlstra
2017-03-22 10:35 ` [PATCH -v6 01/13] futex: Cleanup variable names for futex_top_waiter() Peter Zijlstra
2017-03-23 18:19 ` [tip:locking/core] " tip-bot for Peter Zijlstra
2017-03-24 21:11 ` [PATCH -v6 01/13] " Darren Hart
2017-03-22 10:35 ` [PATCH -v6 02/13] futex: Use smp_store_release() in mark_wake_futex() Peter Zijlstra
2017-03-23 18:19 ` [tip:locking/core] " tip-bot for Peter Zijlstra
2017-03-24 21:16 ` [PATCH -v6 02/13] " Darren Hart
2017-03-22 10:35 ` [PATCH -v6 03/13] futex: Remove rt_mutex_deadlock_account_*() Peter Zijlstra
2017-03-23 18:20 ` [tip:locking/core] " tip-bot for Peter Zijlstra
2017-03-24 21:29 ` [PATCH -v6 03/13] " Darren Hart
2017-03-24 21:31 ` Darren Hart
2017-03-22 10:35 ` [PATCH -v6 04/13] futex,rt_mutex: Provide futex specific rt_mutex API Peter Zijlstra
2017-03-23 18:20 ` [tip:locking/core] " tip-bot for Peter Zijlstra
2017-03-25 0:37 ` [PATCH -v6 04/13] " Darren Hart
2017-04-06 12:15 ` Peter Zijlstra
2017-04-06 17:02 ` Darren Hart
2017-04-05 15:02 ` Darren Hart
2017-04-06 12:17 ` Peter Zijlstra
2017-04-06 17:08 ` Darren Hart
2017-03-22 10:35 ` [PATCH -v6 05/13] futex: Change locking rules Peter Zijlstra
2017-03-23 18:21 ` [tip:locking/core] " tip-bot for Peter Zijlstra
2017-04-05 21:18 ` [PATCH -v6 05/13] " Darren Hart
2017-04-06 12:28 ` Peter Zijlstra
2017-04-06 15:58 ` Joe Perches
2017-04-06 17:21 ` Darren Hart
2017-03-22 10:35 ` [PATCH -v6 06/13] futex: Cleanup refcounting Peter Zijlstra
2017-03-23 18:21 ` [tip:locking/core] " tip-bot for Peter Zijlstra
2017-04-05 21:29 ` [PATCH -v6 06/13] " Darren Hart
2017-03-22 10:35 ` [PATCH -v6 07/13] futex: Rework inconsistent rt_mutex/futex_q state Peter Zijlstra
2017-03-23 18:22 ` [tip:locking/core] " tip-bot for Peter Zijlstra
2017-04-05 21:58 ` [PATCH -v6 07/13] " Darren Hart
2017-03-22 10:35 ` [PATCH -v6 08/13] futex: Pull rt_mutex_futex_unlock() out from under hb->lock Peter Zijlstra
2017-03-23 18:22 ` [tip:locking/core] " tip-bot for Peter Zijlstra
2017-04-05 23:52 ` [PATCH -v6 08/13] " Darren Hart
2017-04-06 12:42 ` Peter Zijlstra
2017-04-06 17:42 ` Darren Hart
2017-03-22 10:35 ` [PATCH -v6 09/13] futex,rt_mutex: Introduce rt_mutex_init_waiter() Peter Zijlstra
2017-03-23 18:23 ` [tip:locking/core] " tip-bot for Peter Zijlstra
2017-04-05 23:57 ` [PATCH -v6 09/13] " Darren Hart
2017-03-22 10:35 ` [PATCH -v6 10/13] futex,rt_mutex: Restructure rt_mutex_finish_proxy_lock() Peter Zijlstra
2017-03-23 18:23 ` [tip:locking/core] " tip-bot for Peter Zijlstra
2017-04-07 23:30 ` [PATCH -v6 10/13] " Darren Hart
2017-04-07 23:35 ` Darren Hart
2017-03-22 10:35 ` [PATCH -v6 11/13] futex: Rework futex_lock_pi() to use rt_mutex_*_proxy_lock() Peter Zijlstra
2017-03-23 18:24 ` [tip:locking/core] " tip-bot for Peter Zijlstra
2017-04-08 0:55 ` [PATCH -v6 11/13] " Darren Hart
2017-04-10 15:51 ` alexander.levin
2017-04-10 16:03 ` Thomas Gleixner
2017-04-14 9:30 ` [tip:locking/core] futex: Avoid freeing an active timer tip-bot for Thomas Gleixner
2017-03-22 10:35 ` [PATCH -v6 12/13] futex: futex_unlock_pi() determinism Peter Zijlstra
2017-03-23 18:24 ` [tip:locking/core] futex: Futex_unlock_pi() determinism tip-bot for Peter Zijlstra
2017-04-08 1:27 ` [PATCH -v6 12/13] futex: futex_unlock_pi() determinism Darren Hart
2017-03-22 10:36 ` Peter Zijlstra [this message]
2017-03-23 18:25 ` [tip:locking/core] futex: Drop hb->lock before enqueueing on the rtmutex tip-bot for Peter Zijlstra
2017-04-08 2:26 ` [PATCH -v6 13/13] futex: futex_lock_pi() vs PREEMPT_RT_FULL Darren Hart
2017-04-08 5:22 ` Mike Galbraith
2017-04-10 8:43 ` Sebastian Andrzej Siewior
2017-04-10 9:08 ` Peter Zijlstra
2017-04-10 16:05 ` Darren Hart
2017-03-24 1:45 ` [PATCH -v6 00/13] The arduous story of FUTEX_UNLOCK_PI Darren Hart
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170322104152.161341537@infradead.org \
--to=peterz@infradead.org \
--cc=bigeasy@linutronix.de \
--cc=bristot@redhat.com \
--cc=dvhart@infradead.org \
--cc=jdesfossez@efficios.com \
--cc=juri.lelli@arm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mingo@kernel.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=xlpang@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).