* [bug report] locking/rtmutex: Return success on deadlock for ww_mutex waiters
@ 2021-08-31 8:21 Dan Carpenter
2021-09-01 8:09 ` Peter Zijlstra
0 siblings, 1 reply; 5+ messages in thread
From: Dan Carpenter @ 2021-08-31 8:21 UTC (permalink / raw)
To: peterz; +Cc: kernel-janitors
Hello Peter Zijlstra,
This is a semi-automatic email about new static checker warnings.
The patch a055fcc132d4: "locking/rtmutex: Return success on deadlock
for ww_mutex waiters" from Aug 26, 2021, leads to the following
Smatch complaint:
kernel/locking/rtmutex.c:756 rt_mutex_adjust_prio_chain()
error: we previously assumed 'orig_waiter' could be null (see line 644)
kernel/locking/rtmutex.c
643 */
644 if (orig_waiter && !rt_mutex_owner(orig_lock))
^^^^^^^^^^^
A lot of this code assumes "orig_waiter" can be NULL.
645 goto out_unlock_pi;
646
647 /*
648 * We dropped all locks after taking a refcount on @task, so
649 * the task might have moved on in the lock chain or even left
650 * the chain completely and blocks now on an unrelated lock or
651 * on @orig_lock.
652 *
653 * We stored the lock on which @task was blocked in @next_lock,
654 * so we can detect the chain change.
655 */
656 if (next_lock != waiter->lock)
657 goto out_unlock_pi;
658
659 /*
660 * There could be 'spurious' loops in the lock graph due to ww_mutex,
661 * consider:
662 *
663 * P1: A, ww_A, ww_B
664 * P2: ww_B, ww_A
665 * P3: A
666 *
667 * P3 should not return -EDEADLK because it gets trapped in the cycle
668 * created by P1 and P2 (which will resolve -- and runs into
669 * max_lock_depth above). Therefore disable detect_deadlock such that
670 * the below termination condition can trigger once all relevant tasks
671 * are boosted.
672 *
673 * Even when we start with ww_mutex we can disable deadlock detection,
674 * since we would supress a ww_mutex induced deadlock at [6] anyway.
675 * Supressing it here however is not sufficient since we might still
676 * hit [6] due to adjustment driven iteration.
677 *
678 * NOTE: if someone were to create a deadlock between 2 ww_classes we'd
679 * utterly fail to report it; lockdep should.
680 */
681 if (IS_ENABLED(CONFIG_PREEMPT_RT) && waiter->ww_ctx && detect_deadlock)
682 detect_deadlock = false;
683
684 /*
685 * Drop out, when the task has no waiters. Note,
686 * top_waiter can be NULL, when we are in the deboosting
687 * mode!
688 */
689 if (top_waiter) {
690 if (!task_has_pi_waiters(task))
691 goto out_unlock_pi;
692 /*
693 * If deadlock detection is off, we stop here if we
694 * are not the top pi waiter of the task. If deadlock
695 * detection is enabled we continue, but stop the
696 * requeueing in the chain walk.
697 */
698 if (top_waiter != task_top_pi_waiter(task)) {
699 if (!detect_deadlock)
700 goto out_unlock_pi;
701 else
702 requeue = false;
703 }
704 }
705
706 /*
707 * If the waiter priority is the same as the task priority
708 * then there is no further priority adjustment necessary. If
709 * deadlock detection is off, we stop the chain walk. If its
710 * enabled we continue, but stop the requeueing in the chain
711 * walk.
712 */
713 if (rt_mutex_waiter_equal(waiter, task_to_waiter(task))) {
714 if (!detect_deadlock)
715 goto out_unlock_pi;
716 else
717 requeue = false;
718 }
719
720 /*
721 * [4] Get the next lock
722 */
723 lock = waiter->lock;
724 /*
725 * [5] We need to trylock here as we are holding task->pi_lock,
726 * which is the reverse lock order versus the other rtmutex
727 * operations.
728 */
729 if (!raw_spin_trylock(&lock->wait_lock)) {
730 raw_spin_unlock_irq(&task->pi_lock);
731 cpu_relax();
732 goto retry;
733 }
734
735 /*
736 * [6] check_exit_conditions_2() protected by task->pi_lock and
737 * lock->wait_lock.
738 *
739 * Deadlock detection. If the lock is the same as the original
740 * lock which caused us to walk the lock chain or if the
741 * current lock is owned by the task which initiated the chain
742 * walk, we detected a deadlock.
743 */
744 if (lock == orig_lock || rt_mutex_owner(lock) == top_task) {
^^^^^^^^^^^^^^^^^
This might mean it's a false positive, but Smatch isn't clever enough to
figure it out. And I'm stupid too! Plus lazy... and ugly.
745 ret = -EDEADLK;
746
747 /*
748 * When the deadlock is due to ww_mutex; also see above. Don't
749 * report the deadlock and instead let the ww_mutex wound/die
750 * logic pick which of the contending threads gets -EDEADLK.
751 *
752 * NOTE: assumes the cycle only contains a single ww_class; any
753 * other configuration and we fail to report; also, see
754 * lockdep.
755 */
756 if (IS_ENABLED(CONFIG_PREEMPT_RT) && orig_waiter->ww_ctx)
^^^^^^^^^^^^^^^^^^^
Unchecked dereference.
757 ret = 0;
758
regards,
dan carpenter
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [bug report] locking/rtmutex: Return success on deadlock for ww_mutex waiters
2021-08-31 8:21 [bug report] locking/rtmutex: Return success on deadlock for ww_mutex waiters Dan Carpenter
@ 2021-09-01 8:09 ` Peter Zijlstra
2021-09-01 9:44 ` [PATCH] locking/rtmutex: Fix ww_mutex deadlock check Peter Zijlstra
0 siblings, 1 reply; 5+ messages in thread
From: Peter Zijlstra @ 2021-09-01 8:09 UTC (permalink / raw)
To: Dan Carpenter
Cc: kernel-janitors, Thomas Gleixner, linux-kernel,
Sebastian Andrzej Siewior
On Tue, Aug 31, 2021 at 11:21:52AM +0300, Dan Carpenter wrote:
> Hello Peter Zijlstra,
Hi Dan :-)
> This is a semi-automatic email about new static checker warnings.
>
> The patch a055fcc132d4: "locking/rtmutex: Return success on deadlock
> for ww_mutex waiters" from Aug 26, 2021, leads to the following
> Smatch complaint:
>
> kernel/locking/rtmutex.c:756 rt_mutex_adjust_prio_chain()
> error: we previously assumed 'orig_waiter' could be null (see line 644)
>
> kernel/locking/rtmutex.c
> 643 */
> 644 if (orig_waiter && !rt_mutex_owner(orig_lock))
> ^^^^^^^^^^^
> A lot of this code assumes "orig_waiter" can be NULL.
>
> 735 /*
> 736 * [6] check_exit_conditions_2() protected by task->pi_lock and
> 737 * lock->wait_lock.
> 738 *
> 739 * Deadlock detection. If the lock is the same as the original
> 740 * lock which caused us to walk the lock chain or if the
> 741 * current lock is owned by the task which initiated the chain
> 742 * walk, we detected a deadlock.
> 743 */
> 744 if (lock == orig_lock || rt_mutex_owner(lock) == top_task) {
> ^^^^^^^^^^^^^^^^^
> This might mean it's a false positive, but Smatch isn't clever enough to
> figure it out. And I'm stupid too! Plus lazy... and ugly.
>
> 745 ret = -EDEADLK;
> 746
> 747 /*
> 748 * When the deadlock is due to ww_mutex; also see above. Don't
> 749 * report the deadlock and instead let the ww_mutex wound/die
> 750 * logic pick which of the contending threads gets -EDEADLK.
> 751 *
> 752 * NOTE: assumes the cycle only contains a single ww_class; any
> 753 * other configuration and we fail to report; also, see
> 754 * lockdep.
> 755 */
> 756 if (IS_ENABLED(CONFIG_PREEMPT_RT) && orig_waiter->ww_ctx)
> ^^^^^^^^^^^^^^^^^^^
> Unchecked dereference.
This is difficult... and I'm glad you flagged it. The normal de-boost
path is through rt_mutex_adjust_prio() and that has: .orig_lock == NULL
&& .orig_waiter == NULL. And as such it would never trigger the above
case.
However, there is remove_waiter() which is called on rt_mutex_lock()'s
failure paths and that doesn't have .orig_lock == NULL, and as such
*could* conceivably trigger this.
Let me figure out what the right thing to do is.
Thanks!
^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH] locking/rtmutex: Fix ww_mutex deadlock check
2021-09-01 8:09 ` Peter Zijlstra
@ 2021-09-01 9:44 ` Peter Zijlstra
2021-09-06 16:51 ` Sebastian Andrzej Siewior
2021-09-09 8:34 ` [tip: locking/urgent] " tip-bot2 for Peter Zijlstra
0 siblings, 2 replies; 5+ messages in thread
From: Peter Zijlstra @ 2021-09-01 9:44 UTC (permalink / raw)
To: Dan Carpenter
Cc: kernel-janitors, Thomas Gleixner, linux-kernel,
Sebastian Andrzej Siewior
On Wed, Sep 01, 2021 at 10:09:43AM +0200, Peter Zijlstra wrote:
> On Tue, Aug 31, 2021 at 11:21:52AM +0300, Dan Carpenter wrote:
> > Hello Peter Zijlstra,
>
> Hi Dan :-)
>
> > This is a semi-automatic email about new static checker warnings.
> >
> > The patch a055fcc132d4: "locking/rtmutex: Return success on deadlock
> > for ww_mutex waiters" from Aug 26, 2021, leads to the following
> > Smatch complaint:
> >
> > kernel/locking/rtmutex.c:756 rt_mutex_adjust_prio_chain()
> > error: we previously assumed 'orig_waiter' could be null (see line 644)
> >
> > kernel/locking/rtmutex.c
> > 643 */
> > 644 if (orig_waiter && !rt_mutex_owner(orig_lock))
> > ^^^^^^^^^^^
> > A lot of this code assumes "orig_waiter" can be NULL.
> >
>
> > 735 /*
> > 736 * [6] check_exit_conditions_2() protected by task->pi_lock and
> > 737 * lock->wait_lock.
> > 738 *
> > 739 * Deadlock detection. If the lock is the same as the original
> > 740 * lock which caused us to walk the lock chain or if the
> > 741 * current lock is owned by the task which initiated the chain
> > 742 * walk, we detected a deadlock.
> > 743 */
> > 744 if (lock == orig_lock || rt_mutex_owner(lock) == top_task) {
> > ^^^^^^^^^^^^^^^^^
> > This might mean it's a false positive, but Smatch isn't clever enough to
> > figure it out. And I'm stupid too! Plus lazy... and ugly.
> >
> > 745 ret = -EDEADLK;
> > 746
> > 747 /*
> > 748 * When the deadlock is due to ww_mutex; also see above. Don't
> > 749 * report the deadlock and instead let the ww_mutex wound/die
> > 750 * logic pick which of the contending threads gets -EDEADLK.
> > 751 *
> > 752 * NOTE: assumes the cycle only contains a single ww_class; any
> > 753 * other configuration and we fail to report; also, see
> > 754 * lockdep.
> > 755 */
> > 756 if (IS_ENABLED(CONFIG_PREEMPT_RT) && orig_waiter->ww_ctx)
> > ^^^^^^^^^^^^^^^^^^^
> > Unchecked dereference.
>
>
> This is difficult... and I'm glad you flagged it. The normal de-boost
> path is through rt_mutex_adjust_prio() and that has: .orig_lock == NULL
> && .orig_waiter == NULL. And as such it would never trigger the above
> case.
>
> However, there is remove_waiter() which is called on rt_mutex_lock()'s
> failure paths and that doesn't have .orig_lock == NULL, and as such
> *could* conceivably trigger this.
>
> Let me figure out what the right thing to do is.
>
> Thanks!
I think something like this ought to do.
---
Subject: locking/rtmutex: Fix ww_mutex deadlock check
Dan reported that rt_mutex_adjust_prio_chain() can be called with
.orig_waiter == NULL however commit a055fcc132d4 ("locking/rtmutex:
Return success on deadlock for ww_mutex waiters") unconditionally
dereferences it.
Since both call-sites that have .orig_waiter == NULL don't care for the
return value, simply disable the deadlock squash by adding the NULL
check.
Notably, both callers use the deadlock condition as a termination
condition for the iteration; once detected, we're sure (de)boosting is
done. Arguably [3] would be a more natural termination point, but I'm
not sure adding a third deadlock detection state would improve the code.
Fixes: a055fcc132d4 ("locking/rtmutex: Return success on deadlock for ww_mutex waiters")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
kernel/locking/rtmutex.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 8eabdc79602b..6bb116c559b4 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -753,7 +753,7 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task,
* other configuration and we fail to report; also, see
* lockdep.
*/
- if (IS_ENABLED(CONFIG_PREEMPT_RT) && orig_waiter->ww_ctx)
+ if (IS_ENABLED(CONFIG_PREEMPT_RT) && orig_waiter && orig_waiter->ww_ctx)
ret = 0;
raw_spin_unlock(&lock->wait_lock);
^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH] locking/rtmutex: Fix ww_mutex deadlock check
2021-09-01 9:44 ` [PATCH] locking/rtmutex: Fix ww_mutex deadlock check Peter Zijlstra
@ 2021-09-06 16:51 ` Sebastian Andrzej Siewior
2021-09-09 8:34 ` [tip: locking/urgent] " tip-bot2 for Peter Zijlstra
1 sibling, 0 replies; 5+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-09-06 16:51 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Dan Carpenter, kernel-janitors, Thomas Gleixner, linux-kernel
On 2021-09-01 11:44:11 [+0200], Peter Zijlstra wrote:
> Subject: locking/rtmutex: Fix ww_mutex deadlock check
>
> Dan reported that rt_mutex_adjust_prio_chain() can be called with
> .orig_waiter == NULL however commit a055fcc132d4 ("locking/rtmutex:
> Return success on deadlock for ww_mutex waiters") unconditionally
> dereferences it.
>
> Since both call-sites that have .orig_waiter == NULL don't care for the
> return value, simply disable the deadlock squash by adding the NULL
> check.
>
> Notably, both callers use the deadlock condition as a termination
> condition for the iteration; once detected, we're sure (de)boosting is
> done. Arguably [3] would be a more natural termination point, but I'm
> not sure adding a third deadlock detection state would improve the code.
>
> Fixes: a055fcc132d4 ("locking/rtmutex: Return success on deadlock for ww_mutex waiters")
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
It sounds reasonable and I don't see any fallout in the testsuite so
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
> kernel/locking/rtmutex.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
> index 8eabdc79602b..6bb116c559b4 100644
> --- a/kernel/locking/rtmutex.c
> +++ b/kernel/locking/rtmutex.c
> @@ -753,7 +753,7 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task,
> * other configuration and we fail to report; also, see
> * lockdep.
> */
> - if (IS_ENABLED(CONFIG_PREEMPT_RT) && orig_waiter->ww_ctx)
> + if (IS_ENABLED(CONFIG_PREEMPT_RT) && orig_waiter && orig_waiter->ww_ctx)
> ret = 0;
>
> raw_spin_unlock(&lock->wait_lock);
Sebastian
^ permalink raw reply [flat|nested] 5+ messages in thread
* [tip: locking/urgent] locking/rtmutex: Fix ww_mutex deadlock check
2021-09-01 9:44 ` [PATCH] locking/rtmutex: Fix ww_mutex deadlock check Peter Zijlstra
2021-09-06 16:51 ` Sebastian Andrzej Siewior
@ 2021-09-09 8:34 ` tip-bot2 for Peter Zijlstra
1 sibling, 0 replies; 5+ messages in thread
From: tip-bot2 for Peter Zijlstra @ 2021-09-09 8:34 UTC (permalink / raw)
To: linux-tip-commits
Cc: Dan Carpenter, Peter Zijlstra (Intel),
Thomas Gleixner, Sebastian Andrzej Siewior, x86, linux-kernel
The following commit has been merged into the locking/urgent branch of tip:
Commit-ID: e5480572706da1b2c2dc2c6484eab64f92b9263b
Gitweb: https://git.kernel.org/tip/e5480572706da1b2c2dc2c6484eab64f92b9263b
Author: Peter Zijlstra <peterz@infradead.org>
AuthorDate: Wed, 01 Sep 2021 11:44:11 +02:00
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 09 Sep 2021 10:31:22 +02:00
locking/rtmutex: Fix ww_mutex deadlock check
Dan reported that rt_mutex_adjust_prio_chain() can be called with
.orig_waiter == NULL however commit a055fcc132d4 ("locking/rtmutex: Return
success on deadlock for ww_mutex waiters") unconditionally dereferences it.
Since both call-sites that have .orig_waiter == NULL don't care for the
return value, simply disable the deadlock squash by adding the NULL check.
Notably, both callers use the deadlock condition as a termination condition
for the iteration; once detected, it is sure that (de)boosting is done.
Arguably step [3] would be a more natural termination point, but it's
dubious whether adding a third deadlock detection state would improve the
code.
Fixes: a055fcc132d4 ("locking/rtmutex: Return success on deadlock for ww_mutex waiters")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Link: https://lore.kernel.org/r/YS9La56fHMiCCo75@hirez.programming.kicks-ass.net
---
kernel/locking/rtmutex.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 8eabdc7..6bb116c 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -753,7 +753,7 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task,
* other configuration and we fail to report; also, see
* lockdep.
*/
- if (IS_ENABLED(CONFIG_PREEMPT_RT) && orig_waiter->ww_ctx)
+ if (IS_ENABLED(CONFIG_PREEMPT_RT) && orig_waiter && orig_waiter->ww_ctx)
ret = 0;
raw_spin_unlock(&lock->wait_lock);
^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-09-09 8:34 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-31 8:21 [bug report] locking/rtmutex: Return success on deadlock for ww_mutex waiters Dan Carpenter
2021-09-01 8:09 ` Peter Zijlstra
2021-09-01 9:44 ` [PATCH] locking/rtmutex: Fix ww_mutex deadlock check Peter Zijlstra
2021-09-06 16:51 ` Sebastian Andrzej Siewior
2021-09-09 8:34 ` [tip: locking/urgent] " tip-bot2 for Peter Zijlstra
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.