* [bug report] locking/rtmutex: Return success on deadlock for ww_mutex waiters
@ 2021-08-31 8:21 Dan Carpenter
2021-09-01 8:09 ` Peter Zijlstra
0 siblings, 1 reply; 4+ messages in thread
From: Dan Carpenter @ 2021-08-31 8:21 UTC (permalink / raw)
To: peterz; +Cc: kernel-janitors
Hello Peter Zijlstra,
This is a semi-automatic email about new static checker warnings.
The patch a055fcc132d4: "locking/rtmutex: Return success on deadlock
for ww_mutex waiters" from Aug 26, 2021, leads to the following
Smatch complaint:
kernel/locking/rtmutex.c:756 rt_mutex_adjust_prio_chain()
error: we previously assumed 'orig_waiter' could be null (see line 644)
kernel/locking/rtmutex.c
643 */
644 if (orig_waiter && !rt_mutex_owner(orig_lock))
^^^^^^^^^^^
A lot of this code assumes "orig_waiter" can be NULL.
645 goto out_unlock_pi;
646
647 /*
648 * We dropped all locks after taking a refcount on @task, so
649 * the task might have moved on in the lock chain or even left
650 * the chain completely and blocks now on an unrelated lock or
651 * on @orig_lock.
652 *
653 * We stored the lock on which @task was blocked in @next_lock,
654 * so we can detect the chain change.
655 */
656 if (next_lock != waiter->lock)
657 goto out_unlock_pi;
658
659 /*
660 * There could be 'spurious' loops in the lock graph due to ww_mutex,
661 * consider:
662 *
663 * P1: A, ww_A, ww_B
664 * P2: ww_B, ww_A
665 * P3: A
666 *
667 * P3 should not return -EDEADLK because it gets trapped in the cycle
668 * created by P1 and P2 (which will resolve -- and runs into
669 * max_lock_depth above). Therefore disable detect_deadlock such that
670 * the below termination condition can trigger once all relevant tasks
671 * are boosted.
672 *
673 * Even when we start with ww_mutex we can disable deadlock detection,
674 * since we would supress a ww_mutex induced deadlock at [6] anyway.
675 * Supressing it here however is not sufficient since we might still
676 * hit [6] due to adjustment driven iteration.
677 *
678 * NOTE: if someone were to create a deadlock between 2 ww_classes we'd
679 * utterly fail to report it; lockdep should.
680 */
681 if (IS_ENABLED(CONFIG_PREEMPT_RT) && waiter->ww_ctx && detect_deadlock)
682 detect_deadlock = false;
683
684 /*
685 * Drop out, when the task has no waiters. Note,
686 * top_waiter can be NULL, when we are in the deboosting
687 * mode!
688 */
689 if (top_waiter) {
690 if (!task_has_pi_waiters(task))
691 goto out_unlock_pi;
692 /*
693 * If deadlock detection is off, we stop here if we
694 * are not the top pi waiter of the task. If deadlock
695 * detection is enabled we continue, but stop the
696 * requeueing in the chain walk.
697 */
698 if (top_waiter != task_top_pi_waiter(task)) {
699 if (!detect_deadlock)
700 goto out_unlock_pi;
701 else
702 requeue = false;
703 }
704 }
705
706 /*
707 * If the waiter priority is the same as the task priority
708 * then there is no further priority adjustment necessary. If
709 * deadlock detection is off, we stop the chain walk. If its
710 * enabled we continue, but stop the requeueing in the chain
711 * walk.
712 */
713 if (rt_mutex_waiter_equal(waiter, task_to_waiter(task))) {
714 if (!detect_deadlock)
715 goto out_unlock_pi;
716 else
717 requeue = false;
718 }
719
720 /*
721 * [4] Get the next lock
722 */
723 lock = waiter->lock;
724 /*
725 * [5] We need to trylock here as we are holding task->pi_lock,
726 * which is the reverse lock order versus the other rtmutex
727 * operations.
728 */
729 if (!raw_spin_trylock(&lock->wait_lock)) {
730 raw_spin_unlock_irq(&task->pi_lock);
731 cpu_relax();
732 goto retry;
733 }
734
735 /*
736 * [6] check_exit_conditions_2() protected by task->pi_lock and
737 * lock->wait_lock.
738 *
739 * Deadlock detection. If the lock is the same as the original
740 * lock which caused us to walk the lock chain or if the
741 * current lock is owned by the task which initiated the chain
742 * walk, we detected a deadlock.
743 */
744 if (lock == orig_lock || rt_mutex_owner(lock) == top_task) {
^^^^^^^^^^^^^^^^^
This might mean it's a false positive, but Smatch isn't clever enough to
figure it out. And I'm stupid too! Plus lazy... and ugly.
745 ret = -EDEADLK;
746
747 /*
748 * When the deadlock is due to ww_mutex; also see above. Don't
749 * report the deadlock and instead let the ww_mutex wound/die
750 * logic pick which of the contending threads gets -EDEADLK.
751 *
752 * NOTE: assumes the cycle only contains a single ww_class; any
753 * other configuration and we fail to report; also, see
754 * lockdep.
755 */
756 if (IS_ENABLED(CONFIG_PREEMPT_RT) && orig_waiter->ww_ctx)
^^^^^^^^^^^^^^^^^^^
Unchecked dereference.
757 ret = 0;
758
regards,
dan carpenter
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [bug report] locking/rtmutex: Return success on deadlock for ww_mutex waiters
2021-08-31 8:21 [bug report] locking/rtmutex: Return success on deadlock for ww_mutex waiters Dan Carpenter
@ 2021-09-01 8:09 ` Peter Zijlstra
2021-09-01 9:44 ` [PATCH] locking/rtmutex: Fix ww_mutex deadlock check Peter Zijlstra
0 siblings, 1 reply; 4+ messages in thread
From: Peter Zijlstra @ 2021-09-01 8:09 UTC (permalink / raw)
To: Dan Carpenter
Cc: kernel-janitors, Thomas Gleixner, linux-kernel,
Sebastian Andrzej Siewior
On Tue, Aug 31, 2021 at 11:21:52AM +0300, Dan Carpenter wrote:
> Hello Peter Zijlstra,
Hi Dan :-)
> This is a semi-automatic email about new static checker warnings.
>
> The patch a055fcc132d4: "locking/rtmutex: Return success on deadlock
> for ww_mutex waiters" from Aug 26, 2021, leads to the following
> Smatch complaint:
>
> kernel/locking/rtmutex.c:756 rt_mutex_adjust_prio_chain()
> error: we previously assumed 'orig_waiter' could be null (see line 644)
>
> kernel/locking/rtmutex.c
> 643 */
> 644 if (orig_waiter && !rt_mutex_owner(orig_lock))
> ^^^^^^^^^^^
> A lot of this code assumes "orig_waiter" can be NULL.
>
> 735 /*
> 736 * [6] check_exit_conditions_2() protected by task->pi_lock and
> 737 * lock->wait_lock.
> 738 *
> 739 * Deadlock detection. If the lock is the same as the original
> 740 * lock which caused us to walk the lock chain or if the
> 741 * current lock is owned by the task which initiated the chain
> 742 * walk, we detected a deadlock.
> 743 */
> 744 if (lock == orig_lock || rt_mutex_owner(lock) == top_task) {
> ^^^^^^^^^^^^^^^^^
> This might mean it's a false positive, but Smatch isn't clever enough to
> figure it out. And I'm stupid too! Plus lazy... and ugly.
>
> 745 ret = -EDEADLK;
> 746
> 747 /*
> 748 * When the deadlock is due to ww_mutex; also see above. Don't
> 749 * report the deadlock and instead let the ww_mutex wound/die
> 750 * logic pick which of the contending threads gets -EDEADLK.
> 751 *
> 752 * NOTE: assumes the cycle only contains a single ww_class; any
> 753 * other configuration and we fail to report; also, see
> 754 * lockdep.
> 755 */
> 756 if (IS_ENABLED(CONFIG_PREEMPT_RT) && orig_waiter->ww_ctx)
> ^^^^^^^^^^^^^^^^^^^
> Unchecked dereference.
This is difficult... and I'm glad you flagged it. The normal de-boost
path is through rt_mutex_adjust_prio() and that has: .orig_lock == NULL
&& .orig_waiter == NULL. And as such it would never trigger the above
case.
However, there is remove_waiter() which is called on rt_mutex_lock()'s
failure paths and that doesn't have .orig_lock == NULL, and as such
*could* conceivably trigger this.
Let me figure out what the right thing to do is.
Thanks!
^ permalink raw reply [flat|nested] 4+ messages in thread
* [PATCH] locking/rtmutex: Fix ww_mutex deadlock check
2021-09-01 8:09 ` Peter Zijlstra
@ 2021-09-01 9:44 ` Peter Zijlstra
2021-09-06 16:51 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 4+ messages in thread
From: Peter Zijlstra @ 2021-09-01 9:44 UTC (permalink / raw)
To: Dan Carpenter
Cc: kernel-janitors, Thomas Gleixner, linux-kernel,
Sebastian Andrzej Siewior
On Wed, Sep 01, 2021 at 10:09:43AM +0200, Peter Zijlstra wrote:
> On Tue, Aug 31, 2021 at 11:21:52AM +0300, Dan Carpenter wrote:
> > Hello Peter Zijlstra,
>
> Hi Dan :-)
>
> > This is a semi-automatic email about new static checker warnings.
> >
> > The patch a055fcc132d4: "locking/rtmutex: Return success on deadlock
> > for ww_mutex waiters" from Aug 26, 2021, leads to the following
> > Smatch complaint:
> >
> > kernel/locking/rtmutex.c:756 rt_mutex_adjust_prio_chain()
> > error: we previously assumed 'orig_waiter' could be null (see line 644)
> >
> > kernel/locking/rtmutex.c
> > 643 */
> > 644 if (orig_waiter && !rt_mutex_owner(orig_lock))
> > ^^^^^^^^^^^
> > A lot of this code assumes "orig_waiter" can be NULL.
> >
>
> > 735 /*
> > 736 * [6] check_exit_conditions_2() protected by task->pi_lock and
> > 737 * lock->wait_lock.
> > 738 *
> > 739 * Deadlock detection. If the lock is the same as the original
> > 740 * lock which caused us to walk the lock chain or if the
> > 741 * current lock is owned by the task which initiated the chain
> > 742 * walk, we detected a deadlock.
> > 743 */
> > 744 if (lock == orig_lock || rt_mutex_owner(lock) == top_task) {
> > ^^^^^^^^^^^^^^^^^
> > This might mean it's a false positive, but Smatch isn't clever enough to
> > figure it out. And I'm stupid too! Plus lazy... and ugly.
> >
> > 745 ret = -EDEADLK;
> > 746
> > 747 /*
> > 748 * When the deadlock is due to ww_mutex; also see above. Don't
> > 749 * report the deadlock and instead let the ww_mutex wound/die
> > 750 * logic pick which of the contending threads gets -EDEADLK.
> > 751 *
> > 752 * NOTE: assumes the cycle only contains a single ww_class; any
> > 753 * other configuration and we fail to report; also, see
> > 754 * lockdep.
> > 755 */
> > 756 if (IS_ENABLED(CONFIG_PREEMPT_RT) && orig_waiter->ww_ctx)
> > ^^^^^^^^^^^^^^^^^^^
> > Unchecked dereference.
>
>
> This is difficult... and I'm glad you flagged it. The normal de-boost
> path is through rt_mutex_adjust_prio() and that has: .orig_lock == NULL
> && .orig_waiter == NULL. And as such it would never trigger the above
> case.
>
> However, there is remove_waiter() which is called on rt_mutex_lock()'s
> failure paths and that doesn't have .orig_lock == NULL, and as such
> *could* conceivably trigger this.
>
> Let me figure out what the right thing to do is.
>
> Thanks!
I think something like this ought to do.
---
Subject: locking/rtmutex: Fix ww_mutex deadlock check
Dan reported that rt_mutex_adjust_prio_chain() can be called with
.orig_waiter == NULL however commit a055fcc132d4 ("locking/rtmutex:
Return success on deadlock for ww_mutex waiters") unconditionally
dereferences it.
Since both call-sites that have .orig_waiter == NULL don't care for the
return value, simply disable the deadlock squash by adding the NULL
check.
Notably, both callers use the deadlock condition as a termination
condition for the iteration; once detected, we're sure (de)boosting is
done. Arguably [3] would be a more natural termination point, but I'm
not sure adding a third deadlock detection state would improve the code.
Fixes: a055fcc132d4 ("locking/rtmutex: Return success on deadlock for ww_mutex waiters")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
kernel/locking/rtmutex.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 8eabdc79602b..6bb116c559b4 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -753,7 +753,7 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task,
* other configuration and we fail to report; also, see
* lockdep.
*/
- if (IS_ENABLED(CONFIG_PREEMPT_RT) && orig_waiter->ww_ctx)
+ if (IS_ENABLED(CONFIG_PREEMPT_RT) && orig_waiter && orig_waiter->ww_ctx)
ret = 0;
raw_spin_unlock(&lock->wait_lock);
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH] locking/rtmutex: Fix ww_mutex deadlock check
2021-09-01 9:44 ` [PATCH] locking/rtmutex: Fix ww_mutex deadlock check Peter Zijlstra
@ 2021-09-06 16:51 ` Sebastian Andrzej Siewior
0 siblings, 0 replies; 4+ messages in thread
From: Sebastian Andrzej Siewior @ 2021-09-06 16:51 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Dan Carpenter, kernel-janitors, Thomas Gleixner, linux-kernel
On 2021-09-01 11:44:11 [+0200], Peter Zijlstra wrote:
> Subject: locking/rtmutex: Fix ww_mutex deadlock check
>
> Dan reported that rt_mutex_adjust_prio_chain() can be called with
> .orig_waiter == NULL however commit a055fcc132d4 ("locking/rtmutex:
> Return success on deadlock for ww_mutex waiters") unconditionally
> dereferences it.
>
> Since both call-sites that have .orig_waiter == NULL don't care for the
> return value, simply disable the deadlock squash by adding the NULL
> check.
>
> Notably, both callers use the deadlock condition as a termination
> condition for the iteration; once detected, we're sure (de)boosting is
> done. Arguably [3] would be a more natural termination point, but I'm
> not sure adding a third deadlock detection state would improve the code.
>
> Fixes: a055fcc132d4 ("locking/rtmutex: Return success on deadlock for ww_mutex waiters")
> Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
It sounds reasonable and I don't see any fallout in the testsuite so
Acked-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
> kernel/locking/rtmutex.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
> index 8eabdc79602b..6bb116c559b4 100644
> --- a/kernel/locking/rtmutex.c
> +++ b/kernel/locking/rtmutex.c
> @@ -753,7 +753,7 @@ static int __sched rt_mutex_adjust_prio_chain(struct task_struct *task,
> * other configuration and we fail to report; also, see
> * lockdep.
> */
> - if (IS_ENABLED(CONFIG_PREEMPT_RT) && orig_waiter->ww_ctx)
> + if (IS_ENABLED(CONFIG_PREEMPT_RT) && orig_waiter && orig_waiter->ww_ctx)
> ret = 0;
>
> raw_spin_unlock(&lock->wait_lock);
Sebastian
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2021-09-06 16:51 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-08-31 8:21 [bug report] locking/rtmutex: Return success on deadlock for ww_mutex waiters Dan Carpenter
2021-09-01 8:09 ` Peter Zijlstra
2021-09-01 9:44 ` [PATCH] locking/rtmutex: Fix ww_mutex deadlock check Peter Zijlstra
2021-09-06 16:51 ` Sebastian Andrzej Siewior
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).