* [PATCH] kernel/rtmutex: Handle non enqueued waiters gracefully in remove_waiter()
@ 2018-03-12 14:28 Sebastian Andrzej Siewior
2018-03-16 12:20 ` Peter Zijlstra
0 siblings, 1 reply; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2018-03-12 14:28 UTC (permalink / raw)
To: linux-kernel; +Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner
In -RT task_blocks_on_rt_mutex() may return with -EAGAIN due to
(->pi_blocked_on == PI_WAKEUP_INPROGRESS) before it added itself as a
waiter. In such a case we must not call remove_waiter() because without
a waiter it will trigger the BUG_ON() statement.
This was initially reported by Yimin Deng. Thomas Gleixner fixed it then
with an explicit check for waiters before calling remove_waiter() with
the following note:
| Guard it with rt_mutex_has_waiters(). This is a quick fix which is
| easy to backport. The proper fix is to have a central check in
| remove_waiter() so we can call it unconditionally.
This is the suggested change.
Now that it is possible to call remove_waiter() unconditionally I also
remove that check from rt_mutex_slowlock().
Link: https://lkml.kernel.org/r/CAAh1qt=DCL9aUXNxanP5BKtiPp3m+qj4yB+gDohhXPVFCxWwzg@mail.gmail.com
Reported-and-debugged-by: Yimin Deng <yimin11.deng@gmail.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
kernel/locking/rtmutex.c | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 65cc0cb984e6..57d28d8f5280 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1068,12 +1068,15 @@ static void mark_wakeup_next_waiter(struct wake_q_head *wake_q,
static void remove_waiter(struct rt_mutex *lock,
struct rt_mutex_waiter *waiter)
{
- bool is_top_waiter = (waiter == rt_mutex_top_waiter(lock));
+ bool is_top_waiter = false;
struct task_struct *owner = rt_mutex_owner(lock);
struct rt_mutex *next_lock;
lockdep_assert_held(&lock->wait_lock);
+ if (rt_mutex_has_waiters(lock))
+ is_top_waiter = waiter == rt_mutex_top_waiter(lock);
+
raw_spin_lock(¤t->pi_lock);
rt_mutex_dequeue(lock, waiter);
current->pi_blocked_on = NULL;
@@ -1268,8 +1271,7 @@ rt_mutex_slowlock(struct rt_mutex *lock, int state,
if (unlikely(ret)) {
__set_current_state(TASK_RUNNING);
- if (rt_mutex_has_waiters(lock))
- remove_waiter(lock, &waiter);
+ remove_waiter(lock, &waiter);
rt_mutex_handle_deadlock(ret, chwalk, &waiter);
}
--
2.16.2
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] kernel/rtmutex: Handle non enqueued waiters gracefully in remove_waiter()
2018-03-12 14:28 [PATCH] kernel/rtmutex: Handle non enqueued waiters gracefully in remove_waiter() Sebastian Andrzej Siewior
@ 2018-03-16 12:20 ` Peter Zijlstra
2018-03-19 15:11 ` Thomas Gleixner
0 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2018-03-16 12:20 UTC (permalink / raw)
To: Sebastian Andrzej Siewior; +Cc: linux-kernel, Ingo Molnar, Thomas Gleixner
On Mon, Mar 12, 2018 at 03:28:45PM +0100, Sebastian Andrzej Siewior wrote:
> In -RT task_blocks_on_rt_mutex() may return with -EAGAIN due to
> (->pi_blocked_on == PI_WAKEUP_INPROGRESS) before it added itself as a
> waiter. In such a case we must not call remove_waiter() because without
> a waiter it will trigger the BUG_ON() statement.
>
> This was initially reported by Yimin Deng. Thomas Gleixner fixed it then
> with an explicit check for waiters before calling remove_waiter() with
> the following note:
>
> | Guard it with rt_mutex_has_waiters(). This is a quick fix which is
> | easy to backport. The proper fix is to have a central check in
> | remove_waiter() so we can call it unconditionally.
>
> This is the suggested change.
> Now that it is possible to call remove_waiter() unconditionally I also
> remove that check from rt_mutex_slowlock().
>
> Link: https://lkml.kernel.org/r/CAAh1qt=DCL9aUXNxanP5BKtiPp3m+qj4yB+gDohhXPVFCxWwzg@mail.gmail.com
> Reported-and-debugged-by: Yimin Deng <yimin11.deng@gmail.com>
> Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> ---
> kernel/locking/rtmutex.c | 8 +++++---
> 1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
> index 65cc0cb984e6..57d28d8f5280 100644
> --- a/kernel/locking/rtmutex.c
> +++ b/kernel/locking/rtmutex.c
> @@ -1068,12 +1068,15 @@ static void mark_wakeup_next_waiter(struct wake_q_head *wake_q,
> static void remove_waiter(struct rt_mutex *lock,
> struct rt_mutex_waiter *waiter)
> {
> - bool is_top_waiter = (waiter == rt_mutex_top_waiter(lock));
I'm a little confused, but isn't it easier to make rt_mutex_top_waiter()
return NULL if there aren't in fact any waiters?
diff --git a/kernel/locking/rtmutex_common.h b/kernel/locking/rtmutex_common.h
index 68686b3ec3c1..70bcafc385c4 100644
--- a/kernel/locking/rtmutex_common.h
+++ b/kernel/locking/rtmutex_common.h
@@ -52,11 +52,13 @@ static inline int rt_mutex_has_waiters(struct rt_mutex *lock)
static inline struct rt_mutex_waiter *
rt_mutex_top_waiter(struct rt_mutex *lock)
{
- struct rt_mutex_waiter *w;
+ struct rb_node *leftmost = rb_first_cached(&lock->waiters);
+ struct rt_mutex_waiter *w = NULL;
- w = rb_entry(lock->waiters.rb_leftmost,
- struct rt_mutex_waiter, tree_entry);
- BUG_ON(w->lock != lock);
+ if (leftmost) {
+ w = rb_entry(leftmost, struct rt_mutex_waiter, tree_entry);
+ BUG_ON(w->lock != lock);
+ }
return w;
}
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] kernel/rtmutex: Handle non enqueued waiters gracefully in remove_waiter()
2018-03-16 12:20 ` Peter Zijlstra
@ 2018-03-19 15:11 ` Thomas Gleixner
2018-03-27 12:14 ` [PATCH v2] locking/rtmutex: " Sebastian Andrzej Siewior
0 siblings, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2018-03-19 15:11 UTC (permalink / raw)
To: Peter Zijlstra; +Cc: Sebastian Andrzej Siewior, linux-kernel, Ingo Molnar
On Fri, 16 Mar 2018, Peter Zijlstra wrote:
> On Mon, Mar 12, 2018 at 03:28:45PM +0100, Sebastian Andrzej Siewior wrote:
> > In -RT task_blocks_on_rt_mutex() may return with -EAGAIN due to
> > (->pi_blocked_on == PI_WAKEUP_INPROGRESS) before it added itself as a
> > waiter. In such a case we must not call remove_waiter() because without
> > a waiter it will trigger the BUG_ON() statement.
> >
> > This was initially reported by Yimin Deng. Thomas Gleixner fixed it then
> > with an explicit check for waiters before calling remove_waiter() with
> > the following note:
> >
> > | Guard it with rt_mutex_has_waiters(). This is a quick fix which is
> > | easy to backport. The proper fix is to have a central check in
> > | remove_waiter() so we can call it unconditionally.
> >
> > This is the suggested change.
> > Now that it is possible to call remove_waiter() unconditionally I also
> > remove that check from rt_mutex_slowlock().
> >
> > Link: https://lkml.kernel.org/r/CAAh1qt=DCL9aUXNxanP5BKtiPp3m+qj4yB+gDohhXPVFCxWwzg@mail.gmail.com
> > Reported-and-debugged-by: Yimin Deng <yimin11.deng@gmail.com>
> > Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> > Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> > ---
> > kernel/locking/rtmutex.c | 8 +++++---
> > 1 file changed, 5 insertions(+), 3 deletions(-)
> >
> > diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
> > index 65cc0cb984e6..57d28d8f5280 100644
> > --- a/kernel/locking/rtmutex.c
> > +++ b/kernel/locking/rtmutex.c
> > @@ -1068,12 +1068,15 @@ static void mark_wakeup_next_waiter(struct wake_q_head *wake_q,
> > static void remove_waiter(struct rt_mutex *lock,
> > struct rt_mutex_waiter *waiter)
> > {
> > - bool is_top_waiter = (waiter == rt_mutex_top_waiter(lock));
>
> I'm a little confused, but isn't it easier to make rt_mutex_top_waiter()
> return NULL if there aren't in fact any waiters?
That works as well.
Thanks,
tglx
> diff --git a/kernel/locking/rtmutex_common.h b/kernel/locking/rtmutex_common.h
> index 68686b3ec3c1..70bcafc385c4 100644
> --- a/kernel/locking/rtmutex_common.h
> +++ b/kernel/locking/rtmutex_common.h
> @@ -52,11 +52,13 @@ static inline int rt_mutex_has_waiters(struct rt_mutex *lock)
> static inline struct rt_mutex_waiter *
> rt_mutex_top_waiter(struct rt_mutex *lock)
> {
> - struct rt_mutex_waiter *w;
> + struct rb_node *leftmost = rb_first_cached(&lock->waiters);
> + struct rt_mutex_waiter *w = NULL;
>
> - w = rb_entry(lock->waiters.rb_leftmost,
> - struct rt_mutex_waiter, tree_entry);
> - BUG_ON(w->lock != lock);
> + if (leftmost) {
> + w = rb_entry(leftmost, struct rt_mutex_waiter, tree_entry);
> + BUG_ON(w->lock != lock);
> + }
>
> return w;
> }
>
^ permalink raw reply [flat|nested] 9+ messages in thread
* [PATCH v2] locking/rtmutex: Handle non enqueued waiters gracefully in remove_waiter()
2018-03-19 15:11 ` Thomas Gleixner
@ 2018-03-27 12:14 ` Sebastian Andrzej Siewior
2018-03-28 21:07 ` [tip:locking/core] " tip-bot for Peter Zijlstra
0 siblings, 1 reply; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2018-03-27 12:14 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: Peter Zijlstra, linux-kernel, Ingo Molnar
From: Peter Zijlstra <peterz@infradead.org>
In -RT task_blocks_on_rt_mutex() may return with -EAGAIN due to
(->pi_blocked_on == PI_WAKEUP_INPROGRESS) before it added itself as a
waiter. In such a case we must not call remove_waiter() because without
a waiter it will trigger the BUG_ON() statement.
This was initially reported by Yimin Deng. Thomas Gleixner fixed it then
with an explicit check for waiters before calling remove_waiter().
Instead of an explicit check before calling rt_mutex_top_waiter() we
could make it return NULL if there are no waiters.
Now that it is possible to call remove_waiter() unconditionally I also
remove that check from rt_mutex_slowlock().
Link: https://lkml.kernel.org/CAAh1qt=DCL9aUXNxanP5BKtiPp3m+qj4yB+gDohhXPVFCxWwzg@mail.gmail.com
Reported-and-debugged-by: Yimin Deng <yimin11.deng@gmail.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
kernel/locking/rtmutex.c | 3 +--
kernel/locking/rtmutex_common.h | 11 ++++++-----
2 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 65cc0cb984e6..355716d03b1a 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1268,8 +1268,7 @@ rt_mutex_slowlock(struct rt_mutex *lock, int state,
if (unlikely(ret)) {
__set_current_state(TASK_RUNNING);
- if (rt_mutex_has_waiters(lock))
- remove_waiter(lock, &waiter);
+ remove_waiter(lock, &waiter);
rt_mutex_handle_deadlock(ret, chwalk, &waiter);
}
diff --git a/kernel/locking/rtmutex_common.h b/kernel/locking/rtmutex_common.h
index 68686b3ec3c1..d1d62f942be2 100644
--- a/kernel/locking/rtmutex_common.h
+++ b/kernel/locking/rtmutex_common.h
@@ -52,12 +52,13 @@ static inline int rt_mutex_has_waiters(struct rt_mutex *lock)
static inline struct rt_mutex_waiter *
rt_mutex_top_waiter(struct rt_mutex *lock)
{
- struct rt_mutex_waiter *w;
-
- w = rb_entry(lock->waiters.rb_leftmost,
- struct rt_mutex_waiter, tree_entry);
- BUG_ON(w->lock != lock);
+ struct rb_node *leftmost = rb_first_cached(&lock->waiters);
+ struct rt_mutex_waiter *w = NULL;
+ if (leftmost) {
+ w = rb_entry(leftmost, struct rt_mutex_waiter, tree_entry);
+ BUG_ON(w->lock != lock);
+ }
return w;
}
--
2.16.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* [tip:locking/core] locking/rtmutex: Handle non enqueued waiters gracefully in remove_waiter()
2018-03-27 12:14 ` [PATCH v2] locking/rtmutex: " Sebastian Andrzej Siewior
@ 2018-03-28 21:07 ` tip-bot for Peter Zijlstra
0 siblings, 0 replies; 9+ messages in thread
From: tip-bot for Peter Zijlstra @ 2018-03-28 21:07 UTC (permalink / raw)
To: linux-tip-commits
Cc: linux-kernel, yimin11.deng, mingo, peterz, tglx, hpa, bigeasy
Commit-ID: c28d62cf52d791ba5f6db7ce525ed06b86291c82
Gitweb: https://git.kernel.org/tip/c28d62cf52d791ba5f6db7ce525ed06b86291c82
Author: Peter Zijlstra <peterz@infradead.org>
AuthorDate: Tue, 27 Mar 2018 14:14:38 +0200
Committer: Thomas Gleixner <tglx@linutronix.de>
CommitDate: Wed, 28 Mar 2018 23:01:30 +0200
locking/rtmutex: Handle non enqueued waiters gracefully in remove_waiter()
In -RT task_blocks_on_rt_mutex() may return with -EAGAIN due to
(->pi_blocked_on == PI_WAKEUP_INPROGRESS) before it added itself as a
waiter. In such a case remove_waiter() must not be called because without a
waiter it will trigger the BUG_ON() statement.
This was initially reported by Yimin Deng. Thomas Gleixner fixed it then
with an explicit check for waiters before calling remove_waiter().
Instead of an explicit NULL check before calling rt_mutex_top_waiter() make
the function return NULL if there are no waiters. With that fixed the now
pointless NULL check is removed from rt_mutex_slowlock().
Reported-and-debugged-by: Yimin Deng <yimin11.deng@gmail.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/CAAh1qt=DCL9aUXNxanP5BKtiPp3m+qj4yB+gDohhXPVFCxWwzg@mail.gmail.com
Link: https://lkml.kernel.org/r/20180327121438.sss7hxg3crqy4ecd@linutronix.de
---
kernel/locking/rtmutex.c | 3 +--
kernel/locking/rtmutex_common.h | 11 ++++++-----
2 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 940633c63254..4f014be7a4b8 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1268,8 +1268,7 @@ rt_mutex_slowlock(struct rt_mutex *lock, int state,
if (unlikely(ret)) {
__set_current_state(TASK_RUNNING);
- if (rt_mutex_has_waiters(lock))
- remove_waiter(lock, &waiter);
+ remove_waiter(lock, &waiter);
rt_mutex_handle_deadlock(ret, chwalk, &waiter);
}
diff --git a/kernel/locking/rtmutex_common.h b/kernel/locking/rtmutex_common.h
index 68686b3ec3c1..d1d62f942be2 100644
--- a/kernel/locking/rtmutex_common.h
+++ b/kernel/locking/rtmutex_common.h
@@ -52,12 +52,13 @@ static inline int rt_mutex_has_waiters(struct rt_mutex *lock)
static inline struct rt_mutex_waiter *
rt_mutex_top_waiter(struct rt_mutex *lock)
{
- struct rt_mutex_waiter *w;
-
- w = rb_entry(lock->waiters.rb_leftmost,
- struct rt_mutex_waiter, tree_entry);
- BUG_ON(w->lock != lock);
+ struct rb_node *leftmost = rb_first_cached(&lock->waiters);
+ struct rt_mutex_waiter *w = NULL;
+ if (leftmost) {
+ w = rb_entry(leftmost, struct rt_mutex_waiter, tree_entry);
+ BUG_ON(w->lock != lock);
+ }
return w;
}
^ permalink raw reply related [flat|nested] 9+ messages in thread
* kernel BUG at kernel/rtmutex_common.h:75
@ 2015-11-04 14:35 Yimin Deng
2015-11-06 14:41 ` Thomas Gleixner
0 siblings, 1 reply; 9+ messages in thread
From: Yimin Deng @ 2015-11-04 14:35 UTC (permalink / raw)
To: linux-rt-users
I encountered “kernel BUG” which was reported in the
rt_mutex_top_waiter() at kernel/rtmutex_common.h:75.
Linux version: 3.12.37-rt51; CONFIG_PREEMPT_RT_FULL is disabled.
Architecture: PowerPC
We ported an application from pSOS RTOS to Linux using the
Xenomai-Mercury (=library to map pSOS task to POSIX threads). And We
have several threads running in the real-time priority domain.
ThreadA: running at prio -59. pthread_mutex_lock() +
pthread_cond_timedwait() + pthread_mutex_unlock()
ThreadB: running at prio -84. pthread_mutex_lock() +
pthread_cond_signal() + pthread_mutex_unlock()
ThreadA:
------ ------
futex_wait_requeue_pi()
futex_wait_queue_me()
<timed out>
raw_spin_lock_irq(¤t->pi_lock);
if (current->pi_blocked_on) {
raw_spin_unlock_irq(¤t->pi_lock);
} else {
current->pi_blocked_on = PI_WAKEUP_INPROGRESS;
raw_spin_unlock_irq(¤t->pi_lock);
<-- ThreadA was interrupted and preempted!
spin_lock(&hb->lock);
ThreadB:
------ ------
rt_mutex_start_proxy_lock();
task_blocks_on_rt_mutex(); <-- return "-EAGAIN" due to
"task->pi_blocked_on == PI_WAKEUP_INPROGRESS"
...
if (unlikely(ret))
remove_waiter(lock, waiter);
int first = (waiter == rt_mutex_top_waiter(lock)); <--
BUG_ON(w->lock != lock);
It seems that the purpose to call the remove_waiter() is to remove the
waiter added by “plist_add(&waiter->list_entry, &lock->wait_list);” in
the task_blocks_on_rt_mutex(). But in the scenario above there's no
waiter on the lock yet and
the waiter has not been added into the wait list of the lock in the
task_blocks_on_rt_mutex() due to the failure “-EAGAIN”. So it reported
kernel BUG in the rt_mutex_top_waiter().
I modified it as below and the issue seems disappear.
- if (unlikely(ret))
+ if (unlikely(ret && (-EAGAIN != ret)))
remove_waiter(lock, waiter);
Could the scenario above be possible? If so, how to resolve this issue?
Thanks!
B.R.
Yimin Deng
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: kernel BUG at kernel/rtmutex_common.h:75
2015-11-04 14:35 kernel BUG at kernel/rtmutex_common.h:75 Yimin Deng
@ 2015-11-06 14:41 ` Thomas Gleixner
2015-11-07 18:09 ` Thomas Gleixner
0 siblings, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2015-11-06 14:41 UTC (permalink / raw)
To: Yimin Deng; +Cc: linux-rt-users
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1857 bytes --]
B1;2802;0cOn Wed, 4 Nov 2015, Yimin Deng wrote:
> It seems that the purpose to call the remove_waiter() is to remove the
> waiter added by “plist_add(&waiter->list_entry, &lock->wait_list);” in
> the task_blocks_on_rt_mutex(). But in the scenario above there's no
> waiter on the lock yet and
> the waiter has not been added into the wait list of the lock in the
> task_blocks_on_rt_mutex() due to the failure “-EAGAIN”. So it reported
> kernel BUG in the rt_mutex_top_waiter().
>
> I modified it as below and the issue seems disappear.
> - if (unlikely(ret))
> + if (unlikely(ret && (-EAGAIN != ret)))
> remove_waiter(lock, waiter);
>
> Could the scenario above be possible? If so, how to resolve this issue?
> Thanks!
Yes it is possible. Nice detective work!
Your solution is correct, but actually it's not sufficient, because we
have another possibility to return early without being queued
(-EDEADLOCK). Find the full solution below.
Thanks for tracking that down!
tglx
---
diff --git a/kernel/rtmutex.c b/kernel/rtmutex.c
index 7601c1332a88..0e6505d5ce4a 100644
--- a/kernel/rtmutex.c
+++ b/kernel/rtmutex.c
@@ -1003,11 +1003,18 @@ static void wakeup_next_waiter(struct rt_mutex *lock)
static void remove_waiter(struct rt_mutex *lock,
struct rt_mutex_waiter *waiter)
{
- bool is_top_waiter = (waiter == rt_mutex_top_waiter(lock));
struct task_struct *owner = rt_mutex_owner(lock);
struct rt_mutex *next_lock = NULL;
+ bool is_top_waiter = false;
unsigned long flags;
+ /*
+ * @waiter might be not queued when task_blocks_on_rt_mutex()
+ * returned early so @lock might not have any waiters.
+ */
+ if (rt_mutex_has_waiters())
+ is_top_waiter = (waiter == rt_mutex_top_waiter(lock));
+
raw_spin_lock_irqsave(¤t->pi_lock, flags);
rt_mutex_dequeue(lock, waiter);
current->pi_blocked_on = NULL;
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: kernel BUG at kernel/rtmutex_common.h:75
2015-11-06 14:41 ` Thomas Gleixner
@ 2015-11-07 18:09 ` Thomas Gleixner
2015-11-08 3:31 ` Yimin Deng
0 siblings, 1 reply; 9+ messages in thread
From: Thomas Gleixner @ 2015-11-07 18:09 UTC (permalink / raw)
To: Yimin Deng; +Cc: linux-rt-users
[-- Attachment #1: Type: TEXT/PLAIN, Size: 1184 bytes --]
On Fri, 6 Nov 2015, Thomas Gleixner wrote:
> On Wed, 4 Nov 2015, Yimin Deng wrote:
> > It seems that the purpose to call the remove_waiter() is to remove the
> > waiter added by “plist_add(&waiter->list_entry, &lock->wait_list);” in
> > the task_blocks_on_rt_mutex(). But in the scenario above there's no
> > waiter on the lock yet and
> > the waiter has not been added into the wait list of the lock in the
> > task_blocks_on_rt_mutex() due to the failure “-EAGAIN”. So it reported
> > kernel BUG in the rt_mutex_top_waiter().
> >
> > I modified it as below and the issue seems disappear.
> > - if (unlikely(ret))
> > + if (unlikely(ret && (-EAGAIN != ret)))
> > remove_waiter(lock, waiter);
> >
> > Could the scenario above be possible? If so, how to resolve this issue?
> > Thanks!
>
> Yes it is possible. Nice detective work!
>
> Your solution is correct, but actually it's not sufficient, because we
> have another possibility to return early without being queued
> (-EDEADLOCK). Find the full solution below.
>
> Thanks for tracking that down!
Btw, please update to 3.12.48-rt66. It contains quite some bugfixes in
the area of futex/rtmutex.
Thanks,
tglx
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: kernel BUG at kernel/rtmutex_common.h:75
2015-11-07 18:09 ` Thomas Gleixner
@ 2015-11-08 3:31 ` Yimin Deng
0 siblings, 0 replies; 9+ messages in thread
From: Yimin Deng @ 2015-11-08 3:31 UTC (permalink / raw)
To: Thomas Gleixner; +Cc: linux-rt-users
2015-11-08 2:09 GMT+08:00 Thomas Gleixner <tglx@linutronix.de>:
> On Fri, 6 Nov 2015, Thomas Gleixner wrote:
> Btw, please update to 3.12.48-rt66. It contains quite some bugfixes in
> the area of futex/rtmutex.
>
> Thanks,
>
> tglx
On Fri, 6 Nov 2015, Thomas Gleixner wrote:
>> diff --git a/kernel/rtmutex.c b/kernel/rtmutex.c
>> index 7601c1332a88..0e6505d5ce4a 100644
>> --- a/kernel/rtmutex.c
>> +++ b/kernel/rtmutex.c
>> @@ -1003,11 +1003,18 @@ static void wakeup_next_waiter(struct rt_mutex *lock)
>> static void remove_waiter(struct rt_mutex *lock,
>> struct rt_mutex_waiter *waiter)
>> {
>> - bool is_top_waiter = (waiter == rt_mutex_top_waiter(lock));
>> struct task_struct *owner = rt_mutex_owner(lock);
>> struct rt_mutex *next_lock = NULL;
>> + bool is_top_waiter = false;
>> unsigned long flags;
>>
>> + /*
>> + * @waiter might be not queued when task_blocks_on_rt_mutex()
>> + * returned early so @lock might not have any waiters.
>> + */
>> + if (rt_mutex_has_waiters())
>> + is_top_waiter = (waiter == rt_mutex_top_waiter(lock));
>> +
>> raw_spin_lock_irqsave(¤t->pi_lock, flags);
>> rt_mutex_dequeue(lock, waiter);
>> current->pi_blocked_on = NULL;
Sincerely appreciate for your answers!
Could it be modified as below? It seems not necessary to call
rt_mutex_dequeue() and clear the current->pi_blocked_on if there's no
waiter on the lock. And the waiter->task is not the 'current' when the
remove_waiter() is called in the function rt_mutex_start_proxy_lock().
static void remove_waiter(struct rt_mutex *lock,
struct rt_mutex_waiter *waiter)
{
- bool is_top_waiter = (waiter == rt_mutex_top_waiter(lock));
struct task_struct *owner = rt_mutex_owner(lock);
struct rt_mutex *next_lock = NULL;
+ bool is_top_waiter;
+ struct task_struct *task = waiter->task;
unsigned long flags;
+ /*
+ * @waiter might be not queued when task_blocks_on_rt_mutex()
+ * returned early so @lock might not have any waiters.
+ */
+ if (unlikely(!rt_mutex_has_waiters()))
+ return;
+
+ is_top_waiter = (waiter == rt_mutex_top_waiter(lock));
+
- raw_spin_lock_irqsave(¤t->pi_lock, flags);
- rt_mutex_dequeue(lock, waiter);
- current->pi_blocked_on = NULL;
- raw_spin_unlock_irqrestore(¤t->pi_lock, flags);
+ raw_spin_lock_irqsave(&task->pi_lock, flags);
+ rt_mutex_dequeue(lock, waiter);
+ task->pi_blocked_on = NULL;
+ __rt_mutex_adjust_prio(task); <-- I'm not sure if it is necessary.
+ raw_spin_unlock_irqrestore(&task->pi_lock, flags);
......
- rt_mutex_adjust_prio_chain(owner, 0, lock, next_lock, NULL, current);
+ rt_mutex_adjust_prio_chain(owner, 0, lock, next_lock, NULL, task);
......
}
I'm sorry for so many questions.
Thanks ahead!
B.R.
Yimin Deng
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2018-03-28 21:07 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-03-12 14:28 [PATCH] kernel/rtmutex: Handle non enqueued waiters gracefully in remove_waiter() Sebastian Andrzej Siewior
2018-03-16 12:20 ` Peter Zijlstra
2018-03-19 15:11 ` Thomas Gleixner
2018-03-27 12:14 ` [PATCH v2] locking/rtmutex: " Sebastian Andrzej Siewior
2018-03-28 21:07 ` [tip:locking/core] " tip-bot for Peter Zijlstra
-- strict thread matches above, loose matches on Subject: below --
2015-11-04 14:35 kernel BUG at kernel/rtmutex_common.h:75 Yimin Deng
2015-11-06 14:41 ` Thomas Gleixner
2015-11-07 18:09 ` Thomas Gleixner
2015-11-08 3:31 ` Yimin Deng
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.