All of lore.kernel.org
 help / color / mirror / Atom feed
* kernel BUG at kernel/rtmutex_common.h:75
@ 2015-11-04 14:35 Yimin Deng
  2015-11-06 14:41 ` Thomas Gleixner
  0 siblings, 1 reply; 9+ messages in thread
From: Yimin Deng @ 2015-11-04 14:35 UTC (permalink / raw)
  To: linux-rt-users

I encountered “kernel BUG” which was reported in the
rt_mutex_top_waiter() at kernel/rtmutex_common.h:75.

Linux version: 3.12.37-rt51; CONFIG_PREEMPT_RT_FULL is disabled.
Architecture: PowerPC

We ported an application from pSOS RTOS to Linux using the
Xenomai-Mercury (=library to map pSOS task to POSIX threads). And We
have several threads running in the real-time priority domain.
ThreadA: running at prio -59. pthread_mutex_lock() +
pthread_cond_timedwait() + pthread_mutex_unlock()
ThreadB: running at prio -84. pthread_mutex_lock() +
pthread_cond_signal() + pthread_mutex_unlock()

ThreadA:
------ ------
futex_wait_requeue_pi()
    futex_wait_queue_me()
    <timed out>

    raw_spin_lock_irq(&current->pi_lock);
    if (current->pi_blocked_on) {
        raw_spin_unlock_irq(&current->pi_lock);
    } else {
        current->pi_blocked_on = PI_WAKEUP_INPROGRESS;
        raw_spin_unlock_irq(&current->pi_lock);
            <-- ThreadA was interrupted and preempted!
        spin_lock(&hb->lock);


ThreadB:
------ ------
rt_mutex_start_proxy_lock();
    task_blocks_on_rt_mutex();  <-- return "-EAGAIN" due to
"task->pi_blocked_on == PI_WAKEUP_INPROGRESS"
    ...
    if (unlikely(ret))
        remove_waiter(lock, waiter);
            int first = (waiter == rt_mutex_top_waiter(lock));  <--
BUG_ON(w->lock != lock);

It seems that the purpose to call the remove_waiter() is to remove the
waiter added by “plist_add(&waiter->list_entry, &lock->wait_list);” in
the task_blocks_on_rt_mutex(). But in the scenario above there's no
waiter on the lock yet and
the waiter has not been added into the wait list of the lock in the
task_blocks_on_rt_mutex() due to the failure “-EAGAIN”. So it reported
kernel BUG in the rt_mutex_top_waiter().

I modified it as below and the issue seems disappear.
- if (unlikely(ret))
+ if (unlikely(ret && (-EAGAIN != ret)))
       remove_waiter(lock, waiter);

Could the scenario above be possible? If so, how to resolve this issue?
Thanks!

B.R.
Yimin Deng
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 9+ messages in thread
* [PATCH] kernel/rtmutex: Handle non enqueued waiters gracefully in remove_waiter()
@ 2018-03-12 14:28 Sebastian Andrzej Siewior
  2018-03-16 12:20 ` Peter Zijlstra
  0 siblings, 1 reply; 9+ messages in thread
From: Sebastian Andrzej Siewior @ 2018-03-12 14:28 UTC (permalink / raw)
  To: linux-kernel; +Cc: Peter Zijlstra, Ingo Molnar, Thomas Gleixner

In -RT task_blocks_on_rt_mutex() may return with -EAGAIN due to
(->pi_blocked_on == PI_WAKEUP_INPROGRESS) before it added itself as a
waiter. In such a case we must not call remove_waiter() because without
a waiter it will trigger the BUG_ON() statement.

This was initially reported by Yimin Deng. Thomas Gleixner fixed it then
with an explicit check for waiters before calling remove_waiter() with
the following note:

| Guard it with rt_mutex_has_waiters(). This is a quick fix which is
| easy to backport. The proper fix is to have a central check in
| remove_waiter() so we can call it unconditionally.

This is the suggested change.
Now that it is possible to call remove_waiter() unconditionally I also
remove that check from rt_mutex_slowlock().

Link: https://lkml.kernel.org/r/CAAh1qt=DCL9aUXNxanP5BKtiPp3m+qj4yB+gDohhXPVFCxWwzg@mail.gmail.com
Reported-and-debugged-by: Yimin Deng <yimin11.deng@gmail.com>
Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 kernel/locking/rtmutex.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/kernel/locking/rtmutex.c b/kernel/locking/rtmutex.c
index 65cc0cb984e6..57d28d8f5280 100644
--- a/kernel/locking/rtmutex.c
+++ b/kernel/locking/rtmutex.c
@@ -1068,12 +1068,15 @@ static void mark_wakeup_next_waiter(struct wake_q_head *wake_q,
 static void remove_waiter(struct rt_mutex *lock,
 			  struct rt_mutex_waiter *waiter)
 {
-	bool is_top_waiter = (waiter == rt_mutex_top_waiter(lock));
+	bool is_top_waiter = false;
 	struct task_struct *owner = rt_mutex_owner(lock);
 	struct rt_mutex *next_lock;
 
 	lockdep_assert_held(&lock->wait_lock);
 
+	if (rt_mutex_has_waiters(lock))
+		is_top_waiter = waiter == rt_mutex_top_waiter(lock);
+
 	raw_spin_lock(&current->pi_lock);
 	rt_mutex_dequeue(lock, waiter);
 	current->pi_blocked_on = NULL;
@@ -1268,8 +1271,7 @@ rt_mutex_slowlock(struct rt_mutex *lock, int state,
 
 	if (unlikely(ret)) {
 		__set_current_state(TASK_RUNNING);
-		if (rt_mutex_has_waiters(lock))
-			remove_waiter(lock, &waiter);
+		remove_waiter(lock, &waiter);
 		rt_mutex_handle_deadlock(ret, chwalk, &waiter);
 	}
 
-- 
2.16.2

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2018-03-28 21:07 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-04 14:35 kernel BUG at kernel/rtmutex_common.h:75 Yimin Deng
2015-11-06 14:41 ` Thomas Gleixner
2015-11-07 18:09   ` Thomas Gleixner
2015-11-08  3:31     ` Yimin Deng
2018-03-12 14:28 [PATCH] kernel/rtmutex: Handle non enqueued waiters gracefully in remove_waiter() Sebastian Andrzej Siewior
2018-03-16 12:20 ` Peter Zijlstra
2018-03-19 15:11   ` Thomas Gleixner
2018-03-27 12:14     ` [PATCH v2] locking/rtmutex: " Sebastian Andrzej Siewior
2018-03-28 21:07       ` [tip:locking/core] " tip-bot for Peter Zijlstra

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.