All of lore.kernel.org
 help / color / mirror / Atom feed
From: Darren Hart <dvhltc@us.ibm.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: Mike Galbraith <efault@gmx.de>,
	linux-rt-users <linux-rt-users@vger.kernel.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Steven Rostedt <rostedt@goodmis.org>,
	gowrishankar <gowrishankar.m@linux.vnet.ibm.com>
Subject: Re: 2.6.33.[56]-rt23: howto create repeatable explosion in wakeup_next_waiter()
Date: Thu, 08 Jul 2010 19:11:49 -0700	[thread overview]
Message-ID: <4C368565.3020806@us.ibm.com> (raw)
In-Reply-To: <alpine.LFD.2.00.1007071356480.2604@localhost.localdomain>

On 07/07/2010 04:57 AM, Thomas Gleixner wrote:
> Cc'ing Darren.
> 
> On Wed, 7 Jul 2010, Mike Galbraith wrote:
> 
>> Greetings,
>>
>> Stress testing, looking to trigger RCU stalls, I've managed to find a
>> way to repeatably create fireworks.  (got RCU stall, see attached)

<snip>
embarrassing ltp realtime/perf/latency/pthread_cond_many breakage </snip>

>> 4. run it.
>>
>> What happens here is we hit WARN_ON(pendowner->pi_blocked_on != waiter),
>> this does not make it to consoles (poking sysrq-foo doesn't either).
>> Next comes WARN_ON(!pendowner->pi_blocked_on), followed by the NULL
>> explosion, which does make it to consoles.

So the WARN_ON sequence is obviously wrong, if it's critical it should
be a BUG(), if not we shouldn't dereference what we know to be null. The
following patch avoids the NULL pointer dereference in the WARN_ON. With
this patch the NULL WARN_ON makes it to the console, and test runs to
completion with no obvious negative side effects. I'm only posting for
reference at this point, as while this may be necessary, it isn't the
right "solution".

Some other data points from what time I could spend on this today.

FC12 kernel (2.6.31 based) has Requeue PI support, but does not exhibit
this behavior.

2.6.33.5-rt23 without CONFIG_PREEMPT_RT does NOT exhibit this behavior.

2.6.33.5-rt23 does exhibit this behavior.

The minimal tracing I attempted (a handful of trace_printk's and run
with the nop plugin) all prevented the crash from happening.

There appears to be no correlation to pi_blocked_on being NULL and the
next pointer being NULL (I saw a roughly equivalent mix of NULL and
valid pointers for next when pi_blocked_on was NULL).

Tonight/Tomorrow I'll review the rtmutex and futex code to try and fully
understand (again) the usage of pi_blocked_on and if we need to avoid
this scenario, or if we need to handle it "gracefully".

>From fa6a6bee6e467d12d3774612c838703acd265ea6 Mon Sep 17 00:00:00 2001
From: Darren Hart <dvhltc@us.ibm.com>
Date: Thu, 8 Jul 2010 19:44:35 -0400
Subject: [PATCH] rtmutex: avoid warnon bug

Signed-off-by: Darren Hart <dvhltc@us.ibm.com>
---
 kernel/rtmutex.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/rtmutex.c b/kernel/rtmutex.c
index 23dd443..a2fcaa5 100644
--- a/kernel/rtmutex.c
+++ b/kernel/rtmutex.c
@@ -579,9 +579,9 @@ static void wakeup_next_waiter(struct rt_mutex *lock, int savestate)
 
 	raw_spin_lock(&pendowner->pi_lock);
 
-	WARN_ON(!pendowner->pi_blocked_on);
 	WARN_ON(pendowner->pi_blocked_on != waiter);
-	WARN_ON(pendowner->pi_blocked_on->lock != lock);
+	if (!WARN_ON(!pendowner->pi_blocked_on))
+		WARN_ON(pendowner->pi_blocked_on->lock != lock);
 
 	pendowner->pi_blocked_on = NULL;
 
-- 
1.6.5.2


-- 
Darren Hart
IBM Linux Technology Center
Real-Time Linux Team

  parent reply	other threads:[~2010-07-09  2:12 UTC|newest]

Thread overview: 29+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2010-07-07  4:46 2.6.33.[56]-rt23: howto create repeatable explosion in wakeup_next_waiter() Mike Galbraith
2010-07-07  8:03 ` Mike Galbraith
2010-07-07 11:57   ` Thomas Gleixner
2010-07-07 12:50     ` Mike Galbraith
2010-07-07 11:57 ` Thomas Gleixner
2010-07-07 14:03   ` Darren Hart
2010-07-07 14:17     ` Mike Galbraith
2010-07-08 12:05     ` Mike Galbraith
2010-07-08 14:12       ` Darren Hart
2010-07-09  2:11   ` Darren Hart [this message]
2010-07-09  4:32     ` Mike Galbraith
     [not found]     ` <4C36CD83.6070809@us.ibm.com>
2010-07-09  8:13       ` Mike Galbraith
2010-07-09 13:58       ` Mike Galbraith
2010-07-09 14:51         ` Mike Galbraith
2010-07-09 16:35         ` Darren Hart
2010-07-09 19:34           ` Mike Galbraith
2010-07-09 20:05   ` Darren Hart
2010-07-13  8:03   ` [PATCH][RT] futex: protect against pi_blocked_on corruption during requeue PI Darren Hart
2010-07-13  9:25     ` Thomas Gleixner
2010-07-13 10:28       ` Thomas Gleixner
2010-07-13 11:52         ` [PATCH][RT] futex: protect against pi_blocked_on corruption during requeue PI -V2 Thomas Gleixner
2010-07-13 15:57           ` Mike Galbraith
2010-07-13 18:59           ` Darren Hart
2010-07-18  8:32           ` Mike Galbraith
2010-07-13  9:58     ` [PATCH][RT] futex: protect against pi_blocked_on corruption during requeue PI Thomas Gleixner
2010-07-07 14:11 ` 2.6.33.[56]-rt23: howto create repeatable explosion in wakeup_next_waiter() gowrishankar
2010-07-07 14:31   ` Mike Galbraith
2010-07-07 15:05     ` Darren Hart
2010-07-07 17:45       ` Mike Galbraith

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4C368565.3020806@us.ibm.com \
    --to=dvhltc@us.ibm.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=efault@gmx.de \
    --cc=gowrishankar.m@linux.vnet.ibm.com \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.