linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Steven Rostedt <rostedt@goodmis.org>
To: linux-kernel@vger.kernel.org,
	linux-rt-users <linux-rt-users@vger.kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
	Carsten Emde <C.Emde@osadl.org>,
	Sebastian Andrzej Siewior <bigeasy@linutronix.de>,
	John Kacur <jkacur@redhat.com>,
	Paul Gortmaker <paul.gortmaker@windriver.com>,
	<stable-rt@vger.kernel.org>
Subject: [PATCH RT 07/14] rt: Make cpu_chill() use hrtimer instead of msleep()
Date: Fri, 28 Feb 2014 22:52:16 -0500	[thread overview]
Message-ID: <20140301035236.823382489@goodmis.org> (raw)
In-Reply-To: 20140301035209.031474616@goodmis.org

[-- Attachment #1: 0007-rt-Make-cpu_chill-use-hrtimer-instead-of-msleep.patch --]
[-- Type: text/plain, Size: 3760 bytes --]

3.10.32-rt31-rc2 stable review patch.
If anyone has any objections, please let me know.

------------------

From: Steven Rostedt <rostedt@goodmis.org>

Ulrich Obergfell pointed out that cpu_chill() calls msleep() which is woken
up by the ksoftirqd running the TIMER softirq. But as the cpu_chill() is
called from softirq context, it may block the ksoftirqd() from running, in
which case, it may never wake up the msleep() causing the deadlock.

I checked the vmcore, and irq/74-qla2xxx is stuck in the msleep() call,
running on CPU 8. The one ksoftirqd that is stuck, happens to be the one that
runs on CPU 8, and it is blocked on a lock held by irq/74-qla2xxx. As that
ksoftirqd is the one that will wake up irq/74-qla2xxx, and it happens to be
blocked on a lock that irq/74-qla2xxx holds, we have our deadlock.

The solution is not to convert the cpu_chill() back to a cpu_relax() as that
will re-create a possible live lock that the cpu_chill() fixed earlier, and may
also leave this bug open on other softirqs. The fix is to remove the
dependency on ksoftirqd from cpu_chill(). That is, instead of calling
msleep() that requires ksoftirqd to wake it up, use the
hrtimer_nanosleep() code that does the wakeup from hard irq context.

|Looks to be the lock of the block softirq. I don't have the core dump
|anymore, but from what I could tell the ksoftirqd was blocked on the
|block softirq lock, where the block softirq handler did a msleep
|(called by the qla2xxx interrupt handler).
|
|Looking at trigger_softirq() in block/blk-softirq.c, it can do a
|smp_callfunction() to another cpu to run the block softirq. If that
|happens to be the cpu where the qla2xx irq handler is doing the block
|softirq and is in a middle of a msleep(), I believe the ksoftirqd will
|try to run the softirq. If it does that, then BOOM, it's deadlocked
|because the ksoftirqd will never run the timer softirq either.

|I should have also stated that it was only one lock that was involved.
|But the lock owner was doing a msleep() that requires a wakeup by
|ksoftirqd to continue. If ksoftirqd happens to be blocked on a lock
|held by the msleep() caller, then you have your deadlock.
|
|It's best not to have any softirqs going to sleep requiring another
|softirq to wake it up. Note, if we ever require a timer softirq to do a
|cpu_chill() it will most definitely hit this deadlock.

Cc: stable-rt@vger.kernel.org
Found-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
[bigeasy: add the 4 | chapters from email]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
 include/linux/delay.h |  2 +-
 kernel/hrtimer.c      | 15 +++++++++++++++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/include/linux/delay.h b/include/linux/delay.h
index e23a7c0..37caab3 100644
--- a/include/linux/delay.h
+++ b/include/linux/delay.h
@@ -53,7 +53,7 @@ static inline void ssleep(unsigned int seconds)
 }
 
 #ifdef CONFIG_PREEMPT_RT_FULL
-# define cpu_chill()	msleep(1)
+extern void cpu_chill(void);
 #else
 # define cpu_chill()	cpu_relax()
 #endif
diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index a7e90b2..b569c6d 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1887,6 +1887,21 @@ SYSCALL_DEFINE2(nanosleep, struct timespec __user *, rqtp,
 	return hrtimer_nanosleep(&tu, rmtp, HRTIMER_MODE_REL, CLOCK_MONOTONIC);
 }
 
+#ifdef CONFIG_PREEMPT_RT_FULL
+/*
+ * Sleep for 1 ms in hope whoever holds what we want will let it go.
+ */
+void cpu_chill(void)
+{
+	struct timespec tu = {
+		.tv_nsec = NSEC_PER_MSEC,
+	};
+
+	hrtimer_nanosleep(&tu, NULL, HRTIMER_MODE_REL, CLOCK_MONOTONIC);
+}
+EXPORT_SYMBOL(cpu_chill);
+#endif
+
 /*
  * Functions related to boot-time initialization:
  */
-- 
1.8.5.3



  parent reply	other threads:[~2014-03-01  3:56 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-01  3:52 [PATCH RT 00/14] Linux 3.10.32-rt31-rc2 Steven Rostedt
2014-03-01  3:52 ` [PATCH RT 01/14] rcu: Dont activate RCU core on NO_HZ_FULL CPUs Steven Rostedt
2014-03-01  3:52 ` [PATCH RT 02/14] timers: do not raise softirq unconditionally Steven Rostedt
2014-03-01  3:52 ` [PATCH RT 03/14] timer: Raise softirq if theres irq_work Steven Rostedt
2014-03-01  3:52 ` [PATCH RT 04/14] timer/rt: Always raise the softirq if theres irq_work to be done Steven Rostedt
2014-03-01  3:52 ` [PATCH RT 05/14] rcutree/rcu_bh_qs: disable irq while calling rcu_preempt_qs() Steven Rostedt
2014-03-01  3:52 ` [PATCH RT 06/14] Revert "x86: Disable IST stacks for debug/int 3/stack fault for PREEMPT_RT" Steven Rostedt
2014-03-01  3:52 ` Steven Rostedt [this message]
2014-03-01  3:52 ` [PATCH RT 08/14] kernel/hrtimer: be non-freezeable in cpu_chill() Steven Rostedt
2014-03-01  3:52 ` [PATCH RT 09/14] irq_work: allow certain work in hard irq context Steven Rostedt
2014-03-01  3:52 ` [PATCH RT 10/14] arm/unwind: use a raw_spin_lock Steven Rostedt
2014-03-01  3:52 ` [PATCH RT 11/14] net: ip_send_unicast_reply: add missing local serialization Steven Rostedt
2014-03-01  3:52 ` [PATCH RT 12/14] leds: trigger: disable CPU trigger on -RT Steven Rostedt
2014-03-01  3:52 ` [PATCH RT 13/14] rcu: Eliminate softirq processing from rcutree Steven Rostedt
2014-03-01  3:52 ` [PATCH RT 14/14] Linux 3.10.32-rt31-rc2 Steven Rostedt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140301035236.823382489@goodmis.org \
    --to=rostedt@goodmis.org \
    --cc=C.Emde@osadl.org \
    --cc=bigeasy@linutronix.de \
    --cc=jkacur@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=paul.gortmaker@windriver.com \
    --cc=stable-rt@vger.kernel.org \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).