From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Thomas Gleixner <tglx@linutronix.de>
Cc: LKML <linux-kernel@vger.kernel.org>,
Sebastian Sewior <bigeasy@linutronix.de>,
Anna-Maria Gleixner <anna-maria@linutronix.de>,
Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>
Subject: Re: [PATCH] hrtimer: Reset hrtimer cpu base proper on CPU hotplug
Date: Fri, 26 Jan 2018 14:09:17 -0800 [thread overview]
Message-ID: <20180126220917.GI3741@linux.vnet.ibm.com> (raw)
In-Reply-To: <alpine.DEB.2.20.1801261447590.2067@nanos>
On Fri, Jan 26, 2018 at 02:54:32PM +0100, Thomas Gleixner wrote:
> The hrtimer interrupt code contains a hang detection and mitigation
> mechanism, which prevents that a long delayed hrtimer interrupt causes a
> continous retriggering of interrupts which prevent the system from making
> progress. If a hang is detected then the timer hardware is programmed with
> a certain delay into the future and a flag is set in the hrtimer cpu base
> which prevents newly enqueued timers from reprogramming the timer hardware
> prior to the chosen delay. The subsequent hrtimer interrupt after the delay
> clears the flag and resumes normal operation.
>
> If such a hang happens in the last hrtimer interrupt before a CPU is
> unplugged then the hang_detected flag is set and stays that way when the
> CPU is plugged in again. At that point the timer hardware is not armed and
> it cannot be armed because the hang_detected flag is still active, so
> nothing clears that flag. As a consequence the CPU does not receive hrtimer
> interrupts and no timers expire on that CPU which results in RCU stalls and
> other malfunctions.
>
> Clear the flag along with some other less critical members of the hrtimer
> cpu base to ensure starting from a clean state when a CPU is plugged in.
>
> Thanks to Paul, Sebastian and Anna-Maria for their help to get down to the
> root cause of that hard to reproduce heisenbug. Once understood it's
> trivial and certainly justifies a brown paperbag.
Thank you very much, and I do know that feeling! After reading the
commit log, I feel significantly less incompetent for having failed to
find this one. ;-) But it did pass rcutorture testing for a great many
years, didn't it? :-/
I have started an eight-hour seven-way test on the dreaded rcutorture
TREE01 scenario. In the meantime, off to the train!
Thanx, Paul
> Fixes: 41d2e4949377 ("hrtimer: Tune hrtimer_interrupt hang logic")
> Reported-by: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Cc: stable@vger.kernel.org
> ---
> kernel/time/hrtimer.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> --- a/kernel/time/hrtimer.c
> +++ b/kernel/time/hrtimer.c
> @@ -655,7 +655,9 @@ static void hrtimer_reprogram(struct hrt
> static inline void hrtimer_init_hres(struct hrtimer_cpu_base *base)
> {
> base->expires_next = KTIME_MAX;
> + base->hang_detected = 0;
> base->hres_active = 0;
> + base->next_timer = NULL;
> }
>
> /*
> @@ -1589,6 +1591,7 @@ int hrtimers_prepare_cpu(unsigned int cp
> timerqueue_init_head(&cpu_base->clock_base[i].active);
> }
>
> + cpu_base->active_bases = 0;
> cpu_base->cpu = cpu;
> hrtimer_init_hres(cpu_base);
> return 0;
>
next prev parent reply other threads:[~2018-01-26 22:09 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-01-26 13:54 [PATCH] hrtimer: Reset hrtimer cpu base proper on CPU hotplug Thomas Gleixner
2018-01-26 22:09 ` Paul E. McKenney [this message]
2018-01-28 0:53 ` Paul E. McKenney
2018-01-29 8:20 ` Sebastian Sewior
2018-01-29 9:57 ` Paul E. McKenney
2018-01-29 23:43 ` Paul E. McKenney
2018-01-30 21:03 ` Thomas Gleixner
2018-01-31 0:52 ` Paul E. McKenney
2018-01-27 14:31 ` [tip:timers/urgent] " tip-bot for Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180126220917.GI3741@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=anna-maria@linutronix.de \
--cc=bigeasy@linutronix.de \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).