Re: [PATCH] kernel/watchdog: flush all printk nmi buffers when hardlockup detected

From: Petr Mladek <pmladek@suse.com>
To: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Kirill Tkhai <ktkhai@virtuozzo.com>,
	Peter Zijlstra <peterz@infradead.org>,
	linux-kernel@vger.kernel.org,
	Steven Rostedt <rostedt@goodmis.org>,
	Sergey Senozhatsky <sergey.senozhatsky@gmail.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Dmitry Monakhov <dmtrmonakhov@yandex-team.ru>
Subject: Re: [PATCH] kernel/watchdog: flush all printk nmi buffers when hardlockup detected
Date: Wed, 12 Feb 2020 15:54:41 +0100	[thread overview]
Message-ID: <20200212145441.ukhvil6tpljsivvl@pathway.suse.cz> (raw)
In-Reply-To: <c3ce3cee-c757-f76e-8d43-33e9b9ae80ba@yandex-team.ru>

On Tue 2020-02-11 15:36:02, Konstantin Khlebnikov wrote:
> On 11/02/2020 11.14, Kirill Tkhai wrote:
> > Hi, Konstantin,
> > 
> > On 10.02.2020 12:48, Konstantin Khlebnikov wrote:
> > > In NMI context printk() could save messages into per-cpu buffers and
> > > schedule flush by irq_work when IRQ are unblocked. This means message
> > > about hardlockup appears in kernel log only when/if lockup is gone.
> > > 
> > > Comment in irq_work_queue_on() states that remote IPI aren't NMI safe
> > > thus printk() cannot schedule flush work to another cpu.
> > > 
> > > This patch adds simple atomic counter of detected hardlockups and
> > > flushes all per-cpu printk buffers in context softlockup watchdog
> > > at any other cpu when it sees changes of this counter.
> > > 
> > > Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> > > ---
> > >   include/linux/nmi.h   |    1 +
> > >   kernel/watchdog.c     |   22 ++++++++++++++++++++++
> > >   kernel/watchdog_hld.c |    1 +
> > >   3 files changed, 24 insertions(+)
> > > 
> > > diff --git a/include/linux/nmi.h b/include/linux/nmi.h
> > > index 9003e29cde46..8406df72ae5a 100644
> > > --- a/include/linux/nmi.h
> > > +++ b/include/linux/nmi.h
> > > @@ -84,6 +84,7 @@ static inline void reset_hung_task_detector(void) { }
> > >   #if defined(CONFIG_HARDLOCKUP_DETECTOR)
> > >   extern void hardlockup_detector_disable(void);
> > >   extern unsigned int hardlockup_panic;
> > > +extern atomic_t hardlockup_detected;
> > >   #else
> > >   static inline void hardlockup_detector_disable(void) {}
> > >   #endif
> > > diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> > > index b6b1f54a7837..9f5c68fababe 100644
> > > --- a/kernel/watchdog.c
> > > +++ b/kernel/watchdog.c
> > > @@ -92,6 +92,26 @@ static int __init hardlockup_all_cpu_backtrace_setup(char *str)
> > >   }
> > >   __setup("hardlockup_all_cpu_backtrace=", hardlockup_all_cpu_backtrace_setup);
> > >   # endif /* CONFIG_SMP */
> > > +
> > > +atomic_t hardlockup_detected = ATOMIC_INIT(0);
> > > +
> > > +static inline void flush_hardlockup_messages(void)
> > > +{
> > > +	static atomic_t flushed = ATOMIC_INIT(0);
> > > +
> > > +	/* flush messages from hard lockup detector */
> > > +	if (atomic_read(&hardlockup_detected) != atomic_read(&flushed)) {
> > > +		atomic_set(&flushed, atomic_read(&hardlockup_detected));
> > > +		printk_safe_flush();
> > > +	}
> > > +}
> > 
> > Do we really need two variables here? They may come into two different
> > cache lines, and there will be double cache pollution just because of
> > this simple check. Why not the below?
> 
> I don't think anybody could notice read-only access to second variable.
> This executes once in several seconds.
> 
> Watchdogs already use same pattern (monotonic counter + snapshot) in
> couple places. So code looks more clean in this way.

It is not only about speed. It is also about code complexity
and correctness. Using two variables is more complex.
Calling atomic_read(&hardlockup_detected) twice does
not look like a correct pattern.

The single variable patter is used for similar things there
as well, for example, see hard_watchdog_warn,
hardlockup_allcpu_dumped.

I would call the variable "hardlockup_dump_flush" and
use the same logic as for hardlockup_allcpu_dumped.

Note that simple READ_ONCE(), WRITE_ONCE() are not enough
because they do not provide smp barriers.

Best Regards,
Petr