linux-watchdog.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ivan Mironov <mironov.ivan@gmail.com>
To: Jerry.Hoemann@hpe.com
Cc: linux-watchdog@vger.kernel.org, linux-kernel@vger.kernel.org,
	Wim Van Sebroeck <wim@linux-watchdog.org>,
	Guenter Roeck <linux@roeck-us.net>
Subject: Re: [RFC PATCH 1/4] watchdog: hpwdt: Don't disable watchdog on NMI
Date: Sat, 02 Feb 2019 09:55:29 +0500	[thread overview]
Message-ID: <84950aa0d194f28389f3bb209154ddc26e96c6de.camel@gmail.com> (raw)
In-Reply-To: <20190116022731.GD18342@anatevka>

On Tue, 2019-01-15 at 19:27 -0700, Jerry Hoemann wrote:
> On Mon, Jan 14, 2019 at 07:36:14AM +0500, Ivan Mironov wrote:
> > Existing code disables watchdog on NMI right before completely hanging
> > the system.
> > 
> > There are two problems here:
> > 
> >  * First, watchdog is expected to reset the system in a case of such
> >    failure, no matter what.
> 
> Documentation/watchdog/watchdog-api.txt
> 
> explicitly allows for pretimeout NMI and generation of kernel crash dumps.
> 
> By removing hpwdt_stop the system will likely fail to crash dump
> as there is only 9 seconds between receipt of a NMI and the iLO
> resetting the system.
> 
> Unfortunately, kdump is not without issues and can also be difficult
> to properly configure either of which can result in failure to dump
> and reset.
> 
> Customers who value availability over kdump collection, the pretimeout
> NMI can be disabled and hardware will not issue the pretimeout NMI
> and will only do reset.
> 
> A middle ground for those who want tombstones but not kdump, would
> be to leave the pretimeout NMI enabled and add "panic=N" to the
> Linux command line.  That way after the panic, the tombstone is
> printed and the system resets after N seconds.
> 
> 

Somehow I missed the whole pretimout thing when reading about the
watchdog API. Thanks for clarification, now code makes much more sense
=).

Still, I do not really understand the point of enabling of kdump
support in hpwdt driver by default while kdump is not enabled by
default.

Also, existing code may call hpwdt_stop() (and thus break watchdog)
even if pretimout is disabled.

Also, "panic=N" option is not providing a way to *not* panic on NMI
unrelated with iLO. This could be circumvented by blacklisting the
hpwdt module entirely, but normal watchdog functionality would be lost
then.

It is possible to rebuild kernel without HPWDT_NMI_DECODING (which is
enabled in Fedora, for example). But it is nearly impossible to come to
this solution without examining the source code, because description of
this option does not mention that it is really about pretimout support
and panics and not about something else...

I would say that current default behavior of hpwd is slightly confusing
in multiple different ways.

> 
> >  * Second, this code has no effect if there are more than one watchdog.
> 
> That is correct.  Hpwdt will not turn off any other WDT.
> 
> I don't see a current method of notifying other watchdogs
> that a given watchdog is going to take the system down.
> 
> The closest I hook see is watchdog_notify_pretimeout, but I don't
> see that notifying other WDT.  Its not clear to me that it should.
> (e.g. the second WDT could be of longer duration and protect against
> kdump hanging. This would need to be thought through.)
> 
> 
> 
> > Signed-off-by: Ivan Mironov <mironov.ivan@gmail.com>
> > ---
> >  drivers/watchdog/hpwdt.c | 2 --
> >  1 file changed, 2 deletions(-)
> > 
> > diff --git a/drivers/watchdog/hpwdt.c b/drivers/watchdog/hpwdt.c
> > index ef30c7e9728d..2467e6bc25c2 100644
> > --- a/drivers/watchdog/hpwdt.c
> > +++ b/drivers/watchdog/hpwdt.c
> > @@ -170,8 +170,6 @@ static int hpwdt_pretimeout(unsigned int ulReason, struct pt_regs *regs)
> >  	if (ilo5 && !pretimeout && !mynmi)
> >  		return NMI_DONE;
> >  
> > -	hpwdt_stop();
> > -
> >  	hex_byte_pack(panic_msg, mynmi);
> >  	nmi_panic(regs, panic_msg);
> >  
> > -- 
> > 2.20.1


  parent reply	other threads:[~2019-02-02  4:55 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-14  2:36 [RFC PATCH 0/4] watchdog: hpwdt: Fix NMI-related behaviour when CONFIG_HPWDT_NMI_DECODING is enabled Ivan Mironov
2019-01-14  2:36 ` [RFC PATCH 1/4] watchdog: hpwdt: Don't disable watchdog on NMI Ivan Mironov
2019-01-16  2:27   ` Jerry Hoemann
2019-01-16  2:52     ` Guenter Roeck
2019-02-02  4:55     ` Ivan Mironov [this message]
2019-02-08  1:26       ` Jerry Hoemann
2019-02-08  4:17         ` Guenter Roeck
2019-02-14 19:49         ` Ivan Mironov
2019-01-14  2:36 ` [RFC PATCH 2/4] watchdog: hpwdt: Don't panic on foreign NMI Ivan Mironov
2019-01-14  2:36 ` [RFC PATCH 3/4] watchdog: hpwdt: Add more information into message Ivan Mironov
2019-01-14  2:36 ` [RFC PATCH 4/4] watchdog: hpwdt: Make panic behaviour configurable Ivan Mironov
2019-01-16  2:30   ` Jerry Hoemann
2019-02-02  5:13     ` Ivan Mironov
2019-01-16  2:22 ` [RFC PATCH 0/4] watchdog: hpwdt: Fix NMI-related behaviour when CONFIG_HPWDT_NMI_DECODING is enabled Jerry Hoemann
2019-02-02  6:24   ` Ivan Mironov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=84950aa0d194f28389f3bb209154ddc26e96c6de.camel@gmail.com \
    --to=mironov.ivan@gmail.com \
    --cc=Jerry.Hoemann@hpe.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-watchdog@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=wim@linux-watchdog.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).