linux-watchdog.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Guenter Roeck <linux@roeck-us.net>
To: Julia Cartwright <julia@ni.com>
Cc: Tim Sander <tim@krieglstein.org>,
	Steffen Trumtrar <s.trumtrar@pengutronix.de>,
	"linux-watchdog@vger.kernel.org" <linux-watchdog@vger.kernel.org>,
	Wim Van Sebroeck <wim@linux-watchdog.org>,
	Christophe Leroy <christophe.leroy@c-s.fr>,
	"linux-rt-users@vger.kernel.org" <linux-rt-users@vger.kernel.org>
Subject: Re: [BUG] dw_wdt watchdog on linux-rt 4.18.5-rt4 not triggering
Date: Fri, 21 Sep 2018 13:21:29 -0700	[thread overview]
Message-ID: <20180921202129.GA30613@roeck-us.net> (raw)
In-Reply-To: <20180921164200.GS8562@jcartwri.amer.corp.natinst.com>

On Fri, Sep 21, 2018 at 04:42:04PM +0000, Julia Cartwright wrote:
> On Fri, Sep 21, 2018 at 06:34:24AM -0700, Guenter Roeck wrote:
> > On 09/20/2018 01:48 PM, Julia Cartwright wrote:
> > > On Wed, Sep 19, 2018 at 12:43:03PM -0700, Guenter Roeck wrote:
> [..]
> > > > Overall, we have a number possibilities to consider:
> > > > 
> > > > - The kernel watchdog timer thread is not triggered at all under some
> > > >    circumstances, meaning it is not set properly. So far we have no real
> > > >    indication that this is the case (since the code works fine unless some
> > > >    userspace task takes all available CPU time).
> > > 
> > > What do you mean by "not triggered".  Do you mean woken-up/activated
> > > from a scheduling perspective?  In the case I identified in my other
> > > email, the watchdogd thread wakeup doesn't even occur, even when the
> > > periodic ping timer expires, because ktimersoftd has been starved.
> > > 
> > 
> > Sorry for not using the correct term. Sometimes I am a bit sloppy.
> > Yes, I meant "woken-up/activated from a scheduling perspective".
> 
> Thanks for the clarification.  I think we're on the same page. :)
> 
> > > I suspect that's what's going on for Steffen, but am not yet sure.
> > > 
> > > > - The watchdog device is closed. The kernel watchdog timer thread is
> > > >    starved and does not get to run. The question is what to do in this
> > > >    situation. In a real time system, this is almost always a fatal
> > > >    condition. Should the system really be kept alive in this situation ?
> > > 
> > > Sometimes its the right decision, sometimes its not.  The only sensible
> > > thing to do is to allow the user make the decision that's right for
> > > their application needs by allowing the relative prioritization of
> > > watchdogd and their application threads.
> >
> > Agreed, but that doesn't help if the watchdog daemon is not open or if the
> > hardware watchdog interval is too small and the kernel mechanism is needed
> > to ping the watchdog.
> 
> Makes sense.
> 
> > > ...which they can do now, but it's not effective on RT because of the
> > > timer deferral through ktimersoftd.
> > > 
> > > The solution, in my mind, and like I mentioned in my other email, is to
> > > opt-out of the ktimersoftd-deferral mechanism.  This requires some
> > > tweaking with the kthread_worker bits to ensure safety in hardirq
> > > context, but that seems straightforward.  See the below.
> >
> > Makes sense to me, though I have no idea what it would take to push
> > the necessary changes into the core kernel.
> 
> As of now, this bug doesn't exist in mainline because the hrtimer
> deferral bits haven't landed yet, as you note below.
> 
> > However, I must be missing something: Looking into the kernel code,
> > it seems to me that the spin_lock functions call the respective raw_
> > spinlock functions right away. With that in mind, why would the kernel
> > code change be necessary ? Also, I don't see HRTIMER_MODE_REL_HARD
> > defined anywhere. Is this RT specific ?
> 
> Yes, there is no functional difference in mainline currently between a
> spin_lock_t and a raw_spin_lock_t.  There is also no
> HRTIMER_MODE_REL_HARD like mentioned before.  These are
> features/concepts currently only in the RT tree, but should be making
> their way into mainline soon.
> 
> As far as path forward, I'd like to get some confirmation from Steffen
> and/or Tim that the proposed patch fixes their issue, then I'll cook
> some proper patches; the kthread_worker bits could go mainline now
> because there is no dependency, but the watchdog change will need to be
> RT-only for now.
> 
SGTM.

Thanks,
Guenter

  reply	other threads:[~2018-09-22  2:12 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-18 13:21 [BUG] dw_wdt watchdog on linux-rt 4.18.5-rt4 not triggering Steffen Trumtrar
2018-09-18 13:46 ` Guenter Roeck
2018-09-19  6:46   ` Steffen Trumtrar
2018-09-19 19:43     ` Guenter Roeck
2018-09-20 20:48       ` Julia Cartwright
2018-09-21 13:34         ` Guenter Roeck
2018-09-21 16:42           ` Julia Cartwright
2018-09-21 20:21             ` Guenter Roeck [this message]
2018-09-24  7:24         ` Steffen Trumtrar
2018-09-20  8:18   ` Tim Sander
2018-09-18 18:14 ` Julia Cartwright

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180921202129.GA30613@roeck-us.net \
    --to=linux@roeck-us.net \
    --cc=christophe.leroy@c-s.fr \
    --cc=julia@ni.com \
    --cc=linux-rt-users@vger.kernel.org \
    --cc=linux-watchdog@vger.kernel.org \
    --cc=s.trumtrar@pengutronix.de \
    --cc=tim@krieglstein.org \
    --cc=wim@linux-watchdog.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).