From: Guenter Roeck <linux@roeck-us.net>
To: Julia Cartwright <julia@ni.com>
Cc: Tim Sander <tim@krieglstein.org>,
Steffen Trumtrar <s.trumtrar@pengutronix.de>,
"linux-watchdog@vger.kernel.org" <linux-watchdog@vger.kernel.org>,
Wim Van Sebroeck <wim@linux-watchdog.org>,
Christophe Leroy <christophe.leroy@c-s.fr>,
"linux-rt-users@vger.kernel.org" <linux-rt-users@vger.kernel.org>
Subject: Re: [BUG] dw_wdt watchdog on linux-rt 4.18.5-rt4 not triggering
Date: Fri, 21 Sep 2018 13:21:29 -0700 [thread overview]
Message-ID: <20180921202129.GA30613@roeck-us.net> (raw)
In-Reply-To: <20180921164200.GS8562@jcartwri.amer.corp.natinst.com>
On Fri, Sep 21, 2018 at 04:42:04PM +0000, Julia Cartwright wrote:
> On Fri, Sep 21, 2018 at 06:34:24AM -0700, Guenter Roeck wrote:
> > On 09/20/2018 01:48 PM, Julia Cartwright wrote:
> > > On Wed, Sep 19, 2018 at 12:43:03PM -0700, Guenter Roeck wrote:
> [..]
> > > > Overall, we have a number possibilities to consider:
> > > >
> > > > - The kernel watchdog timer thread is not triggered at all under some
> > > > circumstances, meaning it is not set properly. So far we have no real
> > > > indication that this is the case (since the code works fine unless some
> > > > userspace task takes all available CPU time).
> > >
> > > What do you mean by "not triggered". Do you mean woken-up/activated
> > > from a scheduling perspective? In the case I identified in my other
> > > email, the watchdogd thread wakeup doesn't even occur, even when the
> > > periodic ping timer expires, because ktimersoftd has been starved.
> > >
> >
> > Sorry for not using the correct term. Sometimes I am a bit sloppy.
> > Yes, I meant "woken-up/activated from a scheduling perspective".
>
> Thanks for the clarification. I think we're on the same page. :)
>
> > > I suspect that's what's going on for Steffen, but am not yet sure.
> > >
> > > > - The watchdog device is closed. The kernel watchdog timer thread is
> > > > starved and does not get to run. The question is what to do in this
> > > > situation. In a real time system, this is almost always a fatal
> > > > condition. Should the system really be kept alive in this situation ?
> > >
> > > Sometimes its the right decision, sometimes its not. The only sensible
> > > thing to do is to allow the user make the decision that's right for
> > > their application needs by allowing the relative prioritization of
> > > watchdogd and their application threads.
> >
> > Agreed, but that doesn't help if the watchdog daemon is not open or if the
> > hardware watchdog interval is too small and the kernel mechanism is needed
> > to ping the watchdog.
>
> Makes sense.
>
> > > ...which they can do now, but it's not effective on RT because of the
> > > timer deferral through ktimersoftd.
> > >
> > > The solution, in my mind, and like I mentioned in my other email, is to
> > > opt-out of the ktimersoftd-deferral mechanism. This requires some
> > > tweaking with the kthread_worker bits to ensure safety in hardirq
> > > context, but that seems straightforward. See the below.
> >
> > Makes sense to me, though I have no idea what it would take to push
> > the necessary changes into the core kernel.
>
> As of now, this bug doesn't exist in mainline because the hrtimer
> deferral bits haven't landed yet, as you note below.
>
> > However, I must be missing something: Looking into the kernel code,
> > it seems to me that the spin_lock functions call the respective raw_
> > spinlock functions right away. With that in mind, why would the kernel
> > code change be necessary ? Also, I don't see HRTIMER_MODE_REL_HARD
> > defined anywhere. Is this RT specific ?
>
> Yes, there is no functional difference in mainline currently between a
> spin_lock_t and a raw_spin_lock_t. There is also no
> HRTIMER_MODE_REL_HARD like mentioned before. These are
> features/concepts currently only in the RT tree, but should be making
> their way into mainline soon.
>
> As far as path forward, I'd like to get some confirmation from Steffen
> and/or Tim that the proposed patch fixes their issue, then I'll cook
> some proper patches; the kthread_worker bits could go mainline now
> because there is no dependency, but the watchdog change will need to be
> RT-only for now.
>
SGTM.
Thanks,
Guenter
next prev parent reply other threads:[~2018-09-22 2:12 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-09-18 13:21 [BUG] dw_wdt watchdog on linux-rt 4.18.5-rt4 not triggering Steffen Trumtrar
2018-09-18 13:46 ` Guenter Roeck
2018-09-19 6:46 ` Steffen Trumtrar
2018-09-19 19:43 ` Guenter Roeck
2018-09-20 20:48 ` Julia Cartwright
2018-09-21 13:34 ` Guenter Roeck
2018-09-21 16:42 ` Julia Cartwright
2018-09-21 20:21 ` Guenter Roeck [this message]
2018-09-24 7:24 ` Steffen Trumtrar
2018-09-20 8:18 ` Tim Sander
2018-09-18 18:14 ` Julia Cartwright
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180921202129.GA30613@roeck-us.net \
--to=linux@roeck-us.net \
--cc=christophe.leroy@c-s.fr \
--cc=julia@ni.com \
--cc=linux-rt-users@vger.kernel.org \
--cc=linux-watchdog@vger.kernel.org \
--cc=s.trumtrar@pengutronix.de \
--cc=tim@krieglstein.org \
--cc=wim@linux-watchdog.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).