linux-watchdog.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Mantas Mikulėnas" <grawity@gmail.com>
To: Jan Kiszka <jan.kiszka@siemens.com>
Cc: Guenter Roeck <linux@roeck-us.net>,
	Wim Van Sebroeck <wim@linux-watchdog.org>,
	linux-watchdog@vger.kernel.org
Subject: Re: iTCO_wdt regression on Dell laptop
Date: Mon, 26 Jul 2021 19:54:54 +0300	[thread overview]
Message-ID: <CAPWNY8VZ4dzub_7PD5qxD0sm1_DmNCznX6af_GKdHu6Y44OBmw@mail.gmail.com> (raw)
In-Reply-To: <1d07f96c-a8c9-06e5-69ec-2c099df7b1f3@siemens.com>

On Mon, Jul 26, 2021 at 12:45 PM Jan Kiszka <jan.kiszka@siemens.com> wrote:
>
> On 26.07.21 11:40, Jan Kiszka wrote:
> > On 26.07.21 11:19, Mantas Mikulėnas wrote:
> >> Hello,
> >>
> >> I have a Dell Inspiron 15-5547 laptop, with systemd configured to set
> >> the watchdog to a 2-minute expiry (due to reasons):
> >>
> >> # /etc/systemd/system.conf
> >> [Manager]
> >> RuntimeWatchdogSec=2min
> >>
> >> So far this setting has worked without problems (including kernels
> >> 5.12.15 and 5.13.1); however, with kernel 5.13.4 the system inevitably
> >> reboots after a few minutes of uptime.
> >>
> >> I have tracked the issue down to commit 5e65819a006e "watchdog:
> >> iTCO_wdt: Account for rebooting on second timeout" in the 5.13.x
> >> branch (commit cb011044e34c upstream). There are no unexpected reboots
> >> when running 5.13.4 with this commit reverted.
> >>
> >> Indeed with the original 5.13.4 kernel, `wdctl` always reports
> >> "Timeleft:" counting down from 60 seconds (sometimes very nearly
> >> reaching 0), even though "Timeout" is still reported to be 120.
> >>
> >> (systemd pokes the watchdog as part of its main loop, trying to so
> >> approximately "between 1/4 and 1/2" of the configured interval.
> >> According to wdctl these pings usually happen every 35-50 seconds but
> >> sometimes nearly at the 60-second mark, and thanks to the kernel now
> >> also dividing the requested expiry by /2 which systemd is unaware of,
> >> sometimes this ends up being a *very* close race to 0.)
> >>
> >> This is a Haswell-era machine (i7-4510U) and seems to have a "version
> >> 0" watchdog:
> >>
> >> Jul 26 11:34:04 archlinux kernel: Linux version 5.13.4-arch2-1
> >> (linux@archlinux) (gcc (GCC) 11.1.0, GNU ld (GNU Binutils) 2.36.1) #1
> >> SMP PREEMPT Thu, 22 Jul 2021 20:46:28 +0000
> >> Jul 26 11:34:14 frost kernel: iTCO_vendor_support: vendor-support=0
> >> Jul 26 11:34:14 frost kernel: iTCO_wdt iTCO_wdt.3.auto: Found a Lynx
> >> Point_LP TCO device (Version=2, TCOBASE=0x1860)
> >> Jul 26 11:34:14 frost systemd[1]: Using hardware watchdog 'iTCO_wdt',
> >> version 0, device /dev/watchdog
> >> Jul 26 11:34:14 frost systemd[1]: Set hardware watchdog to 2min.
> >> Jul 26 11:34:14 frost kernel: iTCO_wdt iTCO_wdt.3.auto: initialized.
> >> heartbeat=30 sec (nowayout=0)
> >>
> >
> > Could you printk SMI_EN(p) in iTCO_wdt_set_timeout()
> > (drivers/watchdog/iTCO_wdt.c)? This is where we decide whether SMIs are
> > working, thus the countdown will only run once. Apparently, something is
> > wrong with the detection on this system.
> >
>
> Wait, found it:
>
> diff --git a/drivers/watchdog/iTCO_wdt.c b/drivers/watchdog/iTCO_wdt.c
> index b3f604669e2c..643c6c2d0b72 100644
> --- a/drivers/watchdog/iTCO_wdt.c
> +++ b/drivers/watchdog/iTCO_wdt.c
> @@ -362,7 +362,7 @@ static int iTCO_wdt_set_timeout(struct watchdog_device *wd_dev, unsigned int t)
>          * Otherwise, the BIOS generally reboots when the SMI triggers.
>          */
>         if (p->smi_res &&
> -           (SMI_EN(p) & (TCO_EN | GBL_SMI_EN)) != (TCO_EN | GBL_SMI_EN))
> +           (inl(SMI_EN(p)) & (TCO_EN | GBL_SMI_EN)) != (TCO_EN | GBL_SMI_EN))
>                 tmrval /= 2;
>
>         /* from the specs: */

Rebuilt with this and it fixes the issue, thanks.

-- 
Mantas Mikulėnas

  reply	other threads:[~2021-07-26 17:23 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-07-26  9:19 iTCO_wdt regression on Dell laptop Mantas Mikulėnas
2021-07-26  9:40 ` Jan Kiszka
2021-07-26  9:45   ` Jan Kiszka
2021-07-26 16:54     ` Mantas Mikulėnas [this message]
2021-07-26 16:56       ` Jan Kiszka

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPWNY8VZ4dzub_7PD5qxD0sm1_DmNCznX6af_GKdHu6Y44OBmw@mail.gmail.com \
    --to=grawity@gmail.com \
    --cc=jan.kiszka@siemens.com \
    --cc=linux-watchdog@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=wim@linux-watchdog.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).