All of lore.kernel.org
 help / color / mirror / Atom feed
* Can't query Intel's iTCO watchdog reboot reason
@ 2016-07-15 23:21 Ezequiel Garcia
  2016-07-16 17:18 ` Guenter Roeck
  0 siblings, 1 reply; 4+ messages in thread
From: Ezequiel Garcia @ 2016-07-15 23:21 UTC (permalink / raw)
  To: dvhart, Guenter Roeck, wim; +Cc: linux-watchdog

Hi everyone,

A large portion of my intel-based products are suffering from
a nasty hardware freeze [1], and so I'm currently working this around
enabling the iTCO watchdog -- which in any case, it's a good idea
to have enabled.

So, it would be interesting to find out on each boot if the machine was
rebooted due to a watchdog timeout, but so far I'm not having any luck.

As per Intel's appnote [2] the BIOS should update the WDDT ACPI table,
so I added something like this to the iTCO driver:

       status = acpi_get_table(ACPI_SIG_WDDT, 1,
                               (struct acpi_table_header **) &buf);
       if (ACPI_FAILURE(status) || buf->header.length < sizeof(*buf)) {
               pr_err(FW_BUG "failed to get WDDT ACPI table\n");
               return;
       }

But it doesn't find the table. Strangely, reading TCO1_STS
and TCO2_STS always gives 0x0.

Tests were done on a Lynx Point:

[    7.131502] iTCO_wdt: Found a Lynx Point TCO device (Version=2,
TCOBASE=0x1860)

Any ideas? Is it possible to get this information or should I just gave up?

[1] https://bugzilla.kernel.org/show_bug.cgi?id=109051
[2] http://download.intel.com/design/chipsets/applnots/29227301.pdf

Thanks!
-- 
Ezequiel García, VanguardiaSur
www.vanguardiasur.com.ar

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Can't query Intel's iTCO watchdog reboot reason
  2016-07-15 23:21 Can't query Intel's iTCO watchdog reboot reason Ezequiel Garcia
@ 2016-07-16 17:18 ` Guenter Roeck
  2016-07-17 22:13   ` Ezequiel Garcia
  0 siblings, 1 reply; 4+ messages in thread
From: Guenter Roeck @ 2016-07-16 17:18 UTC (permalink / raw)
  To: Ezequiel Garcia, dvhart, wim; +Cc: linux-watchdog

On 07/15/2016 04:21 PM, Ezequiel Garcia wrote:
> Hi everyone,
>
> A large portion of my intel-based products are suffering from
> a nasty hardware freeze [1], and so I'm currently working this around
> enabling the iTCO watchdog -- which in any case, it's a good idea
> to have enabled.
>
> So, it would be interesting to find out on each boot if the machine was
> rebooted due to a watchdog timeout, but so far I'm not having any luck.
>
> As per Intel's appnote [2] the BIOS should update the WDDT ACPI table,
> so I added something like this to the iTCO driver:
>
>         status = acpi_get_table(ACPI_SIG_WDDT, 1,
>                                 (struct acpi_table_header **) &buf);
>         if (ACPI_FAILURE(status) || buf->header.length < sizeof(*buf)) {
>                 pr_err(FW_BUG "failed to get WDDT ACPI table\n");
>                 return;
>         }
>
> But it doesn't find the table. Strangely, reading TCO1_STS
> and TCO2_STS always gives 0x0.
>

That sounds like either the BIOS resets those bits, or the reboots
are not caused by the watchdog. Are you sure that you see reboots
that are caused by the watchdog ?

Guenter

> Tests were done on a Lynx Point:
>
> [    7.131502] iTCO_wdt: Found a Lynx Point TCO device (Version=2,
> TCOBASE=0x1860)
>
> Any ideas? Is it possible to get this information or should I just gave up?
>
> [1] https://bugzilla.kernel.org/show_bug.cgi?id=109051
> [2] http://download.intel.com/design/chipsets/applnots/29227301.pdf
>
> Thanks!
>


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Can't query Intel's iTCO watchdog reboot reason
  2016-07-16 17:18 ` Guenter Roeck
@ 2016-07-17 22:13   ` Ezequiel Garcia
  2016-07-19 16:21     ` Ezequiel Garcia
  0 siblings, 1 reply; 4+ messages in thread
From: Ezequiel Garcia @ 2016-07-17 22:13 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: dvhart, wim, linux-watchdog

Hi Guenter,

Thanks a lot for the quick reply.

On 16 Jul 10:18 AM, Guenter Roeck wrote:
> On 07/15/2016 04:21 PM, Ezequiel Garcia wrote:
> > Hi everyone,
> > 
> > A large portion of my intel-based products are suffering from
> > a nasty hardware freeze [1], and so I'm currently working this around
> > enabling the iTCO watchdog -- which in any case, it's a good idea
> > to have enabled.
> > 
> > So, it would be interesting to find out on each boot if the machine was
> > rebooted due to a watchdog timeout, but so far I'm not having any luck.
> > 
> > As per Intel's appnote [2] the BIOS should update the WDDT ACPI table,
> > so I added something like this to the iTCO driver:
> > 
> >         status = acpi_get_table(ACPI_SIG_WDDT, 1,
> >                                 (struct acpi_table_header **) &buf);
> >         if (ACPI_FAILURE(status) || buf->header.length < sizeof(*buf)) {
> >                 pr_err(FW_BUG "failed to get WDDT ACPI table\n");
> >                 return;
> >         }
> > 
> > But it doesn't find the table. Strangely, reading TCO1_STS
> > and TCO2_STS always gives 0x0.
> > 
> 
> That sounds like either the BIOS resets those bits, or the reboots
> are not caused by the watchdog. Are you sure that you see reboots
> that are caused by the watchdog ?
> 

Yes, I'm testing my patch forcing watchdog reboots. At least on my i5
development machine, I haven't found any way of querying the reboot
reason. I'm not even sure this is supposed to work.

I'll see if I can test on other intel machines with TCO watchdogs.

FWIW, here's my (ugly) hack:

diff --git a/drivers/watchdog/iTCO_wdt.c b/drivers/watchdog/iTCO_wdt.c
index 0acc6c5f729d..0374f90b5050 100644
--- a/drivers/watchdog/iTCO_wdt.c
+++ b/drivers/watchdog/iTCO_wdt.c
@@ -421,12 +421,31 @@ static void iTCO_wdt_cleanup(void)
 	iTCO_wdt_private.gcs_pmc = NULL;
 }
 
+static void iTCO_check_table(void)
+{
+	struct acpi_table_wddt *buf;
+	acpi_status status;
+	u16 wdt_status;
+
+	status = acpi_get_table(ACPI_SIG_WDDT, 1,
+				(struct acpi_table_header **) &buf);
+	if (ACPI_FAILURE(status) || buf->header.length < sizeof(*buf)) {
+		pr_err(FW_BUG "failed to get WDDT ACPI table\n");
+		return;
+	}
+
+	wdt_status = buf->status;
+	pr_err("ACPI watchdog status: 0x%x\n", wdt_status);
+}
+
 static int iTCO_wdt_probe(struct platform_device *dev)
 {
 	int ret = -ENODEV;
 	unsigned long val32;
 	struct itco_wdt_platform_data *pdata = dev_get_platdata(&dev->dev);
 
+	iTCO_check_table();
+
 	if (!pdata)
 		goto out;
 
@@ -510,6 +529,9 @@ static int iTCO_wdt_probe(struct platform_device *dev)
 	pr_info("Found a %s TCO device (Version=%d, TCOBASE=0x%04llx)\n",
 		pdata->name, pdata->version, (u64)TCOBASE);
 
+	pr_info("TCO1 status 0x%x\n", inw(TCO1_STS));
+	pr_info("TCO2 status 0x%x\n", inw(TCO2_STS));
+
 	/* Clear out the (probably old) status */
 	switch (iTCO_wdt_private.iTCO_version) {
 	case 4:

Thanks,
-- 
Ezequiel Garcia, VanguardiaSur
www.vanguardiasur.com.ar

^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: Can't query Intel's iTCO watchdog reboot reason
  2016-07-17 22:13   ` Ezequiel Garcia
@ 2016-07-19 16:21     ` Ezequiel Garcia
  0 siblings, 0 replies; 4+ messages in thread
From: Ezequiel Garcia @ 2016-07-19 16:21 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: dvhart, wim, linux-watchdog

On 17 July 2016 at 19:13, Ezequiel Garcia <ezequiel@vanguardiasur.com.ar> wrote:
> Hi Guenter,
>
> Thanks a lot for the quick reply.
>
> On 16 Jul 10:18 AM, Guenter Roeck wrote:
>> On 07/15/2016 04:21 PM, Ezequiel Garcia wrote:
>> > Hi everyone,
>> >
>> > A large portion of my intel-based products are suffering from
>> > a nasty hardware freeze [1], and so I'm currently working this around
>> > enabling the iTCO watchdog -- which in any case, it's a good idea
>> > to have enabled.
>> >
>> > So, it would be interesting to find out on each boot if the machine was
>> > rebooted due to a watchdog timeout, but so far I'm not having any luck.
>> >
>> > As per Intel's appnote [2] the BIOS should update the WDDT ACPI table,
>> > so I added something like this to the iTCO driver:
>> >
>> >         status = acpi_get_table(ACPI_SIG_WDDT, 1,
>> >                                 (struct acpi_table_header **) &buf);
>> >         if (ACPI_FAILURE(status) || buf->header.length < sizeof(*buf)) {
>> >                 pr_err(FW_BUG "failed to get WDDT ACPI table\n");
>> >                 return;
>> >         }
>> >
>> > But it doesn't find the table. Strangely, reading TCO1_STS
>> > and TCO2_STS always gives 0x0.
>> >
>>
>> That sounds like either the BIOS resets those bits, or the reboots
>> are not caused by the watchdog. Are you sure that you see reboots
>> that are caused by the watchdog ?
>>
>
> Yes, I'm testing my patch forcing watchdog reboots. At least on my i5
> development machine, I haven't found any way of querying the reboot
> reason. I'm not even sure this is supposed to work.
>
> I'll see if I can test on other intel machines with TCO watchdogs.
>

FWIW, repeated this test on a Bay Trail SoC TCO device, with no luck.

Thanks,
-- 
Ezequiel García, VanguardiaSur
www.vanguardiasur.com.ar

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2016-07-19 16:21 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-15 23:21 Can't query Intel's iTCO watchdog reboot reason Ezequiel Garcia
2016-07-16 17:18 ` Guenter Roeck
2016-07-17 22:13   ` Ezequiel Garcia
2016-07-19 16:21     ` Ezequiel Garcia

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.