Linux-Watchdog Archive on lore.kernel.org
 help / color / Atom feed
From: Guenter Roeck <linux@roeck-us.net>
To: Muni Sekhar <munisekharrms@gmail.com>
Cc: linux-watchdog@vger.kernel.org, linux-pci@vger.kernel.org,
	wim@linux-watchdog.org
Subject: Re: watchdog: how to enable?
Date: Mon, 18 Nov 2019 06:10:27 -0800
Message-ID: <da120ac6-062a-3dcc-e635-979fdd021592@roeck-us.net> (raw)
In-Reply-To: <CAHhAz+gGPaNTO1VR2iBBDFEdJ+cJx6+CNoAneLj6yTW0hgEfkA@mail.gmail.com>

On 11/18/19 1:52 AM, Muni Sekhar wrote:
> On Sun, Nov 17, 2019 at 3:12 AM Guenter Roeck <linux@roeck-us.net> wrote:
>>
>> On 11/16/19 10:34 AM, Muni Sekhar wrote:
>>> On Sat, Nov 16, 2019 at 9:31 PM Guenter Roeck <linux@roeck-us.net> wrote:
>>>>
>>>> On 11/15/19 7:03 PM, Muni Sekhar wrote:
>>>> [ ... ]
>>>>>>
>>>>>> Another possibility, of course, might be to enable a hardware watchdog
>>>>>> in your system (assuming it supports one). I personally would not trust
>>>>>> the NMI watchdog because to detect a system hang, after all, there are
>>>>>> situations where even NMIs no longer work.
>>>>>
>>>>> >From dmesg , Is it possible to know whether my system supports
>>>>> hardware watchdog or not?
>>>>> I assume that my system supports the hardware watchdog , then how to
>>>>> enable the hardware watchdog to debug the system freeze issues?
>>>>>
>>>>
>>>> Hardware watchdog support really depends on the board type. Most PC
>>>> mainboards support a watchdog in the Super-IO chip, but on some it is
>>>> not wired correctly. On embedded boards it is often built into the SoC.
>>>> The easiest way to see if you have a watchdog would be to check for the
>>>> existence of /dev/watchdog. However, on a PC that would most likely
>>>> not be there because the necessary module is not auto-loaded.
>>>> If you tell us your board type, or better the Super-IO chip on the board,
>>>> we might be able to help.
>>>
>>> I’m having two same configuration systems, in one system I installed
>>> the Vanilla kernel and I see the /dev/watchdog and /dev/watchdog0
>>> nodes. In other system I’m running with ubuntu distribution kernel,
>>> but I don’t see any watchdog device node. So it looks like I need to
>>> manually load the kernel module in distro kernel. Is there a way to
>>> know what is the corresponding kernel module for  /dev/watchdog node?
>>>
>>> # ls -l /dev/watchdog*
>>> crw------- 1 root root  10, 130 Nov 15 17:15 /dev/watchdog
>>> crw------- 1 root root 248,   0 Nov 15 17:15 /dev/watchdog0
>>>
>>> # ps -ax | grep watchdog
>>>     678 ?        S      0:00 [watchdogd]
>>>
>>> Regarding Super-IO chip, how to find out the Super-IO chip model?
>>>
>> You could try to run sensors-detect (from the "sensors" package).
>>
>> If you can boot a system with /dev/watchdog0, you should see the type
>> in /sys/class/watchdog/watchdog0/identity.
> I could not find the /sys/class/watchdog/watchdog0/identity and
> /sys/class/watchdog/watchdog0/timeout files.
> $ ls -l /sys/class/watchdog/watchdog0/
> total 0
> -r--r--r-- 1 root root 4096 Nov 18 15:12 dev
> lrwxrwxrwx 1 root root    0 Nov 18 15:12 device -> ../../../iTCO_wdt.0.auto
> drwxr-xr-x 2 root root    0 Nov 18 15:12 power
> lrwxrwxrwx 1 root root    0 Nov 18 14:53 subsystem ->
> ../../../../../../class/watchdog
> -rw-r--r-- 1 root root 4096 Nov 18 14:53 uevent
> 

Presumably CONFIG_WATCHDOG_SYSFS is not enabled in your configuration.

>>
>> Also, you can test if the watchdog works with "sudo cat /dev/watchdog",
>> assuming the watchdog daemon is not running. The watchdog works if the
>> system reboots after the watchdog times out (/sys/class/watchdog/watchdog0/timeout
>> is the timeout in seconds).
> sudo cat /dev/watchdog perfectly rebooted my system. I don't see
> timeout node, how do I configure the timeout value?

sudo apt-get install watchdog
man watchdog

should tell you. Alternatively, enable CONFIG_WATCHDOG_SYSFS.

>>
>>>>
>>>> Note though that this won't help to debug the problem. A hardware
>>>> watchdog resets the system. It helps to recover, but it is not intended
>>>> to help with debugging.
>>> How do I use the hardware watchdog to reset my system when system is
>>> frozen? It helps me to collect the crashdump and finally helps me to
>>> find the root cause for the system frozen issue.
>>>
>> There won't be a crashdump. It just hard-resets the system.
> So is there any other solution to capture the crashdump or trigger
> soft reboot once kernel is lockedup?

Not that I know of. I suspect, though, that you either have a hard lockup
where even NMI is non-operational, or NMI doesn't work in your system
to start with.

If you have nmi_watchdog=1 in your kernel command line, /proc/interrupts
should show a non-zero number of NMI interrupts. Do you see that in your system ?

Guenter

  reply index

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-16  0:35 Muni Sekhar
2019-11-16  1:04 ` Guenter Roeck
2019-11-16  3:03   ` Muni Sekhar
2019-11-16 16:01     ` Guenter Roeck
2019-11-16 18:34       ` Muni Sekhar
2019-11-16 21:42         ` Guenter Roeck
2019-11-18  9:52           ` Muni Sekhar
2019-11-18 14:10             ` Guenter Roeck [this message]
2019-11-18 15:07               ` Muni Sekhar
2019-11-18 14:38 ` Bjorn Helgaas
2019-11-18 14:41   ` Bjorn Helgaas
2019-11-18 15:09   ` Muni Sekhar
2019-11-22 10:59     ` Guenter Roeck
2019-11-22 12:54       ` Muni Sekhar

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=da120ac6-062a-3dcc-e635-979fdd021592@roeck-us.net \
    --to=linux@roeck-us.net \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-watchdog@vger.kernel.org \
    --cc=munisekharrms@gmail.com \
    --cc=wim@linux-watchdog.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-Watchdog Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-watchdog/0 linux-watchdog/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-watchdog linux-watchdog/ https://lore.kernel.org/linux-watchdog \
		linux-watchdog@vger.kernel.org
	public-inbox-index linux-watchdog

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-watchdog


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git