linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* watchdog: how to enable?
@ 2019-11-16  0:35 Muni Sekhar
  2019-11-16  1:04 ` Guenter Roeck
  0 siblings, 1 reply; 8+ messages in thread
From: Muni Sekhar @ 2019-11-16  0:35 UTC (permalink / raw)
  To: linux-watchdog, linux-pci, wim, linux

[ Please keep me in CC as I'm not subscribed to the list]

Hi All,

My kernel is built with the following options:

$ cat /boot/config-5.0.1 | grep NO_HZ
CONFIG_NO_HZ_COMMON=y
CONFIG_NO_HZ_IDLE=y
# CONFIG_NO_HZ_FULL is not set
CONFIG_NO_HZ=y
CONFIG_RCU_FAST_NO_HZ=y

I booted with watchdog enabled(nmi_watchdog=1) as given below:

BOOT_IMAGE=/boot/vmlinuz-5.0.1
root=UUID=f65454ae-3f1d-4b9e-b4be-74a29becbe1e ro debug
ignore_loglevel console=ttyUSB0,115200 console=tty0 console=tty1
console=ttyS2,115200 memmap=1M!1023M nmi_watchdog=1
crashkernel=384M-:128M

When the system is frozen or the kernel is locked up(I noticed that in
this state kernel is not responding for ALT-SysRq-<command key>) but
watchdog is not triggered. So I want to understand how to enable the
watchdog timer and how to verify the basic watchdog functionality
behavior?

Any pointers on this will be greatly appreciated.

--
Thanks,
Sekhar

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: watchdog: how to enable?
  2019-11-16  0:35 watchdog: how to enable? Muni Sekhar
@ 2019-11-16  1:04 ` Guenter Roeck
  2019-11-16  3:03   ` Muni Sekhar
  0 siblings, 1 reply; 8+ messages in thread
From: Guenter Roeck @ 2019-11-16  1:04 UTC (permalink / raw)
  To: Muni Sekhar, linux-watchdog, linux-pci, wim

On 11/15/19 4:35 PM, Muni Sekhar wrote:
> [ Please keep me in CC as I'm not subscribed to the list]
> 
> Hi All,
> 
> My kernel is built with the following options:
> 
> $ cat /boot/config-5.0.1 | grep NO_HZ
> CONFIG_NO_HZ_COMMON=y
> CONFIG_NO_HZ_IDLE=y
> # CONFIG_NO_HZ_FULL is not set
> CONFIG_NO_HZ=y
> CONFIG_RCU_FAST_NO_HZ=y
> 
> I booted with watchdog enabled(nmi_watchdog=1) as given below:
> 
> BOOT_IMAGE=/boot/vmlinuz-5.0.1
> root=UUID=f65454ae-3f1d-4b9e-b4be-74a29becbe1e ro debug
> ignore_loglevel console=ttyUSB0,115200 console=tty0 console=tty1
> console=ttyS2,115200 memmap=1M!1023M nmi_watchdog=1
> crashkernel=384M-:128M
> 
> When the system is frozen or the kernel is locked up(I noticed that in
> this state kernel is not responding for ALT-SysRq-<command key>) but
> watchdog is not triggered. So I want to understand how to enable the
> watchdog timer and how to verify the basic watchdog functionality
> behavior?
>  > Any pointers on this will be greatly appreciated.
> 
Sorry, I do not have an answer. Please note that you are talking about
the NMI watchdog, which is completely unrelated to hardware watchdogs
and not handled by the watchdog subsystem. I would suggest to send
your question to the Linux kernel mailing list and clearly state
that you are talking about the NMI watchdog.

Please note that, for the NMI watchdog to do anything, you must have
CONFIG_HARDLOCKUP_DETECTOR enabled in your kernel configuration. I don't
know what if anything the configuration options you listed above have
to do with the NMI watchdog.

Another possibility, of course, might be to enable a hardware watchdog
in your system (assuming it supports one). I personally would not trust
the NMI watchdog because to detect a system hang, after all, there are
situations where even NMIs no longer work.

Guenter

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: watchdog: how to enable?
  2019-11-16  1:04 ` Guenter Roeck
@ 2019-11-16  3:03   ` Muni Sekhar
  2019-11-16 16:01     ` Guenter Roeck
  0 siblings, 1 reply; 8+ messages in thread
From: Muni Sekhar @ 2019-11-16  3:03 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: linux-watchdog, linux-pci, wim

On Sat, Nov 16, 2019 at 6:34 AM Guenter Roeck <linux@roeck-us.net> wrote:
>
> On 11/15/19 4:35 PM, Muni Sekhar wrote:
> > [ Please keep me in CC as I'm not subscribed to the list]
> >
> > Hi All,
> >
> > My kernel is built with the following options:
> >
> > $ cat /boot/config-5.0.1 | grep NO_HZ
> > CONFIG_NO_HZ_COMMON=y
> > CONFIG_NO_HZ_IDLE=y
> > # CONFIG_NO_HZ_FULL is not set
> > CONFIG_NO_HZ=y
> > CONFIG_RCU_FAST_NO_HZ=y
> >
> > I booted with watchdog enabled(nmi_watchdog=1) as given below:
> >
> > BOOT_IMAGE=/boot/vmlinuz-5.0.1
> > root=UUID=f65454ae-3f1d-4b9e-b4be-74a29becbe1e ro debug
> > ignore_loglevel console=ttyUSB0,115200 console=tty0 console=tty1
> > console=ttyS2,115200 memmap=1M!1023M nmi_watchdog=1
> > crashkernel=384M-:128M
> >
> > When the system is frozen or the kernel is locked up(I noticed that in
> > this state kernel is not responding for ALT-SysRq-<command key>) but
> > watchdog is not triggered. So I want to understand how to enable the
> > watchdog timer and how to verify the basic watchdog functionality
> > behavior?
> >  > Any pointers on this will be greatly appreciated.
> >
> Sorry, I do not have an answer. Please note that you are talking about
> the NMI watchdog, which is completely unrelated to hardware watchdogs
> and not handled by the watchdog subsystem. I would suggest to send
> your question to the Linux kernel mailing list and clearly state
> that you are talking about the NMI watchdog.
>
> Please note that, for the NMI watchdog to do anything, you must have
> CONFIG_HARDLOCKUP_DETECTOR enabled in your kernel configuration. I don't
> know what if anything the configuration options you listed above have
> to do with the NMI watchdog.

Thank you for your response. I enabled hard\soft\lockup detector
config options. My kernel is built with the following .config options:

CONFIG_HAVE_HARDLOCKUP_DETECTOR_PERF=y
CONFIG_HARDLOCKUP_DETECTOR_PERF=y
CONFIG_HARDLOCKUP_CHECK_TIMESTAMP=y
CONFIG_HARDLOCKUP_DETECTOR=y
CONFIG_BOOTPARAM_HARDLOCKUP_PANIC=y
CONFIG_BOOTPARAM_HARDLOCKUP_PANIC_VALUE=1
CONFIG_SOFTLOCKUP_DETECTOR=y
CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC=y
CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=1

Also I enabled the following stuff in /proc/sys/ directory.

kernel.softlockup_panic = 1
kernel.hardlockup_panic = 1
kernel.unknown_nmi_panic = 1
kernel.softlockup_all_cpu_backtrace = 1
kernel.hardlockup_all_cpu_backtrace = 1
kernel.panic = 3
kernel.panic_on_io_nmi = 1
kernel.panic_on_oops = 1
kernel.panic_on_stackoverflow = 1
kernel.panic_on_unrecovered_nmi = 1
kernel.panic_on_rcu_stall = 1
kernel.panic_print = 31
kernel.sysrq=0x1FF


The https://www.kernel.org/doc/Documentation/lockup-watchdogs.txt
Says “By default, the watchdog runs on all online cores.  However, on a
kernel configured with NO_HZ_FULL, by default the watchdog runs only
on the housekeeping cores, not the cores specified in the "nohz_full"
boot argument.”, so I just mentioned my kernel CONFIG_NO_HZ* options.

>
> Another possibility, of course, might be to enable a hardware watchdog
> in your system (assuming it supports one). I personally would not trust
> the NMI watchdog because to detect a system hang, after all, there are
> situations where even NMIs no longer work.

From dmesg , Is it possible to know whether my system supports
hardware watchdog or not?
I assume that my system supports the hardware watchdog , then how to
enable the hardware watchdog to debug the system freeze issues?


>
> Guenter



-- 
Thanks,
Sekhar

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: watchdog: how to enable?
  2019-11-16  3:03   ` Muni Sekhar
@ 2019-11-16 16:01     ` Guenter Roeck
  2019-11-16 18:34       ` Muni Sekhar
  0 siblings, 1 reply; 8+ messages in thread
From: Guenter Roeck @ 2019-11-16 16:01 UTC (permalink / raw)
  To: Muni Sekhar; +Cc: linux-watchdog, linux-pci, wim

On 11/15/19 7:03 PM, Muni Sekhar wrote:
[ ... ]
>>
>> Another possibility, of course, might be to enable a hardware watchdog
>> in your system (assuming it supports one). I personally would not trust
>> the NMI watchdog because to detect a system hang, after all, there are
>> situations where even NMIs no longer work.
> 
>>From dmesg , Is it possible to know whether my system supports
> hardware watchdog or not?
> I assume that my system supports the hardware watchdog , then how to
> enable the hardware watchdog to debug the system freeze issues?
> 

Hardware watchdog support really depends on the board type. Most PC
mainboards support a watchdog in the Super-IO chip, but on some it is
not wired correctly. On embedded boards it is often built into the SoC.
The easiest way to see if you have a watchdog would be to check for the
existence of /dev/watchdog. However, on a PC that would most likely
not be there because the necessary module is not auto-loaded.
If you tell us your board type, or better the Super-IO chip on the board,
we might be able to help.

Note though that this won't help to debug the problem. A hardware
watchdog resets the system. It helps to recover, but it is not intended
to help with debugging.

Guenter

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: watchdog: how to enable?
  2019-11-16 16:01     ` Guenter Roeck
@ 2019-11-16 18:34       ` Muni Sekhar
  2019-11-16 21:42         ` Guenter Roeck
  0 siblings, 1 reply; 8+ messages in thread
From: Muni Sekhar @ 2019-11-16 18:34 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: linux-watchdog, linux-pci, wim

On Sat, Nov 16, 2019 at 9:31 PM Guenter Roeck <linux@roeck-us.net> wrote:
>
> On 11/15/19 7:03 PM, Muni Sekhar wrote:
> [ ... ]
> >>
> >> Another possibility, of course, might be to enable a hardware watchdog
> >> in your system (assuming it supports one). I personally would not trust
> >> the NMI watchdog because to detect a system hang, after all, there are
> >> situations where even NMIs no longer work.
> >
> >>From dmesg , Is it possible to know whether my system supports
> > hardware watchdog or not?
> > I assume that my system supports the hardware watchdog , then how to
> > enable the hardware watchdog to debug the system freeze issues?
> >
>
> Hardware watchdog support really depends on the board type. Most PC
> mainboards support a watchdog in the Super-IO chip, but on some it is
> not wired correctly. On embedded boards it is often built into the SoC.
> The easiest way to see if you have a watchdog would be to check for the
> existence of /dev/watchdog. However, on a PC that would most likely
> not be there because the necessary module is not auto-loaded.
> If you tell us your board type, or better the Super-IO chip on the board,
> we might be able to help.

I’m having two same configuration systems, in one system I installed
the Vanilla kernel and I see the /dev/watchdog and /dev/watchdog0
nodes. In other system I’m running with ubuntu distribution kernel,
but I don’t see any watchdog device node. So it looks like I need to
manually load the kernel module in distro kernel. Is there a way to
know what is the corresponding kernel module for  /dev/watchdog node?

# ls -l /dev/watchdog*
crw------- 1 root root  10, 130 Nov 15 17:15 /dev/watchdog
crw------- 1 root root 248,   0 Nov 15 17:15 /dev/watchdog0

# ps -ax | grep watchdog
  678 ?        S      0:00 [watchdogd]

Regarding Super-IO chip, how to find out the Super-IO chip model?

>
> Note though that this won't help to debug the problem. A hardware
> watchdog resets the system. It helps to recover, but it is not intended
> to help with debugging.
How do I use the hardware watchdog to reset my system when system is
frozen? It helps me to collect the crashdump and finally helps me to
find the root cause for the system frozen issue.

>
> Guenter



-- 
Thanks,
Sekhar

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: watchdog: how to enable?
  2019-11-16 18:34       ` Muni Sekhar
@ 2019-11-16 21:42         ` Guenter Roeck
  2019-11-18  9:52           ` Muni Sekhar
  0 siblings, 1 reply; 8+ messages in thread
From: Guenter Roeck @ 2019-11-16 21:42 UTC (permalink / raw)
  To: Muni Sekhar; +Cc: linux-watchdog, linux-pci, wim

On 11/16/19 10:34 AM, Muni Sekhar wrote:
> On Sat, Nov 16, 2019 at 9:31 PM Guenter Roeck <linux@roeck-us.net> wrote:
>>
>> On 11/15/19 7:03 PM, Muni Sekhar wrote:
>> [ ... ]
>>>>
>>>> Another possibility, of course, might be to enable a hardware watchdog
>>>> in your system (assuming it supports one). I personally would not trust
>>>> the NMI watchdog because to detect a system hang, after all, there are
>>>> situations where even NMIs no longer work.
>>>
>>> >From dmesg , Is it possible to know whether my system supports
>>> hardware watchdog or not?
>>> I assume that my system supports the hardware watchdog , then how to
>>> enable the hardware watchdog to debug the system freeze issues?
>>>
>>
>> Hardware watchdog support really depends on the board type. Most PC
>> mainboards support a watchdog in the Super-IO chip, but on some it is
>> not wired correctly. On embedded boards it is often built into the SoC.
>> The easiest way to see if you have a watchdog would be to check for the
>> existence of /dev/watchdog. However, on a PC that would most likely
>> not be there because the necessary module is not auto-loaded.
>> If you tell us your board type, or better the Super-IO chip on the board,
>> we might be able to help.
> 
> I’m having two same configuration systems, in one system I installed
> the Vanilla kernel and I see the /dev/watchdog and /dev/watchdog0
> nodes. In other system I’m running with ubuntu distribution kernel,
> but I don’t see any watchdog device node. So it looks like I need to
> manually load the kernel module in distro kernel. Is there a way to
> know what is the corresponding kernel module for  /dev/watchdog node?
> 
> # ls -l /dev/watchdog*
> crw------- 1 root root  10, 130 Nov 15 17:15 /dev/watchdog
> crw------- 1 root root 248,   0 Nov 15 17:15 /dev/watchdog0
> 
> # ps -ax | grep watchdog
>    678 ?        S      0:00 [watchdogd]
> 
> Regarding Super-IO chip, how to find out the Super-IO chip model?
> 
You could try to run sensors-detect (from the "sensors" package).

If you can boot a system with /dev/watchdog0, you should see the type
in /sys/class/watchdog/watchdog0/identity.

Also, you can test if the watchdog works with "sudo cat /dev/watchdog",
assuming the watchdog daemon is not running. The watchdog works if the
system reboots after the watchdog times out (/sys/class/watchdog/watchdog0/timeout
is the timeout in seconds).

>>
>> Note though that this won't help to debug the problem. A hardware
>> watchdog resets the system. It helps to recover, but it is not intended
>> to help with debugging.
> How do I use the hardware watchdog to reset my system when system is
> frozen? It helps me to collect the crashdump and finally helps me to
> find the root cause for the system frozen issue.
> 
There won't be a crashdump. It just hard-resets the system.

Guenter

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: watchdog: how to enable?
  2019-11-16 21:42         ` Guenter Roeck
@ 2019-11-18  9:52           ` Muni Sekhar
  2019-11-18 14:10             ` Guenter Roeck
  0 siblings, 1 reply; 8+ messages in thread
From: Muni Sekhar @ 2019-11-18  9:52 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: linux-watchdog, linux-pci, wim

On Sun, Nov 17, 2019 at 3:12 AM Guenter Roeck <linux@roeck-us.net> wrote:
>
> On 11/16/19 10:34 AM, Muni Sekhar wrote:
> > On Sat, Nov 16, 2019 at 9:31 PM Guenter Roeck <linux@roeck-us.net> wrote:
> >>
> >> On 11/15/19 7:03 PM, Muni Sekhar wrote:
> >> [ ... ]
> >>>>
> >>>> Another possibility, of course, might be to enable a hardware watchdog
> >>>> in your system (assuming it supports one). I personally would not trust
> >>>> the NMI watchdog because to detect a system hang, after all, there are
> >>>> situations where even NMIs no longer work.
> >>>
> >>> >From dmesg , Is it possible to know whether my system supports
> >>> hardware watchdog or not?
> >>> I assume that my system supports the hardware watchdog , then how to
> >>> enable the hardware watchdog to debug the system freeze issues?
> >>>
> >>
> >> Hardware watchdog support really depends on the board type. Most PC
> >> mainboards support a watchdog in the Super-IO chip, but on some it is
> >> not wired correctly. On embedded boards it is often built into the SoC.
> >> The easiest way to see if you have a watchdog would be to check for the
> >> existence of /dev/watchdog. However, on a PC that would most likely
> >> not be there because the necessary module is not auto-loaded.
> >> If you tell us your board type, or better the Super-IO chip on the board,
> >> we might be able to help.
> >
> > I’m having two same configuration systems, in one system I installed
> > the Vanilla kernel and I see the /dev/watchdog and /dev/watchdog0
> > nodes. In other system I’m running with ubuntu distribution kernel,
> > but I don’t see any watchdog device node. So it looks like I need to
> > manually load the kernel module in distro kernel. Is there a way to
> > know what is the corresponding kernel module for  /dev/watchdog node?
> >
> > # ls -l /dev/watchdog*
> > crw------- 1 root root  10, 130 Nov 15 17:15 /dev/watchdog
> > crw------- 1 root root 248,   0 Nov 15 17:15 /dev/watchdog0
> >
> > # ps -ax | grep watchdog
> >    678 ?        S      0:00 [watchdogd]
> >
> > Regarding Super-IO chip, how to find out the Super-IO chip model?
> >
> You could try to run sensors-detect (from the "sensors" package).
>
> If you can boot a system with /dev/watchdog0, you should see the type
> in /sys/class/watchdog/watchdog0/identity.
I could not find the /sys/class/watchdog/watchdog0/identity and
/sys/class/watchdog/watchdog0/timeout files.
$ ls -l /sys/class/watchdog/watchdog0/
total 0
-r--r--r-- 1 root root 4096 Nov 18 15:12 dev
lrwxrwxrwx 1 root root    0 Nov 18 15:12 device -> ../../../iTCO_wdt.0.auto
drwxr-xr-x 2 root root    0 Nov 18 15:12 power
lrwxrwxrwx 1 root root    0 Nov 18 14:53 subsystem ->
../../../../../../class/watchdog
-rw-r--r-- 1 root root 4096 Nov 18 14:53 uevent

>
> Also, you can test if the watchdog works with "sudo cat /dev/watchdog",
> assuming the watchdog daemon is not running. The watchdog works if the
> system reboots after the watchdog times out (/sys/class/watchdog/watchdog0/timeout
> is the timeout in seconds).
sudo cat /dev/watchdog perfectly rebooted my system. I don't see
timeout node, how do I configure the timeout value?
>
> >>
> >> Note though that this won't help to debug the problem. A hardware
> >> watchdog resets the system. It helps to recover, but it is not intended
> >> to help with debugging.
> > How do I use the hardware watchdog to reset my system when system is
> > frozen? It helps me to collect the crashdump and finally helps me to
> > find the root cause for the system frozen issue.
> >
> There won't be a crashdump. It just hard-resets the system.
So is there any other solution to capture the crashdump or trigger
soft reboot once kernel is lockedup?
>
> Guenter



-- 
Thanks,
Sekhar

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: watchdog: how to enable?
  2019-11-18  9:52           ` Muni Sekhar
@ 2019-11-18 14:10             ` Guenter Roeck
  0 siblings, 0 replies; 8+ messages in thread
From: Guenter Roeck @ 2019-11-18 14:10 UTC (permalink / raw)
  To: Muni Sekhar; +Cc: linux-watchdog, linux-pci, wim

On 11/18/19 1:52 AM, Muni Sekhar wrote:
> On Sun, Nov 17, 2019 at 3:12 AM Guenter Roeck <linux@roeck-us.net> wrote:
>>
>> On 11/16/19 10:34 AM, Muni Sekhar wrote:
>>> On Sat, Nov 16, 2019 at 9:31 PM Guenter Roeck <linux@roeck-us.net> wrote:
>>>>
>>>> On 11/15/19 7:03 PM, Muni Sekhar wrote:
>>>> [ ... ]
>>>>>>
>>>>>> Another possibility, of course, might be to enable a hardware watchdog
>>>>>> in your system (assuming it supports one). I personally would not trust
>>>>>> the NMI watchdog because to detect a system hang, after all, there are
>>>>>> situations where even NMIs no longer work.
>>>>>
>>>>> >From dmesg , Is it possible to know whether my system supports
>>>>> hardware watchdog or not?
>>>>> I assume that my system supports the hardware watchdog , then how to
>>>>> enable the hardware watchdog to debug the system freeze issues?
>>>>>
>>>>
>>>> Hardware watchdog support really depends on the board type. Most PC
>>>> mainboards support a watchdog in the Super-IO chip, but on some it is
>>>> not wired correctly. On embedded boards it is often built into the SoC.
>>>> The easiest way to see if you have a watchdog would be to check for the
>>>> existence of /dev/watchdog. However, on a PC that would most likely
>>>> not be there because the necessary module is not auto-loaded.
>>>> If you tell us your board type, or better the Super-IO chip on the board,
>>>> we might be able to help.
>>>
>>> I’m having two same configuration systems, in one system I installed
>>> the Vanilla kernel and I see the /dev/watchdog and /dev/watchdog0
>>> nodes. In other system I’m running with ubuntu distribution kernel,
>>> but I don’t see any watchdog device node. So it looks like I need to
>>> manually load the kernel module in distro kernel. Is there a way to
>>> know what is the corresponding kernel module for  /dev/watchdog node?
>>>
>>> # ls -l /dev/watchdog*
>>> crw------- 1 root root  10, 130 Nov 15 17:15 /dev/watchdog
>>> crw------- 1 root root 248,   0 Nov 15 17:15 /dev/watchdog0
>>>
>>> # ps -ax | grep watchdog
>>>     678 ?        S      0:00 [watchdogd]
>>>
>>> Regarding Super-IO chip, how to find out the Super-IO chip model?
>>>
>> You could try to run sensors-detect (from the "sensors" package).
>>
>> If you can boot a system with /dev/watchdog0, you should see the type
>> in /sys/class/watchdog/watchdog0/identity.
> I could not find the /sys/class/watchdog/watchdog0/identity and
> /sys/class/watchdog/watchdog0/timeout files.
> $ ls -l /sys/class/watchdog/watchdog0/
> total 0
> -r--r--r-- 1 root root 4096 Nov 18 15:12 dev
> lrwxrwxrwx 1 root root    0 Nov 18 15:12 device -> ../../../iTCO_wdt.0.auto
> drwxr-xr-x 2 root root    0 Nov 18 15:12 power
> lrwxrwxrwx 1 root root    0 Nov 18 14:53 subsystem ->
> ../../../../../../class/watchdog
> -rw-r--r-- 1 root root 4096 Nov 18 14:53 uevent
> 

Presumably CONFIG_WATCHDOG_SYSFS is not enabled in your configuration.

>>
>> Also, you can test if the watchdog works with "sudo cat /dev/watchdog",
>> assuming the watchdog daemon is not running. The watchdog works if the
>> system reboots after the watchdog times out (/sys/class/watchdog/watchdog0/timeout
>> is the timeout in seconds).
> sudo cat /dev/watchdog perfectly rebooted my system. I don't see
> timeout node, how do I configure the timeout value?

sudo apt-get install watchdog
man watchdog

should tell you. Alternatively, enable CONFIG_WATCHDOG_SYSFS.

>>
>>>>
>>>> Note though that this won't help to debug the problem. A hardware
>>>> watchdog resets the system. It helps to recover, but it is not intended
>>>> to help with debugging.
>>> How do I use the hardware watchdog to reset my system when system is
>>> frozen? It helps me to collect the crashdump and finally helps me to
>>> find the root cause for the system frozen issue.
>>>
>> There won't be a crashdump. It just hard-resets the system.
> So is there any other solution to capture the crashdump or trigger
> soft reboot once kernel is lockedup?

Not that I know of. I suspect, though, that you either have a hard lockup
where even NMI is non-operational, or NMI doesn't work in your system
to start with.

If you have nmi_watchdog=1 in your kernel command line, /proc/interrupts
should show a non-zero number of NMI interrupts. Do you see that in your system ?

Guenter

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2019-11-18 14:10 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-16  0:35 watchdog: how to enable? Muni Sekhar
2019-11-16  1:04 ` Guenter Roeck
2019-11-16  3:03   ` Muni Sekhar
2019-11-16 16:01     ` Guenter Roeck
2019-11-16 18:34       ` Muni Sekhar
2019-11-16 21:42         ` Guenter Roeck
2019-11-18  9:52           ` Muni Sekhar
2019-11-18 14:10             ` Guenter Roeck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).