* [PATCH 0/2] watchdog: Provide user control over WDOG_STOP_ON_REBOOT @ 2020-02-13 17:59 Dmitry Safonov 2020-02-13 17:59 ` [PATCH 1/2] watchdog: Check WDOG_STOP_ON_REBOOT in reboot notifier Dmitry Safonov 2020-02-13 17:59 ` [PATCH 2/2] watchdog/uapi: Add WDIOS_{RUN,STOP}_ON_REBOOT Dmitry Safonov 0 siblings, 2 replies; 5+ messages in thread From: Dmitry Safonov @ 2020-02-13 17:59 UTC (permalink / raw) To: linux-kernel Cc: Dmitry Safonov, Dmitry Safonov, Guenter Roeck, Wim Van Sebroeck, linux-watchdog Add WDIOS_RUN_ON_REBOOT and WDIOS_STOP_ON_REBOOT to control the watchdog's behavior over reboot. Changes since RFC: o rebase over v5.6 o fixed return code for ioctl() I've sent RFC a while ago and it probably was very late in release cycle to catch any attention: https://lkml.kernel.org/r/20200121162145.166334-1-dima@arista.com While waiting for rc1, I've changed my mind that it's RFC material and sending it as PATCHv1 instead. Cc: Guenter Roeck <linux@roeck-us.net> Cc: Wim Van Sebroeck <wim@linux-watchdog.org> Cc: linux-watchdog@vger.kernel.org Dmitry Safonov (2): watchdog: Check WDOG_STOP_ON_REBOOT in reboot notifier watchdog/uapi: Add WDIOS_{RUN,STOP}_ON_REBOOT drivers/watchdog/watchdog_core.c | 27 +++++++++++++-------------- drivers/watchdog/watchdog_dev.c | 12 ++++++++++++ include/linux/watchdog.h | 6 ++++++ include/uapi/linux/watchdog.h | 3 ++- 4 files changed, 33 insertions(+), 15 deletions(-) -- 2.25.0 ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 1/2] watchdog: Check WDOG_STOP_ON_REBOOT in reboot notifier 2020-02-13 17:59 [PATCH 0/2] watchdog: Provide user control over WDOG_STOP_ON_REBOOT Dmitry Safonov @ 2020-02-13 17:59 ` Dmitry Safonov 2020-02-13 19:12 ` Guenter Roeck 2020-02-13 17:59 ` [PATCH 2/2] watchdog/uapi: Add WDIOS_{RUN,STOP}_ON_REBOOT Dmitry Safonov 1 sibling, 1 reply; 5+ messages in thread From: Dmitry Safonov @ 2020-02-13 17:59 UTC (permalink / raw) To: linux-kernel Cc: Dmitry Safonov, Dmitry Safonov, Guenter Roeck, Wim Van Sebroeck, linux-watchdog Many watchdog drivers use watchdog_stop_on_reboot() helper in order to stop the watchdog on system reboot. Unfortunately, this logic is coded in driver's probe function and doesn't allows user to decide what to do during shutdown/reboot. On the other side, Xen and Qemu watchdog drivers (xen_wdt and i6300esb) may be configured to either send NMI or turn off/reboot VM as the watchdog action. As the kernel may stuck at any state, sending NMIs can't reliably reboot the VM. At Arista, we benefited from the following set-up: the emulated watchdogs trigger VM reset and softdog is set to catch less severe conditions to generate vmcore. Just before reboot watchdog's timeout is increased to some good-enough value (3 mins). That keeps watchdog always running and guarantees that VM doesn't stuck. As a preparation to move the watchdog's decision to stop on reboot or not in userspace, allow WDOG_STOP_ON_REBOOT to be set during runtime, not only on driver's probing. Always register reboot notifier and check WDOG_STOP_ON_REBOOT inside it (on actual reboot). Signed-off-by: Dmitry Safonov <dima@arista.com> --- drivers/watchdog/watchdog_core.c | 27 +++++++++++++-------------- 1 file changed, 13 insertions(+), 14 deletions(-) diff --git a/drivers/watchdog/watchdog_core.c b/drivers/watchdog/watchdog_core.c index 861daf4f37b2..ebf80ff3e8ce 100644 --- a/drivers/watchdog/watchdog_core.c +++ b/drivers/watchdog/watchdog_core.c @@ -153,6 +153,10 @@ static int watchdog_reboot_notifier(struct notifier_block *nb, struct watchdog_device *wdd; wdd = container_of(nb, struct watchdog_device, reboot_nb); + + if (!test_bit(WDOG_STOP_ON_REBOOT, &wdd->status)) + return NOTIFY_DONE; + if (code == SYS_DOWN || code == SYS_HALT) { if (watchdog_active(wdd)) { int ret; @@ -254,17 +258,14 @@ static int __watchdog_register_device(struct watchdog_device *wdd) } } - if (test_bit(WDOG_STOP_ON_REBOOT, &wdd->status)) { - wdd->reboot_nb.notifier_call = watchdog_reboot_notifier; - - ret = register_reboot_notifier(&wdd->reboot_nb); - if (ret) { - pr_err("watchdog%d: Cannot register reboot notifier (%d)\n", - wdd->id, ret); - watchdog_dev_unregister(wdd); - ida_simple_remove(&watchdog_ida, id); - return ret; - } + wdd->reboot_nb.notifier_call = watchdog_reboot_notifier; + ret = register_reboot_notifier(&wdd->reboot_nb); + if (ret) { + pr_err("watchdog%d: Cannot register reboot notifier (%d)\n", + wdd->id, ret); + watchdog_dev_unregister(wdd); + ida_simple_remove(&watchdog_ida, id); + return ret; } if (wdd->ops->restart) { @@ -321,9 +322,7 @@ static void __watchdog_unregister_device(struct watchdog_device *wdd) if (wdd->ops->restart) unregister_restart_handler(&wdd->restart_nb); - if (test_bit(WDOG_STOP_ON_REBOOT, &wdd->status)) - unregister_reboot_notifier(&wdd->reboot_nb); - + unregister_reboot_notifier(&wdd->reboot_nb); watchdog_dev_unregister(wdd); ida_simple_remove(&watchdog_ida, wdd->id); } -- 2.25.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] watchdog: Check WDOG_STOP_ON_REBOOT in reboot notifier 2020-02-13 17:59 ` [PATCH 1/2] watchdog: Check WDOG_STOP_ON_REBOOT in reboot notifier Dmitry Safonov @ 2020-02-13 19:12 ` Guenter Roeck 2020-02-13 20:23 ` Dmitry Safonov 0 siblings, 1 reply; 5+ messages in thread From: Guenter Roeck @ 2020-02-13 19:12 UTC (permalink / raw) To: Dmitry Safonov Cc: linux-kernel, Dmitry Safonov, Wim Van Sebroeck, linux-watchdog On Thu, Feb 13, 2020 at 05:59:57PM +0000, Dmitry Safonov wrote: > Many watchdog drivers use watchdog_stop_on_reboot() helper in order > to stop the watchdog on system reboot. Unfortunately, this logic is > coded in driver's probe function and doesn't allows user to decide what > to do during shutdown/reboot. > > On the other side, Xen and Qemu watchdog drivers (xen_wdt and i6300esb) > may be configured to either send NMI or turn off/reboot VM as > the watchdog action. As the kernel may stuck at any state, sending NMIs > can't reliably reboot the VM. > > At Arista, we benefited from the following set-up: the emulated watchdogs > trigger VM reset and softdog is set to catch less severe conditions to > generate vmcore. Just before reboot watchdog's timeout is increased > to some good-enough value (3 mins). That keeps watchdog always running > and guarantees that VM doesn't stuck. > > As a preparation to move the watchdog's decision to stop on reboot or > not in userspace, allow WDOG_STOP_ON_REBOOT to be set during runtime, > not only on driver's probing. Always register reboot notifier and check > WDOG_STOP_ON_REBOOT inside it (on actual reboot). > Does that really have to be decided at runtime, by the user ? How about doing it with a module parameter ? Also, I am not sure if an ioctl is the best means to do this, if it indeed makes sense to decide it at runtime. ioctl implies an open watchdog device, which interferes with the watchdog daemon. This means that the watchdog daemon would have to be modified to support this, making this a quite expensive change. It also implies that the action would have to be known when the watchdog daemon is started, suggesting that a module parameter should be sufficient. Guenter > Signed-off-by: Dmitry Safonov <dima@arista.com> > --- > drivers/watchdog/watchdog_core.c | 27 +++++++++++++-------------- > 1 file changed, 13 insertions(+), 14 deletions(-) > > diff --git a/drivers/watchdog/watchdog_core.c b/drivers/watchdog/watchdog_core.c > index 861daf4f37b2..ebf80ff3e8ce 100644 > --- a/drivers/watchdog/watchdog_core.c > +++ b/drivers/watchdog/watchdog_core.c > @@ -153,6 +153,10 @@ static int watchdog_reboot_notifier(struct notifier_block *nb, > struct watchdog_device *wdd; > > wdd = container_of(nb, struct watchdog_device, reboot_nb); > + > + if (!test_bit(WDOG_STOP_ON_REBOOT, &wdd->status)) > + return NOTIFY_DONE; > + > if (code == SYS_DOWN || code == SYS_HALT) { > if (watchdog_active(wdd)) { > int ret; > @@ -254,17 +258,14 @@ static int __watchdog_register_device(struct watchdog_device *wdd) > } > } > > - if (test_bit(WDOG_STOP_ON_REBOOT, &wdd->status)) { > - wdd->reboot_nb.notifier_call = watchdog_reboot_notifier; > - > - ret = register_reboot_notifier(&wdd->reboot_nb); > - if (ret) { > - pr_err("watchdog%d: Cannot register reboot notifier (%d)\n", > - wdd->id, ret); > - watchdog_dev_unregister(wdd); > - ida_simple_remove(&watchdog_ida, id); > - return ret; > - } > + wdd->reboot_nb.notifier_call = watchdog_reboot_notifier; > + ret = register_reboot_notifier(&wdd->reboot_nb); > + if (ret) { > + pr_err("watchdog%d: Cannot register reboot notifier (%d)\n", > + wdd->id, ret); > + watchdog_dev_unregister(wdd); > + ida_simple_remove(&watchdog_ida, id); > + return ret; > } > > if (wdd->ops->restart) { > @@ -321,9 +322,7 @@ static void __watchdog_unregister_device(struct watchdog_device *wdd) > if (wdd->ops->restart) > unregister_restart_handler(&wdd->restart_nb); > > - if (test_bit(WDOG_STOP_ON_REBOOT, &wdd->status)) > - unregister_reboot_notifier(&wdd->reboot_nb); > - > + unregister_reboot_notifier(&wdd->reboot_nb); > watchdog_dev_unregister(wdd); > ida_simple_remove(&watchdog_ida, wdd->id); > } > -- > 2.25.0 > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH 1/2] watchdog: Check WDOG_STOP_ON_REBOOT in reboot notifier 2020-02-13 19:12 ` Guenter Roeck @ 2020-02-13 20:23 ` Dmitry Safonov 0 siblings, 0 replies; 5+ messages in thread From: Dmitry Safonov @ 2020-02-13 20:23 UTC (permalink / raw) To: Guenter Roeck Cc: linux-kernel, Dmitry Safonov, Wim Van Sebroeck, linux-watchdog Hi Guenter, On 2/13/20 7:12 PM, Guenter Roeck wrote: > Does that really have to be decided at runtime, by the user ? > How about doing it with a module parameter ? > > Also, I am not sure if an ioctl is the best means to do this, if it indeed > makes sense to decide it at runtime. ioctl implies an open watchdog device, > which interferes with the watchdog daemon. This means that the watchdog > daemon would have to be modified to support this, making this a quite expensive > change. It also implies that the action would have to be known when the > watchdog daemon is started, suggesting that a module parameter should be > sufficient. Yes, fair points. I went with ioctl() because the timeout can be changed in runtime. But you're right, I'll look into making it a module parameter instead. Thanks for the review and time, Dmitry ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH 2/2] watchdog/uapi: Add WDIOS_{RUN,STOP}_ON_REBOOT 2020-02-13 17:59 [PATCH 0/2] watchdog: Provide user control over WDOG_STOP_ON_REBOOT Dmitry Safonov 2020-02-13 17:59 ` [PATCH 1/2] watchdog: Check WDOG_STOP_ON_REBOOT in reboot notifier Dmitry Safonov @ 2020-02-13 17:59 ` Dmitry Safonov 1 sibling, 0 replies; 5+ messages in thread From: Dmitry Safonov @ 2020-02-13 17:59 UTC (permalink / raw) To: linux-kernel Cc: Dmitry Safonov, Dmitry Safonov, Guenter Roeck, Wim Van Sebroeck, linux-watchdog Many watchdog drivers use watchdog_stop_on_reboot() helper in order to stop the watchdog on system reboot. Unfortunately, this logic is coded in driver's probe function and doesn't allows user to decide what to do during shutdown/reboot. On the other side, Xen and Qemu watchdog drivers (xen_wdt and i6300esb) may be configured to either send NMI or turn off/reboot VM as the watchdog action. As the kernel may stuck at any state, sending NMIs can't reliably reboot the VM. At Arista, we benefited from the following set-up: the emulated watchdogs trigger VM reset and softdog is set to catch less severe conditions to generate vmcore. Just before reboot watchdog's timeout is increased to some good-enough value (3 mins). That keeps watchdog always running and guarantees that VM doesn't stuck. Provide new WDIOS_RUN_ON_REBOOT and WDIOS_STOP_ON_REBOOT ioctl options to set up strategy on reboot. Signed-off-by: Dmitry Safonov <dima@arista.com> --- drivers/watchdog/watchdog_dev.c | 12 ++++++++++++ include/linux/watchdog.h | 6 ++++++ include/uapi/linux/watchdog.h | 3 ++- 3 files changed, 20 insertions(+), 1 deletion(-) diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c index 8b5c742f24e8..c854cd0245db 100644 --- a/drivers/watchdog/watchdog_dev.c +++ b/drivers/watchdog/watchdog_dev.c @@ -753,6 +753,18 @@ static long watchdog_ioctl(struct file *file, unsigned int cmd, } if (val & WDIOS_ENABLECARD) err = watchdog_start(wdd); + + if (val & WDIOS_RUN_ON_REBOOT) { + if (val & WDIOS_STOP_ON_REBOOT) { + err = -EINVAL; + break; + } + watchdog_run_on_reboot(wdd); + err = 0; + } else if (val & WDIOS_STOP_ON_REBOOT) { + watchdog_stop_on_reboot(wdd); + err = 0; + } break; case WDIOC_KEEPALIVE: if (!(wdd->info->options & WDIOF_KEEPALIVEPING)) { diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h index 417d9f37077a..9e2ca7754631 100644 --- a/include/linux/watchdog.h +++ b/include/linux/watchdog.h @@ -150,6 +150,12 @@ static inline void watchdog_stop_on_reboot(struct watchdog_device *wdd) set_bit(WDOG_STOP_ON_REBOOT, &wdd->status); } +/* Use the following function to keep the watchdog running on reboot */ +static inline void watchdog_run_on_reboot(struct watchdog_device *wdd) +{ + clear_bit(WDOG_STOP_ON_REBOOT, &wdd->status); +} + /* Use the following function to stop the watchdog when unregistering it */ static inline void watchdog_stop_on_unregister(struct watchdog_device *wdd) { diff --git a/include/uapi/linux/watchdog.h b/include/uapi/linux/watchdog.h index b15cde5c9054..bf19a5d3c987 100644 --- a/include/uapi/linux/watchdog.h +++ b/include/uapi/linux/watchdog.h @@ -53,6 +53,7 @@ struct watchdog_info { #define WDIOS_DISABLECARD 0x0001 /* Turn off the watchdog timer */ #define WDIOS_ENABLECARD 0x0002 /* Turn on the watchdog timer */ #define WDIOS_TEMPPANIC 0x0004 /* Kernel panic on temperature trip */ - +#define WDIOS_RUN_ON_REBOOT 0x0008 /* Keep watchdog enabled on reboot */ +#define WDIOS_STOP_ON_REBOOT 0x0010 /* Turn off the watchdog on reboot */ #endif /* _UAPI_LINUX_WATCHDOG_H */ -- 2.25.0 ^ permalink raw reply related [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-02-13 20:23 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2020-02-13 17:59 [PATCH 0/2] watchdog: Provide user control over WDOG_STOP_ON_REBOOT Dmitry Safonov 2020-02-13 17:59 ` [PATCH 1/2] watchdog: Check WDOG_STOP_ON_REBOOT in reboot notifier Dmitry Safonov 2020-02-13 19:12 ` Guenter Roeck 2020-02-13 20:23 ` Dmitry Safonov 2020-02-13 17:59 ` [PATCH 2/2] watchdog/uapi: Add WDIOS_{RUN,STOP}_ON_REBOOT Dmitry Safonov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).