linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC 0/2] watchdog: Provide user control over WDOG_STOP_ON_REBOOT
@ 2020-01-21 16:21 Dmitry Safonov
  2020-01-21 16:21 ` [RFC 1/2] watchdog: Check WDOG_STOP_ON_REBOOT in reboot notifier Dmitry Safonov
  2020-01-21 16:21 ` [RFC 2/2] watchdog/uapi: Add WDIOS_{RUN,STOP}_ON_REBOOT Dmitry Safonov
  0 siblings, 2 replies; 3+ messages in thread
From: Dmitry Safonov @ 2020-01-21 16:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Dmitry Safonov, Dmitry Safonov, Guenter Roeck, Wim Van Sebroeck,
	linux-watchdog

Add WDIOS_RUN_ON_REBOOT and WDIOS_STOP_ON_REBOOT to control the
watchdog's behavior over reboot.

Cc: Guenter Roeck <linux@roeck-us.net>
Cc: Wim Van Sebroeck <wim@linux-watchdog.org>
Cc: linux-watchdog@vger.kernel.org

Dmitry Safonov (2):
  watchdog: Check WDOG_STOP_ON_REBOOT in reboot notifier
  watchdog/uapi: Add WDIOS_{RUN,STOP}_ON_REBOOT

 drivers/watchdog/watchdog_dev.c | 29 ++++++++++++++++++++---------
 include/linux/watchdog.h        |  6 ++++++
 include/uapi/linux/watchdog.h   |  3 ++-
 3 files changed, 28 insertions(+), 10 deletions(-)

-- 
2.25.0


^ permalink raw reply	[flat|nested] 3+ messages in thread

* [RFC 1/2] watchdog: Check WDOG_STOP_ON_REBOOT in reboot notifier
  2020-01-21 16:21 [RFC 0/2] watchdog: Provide user control over WDOG_STOP_ON_REBOOT Dmitry Safonov
@ 2020-01-21 16:21 ` Dmitry Safonov
  2020-01-21 16:21 ` [RFC 2/2] watchdog/uapi: Add WDIOS_{RUN,STOP}_ON_REBOOT Dmitry Safonov
  1 sibling, 0 replies; 3+ messages in thread
From: Dmitry Safonov @ 2020-01-21 16:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Dmitry Safonov, Dmitry Safonov, Guenter Roeck, Wim Van Sebroeck,
	linux-watchdog

Many watchdog drivers use watchdog_stop_on_reboot() helper in order
to stop the watchdog on system reboot. Unfortunately, this logic is
coded in driver's probe function and doesn't allows user to decide what
to do during shutdown/reboot.

On the other side, Xen and Qemu watchdog drivers (xen_wdt and i6300esb)
may be configured to either send NMI or turn off/reboot VM as
the watchdog action. As the kernel may stuck at any state, sending NMIs
can't reliably reboot the VM.

At Arista, we benefited from the following set-up: the emulated watchdogs
trigger VM reset and softdog is set to catch less severe conditions to
generate vmcore. Just before reboot watchdog's timeout is increased
to some good-enough value (3 mins). That keeps watchdog always running
and guarantees that VM doesn't stuck.

As a preparation to move the watchdog's decision to stop on reboot or
not in userspace, allow WDOG_STOP_ON_REBOOT to be set during runtime,
not only on driver's probing. Always register reboot notifier and check
WDOG_STOP_ON_REBOOT inside it (on actual reboot).

Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 drivers/watchdog/watchdog_dev.c | 19 ++++++++++---------
 1 file changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
index 4b2a85438478..8766dd93028f 100644
--- a/drivers/watchdog/watchdog_dev.c
+++ b/drivers/watchdog/watchdog_dev.c
@@ -1103,6 +1103,10 @@ static int watchdog_reboot_notifier(struct notifier_block *nb,
 	struct watchdog_device *wdd;
 
 	wdd = container_of(nb, struct watchdog_device, reboot_nb);
+
+	if (!test_bit(WDOG_STOP_ON_REBOOT, &wdd->status))
+		return NOTIFY_DONE;
+
 	if (code == SYS_DOWN || code == SYS_HALT) {
 		if (watchdog_active(wdd)) {
 			int ret;
@@ -1139,16 +1143,13 @@ int watchdog_dev_register(struct watchdog_device *wdd)
 		return ret;
 	}
 
-	if (test_bit(WDOG_STOP_ON_REBOOT, &wdd->status)) {
-		wdd->reboot_nb.notifier_call = watchdog_reboot_notifier;
+	wdd->reboot_nb.notifier_call = watchdog_reboot_notifier;
 
-		ret = devm_register_reboot_notifier(&wdd->wd_data->dev,
-						    &wdd->reboot_nb);
-		if (ret) {
-			pr_err("watchdog%d: Cannot register reboot notifier (%d)\n",
-			       wdd->id, ret);
-			watchdog_dev_unregister(wdd);
-		}
+	ret = devm_register_reboot_notifier(&wdd->wd_data->dev, &wdd->reboot_nb);
+	if (ret) {
+		pr_err("watchdog%d: Cannot register reboot notifier (%d)\n",
+				wdd->id, ret);
+		watchdog_dev_unregister(wdd);
 	}
 
 	return ret;
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* [RFC 2/2] watchdog/uapi: Add WDIOS_{RUN,STOP}_ON_REBOOT
  2020-01-21 16:21 [RFC 0/2] watchdog: Provide user control over WDOG_STOP_ON_REBOOT Dmitry Safonov
  2020-01-21 16:21 ` [RFC 1/2] watchdog: Check WDOG_STOP_ON_REBOOT in reboot notifier Dmitry Safonov
@ 2020-01-21 16:21 ` Dmitry Safonov
  1 sibling, 0 replies; 3+ messages in thread
From: Dmitry Safonov @ 2020-01-21 16:21 UTC (permalink / raw)
  To: linux-kernel
  Cc: Dmitry Safonov, Dmitry Safonov, Guenter Roeck, Wim Van Sebroeck,
	linux-watchdog

Many watchdog drivers use watchdog_stop_on_reboot() helper in order
to stop the watchdog on system reboot. Unfortunately, this logic is
coded in driver's probe function and doesn't allows user to decide what
to do during shutdown/reboot.

On the other side, Xen and Qemu watchdog drivers (xen_wdt and i6300esb)
may be configured to either send NMI or turn off/reboot VM as
the watchdog action. As the kernel may stuck at any state, sending NMIs
can't reliably reboot the VM.

At Arista, we benefited from the following set-up: the emulated watchdogs
trigger VM reset and softdog is set to catch less severe conditions to
generate vmcore. Just before reboot watchdog's timeout is increased
to some good-enough value (3 mins). That keeps watchdog always running
and guarantees that VM doesn't stuck.

Provide new WDIOS_RUN_ON_REBOOT and WDIOS_STOP_ON_REBOOT ioctl options
to set up strategy on reboot.

Signed-off-by: Dmitry Safonov <dima@arista.com>
---
 drivers/watchdog/watchdog_dev.c | 10 ++++++++++
 include/linux/watchdog.h        |  6 ++++++
 include/uapi/linux/watchdog.h   |  3 ++-
 3 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
index 8766dd93028f..00135e698946 100644
--- a/drivers/watchdog/watchdog_dev.c
+++ b/drivers/watchdog/watchdog_dev.c
@@ -754,6 +754,16 @@ static long watchdog_ioctl(struct file *file, unsigned int cmd,
 		}
 		if (val & WDIOS_ENABLECARD)
 			err = watchdog_start(wdd);
+
+		if (val & WDIOS_RUN_ON_REBOOT) {
+			if (val & WDIOS_STOP_ON_REBOOT) {
+				err = -EINVAL;
+				break;
+			}
+			watchdog_run_on_reboot(wdd);
+		} else if (val & WDIOS_STOP_ON_REBOOT) {
+			watchdog_stop_on_reboot(wdd);
+		}
 		break;
 	case WDIOC_KEEPALIVE:
 		if (!(wdd->info->options & WDIOF_KEEPALIVEPING)) {
diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h
index 417d9f37077a..9e2ca7754631 100644
--- a/include/linux/watchdog.h
+++ b/include/linux/watchdog.h
@@ -150,6 +150,12 @@ static inline void watchdog_stop_on_reboot(struct watchdog_device *wdd)
 	set_bit(WDOG_STOP_ON_REBOOT, &wdd->status);
 }
 
+/* Use the following function to keep the watchdog running on reboot */
+static inline void watchdog_run_on_reboot(struct watchdog_device *wdd)
+{
+	clear_bit(WDOG_STOP_ON_REBOOT, &wdd->status);
+}
+
 /* Use the following function to stop the watchdog when unregistering it */
 static inline void watchdog_stop_on_unregister(struct watchdog_device *wdd)
 {
diff --git a/include/uapi/linux/watchdog.h b/include/uapi/linux/watchdog.h
index b15cde5c9054..bf19a5d3c987 100644
--- a/include/uapi/linux/watchdog.h
+++ b/include/uapi/linux/watchdog.h
@@ -53,6 +53,7 @@ struct watchdog_info {
 #define	WDIOS_DISABLECARD	0x0001	/* Turn off the watchdog timer */
 #define	WDIOS_ENABLECARD	0x0002	/* Turn on the watchdog timer */
 #define	WDIOS_TEMPPANIC		0x0004	/* Kernel panic on temperature trip */
-
+#define	WDIOS_RUN_ON_REBOOT	0x0008	/* Keep watchdog enabled on reboot */
+#define	WDIOS_STOP_ON_REBOOT	0x0010	/* Turn off the watchdog on reboot */
 
 #endif /* _UAPI_LINUX_WATCHDOG_H */
-- 
2.25.0


^ permalink raw reply related	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2020-01-21 16:21 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-01-21 16:21 [RFC 0/2] watchdog: Provide user control over WDOG_STOP_ON_REBOOT Dmitry Safonov
2020-01-21 16:21 ` [RFC 1/2] watchdog: Check WDOG_STOP_ON_REBOOT in reboot notifier Dmitry Safonov
2020-01-21 16:21 ` [RFC 2/2] watchdog/uapi: Add WDIOS_{RUN,STOP}_ON_REBOOT Dmitry Safonov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).