linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/8] watchdog: Add support for keepalives triggered by infrastructure
@ 2015-08-08  5:02 Guenter Roeck
  2015-08-08  5:02 ` [PATCH v2 1/8] watchdog: watchdog_dev: Use single variable name for struct watchdog_device Guenter Roeck
                   ` (7 more replies)
  0 siblings, 8 replies; 17+ messages in thread
From: Guenter Roeck @ 2015-08-08  5:02 UTC (permalink / raw)
  To: linux-watchdog
  Cc: Wim Van Sebroeck, linux-kernel, Timo Kokkonen,
	Uwe Kleine-König, linux-doc, Jonathan Corbet, Guenter Roeck

The watchdog infrastructure is currently purely passive, meaning
it only passes information from user space to drivers and vice versa.

Since watchdog hardware tends to have its own quirks, this can result
in quite complex watchdog drivers. A number of scanarios are especially common.

- A watchdog is always active and can not be disabled, or can not be disabled
  once enabled. To support such hardware, watchdog drivers have to implement
  their own timers and use those timers to trigger watchdog keepalives while
  the watchdog device is not or not yet opened.
- A variant of this is the desire to enable a watchdog as soon as its driver
  has been instantiated, to protect the system while it is still booting up,
  but the watchdog daemon is not yet running.
- Some watchdogs have a very short maximum timeout, in the range of just a few
  seconds. Such low timeouts are difficult if not impossible to support from
  user space. Drivers supporting such watchdog hardware need to implement
  a timer function to augment heartbeats from user space.

This patch set solves the above problems while keeping changes to the
watchdog core minimal.

- A new status flag, WDOG_RUNNING, informs the watchdog subsystem that a
  watchdog is running, and that the watchdog subsystem needs to generate
  heartbeat requests while the associated watchdog device is closed.
- A new parameter in the watchdog data structure, max_hw_timeout_ms, informs
  the watchdog subsystem about a maximum hardware timeout. The watchdog
  subsystem uses this information together with the configured timeout
  and the maximum permitted timeout to determine if it needs to generate
  additional heartbeat requests.

As part of this patchset, the semantics of the 'timeout' variable and of
the WDOG_ACTIVE flag are changed slightly.

Per the current watchdog kernel API, the 'timeout' variable is supposed
to reflect the actual hardware watcdog timeout. WDOG_ACTIVE is supposed
to reflect if the hardware watchdog is running or not.

Unfortunately, this does not always reflect reality. In drivers which solve
the above mentioned problems internally, 'timeout' is the watchdog timeout
as seen from user space, and WDOG_ACTIVE reflects that user space is expected
to send keepalive requests to the watchdog driver.

After this patch set is applied, this so far inofficial interpretation
is the 'official' semantics for the timeout variable and the WDOG_ACTIVE
flag. In other words, both values no longer reflect the hardware watchdog
status, but its status as seen from user space.

Patch #1 is a preparatory patch.

Patch #2 adds timer functionality to the watchdog core. It solves the problem
of short maximum hardware timeouts by augmenting heartbeats triggered from
user space with internally triggered heartbeats.

Patch #3 adds functionality to generate heartbeats while the watchdog device is
closed. It handles situation where where the watchdog is running after
the driver has been instantiated, but the device is not yet opened,
and post-close situations necessary if a watchdog can not be stopped.

Patch #4 makes the set_timeout function optional. This is now possible since
timeout changes can now be completely handled in the watchdog core, for
example if the hardware watchdog timeout is fixed.

Patch #5 to #8 are example conversions of some watchdog drivers.
Those patches will require testing.

The patch set is also available in branch watchdog-timer of
git://git.kernel.org/pub/scm/linux/kernel/git/groeck/linux-staging.git.

This patch set does not solve all limitations of the watchdog subsystem.
Specifically, it does not add support for the following features.

- It is desirable to be able to specify a maximum early timeout,
  from booting the system to opening the watchdog device.
- Some watchdogs may require a minimum period of time between
  heartbeats. Examples are DA9062 and possibly AT91SAM9x.

This and other features will be addressed with subsequent patches.

The patch set is inspired by an earlier patch set from Timo Kokonnen.

v2:
- Rebased to v4.2-rc5
- Improved and hopefully clarified documentation.
- Rearranged variables in struct watchdog_device such that internal variables
  come last.
- The code now ensures that the watchdog times out <timeout> seconds after
  the most recent keepalive sent from user space.
- The internal keepalive now stops silently and no longer generates a
  warning message. Reason is that it will now stop early, while there
  may still be a substantial amount of time for keepalives from user space
  to arrive. If such keepalives arrive late (for example if user space
  is configured to send keepalives just a few seconds before the watchdog
  times out), the message would just be noise and not provide any value.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v2 1/8] watchdog: watchdog_dev: Use single variable name for struct watchdog_device
  2015-08-08  5:02 [PATCH v2 0/8] watchdog: Add support for keepalives triggered by infrastructure Guenter Roeck
@ 2015-08-08  5:02 ` Guenter Roeck
  2015-08-08  5:02 ` [PATCH v2 2/8] watchdog: Introduce hardware maximum timeout in watchdog core Guenter Roeck
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Guenter Roeck @ 2015-08-08  5:02 UTC (permalink / raw)
  To: linux-watchdog
  Cc: Wim Van Sebroeck, linux-kernel, Timo Kokkonen,
	Uwe Kleine-König, linux-doc, Jonathan Corbet, Guenter Roeck

The current code uses 'wdd', wddev', and 'watchdog' as variable names
for struct watchdog_device. This is confusing and makes it difficult
to enhance the code. Replace it all with 'wdd'.

Cc: Timo Kokkonen <timo.kokkonen@offcode.fi>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>

---
v2: No changes
---
 drivers/watchdog/watchdog_dev.c | 151 ++++++++++++++++++++--------------------
 1 file changed, 75 insertions(+), 76 deletions(-)

diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
index 6aaefbad303e..06171c73daf5 100644
--- a/drivers/watchdog/watchdog_dev.c
+++ b/drivers/watchdog/watchdog_dev.c
@@ -51,7 +51,7 @@ static struct watchdog_device *old_wdd;
 
 /*
  *	watchdog_ping: ping the watchdog.
- *	@wddev: the watchdog device to ping
+ *	@wdd: the watchdog device to ping
  *
  *	If the watchdog has no own ping operation then it needs to be
  *	restarted via the start operation. This wrapper function does
@@ -59,65 +59,65 @@ static struct watchdog_device *old_wdd;
  *	We only ping when the watchdog device is running.
  */
 
-static int watchdog_ping(struct watchdog_device *wddev)
+static int watchdog_ping(struct watchdog_device *wdd)
 {
 	int err = 0;
 
-	mutex_lock(&wddev->lock);
+	mutex_lock(&wdd->lock);
 
-	if (test_bit(WDOG_UNREGISTERED, &wddev->status)) {
+	if (test_bit(WDOG_UNREGISTERED, &wdd->status)) {
 		err = -ENODEV;
 		goto out_ping;
 	}
 
-	if (!watchdog_active(wddev))
+	if (!watchdog_active(wdd))
 		goto out_ping;
 
-	if (wddev->ops->ping)
-		err = wddev->ops->ping(wddev);  /* ping the watchdog */
+	if (wdd->ops->ping)
+		err = wdd->ops->ping(wdd);	/* ping the watchdog */
 	else
-		err = wddev->ops->start(wddev); /* restart watchdog */
+		err = wdd->ops->start(wdd);	/* restart watchdog */
 
 out_ping:
-	mutex_unlock(&wddev->lock);
+	mutex_unlock(&wdd->lock);
 	return err;
 }
 
 /*
  *	watchdog_start: wrapper to start the watchdog.
- *	@wddev: the watchdog device to start
+ *	@wdd: the watchdog device to start
  *
  *	Start the watchdog if it is not active and mark it active.
  *	This function returns zero on success or a negative errno code for
  *	failure.
  */
 
-static int watchdog_start(struct watchdog_device *wddev)
+static int watchdog_start(struct watchdog_device *wdd)
 {
 	int err = 0;
 
-	mutex_lock(&wddev->lock);
+	mutex_lock(&wdd->lock);
 
-	if (test_bit(WDOG_UNREGISTERED, &wddev->status)) {
+	if (test_bit(WDOG_UNREGISTERED, &wdd->status)) {
 		err = -ENODEV;
 		goto out_start;
 	}
 
-	if (watchdog_active(wddev))
+	if (watchdog_active(wdd))
 		goto out_start;
 
-	err = wddev->ops->start(wddev);
+	err = wdd->ops->start(wdd);
 	if (err == 0)
-		set_bit(WDOG_ACTIVE, &wddev->status);
+		set_bit(WDOG_ACTIVE, &wdd->status);
 
 out_start:
-	mutex_unlock(&wddev->lock);
+	mutex_unlock(&wdd->lock);
 	return err;
 }
 
 /*
  *	watchdog_stop: wrapper to stop the watchdog.
- *	@wddev: the watchdog device to stop
+ *	@wdd: the watchdog device to stop
  *
  *	Stop the watchdog if it is still active and unmark it active.
  *	This function returns zero on success or a negative errno code for
@@ -125,155 +125,154 @@ out_start:
  *	If the 'nowayout' feature was set, the watchdog cannot be stopped.
  */
 
-static int watchdog_stop(struct watchdog_device *wddev)
+static int watchdog_stop(struct watchdog_device *wdd)
 {
 	int err = 0;
 
-	mutex_lock(&wddev->lock);
+	mutex_lock(&wdd->lock);
 
-	if (test_bit(WDOG_UNREGISTERED, &wddev->status)) {
+	if (test_bit(WDOG_UNREGISTERED, &wdd->status)) {
 		err = -ENODEV;
 		goto out_stop;
 	}
 
-	if (!watchdog_active(wddev))
+	if (!watchdog_active(wdd))
 		goto out_stop;
 
-	if (test_bit(WDOG_NO_WAY_OUT, &wddev->status)) {
-		dev_info(wddev->dev, "nowayout prevents watchdog being stopped!\n");
+	if (test_bit(WDOG_NO_WAY_OUT, &wdd->status)) {
+		dev_info(wdd->dev, "nowayout prevents watchdog being stopped!\n");
 		err = -EBUSY;
 		goto out_stop;
 	}
 
-	err = wddev->ops->stop(wddev);
+	err = wdd->ops->stop(wdd);
 	if (err == 0)
-		clear_bit(WDOG_ACTIVE, &wddev->status);
+		clear_bit(WDOG_ACTIVE, &wdd->status);
 
 out_stop:
-	mutex_unlock(&wddev->lock);
+	mutex_unlock(&wdd->lock);
 	return err;
 }
 
 /*
  *	watchdog_get_status: wrapper to get the watchdog status
- *	@wddev: the watchdog device to get the status from
+ *	@wdd: the watchdog device to get the status from
  *	@status: the status of the watchdog device
  *
  *	Get the watchdog's status flags.
  */
 
-static int watchdog_get_status(struct watchdog_device *wddev,
+static int watchdog_get_status(struct watchdog_device *wdd,
 							unsigned int *status)
 {
 	int err = 0;
 
 	*status = 0;
-	if (!wddev->ops->status)
+	if (!wdd->ops->status)
 		return -EOPNOTSUPP;
 
-	mutex_lock(&wddev->lock);
+	mutex_lock(&wdd->lock);
 
-	if (test_bit(WDOG_UNREGISTERED, &wddev->status)) {
+	if (test_bit(WDOG_UNREGISTERED, &wdd->status)) {
 		err = -ENODEV;
 		goto out_status;
 	}
 
-	*status = wddev->ops->status(wddev);
+	*status = wdd->ops->status(wdd);
 
 out_status:
-	mutex_unlock(&wddev->lock);
+	mutex_unlock(&wdd->lock);
 	return err;
 }
 
 /*
  *	watchdog_set_timeout: set the watchdog timer timeout
- *	@wddev: the watchdog device to set the timeout for
+ *	@wdd: the watchdog device to set the timeout for
  *	@timeout: timeout to set in seconds
  */
 
-static int watchdog_set_timeout(struct watchdog_device *wddev,
+static int watchdog_set_timeout(struct watchdog_device *wdd,
 							unsigned int timeout)
 {
 	int err;
 
-	if ((wddev->ops->set_timeout == NULL) ||
-	    !(wddev->info->options & WDIOF_SETTIMEOUT))
+	if (!wdd->ops->set_timeout || !(wdd->info->options & WDIOF_SETTIMEOUT))
 		return -EOPNOTSUPP;
 
-	if (watchdog_timeout_invalid(wddev, timeout))
+	if (watchdog_timeout_invalid(wdd, timeout))
 		return -EINVAL;
 
-	mutex_lock(&wddev->lock);
+	mutex_lock(&wdd->lock);
 
-	if (test_bit(WDOG_UNREGISTERED, &wddev->status)) {
+	if (test_bit(WDOG_UNREGISTERED, &wdd->status)) {
 		err = -ENODEV;
 		goto out_timeout;
 	}
 
-	err = wddev->ops->set_timeout(wddev, timeout);
+	err = wdd->ops->set_timeout(wdd, timeout);
 
 out_timeout:
-	mutex_unlock(&wddev->lock);
+	mutex_unlock(&wdd->lock);
 	return err;
 }
 
 /*
  *	watchdog_get_timeleft: wrapper to get the time left before a reboot
- *	@wddev: the watchdog device to get the remaining time from
+ *	@wdd: the watchdog device to get the remaining time from
  *	@timeleft: the time that's left
  *
  *	Get the time before a watchdog will reboot (if not pinged).
  */
 
-static int watchdog_get_timeleft(struct watchdog_device *wddev,
+static int watchdog_get_timeleft(struct watchdog_device *wdd,
 							unsigned int *timeleft)
 {
 	int err = 0;
 
 	*timeleft = 0;
-	if (!wddev->ops->get_timeleft)
+	if (!wdd->ops->get_timeleft)
 		return -EOPNOTSUPP;
 
-	mutex_lock(&wddev->lock);
+	mutex_lock(&wdd->lock);
 
-	if (test_bit(WDOG_UNREGISTERED, &wddev->status)) {
+	if (test_bit(WDOG_UNREGISTERED, &wdd->status)) {
 		err = -ENODEV;
 		goto out_timeleft;
 	}
 
-	*timeleft = wddev->ops->get_timeleft(wddev);
+	*timeleft = wdd->ops->get_timeleft(wdd);
 
 out_timeleft:
-	mutex_unlock(&wddev->lock);
+	mutex_unlock(&wdd->lock);
 	return err;
 }
 
 /*
  *	watchdog_ioctl_op: call the watchdog drivers ioctl op if defined
- *	@wddev: the watchdog device to do the ioctl on
+ *	@wdd: the watchdog device to do the ioctl on
  *	@cmd: watchdog command
  *	@arg: argument pointer
  */
 
-static int watchdog_ioctl_op(struct watchdog_device *wddev, unsigned int cmd,
+static int watchdog_ioctl_op(struct watchdog_device *wdd, unsigned int cmd,
 							unsigned long arg)
 {
 	int err;
 
-	if (!wddev->ops->ioctl)
+	if (!wdd->ops->ioctl)
 		return -ENOIOCTLCMD;
 
-	mutex_lock(&wddev->lock);
+	mutex_lock(&wdd->lock);
 
-	if (test_bit(WDOG_UNREGISTERED, &wddev->status)) {
+	if (test_bit(WDOG_UNREGISTERED, &wdd->status)) {
 		err = -ENODEV;
 		goto out_ioctl;
 	}
 
-	err = wddev->ops->ioctl(wddev, cmd, arg);
+	err = wdd->ops->ioctl(wdd, cmd, arg);
 
 out_ioctl:
-	mutex_unlock(&wddev->lock);
+	mutex_unlock(&wdd->lock);
 	return err;
 }
 
@@ -513,43 +512,43 @@ static struct miscdevice watchdog_miscdev = {
 
 /*
  *	watchdog_dev_register: register a watchdog device
- *	@watchdog: watchdog device
+ *	@wdd: watchdog device
  *
  *	Register a watchdog device including handling the legacy
  *	/dev/watchdog node. /dev/watchdog is actually a miscdevice and
  *	thus we set it up like that.
  */
 
-int watchdog_dev_register(struct watchdog_device *watchdog)
+int watchdog_dev_register(struct watchdog_device *wdd)
 {
 	int err, devno;
 
-	if (watchdog->id == 0) {
-		old_wdd = watchdog;
-		watchdog_miscdev.parent = watchdog->parent;
+	if (wdd->id == 0) {
+		old_wdd = wdd;
+		watchdog_miscdev.parent = wdd->parent;
 		err = misc_register(&watchdog_miscdev);
 		if (err != 0) {
 			pr_err("%s: cannot register miscdev on minor=%d (err=%d).\n",
-				watchdog->info->identity, WATCHDOG_MINOR, err);
+				wdd->info->identity, WATCHDOG_MINOR, err);
 			if (err == -EBUSY)
 				pr_err("%s: a legacy watchdog module is probably present.\n",
-					watchdog->info->identity);
+					wdd->info->identity);
 			old_wdd = NULL;
 			return err;
 		}
 	}
 
 	/* Fill in the data structures */
-	devno = MKDEV(MAJOR(watchdog_devt), watchdog->id);
-	cdev_init(&watchdog->cdev, &watchdog_fops);
-	watchdog->cdev.owner = watchdog->ops->owner;
+	devno = MKDEV(MAJOR(watchdog_devt), wdd->id);
+	cdev_init(&wdd->cdev, &watchdog_fops);
+	wdd->cdev.owner = wdd->ops->owner;
 
 	/* Add the device */
-	err  = cdev_add(&watchdog->cdev, devno, 1);
+	err  = cdev_add(&wdd->cdev, devno, 1);
 	if (err) {
 		pr_err("watchdog%d unable to add device %d:%d\n",
-			watchdog->id,  MAJOR(watchdog_devt), watchdog->id);
-		if (watchdog->id == 0) {
+			wdd->id,  MAJOR(watchdog_devt), wdd->id);
+		if (wdd->id == 0) {
 			misc_deregister(&watchdog_miscdev);
 			old_wdd = NULL;
 		}
@@ -564,14 +563,14 @@ int watchdog_dev_register(struct watchdog_device *watchdog)
  *	Unregister the watchdog and if needed the legacy /dev/watchdog device.
  */
 
-int watchdog_dev_unregister(struct watchdog_device *watchdog)
+int watchdog_dev_unregister(struct watchdog_device *wdd)
 {
-	mutex_lock(&watchdog->lock);
-	set_bit(WDOG_UNREGISTERED, &watchdog->status);
-	mutex_unlock(&watchdog->lock);
+	mutex_lock(&wdd->lock);
+	set_bit(WDOG_UNREGISTERED, &wdd->status);
+	mutex_unlock(&wdd->lock);
 
-	cdev_del(&watchdog->cdev);
-	if (watchdog->id == 0) {
+	cdev_del(&wdd->cdev);
+	if (wdd->id == 0) {
 		misc_deregister(&watchdog_miscdev);
 		old_wdd = NULL;
 	}
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 2/8] watchdog: Introduce hardware maximum timeout in watchdog core
  2015-08-08  5:02 [PATCH v2 0/8] watchdog: Add support for keepalives triggered by infrastructure Guenter Roeck
  2015-08-08  5:02 ` [PATCH v2 1/8] watchdog: watchdog_dev: Use single variable name for struct watchdog_device Guenter Roeck
@ 2015-08-08  5:02 ` Guenter Roeck
  2015-08-13 21:13   ` Uwe Kleine-König
  2015-08-14 11:23   ` Uwe Kleine-König
  2015-08-08  5:02 ` [PATCH v2 3/8] watchdog: Introduce WDOG_RUNNING flag Guenter Roeck
                   ` (5 subsequent siblings)
  7 siblings, 2 replies; 17+ messages in thread
From: Guenter Roeck @ 2015-08-08  5:02 UTC (permalink / raw)
  To: linux-watchdog
  Cc: Wim Van Sebroeck, linux-kernel, Timo Kokkonen,
	Uwe Kleine-König, linux-doc, Jonathan Corbet, Guenter Roeck

Introduce an optional hardware maximum timeout in the watchdog core.
The hardware maximum timeout can be lower than the maximum timeout.

Drivers can set the maximum hardware timeout value in the watchdog data
structure. If the configured timeout exceeds the maximum hardware timeout,
the watchdog core enables a timer function to assist sending keepalive
requests to the watchdog driver.

Cc: Timo Kokkonen <timo.kokkonen@offcode.fi>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
v2:
- Improved and hopefully clarified documentation.
- Rearranged variables in struct watchdog_device such that internal variables
  come last.
- The code now ensures that the watchdog times out <timeout> seconds after
  the most recent keepalive sent from user space.
- The internal keepalive now stops silently and no longer generates a
  warning message. Reason is that it will now stop early, while there
  may still be a substantial amount of time for keepalives from user space
  to arrive. If such keepalives arrive late (for example if user space
  is configured to send keepalives just a few seconds before the watchdog
  times out), the message would just be noise and not provide any value.
---
 Documentation/watchdog/watchdog-kernel-api.txt |  23 +++-
 drivers/watchdog/watchdog_dev.c                | 140 ++++++++++++++++++++++---
 include/linux/watchdog.h                       |  26 +++--
 3 files changed, 163 insertions(+), 26 deletions(-)

diff --git a/Documentation/watchdog/watchdog-kernel-api.txt b/Documentation/watchdog/watchdog-kernel-api.txt
index d8b0d3367706..25b00b878a7b 100644
--- a/Documentation/watchdog/watchdog-kernel-api.txt
+++ b/Documentation/watchdog/watchdog-kernel-api.txt
@@ -53,9 +53,12 @@ struct watchdog_device {
 	unsigned int timeout;
 	unsigned int min_timeout;
 	unsigned int max_timeout;
+	unsigned int max_hw_timeout_ms;
 	void *driver_data;
-	struct mutex lock;
 	unsigned long status;
+	struct mutex lock;
+	unsigned long last_keepalive;
+	struct delayed_work work;
 	struct list_head deferred;
 };
 
@@ -73,18 +76,28 @@ It contains following fields:
   additional information about the watchdog timer itself. (Like it's unique name)
 * ops: a pointer to the list of watchdog operations that the watchdog supports.
 * timeout: the watchdog timer's timeout value (in seconds).
+  This is the time after which the system will reboot if user space does
+  not send a heartbeat request if WDOG_ACTIVE is set.
 * min_timeout: the watchdog timer's minimum timeout value (in seconds).
 * max_timeout: the watchdog timer's maximum timeout value (in seconds).
+* max_hw_timeout_ms: Maximum hardware timeout, in milli-seconds. May differ
+  from max_timeout. If set to a value larger than max_timeout, the
+  infrastructure will send a heartbeat to the watchdog driver if 'timeout'
+  is larger than 'max_hw_timeout / 2', unless WDOG_ACTIVE is set and user
+  space failed to send a heartbeat for at least 'timeout' seconds.
 * bootstatus: status of the device after booting (reported with watchdog
   WDIOF_* status bits).
 * driver_data: a pointer to the drivers private data of a watchdog device.
   This data should only be accessed via the watchdog_set_drvdata and
   watchdog_get_drvdata routines.
-* lock: Mutex for WatchDog Timer Driver Core internal use only.
 * status: this field contains a number of status bits that give extra
   information about the status of the device (Like: is the watchdog timer
   running/active, is the nowayout bit set, is the device opened via
   the /dev/watchdog interface or not, ...).
+* lock: Mutex for WatchDog Timer Driver Core internal use only.
+* last_keepalive: Time of most recent keepalive triggered from user space,
+  in jiffies.
+* work: Worker data structure for WatchDog Timer Driver Core internal use only.
 * deferred: entry in wtd_deferred_reg_list which is used to
   register early initialized watchdogs.
 
@@ -160,7 +173,11 @@ they are supported. These optional routines/operations are:
   and -EIO for "could not write value to the watchdog". On success this
   routine should set the timeout value of the watchdog_device to the
   achieved timeout value (which may be different from the requested one
-  because the watchdog does not necessarily has a 1 second resolution).
+  because the watchdog does not necessarily have a 1 second resolution).
+  Drivers implementing hw_max_timeout_ms set the hardware watchdog timeout
+  to the minimum of timeout and hw_max_timeout_ms. Those drivers set the
+  timeout value of the watchdog_device either to the requested timeout value
+  (if it is larger than hw_max_timeout_ms), or to the achieved timeout value.
   (Note: the WDIOF_SETTIMEOUT needs to be set in the options field of the
   watchdog's info structure).
 * get_timeleft: this routines returns the time that's left before a reset.
diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
index 06171c73daf5..c04ba1a98cc8 100644
--- a/drivers/watchdog/watchdog_dev.c
+++ b/drivers/watchdog/watchdog_dev.c
@@ -37,7 +37,9 @@
 #include <linux/errno.h>	/* For the -ENODEV/... values */
 #include <linux/kernel.h>	/* For printk/panic/... */
 #include <linux/fs.h>		/* For file operations */
+#include <linux/jiffies.h>	/* For timeout functions */
 #include <linux/watchdog.h>	/* For watchdog specific items */
+#include <linux/workqueue.h>	/* For workqueue */
 #include <linux/miscdevice.h>	/* For handling misc devices */
 #include <linux/init.h>		/* For __init/__exit/... */
 #include <linux/uaccess.h>	/* For copy_to_user/put_user/... */
@@ -49,6 +51,78 @@ static dev_t watchdog_devt;
 /* the watchdog device behind /dev/watchdog */
 static struct watchdog_device *old_wdd;
 
+static struct workqueue_struct *watchdog_wq;
+
+static inline bool watchdog_need_worker(struct watchdog_device *wdd)
+{
+	unsigned int hm = wdd->max_hw_timeout_ms;
+	unsigned int m = wdd->max_timeout * 1000;
+	unsigned int t = wdd->timeout * 1000;
+
+	return watchdog_active(wdd) && hm && (!m || hm < m) && t > hm;
+}
+
+static unsigned long watchdog_next_keepalive(struct watchdog_device *wdd)
+{
+	unsigned int t = wdd->timeout * 1000;
+	unsigned long to = wdd->last_keepalive + msecs_to_jiffies(t);
+	unsigned long last, max_t, tj;
+
+	if (wdd->max_hw_timeout_ms && t > wdd->max_hw_timeout_ms)
+		t = wdd->max_hw_timeout_ms;
+
+	tj = msecs_to_jiffies(t / 2);
+
+	/*
+	 * Ensure that the watchdog times out wdd->timeout seconds
+	 * after the most recent keepalive sent from user space.
+	 */
+	last = to - msecs_to_jiffies(t);
+	if (time_is_after_jiffies(last))
+		max_t = last - jiffies;
+	else
+		max_t = 0;
+
+	if (max_t < tj)
+		tj = max_t;
+
+	return tj;
+}
+
+static inline void watchdog_update_worker(struct watchdog_device *wdd,
+					  bool cancel, bool sync)
+{
+	if (watchdog_need_worker(wdd)) {
+		unsigned long t = watchdog_next_keepalive(wdd);
+
+		if (t)
+			mod_delayed_work(watchdog_wq, &wdd->work, t);
+	} else if (cancel) {
+		if (sync)
+			cancel_delayed_work_sync(&wdd->work);
+		else
+			cancel_delayed_work(&wdd->work);
+	}
+}
+
+static int _watchdog_ping(struct watchdog_device *wdd)
+{
+	int err;
+
+	if (test_bit(WDOG_UNREGISTERED, &wdd->status))
+		return -ENODEV;
+
+	if (!watchdog_active(wdd))
+		return 0;
+
+	if (wdd->ops->ping)
+		err = wdd->ops->ping(wdd);  /* ping the watchdog */
+	else
+		err = wdd->ops->start(wdd); /* restart watchdog */
+
+	return err;
+}
+
 /*
  *	watchdog_ping: ping the watchdog.
  *	@wdd: the watchdog device to ping
@@ -61,26 +135,27 @@ static struct watchdog_device *old_wdd;
 
 static int watchdog_ping(struct watchdog_device *wdd)
 {
-	int err = 0;
+	int err;
 
 	mutex_lock(&wdd->lock);
+	err = _watchdog_ping(wdd);
+	wdd->last_keepalive = jiffies;
+	watchdog_update_worker(wdd, false, false);
+	mutex_unlock(&wdd->lock);
 
-	if (test_bit(WDOG_UNREGISTERED, &wdd->status)) {
-		err = -ENODEV;
-		goto out_ping;
-	}
+	return err;
+}
 
-	if (!watchdog_active(wdd))
-		goto out_ping;
+static void watchdog_ping_work(struct work_struct *work)
+{
+	struct watchdog_device *wdd;
 
-	if (wdd->ops->ping)
-		err = wdd->ops->ping(wdd);	/* ping the watchdog */
-	else
-		err = wdd->ops->start(wdd);	/* restart watchdog */
+	wdd = container_of(to_delayed_work(work), struct watchdog_device, work);
 
-out_ping:
+	mutex_lock(&wdd->lock);
+	_watchdog_ping(wdd);
+	watchdog_update_worker(wdd, false, false);
 	mutex_unlock(&wdd->lock);
-	return err;
 }
 
 /*
@@ -107,8 +182,11 @@ static int watchdog_start(struct watchdog_device *wdd)
 		goto out_start;
 
 	err = wdd->ops->start(wdd);
-	if (err == 0)
+	if (err == 0) {
 		set_bit(WDOG_ACTIVE, &wdd->status);
+		wdd->last_keepalive = jiffies;
+		watchdog_update_worker(wdd, false, false);
+	}
 
 out_start:
 	mutex_unlock(&wdd->lock);
@@ -146,8 +224,10 @@ static int watchdog_stop(struct watchdog_device *wdd)
 	}
 
 	err = wdd->ops->stop(wdd);
-	if (err == 0)
+	if (err == 0) {
 		clear_bit(WDOG_ACTIVE, &wdd->status);
+		watchdog_update_worker(wdd, true, false);
+	}
 
 out_stop:
 	mutex_unlock(&wdd->lock);
@@ -211,6 +291,8 @@ static int watchdog_set_timeout(struct watchdog_device *wdd,
 
 	err = wdd->ops->set_timeout(wdd, timeout);
 
+	watchdog_update_worker(wdd, true, false);
+
 out_timeout:
 	mutex_unlock(&wdd->lock);
 	return err;
@@ -483,6 +565,8 @@ static int watchdog_release(struct inode *inode, struct file *file)
 		watchdog_ping(wdd);
 	}
 
+	cancel_delayed_work_sync(&wdd->work);
+
 	/* Allow the owner module to be unloaded again */
 	module_put(wdd->ops->owner);
 
@@ -523,6 +607,14 @@ int watchdog_dev_register(struct watchdog_device *wdd)
 {
 	int err, devno;
 
+	if (!watchdog_wq)
+		return -ENODEV;
+
+	INIT_DELAYED_WORK(&wdd->work, watchdog_ping_work);
+
+	if (!wdd->max_hw_timeout_ms)
+		wdd->max_hw_timeout_ms = wdd->max_timeout * 1000;
+
 	if (wdd->id == 0) {
 		old_wdd = wdd;
 		watchdog_miscdev.parent = wdd->parent;
@@ -574,6 +666,9 @@ int watchdog_dev_unregister(struct watchdog_device *wdd)
 		misc_deregister(&watchdog_miscdev);
 		old_wdd = NULL;
 	}
+
+	cancel_delayed_work_sync(&wdd->work);
+
 	return 0;
 }
 
@@ -585,9 +680,21 @@ int watchdog_dev_unregister(struct watchdog_device *wdd)
 
 int __init watchdog_dev_init(void)
 {
-	int err = alloc_chrdev_region(&watchdog_devt, 0, MAX_DOGS, "watchdog");
+	int err;
+
+	watchdog_wq = alloc_workqueue("watchdogd",
+				      WQ_HIGHPRI | WQ_MEM_RECLAIM, 0);
+	if (!watchdog_wq) {
+		pr_err("Failed to create watchdog workqueue\n");
+		err = -ENOMEM;
+		goto abort;
+	}
+
+	err = alloc_chrdev_region(&watchdog_devt, 0, MAX_DOGS, "watchdog");
 	if (err < 0)
 		pr_err("watchdog: unable to allocate char dev region\n");
+
+abort:
 	return err;
 }
 
@@ -600,4 +707,5 @@ int __init watchdog_dev_init(void)
 void __exit watchdog_dev_exit(void)
 {
 	unregister_chrdev_region(watchdog_devt, MAX_DOGS);
+	destroy_workqueue(watchdog_wq);
 }
diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h
index f47feada5b42..0e0cf36485c1 100644
--- a/include/linux/watchdog.h
+++ b/include/linux/watchdog.h
@@ -12,6 +12,7 @@
 #include <linux/bitops.h>
 #include <linux/device.h>
 #include <linux/cdev.h>
+#include <linux/workqueue.h>
 #include <uapi/linux/watchdog.h>
 
 struct watchdog_ops;
@@ -59,14 +60,21 @@ struct watchdog_ops {
  * @info:	Pointer to a watchdog_info structure.
  * @ops:	Pointer to the list of watchdog operations.
  * @bootstatus:	Status of the watchdog device at boot.
- * @timeout:	The watchdog devices timeout value.
- * @min_timeout:The watchdog devices minimum timeout value.
- * @max_timeout:The watchdog devices maximum timeout value.
+ * @timeout:	The watchdog devices timeout value, in seconds.
+ * @min_timeout:The watchdog devices minimum timeout value, in seconds.
+ * @max_timeout:The watchdog devices maximum timeout value, in seconds.
+ * @max_hw_timeout_ms:
+ *		Hardware limit for maximum timeout, in milli-seconds,
+ *		if different from max_timeout.
  * @driver-data:Pointer to the drivers private data.
- * @lock:	Lock for watchdog core internal use only.
  * @status:	Field that contains the devices internal status bits.
- * @deferred: entry in wtd_deferred_reg_list which is used to
- *			   register early initialized watchdogs.
+ * @lock:	Lock for watchdog core internal use only.
+ * @last_keepalive:
+ *		Time of most recent keepalive triggered from user space,
+ *		in jiffies (watchdog core internal).
+ * @work:	Data structure for worker function (watchdog core internal).
+ * @deferred:	entry in wtd_deferred_reg_list which is used to
+ *		register early initialized watchdogs.
  *
  * The watchdog_device structure contains all information about a
  * watchdog timer device.
@@ -88,8 +96,8 @@ struct watchdog_device {
 	unsigned int timeout;
 	unsigned int min_timeout;
 	unsigned int max_timeout;
+	unsigned int max_hw_timeout_ms;
 	void *driver_data;
-	struct mutex lock;
 	unsigned long status;
 /* Bit numbers for status flags */
 #define WDOG_ACTIVE		0	/* Is the watchdog running/active */
@@ -97,6 +105,10 @@ struct watchdog_device {
 #define WDOG_ALLOW_RELEASE	2	/* Did we receive the magic char ? */
 #define WDOG_NO_WAY_OUT		3	/* Is 'nowayout' feature set ? */
 #define WDOG_UNREGISTERED	4	/* Has the device been unregistered */
+	/* the following variables are for internal use only */
+	struct mutex lock;
+	unsigned long last_keepalive;
+	struct delayed_work work;
 	struct list_head deferred;
 };
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 3/8] watchdog: Introduce WDOG_RUNNING flag
  2015-08-08  5:02 [PATCH v2 0/8] watchdog: Add support for keepalives triggered by infrastructure Guenter Roeck
  2015-08-08  5:02 ` [PATCH v2 1/8] watchdog: watchdog_dev: Use single variable name for struct watchdog_device Guenter Roeck
  2015-08-08  5:02 ` [PATCH v2 2/8] watchdog: Introduce hardware maximum timeout in watchdog core Guenter Roeck
@ 2015-08-08  5:02 ` Guenter Roeck
  2015-08-14 19:04   ` Uwe Kleine-König
  2015-08-08  5:02 ` [PATCH v2 4/8] watchdog: Make set_timeout function optional Guenter Roeck
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 17+ messages in thread
From: Guenter Roeck @ 2015-08-08  5:02 UTC (permalink / raw)
  To: linux-watchdog
  Cc: Wim Van Sebroeck, linux-kernel, Timo Kokkonen,
	Uwe Kleine-König, linux-doc, Jonathan Corbet, Guenter Roeck

The WDOG_RUNNING flag is expected to be set by watchdog drivers if
the hardware watchdog is running. If the flag is set, the watchdog
subsystem will ping the watchdog even if the watchdog device is closed.

The watchdog driver stop function is now optional and may be omitted
if the watchdog can not be stopped. If stopping the watchdog is not
possible but the driver implements a stop function, it is responsible
to set the WDOG_RUNNING flag in its stop function.

Cc: Timo Kokkonen <timo.kokkonen@offcode.fi>
Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
v2: Improved documentation
---
 Documentation/watchdog/watchdog-kernel-api.txt | 29 ++++++++++++-------
 drivers/watchdog/watchdog_core.c               |  2 +-
 drivers/watchdog/watchdog_dev.c                | 40 ++++++++++++++++++++------
 include/linux/watchdog.h                       |  7 +++++
 4 files changed, 59 insertions(+), 19 deletions(-)

diff --git a/Documentation/watchdog/watchdog-kernel-api.txt b/Documentation/watchdog/watchdog-kernel-api.txt
index 25b00b878a7b..6a54dc15a556 100644
--- a/Documentation/watchdog/watchdog-kernel-api.txt
+++ b/Documentation/watchdog/watchdog-kernel-api.txt
@@ -143,17 +143,18 @@ are:
   device.
   The routine needs a pointer to the watchdog timer device structure as a
   parameter. It returns zero on success or a negative errno code for failure.
-* stop: with this routine the watchdog timer device is being stopped.
-  The routine needs a pointer to the watchdog timer device structure as a
-  parameter. It returns zero on success or a negative errno code for failure.
-  Some watchdog timer hardware can only be started and not be stopped. The
-  driver supporting this hardware needs to make sure that a start and stop
-  routine is being provided. This can be done by using a timer in the driver
-  that regularly sends a keepalive ping to the watchdog timer hardware.
 
 Not all watchdog timer hardware supports the same functionality. That's why
 all other routines/operations are optional. They only need to be provided if
 they are supported. These optional routines/operations are:
+* stop: with this routine the watchdog timer device is being stopped.
+  The routine needs a pointer to the watchdog timer device structure as a
+  parameter. It returns zero on success or a negative errno code for failure.
+  Some watchdog timer hardware can only be started and not be stopped. A
+  driver supporting such hardware does not have to implement the stop routine.
+  If a driver has no stop function, the watchdog core will set WDOG_RUNNING and
+  start calling the driver's keepalive pings function after the watchdog device
+  is closed.
 * ping: this is the routine that sends a keepalive ping to the watchdog timer
   hardware.
   The routine needs a pointer to the watchdog timer device structure as a
@@ -193,9 +194,12 @@ they are supported. These optional routines/operations are:
 The status bits should (preferably) be set with the set_bit and clear_bit alike
 bit-operations. The status bits that are defined are:
 * WDOG_ACTIVE: this status bit indicates whether or not a watchdog timer device
-  is active or not. When the watchdog is active after booting, then you should
-  set this status bit (Note: when you register the watchdog timer device with
-  this bit set, then opening /dev/watchdog will skip the start operation)
+  is active or not from user perspective. User space is expected to send
+  heartbeat requests to the driver while this flag is set. If the watchdog
+  is active after booting, and you don't want the infrastructure to send
+  heartbeats to the watchdog driver, then you should set this status bit.
+  Note: when you register the watchdog timer device with this bit set,
+  then opening /dev/watchdog will skip the start operation.
 * WDOG_DEV_OPEN: this status bit shows whether or not the watchdog device
   was opened via /dev/watchdog.
   (This bit should only be used by the WatchDog Timer Driver Core).
@@ -209,6 +213,11 @@ bit-operations. The status bits that are defined are:
   any watchdog_ops, so that you can be sure that no operations (other then
   unref) will get called after unregister, even if userspace still holds a
   reference to /dev/watchdog
+* WDOG_RUNNING: Set by the watchdog driver if the hardware watchdog is running.
+  The bit must be set if the watchdog timer hardware can not be stopped.
+  The bit may also be set if the watchdog timer is running aftyer booting,
+  before the watchdog device is opened. If set, the watchdog infrastructure
+  will send keepalives to the watchdog hardware while WDOG_ACTIVE is not set.
 
   To set the WDOG_NO_WAY_OUT status bit (before registering your watchdog
   timer device) you can either:
diff --git a/drivers/watchdog/watchdog_core.c b/drivers/watchdog/watchdog_core.c
index 1a8059455413..b38d1b7ae10e 100644
--- a/drivers/watchdog/watchdog_core.c
+++ b/drivers/watchdog/watchdog_core.c
@@ -145,7 +145,7 @@ static int __watchdog_register_device(struct watchdog_device *wdd)
 		return -EINVAL;
 
 	/* Mandatory operations need to be supported */
-	if (wdd->ops->start == NULL || wdd->ops->stop == NULL)
+	if (!wdd->ops->start)
 		return -EINVAL;
 
 	watchdog_check_min_max_timeout(wdd);
diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
index c04ba1a98cc8..676e233d5e7b 100644
--- a/drivers/watchdog/watchdog_dev.c
+++ b/drivers/watchdog/watchdog_dev.c
@@ -59,7 +59,8 @@ static inline bool watchdog_need_worker(struct watchdog_device *wdd)
 	unsigned int m = wdd->max_timeout * 1000;
 	unsigned int t = wdd->timeout * 1000;
 
-	return watchdog_active(wdd) && hm && (!m || hm < m) && t > hm;
+	return (watchdog_active(wdd) && hm && (!m || hm < m) && t > hm) ||
+	       (t && !watchdog_active(wdd) && watchdog_running(wdd));
 }
 
 static unsigned long watchdog_next_keepalive(struct watchdog_device *wdd)
@@ -73,6 +74,9 @@ static unsigned long watchdog_next_keepalive(struct watchdog_device *wdd)
 
 	tj = msecs_to_jiffies(t / 2);
 
+	if (!watchdog_active(wdd))
+		return tj;
+
 	/*
 	 * Ensure that the watchdog times out wdd->timeout seconds
 	 * after the most recent keepalive sent from user space.
@@ -112,7 +116,7 @@ static int _watchdog_ping(struct watchdog_device *wdd)
 	if (test_bit(WDOG_UNREGISTERED, &wdd->status))
 		return -ENODEV;
 
-	if (!watchdog_active(wdd))
+	if (!watchdog_active(wdd) && !watchdog_running(wdd))
 		return 0;
 
 	if (wdd->ops->ping)
@@ -223,7 +227,11 @@ static int watchdog_stop(struct watchdog_device *wdd)
 		goto out_stop;
 	}
 
-	err = wdd->ops->stop(wdd);
+	if (wdd->ops->stop)
+		err = wdd->ops->stop(wdd);
+	else
+		set_bit(WDOG_RUNNING, &wdd->status);
+
 	if (err == 0) {
 		clear_bit(WDOG_ACTIVE, &wdd->status);
 		watchdog_update_worker(wdd, true, false);
@@ -508,7 +516,7 @@ static int watchdog_open(struct inode *inode, struct file *file)
 	 * If the /dev/watchdog device is open, we don't want the module
 	 * to be unloaded.
 	 */
-	if (!try_module_get(wdd->ops->owner))
+	if (!watchdog_running(wdd) && !try_module_get(wdd->ops->owner))
 		goto out;
 
 	err = watchdog_start(wdd);
@@ -565,10 +573,15 @@ static int watchdog_release(struct inode *inode, struct file *file)
 		watchdog_ping(wdd);
 	}
 
-	cancel_delayed_work_sync(&wdd->work);
+	watchdog_update_worker(wdd, true, true);
 
-	/* Allow the owner module to be unloaded again */
-	module_put(wdd->ops->owner);
+	/*
+	 * Allow the owner module to be unloaded again unless the watchdog
+	 * is still running. If the watchdog is still running, it can not
+	 * be stopped, and its driver must not be unloaded.
+	 */
+	if (!watchdog_running(wdd))
+		module_put(wdd->ops->owner);
 
 	/* make sure that /dev/watchdog can be re-opened */
 	clear_bit(WDOG_DEV_OPEN, &wdd->status);
@@ -644,8 +657,19 @@ int watchdog_dev_register(struct watchdog_device *wdd)
 			misc_deregister(&watchdog_miscdev);
 			old_wdd = NULL;
 		}
+		return err;
 	}
-	return err;
+
+	/*
+	 * If the watchdog is running, prevent its driver from being unloaded,
+	 * and schedule an immediate ping.
+	 */
+	if (watchdog_running(wdd)) {
+		__module_get(wdd->ops->owner);
+		queue_delayed_work(watchdog_wq, &wdd->work, 0);
+	}
+
+	return 0;
 }
 
 /*
diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h
index 0e0cf36485c1..5b21bf8a8a32 100644
--- a/include/linux/watchdog.h
+++ b/include/linux/watchdog.h
@@ -105,6 +105,7 @@ struct watchdog_device {
 #define WDOG_ALLOW_RELEASE	2	/* Did we receive the magic char ? */
 #define WDOG_NO_WAY_OUT		3	/* Is 'nowayout' feature set ? */
 #define WDOG_UNREGISTERED	4	/* Has the device been unregistered */
+#define WDOG_RUNNING		5	/* True if HW watchdog running */
 	/* the following variables are for internal use only */
 	struct mutex lock;
 	unsigned long last_keepalive;
@@ -121,6 +122,12 @@ static inline bool watchdog_active(struct watchdog_device *wdd)
 	return test_bit(WDOG_ACTIVE, &wdd->status);
 }
 
+/* Use the following function to check whether or not the watchdog is running */
+static inline bool watchdog_running(struct watchdog_device *wdd)
+{
+	return test_bit(WDOG_RUNNING, &wdd->status);
+}
+
 /* Use the following function to set the nowayout feature */
 static inline void watchdog_set_nowayout(struct watchdog_device *wdd, bool nowayout)
 {
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 4/8] watchdog: Make set_timeout function optional
  2015-08-08  5:02 [PATCH v2 0/8] watchdog: Add support for keepalives triggered by infrastructure Guenter Roeck
                   ` (2 preceding siblings ...)
  2015-08-08  5:02 ` [PATCH v2 3/8] watchdog: Introduce WDOG_RUNNING flag Guenter Roeck
@ 2015-08-08  5:02 ` Guenter Roeck
  2015-08-14 19:05   ` Uwe Kleine-König
  2015-08-08  5:02 ` [PATCH v2 5/8] watchdog: imx2: Convert to use infrastructure triggered keepalives Guenter Roeck
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 17+ messages in thread
From: Guenter Roeck @ 2015-08-08  5:02 UTC (permalink / raw)
  To: linux-watchdog
  Cc: Wim Van Sebroeck, linux-kernel, Timo Kokkonen,
	Uwe Kleine-König, linux-doc, Jonathan Corbet, Guenter Roeck

For some watchdogs, the hardware timeout is fixed, and the
watchdog driver depends on the watchdog core to handle the
actual timeout. In this situation, the watchdog driver might
only set the 'timeout' variable but do nothing else.
This can as well be handled by the infrastructure, so make
the set_timeout callback optional. If WDIOF_SETTIMEOUT is
configured but the .set_timeout callback is not available,
update the timeout variable in the infrastructure code.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
v2: No changes
---
 Documentation/watchdog/watchdog-kernel-api.txt | 5 +++++
 drivers/watchdog/watchdog_dev.c                | 9 ++++++---
 2 files changed, 11 insertions(+), 3 deletions(-)

diff --git a/Documentation/watchdog/watchdog-kernel-api.txt b/Documentation/watchdog/watchdog-kernel-api.txt
index 6a54dc15a556..49dce3a5477b 100644
--- a/Documentation/watchdog/watchdog-kernel-api.txt
+++ b/Documentation/watchdog/watchdog-kernel-api.txt
@@ -181,6 +181,11 @@ they are supported. These optional routines/operations are:
   (if it is larger than hw_max_timeout_ms), or to the achieved timeout value.
   (Note: the WDIOF_SETTIMEOUT needs to be set in the options field of the
   watchdog's info structure).
+  If the watchdog driver does not have to perform any action but setting the
+  watchdog_device.timeout, this callback can be omitted.
+  If set_timeout is not provided but, WDIOF_SETTIMEOUT is set, the watchdog
+  infrastructure updates the timeout value of the watchdog_device internally
+  to the requested value.
 * get_timeleft: this routines returns the time that's left before a reset.
 * ref: the operation that calls kref_get on the kref of a dynamically
   allocated watchdog_device struct.
diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
index 676e233d5e7b..752de264a244 100644
--- a/drivers/watchdog/watchdog_dev.c
+++ b/drivers/watchdog/watchdog_dev.c
@@ -282,9 +282,9 @@ out_status:
 static int watchdog_set_timeout(struct watchdog_device *wdd,
 							unsigned int timeout)
 {
-	int err;
+	int err = 0;
 
-	if (!wdd->ops->set_timeout || !(wdd->info->options & WDIOF_SETTIMEOUT))
+	if (!(wdd->info->options & WDIOF_SETTIMEOUT))
 		return -EOPNOTSUPP;
 
 	if (watchdog_timeout_invalid(wdd, timeout))
@@ -297,7 +297,10 @@ static int watchdog_set_timeout(struct watchdog_device *wdd,
 		goto out_timeout;
 	}
 
-	err = wdd->ops->set_timeout(wdd, timeout);
+	if (wdd->ops->set_timeout)
+		err = wdd->ops->set_timeout(wdd, timeout);
+	else
+		wdd->timeout = timeout;
 
 	watchdog_update_worker(wdd, true, false);
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 5/8] watchdog: imx2: Convert to use infrastructure triggered keepalives
  2015-08-08  5:02 [PATCH v2 0/8] watchdog: Add support for keepalives triggered by infrastructure Guenter Roeck
                   ` (3 preceding siblings ...)
  2015-08-08  5:02 ` [PATCH v2 4/8] watchdog: Make set_timeout function optional Guenter Roeck
@ 2015-08-08  5:02 ` Guenter Roeck
  2015-08-08  5:02 ` [PATCH v2 6/8] watchdog: retu: " Guenter Roeck
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 17+ messages in thread
From: Guenter Roeck @ 2015-08-08  5:02 UTC (permalink / raw)
  To: linux-watchdog
  Cc: Wim Van Sebroeck, linux-kernel, Timo Kokkonen,
	Uwe Kleine-König, linux-doc, Jonathan Corbet, Guenter Roeck

The watchdog infrastructure now supports handling watchdog keepalive
if the watchdog is running while the watchdog device is closed.
Convert the driver to use this infrastructure.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
v2: No changes
---
 drivers/watchdog/imx2_wdt.c | 72 ++++++++-------------------------------------
 1 file changed, 12 insertions(+), 60 deletions(-)

diff --git a/drivers/watchdog/imx2_wdt.c b/drivers/watchdog/imx2_wdt.c
index 0bb1a1d1b170..66feef254661 100644
--- a/drivers/watchdog/imx2_wdt.c
+++ b/drivers/watchdog/imx2_wdt.c
@@ -25,7 +25,6 @@
 #include <linux/delay.h>
 #include <linux/init.h>
 #include <linux/io.h>
-#include <linux/jiffies.h>
 #include <linux/kernel.h>
 #include <linux/module.h>
 #include <linux/moduleparam.h>
@@ -34,7 +33,6 @@
 #include <linux/platform_device.h>
 #include <linux/reboot.h>
 #include <linux/regmap.h>
-#include <linux/timer.h>
 #include <linux/watchdog.h>
 
 #define DRIVER_NAME "imx2-wdt"
@@ -62,7 +60,6 @@
 struct imx2_wdt_device {
 	struct clk *clk;
 	struct regmap *regmap;
-	struct timer_list timer;	/* Pings the watchdog when closed */
 	struct watchdog_device wdog;
 	struct notifier_block restart_handler;
 };
@@ -151,16 +148,6 @@ static int imx2_wdt_ping(struct watchdog_device *wdog)
 	return 0;
 }
 
-static void imx2_wdt_timer_ping(unsigned long arg)
-{
-	struct watchdog_device *wdog = (struct watchdog_device *)arg;
-	struct imx2_wdt_device *wdev = watchdog_get_drvdata(wdog);
-
-	/* ping it every wdog->timeout / 2 seconds to prevent reboot */
-	imx2_wdt_ping(wdog);
-	mod_timer(&wdev->timer, jiffies + wdog->timeout * HZ / 2);
-}
-
 static int imx2_wdt_set_timeout(struct watchdog_device *wdog,
 				unsigned int new_timeout)
 {
@@ -177,40 +164,19 @@ static int imx2_wdt_start(struct watchdog_device *wdog)
 {
 	struct imx2_wdt_device *wdev = watchdog_get_drvdata(wdog);
 
-	if (imx2_wdt_is_running(wdev)) {
-		/* delete the timer that pings the watchdog after close */
-		del_timer_sync(&wdev->timer);
+	if (imx2_wdt_is_running(wdev))
 		imx2_wdt_set_timeout(wdog, wdog->timeout);
-	} else
+	else
 		imx2_wdt_setup(wdog);
 
-	return imx2_wdt_ping(wdog);
-}
-
-static int imx2_wdt_stop(struct watchdog_device *wdog)
-{
-	/*
-	 * We don't need a clk_disable, it cannot be disabled once started.
-	 * We use a timer to ping the watchdog while /dev/watchdog is closed
-	 */
-	imx2_wdt_timer_ping((unsigned long)wdog);
-	return 0;
-}
-
-static inline void imx2_wdt_ping_if_active(struct watchdog_device *wdog)
-{
-	struct imx2_wdt_device *wdev = watchdog_get_drvdata(wdog);
+	set_bit(WDOG_RUNNING, &wdog->status);
 
-	if (imx2_wdt_is_running(wdev)) {
-		imx2_wdt_set_timeout(wdog, wdog->timeout);
-		imx2_wdt_timer_ping((unsigned long)wdog);
-	}
+	return imx2_wdt_ping(wdog);
 }
 
 static const struct watchdog_ops imx2_wdt_ops = {
 	.owner = THIS_MODULE,
 	.start = imx2_wdt_start,
-	.stop = imx2_wdt_stop,
 	.ping = imx2_wdt_ping,
 	.set_timeout = imx2_wdt_set_timeout,
 };
@@ -277,9 +243,10 @@ static int __init imx2_wdt_probe(struct platform_device *pdev)
 	watchdog_set_nowayout(wdog, nowayout);
 	watchdog_init_timeout(wdog, timeout, &pdev->dev);
 
-	setup_timer(&wdev->timer, imx2_wdt_timer_ping, (unsigned long)wdog);
-
-	imx2_wdt_ping_if_active(wdog);
+	if (imx2_wdt_is_running(wdev)) {
+		imx2_wdt_set_timeout(wdog, wdog->timeout);
+		set_bit(WDOG_RUNNING, &wdog->status);
+	}
 
 	/*
 	 * Disable the watchdog power down counter at boot. Otherwise the power
@@ -320,7 +287,6 @@ static int __exit imx2_wdt_remove(struct platform_device *pdev)
 	watchdog_unregister_device(wdog);
 
 	if (imx2_wdt_is_running(wdev)) {
-		del_timer_sync(&wdev->timer);
 		imx2_wdt_ping(wdog);
 		dev_crit(&pdev->dev, "Device removed: Expect reboot!\n");
 	}
@@ -334,10 +300,9 @@ static void imx2_wdt_shutdown(struct platform_device *pdev)
 
 	if (imx2_wdt_is_running(wdev)) {
 		/*
-		 * We are running, we need to delete the timer but will
-		 * give max timeout before reboot will take place
+		 * We are running, configure max timeout before reboot
+		 * will take place.
 		 */
-		del_timer_sync(&wdev->timer);
 		imx2_wdt_set_timeout(wdog, IMX2_WDT_MAX_TIME);
 		imx2_wdt_ping(wdog);
 		dev_crit(&pdev->dev, "Device shutdown: Expect reboot!\n");
@@ -355,10 +320,6 @@ static int imx2_wdt_suspend(struct device *dev)
 	if (imx2_wdt_is_running(wdev)) {
 		imx2_wdt_set_timeout(wdog, IMX2_WDT_MAX_TIME);
 		imx2_wdt_ping(wdog);
-
-		/* The watchdog is not active */
-		if (!watchdog_active(wdog))
-			del_timer_sync(&wdev->timer);
 	}
 
 	clk_disable_unprepare(wdev->clk);
@@ -384,19 +345,10 @@ static int imx2_wdt_resume(struct device *dev)
 		 * watchdog again.
 		 */
 		imx2_wdt_setup(wdog);
+	}
+	if (imx2_wdt_is_running(wdev)) {
 		imx2_wdt_set_timeout(wdog, wdog->timeout);
 		imx2_wdt_ping(wdog);
-	} else if (imx2_wdt_is_running(wdev)) {
-		/* Resuming from non-deep sleep state. */
-		imx2_wdt_set_timeout(wdog, wdog->timeout);
-		imx2_wdt_ping(wdog);
-		/*
-		 * But the watchdog is not active, then start
-		 * the timer again.
-		 */
-		if (!watchdog_active(wdog))
-			mod_timer(&wdev->timer,
-				  jiffies + wdog->timeout * HZ / 2);
 	}
 
 	return 0;
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 6/8] watchdog: retu: Convert to use infrastructure triggered keepalives
  2015-08-08  5:02 [PATCH v2 0/8] watchdog: Add support for keepalives triggered by infrastructure Guenter Roeck
                   ` (4 preceding siblings ...)
  2015-08-08  5:02 ` [PATCH v2 5/8] watchdog: imx2: Convert to use infrastructure triggered keepalives Guenter Roeck
@ 2015-08-08  5:02 ` Guenter Roeck
  2015-08-08  5:02 ` [PATCH v2 7/8] watchdog: gpio_wdt: " Guenter Roeck
  2015-08-08  5:02 ` [PATCH v2 8/8] watchdog: at91sam9: " Guenter Roeck
  7 siblings, 0 replies; 17+ messages in thread
From: Guenter Roeck @ 2015-08-08  5:02 UTC (permalink / raw)
  To: linux-watchdog
  Cc: Wim Van Sebroeck, linux-kernel, Timo Kokkonen,
	Uwe Kleine-König, linux-doc, Jonathan Corbet, Guenter Roeck

The watchdog infrastructure now supports handling watchdog keepalive
if the watchdog is running while the watchdog device is closed.
Convert the driver to use this infrastructure.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
v2: No changes
---
 drivers/watchdog/retu_wdt.c | 78 ++++-----------------------------------------
 1 file changed, 7 insertions(+), 71 deletions(-)

diff --git a/drivers/watchdog/retu_wdt.c b/drivers/watchdog/retu_wdt.c
index b7c68e275aeb..ce2982a7670c 100644
--- a/drivers/watchdog/retu_wdt.c
+++ b/drivers/watchdog/retu_wdt.c
@@ -28,69 +28,22 @@
 /* Watchdog timer values in seconds */
 #define RETU_WDT_MAX_TIMER	63
 
-struct retu_wdt_dev {
-	struct retu_dev		*rdev;
-	struct device		*dev;
-	struct delayed_work	ping_work;
-};
-
-/*
- * Since Retu watchdog cannot be disabled in hardware, we must kick it
- * with a timer until userspace watchdog software takes over. If
- * CONFIG_WATCHDOG_NOWAYOUT is set, we never start the feeding.
- */
-static void retu_wdt_ping_enable(struct retu_wdt_dev *wdev)
-{
-	retu_write(wdev->rdev, RETU_REG_WATCHDOG, RETU_WDT_MAX_TIMER);
-	schedule_delayed_work(&wdev->ping_work,
-			round_jiffies_relative(RETU_WDT_MAX_TIMER * HZ / 2));
-}
-
-static void retu_wdt_ping_disable(struct retu_wdt_dev *wdev)
-{
-	retu_write(wdev->rdev, RETU_REG_WATCHDOG, RETU_WDT_MAX_TIMER);
-	cancel_delayed_work_sync(&wdev->ping_work);
-}
-
-static void retu_wdt_ping_work(struct work_struct *work)
-{
-	struct retu_wdt_dev *wdev = container_of(to_delayed_work(work),
-						struct retu_wdt_dev, ping_work);
-	retu_wdt_ping_enable(wdev);
-}
-
 static int retu_wdt_start(struct watchdog_device *wdog)
 {
-	struct retu_wdt_dev *wdev = watchdog_get_drvdata(wdog);
+	struct retu_dev *rdev = watchdog_get_drvdata(wdog);
 
-	retu_wdt_ping_disable(wdev);
+	set_bit(WDOG_RUNNING, &wdog->status);
 
-	return retu_write(wdev->rdev, RETU_REG_WATCHDOG, wdog->timeout);
-}
-
-static int retu_wdt_stop(struct watchdog_device *wdog)
-{
-	struct retu_wdt_dev *wdev = watchdog_get_drvdata(wdog);
-
-	retu_wdt_ping_enable(wdev);
-
-	return 0;
-}
-
-static int retu_wdt_ping(struct watchdog_device *wdog)
-{
-	struct retu_wdt_dev *wdev = watchdog_get_drvdata(wdog);
-
-	return retu_write(wdev->rdev, RETU_REG_WATCHDOG, wdog->timeout);
+	return retu_write(rdev, RETU_REG_WATCHDOG, wdog->timeout);
 }
 
 static int retu_wdt_set_timeout(struct watchdog_device *wdog,
 				unsigned int timeout)
 {
-	struct retu_wdt_dev *wdev = watchdog_get_drvdata(wdog);
+	struct retu_dev *rdev = watchdog_get_drvdata(wdog);
 
 	wdog->timeout = timeout;
-	return retu_write(wdev->rdev, RETU_REG_WATCHDOG, wdog->timeout);
+	return retu_write(rdev, RETU_REG_WATCHDOG, wdog->timeout);
 }
 
 static const struct watchdog_info retu_wdt_info = {
@@ -101,8 +54,6 @@ static const struct watchdog_info retu_wdt_info = {
 static const struct watchdog_ops retu_wdt_ops = {
 	.owner		= THIS_MODULE,
 	.start		= retu_wdt_start,
-	.stop		= retu_wdt_stop,
-	.ping		= retu_wdt_ping,
 	.set_timeout	= retu_wdt_set_timeout,
 };
 
@@ -111,39 +62,26 @@ static int retu_wdt_probe(struct platform_device *pdev)
 	struct retu_dev *rdev = dev_get_drvdata(pdev->dev.parent);
 	bool nowayout = WATCHDOG_NOWAYOUT;
 	struct watchdog_device *retu_wdt;
-	struct retu_wdt_dev *wdev;
 	int ret;
 
 	retu_wdt = devm_kzalloc(&pdev->dev, sizeof(*retu_wdt), GFP_KERNEL);
 	if (!retu_wdt)
 		return -ENOMEM;
 
-	wdev = devm_kzalloc(&pdev->dev, sizeof(*wdev), GFP_KERNEL);
-	if (!wdev)
-		return -ENOMEM;
-
 	retu_wdt->info		= &retu_wdt_info;
 	retu_wdt->ops		= &retu_wdt_ops;
 	retu_wdt->timeout	= RETU_WDT_MAX_TIMER;
 	retu_wdt->min_timeout	= 0;
 	retu_wdt->max_timeout	= RETU_WDT_MAX_TIMER;
 
-	watchdog_set_drvdata(retu_wdt, wdev);
+	watchdog_set_drvdata(retu_wdt, rdev);
 	watchdog_set_nowayout(retu_wdt, nowayout);
 
-	wdev->rdev		= rdev;
-	wdev->dev		= &pdev->dev;
-
-	INIT_DELAYED_WORK(&wdev->ping_work, retu_wdt_ping_work);
-
 	ret = watchdog_register_device(retu_wdt);
 	if (ret < 0)
 		return ret;
 
-	if (nowayout)
-		retu_wdt_ping(retu_wdt);
-	else
-		retu_wdt_ping_enable(wdev);
+	retu_wdt_start(retu_wdt);
 
 	platform_set_drvdata(pdev, retu_wdt);
 
@@ -153,10 +91,8 @@ static int retu_wdt_probe(struct platform_device *pdev)
 static int retu_wdt_remove(struct platform_device *pdev)
 {
 	struct watchdog_device *wdog = platform_get_drvdata(pdev);
-	struct retu_wdt_dev *wdev = watchdog_get_drvdata(wdog);
 
 	watchdog_unregister_device(wdog);
-	cancel_delayed_work_sync(&wdev->ping_work);
 
 	return 0;
 }
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 7/8] watchdog: gpio_wdt: Convert to use infrastructure triggered keepalives
  2015-08-08  5:02 [PATCH v2 0/8] watchdog: Add support for keepalives triggered by infrastructure Guenter Roeck
                   ` (5 preceding siblings ...)
  2015-08-08  5:02 ` [PATCH v2 6/8] watchdog: retu: " Guenter Roeck
@ 2015-08-08  5:02 ` Guenter Roeck
  2015-08-08  5:02 ` [PATCH v2 8/8] watchdog: at91sam9: " Guenter Roeck
  7 siblings, 0 replies; 17+ messages in thread
From: Guenter Roeck @ 2015-08-08  5:02 UTC (permalink / raw)
  To: linux-watchdog
  Cc: Wim Van Sebroeck, linux-kernel, Timo Kokkonen,
	Uwe Kleine-König, linux-doc, Jonathan Corbet, Guenter Roeck

The watchdog infrastructure now supports handling watchdog keepalive
if the watchdog is running while the watchdog device is closed.
The infrastructure now also supports generating additional heartbeats
if the maximum hardware timeout is smaller than or close to the
configured timeout. Convert the driver to use this infrastructure.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
v2: No changes
---
 drivers/watchdog/gpio_wdt.c | 65 ++++++++-------------------------------------
 1 file changed, 11 insertions(+), 54 deletions(-)

diff --git a/drivers/watchdog/gpio_wdt.c b/drivers/watchdog/gpio_wdt.c
index 1687cc2d7122..cbbdae440bfa 100644
--- a/drivers/watchdog/gpio_wdt.c
+++ b/drivers/watchdog/gpio_wdt.c
@@ -32,12 +32,8 @@ struct gpio_wdt_priv {
 	bool			active_low;
 	bool			state;
 	bool			always_running;
-	bool			armed;
 	unsigned int		hw_algo;
-	unsigned int		hw_margin;
-	unsigned long		last_jiffies;
 	struct notifier_block	notifier;
-	struct timer_list	timer;
 	struct watchdog_device	wdd;
 };
 
@@ -50,20 +46,12 @@ static void gpio_wdt_disable(struct gpio_wdt_priv *priv)
 		gpio_direction_input(priv->gpio);
 }
 
-static void gpio_wdt_start_impl(struct gpio_wdt_priv *priv)
-{
-	priv->state = priv->active_low;
-	gpio_direction_output(priv->gpio, priv->state);
-	priv->last_jiffies = jiffies;
-	mod_timer(&priv->timer, priv->last_jiffies + priv->hw_margin);
-}
-
 static int gpio_wdt_start(struct watchdog_device *wdd)
 {
 	struct gpio_wdt_priv *priv = watchdog_get_drvdata(wdd);
 
-	gpio_wdt_start_impl(priv);
-	priv->armed = true;
+	priv->state = priv->active_low;
+	gpio_direction_output(priv->gpio, priv->state);
 
 	return 0;
 }
@@ -72,10 +60,9 @@ static int gpio_wdt_stop(struct watchdog_device *wdd)
 {
 	struct gpio_wdt_priv *priv = watchdog_get_drvdata(wdd);
 
-	priv->armed = false;
 	if (!priv->always_running) {
-		mod_timer(&priv->timer, 0);
 		gpio_wdt_disable(priv);
+		clear_bit(WDOG_RUNNING, &priv->wdd.status);
 	}
 
 	return 0;
@@ -85,32 +72,6 @@ static int gpio_wdt_ping(struct watchdog_device *wdd)
 {
 	struct gpio_wdt_priv *priv = watchdog_get_drvdata(wdd);
 
-	priv->last_jiffies = jiffies;
-
-	return 0;
-}
-
-static int gpio_wdt_set_timeout(struct watchdog_device *wdd, unsigned int t)
-{
-	wdd->timeout = t;
-
-	return gpio_wdt_ping(wdd);
-}
-
-static void gpio_wdt_hwping(unsigned long data)
-{
-	struct watchdog_device *wdd = (struct watchdog_device *)data;
-	struct gpio_wdt_priv *priv = watchdog_get_drvdata(wdd);
-
-	if (priv->armed && time_after(jiffies, priv->last_jiffies +
-				      msecs_to_jiffies(wdd->timeout * 1000))) {
-		dev_crit(wdd->dev, "Timer expired. System will reboot soon!\n");
-		return;
-	}
-
-	/* Restart timer */
-	mod_timer(&priv->timer, jiffies + priv->hw_margin);
-
 	switch (priv->hw_algo) {
 	case HW_ALGO_TOGGLE:
 		/* Toggle output pin */
@@ -124,6 +85,8 @@ static void gpio_wdt_hwping(unsigned long data)
 		gpio_set_value_cansleep(priv->gpio, priv->active_low);
 		break;
 	}
+
+	return 0;
 }
 
 static int gpio_wdt_notify_sys(struct notifier_block *nb, unsigned long code,
@@ -132,12 +95,10 @@ static int gpio_wdt_notify_sys(struct notifier_block *nb, unsigned long code,
 	struct gpio_wdt_priv *priv = container_of(nb, struct gpio_wdt_priv,
 						  notifier);
 
-	mod_timer(&priv->timer, 0);
-
 	switch (code) {
 	case SYS_HALT:
 	case SYS_POWER_OFF:
-		gpio_wdt_disable(priv);
+		gpio_wdt_stop(&priv->wdd);
 		break;
 	default:
 		break;
@@ -157,7 +118,6 @@ static const struct watchdog_ops gpio_wdt_ops = {
 	.start		= gpio_wdt_start,
 	.stop		= gpio_wdt_stop,
 	.ping		= gpio_wdt_ping,
-	.set_timeout	= gpio_wdt_set_timeout,
 };
 
 static int gpio_wdt_probe(struct platform_device *pdev)
@@ -205,9 +165,6 @@ static int gpio_wdt_probe(struct platform_device *pdev)
 	if (hw_margin < 2 || hw_margin > 65535)
 		return -EINVAL;
 
-	/* Use safe value (1/2 of real timeout) */
-	priv->hw_margin = msecs_to_jiffies(hw_margin / 2);
-
 	priv->always_running = of_property_read_bool(pdev->dev.of_node,
 						     "always-running");
 
@@ -217,11 +174,15 @@ static int gpio_wdt_probe(struct platform_device *pdev)
 	priv->wdd.ops		= &gpio_wdt_ops;
 	priv->wdd.min_timeout	= SOFT_TIMEOUT_MIN;
 	priv->wdd.max_timeout	= SOFT_TIMEOUT_MAX;
+	priv->wdd.max_hw_timeout_ms = hw_margin;
 
 	if (watchdog_init_timeout(&priv->wdd, 0, &pdev->dev) < 0)
 		priv->wdd.timeout = SOFT_TIMEOUT_DEF;
 
-	setup_timer(&priv->timer, gpio_wdt_hwping, (unsigned long)&priv->wdd);
+	if (priv->always_running) {
+		set_bit(WDOG_RUNNING, &priv->wdd.status);
+		gpio_wdt_start(&priv->wdd);
+	}
 
 	ret = watchdog_register_device(&priv->wdd);
 	if (ret)
@@ -232,9 +193,6 @@ static int gpio_wdt_probe(struct platform_device *pdev)
 	if (ret)
 		goto error_unregister;
 
-	if (priv->always_running)
-		gpio_wdt_start_impl(priv);
-
 	return 0;
 
 error_unregister:
@@ -246,7 +204,6 @@ static int gpio_wdt_remove(struct platform_device *pdev)
 {
 	struct gpio_wdt_priv *priv = platform_get_drvdata(pdev);
 
-	del_timer_sync(&priv->timer);
 	unregister_reboot_notifier(&priv->notifier);
 	watchdog_unregister_device(&priv->wdd);
 
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v2 8/8] watchdog: at91sam9: Convert to use infrastructure triggered keepalives
  2015-08-08  5:02 [PATCH v2 0/8] watchdog: Add support for keepalives triggered by infrastructure Guenter Roeck
                   ` (6 preceding siblings ...)
  2015-08-08  5:02 ` [PATCH v2 7/8] watchdog: gpio_wdt: " Guenter Roeck
@ 2015-08-08  5:02 ` Guenter Roeck
  7 siblings, 0 replies; 17+ messages in thread
From: Guenter Roeck @ 2015-08-08  5:02 UTC (permalink / raw)
  To: linux-watchdog
  Cc: Wim Van Sebroeck, linux-kernel, Timo Kokkonen,
	Uwe Kleine-König, linux-doc, Jonathan Corbet, Guenter Roeck

The watchdog infrastructure now supports handling watchdog keepalive
if the watchdog is running while the watchdog device is closed.
The infrastructure now also supports generating additional heartbeats
if the maximum hardware timeout is smaller than or close to the
configured timeout. Convert the driver to use this
infrastructure.

Signed-off-by: Guenter Roeck <linux@roeck-us.net>
---
v2: No changes
---
 drivers/watchdog/at91sam9_wdt.c | 102 +++++-----------------------------------
 1 file changed, 11 insertions(+), 91 deletions(-)

diff --git a/drivers/watchdog/at91sam9_wdt.c b/drivers/watchdog/at91sam9_wdt.c
index e4698f7c5f93..0de39b52962c 100644
--- a/drivers/watchdog/at91sam9_wdt.c
+++ b/drivers/watchdog/at91sam9_wdt.c
@@ -29,7 +29,6 @@
 #include <linux/types.h>
 #include <linux/watchdog.h>
 #include <linux/jiffies.h>
-#include <linux/timer.h>
 #include <linux/bitops.h>
 #include <linux/uaccess.h>
 #include <linux/of.h>
@@ -48,8 +47,8 @@
  * use this to convert a watchdog
  * value from/to milliseconds.
  */
-#define ticks_to_hz_rounddown(t)	((((t) + 1) * HZ) >> 8)
-#define ticks_to_hz_roundup(t)		(((((t) + 1) * HZ) + 255) >> 8)
+#define ticks_to_ms_rounddown(t)	((((t) + 1) * 1000) >> 8)
+#define ticks_to_ms_roundup(t)		(((((t) + 1) * 1000) + 255) >> 8)
 #define ticks_to_secs(t)		(((t) + 1) >> 8)
 #define secs_to_ticks(s)		((s) ? (((s) << 8) - 1) : 0)
 
@@ -64,9 +63,6 @@
 /* Hardware timeout in seconds */
 #define WDT_HW_TIMEOUT 2
 
-/* Timer heartbeat (500ms) */
-#define WDT_TIMEOUT	(HZ/2)
-
 /* User land timeout */
 #define WDT_HEARTBEAT 15
 static int heartbeat;
@@ -83,11 +79,8 @@ MODULE_PARM_DESC(nowayout, "Watchdog cannot be stopped once started "
 struct at91wdt {
 	struct watchdog_device wdd;
 	void __iomem *base;
-	unsigned long next_heartbeat;	/* the next_heartbeat for the timer */
-	struct timer_list timer;	/* The timer that pings the watchdog */
 	u32 mr;
 	u32 mr_mask;
-	unsigned long heartbeat;	/* WDT heartbeat in jiffies */
 	bool nowayout;
 	unsigned int irq;
 };
@@ -107,47 +100,13 @@ static irqreturn_t wdt_interrupt(int irq, void *dev_id)
 	return IRQ_HANDLED;
 }
 
-/*
- * Reload the watchdog timer.  (ie, pat the watchdog)
- */
-static inline void at91_wdt_reset(struct at91wdt *wdt)
-{
-	wdt_write(wdt, AT91_WDT_CR, AT91_WDT_KEY | AT91_WDT_WDRSTT);
-}
-
-/*
- * Timer tick
- */
-static void at91_ping(unsigned long data)
-{
-	struct at91wdt *wdt = (struct at91wdt *)data;
-	if (time_before(jiffies, wdt->next_heartbeat) ||
-	    !watchdog_active(&wdt->wdd)) {
-		at91_wdt_reset(wdt);
-		mod_timer(&wdt->timer, jiffies + wdt->heartbeat);
-	} else {
-		pr_crit("I will reset your machine !\n");
-	}
-}
-
 static int at91_wdt_start(struct watchdog_device *wdd)
 {
 	struct at91wdt *wdt = to_wdt(wdd);
-	/* calculate when the next userspace timeout will be */
-	wdt->next_heartbeat = jiffies + wdd->timeout * HZ;
-	return 0;
-}
 
-static int at91_wdt_stop(struct watchdog_device *wdd)
-{
-	/* The watchdog timer hardware can not be stopped... */
-	return 0;
-}
+	wdt_write(wdt, AT91_WDT_CR, AT91_WDT_KEY | AT91_WDT_WDRSTT);
 
-static int at91_wdt_set_timeout(struct watchdog_device *wdd, unsigned int new_timeout)
-{
-	wdd->timeout = new_timeout;
-	return at91_wdt_start(wdd);
+	return 0;
 }
 
 static int at91_wdt_init(struct platform_device *pdev, struct at91wdt *wdt)
@@ -157,8 +116,8 @@ static int at91_wdt_init(struct platform_device *pdev, struct at91wdt *wdt)
 	u32 value;
 	int err;
 	u32 mask = wdt->mr_mask;
-	unsigned long min_heartbeat = 1;
-	unsigned long max_heartbeat;
+	unsigned int min_timeout = jiffies_to_msecs(1);
+	unsigned int hw_timeout;
 	struct device *dev = &pdev->dev;
 
 	tmp = wdt_read(wdt, AT91_WDT_MR);
@@ -180,31 +139,15 @@ static int at91_wdt_init(struct platform_device *pdev, struct at91wdt *wdt)
 	delta = (tmp & AT91_WDT_WDD) >> 16;
 
 	if (delta < value)
-		min_heartbeat = ticks_to_hz_roundup(value - delta);
+		min_timeout = ticks_to_ms_roundup(value - delta);
 
-	max_heartbeat = ticks_to_hz_rounddown(value);
-	if (!max_heartbeat) {
+	hw_timeout = ticks_to_ms_rounddown(value);
+	if (hw_timeout < min_timeout * 2) {
 		dev_err(dev,
 			"heartbeat is too small for the system to handle it correctly\n");
 		return -EINVAL;
 	}
-
-	/*
-	 * Try to reset the watchdog counter 4 or 2 times more often than
-	 * actually requested, to avoid spurious watchdog reset.
-	 * If this is not possible because of the min_heartbeat value, reset
-	 * it at the min_heartbeat period.
-	 */
-	if ((max_heartbeat / 4) >= min_heartbeat)
-		wdt->heartbeat = max_heartbeat / 4;
-	else if ((max_heartbeat / 2) >= min_heartbeat)
-		wdt->heartbeat = max_heartbeat / 2;
-	else
-		wdt->heartbeat = min_heartbeat;
-
-	if (max_heartbeat < min_heartbeat + 4)
-		dev_warn(dev,
-			 "min heartbeat and max heartbeat might be too close for the system to handle it correctly\n");
+	wdt->wdd.max_hw_timeout_ms = hw_timeout;
 
 	if ((tmp & AT91_WDT_WDFIEN) && wdt->irq) {
 		err = request_irq(wdt->irq, wdt_interrupt,
@@ -220,32 +163,12 @@ static int at91_wdt_init(struct platform_device *pdev, struct at91wdt *wdt)
 			 "watchdog already configured differently (mr = %x expecting %x)\n",
 			 tmp & wdt->mr_mask, wdt->mr & wdt->mr_mask);
 
-	setup_timer(&wdt->timer, at91_ping, (unsigned long)wdt);
-
-	/*
-	 * Use min_heartbeat the first time to avoid spurious watchdog reset:
-	 * we don't know for how long the watchdog counter is running, and
-	 *  - resetting it right now might trigger a watchdog fault reset
-	 *  - waiting for heartbeat time might lead to a watchdog timeout
-	 *    reset
-	 */
-	mod_timer(&wdt->timer, jiffies + min_heartbeat);
-
 	/* Try to set timeout from device tree first */
 	if (watchdog_init_timeout(&wdt->wdd, 0, dev))
 		watchdog_init_timeout(&wdt->wdd, heartbeat, dev);
 	watchdog_set_nowayout(&wdt->wdd, wdt->nowayout);
-	err = watchdog_register_device(&wdt->wdd);
-	if (err)
-		goto out_stop_timer;
-
-	wdt->next_heartbeat = jiffies + wdt->wdd.timeout * HZ;
-
-	return 0;
 
-out_stop_timer:
-	del_timer(&wdt->timer);
-	return err;
+	return watchdog_register_device(&wdt->wdd);
 }
 
 /* ......................................................................... */
@@ -259,8 +182,6 @@ static const struct watchdog_info at91_wdt_info = {
 static const struct watchdog_ops at91_wdt_ops = {
 	.owner =	THIS_MODULE,
 	.start =	at91_wdt_start,
-	.stop =		at91_wdt_stop,
-	.set_timeout =	at91_wdt_set_timeout,
 };
 
 #if defined(CONFIG_OF)
@@ -376,7 +297,6 @@ static int __exit at91wdt_remove(struct platform_device *pdev)
 	watchdog_unregister_device(&wdt->wdd);
 
 	pr_warn("I quit now, hardware will probably reboot!\n");
-	del_timer(&wdt->timer);
 
 	return 0;
 }
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/8] watchdog: Introduce hardware maximum timeout in watchdog core
  2015-08-08  5:02 ` [PATCH v2 2/8] watchdog: Introduce hardware maximum timeout in watchdog core Guenter Roeck
@ 2015-08-13 21:13   ` Uwe Kleine-König
  2015-08-15 23:17     ` Guenter Roeck
  2015-08-14 11:23   ` Uwe Kleine-König
  1 sibling, 1 reply; 17+ messages in thread
From: Uwe Kleine-König @ 2015-08-13 21:13 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: linux-watchdog, Wim Van Sebroeck, linux-kernel, Timo Kokkonen,
	linux-doc, Jonathan Corbet

Hello Guenter,

On Fri, Aug 07, 2015 at 10:02:41PM -0700, Guenter Roeck wrote:
> Introduce an optional hardware maximum timeout in the watchdog core.
> The hardware maximum timeout can be lower than the maximum timeout.
> 
> Drivers can set the maximum hardware timeout value in the watchdog data
> structure. If the configured timeout exceeds the maximum hardware timeout,
> the watchdog core enables a timer function to assist sending keepalive
> requests to the watchdog driver.
> 
> Cc: Timo Kokkonen <timo.kokkonen@offcode.fi>
> Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> ---
> v2:
> - Improved and hopefully clarified documentation.
> - Rearranged variables in struct watchdog_device such that internal variables
>   come last.
> - The code now ensures that the watchdog times out <timeout> seconds after
>   the most recent keepalive sent from user space.
> - The internal keepalive now stops silently and no longer generates a
>   warning message. Reason is that it will now stop early, while there
>   may still be a substantial amount of time for keepalives from user space
>   to arrive. If such keepalives arrive late (for example if user space
>   is configured to send keepalives just a few seconds before the watchdog
>   times out), the message would just be noise and not provide any value.
> ---
>  Documentation/watchdog/watchdog-kernel-api.txt |  23 +++-
>  drivers/watchdog/watchdog_dev.c                | 140 ++++++++++++++++++++++---
>  include/linux/watchdog.h                       |  26 +++--
>  3 files changed, 163 insertions(+), 26 deletions(-)
> 
> diff --git a/Documentation/watchdog/watchdog-kernel-api.txt b/Documentation/watchdog/watchdog-kernel-api.txt
> index d8b0d3367706..25b00b878a7b 100644
> --- a/Documentation/watchdog/watchdog-kernel-api.txt
> +++ b/Documentation/watchdog/watchdog-kernel-api.txt
> @@ -53,9 +53,12 @@ struct watchdog_device {
>  	unsigned int timeout;
>  	unsigned int min_timeout;
>  	unsigned int max_timeout;
> +	unsigned int max_hw_timeout_ms;
>  	void *driver_data;
> -	struct mutex lock;
>  	unsigned long status;
> +	struct mutex lock;
> +	unsigned long last_keepalive;
> +	struct delayed_work work;
>  	struct list_head deferred;
>  };
>  
> @@ -73,18 +76,28 @@ It contains following fields:
>    additional information about the watchdog timer itself. (Like it's unique name)
>  * ops: a pointer to the list of watchdog operations that the watchdog supports.
>  * timeout: the watchdog timer's timeout value (in seconds).
> +  This is the time after which the system will reboot if user space does
> +  not send a heartbeat request if WDOG_ACTIVE is set.
>  * min_timeout: the watchdog timer's minimum timeout value (in seconds).
>  * max_timeout: the watchdog timer's maximum timeout value (in seconds).
> +* max_hw_timeout_ms: Maximum hardware timeout, in milli-seconds. May differ
> +  from max_timeout. If set to a value larger than max_timeout, the
> +  infrastructure will send a heartbeat to the watchdog driver if 'timeout'
> +  is larger than 'max_hw_timeout / 2', unless WDOG_ACTIVE is set and user
> +  space failed to send a heartbeat for at least 'timeout' seconds.
To properly understand this it would be necessary to know what
max_timeout is. Maybe clearify that, too?

>  * bootstatus: status of the device after booting (reported with watchdog
>    WDIOF_* status bits).
>  * driver_data: a pointer to the drivers private data of a watchdog device.
>    This data should only be accessed via the watchdog_set_drvdata and
>    watchdog_get_drvdata routines.
> -* lock: Mutex for WatchDog Timer Driver Core internal use only.
>  * status: this field contains a number of status bits that give extra
>    information about the status of the device (Like: is the watchdog timer
>    running/active, is the nowayout bit set, is the device opened via
>    the /dev/watchdog interface or not, ...).
> +* lock: Mutex for WatchDog Timer Driver Core internal use only.
> +* last_keepalive: Time of most recent keepalive triggered from user space,
> +  in jiffies.
> +* work: Worker data structure for WatchDog Timer Driver Core internal use only.
>  * deferred: entry in wtd_deferred_reg_list which is used to
>    register early initialized watchdogs.
>  
> @@ -160,7 +173,11 @@ they are supported. These optional routines/operations are:
>    and -EIO for "could not write value to the watchdog". On success this
>    routine should set the timeout value of the watchdog_device to the
>    achieved timeout value (which may be different from the requested one
> -  because the watchdog does not necessarily has a 1 second resolution).
> +  because the watchdog does not necessarily have a 1 second resolution).
> +  Drivers implementing hw_max_timeout_ms set the hardware watchdog timeout
> +  to the minimum of timeout and hw_max_timeout_ms. Those drivers set the
> +  timeout value of the watchdog_device either to the requested timeout value
> +  (if it is larger than hw_max_timeout_ms), or to the achieved timeout value.
>    (Note: the WDIOF_SETTIMEOUT needs to be set in the options field of the
>    watchdog's info structure).
>  * get_timeleft: this routines returns the time that's left before a reset.
> diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
> index 06171c73daf5..c04ba1a98cc8 100644
> --- a/drivers/watchdog/watchdog_dev.c
> +++ b/drivers/watchdog/watchdog_dev.c
> @@ -37,7 +37,9 @@
>  #include <linux/errno.h>	/* For the -ENODEV/... values */
>  #include <linux/kernel.h>	/* For printk/panic/... */
>  #include <linux/fs.h>		/* For file operations */
> +#include <linux/jiffies.h>	/* For timeout functions */
>  #include <linux/watchdog.h>	/* For watchdog specific items */
> +#include <linux/workqueue.h>	/* For workqueue */
>  #include <linux/miscdevice.h>	/* For handling misc devices */
>  #include <linux/init.h>		/* For __init/__exit/... */
>  #include <linux/uaccess.h>	/* For copy_to_user/put_user/... */
> @@ -49,6 +51,78 @@ static dev_t watchdog_devt;
>  /* the watchdog device behind /dev/watchdog */
>  static struct watchdog_device *old_wdd;
>  
> +static struct workqueue_struct *watchdog_wq;
> +
> +static inline bool watchdog_need_worker(struct watchdog_device *wdd)
> +{

Maybe add:
	/* all variables in milli seconds */

> +	unsigned int hm = wdd->max_hw_timeout_ms;
> +	unsigned int m = wdd->max_timeout * 1000;
> +	unsigned int t = wdd->timeout * 1000;
> +
> +	return watchdog_active(wdd) && hm && (!m || hm < m) && t > hm;

ok, this means:

	watchdog_active(wdd): Userspace thinks the hardware is running
	hm: the driver is aware of framework pinging
	!m: no artificial limit
	hm < m: some artificial limit (does this make sense to have
		max_timeout > DIV_ROUND_UP(max_hw_timeout_ms, 1000), or
		should we better enforce max_timeout = 0 for "good"
		drivers?)
	t > hm: userspace requests longer timeout than hardware can
		handle.

looks good.

> +}
> +
> +static unsigned long watchdog_next_keepalive(struct watchdog_device *wdd)
> +{
> +	unsigned int t = wdd->timeout * 1000;
> +	unsigned long to = wdd->last_keepalive + msecs_to_jiffies(t);
> +	unsigned long last, max_t, tj;
> +
> +	if (wdd->max_hw_timeout_ms && t > wdd->max_hw_timeout_ms)
> +		t = wdd->max_hw_timeout_ms;

watchdog_next_keepalive is only called after watchdog_need_worker
returned true, so there shouldn't be a need to repeat this test, right?

> +	tj = msecs_to_jiffies(t / 2);
> +
> +	/*
> +	 * Ensure that the watchdog times out wdd->timeout seconds
> +	 * after the most recent keepalive sent from user space.
> +	 */
> +	last = to - msecs_to_jiffies(t);
> +	if (time_is_after_jiffies(last))
> +		max_t = last - jiffies;
> +	else
> +		max_t = 0;
> +
> +	if (max_t < tj)
> +		tj = max_t;
> +
> +	return tj;
I find this hard to understand. Maybe the variable names could be
improved, something like:

	t -> hw_timeout_ms
	tj -> keep_alive_interval
	to -> virt_timeout
	last -> last_hw_ping

And I'd do:

	/*
	 * To ensure that the watchdog times out wdd->timeout seconds
	 * after the most recent ping from userspace, the last
	 * worker ping has to come in hw_timeout_ms before this timeout.
	 */
	last_hw_ping = virt_timeout - msecs_to_jiffies(hw_timeout_ms);

	/*
	 * Worker pings usually come at intervals of
	 * max_hw_timeout_ms / 2. This interval is shortend if
	 * virt_timeout is near (or already over).
	 */
	return min_t(long, last_hw_ping - jiffies, keep_alive_interval);

then you have to change the return type to long and ...

> +}
> +
> +static inline void watchdog_update_worker(struct watchdog_device *wdd,
> +					  bool cancel, bool sync)
> +{
> +	if (watchdog_need_worker(wdd)) {
> +		unsigned long t = watchdog_next_keepalive(wdd);
> +
> +		if (t)

... test for t > 0 here.

> +			mod_delayed_work(watchdog_wq, &wdd->work, t);
> +	} else if (cancel) {
> +		if (sync)

Up to now sync is always false. Does this change later? Should this
parameter be introduced only later, too?

> +			cancel_delayed_work_sync(&wdd->work);
> +		else
> +			cancel_delayed_work(&wdd->work);
> +	}
> +}

This function looks artificial, i.e. I don't understand why you would
want to modify the timer or stop it. Probably that is because the timer
does more than increasing the virtual timeout later?

> +static int _watchdog_ping(struct watchdog_device *wdd)
> +{
> +	int err;
> +
> +	if (test_bit(WDOG_UNREGISTERED, &wdd->status))
> +		return -ENODEV;
> +
> +	if (!watchdog_active(wdd))
> +		return 0;
> +
> +	if (wdd->ops->ping)
> +		err = wdd->ops->ping(wdd);  /* ping the watchdog */
> +	else
> +		err = wdd->ops->start(wdd); /* restart watchdog */
> +
> +	return err;
> +}
> +
>  /*
>   *	watchdog_ping: ping the watchdog.
>   *	@wdd: the watchdog device to ping
> @@ -61,26 +135,27 @@ static struct watchdog_device *old_wdd;
>  
>  static int watchdog_ping(struct watchdog_device *wdd)
>  {
> -	int err = 0;
> +	int err;
>  
>  	mutex_lock(&wdd->lock);
> +	err = _watchdog_ping(wdd);
> +	wdd->last_keepalive = jiffies;
I'd switch these two lines. Consider wdd->ops->ping takes some time then
the next ping should timed relative to before the ping, not after it.

> +	watchdog_update_worker(wdd, false, false);
> +	mutex_unlock(&wdd->lock);
>  
> -	if (test_bit(WDOG_UNREGISTERED, &wdd->status)) {
> -		err = -ENODEV;
> -		goto out_ping;
> -	}
> +	return err;
> +}
>  
> -	if (!watchdog_active(wdd))
> -		goto out_ping;
> +static void watchdog_ping_work(struct work_struct *work)
> +{
> +	struct watchdog_device *wdd;
>  
> -	if (wdd->ops->ping)
> -		err = wdd->ops->ping(wdd);	/* ping the watchdog */
> -	else
> -		err = wdd->ops->start(wdd);	/* restart watchdog */
> +	wdd = container_of(to_delayed_work(work), struct watchdog_device, work);
>  
> -out_ping:
> +	mutex_lock(&wdd->lock);
> +	_watchdog_ping(wdd);
> +	watchdog_update_worker(wdd, false, false);
>  	mutex_unlock(&wdd->lock);
> -	return err;
>  }
>  
>  /*
> @@ -107,8 +182,11 @@ static int watchdog_start(struct watchdog_device *wdd)
>  		goto out_start;
>  
>  	err = wdd->ops->start(wdd);
> -	if (err == 0)
> +	if (err == 0) {
>  		set_bit(WDOG_ACTIVE, &wdd->status);
> +		wdd->last_keepalive = jiffies;
> +		watchdog_update_worker(wdd, false, false);
> +	}
>  
>  out_start:
>  	mutex_unlock(&wdd->lock);
> @@ -146,8 +224,10 @@ static int watchdog_stop(struct watchdog_device *wdd)
>  	}
>  
>  	err = wdd->ops->stop(wdd);
> -	if (err == 0)
> +	if (err == 0) {
>  		clear_bit(WDOG_ACTIVE, &wdd->status);
> +		watchdog_update_worker(wdd, true, false);

Regarding my concern about watchdog_update_worker above: After
clear_bit(WDOG_ACTIVE, ...) it's an unconditional cancel_delayed_work
and I'd prefer to not have that disguised. Maybe it would be easier to
understand if there were two different functions, one with the semantic
of watchdog_update_worker(cancel=true) and one with
watchdog_update_worker(cancel=false)?

It's to late here now to review the rest of your patch. Will follow-up
after sleeping.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/8] watchdog: Introduce hardware maximum timeout in watchdog core
  2015-08-08  5:02 ` [PATCH v2 2/8] watchdog: Introduce hardware maximum timeout in watchdog core Guenter Roeck
  2015-08-13 21:13   ` Uwe Kleine-König
@ 2015-08-14 11:23   ` Uwe Kleine-König
  2015-08-15 23:21     ` Guenter Roeck
  1 sibling, 1 reply; 17+ messages in thread
From: Uwe Kleine-König @ 2015-08-14 11:23 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: linux-watchdog, Wim Van Sebroeck, linux-kernel, Timo Kokkonen,
	linux-doc, Jonathan Corbet

Hello Guenter,

On Fri, Aug 07, 2015 at 10:02:41PM -0700, Guenter Roeck wrote:
> [...]
> @@ -61,26 +135,27 @@ static struct watchdog_device *old_wdd;
>  
>  static int watchdog_ping(struct watchdog_device *wdd)
>  {
> -	int err = 0;
> +	int err;
>  
>  	mutex_lock(&wdd->lock);
> +	err = _watchdog_ping(wdd);
> +	wdd->last_keepalive = jiffies;
> +	watchdog_update_worker(wdd, false, false);
> +	mutex_unlock(&wdd->lock);
>  
> -	if (test_bit(WDOG_UNREGISTERED, &wdd->status)) {
> -		err = -ENODEV;
> -		goto out_ping;
> -	}
> +	return err;
> +}
>  
> -	if (!watchdog_active(wdd))
> -		goto out_ping;
> +static void watchdog_ping_work(struct work_struct *work)
> +{
> +	struct watchdog_device *wdd;
>  
> -	if (wdd->ops->ping)
> -		err = wdd->ops->ping(wdd);	/* ping the watchdog */
> -	else
> -		err = wdd->ops->start(wdd);	/* restart watchdog */
> +	wdd = container_of(to_delayed_work(work), struct watchdog_device, work);
>  
> -out_ping:
> +	mutex_lock(&wdd->lock);
> +	_watchdog_ping(wdd);
> +	watchdog_update_worker(wdd, false, false);
>  	mutex_unlock(&wdd->lock);
> -	return err;
>  }
>  
>  /*
> @@ -107,8 +182,11 @@ static int watchdog_start(struct watchdog_device *wdd)
>  		goto out_start;
>  
>  	err = wdd->ops->start(wdd);
> -	if (err == 0)
> +	if (err == 0) {
>  		set_bit(WDOG_ACTIVE, &wdd->status);
> +		wdd->last_keepalive = jiffies;
> +		watchdog_update_worker(wdd, false, false);
> +	}
>  
>  out_start:
>  	mutex_unlock(&wdd->lock);
> @@ -146,8 +224,10 @@ static int watchdog_stop(struct watchdog_device *wdd)
>  	}
>  
>  	err = wdd->ops->stop(wdd);
> -	if (err == 0)
> +	if (err == 0) {
>  		clear_bit(WDOG_ACTIVE, &wdd->status);
> +		watchdog_update_worker(wdd, true, false);
> +	}
>  
>  out_stop:
>  	mutex_unlock(&wdd->lock);
> @@ -211,6 +291,8 @@ static int watchdog_set_timeout(struct watchdog_device *wdd,
>  
>  	err = wdd->ops->set_timeout(wdd, timeout);
>  
> +	watchdog_update_worker(wdd, true, false);

I still try to wrap my head around this function. You pass cancel=true
for stop and set_timeout to ensure that the worker doesn't continue to
run. That's fine.

For watchdog_start you pass cancel=false. I guess the background is that
after one of the next patches the worker might already run for handling
the watchdog being unstoppable. Maybe it's easier to grasp the logic if
you don't try to be too clever here: stop the worker on start
unconditionally and possibly restart it if the hardware needs extra
poking to fulfil the timeout set?

> +	if (!watchdog_wq)
> +		return -ENODEV;
> +
> +	INIT_DELAYED_WORK(&wdd->work, watchdog_ping_work);
> +
> +	if (!wdd->max_hw_timeout_ms)
> +		wdd->max_hw_timeout_ms = wdd->max_timeout * 1000;

With this (and assuming wdd->max_timeout > 0) the check for
max_hw_timeout_ms != 0 is not necessary, is it?

> +
>  	if (wdd->id == 0) {
>  		old_wdd = wdd;
>  		watchdog_miscdev.parent = wdd->parent;
> [...]
> @@ -585,9 +680,21 @@ int watchdog_dev_unregister(struct watchdog_device *wdd)
>  
>  int __init watchdog_dev_init(void)
>  {
> -	int err = alloc_chrdev_region(&watchdog_devt, 0, MAX_DOGS, "watchdog");
> +	int err;
> +
> +	watchdog_wq = alloc_workqueue("watchdogd",
> +				      WQ_HIGHPRI | WQ_MEM_RECLAIM, 0);
> +	if (!watchdog_wq) {
> +		pr_err("Failed to create watchdog workqueue\n");
> +		err = -ENOMEM;
> +		goto abort;

Why not return -ENOMEM directly?

> +	}
> +
> +	err = alloc_chrdev_region(&watchdog_devt, 0, MAX_DOGS, "watchdog");
>  	if (err < 0)
>  		pr_err("watchdog: unable to allocate char dev region\n");
> +
> +abort:
>  	return err;

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 3/8] watchdog: Introduce WDOG_RUNNING flag
  2015-08-08  5:02 ` [PATCH v2 3/8] watchdog: Introduce WDOG_RUNNING flag Guenter Roeck
@ 2015-08-14 19:04   ` Uwe Kleine-König
  2015-08-15 23:38     ` Guenter Roeck
  0 siblings, 1 reply; 17+ messages in thread
From: Uwe Kleine-König @ 2015-08-14 19:04 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: linux-watchdog, Wim Van Sebroeck, linux-kernel, Timo Kokkonen,
	linux-doc, Jonathan Corbet

Hello Guenter,

> diff --git a/Documentation/watchdog/watchdog-kernel-api.txt b/Documentation/watchdog/watchdog-kernel-api.txt
> index 25b00b878a7b..6a54dc15a556 100644
> --- a/Documentation/watchdog/watchdog-kernel-api.txt
> +++ b/Documentation/watchdog/watchdog-kernel-api.txt
> [...]
> @@ -193,9 +194,12 @@ they are supported. These optional routines/operations are:
>  The status bits should (preferably) be set with the set_bit and clear_bit alike
>  bit-operations. The status bits that are defined are:
>  * WDOG_ACTIVE: this status bit indicates whether or not a watchdog timer device
> -  is active or not. When the watchdog is active after booting, then you should
> -  set this status bit (Note: when you register the watchdog timer device with
> -  this bit set, then opening /dev/watchdog will skip the start operation)
> +  is active or not from user perspective. User space is expected to send
> +  heartbeat requests to the driver while this flag is set. If the watchdog
> +  is active after booting, and you don't want the infrastructure to send
> +  heartbeats to the watchdog driver, then you should set this status bit.

IMHO this should not be the driver author's choice! If you implement
policy in the kernel it should at least be implemented in the framework
and preferably easily changeable. (At least with Kconfig, but better use
a kernel parameter (or both, the latter overriding the former).)

> +  Note: when you register the watchdog timer device with this bit set,
> +  then opening /dev/watchdog will skip the start operation.
>  * WDOG_DEV_OPEN: this status bit shows whether or not the watchdog device
>    was opened via /dev/watchdog.
>    (This bit should only be used by the WatchDog Timer Driver Core).
> @@ -209,6 +213,11 @@ bit-operations. The status bits that are defined are:
>    any watchdog_ops, so that you can be sure that no operations (other then
>    unref) will get called after unregister, even if userspace still holds a
>    reference to /dev/watchdog
> +* WDOG_RUNNING: Set by the watchdog driver if the hardware watchdog is running.
> +  The bit must be set if the watchdog timer hardware can not be stopped.
> +  The bit may also be set if the watchdog timer is running aftyer booting,
> +  before the watchdog device is opened. If set, the watchdog infrastructure
> +  will send keepalives to the watchdog hardware while WDOG_ACTIVE is not set.
>  
>    To set the WDOG_NO_WAY_OUT status bit (before registering your watchdog
>    timer device) you can either:
> [...]
> diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
> index c04ba1a98cc8..676e233d5e7b 100644
> --- a/drivers/watchdog/watchdog_dev.c
> +++ b/drivers/watchdog/watchdog_dev.c
> @@ -59,7 +59,8 @@ static inline bool watchdog_need_worker(struct watchdog_device *wdd)
>  	unsigned int m = wdd->max_timeout * 1000;
>  	unsigned int t = wdd->timeout * 1000;
>  
> -	return watchdog_active(wdd) && hm && (!m || hm < m) && t > hm;
> +	return (watchdog_active(wdd) && hm && (!m || hm < m) && t > hm) ||
> +	       (t && !watchdog_active(wdd) && watchdog_running(wdd));

What is the meaning of

	!t && !watchdog_active(wdd) && watchdog_running(wdd)

? Can this happen at all? If not, drop "t && "?

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/8] watchdog: Make set_timeout function optional
  2015-08-08  5:02 ` [PATCH v2 4/8] watchdog: Make set_timeout function optional Guenter Roeck
@ 2015-08-14 19:05   ` Uwe Kleine-König
  2015-08-15 23:41     ` Guenter Roeck
  0 siblings, 1 reply; 17+ messages in thread
From: Uwe Kleine-König @ 2015-08-14 19:05 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: linux-watchdog, Wim Van Sebroeck, linux-kernel, Timo Kokkonen,
	linux-doc, Jonathan Corbet

Hello Guenter,

On Fri, Aug 07, 2015 at 10:02:43PM -0700, Guenter Roeck wrote:
> For some watchdogs, the hardware timeout is fixed, and the
> watchdog driver depends on the watchdog core to handle the
> actual timeout. In this situation, the watchdog driver might
> only set the 'timeout' variable but do nothing else.
> This can as well be handled by the infrastructure, so make
> the set_timeout callback optional. If WDIOF_SETTIMEOUT is
> configured but the .set_timeout callback is not available,
> update the timeout variable in the infrastructure code.
> 
> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>

Thanks
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/8] watchdog: Introduce hardware maximum timeout in watchdog core
  2015-08-13 21:13   ` Uwe Kleine-König
@ 2015-08-15 23:17     ` Guenter Roeck
  0 siblings, 0 replies; 17+ messages in thread
From: Guenter Roeck @ 2015-08-15 23:17 UTC (permalink / raw)
  To: Uwe Kleine-König
  Cc: linux-watchdog, Wim Van Sebroeck, linux-kernel, Timo Kokkonen,
	linux-doc, Jonathan Corbet

Hi Uwe,

On 08/13/2015 02:13 PM, Uwe Kleine-König wrote:
> Hello Guenter,
>
> On Fri, Aug 07, 2015 at 10:02:41PM -0700, Guenter Roeck wrote:
>> Introduce an optional hardware maximum timeout in the watchdog core.
>> The hardware maximum timeout can be lower than the maximum timeout.
>>
>> Drivers can set the maximum hardware timeout value in the watchdog data
>> structure. If the configured timeout exceeds the maximum hardware timeout,
>> the watchdog core enables a timer function to assist sending keepalive
>> requests to the watchdog driver.
>>
>> Cc: Timo Kokkonen <timo.kokkonen@offcode.fi>
>> Cc: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
>> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
>> ---
>> v2:
>> - Improved and hopefully clarified documentation.
>> - Rearranged variables in struct watchdog_device such that internal variables
>>    come last.
>> - The code now ensures that the watchdog times out <timeout> seconds after
>>    the most recent keepalive sent from user space.
>> - The internal keepalive now stops silently and no longer generates a
>>    warning message. Reason is that it will now stop early, while there
>>    may still be a substantial amount of time for keepalives from user space
>>    to arrive. If such keepalives arrive late (for example if user space
>>    is configured to send keepalives just a few seconds before the watchdog
>>    times out), the message would just be noise and not provide any value.
>> ---
>>   Documentation/watchdog/watchdog-kernel-api.txt |  23 +++-
>>   drivers/watchdog/watchdog_dev.c                | 140 ++++++++++++++++++++++---
>>   include/linux/watchdog.h                       |  26 +++--
>>   3 files changed, 163 insertions(+), 26 deletions(-)
>>
>> diff --git a/Documentation/watchdog/watchdog-kernel-api.txt b/Documentation/watchdog/watchdog-kernel-api.txt
>> index d8b0d3367706..25b00b878a7b 100644
>> --- a/Documentation/watchdog/watchdog-kernel-api.txt
>> +++ b/Documentation/watchdog/watchdog-kernel-api.txt
>> @@ -53,9 +53,12 @@ struct watchdog_device {
>>   	unsigned int timeout;
>>   	unsigned int min_timeout;
>>   	unsigned int max_timeout;
>> +	unsigned int max_hw_timeout_ms;
>>   	void *driver_data;
>> -	struct mutex lock;
>>   	unsigned long status;
>> +	struct mutex lock;
>> +	unsigned long last_keepalive;
>> +	struct delayed_work work;
>>   	struct list_head deferred;
>>   };
>>
>> @@ -73,18 +76,28 @@ It contains following fields:
>>     additional information about the watchdog timer itself. (Like it's unique name)
>>   * ops: a pointer to the list of watchdog operations that the watchdog supports.
>>   * timeout: the watchdog timer's timeout value (in seconds).
>> +  This is the time after which the system will reboot if user space does
>> +  not send a heartbeat request if WDOG_ACTIVE is set.
>>   * min_timeout: the watchdog timer's minimum timeout value (in seconds).
>>   * max_timeout: the watchdog timer's maximum timeout value (in seconds).
>> +* max_hw_timeout_ms: Maximum hardware timeout, in milli-seconds. May differ
>> +  from max_timeout. If set to a value larger than max_timeout, the
>> +  infrastructure will send a heartbeat to the watchdog driver if 'timeout'
>> +  is larger than 'max_hw_timeout / 2', unless WDOG_ACTIVE is set and user
>> +  space failed to send a heartbeat for at least 'timeout' seconds.
> To properly understand this it would be necessary to know what
> max_timeout is. Maybe clearify that, too?
>

I think I found a better solution: Declare that max_timeout won't be used
if max_hw_timeout_ms is provided. This also simplifies the code a lot.

>>   * bootstatus: status of the device after booting (reported with watchdog
>>     WDIOF_* status bits).
>>   * driver_data: a pointer to the drivers private data of a watchdog device.
>>     This data should only be accessed via the watchdog_set_drvdata and
>>     watchdog_get_drvdata routines.
>> -* lock: Mutex for WatchDog Timer Driver Core internal use only.
>>   * status: this field contains a number of status bits that give extra
>>     information about the status of the device (Like: is the watchdog timer
>>     running/active, is the nowayout bit set, is the device opened via
>>     the /dev/watchdog interface or not, ...).
>> +* lock: Mutex for WatchDog Timer Driver Core internal use only.
>> +* last_keepalive: Time of most recent keepalive triggered from user space,
>> +  in jiffies.
>> +* work: Worker data structure for WatchDog Timer Driver Core internal use only.
>>   * deferred: entry in wtd_deferred_reg_list which is used to
>>     register early initialized watchdogs.
>>
>> @@ -160,7 +173,11 @@ they are supported. These optional routines/operations are:
>>     and -EIO for "could not write value to the watchdog". On success this
>>     routine should set the timeout value of the watchdog_device to the
>>     achieved timeout value (which may be different from the requested one
>> -  because the watchdog does not necessarily has a 1 second resolution).
>> +  because the watchdog does not necessarily have a 1 second resolution).
>> +  Drivers implementing hw_max_timeout_ms set the hardware watchdog timeout
>> +  to the minimum of timeout and hw_max_timeout_ms. Those drivers set the
>> +  timeout value of the watchdog_device either to the requested timeout value
>> +  (if it is larger than hw_max_timeout_ms), or to the achieved timeout value.
>>     (Note: the WDIOF_SETTIMEOUT needs to be set in the options field of the
>>     watchdog's info structure).
>>   * get_timeleft: this routines returns the time that's left before a reset.
>> diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
>> index 06171c73daf5..c04ba1a98cc8 100644
>> --- a/drivers/watchdog/watchdog_dev.c
>> +++ b/drivers/watchdog/watchdog_dev.c
>> @@ -37,7 +37,9 @@
>>   #include <linux/errno.h>	/* For the -ENODEV/... values */
>>   #include <linux/kernel.h>	/* For printk/panic/... */
>>   #include <linux/fs.h>		/* For file operations */
>> +#include <linux/jiffies.h>	/* For timeout functions */
>>   #include <linux/watchdog.h>	/* For watchdog specific items */
>> +#include <linux/workqueue.h>	/* For workqueue */
>>   #include <linux/miscdevice.h>	/* For handling misc devices */
>>   #include <linux/init.h>		/* For __init/__exit/... */
>>   #include <linux/uaccess.h>	/* For copy_to_user/put_user/... */
>> @@ -49,6 +51,78 @@ static dev_t watchdog_devt;
>>   /* the watchdog device behind /dev/watchdog */
>>   static struct watchdog_device *old_wdd;
>>
>> +static struct workqueue_struct *watchdog_wq;
>> +
>> +static inline bool watchdog_need_worker(struct watchdog_device *wdd)
>> +{
>
> Maybe add:
> 	/* all variables in milli seconds */
>
Done.

>> +	unsigned int hm = wdd->max_hw_timeout_ms;
>> +	unsigned int m = wdd->max_timeout * 1000;
>> +	unsigned int t = wdd->timeout * 1000;
>> +
>> +	return watchdog_active(wdd) && hm && (!m || hm < m) && t > hm;
>
> ok, this means:
>
> 	watchdog_active(wdd): Userspace thinks the hardware is running
> 	hm: the driver is aware of framework pinging
> 	!m: no artificial limit
> 	hm < m: some artificial limit (does this make sense to have
> 		max_timeout > DIV_ROUND_UP(max_hw_timeout_ms, 1000), or
> 		should we better enforce max_timeout = 0 for "good"
> 		drivers?)
> 	t > hm: userspace requests longer timeout than hardware can
> 		handle.

I added this to the comments, with a slight modification. Specifically,
I don't check max_timeout anymore if max_hw_timeout_ms is specified.

>
> looks good.
>
>> +}
>> +
>> +static unsigned long watchdog_next_keepalive(struct watchdog_device *wdd)
>> +{
>> +	unsigned int t = wdd->timeout * 1000;
>> +	unsigned long to = wdd->last_keepalive + msecs_to_jiffies(t);
>> +	unsigned long last, max_t, tj;
>> +
>> +	if (wdd->max_hw_timeout_ms && t > wdd->max_hw_timeout_ms)
>> +		t = wdd->max_hw_timeout_ms;
>
> watchdog_next_keepalive is only called after watchdog_need_worker
> returned true, so there shouldn't be a need to repeat this test, right?
>

Yes, the check if wdd->max_hw_timeout_ms != 0 can be dropped.

>> +	tj = msecs_to_jiffies(t / 2);
>> +
>> +	/*
>> +	 * Ensure that the watchdog times out wdd->timeout seconds
>> +	 * after the most recent keepalive sent from user space.
>> +	 */
>> +	last = to - msecs_to_jiffies(t);
>> +	if (time_is_after_jiffies(last))
>> +		max_t = last - jiffies;
>> +	else
>> +		max_t = 0;
>> +
>> +	if (max_t < tj)
>> +		tj = max_t;
>> +
>> +	return tj;
> I find this hard to understand. Maybe the variable names could be
> improved, something like:
>
> 	t -> hw_timeout_ms
> 	tj -> keep_alive_interval
> 	to -> virt_timeout
> 	last -> last_hw_ping
>
> And I'd do:
>
> 	/*
> 	 * To ensure that the watchdog times out wdd->timeout seconds
> 	 * after the most recent ping from userspace, the last
> 	 * worker ping has to come in hw_timeout_ms before this timeout.
> 	 */
> 	last_hw_ping = virt_timeout - msecs_to_jiffies(hw_timeout_ms);
>
> 	/*
> 	 * Worker pings usually come at intervals of
> 	 * max_hw_timeout_ms / 2. This interval is shortend if
> 	 * virt_timeout is near (or already over).
> 	 */
> 	return min_t(long, last_hw_ping - jiffies, keep_alive_interval);
>
> then you have to change the return type to long and ...
>
>> +}
>> +
>> +static inline void watchdog_update_worker(struct watchdog_device *wdd,
>> +					  bool cancel, bool sync)
>> +{
>> +	if (watchdog_need_worker(wdd)) {
>> +		unsigned long t = watchdog_next_keepalive(wdd);
>> +
>> +		if (t)
>
> ... test for t > 0 here.
>
Makes sense, done.

>> +			mod_delayed_work(watchdog_wq, &wdd->work, t);
>> +	} else if (cancel) {
>> +		if (sync)
>
> Up to now sync is always false. Does this change later? Should this
> parameter be introduced only later, too?
>
>> +			cancel_delayed_work_sync(&wdd->work);
>> +		else
>> +			cancel_delayed_work(&wdd->work);
>> +	}
>> +}
>
> This function looks artificial, i.e. I don't understand why you would
> want to modify the timer or stop it. Probably that is because the timer
> does more than increasing the virtual timeout later?
>

I dropped the sync part. This will be needed with the next patch of the series
(if I recall correctly) so I'll introduce it there.

cancel is needed because the timer does not have to be canceled if called
from the timer function itself.

>> +static int _watchdog_ping(struct watchdog_device *wdd)
>> +{
>> +	int err;
>> +
>> +	if (test_bit(WDOG_UNREGISTERED, &wdd->status))
>> +		return -ENODEV;
>> +
>> +	if (!watchdog_active(wdd))
>> +		return 0;
>> +
>> +	if (wdd->ops->ping)
>> +		err = wdd->ops->ping(wdd);  /* ping the watchdog */
>> +	else
>> +		err = wdd->ops->start(wdd); /* restart watchdog */
>> +
>> +	return err;
>> +}
>> +
>>   /*
>>    *	watchdog_ping: ping the watchdog.
>>    *	@wdd: the watchdog device to ping
>> @@ -61,26 +135,27 @@ static struct watchdog_device *old_wdd;
>>
>>   static int watchdog_ping(struct watchdog_device *wdd)
>>   {
>> -	int err = 0;
>> +	int err;
>>
>>   	mutex_lock(&wdd->lock);
>> +	err = _watchdog_ping(wdd);
>> +	wdd->last_keepalive = jiffies;
> I'd switch these two lines. Consider wdd->ops->ping takes some time then
> the next ping should timed relative to before the ping, not after it.
>
Ok.

>> +	watchdog_update_worker(wdd, false, false);
>> +	mutex_unlock(&wdd->lock);
>>
>> -	if (test_bit(WDOG_UNREGISTERED, &wdd->status)) {
>> -		err = -ENODEV;
>> -		goto out_ping;
>> -	}
>> +	return err;
>> +}
>>
>> -	if (!watchdog_active(wdd))
>> -		goto out_ping;
>> +static void watchdog_ping_work(struct work_struct *work)
>> +{
>> +	struct watchdog_device *wdd;
>>
>> -	if (wdd->ops->ping)
>> -		err = wdd->ops->ping(wdd);	/* ping the watchdog */
>> -	else
>> -		err = wdd->ops->start(wdd);	/* restart watchdog */
>> +	wdd = container_of(to_delayed_work(work), struct watchdog_device, work);
>>
>> -out_ping:
>> +	mutex_lock(&wdd->lock);
>> +	_watchdog_ping(wdd);
>> +	watchdog_update_worker(wdd, false, false);
>>   	mutex_unlock(&wdd->lock);
>> -	return err;
>>   }
>>
>>   /*
>> @@ -107,8 +182,11 @@ static int watchdog_start(struct watchdog_device *wdd)
>>   		goto out_start;
>>
>>   	err = wdd->ops->start(wdd);
>> -	if (err == 0)
>> +	if (err == 0) {
>>   		set_bit(WDOG_ACTIVE, &wdd->status);
>> +		wdd->last_keepalive = jiffies;
>> +		watchdog_update_worker(wdd, false, false);
>> +	}
>>
>>   out_start:
>>   	mutex_unlock(&wdd->lock);
>> @@ -146,8 +224,10 @@ static int watchdog_stop(struct watchdog_device *wdd)
>>   	}
>>
>>   	err = wdd->ops->stop(wdd);
>> -	if (err == 0)
>> +	if (err == 0) {
>>   		clear_bit(WDOG_ACTIVE, &wdd->status);
>> +		watchdog_update_worker(wdd, true, false);
>
> Regarding my concern about watchdog_update_worker above: After
> clear_bit(WDOG_ACTIVE, ...) it's an unconditional cancel_delayed_work
> and I'd prefer to not have that disguised. Maybe it would be easier to
> understand if there were two different functions, one with the semantic
> of watchdog_update_worker(cancel=true) and one with
> watchdog_update_worker(cancel=false)?
>
I replaced it with cancel_delayed_work() in this patch. It needs to be
the update function after the next patch, but I'll introduce it there.

Reason for doing it here was to reduce the number of changes needed
in the next patch, but I now think that is just adding confusion.

> It's to late here now to review the rest of your patch. Will follow-up
> after sleeping.
>
Thanks!

Guenter


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 2/8] watchdog: Introduce hardware maximum timeout in watchdog core
  2015-08-14 11:23   ` Uwe Kleine-König
@ 2015-08-15 23:21     ` Guenter Roeck
  0 siblings, 0 replies; 17+ messages in thread
From: Guenter Roeck @ 2015-08-15 23:21 UTC (permalink / raw)
  To: Uwe Kleine-König
  Cc: linux-watchdog, Wim Van Sebroeck, linux-kernel, Timo Kokkonen,
	linux-doc, Jonathan Corbet

On 08/14/2015 04:23 AM, Uwe Kleine-König wrote:
> Hello Guenter,
>
> On Fri, Aug 07, 2015 at 10:02:41PM -0700, Guenter Roeck wrote:
>> [...]
>> @@ -61,26 +135,27 @@ static struct watchdog_device *old_wdd;
>>
>>   static int watchdog_ping(struct watchdog_device *wdd)
>>   {
>> -	int err = 0;
>> +	int err;
>>
>>   	mutex_lock(&wdd->lock);
>> +	err = _watchdog_ping(wdd);
>> +	wdd->last_keepalive = jiffies;
>> +	watchdog_update_worker(wdd, false, false);
>> +	mutex_unlock(&wdd->lock);
>>
>> -	if (test_bit(WDOG_UNREGISTERED, &wdd->status)) {
>> -		err = -ENODEV;
>> -		goto out_ping;
>> -	}
>> +	return err;
>> +}
>>
>> -	if (!watchdog_active(wdd))
>> -		goto out_ping;
>> +static void watchdog_ping_work(struct work_struct *work)
>> +{
>> +	struct watchdog_device *wdd;
>>
>> -	if (wdd->ops->ping)
>> -		err = wdd->ops->ping(wdd);	/* ping the watchdog */
>> -	else
>> -		err = wdd->ops->start(wdd);	/* restart watchdog */
>> +	wdd = container_of(to_delayed_work(work), struct watchdog_device, work);
>>
>> -out_ping:
>> +	mutex_lock(&wdd->lock);
>> +	_watchdog_ping(wdd);
>> +	watchdog_update_worker(wdd, false, false);
>>   	mutex_unlock(&wdd->lock);
>> -	return err;
>>   }
>>
>>   /*
>> @@ -107,8 +182,11 @@ static int watchdog_start(struct watchdog_device *wdd)
>>   		goto out_start;
>>
>>   	err = wdd->ops->start(wdd);
>> -	if (err == 0)
>> +	if (err == 0) {
>>   		set_bit(WDOG_ACTIVE, &wdd->status);
>> +		wdd->last_keepalive = jiffies;
>> +		watchdog_update_worker(wdd, false, false);
>> +	}
>>
>>   out_start:
>>   	mutex_unlock(&wdd->lock);
>> @@ -146,8 +224,10 @@ static int watchdog_stop(struct watchdog_device *wdd)
>>   	}
>>
>>   	err = wdd->ops->stop(wdd);
>> -	if (err == 0)
>> +	if (err == 0) {
>>   		clear_bit(WDOG_ACTIVE, &wdd->status);
>> +		watchdog_update_worker(wdd, true, false);
>> +	}
>>
>>   out_stop:
>>   	mutex_unlock(&wdd->lock);
>> @@ -211,6 +291,8 @@ static int watchdog_set_timeout(struct watchdog_device *wdd,
>>
>>   	err = wdd->ops->set_timeout(wdd, timeout);
>>
>> +	watchdog_update_worker(wdd, true, false);
>
> I still try to wrap my head around this function. You pass cancel=true
> for stop and set_timeout to ensure that the worker doesn't continue to
> run. That's fine.
>
> For watchdog_start you pass cancel=false. I guess the background is that
> after one of the next patches the worker might already run for handling
> the watchdog being unstoppable. Maybe it's easier to grasp the logic if
> you don't try to be too clever here: stop the worker on start
> unconditionally and possibly restart it if the hardware needs extra
> poking to fulfil the timeout set?
>
I thought it would reduce the amount of code, and I thought it would be
more confusing and complicated to first call cancel the worker followed
by a (conditional) start. No strong opinion, though; I'll be happy to
make that change in exchange for a Reviewed-by:.


>> +	if (!watchdog_wq)
>> +		return -ENODEV;
>> +
>> +	INIT_DELAYED_WORK(&wdd->work, watchdog_ping_work);
>> +
>> +	if (!wdd->max_hw_timeout_ms)
>> +		wdd->max_hw_timeout_ms = wdd->max_timeout * 1000;
>
> With this (and assuming wdd->max_timeout > 0) the check for
> max_hw_timeout_ms != 0 is not necessary, is it?
>
With the logical change I am making, to ignore max_timeout if max_hw_timeout_ms
is configured, it is indeed no longer necessary (nor desirable).

>> +
>>   	if (wdd->id == 0) {
>>   		old_wdd = wdd;
>>   		watchdog_miscdev.parent = wdd->parent;
>> [...]
>> @@ -585,9 +680,21 @@ int watchdog_dev_unregister(struct watchdog_device *wdd)
>>
>>   int __init watchdog_dev_init(void)
>>   {
>> -	int err = alloc_chrdev_region(&watchdog_devt, 0, MAX_DOGS, "watchdog");
>> +	int err;
>> +
>> +	watchdog_wq = alloc_workqueue("watchdogd",
>> +				      WQ_HIGHPRI | WQ_MEM_RECLAIM, 0);
>> +	if (!watchdog_wq) {
>> +		pr_err("Failed to create watchdog workqueue\n");
>> +		err = -ENOMEM;
>> +		goto abort;
>
> Why not return -ENOMEM directly?
>
No idea. Changed.

Thanks,
Guenter


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 3/8] watchdog: Introduce WDOG_RUNNING flag
  2015-08-14 19:04   ` Uwe Kleine-König
@ 2015-08-15 23:38     ` Guenter Roeck
  0 siblings, 0 replies; 17+ messages in thread
From: Guenter Roeck @ 2015-08-15 23:38 UTC (permalink / raw)
  To: Uwe Kleine-König
  Cc: linux-watchdog, Wim Van Sebroeck, linux-kernel, Timo Kokkonen,
	linux-doc, Jonathan Corbet

On 08/14/2015 12:04 PM, Uwe Kleine-König wrote:
> Hello Guenter,
>
>> diff --git a/Documentation/watchdog/watchdog-kernel-api.txt b/Documentation/watchdog/watchdog-kernel-api.txt
>> index 25b00b878a7b..6a54dc15a556 100644
>> --- a/Documentation/watchdog/watchdog-kernel-api.txt
>> +++ b/Documentation/watchdog/watchdog-kernel-api.txt
>> [...]
>> @@ -193,9 +194,12 @@ they are supported. These optional routines/operations are:
>>   The status bits should (preferably) be set with the set_bit and clear_bit alike
>>   bit-operations. The status bits that are defined are:
>>   * WDOG_ACTIVE: this status bit indicates whether or not a watchdog timer device
>> -  is active or not. When the watchdog is active after booting, then you should
>> -  set this status bit (Note: when you register the watchdog timer device with
>> -  this bit set, then opening /dev/watchdog will skip the start operation)
>> +  is active or not from user perspective. User space is expected to send
>> +  heartbeat requests to the driver while this flag is set. If the watchdog
>> +  is active after booting, and you don't want the infrastructure to send
>> +  heartbeats to the watchdog driver, then you should set this status bit.
>
> IMHO this should not be the driver author's choice! If you implement
> policy in the kernel it should at least be implemented in the framework
> and preferably easily changeable. (At least with Kconfig, but better use
> a kernel parameter (or both, the latter overriding the former).)
>
Agreed. I'll change that and simply drop that part. After all,
we now have WDOG_RUNNING to indicate that the watchdog is running.
I'll just have to make sure that there are no drivers which set
WDOG_ACTIVE at boot (afaics there are none).

>> +  Note: when you register the watchdog timer device with this bit set,
>> +  then opening /dev/watchdog will skip the start operation.
>>   * WDOG_DEV_OPEN: this status bit shows whether or not the watchdog device
>>     was opened via /dev/watchdog.
>>     (This bit should only be used by the WatchDog Timer Driver Core).
>> @@ -209,6 +213,11 @@ bit-operations. The status bits that are defined are:
>>     any watchdog_ops, so that you can be sure that no operations (other then
>>     unref) will get called after unregister, even if userspace still holds a
>>     reference to /dev/watchdog
>> +* WDOG_RUNNING: Set by the watchdog driver if the hardware watchdog is running.
>> +  The bit must be set if the watchdog timer hardware can not be stopped.
>> +  The bit may also be set if the watchdog timer is running aftyer booting,
>> +  before the watchdog device is opened. If set, the watchdog infrastructure
>> +  will send keepalives to the watchdog hardware while WDOG_ACTIVE is not set.
>>
>>     To set the WDOG_NO_WAY_OUT status bit (before registering your watchdog
>>     timer device) you can either:
>> [...]
>> diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
>> index c04ba1a98cc8..676e233d5e7b 100644
>> --- a/drivers/watchdog/watchdog_dev.c
>> +++ b/drivers/watchdog/watchdog_dev.c
>> @@ -59,7 +59,8 @@ static inline bool watchdog_need_worker(struct watchdog_device *wdd)
>>   	unsigned int m = wdd->max_timeout * 1000;
>>   	unsigned int t = wdd->timeout * 1000;
>>
>> -	return watchdog_active(wdd) && hm && (!m || hm < m) && t > hm;
>> +	return (watchdog_active(wdd) && hm && (!m || hm < m) && t > hm) ||
>> +	       (t && !watchdog_active(wdd) && watchdog_running(wdd));
>
> What is the meaning of
>
> 	!t && !watchdog_active(wdd) && watchdog_running(wdd)
>
> ? Can this happen at all? If not, drop "t && "?
>
t can be 0, meaning "the watchdog timeout is unknown", unless a driver sets min_timeout
to a value larger than 0. Unfortunately, that is not explicitly specified.

Thanks,
Guenter


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v2 4/8] watchdog: Make set_timeout function optional
  2015-08-14 19:05   ` Uwe Kleine-König
@ 2015-08-15 23:41     ` Guenter Roeck
  0 siblings, 0 replies; 17+ messages in thread
From: Guenter Roeck @ 2015-08-15 23:41 UTC (permalink / raw)
  To: Uwe Kleine-König
  Cc: linux-watchdog, Wim Van Sebroeck, linux-kernel, Timo Kokkonen,
	linux-doc, Jonathan Corbet

Hi Uwe,

On 08/14/2015 12:05 PM, Uwe Kleine-König wrote:
> Hello Guenter,
>
> On Fri, Aug 07, 2015 at 10:02:43PM -0700, Guenter Roeck wrote:
>> For some watchdogs, the hardware timeout is fixed, and the
>> watchdog driver depends on the watchdog core to handle the
>> actual timeout. In this situation, the watchdog driver might
>> only set the 'timeout' variable but do nothing else.
>> This can as well be handled by the infrastructure, so make
>> the set_timeout callback optional. If WDIOF_SETTIMEOUT is
>> configured but the .set_timeout callback is not available,
>> update the timeout variable in the infrastructure code.
>>
>> Signed-off-by: Guenter Roeck <linux@roeck-us.net>
> Acked-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
>

Thanks!
Guenter



^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2015-08-15 23:41 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-08  5:02 [PATCH v2 0/8] watchdog: Add support for keepalives triggered by infrastructure Guenter Roeck
2015-08-08  5:02 ` [PATCH v2 1/8] watchdog: watchdog_dev: Use single variable name for struct watchdog_device Guenter Roeck
2015-08-08  5:02 ` [PATCH v2 2/8] watchdog: Introduce hardware maximum timeout in watchdog core Guenter Roeck
2015-08-13 21:13   ` Uwe Kleine-König
2015-08-15 23:17     ` Guenter Roeck
2015-08-14 11:23   ` Uwe Kleine-König
2015-08-15 23:21     ` Guenter Roeck
2015-08-08  5:02 ` [PATCH v2 3/8] watchdog: Introduce WDOG_RUNNING flag Guenter Roeck
2015-08-14 19:04   ` Uwe Kleine-König
2015-08-15 23:38     ` Guenter Roeck
2015-08-08  5:02 ` [PATCH v2 4/8] watchdog: Make set_timeout function optional Guenter Roeck
2015-08-14 19:05   ` Uwe Kleine-König
2015-08-15 23:41     ` Guenter Roeck
2015-08-08  5:02 ` [PATCH v2 5/8] watchdog: imx2: Convert to use infrastructure triggered keepalives Guenter Roeck
2015-08-08  5:02 ` [PATCH v2 6/8] watchdog: retu: " Guenter Roeck
2015-08-08  5:02 ` [PATCH v2 7/8] watchdog: gpio_wdt: " Guenter Roeck
2015-08-08  5:02 ` [PATCH v2 8/8] watchdog: at91sam9: " Guenter Roeck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).