All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 0/3] watchdog: introduce open deadline
@ 2016-07-14  9:16 Rasmus Villemoes
  2016-07-14  9:16 ` [RFC 1/3] watchdog: change watchdog_need_worker logic Rasmus Villemoes
                   ` (3 more replies)
  0 siblings, 4 replies; 28+ messages in thread
From: Rasmus Villemoes @ 2016-07-14  9:16 UTC (permalink / raw)
  To: linux-watchdog, linux-kernel; +Cc: Rasmus Villemoes, Guenter Roeck

If a watchdog driver tells the framework that the device is running,
the framework takes care of feeding the watchdog until userspace opens
the device. If the userspace application which is supposed to do that
never comes up properly, the watchdog is fed indefinitely by the
kernel. This can be especially problematic for embedded devices.

These patches allow one to set a maximum time for which the kernel
will feed the watchdog, thus ensuring that either userspace has come
up, or the board gets reset. This allows fallback logic in the
bootloader to attempt some recovery (for example, if an automatic
update is in progress, it could roll back to the previous version).

The patches have been tested on a Raspberry Pi 2 (with a suitably
modified driver for setting WDOG_HW_RUNNING - patches will be
submitted seperately) and a Wandboard.

Rasmus Villemoes (3):
  watchdog: change watchdog_need_worker logic
  watchdog: introduce watchdog_worker_should_ping helper
  watchdog: introduce CONFIG_WATCHDOG_OPEN_DEADLINE

 drivers/watchdog/Kconfig        | 21 +++++++++++++++
 drivers/watchdog/watchdog_dev.c | 60 ++++++++++++++++++++++++++++++++++++-----
 2 files changed, 74 insertions(+), 7 deletions(-)

-- 
2.5.0

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC 1/3] watchdog: change watchdog_need_worker logic
  2016-07-14  9:16 [RFC 0/3] watchdog: introduce open deadline Rasmus Villemoes
@ 2016-07-14  9:16 ` Rasmus Villemoes
  2016-07-14 20:45   ` Guenter Roeck
                     ` (2 more replies)
  2016-07-14  9:16 ` [RFC 2/3] watchdog: introduce watchdog_worker_should_ping helper Rasmus Villemoes
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 28+ messages in thread
From: Rasmus Villemoes @ 2016-07-14  9:16 UTC (permalink / raw)
  To: Wim Van Sebroeck, Guenter Roeck
  Cc: Rasmus Villemoes, linux-watchdog, linux-kernel

If the driver indicates that the watchdog is running, the framework
should feed it until userspace opens the device, regardless of whether
the driver has set max_hw_heartbeat_ms.

This patch only affects the case where wdd->max_hw_heartbeat_ms is
zero, wdd->timeout is non-zero, the watchdog is not active and the
hardware device is running (*):

- If wdd->timeout is zero, watchdog_need_worker() returns false both
before and after this patch, and watchdog_next_keepalive() is not
called.

- If watchdog_active(wdd), the return value from watchdog_need_worker
is also the same as before (namely, hm && t > hm). Hence in that case,
watchdog_next_keepalive() is only called if hm == max_hw_heartbeat_ms
is non-zero, so the change to min_not_zero there is a no-op.

- If the watchdog is not active and the device is not running, we
return false from watchdog_need_worker just as before.

That leaves the watchdog_hw_running(wdd) && !watchdog_active(wdd) &&
wdd->timeout case. Again, it's easy to see that if
wdd->max_hw_heartbeat_ms is non-zero, we return true from
watchdog_need_worker with and without this patch, and the logic in
watchdog_next_keepalive is unchanged. Finally, if
wdd->max_hw_heartbeat_ms is 0, we used to end up in the
cancel_delayed_work branch, whereas with this patch we end up
scheduling a ping timeout_ms/2 from now.

(*) This should imply that no current kernel drivers are affected,
since the only drivers which explicitly set WDOG_HW_RUNNING are
imx2_wdt.c and dw_wdt.c, both of which also provide a non-zero value
for max_hw_heartbeat_ms. The watchdog core also sets WDOG_HW_RUNNING,
but only when the driver doesn't provide ->stop, in which case it
must, according to Documentation/watchdog/watchdog-kernel-api.txt, set
max_hw_heartbeat_ms.

Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
---
 drivers/watchdog/watchdog_dev.c | 10 +++++++---
 1 file changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
index 3595cff..14f8a92 100644
--- a/drivers/watchdog/watchdog_dev.c
+++ b/drivers/watchdog/watchdog_dev.c
@@ -92,9 +92,13 @@ static inline bool watchdog_need_worker(struct watchdog_device *wdd)
 	 *   thus is aware that the framework supports generating heartbeat
 	 *   requests.
 	 * - Userspace requests a longer timeout than the hardware can handle.
+	 *
+	 * Alternatively, if userspace has not opened the watchdog
+	 * device, we take care of feeding the watchdog if it is
+	 * running.
 	 */
-	return hm && ((watchdog_active(wdd) && t > hm) ||
-		      (t && !watchdog_active(wdd) && watchdog_hw_running(wdd)));
+	return (hm && watchdog_active(wdd) && t > hm) ||
+		(t && !watchdog_active(wdd) && watchdog_hw_running(wdd));
 }
 
 static long watchdog_next_keepalive(struct watchdog_device *wdd)
@@ -107,7 +111,7 @@ static long watchdog_next_keepalive(struct watchdog_device *wdd)
 	unsigned int hw_heartbeat_ms;
 
 	virt_timeout = wd_data->last_keepalive + msecs_to_jiffies(timeout_ms);
-	hw_heartbeat_ms = min(timeout_ms, wdd->max_hw_heartbeat_ms);
+	hw_heartbeat_ms = min_not_zero(timeout_ms, wdd->max_hw_heartbeat_ms);
 	keepalive_interval = msecs_to_jiffies(hw_heartbeat_ms / 2);
 
 	if (!watchdog_active(wdd))
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC 2/3] watchdog: introduce watchdog_worker_should_ping helper
  2016-07-14  9:16 [RFC 0/3] watchdog: introduce open deadline Rasmus Villemoes
  2016-07-14  9:16 ` [RFC 1/3] watchdog: change watchdog_need_worker logic Rasmus Villemoes
@ 2016-07-14  9:16 ` Rasmus Villemoes
  2016-07-14  9:16 ` [RFC 3/3] watchdog: introduce CONFIG_WATCHDOG_OPEN_DEADLINE Rasmus Villemoes
  2016-12-12  9:17 ` [PATCH v2 0/2] watchdog: allow setting deadline for opening /dev/watchdogN Rasmus Villemoes
  3 siblings, 0 replies; 28+ messages in thread
From: Rasmus Villemoes @ 2016-07-14  9:16 UTC (permalink / raw)
  To: Wim Van Sebroeck, Guenter Roeck
  Cc: Rasmus Villemoes, linux-watchdog, linux-kernel

This will be useful when the condition becomes slightly more
complicated in the next patch.

Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
---
 drivers/watchdog/watchdog_dev.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
index 14f8a92..da549e5 100644
--- a/drivers/watchdog/watchdog_dev.c
+++ b/drivers/watchdog/watchdog_dev.c
@@ -188,18 +188,23 @@ static int watchdog_ping(struct watchdog_device *wdd)
 	return __watchdog_ping(wdd);
 }
 
+static bool watchdog_worker_should_ping(struct watchdog_core_data *wd_data)
+{
+	struct watchdog_device *wdd = wd_data->wdd;
+
+	return wdd && (watchdog_active(wdd) || watchdog_hw_running(wdd));
+}
+
 static void watchdog_ping_work(struct work_struct *work)
 {
 	struct watchdog_core_data *wd_data;
-	struct watchdog_device *wdd;
 
 	wd_data = container_of(to_delayed_work(work), struct watchdog_core_data,
 			       work);
 
 	mutex_lock(&wd_data->lock);
-	wdd = wd_data->wdd;
-	if (wdd && (watchdog_active(wdd) || watchdog_hw_running(wdd)))
-		__watchdog_ping(wdd);
+	if (watchdog_worker_should_ping(wd_data))
+		__watchdog_ping(wd_data->wdd);
 	mutex_unlock(&wd_data->lock);
 }
 
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [RFC 3/3] watchdog: introduce CONFIG_WATCHDOG_OPEN_DEADLINE
  2016-07-14  9:16 [RFC 0/3] watchdog: introduce open deadline Rasmus Villemoes
  2016-07-14  9:16 ` [RFC 1/3] watchdog: change watchdog_need_worker logic Rasmus Villemoes
  2016-07-14  9:16 ` [RFC 2/3] watchdog: introduce watchdog_worker_should_ping helper Rasmus Villemoes
@ 2016-07-14  9:16 ` Rasmus Villemoes
  2016-07-14 14:42   ` Guenter Roeck
  2016-12-12  9:17 ` [PATCH v2 0/2] watchdog: allow setting deadline for opening /dev/watchdogN Rasmus Villemoes
  3 siblings, 1 reply; 28+ messages in thread
From: Rasmus Villemoes @ 2016-07-14  9:16 UTC (permalink / raw)
  To: Wim Van Sebroeck, Guenter Roeck
  Cc: Rasmus Villemoes, linux-watchdog, linux-kernel

The watchdog framework takes care of feeding a hardware watchdog until
userspace opens /dev/watchdogN. If that never happens for some reason
(buggy init script, corrupt root filesystem or whatnot) but the kernel
itself is fine, the machine stays up indefinitely. This patch allows
setting an upper limit for how long the kernel will take care of the
watchdog, thus ensuring that the watchdog will eventually reset the
machine.

This is particularly useful for embedded devices where some fallback
logic is implemented in the bootloader (e.g., use a different root
partition, boot from network, ...).

The open timeout is also used as a maximum time for an application to
re-open /dev/watchdogN after closing it.

Setting a timeout of 0 (either on the kernel command line or via
sysfs) makes the system behave as if WATCHDOG_OPEN_DEADLINE=n.

Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
---
 drivers/watchdog/Kconfig        | 21 +++++++++++++++++++++
 drivers/watchdog/watchdog_dev.c | 39 ++++++++++++++++++++++++++++++++++++++-
 2 files changed, 59 insertions(+), 1 deletion(-)

diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
index b4b3e25..2674ddf 100644
--- a/drivers/watchdog/Kconfig
+++ b/drivers/watchdog/Kconfig
@@ -53,6 +53,27 @@ config WATCHDOG_SYSFS
 	  Say Y here if you want to enable watchdog device status read through
 	  sysfs attributes.
 
+config WATCHDOG_OPEN_DEADLINE
+	bool "Allow deadline for opening watchdog device"
+	help
+	  If a watchdog driver indicates that to the framework that
+	  the hardware watchdog is running, the framework takes care
+	  of pinging the watchdog until userspace opens
+	  /dev/watchdogN. By selecting this option, you can set a
+	  maximum time for which the kernel will do this after the
+	  device has been registered.
+
+config WATCHDOG_OPEN_TIMEOUT
+	int "Timeout value for opening watchdog device"
+	depends on WATCHDOG_OPEN_DEADLINE
+	default 120000
+	help
+	  The maximum time, in milliseconds, for which the watchdog
+	  framework takes care of pinging a watchdog device. A value
+	  of 0 means infinite. The value set here can be overridden by
+	  the commandline parameter "watchdog.open_timeout" or through
+	  sysfs.
+
 #
 # General Watchdog drivers
 #
diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
index da549e5..584c8a5 100644
--- a/drivers/watchdog/watchdog_dev.c
+++ b/drivers/watchdog/watchdog_dev.c
@@ -65,6 +65,9 @@ struct watchdog_core_data {
 	struct mutex lock;
 	unsigned long last_keepalive;
 	unsigned long last_hw_keepalive;
+#ifdef CONFIG_WATCHDOG_OPEN_DEADLINE
+	unsigned long open_deadline;
+#endif
 	struct delayed_work work;
 	unsigned long status;		/* Internal status bits */
 #define _WDOG_DEV_OPEN		0	/* Opened ? */
@@ -78,6 +81,32 @@ static struct watchdog_core_data *old_wd_data;
 
 static struct workqueue_struct *watchdog_wq;
 
+#ifdef CONFIG_WATCHDOG_OPEN_DEADLINE
+static unsigned open_timeout = CONFIG_WATCHDOG_OPEN_TIMEOUT;
+module_param(open_timeout, uint, 0644);
+
+static bool watchdog_past_open_deadline(struct watchdog_core_data *data)
+{
+	if (!open_timeout)
+		return false;
+	return time_is_before_jiffies(data->open_deadline);
+}
+
+static void watchdog_set_open_deadline(struct watchdog_core_data *data)
+{
+	data->open_deadline = jiffies + msecs_to_jiffies(open_timeout);
+}
+#else
+static bool watchdog_past_open_deadline(struct watchdog_core_data *data)
+{
+	return false;
+}
+
+static void watchdog_set_open_deadline(struct watchdog_core_data *data)
+{
+}
+#endif
+
 static inline bool watchdog_need_worker(struct watchdog_device *wdd)
 {
 	/* All variables in milli-seconds */
@@ -192,7 +221,13 @@ static bool watchdog_worker_should_ping(struct watchdog_core_data *wd_data)
 {
 	struct watchdog_device *wdd = wd_data->wdd;
 
-	return wdd && (watchdog_active(wdd) || watchdog_hw_running(wdd));
+	if (!wdd)
+		return false;
+
+	if (watchdog_active(wdd))
+		return true;
+
+	return watchdog_hw_running(wdd) && !watchdog_past_open_deadline(wd_data);
 }
 
 static void watchdog_ping_work(struct work_struct *work)
@@ -745,6 +780,7 @@ static int watchdog_release(struct inode *inode, struct file *file)
 		watchdog_ping(wdd);
 	}
 
+	watchdog_set_open_deadline(wd_data);
 	watchdog_update_worker(wdd);
 
 	/* make sure that /dev/watchdog can be re-opened */
@@ -843,6 +879,7 @@ static int watchdog_cdev_register(struct watchdog_device *wdd, dev_t devno)
 
 	/* Record time of most recent heartbeat as 'just before now'. */
 	wd_data->last_hw_keepalive = jiffies - 1;
+	watchdog_set_open_deadline(wd_data);
 
 	/*
 	 * If the watchdog is running, prevent its driver from being unloaded,
-- 
2.5.0

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [RFC 3/3] watchdog: introduce CONFIG_WATCHDOG_OPEN_DEADLINE
  2016-07-14  9:16 ` [RFC 3/3] watchdog: introduce CONFIG_WATCHDOG_OPEN_DEADLINE Rasmus Villemoes
@ 2016-07-14 14:42   ` Guenter Roeck
  2016-07-15  7:32     ` Rasmus Villemoes
  0 siblings, 1 reply; 28+ messages in thread
From: Guenter Roeck @ 2016-07-14 14:42 UTC (permalink / raw)
  To: Rasmus Villemoes, Wim Van Sebroeck; +Cc: linux-watchdog, linux-kernel

On 07/14/2016 02:16 AM, Rasmus Villemoes wrote:
> The watchdog framework takes care of feeding a hardware watchdog until
> userspace opens /dev/watchdogN. If that never happens for some reason
> (buggy init script, corrupt root filesystem or whatnot) but the kernel
> itself is fine, the machine stays up indefinitely. This patch allows
> setting an upper limit for how long the kernel will take care of the
> watchdog, thus ensuring that the watchdog will eventually reset the
> machine.
>
> This is particularly useful for embedded devices where some fallback
> logic is implemented in the bootloader (e.g., use a different root
> partition, boot from network, ...).
>
> The open timeout is also used as a maximum time for an application to
> re-open /dev/watchdogN after closing it.
>
> Setting a timeout of 0 (either on the kernel command line or via
> sysfs) makes the system behave as if WATCHDOG_OPEN_DEADLINE=n.
>
> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
> ---
>   drivers/watchdog/Kconfig        | 21 +++++++++++++++++++++
>   drivers/watchdog/watchdog_dev.c | 39 ++++++++++++++++++++++++++++++++++++++-
>   2 files changed, 59 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
> index b4b3e25..2674ddf 100644
> --- a/drivers/watchdog/Kconfig
> +++ b/drivers/watchdog/Kconfig
> @@ -53,6 +53,27 @@ config WATCHDOG_SYSFS
>   	  Say Y here if you want to enable watchdog device status read through
>   	  sysfs attributes.
>
> +config WATCHDOG_OPEN_DEADLINE
> +	bool "Allow deadline for opening watchdog device"
> +	help
> +	  If a watchdog driver indicates that to the framework that
> +	  the hardware watchdog is running, the framework takes care
> +	  of pinging the watchdog until userspace opens
> +	  /dev/watchdogN. By selecting this option, you can set a
> +	  maximum time for which the kernel will do this after the
> +	  device has been registered.
> +
> +config WATCHDOG_OPEN_TIMEOUT
> +	int "Timeout value for opening watchdog device"
> +	depends on WATCHDOG_OPEN_DEADLINE
> +	default 120000
> +	help
> +	  The maximum time, in milliseconds, for which the watchdog
> +	  framework takes care of pinging a watchdog device. A value
> +	  of 0 means infinite. The value set here can be overridden by
> +	  the commandline parameter "watchdog.open_timeout" or through
> +	  sysfs.
> +

I like the basic idea, and we always thought about implementing it,
though as "initial timeout" (I personally preferred that term).
However, implementing it as configuration option diminishes its
value substantially, since it means that using it in multi-platform
images (such as multi_v7_defconfig) becomes impossible.

The initial timeout should be specified as module option or as
devicetree parameter, and there should be no additional configuration
option.

Where does the module parameter in watchdog_dev.c end up ?

Thanks,
Guenter

>   #
>   # General Watchdog drivers
>   #
> diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
> index da549e5..584c8a5 100644
> --- a/drivers/watchdog/watchdog_dev.c
> +++ b/drivers/watchdog/watchdog_dev.c
> @@ -65,6 +65,9 @@ struct watchdog_core_data {
>   	struct mutex lock;
>   	unsigned long last_keepalive;
>   	unsigned long last_hw_keepalive;
> +#ifdef CONFIG_WATCHDOG_OPEN_DEADLINE
> +	unsigned long open_deadline;
> +#endif
>   	struct delayed_work work;
>   	unsigned long status;		/* Internal status bits */
>   #define _WDOG_DEV_OPEN		0	/* Opened ? */
> @@ -78,6 +81,32 @@ static struct watchdog_core_data *old_wd_data;
>
>   static struct workqueue_struct *watchdog_wq;
>
> +#ifdef CONFIG_WATCHDOG_OPEN_DEADLINE
> +static unsigned open_timeout = CONFIG_WATCHDOG_OPEN_TIMEOUT;
> +module_param(open_timeout, uint, 0644);
> +
> +static bool watchdog_past_open_deadline(struct watchdog_core_data *data)
> +{
> +	if (!open_timeout)
> +		return false;
> +	return time_is_before_jiffies(data->open_deadline);
> +}
> +
> +static void watchdog_set_open_deadline(struct watchdog_core_data *data)
> +{
> +	data->open_deadline = jiffies + msecs_to_jiffies(open_timeout);
> +}
> +#else
> +static bool watchdog_past_open_deadline(struct watchdog_core_data *data)
> +{
> +	return false;
> +}
> +
> +static void watchdog_set_open_deadline(struct watchdog_core_data *data)
> +{
> +}
> +#endif
> +
>   static inline bool watchdog_need_worker(struct watchdog_device *wdd)
>   {
>   	/* All variables in milli-seconds */
> @@ -192,7 +221,13 @@ static bool watchdog_worker_should_ping(struct watchdog_core_data *wd_data)
>   {
>   	struct watchdog_device *wdd = wd_data->wdd;
>
> -	return wdd && (watchdog_active(wdd) || watchdog_hw_running(wdd));
> +	if (!wdd)
> +		return false;
> +
> +	if (watchdog_active(wdd))
> +		return true;
> +
> +	return watchdog_hw_running(wdd) && !watchdog_past_open_deadline(wd_data);
>   }
>
>   static void watchdog_ping_work(struct work_struct *work)
> @@ -745,6 +780,7 @@ static int watchdog_release(struct inode *inode, struct file *file)
>   		watchdog_ping(wdd);
>   	}
>
> +	watchdog_set_open_deadline(wd_data);
>   	watchdog_update_worker(wdd);
>
>   	/* make sure that /dev/watchdog can be re-opened */
> @@ -843,6 +879,7 @@ static int watchdog_cdev_register(struct watchdog_device *wdd, dev_t devno)
>
>   	/* Record time of most recent heartbeat as 'just before now'. */
>   	wd_data->last_hw_keepalive = jiffies - 1;
> +	watchdog_set_open_deadline(wd_data);
>
>   	/*
>   	 * If the watchdog is running, prevent its driver from being unloaded,
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 1/3] watchdog: change watchdog_need_worker logic
  2016-07-14  9:16 ` [RFC 1/3] watchdog: change watchdog_need_worker logic Rasmus Villemoes
@ 2016-07-14 20:45   ` Guenter Roeck
  2016-07-17 19:24   ` Wim Van Sebroeck
  2016-07-17 20:33   ` Wim Van Sebroeck
  2 siblings, 0 replies; 28+ messages in thread
From: Guenter Roeck @ 2016-07-14 20:45 UTC (permalink / raw)
  To: Rasmus Villemoes; +Cc: Wim Van Sebroeck, linux-watchdog, linux-kernel

On Thu, Jul 14, 2016 at 11:16:26AM +0200, Rasmus Villemoes wrote:
> If the driver indicates that the watchdog is running, the framework
> should feed it until userspace opens the device, regardless of whether
> the driver has set max_hw_heartbeat_ms.
> 
> This patch only affects the case where wdd->max_hw_heartbeat_ms is
> zero, wdd->timeout is non-zero, the watchdog is not active and the
> hardware device is running (*):
> 
> - If wdd->timeout is zero, watchdog_need_worker() returns false both
> before and after this patch, and watchdog_next_keepalive() is not
> called.
> 
> - If watchdog_active(wdd), the return value from watchdog_need_worker
> is also the same as before (namely, hm && t > hm). Hence in that case,
> watchdog_next_keepalive() is only called if hm == max_hw_heartbeat_ms
> is non-zero, so the change to min_not_zero there is a no-op.
> 
> - If the watchdog is not active and the device is not running, we
> return false from watchdog_need_worker just as before.
> 
> That leaves the watchdog_hw_running(wdd) && !watchdog_active(wdd) &&
> wdd->timeout case. Again, it's easy to see that if
> wdd->max_hw_heartbeat_ms is non-zero, we return true from
> watchdog_need_worker with and without this patch, and the logic in
> watchdog_next_keepalive is unchanged. Finally, if
> wdd->max_hw_heartbeat_ms is 0, we used to end up in the
> cancel_delayed_work branch, whereas with this patch we end up
> scheduling a ping timeout_ms/2 from now.
> 
> (*) This should imply that no current kernel drivers are affected,
> since the only drivers which explicitly set WDOG_HW_RUNNING are
> imx2_wdt.c and dw_wdt.c, both of which also provide a non-zero value
> for max_hw_heartbeat_ms. The watchdog core also sets WDOG_HW_RUNNING,
> but only when the driver doesn't provide ->stop, in which case it
> must, according to Documentation/watchdog/watchdog-kernel-api.txt, set
> max_hw_heartbeat_ms.
> 
> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>

LGTM.

Reviewed-by: Guenter Roeck <linux@roeck-us.net>

> ---
>  drivers/watchdog/watchdog_dev.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
> index 3595cff..14f8a92 100644
> --- a/drivers/watchdog/watchdog_dev.c
> +++ b/drivers/watchdog/watchdog_dev.c
> @@ -92,9 +92,13 @@ static inline bool watchdog_need_worker(struct watchdog_device *wdd)
>  	 *   thus is aware that the framework supports generating heartbeat
>  	 *   requests.
>  	 * - Userspace requests a longer timeout than the hardware can handle.
> +	 *
> +	 * Alternatively, if userspace has not opened the watchdog
> +	 * device, we take care of feeding the watchdog if it is
> +	 * running.
>  	 */
> -	return hm && ((watchdog_active(wdd) && t > hm) ||
> -		      (t && !watchdog_active(wdd) && watchdog_hw_running(wdd)));
> +	return (hm && watchdog_active(wdd) && t > hm) ||
> +		(t && !watchdog_active(wdd) && watchdog_hw_running(wdd));
>  }
>  
>  static long watchdog_next_keepalive(struct watchdog_device *wdd)
> @@ -107,7 +111,7 @@ static long watchdog_next_keepalive(struct watchdog_device *wdd)
>  	unsigned int hw_heartbeat_ms;
>  
>  	virt_timeout = wd_data->last_keepalive + msecs_to_jiffies(timeout_ms);
> -	hw_heartbeat_ms = min(timeout_ms, wdd->max_hw_heartbeat_ms);
> +	hw_heartbeat_ms = min_not_zero(timeout_ms, wdd->max_hw_heartbeat_ms);
>  	keepalive_interval = msecs_to_jiffies(hw_heartbeat_ms / 2);
>  
>  	if (!watchdog_active(wdd))
> -- 
> 2.5.0
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-watchdog" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 3/3] watchdog: introduce CONFIG_WATCHDOG_OPEN_DEADLINE
  2016-07-14 14:42   ` Guenter Roeck
@ 2016-07-15  7:32     ` Rasmus Villemoes
  2016-07-15 14:29       ` Guenter Roeck
  0 siblings, 1 reply; 28+ messages in thread
From: Rasmus Villemoes @ 2016-07-15  7:32 UTC (permalink / raw)
  To: Guenter Roeck, Wim Van Sebroeck; +Cc: linux-watchdog, linux-kernel

On 2016-07-14 16:42, Guenter Roeck wrote:
> On 07/14/2016 02:16 AM, Rasmus Villemoes wrote:
>>
>> +config WATCHDOG_OPEN_DEADLINE
>> +    bool "Allow deadline for opening watchdog device"
>> +    help
>> +      If a watchdog driver indicates that to the framework that
>> +      the hardware watchdog is running, the framework takes care
>> +      of pinging the watchdog until userspace opens
>> +      /dev/watchdogN. By selecting this option, you can set a
>> +      maximum time for which the kernel will do this after the
>> +      device has been registered.
>> +
>> +config WATCHDOG_OPEN_TIMEOUT
>> +    int "Timeout value for opening watchdog device"
>> +    depends on WATCHDOG_OPEN_DEADLINE
>> +    default 120000
>> +    help
>> +      The maximum time, in milliseconds, for which the watchdog
>> +      framework takes care of pinging a watchdog device. A value
>> +      of 0 means infinite. The value set here can be overridden by
>> +      the commandline parameter "watchdog.open_timeout" or through
>> +      sysfs.
>> +
>
> I like the basic idea, and we always thought about implementing it,
> though as "initial timeout" (I personally preferred that term).

I also used WATCHDOG_INIT_TIMEOUT in my first few drafts, and my helper 
watchdog_set_open_deadline was called watchdog_set_init_timeout. But 
then I stumbled on watchdog_init_timeout in watchdog_core.c, and thought 
that might end up being quite confusing. I think having 'open' part of 
the name is quite natural, but I don't really have strong feelings about 
the naming of this thing.

> However, implementing it as configuration option diminishes its
> value substantially, since it means that using it in multi-platform
> images (such as multi_v7_defconfig) becomes impossible.

If one wants to allow this feature in an existing _defconfig, one can 
set OPEN_DEADLINE=y and OPEN_TIMEOUT=0; we could change the default for 
the latter to that. (I thought about just having that single config 
option with a default of 0, but it wasn't much more code to allow this 
thing to be compiled out completely.) Platforms without a running 
watchdog won't be affected, and for those with, having a non-zero 
deadline requires opt-in anyway.

> The initial timeout should be specified as module option or as
> devicetree parameter, and there should be no additional configuration
> option.

I was under the impression that device tree was exclusively for 
describing hardware, and this certainly is not that. I also wanted to 
avoid having to modify each driver, which would seem to be necessary if 
it was module parameter/DT - the only thing required of a driver now is 
that it correctly reports WDOG_HW_RUNNING.

Regardless of implementation, it's not something that any "distro" 
kernel is going to enable (with non-zero deadline), so to use it will 
require some customization - which could be a .config tweak, passing a 
command line parameter, a custom dtb, and probably other options. ISTM 
that the first two are the most generic and require the least repeated 
work across platforms.

> Where does the module parameter in watchdog_dev.c end up ?

# cat /sys/module/watchdog/parameters/open_timeout
120000

Rasmus

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 3/3] watchdog: introduce CONFIG_WATCHDOG_OPEN_DEADLINE
  2016-07-15  7:32     ` Rasmus Villemoes
@ 2016-07-15 14:29       ` Guenter Roeck
  2016-07-20 22:08         ` Rasmus Villemoes
  0 siblings, 1 reply; 28+ messages in thread
From: Guenter Roeck @ 2016-07-15 14:29 UTC (permalink / raw)
  To: Rasmus Villemoes, Wim Van Sebroeck; +Cc: linux-watchdog, linux-kernel

On 07/15/2016 12:32 AM, Rasmus Villemoes wrote:
> On 2016-07-14 16:42, Guenter Roeck wrote:
>> On 07/14/2016 02:16 AM, Rasmus Villemoes wrote:
>>>
>>> +config WATCHDOG_OPEN_DEADLINE
>>> +    bool "Allow deadline for opening watchdog device"
>>> +    help
>>> +      If a watchdog driver indicates that to the framework that
>>> +      the hardware watchdog is running, the framework takes care
>>> +      of pinging the watchdog until userspace opens
>>> +      /dev/watchdogN. By selecting this option, you can set a
>>> +      maximum time for which the kernel will do this after the
>>> +      device has been registered.
>>> +
>>> +config WATCHDOG_OPEN_TIMEOUT
>>> +    int "Timeout value for opening watchdog device"
>>> +    depends on WATCHDOG_OPEN_DEADLINE
>>> +    default 120000
>>> +    help
>>> +      The maximum time, in milliseconds, for which the watchdog
>>> +      framework takes care of pinging a watchdog device. A value
>>> +      of 0 means infinite. The value set here can be overridden by
>>> +      the commandline parameter "watchdog.open_timeout" or through
>>> +      sysfs.
>>> +
>>
>> I like the basic idea, and we always thought about implementing it,
>> though as "initial timeout" (I personally preferred that term).
>
> I also used WATCHDOG_INIT_TIMEOUT in my first few drafts, and my helper watchdog_set_open_deadline was called watchdog_set_init_timeout. But then I stumbled on watchdog_init_timeout in watchdog_core.c, and thought that might end up being quite confusing. I think having 'open' part of the name is quite natural, but I don't really have strong feelings about the naming of this thing.
>
You could use "INITIAL".

>> However, implementing it as configuration option diminishes its
>> value substantially, since it means that using it in multi-platform
>> images (such as multi_v7_defconfig) becomes impossible.
>
> If one wants to allow this feature in an existing _defconfig, one can set OPEN_DEADLINE=y and OPEN_TIMEOUT=0; we could change the default for the latter to that. (I thought about just having that single config option with a default of 0, but it wasn't much more code to allow this thing to be compiled out completely.) Platforms without a running watchdog won't be affected, and for those with, having a non-zero deadline requires opt-in anyway.
>

Ok, off to Wim to decide then. I don't see the value of WATCHDOG_OPEN_DEADLINE,
and for sure don't like the WATCHDOG_OPEN_TIMEOUT parameter.

>> The initial timeout should be specified as module option or as
>> devicetree parameter, and there should be no additional configuration
>> option.
>
> I was under the impression that device tree was exclusively for describing hardware, and this certainly is not that. I also wanted to avoid having to modify each driver, which would seem to be necessary if it was module parameter/DT - the only thing required of a driver now is that it correctly reports WDOG_HW_RUNNING.

What is "hardware" ? It is supposed to describe the system, isn't it ? Part of that system is its clock rate,
and the means how the OS is loaded, and both have impact on the initial timeout (and the regular timeout).

You might as well argue that clock rates should not be in devicetree either. Clock rates are, after all,
just reflecting the _use_ of the hardware, not the hardware itself.

Devicetree could be handled in the core, with a function to set the initial timeout,
or possibly even with the watchdog registration itself.

>
> Regardless of implementation, it's not something that any "distro" kernel is going to enable (with non-zero deadline), so to use it will require some customization - which could be a .config tweak, passing a command line parameter, a custom dtb, and probably other options. ISTM that the first two are the most generic and require the least repeated work across platforms.
>

Each system running a specific kernel will require its own dtb, so I don't see your point there.

Guenter

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 1/3] watchdog: change watchdog_need_worker logic
  2016-07-14  9:16 ` [RFC 1/3] watchdog: change watchdog_need_worker logic Rasmus Villemoes
  2016-07-14 20:45   ` Guenter Roeck
@ 2016-07-17 19:24   ` Wim Van Sebroeck
  2016-07-17 19:49     ` Guenter Roeck
  2016-07-17 20:33   ` Wim Van Sebroeck
  2 siblings, 1 reply; 28+ messages in thread
From: Wim Van Sebroeck @ 2016-07-17 19:24 UTC (permalink / raw)
  To: Rasmus Villemoes; +Cc: Guenter Roeck, linux-watchdog, linux-kernel

Hi Rasmus,

> If the driver indicates that the watchdog is running, the framework
> should feed it until userspace opens the device, regardless of whether
> the driver has set max_hw_heartbeat_ms.
> 
> This patch only affects the case where wdd->max_hw_heartbeat_ms is
> zero, wdd->timeout is non-zero, the watchdog is not active and the
> hardware device is running (*):
> 
> - If wdd->timeout is zero, watchdog_need_worker() returns false both
> before and after this patch, and watchdog_next_keepalive() is not
> called.
> 
> - If watchdog_active(wdd), the return value from watchdog_need_worker
> is also the same as before (namely, hm && t > hm). Hence in that case,
> watchdog_next_keepalive() is only called if hm == max_hw_heartbeat_ms
> is non-zero, so the change to min_not_zero there is a no-op.
> 
> - If the watchdog is not active and the device is not running, we
> return false from watchdog_need_worker just as before.
> 
> That leaves the watchdog_hw_running(wdd) && !watchdog_active(wdd) &&
> wdd->timeout case. Again, it's easy to see that if
> wdd->max_hw_heartbeat_ms is non-zero, we return true from
> watchdog_need_worker with and without this patch, and the logic in
> watchdog_next_keepalive is unchanged. Finally, if
> wdd->max_hw_heartbeat_ms is 0, we used to end up in the
> cancel_delayed_work branch, whereas with this patch we end up
> scheduling a ping timeout_ms/2 from now.
> 
> (*) This should imply that no current kernel drivers are affected,
> since the only drivers which explicitly set WDOG_HW_RUNNING are
> imx2_wdt.c and dw_wdt.c, both of which also provide a non-zero value
> for max_hw_heartbeat_ms. The watchdog core also sets WDOG_HW_RUNNING,
> but only when the driver doesn't provide ->stop, in which case it
> must, according to Documentation/watchdog/watchdog-kernel-api.txt, set
> max_hw_heartbeat_ms.

This isn't completely true. We will have the following in the linux-watchdog tree:
drivers/watchdog/aspeed_wdt.c:		set_bit(WDOG_HW_RUNNING, &wdt->wdd.status);
drivers/watchdog/dw_wdt.c:	set_bit(WDOG_HW_RUNNING, &wdd->status);
drivers/watchdog/dw_wdt.c:		set_bit(WDOG_HW_RUNNING, &wdd->status);
drivers/watchdog/imx2_wdt.c:	set_bit(WDOG_HW_RUNNING, &wdog->status);
drivers/watchdog/imx2_wdt.c:		set_bit(WDOG_HW_RUNNING, &wdog->status);
drivers/watchdog/max77620_wdt.c:		set_bit(WDOG_HW_RUNNING, &wdt_dev->status);
drivers/watchdog/sbsa_gwdt.c:		set_bit(WDOG_HW_RUNNING, &wdd->status);
drivers/watchdog/tangox_wdt.c:		set_bit(WDOG_HW_RUNNING, &dev->wdt.status);

I checked the ones that aren't mentioned and aspeed_wdt, max77620_wdt and sbsa_gwdt.c
also have a non-zero value for max_hw_heartbeat_ms. But tangox_wdt.c doesn't set it.
This one will need to be looked at closer.

> 
> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
> ---
>  drivers/watchdog/watchdog_dev.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
> index 3595cff..14f8a92 100644
> --- a/drivers/watchdog/watchdog_dev.c
> +++ b/drivers/watchdog/watchdog_dev.c
> @@ -92,9 +92,13 @@ static inline bool watchdog_need_worker(struct watchdog_device *wdd)
>  	 *   thus is aware that the framework supports generating heartbeat
>  	 *   requests.
>  	 * - Userspace requests a longer timeout than the hardware can handle.
> +	 *
> +	 * Alternatively, if userspace has not opened the watchdog
> +	 * device, we take care of feeding the watchdog if it is
> +	 * running.
>  	 */
> -	return hm && ((watchdog_active(wdd) && t > hm) ||
> -		      (t && !watchdog_active(wdd) && watchdog_hw_running(wdd)));
> +	return (hm && watchdog_active(wdd) && t > hm) ||
> +		(t && !watchdog_active(wdd) && watchdog_hw_running(wdd));
>  }
>  
>  static long watchdog_next_keepalive(struct watchdog_device *wdd)
> @@ -107,7 +111,7 @@ static long watchdog_next_keepalive(struct watchdog_device *wdd)
>  	unsigned int hw_heartbeat_ms;
>  
>  	virt_timeout = wd_data->last_keepalive + msecs_to_jiffies(timeout_ms);
> -	hw_heartbeat_ms = min(timeout_ms, wdd->max_hw_heartbeat_ms);
> +	hw_heartbeat_ms = min_not_zero(timeout_ms, wdd->max_hw_heartbeat_ms);
>  	keepalive_interval = msecs_to_jiffies(hw_heartbeat_ms / 2);
>  
>  	if (!watchdog_active(wdd))
> -- 
> 2.5.0
> 

Kind regards,
Wim.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 1/3] watchdog: change watchdog_need_worker logic
  2016-07-17 19:24   ` Wim Van Sebroeck
@ 2016-07-17 19:49     ` Guenter Roeck
  2016-07-17 20:30       ` Wim Van Sebroeck
  0 siblings, 1 reply; 28+ messages in thread
From: Guenter Roeck @ 2016-07-17 19:49 UTC (permalink / raw)
  To: Wim Van Sebroeck, Rasmus Villemoes; +Cc: linux-watchdog, linux-kernel

On 07/17/2016 12:24 PM, Wim Van Sebroeck wrote:
> Hi Rasmus,
>
>> If the driver indicates that the watchdog is running, the framework
>> should feed it until userspace opens the device, regardless of whether
>> the driver has set max_hw_heartbeat_ms.
>>
>> This patch only affects the case where wdd->max_hw_heartbeat_ms is
>> zero, wdd->timeout is non-zero, the watchdog is not active and the
>> hardware device is running (*):
>>
>> - If wdd->timeout is zero, watchdog_need_worker() returns false both
>> before and after this patch, and watchdog_next_keepalive() is not
>> called.
>>
>> - If watchdog_active(wdd), the return value from watchdog_need_worker
>> is also the same as before (namely, hm && t > hm). Hence in that case,
>> watchdog_next_keepalive() is only called if hm == max_hw_heartbeat_ms
>> is non-zero, so the change to min_not_zero there is a no-op.
>>
>> - If the watchdog is not active and the device is not running, we
>> return false from watchdog_need_worker just as before.
>>
>> That leaves the watchdog_hw_running(wdd) && !watchdog_active(wdd) &&
>> wdd->timeout case. Again, it's easy to see that if
>> wdd->max_hw_heartbeat_ms is non-zero, we return true from
>> watchdog_need_worker with and without this patch, and the logic in
>> watchdog_next_keepalive is unchanged. Finally, if
>> wdd->max_hw_heartbeat_ms is 0, we used to end up in the
>> cancel_delayed_work branch, whereas with this patch we end up
>> scheduling a ping timeout_ms/2 from now.
>>
>> (*) This should imply that no current kernel drivers are affected,
>> since the only drivers which explicitly set WDOG_HW_RUNNING are
>> imx2_wdt.c and dw_wdt.c, both of which also provide a non-zero value
>> for max_hw_heartbeat_ms. The watchdog core also sets WDOG_HW_RUNNING,
>> but only when the driver doesn't provide ->stop, in which case it
>> must, according to Documentation/watchdog/watchdog-kernel-api.txt, set
>> max_hw_heartbeat_ms.
>
> This isn't completely true. We will have the following in the linux-watchdog tree:
> drivers/watchdog/aspeed_wdt.c:		set_bit(WDOG_HW_RUNNING, &wdt->wdd.status);
> drivers/watchdog/dw_wdt.c:	set_bit(WDOG_HW_RUNNING, &wdd->status);
> drivers/watchdog/dw_wdt.c:		set_bit(WDOG_HW_RUNNING, &wdd->status);
> drivers/watchdog/imx2_wdt.c:	set_bit(WDOG_HW_RUNNING, &wdog->status);
> drivers/watchdog/imx2_wdt.c:		set_bit(WDOG_HW_RUNNING, &wdog->status);
> drivers/watchdog/max77620_wdt.c:		set_bit(WDOG_HW_RUNNING, &wdt_dev->status);
> drivers/watchdog/sbsa_gwdt.c:		set_bit(WDOG_HW_RUNNING, &wdd->status);
> drivers/watchdog/tangox_wdt.c:		set_bit(WDOG_HW_RUNNING, &dev->wdt.status);
>
> I checked the ones that aren't mentioned and aspeed_wdt, max77620_wdt and sbsa_gwdt.c
> also have a non-zero value for max_hw_heartbeat_ms. But tangox_wdt.c doesn't set it.
> This one will need to be looked at closer.
>

I had a brief look; the tangox_wdt problem is my fault. I overlooked that with
my commit 'watchdog: tangox: Mark running watchdog correctly'.

We have a number of options: Set max_hw_heartbeat_ms in tangox_wdt.c,
accept this patch, or both. I think we should accept this patch.

Thanks,
Guenter

>>
>> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
>> ---
>>   drivers/watchdog/watchdog_dev.c | 10 +++++++---
>>   1 file changed, 7 insertions(+), 3 deletions(-)
>>
>> diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
>> index 3595cff..14f8a92 100644
>> --- a/drivers/watchdog/watchdog_dev.c
>> +++ b/drivers/watchdog/watchdog_dev.c
>> @@ -92,9 +92,13 @@ static inline bool watchdog_need_worker(struct watchdog_device *wdd)
>>   	 *   thus is aware that the framework supports generating heartbeat
>>   	 *   requests.
>>   	 * - Userspace requests a longer timeout than the hardware can handle.
>> +	 *
>> +	 * Alternatively, if userspace has not opened the watchdog
>> +	 * device, we take care of feeding the watchdog if it is
>> +	 * running.
>>   	 */
>> -	return hm && ((watchdog_active(wdd) && t > hm) ||
>> -		      (t && !watchdog_active(wdd) && watchdog_hw_running(wdd)));
>> +	return (hm && watchdog_active(wdd) && t > hm) ||
>> +		(t && !watchdog_active(wdd) && watchdog_hw_running(wdd));
>>   }
>>
>>   static long watchdog_next_keepalive(struct watchdog_device *wdd)
>> @@ -107,7 +111,7 @@ static long watchdog_next_keepalive(struct watchdog_device *wdd)
>>   	unsigned int hw_heartbeat_ms;
>>
>>   	virt_timeout = wd_data->last_keepalive + msecs_to_jiffies(timeout_ms);
>> -	hw_heartbeat_ms = min(timeout_ms, wdd->max_hw_heartbeat_ms);
>> +	hw_heartbeat_ms = min_not_zero(timeout_ms, wdd->max_hw_heartbeat_ms);
>>   	keepalive_interval = msecs_to_jiffies(hw_heartbeat_ms / 2);
>>
>>   	if (!watchdog_active(wdd))
>> --
>> 2.5.0
>>
>
> Kind regards,
> Wim.
>
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 1/3] watchdog: change watchdog_need_worker logic
  2016-07-17 19:49     ` Guenter Roeck
@ 2016-07-17 20:30       ` Wim Van Sebroeck
  0 siblings, 0 replies; 28+ messages in thread
From: Wim Van Sebroeck @ 2016-07-17 20:30 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Rasmus Villemoes, linux-watchdog, linux-kernel

Hi Guenter,

> On 07/17/2016 12:24 PM, Wim Van Sebroeck wrote:
> >Hi Rasmus,
> >
> >>If the driver indicates that the watchdog is running, the framework
> >>should feed it until userspace opens the device, regardless of whether
> >>the driver has set max_hw_heartbeat_ms.
> >>
> >>This patch only affects the case where wdd->max_hw_heartbeat_ms is
> >>zero, wdd->timeout is non-zero, the watchdog is not active and the
> >>hardware device is running (*):
> >>
> >>- If wdd->timeout is zero, watchdog_need_worker() returns false both
> >>before and after this patch, and watchdog_next_keepalive() is not
> >>called.
> >>
> >>- If watchdog_active(wdd), the return value from watchdog_need_worker
> >>is also the same as before (namely, hm && t > hm). Hence in that case,
> >>watchdog_next_keepalive() is only called if hm == max_hw_heartbeat_ms
> >>is non-zero, so the change to min_not_zero there is a no-op.
> >>
> >>- If the watchdog is not active and the device is not running, we
> >>return false from watchdog_need_worker just as before.
> >>
> >>That leaves the watchdog_hw_running(wdd) && !watchdog_active(wdd) &&
> >>wdd->timeout case. Again, it's easy to see that if
> >>wdd->max_hw_heartbeat_ms is non-zero, we return true from
> >>watchdog_need_worker with and without this patch, and the logic in
> >>watchdog_next_keepalive is unchanged. Finally, if
> >>wdd->max_hw_heartbeat_ms is 0, we used to end up in the
> >>cancel_delayed_work branch, whereas with this patch we end up
> >>scheduling a ping timeout_ms/2 from now.
> >>
> >>(*) This should imply that no current kernel drivers are affected,
> >>since the only drivers which explicitly set WDOG_HW_RUNNING are
> >>imx2_wdt.c and dw_wdt.c, both of which also provide a non-zero value
> >>for max_hw_heartbeat_ms. The watchdog core also sets WDOG_HW_RUNNING,
> >>but only when the driver doesn't provide ->stop, in which case it
> >>must, according to Documentation/watchdog/watchdog-kernel-api.txt, set
> >>max_hw_heartbeat_ms.
> >
> >This isn't completely true. We will have the following in the 
> >linux-watchdog tree:
> >drivers/watchdog/aspeed_wdt.c:		set_bit(WDOG_HW_RUNNING, 
> >&wdt->wdd.status);
> >drivers/watchdog/dw_wdt.c:	set_bit(WDOG_HW_RUNNING, &wdd->status);
> >drivers/watchdog/dw_wdt.c:		set_bit(WDOG_HW_RUNNING, 
> >&wdd->status);
> >drivers/watchdog/imx2_wdt.c:	set_bit(WDOG_HW_RUNNING, &wdog->status);
> >drivers/watchdog/imx2_wdt.c:		set_bit(WDOG_HW_RUNNING, 
> >&wdog->status);
> >drivers/watchdog/max77620_wdt.c:		set_bit(WDOG_HW_RUNNING, 
> >&wdt_dev->status);
> >drivers/watchdog/sbsa_gwdt.c:		set_bit(WDOG_HW_RUNNING, 
> >&wdd->status);
> >drivers/watchdog/tangox_wdt.c:		set_bit(WDOG_HW_RUNNING, 
> >&dev->wdt.status);
> >
> >I checked the ones that aren't mentioned and aspeed_wdt, max77620_wdt and 
> >sbsa_gwdt.c
> >also have a non-zero value for max_hw_heartbeat_ms. But tangox_wdt.c 
> >doesn't set it.
> >This one will need to be looked at closer.
> >
> 
> I had a brief look; the tangox_wdt problem is my fault. I overlooked that 
> with
> my commit 'watchdog: tangox: Mark running watchdog correctly'.
> 
> We have a number of options: Set max_hw_heartbeat_ms in tangox_wdt.c,
> accept this patch, or both. I think we should accept this patch.

We accept this patch and add a fix for tangox_wdt.c .

> 
> Thanks,
> Guenter
> 
> >>
> >>Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
> >>---
> >>  drivers/watchdog/watchdog_dev.c | 10 +++++++---
> >>  1 file changed, 7 insertions(+), 3 deletions(-)
> >>
> >>diff --git a/drivers/watchdog/watchdog_dev.c 
> >>b/drivers/watchdog/watchdog_dev.c
> >>index 3595cff..14f8a92 100644
> >>--- a/drivers/watchdog/watchdog_dev.c
> >>+++ b/drivers/watchdog/watchdog_dev.c
> >>@@ -92,9 +92,13 @@ static inline bool watchdog_need_worker(struct 
> >>watchdog_device *wdd)
> >>  	 *   thus is aware that the framework supports generating heartbeat
> >>  	 *   requests.
> >>  	 * - Userspace requests a longer timeout than the hardware can 
> >>  	 handle.
> >>+	 *
> >>+	 * Alternatively, if userspace has not opened the watchdog
> >>+	 * device, we take care of feeding the watchdog if it is
> >>+	 * running.
> >>  	 */
> >>-	return hm && ((watchdog_active(wdd) && t > hm) ||
> >>-		      (t && !watchdog_active(wdd) && 
> >>watchdog_hw_running(wdd)));
> >>+	return (hm && watchdog_active(wdd) && t > hm) ||
> >>+		(t && !watchdog_active(wdd) && watchdog_hw_running(wdd));
> >>  }
> >>
> >>  static long watchdog_next_keepalive(struct watchdog_device *wdd)
> >>@@ -107,7 +111,7 @@ static long watchdog_next_keepalive(struct 
> >>watchdog_device *wdd)
> >>  	unsigned int hw_heartbeat_ms;
> >>
> >>  	virt_timeout = wd_data->last_keepalive + 
> >>  	msecs_to_jiffies(timeout_ms);
> >>-	hw_heartbeat_ms = min(timeout_ms, wdd->max_hw_heartbeat_ms);
> >>+	hw_heartbeat_ms = min_not_zero(timeout_ms, wdd->max_hw_heartbeat_ms);
> >>  	keepalive_interval = msecs_to_jiffies(hw_heartbeat_ms / 2);
> >>
> >>  	if (!watchdog_active(wdd))
> >>--
> >>2.5.0
> >>
> >
> >Kind regards,
> >Wim.
> >
> >
> 

Kind regards,
Wim.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 1/3] watchdog: change watchdog_need_worker logic
  2016-07-14  9:16 ` [RFC 1/3] watchdog: change watchdog_need_worker logic Rasmus Villemoes
  2016-07-14 20:45   ` Guenter Roeck
  2016-07-17 19:24   ` Wim Van Sebroeck
@ 2016-07-17 20:33   ` Wim Van Sebroeck
  2 siblings, 0 replies; 28+ messages in thread
From: Wim Van Sebroeck @ 2016-07-17 20:33 UTC (permalink / raw)
  To: Rasmus Villemoes; +Cc: Guenter Roeck, linux-watchdog, linux-kernel

Hi Rasmus,

> If the driver indicates that the watchdog is running, the framework
> should feed it until userspace opens the device, regardless of whether
> the driver has set max_hw_heartbeat_ms.
> 
> This patch only affects the case where wdd->max_hw_heartbeat_ms is
> zero, wdd->timeout is non-zero, the watchdog is not active and the
> hardware device is running (*):
> 
> - If wdd->timeout is zero, watchdog_need_worker() returns false both
> before and after this patch, and watchdog_next_keepalive() is not
> called.
> 
> - If watchdog_active(wdd), the return value from watchdog_need_worker
> is also the same as before (namely, hm && t > hm). Hence in that case,
> watchdog_next_keepalive() is only called if hm == max_hw_heartbeat_ms
> is non-zero, so the change to min_not_zero there is a no-op.
> 
> - If the watchdog is not active and the device is not running, we
> return false from watchdog_need_worker just as before.
> 
> That leaves the watchdog_hw_running(wdd) && !watchdog_active(wdd) &&
> wdd->timeout case. Again, it's easy to see that if
> wdd->max_hw_heartbeat_ms is non-zero, we return true from
> watchdog_need_worker with and without this patch, and the logic in
> watchdog_next_keepalive is unchanged. Finally, if
> wdd->max_hw_heartbeat_ms is 0, we used to end up in the
> cancel_delayed_work branch, whereas with this patch we end up
> scheduling a ping timeout_ms/2 from now.
> 
> (*) This should imply that no current kernel drivers are affected,
> since the only drivers which explicitly set WDOG_HW_RUNNING are
> imx2_wdt.c and dw_wdt.c, both of which also provide a non-zero value
> for max_hw_heartbeat_ms. The watchdog core also sets WDOG_HW_RUNNING,
> but only when the driver doesn't provide ->stop, in which case it
> must, according to Documentation/watchdog/watchdog-kernel-api.txt, set
> max_hw_heartbeat_ms.
> 
> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
> ---
>  drivers/watchdog/watchdog_dev.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
> index 3595cff..14f8a92 100644
> --- a/drivers/watchdog/watchdog_dev.c
> +++ b/drivers/watchdog/watchdog_dev.c
> @@ -92,9 +92,13 @@ static inline bool watchdog_need_worker(struct watchdog_device *wdd)
>  	 *   thus is aware that the framework supports generating heartbeat
>  	 *   requests.
>  	 * - Userspace requests a longer timeout than the hardware can handle.
> +	 *
> +	 * Alternatively, if userspace has not opened the watchdog
> +	 * device, we take care of feeding the watchdog if it is
> +	 * running.
>  	 */
> -	return hm && ((watchdog_active(wdd) && t > hm) ||
> -		      (t && !watchdog_active(wdd) && watchdog_hw_running(wdd)));
> +	return (hm && watchdog_active(wdd) && t > hm) ||
> +		(t && !watchdog_active(wdd) && watchdog_hw_running(wdd));
>  }
>  
>  static long watchdog_next_keepalive(struct watchdog_device *wdd)
> @@ -107,7 +111,7 @@ static long watchdog_next_keepalive(struct watchdog_device *wdd)
>  	unsigned int hw_heartbeat_ms;
>  
>  	virt_timeout = wd_data->last_keepalive + msecs_to_jiffies(timeout_ms);
> -	hw_heartbeat_ms = min(timeout_ms, wdd->max_hw_heartbeat_ms);
> +	hw_heartbeat_ms = min_not_zero(timeout_ms, wdd->max_hw_heartbeat_ms);
>  	keepalive_interval = msecs_to_jiffies(hw_heartbeat_ms / 2);
>  
>  	if (!watchdog_active(wdd))
> -- 
> 2.5.0
> 

This patch has been added to linux-watchdog-next.

Kind regards,
Wim.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 3/3] watchdog: introduce CONFIG_WATCHDOG_OPEN_DEADLINE
  2016-07-15 14:29       ` Guenter Roeck
@ 2016-07-20 22:08         ` Rasmus Villemoes
  2016-07-21  0:31           ` Guenter Roeck
  0 siblings, 1 reply; 28+ messages in thread
From: Rasmus Villemoes @ 2016-07-20 22:08 UTC (permalink / raw)
  To: Guenter Roeck, Wim Van Sebroeck; +Cc: linux-watchdog, linux-kernel

On 2016-07-15 16:29, Guenter Roeck wrote:
> On 07/15/2016 12:32 AM, Rasmus Villemoes wrote:
>
>>> The initial timeout should be specified as module option or as
>>> devicetree parameter, and there should be no additional configuration
>>> option.
>>
>> I was under the impression that device tree was exclusively for
>> describing hardware, and this certainly is not that. I also wanted to
>> avoid having to modify each driver, which would seem to be necessary
>> if it was module parameter/DT - the only thing required of a driver
>> now is that it correctly reports WDOG_HW_RUNNING.
>
> What is "hardware" ? It is supposed to describe the system, isn't it ?
> Part of that system is its clock rate,
> and the means how the OS is loaded, and both have impact on the initial
> timeout (and the regular timeout).
>
> You might as well argue that clock rates should not be in devicetree
> either. Clock rates are, after all,
> just reflecting the _use_ of the hardware, not the hardware itself.

But they are used to configure hardware. The init timeout is not a 
property of any particular device - it configures how the kernel 
behaves, and as such I find it quite natural to have it in the kernel's 
.config (and overridable on command line and via sysfs).

> Devicetree could be handled in the core, with a function to set the
> initial timeout,
> or possibly even with the watchdog registration itself.

But where in the device tree would you put this value? I'd really prefer 
not having to modify the node representing each individual watchdog 
device I might use.

Rasmus

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 3/3] watchdog: introduce CONFIG_WATCHDOG_OPEN_DEADLINE
  2016-07-20 22:08         ` Rasmus Villemoes
@ 2016-07-21  0:31           ` Guenter Roeck
  2016-07-27 20:17             ` Rasmus Villemoes
  0 siblings, 1 reply; 28+ messages in thread
From: Guenter Roeck @ 2016-07-21  0:31 UTC (permalink / raw)
  To: Rasmus Villemoes; +Cc: Wim Van Sebroeck, linux-watchdog, linux-kernel

On Thu, Jul 21, 2016 at 12:08:52AM +0200, Rasmus Villemoes wrote:
> On 2016-07-15 16:29, Guenter Roeck wrote:
> >On 07/15/2016 12:32 AM, Rasmus Villemoes wrote:
> >
> >>>The initial timeout should be specified as module option or as
> >>>devicetree parameter, and there should be no additional configuration
> >>>option.
> >>
> >>I was under the impression that device tree was exclusively for
> >>describing hardware, and this certainly is not that. I also wanted to
> >>avoid having to modify each driver, which would seem to be necessary
> >>if it was module parameter/DT - the only thing required of a driver
> >>now is that it correctly reports WDOG_HW_RUNNING.
> >
> >What is "hardware" ? It is supposed to describe the system, isn't it ?
> >Part of that system is its clock rate,
> >and the means how the OS is loaded, and both have impact on the initial
> >timeout (and the regular timeout).
> >
> >You might as well argue that clock rates should not be in devicetree
> >either. Clock rates are, after all,
> >just reflecting the _use_ of the hardware, not the hardware itself.
> 
> But they are used to configure hardware. The init timeout is not a property
> of any particular device - it configures how the kernel behaves, and as such
> I find it quite natural to have it in the kernel's .config (and overridable
> on command line and via sysfs).
> 

I hear you. "configure hardware" is a slippery term, though. After all,
one would typically configure the initial timeout in hardware, just as
any "normal" timeout. In many cases, this will actually already be the
case (and should be), since the watchdog should be enabled by the
ROMMON or BIOS before control is passed to the kernel. As such, the
initial timeout should already be set when the driver is loaded.

Also, I would want to be able to use the same kernel, and the same
root file system, on different machines without having to bother about
system variants, even more so if I was responsible for potentially dozens
of different variants with subtle differences in hardware. One ends
up having to maintain configuration files which happen to closely look like
devicetree files, plus an entire infrastructure to detect and configure
hardware variants. Such configuration files may be part of the root file
system, or have to be maintained separately. Either creates additional
overhead.

Unfortunately, those configuration files can only be read after the kernel
passed control to user space, which typically happens after the driver
to be controlled was already loaded. Using sysfs is therefore pretty much
useless - one might as well start the watchdog daemon as early as possible
after passing control to user space.

This leaves module parameters, which have to be passed on the kernel command
line to be useful. This is not really desirable either in most situations,
since now the system variant specific configuration has to be implemented
somewhere in the ROMMON, BIOS, or bootloader configuration file. Another level
of complexity added, since the per-variant boot parameters have to be managed
somewhere. Plus, as mentioned above, if the initial timeout has to be passed
as parameter to the kernel, the passing entity might as well program it
into the hardware directly, and start the watchdog, thereby fixing the gap
between hand-off to kernel and loading the watchdog driver.

So what is left ? Situations where the hardware does not tell software what
its configured timeout is, and situations where the maximum hardware timeout
is smaller than the required initial timeout. Both would, in my opinion, warrant
the use of a devicetree property, but I understand that others may have a
different opinion.

Overall, my conclusion is that if a devicetree property is not acceptable
for some reason, we should drop the notion of supporting an initial or
"open" timeout entirely, and leave it up to the BIOS/ROMMON as well as the
respective watchdog driver to set an acceptable default or initial timeout.
This initial timeout can (and will) be overwritten with the desired runtime
timeout by the watchdog daemon after it opened the watchdog device.

> >Devicetree could be handled in the core, with a function to set the
> >initial timeout,
> >or possibly even with the watchdog registration itself.
> 
> But where in the device tree would you put this value? I'd really prefer not
> having to modify the node representing each individual watchdog device I
> might use.
> 
The existing "timeout-sec" property is in the watchdog node as well.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 3/3] watchdog: introduce CONFIG_WATCHDOG_OPEN_DEADLINE
  2016-07-21  0:31           ` Guenter Roeck
@ 2016-07-27 20:17             ` Rasmus Villemoes
  2016-07-31 22:17               ` Guenter Roeck
  0 siblings, 1 reply; 28+ messages in thread
From: Rasmus Villemoes @ 2016-07-27 20:17 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Wim Van Sebroeck, linux-watchdog, linux-kernel

On 2016-07-21 02:31, Guenter Roeck wrote:
> On Thu, Jul 21, 2016 at 12:08:52AM +0200, Rasmus Villemoes wrote:
>
> I hear you. "configure hardware" is a slippery term, though. After all,
> one would typically configure the initial timeout in hardware, just as
> any "normal" timeout. In many cases, this will actually already be the
> case (and should be), since the watchdog should be enabled by the
> ROMMON or BIOS before control is passed to the kernel. As such, the
> initial timeout should already be set when the driver is loaded.

So this suggests to me that you've misunderstood how I imagine the 
"open-deadline" (let me keep that name for now) to work. I do certainly 
expect the bootloader to have configured and started the watchdog device 
before control is handed to the kernel, preferably with a rather large 
timeout, so that the watchdog doesn't reset the board before the kernel 
gets around to loading the appropriate driver and detecting that there's 
a dog to be fed. In fact, if the watchdog isn't running when the kernel 
starts, my patch does nothing.

> Overall, my conclusion is that if a devicetree property is not acceptable
> for some reason, we should drop the notion of supporting an initial or
> "open" timeout entirely, and leave it up to the BIOS/ROMMON as well as the
> respective watchdog driver to set an acceptable default or initial timeout.
> This initial timeout can (and will) be overwritten with the desired runtime
> timeout by the watchdog daemon after it opened the watchdog device.
>
>>> Devicetree could be handled in the core, with a function to set the
>>> initial timeout,
>>> or possibly even with the watchdog registration itself.
>>
>> But where in the device tree would you put this value? I'd really prefer not
>> having to modify the node representing each individual watchdog device I
>> might use.
>>
> The existing "timeout-sec" property is in the watchdog node as well.

Yes, but my "timeout" is not used in any way for configuring the 
watchdog device nor the driver for it. Nor does an appropriate value 
depend on which watchdog hardware one has.

I'm interested in, to borrow your words, "fixing the gap" between 
handing over control from the kernel to user-space, by making sure that 
the kernel only feeds the watchdog for a finite amount of time - if 
userspace hasn't come up and opened the watchdog device by then, the 
board reboots. I set a rather generous default of two minutes; it could 
be (and may in some cases need to be) more, but the important thing is 
that the kernel doesn't feed the watchdog indefinitely.

Rasmus

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 3/3] watchdog: introduce CONFIG_WATCHDOG_OPEN_DEADLINE
  2016-07-27 20:17             ` Rasmus Villemoes
@ 2016-07-31 22:17               ` Guenter Roeck
  2016-08-01 11:10                 ` Wim Van Sebroeck
  0 siblings, 1 reply; 28+ messages in thread
From: Guenter Roeck @ 2016-07-31 22:17 UTC (permalink / raw)
  To: Rasmus Villemoes; +Cc: Wim Van Sebroeck, linux-watchdog, linux-kernel

On 07/27/2016 01:17 PM, Rasmus Villemoes wrote:
> On 2016-07-21 02:31, Guenter Roeck wrote:
>> On Thu, Jul 21, 2016 at 12:08:52AM +0200, Rasmus Villemoes wrote:
>>
>> I hear you. "configure hardware" is a slippery term, though. After all,
>> one would typically configure the initial timeout in hardware, just as
>> any "normal" timeout. In many cases, this will actually already be the
>> case (and should be), since the watchdog should be enabled by the
>> ROMMON or BIOS before control is passed to the kernel. As such, the
>> initial timeout should already be set when the driver is loaded.
>
> So this suggests to me that you've misunderstood how I imagine the "open-deadline" (let me keep that name for now) to work. I do certainly expect the bootloader to have configured and started the watchdog device before control is handed to the kernel, preferably with a rather large timeout, so that the watchdog doesn't reset the board before the kernel gets around to loading the appropriate driver and detecting that there's a dog to be fed. In fact, if the watchdog isn't running when the kernel starts, my patch does nothing.
>

No, I did not misunderstand it. We just have a different perspective how to fix it.

>> Overall, my conclusion is that if a devicetree property is not acceptable
>> for some reason, we should drop the notion of supporting an initial or
>> "open" timeout entirely, and leave it up to the BIOS/ROMMON as well as the
>> respective watchdog driver to set an acceptable default or initial timeout.
>> This initial timeout can (and will) be overwritten with the desired runtime
>> timeout by the watchdog daemon after it opened the watchdog device.
>>
>>>> Devicetree could be handled in the core, with a function to set the
>>>> initial timeout,
>>>> or possibly even with the watchdog registration itself.
>>>
>>> But where in the device tree would you put this value? I'd really prefer not
>>> having to modify the node representing each individual watchdog device I
>>> might use.
>>>
>> The existing "timeout-sec" property is in the watchdog node as well.
>
> Yes, but my "timeout" is not used in any way for configuring the watchdog device nor the driver for it. Nor does an appropriate value depend on which watchdog hardware one has.
>
> I'm interested in, to borrow your words, "fixing the gap" between handing over control from the kernel to user-space, by making sure that the kernel only feeds the watchdog for a finite amount of time - if userspace hasn't come up and opened the watchdog device by then, the board reboots. I set a rather generous default of two minutes; it could be (and may in some cases need to be) more, but the important thing is that the kernel doesn't feed the watchdog indefinitely.
>

I don't think we are making progress. You are absolutely opposed to introducing a
devicetree property, and I am absolutely opposed to introducing a configuration option.
The only alternatives I can see would be the old fashioned way of introducing platform
data, and/or to live with a boot parameter. Or we can wait for Wim to chime in
and share his opinion.

Guenter

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC 3/3] watchdog: introduce CONFIG_WATCHDOG_OPEN_DEADLINE
  2016-07-31 22:17               ` Guenter Roeck
@ 2016-08-01 11:10                 ` Wim Van Sebroeck
  0 siblings, 0 replies; 28+ messages in thread
From: Wim Van Sebroeck @ 2016-08-01 11:10 UTC (permalink / raw)
  To: Guenter Roeck; +Cc: Rasmus Villemoes, linux-watchdog, linux-kernel

Hi All,

> On 07/27/2016 01:17 PM, Rasmus Villemoes wrote:
> >On 2016-07-21 02:31, Guenter Roeck wrote:
> >>On Thu, Jul 21, 2016 at 12:08:52AM +0200, Rasmus Villemoes wrote:
> >>
> >>I hear you. "configure hardware" is a slippery term, though. After all,
> >>one would typically configure the initial timeout in hardware, just as
> >>any "normal" timeout. In many cases, this will actually already be the
> >>case (and should be), since the watchdog should be enabled by the
> >>ROMMON or BIOS before control is passed to the kernel. As such, the
> >>initial timeout should already be set when the driver is loaded.
> >
> >So this suggests to me that you've misunderstood how I imagine the 
> >"open-deadline" (let me keep that name for now) to work. I do certainly 
> >expect the bootloader to have configured and started the watchdog device 
> >before control is handed to the kernel, preferably with a rather large 
> >timeout, so that the watchdog doesn't reset the board before the kernel 
> >gets around to loading the appropriate driver and detecting that there's a 
> >dog to be fed. In fact, if the watchdog isn't running when the kernel 
> >starts, my patch does nothing.
> >
> 
> No, I did not misunderstand it. We just have a different perspective how to 
> fix it.
> 
> >>Overall, my conclusion is that if a devicetree property is not acceptable
> >>for some reason, we should drop the notion of supporting an initial or
> >>"open" timeout entirely, and leave it up to the BIOS/ROMMON as well as the
> >>respective watchdog driver to set an acceptable default or initial 
> >>timeout.
> >>This initial timeout can (and will) be overwritten with the desired 
> >>runtime
> >>timeout by the watchdog daemon after it opened the watchdog device.
> >>
> >>>>Devicetree could be handled in the core, with a function to set the
> >>>>initial timeout,
> >>>>or possibly even with the watchdog registration itself.
> >>>
> >>>But where in the device tree would you put this value? I'd really prefer 
> >>>not
> >>>having to modify the node representing each individual watchdog device I
> >>>might use.
> >>>
> >>The existing "timeout-sec" property is in the watchdog node as well.
> >
> >Yes, but my "timeout" is not used in any way for configuring the watchdog 
> >device nor the driver for it. Nor does an appropriate value depend on 
> >which watchdog hardware one has.
> >
> >I'm interested in, to borrow your words, "fixing the gap" between handing 
> >over control from the kernel to user-space, by making sure that the kernel 
> >only feeds the watchdog for a finite amount of time - if userspace hasn't 
> >come up and opened the watchdog device by then, the board reboots. I set a 
> >rather generous default of two minutes; it could be (and may in some cases 
> >need to be) more, but the important thing is that the kernel doesn't feed 
> >the watchdog indefinitely.
> >
> 
> I don't think we are making progress. You are absolutely opposed to 
> introducing a
> devicetree property, and I am absolutely opposed to introducing a 
> configuration option.
> The only alternatives I can see would be the old fashioned way of 
> introducing platform
> data, and/or to live with a boot parameter. Or we can wait for Wim to chime 
> in
> and share his opinion.

I would go for a device tree property. If that isn't there then we take a standard value which should imho not be bigger then 5 minutes.

Kind regards,
Wim.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v2 0/2] watchdog: allow setting deadline for opening /dev/watchdogN
  2016-07-14  9:16 [RFC 0/3] watchdog: introduce open deadline Rasmus Villemoes
                   ` (2 preceding siblings ...)
  2016-07-14  9:16 ` [RFC 3/3] watchdog: introduce CONFIG_WATCHDOG_OPEN_DEADLINE Rasmus Villemoes
@ 2016-12-12  9:17 ` Rasmus Villemoes
  2016-12-12  9:17   ` [PATCH v2 1/2] watchdog: introduce watchdog_worker_should_ping helper Rasmus Villemoes
                     ` (2 more replies)
  3 siblings, 3 replies; 28+ messages in thread
From: Rasmus Villemoes @ 2016-12-12  9:17 UTC (permalink / raw)
  To: linux-watchdog, linux-kernel; +Cc: Rasmus Villemoes, Guenter Roeck

If a watchdog driver tells the framework that the device is running,
the framework takes care of feeding the watchdog until userspace opens
the device. If the userspace application which is supposed to do that
never comes up properly, the watchdog is fed indefinitely by the
kernel. This can be especially problematic for embedded devices.

These patches allow one to set a maximum time for which the kernel
will feed the watchdog, thus ensuring that either userspace has come
up, or the board gets reset. This allows fallback logic in the
bootloader to attempt some recovery (for example, if an automatic
update is in progress, it could roll back to the previous version).

The patches have been tested on a Raspberry Pi 2 (with a suitably
modified driver for setting WDOG_HW_RUNNING) and a Wandboard.

Changes since the initial RFC (https://lkml.org/lkml/2016/7/14/254):
take the timeout value from the device tree node rather than a
watchdog module parameter.

Rasmus Villemoes (2):
  watchdog: introduce watchdog_worker_should_ping helper
  watchdog: introduce CONFIG_WATCHDOG_OPEN_DEADLINE

 drivers/watchdog/Kconfig         | 19 +++++++++++++++++++
 drivers/watchdog/watchdog_core.c | 17 +++++++++++++++++
 drivers/watchdog/watchdog_dev.c  | 38 +++++++++++++++++++++++++++++++++++++-
 include/linux/watchdog.h         |  9 +++++++++
 4 files changed, 82 insertions(+), 1 deletion(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v2 1/2] watchdog: introduce watchdog_worker_should_ping helper
  2016-12-12  9:17 ` [PATCH v2 0/2] watchdog: allow setting deadline for opening /dev/watchdogN Rasmus Villemoes
@ 2016-12-12  9:17   ` Rasmus Villemoes
  2016-12-12  9:17   ` [PATCH v2 2/2] watchdog: introduce CONFIG_WATCHDOG_OPEN_DEADLINE Rasmus Villemoes
  2016-12-14 13:37   ` [PATCH v3 0/2] watchdog: allow setting deadline for opening /dev/watchdogN Rasmus Villemoes
  2 siblings, 0 replies; 28+ messages in thread
From: Rasmus Villemoes @ 2016-12-12  9:17 UTC (permalink / raw)
  To: Wim Van Sebroeck, Guenter Roeck
  Cc: Rasmus Villemoes, linux-watchdog, linux-kernel

This will be useful when the condition becomes slightly more
complicated in the next patch.

Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
---
 drivers/watchdog/watchdog_dev.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
index 32930a0..ca0a000 100644
--- a/drivers/watchdog/watchdog_dev.c
+++ b/drivers/watchdog/watchdog_dev.c
@@ -192,6 +192,11 @@ static int watchdog_ping(struct watchdog_device *wdd)
 	return __watchdog_ping(wdd);
 }
 
+static bool watchdog_worker_should_ping(struct watchdog_device *wdd)
+{
+	return wdd && (watchdog_active(wdd) || watchdog_hw_running(wdd));
+}
+
 static void watchdog_ping_work(struct work_struct *work)
 {
 	struct watchdog_core_data *wd_data;
@@ -202,7 +207,7 @@ static void watchdog_ping_work(struct work_struct *work)
 
 	mutex_lock(&wd_data->lock);
 	wdd = wd_data->wdd;
-	if (wdd && (watchdog_active(wdd) || watchdog_hw_running(wdd)))
+	if (watchdog_worker_should_ping(wdd))
 		__watchdog_ping(wdd);
 	mutex_unlock(&wd_data->lock);
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v2 2/2] watchdog: introduce CONFIG_WATCHDOG_OPEN_DEADLINE
  2016-12-12  9:17 ` [PATCH v2 0/2] watchdog: allow setting deadline for opening /dev/watchdogN Rasmus Villemoes
  2016-12-12  9:17   ` [PATCH v2 1/2] watchdog: introduce watchdog_worker_should_ping helper Rasmus Villemoes
@ 2016-12-12  9:17   ` Rasmus Villemoes
  2016-12-12 16:59     ` Guenter Roeck
  2016-12-14 13:37   ` [PATCH v3 0/2] watchdog: allow setting deadline for opening /dev/watchdogN Rasmus Villemoes
  2 siblings, 1 reply; 28+ messages in thread
From: Rasmus Villemoes @ 2016-12-12  9:17 UTC (permalink / raw)
  To: Wim Van Sebroeck, Guenter Roeck
  Cc: Rasmus Villemoes, linux-watchdog, linux-kernel

The watchdog framework takes care of feeding a hardware watchdog until
userspace opens /dev/watchdogN. If that never happens for some reason
(buggy init script, corrupt root filesystem or whatnot) but the kernel
itself is fine, the machine stays up indefinitely. This patch allows
setting an upper limit for how long the kernel will take care of the
watchdog, thus ensuring that the watchdog will eventually reset the
machine if userspace fails to come up.

This is particularly useful for embedded devices where some fallback
logic is implemented in the bootloader (e.g., use a different root
partition, boot from network, ...).

The open timeout is also used as a maximum time for an application to
re-open /dev/watchdogN after closing it.

The open timeout is taken from the device tree
property "open-timeout", and if that is not present, defaults to
CONFIG_WATCHDOG_DEFAULT_OPEN_TIMEOUT (whose default value itself
is five minutes).

Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
---
 drivers/watchdog/Kconfig         | 19 +++++++++++++++++++
 drivers/watchdog/watchdog_core.c | 17 +++++++++++++++++
 drivers/watchdog/watchdog_dev.c  | 33 ++++++++++++++++++++++++++++++++-
 include/linux/watchdog.h         |  9 +++++++++
 4 files changed, 77 insertions(+), 1 deletion(-)

diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
index 3eb58cb..908bb3f 100644
--- a/drivers/watchdog/Kconfig
+++ b/drivers/watchdog/Kconfig
@@ -52,6 +52,25 @@ config WATCHDOG_SYSFS
 	  Say Y here if you want to enable watchdog device status read through
 	  sysfs attributes.
 
+config WATCHDOG_OPEN_DEADLINE
+	bool "Allow deadline for opening watchdog device"
+	help
+	  If a watchdog driver indicates that to the framework that
+	  the hardware watchdog is running, the framework takes care
+	  of pinging the watchdog until userspace opens
+	  /dev/watchdogN. By selecting this option, the open-timeout
+	  device tree property is used as an upper bound for which the
+	  kernel does this - thus, if userspace hasn't opened the
+	  device within this time, the board reboots.
+
+config WATCHDOG_DEFAULT_OPEN_TIMEOUT
+	int "Default timeout value for opening watchdog device"
+	depends on WATCHDOG_OPEN_DEADLINE
+	default 300
+	help
+	  The default value used when the watchdog's device tree node
+	  does not have the "open-timeout" property.
+
 #
 # General Watchdog drivers
 #
diff --git a/drivers/watchdog/watchdog_core.c b/drivers/watchdog/watchdog_core.c
index 74265b2..31294b2 100644
--- a/drivers/watchdog/watchdog_core.c
+++ b/drivers/watchdog/watchdog_core.c
@@ -191,6 +191,21 @@ void watchdog_set_restart_priority(struct watchdog_device *wdd, int priority)
 }
 EXPORT_SYMBOL_GPL(watchdog_set_restart_priority);
 
+static void
+watchdog_set_open_timeout(struct watchdog_device *wdd)
+{
+#ifdef CONFIG_WATCHDOG_OPEN_DEADLINE
+	u32 t;
+	struct device *dev;
+
+	dev = wdd->parent;
+	if (dev && !of_property_read_u32(dev->of_node, "open-timeout", &t))
+		wdd->open_timeout = t;
+	else
+		wdd->open_timeout = CONFIG_WATCHDOG_DEFAULT_OPEN_TIMEOUT;
+#endif
+}
+
 static int __watchdog_register_device(struct watchdog_device *wdd)
 {
 	int ret, id = -1;
@@ -225,6 +240,8 @@ static int __watchdog_register_device(struct watchdog_device *wdd)
 		return id;
 	wdd->id = id;
 
+	watchdog_set_open_timeout(wdd);
+
 	ret = watchdog_dev_register(wdd);
 	if (ret) {
 		ida_simple_remove(&watchdog_ida, id);
diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
index ca0a000..f725e0b 100644
--- a/drivers/watchdog/watchdog_dev.c
+++ b/drivers/watchdog/watchdog_dev.c
@@ -80,6 +80,29 @@ static struct watchdog_core_data *old_wd_data;
 
 static struct workqueue_struct *watchdog_wq;
 
+#ifdef CONFIG_WATCHDOG_OPEN_DEADLINE
+static bool watchdog_past_open_deadline(struct watchdog_device *wdd)
+{
+	if (!wdd->open_timeout)
+		return false;
+	return time_is_before_jiffies(wdd->open_deadline);
+}
+
+static void watchdog_set_open_deadline(struct watchdog_device *wdd)
+{
+	wdd->open_deadline = jiffies + msecs_to_jiffies(1000 * wdd->open_timeout);
+}
+#else
+static bool watchdog_past_open_deadline(struct watchdog_device *wdd)
+{
+	return false;
+}
+
+static void watchdog_set_open_deadline(struct watchdog_device *wdd)
+{
+}
+#endif
+
 static inline bool watchdog_need_worker(struct watchdog_device *wdd)
 {
 	/* All variables in milli-seconds */
@@ -194,7 +217,13 @@ static int watchdog_ping(struct watchdog_device *wdd)
 
 static bool watchdog_worker_should_ping(struct watchdog_device *wdd)
 {
-	return wdd && (watchdog_active(wdd) || watchdog_hw_running(wdd));
+	if (!wdd)
+		return false;
+
+	if (watchdog_active(wdd))
+		return true;
+
+	return watchdog_hw_running(wdd) && !watchdog_past_open_deadline(wdd);
 }
 
 static void watchdog_ping_work(struct work_struct *work)
@@ -857,6 +886,7 @@ static int watchdog_release(struct inode *inode, struct file *file)
 		watchdog_ping(wdd);
 	}
 
+	watchdog_set_open_deadline(wdd);
 	watchdog_update_worker(wdd);
 
 	/* make sure that /dev/watchdog can be re-opened */
@@ -955,6 +985,7 @@ static int watchdog_cdev_register(struct watchdog_device *wdd, dev_t devno)
 
 	/* Record time of most recent heartbeat as 'just before now'. */
 	wd_data->last_hw_keepalive = jiffies - 1;
+	watchdog_set_open_deadline(wdd);
 
 	/*
 	 * If the watchdog is running, prevent its driver from being unloaded,
diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h
index 35a4d81..c4e7ff8 100644
--- a/include/linux/watchdog.h
+++ b/include/linux/watchdog.h
@@ -76,6 +76,11 @@ struct watchdog_ops {
  * @max_hw_heartbeat_ms:
  *		Hardware limit for maximum timeout, in milli-seconds.
  *		Replaces max_timeout if specified.
+ * @open_timeout:
+ *              The maximum time for which the kernel will ping the
+ *              device after registration.
+ * @open_deadline:
+ *              Set to jiffies + @open_timeout at registration.
  * @reboot_nb:	The notifier block to stop watchdog on reboot.
  * @restart_nb:	The notifier block to register a restart function.
  * @driver_data:Pointer to the drivers private data.
@@ -107,6 +112,10 @@ struct watchdog_device {
 	unsigned int max_timeout;
 	unsigned int min_hw_heartbeat_ms;
 	unsigned int max_hw_heartbeat_ms;
+#ifdef CONFIG_WATCHDOG_OPEN_DEADLINE
+	unsigned long open_timeout;
+	unsigned long open_deadline;
+#endif
 	struct notifier_block reboot_nb;
 	struct notifier_block restart_nb;
 	void *driver_data;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v2 2/2] watchdog: introduce CONFIG_WATCHDOG_OPEN_DEADLINE
  2016-12-12  9:17   ` [PATCH v2 2/2] watchdog: introduce CONFIG_WATCHDOG_OPEN_DEADLINE Rasmus Villemoes
@ 2016-12-12 16:59     ` Guenter Roeck
  0 siblings, 0 replies; 28+ messages in thread
From: Guenter Roeck @ 2016-12-12 16:59 UTC (permalink / raw)
  To: Rasmus Villemoes; +Cc: Wim Van Sebroeck, linux-watchdog, linux-kernel

Hi,

On Mon, Dec 12, 2016 at 10:17:53AM +0100, Rasmus Villemoes wrote:
> The watchdog framework takes care of feeding a hardware watchdog until
> userspace opens /dev/watchdogN. If that never happens for some reason
> (buggy init script, corrupt root filesystem or whatnot) but the kernel
> itself is fine, the machine stays up indefinitely. This patch allows
> setting an upper limit for how long the kernel will take care of the
> watchdog, thus ensuring that the watchdog will eventually reset the
> machine if userspace fails to come up.
> 
> This is particularly useful for embedded devices where some fallback
> logic is implemented in the bootloader (e.g., use a different root
> partition, boot from network, ...).
> 
> The open timeout is also used as a maximum time for an application to
> re-open /dev/watchdogN after closing it.
> 
> The open timeout is taken from the device tree
> property "open-timeout", and if that is not present, defaults to
> CONFIG_WATCHDOG_DEFAULT_OPEN_TIMEOUT (whose default value itself
> is five minutes).
> 

You'll needd to submit the devicetree change as separate patch.

For this patch itself, I am in general not a friend of adding configuration
options, especially if there are other means to achieve the same goal and
the added code is minimal. This is true even more so in this case, where a
devicetree property could be present. The configuration option here really
means "ignore this devicetree property", and I don't think this would be
a good idea.

The timeout should be set through through a device property (ie use
device_property_read_u32() instead of of_property_read_u32()), to support
non-devicetree systems (we should probably make a similar change for the
timeout itself).

I don't think that a default should be set if the property is not present.
It should mean "no timeout", as with the current code.

For the configuration option, you'll have to convince Wim to accept it.

Thanks,
Guenter

> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
> ---
>  drivers/watchdog/Kconfig         | 19 +++++++++++++++++++
>  drivers/watchdog/watchdog_core.c | 17 +++++++++++++++++
>  drivers/watchdog/watchdog_dev.c  | 33 ++++++++++++++++++++++++++++++++-
>  include/linux/watchdog.h         |  9 +++++++++
>  4 files changed, 77 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
> index 3eb58cb..908bb3f 100644
> --- a/drivers/watchdog/Kconfig
> +++ b/drivers/watchdog/Kconfig
> @@ -52,6 +52,25 @@ config WATCHDOG_SYSFS
>  	  Say Y here if you want to enable watchdog device status read through
>  	  sysfs attributes.
>  
> +config WATCHDOG_OPEN_DEADLINE
> +	bool "Allow deadline for opening watchdog device"
> +	help
> +	  If a watchdog driver indicates that to the framework that
> +	  the hardware watchdog is running, the framework takes care
> +	  of pinging the watchdog until userspace opens
> +	  /dev/watchdogN. By selecting this option, the open-timeout
> +	  device tree property is used as an upper bound for which the
> +	  kernel does this - thus, if userspace hasn't opened the
> +	  device within this time, the board reboots.
> +
> +config WATCHDOG_DEFAULT_OPEN_TIMEOUT
> +	int "Default timeout value for opening watchdog device"
> +	depends on WATCHDOG_OPEN_DEADLINE
> +	default 300
> +	help
> +	  The default value used when the watchdog's device tree node
> +	  does not have the "open-timeout" property.
> +
>  #
>  # General Watchdog drivers
>  #
> diff --git a/drivers/watchdog/watchdog_core.c b/drivers/watchdog/watchdog_core.c
> index 74265b2..31294b2 100644
> --- a/drivers/watchdog/watchdog_core.c
> +++ b/drivers/watchdog/watchdog_core.c
> @@ -191,6 +191,21 @@ void watchdog_set_restart_priority(struct watchdog_device *wdd, int priority)
>  }
>  EXPORT_SYMBOL_GPL(watchdog_set_restart_priority);
>  
> +static void
> +watchdog_set_open_timeout(struct watchdog_device *wdd)
> +{
> +#ifdef CONFIG_WATCHDOG_OPEN_DEADLINE
> +	u32 t;
> +	struct device *dev;
> +
> +	dev = wdd->parent;
> +	if (dev && !of_property_read_u32(dev->of_node, "open-timeout", &t))
> +		wdd->open_timeout = t;
> +	else
> +		wdd->open_timeout = CONFIG_WATCHDOG_DEFAULT_OPEN_TIMEOUT;
> +#endif
> +}
> +
>  static int __watchdog_register_device(struct watchdog_device *wdd)
>  {
>  	int ret, id = -1;
> @@ -225,6 +240,8 @@ static int __watchdog_register_device(struct watchdog_device *wdd)
>  		return id;
>  	wdd->id = id;
>  
> +	watchdog_set_open_timeout(wdd);
> +
>  	ret = watchdog_dev_register(wdd);
>  	if (ret) {
>  		ida_simple_remove(&watchdog_ida, id);
> diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
> index ca0a000..f725e0b 100644
> --- a/drivers/watchdog/watchdog_dev.c
> +++ b/drivers/watchdog/watchdog_dev.c
> @@ -80,6 +80,29 @@ static struct watchdog_core_data *old_wd_data;
>  
>  static struct workqueue_struct *watchdog_wq;
>  
> +#ifdef CONFIG_WATCHDOG_OPEN_DEADLINE
> +static bool watchdog_past_open_deadline(struct watchdog_device *wdd)
> +{
> +	if (!wdd->open_timeout)
> +		return false;
> +	return time_is_before_jiffies(wdd->open_deadline);
> +}
> +
> +static void watchdog_set_open_deadline(struct watchdog_device *wdd)
> +{
> +	wdd->open_deadline = jiffies + msecs_to_jiffies(1000 * wdd->open_timeout);
> +}
> +#else
> +static bool watchdog_past_open_deadline(struct watchdog_device *wdd)
> +{
> +	return false;
> +}
> +
> +static void watchdog_set_open_deadline(struct watchdog_device *wdd)
> +{
> +}
> +#endif
> +
>  static inline bool watchdog_need_worker(struct watchdog_device *wdd)
>  {
>  	/* All variables in milli-seconds */
> @@ -194,7 +217,13 @@ static int watchdog_ping(struct watchdog_device *wdd)
>  
>  static bool watchdog_worker_should_ping(struct watchdog_device *wdd)
>  {
> -	return wdd && (watchdog_active(wdd) || watchdog_hw_running(wdd));
> +	if (!wdd)
> +		return false;
> +
> +	if (watchdog_active(wdd))
> +		return true;
> +
> +	return watchdog_hw_running(wdd) && !watchdog_past_open_deadline(wdd);
>  }
>  
>  static void watchdog_ping_work(struct work_struct *work)
> @@ -857,6 +886,7 @@ static int watchdog_release(struct inode *inode, struct file *file)
>  		watchdog_ping(wdd);
>  	}
>  
> +	watchdog_set_open_deadline(wdd);
>  	watchdog_update_worker(wdd);
>  
>  	/* make sure that /dev/watchdog can be re-opened */
> @@ -955,6 +985,7 @@ static int watchdog_cdev_register(struct watchdog_device *wdd, dev_t devno)
>  
>  	/* Record time of most recent heartbeat as 'just before now'. */
>  	wd_data->last_hw_keepalive = jiffies - 1;
> +	watchdog_set_open_deadline(wdd);
>  
>  	/*
>  	 * If the watchdog is running, prevent its driver from being unloaded,
> diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h
> index 35a4d81..c4e7ff8 100644
> --- a/include/linux/watchdog.h
> +++ b/include/linux/watchdog.h
> @@ -76,6 +76,11 @@ struct watchdog_ops {
>   * @max_hw_heartbeat_ms:
>   *		Hardware limit for maximum timeout, in milli-seconds.
>   *		Replaces max_timeout if specified.
> + * @open_timeout:
> + *              The maximum time for which the kernel will ping the
> + *              device after registration.
> + * @open_deadline:
> + *              Set to jiffies + @open_timeout at registration.
>   * @reboot_nb:	The notifier block to stop watchdog on reboot.
>   * @restart_nb:	The notifier block to register a restart function.
>   * @driver_data:Pointer to the drivers private data.
> @@ -107,6 +112,10 @@ struct watchdog_device {
>  	unsigned int max_timeout;
>  	unsigned int min_hw_heartbeat_ms;
>  	unsigned int max_hw_heartbeat_ms;
> +#ifdef CONFIG_WATCHDOG_OPEN_DEADLINE
> +	unsigned long open_timeout;
> +	unsigned long open_deadline;
> +#endif
>  	struct notifier_block reboot_nb;
>  	struct notifier_block restart_nb;
>  	void *driver_data;
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-watchdog" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v3 0/2] watchdog: allow setting deadline for opening /dev/watchdogN
  2016-12-12  9:17 ` [PATCH v2 0/2] watchdog: allow setting deadline for opening /dev/watchdogN Rasmus Villemoes
  2016-12-12  9:17   ` [PATCH v2 1/2] watchdog: introduce watchdog_worker_should_ping helper Rasmus Villemoes
  2016-12-12  9:17   ` [PATCH v2 2/2] watchdog: introduce CONFIG_WATCHDOG_OPEN_DEADLINE Rasmus Villemoes
@ 2016-12-14 13:37   ` Rasmus Villemoes
  2016-12-14 13:37     ` [PATCH v3 1/2] watchdog: introduce watchdog_worker_should_ping helper Rasmus Villemoes
                       ` (2 more replies)
  2 siblings, 3 replies; 28+ messages in thread
From: Rasmus Villemoes @ 2016-12-14 13:37 UTC (permalink / raw)
  To: linux-watchdog, linux-kernel; +Cc: Rasmus Villemoes, Guenter Roeck

If a watchdog driver tells the framework that the device is running,
the framework takes care of feeding the watchdog until userspace opens
the device. If the userspace application which is supposed to do that
never comes up properly, the watchdog is fed indefinitely by the
kernel. This can be especially problematic for embedded devices.

These patches allow one to set a maximum time for which the kernel
will feed the watchdog, thus ensuring that either userspace has come
up, or the board gets reset. This allows fallback logic in the
bootloader to attempt some recovery (for example, if an automatic
update is in progress, it could roll back to the previous version).

The patches have been tested on a Raspberry Pi 2 (with a suitably
modified driver for setting WDOG_HW_RUNNING) and a Wandboard.

Changes since v2:

 - Use device_property_read_u32 rather than of_property_read_u32

  - Remove config option to ifdef this completely out, instead set the
    default timeout to "infinite" to preserve existing behaviour.

Changes since v1:

 - Take the timeout value from the device tree node rather than a
   watchdog module parameter.


Rasmus Villemoes (2):
  watchdog: introduce watchdog_worker_should_ping helper
  watchdog: introduce CONFIG_WATCHDOG_OPEN_TIMEOUT

 drivers/watchdog/Kconfig         | 14 ++++++++++++++
 drivers/watchdog/watchdog_core.c | 16 ++++++++++++++++
 drivers/watchdog/watchdog_dev.c  | 27 ++++++++++++++++++++++++++-
 include/linux/watchdog.h         |  7 +++++++
 4 files changed, 63 insertions(+), 1 deletion(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v3 1/2] watchdog: introduce watchdog_worker_should_ping helper
  2016-12-14 13:37   ` [PATCH v3 0/2] watchdog: allow setting deadline for opening /dev/watchdogN Rasmus Villemoes
@ 2016-12-14 13:37     ` Rasmus Villemoes
  2016-12-14 13:37     ` [PATCH v3 2/2] watchdog: introduce CONFIG_WATCHDOG_OPEN_TIMEOUT Rasmus Villemoes
  2017-01-02  8:04     ` [PATCH v3 0/2] watchdog: allow setting deadline for opening /dev/watchdogN Rasmus Villemoes
  2 siblings, 0 replies; 28+ messages in thread
From: Rasmus Villemoes @ 2016-12-14 13:37 UTC (permalink / raw)
  To: Wim Van Sebroeck, Guenter Roeck
  Cc: Rasmus Villemoes, linux-watchdog, linux-kernel

This will be useful when the condition becomes slightly more
complicated in the next patch.

Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
---
 drivers/watchdog/watchdog_dev.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
index 32930a0..ca0a000 100644
--- a/drivers/watchdog/watchdog_dev.c
+++ b/drivers/watchdog/watchdog_dev.c
@@ -192,6 +192,11 @@ static int watchdog_ping(struct watchdog_device *wdd)
 	return __watchdog_ping(wdd);
 }
 
+static bool watchdog_worker_should_ping(struct watchdog_device *wdd)
+{
+	return wdd && (watchdog_active(wdd) || watchdog_hw_running(wdd));
+}
+
 static void watchdog_ping_work(struct work_struct *work)
 {
 	struct watchdog_core_data *wd_data;
@@ -202,7 +207,7 @@ static void watchdog_ping_work(struct work_struct *work)
 
 	mutex_lock(&wd_data->lock);
 	wdd = wd_data->wdd;
-	if (wdd && (watchdog_active(wdd) || watchdog_hw_running(wdd)))
+	if (watchdog_worker_should_ping(wdd))
 		__watchdog_ping(wdd);
 	mutex_unlock(&wd_data->lock);
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v3 2/2] watchdog: introduce CONFIG_WATCHDOG_OPEN_TIMEOUT
  2016-12-14 13:37   ` [PATCH v3 0/2] watchdog: allow setting deadline for opening /dev/watchdogN Rasmus Villemoes
  2016-12-14 13:37     ` [PATCH v3 1/2] watchdog: introduce watchdog_worker_should_ping helper Rasmus Villemoes
@ 2016-12-14 13:37     ` Rasmus Villemoes
  2017-01-02 15:22       ` Guenter Roeck
  2017-01-02  8:04     ` [PATCH v3 0/2] watchdog: allow setting deadline for opening /dev/watchdogN Rasmus Villemoes
  2 siblings, 1 reply; 28+ messages in thread
From: Rasmus Villemoes @ 2016-12-14 13:37 UTC (permalink / raw)
  To: Wim Van Sebroeck, Guenter Roeck
  Cc: Rasmus Villemoes, linux-watchdog, linux-kernel

The watchdog framework takes care of feeding a hardware watchdog until
userspace opens /dev/watchdogN. If that never happens for some reason
(buggy init script, corrupt root filesystem or whatnot) but the kernel
itself is fine, the machine stays up indefinitely. This patch allows
setting an upper limit for how long the kernel will take care of the
watchdog, thus ensuring that the watchdog will eventually reset the
machine if userspace fails to come up.

This is particularly useful for embedded devices where some fallback
logic is implemented in the bootloader (e.g., use a different root
partition, boot from network, ...).

The open timeout is also used as a maximum time for an application to
re-open /dev/watchdogN after closing it.

The open timeout is taken from the device property "open-timeout", and
if that is not present, defaults to CONFIG_WATCHDOG_OPEN_TIMEOUT, which
in turn defaults to 0, which means there is no upper limit (as must be
the default, since there may be existing systems out there relying on
the kernel taking care of the watchdog forever).

Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
---
 drivers/watchdog/Kconfig         | 14 ++++++++++++++
 drivers/watchdog/watchdog_core.c | 16 ++++++++++++++++
 drivers/watchdog/watchdog_dev.c  | 22 +++++++++++++++++++++-
 include/linux/watchdog.h         |  7 +++++++
 4 files changed, 58 insertions(+), 1 deletion(-)

diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
index 3eb58cb..890418c 100644
--- a/drivers/watchdog/Kconfig
+++ b/drivers/watchdog/Kconfig
@@ -52,6 +52,20 @@ config WATCHDOG_SYSFS
 	  Say Y here if you want to enable watchdog device status read through
 	  sysfs attributes.
 
+config WATCHDOG_OPEN_TIMEOUT
+	int "Default timeout value for opening watchdog device (seconds)"
+	default 0
+	help
+	  If a watchdog driver indicates to the framework that the
+	  hardware watchdog is running, the framework takes care of
+	  pinging the watchdog until userspace opens
+	  /dev/watchdogN. This value (overridden by the device's
+	  "open-timeout" property if present) sets an upper bound for
+	  how long the kernel does this - thus, if userspace hasn't
+	  opened the device within the timeout, the board reboots. A
+	  value of 0 means there is no timeout.
+
+
 #
 # General Watchdog drivers
 #
diff --git a/drivers/watchdog/watchdog_core.c b/drivers/watchdog/watchdog_core.c
index 74265b2..d968e0f 100644
--- a/drivers/watchdog/watchdog_core.c
+++ b/drivers/watchdog/watchdog_core.c
@@ -38,6 +38,7 @@
 #include <linux/idr.h>		/* For ida_* macros */
 #include <linux/err.h>		/* For IS_ERR macros */
 #include <linux/of.h>		/* For of_get_timeout_sec */
+#include <linux/property.h>	/* For device_property_read_u32 */
 
 #include "watchdog_core.h"	/* For watchdog_dev_register/... */
 
@@ -191,6 +192,19 @@ void watchdog_set_restart_priority(struct watchdog_device *wdd, int priority)
 }
 EXPORT_SYMBOL_GPL(watchdog_set_restart_priority);
 
+static void
+watchdog_set_open_timeout(struct watchdog_device *wdd)
+{
+	u32 t;
+	struct device *dev;
+
+	dev = wdd->parent;
+	if (dev && !device_property_read_u32(dev, "open-timeout", &t))
+		wdd->open_timeout = t;
+	else
+		wdd->open_timeout = CONFIG_WATCHDOG_OPEN_TIMEOUT;
+}
+
 static int __watchdog_register_device(struct watchdog_device *wdd)
 {
 	int ret, id = -1;
@@ -225,6 +239,8 @@ static int __watchdog_register_device(struct watchdog_device *wdd)
 		return id;
 	wdd->id = id;
 
+	watchdog_set_open_timeout(wdd);
+
 	ret = watchdog_dev_register(wdd);
 	if (ret) {
 		ida_simple_remove(&watchdog_ida, id);
diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
index ca0a000..f153091 100644
--- a/drivers/watchdog/watchdog_dev.c
+++ b/drivers/watchdog/watchdog_dev.c
@@ -80,6 +80,18 @@ static struct watchdog_core_data *old_wd_data;
 
 static struct workqueue_struct *watchdog_wq;
 
+static bool watchdog_past_open_deadline(struct watchdog_device *wdd)
+{
+	if (!wdd->open_timeout)
+		return false;
+	return time_is_before_jiffies(wdd->open_deadline);
+}
+
+static void watchdog_set_open_deadline(struct watchdog_device *wdd)
+{
+	wdd->open_deadline = jiffies + msecs_to_jiffies(1000 * wdd->open_timeout);
+}
+
 static inline bool watchdog_need_worker(struct watchdog_device *wdd)
 {
 	/* All variables in milli-seconds */
@@ -194,7 +206,13 @@ static int watchdog_ping(struct watchdog_device *wdd)
 
 static bool watchdog_worker_should_ping(struct watchdog_device *wdd)
 {
-	return wdd && (watchdog_active(wdd) || watchdog_hw_running(wdd));
+	if (!wdd)
+		return false;
+
+	if (watchdog_active(wdd))
+		return true;
+
+	return watchdog_hw_running(wdd) && !watchdog_past_open_deadline(wdd);
 }
 
 static void watchdog_ping_work(struct work_struct *work)
@@ -857,6 +875,7 @@ static int watchdog_release(struct inode *inode, struct file *file)
 		watchdog_ping(wdd);
 	}
 
+	watchdog_set_open_deadline(wdd);
 	watchdog_update_worker(wdd);
 
 	/* make sure that /dev/watchdog can be re-opened */
@@ -955,6 +974,7 @@ static int watchdog_cdev_register(struct watchdog_device *wdd, dev_t devno)
 
 	/* Record time of most recent heartbeat as 'just before now'. */
 	wd_data->last_hw_keepalive = jiffies - 1;
+	watchdog_set_open_deadline(wdd);
 
 	/*
 	 * If the watchdog is running, prevent its driver from being unloaded,
diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h
index 35a4d81..a89a293 100644
--- a/include/linux/watchdog.h
+++ b/include/linux/watchdog.h
@@ -76,6 +76,11 @@ struct watchdog_ops {
  * @max_hw_heartbeat_ms:
  *		Hardware limit for maximum timeout, in milli-seconds.
  *		Replaces max_timeout if specified.
+ * @open_timeout:
+ *              The maximum time for which the kernel will ping the
+ *              device after registration.
+ * @open_deadline:
+ *              Set to jiffies + @open_timeout at registration.
  * @reboot_nb:	The notifier block to stop watchdog on reboot.
  * @restart_nb:	The notifier block to register a restart function.
  * @driver_data:Pointer to the drivers private data.
@@ -107,6 +112,8 @@ struct watchdog_device {
 	unsigned int max_timeout;
 	unsigned int min_hw_heartbeat_ms;
 	unsigned int max_hw_heartbeat_ms;
+	unsigned int open_timeout;
+	unsigned long open_deadline;
 	struct notifier_block reboot_nb;
 	struct notifier_block restart_nb;
 	void *driver_data;
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 0/2] watchdog: allow setting deadline for opening /dev/watchdogN
  2016-12-14 13:37   ` [PATCH v3 0/2] watchdog: allow setting deadline for opening /dev/watchdogN Rasmus Villemoes
  2016-12-14 13:37     ` [PATCH v3 1/2] watchdog: introduce watchdog_worker_should_ping helper Rasmus Villemoes
  2016-12-14 13:37     ` [PATCH v3 2/2] watchdog: introduce CONFIG_WATCHDOG_OPEN_TIMEOUT Rasmus Villemoes
@ 2017-01-02  8:04     ` Rasmus Villemoes
  2 siblings, 0 replies; 28+ messages in thread
From: Rasmus Villemoes @ 2017-01-02  8:04 UTC (permalink / raw)
  To: linux-watchdog, linux-kernel; +Cc: Guenter Roeck, Sylvain Lemieux

On 2016-12-14 14:37, Rasmus Villemoes wrote:

> If a watchdog driver tells the framework that the device is running,
> the framework takes care of feeding the watchdog until userspace opens
> the device. If the userspace application which is supposed to do that
> never comes up properly, the watchdog is fed indefinitely by the
> kernel. This can be especially problematic for embedded devices.
>
> These patches allow one to set a maximum time for which the kernel
> will feed the watchdog, thus ensuring that either userspace has come
> up, or the board gets reset.

Ping.

Rasmus

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/2] watchdog: introduce CONFIG_WATCHDOG_OPEN_TIMEOUT
  2016-12-14 13:37     ` [PATCH v3 2/2] watchdog: introduce CONFIG_WATCHDOG_OPEN_TIMEOUT Rasmus Villemoes
@ 2017-01-02 15:22       ` Guenter Roeck
  2017-01-03 15:52         ` Rasmus Villemoes
  0 siblings, 1 reply; 28+ messages in thread
From: Guenter Roeck @ 2017-01-02 15:22 UTC (permalink / raw)
  To: Rasmus Villemoes, Wim Van Sebroeck; +Cc: linux-watchdog, linux-kernel

On 12/14/2016 05:37 AM, Rasmus Villemoes wrote:
> The watchdog framework takes care of feeding a hardware watchdog until
> userspace opens /dev/watchdogN. If that never happens for some reason
> (buggy init script, corrupt root filesystem or whatnot) but the kernel
> itself is fine, the machine stays up indefinitely. This patch allows
> setting an upper limit for how long the kernel will take care of the
> watchdog, thus ensuring that the watchdog will eventually reset the
> machine if userspace fails to come up.
>
> This is particularly useful for embedded devices where some fallback
> logic is implemented in the bootloader (e.g., use a different root
> partition, boot from network, ...).
>
> The open timeout is also used as a maximum time for an application to
> re-open /dev/watchdogN after closing it.
>
> The open timeout is taken from the device property "open-timeout", and
> if that is not present, defaults to CONFIG_WATCHDOG_OPEN_TIMEOUT, which
> in turn defaults to 0, which means there is no upper limit (as must be
> the default, since there may be existing systems out there relying on
> the kernel taking care of the watchdog forever).
>
> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
> ---
>  drivers/watchdog/Kconfig         | 14 ++++++++++++++
>  drivers/watchdog/watchdog_core.c | 16 ++++++++++++++++
>  drivers/watchdog/watchdog_dev.c  | 22 +++++++++++++++++++++-
>  include/linux/watchdog.h         |  7 +++++++
>  4 files changed, 58 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
> index 3eb58cb..890418c 100644
> --- a/drivers/watchdog/Kconfig
> +++ b/drivers/watchdog/Kconfig
> @@ -52,6 +52,20 @@ config WATCHDOG_SYSFS
>  	  Say Y here if you want to enable watchdog device status read through
>  	  sysfs attributes.
>
> +config WATCHDOG_OPEN_TIMEOUT
> +	int "Default timeout value for opening watchdog device (seconds)"
> +	default 0
> +	help
> +	  If a watchdog driver indicates to the framework that the
> +	  hardware watchdog is running, the framework takes care of
> +	  pinging the watchdog until userspace opens
> +	  /dev/watchdogN. This value (overridden by the device's
> +	  "open-timeout" property if present) sets an upper bound for
> +	  how long the kernel does this - thus, if userspace hasn't
> +	  opened the device within the timeout, the board reboots. A
> +	  value of 0 means there is no timeout.
> +
> +

Dual empty line. Also, as mentioned before, I am not a friend of such
configuration variables. It forces distribution vendors to make the decision
for everyone. Please consider defining a command line parameter such as
watchdog_open_timeout.

>  #
>  # General Watchdog drivers
>  #
> diff --git a/drivers/watchdog/watchdog_core.c b/drivers/watchdog/watchdog_core.c
> index 74265b2..d968e0f 100644
> --- a/drivers/watchdog/watchdog_core.c
> +++ b/drivers/watchdog/watchdog_core.c
> @@ -38,6 +38,7 @@
>  #include <linux/idr.h>		/* For ida_* macros */
>  #include <linux/err.h>		/* For IS_ERR macros */
>  #include <linux/of.h>		/* For of_get_timeout_sec */
> +#include <linux/property.h>	/* For device_property_read_u32 */
>
>  #include "watchdog_core.h"	/* For watchdog_dev_register/... */
>
> @@ -191,6 +192,19 @@ void watchdog_set_restart_priority(struct watchdog_device *wdd, int priority)
>  }
>  EXPORT_SYMBOL_GPL(watchdog_set_restart_priority);
>
> +static void
> +watchdog_set_open_timeout(struct watchdog_device *wdd)
> +{
> +	u32 t;
> +	struct device *dev;
> +
> +	dev = wdd->parent;
> +	if (dev && !device_property_read_u32(dev, "open-timeout", &t))
> +		wdd->open_timeout = t;

The check if dev == NULL is unnecessary.

New devicetree properties need to be documented and approved by devicetree maintainers.

> +	else
> +		wdd->open_timeout = CONFIG_WATCHDOG_OPEN_TIMEOUT;
> +}
> +
>  static int __watchdog_register_device(struct watchdog_device *wdd)
>  {
>  	int ret, id = -1;
> @@ -225,6 +239,8 @@ static int __watchdog_register_device(struct watchdog_device *wdd)
>  		return id;
>  	wdd->id = id;
>
> +	watchdog_set_open_timeout(wdd);
> +
Consider moving this call to watchdog_dev_register().

>  	ret = watchdog_dev_register(wdd);
>  	if (ret) {
>  		ida_simple_remove(&watchdog_ida, id);
> diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
> index ca0a000..f153091 100644
> --- a/drivers/watchdog/watchdog_dev.c
> +++ b/drivers/watchdog/watchdog_dev.c
> @@ -80,6 +80,18 @@ static struct watchdog_core_data *old_wd_data;
>
>  static struct workqueue_struct *watchdog_wq;
>
> +static bool watchdog_past_open_deadline(struct watchdog_device *wdd)
> +{
> +	if (!wdd->open_timeout)
> +		return false;
> +	return time_is_before_jiffies(wdd->open_deadline);
> +}
> +
> +static void watchdog_set_open_deadline(struct watchdog_device *wdd)
> +{
> +	wdd->open_deadline = jiffies + msecs_to_jiffies(1000 * wdd->open_timeout);
> +}
> +
>  static inline bool watchdog_need_worker(struct watchdog_device *wdd)
>  {
>  	/* All variables in milli-seconds */
> @@ -194,7 +206,13 @@ static int watchdog_ping(struct watchdog_device *wdd)
>
>  static bool watchdog_worker_should_ping(struct watchdog_device *wdd)
>  {
> -	return wdd && (watchdog_active(wdd) || watchdog_hw_running(wdd));
> +	if (!wdd)
> +		return false;
> +
> +	if (watchdog_active(wdd))
> +		return true;
> +
> +	return watchdog_hw_running(wdd) && !watchdog_past_open_deadline(wdd);
>  }
>
>  static void watchdog_ping_work(struct work_struct *work)
> @@ -857,6 +875,7 @@ static int watchdog_release(struct inode *inode, struct file *file)
>  		watchdog_ping(wdd);
>  	}
>
> +	watchdog_set_open_deadline(wdd);
>  	watchdog_update_worker(wdd);
>
>  	/* make sure that /dev/watchdog can be re-opened */
> @@ -955,6 +974,7 @@ static int watchdog_cdev_register(struct watchdog_device *wdd, dev_t devno)
>
>  	/* Record time of most recent heartbeat as 'just before now'. */
>  	wd_data->last_hw_keepalive = jiffies - 1;
> +	watchdog_set_open_deadline(wdd);
>
>  	/*
>  	 * If the watchdog is running, prevent its driver from being unloaded,
> diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h
> index 35a4d81..a89a293 100644
> --- a/include/linux/watchdog.h
> +++ b/include/linux/watchdog.h
> @@ -76,6 +76,11 @@ struct watchdog_ops {
>   * @max_hw_heartbeat_ms:
>   *		Hardware limit for maximum timeout, in milli-seconds.
>   *		Replaces max_timeout if specified.
> + * @open_timeout:
> + *              The maximum time for which the kernel will ping the
> + *              device after registration.
> + * @open_deadline:
> + *              Set to jiffies + @open_timeout at registration.
>   * @reboot_nb:	The notifier block to stop watchdog on reboot.
>   * @restart_nb:	The notifier block to register a restart function.
>   * @driver_data:Pointer to the drivers private data.
> @@ -107,6 +112,8 @@ struct watchdog_device {
>  	unsigned int max_timeout;
>  	unsigned int min_hw_heartbeat_ms;
>  	unsigned int max_hw_heartbeat_ms;
> +	unsigned int open_timeout;
> +	unsigned long open_deadline;

None of those need to be visible in drivers. I don't see why the variables should be
defined here and not in struct watchdog_core_data.

>  	struct notifier_block reboot_nb;
>  	struct notifier_block restart_nb;
>  	void *driver_data;
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/2] watchdog: introduce CONFIG_WATCHDOG_OPEN_TIMEOUT
  2017-01-02 15:22       ` Guenter Roeck
@ 2017-01-03 15:52         ` Rasmus Villemoes
  2017-01-03 18:59           ` Guenter Roeck
  0 siblings, 1 reply; 28+ messages in thread
From: Rasmus Villemoes @ 2017-01-03 15:52 UTC (permalink / raw)
  To: Guenter Roeck, Wim Van Sebroeck
  Cc: linux-watchdog, linux-kernel, Sylvain Lemieux

On 2017-01-02 16:22, Guenter Roeck wrote:
> On 12/14/2016 05:37 AM, Rasmus Villemoes wrote:
>> +config WATCHDOG_OPEN_TIMEOUT
>> +    int "Default timeout value for opening watchdog device (seconds)"
>> +    default 0
>> +    help
>> +      If a watchdog driver indicates to the framework that the
>> +      hardware watchdog is running, the framework takes care of
>> +      pinging the watchdog until userspace opens
>> +      /dev/watchdogN. This value (overridden by the device's
>> +      "open-timeout" property if present) sets an upper bound for
>> +      how long the kernel does this - thus, if userspace hasn't
>> +      opened the device within the timeout, the board reboots. A
>> +      value of 0 means there is no timeout.
>> +
>> +
>
> Dual empty line. Also, as mentioned before, I am not a friend of such
> configuration variables. It forces distribution vendors to make the
> decision
> for everyone.

Well, unless the distribution can guarantee that something opens the 
watchdog device in a timely manner, the only sensible decision is to 
keep the default infinite timeout, so I don't see how this is in any way 
a burden for general distributions.

> Please consider defining a command line parameter such as
> watchdog_open_timeout.

I'd be happy to, and that was exactly what I did in the original 
proposal, where it could also be tweaked after boot via sysfs. But I'd 
still like the default value of that command line parameter to be 
settable via Kconfig. That's a lot easier to maintain than having to do 
parts of the kernel configuration in u-boot or whatever bootloader one uses.

>> diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h
>> index 35a4d81..a89a293 100644
>> --- a/include/linux/watchdog.h
>> +++ b/include/linux/watchdog.h
>> @@ -76,6 +76,11 @@ struct watchdog_ops {
>>   * @max_hw_heartbeat_ms:
>>   *        Hardware limit for maximum timeout, in milli-seconds.
>>   *        Replaces max_timeout if specified.
>> + * @open_timeout:
>> + *              The maximum time for which the kernel will ping the
>> + *              device after registration.
>> + * @open_deadline:
>> + *              Set to jiffies + @open_timeout at registration.
>>   * @reboot_nb:    The notifier block to stop watchdog on reboot.
>>   * @restart_nb:    The notifier block to register a restart function.
>>   * @driver_data:Pointer to the drivers private data.
>> @@ -107,6 +112,8 @@ struct watchdog_device {
>>      unsigned int max_timeout;
>>      unsigned int min_hw_heartbeat_ms;
>>      unsigned int max_hw_heartbeat_ms;
>> +    unsigned int open_timeout;
>> +    unsigned long open_deadline;
>
> None of those need to be visible in drivers. I don't see why the
> variables should be
> defined here and not in struct watchdog_core_data.

I completely agree, these values have nothing to do with individual 
drivers, but as I tried to convince you, they also, IMO, have nothing to 
do with individual devices.

So, since

 > New devicetree properties need to be documented and approved by
 > devicetree maintainers.

I'd like to start with implementing this _just_ as a command line 
parameter (if I can't convince you to accept the Kconfig part we can 
always maintain that trivial addition internally).  If anyone down the 
road feels strongly about the possibility of setting the deadline via a 
device property, it can be added later if they do that work.

How about that?

-- 
Rasmus Villemoes
Software Developer
Prevas A/S
Hedeager 1
DK-8200 Aarhus N
+45 51210274
rasmus.villemoes@prevas.dk
www.prevas.dk

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v3 2/2] watchdog: introduce CONFIG_WATCHDOG_OPEN_TIMEOUT
  2017-01-03 15:52         ` Rasmus Villemoes
@ 2017-01-03 18:59           ` Guenter Roeck
  0 siblings, 0 replies; 28+ messages in thread
From: Guenter Roeck @ 2017-01-03 18:59 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Wim Van Sebroeck, linux-watchdog, linux-kernel, Sylvain Lemieux

On Tue, Jan 03, 2017 at 04:52:14PM +0100, Rasmus Villemoes wrote:
> On 2017-01-02 16:22, Guenter Roeck wrote:
> >On 12/14/2016 05:37 AM, Rasmus Villemoes wrote:
> >>+config WATCHDOG_OPEN_TIMEOUT
> >>+    int "Default timeout value for opening watchdog device (seconds)"
> >>+    default 0
> >>+    help
> >>+      If a watchdog driver indicates to the framework that the
> >>+      hardware watchdog is running, the framework takes care of
> >>+      pinging the watchdog until userspace opens
> >>+      /dev/watchdogN. This value (overridden by the device's
> >>+      "open-timeout" property if present) sets an upper bound for
> >>+      how long the kernel does this - thus, if userspace hasn't
> >>+      opened the device within the timeout, the board reboots. A
> >>+      value of 0 means there is no timeout.
> >>+
> >>+
> >
> >Dual empty line. Also, as mentioned before, I am not a friend of such
> >configuration variables. It forces distribution vendors to make the
> >decision
> >for everyone.
> 
> Well, unless the distribution can guarantee that something opens the
> watchdog device in a timely manner, the only sensible decision is to keep
> the default infinite timeout, so I don't see how this is in any way a burden
> for general distributions.
> 
> >Please consider defining a command line parameter such as
> >watchdog_open_timeout.
> 
> I'd be happy to, and that was exactly what I did in the original proposal,
> where it could also be tweaked after boot via sysfs. But I'd still like the
> default value of that command line parameter to be settable via Kconfig.
> That's a lot easier to maintain than having to do parts of the kernel
> configuration in u-boot or whatever bootloader one uses.
> 
A sysfs attribute is a bit different to a command line parameter.

> >>diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h
> >>index 35a4d81..a89a293 100644
> >>--- a/include/linux/watchdog.h
> >>+++ b/include/linux/watchdog.h
> >>@@ -76,6 +76,11 @@ struct watchdog_ops {
> >>  * @max_hw_heartbeat_ms:
> >>  *        Hardware limit for maximum timeout, in milli-seconds.
> >>  *        Replaces max_timeout if specified.
> >>+ * @open_timeout:
> >>+ *              The maximum time for which the kernel will ping the
> >>+ *              device after registration.
> >>+ * @open_deadline:
> >>+ *              Set to jiffies + @open_timeout at registration.
> >>  * @reboot_nb:    The notifier block to stop watchdog on reboot.
> >>  * @restart_nb:    The notifier block to register a restart function.
> >>  * @driver_data:Pointer to the drivers private data.
> >>@@ -107,6 +112,8 @@ struct watchdog_device {
> >>     unsigned int max_timeout;
> >>     unsigned int min_hw_heartbeat_ms;
> >>     unsigned int max_hw_heartbeat_ms;
> >>+    unsigned int open_timeout;
> >>+    unsigned long open_deadline;
> >
> >None of those need to be visible in drivers. I don't see why the
> >variables should be
> >defined here and not in struct watchdog_core_data.
> 
> I completely agree, these values have nothing to do with individual drivers,
> but as I tried to convince you, they also, IMO, have nothing to do with
> individual devices.
> 

This is not about device vs. driver, it is about variable visibility.
The variables are only used in and by the watchdog core and thus only need
to be visible there. As for devices vs. system property, that really depends
on the means to set the variables. If there ever is a devicetree property,
it might well be per device (unless the property is, for example, attached
to the 'chosen' node), and might well be different for different watchdog
devices in the same system (just like two watchdogs in the same system can
have different timeouts).

Ultimately I would not mind to define the variables as static in the watchdog
core. What I object to is to have the variables in struct watchdog_device.

> So, since
> 
> > New devicetree properties need to be documented and approved by
> > devicetree maintainers.
> 
> I'd like to start with implementing this _just_ as a command line parameter
> (if I can't convince you to accept the Kconfig part we can always maintain
> that trivial addition internally).  If anyone down the road feels strongly
> about the possibility of setting the deadline via a device property, it can
> be added later if they do that work.
> 

Fine with me, though as I mentioned before, you'll have to convince Wim,
not me, to accept a Kconfig parameter.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2017-01-03 19:00 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-14  9:16 [RFC 0/3] watchdog: introduce open deadline Rasmus Villemoes
2016-07-14  9:16 ` [RFC 1/3] watchdog: change watchdog_need_worker logic Rasmus Villemoes
2016-07-14 20:45   ` Guenter Roeck
2016-07-17 19:24   ` Wim Van Sebroeck
2016-07-17 19:49     ` Guenter Roeck
2016-07-17 20:30       ` Wim Van Sebroeck
2016-07-17 20:33   ` Wim Van Sebroeck
2016-07-14  9:16 ` [RFC 2/3] watchdog: introduce watchdog_worker_should_ping helper Rasmus Villemoes
2016-07-14  9:16 ` [RFC 3/3] watchdog: introduce CONFIG_WATCHDOG_OPEN_DEADLINE Rasmus Villemoes
2016-07-14 14:42   ` Guenter Roeck
2016-07-15  7:32     ` Rasmus Villemoes
2016-07-15 14:29       ` Guenter Roeck
2016-07-20 22:08         ` Rasmus Villemoes
2016-07-21  0:31           ` Guenter Roeck
2016-07-27 20:17             ` Rasmus Villemoes
2016-07-31 22:17               ` Guenter Roeck
2016-08-01 11:10                 ` Wim Van Sebroeck
2016-12-12  9:17 ` [PATCH v2 0/2] watchdog: allow setting deadline for opening /dev/watchdogN Rasmus Villemoes
2016-12-12  9:17   ` [PATCH v2 1/2] watchdog: introduce watchdog_worker_should_ping helper Rasmus Villemoes
2016-12-12  9:17   ` [PATCH v2 2/2] watchdog: introduce CONFIG_WATCHDOG_OPEN_DEADLINE Rasmus Villemoes
2016-12-12 16:59     ` Guenter Roeck
2016-12-14 13:37   ` [PATCH v3 0/2] watchdog: allow setting deadline for opening /dev/watchdogN Rasmus Villemoes
2016-12-14 13:37     ` [PATCH v3 1/2] watchdog: introduce watchdog_worker_should_ping helper Rasmus Villemoes
2016-12-14 13:37     ` [PATCH v3 2/2] watchdog: introduce CONFIG_WATCHDOG_OPEN_TIMEOUT Rasmus Villemoes
2017-01-02 15:22       ` Guenter Roeck
2017-01-03 15:52         ` Rasmus Villemoes
2017-01-03 18:59           ` Guenter Roeck
2017-01-02  8:04     ` [PATCH v3 0/2] watchdog: allow setting deadline for opening /dev/watchdogN Rasmus Villemoes

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.