All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/3] watchdog: allow setting deadline for opening /dev/watchdogN
@ 2017-01-09 15:02 Rasmus Villemoes
  2017-01-09 15:02 ` [PATCH v4 1/3] watchdog: introduce watchdog_worker_should_ping helper Rasmus Villemoes
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Rasmus Villemoes @ 2017-01-09 15:02 UTC (permalink / raw)
  To: linux-watchdog, linux-doc, linux-kernel
  Cc: Sylvain Lemieux, Rasmus Villemoes, Guenter Roeck

If a watchdog driver tells the framework that the device is running,
the framework takes care of feeding the watchdog until userspace opens
the device. If the userspace application which is supposed to do that
never comes up properly, the watchdog is fed indefinitely by the
kernel. This can be especially problematic for embedded devices.

These patches allow one to set a maximum time for which the kernel
will feed the watchdog, thus ensuring that either userspace has come
up, or the board gets reset. This allows fallback logic in the
bootloader to attempt some recovery (for example, if an automatic
update is in progress, it could roll back to the previous version).

The patches have been tested on a Raspberry Pi 2 and a Wandboard.

v4 is mostly identical to v1. The differences are that the ability to
compile out this feature is removed, and the ability to set the
default value for the watchdog.open_timeout command line parameter via
Kconfig is split into a separate patch.

Compared to v2/v3, this drops the ability to set the open_timeout via
a device property; I'll leave implementing that to those who actually
need it.

Rasmus Villemoes (3):
  watchdog: introduce watchdog_worker_should_ping helper
  watchdog: introduce watchdog.open_timeout commandline parameter
  watchdog: introduce CONFIG_WATCHDOG_OPEN_TIMEOUT

 Documentation/watchdog/watchdog-parameters.txt | 10 +++++++
 drivers/watchdog/Kconfig                       |  9 +++++++
 drivers/watchdog/watchdog_dev.c                | 37 +++++++++++++++++++++++---
 3 files changed, 52 insertions(+), 4 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH v4 1/3] watchdog: introduce watchdog_worker_should_ping helper
  2017-01-09 15:02 [PATCH v4 0/3] watchdog: allow setting deadline for opening /dev/watchdogN Rasmus Villemoes
@ 2017-01-09 15:02 ` Rasmus Villemoes
  2017-01-09 15:02 ` [PATCH v4 2/3] watchdog: introduce watchdog.open_timeout commandline parameter Rasmus Villemoes
  2017-01-09 15:02 ` [PATCH v4 3/3] watchdog: introduce CONFIG_WATCHDOG_OPEN_TIMEOUT Rasmus Villemoes
  2 siblings, 0 replies; 9+ messages in thread
From: Rasmus Villemoes @ 2017-01-09 15:02 UTC (permalink / raw)
  To: Wim Van Sebroeck, Guenter Roeck
  Cc: Sylvain Lemieux, Rasmus Villemoes, linux-watchdog, linux-kernel

This will be useful when the condition becomes slightly more
complicated in the next patch.

Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
---
 drivers/watchdog/watchdog_dev.c | 13 +++++++++----
 1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
index 32930a0..21809e4 100644
--- a/drivers/watchdog/watchdog_dev.c
+++ b/drivers/watchdog/watchdog_dev.c
@@ -192,18 +192,23 @@ static int watchdog_ping(struct watchdog_device *wdd)
 	return __watchdog_ping(wdd);
 }
 
+static bool watchdog_worker_should_ping(struct watchdog_core_data *wd_data)
+{
+	struct watchdog_device *wdd = wd_data->wdd;
+
+	return wdd && (watchdog_active(wdd) || watchdog_hw_running(wdd));
+}
+
 static void watchdog_ping_work(struct work_struct *work)
 {
 	struct watchdog_core_data *wd_data;
-	struct watchdog_device *wdd;
 
 	wd_data = container_of(to_delayed_work(work), struct watchdog_core_data,
 			       work);
 
 	mutex_lock(&wd_data->lock);
-	wdd = wd_data->wdd;
-	if (wdd && (watchdog_active(wdd) || watchdog_hw_running(wdd)))
-		__watchdog_ping(wdd);
+	if (watchdog_worker_should_ping(wd_data))
+		__watchdog_ping(wd_data->wdd);
 	mutex_unlock(&wd_data->lock);
 }
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v4 2/3] watchdog: introduce watchdog.open_timeout commandline parameter
  2017-01-09 15:02 [PATCH v4 0/3] watchdog: allow setting deadline for opening /dev/watchdogN Rasmus Villemoes
  2017-01-09 15:02 ` [PATCH v4 1/3] watchdog: introduce watchdog_worker_should_ping helper Rasmus Villemoes
@ 2017-01-09 15:02 ` Rasmus Villemoes
  2017-01-10 18:08   ` Guenter Roeck
  2017-01-09 15:02 ` [PATCH v4 3/3] watchdog: introduce CONFIG_WATCHDOG_OPEN_TIMEOUT Rasmus Villemoes
  2 siblings, 1 reply; 9+ messages in thread
From: Rasmus Villemoes @ 2017-01-09 15:02 UTC (permalink / raw)
  To: Wim Van Sebroeck, Guenter Roeck, Jonathan Corbet
  Cc: Sylvain Lemieux, Rasmus Villemoes, linux-watchdog, linux-doc,
	linux-kernel

The watchdog framework takes care of feeding a hardware watchdog until
userspace opens /dev/watchdogN. If that never happens for some reason
(buggy init script, corrupt root filesystem or whatnot) but the kernel
itself is fine, the machine stays up indefinitely. This patch allows
setting an upper limit for how long the kernel will take care of the
watchdog, thus ensuring that the watchdog will eventually reset the
machine.

This is particularly useful for embedded devices where some fallback
logic is implemented in the bootloader (e.g., use a different root
partition, boot from network, ...).

The open timeout is also used as a maximum time for an application to
re-open /dev/watchdogN after closing it.

A value of 0 (the default) means infinite timeout, preserving the
current behaviour.

Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
---
 Documentation/watchdog/watchdog-parameters.txt |  9 +++++++++
 drivers/watchdog/watchdog_dev.c                | 26 +++++++++++++++++++++++++-
 2 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/Documentation/watchdog/watchdog-parameters.txt b/Documentation/watchdog/watchdog-parameters.txt
index e21850e..2ae0fdf 100644
--- a/Documentation/watchdog/watchdog-parameters.txt
+++ b/Documentation/watchdog/watchdog-parameters.txt
@@ -8,6 +8,15 @@ See Documentation/admin-guide/kernel-parameters.rst for information on
 providing kernel parameters for builtin drivers versus loadable
 modules.
 
+The watchdog core currently understands one parameter,
+watchdog.open_timeout. This is the maximum time, in milliseconds, for
+which the watchdog framework will take care of pinging a hardware
+watchdog until userspace opens the corresponding /dev/watchdogN
+device. A value of 0 (the default) means an infinite timeout. Setting
+this to a non-zero value can be useful to ensure that either userspace
+comes up properly, or the board gets reset and allows fallback logic
+in the bootloader to try something else.
+
 
 -------------------------------------------------
 acquirewdt:
diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
index 21809e4..8bc9f24 100644
--- a/drivers/watchdog/watchdog_dev.c
+++ b/drivers/watchdog/watchdog_dev.c
@@ -66,6 +66,7 @@ struct watchdog_core_data {
 	struct mutex lock;
 	unsigned long last_keepalive;
 	unsigned long last_hw_keepalive;
+	unsigned long open_deadline;
 	struct delayed_work work;
 	unsigned long status;		/* Internal status bits */
 #define _WDOG_DEV_OPEN		0	/* Opened ? */
@@ -80,6 +81,21 @@ static struct watchdog_core_data *old_wd_data;
 
 static struct workqueue_struct *watchdog_wq;
 
+static unsigned open_timeout;
+module_param(open_timeout, uint, 0644);
+
+static bool watchdog_past_open_deadline(struct watchdog_core_data *data)
+{
+	if (!open_timeout)
+		return false;
+	return time_is_before_jiffies(data->open_deadline);
+}
+
+static void watchdog_set_open_deadline(struct watchdog_core_data *data)
+{
+	data->open_deadline = jiffies + msecs_to_jiffies(open_timeout);
+}
+
 static inline bool watchdog_need_worker(struct watchdog_device *wdd)
 {
 	/* All variables in milli-seconds */
@@ -196,7 +212,13 @@ static bool watchdog_worker_should_ping(struct watchdog_core_data *wd_data)
 {
 	struct watchdog_device *wdd = wd_data->wdd;
 
-	return wdd && (watchdog_active(wdd) || watchdog_hw_running(wdd));
+	if (!wdd)
+		return false;
+
+	if (watchdog_active(wdd))
+		return true;
+
+	return watchdog_hw_running(wdd) && !watchdog_past_open_deadline(wd_data);
 }
 
 static void watchdog_ping_work(struct work_struct *work)
@@ -857,6 +879,7 @@ static int watchdog_release(struct inode *inode, struct file *file)
 		watchdog_ping(wdd);
 	}
 
+	watchdog_set_open_deadline(wd_data);
 	watchdog_update_worker(wdd);
 
 	/* make sure that /dev/watchdog can be re-opened */
@@ -955,6 +978,7 @@ static int watchdog_cdev_register(struct watchdog_device *wdd, dev_t devno)
 
 	/* Record time of most recent heartbeat as 'just before now'. */
 	wd_data->last_hw_keepalive = jiffies - 1;
+	watchdog_set_open_deadline(wd_data);
 
 	/*
 	 * If the watchdog is running, prevent its driver from being unloaded,
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH v4 3/3] watchdog: introduce CONFIG_WATCHDOG_OPEN_TIMEOUT
  2017-01-09 15:02 [PATCH v4 0/3] watchdog: allow setting deadline for opening /dev/watchdogN Rasmus Villemoes
  2017-01-09 15:02 ` [PATCH v4 1/3] watchdog: introduce watchdog_worker_should_ping helper Rasmus Villemoes
  2017-01-09 15:02 ` [PATCH v4 2/3] watchdog: introduce watchdog.open_timeout commandline parameter Rasmus Villemoes
@ 2017-01-09 15:02 ` Rasmus Villemoes
  2 siblings, 0 replies; 9+ messages in thread
From: Rasmus Villemoes @ 2017-01-09 15:02 UTC (permalink / raw)
  To: Wim Van Sebroeck, Guenter Roeck, Jonathan Corbet
  Cc: Sylvain Lemieux, Rasmus Villemoes, linux-watchdog, linux-doc,
	linux-kernel

This allows setting a default value for the watchdog.open_timeout
commandline parameter via Kconfig.

Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
---
 Documentation/watchdog/watchdog-parameters.txt | 9 +++++----
 drivers/watchdog/Kconfig                       | 9 +++++++++
 drivers/watchdog/watchdog_dev.c                | 2 +-
 3 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/Documentation/watchdog/watchdog-parameters.txt b/Documentation/watchdog/watchdog-parameters.txt
index 2ae0fdf..a688028 100644
--- a/Documentation/watchdog/watchdog-parameters.txt
+++ b/Documentation/watchdog/watchdog-parameters.txt
@@ -12,10 +12,11 @@ The watchdog core currently understands one parameter,
 watchdog.open_timeout. This is the maximum time, in milliseconds, for
 which the watchdog framework will take care of pinging a hardware
 watchdog until userspace opens the corresponding /dev/watchdogN
-device. A value of 0 (the default) means an infinite timeout. Setting
-this to a non-zero value can be useful to ensure that either userspace
-comes up properly, or the board gets reset and allows fallback logic
-in the bootloader to try something else.
+device. The defalt value is CONFIG_WATCHDOG_OPEN_TIMEOUT. A value of 0
+means an infinite timeout. Setting this to a non-zero value can be
+useful to ensure that either userspace comes up properly, or the board
+gets reset and allows fallback logic in the bootloader to try
+something else.
 
 
 -------------------------------------------------
diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
index acb00b5..8a847f7 100644
--- a/drivers/watchdog/Kconfig
+++ b/drivers/watchdog/Kconfig
@@ -52,6 +52,15 @@ config WATCHDOG_SYSFS
 	  Say Y here if you want to enable watchdog device status read through
 	  sysfs attributes.
 
+config WATCHDOG_OPEN_TIMEOUT
+	int "Timeout value for opening watchdog device"
+	default 0
+	help
+	  The maximum time, in milliseconds, for which the watchdog
+	  framework takes care of pinging a hardware watchdog. A value
+	  of 0 means infinite. The value set here can be overridden by
+	  the commandline parameter "watchdog.open_timeout".
+
 #
 # General Watchdog drivers
 #
diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
index 8bc9f24..cc26c4d 100644
--- a/drivers/watchdog/watchdog_dev.c
+++ b/drivers/watchdog/watchdog_dev.c
@@ -81,7 +81,7 @@ static struct watchdog_core_data *old_wd_data;
 
 static struct workqueue_struct *watchdog_wq;
 
-static unsigned open_timeout;
+static unsigned open_timeout = CONFIG_WATCHDOG_OPEN_TIMEOUT;
 module_param(open_timeout, uint, 0644);
 
 static bool watchdog_past_open_deadline(struct watchdog_core_data *data)
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 2/3] watchdog: introduce watchdog.open_timeout commandline parameter
  2017-01-09 15:02 ` [PATCH v4 2/3] watchdog: introduce watchdog.open_timeout commandline parameter Rasmus Villemoes
@ 2017-01-10 18:08   ` Guenter Roeck
  2017-01-11  8:11     ` Rasmus Villemoes
  0 siblings, 1 reply; 9+ messages in thread
From: Guenter Roeck @ 2017-01-10 18:08 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Wim Van Sebroeck, Jonathan Corbet, Sylvain Lemieux,
	linux-watchdog, linux-doc, linux-kernel

On Mon, Jan 09, 2017 at 04:02:32PM +0100, Rasmus Villemoes wrote:
> The watchdog framework takes care of feeding a hardware watchdog until
> userspace opens /dev/watchdogN. If that never happens for some reason
> (buggy init script, corrupt root filesystem or whatnot) but the kernel
> itself is fine, the machine stays up indefinitely. This patch allows
> setting an upper limit for how long the kernel will take care of the
> watchdog, thus ensuring that the watchdog will eventually reset the
> machine.
> 
> This is particularly useful for embedded devices where some fallback
> logic is implemented in the bootloader (e.g., use a different root
> partition, boot from network, ...).
> 
> The open timeout is also used as a maximum time for an application to
> re-open /dev/watchdogN after closing it.
> 
> A value of 0 (the default) means infinite timeout, preserving the
> current behaviour.
> 
> Signed-off-by: Rasmus Villemoes <rasmus.villemoes@prevas.dk>
> ---
>  Documentation/watchdog/watchdog-parameters.txt |  9 +++++++++
>  drivers/watchdog/watchdog_dev.c                | 26 +++++++++++++++++++++++++-
>  2 files changed, 34 insertions(+), 1 deletion(-)
> 
> diff --git a/Documentation/watchdog/watchdog-parameters.txt b/Documentation/watchdog/watchdog-parameters.txt
> index e21850e..2ae0fdf 100644
> --- a/Documentation/watchdog/watchdog-parameters.txt
> +++ b/Documentation/watchdog/watchdog-parameters.txt
> @@ -8,6 +8,15 @@ See Documentation/admin-guide/kernel-parameters.rst for information on
>  providing kernel parameters for builtin drivers versus loadable
>  modules.
>  
> +The watchdog core currently understands one parameter,
> +watchdog.open_timeout. This is the maximum time, in milliseconds, for
> +which the watchdog framework will take care of pinging a hardware
> +watchdog until userspace opens the corresponding /dev/watchdogN
> +device. A value of 0 (the default) means an infinite timeout. Setting
> +this to a non-zero value can be useful to ensure that either userspace
> +comes up properly, or the board gets reset and allows fallback logic
> +in the bootloader to try something else.
> +
>  
>  -------------------------------------------------
>  acquirewdt:
> diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
> index 21809e4..8bc9f24 100644
> --- a/drivers/watchdog/watchdog_dev.c
> +++ b/drivers/watchdog/watchdog_dev.c
> @@ -66,6 +66,7 @@ struct watchdog_core_data {
>  	struct mutex lock;
>  	unsigned long last_keepalive;
>  	unsigned long last_hw_keepalive;
> +	unsigned long open_deadline;
>  	struct delayed_work work;
>  	unsigned long status;		/* Internal status bits */
>  #define _WDOG_DEV_OPEN		0	/* Opened ? */
> @@ -80,6 +81,21 @@ static struct watchdog_core_data *old_wd_data;
>  
>  static struct workqueue_struct *watchdog_wq;
>  
> +static unsigned open_timeout;
> +module_param(open_timeout, uint, 0644);
> +
> +static bool watchdog_past_open_deadline(struct watchdog_core_data *data)
> +{
> +	if (!open_timeout)
> +		return false;
> +	return time_is_before_jiffies(data->open_deadline);

Doesn't this return true if the time is _before_ the open deadline ?
Should it be time_is_after_jiffies() ?

> +}
> +
> +static void watchdog_set_open_deadline(struct watchdog_core_data *data)
> +{
> +	data->open_deadline = jiffies + msecs_to_jiffies(open_timeout);

The open deadline as defined applies to the time after the device was
instantiated, not to the time since boot. Would it be better to make it
"time since boot" ?

Also, are you sure about using milli-seconds (instead of seconds) ?
I can not really imagine a situation where this would be needed
(especially and even more so in the context of using "time after
instantiating").

Thanks,
Guenter

> +}
> +
>  static inline bool watchdog_need_worker(struct watchdog_device *wdd)
>  {
>  	/* All variables in milli-seconds */
> @@ -196,7 +212,13 @@ static bool watchdog_worker_should_ping(struct watchdog_core_data *wd_data)
>  {
>  	struct watchdog_device *wdd = wd_data->wdd;
>  
> -	return wdd && (watchdog_active(wdd) || watchdog_hw_running(wdd));
> +	if (!wdd)
> +		return false;
> +
> +	if (watchdog_active(wdd))
> +		return true;
> +
> +	return watchdog_hw_running(wdd) && !watchdog_past_open_deadline(wd_data);
>  }
>  
>  static void watchdog_ping_work(struct work_struct *work)
> @@ -857,6 +879,7 @@ static int watchdog_release(struct inode *inode, struct file *file)
>  		watchdog_ping(wdd);
>  	}
>  
> +	watchdog_set_open_deadline(wd_data);
>  	watchdog_update_worker(wdd);
>  
>  	/* make sure that /dev/watchdog can be re-opened */
> @@ -955,6 +978,7 @@ static int watchdog_cdev_register(struct watchdog_device *wdd, dev_t devno)
>  
>  	/* Record time of most recent heartbeat as 'just before now'. */
>  	wd_data->last_hw_keepalive = jiffies - 1;
> +	watchdog_set_open_deadline(wd_data);
>  
>  	/*
>  	 * If the watchdog is running, prevent its driver from being unloaded,
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 2/3] watchdog: introduce watchdog.open_timeout commandline parameter
  2017-01-10 18:08   ` Guenter Roeck
@ 2017-01-11  8:11     ` Rasmus Villemoes
  2017-01-11 11:02       ` Guenter Roeck
  0 siblings, 1 reply; 9+ messages in thread
From: Rasmus Villemoes @ 2017-01-11  8:11 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Wim Van Sebroeck, Jonathan Corbet, Sylvain Lemieux,
	linux-watchdog, linux-doc, linux-kernel

On 2017-01-10 19:08, Guenter Roeck wrote:
> On Mon, Jan 09, 2017 at 04:02:32PM +0100, Rasmus Villemoes wrote:
>>
>> +static unsigned open_timeout;
>> +module_param(open_timeout, uint, 0644);
>> +
>> +static bool watchdog_past_open_deadline(struct watchdog_core_data *data)
>> +{
>> +	if (!open_timeout)
>> +		return false;
>> +	return time_is_before_jiffies(data->open_deadline);
>
> Doesn't this return true if the time is _before_ the open deadline ?
> Should it be time_is_after_jiffies() ?

Yes, time_is_before_jiffies(foo) means foo < jiffies, and that is what 
we want when we're querying whether we've passed the deadline.

>> +}
>> +
>> +static void watchdog_set_open_deadline(struct watchdog_core_data *data)
>> +{
>> +	data->open_deadline = jiffies + msecs_to_jiffies(open_timeout);
>
> The open deadline as defined applies to the time after the device was
> instantiated, not to the time since boot. Would it be better to make it
> "time since boot" ?

I don't have a strong opinion on that, but two small things made me do 
it this way: (1) In case a hardware watchdog is somehow hotplugged long 
after boot and userspace is supposed to detect that and start feeding 
it, it wouldn't make sense for the framework not to take care of it for 
a while. (2) The open_timeout also applies to the case where the 
userspace app gracefully closes the watchdog device (e.g. because it's 
been instructed to restart to load a new configuration or whatnot) but 
the hardware cannot be stopped. In that case, the framework also takes 
over, and the same logic as after boot should apply - if the app fails 
to come up again, the framework should not feed the dog indefinitely, 
but OTOH it clearly doesn't make sense to have a boot-time based deadline.

> Also, are you sure about using milli-seconds (instead of seconds) ?
> I can not really imagine a situation where this would be needed
> (especially and even more so in the context of using "time after
> instantiating").

I went back and forth on this. I did milli-seconds because that should 
cover more use cases. Yes, wanting an open timeout of .7 seconds with 
1.0 seconds being unacceptable is unusual, but I know of some customers 
with very strict requirements. Also, even if one cannot make userspace 
start that fast, one can boot with a somewhat generous open_timeout and 
then write 700 to /sys/module/watchdog/parameters/open_timeout for use 
in case (2) above.

Thanks,
Rasmus

-- 
Rasmus Villemoes
Software Developer
Prevas A/S
Hedeager 1
DK-8200 Aarhus N
+45 51210274
rasmus.villemoes@prevas.dk
www.prevas.dk

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 2/3] watchdog: introduce watchdog.open_timeout commandline parameter
  2017-01-11  8:11     ` Rasmus Villemoes
@ 2017-01-11 11:02       ` Guenter Roeck
  2017-01-13  9:11         ` Rasmus Villemoes
  0 siblings, 1 reply; 9+ messages in thread
From: Guenter Roeck @ 2017-01-11 11:02 UTC (permalink / raw)
  To: Rasmus Villemoes
  Cc: Wim Van Sebroeck, Jonathan Corbet, Sylvain Lemieux,
	linux-watchdog, linux-doc, linux-kernel

On 01/11/2017 12:11 AM, Rasmus Villemoes wrote:
> On 2017-01-10 19:08, Guenter Roeck wrote:
>> On Mon, Jan 09, 2017 at 04:02:32PM +0100, Rasmus Villemoes wrote:
>>>
>>> +static unsigned open_timeout;
>>> +module_param(open_timeout, uint, 0644);
>>> +
>>> +static bool watchdog_past_open_deadline(struct watchdog_core_data *data)
>>> +{
>>> +    if (!open_timeout)
>>> +        return false;
>>> +    return time_is_before_jiffies(data->open_deadline);
>>
>> Doesn't this return true if the time is _before_ the open deadline ?
>> Should it be time_is_after_jiffies() ?
>
> Yes, time_is_before_jiffies(foo) means foo < jiffies, and that is what we want when we're querying whether we've passed the deadline.
>
So you want the function to return true if the timeout has _not_ expired ?

>>> +}
>>> +
>>> +static void watchdog_set_open_deadline(struct watchdog_core_data *data)
>>> +{
>>> +    data->open_deadline = jiffies + msecs_to_jiffies(open_timeout);
>>
>> The open deadline as defined applies to the time after the device was
>> instantiated, not to the time since boot. Would it be better to make it
>> "time since boot" ?
>
> I don't have a strong opinion on that, but two small things made me do it this way: (1) In case a hardware watchdog

is somehow hotplugged long after boot and userspace is supposed to detect that and start feeding it, it wouldn't make

sense for the framework not to take care of it for a while. (2) The open_timeout also applies to the case where the

userspace app gracefully closes the watchdog device (e.g. because it's been instructed to restart to load a new configuration

or whatnot) but the hardware cannot be stopped. In that case, the framework also takes over, and the same logic as after boot should apply

- if the app fails to come up again, the framework should not feed the dog indefinitely, but OTOH it clearly doesn't make sense to have a boot-time based deadline.
>

[ Can you try to work with line wraps ? ]

Uuh, no. I didn't realize that you apply that case to the "userspace app gracefully
closes the watchdog device" case as well. This is clearly a separate use case.

Looks like there are now three use cases for 'open timeout'.
- time after boot
- timer after loading the watchdog device
- time after closing watchdog device and before reopening it

I would have thought the first use case is the important one, and the other ones are,
at best, secondary. Given that, we are clearly not there yet. This will require input
from others on the semantics.

Thanks,
Guenter

>> Also, are you sure about using milli-seconds (instead of seconds) ?
>> I can not really imagine a situation where this would be needed
>> (especially and even more so in the context of using "time after
>> instantiating").
>
> I went back and forth on this. I did milli-seconds because that should cover more use cases. Yes, wanting an open timeout of .7 seconds with 1.0 seconds being unacceptable is unusual, but I know of some customers with very strict requirements. Also, even if one cannot make userspace start that fast, one can boot with a somewhat generous open_timeout and then write 700 to /sys/module/watchdog/parameters/open_timeout for use in case (2) above.
>
> Thanks,
> Rasmus
>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 2/3] watchdog: introduce watchdog.open_timeout commandline parameter
  2017-01-11 11:02       ` Guenter Roeck
@ 2017-01-13  9:11         ` Rasmus Villemoes
  2017-01-30 10:33           ` Rasmus Villemoes
  0 siblings, 1 reply; 9+ messages in thread
From: Rasmus Villemoes @ 2017-01-13  9:11 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Wim Van Sebroeck, Jonathan Corbet, Sylvain Lemieux,
	linux-watchdog, linux-doc, linux-kernel

On 2017-01-11 12:02, Guenter Roeck wrote:
> On 01/11/2017 12:11 AM, Rasmus Villemoes wrote:
>> On 2017-01-10 19:08, Guenter Roeck wrote:
>>> On Mon, Jan 09, 2017 at 04:02:32PM +0100, Rasmus Villemoes wrote:
>>>>
>>>> +static unsigned open_timeout;
>>>> +module_param(open_timeout, uint, 0644);
>>>> +
>>>> +static bool watchdog_past_open_deadline(struct watchdog_core_data
>>>> *data)
>>>> +{
>>>> +    if (!open_timeout)
>>>> +        return false;
>>>> +    return time_is_before_jiffies(data->open_deadline);
>>>
>>> Doesn't this return true if the time is _before_ the open deadline ?
>>> Should it be time_is_after_jiffies() ?
>>
>> Yes, time_is_before_jiffies(foo) means foo < jiffies, and that is what
>> we want when we're querying whether we've passed the deadline.
>>
> So you want the function to return true if the timeout has _not_ expired ?

No, I want it to return true if the timeout has expired, just as its
name hopefully suggests ("are we now past the deadline for opening").

time_is_before_jiffies(foo) expands to time_after(jiffies, foo), which
in turn expands to (aside from some type checking)

(long)((foo) - (jiffies)) < 0

which is true precisely when jiffies > foo, i.e., when the current time
is later than foo. [Yes, just as every other user of the time_* helpers
this only works if the times being compared are within LONG_MAX jiffies
of each other.]

> 
> [ Can you try to work with line wraps ? ]

Sorry about that, I hope I've now managed to make Thunderbird not mess
it up.

> Uuh, no. I didn't realize that you apply that case to the "userspace app
> gracefully
> closes the watchdog device" case as well. This is clearly a separate use
> case.
> 
> Looks like there are now three use cases for 'open timeout'.
> - time after boot
> - timer after loading the watchdog device
> - time after closing watchdog device and before reopening it
> 
> I would have thought the first use case is the important one, and the
> other ones are,
> at best, secondary. Given that, we are clearly not there yet. This will
> require input
> from others on the semantics.

I agree, it would be nice to have others chime in on whether they'd even
find this useful, and what semantics they'd like.

But for the record, for us, both the "time after boot" and "time after
closing" are important use cases. In practice, approximating boot time
with time of first registration effectively just pushes the deadline a
little further out, which I think is acceptable (boot time anyway means
"when timekeeping began", and even when the deadline is passed, it takes
up to whatever the hardware is configured to before the board actually
resets, so there's already some slack involved). If one really wants to
measure with respect to boot time, I think one can just make
set_open_deadline take an additional "from" parameter, passing
INITIAL_JIFFIES from watchdog_cdev_register and jiffies from
watchdog_release.

(That won't work for the hotplug case, but please ignore that; it was a
bad example, and certainly not a use case we actually have in mind).


Thanks,
Rasmus


-- 
Rasmus Villemoes
Software Developer
Prevas A/S
Hedeager 1
DK-8200 Aarhus N
+45 51210274
rasmus.villemoes@prevas.dk
www.prevas.dk

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v4 2/3] watchdog: introduce watchdog.open_timeout commandline parameter
  2017-01-13  9:11         ` Rasmus Villemoes
@ 2017-01-30 10:33           ` Rasmus Villemoes
  0 siblings, 0 replies; 9+ messages in thread
From: Rasmus Villemoes @ 2017-01-30 10:33 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: Wim Van Sebroeck, Jonathan Corbet, Sylvain Lemieux,
	linux-watchdog, linux-doc, linux-kernel

On 2017-01-13 10:11, Rasmus Villemoes wrote:
> On 2017-01-11 12:02, Guenter Roeck wrote:
>> This will require input from others on the semantics.
> 
> I agree, it would be nice to have others chime in on whether they'd even
> find this useful, and what semantics they'd like.

Ping.

-- 
Rasmus Villemoes
Software Developer
Prevas A/S
Hedeager 1
DK-8200 Aarhus N
+45 51210274
rasmus.villemoes@prevas.dk
www.prevas.dk

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2017-01-30 10:33 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-01-09 15:02 [PATCH v4 0/3] watchdog: allow setting deadline for opening /dev/watchdogN Rasmus Villemoes
2017-01-09 15:02 ` [PATCH v4 1/3] watchdog: introduce watchdog_worker_should_ping helper Rasmus Villemoes
2017-01-09 15:02 ` [PATCH v4 2/3] watchdog: introduce watchdog.open_timeout commandline parameter Rasmus Villemoes
2017-01-10 18:08   ` Guenter Roeck
2017-01-11  8:11     ` Rasmus Villemoes
2017-01-11 11:02       ` Guenter Roeck
2017-01-13  9:11         ` Rasmus Villemoes
2017-01-30 10:33           ` Rasmus Villemoes
2017-01-09 15:02 ` [PATCH v4 3/3] watchdog: introduce CONFIG_WATCHDOG_OPEN_TIMEOUT Rasmus Villemoes

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.