All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Uwe Kleine-König" <u.kleine-koenig@pengutronix.de>
To: Guenter Roeck <linux@roeck-us.net>
Cc: linux-watchdog@vger.kernel.org, Wim Van Sebroeck <wim@iguana.be>,
	linux-kernel@vger.kernel.org,
	Timo Kokkonen <timo.kokkonen@offcode.fi>,
	linux-doc@vger.kernel.org, Jonathan Corbet <corbet@lwn.net>
Subject: Re: [PATCH v5 1/8] watchdog: Introduce hardware maximum timeout in watchdog core
Date: Mon, 23 Nov 2015 19:26:08 +0100	[thread overview]
Message-ID: <20151123182608.GA19888@pengutronix.de> (raw)
In-Reply-To: <56533B80.4030400@roeck-us.net>

Hello Guenter,

On Mon, Nov 23, 2015 at 08:14:56AM -0800, Guenter Roeck wrote:
> On 11/22/2015 11:53 PM, Uwe Kleine-König wrote:
> >On Sun, Nov 22, 2015 at 07:20:58PM -0800, Guenter Roeck wrote:
> >>@@ -160,7 +176,11 @@ they are supported. These optional routines/operations are:
> >>    and -EIO for "could not write value to the watchdog". On success this
> >>    routine should set the timeout value of the watchdog_device to the
> >>    achieved timeout value (which may be different from the requested one
> >>-  because the watchdog does not necessarily has a 1 second resolution).
> >>+  because the watchdog does not necessarily have a 1 second resolution).
> >>+  Drivers implementing hw_max_timeout_ms set the hardware watchdog timeout
> >>+  to the minimum of timeout and hw_max_timeout_ms. Those drivers set the
> >
> >Actually this is something that the wdg core could abstract for drivers.
> >Oh well, apart from hw_max_timeout_ms having ms accuracy.
> >
> Not that sure. about the abstraction. The actual timeout to set depends on
> the hardware, and may have an unknown granularity. The watchdog core can

What I wanted to say is that the driver core can do (ignoring scaling
due to different units):

	if (timeout > wdd->hw_max_timeout_ms)
		wdd->set_timeout(wdd->hw_max_timeout_ms);
	else
		wdd->set_timeout(timeout)

So that the device specific driver can assume that timeout passed to its
set_timeout callback is always less than or equal to hw_max_timeout_ms.
And so it doesn't need to compare against hw_max_timeout_ms again.

> not predict what that granularity would be. We can play with it, and try
> if it can be done, but I really would like this to be a separate patch.
> 
> hw_max_timeout is in ms because some watchdogs have a very low maximum
> HW timeout.
> 
> >>+  timeout value of the watchdog_device either to the requested timeout value
> >>+  (if it is larger than hw_max_timeout_ms), or to the achieved timeout value.
> >>    (Note: the WDIOF_SETTIMEOUT needs to be set in the options field of the
> >>    watchdog's info structure).
> >>  * get_timeleft: this routines returns the time that's left before a reset.
> >>diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
> >>index 56a649e66eb2..1dba3f57dba3 100644
> >>--- a/drivers/watchdog/watchdog_dev.c
> >>+++ b/drivers/watchdog/watchdog_dev.c
> >>[...]
> >>+static long watchdog_next_keepalive(struct watchdog_device *wdd)
> >>+{
> >>+	unsigned int timeout_ms = wdd->timeout * 1000;
> >>+	unsigned long keepalive_interval;
> >>+	unsigned long last_heartbeat;
> >>+	unsigned long virt_timeout;
> >>+	unsigned int hw_timeout_ms;
> >>+
> >>+	virt_timeout = wdd->last_keepalive + msecs_to_jiffies(timeout_ms);
> >
> >I think it's sensible to store
> >
> >	last_keepalive + timeout
> >
> >(i.e. the expected expiration time) in struct watchdog_device instead of
> >last_keepalive. This moves the (maybe expensive) calculation to a
> >context that has userspace interaction anyhow. On the other hand this
> >complicates the set_timeout call. Hmm.
> >
> 
> I would rather keep the code simple. It is not as if this is called
> all the time. Also, I need last_keepalive later in the series
> to determine if the minimum timeout between hardware pings has elapsed,
> so we would need both.

I'm not sure my suggestion improves the situation, but I will keep my
idea in mind and check once the dust settled.

> >>+	hw_timeout_ms = min(timeout_ms, wdd->max_hw_timeout_ms);
> >>+	keepalive_interval = msecs_to_jiffies(hw_timeout_ms / 2);
> >>+
> >>+	/*
> >>+	 * To ensure that the watchdog times out wdd->timeout seconds
> >>+	 * after the most recent ping from userspace, the last
> >>+	 * worker ping has to come in hw_timeout_ms before this timeout.
> >>+	 */
> >>+	last_heartbeat = virt_timeout - msecs_to_jiffies(hw_timeout_ms);
> >>+	return min_t(long, last_heartbeat - jiffies, keepalive_interval);
> >>+}
> >>[...]
> >>@@ -61,26 +137,25 @@ static struct watchdog_device *old_wdd;
> >>
> >>  static int watchdog_ping(struct watchdog_device *wdd)
> >>  {
> >>-	int err = 0;
> >>+	int err;
> >>
> >>  	mutex_lock(&wdd->lock);
> >>+	wdd->last_keepalive = jiffies;
> >>+	err = _watchdog_ping(wdd);
> >>+	mutex_unlock(&wdd->lock);
> >>
> >>-	if (test_bit(WDOG_UNREGISTERED, &wdd->status)) {
> >>-		err = -ENODEV;
> >>-		goto out_ping;
> >>-	}
> >>+	return err;
> >>+}
> >>
> >>-	if (!watchdog_active(wdd))
> >>-		goto out_ping;
> >>+static void watchdog_ping_work(struct work_struct *work)
> >>+{
> >>+	struct watchdog_device *wdd;
> >>
> >>-	if (wdd->ops->ping)
> >>-		err = wdd->ops->ping(wdd);	/* ping the watchdog */
> >>-	else
> >>-		err = wdd->ops->start(wdd);	/* restart watchdog */
> >>+	wdd = container_of(to_delayed_work(work), struct watchdog_device, work);
> >>
> >>-out_ping:
> >>+	mutex_lock(&wdd->lock);
> >>+	_watchdog_ping(wdd);
> >>  	mutex_unlock(&wdd->lock);
> >>-	return err;
> >
> >Calling this function might come after last_keepalive + timeout in which
> >case the watchdog shouldn't be pinged.
> >
> Unless the code is wrong, the last time this function is called should be
> at (timeout - max_hw_timeout), ie well before the timeout elapsed.
> Given that, I don't think this is something to be concerned about.

Depends on max_hw_timeout and the load of the machine if that can
happen. And thinking again, if the ping didn't come in until
last_keepalive + timeout, the machine is reset anyhow ...

> >>  }
> >>
> >>  /*
> >>@@ -107,8 +182,11 @@ static int watchdog_start(struct watchdog_device *wdd)
> >>  		goto out_start;
> >>
> >>  	err = wdd->ops->start(wdd);
> >>-	if (err == 0)
> >>+	if (err == 0) {
> >>  		set_bit(WDOG_ACTIVE, &wdd->status);
> >>+		wdd->last_keepalive = jiffies;
> >>+		watchdog_update_worker(wdd, true);
> >>+	}
> >
> >I think it's more correct to sample jiffies before calling .start.
> >Something like:
> >
> >	unsigned long started_at = jiffies;
> >
> >	err = wdd->ops->start(wdd);
> >	if (err == 0)
> >		wdd->last_keepalive = jiffies;
> >
> I assume you mean
> 		wdd->last_keepalive = started_at;

right.

Best regards
Uwe

-- 
Pengutronix e.K.                           | Uwe Kleine-König            |
Industrial Linux Solutions                 | http://www.pengutronix.de/  |

  reply	other threads:[~2015-11-23 18:26 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-23  3:20 [PATCH v5 0/8] watchdog: Add support for keepalives triggered by infrastructure Guenter Roeck
2015-11-23  3:20 ` [PATCH v5 1/8] watchdog: Introduce hardware maximum timeout in watchdog core Guenter Roeck
2015-11-23  7:53   ` Uwe Kleine-König
2015-11-23 16:14     ` Guenter Roeck
2015-11-23 18:26       ` Uwe Kleine-König [this message]
2015-11-23 21:55         ` Guenter Roeck
2015-11-24  7:16   ` Uwe Kleine-König
2015-11-24 15:03     ` Guenter Roeck
2015-11-24 16:11       ` Uwe Kleine-König
2015-11-24 16:45         ` Guenter Roeck
2015-11-24 21:14           ` Uwe Kleine-König
2015-11-23  3:20 ` [PATCH v5 2/8] watchdog: Introduce WDOG_RUNNING flag Guenter Roeck
2015-11-23 16:30   ` Alexander Stein
2015-11-23 16:42     ` Guenter Roeck
2015-11-23 19:26   ` Uwe Kleine-König
2015-11-23 21:48     ` Guenter Roeck
2015-11-23  3:21 ` [PATCH v5 3/8] watchdog: Make set_timeout function optional Guenter Roeck
2015-11-23  3:21 ` [PATCH v5 4/8] watchdog: Add support for minimum time between heartbeats Guenter Roeck
2015-11-23  3:21 ` [PATCH v5 5/8] watchdog: Simplify update_worker Guenter Roeck
2015-11-24  7:13   ` Uwe Kleine-König
2015-11-24  7:25     ` Guenter Roeck
2015-11-23  3:21 ` [RFT PATCH v5 6/8] watchdog: imx2: Convert to use infrastructure triggered keepalives Guenter Roeck
2015-11-23  3:21 ` [RFT PATCH v5 7/8] watchdog: retu: " Guenter Roeck
2015-11-23  3:21 ` [RFT PATCH v5 8/8] watchdog: at91sam9: " Guenter Roeck

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20151123182608.GA19888@pengutronix.de \
    --to=u.kleine-koenig@pengutronix.de \
    --cc=corbet@lwn.net \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-watchdog@vger.kernel.org \
    --cc=linux@roeck-us.net \
    --cc=timo.kokkonen@offcode.fi \
    --cc=wim@iguana.be \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.