All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC v1 0/2] Add driver for PAPR watchdog timers
@ 2022-04-13 16:51 Scott Cheloha
  2022-04-13 16:51 ` [RFC v1 1/2] powerpc/pseries: hvcall.h: add definitions for H_WATCHDOG hypercall Scott Cheloha
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: Scott Cheloha @ 2022-04-13 16:51 UTC (permalink / raw)
  To: linux-watchdog; +Cc: bjking, nathanl, aik, npiggin, vaishnavi, wvoigt

This series adds a driver for PAPR hypercall-based watchdog timers,
tentatively named "pseries-wdt".

I wanted to get some clarification on a few things before submitting
the series as a patch, hence the RFC.  The first patch adding the
hypercall to hvcall.h is straightforward, but I have questions about
the second patch (the driver).  In particular:

- In pseries_wdt_probe() we register the watchdog device with
  devm_watchdog_register_device().  However, in pseries_wdt_remove(),
  calling watchdog_unregister_devce() causes a kernel panic later,
  so I assume this is the wrong thing to do.

  Do we need to do anything to clean up the watchdog device during
  pseries_wdt_remove()?  Or does devm_watchdog_register_device()
  ensure the cleanup is handled transparently?

- In pseries_wdt_probe(), is it incorrect to devm_kfree() my
  allocation in the event that devm_watchdog_register_device()
  fails?

- The enormous hypercall input/output comment is mostly for my
  edification.  It seems like the sort of thing that will rot over time.
  I intend to remove most of it.  However, as far as I know the PAPR
  revision containing these details is not published yet.  Should I
  leave the comment in to ease review for now and remove it later?
  Or should I omit it from the initial commit entirely?

- Should we print something to the console when probing/removing the
  watchdog0 device or is that just noise?

  Most drivers (as distinct from devices) seem to print something
  during initialization, so that's what I've done in
  pseries_wdt_module_init() when the capability query succeeds.

- The timeout action is currently hardcoded to a hard reset.  This
  could be made configurable through a module parameter.  I intend
  to do this in a later patch unless someone needs it included
  in the initial patch.

- We set EIO if the hypercall fails in pseries_wdt_start() or
  pseries_wdt_stop().  There is nothing userspace can do if this
  happens.  All hypercall failures in these contexts are unexpected.

  Given all of that, is there is a more appropriate errno than EIO?

- The H_WATCHDOG spec indicates that H_BUSY is possible.  Is it
  probable, though?  Should we spin and retry the hypercall in
  the event that we see it?  Or is that pointless?



^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC v1 1/2] powerpc/pseries: hvcall.h: add definitions for H_WATCHDOG hypercall
  2022-04-13 16:51 [RFC v1 0/2] Add driver for PAPR watchdog timers Scott Cheloha
@ 2022-04-13 16:51 ` Scott Cheloha
  2022-04-13 16:51 ` [RFC v1 2/2] watchdog: pseries-wdt: initial support for PAPR virtual watchdog timers Scott Cheloha
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: Scott Cheloha @ 2022-04-13 16:51 UTC (permalink / raw)
  To: linux-watchdog
  Cc: bjking, nathanl, aik, npiggin, vaishnavi, wvoigt, Scott Cheloha

A future revision of the PAPR will define a new hypercall, H_WATCHDOG.
The hypercall will permit guest control of one or more virtual
watchdog timers.

Signed-off-by: Scott Cheloha <cheloha@linux.ibm.com>
---
 arch/powerpc/include/asm/hvcall.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h
index d92a20a85395..992fc4b88cb0 100644
--- a/arch/powerpc/include/asm/hvcall.h
+++ b/arch/powerpc/include/asm/hvcall.h
@@ -324,7 +324,8 @@
 #define H_RPT_INVALIDATE	0x448
 #define H_SCM_FLUSH		0x44C
 #define H_GET_ENERGY_SCALE_INFO	0x450
-#define MAX_HCALL_OPCODE	H_GET_ENERGY_SCALE_INFO
+#define H_WATCHDOG		0x45C
+#define MAX_HCALL_OPCODE	H_WATCHDOG
 
 /* Scope args for H_SCM_UNBIND_ALL */
 #define H_UNBIND_SCOPE_ALL (0x1)
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC v1 2/2] watchdog: pseries-wdt: initial support for PAPR virtual watchdog timers
  2022-04-13 16:51 [RFC v1 0/2] Add driver for PAPR watchdog timers Scott Cheloha
  2022-04-13 16:51 ` [RFC v1 1/2] powerpc/pseries: hvcall.h: add definitions for H_WATCHDOG hypercall Scott Cheloha
@ 2022-04-13 16:51 ` Scott Cheloha
  2022-04-14  3:20   ` Tzung-Bi Shih
  2022-04-14  2:23 ` [RFC v1 0/2] Add driver for PAPR " Guenter Roeck
  2022-04-19  8:49 ` Alexey Kardashevskiy
  3 siblings, 1 reply; 10+ messages in thread
From: Scott Cheloha @ 2022-04-13 16:51 UTC (permalink / raw)
  To: linux-watchdog
  Cc: bjking, nathanl, aik, npiggin, vaishnavi, wvoigt, Scott Cheloha

A future revision of the PAPR will define a new hypercall, H_WATCHDOG.
The hypercall will permit guest control of one or more virtual
watchdog timers.

This patch adds a new watchdog driver for these timers, "pseries-wdt".
For now, the driver only exposes a single timer.  In the future it
could expose additional timers when more than one is available.

Signed-off-by: Scott Cheloha <cheloha@linux.ibm.com>
---
 drivers/watchdog/Kconfig       |   8 +
 drivers/watchdog/Makefile      |   1 +
 drivers/watchdog/pseries-wdt.c | 337 +++++++++++++++++++++++++++++++++
 3 files changed, 346 insertions(+)
 create mode 100644 drivers/watchdog/pseries-wdt.c

diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
index c4e82a8d863f..f3e6db5bed74 100644
--- a/drivers/watchdog/Kconfig
+++ b/drivers/watchdog/Kconfig
@@ -1941,6 +1941,14 @@ config WATCHDOG_RTAS
 	  To compile this driver as a module, choose M here. The module
 	  will be called wdrtas.
 
+config PSERIES_WDT
+	tristate "POWER Architecture Platform Watchdog Timer"
+	depends on PPC_PSERIES
+	select WATCHDOG_CORE
+	help
+	  Driver for virtual watchdog timers provided by PAPR
+	  hypervisors (e.g. pHyp, KVM).
+
 # S390 Architecture
 
 config DIAG288_WATCHDOG
diff --git a/drivers/watchdog/Makefile b/drivers/watchdog/Makefile
index f7da867e8782..3ae1f7d9f1ec 100644
--- a/drivers/watchdog/Makefile
+++ b/drivers/watchdog/Makefile
@@ -185,6 +185,7 @@ obj-$(CONFIG_MEN_A21_WDT) += mena21_wdt.o
 
 # PPC64 Architecture
 obj-$(CONFIG_WATCHDOG_RTAS) += wdrtas.o
+obj-$(CONFIG_PSERIES_WDT) += pseries-wdt.o
 
 # S390 Architecture
 obj-$(CONFIG_DIAG288_WATCHDOG) += diag288_wdt.o
diff --git a/drivers/watchdog/pseries-wdt.c b/drivers/watchdog/pseries-wdt.c
new file mode 100644
index 000000000000..0d22ddf12a7f
--- /dev/null
+++ b/drivers/watchdog/pseries-wdt.c
@@ -0,0 +1,337 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+
+#define DRV_NAME "pseries-wdt"
+#define pr_fmt(fmt) DRV_NAME ": " fmt
+
+#include <linux/bitops.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/moduleparam.h>
+#include <linux/platform_device.h>
+#include <linux/watchdog.h>
+
+/*
+ * The PAPR's MSB->LSB bit ordering is is 0->63.  These macros
+ * simplify defining bitfields as described in the PAPR without
+ * needing to transpose values to the more C-like 63->0 ordering.
+ */
+#define SETFIELD(_v, _b, _e)	\
+    (((unsigned long)(_v) << PPC_BITLSHIFT(_e)) & PPC_BITMASK(_b, _e))
+#define GETFIELD(_v, _b, _e)	\
+    (((unsigned long)(_v) & PPC_BITMASK(_b, _e)) >> PPC_BITLSHIFT(_e))
+
+/*
+ * H_WATCHDOG Hypercall Input
+ *
+ * R4: "flags":
+ *
+ *     A 64-bit value structured as follows:
+ *
+ *         Bits 0-46: Reserved (must be zero).
+ */
+#define PSERIES_WDTF_RESERVED	PPC_BITMASK(0, 46)
+
+/*
+ *         Bit 47: "leaveOtherWatchdogsRunningOnTimeout"
+ *
+ *             0  Stop outstanding watchdogs on timeout.
+ *             1  Leave outstanding watchdogs running on timeout.
+ */
+#define PSERIES_WDTF_LEAVE_OTHER	PPC_BIT(47)
+
+/*
+ *         Bits 48-55: "operation"
+ *
+ *             0x01  Start Watchdog
+ *             0x02  Stop Watchdog
+ *             0x03  Query Watchdog Capabilities
+ *             0x04  Query Watchdog LPM Requirement
+ */
+#define PSERIES_WDTF_OP(op)		SETFIELD((op), 48, 55)
+#define PSERIES_WDTF_OP_START		PSERIES_WDTF_OP(0x1)
+#define PSERIES_WDTF_OP_STOP		PSERIES_WDTF_OP(0x2)
+#define PSERIES_WDTF_OP_QUERY		PSERIES_WDTF_OP(0x3)
+#define PSERIES_WDTF_OP_QUERY_LPM	PSERIES_WDTF_OP(0x4)
+
+/*
+ *         Bits 56-63: "timeoutAction"
+ *
+ *             0x01  Hard poweroff
+ *             0x02  Hard restart
+ *             0x03  Dump restart
+ */
+#define PSERIES_WDTF_ACTION(ac)			SETFIELD(ac, 56, 63)
+#define PSERIES_WDTF_ACTION_HARD_POWEROFF	PSERIES_WDTF_ACTION(0x1)
+#define PSERIES_WDTF_ACTION_HARD_RESTART	PSERIES_WDTF_ACTION(0x2)
+#define PSERIES_WDTF_ACTION_DUMP_RESTART	PSERIES_WDTF_ACTION(0x3)
+
+/*
+ * R5: "watchdogNumber":
+ *
+ *     The target watchdog.  Watchdog numbers are 1-based.  The
+ *     maximum supported watchdog number may be obtained via the
+ *     "Query Watchdog Capabilities" operation.
+ *
+ *     This input is ignored for the "Query Watchdog Capabilities"
+ *     operation.
+ *
+ * R6: "timeoutInMs":
+ *
+ *     The timeout in milliseconds.  The minimum supported timeout may
+ *     be obtained via the "Query Watchdog Capabilities" operation.
+ *
+ *     This input is ignored for the "Stop Watchdog", "Query Watchdog
+ *     Capabilities", and "Query LPM Requirement" operations.
+ */
+
+/*
+ * H_WATCHDOG Hypercall Output
+ *
+ * R3: Return code
+ *
+ *     H_SUCCESS    The operation completed.
+ *
+ *     H_BUSY	    The hypervisor is too busy; retry the operation.
+ *
+ *     H_PARAMETER  The given "flags" are somehow invalid.  Either the
+ *                  "operation" or "timeoutAction" is invalid, or a
+ *                  reserved bit is set.
+ *
+ *     H_P2         The given "watchdogNumber" is zero or exceeds the
+ *                  supported maximum value.
+ *
+ *     H_P3         The given "timeoutInMs" is below the supported
+ *                  minimum value.
+ *
+ *     H_NOOP       The given "watchdogNumber" is already stopped.
+ *
+ *     H_HARDWARE   The operation failed for ineffable reasons.
+ *
+ *     H_FUNCTION   The H_WATCHDOG hypercall is not supported by this
+ *                  hypervisor.
+ *
+ * R4:
+ *
+ * - For the "Query Watchdog Capabilities" operation, a 64-bit
+ *   structure defined as:
+ *
+ *       Bits  0-15: The minimum supported timeout in milliseconds.
+ *       Bits 16-31: The number of watchdogs supported.
+ *       Bits 32-63: Reserved.
+ */
+#define PSERIES_WDTQ_MIN_TIMEOUT(cap)	GETFIELD((cap), 0, 15)
+#define PSERIES_WDTQ_MAX_NUMBER(cap)	GETFIELD((cap), 16, 31)
+#define PSERIES_WDTQ_RESERVED		PPC_BITMASK(32, 63)
+
+/*
+ * - For the "Query Watchdog LPM Requirement" operation:
+ *
+ *       1  The given "watchdogNumber" must be stopped prior to
+ *          suspending.
+ *
+ *       2  The given "watchdogNumber" does not have to be stopped
+ *          prior to suspending.
+ */
+#define PSERIES_WDTQL_MUST_STOP       	1
+#define PSERIES_WDTQL_NEED_NOT_STOP	2
+
+static struct platform_device *pseries_wdt_pdev;
+static unsigned long action = PSERIES_WDTF_ACTION_HARD_RESTART;
+static unsigned int min_timeout = 0;
+
+static bool nowayout = WATCHDOG_NOWAYOUT;
+module_param(nowayout, bool, S_IRUGO);
+MODULE_PARM_DESC(nowayout, "Watchdog cannot be stopped once started "
+	"(default=" __MODULE_STRING(WATCHDOG_NOWAYOUT) ")");
+
+#define DEFAULT_TIMEOUT	60
+static unsigned int timeout = DEFAULT_TIMEOUT;
+module_param(timeout, uint, S_IRUGO);
+MODULE_PARM_DESC(timeout, "Initial watchdog timeout in seconds "
+	"(default=" __MODULE_STRING(WATCHDOG_TIMEOUT) ")");
+
+struct pseries_wdt {
+	struct watchdog_device wd;
+	unsigned long num;		/* NB: Watchdog numbers are 1-based */
+};
+#define wd_to_pseries_wdt(ptr)	container_of(ptr, struct pseries_wdt, wd)
+
+static int pseries_wdt_start(struct watchdog_device *wdd)
+{
+	struct pseries_wdt *pw = wd_to_pseries_wdt(wdd);
+	unsigned long flags, msecs;
+	long rc;
+
+	flags = PSERIES_WDTF_OP_START | action;
+	msecs = wdd->timeout * 1000UL;
+	rc = plpar_hcall_norets(H_WATCHDOG, flags, pw->num, msecs);
+	if (rc != H_SUCCESS) {
+		pr_crit("H_WATCHDOG: %ld: failed to start timer %lu",
+			rc, pw->num);
+		return -EIO;	/* XXX What is the right errno here? */
+	}
+	return 0;
+}
+
+static int pseries_wdt_stop(struct watchdog_device *wdd)
+{
+	struct pseries_wdt *pw = wd_to_pseries_wdt(wdd);
+	long rc;
+
+	rc = plpar_hcall_norets(H_WATCHDOG, PSERIES_WDTF_OP_STOP, pw->num);
+	if (rc != H_SUCCESS) {
+		pr_crit("H_WATCHDOG: %ld: failed to stop timer %lu",
+			rc, pw->num);
+		return -EIO;
+	}
+	return 0;
+}
+
+static struct watchdog_info pseries_wdt_info = {
+	.identity = DRV_NAME,
+	.options = WDIOF_KEEPALIVEPING | WDIOF_MAGICCLOSE | WDIOF_SETTIMEOUT \
+	    | WDIOF_PRETIMEOUT,
+};
+
+static const struct watchdog_ops pseries_wdt_ops = {
+	.owner = THIS_MODULE,
+	.ping = pseries_wdt_start,
+	.start = pseries_wdt_start,
+	.stop = pseries_wdt_stop,
+};
+
+static int pseries_wdt_probe(struct platform_device *pdev)
+{
+	struct device *dev = &pdev->dev;
+	struct pseries_wdt *pw;
+	int err = 0;
+
+	pw = devm_kzalloc(dev, sizeof *pw, GFP_KERNEL);
+	if (pw == NULL)
+		return -ENOMEM;
+
+	pw->num = 1;
+
+	pw->wd.info = &pseries_wdt_info;
+	pw->wd.ops = &pseries_wdt_ops;
+	pw->wd.min_timeout = min_timeout;
+	watchdog_init_timeout(&pw->wd, timeout, NULL);
+	watchdog_set_nowayout(&pw->wd, nowayout);
+	watchdog_stop_on_reboot(&pw->wd);
+	watchdog_stop_on_unregister(&pw->wd);
+	watchdog_set_drvdata(&pw->wd, pw);
+
+	/* XXX Is it appropriate to call devm_kfree() on registration error? */
+	err = devm_watchdog_register_device(dev, &pw->wd);
+	if (err) {
+		devm_kfree(dev, pw);
+		return err;
+	}
+
+	platform_set_drvdata(pdev, pw);
+
+	/*
+	 * XXX Should we print something to the console about the new
+	 * pseudo-device?  If so, what?
+	 */
+	pr_info("watchdog%d probed\n", pw->wd.id);
+	return 0;
+}
+
+static int pseries_wdt_remove(struct platform_device *pdev)
+{
+	struct pseries_wdt *pw = platform_get_drvdata(pdev);
+
+	/* XXX Should we say something about removing the pseudo-device? */
+	pr_info("watchdog%d removed\n", pw->wd.id);
+
+	/*
+	 * XXX Calling watchdog_unregister_device() here causes the kernel
+	 * to panic later.  What is the proper way to clean up a watchdog
+	 * device at module unload time?
+	 */
+#if 0
+	watchdog_unregister_device(&pw->wd);
+#endif
+	return 0;
+}
+
+static int pseries_wdt_suspend(struct platform_device *pdev, pm_message_t state)
+{
+	struct pseries_wdt *w;
+
+	w = platform_get_drvdata(pdev);
+	return pseries_wdt_stop(&w->wd);
+}
+
+static int pseries_wdt_resume(struct platform_device *pdev)
+{
+	struct pseries_wdt *w;
+
+	w = platform_get_drvdata(pdev);
+	return pseries_wdt_start(&w->wd);
+}
+
+static struct platform_driver pseries_wdt_driver = {
+	.probe = pseries_wdt_probe,
+	.remove	= pseries_wdt_remove,
+	.suspend = pseries_wdt_suspend,
+	.resume = pseries_wdt_resume,
+	.driver = {
+		.name = DRV_NAME,
+		.owner = THIS_MODULE,
+	},
+};
+
+static int __init pseries_wdt_init_module(void)
+{
+	unsigned long ret[PLPAR_HCALL_BUFSIZE] = { 0 };
+	unsigned long cap;
+	long rc;
+	int err;
+
+	rc = plpar_hcall(H_WATCHDOG, ret, PSERIES_WDTF_OP_QUERY);
+	if (rc != H_SUCCESS) {
+		if (rc == H_FUNCTION) {
+			pr_info("hypervisor does not support H_WATCHDOG");
+			return -ENODEV;
+		}
+		pr_err("H_WATCHDOG: %ld: capability query failed", rc);
+		return -EIO;
+	}
+	cap = ret[0];
+	min_timeout = roundup(PSERIES_WDTQ_MIN_TIMEOUT(cap), 1000) / 1000;
+	pr_info("hypervisor supports %lu timer(s), %lums minimum timeout",
+		PSERIES_WDTQ_MAX_NUMBER(cap), PSERIES_WDTQ_MIN_TIMEOUT(cap));
+
+	err = platform_driver_register(&pseries_wdt_driver);
+	if (err)
+		return err;
+
+	/*
+	 * For the moment we only expose the first timer to userspace.
+	 * In the future we could expose more.
+	 */
+	pseries_wdt_pdev = platform_device_register_simple(DRV_NAME,
+							   -1, NULL, 0);
+	if (IS_ERR(pseries_wdt_pdev)) {
+		platform_driver_unregister(&pseries_wdt_driver);
+		return PTR_ERR(pseries_wdt_pdev);
+	}
+
+	return 0;
+}
+
+static void __exit pseries_wdt_cleanup_module(void)
+{
+	platform_device_unregister(pseries_wdt_pdev);
+	platform_driver_unregister(&pseries_wdt_driver);
+}
+
+module_init(pseries_wdt_init_module);
+module_exit(pseries_wdt_cleanup_module);
+
+MODULE_AUTHOR("Alexey Kardashevskiy <aik@ozlabs.ru>");
+MODULE_AUTHOR("Scott Cheloha <cheloha@linux.ibm.com>");
+MODULE_DESCRIPTION("POWER Architecture Platform Watchdog Timer Driver");
+MODULE_LICENSE("GPL");
-- 
2.27.0


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [RFC v1 0/2] Add driver for PAPR watchdog timers
  2022-04-13 16:51 [RFC v1 0/2] Add driver for PAPR watchdog timers Scott Cheloha
  2022-04-13 16:51 ` [RFC v1 1/2] powerpc/pseries: hvcall.h: add definitions for H_WATCHDOG hypercall Scott Cheloha
  2022-04-13 16:51 ` [RFC v1 2/2] watchdog: pseries-wdt: initial support for PAPR virtual watchdog timers Scott Cheloha
@ 2022-04-14  2:23 ` Guenter Roeck
  2022-04-14 12:39   ` Nathan Lynch
  2022-04-19  8:49 ` Alexey Kardashevskiy
  3 siblings, 1 reply; 10+ messages in thread
From: Guenter Roeck @ 2022-04-14  2:23 UTC (permalink / raw)
  To: Scott Cheloha, linux-watchdog
  Cc: bjking, nathanl, aik, npiggin, vaishnavi, wvoigt

On 4/13/22 09:51, Scott Cheloha wrote:
> This series adds a driver for PAPR hypercall-based watchdog timers,
> tentatively named "pseries-wdt".
> 
> I wanted to get some clarification on a few things before submitting
> the series as a patch, hence the RFC.  The first patch adding the
> hypercall to hvcall.h is straightforward, but I have questions about
> the second patch (the driver).  In particular:
> 
> - In pseries_wdt_probe() we register the watchdog device with
>    devm_watchdog_register_device().  However, in pseries_wdt_remove(),
>    calling watchdog_unregister_devce() causes a kernel panic later,
>    so I assume this is the wrong thing to do.
> 

The whole point of using devm_ functions is to handle cleanup (or removal)
automatically. I would suggest to make yourself familiar with the concept.

>    Do we need to do anything to clean up the watchdog device during
>    pseries_wdt_remove()?  Or does devm_watchdog_register_device()
>    ensure the cleanup is handled transparently?
> 
> - In pseries_wdt_probe(), is it incorrect to devm_kfree() my
>    allocation in the event that devm_watchdog_register_device()
>    fails?
> 

No. Same thing.

> - The enormous hypercall input/output comment is mostly for my
>    edification.  It seems like the sort of thing that will rot over time.
>    I intend to remove most of it.  However, as far as I know the PAPR
>    revision containing these details is not published yet.  Should I
>    leave the comment in to ease review for now and remove it later?
>    Or should I omit it from the initial commit entirely?
> 
> - Should we print something to the console when probing/removing the
>    watchdog0 device or is that just noise?
> 
It is just noise, but some developers insist on it.

>    Most drivers (as distinct from devices) seem to print something
>    during initialization, so that's what I've done in
>    pseries_wdt_module_init() when the capability query succeeds.
> 
No. If you have to print something, print it during probe. module init
noise is even worse. And those error messages in the init function are
completely unacceptable.

Anyway, doesn't pseries support devicetree ? Why is this driver not
instantiated through a devicetree node ?

Guenter

> - The timeout action is currently hardcoded to a hard reset.  This
>    could be made configurable through a module parameter.  I intend
>    to do this in a later patch unless someone needs it included
>    in the initial patch.
> 
> - We set EIO if the hypercall fails in pseries_wdt_start() or
>    pseries_wdt_stop().  There is nothing userspace can do if this
>    happens.  All hypercall failures in these contexts are unexpected.
> 
>    Given all of that, is there is a more appropriate errno than EIO?
> 
> - The H_WATCHDOG spec indicates that H_BUSY is possible.  Is it
>    probable, though?  Should we spin and retry the hypercall in
>    the event that we see it?  Or is that pointless?
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC v1 2/2] watchdog: pseries-wdt: initial support for PAPR virtual watchdog timers
  2022-04-13 16:51 ` [RFC v1 2/2] watchdog: pseries-wdt: initial support for PAPR virtual watchdog timers Scott Cheloha
@ 2022-04-14  3:20   ` Tzung-Bi Shih
  2022-04-14  3:48     ` Guenter Roeck
  0 siblings, 1 reply; 10+ messages in thread
From: Tzung-Bi Shih @ 2022-04-14  3:20 UTC (permalink / raw)
  To: Scott Cheloha
  Cc: linux-watchdog, bjking, nathanl, aik, npiggin, vaishnavi, wvoigt

On Wed, Apr 13, 2022 at 11:51:04AM -0500, Scott Cheloha wrote:
> diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
> index c4e82a8d863f..f3e6db5bed74 100644
> --- a/drivers/watchdog/Kconfig
> +++ b/drivers/watchdog/Kconfig
> @@ -1941,6 +1941,14 @@ config WATCHDOG_RTAS
>  	  To compile this driver as a module, choose M here. The module
>  	  will be called wdrtas.
>  
> +config PSERIES_WDT
> +	tristate "POWER Architecture Platform Watchdog Timer"
> +	depends on PPC_PSERIES
> +	select WATCHDOG_CORE
> +	help
> +	  Driver for virtual watchdog timers provided by PAPR
> +	  hypervisors (e.g. pHyp, KVM).
> +

To maintain alphabetical order, the option should be prior to WATCHDOG_RTAS.

> diff --git a/drivers/watchdog/Makefile b/drivers/watchdog/Makefile
> index f7da867e8782..3ae1f7d9f1ec 100644
> --- a/drivers/watchdog/Makefile
> +++ b/drivers/watchdog/Makefile
> @@ -185,6 +185,7 @@ obj-$(CONFIG_MEN_A21_WDT) += mena21_wdt.o
>  
>  # PPC64 Architecture
>  obj-$(CONFIG_WATCHDOG_RTAS) += wdrtas.o
> +obj-$(CONFIG_PSERIES_WDT) += pseries-wdt.o

Same here.

> diff --git a/drivers/watchdog/pseries-wdt.c b/drivers/watchdog/pseries-wdt.c
> new file mode 100644
> index 000000000000..0d22ddf12a7f
> --- /dev/null
> +++ b/drivers/watchdog/pseries-wdt.c
> @@ -0,0 +1,337 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +
> +#define DRV_NAME "pseries-wdt"
> +#define pr_fmt(fmt) DRV_NAME ": " fmt

`pr_fmt` is unused.

> +/*
> + * The PAPR's MSB->LSB bit ordering is is 0->63.  These macros
> + * simplify defining bitfields as described in the PAPR without
> + * needing to transpose values to the more C-like 63->0 ordering.
> + */
> +#define SETFIELD(_v, _b, _e)	\
> +    (((unsigned long)(_v) << PPC_BITLSHIFT(_e)) & PPC_BITMASK(_b, _e))
> +#define GETFIELD(_v, _b, _e)	\
> +    (((unsigned long)(_v) & PPC_BITMASK(_b, _e)) >> PPC_BITLSHIFT(_e))

Would it be better to enclose parentheses for _b and _e in PPC_BITMASK()?

> +/*
> + * R5: "watchdogNumber":
> + *
> + *     The target watchdog.  Watchdog numbers are 1-based.  The
> + *     maximum supported watchdog number may be obtained via the
> + *     "Query Watchdog Capabilities" operation.
> + *
> + *     This input is ignored for the "Query Watchdog Capabilities"
> + *     operation.
> + *
> + * R6: "timeoutInMs":
> + *
> + *     The timeout in milliseconds.  The minimum supported timeout may
> + *     be obtained via the "Query Watchdog Capabilities" operation.
> + *
> + *     This input is ignored for the "Stop Watchdog", "Query Watchdog
> + *     Capabilities", and "Query LPM Requirement" operations.
> + */
> +
> +/*
> + * H_WATCHDOG Hypercall Output
> + *
> + * R3: Return code
> + *
> + *     H_SUCCESS    The operation completed.
> + *
> + *     H_BUSY	    The hypervisor is too busy; retry the operation.
> + *
> + *     H_PARAMETER  The given "flags" are somehow invalid.  Either the
> + *                  "operation" or "timeoutAction" is invalid, or a
> + *                  reserved bit is set.
> + *
> + *     H_P2         The given "watchdogNumber" is zero or exceeds the
> + *                  supported maximum value.
> + *
> + *     H_P3         The given "timeoutInMs" is below the supported
> + *                  minimum value.
> + *
> + *     H_NOOP       The given "watchdogNumber" is already stopped.
> + *
> + *     H_HARDWARE   The operation failed for ineffable reasons.
> + *
> + *     H_FUNCTION   The H_WATCHDOG hypercall is not supported by this
> + *                  hypervisor.

The above registers/bits have no corresponding macros for manipulating.  Are
they constructing?

> +static struct platform_device *pseries_wdt_pdev;

I wonder if it could be dropped.  See below.

> +static unsigned long action = PSERIES_WDTF_ACTION_HARD_RESTART;

It looks like `action` can only be in the scope of pseries_wdt_start().

> +static unsigned int min_timeout = 0;

I wonder if it could be dropped.  See below.

> +struct pseries_wdt {
> +	struct watchdog_device wd;
> +	unsigned long num;		/* NB: Watchdog numbers are 1-based */
> +};
> +#define wd_to_pseries_wdt(ptr)	container_of(ptr, struct pseries_wdt, wd)

Given that it already calls watchdog_set_drvdata() in pseries_wdt_probe(), it
doesn't need the container_of() to get the struct pseries_wdt.

> +static int pseries_wdt_start(struct watchdog_device *wdd)
> +{
> +	struct pseries_wdt *pw = wd_to_pseries_wdt(wdd);

Use watchdog_get_drvdata().

> +	rc = plpar_hcall_norets(H_WATCHDOG, flags, pw->num, msecs);
> +	if (rc != H_SUCCESS) {
> +		pr_crit("H_WATCHDOG: %ld: failed to start timer %lu",
> +			rc, pw->num);

If it really needs to print something, save &pdev->dev in struct pseries_wdt
in pseries_wdt_probe() and use dev_err() here.

> +		return -EIO;	/* XXX What is the right errno here? */

I think it is fine as long as the errno makes sense for the context.  Watchdog
framework doesn't propagate the errno[1].

[1]: https://elixir.bootlin.com/linux/v5.18-rc2/source/drivers/watchdog/watchdog_core.c#L166

> +static int pseries_wdt_stop(struct watchdog_device *wdd)
> +{
> +	struct pseries_wdt *pw = wd_to_pseries_wdt(wdd);

Use watchdog_get_drvdata().

> +	rc = plpar_hcall_norets(H_WATCHDOG, PSERIES_WDTF_OP_STOP, pw->num);
> +	if (rc != H_SUCCESS) {
> +		pr_crit("H_WATCHDOG: %ld: failed to stop timer %lu",
> +			rc, pw->num);

Ditto.

> +static int pseries_wdt_probe(struct platform_device *pdev)
> +{
> +	struct device *dev = &pdev->dev;
> +	struct pseries_wdt *pw;
> +	int err = 0;

The initialization is pointless.  `err` is going to be overriden soon.

> +	pw = devm_kzalloc(dev, sizeof *pw, GFP_KERNEL);
> +	if (pw == NULL)

To be concise, use !pw.

> +	/* XXX Is it appropriate to call devm_kfree() on registration error? */
> +	err = devm_watchdog_register_device(dev, &pw->wd);
> +	if (err) {
> +		devm_kfree(dev, pw);

No.  devm_* delegate the responsibility to device.  It doesn't need to call
devm_kfree().

> +	/*
> +	 * XXX Should we print something to the console about the new
> +	 * pseudo-device?  If so, what?
> +	 */
> +	pr_info("watchdog%d probed\n", pw->wd.id);

Up to it.  Use dev_info() or dev_dbg() here if it really needs to print
something.

> +static int pseries_wdt_remove(struct platform_device *pdev)
> +{
> +	struct pseries_wdt *pw = platform_get_drvdata(pdev);
> +
> +	/* XXX Should we say something about removing the pseudo-device? */
> +	pr_info("watchdog%d removed\n", pw->wd.id);

Ditto.

> +	/*
> +	 * XXX Calling watchdog_unregister_device() here causes the kernel
> +	 * to panic later.  What is the proper way to clean up a watchdog
> +	 * device at module unload time?
> +	 */
> +#if 0
> +	watchdog_unregister_device(&pw->wd);
> +#endif

It doesn't need to call watchdog_unregister_device().  The life cycle is
already delegated to the device if it calls devm_watchdog_register_device().

> +static int pseries_wdt_suspend(struct platform_device *pdev, pm_message_t state)
> +{
> +	struct pseries_wdt *w;
> +
> +	w = platform_get_drvdata(pdev);

To be concise, inline the assignment.  I.e.
struct pseries_wdt *w = platform_get_drvdata(pdev);

> +	return pseries_wdt_stop(&w->wd);

Taking other watchdog drivers as examples[2][3], doesn't it need to check by
watchdog_active()?

[2]: https://elixir.bootlin.com/linux/v5.18-rc2/source/drivers/watchdog/mtk_wdt.c#L399
[3]: https://elixir.bootlin.com/linux/v5.18-rc2/source/drivers/watchdog/imx_sc_wdt.c#L220

> +static int pseries_wdt_resume(struct platform_device *pdev)
> +{
> +	struct pseries_wdt *w;
> +
> +	w = platform_get_drvdata(pdev);

Ditto.

> +	return pseries_wdt_start(&w->wd);

Share the same concern for pseries_wdt_suspend().

> +static struct platform_driver pseries_wdt_driver = {
> +	.probe = pseries_wdt_probe,
> +	.remove	= pseries_wdt_remove,
> +	.suspend = pseries_wdt_suspend,
> +	.resume = pseries_wdt_resume,
> +	.driver = {
> +		.name = DRV_NAME,
> +		.owner = THIS_MODULE,
> +	},
> +};

Taking other watchdog drivers as examples[4][5], their .suspend and .resume ops
bound to the struct device_driver instead of struct platform_driver.

I have no idea what could be the difference.  Maybe others on the mailing list
could help to answer.

[4]: https://elixir.bootlin.com/linux/v5.18-rc2/source/drivers/watchdog/mtk_wdt.c#L437
[5]: https://elixir.bootlin.com/linux/v5.18-rc2/source/drivers/watchdog/imx_sc_wdt.c#L250

> +static int __init pseries_wdt_init_module(void)
> +{
> +	unsigned long ret[PLPAR_HCALL_BUFSIZE] = { 0 };
> +	unsigned long cap;
> +	long rc;
> +	int err;
> +
> +	rc = plpar_hcall(H_WATCHDOG, ret, PSERIES_WDTF_OP_QUERY);
> +	if (rc != H_SUCCESS) {
> +		if (rc == H_FUNCTION) {
> +			pr_info("hypervisor does not support H_WATCHDOG");
> +			return -ENODEV;
> +		}
> +		pr_err("H_WATCHDOG: %ld: capability query failed", rc);
> +		return -EIO;
> +	}
> +	cap = ret[0];
> +	min_timeout = roundup(PSERIES_WDTQ_MIN_TIMEOUT(cap), 1000) / 1000;
> +	pr_info("hypervisor supports %lu timer(s), %lums minimum timeout",
> +		PSERIES_WDTQ_MAX_NUMBER(cap), PSERIES_WDTQ_MIN_TIMEOUT(cap));

Could these bits be in pseries_wdt_probe()?

> +
> +	err = platform_driver_register(&pseries_wdt_driver);
> +	if (err)
> +		return err;
> +
> +	/*
> +	 * For the moment we only expose the first timer to userspace.
> +	 * In the future we could expose more.
> +	 */
> +	pseries_wdt_pdev = platform_device_register_simple(DRV_NAME,
> +							   -1, NULL, 0);
> +	if (IS_ERR(pseries_wdt_pdev)) {
> +		platform_driver_unregister(&pseries_wdt_driver);
> +		return PTR_ERR(pseries_wdt_pdev);
> +	}
> +
> +	return 0;
> +}
> +
> +static void __exit pseries_wdt_cleanup_module(void)
> +{
> +	platform_device_unregister(pseries_wdt_pdev);
> +	platform_driver_unregister(&pseries_wdt_driver);
> +}
> +
> +module_init(pseries_wdt_init_module);
> +module_exit(pseries_wdt_cleanup_module);

If the plpar_hcall() check and `min_timeout` initialzation could be in
pseries_wdt_probe(), the whole blocks can be replaced by
module_platform_driver().

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC v1 2/2] watchdog: pseries-wdt: initial support for PAPR virtual watchdog timers
  2022-04-14  3:20   ` Tzung-Bi Shih
@ 2022-04-14  3:48     ` Guenter Roeck
  0 siblings, 0 replies; 10+ messages in thread
From: Guenter Roeck @ 2022-04-14  3:48 UTC (permalink / raw)
  To: Tzung-Bi Shih
  Cc: Scott Cheloha, linux-watchdog, bjking, nathanl, aik, npiggin,
	vaishnavi, wvoigt

On Thu, Apr 14, 2022 at 11:20:08AM +0800, Tzung-Bi Shih wrote:
> On Wed, Apr 13, 2022 at 11:51:04AM -0500, Scott Cheloha wrote:
> > diff --git a/drivers/watchdog/Kconfig b/drivers/watchdog/Kconfig
> > index c4e82a8d863f..f3e6db5bed74 100644
> > --- a/drivers/watchdog/Kconfig
> > +++ b/drivers/watchdog/Kconfig
> > @@ -1941,6 +1941,14 @@ config WATCHDOG_RTAS
> >  	  To compile this driver as a module, choose M here. The module
> >  	  will be called wdrtas.
> >  
> > +config PSERIES_WDT
> > +	tristate "POWER Architecture Platform Watchdog Timer"
> > +	depends on PPC_PSERIES
> > +	select WATCHDOG_CORE
> > +	help
> > +	  Driver for virtual watchdog timers provided by PAPR
> > +	  hypervisors (e.g. pHyp, KVM).
> > +
> 
> To maintain alphabetical order, the option should be prior to WATCHDOG_RTAS.
> 
> > diff --git a/drivers/watchdog/Makefile b/drivers/watchdog/Makefile
> > index f7da867e8782..3ae1f7d9f1ec 100644
> > --- a/drivers/watchdog/Makefile
> > +++ b/drivers/watchdog/Makefile
> > @@ -185,6 +185,7 @@ obj-$(CONFIG_MEN_A21_WDT) += mena21_wdt.o
> >  
> >  # PPC64 Architecture
> >  obj-$(CONFIG_WATCHDOG_RTAS) += wdrtas.o
> > +obj-$(CONFIG_PSERIES_WDT) += pseries-wdt.o
> 
> Same here.
> 
> > diff --git a/drivers/watchdog/pseries-wdt.c b/drivers/watchdog/pseries-wdt.c
> > new file mode 100644
> > index 000000000000..0d22ddf12a7f
> > --- /dev/null
> > +++ b/drivers/watchdog/pseries-wdt.c
> > @@ -0,0 +1,337 @@
> > +/* SPDX-License-Identifier: GPL-2.0-or-later */
> > +
> > +#define DRV_NAME "pseries-wdt"
> > +#define pr_fmt(fmt) DRV_NAME ": " fmt
> 
> `pr_fmt` is unused.
> 

That is just a neat trick to get subsequent pr_xxx to print the driver name
as prefix.

> 
> > +static int __init pseries_wdt_init_module(void)
> > +{
> > +	unsigned long ret[PLPAR_HCALL_BUFSIZE] = { 0 };
> > +	unsigned long cap;
> > +	long rc;
> > +	int err;
> > +
> > +	rc = plpar_hcall(H_WATCHDOG, ret, PSERIES_WDTF_OP_QUERY);
> > +	if (rc != H_SUCCESS) {
> > +		if (rc == H_FUNCTION) {
> > +			pr_info("hypervisor does not support H_WATCHDOG");
> > +			return -ENODEV;
> > +		}
> > +		pr_err("H_WATCHDOG: %ld: capability query failed", rc);
> > +		return -EIO;

The init function should not print any messages. Loading the driver on
a platform not supporting it should have no effect and print no messages.

> > +	}
> > +	cap = ret[0];
> > +	min_timeout = roundup(PSERIES_WDTQ_MIN_TIMEOUT(cap), 1000) / 1000;
> > +	pr_info("hypervisor supports %lu timer(s), %lums minimum timeout",
> > +		PSERIES_WDTQ_MAX_NUMBER(cap), PSERIES_WDTQ_MIN_TIMEOUT(cap));
> 
> Could these bits be in pseries_wdt_probe()?

Yes. Also, if values have to be passed to the driver, the init code should
pass it as platform data. There should be no static variables to pass
values to the probe function.

Thanks,
Guenter

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC v1 0/2] Add driver for PAPR watchdog timers
  2022-04-14  2:23 ` [RFC v1 0/2] Add driver for PAPR " Guenter Roeck
@ 2022-04-14 12:39   ` Nathan Lynch
  0 siblings, 0 replies; 10+ messages in thread
From: Nathan Lynch @ 2022-04-14 12:39 UTC (permalink / raw)
  To: Guenter Roeck
  Cc: bjking, aik, npiggin, vaishnavi, Scott Cheloha, linux-watchdog

Guenter Roeck <linux@roeck-us.net> writes:
> Anyway, doesn't pseries support devicetree ? Why is this driver not
> instantiated through a devicetree node ?

It's not ideal, but this facility doesn't have a device tree
representation specified in the platform architecture. It has to be
discovered through hypervisor calls.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC v1 0/2] Add driver for PAPR watchdog timers
  2022-04-13 16:51 [RFC v1 0/2] Add driver for PAPR watchdog timers Scott Cheloha
                   ` (2 preceding siblings ...)
  2022-04-14  2:23 ` [RFC v1 0/2] Add driver for PAPR " Guenter Roeck
@ 2022-04-19  8:49 ` Alexey Kardashevskiy
  2022-04-19 13:55   ` Guenter Roeck
  3 siblings, 1 reply; 10+ messages in thread
From: Alexey Kardashevskiy @ 2022-04-19  8:49 UTC (permalink / raw)
  To: Scott Cheloha, linux-watchdog; +Cc: bjking, nathanl, npiggin, vaishnavi, wvoigt



On 14/04/2022 02:51, Scott Cheloha wrote:
> This series adds a driver for PAPR hypercall-based watchdog timers,
> tentatively named "pseries-wdt".
> 
> I wanted to get some clarification on a few things before submitting
> the series as a patch, hence the RFC.  The first patch adding the
> hypercall to hvcall.h is straightforward, but I have questions about
> the second patch (the driver).  In particular:
> 
> - In pseries_wdt_probe() we register the watchdog device with
>    devm_watchdog_register_device().  However, in pseries_wdt_remove(),
>    calling watchdog_unregister_devce() causes a kernel panic later,
>    so I assume this is the wrong thing to do.


It should have been devm_watchdog_unregister_device() (no difference 
though) and what was the backtrace? Most watchdog drivers do it this way 
  :-/


>    Do we need to do anything to clean up the watchdog device during
>    pseries_wdt_remove()?  Or does devm_watchdog_register_device()
>    ensure the cleanup is handled transparently?
> 
> - In pseries_wdt_probe(), is it incorrect to devm_kfree() my
>    allocation in the event that devm_watchdog_register_device()
>    fails?

I am pretty sure nothing is going to free the memory you allocated in 
devm_kzalloc() as you do not even pass the allocated pointer to 
devm_watchdog_register_device(), it is an offset. The only reason 
devm_kfree(&pw->wd) won't barf1 is @wd is the first member of the 
pseries_wdt struct.


> - The enormous hypercall input/output comment is mostly for my
>    edification.  It seems like the sort of thing that will rot over time.
>    I intend to remove most of it.  However, as far as I know the PAPR
>    revision containing these details is not published yet.  Should I
>    leave the comment in to ease review for now and remove it later?
>    Or should I omit it from the initial commit entirely?

I'd probably remove some empty lines and add shorter comments inline, like:

+/* Bits 56-63: "timeoutAction" */
+#define PSERIES_WDTF_ACTION(ac)			SETFIELD(ac, 56, 63)
+#define PSERIES_WDTF_ACTION_HARD_POWEROFF	PSERIES_WDTF_ACTION(0x1) // 
"Hard poweroff"
+#define PSERIES_WDTF_ACTION_HARD_RESTART	PSERIES_WDTF_ACTION(0x2) // 
"Hard restart"
+#define PSERIES_WDTF_ACTION_DUMP_RESTART	PSERIES_WDTF_ACTION(0x3) // 
"Dump restart"


The quoted text would tell what to search literally for in the PAPR spec 
when it is updated.


> - Should we print something to the console when probing/removing the
>    watchdog0 device or is that just noise?
> 
>    Most drivers (as distinct from devices) seem to print something
>    during initialization, so that's what I've done in
>    pseries_wdt_module_init() when the capability query succeeds.


I'd say it is noise but since the watchdog is not represented in the 
device tree, there is really no other way of knowing if it is running 
(unless it is a module?).

One line message in pseries_wdt_probe() with 
PSERIES_WDTQ_MAX_NUMBER/PSERIES_WDTQ_MIN_TIMEOUT should do.


> - The timeout action is currently hardcoded to a hard reset.  This
>    could be made configurable through a module parameter.  I intend
>    to do this in a later patch unless someone needs it included
>    in the initial patch.

Make it in the initial patch, it is just a few lines.


> - We set EIO if the hypercall fails in pseries_wdt_start() or
>    pseries_wdt_stop().  There is nothing userspace can do if this
>    happens.  All hypercall failures in these contexts are unexpected.

The userspace can log the event, send an email, "sync && reboot", dunno.

>    Given all of that, is there is a more appropriate errno than EIO?
> 
> - The H_WATCHDOG spec indicates that H_BUSY is possible.  Is it
>    probable, though?  Should we spin and retry the hypercall in
>    the event that we see it?  Or is that pointless?


Looks like the other parts of pseries do retry after calling cond_resched().

> 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC v1 0/2] Add driver for PAPR watchdog timers
  2022-04-19  8:49 ` Alexey Kardashevskiy
@ 2022-04-19 13:55   ` Guenter Roeck
  0 siblings, 0 replies; 10+ messages in thread
From: Guenter Roeck @ 2022-04-19 13:55 UTC (permalink / raw)
  To: Alexey Kardashevskiy, Scott Cheloha, linux-watchdog
  Cc: bjking, nathanl, npiggin, vaishnavi, wvoigt

On 4/19/22 01:49, Alexey Kardashevskiy wrote:
> 
> 
> On 14/04/2022 02:51, Scott Cheloha wrote:
>> This series adds a driver for PAPR hypercall-based watchdog timers,
>> tentatively named "pseries-wdt".
>>
>> I wanted to get some clarification on a few things before submitting
>> the series as a patch, hence the RFC.  The first patch adding the
>> hypercall to hvcall.h is straightforward, but I have questions about
>> the second patch (the driver).  In particular:
>>
>> - In pseries_wdt_probe() we register the watchdog device with
>>    devm_watchdog_register_device().  However, in pseries_wdt_remove(),
>>    calling watchdog_unregister_devce() causes a kernel panic later,
>>    so I assume this is the wrong thing to do.
> 
> 
> It should have been devm_watchdog_unregister_device() (no difference though) and what was the backtrace? Most watchdog drivers do it this way  :-/
> 

Please make yourself familiar with devm_ functions and their use.
There is no exported devm_watchdog_unregister_device() because it is
not needed.

> 
>>    Do we need to do anything to clean up the watchdog device during
>>    pseries_wdt_remove()?  Or does devm_watchdog_register_device()
>>    ensure the cleanup is handled transparently?
>>
>> - In pseries_wdt_probe(), is it incorrect to devm_kfree() my
>>    allocation in the event that devm_watchdog_register_device()
>>    fails?
> 
> I am pretty sure nothing is going to free the memory you allocated in devm_kzalloc() as you do not even pass the allocated pointer to devm_watchdog_register_device(), it is an offset. The only reason devm_kfree(&pw->wd) won't barf1 is @wd is the first member of the pseries_wdt struct.
> 

Again, please make yourself familiar with devm_ functions
and their use.

Guenter

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC v1 0/2] Add driver for PAPR watchdog timers
@ 2022-04-13 16:48 Scott Cheloha
  0 siblings, 0 replies; 10+ messages in thread
From: Scott Cheloha @ 2022-04-13 16:48 UTC (permalink / raw)
  To: linux-watchdog; +Cc: bjking, nlynch, aik, npiggin, vaishnavi, wvoigt

This series adds a driver for PAPR hypercall-based watchdog timers,
tentatively named "pseries-wdt".

I wanted to get some clarification on a few things before submitting
the series as a patch, hence the RFC.  The first patch adding the
hypercall to hvcall.h is straightforward, but I have questions about
the second patch (the driver).  In particular:

- In pseries_wdt_probe() we register the watchdog device with
  devm_watchdog_register_device().  However, in pseries_wdt_remove(),
  calling watchdog_unregister_devce() causes a kernel panic later,
  so I assume this is the wrong thing to do.

  Do we need to do anything to clean up the watchdog device during
  pseries_wdt_remove()?  Or does devm_watchdog_register_device()
  ensure the cleanup is handled transparently?

- In pseries_wdt_probe(), is it incorrect to devm_kfree() my
  allocation in the event that devm_watchdog_register_device()
  fails?

- The enormous hypercall input/output comment is mostly for my
  edification.  It seems like the sort of thing that will rot over time.
  I intend to remove most of it.  However, as far as I know the PAPR
  revision containing these details is not published yet.  Should I
  leave the comment in to ease review for now and remove it later?
  Or should I omit it from the initial commit entirely?

- Should we print something to the console when probing/removing the
  watchdog0 device or is that just noise?

  Most drivers (as distinct from devices) seem to print something
  during initialization, so that's what I've done in
  pseries_wdt_module_init() when the capability query succeeds.

- The timeout action is currently hardcoded to a hard reset.  This
  could be made configurable through a module parameter.  I intend
  to do this in a later patch unless someone needs it included
  in the initial patch.

- We set EIO if the hypercall fails in pseries_wdt_start() or
  pseries_wdt_stop().  There is nothing userspace can do if this
  happens.  All hypercall failures in these contexts are unexpected.

  Given all of that, is there is a more appropriate errno than EIO?

- The H_WATCHDOG spec indicates that H_BUSY is possible.  Is it
  probable, though?  Should we spin and retry the hypercall in
  the event that we see it?  Or is that pointless?



^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-04-19 13:57 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-13 16:51 [RFC v1 0/2] Add driver for PAPR watchdog timers Scott Cheloha
2022-04-13 16:51 ` [RFC v1 1/2] powerpc/pseries: hvcall.h: add definitions for H_WATCHDOG hypercall Scott Cheloha
2022-04-13 16:51 ` [RFC v1 2/2] watchdog: pseries-wdt: initial support for PAPR virtual watchdog timers Scott Cheloha
2022-04-14  3:20   ` Tzung-Bi Shih
2022-04-14  3:48     ` Guenter Roeck
2022-04-14  2:23 ` [RFC v1 0/2] Add driver for PAPR " Guenter Roeck
2022-04-14 12:39   ` Nathan Lynch
2022-04-19  8:49 ` Alexey Kardashevskiy
2022-04-19 13:55   ` Guenter Roeck
  -- strict thread matches above, loose matches on Subject: below --
2022-04-13 16:48 Scott Cheloha

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.