linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v9 0/2] Detect stalls on guest vCPUS
@ 2022-07-01 14:40 Sebastian Ene
  2022-07-01 14:40 ` [PATCH v9 1/2] dt-bindings: vcpu_stall_detector: Add qemu,vcpu-stall-detector compatible Sebastian Ene
  2022-07-01 14:40 ` [PATCH v9 2/2] misc: Add a mechanism to detect stalls on guest vCPUs Sebastian Ene
  0 siblings, 2 replies; 10+ messages in thread
From: Sebastian Ene @ 2022-07-01 14:40 UTC (permalink / raw)
  To: Rob Herring, Greg Kroah-Hartman, Arnd Bergmann, Dragan Cvetic
  Cc: linux-kernel, devicetree, maz, will, vdonnefort, Guenter Roeck,
	Sebastian Ene

This adds a mechanism to detect stalls on the guest vCPUS by creating a
per CPU hrtimer which periodically 'pets' the host backend driver.
On a conventional watchdog-core driver, the userspace is responsible for
delivering the 'pet' events by writing to the particular /dev/watchdogN node.
In this case we require a strong thread affinity to be able to
account for lost time on a per vCPU basis.

This device driver acts as a soft lockup detector by relying on the host
backend driver to measure the elapesed time between subsequent 'pet' events.
If the elapsed time doesn't match an expected value, the backend driver
decides that the guest vCPU is locked and resets the guest. The host
backend driver takes into account the time that the guest is not
running. The communication with the backend driver is done through MMIO
and the register layout of the virtual watchdog is described as part of
the backend driver changes.

The host backend driver is implemented as part of:
https://chromium-review.googlesource.com/c/chromiumos/platform/crosvm/+/3548817

Changelog v9:
 - make the driver depend on CONFIG_OF
 - remove the platform_(set|get)_drvdata calls and keep a per-cpu static
   variable `vm_stall_detect` as suggested by Guenter on the (v8) series
 - improve commit description and fix styling

Changelog v8:
 - fix the yamlint dtschema warning caused by the missing 'reg' property 

Changelog v7:
 - fix the dtschema warnings for 'timeout-sec' property
 - rename vcpu_stall_detector.yaml to qemu,vcpu_stall_detector.yaml and
   place the file under misc
 - improve the Kconfig description for the driver by making it KVM
   specific

Changelog v6:
 - fix issues reported by lkp@intel robot:
     building for ARCH=h8300 incorrect type in assignment
     (different address spaces)

Sebastian Ene (2):
  dt-bindings: vcpu_stall_detector: Add qemu,vcpu-stall-detector
    compatible
  misc: Add a mechanism to detect stalls on guest vCPUs

 .../misc/qemu,vcpu-stall-detector.yaml        |  51 +++++
 drivers/misc/Kconfig                          |  13 ++
 drivers/misc/Makefile                         |   1 +
 drivers/misc/vcpu_stall_detector.c            | 212 ++++++++++++++++++
 4 files changed, 277 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/misc/qemu,vcpu-stall-detector.yaml
 create mode 100644 drivers/misc/vcpu_stall_detector.c

-- 
2.37.0.rc0.161.g10f37bed90-goog


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v9 1/2] dt-bindings: vcpu_stall_detector: Add qemu,vcpu-stall-detector compatible
  2022-07-01 14:40 [PATCH v9 0/2] Detect stalls on guest vCPUS Sebastian Ene
@ 2022-07-01 14:40 ` Sebastian Ene
  2022-07-01 21:01   ` Rob Herring
  2022-07-01 14:40 ` [PATCH v9 2/2] misc: Add a mechanism to detect stalls on guest vCPUs Sebastian Ene
  1 sibling, 1 reply; 10+ messages in thread
From: Sebastian Ene @ 2022-07-01 14:40 UTC (permalink / raw)
  To: Rob Herring, Greg Kroah-Hartman, Arnd Bergmann, Dragan Cvetic
  Cc: linux-kernel, devicetree, maz, will, vdonnefort, Guenter Roeck,
	Sebastian Ene

The VCPU stall detection mechanism allows to configure the expiration
duration and the internal counter clock frequency measured in Hz.
Add these properties in the schema.

While this is a memory mapped virtual device, it is expected to be loaded
when the DT contains the compatible: "qemu,vcpu-stall-detector" node.
In a protected VM we trust the generated DT nodes and we don't rely on
the host to present the hardware peripherals.

Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
 .../misc/qemu,vcpu-stall-detector.yaml        | 51 +++++++++++++++++++
 1 file changed, 51 insertions(+)
 create mode 100644 Documentation/devicetree/bindings/misc/qemu,vcpu-stall-detector.yaml

diff --git a/Documentation/devicetree/bindings/misc/qemu,vcpu-stall-detector.yaml b/Documentation/devicetree/bindings/misc/qemu,vcpu-stall-detector.yaml
new file mode 100644
index 000000000000..1aebeb696ee0
--- /dev/null
+++ b/Documentation/devicetree/bindings/misc/qemu,vcpu-stall-detector.yaml
@@ -0,0 +1,51 @@
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/misc/qemu,vcpu-stall-detector.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: VCPU stall detector
+
+description:
+  This binding describes a CPU stall detector mechanism for virtual CPUs
+  which is accessed through MMIO.
+
+maintainers:
+  - Sebastian Ene <sebastianene@google.com>
+
+properties:
+  compatible:
+    enum:
+      - qemu,vcpu-stall-detector
+
+  reg:
+    maxItems: 1
+
+  clock-frequency:
+    $ref: /schemas/types.yaml#/definitions/uint32
+    description: |
+      The internal clock of the stall detector peripheral measure in Hz used
+      to decrement its internal counter register on each tick.
+      Defaults to 10 if unset.
+    default: 10
+
+  timeout-sec:
+    description: |
+      The stall detector expiration timeout measured in seconds.
+      Defaults to 8 if unset. Please note that it also takes into account the
+      time spent while the VCPU is not running.
+    default: 8
+
+required:
+  - compatible
+
+additionalProperties: false
+
+examples:
+  - |
+    vmwdt@9030000 {
+      compatible = "qemu,vcpu-stall-detector";
+      reg = <0x9030000 0x10000>;
+      clock-frequency = <10>;
+      timeout-sec = <8>;
+    };
-- 
2.37.0.rc0.161.g10f37bed90-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v9 2/2] misc: Add a mechanism to detect stalls on guest vCPUs
  2022-07-01 14:40 [PATCH v9 0/2] Detect stalls on guest vCPUS Sebastian Ene
  2022-07-01 14:40 ` [PATCH v9 1/2] dt-bindings: vcpu_stall_detector: Add qemu,vcpu-stall-detector compatible Sebastian Ene
@ 2022-07-01 14:40 ` Sebastian Ene
  2022-07-01 14:52   ` Guenter Roeck
  2022-07-06 15:21   ` Will Deacon
  1 sibling, 2 replies; 10+ messages in thread
From: Sebastian Ene @ 2022-07-01 14:40 UTC (permalink / raw)
  To: Rob Herring, Greg Kroah-Hartman, Arnd Bergmann, Dragan Cvetic
  Cc: linux-kernel, devicetree, maz, will, vdonnefort, Guenter Roeck,
	Sebastian Ene

This driver creates per-cpu hrtimers which are required to do the
periodic 'pet' operation. On a conventional watchdog-core driver, the
userspace is responsible for delivering the 'pet' events by writing to
the particular /dev/watchdogN node. In this case we require a strong
thread affinity to be able to account for lost time on a per vCPU.

This part of the driver is the 'frontend' which is reponsible for
delivering the periodic 'pet' events, configuring the virtual peripheral
and listening for cpu hotplug events. The other part of the driver is
an emulated MMIO device which is part of the KVM virtual machine
monitor and this part accounts for lost time by looking at the
/proc/{}/task/{}/stat entries.

Signed-off-by: Sebastian Ene <sebastianene@google.com>
---
 drivers/misc/Kconfig               |  13 ++
 drivers/misc/Makefile              |   1 +
 drivers/misc/vcpu_stall_detector.c | 212 +++++++++++++++++++++++++++++
 3 files changed, 226 insertions(+)
 create mode 100644 drivers/misc/vcpu_stall_detector.c

diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index 41d2bb0ae23a..83afb41a85cf 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -483,6 +483,19 @@ config OPEN_DICE
 
 	  If unsure, say N.
 
+config VCPU_STALL_DETECTOR
+	tristate "VCPU stall detector"
+	select LOCKUP_DETECTOR
+	depends on OF
+	help
+	  Detect CPU locks on a kvm virtual machine. This driver relies on
+	  the hrtimers which are CPU-binded to do the 'pet' operation. When a
+	  vCPU has to do a 'pet', it exits the guest through MMIO write and
+	  the backend driver takes into account the lost ticks for this
+	  particular CPU.
+	  To compile this driver as a module, choose M here: the
+	  module will be called vcpu_stall_detector.
+
 source "drivers/misc/c2port/Kconfig"
 source "drivers/misc/eeprom/Kconfig"
 source "drivers/misc/cb710/Kconfig"
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 70e800e9127f..2be8542616dd 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -60,3 +60,4 @@ obj-$(CONFIG_XILINX_SDFEC)	+= xilinx_sdfec.o
 obj-$(CONFIG_HISI_HIKEY_USB)	+= hisi_hikey_usb.o
 obj-$(CONFIG_HI6421V600_IRQ)	+= hi6421v600-irq.o
 obj-$(CONFIG_OPEN_DICE)		+= open-dice.o
+obj-$(CONFIG_VCPU_STALL_DETECTOR)	+= vcpu_stall_detector.o
\ No newline at end of file
diff --git a/drivers/misc/vcpu_stall_detector.c b/drivers/misc/vcpu_stall_detector.c
new file mode 100644
index 000000000000..039ac54564c1
--- /dev/null
+++ b/drivers/misc/vcpu_stall_detector.c
@@ -0,0 +1,212 @@
+// SPDX-License-Identifier: GPL-2.0
+//
+// VCPU stall detector.
+//  Copyright (C) Google, 2022
+
+#include <linux/cpu.h>
+#include <linux/init.h>
+#include <linux/io.h>
+#include <linux/kernel.h>
+
+#include <linux/device.h>
+#include <linux/interrupt.h>
+#include <linux/module.h>
+#include <linux/nmi.h>
+#include <linux/of.h>
+#include <linux/of_device.h>
+#include <linux/param.h>
+#include <linux/percpu.h>
+#include <linux/platform_device.h>
+#include <linux/slab.h>
+
+#define REG_STATUS		(0x00)
+#define REG_LOAD_CNT		(0x04)
+#define REG_CURRENT_CNT		(0x08)
+#define REG_CLOCK_FREQ_HZ	(0x0C)
+#define REG_LEN			(0x10)
+
+#define DEFAULT_CLOCK_HZ	(10)
+#define DEFAULT_TIMEOT_SEC	(8)
+
+struct vm_stall_detect_s {
+	void __iomem *membase;
+	u32 clock_freq;
+	u32 expiration_sec;
+	u32 ping_timeout_ms;
+	struct hrtimer per_cpu_hrtimer;
+	struct platform_device *dev;
+};
+
+#define vcpu_stall_detect_reg_write(stall_detect, reg, value)	\
+	iowrite32((value), (stall_detect)->membase + (reg))
+#define vcpu_stall_detect_reg_read(stall_detect, reg)		\
+	io32read((stall_detect)->membase + (reg))
+
+static struct vm_stall_detect_s __percpu *vm_stall_detect;
+
+static enum hrtimer_restart
+vcpu_stall_detect_timer_fn(struct hrtimer *hrtimer)
+{
+	struct vm_stall_detect_s *cpu_stall_detect;
+	u32 ticks;
+
+	cpu_stall_detect = container_of(hrtimer, struct vm_stall_detect_s,
+					per_cpu_hrtimer);
+	ticks = cpu_stall_detect->clock_freq * cpu_stall_detect->expiration_sec;
+	vcpu_stall_detect_reg_write(cpu_stall_detect, REG_LOAD_CNT, ticks);
+	hrtimer_forward_now(hrtimer,
+			    ms_to_ktime(cpu_stall_detect->ping_timeout_ms));
+
+	return HRTIMER_RESTART;
+}
+
+static void vcpu_stall_detect_start(void *arg)
+{
+	u32 ticks;
+	struct vm_stall_detect_s *cpu_stall_detect = arg;
+	struct hrtimer *hrtimer = &cpu_stall_detect->per_cpu_hrtimer;
+
+	vcpu_stall_detect_reg_write(cpu_stall_detect, REG_CLOCK_FREQ_HZ,
+				    cpu_stall_detect->clock_freq);
+
+	/* Compute the number of ticks required for the stall detector counter
+	 * register based on the internal clock frequency and the timeout
+	 * value given from the device tree.
+	 */
+	ticks = cpu_stall_detect->clock_freq *
+		cpu_stall_detect->expiration_sec;
+	vcpu_stall_detect_reg_write(cpu_stall_detect, REG_LOAD_CNT, ticks);
+
+	/* Enable the internal clock and start the stall detector */
+	vcpu_stall_detect_reg_write(cpu_stall_detect, REG_STATUS, 1);
+
+	hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+	hrtimer->function = vcpu_stall_detect_timer_fn;
+	hrtimer_start(hrtimer, ms_to_ktime(cpu_stall_detect->ping_timeout_ms),
+		      HRTIMER_MODE_REL_PINNED);
+}
+
+static void vcpu_stall_detect_stop(void *arg)
+{
+	struct vm_stall_detect_s *cpu_stall_detect = arg;
+	struct hrtimer *hrtimer = &cpu_stall_detect->per_cpu_hrtimer;
+
+	hrtimer_cancel(hrtimer);
+
+	/* Disable the stall detector */
+	vcpu_stall_detect_reg_write(cpu_stall_detect, REG_STATUS, 0);
+}
+
+static int start_stall_detector_on_cpu(unsigned int cpu)
+{
+	vcpu_stall_detect_start(this_cpu_ptr(vm_stall_detect));
+	return 0;
+}
+
+static int stop_stall_detector_on_cpu(unsigned int cpu)
+{
+	vcpu_stall_detect_stop(this_cpu_ptr(vm_stall_detect));
+	return 0;
+}
+
+static int vcpu_stall_detect_probe(struct platform_device *dev)
+{
+	int cpu, ret, err;
+	void __iomem *membase;
+	struct resource *r;
+	u32 stall_detect_clock, stall_detect_timeout_sec = 0;
+
+	r = platform_get_resource(dev, IORESOURCE_MEM, 0);
+	if (r == NULL)
+		return -ENODEV;
+
+	vm_stall_detect = alloc_percpu(typeof(struct vm_stall_detect_s));
+	if (!vm_stall_detect)
+		return -ENOMEM;
+
+	membase = ioremap(r->start, resource_size(r));
+	if (!membase) {
+		ret = -ENOMEM;
+		goto err_withmem;
+	}
+
+	if (of_property_read_u32(dev->dev.of_node, "clock-frequency",
+				 &stall_detect_clock))
+		stall_detect_clock = DEFAULT_CLOCK_HZ;
+
+	if (of_property_read_u32(dev->dev.of_node, "timeout-sec",
+				 &stall_detect_timeout_sec))
+		stall_detect_timeout_sec = DEFAULT_TIMEOT_SEC;
+
+	for_each_cpu_and(cpu, cpu_online_mask, &watchdog_cpumask) {
+		struct vm_stall_detect_s *cpu_stall_detect;
+
+		cpu_stall_detect = per_cpu_ptr(vm_stall_detect, cpu);
+		cpu_stall_detect->membase = membase + cpu * REG_LEN;
+		cpu_stall_detect->clock_freq = stall_detect_clock;
+		cpu_stall_detect->expiration_sec = stall_detect_timeout_sec;
+
+		/* Pet the stall detector at half of its expiration timeout
+		 * to prevent spurios resets.
+		 */
+		cpu_stall_detect->ping_timeout_ms = stall_detect_timeout_sec *
+			MSEC_PER_SEC / 2;
+		smp_call_function_single(cpu, vcpu_stall_detect_start,
+					 cpu_stall_detect, true);
+	}
+
+	err = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
+					"virt/vcpu_stall_detector:online",
+					start_stall_detector_on_cpu,
+					stop_stall_detector_on_cpu);
+	if (err < 0) {
+		dev_err(&dev->dev, "failed to install cpu hotplug");
+		ret = err;
+		goto err_withmem;
+	}
+
+	return 0;
+
+err_withmem:
+	free_percpu(vm_stall_detect);
+	return ret;
+}
+
+static int vcpu_stall_detect_remove(struct platform_device *dev)
+{
+	int cpu;
+
+	for_each_cpu_and(cpu, cpu_online_mask, &watchdog_cpumask) {
+		struct vm_stall_detect_s *cpu_stall_detect;
+
+		cpu_stall_detect = per_cpu_ptr(vm_stall_detect, cpu);
+		smp_call_function_single(cpu, vcpu_stall_detect_stop,
+					 cpu_stall_detect, true);
+	}
+
+	free_percpu(vm_stall_detect);
+	vm_stall_detect = NULL;
+	return 0;
+}
+
+static const struct of_device_id vcpu_stall_detect_of_match[] = {
+	{ .compatible = "qemu,vcpu-stall-detector", },
+	{}
+};
+
+MODULE_DEVICE_TABLE(of, vcpu_stall_detect_of_match);
+
+static struct platform_driver vcpu_stall_detect_driver = {
+	.probe  = vcpu_stall_detect_probe,
+	.remove = vcpu_stall_detect_remove,
+	.driver = {
+		.name           = KBUILD_MODNAME,
+		.of_match_table = vcpu_stall_detect_of_match,
+	},
+};
+
+module_platform_driver(vcpu_stall_detect_driver);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("Sebastian Ene <sebastianene@google.com>");
+MODULE_DESCRIPTION("VCPU stall detector");
-- 
2.37.0.rc0.161.g10f37bed90-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v9 2/2] misc: Add a mechanism to detect stalls on guest vCPUs
  2022-07-01 14:40 ` [PATCH v9 2/2] misc: Add a mechanism to detect stalls on guest vCPUs Sebastian Ene
@ 2022-07-01 14:52   ` Guenter Roeck
  2022-07-06 15:21   ` Will Deacon
  1 sibling, 0 replies; 10+ messages in thread
From: Guenter Roeck @ 2022-07-01 14:52 UTC (permalink / raw)
  To: Sebastian Ene, Rob Herring, Greg Kroah-Hartman, Arnd Bergmann,
	Dragan Cvetic
  Cc: linux-kernel, devicetree, maz, will, vdonnefort

On 7/1/22 07:40, Sebastian Ene wrote:
> This driver creates per-cpu hrtimers which are required to do the
> periodic 'pet' operation. On a conventional watchdog-core driver, the
> userspace is responsible for delivering the 'pet' events by writing to
> the particular /dev/watchdogN node. In this case we require a strong
> thread affinity to be able to account for lost time on a per vCPU.
> 
> This part of the driver is the 'frontend' which is reponsible for
> delivering the periodic 'pet' events, configuring the virtual peripheral
> and listening for cpu hotplug events. The other part of the driver is
> an emulated MMIO device which is part of the KVM virtual machine
> monitor and this part accounts for lost time by looking at the
> /proc/{}/task/{}/stat entries.
> 
> Signed-off-by: Sebastian Ene <sebastianene@google.com>

Reviewed-by: Guenter Roeck <linux@roeck-us.net>

> ---
>   drivers/misc/Kconfig               |  13 ++
>   drivers/misc/Makefile              |   1 +
>   drivers/misc/vcpu_stall_detector.c | 212 +++++++++++++++++++++++++++++
>   3 files changed, 226 insertions(+)
>   create mode 100644 drivers/misc/vcpu_stall_detector.c
> 
> diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
> index 41d2bb0ae23a..83afb41a85cf 100644
> --- a/drivers/misc/Kconfig
> +++ b/drivers/misc/Kconfig
> @@ -483,6 +483,19 @@ config OPEN_DICE
>   
>   	  If unsure, say N.
>   
> +config VCPU_STALL_DETECTOR
> +	tristate "VCPU stall detector"
> +	select LOCKUP_DETECTOR
> +	depends on OF
> +	help
> +	  Detect CPU locks on a kvm virtual machine. This driver relies on
> +	  the hrtimers which are CPU-binded to do the 'pet' operation. When a
> +	  vCPU has to do a 'pet', it exits the guest through MMIO write and
> +	  the backend driver takes into account the lost ticks for this
> +	  particular CPU.
> +	  To compile this driver as a module, choose M here: the
> +	  module will be called vcpu_stall_detector.
> +
>   source "drivers/misc/c2port/Kconfig"
>   source "drivers/misc/eeprom/Kconfig"
>   source "drivers/misc/cb710/Kconfig"
> diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
> index 70e800e9127f..2be8542616dd 100644
> --- a/drivers/misc/Makefile
> +++ b/drivers/misc/Makefile
> @@ -60,3 +60,4 @@ obj-$(CONFIG_XILINX_SDFEC)	+= xilinx_sdfec.o
>   obj-$(CONFIG_HISI_HIKEY_USB)	+= hisi_hikey_usb.o
>   obj-$(CONFIG_HI6421V600_IRQ)	+= hi6421v600-irq.o
>   obj-$(CONFIG_OPEN_DICE)		+= open-dice.o
> +obj-$(CONFIG_VCPU_STALL_DETECTOR)	+= vcpu_stall_detector.o
> \ No newline at end of file
> diff --git a/drivers/misc/vcpu_stall_detector.c b/drivers/misc/vcpu_stall_detector.c
> new file mode 100644
> index 000000000000..039ac54564c1
> --- /dev/null
> +++ b/drivers/misc/vcpu_stall_detector.c
> @@ -0,0 +1,212 @@
> +// SPDX-License-Identifier: GPL-2.0
> +//
> +// VCPU stall detector.
> +//  Copyright (C) Google, 2022
> +
> +#include <linux/cpu.h>
> +#include <linux/init.h>
> +#include <linux/io.h>
> +#include <linux/kernel.h>
> +
> +#include <linux/device.h>
> +#include <linux/interrupt.h>
> +#include <linux/module.h>
> +#include <linux/nmi.h>
> +#include <linux/of.h>
> +#include <linux/of_device.h>
> +#include <linux/param.h>
> +#include <linux/percpu.h>
> +#include <linux/platform_device.h>
> +#include <linux/slab.h>
> +
> +#define REG_STATUS		(0x00)
> +#define REG_LOAD_CNT		(0x04)
> +#define REG_CURRENT_CNT		(0x08)
> +#define REG_CLOCK_FREQ_HZ	(0x0C)
> +#define REG_LEN			(0x10)
> +
> +#define DEFAULT_CLOCK_HZ	(10)
> +#define DEFAULT_TIMEOT_SEC	(8)
> +
> +struct vm_stall_detect_s {
> +	void __iomem *membase;
> +	u32 clock_freq;
> +	u32 expiration_sec;
> +	u32 ping_timeout_ms;
> +	struct hrtimer per_cpu_hrtimer;
> +	struct platform_device *dev;
> +};
> +
> +#define vcpu_stall_detect_reg_write(stall_detect, reg, value)	\
> +	iowrite32((value), (stall_detect)->membase + (reg))
> +#define vcpu_stall_detect_reg_read(stall_detect, reg)		\
> +	io32read((stall_detect)->membase + (reg))
> +
> +static struct vm_stall_detect_s __percpu *vm_stall_detect;
> +
> +static enum hrtimer_restart
> +vcpu_stall_detect_timer_fn(struct hrtimer *hrtimer)
> +{
> +	struct vm_stall_detect_s *cpu_stall_detect;
> +	u32 ticks;
> +
> +	cpu_stall_detect = container_of(hrtimer, struct vm_stall_detect_s,
> +					per_cpu_hrtimer);
> +	ticks = cpu_stall_detect->clock_freq * cpu_stall_detect->expiration_sec;
> +	vcpu_stall_detect_reg_write(cpu_stall_detect, REG_LOAD_CNT, ticks);
> +	hrtimer_forward_now(hrtimer,
> +			    ms_to_ktime(cpu_stall_detect->ping_timeout_ms));
> +
> +	return HRTIMER_RESTART;
> +}
> +
> +static void vcpu_stall_detect_start(void *arg)
> +{
> +	u32 ticks;
> +	struct vm_stall_detect_s *cpu_stall_detect = arg;
> +	struct hrtimer *hrtimer = &cpu_stall_detect->per_cpu_hrtimer;
> +
> +	vcpu_stall_detect_reg_write(cpu_stall_detect, REG_CLOCK_FREQ_HZ,
> +				    cpu_stall_detect->clock_freq);
> +
> +	/* Compute the number of ticks required for the stall detector counter
> +	 * register based on the internal clock frequency and the timeout
> +	 * value given from the device tree.
> +	 */
> +	ticks = cpu_stall_detect->clock_freq *
> +		cpu_stall_detect->expiration_sec;
> +	vcpu_stall_detect_reg_write(cpu_stall_detect, REG_LOAD_CNT, ticks);
> +
> +	/* Enable the internal clock and start the stall detector */
> +	vcpu_stall_detect_reg_write(cpu_stall_detect, REG_STATUS, 1);
> +
> +	hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
> +	hrtimer->function = vcpu_stall_detect_timer_fn;
> +	hrtimer_start(hrtimer, ms_to_ktime(cpu_stall_detect->ping_timeout_ms),
> +		      HRTIMER_MODE_REL_PINNED);
> +}
> +
> +static void vcpu_stall_detect_stop(void *arg)
> +{
> +	struct vm_stall_detect_s *cpu_stall_detect = arg;
> +	struct hrtimer *hrtimer = &cpu_stall_detect->per_cpu_hrtimer;
> +
> +	hrtimer_cancel(hrtimer);
> +
> +	/* Disable the stall detector */
> +	vcpu_stall_detect_reg_write(cpu_stall_detect, REG_STATUS, 0);
> +}
> +
> +static int start_stall_detector_on_cpu(unsigned int cpu)
> +{
> +	vcpu_stall_detect_start(this_cpu_ptr(vm_stall_detect));
> +	return 0;
> +}
> +
> +static int stop_stall_detector_on_cpu(unsigned int cpu)
> +{
> +	vcpu_stall_detect_stop(this_cpu_ptr(vm_stall_detect));
> +	return 0;
> +}
> +
> +static int vcpu_stall_detect_probe(struct platform_device *dev)
> +{
> +	int cpu, ret, err;
> +	void __iomem *membase;
> +	struct resource *r;
> +	u32 stall_detect_clock, stall_detect_timeout_sec = 0;
> +
> +	r = platform_get_resource(dev, IORESOURCE_MEM, 0);
> +	if (r == NULL)
> +		return -ENODEV;
> +
> +	vm_stall_detect = alloc_percpu(typeof(struct vm_stall_detect_s));
> +	if (!vm_stall_detect)
> +		return -ENOMEM;
> +
> +	membase = ioremap(r->start, resource_size(r));
> +	if (!membase) {
> +		ret = -ENOMEM;
> +		goto err_withmem;
> +	}
> +
> +	if (of_property_read_u32(dev->dev.of_node, "clock-frequency",
> +				 &stall_detect_clock))
> +		stall_detect_clock = DEFAULT_CLOCK_HZ;
> +
> +	if (of_property_read_u32(dev->dev.of_node, "timeout-sec",
> +				 &stall_detect_timeout_sec))
> +		stall_detect_timeout_sec = DEFAULT_TIMEOT_SEC;
> +
> +	for_each_cpu_and(cpu, cpu_online_mask, &watchdog_cpumask) {
> +		struct vm_stall_detect_s *cpu_stall_detect;
> +
> +		cpu_stall_detect = per_cpu_ptr(vm_stall_detect, cpu);
> +		cpu_stall_detect->membase = membase + cpu * REG_LEN;
> +		cpu_stall_detect->clock_freq = stall_detect_clock;
> +		cpu_stall_detect->expiration_sec = stall_detect_timeout_sec;
> +
> +		/* Pet the stall detector at half of its expiration timeout
> +		 * to prevent spurios resets.
> +		 */
> +		cpu_stall_detect->ping_timeout_ms = stall_detect_timeout_sec *
> +			MSEC_PER_SEC / 2;
> +		smp_call_function_single(cpu, vcpu_stall_detect_start,
> +					 cpu_stall_detect, true);
> +	}
> +
> +	err = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
> +					"virt/vcpu_stall_detector:online",
> +					start_stall_detector_on_cpu,
> +					stop_stall_detector_on_cpu);
> +	if (err < 0) {
> +		dev_err(&dev->dev, "failed to install cpu hotplug");
> +		ret = err;
> +		goto err_withmem;
> +	}
> +
> +	return 0;
> +
> +err_withmem:
> +	free_percpu(vm_stall_detect);
> +	return ret;
> +}
> +
> +static int vcpu_stall_detect_remove(struct platform_device *dev)
> +{
> +	int cpu;
> +
> +	for_each_cpu_and(cpu, cpu_online_mask, &watchdog_cpumask) {
> +		struct vm_stall_detect_s *cpu_stall_detect;
> +
> +		cpu_stall_detect = per_cpu_ptr(vm_stall_detect, cpu);
> +		smp_call_function_single(cpu, vcpu_stall_detect_stop,
> +					 cpu_stall_detect, true);
> +	}
> +
> +	free_percpu(vm_stall_detect);
> +	vm_stall_detect = NULL;
> +	return 0;
> +}
> +
> +static const struct of_device_id vcpu_stall_detect_of_match[] = {
> +	{ .compatible = "qemu,vcpu-stall-detector", },
> +	{}
> +};
> +
> +MODULE_DEVICE_TABLE(of, vcpu_stall_detect_of_match);
> +
> +static struct platform_driver vcpu_stall_detect_driver = {
> +	.probe  = vcpu_stall_detect_probe,
> +	.remove = vcpu_stall_detect_remove,
> +	.driver = {
> +		.name           = KBUILD_MODNAME,
> +		.of_match_table = vcpu_stall_detect_of_match,
> +	},
> +};
> +
> +module_platform_driver(vcpu_stall_detect_driver);
> +
> +MODULE_LICENSE("GPL");
> +MODULE_AUTHOR("Sebastian Ene <sebastianene@google.com>");
> +MODULE_DESCRIPTION("VCPU stall detector");


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v9 1/2] dt-bindings: vcpu_stall_detector: Add qemu,vcpu-stall-detector compatible
  2022-07-01 14:40 ` [PATCH v9 1/2] dt-bindings: vcpu_stall_detector: Add qemu,vcpu-stall-detector compatible Sebastian Ene
@ 2022-07-01 21:01   ` Rob Herring
  0 siblings, 0 replies; 10+ messages in thread
From: Rob Herring @ 2022-07-01 21:01 UTC (permalink / raw)
  To: Sebastian Ene
  Cc: Guenter Roeck, Rob Herring, Arnd Bergmann, devicetree, maz,
	vdonnefort, Greg Kroah-Hartman, will, Dragan Cvetic,
	linux-kernel

On Fri, 01 Jul 2022 14:40:13 +0000, Sebastian Ene wrote:
> The VCPU stall detection mechanism allows to configure the expiration
> duration and the internal counter clock frequency measured in Hz.
> Add these properties in the schema.
> 
> While this is a memory mapped virtual device, it is expected to be loaded
> when the DT contains the compatible: "qemu,vcpu-stall-detector" node.
> In a protected VM we trust the generated DT nodes and we don't rely on
> the host to present the hardware peripherals.
> 
> Signed-off-by: Sebastian Ene <sebastianene@google.com>
> ---
>  .../misc/qemu,vcpu-stall-detector.yaml        | 51 +++++++++++++++++++
>  1 file changed, 51 insertions(+)
>  create mode 100644 Documentation/devicetree/bindings/misc/qemu,vcpu-stall-detector.yaml
> 

Reviewed-by: Rob Herring <robh@kernel.org>

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v9 2/2] misc: Add a mechanism to detect stalls on guest vCPUs
  2022-07-01 14:40 ` [PATCH v9 2/2] misc: Add a mechanism to detect stalls on guest vCPUs Sebastian Ene
  2022-07-01 14:52   ` Guenter Roeck
@ 2022-07-06 15:21   ` Will Deacon
  2022-07-06 15:50     ` Greg Kroah-Hartman
  2022-07-07 13:17     ` Sebastian Ene
  1 sibling, 2 replies; 10+ messages in thread
From: Will Deacon @ 2022-07-06 15:21 UTC (permalink / raw)
  To: Sebastian Ene
  Cc: Rob Herring, Greg Kroah-Hartman, Arnd Bergmann, Dragan Cvetic,
	linux-kernel, devicetree, maz, vdonnefort, Guenter Roeck

Hi Sebastian,

Thanks for working on this and apologies it's taken so long for me to
review your changes.

On Fri, Jul 01, 2022 at 02:40:14PM +0000, Sebastian Ene wrote:
> This driver creates per-cpu hrtimers which are required to do the
> periodic 'pet' operation. On a conventional watchdog-core driver, the
> userspace is responsible for delivering the 'pet' events by writing to
> the particular /dev/watchdogN node. In this case we require a strong
> thread affinity to be able to account for lost time on a per vCPU.
> 
> This part of the driver is the 'frontend' which is reponsible for
> delivering the periodic 'pet' events, configuring the virtual peripheral
> and listening for cpu hotplug events. The other part of the driver is
> an emulated MMIO device which is part of the KVM virtual machine
> monitor and this part accounts for lost time by looking at the
> /proc/{}/task/{}/stat entries.
> 
> Signed-off-by: Sebastian Ene <sebastianene@google.com>
> ---
>  drivers/misc/Kconfig               |  13 ++
>  drivers/misc/Makefile              |   1 +
>  drivers/misc/vcpu_stall_detector.c | 212 +++++++++++++++++++++++++++++
>  3 files changed, 226 insertions(+)
>  create mode 100644 drivers/misc/vcpu_stall_detector.c
> 
> diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
> index 41d2bb0ae23a..83afb41a85cf 100644
> --- a/drivers/misc/Kconfig
> +++ b/drivers/misc/Kconfig
> @@ -483,6 +483,19 @@ config OPEN_DICE
>  
>  	  If unsure, say N.
>  
> +config VCPU_STALL_DETECTOR
> +	tristate "VCPU stall detector"

I'd probably stick "Guest" in here somewhere, as this driver won't be
useful on the host. So something like "Guest vCPU stall detector".

> +	select LOCKUP_DETECTOR
> +	depends on OF
> +	help
> +	  Detect CPU locks on a kvm virtual machine. This driver relies on
> +	  the hrtimers which are CPU-binded to do the 'pet' operation. When a
> +	  vCPU has to do a 'pet', it exits the guest through MMIO write and
> +	  the backend driver takes into account the lost ticks for this
> +	  particular CPU.
> +	  To compile this driver as a module, choose M here: the
> +	  module will be called vcpu_stall_detector.

I suggest rewording this along the lines of:

  When this driver is bound inside a KVM guest, it will periodically
  "pet" an MMIO stall detector device from each vCPU and allow the host
  to detect vCPU stalls.

  To compile this driver as a module, choose M here: the module will
  be called vcpu_stall_detector.

  If you do not intend to run this kernel as a guest, say N.

but up to you.

>  source "drivers/misc/c2port/Kconfig"
>  source "drivers/misc/eeprom/Kconfig"
>  source "drivers/misc/cb710/Kconfig"
> diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
> index 70e800e9127f..2be8542616dd 100644
> --- a/drivers/misc/Makefile
> +++ b/drivers/misc/Makefile
> @@ -60,3 +60,4 @@ obj-$(CONFIG_XILINX_SDFEC)	+= xilinx_sdfec.o
>  obj-$(CONFIG_HISI_HIKEY_USB)	+= hisi_hikey_usb.o
>  obj-$(CONFIG_HI6421V600_IRQ)	+= hi6421v600-irq.o
>  obj-$(CONFIG_OPEN_DICE)		+= open-dice.o
> +obj-$(CONFIG_VCPU_STALL_DETECTOR)	+= vcpu_stall_detector.o
> \ No newline at end of file
> diff --git a/drivers/misc/vcpu_stall_detector.c b/drivers/misc/vcpu_stall_detector.c
> new file mode 100644
> index 000000000000..039ac54564c1
> --- /dev/null
> +++ b/drivers/misc/vcpu_stall_detector.c
> @@ -0,0 +1,212 @@
> +// SPDX-License-Identifier: GPL-2.0

This should probably be "GPL-2.0-only" for the avoidance of doubt.

> +//
> +// VCPU stall detector.
> +//  Copyright (C) Google, 2022
> +
> +#include <linux/cpu.h>
> +#include <linux/init.h>
> +#include <linux/io.h>
> +#include <linux/kernel.h>
> +
> +#include <linux/device.h>
> +#include <linux/interrupt.h>
> +#include <linux/module.h>
> +#include <linux/nmi.h>
> +#include <linux/of.h>
> +#include <linux/of_device.h>
> +#include <linux/param.h>
> +#include <linux/percpu.h>
> +#include <linux/platform_device.h>
> +#include <linux/slab.h>
> +
> +#define REG_STATUS		(0x00)
> +#define REG_LOAD_CNT		(0x04)
> +#define REG_CURRENT_CNT		(0x08)
> +#define REG_CLOCK_FREQ_HZ	(0x0C)
> +#define REG_LEN			(0x10)
> +
> +#define DEFAULT_CLOCK_HZ	(10)
> +#define DEFAULT_TIMEOT_SEC	(8)

TIMEOUT instead of TIMEOT?

Generally, I'd probably also prefix these definitions with VCPU_STALL_
or something to avoid the potential for conflicting with the headers.

> +struct vm_stall_detect_s {
> +	void __iomem *membase;
> +	u32 clock_freq;
> +	u32 expiration_sec;
> +	u32 ping_timeout_ms;
> +	struct hrtimer per_cpu_hrtimer;
> +	struct platform_device *dev;
> +};
> +
> +#define vcpu_stall_detect_reg_write(stall_detect, reg, value)	\
> +	iowrite32((value), (stall_detect)->membase + (reg))

You can use writel_relaxed() here as you don't require ordering guarantees
against DMA.

> +#define vcpu_stall_detect_reg_read(stall_detect, reg)		\
> +	io32read((stall_detect)->membase + (reg))

io32read() doesn't exist, but you also don't use this macro so just remove
it :)

> +
> +static struct vm_stall_detect_s __percpu *vm_stall_detect;
> +
> +static enum hrtimer_restart
> +vcpu_stall_detect_timer_fn(struct hrtimer *hrtimer)
> +{
> +	struct vm_stall_detect_s *cpu_stall_detect;
> +	u32 ticks;
> +
> +	cpu_stall_detect = container_of(hrtimer, struct vm_stall_detect_s,
> +					per_cpu_hrtimer);
> +	ticks = cpu_stall_detect->clock_freq * cpu_stall_detect->expiration_sec;
> +	vcpu_stall_detect_reg_write(cpu_stall_detect, REG_LOAD_CNT, ticks);
> +	hrtimer_forward_now(hrtimer,
> +			    ms_to_ktime(cpu_stall_detect->ping_timeout_ms));
> +
> +	return HRTIMER_RESTART;
> +}
> +
> +static void vcpu_stall_detect_start(void *arg)
> +{
> +	u32 ticks;
> +	struct vm_stall_detect_s *cpu_stall_detect = arg;
> +	struct hrtimer *hrtimer = &cpu_stall_detect->per_cpu_hrtimer;
> +
> +	vcpu_stall_detect_reg_write(cpu_stall_detect, REG_CLOCK_FREQ_HZ,
> +				    cpu_stall_detect->clock_freq);
> +
> +	/* Compute the number of ticks required for the stall detector counter
> +	 * register based on the internal clock frequency and the timeout
> +	 * value given from the device tree.
> +	 */
> +	ticks = cpu_stall_detect->clock_freq *
> +		cpu_stall_detect->expiration_sec;
> +	vcpu_stall_detect_reg_write(cpu_stall_detect, REG_LOAD_CNT, ticks);
> +
> +	/* Enable the internal clock and start the stall detector */
> +	vcpu_stall_detect_reg_write(cpu_stall_detect, REG_STATUS, 1);
> +
> +	hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
> +	hrtimer->function = vcpu_stall_detect_timer_fn;
> +	hrtimer_start(hrtimer, ms_to_ktime(cpu_stall_detect->ping_timeout_ms),
> +		      HRTIMER_MODE_REL_PINNED);
> +}
> +
> +static void vcpu_stall_detect_stop(void *arg)
> +{
> +	struct vm_stall_detect_s *cpu_stall_detect = arg;
> +	struct hrtimer *hrtimer = &cpu_stall_detect->per_cpu_hrtimer;
> +
> +	hrtimer_cancel(hrtimer);
> +
> +	/* Disable the stall detector */
> +	vcpu_stall_detect_reg_write(cpu_stall_detect, REG_STATUS, 0);
> +}
> +
> +static int start_stall_detector_on_cpu(unsigned int cpu)
> +{
> +	vcpu_stall_detect_start(this_cpu_ptr(vm_stall_detect));
> +	return 0;
> +}
> +
> +static int stop_stall_detector_on_cpu(unsigned int cpu)
> +{
> +	vcpu_stall_detect_stop(this_cpu_ptr(vm_stall_detect));
> +	return 0;
> +}
> +
> +static int vcpu_stall_detect_probe(struct platform_device *dev)
> +{
> +	int cpu, ret, err;
> +	void __iomem *membase;
> +	struct resource *r;
> +	u32 stall_detect_clock, stall_detect_timeout_sec = 0;
> +
> +	r = platform_get_resource(dev, IORESOURCE_MEM, 0);
> +	if (r == NULL)
> +		return -ENODEV;
> +
> +	vm_stall_detect = alloc_percpu(typeof(struct vm_stall_detect_s));
> +	if (!vm_stall_detect)
> +		return -ENOMEM;
> +
> +	membase = ioremap(r->start, resource_size(r));
> +	if (!membase) {
> +		ret = -ENOMEM;
> +		goto err_withmem;
> +	}

Can you use devm_platform_get_and_ioremap_resource() to avoid calling
both platform_get_resource() and ioremap() explicitly? I think the devm_
handling there also means you don't need to worry about the iounmap(),
which seems to be missing here anyway.

> +
> +	if (of_property_read_u32(dev->dev.of_node, "clock-frequency",
> +				 &stall_detect_clock))
> +		stall_detect_clock = DEFAULT_CLOCK_HZ;
> +
> +	if (of_property_read_u32(dev->dev.of_node, "timeout-sec",
> +				 &stall_detect_timeout_sec))
> +		stall_detect_timeout_sec = DEFAULT_TIMEOT_SEC;
> +
> +	for_each_cpu_and(cpu, cpu_online_mask, &watchdog_cpumask) {

What prevents CPUs from coming up and down during this loop?

I also don't think we should be using 'watchdog_cpumask' here -- that's
part of the watchdog core; instead we should either take the mask in the
DT description or rely on the device exposing an interface for every CPU
in the system.

> +		struct vm_stall_detect_s *cpu_stall_detect;
> +
> +		cpu_stall_detect = per_cpu_ptr(vm_stall_detect, cpu);
> +		cpu_stall_detect->membase = membase + cpu * REG_LEN;
> +		cpu_stall_detect->clock_freq = stall_detect_clock;
> +		cpu_stall_detect->expiration_sec = stall_detect_timeout_sec;
> +
> +		/* Pet the stall detector at half of its expiration timeout
> +		 * to prevent spurios resets.

typo: spurious

> +		 */
> +		cpu_stall_detect->ping_timeout_ms = stall_detect_timeout_sec *
> +			MSEC_PER_SEC / 2;

All of these fields are the same for each core apart from the membase,
which a CPU can compute locally using its smp_processor_id() so I don't
think you need to include these in the per-cpu structure. It would be
simpler if you tracked these invariants in a separate structure (either
a global static structure or indirected by a per-cpu pointer).

Then the only thing left for the per-cpu structure would be the hrtimer.

> +		smp_call_function_single(cpu, vcpu_stall_detect_start,
> +					 cpu_stall_detect, true);
> +	}
> +
> +	err = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
> +					"virt/vcpu_stall_detector:online",
> +					start_stall_detector_on_cpu,
> +					stop_stall_detector_on_cpu);

If you drop the "_nocalls" variant then I think you can avoid the explicit
previous call to smp_call_function_single() entirely and just rely on the
notifiers to do the right thing. That also solves the hotplug race I
mentioned.

> +	if (err < 0) {
> +		dev_err(&dev->dev, "failed to install cpu hotplug");
> +		ret = err;
> +		goto err_withmem;
> +	}
> +
> +	return 0;
> +
> +err_withmem:
> +	free_percpu(vm_stall_detect);

Looks like there's a devm_alloc_percpu() which might make this a bit
simpler.

> +	return ret;
> +}
> +
> +static int vcpu_stall_detect_remove(struct platform_device *dev)
> +{
> +	int cpu;
> +
> +	for_each_cpu_and(cpu, cpu_online_mask, &watchdog_cpumask) {
> +		struct vm_stall_detect_s *cpu_stall_detect;
> +
> +		cpu_stall_detect = per_cpu_ptr(vm_stall_detect, cpu);
> +		smp_call_function_single(cpu, vcpu_stall_detect_stop,
> +					 cpu_stall_detect, true);
> +	}

I don't think you need to cross-call here, and it looks racy again with
hotplug; you should be able to do something like:

  1. Unregister the hotplug notifier (looks like you're missing this?)
  2. Call hrtimer_cancel() for each percpu timer
  3. Disable the watchdog for each CPU

> +	free_percpu(vm_stall_detect);
> +	vm_stall_detect = NULL;
> +	return 0;
> +}
> +
> +static const struct of_device_id vcpu_stall_detect_of_match[] = {
> +	{ .compatible = "qemu,vcpu-stall-detector", },

I'm not sure why we're mentioning qemu here; something as boring as "virt"
(like you use for the cpuhp notifier) or "dummy" might be better, but given
that Rob is happy with the binding then I won't complain further. We can
always add extra strings later if we need to.

> +	{}
> +};
> +
> +MODULE_DEVICE_TABLE(of, vcpu_stall_detect_of_match);
> +
> +static struct platform_driver vcpu_stall_detect_driver = {
> +	.probe  = vcpu_stall_detect_probe,
> +	.remove = vcpu_stall_detect_remove,
> +	.driver = {
> +		.name           = KBUILD_MODNAME,
> +		.of_match_table = vcpu_stall_detect_of_match,
> +	},
> +};
> +
> +module_platform_driver(vcpu_stall_detect_driver);
> +
> +MODULE_LICENSE("GPL");

This needs to be "GPL v2".

Will

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v9 2/2] misc: Add a mechanism to detect stalls on guest vCPUs
  2022-07-06 15:21   ` Will Deacon
@ 2022-07-06 15:50     ` Greg Kroah-Hartman
  2022-07-06 16:10       ` Will Deacon
  2022-07-07 13:17     ` Sebastian Ene
  1 sibling, 1 reply; 10+ messages in thread
From: Greg Kroah-Hartman @ 2022-07-06 15:50 UTC (permalink / raw)
  To: Will Deacon
  Cc: Sebastian Ene, Rob Herring, Arnd Bergmann, Dragan Cvetic,
	linux-kernel, devicetree, maz, vdonnefort, Guenter Roeck

On Wed, Jul 06, 2022 at 04:21:01PM +0100, Will Deacon wrote:
> > +MODULE_LICENSE("GPL");
> 
> This needs to be "GPL v2".

Nope, that is what this is saying, please see the huge comment in
include/linux/module.h right above MODULE_LICENSE().  So all is good
here.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v9 2/2] misc: Add a mechanism to detect stalls on guest vCPUs
  2022-07-06 15:50     ` Greg Kroah-Hartman
@ 2022-07-06 16:10       ` Will Deacon
  2022-07-06 17:35         ` Greg Kroah-Hartman
  0 siblings, 1 reply; 10+ messages in thread
From: Will Deacon @ 2022-07-06 16:10 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Sebastian Ene, Rob Herring, Arnd Bergmann, Dragan Cvetic,
	linux-kernel, devicetree, maz, vdonnefort, Guenter Roeck

On Wed, Jul 06, 2022 at 05:50:14PM +0200, Greg Kroah-Hartman wrote:
> On Wed, Jul 06, 2022 at 04:21:01PM +0100, Will Deacon wrote:
> > > +MODULE_LICENSE("GPL");
> > 
> > This needs to be "GPL v2".
> 
> Nope, that is what this is saying, please see the huge comment in
> include/linux/module.h right above MODULE_LICENSE().  So all is good
> here.

Thanks, I stand corrected, although I personally still find it clearer to
spell it out!

Will

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v9 2/2] misc: Add a mechanism to detect stalls on guest vCPUs
  2022-07-06 16:10       ` Will Deacon
@ 2022-07-06 17:35         ` Greg Kroah-Hartman
  0 siblings, 0 replies; 10+ messages in thread
From: Greg Kroah-Hartman @ 2022-07-06 17:35 UTC (permalink / raw)
  To: Will Deacon
  Cc: Sebastian Ene, Rob Herring, Arnd Bergmann, Dragan Cvetic,
	linux-kernel, devicetree, maz, vdonnefort, Guenter Roeck

On Wed, Jul 06, 2022 at 05:10:50PM +0100, Will Deacon wrote:
> On Wed, Jul 06, 2022 at 05:50:14PM +0200, Greg Kroah-Hartman wrote:
> > On Wed, Jul 06, 2022 at 04:21:01PM +0100, Will Deacon wrote:
> > > > +MODULE_LICENSE("GPL");
> > > 
> > > This needs to be "GPL v2".
> > 
> > Nope, that is what this is saying, please see the huge comment in
> > include/linux/module.h right above MODULE_LICENSE().  So all is good
> > here.
> 
> Thanks, I stand corrected, although I personally still find it clearer to
> spell it out!

I totally agree, but there is no "you must do this" directive here :)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v9 2/2] misc: Add a mechanism to detect stalls on guest vCPUs
  2022-07-06 15:21   ` Will Deacon
  2022-07-06 15:50     ` Greg Kroah-Hartman
@ 2022-07-07 13:17     ` Sebastian Ene
  1 sibling, 0 replies; 10+ messages in thread
From: Sebastian Ene @ 2022-07-07 13:17 UTC (permalink / raw)
  To: Will Deacon
  Cc: Rob Herring, Greg Kroah-Hartman, Arnd Bergmann, Dragan Cvetic,
	linux-kernel, devicetree, maz, vdonnefort, Guenter Roeck

On Wed, Jul 06, 2022 at 04:21:01PM +0100, Will Deacon wrote:

Hi Will,

> Hi Sebastian,
> 
> Thanks for working on this and apologies it's taken so long for me to
> review your changes.
> 
> On Fri, Jul 01, 2022 at 02:40:14PM +0000, Sebastian Ene wrote:
> > This driver creates per-cpu hrtimers which are required to do the
> > periodic 'pet' operation. On a conventional watchdog-core driver, the
> > userspace is responsible for delivering the 'pet' events by writing to
> > the particular /dev/watchdogN node. In this case we require a strong
> > thread affinity to be able to account for lost time on a per vCPU.
> > 
> > This part of the driver is the 'frontend' which is reponsible for
> > delivering the periodic 'pet' events, configuring the virtual peripheral
> > and listening for cpu hotplug events. The other part of the driver is
> > an emulated MMIO device which is part of the KVM virtual machine
> > monitor and this part accounts for lost time by looking at the
> > /proc/{}/task/{}/stat entries.
> > 
> > Signed-off-by: Sebastian Ene <sebastianene@google.com>
> > ---
> >  drivers/misc/Kconfig               |  13 ++
> >  drivers/misc/Makefile              |   1 +
> >  drivers/misc/vcpu_stall_detector.c | 212 +++++++++++++++++++++++++++++
> >  3 files changed, 226 insertions(+)
> >  create mode 100644 drivers/misc/vcpu_stall_detector.c
> > 
> > diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
> > index 41d2bb0ae23a..83afb41a85cf 100644
> > --- a/drivers/misc/Kconfig
> > +++ b/drivers/misc/Kconfig
> > @@ -483,6 +483,19 @@ config OPEN_DICE
> >  
> >  	  If unsure, say N.
> >  
> > +config VCPU_STALL_DETECTOR
> > +	tristate "VCPU stall detector"
> 
> I'd probably stick "Guest" in here somewhere, as this driver won't be
> useful on the host. So something like "Guest vCPU stall detector".
> 

I improved the description with your suggesion.

> > +	select LOCKUP_DETECTOR
> > +	depends on OF
> > +	help
> > +	  Detect CPU locks on a kvm virtual machine. This driver relies on
> > +	  the hrtimers which are CPU-binded to do the 'pet' operation. When a
> > +	  vCPU has to do a 'pet', it exits the guest through MMIO write and
> > +	  the backend driver takes into account the lost ticks for this
> > +	  particular CPU.
> > +	  To compile this driver as a module, choose M here: the
> > +	  module will be called vcpu_stall_detector.
> 
> I suggest rewording this along the lines of:
> 
>   When this driver is bound inside a KVM guest, it will periodically
>   "pet" an MMIO stall detector device from each vCPU and allow the host
>   to detect vCPU stalls.
> 
>   To compile this driver as a module, choose M here: the module will
>   be called vcpu_stall_detector.
> 
>   If you do not intend to run this kernel as a guest, say N.
> 
> but up to you.
> 

I reworded this, thanks !

> >  source "drivers/misc/c2port/Kconfig"
> >  source "drivers/misc/eeprom/Kconfig"
> >  source "drivers/misc/cb710/Kconfig"
> > diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
> > index 70e800e9127f..2be8542616dd 100644
> > --- a/drivers/misc/Makefile
> > +++ b/drivers/misc/Makefile
> > @@ -60,3 +60,4 @@ obj-$(CONFIG_XILINX_SDFEC)	+= xilinx_sdfec.o
> >  obj-$(CONFIG_HISI_HIKEY_USB)	+= hisi_hikey_usb.o
> >  obj-$(CONFIG_HI6421V600_IRQ)	+= hi6421v600-irq.o
> >  obj-$(CONFIG_OPEN_DICE)		+= open-dice.o
> > +obj-$(CONFIG_VCPU_STALL_DETECTOR)	+= vcpu_stall_detector.o
> > \ No newline at end of file
> > diff --git a/drivers/misc/vcpu_stall_detector.c b/drivers/misc/vcpu_stall_detector.c
> > new file mode 100644
> > index 000000000000..039ac54564c1
> > --- /dev/null
> > +++ b/drivers/misc/vcpu_stall_detector.c
> > @@ -0,0 +1,212 @@
> > +// SPDX-License-Identifier: GPL-2.0
> 
> This should probably be "GPL-2.0-only" for the avoidance of doubt.
> 

I updated the license.

> > +//
> > +// VCPU stall detector.
> > +//  Copyright (C) Google, 2022
> > +
> > +#include <linux/cpu.h>
> > +#include <linux/init.h>
> > +#include <linux/io.h>
> > +#include <linux/kernel.h>
> > +
> > +#include <linux/device.h>
> > +#include <linux/interrupt.h>
> > +#include <linux/module.h>
> > +#include <linux/nmi.h>
> > +#include <linux/of.h>
> > +#include <linux/of_device.h>
> > +#include <linux/param.h>
> > +#include <linux/percpu.h>
> > +#include <linux/platform_device.h>
> > +#include <linux/slab.h>
> > +
> > +#define REG_STATUS		(0x00)
> > +#define REG_LOAD_CNT		(0x04)
> > +#define REG_CURRENT_CNT		(0x08)
> > +#define REG_CLOCK_FREQ_HZ	(0x0C)
> > +#define REG_LEN			(0x10)
> > +
> > +#define DEFAULT_CLOCK_HZ	(10)
> > +#define DEFAULT_TIMEOT_SEC	(8)
> 
> TIMEOUT instead of TIMEOT?
> 

I fixed the typo.

> Generally, I'd probably also prefix these definitions with VCPU_STALL_
> or something to avoid the potential for conflicting with the headers.
> 

I prefixed the definitions with VCPU_STALL_.

> > +struct vm_stall_detect_s {
> > +	void __iomem *membase;
> > +	u32 clock_freq;
> > +	u32 expiration_sec;
> > +	u32 ping_timeout_ms;
> > +	struct hrtimer per_cpu_hrtimer;
> > +	struct platform_device *dev;
> > +};
> > +
> > +#define vcpu_stall_detect_reg_write(stall_detect, reg, value)	\
> > +	iowrite32((value), (stall_detect)->membase + (reg))
> 
> You can use writel_relaxed() here as you don't require ordering guarantees
> against DMA.
> 

I updated it to use writel_relaxed(..) as suggested.

> > +#define vcpu_stall_detect_reg_read(stall_detect, reg)		\
> > +	io32read((stall_detect)->membase + (reg))
> 
> io32read() doesn't exist, but you also don't use this macro so just remove
> it :)
> 

I removed this unused definition.

> > +
> > +static struct vm_stall_detect_s __percpu *vm_stall_detect;
> > +
> > +static enum hrtimer_restart
> > +vcpu_stall_detect_timer_fn(struct hrtimer *hrtimer)
> > +{
> > +	struct vm_stall_detect_s *cpu_stall_detect;
> > +	u32 ticks;
> > +
> > +	cpu_stall_detect = container_of(hrtimer, struct vm_stall_detect_s,
> > +					per_cpu_hrtimer);
> > +	ticks = cpu_stall_detect->clock_freq * cpu_stall_detect->expiration_sec;
> > +	vcpu_stall_detect_reg_write(cpu_stall_detect, REG_LOAD_CNT, ticks);
> > +	hrtimer_forward_now(hrtimer,
> > +			    ms_to_ktime(cpu_stall_detect->ping_timeout_ms));
> > +
> > +	return HRTIMER_RESTART;
> > +}
> > +
> > +static void vcpu_stall_detect_start(void *arg)
> > +{
> > +	u32 ticks;
> > +	struct vm_stall_detect_s *cpu_stall_detect = arg;
> > +	struct hrtimer *hrtimer = &cpu_stall_detect->per_cpu_hrtimer;
> > +
> > +	vcpu_stall_detect_reg_write(cpu_stall_detect, REG_CLOCK_FREQ_HZ,
> > +				    cpu_stall_detect->clock_freq);
> > +
> > +	/* Compute the number of ticks required for the stall detector counter
> > +	 * register based on the internal clock frequency and the timeout
> > +	 * value given from the device tree.
> > +	 */
> > +	ticks = cpu_stall_detect->clock_freq *
> > +		cpu_stall_detect->expiration_sec;
> > +	vcpu_stall_detect_reg_write(cpu_stall_detect, REG_LOAD_CNT, ticks);
> > +
> > +	/* Enable the internal clock and start the stall detector */
> > +	vcpu_stall_detect_reg_write(cpu_stall_detect, REG_STATUS, 1);
> > +
> > +	hrtimer_init(hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
> > +	hrtimer->function = vcpu_stall_detect_timer_fn;
> > +	hrtimer_start(hrtimer, ms_to_ktime(cpu_stall_detect->ping_timeout_ms),
> > +		      HRTIMER_MODE_REL_PINNED);
> > +}
> > +
> > +static void vcpu_stall_detect_stop(void *arg)
> > +{
> > +	struct vm_stall_detect_s *cpu_stall_detect = arg;
> > +	struct hrtimer *hrtimer = &cpu_stall_detect->per_cpu_hrtimer;
> > +
> > +	hrtimer_cancel(hrtimer);
> > +
> > +	/* Disable the stall detector */
> > +	vcpu_stall_detect_reg_write(cpu_stall_detect, REG_STATUS, 0);
> > +}
> > +
> > +static int start_stall_detector_on_cpu(unsigned int cpu)
> > +{
> > +	vcpu_stall_detect_start(this_cpu_ptr(vm_stall_detect));
> > +	return 0;
> > +}
> > +
> > +static int stop_stall_detector_on_cpu(unsigned int cpu)
> > +{
> > +	vcpu_stall_detect_stop(this_cpu_ptr(vm_stall_detect));
> > +	return 0;
> > +}
> > +
> > +static int vcpu_stall_detect_probe(struct platform_device *dev)
> > +{
> > +	int cpu, ret, err;
> > +	void __iomem *membase;
> > +	struct resource *r;
> > +	u32 stall_detect_clock, stall_detect_timeout_sec = 0;
> > +
> > +	r = platform_get_resource(dev, IORESOURCE_MEM, 0);
> > +	if (r == NULL)
> > +		return -ENODEV;
> > +
> > +	vm_stall_detect = alloc_percpu(typeof(struct vm_stall_detect_s));
> > +	if (!vm_stall_detect)
> > +		return -ENOMEM;
> > +
> > +	membase = ioremap(r->start, resource_size(r));
> > +	if (!membase) {
> > +		ret = -ENOMEM;
> > +		goto err_withmem;
> > +	}
> 
> Can you use devm_platform_get_and_ioremap_resource() to avoid calling
> both platform_get_resource() and ioremap() explicitly? I think the devm_
> handling there also means you don't need to worry about the iounmap(),
> which seems to be missing here anyway.
> 

Thanks, I updated the code to use the
devm_platform_get_and_ioremap_resource(..) API and I modified the
Kconfig so that the driver would depend on HAS_IOMEM.

> > +
> > +	if (of_property_read_u32(dev->dev.of_node, "clock-frequency",
> > +				 &stall_detect_clock))
> > +		stall_detect_clock = DEFAULT_CLOCK_HZ;
> > +
> > +	if (of_property_read_u32(dev->dev.of_node, "timeout-sec",
> > +				 &stall_detect_timeout_sec))
> > +		stall_detect_timeout_sec = DEFAULT_TIMEOT_SEC;
> > +
> > +	for_each_cpu_and(cpu, cpu_online_mask, &watchdog_cpumask) {
> 
> What prevents CPUs from coming up and down during this loop?
> 
> I also don't think we should be using 'watchdog_cpumask' here -- that's
> part of the watchdog core; instead we should either take the mask in the
> DT description or rely on the device exposing an interface for every CPU
> in the system.
> 
> > +		struct vm_stall_detect_s *cpu_stall_detect;
> > +
> > +		cpu_stall_detect = per_cpu_ptr(vm_stall_detect, cpu);
> > +		cpu_stall_detect->membase = membase + cpu * REG_LEN;
> > +		cpu_stall_detect->clock_freq = stall_detect_clock;
> > +		cpu_stall_detect->expiration_sec = stall_detect_timeout_sec;
> > +
> > +		/* Pet the stall detector at half of its expiration timeout
> > +		 * to prevent spurios resets.
> 
> typo: spurious
> 

Fixed the typo.

> > +		 */
> > +		cpu_stall_detect->ping_timeout_ms = stall_detect_timeout_sec *
> > +			MSEC_PER_SEC / 2;
> 
> All of these fields are the same for each core apart from the membase,
> which a CPU can compute locally using its smp_processor_id() so I don't
> think you need to include these in the per-cpu structure. It would be
> simpler if you tracked these invariants in a separate structure (either
> a global static structure or indirected by a per-cpu pointer).
> 
> Then the only thing left for the per-cpu structure would be the hrtimer.
>

I removed the fields which are not needed per-cpu and I only kept the
hrtimers. I created a separate global config structure which holds the
removed data.

> > +		smp_call_function_single(cpu, vcpu_stall_detect_start,
> > +					 cpu_stall_detect, true);
> > +	}
> > +
> > +	err = cpuhp_setup_state_nocalls(CPUHP_AP_ONLINE_DYN,
> > +					"virt/vcpu_stall_detector:online",
> > +					start_stall_detector_on_cpu,
> > +					stop_stall_detector_on_cpu);
> 
> If you drop the "_nocalls" variant then I think you can avoid the explicit
> previous call to smp_call_function_single() entirely and just rely on the
> notifiers to do the right thing. That also solves the hotplug race I
> mentioned.
> 

I updated the code to use the cpuhp_setup_state API.

> > +	if (err < 0) {
> > +		dev_err(&dev->dev, "failed to install cpu hotplug");
> > +		ret = err;
> > +		goto err_withmem;
> > +	}
> > +
> > +	return 0;
> > +
> > +err_withmem:
> > +	free_percpu(vm_stall_detect);
> 
> Looks like there's a devm_alloc_percpu() which might make this a bit
> simpler.
>

I updated the code to use devm_alloc_percpu(..) and I got rid of the
free_percpu(..).

> > +	return ret;
> > +}
> > +
> > +static int vcpu_stall_detect_remove(struct platform_device *dev)
> > +{
> > +	int cpu;
> > +
> > +	for_each_cpu_and(cpu, cpu_online_mask, &watchdog_cpumask) {
> > +		struct vm_stall_detect_s *cpu_stall_detect;
> > +
> > +		cpu_stall_detect = per_cpu_ptr(vm_stall_detect, cpu);
> > +		smp_call_function_single(cpu, vcpu_stall_detect_stop,
> > +					 cpu_stall_detect, true);
> > +	}
> 
> I don't think you need to cross-call here, and it looks racy again with
> hotplug; you should be able to do something like:
> 
>   1. Unregister the hotplug notifier (looks like you're missing this?)
>   2. Call hrtimer_cancel() for each percpu timer
>   3. Disable the watchdog for each CPU
>

Thanks for pointing our ! I restructured the code on the teardown path.

> > +	free_percpu(vm_stall_detect);
> > +	vm_stall_detect = NULL;
> > +	return 0;
> > +}
> > +
> > +static const struct of_device_id vcpu_stall_detect_of_match[] = {
> > +	{ .compatible = "qemu,vcpu-stall-detector", },
> 
> I'm not sure why we're mentioning qemu here; something as boring as "virt"
> (like you use for the cpuhp notifier) or "dummy" might be better, but given
> that Rob is happy with the binding then I won't complain further. We can
> always add extra strings later if we need to.
>
I used this because I originaly designed this for qemu and then I ported
it to crovm.

> > +	{}
> > +};
> > +
> > +MODULE_DEVICE_TABLE(of, vcpu_stall_detect_of_match);
> > +
> > +static struct platform_driver vcpu_stall_detect_driver = {
> > +	.probe  = vcpu_stall_detect_probe,
> > +	.remove = vcpu_stall_detect_remove,
> > +	.driver = {
> > +		.name           = KBUILD_MODNAME,
> > +		.of_match_table = vcpu_stall_detect_of_match,
> > +	},
> > +};
> > +
> > +module_platform_driver(vcpu_stall_detect_driver);
> > +
> > +MODULE_LICENSE("GPL");
> 
> This needs to be "GPL v2".
>
I think this should stay as is.

> Will

Thanks for reviewing my code !
Seb

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-07-07 13:17 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-07-01 14:40 [PATCH v9 0/2] Detect stalls on guest vCPUS Sebastian Ene
2022-07-01 14:40 ` [PATCH v9 1/2] dt-bindings: vcpu_stall_detector: Add qemu,vcpu-stall-detector compatible Sebastian Ene
2022-07-01 21:01   ` Rob Herring
2022-07-01 14:40 ` [PATCH v9 2/2] misc: Add a mechanism to detect stalls on guest vCPUs Sebastian Ene
2022-07-01 14:52   ` Guenter Roeck
2022-07-06 15:21   ` Will Deacon
2022-07-06 15:50     ` Greg Kroah-Hartman
2022-07-06 16:10       ` Will Deacon
2022-07-06 17:35         ` Greg Kroah-Hartman
2022-07-07 13:17     ` Sebastian Ene

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).