linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv3 0/4] watchdog_hld cleanup and async model for arm64
@ 2021-10-14  2:41 Pingfan Liu
  2021-10-14  2:41 ` [PATCHv3 1/4] kernel/watchdog: trival cleanups Pingfan Liu
                   ` (4 more replies)
  0 siblings, 5 replies; 7+ messages in thread
From: Pingfan Liu @ 2021-10-14  2:41 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: Pingfan Liu, Sumit Garg, Catalin Marinas, Will Deacon,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Marc Zyngier,
	Kees Cook, Masahiro Yamada, Sami Tolvanen, Petr Mladek,
	Andrew Morton, Wang Qing, Peter Zijlstra (Intel),
	Santosh Sivaraj

Hard lockup detector is helpful to diagnose unpaired irq enable/disable.
But the current watchdog framework can not cope with arm64 hw perf event
easily.

On arm64, when lockup_detector_init()->watchdog_nmi_probe(), PMU is not
ready until device_initcall(armv8_pmu_driver_init).  And it is deeply
integrated with the driver model and cpuhp. Hence it is hard to push the
initialization of armv8_pmu_driver_init() before smp_init().

But it is easy to take an opposite approach by enabling watchdog_hld to
get the capability of PMU async. 
The async model is achieved by expanding watchdog_nmi_probe() with
-EBUSY, and a re-initializing work_struct which waits on a
wait_queue_head.

In this series, [1-2/4] are trivial cleanup. [3-4/4] is for this async
model.

v2 -> v3:
    check the delay work waken up and flush the work before __initdata is free.
    improve the commit log of [4/4]
    rebase to v5.15-rc5

v1 > v2:
    uplift the async model from hard lockup layer to watchdog layter.
The benefit is simpler code, the drawback is re-initialize means wasted
alloc/free.
    
Cc: Sumit Garg <sumit.garg@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Wang Qing <wangqing@vivo.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Santosh Sivaraj <santosh@fossix.org>
To: linux-arm-kernel@lists.infradead.org
To: linux-kernel@vger.kernel.org

*** BLURB HERE ***

Pingfan Liu (3):
  kernel/watchdog: trival cleanups
  kernel/watchdog_hld: Ensure CPU-bound context when creating hardlockup
    detector event
  kernel/watchdog: Adapt the watchdog_hld interface for async model

Sumit Garg (1):
  arm64: Enable perf events based hard lockup detector

 arch/arm64/Kconfig               |  2 ++
 arch/arm64/kernel/Makefile       |  1 +
 arch/arm64/kernel/perf_event.c   | 11 ++++--
 arch/arm64/kernel/watchdog_hld.c | 36 +++++++++++++++++++
 arch/sparc/kernel/nmi.c          |  8 ++---
 drivers/perf/arm_pmu.c           |  5 +++
 include/linux/nmi.h              | 11 +++++-
 include/linux/perf/arm_pmu.h     |  2 ++
 kernel/watchdog.c                | 62 ++++++++++++++++++++++++++++----
 kernel/watchdog_hld.c            |  5 ++-
 10 files changed, 129 insertions(+), 14 deletions(-)
 create mode 100644 arch/arm64/kernel/watchdog_hld.c

-- 
2.31.1


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCHv3 1/4] kernel/watchdog: trival cleanups
  2021-10-14  2:41 [PATCHv3 0/4] watchdog_hld cleanup and async model for arm64 Pingfan Liu
@ 2021-10-14  2:41 ` Pingfan Liu
  2021-10-14  2:41 ` [PATCHv3 2/4] kernel/watchdog_hld: Ensure CPU-bound context when creating hardlockup detector event Pingfan Liu
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Pingfan Liu @ 2021-10-14  2:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Pingfan Liu, Petr Mladek, Andrew Morton, Wang Qing,
	Peter Zijlstra (Intel),
	Santosh Sivaraj, linux-arm-kernel

No reference to WATCHDOG_DEFAULT, remove it.

And nobody cares about the return value of watchdog_nmi_enable(),
changing its prototype to void.

Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Wang Qing <wangqing@vivo.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Santosh Sivaraj <santosh@fossix.org>
Cc: linux-arm-kernel@lists.infradead.org
To: linux-kernel@vger.kernel.org
---
 arch/sparc/kernel/nmi.c | 8 ++++----
 include/linux/nmi.h     | 2 +-
 kernel/watchdog.c       | 5 +----
 3 files changed, 6 insertions(+), 9 deletions(-)

diff --git a/arch/sparc/kernel/nmi.c b/arch/sparc/kernel/nmi.c
index 060fff95a305..8dc0f4e820b0 100644
--- a/arch/sparc/kernel/nmi.c
+++ b/arch/sparc/kernel/nmi.c
@@ -282,11 +282,11 @@ __setup("nmi_watchdog=", setup_nmi_watchdog);
  * sparc specific NMI watchdog enable function.
  * Enables watchdog if it is not enabled already.
  */
-int watchdog_nmi_enable(unsigned int cpu)
+void watchdog_nmi_enable(unsigned int cpu)
 {
 	if (atomic_read(&nmi_active) == -1) {
 		pr_warn("NMI watchdog cannot be enabled or disabled\n");
-		return -1;
+		return;
 	}
 
 	/*
@@ -295,11 +295,11 @@ int watchdog_nmi_enable(unsigned int cpu)
 	 * process first.
 	 */
 	if (!nmi_init_done)
-		return 0;
+		return;
 
 	smp_call_function_single(cpu, start_nmi_watchdog, NULL, 1);
 
-	return 0;
+	return;
 }
 /*
  * sparc specific NMI watchdog disable function.
diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index 750c7f395ca9..b7bcd63c36b4 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -119,7 +119,7 @@ static inline int hardlockup_detector_perf_init(void) { return 0; }
 void watchdog_nmi_stop(void);
 void watchdog_nmi_start(void);
 int watchdog_nmi_probe(void);
-int watchdog_nmi_enable(unsigned int cpu);
+void watchdog_nmi_enable(unsigned int cpu);
 void watchdog_nmi_disable(unsigned int cpu);
 
 /**
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index ad912511a0c0..6e6dd5f0bc3e 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -30,10 +30,8 @@
 static DEFINE_MUTEX(watchdog_mutex);
 
 #if defined(CONFIG_HARDLOCKUP_DETECTOR) || defined(CONFIG_HAVE_NMI_WATCHDOG)
-# define WATCHDOG_DEFAULT	(SOFT_WATCHDOG_ENABLED | NMI_WATCHDOG_ENABLED)
 # define NMI_WATCHDOG_DEFAULT	1
 #else
-# define WATCHDOG_DEFAULT	(SOFT_WATCHDOG_ENABLED)
 # define NMI_WATCHDOG_DEFAULT	0
 #endif
 
@@ -95,10 +93,9 @@ __setup("nmi_watchdog=", hardlockup_panic_setup);
  * softlockup watchdog start and stop. The arch must select the
  * SOFTLOCKUP_DETECTOR Kconfig.
  */
-int __weak watchdog_nmi_enable(unsigned int cpu)
+void __weak watchdog_nmi_enable(unsigned int cpu)
 {
 	hardlockup_detector_perf_enable();
-	return 0;
 }
 
 void __weak watchdog_nmi_disable(unsigned int cpu)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCHv3 2/4] kernel/watchdog_hld: Ensure CPU-bound context when creating hardlockup detector event
  2021-10-14  2:41 [PATCHv3 0/4] watchdog_hld cleanup and async model for arm64 Pingfan Liu
  2021-10-14  2:41 ` [PATCHv3 1/4] kernel/watchdog: trival cleanups Pingfan Liu
@ 2021-10-14  2:41 ` Pingfan Liu
  2021-10-14  2:41 ` [PATCHv3 3/4] kernel/watchdog: Adapt the watchdog_hld interface for async model Pingfan Liu
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 7+ messages in thread
From: Pingfan Liu @ 2021-10-14  2:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Pingfan Liu, Petr Mladek, Andrew Morton, Wang Qing,
	Peter Zijlstra (Intel),
	Santosh Sivaraj, linux-arm-kernel

hardlockup_detector_event_create() should create perf_event on the
current CPU. Preemption could not get disabled because
perf_event_create_kernel_counter() allocates memory. Instead,
the CPU locality is achieved by processing the code in a per-CPU
bound kthread.

Add a check to prevent mistakes when calling the code in another
code path.

Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Wang Qing <wangqing@vivo.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Santosh Sivaraj <santosh@fossix.org>
Cc: linux-arm-kernel@lists.infradead.org
To: linux-kernel@vger.kernel.org
---
 kernel/watchdog_hld.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c
index 247bf0b1582c..df010df76576 100644
--- a/kernel/watchdog_hld.c
+++ b/kernel/watchdog_hld.c
@@ -165,10 +165,13 @@ static void watchdog_overflow_callback(struct perf_event *event,
 
 static int hardlockup_detector_event_create(void)
 {
-	unsigned int cpu = smp_processor_id();
+	unsigned int cpu;
 	struct perf_event_attr *wd_attr;
 	struct perf_event *evt;
 
+	/* This function plans to execute in cpu bound kthread */
+	WARN_ON(!is_percpu_thread());
+	cpu = raw_smp_processor_id();
 	wd_attr = &wd_hw_attr;
 	wd_attr->sample_period = hw_nmi_get_sample_period(watchdog_thresh);
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCHv3 3/4] kernel/watchdog: Adapt the watchdog_hld interface for async model
  2021-10-14  2:41 [PATCHv3 0/4] watchdog_hld cleanup and async model for arm64 Pingfan Liu
  2021-10-14  2:41 ` [PATCHv3 1/4] kernel/watchdog: trival cleanups Pingfan Liu
  2021-10-14  2:41 ` [PATCHv3 2/4] kernel/watchdog_hld: Ensure CPU-bound context when creating hardlockup detector event Pingfan Liu
@ 2021-10-14  2:41 ` Pingfan Liu
  2021-10-14  2:41 ` [PATCHv3 4/4] arm64: Enable perf events based hard lockup detector Pingfan Liu
  2022-01-17 10:19 ` [PATCHv3 0/4] watchdog_hld cleanup and async model for arm64 Lecopzer Chen
  4 siblings, 0 replies; 7+ messages in thread
From: Pingfan Liu @ 2021-10-14  2:41 UTC (permalink / raw)
  To: linux-kernel
  Cc: Pingfan Liu, Sumit Garg, Catalin Marinas, Will Deacon,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Marc Zyngier,
	Kees Cook, Masahiro Yamada, Sami Tolvanen, Petr Mladek,
	Andrew Morton, Wang Qing, Peter Zijlstra (Intel),
	Santosh Sivaraj, linux-arm-kernel

When lockup_detector_init()->watchdog_nmi_probe(), PMU may be not ready
yet. E.g. on arm64, PMU is not ready until
device_initcall(armv8_pmu_driver_init).  And it is deeply integrated
with the driver model and cpuhp. Hence it is hard to push this
initialization before smp_init().

But it is easy to take an opposite approach by enabling watchdog_hld to
get the capability of PMU async.

The async model is achieved by expanding watchdog_nmi_probe() with
-EBUSY, and a re-initializing work_struct which waits on a wait_queue_head.

Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Wang Qing <wangqing@vivo.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Santosh Sivaraj <santosh@fossix.org>
Cc: linux-arm-kernel@lists.infradead.org
To: linux-kernel@vger.kernel.org
---
 include/linux/nmi.h |  9 +++++++
 kernel/watchdog.c   | 57 +++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 64 insertions(+), 2 deletions(-)

diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index b7bcd63c36b4..9def85c00bd8 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -118,6 +118,15 @@ static inline int hardlockup_detector_perf_init(void) { return 0; }
 
 void watchdog_nmi_stop(void);
 void watchdog_nmi_start(void);
+
+enum hld_detector_state {
+	DELAY_INIT_NOP,
+	DELAY_INIT_WAIT,
+	DELAY_INIT_READY
+};
+
+extern enum hld_detector_state detector_delay_init_state;
+extern struct wait_queue_head hld_detector_wait;
 int watchdog_nmi_probe(void);
 void watchdog_nmi_enable(unsigned int cpu);
 void watchdog_nmi_disable(unsigned int cpu);
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 6e6dd5f0bc3e..2f267d21a7a1 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -103,7 +103,11 @@ void __weak watchdog_nmi_disable(unsigned int cpu)
 	hardlockup_detector_perf_disable();
 }
 
-/* Return 0, if a NMI watchdog is available. Error code otherwise */
+/*
+ * Arch specific API. Return 0, if a NMI watchdog is available. -EBUSY if not
+ * ready, and arch code should wake up hld_detector_wait when ready. Other
+ * negative value if not support.
+ */
 int __weak __init watchdog_nmi_probe(void)
 {
 	return hardlockup_detector_perf_init();
@@ -739,15 +743,64 @@ int proc_watchdog_cpumask(struct ctl_table *table, int write,
 }
 #endif /* CONFIG_SYSCTL */
 
+static void lockup_detector_delay_init(struct work_struct *work);
+enum hld_detector_state detector_delay_init_state __initdata;
+
+struct wait_queue_head hld_detector_wait __initdata =
+		__WAIT_QUEUE_HEAD_INITIALIZER(hld_detector_wait);
+
+static struct work_struct detector_work __initdata =
+		__WORK_INITIALIZER(detector_work, lockup_detector_delay_init);
+
+static void __init lockup_detector_delay_init(struct work_struct *work)
+{
+	int ret;
+
+	wait_event(hld_detector_wait,
+			detector_delay_init_state == DELAY_INIT_READY);
+	ret = watchdog_nmi_probe();
+	if (!ret) {
+		nmi_watchdog_available = true;
+		lockup_detector_setup();
+	} else {
+		WARN_ON(ret == -EBUSY);
+		pr_info("Perf NMI watchdog permanently disabled\n");
+	}
+}
+
+/* Ensure the check is called after the initialization of PMU driver */
+static int __init lockup_detector_check(void)
+{
+	if (detector_delay_init_state < DELAY_INIT_WAIT)
+		return 0;
+
+	if (WARN_ON(detector_delay_init_state == DELAY_INIT_WAIT)) {
+		detector_delay_init_state = DELAY_INIT_READY;
+		wake_up(&hld_detector_wait);
+	}
+	flush_work(&detector_work);
+	return 0;
+}
+late_initcall_sync(lockup_detector_check);
+
+
 void __init lockup_detector_init(void)
 {
+	int ret;
+
 	if (tick_nohz_full_enabled())
 		pr_info("Disabling watchdog on nohz_full cores by default\n");
 
 	cpumask_copy(&watchdog_cpumask,
 		     housekeeping_cpumask(HK_FLAG_TIMER));
 
-	if (!watchdog_nmi_probe())
+	ret = watchdog_nmi_probe();
+	if (!ret)
 		nmi_watchdog_available = true;
+	else if (ret == -EBUSY) {
+		detector_delay_init_state = DELAY_INIT_WAIT;
+		queue_work_on(smp_processor_id(), system_wq, &detector_work);
+	}
+
 	lockup_detector_setup();
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCHv3 4/4] arm64: Enable perf events based hard lockup detector
  2021-10-14  2:41 [PATCHv3 0/4] watchdog_hld cleanup and async model for arm64 Pingfan Liu
                   ` (2 preceding siblings ...)
  2021-10-14  2:41 ` [PATCHv3 3/4] kernel/watchdog: Adapt the watchdog_hld interface for async model Pingfan Liu
@ 2021-10-14  2:41 ` Pingfan Liu
  2022-01-17 10:19 ` [PATCHv3 0/4] watchdog_hld cleanup and async model for arm64 Lecopzer Chen
  4 siblings, 0 replies; 7+ messages in thread
From: Pingfan Liu @ 2021-10-14  2:41 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Sumit Garg, Pingfan Liu, Catalin Marinas, Will Deacon,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Marc Zyngier,
	Kees Cook, Masahiro Yamada, Sami Tolvanen, Petr Mladek,
	Andrew Morton, Wang Qing, Peter Zijlstra (Intel),
	Santosh Sivaraj, linux-kernel

From: Sumit Garg <sumit.garg@linaro.org>

With the recent feature added to enable perf events to use pseudo NMIs
as interrupts on platforms which support GICv3 or later, its now been
possible to enable hard lockup detector (or NMI watchdog) on arm64
platforms. So enable corresponding support.

One thing to note here is that normally lockup detector is initialized
just after the early initcalls but PMU on arm64 comes up much later as
device_initcall(). To cope with that, overriding watchdog_nmi_probe() to
let the watchdog framework know PMU not ready, and inform the framework
to re-initialize lockup detection once PMU has been initialized.

[1]: http://lore.kernel.org/linux-arm-kernel/1610712101-14929-1-git-send-email-sumit.garg@linaro.org

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
(Pingfan: adapt it to watchdog_hld async model based on [1])
Co-developed-by: Pingfan Liu <kernelfans@gmail.com>
Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Wang Qing <wangqing@vivo.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Santosh Sivaraj <santosh@fossix.org>
Cc: linux-kernel@vger.kernel.org
To: linux-arm-kernel@lists.infradead.org
---
 arch/arm64/Kconfig               |  2 ++
 arch/arm64/kernel/Makefile       |  1 +
 arch/arm64/kernel/perf_event.c   | 11 ++++++++--
 arch/arm64/kernel/watchdog_hld.c | 36 ++++++++++++++++++++++++++++++++
 drivers/perf/arm_pmu.c           |  5 +++++
 include/linux/perf/arm_pmu.h     |  2 ++
 6 files changed, 55 insertions(+), 2 deletions(-)
 create mode 100644 arch/arm64/kernel/watchdog_hld.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index fee914c716aa..762500f27aec 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -189,6 +189,8 @@ config ARM64
 	select HAVE_NMI
 	select HAVE_PATA_PLATFORM
 	select HAVE_PERF_EVENTS
+	select HAVE_PERF_EVENTS_NMI if ARM64_PSEUDO_NMI
+	select HAVE_HARDLOCKUP_DETECTOR_PERF if PERF_EVENTS && HAVE_PERF_EVENTS_NMI
 	select HAVE_PERF_REGS
 	select HAVE_PERF_USER_STACK_DUMP
 	select HAVE_REGS_AND_STACK_ACCESS_API
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 3f1490bfb938..789c2fe5bb90 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -46,6 +46,7 @@ obj-$(CONFIG_MODULES)			+= module.o
 obj-$(CONFIG_ARM64_MODULE_PLTS)		+= module-plts.o
 obj-$(CONFIG_PERF_EVENTS)		+= perf_regs.o perf_callchain.o
 obj-$(CONFIG_HW_PERF_EVENTS)		+= perf_event.o
+obj-$(CONFIG_HARDLOCKUP_DETECTOR_PERF)	+= watchdog_hld.o
 obj-$(CONFIG_HAVE_HW_BREAKPOINT)	+= hw_breakpoint.o
 obj-$(CONFIG_CPU_PM)			+= sleep.o suspend.o
 obj-$(CONFIG_CPU_IDLE)			+= cpuidle.o
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index b4044469527e..8e4c39f1db52 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -23,6 +23,7 @@
 #include <linux/platform_device.h>
 #include <linux/sched_clock.h>
 #include <linux/smp.h>
+#include <linux/nmi.h>
 
 /* ARMv8 Cortex-A53 specific event types. */
 #define ARMV8_A53_PERFCTR_PREF_LINEFILL				0xC2
@@ -1284,10 +1285,16 @@ static struct platform_driver armv8_pmu_driver = {
 
 static int __init armv8_pmu_driver_init(void)
 {
+	int ret;
+
 	if (acpi_disabled)
-		return platform_driver_register(&armv8_pmu_driver);
+		ret = platform_driver_register(&armv8_pmu_driver);
 	else
-		return arm_pmu_acpi_probe(armv8_pmuv3_init);
+		ret = arm_pmu_acpi_probe(armv8_pmuv3_init);
+
+	detector_delay_init_state = DELAY_INIT_READY;
+	wake_up(&hld_detector_wait);
+	return ret;
 }
 device_initcall(armv8_pmu_driver_init)
 
diff --git a/arch/arm64/kernel/watchdog_hld.c b/arch/arm64/kernel/watchdog_hld.c
new file mode 100644
index 000000000000..85536906a186
--- /dev/null
+++ b/arch/arm64/kernel/watchdog_hld.c
@@ -0,0 +1,36 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/nmi.h>
+#include <linux/cpufreq.h>
+#include <linux/perf/arm_pmu.h>
+
+/*
+ * Safe maximum CPU frequency in case a particular platform doesn't implement
+ * cpufreq driver. Although, architecture doesn't put any restrictions on
+ * maximum frequency but 5 GHz seems to be safe maximum given the available
+ * Arm CPUs in the market which are clocked much less than 5 GHz. On the other
+ * hand, we can't make it much higher as it would lead to a large hard-lockup
+ * detection timeout on parts which are running slower (eg. 1GHz on
+ * Developerbox) and doesn't possess a cpufreq driver.
+ */
+#define SAFE_MAX_CPU_FREQ	5000000000UL // 5 GHz
+u64 hw_nmi_get_sample_period(int watchdog_thresh)
+{
+	unsigned int cpu = smp_processor_id();
+	unsigned long max_cpu_freq;
+
+	max_cpu_freq = cpufreq_get_hw_max_freq(cpu) * 1000UL;
+	if (!max_cpu_freq)
+		max_cpu_freq = SAFE_MAX_CPU_FREQ;
+
+	return (u64)max_cpu_freq * watchdog_thresh;
+}
+
+int __init watchdog_nmi_probe(void)
+{
+	if (detector_delay_init_state != DELAY_INIT_READY)
+		return -EBUSY;
+	else if (!arm_pmu_irq_is_nmi())
+		return -ENODEV;
+
+	return hardlockup_detector_perf_init();
+}
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 295cc7952d0e..e77f4897fca2 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -697,6 +697,11 @@ static int armpmu_get_cpu_irq(struct arm_pmu *pmu, int cpu)
 	return per_cpu(hw_events->irq, cpu);
 }
 
+bool arm_pmu_irq_is_nmi(void)
+{
+	return has_nmi;
+}
+
 /*
  * PMU hardware loses all context when a CPU goes offline.
  * When a CPU is hotplugged back in, since some hardware registers are
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 2512e2f9cd4e..9325d01adc3e 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -169,6 +169,8 @@ void kvm_host_pmu_init(struct arm_pmu *pmu);
 #define kvm_host_pmu_init(x)	do { } while(0)
 #endif
 
+bool arm_pmu_irq_is_nmi(void);
+
 /* Internal functions only for core arm_pmu code */
 struct arm_pmu *armpmu_alloc(void);
 struct arm_pmu *armpmu_alloc_atomic(void);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCHv3 0/4] watchdog_hld cleanup and async model for arm64
  2021-10-14  2:41 [PATCHv3 0/4] watchdog_hld cleanup and async model for arm64 Pingfan Liu
                   ` (3 preceding siblings ...)
  2021-10-14  2:41 ` [PATCHv3 4/4] arm64: Enable perf events based hard lockup detector Pingfan Liu
@ 2022-01-17 10:19 ` Lecopzer Chen
  2022-01-24  1:02   ` Pingfan Liu
  4 siblings, 1 reply; 7+ messages in thread
From: Lecopzer Chen @ 2022-01-17 10:19 UTC (permalink / raw)
  To: kernelfans
  Cc: acme, akpm, alexander.shishkin, catalin.marinas, jolsa, keescook,
	linux-arm-kernel, linux-kernel, mark.rutland, masahiroy, maz,
	mingo, namhyung, peterz, pmladek, samitolvanen, santosh,
	sumit.garg, wangqing, will, yj.chiang, lecopzer.chen

Hi Pingfan,

Is this thread sill in progress?
We are looking for the upstream solution for ARM64 Hardlockup detector.

I'd appreciate it if someone keep working on it,
if not, I can take over it.



thanks!

-Lecopzer



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCHv3 0/4] watchdog_hld cleanup and async model for arm64
  2022-01-17 10:19 ` [PATCHv3 0/4] watchdog_hld cleanup and async model for arm64 Lecopzer Chen
@ 2022-01-24  1:02   ` Pingfan Liu
  0 siblings, 0 replies; 7+ messages in thread
From: Pingfan Liu @ 2022-01-24  1:02 UTC (permalink / raw)
  To: Lecopzer Chen
  Cc: Arnaldo Carvalho de Melo, Andrew Morton, Alexander Shishkin,
	Catalin Marinas, Jiri Olsa, Kees Cook, Linux ARM, LKML,
	Mark Rutland, Masahiro Yamada, Marc Zyngier, Ingo Molnar,
	Namhyung Kim, Peter Zijlstra, Petr Mladek, Sami Tolvanen,
	Santosh Sivaraj, Sumit Garg, Wang Qing, Will Deacon, yj.chiang

On Mon, Jan 17, 2022 at 6:19 PM Lecopzer Chen
<lecopzer.chen@mediatek.com> wrote:
>
> Hi Pingfan,
>
> Is this thread sill in progress?

No, I am working on other topic at present, and this is not in my
queue in near future.
> We are looking for the upstream solution for ARM64 Hardlockup detector.
>
> I'd appreciate it if someone keep working on it,
> if not, I can take over it.
>
Be my guest, and hope you have great work. We badly wants hardlock up
detector on arm64

Best Regards,

Pingfan

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2022-01-24  1:02 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-14  2:41 [PATCHv3 0/4] watchdog_hld cleanup and async model for arm64 Pingfan Liu
2021-10-14  2:41 ` [PATCHv3 1/4] kernel/watchdog: trival cleanups Pingfan Liu
2021-10-14  2:41 ` [PATCHv3 2/4] kernel/watchdog_hld: Ensure CPU-bound context when creating hardlockup detector event Pingfan Liu
2021-10-14  2:41 ` [PATCHv3 3/4] kernel/watchdog: Adapt the watchdog_hld interface for async model Pingfan Liu
2021-10-14  2:41 ` [PATCHv3 4/4] arm64: Enable perf events based hard lockup detector Pingfan Liu
2022-01-17 10:19 ` [PATCHv3 0/4] watchdog_hld cleanup and async model for arm64 Lecopzer Chen
2022-01-24  1:02   ` Pingfan Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).