linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCHv2 0/4] watchdog_hld cleanup and async model for arm64
@ 2021-09-23 14:09 Pingfan Liu
  2021-09-23 14:09 ` [PATCHv2 1/4] kernel/watchdog: trival cleanups Pingfan Liu
                   ` (3 more replies)
  0 siblings, 4 replies; 15+ messages in thread
From: Pingfan Liu @ 2021-09-23 14:09 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: Pingfan Liu, Sumit Garg, Catalin Marinas, Will Deacon,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Marc Zyngier,
	Kees Cook, Masahiro Yamada, Sami Tolvanen, Petr Mladek,
	Andrew Morton, Wang Qing, Peter Zijlstra (Intel),
	Santosh Sivaraj

hard lockup detector is helpful to diagnose unpaired irq enable/disable.
Sumit has tried with a series, and the last one is V5 [1].
Since it lasts a long time without any update, I takes a retry, which
addresses the delay intialization of watchdog_hld.
( To: Sumit, I think the main body of [4/4] is contributed from you,so I
keep you as the author, please let me know if you dislike it and my
modification.)

There is an obstacle to integrate arm64 hw perf event into watchdog_hld.
When lockup_detector_init()->watchdog_nmi_probe(), on arm64, PMU is not
ready until device_initcall(armv8_pmu_driver_init).  And it is deeply
integrated with the driver model and cpuhp. Hence it is hard to push
the initialization of armv8_pmu_driver_init() before smp_init().

But it is easy to take an opposite approach by enabling watchdog_hld to
get the capability of PMU async. 
The async model is achieved by expanding watchdog_nmi_probe() with
-EBUSY, and a re-initializing work_struct which waits on a
wait_queue_head.

In this series, [1-2/4] are trivial cleanup. [3-4/4] is for this async
model.


v1 > v2:
    uplift the async model from hard lockup layer to watchdog layter.
The benefit is simpler code, the drawback is re-initialize means wasted
alloc/free.
    
[1]: http://lore.kernel.org/linux-arm-kernel/1610712101-14929-1-git-send-email-sumit.garg@linaro.org

Cc: Sumit Garg <sumit.garg@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Wang Qing <wangqing@vivo.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Santosh Sivaraj <santosh@fossix.org>
To: linux-arm-kernel@lists.infradead.org
To: linux-kernel@vger.kernel.org

*** BLURB HERE ***

Pingfan Liu (3):
  kernel/watchdog: trival cleanups
  kernel/watchdog_hld: clarify the condition in
    hardlockup_detector_event_create()
  kernel/watchdog: adapt the watchdog_hld interface for async model

Sumit Garg (1):
  arm64: Enable perf events based hard lockup detector

 arch/arm64/Kconfig               |  2 ++
 arch/arm64/kernel/Makefile       |  1 +
 arch/arm64/kernel/perf_event.c   | 11 +++++++--
 arch/arm64/kernel/watchdog_hld.c | 36 +++++++++++++++++++++++++++
 drivers/perf/arm_pmu.c           |  5 ++++
 include/linux/nmi.h              |  5 +++-
 include/linux/perf/arm_pmu.h     |  2 ++
 kernel/watchdog.c                | 42 +++++++++++++++++++++++++++-----
 kernel/watchdog_hld.c            |  5 +++-
 9 files changed, 99 insertions(+), 10 deletions(-)
 create mode 100644 arch/arm64/kernel/watchdog_hld.c

-- 
2.31.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCHv2 1/4] kernel/watchdog: trival cleanups
  2021-09-23 14:09 [PATCHv2 0/4] watchdog_hld cleanup and async model for arm64 Pingfan Liu
@ 2021-09-23 14:09 ` Pingfan Liu
  2021-10-04  9:32   ` Petr Mladek
  2021-09-23 14:09 ` [PATCHv2 2/4] kernel/watchdog_hld: clarify the condition in hardlockup_detector_event_create() Pingfan Liu
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 15+ messages in thread
From: Pingfan Liu @ 2021-09-23 14:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: Pingfan Liu, Petr Mladek, Andrew Morton, Wang Qing,
	Peter Zijlstra (Intel),
	Santosh Sivaraj, linux-arm-kernel

No reference to WATCHDOG_DEFAULT, remove it.

And nobody cares about the return value of watchdog_nmi_enable(),
changing its prototype to void.

Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Wang Qing <wangqing@vivo.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Santosh Sivaraj <santosh@fossix.org>
Cc: linux-arm-kernel@lists.infradead.org
To: linux-kernel@vger.kernel.org
---
 include/linux/nmi.h | 2 +-
 kernel/watchdog.c   | 5 +----
 2 files changed, 2 insertions(+), 5 deletions(-)

diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index 750c7f395ca9..b7bcd63c36b4 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -119,7 +119,7 @@ static inline int hardlockup_detector_perf_init(void) { return 0; }
 void watchdog_nmi_stop(void);
 void watchdog_nmi_start(void);
 int watchdog_nmi_probe(void);
-int watchdog_nmi_enable(unsigned int cpu);
+void watchdog_nmi_enable(unsigned int cpu);
 void watchdog_nmi_disable(unsigned int cpu);
 
 /**
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index ad912511a0c0..6e6dd5f0bc3e 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -30,10 +30,8 @@
 static DEFINE_MUTEX(watchdog_mutex);
 
 #if defined(CONFIG_HARDLOCKUP_DETECTOR) || defined(CONFIG_HAVE_NMI_WATCHDOG)
-# define WATCHDOG_DEFAULT	(SOFT_WATCHDOG_ENABLED | NMI_WATCHDOG_ENABLED)
 # define NMI_WATCHDOG_DEFAULT	1
 #else
-# define WATCHDOG_DEFAULT	(SOFT_WATCHDOG_ENABLED)
 # define NMI_WATCHDOG_DEFAULT	0
 #endif
 
@@ -95,10 +93,9 @@ __setup("nmi_watchdog=", hardlockup_panic_setup);
  * softlockup watchdog start and stop. The arch must select the
  * SOFTLOCKUP_DETECTOR Kconfig.
  */
-int __weak watchdog_nmi_enable(unsigned int cpu)
+void __weak watchdog_nmi_enable(unsigned int cpu)
 {
 	hardlockup_detector_perf_enable();
-	return 0;
 }
 
 void __weak watchdog_nmi_disable(unsigned int cpu)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCHv2 2/4] kernel/watchdog_hld: clarify the condition in hardlockup_detector_event_create()
  2021-09-23 14:09 [PATCHv2 0/4] watchdog_hld cleanup and async model for arm64 Pingfan Liu
  2021-09-23 14:09 ` [PATCHv2 1/4] kernel/watchdog: trival cleanups Pingfan Liu
@ 2021-09-23 14:09 ` Pingfan Liu
  2021-10-04 12:32   ` Petr Mladek
  2021-09-23 14:09 ` [PATCHv2 3/4] kernel/watchdog: adapt the watchdog_hld interface for async model Pingfan Liu
  2021-09-23 14:09 ` [PATCHv2 4/4] arm64: Enable perf events based hard lockup detector Pingfan Liu
  3 siblings, 1 reply; 15+ messages in thread
From: Pingfan Liu @ 2021-09-23 14:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: Pingfan Liu, Petr Mladek, Andrew Morton, Wang Qing,
	Peter Zijlstra (Intel),
	Santosh Sivaraj, linux-arm-kernel

As for the context, there are two arguments to change
debug_smp_processor_id() to is_percpu_thread().

  -1. watchdog_ev is percpu, and migration will frustrate the attempt
which try to bind a watchdog_ev to a cpu by protecting this func inside
the pair of preempt_disable()/preempt_enable().

  -2. hardlockup_detector_event_create() indirectly calls
kmem_cache_alloc_node(), which is blockable.

So here, spelling out the really planned context "is_percpu_thread()".

Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Wang Qing <wangqing@vivo.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Santosh Sivaraj <santosh@fossix.org>
Cc: linux-arm-kernel@lists.infradead.org
To: linux-kernel@vger.kernel.org
---
 kernel/watchdog_hld.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c
index 247bf0b1582c..df010df76576 100644
--- a/kernel/watchdog_hld.c
+++ b/kernel/watchdog_hld.c
@@ -165,10 +165,13 @@ static void watchdog_overflow_callback(struct perf_event *event,
 
 static int hardlockup_detector_event_create(void)
 {
-	unsigned int cpu = smp_processor_id();
+	unsigned int cpu;
 	struct perf_event_attr *wd_attr;
 	struct perf_event *evt;
 
+	/* This function plans to execute in cpu bound kthread */
+	WARN_ON(!is_percpu_thread());
+	cpu = raw_smp_processor_id();
 	wd_attr = &wd_hw_attr;
 	wd_attr->sample_period = hw_nmi_get_sample_period(watchdog_thresh);
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCHv2 3/4] kernel/watchdog: adapt the watchdog_hld interface for async model
  2021-09-23 14:09 [PATCHv2 0/4] watchdog_hld cleanup and async model for arm64 Pingfan Liu
  2021-09-23 14:09 ` [PATCHv2 1/4] kernel/watchdog: trival cleanups Pingfan Liu
  2021-09-23 14:09 ` [PATCHv2 2/4] kernel/watchdog_hld: clarify the condition in hardlockup_detector_event_create() Pingfan Liu
@ 2021-09-23 14:09 ` Pingfan Liu
  2021-10-05  7:03   ` Petr Mladek
  2021-09-23 14:09 ` [PATCHv2 4/4] arm64: Enable perf events based hard lockup detector Pingfan Liu
  3 siblings, 1 reply; 15+ messages in thread
From: Pingfan Liu @ 2021-09-23 14:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: Pingfan Liu, Sumit Garg, Catalin Marinas, Will Deacon,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Marc Zyngier,
	Kees Cook, Masahiro Yamada, Sami Tolvanen, Petr Mladek,
	Andrew Morton, Wang Qing, Peter Zijlstra (Intel),
	Santosh Sivaraj, linux-arm-kernel

When lockup_detector_init()->watchdog_nmi_probe(), PMU may be not ready
yet. E.g. on arm64, PMU is not ready until
device_initcall(armv8_pmu_driver_init).  And it is deeply integrated
with the driver model and cpuhp. Hence it is hard to push this
initialization before smp_init().

But it is easy to take an opposite approach by enabling watchdog_hld to
get the capability of PMU async.

The async model is achieved by expanding watchdog_nmi_probe() with
-EBUSY, and a re-initializing work_struct which waits on a wait_queue_head.

Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: Sumit Garg <sumit.garg@linaro.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Wang Qing <wangqing@vivo.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Santosh Sivaraj <santosh@fossix.org>
Cc: linux-arm-kernel@lists.infradead.org
To: linux-kernel@vger.kernel.org
---
 include/linux/nmi.h |  3 +++
 kernel/watchdog.c   | 37 +++++++++++++++++++++++++++++++++++--
 2 files changed, 38 insertions(+), 2 deletions(-)

diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index b7bcd63c36b4..270d440fe4b7 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -118,6 +118,9 @@ static inline int hardlockup_detector_perf_init(void) { return 0; }
 
 void watchdog_nmi_stop(void);
 void watchdog_nmi_start(void);
+
+extern bool hld_detector_delay_initialized;
+extern struct wait_queue_head hld_detector_wait;
 int watchdog_nmi_probe(void);
 void watchdog_nmi_enable(unsigned int cpu);
 void watchdog_nmi_disable(unsigned int cpu);
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 6e6dd5f0bc3e..bd4ae1839b72 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -103,7 +103,10 @@ void __weak watchdog_nmi_disable(unsigned int cpu)
 	hardlockup_detector_perf_disable();
 }
 
-/* Return 0, if a NMI watchdog is available. Error code otherwise */
+/*
+ * Return 0, if a NMI watchdog is available. -EBUSY if not ready.
+ * Other negative value if not support.
+ */
 int __weak __init watchdog_nmi_probe(void)
 {
 	return hardlockup_detector_perf_init();
@@ -739,15 +742,45 @@ int proc_watchdog_cpumask(struct ctl_table *table, int write,
 }
 #endif /* CONFIG_SYSCTL */
 
+static void lockup_detector_delay_init(struct work_struct *work);
+bool hld_detector_delay_initialized __initdata;
+
+struct wait_queue_head hld_detector_wait __initdata =
+		__WAIT_QUEUE_HEAD_INITIALIZER(hld_detector_wait);
+
+static struct work_struct detector_work __initdata =
+		__WORK_INITIALIZER(detector_work, lockup_detector_delay_init);
+
+static void __init lockup_detector_delay_init(struct work_struct *work)
+{
+	int ret;
+
+	wait_event(hld_detector_wait, hld_detector_delay_initialized);
+	ret = watchdog_nmi_probe();
+	if (!ret) {
+		nmi_watchdog_available = true;
+		lockup_detector_setup();
+	} else {
+		WARN_ON(ret == -EBUSY);
+		pr_info("Perf NMI watchdog permanently disabled\n");
+	}
+}
+
 void __init lockup_detector_init(void)
 {
+	int ret;
+
 	if (tick_nohz_full_enabled())
 		pr_info("Disabling watchdog on nohz_full cores by default\n");
 
 	cpumask_copy(&watchdog_cpumask,
 		     housekeeping_cpumask(HK_FLAG_TIMER));
 
-	if (!watchdog_nmi_probe())
+	ret = watchdog_nmi_probe();
+	if (!ret)
 		nmi_watchdog_available = true;
+	else if (ret == -EBUSY)
+		queue_work_on(smp_processor_id(), system_wq, &detector_work);
+
 	lockup_detector_setup();
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCHv2 4/4] arm64: Enable perf events based hard lockup detector
  2021-09-23 14:09 [PATCHv2 0/4] watchdog_hld cleanup and async model for arm64 Pingfan Liu
                   ` (2 preceding siblings ...)
  2021-09-23 14:09 ` [PATCHv2 3/4] kernel/watchdog: adapt the watchdog_hld interface for async model Pingfan Liu
@ 2021-09-23 14:09 ` Pingfan Liu
  2021-09-23 14:29   ` Pingfan Liu
  3 siblings, 1 reply; 15+ messages in thread
From: Pingfan Liu @ 2021-09-23 14:09 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Sumit Garg, Pingfan Liu, Catalin Marinas, Will Deacon,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Marc Zyngier,
	Kees Cook, Masahiro Yamada, Sami Tolvanen, Petr Mladek,
	Andrew Morton, Wang Qing, Peter Zijlstra (Intel),
	Santosh Sivaraj, linux-kernel

From: Sumit Garg <sumit.garg@linaro.org>

With the recent feature added to enable perf events to use pseudo NMIs
as interrupts on platforms which support GICv3 or later, its now been
possible to enable hard lockup detector (or NMI watchdog) on arm64
platforms. So enable corresponding support.

One thing to note here is that normally lockup detector is initialized
just after the early initcalls but PMU on arm64 comes up much later as
device_initcall(). So we need to re-initialize lockup detection once
PMU has been initialized.

[1]: http://lore.kernel.org/linux-arm-kernel/1610712101-14929-1-git-send-email-sumit.garg@linaro.org

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
(Pingfan: adapt it to watchdog_hld async model based on [1])
Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>
Cc: Kees Cook <keescook@chromium.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Sami Tolvanen <samitolvanen@google.com>
Cc: Petr Mladek <pmladek@suse.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Wang Qing <wangqing@vivo.com>
Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
Cc: Santosh Sivaraj <santosh@fossix.org>
Cc: linux-kernel@vger.kernel.org
To: linux-arm-kernel@lists.infradead.org
---
 arch/arm64/Kconfig               |  2 ++
 arch/arm64/kernel/Makefile       |  1 +
 arch/arm64/kernel/perf_event.c   | 11 ++++++++--
 arch/arm64/kernel/watchdog_hld.c | 36 ++++++++++++++++++++++++++++++++
 drivers/perf/arm_pmu.c           |  5 +++++
 include/linux/perf/arm_pmu.h     |  2 ++
 6 files changed, 55 insertions(+), 2 deletions(-)
 create mode 100644 arch/arm64/kernel/watchdog_hld.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 5c7ae4c3954b..8287e9e1d28d 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -189,6 +189,8 @@ config ARM64
 	select HAVE_NMI
 	select HAVE_PATA_PLATFORM
 	select HAVE_PERF_EVENTS
+	select HAVE_PERF_EVENTS_NMI if ARM64_PSEUDO_NMI
+	select HAVE_HARDLOCKUP_DETECTOR_PERF if PERF_EVENTS && HAVE_PERF_EVENTS_NMI
 	select HAVE_PERF_REGS
 	select HAVE_PERF_USER_STACK_DUMP
 	select HAVE_REGS_AND_STACK_ACCESS_API
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 3f1490bfb938..789c2fe5bb90 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -46,6 +46,7 @@ obj-$(CONFIG_MODULES)			+= module.o
 obj-$(CONFIG_ARM64_MODULE_PLTS)		+= module-plts.o
 obj-$(CONFIG_PERF_EVENTS)		+= perf_regs.o perf_callchain.o
 obj-$(CONFIG_HW_PERF_EVENTS)		+= perf_event.o
+obj-$(CONFIG_HARDLOCKUP_DETECTOR_PERF)	+= watchdog_hld.o
 obj-$(CONFIG_HAVE_HW_BREAKPOINT)	+= hw_breakpoint.o
 obj-$(CONFIG_CPU_PM)			+= sleep.o suspend.o
 obj-$(CONFIG_CPU_IDLE)			+= cpuidle.o
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index b4044469527e..a34343d0f418 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -23,6 +23,7 @@
 #include <linux/platform_device.h>
 #include <linux/sched_clock.h>
 #include <linux/smp.h>
+#include <linux/nmi.h>
 
 /* ARMv8 Cortex-A53 specific event types. */
 #define ARMV8_A53_PERFCTR_PREF_LINEFILL				0xC2
@@ -1284,10 +1285,16 @@ static struct platform_driver armv8_pmu_driver = {
 
 static int __init armv8_pmu_driver_init(void)
 {
+	int ret;
+
 	if (acpi_disabled)
-		return platform_driver_register(&armv8_pmu_driver);
+		ret = platform_driver_register(&armv8_pmu_driver);
 	else
-		return arm_pmu_acpi_probe(armv8_pmuv3_init);
+		ret = arm_pmu_acpi_probe(armv8_pmuv3_init);
+
+	hld_detector_delay_initialized = true;
+	wake_up(&hld_detector_wait);
+	return ret;
 }
 device_initcall(armv8_pmu_driver_init)
 
diff --git a/arch/arm64/kernel/watchdog_hld.c b/arch/arm64/kernel/watchdog_hld.c
new file mode 100644
index 000000000000..379743e0d001
--- /dev/null
+++ b/arch/arm64/kernel/watchdog_hld.c
@@ -0,0 +1,36 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/nmi.h>
+#include <linux/cpufreq.h>
+#include <linux/perf/arm_pmu.h>
+
+/*
+ * Safe maximum CPU frequency in case a particular platform doesn't implement
+ * cpufreq driver. Although, architecture doesn't put any restrictions on
+ * maximum frequency but 5 GHz seems to be safe maximum given the available
+ * Arm CPUs in the market which are clocked much less than 5 GHz. On the other
+ * hand, we can't make it much higher as it would lead to a large hard-lockup
+ * detection timeout on parts which are running slower (eg. 1GHz on
+ * Developerbox) and doesn't possess a cpufreq driver.
+ */
+#define SAFE_MAX_CPU_FREQ	5000000000UL // 5 GHz
+u64 hw_nmi_get_sample_period(int watchdog_thresh)
+{
+	unsigned int cpu = smp_processor_id();
+	unsigned long max_cpu_freq;
+
+	max_cpu_freq = cpufreq_get_hw_max_freq(cpu) * 1000UL;
+	if (!max_cpu_freq)
+		max_cpu_freq = SAFE_MAX_CPU_FREQ;
+
+	return (u64)max_cpu_freq * watchdog_thresh;
+}
+
+int __init watchdog_nmi_probe(void)
+{
+	if (!hld_detector_delay_initialized)
+		return -EBUSY;
+	else if (!arm_pmu_irq_is_nmi())
+		return -ENODEV;
+
+	return hardlockup_detector_perf_init();
+}
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 3cbc3baf087f..2aecb0c34290 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -697,6 +697,11 @@ static int armpmu_get_cpu_irq(struct arm_pmu *pmu, int cpu)
 	return per_cpu(hw_events->irq, cpu);
 }
 
+bool arm_pmu_irq_is_nmi(void)
+{
+	return has_nmi;
+}
+
 /*
  * PMU hardware loses all context when a CPU goes offline.
  * When a CPU is hotplugged back in, since some hardware registers are
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 505480217cf1..bf7966776c55 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -163,6 +163,8 @@ int arm_pmu_acpi_probe(armpmu_init_fn init_fn);
 static inline int arm_pmu_acpi_probe(armpmu_init_fn init_fn) { return 0; }
 #endif
 
+bool arm_pmu_irq_is_nmi(void);
+
 /* Internal functions only for core arm_pmu code */
 struct arm_pmu *armpmu_alloc(void);
 struct arm_pmu *armpmu_alloc_atomic(void);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCHv2 4/4] arm64: Enable perf events based hard lockup detector
  2021-09-23 14:09 ` [PATCHv2 4/4] arm64: Enable perf events based hard lockup detector Pingfan Liu
@ 2021-09-23 14:29   ` Pingfan Liu
  2021-09-24  5:18     ` Sumit Garg
  0 siblings, 1 reply; 15+ messages in thread
From: Pingfan Liu @ 2021-09-23 14:29 UTC (permalink / raw)
  To: Linux ARM
  Cc: Sumit Garg, Catalin Marinas, Will Deacon, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Marc Zyngier, Kees Cook,
	Masahiro Yamada, Sami Tolvanen, Petr Mladek, Andrew Morton,
	Wang Qing, Peter Zijlstra (Intel),
	Santosh Sivaraj, LKML

On Thu, Sep 23, 2021 at 10:10 PM Pingfan Liu <kernelfans@gmail.com> wrote:
>
> From: Sumit Garg <sumit.garg@linaro.org>
>
To Sumit, I think credits should go to you and keep you as the author.

Please let me know if you dislike it.

Thanks,

Pingfan
> With the recent feature added to enable perf events to use pseudo NMIs
> as interrupts on platforms which support GICv3 or later, its now been
> possible to enable hard lockup detector (or NMI watchdog) on arm64
> platforms. So enable corresponding support.
>
> One thing to note here is that normally lockup detector is initialized
> just after the early initcalls but PMU on arm64 comes up much later as
> device_initcall(). So we need to re-initialize lockup detection once
> PMU has been initialized.
>
> [1]: http://lore.kernel.org/linux-arm-kernel/1610712101-14929-1-git-send-email-sumit.garg@linaro.org
>
> Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> (Pingfan: adapt it to watchdog_hld async model based on [1])
> Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Cc: Jiri Olsa <jolsa@redhat.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Masahiro Yamada <masahiroy@kernel.org>
> Cc: Sami Tolvanen <samitolvanen@google.com>
> Cc: Petr Mladek <pmladek@suse.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Wang Qing <wangqing@vivo.com>
> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
> Cc: Santosh Sivaraj <santosh@fossix.org>
> Cc: linux-kernel@vger.kernel.org
> To: linux-arm-kernel@lists.infradead.org
> ---
>  arch/arm64/Kconfig               |  2 ++
>  arch/arm64/kernel/Makefile       |  1 +
>  arch/arm64/kernel/perf_event.c   | 11 ++++++++--
>  arch/arm64/kernel/watchdog_hld.c | 36 ++++++++++++++++++++++++++++++++
>  drivers/perf/arm_pmu.c           |  5 +++++
>  include/linux/perf/arm_pmu.h     |  2 ++
>  6 files changed, 55 insertions(+), 2 deletions(-)
>  create mode 100644 arch/arm64/kernel/watchdog_hld.c
>
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 5c7ae4c3954b..8287e9e1d28d 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -189,6 +189,8 @@ config ARM64
>         select HAVE_NMI
>         select HAVE_PATA_PLATFORM
>         select HAVE_PERF_EVENTS
> +       select HAVE_PERF_EVENTS_NMI if ARM64_PSEUDO_NMI
> +       select HAVE_HARDLOCKUP_DETECTOR_PERF if PERF_EVENTS && HAVE_PERF_EVENTS_NMI
>         select HAVE_PERF_REGS
>         select HAVE_PERF_USER_STACK_DUMP
>         select HAVE_REGS_AND_STACK_ACCESS_API
> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
> index 3f1490bfb938..789c2fe5bb90 100644
> --- a/arch/arm64/kernel/Makefile
> +++ b/arch/arm64/kernel/Makefile
> @@ -46,6 +46,7 @@ obj-$(CONFIG_MODULES)                 += module.o
>  obj-$(CONFIG_ARM64_MODULE_PLTS)                += module-plts.o
>  obj-$(CONFIG_PERF_EVENTS)              += perf_regs.o perf_callchain.o
>  obj-$(CONFIG_HW_PERF_EVENTS)           += perf_event.o
> +obj-$(CONFIG_HARDLOCKUP_DETECTOR_PERF) += watchdog_hld.o
>  obj-$(CONFIG_HAVE_HW_BREAKPOINT)       += hw_breakpoint.o
>  obj-$(CONFIG_CPU_PM)                   += sleep.o suspend.o
>  obj-$(CONFIG_CPU_IDLE)                 += cpuidle.o
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index b4044469527e..a34343d0f418 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -23,6 +23,7 @@
>  #include <linux/platform_device.h>
>  #include <linux/sched_clock.h>
>  #include <linux/smp.h>
> +#include <linux/nmi.h>
>
>  /* ARMv8 Cortex-A53 specific event types. */
>  #define ARMV8_A53_PERFCTR_PREF_LINEFILL                                0xC2
> @@ -1284,10 +1285,16 @@ static struct platform_driver armv8_pmu_driver = {
>
>  static int __init armv8_pmu_driver_init(void)
>  {
> +       int ret;
> +
>         if (acpi_disabled)
> -               return platform_driver_register(&armv8_pmu_driver);
> +               ret = platform_driver_register(&armv8_pmu_driver);
>         else
> -               return arm_pmu_acpi_probe(armv8_pmuv3_init);
> +               ret = arm_pmu_acpi_probe(armv8_pmuv3_init);
> +
> +       hld_detector_delay_initialized = true;
> +       wake_up(&hld_detector_wait);
> +       return ret;
>  }
>  device_initcall(armv8_pmu_driver_init)
>
> diff --git a/arch/arm64/kernel/watchdog_hld.c b/arch/arm64/kernel/watchdog_hld.c
> new file mode 100644
> index 000000000000..379743e0d001
> --- /dev/null
> +++ b/arch/arm64/kernel/watchdog_hld.c
> @@ -0,0 +1,36 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <linux/nmi.h>
> +#include <linux/cpufreq.h>
> +#include <linux/perf/arm_pmu.h>
> +
> +/*
> + * Safe maximum CPU frequency in case a particular platform doesn't implement
> + * cpufreq driver. Although, architecture doesn't put any restrictions on
> + * maximum frequency but 5 GHz seems to be safe maximum given the available
> + * Arm CPUs in the market which are clocked much less than 5 GHz. On the other
> + * hand, we can't make it much higher as it would lead to a large hard-lockup
> + * detection timeout on parts which are running slower (eg. 1GHz on
> + * Developerbox) and doesn't possess a cpufreq driver.
> + */
> +#define SAFE_MAX_CPU_FREQ      5000000000UL // 5 GHz
> +u64 hw_nmi_get_sample_period(int watchdog_thresh)
> +{
> +       unsigned int cpu = smp_processor_id();
> +       unsigned long max_cpu_freq;
> +
> +       max_cpu_freq = cpufreq_get_hw_max_freq(cpu) * 1000UL;
> +       if (!max_cpu_freq)
> +               max_cpu_freq = SAFE_MAX_CPU_FREQ;
> +
> +       return (u64)max_cpu_freq * watchdog_thresh;
> +}
> +
> +int __init watchdog_nmi_probe(void)
> +{
> +       if (!hld_detector_delay_initialized)
> +               return -EBUSY;
> +       else if (!arm_pmu_irq_is_nmi())
> +               return -ENODEV;
> +
> +       return hardlockup_detector_perf_init();
> +}
> diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> index 3cbc3baf087f..2aecb0c34290 100644
> --- a/drivers/perf/arm_pmu.c
> +++ b/drivers/perf/arm_pmu.c
> @@ -697,6 +697,11 @@ static int armpmu_get_cpu_irq(struct arm_pmu *pmu, int cpu)
>         return per_cpu(hw_events->irq, cpu);
>  }
>
> +bool arm_pmu_irq_is_nmi(void)
> +{
> +       return has_nmi;
> +}
> +
>  /*
>   * PMU hardware loses all context when a CPU goes offline.
>   * When a CPU is hotplugged back in, since some hardware registers are
> diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
> index 505480217cf1..bf7966776c55 100644
> --- a/include/linux/perf/arm_pmu.h
> +++ b/include/linux/perf/arm_pmu.h
> @@ -163,6 +163,8 @@ int arm_pmu_acpi_probe(armpmu_init_fn init_fn);
>  static inline int arm_pmu_acpi_probe(armpmu_init_fn init_fn) { return 0; }
>  #endif
>
> +bool arm_pmu_irq_is_nmi(void);
> +
>  /* Internal functions only for core arm_pmu code */
>  struct arm_pmu *armpmu_alloc(void);
>  struct arm_pmu *armpmu_alloc_atomic(void);
> --
> 2.31.1
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv2 4/4] arm64: Enable perf events based hard lockup detector
  2021-09-23 14:29   ` Pingfan Liu
@ 2021-09-24  5:18     ` Sumit Garg
  2021-09-24 13:31       ` Pingfan Liu
  0 siblings, 1 reply; 15+ messages in thread
From: Sumit Garg @ 2021-09-24  5:18 UTC (permalink / raw)
  To: Pingfan Liu
  Cc: Linux ARM, Catalin Marinas, Will Deacon, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Marc Zyngier, Kees Cook,
	Masahiro Yamada, Sami Tolvanen, Petr Mladek, Andrew Morton,
	Wang Qing, Peter Zijlstra (Intel),
	Santosh Sivaraj, LKML

Hi Pingfan,

On Thu, 23 Sept 2021 at 19:59, Pingfan Liu <kernelfans@gmail.com> wrote:
>
> On Thu, Sep 23, 2021 at 10:10 PM Pingfan Liu <kernelfans@gmail.com> wrote:
> >
> > From: Sumit Garg <sumit.garg@linaro.org>
> >
> To Sumit, I think credits should go to you and keep you as the author.
>

Thanks, I am fine with it. If you like then you can add your
"Co-developed-by" as well.

> Please let me know if you dislike it.
>
> Thanks,
>
> Pingfan
> > With the recent feature added to enable perf events to use pseudo NMIs
> > as interrupts on platforms which support GICv3 or later, its now been
> > possible to enable hard lockup detector (or NMI watchdog) on arm64
> > platforms. So enable corresponding support.
> >
> > One thing to note here is that normally lockup detector is initialized
> > just after the early initcalls but PMU on arm64 comes up much later as
> > device_initcall(). So we need to re-initialize lockup detection once
> > PMU has been initialized.

This needs to be updated to reflect delayed initialization instead.

-Sumit

> >
> > [1]: http://lore.kernel.org/linux-arm-kernel/1610712101-14929-1-git-send-email-sumit.garg@linaro.org
> >
> > Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> > (Pingfan: adapt it to watchdog_hld async model based on [1])
> > Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> > Cc: Catalin Marinas <catalin.marinas@arm.com>
> > Cc: Will Deacon <will@kernel.org>
> > Cc: Ingo Molnar <mingo@redhat.com>
> > Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> > Cc: Mark Rutland <mark.rutland@arm.com>
> > Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> > Cc: Jiri Olsa <jolsa@redhat.com>
> > Cc: Namhyung Kim <namhyung@kernel.org>
> > Cc: Marc Zyngier <maz@kernel.org>
> > Cc: Kees Cook <keescook@chromium.org>
> > Cc: Masahiro Yamada <masahiroy@kernel.org>
> > Cc: Sami Tolvanen <samitolvanen@google.com>
> > Cc: Petr Mladek <pmladek@suse.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Wang Qing <wangqing@vivo.com>
> > Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
> > Cc: Santosh Sivaraj <santosh@fossix.org>
> > Cc: linux-kernel@vger.kernel.org
> > To: linux-arm-kernel@lists.infradead.org
> > ---
> >  arch/arm64/Kconfig               |  2 ++
> >  arch/arm64/kernel/Makefile       |  1 +
> >  arch/arm64/kernel/perf_event.c   | 11 ++++++++--
> >  arch/arm64/kernel/watchdog_hld.c | 36 ++++++++++++++++++++++++++++++++
> >  drivers/perf/arm_pmu.c           |  5 +++++
> >  include/linux/perf/arm_pmu.h     |  2 ++
> >  6 files changed, 55 insertions(+), 2 deletions(-)
> >  create mode 100644 arch/arm64/kernel/watchdog_hld.c
> >
> > diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> > index 5c7ae4c3954b..8287e9e1d28d 100644
> > --- a/arch/arm64/Kconfig
> > +++ b/arch/arm64/Kconfig
> > @@ -189,6 +189,8 @@ config ARM64
> >         select HAVE_NMI
> >         select HAVE_PATA_PLATFORM
> >         select HAVE_PERF_EVENTS
> > +       select HAVE_PERF_EVENTS_NMI if ARM64_PSEUDO_NMI
> > +       select HAVE_HARDLOCKUP_DETECTOR_PERF if PERF_EVENTS && HAVE_PERF_EVENTS_NMI
> >         select HAVE_PERF_REGS
> >         select HAVE_PERF_USER_STACK_DUMP
> >         select HAVE_REGS_AND_STACK_ACCESS_API
> > diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
> > index 3f1490bfb938..789c2fe5bb90 100644
> > --- a/arch/arm64/kernel/Makefile
> > +++ b/arch/arm64/kernel/Makefile
> > @@ -46,6 +46,7 @@ obj-$(CONFIG_MODULES)                 += module.o
> >  obj-$(CONFIG_ARM64_MODULE_PLTS)                += module-plts.o
> >  obj-$(CONFIG_PERF_EVENTS)              += perf_regs.o perf_callchain.o
> >  obj-$(CONFIG_HW_PERF_EVENTS)           += perf_event.o
> > +obj-$(CONFIG_HARDLOCKUP_DETECTOR_PERF) += watchdog_hld.o
> >  obj-$(CONFIG_HAVE_HW_BREAKPOINT)       += hw_breakpoint.o
> >  obj-$(CONFIG_CPU_PM)                   += sleep.o suspend.o
> >  obj-$(CONFIG_CPU_IDLE)                 += cpuidle.o
> > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > index b4044469527e..a34343d0f418 100644
> > --- a/arch/arm64/kernel/perf_event.c
> > +++ b/arch/arm64/kernel/perf_event.c
> > @@ -23,6 +23,7 @@
> >  #include <linux/platform_device.h>
> >  #include <linux/sched_clock.h>
> >  #include <linux/smp.h>
> > +#include <linux/nmi.h>
> >
> >  /* ARMv8 Cortex-A53 specific event types. */
> >  #define ARMV8_A53_PERFCTR_PREF_LINEFILL                                0xC2
> > @@ -1284,10 +1285,16 @@ static struct platform_driver armv8_pmu_driver = {
> >
> >  static int __init armv8_pmu_driver_init(void)
> >  {
> > +       int ret;
> > +
> >         if (acpi_disabled)
> > -               return platform_driver_register(&armv8_pmu_driver);
> > +               ret = platform_driver_register(&armv8_pmu_driver);
> >         else
> > -               return arm_pmu_acpi_probe(armv8_pmuv3_init);
> > +               ret = arm_pmu_acpi_probe(armv8_pmuv3_init);
> > +
> > +       hld_detector_delay_initialized = true;
> > +       wake_up(&hld_detector_wait);
> > +       return ret;
> >  }
> >  device_initcall(armv8_pmu_driver_init)
> >
> > diff --git a/arch/arm64/kernel/watchdog_hld.c b/arch/arm64/kernel/watchdog_hld.c
> > new file mode 100644
> > index 000000000000..379743e0d001
> > --- /dev/null
> > +++ b/arch/arm64/kernel/watchdog_hld.c
> > @@ -0,0 +1,36 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +#include <linux/nmi.h>
> > +#include <linux/cpufreq.h>
> > +#include <linux/perf/arm_pmu.h>
> > +
> > +/*
> > + * Safe maximum CPU frequency in case a particular platform doesn't implement
> > + * cpufreq driver. Although, architecture doesn't put any restrictions on
> > + * maximum frequency but 5 GHz seems to be safe maximum given the available
> > + * Arm CPUs in the market which are clocked much less than 5 GHz. On the other
> > + * hand, we can't make it much higher as it would lead to a large hard-lockup
> > + * detection timeout on parts which are running slower (eg. 1GHz on
> > + * Developerbox) and doesn't possess a cpufreq driver.
> > + */
> > +#define SAFE_MAX_CPU_FREQ      5000000000UL // 5 GHz
> > +u64 hw_nmi_get_sample_period(int watchdog_thresh)
> > +{
> > +       unsigned int cpu = smp_processor_id();
> > +       unsigned long max_cpu_freq;
> > +
> > +       max_cpu_freq = cpufreq_get_hw_max_freq(cpu) * 1000UL;
> > +       if (!max_cpu_freq)
> > +               max_cpu_freq = SAFE_MAX_CPU_FREQ;
> > +
> > +       return (u64)max_cpu_freq * watchdog_thresh;
> > +}
> > +
> > +int __init watchdog_nmi_probe(void)
> > +{
> > +       if (!hld_detector_delay_initialized)
> > +               return -EBUSY;
> > +       else if (!arm_pmu_irq_is_nmi())
> > +               return -ENODEV;
> > +
> > +       return hardlockup_detector_perf_init();
> > +}
> > diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
> > index 3cbc3baf087f..2aecb0c34290 100644
> > --- a/drivers/perf/arm_pmu.c
> > +++ b/drivers/perf/arm_pmu.c
> > @@ -697,6 +697,11 @@ static int armpmu_get_cpu_irq(struct arm_pmu *pmu, int cpu)
> >         return per_cpu(hw_events->irq, cpu);
> >  }
> >
> > +bool arm_pmu_irq_is_nmi(void)
> > +{
> > +       return has_nmi;
> > +}
> > +
> >  /*
> >   * PMU hardware loses all context when a CPU goes offline.
> >   * When a CPU is hotplugged back in, since some hardware registers are
> > diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
> > index 505480217cf1..bf7966776c55 100644
> > --- a/include/linux/perf/arm_pmu.h
> > +++ b/include/linux/perf/arm_pmu.h
> > @@ -163,6 +163,8 @@ int arm_pmu_acpi_probe(armpmu_init_fn init_fn);
> >  static inline int arm_pmu_acpi_probe(armpmu_init_fn init_fn) { return 0; }
> >  #endif
> >
> > +bool arm_pmu_irq_is_nmi(void);
> > +
> >  /* Internal functions only for core arm_pmu code */
> >  struct arm_pmu *armpmu_alloc(void);
> >  struct arm_pmu *armpmu_alloc_atomic(void);
> > --
> > 2.31.1
> >

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv2 4/4] arm64: Enable perf events based hard lockup detector
  2021-09-24  5:18     ` Sumit Garg
@ 2021-09-24 13:31       ` Pingfan Liu
  0 siblings, 0 replies; 15+ messages in thread
From: Pingfan Liu @ 2021-09-24 13:31 UTC (permalink / raw)
  To: Sumit Garg
  Cc: Linux ARM, Catalin Marinas, Will Deacon, Ingo Molnar,
	Arnaldo Carvalho de Melo, Mark Rutland, Alexander Shishkin,
	Jiri Olsa, Namhyung Kim, Marc Zyngier, Kees Cook,
	Masahiro Yamada, Sami Tolvanen, Petr Mladek, Andrew Morton,
	Wang Qing, Peter Zijlstra (Intel),
	Santosh Sivaraj, LKML

On Fri, Sep 24, 2021 at 1:19 PM Sumit Garg <sumit.garg@linaro.org> wrote:
>
> Hi Pingfan,
>
> On Thu, 23 Sept 2021 at 19:59, Pingfan Liu <kernelfans@gmail.com> wrote:
> >
> > On Thu, Sep 23, 2021 at 10:10 PM Pingfan Liu <kernelfans@gmail.com> wrote:
> > >
> > > From: Sumit Garg <sumit.garg@linaro.org>
> > >
> > To Sumit, I think credits should go to you and keep you as the author.
> >
>
> Thanks, I am fine with it. If you like then you can add your
> "Co-developed-by" as well.
>
OK, I will.

> > > One thing to note here is that normally lockup detector is initialized
> > > just after the early initcalls but PMU on arm64 comes up much later as
> > > device_initcall(). So we need to re-initialize lockup detection once
> > > PMU has been initialized.
>
> This needs to be updated to reflect delayed initialization instead.
>
I will update the commit log accordingly.

Thanks,

Pingfan

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv2 1/4] kernel/watchdog: trival cleanups
  2021-09-23 14:09 ` [PATCHv2 1/4] kernel/watchdog: trival cleanups Pingfan Liu
@ 2021-10-04  9:32   ` Petr Mladek
  2021-10-08  4:04     ` Pingfan Liu
  0 siblings, 1 reply; 15+ messages in thread
From: Petr Mladek @ 2021-10-04  9:32 UTC (permalink / raw)
  To: Pingfan Liu
  Cc: linux-kernel, Andrew Morton, Wang Qing, Peter Zijlstra (Intel),
	Santosh Sivaraj, linux-arm-kernel

On Thu 2021-09-23 22:09:48, Pingfan Liu wrote:
> No reference to WATCHDOG_DEFAULT, remove it.
> 
> And nobody cares about the return value of watchdog_nmi_enable(),
> changing its prototype to void.
> 
> Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> Cc: Petr Mladek <pmladek@suse.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Wang Qing <wangqing@vivo.com>
> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
> Cc: Santosh Sivaraj <santosh@fossix.org>
> Cc: linux-arm-kernel@lists.infradead.org
> To: linux-kernel@vger.kernel.org
> ---
>  include/linux/nmi.h | 2 +-
>  kernel/watchdog.c   | 5 +----
>  2 files changed, 2 insertions(+), 5 deletions(-)
> 
> diff --git a/include/linux/nmi.h b/include/linux/nmi.h
> index 750c7f395ca9..b7bcd63c36b4 100644
> --- a/include/linux/nmi.h
> +++ b/include/linux/nmi.h
> @@ -119,7 +119,7 @@ static inline int hardlockup_detector_perf_init(void) { return 0; }
>  void watchdog_nmi_stop(void);
>  void watchdog_nmi_start(void);
>  int watchdog_nmi_probe(void);
> -int watchdog_nmi_enable(unsigned int cpu);
> +void watchdog_nmi_enable(unsigned int cpu);
>  void watchdog_nmi_disable(unsigned int cpu);
>  
>  /**
> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index ad912511a0c0..6e6dd5f0bc3e 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -30,10 +30,8 @@
>  static DEFINE_MUTEX(watchdog_mutex);
>  
>  #if defined(CONFIG_HARDLOCKUP_DETECTOR) || defined(CONFIG_HAVE_NMI_WATCHDOG)
> -# define WATCHDOG_DEFAULT	(SOFT_WATCHDOG_ENABLED | NMI_WATCHDOG_ENABLED)
>  # define NMI_WATCHDOG_DEFAULT	1
>  #else
> -# define WATCHDOG_DEFAULT	(SOFT_WATCHDOG_ENABLED)
>  # define NMI_WATCHDOG_DEFAULT	0
>  #endif
>  
> @@ -95,10 +93,9 @@ __setup("nmi_watchdog=", hardlockup_panic_setup);
>   * softlockup watchdog start and stop. The arch must select the
>   * SOFTLOCKUP_DETECTOR Kconfig.
>   */
> -int __weak watchdog_nmi_enable(unsigned int cpu)
> +void __weak watchdog_nmi_enable(unsigned int cpu)

It is __weak. spart specific implementation is in
arch/sparc/kernel/nmi.c. It has to be updated as well.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv2 2/4] kernel/watchdog_hld: clarify the condition in hardlockup_detector_event_create()
  2021-09-23 14:09 ` [PATCHv2 2/4] kernel/watchdog_hld: clarify the condition in hardlockup_detector_event_create() Pingfan Liu
@ 2021-10-04 12:32   ` Petr Mladek
  2021-10-08  4:11     ` Pingfan Liu
  0 siblings, 1 reply; 15+ messages in thread
From: Petr Mladek @ 2021-10-04 12:32 UTC (permalink / raw)
  To: Pingfan Liu
  Cc: linux-kernel, Andrew Morton, Wang Qing, Peter Zijlstra (Intel),
	Santosh Sivaraj, linux-arm-kernel

On Thu 2021-09-23 22:09:49, Pingfan Liu wrote:
> As for the context, there are two arguments to change
> debug_smp_processor_id() to is_percpu_thread().
> 
>   -1. watchdog_ev is percpu, and migration will frustrate the attempt
> which try to bind a watchdog_ev to a cpu by protecting this func inside
> the pair of preempt_disable()/preempt_enable().
> 
>   -2. hardlockup_detector_event_create() indirectly calls
> kmem_cache_alloc_node(), which is blockable.
> 
> So here, spelling out the really planned context "is_percpu_thread()".

The description is pretty hard to understand. I would suggest
something like:

Subject: kernel/watchdog_hld: Ensure CPU-bound context when creating
hardlockup detector event

hardlockup_detector_event_create() should create perf_event on the
current CPU. Preemption could not get disabled because
perf_event_create_kernel_counter() allocates memory. Instead,
the CPU locality is achieved by processing the code in a per-CPU
bound kthread.

Add a check to prevent mistakes when calling the code in another
code path.

> Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> Cc: Petr Mladek <pmladek@suse.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Wang Qing <wangqing@vivo.com>
> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
> Cc: Santosh Sivaraj <santosh@fossix.org>
> Cc: linux-arm-kernel@lists.infradead.org
> To: linux-kernel@vger.kernel.org
> ---
>  kernel/watchdog_hld.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c
> index 247bf0b1582c..df010df76576 100644
> --- a/kernel/watchdog_hld.c
> +++ b/kernel/watchdog_hld.c
> @@ -165,10 +165,13 @@ static void watchdog_overflow_callback(struct perf_event *event,
>  
>  static int hardlockup_detector_event_create(void)
>  {
> -	unsigned int cpu = smp_processor_id();
> +	unsigned int cpu;
>  	struct perf_event_attr *wd_attr;
>  	struct perf_event *evt;
>  
> +	/* This function plans to execute in cpu bound kthread */

This does not explain why it is needed. I suggest something like:

	/*
	 * Preemption is not disabled because memory will be allocated.
	 * Ensure CPU-locality by calling this in per-CPU kthread.
	 */


> +	WARN_ON(!is_percpu_thread());
> +	cpu = raw_smp_processor_id();
>  	wd_attr = &wd_hw_attr;
>  	wd_attr->sample_period = hw_nmi_get_sample_period(watchdog_thresh);
>  

Othrewise the change looks good to me.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv2 3/4] kernel/watchdog: adapt the watchdog_hld interface for async model
  2021-09-23 14:09 ` [PATCHv2 3/4] kernel/watchdog: adapt the watchdog_hld interface for async model Pingfan Liu
@ 2021-10-05  7:03   ` Petr Mladek
  2021-10-08  5:53     ` Pingfan Liu
  0 siblings, 1 reply; 15+ messages in thread
From: Petr Mladek @ 2021-10-05  7:03 UTC (permalink / raw)
  To: Pingfan Liu
  Cc: linux-kernel, Sumit Garg, Catalin Marinas, Will Deacon,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Marc Zyngier,
	Kees Cook, Masahiro Yamada, Sami Tolvanen, Andrew Morton,
	Wang Qing, Peter Zijlstra (Intel),
	Santosh Sivaraj, linux-arm-kernel

On Thu 2021-09-23 22:09:50, Pingfan Liu wrote:
> When lockup_detector_init()->watchdog_nmi_probe(), PMU may be not ready
> yet. E.g. on arm64, PMU is not ready until
> device_initcall(armv8_pmu_driver_init).  And it is deeply integrated
> with the driver model and cpuhp. Hence it is hard to push this
> initialization before smp_init().
> 
> But it is easy to take an opposite approach by enabling watchdog_hld to
> get the capability of PMU async.
> 
> The async model is achieved by expanding watchdog_nmi_probe() with
> -EBUSY, and a re-initializing work_struct which waits on a wait_queue_head.
> 
> Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> Cc: Sumit Garg <sumit.garg@linaro.org>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
> Cc: Jiri Olsa <jolsa@redhat.com>
> Cc: Namhyung Kim <namhyung@kernel.org>
> Cc: Marc Zyngier <maz@kernel.org>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Masahiro Yamada <masahiroy@kernel.org>
> Cc: Sami Tolvanen <samitolvanen@google.com>
> Cc: Petr Mladek <pmladek@suse.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Wang Qing <wangqing@vivo.com>
> Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
> Cc: Santosh Sivaraj <santosh@fossix.org>
> Cc: linux-arm-kernel@lists.infradead.org
> To: linux-kernel@vger.kernel.org
> ---
>  include/linux/nmi.h |  3 +++
>  kernel/watchdog.c   | 37 +++++++++++++++++++++++++++++++++++--
>  2 files changed, 38 insertions(+), 2 deletions(-)
> 
> diff --git a/include/linux/nmi.h b/include/linux/nmi.h
> index b7bcd63c36b4..270d440fe4b7 100644
> --- a/include/linux/nmi.h
> +++ b/include/linux/nmi.h
> @@ -118,6 +118,9 @@ static inline int hardlockup_detector_perf_init(void) { return 0; }
>  
>  void watchdog_nmi_stop(void);
>  void watchdog_nmi_start(void);
> +
> +extern bool hld_detector_delay_initialized;
> +extern struct wait_queue_head hld_detector_wait;
>  int watchdog_nmi_probe(void);
>  void watchdog_nmi_enable(unsigned int cpu);
>  void watchdog_nmi_disable(unsigned int cpu);
> diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> index 6e6dd5f0bc3e..bd4ae1839b72 100644
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -103,7 +103,10 @@ void __weak watchdog_nmi_disable(unsigned int cpu)
>  	hardlockup_detector_perf_disable();
>  }
>  
> -/* Return 0, if a NMI watchdog is available. Error code otherwise */
> +/*
> + * Return 0, if a NMI watchdog is available. -EBUSY if not ready.
> + * Other negative value if not support.
> + */
>  int __weak __init watchdog_nmi_probe(void)
>  {
>  	return hardlockup_detector_perf_init();
> @@ -739,15 +742,45 @@ int proc_watchdog_cpumask(struct ctl_table *table, int write,
>  }
>  #endif /* CONFIG_SYSCTL */
>  
> +static void lockup_detector_delay_init(struct work_struct *work);
> +bool hld_detector_delay_initialized __initdata;
> +
> +struct wait_queue_head hld_detector_wait __initdata =
> +		__WAIT_QUEUE_HEAD_INITIALIZER(hld_detector_wait);
> +
> +static struct work_struct detector_work __initdata =
> +		__WORK_INITIALIZER(detector_work, lockup_detector_delay_init);
> +
> +static void __init lockup_detector_delay_init(struct work_struct *work)
> +{
> +	int ret;
> +
> +	wait_event(hld_detector_wait, hld_detector_delay_initialized);
> +	ret = watchdog_nmi_probe();
> +	if (!ret) {
> +		nmi_watchdog_available = true;
> +		lockup_detector_setup();

Is it really safe to call the entire lockup_detector_setup()
later?

It manipulates also softlockup detector. And more importantly,
the original call is before smp_init(). It means that it was
running when only single CPU was on.

It seems that x86 has some problem with hardlockup detector as
well. It later manipulates only the hardlockup detector. Also it uses
cpus_read_lock() to prevent races with CPU hotplug, see
fixup_ht_bug().


> +	} else {
> +		WARN_ON(ret == -EBUSY);
> +		pr_info("Perf NMI watchdog permanently disabled\n");
> +	}
> +}
> +
>  void __init lockup_detector_init(void)
>  {
> +	int ret;
> +
>  	if (tick_nohz_full_enabled())
>  		pr_info("Disabling watchdog on nohz_full cores by default\n");
>  
>  	cpumask_copy(&watchdog_cpumask,
>  		     housekeeping_cpumask(HK_FLAG_TIMER));
>  
> -	if (!watchdog_nmi_probe())
> +	ret = watchdog_nmi_probe();
> +	if (!ret)
>  		nmi_watchdog_available = true;
> +	else if (ret == -EBUSY)
> +		queue_work_on(smp_processor_id(), system_wq, &detector_work);

IMHO, this is not acceptable. It will block one worker until someone
wakes it. Only arm64 will have a code to wake up the work and only
when pmu is successfully initialized. In all other cases, the worker
will stay blocked forever.

The right solution is to do it the other way. Queue the work
from arm64-specific code when armv8_pmu_driver_init() succeeded.

Also I suggest to flush the work to make sure that it is finished
before __init code gets removed.


The open question is what code the work will call. As mentioned
above, I am not sure that lockup_detector_delay_init() is safe.
IMHO, we need to manipulate only hardlockup detector and
we have to serialize it against CPU hotplug.

Best Regards,
Petr

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv2 1/4] kernel/watchdog: trival cleanups
  2021-10-04  9:32   ` Petr Mladek
@ 2021-10-08  4:04     ` Pingfan Liu
  0 siblings, 0 replies; 15+ messages in thread
From: Pingfan Liu @ 2021-10-08  4:04 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Pingfan Liu, linux-kernel, Andrew Morton, Wang Qing,
	Peter Zijlstra (Intel),
	Santosh Sivaraj, linux-arm-kernel

On Mon, Oct 04, 2021 at 11:32:41AM +0200, Petr Mladek wrote:
> On Thu 2021-09-23 22:09:48, Pingfan Liu wrote:
> > No reference to WATCHDOG_DEFAULT, remove it.
> > 
> > And nobody cares about the return value of watchdog_nmi_enable(),
> > changing its prototype to void.
> > 
> > Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> > Cc: Petr Mladek <pmladek@suse.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Wang Qing <wangqing@vivo.com>
> > Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
> > Cc: Santosh Sivaraj <santosh@fossix.org>
> > Cc: linux-arm-kernel@lists.infradead.org
> > To: linux-kernel@vger.kernel.org
> > ---
> >  include/linux/nmi.h | 2 +-
> >  kernel/watchdog.c   | 5 +----
> >  2 files changed, 2 insertions(+), 5 deletions(-)
> > 
> > diff --git a/include/linux/nmi.h b/include/linux/nmi.h
> > index 750c7f395ca9..b7bcd63c36b4 100644
> > --- a/include/linux/nmi.h
> > +++ b/include/linux/nmi.h
> > @@ -119,7 +119,7 @@ static inline int hardlockup_detector_perf_init(void) { return 0; }
> >  void watchdog_nmi_stop(void);
> >  void watchdog_nmi_start(void);
> >  int watchdog_nmi_probe(void);
> > -int watchdog_nmi_enable(unsigned int cpu);
> > +void watchdog_nmi_enable(unsigned int cpu);
> >  void watchdog_nmi_disable(unsigned int cpu);
> >  
> >  /**
> > diff --git a/kernel/watchdog.c b/kernel/watchdog.c
> > index ad912511a0c0..6e6dd5f0bc3e 100644
> > --- a/kernel/watchdog.c
> > +++ b/kernel/watchdog.c
> > @@ -30,10 +30,8 @@
> >  static DEFINE_MUTEX(watchdog_mutex);
> >  
> >  #if defined(CONFIG_HARDLOCKUP_DETECTOR) || defined(CONFIG_HAVE_NMI_WATCHDOG)
> > -# define WATCHDOG_DEFAULT	(SOFT_WATCHDOG_ENABLED | NMI_WATCHDOG_ENABLED)
> >  # define NMI_WATCHDOG_DEFAULT	1
> >  #else
> > -# define WATCHDOG_DEFAULT	(SOFT_WATCHDOG_ENABLED)
> >  # define NMI_WATCHDOG_DEFAULT	0
> >  #endif
> >  
> > @@ -95,10 +93,9 @@ __setup("nmi_watchdog=", hardlockup_panic_setup);
> >   * softlockup watchdog start and stop. The arch must select the
> >   * SOFTLOCKUP_DETECTOR Kconfig.
> >   */
> > -int __weak watchdog_nmi_enable(unsigned int cpu)
> > +void __weak watchdog_nmi_enable(unsigned int cpu)
> 
> It is __weak. spart specific implementation is in
> arch/sparc/kernel/nmi.c. It has to be updated as well.
> 
Oops, I will fix it.

Thanks,

	Pingfan


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv2 2/4] kernel/watchdog_hld: clarify the condition in hardlockup_detector_event_create()
  2021-10-04 12:32   ` Petr Mladek
@ 2021-10-08  4:11     ` Pingfan Liu
  0 siblings, 0 replies; 15+ messages in thread
From: Pingfan Liu @ 2021-10-08  4:11 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Pingfan Liu, linux-kernel, Andrew Morton, Wang Qing,
	Peter Zijlstra (Intel),
	Santosh Sivaraj, linux-arm-kernel

On Mon, Oct 04, 2021 at 02:32:47PM +0200, Petr Mladek wrote:
> On Thu 2021-09-23 22:09:49, Pingfan Liu wrote:
> > As for the context, there are two arguments to change
> > debug_smp_processor_id() to is_percpu_thread().
> > 
> >   -1. watchdog_ev is percpu, and migration will frustrate the attempt
> > which try to bind a watchdog_ev to a cpu by protecting this func inside
> > the pair of preempt_disable()/preempt_enable().
> > 
> >   -2. hardlockup_detector_event_create() indirectly calls
> > kmem_cache_alloc_node(), which is blockable.
> > 
> > So here, spelling out the really planned context "is_percpu_thread()".
> 
> The description is pretty hard to understand. I would suggest
> something like:
> 
> Subject: kernel/watchdog_hld: Ensure CPU-bound context when creating
> hardlockup detector event
> 
> hardlockup_detector_event_create() should create perf_event on the
> current CPU. Preemption could not get disabled because
> perf_event_create_kernel_counter() allocates memory. Instead,
> the CPU locality is achieved by processing the code in a per-CPU
> bound kthread.
> 
> Add a check to prevent mistakes when calling the code in another
> code path.
> 
Appreciate for that. I will use it.

> > Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
> > Cc: Petr Mladek <pmladek@suse.com>
> > Cc: Andrew Morton <akpm@linux-foundation.org>
> > Cc: Wang Qing <wangqing@vivo.com>
> > Cc: "Peter Zijlstra (Intel)" <peterz@infradead.org>
> > Cc: Santosh Sivaraj <santosh@fossix.org>
> > Cc: linux-arm-kernel@lists.infradead.org
> > To: linux-kernel@vger.kernel.org
> > ---
> >  kernel/watchdog_hld.c | 5 ++++-
> >  1 file changed, 4 insertions(+), 1 deletion(-)
> > 
> > diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c
> > index 247bf0b1582c..df010df76576 100644
> > --- a/kernel/watchdog_hld.c
> > +++ b/kernel/watchdog_hld.c
> > @@ -165,10 +165,13 @@ static void watchdog_overflow_callback(struct perf_event *event,
> >  
> >  static int hardlockup_detector_event_create(void)
> >  {
> > -	unsigned int cpu = smp_processor_id();
> > +	unsigned int cpu;
> >  	struct perf_event_attr *wd_attr;
> >  	struct perf_event *evt;
> >  
> > +	/* This function plans to execute in cpu bound kthread */
> 
> This does not explain why it is needed. I suggest something like:
> 
> 	/*
> 	 * Preemption is not disabled because memory will be allocated.
> 	 * Ensure CPU-locality by calling this in per-CPU kthread.
> 	 */
> 
It sounds good. I will use it.

> 
> > +	WARN_ON(!is_percpu_thread());
> > +	cpu = raw_smp_processor_id();
> >  	wd_attr = &wd_hw_attr;
> >  	wd_attr->sample_period = hw_nmi_get_sample_period(watchdog_thresh);
> >  
> 
> Othrewise the change looks good to me.
> 
Thank for your help.

Regards,

	Pingfan
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv2 3/4] kernel/watchdog: adapt the watchdog_hld interface for async model
  2021-10-05  7:03   ` Petr Mladek
@ 2021-10-08  5:53     ` Pingfan Liu
  2021-10-08 15:10       ` Pingfan Liu
  0 siblings, 1 reply; 15+ messages in thread
From: Pingfan Liu @ 2021-10-08  5:53 UTC (permalink / raw)
  To: Petr Mladek
  Cc: Pingfan Liu, linux-kernel, Sumit Garg, Catalin Marinas,
	Will Deacon, Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Marc Zyngier,
	Kees Cook, Masahiro Yamada, Sami Tolvanen, Andrew Morton,
	Wang Qing, Peter Zijlstra (Intel),
	Santosh Sivaraj, linux-arm-kernel

On Tue, Oct 05, 2021 at 09:03:17AM +0200, Petr Mladek wrote:
[...]
> > +static void lockup_detector_delay_init(struct work_struct *work);
> > +bool hld_detector_delay_initialized __initdata;
> > +
> > +struct wait_queue_head hld_detector_wait __initdata =
> > +		__WAIT_QUEUE_HEAD_INITIALIZER(hld_detector_wait);
> > +
> > +static struct work_struct detector_work __initdata =
> > +		__WORK_INITIALIZER(detector_work, lockup_detector_delay_init);
> > +
> > +static void __init lockup_detector_delay_init(struct work_struct *work)
> > +{
> > +	int ret;
> > +
> > +	wait_event(hld_detector_wait, hld_detector_delay_initialized);
> > +	ret = watchdog_nmi_probe();
> > +	if (!ret) {
> > +		nmi_watchdog_available = true;
> > +		lockup_detector_setup();
> 
> Is it really safe to call the entire lockup_detector_setup()
> later?
> 
> It manipulates also softlockup detector. And more importantly,
> the original call is before smp_init(). It means that it was
> running when only single CPU was on.
> 
For the race analysis, lockup_detector_reconfigure() is on the centre stage.
Since proc_watchdog_update() can call lockup_detector_reconfigure() to
re-initialize both soft and hard lockup detector, so the race issue
should be already taken into consideration.

> It seems that x86 has some problem with hardlockup detector as
> well. It later manipulates only the hardlockup detector. Also it uses
> cpus_read_lock() to prevent races with CPU hotplug, see
> fixup_ht_bug().
> 
Yes. But hardlockup_detector_perf_{stop,start}() can not meet the
requirement, since no perf_event is created yet. So there is no handy
interface to re-initialize hardlockup detector directly.

> 
> > +	} else {
> > +		WARN_ON(ret == -EBUSY);
> > +		pr_info("Perf NMI watchdog permanently disabled\n");
> > +	}
> > +}
> > +
> >  void __init lockup_detector_init(void)
> >  {
> > +	int ret;
> > +
> >  	if (tick_nohz_full_enabled())
> >  		pr_info("Disabling watchdog on nohz_full cores by default\n");
> >  
> >  	cpumask_copy(&watchdog_cpumask,
> >  		     housekeeping_cpumask(HK_FLAG_TIMER));
> >  
> > -	if (!watchdog_nmi_probe())
> > +	ret = watchdog_nmi_probe();
> > +	if (!ret)
> >  		nmi_watchdog_available = true;
> > +	else if (ret == -EBUSY)
> > +		queue_work_on(smp_processor_id(), system_wq, &detector_work);
> 
> IMHO, this is not acceptable. It will block one worker until someone
> wakes it. Only arm64 will have a code to wake up the work and only
> when pmu is successfully initialized. In all other cases, the worker
> will stay blocked forever.
> 
What about consider -EBUSY and hld_detector_delay_initialized as a unit?
If watchdog_nmi_probe() returns -EBUSY, then
set the state of ld_detector_delay_initialized as "waiting", and then moved to state "finished".

And at the end of do_initcalls(), check the state is "finished". If not,
then throw a warning and wake up the worker.

> The right solution is to do it the other way. Queue the work
> from arm64-specific code when armv8_pmu_driver_init() succeeded.
> 
Could it be better if watchdog can provide a common framework for future
extension instead of arch specific? The 2nd argument is to avoid the
message "Perf NMI watchdog permanently disabled" while later enabling
it.  (Please see
lockup_detector_init()->watchdog_nmi_probe()->hardlockup_detector_perf_init(),
but if providing arch specific probe method, it can be avoided)

> Also I suggest to flush the work to make sure that it is finished
> before __init code gets removed.
> 
Good point, and very interesting. I will look into it.

> 
> The open question is what code the work will call. As mentioned
> above, I am not sure that lockup_detector_delay_init() is safe.
> IMHO, we need to manipulate only hardlockup detector and
> we have to serialize it against CPU hotplug.
> 
As explained ahead, it has already consider the race against CPU
hotplug.

Thanks,

	Pingfan


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCHv2 3/4] kernel/watchdog: adapt the watchdog_hld interface for async model
  2021-10-08  5:53     ` Pingfan Liu
@ 2021-10-08 15:10       ` Pingfan Liu
  0 siblings, 0 replies; 15+ messages in thread
From: Pingfan Liu @ 2021-10-08 15:10 UTC (permalink / raw)
  To: Petr Mladek
  Cc: linux-kernel, Sumit Garg, Catalin Marinas, Will Deacon,
	Ingo Molnar, Arnaldo Carvalho de Melo, Mark Rutland,
	Alexander Shishkin, Jiri Olsa, Namhyung Kim, Marc Zyngier,
	Kees Cook, Masahiro Yamada, Sami Tolvanen, Andrew Morton,
	Wang Qing, Peter Zijlstra (Intel),
	Santosh Sivaraj, linux-arm-kernel

On Fri, Oct 08, 2021 at 01:53:45PM +0800, Pingfan Liu wrote:
> On Tue, Oct 05, 2021 at 09:03:17AM +0200, Petr Mladek wrote:
> [...]
> > > +static void lockup_detector_delay_init(struct work_struct *work);
> > > +bool hld_detector_delay_initialized __initdata;
> > > +
> > > +struct wait_queue_head hld_detector_wait __initdata =
> > > +		__WAIT_QUEUE_HEAD_INITIALIZER(hld_detector_wait);
> > > +
> > > +static struct work_struct detector_work __initdata =
> > > +		__WORK_INITIALIZER(detector_work, lockup_detector_delay_init);
> > > +
> > > +static void __init lockup_detector_delay_init(struct work_struct *work)
> > > +{
> > > +	int ret;
> > > +
> > > +	wait_event(hld_detector_wait, hld_detector_delay_initialized);
> > > +	ret = watchdog_nmi_probe();
> > > +	if (!ret) {
> > > +		nmi_watchdog_available = true;
> > > +		lockup_detector_setup();
> > 
> > Is it really safe to call the entire lockup_detector_setup()
> > later?
> > 
> > It manipulates also softlockup detector. And more importantly,
> > the original call is before smp_init(). It means that it was
> > running when only single CPU was on.
> > 
> For the race analysis, lockup_detector_reconfigure() is on the centre stage.
> Since proc_watchdog_update() can call lockup_detector_reconfigure() to
> re-initialize both soft and hard lockup detector, so the race issue
> should be already taken into consideration.
> 
> > It seems that x86 has some problem with hardlockup detector as
> > well. It later manipulates only the hardlockup detector. Also it uses
> > cpus_read_lock() to prevent races with CPU hotplug, see
> > fixup_ht_bug().
> > 
> Yes. But hardlockup_detector_perf_{stop,start}() can not meet the
> requirement, since no perf_event is created yet. So there is no handy
> interface to re-initialize hardlockup detector directly.
> 
> > 
> > > +	} else {
> > > +		WARN_ON(ret == -EBUSY);
> > > +		pr_info("Perf NMI watchdog permanently disabled\n");
> > > +	}
> > > +}
> > > +
> > >  void __init lockup_detector_init(void)
> > >  {
> > > +	int ret;
> > > +
> > >  	if (tick_nohz_full_enabled())
> > >  		pr_info("Disabling watchdog on nohz_full cores by default\n");
> > >  
> > >  	cpumask_copy(&watchdog_cpumask,
> > >  		     housekeeping_cpumask(HK_FLAG_TIMER));
> > >  
> > > -	if (!watchdog_nmi_probe())
> > > +	ret = watchdog_nmi_probe();
> > > +	if (!ret)
> > >  		nmi_watchdog_available = true;
> > > +	else if (ret == -EBUSY)
> > > +		queue_work_on(smp_processor_id(), system_wq, &detector_work);
> > 
> > IMHO, this is not acceptable. It will block one worker until someone
> > wakes it. Only arm64 will have a code to wake up the work and only
> > when pmu is successfully initialized. In all other cases, the worker
> > will stay blocked forever.
> > 
> What about consider -EBUSY and hld_detector_delay_initialized as a unit?
                                                                     ^^^
								     unity
> If watchdog_nmi_probe() returns -EBUSY, then
> set the state of ld_detector_delay_initialized as "waiting", and then moved to state "finished".
> 
> And at the end of do_initcalls(), check the state is "finished". If not,
> then throw a warning and wake up the worker.
> 
> > The right solution is to do it the other way. Queue the work
> > from arm64-specific code when armv8_pmu_driver_init() succeeded.
> > 
> Could it be better if watchdog can provide a common framework for future
> extension instead of arch specific? The 2nd argument is to avoid the
> message "Perf NMI watchdog permanently disabled" while later enabling
> it.  (Please see
> lockup_detector_init()->watchdog_nmi_probe()->hardlockup_detector_perf_init(),
> but if providing arch specific probe method, it can be avoided)
> 
Sorry for poor expression. I have not explained it completely for the
second point.

Since using arch specific watchdog_nmi_probe() to avoid misleading
message "Perf NMI watchdog permanently disabled", then -EBUSY should be
returned. And from watchdog level, it should know how to handle error,
that is to say queue_work_on(smp_processor_id(), system_wq, &detector_work).

Thanks,

	Pingfan

> > Also I suggest to flush the work to make sure that it is finished
> > before __init code gets removed.
> > 
> Good point, and very interesting. I will look into it.
> 
> > 
> > The open question is what code the work will call. As mentioned
> > above, I am not sure that lockup_detector_delay_init() is safe.
> > IMHO, we need to manipulate only hardlockup detector and
> > we have to serialize it against CPU hotplug.
> > 
> As explained ahead, it has already consider the race against CPU
> hotplug.
> 
> Thanks,
> 
> 	Pingfan
> 

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2021-10-08 15:10 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-23 14:09 [PATCHv2 0/4] watchdog_hld cleanup and async model for arm64 Pingfan Liu
2021-09-23 14:09 ` [PATCHv2 1/4] kernel/watchdog: trival cleanups Pingfan Liu
2021-10-04  9:32   ` Petr Mladek
2021-10-08  4:04     ` Pingfan Liu
2021-09-23 14:09 ` [PATCHv2 2/4] kernel/watchdog_hld: clarify the condition in hardlockup_detector_event_create() Pingfan Liu
2021-10-04 12:32   ` Petr Mladek
2021-10-08  4:11     ` Pingfan Liu
2021-09-23 14:09 ` [PATCHv2 3/4] kernel/watchdog: adapt the watchdog_hld interface for async model Pingfan Liu
2021-10-05  7:03   ` Petr Mladek
2021-10-08  5:53     ` Pingfan Liu
2021-10-08 15:10       ` Pingfan Liu
2021-09-23 14:09 ` [PATCHv2 4/4] arm64: Enable perf events based hard lockup detector Pingfan Liu
2021-09-23 14:29   ` Pingfan Liu
2021-09-24  5:18     ` Sumit Garg
2021-09-24 13:31       ` Pingfan Liu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).