linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/5] Suppot hld based on Pseudo-NMI for arm64
@ 2022-03-07 15:47 Lecopzer Chen
  2022-03-07 15:47 ` [PATCH v2 1/5] kernel/watchdog: remove WATCHDOG_DEFAULT Lecopzer Chen
                   ` (4 more replies)
  0 siblings, 5 replies; 10+ messages in thread
From: Lecopzer Chen @ 2022-03-07 15:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: lecopzer.chen, acme, akpm, alexander.shishkin, catalin.marinas,
	davem, jolsa, jthierry, keescook, kernelfans, linux-arm-kernel,
	linux-mediatek, linux-perf-users, mark.rutland, masahiroy,
	matthias.bgg, maz, mcgrof, mingo, namhyung, nixiaoming, peterz,
	pmladek, sparclinux, sumit.garg, wangqing, will, yj.chiang

As we already used hld internally for arm64 since 2020, there still
doesn't have a proper commit on the upstream and we badly need it.

This serise rework on 5.17-rc7 from [1] and the origin author is
Pingfan Liu <kernelfans@gmail.com>
Sumit Garg <sumit.garg@linaro.org>

Qoute from [1]:

> Hard lockup detector is helpful to diagnose unpaired irq enable/disable.
> But the current watchdog framework can not cope with arm64 hw perf event
> easily.

> On arm64, when lockup_detector_init()->watchdog_nmi_probe(), PMU is not
> ready until device_initcall(armv8_pmu_driver_init).  And it is deeply
> integrated with the driver model and cpuhp. Hence it is hard to push the
> initialization of armv8_pmu_driver_init() before smp_init().

> But it is easy to take an opposite approach by enabling watchdog_hld to
> get the capability of PMU async. 
> The async model is achieved by expanding watchdog_nmi_probe() with
> -EBUSY, and a re-initializing work_struct which waits on a
> wait_queue_head.

[1] https://lore.kernel.org/lkml/20211014024155.15253-1-kernelfans@gmail.com/

v2:
  1. Tweak commit message in patch 01/02/04/05 
  2. Remove vobose WARN in patch 04 within watchdog core.
  3. Change from three states variable: detector_delay_init_state to
     two states variable: lockup_detector_pending_init

     Thanks Petr Mladek <pmladek@suse.com> for the idea.
     > 1.  lockup_detector_work() called before lockup_detector_check().
     >     In this case, wait_event() will wait until lockup_detector_check()
     >     clears detector_delay_pending_init and calls wake_up().

     > 2. lockup_detector_check() called before lockup_detector_work().
     >    In this case, wait_even() will immediately continue because
     >    it will see cleared detector_delay_pending_init.
  4. Add comment in code in patch 04/05 for two states variable changing.


Lecopzer Chen (4):
  kernel/watchdog: remove WATCHDOG_DEFAULT
  kernel/watchdog: change watchdog_nmi_enable() to void
  kernel/watchdog: Adapt the watchdog_hld interface for async model
  arm64: Enable perf events based hard lockup detector

Pingfan Liu (1):
  kernel/watchdog_hld: Ensure CPU-bound context when creating hardlockup
    detector event

 arch/arm64/Kconfig             |  2 +
 arch/arm64/kernel/Makefile     |  1 +
 arch/arm64/kernel/perf_event.c | 12 +++++-
 arch/sparc/kernel/nmi.c        |  8 ++--
 drivers/perf/arm_pmu.c         |  5 +++
 include/linux/nmi.h            |  5 ++-
 include/linux/perf/arm_pmu.h   |  2 +
 kernel/watchdog.c              | 67 +++++++++++++++++++++++++++++++---
 kernel/watchdog_hld.c          |  8 +++-
 9 files changed, 95 insertions(+), 15 deletions(-)

-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH v2 1/5] kernel/watchdog: remove WATCHDOG_DEFAULT
  2022-03-07 15:47 [PATCH v2 0/5] Suppot hld based on Pseudo-NMI for arm64 Lecopzer Chen
@ 2022-03-07 15:47 ` Lecopzer Chen
  2022-03-07 15:47 ` [PATCH v2 2/5] kernel/watchdog: change watchdog_nmi_enable() to void Lecopzer Chen
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Lecopzer Chen @ 2022-03-07 15:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: lecopzer.chen, acme, akpm, alexander.shishkin, catalin.marinas,
	davem, jolsa, jthierry, keescook, kernelfans, linux-arm-kernel,
	linux-mediatek, linux-perf-users, mark.rutland, masahiroy,
	matthias.bgg, maz, mcgrof, mingo, namhyung, nixiaoming, peterz,
	pmladek, sparclinux, sumit.garg, wangqing, will, yj.chiang

No reference to WATCHDOG_DEFAULT, remove it.

Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
---
 kernel/watchdog.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 99afb88d2e85..db451e943a04 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -30,10 +30,8 @@
 static DEFINE_MUTEX(watchdog_mutex);
 
 #if defined(CONFIG_HARDLOCKUP_DETECTOR) || defined(CONFIG_HAVE_NMI_WATCHDOG)
-# define WATCHDOG_DEFAULT	(SOFT_WATCHDOG_ENABLED | NMI_WATCHDOG_ENABLED)
 # define NMI_WATCHDOG_DEFAULT	1
 #else
-# define WATCHDOG_DEFAULT	(SOFT_WATCHDOG_ENABLED)
 # define NMI_WATCHDOG_DEFAULT	0
 #endif
 
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 2/5] kernel/watchdog: change watchdog_nmi_enable() to void
  2022-03-07 15:47 [PATCH v2 0/5] Suppot hld based on Pseudo-NMI for arm64 Lecopzer Chen
  2022-03-07 15:47 ` [PATCH v2 1/5] kernel/watchdog: remove WATCHDOG_DEFAULT Lecopzer Chen
@ 2022-03-07 15:47 ` Lecopzer Chen
  2022-03-07 15:47 ` [PATCH v2 3/5] kernel/watchdog_hld: Ensure CPU-bound context when creating hardlockup detector event Lecopzer Chen
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 10+ messages in thread
From: Lecopzer Chen @ 2022-03-07 15:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: lecopzer.chen, acme, akpm, alexander.shishkin, catalin.marinas,
	davem, jolsa, jthierry, keescook, kernelfans, linux-arm-kernel,
	linux-mediatek, linux-perf-users, mark.rutland, masahiroy,
	matthias.bgg, maz, mcgrof, mingo, namhyung, nixiaoming, peterz,
	pmladek, sparclinux, sumit.garg, wangqing, will, yj.chiang

Nobody cares about the return value of watchdog_nmi_enable(),
changing its prototype to void.

Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
---
 arch/sparc/kernel/nmi.c | 8 +++-----
 include/linux/nmi.h     | 2 +-
 kernel/watchdog.c       | 3 +--
 3 files changed, 5 insertions(+), 8 deletions(-)

diff --git a/arch/sparc/kernel/nmi.c b/arch/sparc/kernel/nmi.c
index 060fff95a305..5dcf31f7e81f 100644
--- a/arch/sparc/kernel/nmi.c
+++ b/arch/sparc/kernel/nmi.c
@@ -282,11 +282,11 @@ __setup("nmi_watchdog=", setup_nmi_watchdog);
  * sparc specific NMI watchdog enable function.
  * Enables watchdog if it is not enabled already.
  */
-int watchdog_nmi_enable(unsigned int cpu)
+void watchdog_nmi_enable(unsigned int cpu)
 {
 	if (atomic_read(&nmi_active) == -1) {
 		pr_warn("NMI watchdog cannot be enabled or disabled\n");
-		return -1;
+		return;
 	}
 
 	/*
@@ -295,11 +295,9 @@ int watchdog_nmi_enable(unsigned int cpu)
 	 * process first.
 	 */
 	if (!nmi_init_done)
-		return 0;
+		return;
 
 	smp_call_function_single(cpu, start_nmi_watchdog, NULL, 1);
-
-	return 0;
 }
 /*
  * sparc specific NMI watchdog disable function.
diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index 750c7f395ca9..b7bcd63c36b4 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -119,7 +119,7 @@ static inline int hardlockup_detector_perf_init(void) { return 0; }
 void watchdog_nmi_stop(void);
 void watchdog_nmi_start(void);
 int watchdog_nmi_probe(void);
-int watchdog_nmi_enable(unsigned int cpu);
+void watchdog_nmi_enable(unsigned int cpu);
 void watchdog_nmi_disable(unsigned int cpu);
 
 /**
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index db451e943a04..b71d434cf648 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -93,10 +93,9 @@ __setup("nmi_watchdog=", hardlockup_panic_setup);
  * softlockup watchdog start and stop. The arch must select the
  * SOFTLOCKUP_DETECTOR Kconfig.
  */
-int __weak watchdog_nmi_enable(unsigned int cpu)
+void __weak watchdog_nmi_enable(unsigned int cpu)
 {
 	hardlockup_detector_perf_enable();
-	return 0;
 }
 
 void __weak watchdog_nmi_disable(unsigned int cpu)
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 3/5] kernel/watchdog_hld: Ensure CPU-bound context when creating hardlockup detector event
  2022-03-07 15:47 [PATCH v2 0/5] Suppot hld based on Pseudo-NMI for arm64 Lecopzer Chen
  2022-03-07 15:47 ` [PATCH v2 1/5] kernel/watchdog: remove WATCHDOG_DEFAULT Lecopzer Chen
  2022-03-07 15:47 ` [PATCH v2 2/5] kernel/watchdog: change watchdog_nmi_enable() to void Lecopzer Chen
@ 2022-03-07 15:47 ` Lecopzer Chen
  2022-03-07 15:47 ` [PATCH v2 4/5] kernel/watchdog: Adapt the watchdog_hld interface for async model Lecopzer Chen
  2022-03-07 15:47 ` [PATCH v2 5/5] arm64: Enable perf events based hard lockup detector Lecopzer Chen
  4 siblings, 0 replies; 10+ messages in thread
From: Lecopzer Chen @ 2022-03-07 15:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: lecopzer.chen, acme, akpm, alexander.shishkin, catalin.marinas,
	davem, jolsa, jthierry, keescook, kernelfans, linux-arm-kernel,
	linux-mediatek, linux-perf-users, mark.rutland, masahiroy,
	matthias.bgg, maz, mcgrof, mingo, namhyung, nixiaoming, peterz,
	pmladek, sparclinux, sumit.garg, wangqing, will, yj.chiang

From: Pingfan Liu <kernelfans@gmail.com>

hardlockup_detector_event_create() should create perf_event on the
current CPU. Preemption could not get disabled because
perf_event_create_kernel_counter() allocates memory. Instead,
the CPU locality is achieved by processing the code in a per-CPU
bound kthread.

Add a check to prevent mistakes when calling the code in another
code path.

Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Co-developed-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
Reviewed-by: Petr Mladek <pmladek@suse.com>
---
 kernel/watchdog_hld.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/kernel/watchdog_hld.c b/kernel/watchdog_hld.c
index 247bf0b1582c..96b717205952 100644
--- a/kernel/watchdog_hld.c
+++ b/kernel/watchdog_hld.c
@@ -165,10 +165,16 @@ static void watchdog_overflow_callback(struct perf_event *event,
 
 static int hardlockup_detector_event_create(void)
 {
-	unsigned int cpu = smp_processor_id();
+	unsigned int cpu;
 	struct perf_event_attr *wd_attr;
 	struct perf_event *evt;
 
+	/*
+	 * Preemption is not disabled because memory will be allocated.
+	 * Ensure CPU-locality by calling this in per-CPU kthread.
+	 */
+	WARN_ON(!is_percpu_thread());
+	cpu = raw_smp_processor_id();
 	wd_attr = &wd_hw_attr;
 	wd_attr->sample_period = hw_nmi_get_sample_period(watchdog_thresh);
 
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 4/5] kernel/watchdog: Adapt the watchdog_hld interface for async model
  2022-03-07 15:47 [PATCH v2 0/5] Suppot hld based on Pseudo-NMI for arm64 Lecopzer Chen
                   ` (2 preceding siblings ...)
  2022-03-07 15:47 ` [PATCH v2 3/5] kernel/watchdog_hld: Ensure CPU-bound context when creating hardlockup detector event Lecopzer Chen
@ 2022-03-07 15:47 ` Lecopzer Chen
  2022-03-18 10:40   ` Petr Mladek
  2022-03-07 15:47 ` [PATCH v2 5/5] arm64: Enable perf events based hard lockup detector Lecopzer Chen
  4 siblings, 1 reply; 10+ messages in thread
From: Lecopzer Chen @ 2022-03-07 15:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: lecopzer.chen, acme, akpm, alexander.shishkin, catalin.marinas,
	davem, jolsa, jthierry, keescook, kernelfans, linux-arm-kernel,
	linux-mediatek, linux-perf-users, mark.rutland, masahiroy,
	matthias.bgg, maz, mcgrof, mingo, namhyung, nixiaoming, peterz,
	pmladek, sparclinux, sumit.garg, wangqing, will, yj.chiang

When lockup_detector_init()->watchdog_nmi_probe(), PMU may be not ready
yet. E.g. on arm64, PMU is not ready until
device_initcall(armv8_pmu_driver_init).  And it is deeply integrated
with the driver model and cpuhp. Hence it is hard to push this
initialization before smp_init().

But it is easy to take an opposite approach by enabling watchdog_hld to
get the capability of PMU async.

The async model is achieved by expanding watchdog_nmi_probe() with
-EBUSY, and a re-initializing work_struct which waits on a wait_queue_head.

Co-developed-by: Pingfan Liu <kernelfans@gmail.com>
Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
Suggested-by: Petr Mladek <pmladek@suse.com>
---
 include/linux/nmi.h |  3 +++
 kernel/watchdog.c   | 62 +++++++++++++++++++++++++++++++++++++++++++--
 2 files changed, 63 insertions(+), 2 deletions(-)

diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index b7bcd63c36b4..cc7df31be9db 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -118,6 +118,9 @@ static inline int hardlockup_detector_perf_init(void) { return 0; }
 
 void watchdog_nmi_stop(void);
 void watchdog_nmi_start(void);
+
+extern bool lockup_detector_pending_init;
+extern struct wait_queue_head hld_detector_wait;
 int watchdog_nmi_probe(void);
 void watchdog_nmi_enable(unsigned int cpu);
 void watchdog_nmi_disable(unsigned int cpu);
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index b71d434cf648..49bdcaf5bd8f 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -103,7 +103,11 @@ void __weak watchdog_nmi_disable(unsigned int cpu)
 	hardlockup_detector_perf_disable();
 }
 
-/* Return 0, if a NMI watchdog is available. Error code otherwise */
+/*
+ * Arch specific API. Return 0, if a NMI watchdog is available. -EBUSY if not
+ * ready, and arch code should wake up hld_detector_wait when ready. Other
+ * negative value if not support.
+ */
 int __weak __init watchdog_nmi_probe(void)
 {
 	return hardlockup_detector_perf_init();
@@ -839,16 +843,70 @@ static void __init watchdog_sysctl_init(void)
 #define watchdog_sysctl_init() do { } while (0)
 #endif /* CONFIG_SYSCTL */
 
+static void lockup_detector_delay_init(struct work_struct *work);
+bool lockup_detector_pending_init __initdata;
+
+struct wait_queue_head hld_detector_wait __initdata =
+		__WAIT_QUEUE_HEAD_INITIALIZER(hld_detector_wait);
+
+static struct work_struct detector_work __initdata =
+		__WORK_INITIALIZER(detector_work, lockup_detector_delay_init);
+
+static void __init lockup_detector_delay_init(struct work_struct *work)
+{
+	int ret;
+
+	wait_event(hld_detector_wait,
+			lockup_detector_pending_init == false);
+
+	/*
+	 * Here, we know the PMU should be ready, so set pending to true to
+	 * inform watchdog_nmi_probe() that it shouldn't return -EBUSY again.
+	 */
+	lockup_detector_pending_init = true;
+	ret = watchdog_nmi_probe();
+	if (ret) {
+		pr_info("Delayed init of the lockup detector failed: %d\n", ret);
+		pr_info("Perf NMI watchdog permanently disabled\n");
+		return;
+	}
+
+	nmi_watchdog_available = true;
+	lockup_detector_setup();
+	lockup_detector_pending_init = false;
+}
+
+/* Ensure the check is called after the initialization of PMU driver */
+static int __init lockup_detector_check(void)
+{
+	if (!lockup_detector_pending_init)
+		return 0;
+
+	pr_info("Delayed init checking failed, retry for once.\n");
+	lockup_detector_pending_init = false;
+	wake_up(&hld_detector_wait);
+	return 0;
+}
+late_initcall_sync(lockup_detector_check);
+
 void __init lockup_detector_init(void)
 {
+	int ret;
+
 	if (tick_nohz_full_enabled())
 		pr_info("Disabling watchdog on nohz_full cores by default\n");
 
 	cpumask_copy(&watchdog_cpumask,
 		     housekeeping_cpumask(HK_FLAG_TIMER));
 
-	if (!watchdog_nmi_probe())
+	ret = watchdog_nmi_probe();
+	if (!ret)
 		nmi_watchdog_available = true;
+	else if (ret == -EBUSY) {
+		lockup_detector_pending_init = true;
+		queue_work_on(smp_processor_id(), system_wq, &detector_work);
+	}
+
 	lockup_detector_setup();
 	watchdog_sysctl_init();
 }
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH v2 5/5] arm64: Enable perf events based hard lockup detector
  2022-03-07 15:47 [PATCH v2 0/5] Suppot hld based on Pseudo-NMI for arm64 Lecopzer Chen
                   ` (3 preceding siblings ...)
  2022-03-07 15:47 ` [PATCH v2 4/5] kernel/watchdog: Adapt the watchdog_hld interface for async model Lecopzer Chen
@ 2022-03-07 15:47 ` Lecopzer Chen
  4 siblings, 0 replies; 10+ messages in thread
From: Lecopzer Chen @ 2022-03-07 15:47 UTC (permalink / raw)
  To: linux-kernel
  Cc: lecopzer.chen, acme, akpm, alexander.shishkin, catalin.marinas,
	davem, jolsa, jthierry, keescook, kernelfans, linux-arm-kernel,
	linux-mediatek, linux-perf-users, mark.rutland, masahiroy,
	matthias.bgg, maz, mcgrof, mingo, namhyung, nixiaoming, peterz,
	pmladek, sparclinux, sumit.garg, wangqing, will, yj.chiang

With the recent feature added to enable perf events to use pseudo NMIs
as interrupts on platforms which support GICv3 or later, its now been
possible to enable hard lockup detector (or NMI watchdog) on arm64
platforms. So enable corresponding support.

One thing to note here is that normally lockup detector is initialized
just after the early initcalls but PMU on arm64 comes up much later as
device_initcall(). To cope with that, overriding watchdog_nmi_probe() to
let the watchdog framework know PMU not ready, and inform the framework
to re-initialize lockup detection once PMU has been initialized.

[1]: http://lore.kernel.org/linux-arm-kernel/1610712101-14929-1-git-send-email-sumit.garg@linaro.org

Co-developed-by: Sumit Garg <sumit.garg@linaro.org>
Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Co-developed-by: Pingfan Liu <kernelfans@gmail.com>
Signed-off-by: Pingfan Liu <kernelfans@gmail.com>
Signed-off-by: Lecopzer Chen <lecopzer.chen@mediatek.com>
---
 arch/arm64/Kconfig             |  2 ++
 arch/arm64/kernel/Makefile     |  1 +
 arch/arm64/kernel/perf_event.c | 12 ++++++++++--
 drivers/perf/arm_pmu.c         |  5 +++++
 include/linux/perf/arm_pmu.h   |  2 ++
 5 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 09b885cc4db5..df6fed8327ba 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -190,6 +190,8 @@ config ARM64
 	select HAVE_NMI
 	select HAVE_PATA_PLATFORM
 	select HAVE_PERF_EVENTS
+	select HAVE_PERF_EVENTS_NMI if ARM64_PSEUDO_NMI
+	select HAVE_HARDLOCKUP_DETECTOR_PERF if PERF_EVENTS && HAVE_PERF_EVENTS_NMI
 	select HAVE_PERF_REGS
 	select HAVE_PERF_USER_STACK_DUMP
 	select HAVE_REGS_AND_STACK_ACCESS_API
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 88b3e2a21408..3e62a8877ed7 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -46,6 +46,7 @@ obj-$(CONFIG_MODULES)			+= module.o
 obj-$(CONFIG_ARM64_MODULE_PLTS)		+= module-plts.o
 obj-$(CONFIG_PERF_EVENTS)		+= perf_regs.o perf_callchain.o
 obj-$(CONFIG_HW_PERF_EVENTS)		+= perf_event.o
+obj-$(CONFIG_HARDLOCKUP_DETECTOR_PERF)	+= watchdog_hld.o
 obj-$(CONFIG_HAVE_HW_BREAKPOINT)	+= hw_breakpoint.o
 obj-$(CONFIG_CPU_PM)			+= sleep.o suspend.o
 obj-$(CONFIG_CPU_IDLE)			+= cpuidle.o
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index cab678ed6618..77eaefee13ea 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -23,6 +23,7 @@
 #include <linux/platform_device.h>
 #include <linux/sched_clock.h>
 #include <linux/smp.h>
+#include <linux/nmi.h>
 
 /* ARMv8 Cortex-A53 specific event types. */
 #define ARMV8_A53_PERFCTR_PREF_LINEFILL				0xC2
@@ -1380,10 +1381,17 @@ static struct platform_driver armv8_pmu_driver = {
 
 static int __init armv8_pmu_driver_init(void)
 {
+	int ret;
+
 	if (acpi_disabled)
-		return platform_driver_register(&armv8_pmu_driver);
+		ret = platform_driver_register(&armv8_pmu_driver);
 	else
-		return arm_pmu_acpi_probe(armv8_pmuv3_pmu_init);
+		ret = arm_pmu_acpi_probe(armv8_pmuv3_pmu_init);
+
+	/* Inform watchdog core we are ready to probe hld by delayed init. */
+	lockup_detector_pending_init = false;
+	wake_up(&hld_detector_wait);
+	return ret;
 }
 device_initcall(armv8_pmu_driver_init)
 
diff --git a/drivers/perf/arm_pmu.c b/drivers/perf/arm_pmu.c
index 295cc7952d0e..e77f4897fca2 100644
--- a/drivers/perf/arm_pmu.c
+++ b/drivers/perf/arm_pmu.c
@@ -697,6 +697,11 @@ static int armpmu_get_cpu_irq(struct arm_pmu *pmu, int cpu)
 	return per_cpu(hw_events->irq, cpu);
 }
 
+bool arm_pmu_irq_is_nmi(void)
+{
+	return has_nmi;
+}
+
 /*
  * PMU hardware loses all context when a CPU goes offline.
  * When a CPU is hotplugged back in, since some hardware registers are
diff --git a/include/linux/perf/arm_pmu.h b/include/linux/perf/arm_pmu.h
index 2512e2f9cd4e..9325d01adc3e 100644
--- a/include/linux/perf/arm_pmu.h
+++ b/include/linux/perf/arm_pmu.h
@@ -169,6 +169,8 @@ void kvm_host_pmu_init(struct arm_pmu *pmu);
 #define kvm_host_pmu_init(x)	do { } while(0)
 #endif
 
+bool arm_pmu_irq_is_nmi(void);
+
 /* Internal functions only for core arm_pmu code */
 struct arm_pmu *armpmu_alloc(void);
 struct arm_pmu *armpmu_alloc_atomic(void);
-- 
2.25.1


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 4/5] kernel/watchdog: Adapt the watchdog_hld interface for async model
  2022-03-07 15:47 ` [PATCH v2 4/5] kernel/watchdog: Adapt the watchdog_hld interface for async model Lecopzer Chen
@ 2022-03-18 10:40   ` Petr Mladek
  2022-03-19  8:18     ` Lecopzer Chen
  0 siblings, 1 reply; 10+ messages in thread
From: Petr Mladek @ 2022-03-18 10:40 UTC (permalink / raw)
  To: Lecopzer Chen
  Cc: linux-kernel, acme, akpm, alexander.shishkin, catalin.marinas,
	davem, jolsa, jthierry, keescook, kernelfans, linux-arm-kernel,
	linux-mediatek, linux-perf-users, mark.rutland, masahiroy,
	matthias.bgg, maz, mcgrof, mingo, namhyung, nixiaoming, peterz,
	sparclinux, sumit.garg, wangqing, will, yj.chiang

On Mon 2022-03-07 23:47:28, Lecopzer Chen wrote:
> When lockup_detector_init()->watchdog_nmi_probe(), PMU may be not ready
> yet. E.g. on arm64, PMU is not ready until
> device_initcall(armv8_pmu_driver_init).  And it is deeply integrated
> with the driver model and cpuhp. Hence it is hard to push this
> initialization before smp_init().

The above is clear.

> But it is easy to take an opposite approach by enabling watchdog_hld to
> get the capability of PMU async.
> 
> The async model is achieved by expanding watchdog_nmi_probe() with
> -EBUSY, and a re-initializing work_struct which waits on a wait_queue_head.

These two paragraphs are a bit confusing to me. It might be just a
problem with translation. I am not a native speaker. Anyway, I wonder
if the following is more clear:

<proposal>
But it is easy to take an opposite approach and try to initialize
the watchdog once again later.

The delayed probe is called using workqueues. It need to allocate
memory and must be proceed in a normal context.

The delayed probe is queued only when the early one returns -EBUSY.
It is the return code returned when PMU is not ready yet.
</proposal>

> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -103,7 +103,11 @@ void __weak watchdog_nmi_disable(unsigned int cpu)
>  	hardlockup_detector_perf_disable();
>  }
>  
> -/* Return 0, if a NMI watchdog is available. Error code otherwise */
> +/*
> + * Arch specific API. Return 0, if a NMI watchdog is available. -EBUSY if not
> + * ready, and arch code should wake up hld_detector_wait when ready. Other
> + * negative value if not support.
> + */

I wonder if the following is slightly more clear:

 /*
 * Arch specific API.
 *
 * Return 0 when NMI watchdog is available, negative value otherwise.
 * The error code -EBUSY is special. It means that a deferred probe
 * might succeed later.
 */

>  int __weak __init watchdog_nmi_probe(void)
>  {
>  	return hardlockup_detector_perf_init();
> @@ -839,16 +843,70 @@ static void __init watchdog_sysctl_init(void)
>  #define watchdog_sysctl_init() do { } while (0)
>  #endif /* CONFIG_SYSCTL */
>  
> +static void lockup_detector_delay_init(struct work_struct *work);
> +bool lockup_detector_pending_init __initdata;
> +
> +struct wait_queue_head hld_detector_wait __initdata =
> +		__WAIT_QUEUE_HEAD_INITIALIZER(hld_detector_wait);
> +
> +static struct work_struct detector_work __initdata =
> +		__WORK_INITIALIZER(detector_work, lockup_detector_delay_init);
> +
> +static void __init lockup_detector_delay_init(struct work_struct *work)
> +{
> +	int ret;
> +
> +	wait_event(hld_detector_wait,
> +			lockup_detector_pending_init == false);
> +
> +	/*
> +	 * Here, we know the PMU should be ready, so set pending to true to
> +	 * inform watchdog_nmi_probe() that it shouldn't return -EBUSY again.
> +	 */
> +	lockup_detector_pending_init = true;

This does not make sense to me. We are here only when:

   1. lockup_detector_init() queued this work.

   2. Someone cleared @lockup_detector_pending_init and woke the
      worker via wait_queue. IT might be either PMU init code
      or the late lockup_detector_check().

watchdog_nmi_probe() might still return -EBUSY when PMU init failed.

If you wanted to try the delayed probe once again (3rd attempt) from
lockup_detector_check(), you would need to queue the work once again.
But you need to be sure that lockup_detector_check() was not called
yet. Otherwise, the 2nd work might wait forewer.

IMHO, it is not worth the complexity.

> +	ret = watchdog_nmi_probe();
> +	if (ret) {
> +		pr_info("Delayed init of the lockup detector failed: %d\n", ret);
> +		pr_info("Perf NMI watchdog permanently disabled\n");
> +		return;
> +	}
> +
> +	nmi_watchdog_available = true;
> +	lockup_detector_setup();
> +	lockup_detector_pending_init = false;
> +}

Otherwise, it looks good to me.

Best Regards,
Petr

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 4/5] kernel/watchdog: Adapt the watchdog_hld interface for async model
  2022-03-18 10:40   ` Petr Mladek
@ 2022-03-19  8:18     ` Lecopzer Chen
  2022-03-21 17:37       ` Petr Mladek
  0 siblings, 1 reply; 10+ messages in thread
From: Lecopzer Chen @ 2022-03-19  8:18 UTC (permalink / raw)
  To: pmladek
  Cc: acme, akpm, alexander.shishkin, catalin.marinas, davem, jolsa,
	jthierry, keescook, kernelfans, lecopzer.chen, linux-arm-kernel,
	linux-kernel, linux-mediatek, linux-perf-users, mark.rutland,
	masahiroy, matthias.bgg, maz, mcgrof, mingo, namhyung,
	nixiaoming, peterz, sparclinux, sumit.garg, wangqing, will,
	yj.chiang

> On Mon 2022-03-07 23:47:28, Lecopzer Chen wrote:
> > When lockup_detector_init()->watchdog_nmi_probe(), PMU may be not ready
> > yet. E.g. on arm64, PMU is not ready until
> > device_initcall(armv8_pmu_driver_init).  And it is deeply integrated
> > with the driver model and cpuhp. Hence it is hard to push this
> > initialization before smp_init().
> 
> The above is clear.
> 
> > But it is easy to take an opposite approach by enabling watchdog_hld to
> > get the capability of PMU async.
> > 
> > The async model is achieved by expanding watchdog_nmi_probe() with
> > -EBUSY, and a re-initializing work_struct which waits on a wait_queue_head.
> 
> These two paragraphs are a bit confusing to me. It might be just a
> problem with translation. I am not a native speaker. Anyway, I wonder
> if the following is more clear:
> 
> <proposal>
> But it is easy to take an opposite approach and try to initialize
> the watchdog once again later.
> 
> The delayed probe is called using workqueues. It need to allocate
> memory and must be proceed in a normal context.
> 
> The delayed probe is queued only when the early one returns -EBUSY.
> It is the return code returned when PMU is not ready yet.
> </proposal>

Of course, the original description only briefly told us the functionality.
So I think it makes sense to explain it if anyone think it's unclear.
Also I'm not native speaker either, but I don't feel anything weird in the
description.

I'll use the description you provided, thanks.


> 
> > --- a/kernel/watchdog.c
> > +++ b/kernel/watchdog.c
> > @@ -103,7 +103,11 @@ void __weak watchdog_nmi_disable(unsigned int cpu)
> >  	hardlockup_detector_perf_disable();
> >  }
> >  
> > -/* Return 0, if a NMI watchdog is available. Error code otherwise */
> > +/*
> > + * Arch specific API. Return 0, if a NMI watchdog is available. -EBUSY if not
> > + * ready, and arch code should wake up hld_detector_wait when ready. Other
> > + * negative value if not support.
> > + */
> 
> I wonder if the following is slightly more clear:
> 
>  /*
>  * Arch specific API.
>  *
>  * Return 0 when NMI watchdog is available, negative value otherwise.
>  * The error code -EBUSY is special. It means that a deferred probe
>  * might succeed later.
>  */
> 

Yes this should be more clear.
Abstract `hld_detector_wait` with `deferred probe` is a good idea.

Thanks, I'll take this.

> >  int __weak __init watchdog_nmi_probe(void)
> >  {
> >  	return hardlockup_detector_perf_init();
> > @@ -839,16 +843,70 @@ static void __init watchdog_sysctl_init(void)
> >  #define watchdog_sysctl_init() do { } while (0)
> >  #endif /* CONFIG_SYSCTL */
> >  
> > +static void lockup_detector_delay_init(struct work_struct *work);
> > +bool lockup_detector_pending_init __initdata;
> > +
> > +struct wait_queue_head hld_detector_wait __initdata =
> > +		__WAIT_QUEUE_HEAD_INITIALIZER(hld_detector_wait);
> > +
> > +static struct work_struct detector_work __initdata =
> > +		__WORK_INITIALIZER(detector_work, lockup_detector_delay_init);
> > +
> > +static void __init lockup_detector_delay_init(struct work_struct *work)
> > +{
> > +	int ret;
> > +
> > +	wait_event(hld_detector_wait,
> > +			lockup_detector_pending_init == false);
> > +
> > +	/*
> > +	 * Here, we know the PMU should be ready, so set pending to true to
> > +	 * inform watchdog_nmi_probe() that it shouldn't return -EBUSY again.
> > +	 */
> > +	lockup_detector_pending_init = true;
> 
> This does not make sense to me. We are here only when:
> 
>    1. lockup_detector_init() queued this work.
> 
>    2. Someone cleared @lockup_detector_pending_init and woke the
>       worker via wait_queue. IT might be either PMU init code
>       or the late lockup_detector_check().
> 
> watchdog_nmi_probe() might still return -EBUSY when PMU init failed.
> 
> If you wanted to try the delayed probe once again (3rd attempt) from
> lockup_detector_check(), you would need to queue the work once again.
> But you need to be sure that lockup_detector_check() was not called
> yet. Otherwise, the 2nd work might wait forewer.
> 
> IMHO, it is not worth the complexity.

The original assumption is: nobody should use delayed probe after
lockup_detector_check() (which has __init attribute).


That is, everything including PMU and delayed probe of lock detector must
finsh before do_initcalls() which means delayed probe can't support with
external PMU module init.

Also,
  1. lockup_detector_check is registered with late_initcall_sync(), so it'd
     be called in the last order of do_initcalls()).

  2. watchdog_nmi_probe() and all the delayed relative functions and variables
     have __init attribute, no one should ever use it after __init section
     is released.

The only case is PMU probe function is also late_initcall_sync().


How about this one:
  1. Wrap the wake_up code to reduce the complexity for user side.

  2. Remove wait queue.
     Instead queue work when lockup_detector_init(), queue the delayed
     probe work when arch PMU code finish probe.

and the flow turns to

  1. lockup_detector_init() get -EBUSY, set lockup_detector_pending_init=true

  2. PMU arch code init done, call lockup_detector_queue_work().

  3. lockup_detector_queue_work() queue the work only when
     lockup_detector_pending_init=true which means nobody should call
     this before lockup_detector_init().

  4. the work lockup_detector_delay_init() is doing without wait event.
     if probe success, set lockup_detector_pending_init=false.

  5. at late_initcall_sync(), lockup_detector_check() call flush_work() first
     to avoid previous lockup_detector_queue_work() is not scheduled.
     And then test whether lockup_detector_pending_init is false, if it's
     true, means we have pending init un-finished, than forcely queue work
     again and flush_work to make sure the __init section won't be freed
     before the work done.
 
This remove the complexity of wait event which we were disscussed.
The draft of the diff code(diff with this series) shows below.


diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
index 77eaefee13ea..c776618fbfa8 100644
--- a/arch/arm64/kernel/perf_event.c
+++ b/arch/arm64/kernel/perf_event.c
@@ -1388,9 +1388,7 @@ static int __init armv8_pmu_driver_init(void)
 	else
 		ret = arm_pmu_acpi_probe(armv8_pmuv3_pmu_init);
 
-	/* Inform watchdog core we are ready to probe hld by delayed init. */
-	lockup_detector_pending_init = false;
-	wake_up(&hld_detector_wait);
+	lockup_detector_queue_work();
 	return ret;
 }
 device_initcall(armv8_pmu_driver_init)
diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index cc7df31be9db..98060a86fac6 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -120,7 +120,7 @@ void watchdog_nmi_stop(void);
 void watchdog_nmi_start(void);
 
 extern bool lockup_detector_pending_init;
-extern struct wait_queue_head hld_detector_wait;
+void lockup_detector_queue_work(void);
 int watchdog_nmi_probe(void);
 void watchdog_nmi_enable(unsigned int cpu);
 void watchdog_nmi_disable(unsigned int cpu);
diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index 49bdcaf5bd8f..acaa9f3ac162 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -846,9 +846,6 @@ static void __init watchdog_sysctl_init(void)
 static void lockup_detector_delay_init(struct work_struct *work);
 bool lockup_detector_pending_init __initdata;
 
-struct wait_queue_head hld_detector_wait __initdata =
-		__WAIT_QUEUE_HEAD_INITIALIZER(hld_detector_wait);
-
 static struct work_struct detector_work __initdata =
 		__WORK_INITIALIZER(detector_work, lockup_detector_delay_init);
 
@@ -856,14 +853,6 @@ static void __init lockup_detector_delay_init(struct work_struct *work)
 {
 	int ret;
 
-	wait_event(hld_detector_wait,
-			lockup_detector_pending_init == false);
-
-	/*
-	 * Here, we know the PMU should be ready, so set pending to true to
-	 * inform watchdog_nmi_probe() that it shouldn't return -EBUSY again.
-	 */
-	lockup_detector_pending_init = true;
 	ret = watchdog_nmi_probe();
 	if (ret) {
 		pr_info("Delayed init of the lockup detector failed: %d\n", ret);
@@ -876,15 +865,27 @@ static void __init lockup_detector_delay_init(struct work_struct *work)
 	lockup_detector_pending_init = false;
 }
 
+/* Must call after lockup_detector_init() that we do need delayed probe */
+void __init lockup_detector_queue_work(void)
+{
+	if (!lockup_detector_pending_init)
+		return;
+
+	queue_work_on(__smp_processor_id(), system_wq, &detector_work);
+}
+
 /* Ensure the check is called after the initialization of PMU driver */
 static int __init lockup_detector_check(void)
 {
+	/* Make sure no work is pending. */
+	flush_work(&detector_work);
+
 	if (!lockup_detector_pending_init)
 		return 0;
 
 	pr_info("Delayed init checking failed, retry for once.\n");
-	lockup_detector_pending_init = false;
-	wake_up(&hld_detector_wait);
+	lockup_detector_queue_work();
+	flush_work(&detector_work);
 	return 0;
 }
 late_initcall_sync(lockup_detector_check);
@@ -902,10 +903,8 @@ void __init lockup_detector_init(void)
 	ret = watchdog_nmi_probe();
 	if (!ret)
 		nmi_watchdog_available = true;
-	else if (ret == -EBUSY) {
+	else if (ret == -EBUSY)
 		lockup_detector_pending_init = true;
-		queue_work_on(smp_processor_id(), system_wq, &detector_work);
-	}
 
 	lockup_detector_setup();
 	watchdog_sysctl_init();

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 4/5] kernel/watchdog: Adapt the watchdog_hld interface for async model
  2022-03-19  8:18     ` Lecopzer Chen
@ 2022-03-21 17:37       ` Petr Mladek
  2022-03-24 12:55         ` Lecopzer Chen
  0 siblings, 1 reply; 10+ messages in thread
From: Petr Mladek @ 2022-03-21 17:37 UTC (permalink / raw)
  To: Lecopzer Chen
  Cc: acme, akpm, alexander.shishkin, catalin.marinas, davem, jolsa,
	jthierry, keescook, kernelfans, linux-arm-kernel, linux-kernel,
	linux-mediatek, linux-perf-users, mark.rutland, masahiroy,
	matthias.bgg, maz, mcgrof, mingo, namhyung, nixiaoming, peterz,
	sparclinux, sumit.garg, wangqing, will, yj.chiang

On Sat 2022-03-19 16:18:22, Lecopzer Chen wrote:
> > On Mon 2022-03-07 23:47:28, Lecopzer Chen wrote:
> > > When lockup_detector_init()->watchdog_nmi_probe(), PMU may be not ready
> > > yet. E.g. on arm64, PMU is not ready until
> > > device_initcall(armv8_pmu_driver_init).  And it is deeply integrated
> > > with the driver model and cpuhp. Hence it is hard to push this
> > > initialization before smp_init().
> > 
> > > --- a/kernel/watchdog.c
> > > +++ b/kernel/watchdog.c
> > > @@ -839,16 +843,70 @@ static void __init watchdog_sysctl_init(void)
> > >  #define watchdog_sysctl_init() do { } while (0)
> > >  #endif /* CONFIG_SYSCTL */
> > >  
> > > +static void lockup_detector_delay_init(struct work_struct *work);
> > > +bool lockup_detector_pending_init __initdata;
> > > +
> > > +struct wait_queue_head hld_detector_wait __initdata =
> > > +		__WAIT_QUEUE_HEAD_INITIALIZER(hld_detector_wait);
> > > +
> > > +static struct work_struct detector_work __initdata =
> > > +		__WORK_INITIALIZER(detector_work, lockup_detector_delay_init);
> > > +
> > > +static void __init lockup_detector_delay_init(struct work_struct *work)
> > > +{
> > > +	int ret;
> > > +
> > > +	wait_event(hld_detector_wait,
> > > +			lockup_detector_pending_init == false);
> > > +
> > > +	/*
> > > +	 * Here, we know the PMU should be ready, so set pending to true to
> > > +	 * inform watchdog_nmi_probe() that it shouldn't return -EBUSY again.
> > > +	 */
> > > +	lockup_detector_pending_init = true;
> > 
> > This does not make sense to me. We are here only when:
> > 
> >    1. lockup_detector_init() queued this work.
> > 
> >    2. Someone cleared @lockup_detector_pending_init and woke the
> >       worker via wait_queue. IT might be either PMU init code
> >       or the late lockup_detector_check().
> > 
> > watchdog_nmi_probe() might still return -EBUSY when PMU init failed.
> > 
> > If you wanted to try the delayed probe once again (3rd attempt) from
> > lockup_detector_check(), you would need to queue the work once again.
> > But you need to be sure that lockup_detector_check() was not called
> > yet. Otherwise, the 2nd work might wait forewer.
> > 
> > IMHO, it is not worth the complexity.
> 
> The original assumption is: nobody should use delayed probe after
> lockup_detector_check() (which has __init attribute).

Good point. It makes perfect sense.

But it was not mentioned anywhere. And the code did not work this way.

> 
> That is, everything including PMU and delayed probe of lock detector must
> finsh before do_initcalls() which means delayed probe can't support with
> external PMU module init.
> 
> Also,
>   1. lockup_detector_check is registered with late_initcall_sync(), so it'd
>      be called in the last order of do_initcalls()).
> 
>   2. watchdog_nmi_probe() and all the delayed relative functions and variables
>      have __init attribute, no one should ever use it after __init section
>      is released.
> 
> The only case is PMU probe function is also late_initcall_sync().

This is the case for PMU. The API for delayed init is generic a should
be safe even for other users.


> How about this one:
>   1. Wrap the wake_up code to reduce the complexity for user side.
> 
>   2. Remove wait queue.
>      Instead queue work when lockup_detector_init(), queue the delayed
>      probe work when arch PMU code finish probe.
> 
> and the flow turns to
> 
>   1. lockup_detector_init() get -EBUSY, set lockup_detector_pending_init=true
> 
>   2. PMU arch code init done, call lockup_detector_queue_work().
> 
>   3. lockup_detector_queue_work() queue the work only when
>      lockup_detector_pending_init=true which means nobody should call
>      this before lockup_detector_init().
> 
>   4. the work lockup_detector_delay_init() is doing without wait event.
>      if probe success, set lockup_detector_pending_init=false.
> 
>   5. at late_initcall_sync(), lockup_detector_check() call flush_work() first
>      to avoid previous lockup_detector_queue_work() is not scheduled.
>      And then test whether lockup_detector_pending_init is false, if it's
>      true, means we have pending init un-finished, than forcely queue work
>      again and flush_work to make sure the __init section won't be freed
>      before the work done.

Nice, I like it.

> This remove the complexity of wait event which we were disscussed.
> The draft of the diff code(diff with this series) shows below.
> 
> 
> diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> index 77eaefee13ea..c776618fbfa8 100644
> --- a/arch/arm64/kernel/perf_event.c
> +++ b/arch/arm64/kernel/perf_event.c
> @@ -1388,9 +1388,7 @@ static int __init armv8_pmu_driver_init(void)
>  	else
>  		ret = arm_pmu_acpi_probe(armv8_pmuv3_pmu_init);
>  
> -	/* Inform watchdog core we are ready to probe hld by delayed init. */
> -	lockup_detector_pending_init = false;
> -	wake_up(&hld_detector_wait);
> +	lockup_detector_queue_work();

The name is strange. The fact that it uses workqueues is an
implementation detail. I would call it
retry_lockup_detector_init() so that it is more obvious what it does.

>  	return ret;
>  }
>  device_initcall(armv8_pmu_driver_init)
> --- a/kernel/watchdog.c
> +++ b/kernel/watchdog.c
> @@ -876,15 +865,27 @@ static void __init lockup_detector_delay_init(struct work_struct *work)
>  	lockup_detector_pending_init = false;
>  }
>  
> +/* Must call after lockup_detector_init() that we do need delayed probe */
> +void __init lockup_detector_queue_work(void)
> +{
> +	if (!lockup_detector_pending_init)
> +		return;
> +
> +	queue_work_on(__smp_processor_id(), system_wq, &detector_work);
> +}
> +
>  /* Ensure the check is called after the initialization of PMU driver */
>  static int __init lockup_detector_check(void)
>  {
> +	/* Make sure no work is pending. */
> +	flush_work(&detector_work);
> +
>  	if (!lockup_detector_pending_init)
>  		return 0;
>  
>  	pr_info("Delayed init checking failed, retry for once.\n");
> -	lockup_detector_pending_init = false;
> -	wake_up(&hld_detector_wait);
> +	lockup_detector_queue_work();

I would do here

	lockup_detector_pending_init = false;

to make sure that lockup_detector_queue_work() will not longer
queue the work after the final flush.

Maybe, we could rename the variable to allow_lockup_detector_init_retry.

> +	flush_work(&detector_work);
>
>	return 0;
>  }
>  late_initcall_sync(lockup_detector_check);

Best Regards,
Petr

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH v2 4/5] kernel/watchdog: Adapt the watchdog_hld interface for async model
  2022-03-21 17:37       ` Petr Mladek
@ 2022-03-24 12:55         ` Lecopzer Chen
  0 siblings, 0 replies; 10+ messages in thread
From: Lecopzer Chen @ 2022-03-24 12:55 UTC (permalink / raw)
  To: pmladek
  Cc: acme, akpm, alexander.shishkin, catalin.marinas, davem, jolsa,
	jthierry, keescook, kernelfans, lecopzer.chen, linux-arm-kernel,
	linux-kernel, linux-mediatek, linux-perf-users, mark.rutland,
	masahiroy, matthias.bgg, maz, mcgrof, mingo, namhyung,
	nixiaoming, peterz, sparclinux, sumit.garg, wangqing, will,
	yj.chiang

> On Sat 2022-03-19 16:18:22, Lecopzer Chen wrote:
> > > On Mon 2022-03-07 23:47:28, Lecopzer Chen wrote:
> > > > When lockup_detector_init()->watchdog_nmi_probe(), PMU may be not ready
> > > > yet. E.g. on arm64, PMU is not ready until
> > > > device_initcall(armv8_pmu_driver_init).  And it is deeply integrated
> > > > with the driver model and cpuhp. Hence it is hard to push this
> > > > initialization before smp_init().
> > > 
> > > > --- a/kernel/watchdog.c
> > > > +++ b/kernel/watchdog.c
> > > > @@ -839,16 +843,70 @@ static void __init watchdog_sysctl_init(void)
> > > >  #define watchdog_sysctl_init() do { } while (0)
> > > >  #endif /* CONFIG_SYSCTL */
> > > >  
> > > > +static void lockup_detector_delay_init(struct work_struct *work);
> > > > +bool lockup_detector_pending_init __initdata;
> > > > +
> > > > +struct wait_queue_head hld_detector_wait __initdata =
> > > > +		__WAIT_QUEUE_HEAD_INITIALIZER(hld_detector_wait);
> > > > +
> > > > +static struct work_struct detector_work __initdata =
> > > > +		__WORK_INITIALIZER(detector_work, lockup_detector_delay_init);
> > > > +
> > > > +static void __init lockup_detector_delay_init(struct work_struct *work)
> > > > +{
> > > > +	int ret;
> > > > +
> > > > +	wait_event(hld_detector_wait,
> > > > +			lockup_detector_pending_init == false);
> > > > +
> > > > +	/*
> > > > +	 * Here, we know the PMU should be ready, so set pending to true to
> > > > +	 * inform watchdog_nmi_probe() that it shouldn't return -EBUSY again.
> > > > +	 */
> > > > +	lockup_detector_pending_init = true;
> > > 
> > > This does not make sense to me. We are here only when:
> > > 
> > >    1. lockup_detector_init() queued this work.
> > > 
> > >    2. Someone cleared @lockup_detector_pending_init and woke the
> > >       worker via wait_queue. IT might be either PMU init code
> > >       or the late lockup_detector_check().
> > > 
> > > watchdog_nmi_probe() might still return -EBUSY when PMU init failed.
> > > 
> > > If you wanted to try the delayed probe once again (3rd attempt) from
> > > lockup_detector_check(), you would need to queue the work once again.
> > > But you need to be sure that lockup_detector_check() was not called
> > > yet. Otherwise, the 2nd work might wait forewer.
> > > 
> > > IMHO, it is not worth the complexity.
> > 
> > The original assumption is: nobody should use delayed probe after
> > lockup_detector_check() (which has __init attribute).
> 
> Good point. It makes perfect sense.
> 
> But it was not mentioned anywhere. And the code did not work this way.
> 
> > 
> > That is, everything including PMU and delayed probe of lock detector must
> > finsh before do_initcalls() which means delayed probe can't support with
> > external PMU module init.
> > 
> > Also,
> >   1. lockup_detector_check is registered with late_initcall_sync(), so it'd
> >      be called in the last order of do_initcalls()).
> > 
> >   2. watchdog_nmi_probe() and all the delayed relative functions and variables
> >      have __init attribute, no one should ever use it after __init section
> >      is released.
> > 
> > The only case is PMU probe function is also late_initcall_sync().
> 
> This is the case for PMU. The API for delayed init is generic a should
> be safe even for other users.
> 

I think this can be fixed after the suggestion provied by you below.
Set lockup_detector_pending_init=false at the end of lockup_detector_check().
So nobody after lockup_detector_check() can ever queue another work.

> 
> > How about this one:
> >   1. Wrap the wake_up code to reduce the complexity for user side.
> > 
> >   2. Remove wait queue.
> >      Instead queue work when lockup_detector_init(), queue the delayed
> >      probe work when arch PMU code finish probe.
> > 
> > and the flow turns to
> > 
> >   1. lockup_detector_init() get -EBUSY, set lockup_detector_pending_init=true
> > 
> >   2. PMU arch code init done, call lockup_detector_queue_work().
> > 
> >   3. lockup_detector_queue_work() queue the work only when
> >      lockup_detector_pending_init=true which means nobody should call
> >      this before lockup_detector_init().
> > 
> >   4. the work lockup_detector_delay_init() is doing without wait event.
> >      if probe success, set lockup_detector_pending_init=false.
> > 
> >   5. at late_initcall_sync(), lockup_detector_check() call flush_work() first
> >      to avoid previous lockup_detector_queue_work() is not scheduled.
> >      And then test whether lockup_detector_pending_init is false, if it's
> >      true, means we have pending init un-finished, than forcely queue work
> >      again and flush_work to make sure the __init section won't be freed
> >      before the work done.
> 
> Nice, I like it.
> 
> > This remove the complexity of wait event which we were disscussed.
> > The draft of the diff code(diff with this series) shows below.
> > 
> > 
> > diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
> > index 77eaefee13ea..c776618fbfa8 100644
> > --- a/arch/arm64/kernel/perf_event.c
> > +++ b/arch/arm64/kernel/perf_event.c
> > @@ -1388,9 +1388,7 @@ static int __init armv8_pmu_driver_init(void)
> >  	else
> >  		ret = arm_pmu_acpi_probe(armv8_pmuv3_pmu_init);
> >  
> > -	/* Inform watchdog core we are ready to probe hld by delayed init. */
> > -	lockup_detector_pending_init = false;
> > -	wake_up(&hld_detector_wait);
> > +	lockup_detector_queue_work();
> 
> The name is strange. The fact that it uses workqueues is an
> implementation detail. I would call it
> retry_lockup_detector_init() so that it is more obvious what it does.

Okay, I don't have a good taste in naming.
I'll provide next version patches including this.


> 
> >  	return ret;
> >  }
> >  device_initcall(armv8_pmu_driver_init)
> > --- a/kernel/watchdog.c
> > +++ b/kernel/watchdog.c
> > @@ -876,15 +865,27 @@ static void __init lockup_detector_delay_init(struct work_struct *work)
> >  	lockup_detector_pending_init = false;
> >  }
> >  
> > +/* Must call after lockup_detector_init() that we do need delayed probe */
> > +void __init lockup_detector_queue_work(void)
> > +{
> > +	if (!lockup_detector_pending_init)
> > +		return;
> > +
> > +	queue_work_on(__smp_processor_id(), system_wq, &detector_work);
> > +}
> > +
> >  /* Ensure the check is called after the initialization of PMU driver */
> >  static int __init lockup_detector_check(void)
> >  {
> > +	/* Make sure no work is pending. */
> > +	flush_work(&detector_work);
> > +
> >  	if (!lockup_detector_pending_init)
> >  		return 0;
> >  
> >  	pr_info("Delayed init checking failed, retry for once.\n");
> > -	lockup_detector_pending_init = false;
> > -	wake_up(&hld_detector_wait);
> > +	lockup_detector_queue_work();
> 
> I would do here
> 
> 	lockup_detector_pending_init = false;
> 
> to make sure that lockup_detector_queue_work() will not longer
> queue the work after the final flush.
> 
> Maybe, we could rename the variable to allow_lockup_detector_init_retry.

Okay, I'm prepareing the next version patches, I'll include in it

thanks a lot for all of the suggestion


BRs,
Lecopzer

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2022-03-24 13:00 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-07 15:47 [PATCH v2 0/5] Suppot hld based on Pseudo-NMI for arm64 Lecopzer Chen
2022-03-07 15:47 ` [PATCH v2 1/5] kernel/watchdog: remove WATCHDOG_DEFAULT Lecopzer Chen
2022-03-07 15:47 ` [PATCH v2 2/5] kernel/watchdog: change watchdog_nmi_enable() to void Lecopzer Chen
2022-03-07 15:47 ` [PATCH v2 3/5] kernel/watchdog_hld: Ensure CPU-bound context when creating hardlockup detector event Lecopzer Chen
2022-03-07 15:47 ` [PATCH v2 4/5] kernel/watchdog: Adapt the watchdog_hld interface for async model Lecopzer Chen
2022-03-18 10:40   ` Petr Mladek
2022-03-19  8:18     ` Lecopzer Chen
2022-03-21 17:37       ` Petr Mladek
2022-03-24 12:55         ` Lecopzer Chen
2022-03-07 15:47 ` [PATCH v2 5/5] arm64: Enable perf events based hard lockup detector Lecopzer Chen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).