All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v8 00/10] arm64: Add framework to turn an IPI as NMI
@ 2023-04-19 22:55 ` Douglas Anderson
  0 siblings, 0 replies; 46+ messages in thread
From: Douglas Anderson @ 2023-04-19 22:55 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, Mark Rutland
  Cc: ito-yuichi, kgdb-bugreport, Chen-Yu Tsai, Masayoshi Mizuma,
	Peter Zijlstra, Ard Biesheuvel, Rafael J . Wysocki,
	linux-arm-kernel, Stephen Boyd, Lecopzer Chen, Thomas Gleixner,
	linux-perf-users, Douglas Anderson, Alexandru Elisei,
	Andrey Konovalov, Ben Dooks, Borislav Petkov, Christophe Leroy,
	Darrick J. Wong, Dave Hansen, David S. Miller, Eric W. Biederman,
	Frederic Weisbecker, Gaosheng Cui, Gautham R. Shenoy,
	Greg Kroah-Hartman, Guilherme G. Piccoli, Guo Ren,
	H. Peter Anvin, Huacai Chen, Ingo Molnar, Ingo Molnar,
	Jason A. Donenfeld, Jason Wessel, Jianmin Lv, Jiaxun Yang,
	Jinyang He, Joey Gouly, Kees Cook, Laurent Dufour,
	Masahiro Yamada, Masayoshi Mizuma, Michael Ellerman,
	Nicholas Piggin, Paul E. McKenney, Philippe Mathieu-Daudé,
	Pierre Gondois, Qing Zhang, Russell King (Oracle),
	Russell King, Thomas Bogendoerfer, Ulf Hansson, WANG Xuerui,
	linux-kernel, linux-mips, linuxppc-dev, loongarch, sparclinux,
	x86

This is an attempt to resurrect Sumit's old patch series [1] that
allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and
also to round up CPUs in kdb/kgdb. The last post from Sumit that I
could find was v7, so I called this series v8. I haven't copied all of
his old changelongs here, but you can find them from the link.

Since v7, I have:
* Addressed the small amount of feedback that was there for v7.
* Rebased.
* Added a new patch that prevents us from spamming the logs with idle
  tasks.
* Added an extra patch to gracefully fall back to regular IPIs if
  pseudo-NMIs aren't there.

Since there appear to be a few different patches series related to
being able to use NMIs to get stack traces of crashed systems, let me
try to organize them to the best of my understanding:

a) This series. On its own, a) will (among other things) enable stack
   traces of all running processes with the soft lockup detector if
   you've enabled the sysctl "kernel.softlockup_all_cpu_backtrace". On
   its own, a) doesn't give a hard lockup detector.

b) A different recently-posted series [2] that adds a hard lockup
   detector based on perf. On its own, b) gives a stack crawl of the
   locked up CPU but no stack crawls of other CPUs (even if they're
   locked too). Together with a) + b) we get everything (full lockup
   detect, full ability to get stack crawls).

c) The old Android "buddy" hard lockup detector [3] that I'm
   considering trying to upstream. If b) lands then I believe c) would
   be redundant (at least for arm64). c) on its own is really only
   useful on arm64 for platforms that can print CPU_DBGPCSR somehow
   (see [4]). a) + c) is roughly as good as a) + b).

[1] https://lore.kernel.org/linux-arm-kernel/1604317487-14543-1-git-send-email-sumit.garg@linaro.org/
[2] https://lore.kernel.org/linux-arm-kernel/20220903093415.15850-1-lecopzer.chen@mediatek.com/
[3] https://issuetracker.google.com/172213097
[4] https://issuetracker.google.com/172213129

Changes in v8:
- dynamic_ipi_setup() and dynamic_ipi_teardown() no longer take cpu param
- dynamic_ipi_setup() and dynamic_ipi_teardown() no longer take cpu param
- Add loongarch support, too
- Removed "#ifdef CONFIG_SMP" since arm64 is always SMP
- "Tag the arm64 idle functions as __cpuidle" new for v8
- "Provide a stub kgdb_nmicallback() if !CONFIG_KGDB" new for v8
- "Fallback to a regular IPI if NMI isn't enabled" new for v8

Douglas Anderson (3):
  arm64: idle: Tag the arm64 idle functions as __cpuidle
  kgdb: Provide a stub kgdb_nmicallback() if !CONFIG_KGDB
  arm64: ipi_nmi: Fallback to a regular IPI if NMI isn't enabled

Sumit Garg (7):
  arm64: Add framework to turn IPI as NMI
  irqchip/gic-v3: Enable support for SGIs to act as NMIs
  arm64: smp: Assign and setup an IPI as NMI
  nmi: backtrace: Allow runtime arch specific override
  arm64: ipi_nmi: Add support for NMI backtrace
  kgdb: Expose default CPUs roundup fallback mechanism
  arm64: kgdb: Roundup cpus using IPI as NMI

 arch/arm/include/asm/irq.h       |   2 +-
 arch/arm/kernel/smp.c            |   3 +-
 arch/arm64/include/asm/irq.h     |   4 ++
 arch/arm64/include/asm/nmi.h     |  17 +++++
 arch/arm64/kernel/Makefile       |   2 +-
 arch/arm64/kernel/idle.c         |   4 +-
 arch/arm64/kernel/ipi_nmi.c      | 103 +++++++++++++++++++++++++++++++
 arch/arm64/kernel/kgdb.c         |  18 ++++++
 arch/arm64/kernel/smp.c          |   8 +++
 arch/loongarch/include/asm/irq.h |   2 +-
 arch/loongarch/kernel/process.c  |   3 +-
 arch/mips/include/asm/irq.h      |   2 +-
 arch/mips/kernel/process.c       |   3 +-
 arch/powerpc/include/asm/nmi.h   |   2 +-
 arch/powerpc/kernel/stacktrace.c |   3 +-
 arch/sparc/include/asm/irq_64.h  |   2 +-
 arch/sparc/kernel/process_64.c   |   4 +-
 arch/x86/include/asm/irq.h       |   2 +-
 arch/x86/kernel/apic/hw_nmi.c    |   3 +-
 drivers/irqchip/irq-gic-v3.c     |  29 ++++++---
 include/linux/kgdb.h             |  13 ++++
 include/linux/nmi.h              |  12 ++--
 kernel/debug/debug_core.c        |   8 ++-
 23 files changed, 217 insertions(+), 32 deletions(-)
 create mode 100644 arch/arm64/include/asm/nmi.h
 create mode 100644 arch/arm64/kernel/ipi_nmi.c

-- 
2.40.0.634.g4ca3ef3211-goog


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v8 00/10] arm64: Add framework to turn an IPI as NMI
@ 2023-04-19 22:55 ` Douglas Anderson
  0 siblings, 0 replies; 46+ messages in thread
From: Douglas Anderson @ 2023-04-19 22:55 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, Mark Rutland
  Cc: Masayoshi Mizuma, Jason A. Donenfeld, Huacai Chen, Jinyang He,
	Lecopzer Chen, Peter Zijlstra, kgdb-bugreport, Dave Hansen,
	Qing Zhang, Jiaxun Yang, Philippe Mathieu-Daudé,
	Guo Ren, H. Peter Anvin, sparclinux, WANG Xuerui, Ard Biesheuvel,
	Masahiro Yamada, Rafael J . Wysocki, Russell King, Gaosheng Cui,
	Chen-Yu Tsai, Ingo Molnar, Darrick J. Wong, Ulf Hansson,
	Masayoshi Mizuma, Ingo Molnar, Pierre Gondois, Kees Cook,
	Paul E. McKenney, Frederic Weisbecker, Nicholas Piggin,
	Stephen Boyd, loongarch, Russell King (Oracle),
	Jianmin Lv, Borislav Petkov, Joey Gouly, Ben Dooks,
	Thomas Gleixner, Laurent Dufour, Alexandru Elisei,
	linux-arm-kernel, Andrey Konovalov, x86, Thomas Bogendoerfer,
	linux-mips, Greg Kroah-Hartman, ito-yuichi, Douglas Anderson,
	linux-kernel, linux-perf-users, Guilherme G. Piccoli

This is an attempt to resurrect Sumit's old patch series [1] that
allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and
also to round up CPUs in kdb/kgdb. The last post from Sumit that I
could find was v7, so I called this series v8. I haven't copied all of
his old changelongs here, but you can find them from the link.

Since v7, I have:
* Addressed the small amount of feedback that was there for v7.
* Rebased.
* Added a new patch that prevents us from spamming the logs with idle
  tasks.
* Added an extra patch to gracefully fall back to regular IPIs if
  pseudo-NMIs aren't there.

Since there appear to be a few different patches series related to
being able to use NMIs to get stack traces of crashed systems, let me
try to organize them to the best of my understanding:

a) This series. On its own, a) will (among other things) enable stack
   traces of all running processes with the soft lockup detector if
   you've enabled the sysctl "kernel.softlockup_all_cpu_backtrace". On
   its own, a) doesn't give a hard lockup detector.

b) A different recently-posted series [2] that adds a hard lockup
   detector based on perf. On its own, b) gives a stack crawl of the
   locked up CPU but no stack crawls of other CPUs (even if they're
   locked too). Together with a) + b) we get everything (full lockup
   detect, full ability to get stack crawls).

c) The old Android "buddy" hard lockup detector [3] that I'm
   considering trying to upstream. If b) lands then I believe c) would
   be redundant (at least for arm64). c) on its own is really only
   useful on arm64 for platforms that can print CPU_DBGPCSR somehow
   (see [4]). a) + c) is roughly as good as a) + b).

[1] https://lore.kernel.org/linux-arm-kernel/1604317487-14543-1-git-send-email-sumit.garg@linaro.org/
[2] https://lore.kernel.org/linux-arm-kernel/20220903093415.15850-1-lecopzer.chen@mediatek.com/
[3] https://issuetracker.google.com/172213097
[4] https://issuetracker.google.com/172213129

Changes in v8:
- dynamic_ipi_setup() and dynamic_ipi_teardown() no longer take cpu param
- dynamic_ipi_setup() and dynamic_ipi_teardown() no longer take cpu param
- Add loongarch support, too
- Removed "#ifdef CONFIG_SMP" since arm64 is always SMP
- "Tag the arm64 idle functions as __cpuidle" new for v8
- "Provide a stub kgdb_nmicallback() if !CONFIG_KGDB" new for v8
- "Fallback to a regular IPI if NMI isn't enabled" new for v8

Douglas Anderson (3):
  arm64: idle: Tag the arm64 idle functions as __cpuidle
  kgdb: Provide a stub kgdb_nmicallback() if !CONFIG_KGDB
  arm64: ipi_nmi: Fallback to a regular IPI if NMI isn't enabled

Sumit Garg (7):
  arm64: Add framework to turn IPI as NMI
  irqchip/gic-v3: Enable support for SGIs to act as NMIs
  arm64: smp: Assign and setup an IPI as NMI
  nmi: backtrace: Allow runtime arch specific override
  arm64: ipi_nmi: Add support for NMI backtrace
  kgdb: Expose default CPUs roundup fallback mechanism
  arm64: kgdb: Roundup cpus using IPI as NMI

 arch/arm/include/asm/irq.h       |   2 +-
 arch/arm/kernel/smp.c            |   3 +-
 arch/arm64/include/asm/irq.h     |   4 ++
 arch/arm64/include/asm/nmi.h     |  17 +++++
 arch/arm64/kernel/Makefile       |   2 +-
 arch/arm64/kernel/idle.c         |   4 +-
 arch/arm64/kernel/ipi_nmi.c      | 103 +++++++++++++++++++++++++++++++
 arch/arm64/kernel/kgdb.c         |  18 ++++++
 arch/arm64/kernel/smp.c          |   8 +++
 arch/loongarch/include/asm/irq.h |   2 +-
 arch/loongarch/kernel/process.c  |   3 +-
 arch/mips/include/asm/irq.h      |   2 +-
 arch/mips/kernel/process.c       |   3 +-
 arch/powerpc/include/asm/nmi.h   |   2 +-
 arch/powerpc/kernel/stacktrace.c |   3 +-
 arch/sparc/include/asm/irq_64.h  |   2 +-
 arch/sparc/kernel/process_64.c   |   4 +-
 arch/x86/include/asm/irq.h       |   2 +-
 arch/x86/kernel/apic/hw_nmi.c    |   3 +-
 drivers/irqchip/irq-gic-v3.c     |  29 ++++++---
 include/linux/kgdb.h             |  13 ++++
 include/linux/nmi.h              |  12 ++--
 kernel/debug/debug_core.c        |   8 ++-
 23 files changed, 217 insertions(+), 32 deletions(-)
 create mode 100644 arch/arm64/include/asm/nmi.h
 create mode 100644 arch/arm64/kernel/ipi_nmi.c

-- 
2.40.0.634.g4ca3ef3211-goog


^ permalink raw reply	[flat|nested] 46+ messages in thread

* [PATCH v8 01/10] arm64: Add framework to turn IPI as NMI
  2023-04-19 22:55 ` Douglas Anderson
@ 2023-04-19 22:55   ` Douglas Anderson
  -1 siblings, 0 replies; 46+ messages in thread
From: Douglas Anderson @ 2023-04-19 22:55 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, Mark Rutland
  Cc: ito-yuichi, kgdb-bugreport, Chen-Yu Tsai, Masayoshi Mizuma,
	Peter Zijlstra, Ard Biesheuvel, Rafael J . Wysocki,
	linux-arm-kernel, Stephen Boyd, Lecopzer Chen, Thomas Gleixner,
	linux-perf-users, Masayoshi Mizuma, Douglas Anderson,
	Andrey Konovalov, Joey Gouly, Masahiro Yamada, linux-kernel

From: Sumit Garg <sumit.garg@linaro.org>

Introduce framework to turn an IPI as NMI using pseudo NMIs. The main
motivation for this feature is to have an IPI that can be leveraged to
invoke NMI functions on other CPUs.

And current prospective users are NMI backtrace and KGDB CPUs round-up
whose support is added via future patches.

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Tested-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

Changes in v8:
- dynamic_ipi_setup() and dynamic_ipi_teardown() no longer take cpu param

 arch/arm64/include/asm/nmi.h | 17 ++++++++++
 arch/arm64/kernel/Makefile   |  2 +-
 arch/arm64/kernel/ipi_nmi.c  | 65 ++++++++++++++++++++++++++++++++++++
 3 files changed, 83 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/include/asm/nmi.h
 create mode 100644 arch/arm64/kernel/ipi_nmi.c

diff --git a/arch/arm64/include/asm/nmi.h b/arch/arm64/include/asm/nmi.h
new file mode 100644
index 000000000000..2cc4b4d4090e
--- /dev/null
+++ b/arch/arm64/include/asm/nmi.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_NMI_H
+#define __ASM_NMI_H
+
+#ifndef __ASSEMBLER__
+
+#include <linux/cpumask.h>
+
+extern bool arm64_supports_nmi(void);
+extern void arm64_send_nmi(cpumask_t *mask);
+
+void set_smp_dynamic_ipi(int ipi);
+void dynamic_ipi_setup(void);
+void dynamic_ipi_teardown(void);
+
+#endif /* !__ASSEMBLER__ */
+#endif
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index ceba6792f5b3..d57c8f99ca69 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -34,7 +34,7 @@ obj-y			:= debug-monitors.o entry.o irq.o fpsimd.o		\
 			   cpufeature.o alternative.o cacheinfo.o		\
 			   smp.o smp_spin_table.o topology.o smccc-call.o	\
 			   syscall.o proton-pack.o idreg-override.o idle.o	\
-			   patching.o
+			   patching.o ipi_nmi.o
 
 obj-$(CONFIG_COMPAT)			+= sys32.o signal32.o			\
 					   sys_compat.o
diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
new file mode 100644
index 000000000000..712411eed949
--- /dev/null
+++ b/arch/arm64/kernel/ipi_nmi.c
@@ -0,0 +1,65 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * NMI support for IPIs
+ *
+ * Copyright (C) 2020 Linaro Limited
+ * Author: Sumit Garg <sumit.garg@linaro.org>
+ */
+
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <linux/smp.h>
+
+#include <asm/nmi.h>
+
+static struct irq_desc *ipi_nmi_desc __read_mostly;
+static int ipi_nmi_id __read_mostly;
+
+bool arm64_supports_nmi(void)
+{
+	if (ipi_nmi_desc)
+		return true;
+
+	return false;
+}
+
+void arm64_send_nmi(cpumask_t *mask)
+{
+	if (WARN_ON_ONCE(!ipi_nmi_desc))
+		return;
+
+	__ipi_send_mask(ipi_nmi_desc, mask);
+}
+
+static irqreturn_t ipi_nmi_handler(int irq, void *data)
+{
+	/* nop, NMI handlers for special features can be added here. */
+
+	return IRQ_NONE;
+}
+
+void dynamic_ipi_setup(void)
+{
+	if (!ipi_nmi_desc)
+		return;
+
+	if (!prepare_percpu_nmi(ipi_nmi_id))
+		enable_percpu_nmi(ipi_nmi_id, IRQ_TYPE_NONE);
+}
+
+void dynamic_ipi_teardown(void)
+{
+	if (!ipi_nmi_desc)
+		return;
+
+	disable_percpu_nmi(ipi_nmi_id);
+	teardown_percpu_nmi(ipi_nmi_id);
+}
+
+void __init set_smp_dynamic_ipi(int ipi)
+{
+	if (!request_percpu_nmi(ipi, ipi_nmi_handler, "IPI", &cpu_number)) {
+		ipi_nmi_desc = irq_to_desc(ipi);
+		ipi_nmi_id = ipi;
+	}
+}
-- 
2.40.0.634.g4ca3ef3211-goog


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 01/10] arm64: Add framework to turn IPI as NMI
@ 2023-04-19 22:55   ` Douglas Anderson
  0 siblings, 0 replies; 46+ messages in thread
From: Douglas Anderson @ 2023-04-19 22:55 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, Mark Rutland
  Cc: ito-yuichi, kgdb-bugreport, Chen-Yu Tsai, Masayoshi Mizuma,
	Peter Zijlstra, Ard Biesheuvel, Rafael J . Wysocki,
	linux-arm-kernel, Stephen Boyd, Lecopzer Chen, Thomas Gleixner,
	linux-perf-users, Masayoshi Mizuma, Douglas Anderson,
	Andrey Konovalov, Joey Gouly, Masahiro Yamada, linux-kernel

From: Sumit Garg <sumit.garg@linaro.org>

Introduce framework to turn an IPI as NMI using pseudo NMIs. The main
motivation for this feature is to have an IPI that can be leveraged to
invoke NMI functions on other CPUs.

And current prospective users are NMI backtrace and KGDB CPUs round-up
whose support is added via future patches.

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Tested-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

Changes in v8:
- dynamic_ipi_setup() and dynamic_ipi_teardown() no longer take cpu param

 arch/arm64/include/asm/nmi.h | 17 ++++++++++
 arch/arm64/kernel/Makefile   |  2 +-
 arch/arm64/kernel/ipi_nmi.c  | 65 ++++++++++++++++++++++++++++++++++++
 3 files changed, 83 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/include/asm/nmi.h
 create mode 100644 arch/arm64/kernel/ipi_nmi.c

diff --git a/arch/arm64/include/asm/nmi.h b/arch/arm64/include/asm/nmi.h
new file mode 100644
index 000000000000..2cc4b4d4090e
--- /dev/null
+++ b/arch/arm64/include/asm/nmi.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_NMI_H
+#define __ASM_NMI_H
+
+#ifndef __ASSEMBLER__
+
+#include <linux/cpumask.h>
+
+extern bool arm64_supports_nmi(void);
+extern void arm64_send_nmi(cpumask_t *mask);
+
+void set_smp_dynamic_ipi(int ipi);
+void dynamic_ipi_setup(void);
+void dynamic_ipi_teardown(void);
+
+#endif /* !__ASSEMBLER__ */
+#endif
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index ceba6792f5b3..d57c8f99ca69 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -34,7 +34,7 @@ obj-y			:= debug-monitors.o entry.o irq.o fpsimd.o		\
 			   cpufeature.o alternative.o cacheinfo.o		\
 			   smp.o smp_spin_table.o topology.o smccc-call.o	\
 			   syscall.o proton-pack.o idreg-override.o idle.o	\
-			   patching.o
+			   patching.o ipi_nmi.o
 
 obj-$(CONFIG_COMPAT)			+= sys32.o signal32.o			\
 					   sys_compat.o
diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
new file mode 100644
index 000000000000..712411eed949
--- /dev/null
+++ b/arch/arm64/kernel/ipi_nmi.c
@@ -0,0 +1,65 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * NMI support for IPIs
+ *
+ * Copyright (C) 2020 Linaro Limited
+ * Author: Sumit Garg <sumit.garg@linaro.org>
+ */
+
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <linux/smp.h>
+
+#include <asm/nmi.h>
+
+static struct irq_desc *ipi_nmi_desc __read_mostly;
+static int ipi_nmi_id __read_mostly;
+
+bool arm64_supports_nmi(void)
+{
+	if (ipi_nmi_desc)
+		return true;
+
+	return false;
+}
+
+void arm64_send_nmi(cpumask_t *mask)
+{
+	if (WARN_ON_ONCE(!ipi_nmi_desc))
+		return;
+
+	__ipi_send_mask(ipi_nmi_desc, mask);
+}
+
+static irqreturn_t ipi_nmi_handler(int irq, void *data)
+{
+	/* nop, NMI handlers for special features can be added here. */
+
+	return IRQ_NONE;
+}
+
+void dynamic_ipi_setup(void)
+{
+	if (!ipi_nmi_desc)
+		return;
+
+	if (!prepare_percpu_nmi(ipi_nmi_id))
+		enable_percpu_nmi(ipi_nmi_id, IRQ_TYPE_NONE);
+}
+
+void dynamic_ipi_teardown(void)
+{
+	if (!ipi_nmi_desc)
+		return;
+
+	disable_percpu_nmi(ipi_nmi_id);
+	teardown_percpu_nmi(ipi_nmi_id);
+}
+
+void __init set_smp_dynamic_ipi(int ipi)
+{
+	if (!request_percpu_nmi(ipi, ipi_nmi_handler, "IPI", &cpu_number)) {
+		ipi_nmi_desc = irq_to_desc(ipi);
+		ipi_nmi_id = ipi;
+	}
+}
-- 
2.40.0.634.g4ca3ef3211-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 02/10] irqchip/gic-v3: Enable support for SGIs to act as NMIs
  2023-04-19 22:55 ` Douglas Anderson
@ 2023-04-19 22:55   ` Douglas Anderson
  -1 siblings, 0 replies; 46+ messages in thread
From: Douglas Anderson @ 2023-04-19 22:55 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, Mark Rutland
  Cc: ito-yuichi, kgdb-bugreport, Chen-Yu Tsai, Masayoshi Mizuma,
	Peter Zijlstra, Ard Biesheuvel, Rafael J . Wysocki,
	linux-arm-kernel, Stephen Boyd, Lecopzer Chen, Thomas Gleixner,
	linux-perf-users, Masayoshi Mizuma, Douglas Anderson,
	linux-kernel

From: Sumit Garg <sumit.garg@linaro.org>

Add support to handle SGIs as pseudo NMIs. As SGIs or IPIs default to a
special flow handler: handle_percpu_devid_fasteoi_ipi(), so skip NMI
handler update in case of SGIs.

Also, enable NMI support prior to gic_smp_init() as allocation of SGIs
as IRQs/NMIs happen as part of this routine.

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Tested-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

(no changes since v1)

 drivers/irqchip/irq-gic-v3.c | 29 +++++++++++++++++++++--------
 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index fd134e1f481a..b402a81fea59 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -482,6 +482,7 @@ static u32 gic_get_ppi_index(struct irq_data *d)
 static int gic_irq_nmi_setup(struct irq_data *d)
 {
 	struct irq_desc *desc = irq_to_desc(d->irq);
+	u32 idx;
 
 	if (!gic_supports_nmi())
 		return -EINVAL;
@@ -499,16 +500,22 @@ static int gic_irq_nmi_setup(struct irq_data *d)
 		return -EINVAL;
 
 	/* desc lock should already be held */
-	if (gic_irq_in_rdist(d)) {
-		u32 idx = gic_get_ppi_index(d);
+	switch (get_intid_range(d)) {
+	case SGI_RANGE:
+		break;
+	case PPI_RANGE:
+	case EPPI_RANGE:
+		idx = gic_get_ppi_index(d);
 
 		/* Setting up PPI as NMI, only switch handler for first NMI */
 		if (!refcount_inc_not_zero(&ppi_nmi_refs[idx])) {
 			refcount_set(&ppi_nmi_refs[idx], 1);
 			desc->handle_irq = handle_percpu_devid_fasteoi_nmi;
 		}
-	} else {
+		break;
+	default:
 		desc->handle_irq = handle_fasteoi_nmi;
+		break;
 	}
 
 	gic_irq_set_prio(d, GICD_INT_NMI_PRI);
@@ -519,6 +526,7 @@ static int gic_irq_nmi_setup(struct irq_data *d)
 static void gic_irq_nmi_teardown(struct irq_data *d)
 {
 	struct irq_desc *desc = irq_to_desc(d->irq);
+	u32 idx;
 
 	if (WARN_ON(!gic_supports_nmi()))
 		return;
@@ -536,14 +544,20 @@ static void gic_irq_nmi_teardown(struct irq_data *d)
 		return;
 
 	/* desc lock should already be held */
-	if (gic_irq_in_rdist(d)) {
-		u32 idx = gic_get_ppi_index(d);
+	switch (get_intid_range(d)) {
+	case SGI_RANGE:
+		break;
+	case PPI_RANGE:
+	case EPPI_RANGE:
+		idx = gic_get_ppi_index(d);
 
 		/* Tearing down NMI, only switch handler for last NMI */
 		if (refcount_dec_and_test(&ppi_nmi_refs[idx]))
 			desc->handle_irq = handle_percpu_devid_irq;
-	} else {
+		break;
+	default:
 		desc->handle_irq = handle_fasteoi_irq;
+		break;
 	}
 
 	gic_irq_set_prio(d, GICD_INT_DEF_PRI);
@@ -1867,6 +1881,7 @@ static int __init gic_init_bases(void __iomem *dist_base,
 
 	gic_dist_init();
 	gic_cpu_init();
+	gic_enable_nmi_support();
 	gic_smp_init();
 	gic_cpu_pm_init();
 
@@ -1879,8 +1894,6 @@ static int __init gic_init_bases(void __iomem *dist_base,
 			gicv2m_init(handle, gic_data.domain);
 	}
 
-	gic_enable_nmi_support();
-
 	return 0;
 
 out_free:
-- 
2.40.0.634.g4ca3ef3211-goog


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 02/10] irqchip/gic-v3: Enable support for SGIs to act as NMIs
@ 2023-04-19 22:55   ` Douglas Anderson
  0 siblings, 0 replies; 46+ messages in thread
From: Douglas Anderson @ 2023-04-19 22:55 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, Mark Rutland
  Cc: ito-yuichi, kgdb-bugreport, Chen-Yu Tsai, Masayoshi Mizuma,
	Peter Zijlstra, Ard Biesheuvel, Rafael J . Wysocki,
	linux-arm-kernel, Stephen Boyd, Lecopzer Chen, Thomas Gleixner,
	linux-perf-users, Masayoshi Mizuma, Douglas Anderson,
	linux-kernel

From: Sumit Garg <sumit.garg@linaro.org>

Add support to handle SGIs as pseudo NMIs. As SGIs or IPIs default to a
special flow handler: handle_percpu_devid_fasteoi_ipi(), so skip NMI
handler update in case of SGIs.

Also, enable NMI support prior to gic_smp_init() as allocation of SGIs
as IRQs/NMIs happen as part of this routine.

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Tested-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

(no changes since v1)

 drivers/irqchip/irq-gic-v3.c | 29 +++++++++++++++++++++--------
 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index fd134e1f481a..b402a81fea59 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -482,6 +482,7 @@ static u32 gic_get_ppi_index(struct irq_data *d)
 static int gic_irq_nmi_setup(struct irq_data *d)
 {
 	struct irq_desc *desc = irq_to_desc(d->irq);
+	u32 idx;
 
 	if (!gic_supports_nmi())
 		return -EINVAL;
@@ -499,16 +500,22 @@ static int gic_irq_nmi_setup(struct irq_data *d)
 		return -EINVAL;
 
 	/* desc lock should already be held */
-	if (gic_irq_in_rdist(d)) {
-		u32 idx = gic_get_ppi_index(d);
+	switch (get_intid_range(d)) {
+	case SGI_RANGE:
+		break;
+	case PPI_RANGE:
+	case EPPI_RANGE:
+		idx = gic_get_ppi_index(d);
 
 		/* Setting up PPI as NMI, only switch handler for first NMI */
 		if (!refcount_inc_not_zero(&ppi_nmi_refs[idx])) {
 			refcount_set(&ppi_nmi_refs[idx], 1);
 			desc->handle_irq = handle_percpu_devid_fasteoi_nmi;
 		}
-	} else {
+		break;
+	default:
 		desc->handle_irq = handle_fasteoi_nmi;
+		break;
 	}
 
 	gic_irq_set_prio(d, GICD_INT_NMI_PRI);
@@ -519,6 +526,7 @@ static int gic_irq_nmi_setup(struct irq_data *d)
 static void gic_irq_nmi_teardown(struct irq_data *d)
 {
 	struct irq_desc *desc = irq_to_desc(d->irq);
+	u32 idx;
 
 	if (WARN_ON(!gic_supports_nmi()))
 		return;
@@ -536,14 +544,20 @@ static void gic_irq_nmi_teardown(struct irq_data *d)
 		return;
 
 	/* desc lock should already be held */
-	if (gic_irq_in_rdist(d)) {
-		u32 idx = gic_get_ppi_index(d);
+	switch (get_intid_range(d)) {
+	case SGI_RANGE:
+		break;
+	case PPI_RANGE:
+	case EPPI_RANGE:
+		idx = gic_get_ppi_index(d);
 
 		/* Tearing down NMI, only switch handler for last NMI */
 		if (refcount_dec_and_test(&ppi_nmi_refs[idx]))
 			desc->handle_irq = handle_percpu_devid_irq;
-	} else {
+		break;
+	default:
 		desc->handle_irq = handle_fasteoi_irq;
+		break;
 	}
 
 	gic_irq_set_prio(d, GICD_INT_DEF_PRI);
@@ -1867,6 +1881,7 @@ static int __init gic_init_bases(void __iomem *dist_base,
 
 	gic_dist_init();
 	gic_cpu_init();
+	gic_enable_nmi_support();
 	gic_smp_init();
 	gic_cpu_pm_init();
 
@@ -1879,8 +1894,6 @@ static int __init gic_init_bases(void __iomem *dist_base,
 			gicv2m_init(handle, gic_data.domain);
 	}
 
-	gic_enable_nmi_support();
-
 	return 0;
 
 out_free:
-- 
2.40.0.634.g4ca3ef3211-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 03/10] arm64: smp: Assign and setup an IPI as NMI
  2023-04-19 22:55 ` Douglas Anderson
@ 2023-04-19 22:55   ` Douglas Anderson
  -1 siblings, 0 replies; 46+ messages in thread
From: Douglas Anderson @ 2023-04-19 22:55 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, Mark Rutland
  Cc: ito-yuichi, kgdb-bugreport, Chen-Yu Tsai, Masayoshi Mizuma,
	Peter Zijlstra, Ard Biesheuvel, Rafael J . Wysocki,
	linux-arm-kernel, Stephen Boyd, Lecopzer Chen, Thomas Gleixner,
	linux-perf-users, Masayoshi Mizuma, Douglas Anderson, Ben Dooks,
	Pierre Gondois, linux-kernel

From: Sumit Garg <sumit.garg@linaro.org>

Assign an unused IPI which can be turned as NMI using ipi_nmi framework.
Also, invoke corresponding dynamic IPI setup/teardown APIs.

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Tested-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

Changes in v8:
- dynamic_ipi_setup() and dynamic_ipi_teardown() no longer take cpu param

 arch/arm64/kernel/smp.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 4e8327264255..94ff063527c6 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -43,6 +43,7 @@
 #include <asm/daifflags.h>
 #include <asm/kvm_mmu.h>
 #include <asm/mmu_context.h>
+#include <asm/nmi.h>
 #include <asm/numa.h>
 #include <asm/processor.h>
 #include <asm/smp_plat.h>
@@ -938,6 +939,8 @@ static void ipi_setup(int cpu)
 
 	for (i = 0; i < nr_ipi; i++)
 		enable_percpu_irq(ipi_irq_base + i, 0);
+
+	dynamic_ipi_setup();
 }
 
 #ifdef CONFIG_HOTPLUG_CPU
@@ -950,6 +953,8 @@ static void ipi_teardown(int cpu)
 
 	for (i = 0; i < nr_ipi; i++)
 		disable_percpu_irq(ipi_irq_base + i);
+
+	dynamic_ipi_teardown();
 }
 #endif
 
@@ -971,6 +976,9 @@ void __init set_smp_ipi_range(int ipi_base, int n)
 		irq_set_status_flags(ipi_base + i, IRQ_HIDDEN);
 	}
 
+	if (n > nr_ipi)
+		set_smp_dynamic_ipi(ipi_base + nr_ipi);
+
 	ipi_irq_base = ipi_base;
 
 	/* Setup the boot CPU immediately */
-- 
2.40.0.634.g4ca3ef3211-goog


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 03/10] arm64: smp: Assign and setup an IPI as NMI
@ 2023-04-19 22:55   ` Douglas Anderson
  0 siblings, 0 replies; 46+ messages in thread
From: Douglas Anderson @ 2023-04-19 22:55 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, Mark Rutland
  Cc: ito-yuichi, kgdb-bugreport, Chen-Yu Tsai, Masayoshi Mizuma,
	Peter Zijlstra, Ard Biesheuvel, Rafael J . Wysocki,
	linux-arm-kernel, Stephen Boyd, Lecopzer Chen, Thomas Gleixner,
	linux-perf-users, Masayoshi Mizuma, Douglas Anderson, Ben Dooks,
	Pierre Gondois, linux-kernel

From: Sumit Garg <sumit.garg@linaro.org>

Assign an unused IPI which can be turned as NMI using ipi_nmi framework.
Also, invoke corresponding dynamic IPI setup/teardown APIs.

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Tested-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

Changes in v8:
- dynamic_ipi_setup() and dynamic_ipi_teardown() no longer take cpu param

 arch/arm64/kernel/smp.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 4e8327264255..94ff063527c6 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -43,6 +43,7 @@
 #include <asm/daifflags.h>
 #include <asm/kvm_mmu.h>
 #include <asm/mmu_context.h>
+#include <asm/nmi.h>
 #include <asm/numa.h>
 #include <asm/processor.h>
 #include <asm/smp_plat.h>
@@ -938,6 +939,8 @@ static void ipi_setup(int cpu)
 
 	for (i = 0; i < nr_ipi; i++)
 		enable_percpu_irq(ipi_irq_base + i, 0);
+
+	dynamic_ipi_setup();
 }
 
 #ifdef CONFIG_HOTPLUG_CPU
@@ -950,6 +953,8 @@ static void ipi_teardown(int cpu)
 
 	for (i = 0; i < nr_ipi; i++)
 		disable_percpu_irq(ipi_irq_base + i);
+
+	dynamic_ipi_teardown();
 }
 #endif
 
@@ -971,6 +976,9 @@ void __init set_smp_ipi_range(int ipi_base, int n)
 		irq_set_status_flags(ipi_base + i, IRQ_HIDDEN);
 	}
 
+	if (n > nr_ipi)
+		set_smp_dynamic_ipi(ipi_base + nr_ipi);
+
 	ipi_irq_base = ipi_base;
 
 	/* Setup the boot CPU immediately */
-- 
2.40.0.634.g4ca3ef3211-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 04/10] nmi: backtrace: Allow runtime arch specific override
  2023-04-19 22:55 ` Douglas Anderson
@ 2023-04-19 22:55   ` Douglas Anderson
  -1 siblings, 0 replies; 46+ messages in thread
From: Douglas Anderson @ 2023-04-19 22:55 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, Mark Rutland
  Cc: ito-yuichi, kgdb-bugreport, Chen-Yu Tsai, Masayoshi Mizuma,
	Peter Zijlstra, Ard Biesheuvel, Rafael J . Wysocki,
	linux-arm-kernel, Stephen Boyd, Lecopzer Chen, Thomas Gleixner,
	linux-perf-users, Douglas Anderson, Ben Dooks, Borislav Petkov,
	Christophe Leroy, Darrick J. Wong, Dave Hansen, David S. Miller,
	Eric W. Biederman, Gaosheng Cui, Gautham R. Shenoy,
	Greg Kroah-Hartman, Guilherme G. Piccoli, Guo Ren,
	H. Peter Anvin, Huacai Chen, Ingo Molnar, Jason A. Donenfeld,
	Jianmin Lv, Jiaxun Yang, Jinyang He, Kees Cook, Laurent Dufour,
	Michael Ellerman, Nicholas Piggin, Paul E. McKenney,
	Philippe Mathieu-Daudé, Qing Zhang, Russell King (Oracle),
	Russell King, Thomas Bogendoerfer, Ulf Hansson, WANG Xuerui,
	linux-kernel, linux-mips, linuxppc-dev, loongarch, sparclinux,
	x86

From: Sumit Garg <sumit.garg@linaro.org>

Add a boolean return to arch_trigger_cpumask_backtrace() to support a
use-case where a particular architecture detects at runtime if it supports
NMI backtrace or it would like to fallback to default implementation using
SMP cross-calls.

Currently such an architecture example is arm64 supporting pseudo NMIs
feature which is only available on platforms which have support for GICv3
or later version.

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Tested-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

Changes in v8:
- Add loongarch support, too

 arch/arm/include/asm/irq.h       |  2 +-
 arch/arm/kernel/smp.c            |  3 ++-
 arch/loongarch/include/asm/irq.h |  2 +-
 arch/loongarch/kernel/process.c  |  3 ++-
 arch/mips/include/asm/irq.h      |  2 +-
 arch/mips/kernel/process.c       |  3 ++-
 arch/powerpc/include/asm/nmi.h   |  2 +-
 arch/powerpc/kernel/stacktrace.c |  3 ++-
 arch/sparc/include/asm/irq_64.h  |  2 +-
 arch/sparc/kernel/process_64.c   |  4 +++-
 arch/x86/include/asm/irq.h       |  2 +-
 arch/x86/kernel/apic/hw_nmi.c    |  3 ++-
 include/linux/nmi.h              | 12 ++++--------
 13 files changed, 23 insertions(+), 20 deletions(-)

diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h
index a7c2337b0c7d..e6b62c7d6f0e 100644
--- a/arch/arm/include/asm/irq.h
+++ b/arch/arm/include/asm/irq.h
@@ -32,7 +32,7 @@ void init_IRQ(void);
 #ifdef CONFIG_SMP
 #include <linux/cpumask.h>
 
-extern void arch_trigger_cpumask_backtrace(const cpumask_t *mask,
+extern bool arch_trigger_cpumask_backtrace(const cpumask_t *mask,
 					   bool exclude_self);
 #define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
 #endif
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 0b8c25763adc..acb97d9219b1 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -849,7 +849,8 @@ static void raise_nmi(cpumask_t *mask)
 	__ipi_send_mask(ipi_desc[IPI_CPU_BACKTRACE], mask);
 }
 
-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
 {
 	nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_nmi);
+	return true;
 }
diff --git a/arch/loongarch/include/asm/irq.h b/arch/loongarch/include/asm/irq.h
index a115e8999c69..c7a152d6bf0c 100644
--- a/arch/loongarch/include/asm/irq.h
+++ b/arch/loongarch/include/asm/irq.h
@@ -40,7 +40,7 @@ void spurious_interrupt(void);
 #define NR_IRQS_LEGACY 16
 
 #define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
-void arch_trigger_cpumask_backtrace(const struct cpumask *mask, bool exclude_self);
+bool arch_trigger_cpumask_backtrace(const struct cpumask *mask, bool exclude_self);
 
 #define MAX_IO_PICS 2
 #define NR_IRQS	(64 + (256 * MAX_IO_PICS))
diff --git a/arch/loongarch/kernel/process.c b/arch/loongarch/kernel/process.c
index fa2443c7afb2..8f7f818f5c4e 100644
--- a/arch/loongarch/kernel/process.c
+++ b/arch/loongarch/kernel/process.c
@@ -339,9 +339,10 @@ static void raise_backtrace(cpumask_t *mask)
 	}
 }
 
-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
 {
 	nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_backtrace);
+	return true;
 }
 
 #ifdef CONFIG_64BIT
diff --git a/arch/mips/include/asm/irq.h b/arch/mips/include/asm/irq.h
index 44f9824c1d8c..daf16173486a 100644
--- a/arch/mips/include/asm/irq.h
+++ b/arch/mips/include/asm/irq.h
@@ -77,7 +77,7 @@ extern int cp0_fdc_irq;
 
 extern int get_c0_fdc_int(void);
 
-void arch_trigger_cpumask_backtrace(const struct cpumask *mask,
+bool arch_trigger_cpumask_backtrace(const struct cpumask *mask,
 				    bool exclude_self);
 #define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
 
diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c
index 093dbbd6b843..7d538571830a 100644
--- a/arch/mips/kernel/process.c
+++ b/arch/mips/kernel/process.c
@@ -750,9 +750,10 @@ static void raise_backtrace(cpumask_t *mask)
 	}
 }
 
-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
 {
 	nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_backtrace);
+	return true;
 }
 
 int mips_get_process_fp_mode(struct task_struct *task)
diff --git a/arch/powerpc/include/asm/nmi.h b/arch/powerpc/include/asm/nmi.h
index c3c7adef74de..135f65adcf63 100644
--- a/arch/powerpc/include/asm/nmi.h
+++ b/arch/powerpc/include/asm/nmi.h
@@ -12,7 +12,7 @@ static inline void watchdog_nmi_set_timeout_pct(u64 pct) {}
 #endif
 
 #ifdef CONFIG_NMI_IPI
-extern void arch_trigger_cpumask_backtrace(const cpumask_t *mask,
+extern bool arch_trigger_cpumask_backtrace(const cpumask_t *mask,
 					   bool exclude_self);
 #define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
 #endif
diff --git a/arch/powerpc/kernel/stacktrace.c b/arch/powerpc/kernel/stacktrace.c
index 5de8597eaab8..0fee4bded7ba 100644
--- a/arch/powerpc/kernel/stacktrace.c
+++ b/arch/powerpc/kernel/stacktrace.c
@@ -221,8 +221,9 @@ static void raise_backtrace_ipi(cpumask_t *mask)
 	}
 }
 
-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
 {
 	nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_backtrace_ipi);
+	return true;
 }
 #endif /* defined(CONFIG_PPC_BOOK3S_64) && defined(CONFIG_NMI_IPI) */
diff --git a/arch/sparc/include/asm/irq_64.h b/arch/sparc/include/asm/irq_64.h
index 154df2cf19f4..00a0051a9da0 100644
--- a/arch/sparc/include/asm/irq_64.h
+++ b/arch/sparc/include/asm/irq_64.h
@@ -87,7 +87,7 @@ static inline unsigned long get_softint(void)
 	return retval;
 }
 
-void arch_trigger_cpumask_backtrace(const struct cpumask *mask,
+bool arch_trigger_cpumask_backtrace(const struct cpumask *mask,
 				    bool exclude_self);
 #define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
 
diff --git a/arch/sparc/kernel/process_64.c b/arch/sparc/kernel/process_64.c
index 91c2b8124527..f9aea1df3adf 100644
--- a/arch/sparc/kernel/process_64.c
+++ b/arch/sparc/kernel/process_64.c
@@ -236,7 +236,7 @@ static void __global_reg_poll(struct global_reg_snapshot *gp)
 	}
 }
 
-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
 {
 	struct thread_info *tp = current_thread_info();
 	struct pt_regs *regs = get_irq_regs();
@@ -291,6 +291,8 @@ void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
 	memset(global_cpu_snapshot, 0, sizeof(global_cpu_snapshot));
 
 	spin_unlock_irqrestore(&global_cpu_snapshot_lock, flags);
+
+	return true;
 }
 
 #ifdef CONFIG_MAGIC_SYSRQ
diff --git a/arch/x86/include/asm/irq.h b/arch/x86/include/asm/irq.h
index 768aa234cbb4..f731638cc38e 100644
--- a/arch/x86/include/asm/irq.h
+++ b/arch/x86/include/asm/irq.h
@@ -43,7 +43,7 @@ extern void init_ISA_irqs(void);
 extern void __init init_IRQ(void);
 
 #ifdef CONFIG_X86_LOCAL_APIC
-void arch_trigger_cpumask_backtrace(const struct cpumask *mask,
+bool arch_trigger_cpumask_backtrace(const struct cpumask *mask,
 				    bool exclude_self);
 
 #define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index 34a992e275ef..e7dcd28bc824 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -34,10 +34,11 @@ static void nmi_raise_cpu_backtrace(cpumask_t *mask)
 	apic->send_IPI_mask(mask, NMI_VECTOR);
 }
 
-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
 {
 	nmi_trigger_cpumask_backtrace(mask, exclude_self,
 				      nmi_raise_cpu_backtrace);
+	return true;
 }
 
 static int nmi_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs)
diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index 048c0b9aa623..7d8a77cd1e03 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -145,26 +145,22 @@ static inline void touch_nmi_watchdog(void)
 #ifdef arch_trigger_cpumask_backtrace
 static inline bool trigger_all_cpu_backtrace(void)
 {
-	arch_trigger_cpumask_backtrace(cpu_online_mask, false);
-	return true;
+	return arch_trigger_cpumask_backtrace(cpu_online_mask, false);
 }
 
 static inline bool trigger_allbutself_cpu_backtrace(void)
 {
-	arch_trigger_cpumask_backtrace(cpu_online_mask, true);
-	return true;
+	return arch_trigger_cpumask_backtrace(cpu_online_mask, true);
 }
 
 static inline bool trigger_cpumask_backtrace(struct cpumask *mask)
 {
-	arch_trigger_cpumask_backtrace(mask, false);
-	return true;
+	return arch_trigger_cpumask_backtrace(mask, false);
 }
 
 static inline bool trigger_single_cpu_backtrace(int cpu)
 {
-	arch_trigger_cpumask_backtrace(cpumask_of(cpu), false);
-	return true;
+	return arch_trigger_cpumask_backtrace(cpumask_of(cpu), false);
 }
 
 /* generic implementation */
-- 
2.40.0.634.g4ca3ef3211-goog


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 04/10] nmi: backtrace: Allow runtime arch specific override
@ 2023-04-19 22:55   ` Douglas Anderson
  0 siblings, 0 replies; 46+ messages in thread
From: Douglas Anderson @ 2023-04-19 22:55 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, Mark Rutland
  Cc: Jason A. Donenfeld, Huacai Chen, Jinyang He, Lecopzer Chen,
	Peter Zijlstra, kgdb-bugreport, Dave Hansen, Qing Zhang,
	Jiaxun Yang, Philippe Mathieu-Daudé,
	Guo Ren, H. Peter Anvin, sparclinux, WANG Xuerui, Gaosheng Cui,
	Rafael J . Wysocki, Russell King, Ard Biesheuvel, Chen-Yu Tsai,
	Ingo Molnar, Darrick J. Wong, Ulf Hansson, Masayoshi Mizuma,
	Kees Cook, Paul E. McKenney, Nicholas Piggin, Stephen Boyd,
	loongarch, Russell King (Oracle),
	Jianmin Lv, Borislav Petkov, Ben Dooks, Thomas Gleixner,
	Laurent Dufour, linux-arm-kernel, x86, Thomas Bogendoerfer,
	linux-mips, Greg Kroah-Hartman, ito-yuichi, Douglas Anderson,
	linux-kernel, linux-perf-users, Guilherme G. Piccoli,
	Eric W. Biederman, Gautham R. Shenoy, linuxppc-dev,
	David S. Miller

From: Sumit Garg <sumit.garg@linaro.org>

Add a boolean return to arch_trigger_cpumask_backtrace() to support a
use-case where a particular architecture detects at runtime if it supports
NMI backtrace or it would like to fallback to default implementation using
SMP cross-calls.

Currently such an architecture example is arm64 supporting pseudo NMIs
feature which is only available on platforms which have support for GICv3
or later version.

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Tested-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

Changes in v8:
- Add loongarch support, too

 arch/arm/include/asm/irq.h       |  2 +-
 arch/arm/kernel/smp.c            |  3 ++-
 arch/loongarch/include/asm/irq.h |  2 +-
 arch/loongarch/kernel/process.c  |  3 ++-
 arch/mips/include/asm/irq.h      |  2 +-
 arch/mips/kernel/process.c       |  3 ++-
 arch/powerpc/include/asm/nmi.h   |  2 +-
 arch/powerpc/kernel/stacktrace.c |  3 ++-
 arch/sparc/include/asm/irq_64.h  |  2 +-
 arch/sparc/kernel/process_64.c   |  4 +++-
 arch/x86/include/asm/irq.h       |  2 +-
 arch/x86/kernel/apic/hw_nmi.c    |  3 ++-
 include/linux/nmi.h              | 12 ++++--------
 13 files changed, 23 insertions(+), 20 deletions(-)

diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h
index a7c2337b0c7d..e6b62c7d6f0e 100644
--- a/arch/arm/include/asm/irq.h
+++ b/arch/arm/include/asm/irq.h
@@ -32,7 +32,7 @@ void init_IRQ(void);
 #ifdef CONFIG_SMP
 #include <linux/cpumask.h>
 
-extern void arch_trigger_cpumask_backtrace(const cpumask_t *mask,
+extern bool arch_trigger_cpumask_backtrace(const cpumask_t *mask,
 					   bool exclude_self);
 #define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
 #endif
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 0b8c25763adc..acb97d9219b1 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -849,7 +849,8 @@ static void raise_nmi(cpumask_t *mask)
 	__ipi_send_mask(ipi_desc[IPI_CPU_BACKTRACE], mask);
 }
 
-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
 {
 	nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_nmi);
+	return true;
 }
diff --git a/arch/loongarch/include/asm/irq.h b/arch/loongarch/include/asm/irq.h
index a115e8999c69..c7a152d6bf0c 100644
--- a/arch/loongarch/include/asm/irq.h
+++ b/arch/loongarch/include/asm/irq.h
@@ -40,7 +40,7 @@ void spurious_interrupt(void);
 #define NR_IRQS_LEGACY 16
 
 #define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
-void arch_trigger_cpumask_backtrace(const struct cpumask *mask, bool exclude_self);
+bool arch_trigger_cpumask_backtrace(const struct cpumask *mask, bool exclude_self);
 
 #define MAX_IO_PICS 2
 #define NR_IRQS	(64 + (256 * MAX_IO_PICS))
diff --git a/arch/loongarch/kernel/process.c b/arch/loongarch/kernel/process.c
index fa2443c7afb2..8f7f818f5c4e 100644
--- a/arch/loongarch/kernel/process.c
+++ b/arch/loongarch/kernel/process.c
@@ -339,9 +339,10 @@ static void raise_backtrace(cpumask_t *mask)
 	}
 }
 
-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
 {
 	nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_backtrace);
+	return true;
 }
 
 #ifdef CONFIG_64BIT
diff --git a/arch/mips/include/asm/irq.h b/arch/mips/include/asm/irq.h
index 44f9824c1d8c..daf16173486a 100644
--- a/arch/mips/include/asm/irq.h
+++ b/arch/mips/include/asm/irq.h
@@ -77,7 +77,7 @@ extern int cp0_fdc_irq;
 
 extern int get_c0_fdc_int(void);
 
-void arch_trigger_cpumask_backtrace(const struct cpumask *mask,
+bool arch_trigger_cpumask_backtrace(const struct cpumask *mask,
 				    bool exclude_self);
 #define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
 
diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c
index 093dbbd6b843..7d538571830a 100644
--- a/arch/mips/kernel/process.c
+++ b/arch/mips/kernel/process.c
@@ -750,9 +750,10 @@ static void raise_backtrace(cpumask_t *mask)
 	}
 }
 
-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
 {
 	nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_backtrace);
+	return true;
 }
 
 int mips_get_process_fp_mode(struct task_struct *task)
diff --git a/arch/powerpc/include/asm/nmi.h b/arch/powerpc/include/asm/nmi.h
index c3c7adef74de..135f65adcf63 100644
--- a/arch/powerpc/include/asm/nmi.h
+++ b/arch/powerpc/include/asm/nmi.h
@@ -12,7 +12,7 @@ static inline void watchdog_nmi_set_timeout_pct(u64 pct) {}
 #endif
 
 #ifdef CONFIG_NMI_IPI
-extern void arch_trigger_cpumask_backtrace(const cpumask_t *mask,
+extern bool arch_trigger_cpumask_backtrace(const cpumask_t *mask,
 					   bool exclude_self);
 #define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
 #endif
diff --git a/arch/powerpc/kernel/stacktrace.c b/arch/powerpc/kernel/stacktrace.c
index 5de8597eaab8..0fee4bded7ba 100644
--- a/arch/powerpc/kernel/stacktrace.c
+++ b/arch/powerpc/kernel/stacktrace.c
@@ -221,8 +221,9 @@ static void raise_backtrace_ipi(cpumask_t *mask)
 	}
 }
 
-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
 {
 	nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_backtrace_ipi);
+	return true;
 }
 #endif /* defined(CONFIG_PPC_BOOK3S_64) && defined(CONFIG_NMI_IPI) */
diff --git a/arch/sparc/include/asm/irq_64.h b/arch/sparc/include/asm/irq_64.h
index 154df2cf19f4..00a0051a9da0 100644
--- a/arch/sparc/include/asm/irq_64.h
+++ b/arch/sparc/include/asm/irq_64.h
@@ -87,7 +87,7 @@ static inline unsigned long get_softint(void)
 	return retval;
 }
 
-void arch_trigger_cpumask_backtrace(const struct cpumask *mask,
+bool arch_trigger_cpumask_backtrace(const struct cpumask *mask,
 				    bool exclude_self);
 #define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
 
diff --git a/arch/sparc/kernel/process_64.c b/arch/sparc/kernel/process_64.c
index 91c2b8124527..f9aea1df3adf 100644
--- a/arch/sparc/kernel/process_64.c
+++ b/arch/sparc/kernel/process_64.c
@@ -236,7 +236,7 @@ static void __global_reg_poll(struct global_reg_snapshot *gp)
 	}
 }
 
-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
 {
 	struct thread_info *tp = current_thread_info();
 	struct pt_regs *regs = get_irq_regs();
@@ -291,6 +291,8 @@ void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
 	memset(global_cpu_snapshot, 0, sizeof(global_cpu_snapshot));
 
 	spin_unlock_irqrestore(&global_cpu_snapshot_lock, flags);
+
+	return true;
 }
 
 #ifdef CONFIG_MAGIC_SYSRQ
diff --git a/arch/x86/include/asm/irq.h b/arch/x86/include/asm/irq.h
index 768aa234cbb4..f731638cc38e 100644
--- a/arch/x86/include/asm/irq.h
+++ b/arch/x86/include/asm/irq.h
@@ -43,7 +43,7 @@ extern void init_ISA_irqs(void);
 extern void __init init_IRQ(void);
 
 #ifdef CONFIG_X86_LOCAL_APIC
-void arch_trigger_cpumask_backtrace(const struct cpumask *mask,
+bool arch_trigger_cpumask_backtrace(const struct cpumask *mask,
 				    bool exclude_self);
 
 #define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index 34a992e275ef..e7dcd28bc824 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -34,10 +34,11 @@ static void nmi_raise_cpu_backtrace(cpumask_t *mask)
 	apic->send_IPI_mask(mask, NMI_VECTOR);
 }
 
-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
 {
 	nmi_trigger_cpumask_backtrace(mask, exclude_self,
 				      nmi_raise_cpu_backtrace);
+	return true;
 }
 
 static int nmi_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs)
diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index 048c0b9aa623..7d8a77cd1e03 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -145,26 +145,22 @@ static inline void touch_nmi_watchdog(void)
 #ifdef arch_trigger_cpumask_backtrace
 static inline bool trigger_all_cpu_backtrace(void)
 {
-	arch_trigger_cpumask_backtrace(cpu_online_mask, false);
-	return true;
+	return arch_trigger_cpumask_backtrace(cpu_online_mask, false);
 }
 
 static inline bool trigger_allbutself_cpu_backtrace(void)
 {
-	arch_trigger_cpumask_backtrace(cpu_online_mask, true);
-	return true;
+	return arch_trigger_cpumask_backtrace(cpu_online_mask, true);
 }
 
 static inline bool trigger_cpumask_backtrace(struct cpumask *mask)
 {
-	arch_trigger_cpumask_backtrace(mask, false);
-	return true;
+	return arch_trigger_cpumask_backtrace(mask, false);
 }
 
 static inline bool trigger_single_cpu_backtrace(int cpu)
 {
-	arch_trigger_cpumask_backtrace(cpumask_of(cpu), false);
-	return true;
+	return arch_trigger_cpumask_backtrace(cpumask_of(cpu), false);
 }
 
 /* generic implementation */
-- 
2.40.0.634.g4ca3ef3211-goog


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 05/10] arm64: ipi_nmi: Add support for NMI backtrace
  2023-04-19 22:55 ` Douglas Anderson
@ 2023-04-19 22:55   ` Douglas Anderson
  -1 siblings, 0 replies; 46+ messages in thread
From: Douglas Anderson @ 2023-04-19 22:55 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, Mark Rutland
  Cc: ito-yuichi, kgdb-bugreport, Chen-Yu Tsai, Masayoshi Mizuma,
	Peter Zijlstra, Ard Biesheuvel, Rafael J . Wysocki,
	linux-arm-kernel, Stephen Boyd, Lecopzer Chen, Thomas Gleixner,
	linux-perf-users, Masayoshi Mizuma, Douglas Anderson,
	linux-kernel

From: Sumit Garg <sumit.garg@linaro.org>

Enable NMI backtrace support on arm64 using IPI turned as an NMI
leveraging pseudo NMIs support. It is now possible for users to get a
backtrace of a CPU stuck in hard-lockup using magic SYSRQ.

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Tested-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

Changes in v8:
- Removed "#ifdef CONFIG_SMP" since arm64 is always SMP

 arch/arm64/include/asm/irq.h |  4 ++++
 arch/arm64/kernel/ipi_nmi.c  | 18 ++++++++++++++++--
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h
index fac08e18bcd5..dc35b9d23a81 100644
--- a/arch/arm64/include/asm/irq.h
+++ b/arch/arm64/include/asm/irq.h
@@ -6,6 +6,10 @@
 
 #include <asm-generic/irq.h>
 
+extern bool arch_trigger_cpumask_backtrace(const cpumask_t *mask,
+					   bool exclude_self);
+#define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
+
 struct pt_regs;
 
 int set_handle_irq(void (*handle_irq)(struct pt_regs *));
diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
index 712411eed949..c592e92b8cbf 100644
--- a/arch/arm64/kernel/ipi_nmi.c
+++ b/arch/arm64/kernel/ipi_nmi.c
@@ -8,6 +8,7 @@
 
 #include <linux/interrupt.h>
 #include <linux/irq.h>
+#include <linux/nmi.h>
 #include <linux/smp.h>
 
 #include <asm/nmi.h>
@@ -31,11 +32,24 @@ void arm64_send_nmi(cpumask_t *mask)
 	__ipi_send_mask(ipi_nmi_desc, mask);
 }
 
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+{
+	if (!ipi_nmi_desc)
+		return false;
+
+	nmi_trigger_cpumask_backtrace(mask, exclude_self, arm64_send_nmi);
+
+	return true;
+}
+
 static irqreturn_t ipi_nmi_handler(int irq, void *data)
 {
-	/* nop, NMI handlers for special features can be added here. */
+	irqreturn_t ret = IRQ_NONE;
+
+	if (nmi_cpu_backtrace(get_irq_regs()))
+		ret = IRQ_HANDLED;
 
-	return IRQ_NONE;
+	return ret;
 }
 
 void dynamic_ipi_setup(void)
-- 
2.40.0.634.g4ca3ef3211-goog


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 05/10] arm64: ipi_nmi: Add support for NMI backtrace
@ 2023-04-19 22:55   ` Douglas Anderson
  0 siblings, 0 replies; 46+ messages in thread
From: Douglas Anderson @ 2023-04-19 22:55 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, Mark Rutland
  Cc: ito-yuichi, kgdb-bugreport, Chen-Yu Tsai, Masayoshi Mizuma,
	Peter Zijlstra, Ard Biesheuvel, Rafael J . Wysocki,
	linux-arm-kernel, Stephen Boyd, Lecopzer Chen, Thomas Gleixner,
	linux-perf-users, Masayoshi Mizuma, Douglas Anderson,
	linux-kernel

From: Sumit Garg <sumit.garg@linaro.org>

Enable NMI backtrace support on arm64 using IPI turned as an NMI
leveraging pseudo NMIs support. It is now possible for users to get a
backtrace of a CPU stuck in hard-lockup using magic SYSRQ.

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Tested-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
Tested-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

Changes in v8:
- Removed "#ifdef CONFIG_SMP" since arm64 is always SMP

 arch/arm64/include/asm/irq.h |  4 ++++
 arch/arm64/kernel/ipi_nmi.c  | 18 ++++++++++++++++--
 2 files changed, 20 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h
index fac08e18bcd5..dc35b9d23a81 100644
--- a/arch/arm64/include/asm/irq.h
+++ b/arch/arm64/include/asm/irq.h
@@ -6,6 +6,10 @@
 
 #include <asm-generic/irq.h>
 
+extern bool arch_trigger_cpumask_backtrace(const cpumask_t *mask,
+					   bool exclude_self);
+#define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
+
 struct pt_regs;
 
 int set_handle_irq(void (*handle_irq)(struct pt_regs *));
diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
index 712411eed949..c592e92b8cbf 100644
--- a/arch/arm64/kernel/ipi_nmi.c
+++ b/arch/arm64/kernel/ipi_nmi.c
@@ -8,6 +8,7 @@
 
 #include <linux/interrupt.h>
 #include <linux/irq.h>
+#include <linux/nmi.h>
 #include <linux/smp.h>
 
 #include <asm/nmi.h>
@@ -31,11 +32,24 @@ void arm64_send_nmi(cpumask_t *mask)
 	__ipi_send_mask(ipi_nmi_desc, mask);
 }
 
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+{
+	if (!ipi_nmi_desc)
+		return false;
+
+	nmi_trigger_cpumask_backtrace(mask, exclude_self, arm64_send_nmi);
+
+	return true;
+}
+
 static irqreturn_t ipi_nmi_handler(int irq, void *data)
 {
-	/* nop, NMI handlers for special features can be added here. */
+	irqreturn_t ret = IRQ_NONE;
+
+	if (nmi_cpu_backtrace(get_irq_regs()))
+		ret = IRQ_HANDLED;
 
-	return IRQ_NONE;
+	return ret;
 }
 
 void dynamic_ipi_setup(void)
-- 
2.40.0.634.g4ca3ef3211-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 06/10] arm64: idle: Tag the arm64 idle functions as __cpuidle
  2023-04-19 22:55 ` Douglas Anderson
@ 2023-04-19 22:56   ` Douglas Anderson
  -1 siblings, 0 replies; 46+ messages in thread
From: Douglas Anderson @ 2023-04-19 22:56 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, Mark Rutland
  Cc: ito-yuichi, kgdb-bugreport, Chen-Yu Tsai, Masayoshi Mizuma,
	Peter Zijlstra, Ard Biesheuvel, Rafael J . Wysocki,
	linux-arm-kernel, Stephen Boyd, Lecopzer Chen, Thomas Gleixner,
	linux-perf-users, Douglas Anderson, Frederic Weisbecker,
	Gautham R. Shenoy, Ingo Molnar, linux-kernel

As per the (somewhat recent) comment before the definition of
`__cpuidle`, the tag is like `noinstr` but also marks a function so it
can be identified by cpu_in_idle(). Let'a add this.

After doing this then when we dump stack traces of all processors
using nmi_cpu_backtrace() then instead of getting useless backtraces
we get things like:

  NMI backtrace for cpu N skipped: idling at cpu_do_idle+0x94/0x98

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

Changes in v8:
- "Tag the arm64 idle functions as __cpuidle" new for v8

 arch/arm64/kernel/idle.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/idle.c b/arch/arm64/kernel/idle.c
index c1125753fe9b..05cfb347ec26 100644
--- a/arch/arm64/kernel/idle.c
+++ b/arch/arm64/kernel/idle.c
@@ -20,7 +20,7 @@
  *	ensure that interrupts are not masked at the PMR (because the core will
  *	not wake up if we block the wake up signal in the interrupt controller).
  */
-void noinstr cpu_do_idle(void)
+void __cpuidle cpu_do_idle(void)
 {
 	struct arm_cpuidle_irq_context context;
 
@@ -35,7 +35,7 @@ void noinstr cpu_do_idle(void)
 /*
  * This is our default idle handler.
  */
-void noinstr arch_cpu_idle(void)
+void __cpuidle arch_cpu_idle(void)
 {
 	/*
 	 * This should do all the clock switching and wait for interrupt
-- 
2.40.0.634.g4ca3ef3211-goog


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 06/10] arm64: idle: Tag the arm64 idle functions as __cpuidle
@ 2023-04-19 22:56   ` Douglas Anderson
  0 siblings, 0 replies; 46+ messages in thread
From: Douglas Anderson @ 2023-04-19 22:56 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, Mark Rutland
  Cc: ito-yuichi, kgdb-bugreport, Chen-Yu Tsai, Masayoshi Mizuma,
	Peter Zijlstra, Ard Biesheuvel, Rafael J . Wysocki,
	linux-arm-kernel, Stephen Boyd, Lecopzer Chen, Thomas Gleixner,
	linux-perf-users, Douglas Anderson, Frederic Weisbecker,
	Gautham R. Shenoy, Ingo Molnar, linux-kernel

As per the (somewhat recent) comment before the definition of
`__cpuidle`, the tag is like `noinstr` but also marks a function so it
can be identified by cpu_in_idle(). Let'a add this.

After doing this then when we dump stack traces of all processors
using nmi_cpu_backtrace() then instead of getting useless backtraces
we get things like:

  NMI backtrace for cpu N skipped: idling at cpu_do_idle+0x94/0x98

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

Changes in v8:
- "Tag the arm64 idle functions as __cpuidle" new for v8

 arch/arm64/kernel/idle.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/idle.c b/arch/arm64/kernel/idle.c
index c1125753fe9b..05cfb347ec26 100644
--- a/arch/arm64/kernel/idle.c
+++ b/arch/arm64/kernel/idle.c
@@ -20,7 +20,7 @@
  *	ensure that interrupts are not masked at the PMR (because the core will
  *	not wake up if we block the wake up signal in the interrupt controller).
  */
-void noinstr cpu_do_idle(void)
+void __cpuidle cpu_do_idle(void)
 {
 	struct arm_cpuidle_irq_context context;
 
@@ -35,7 +35,7 @@ void noinstr cpu_do_idle(void)
 /*
  * This is our default idle handler.
  */
-void noinstr arch_cpu_idle(void)
+void __cpuidle arch_cpu_idle(void)
 {
 	/*
 	 * This should do all the clock switching and wait for interrupt
-- 
2.40.0.634.g4ca3ef3211-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 07/10] kgdb: Expose default CPUs roundup fallback mechanism
  2023-04-19 22:55 ` Douglas Anderson
@ 2023-04-19 22:56   ` Douglas Anderson
  -1 siblings, 0 replies; 46+ messages in thread
From: Douglas Anderson @ 2023-04-19 22:56 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, Mark Rutland
  Cc: ito-yuichi, kgdb-bugreport, Chen-Yu Tsai, Masayoshi Mizuma,
	Peter Zijlstra, Ard Biesheuvel, Rafael J . Wysocki,
	linux-arm-kernel, Stephen Boyd, Lecopzer Chen, Thomas Gleixner,
	linux-perf-users, Douglas Anderson, Jason Wessel, linux-kernel

From: Sumit Garg <sumit.garg@linaro.org>

Add a new API kgdb_smp_call_nmi_hook() to expose default CPUs roundup
mechanism to a particular archichecture as a runtime fallback if it
detects to not support NMI roundup.

Currently such an architecture example is arm64 supporting pseudo NMIs
feature which is only available on platforms which have support for GICv3
or later version.

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Tested-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

(no changes since v1)

 include/linux/kgdb.h      | 12 ++++++++++++
 kernel/debug/debug_core.c |  8 +++++++-
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
index 258cdde8d356..87713bd390f3 100644
--- a/include/linux/kgdb.h
+++ b/include/linux/kgdb.h
@@ -199,6 +199,18 @@ kgdb_arch_handle_qxfer_pkt(char *remcom_in_buffer,
 
 extern void kgdb_call_nmi_hook(void *ignored);
 
+/**
+ *	kgdb_smp_call_nmi_hook - Provide default fallback mechanism to
+ *				 round-up CPUs
+ *
+ *	If you're using the default implementation of kgdb_roundup_cpus()
+ *	this function will be called.  And if an arch detects at runtime to
+ *	not support NMI based roundup then it can fallback to default
+ *	mechanism using this API.
+ */
+
+extern void kgdb_smp_call_nmi_hook(void);
+
 /**
  *	kgdb_roundup_cpus - Get other CPUs into a holding pattern
  *
diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
index d5e9ccde3ab8..14d40a7d6a4b 100644
--- a/kernel/debug/debug_core.c
+++ b/kernel/debug/debug_core.c
@@ -238,7 +238,7 @@ NOKPROBE_SYMBOL(kgdb_call_nmi_hook);
 static DEFINE_PER_CPU(call_single_data_t, kgdb_roundup_csd) =
 	CSD_INIT(kgdb_call_nmi_hook, NULL);
 
-void __weak kgdb_roundup_cpus(void)
+void kgdb_smp_call_nmi_hook(void)
 {
 	call_single_data_t *csd;
 	int this_cpu = raw_smp_processor_id();
@@ -269,6 +269,12 @@ void __weak kgdb_roundup_cpus(void)
 			kgdb_info[cpu].rounding_up = false;
 	}
 }
+NOKPROBE_SYMBOL(kgdb_smp_call_nmi_hook);
+
+void __weak kgdb_roundup_cpus(void)
+{
+	kgdb_smp_call_nmi_hook();
+}
 NOKPROBE_SYMBOL(kgdb_roundup_cpus);
 
 #endif
-- 
2.40.0.634.g4ca3ef3211-goog


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 07/10] kgdb: Expose default CPUs roundup fallback mechanism
@ 2023-04-19 22:56   ` Douglas Anderson
  0 siblings, 0 replies; 46+ messages in thread
From: Douglas Anderson @ 2023-04-19 22:56 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, Mark Rutland
  Cc: ito-yuichi, kgdb-bugreport, Chen-Yu Tsai, Masayoshi Mizuma,
	Peter Zijlstra, Ard Biesheuvel, Rafael J . Wysocki,
	linux-arm-kernel, Stephen Boyd, Lecopzer Chen, Thomas Gleixner,
	linux-perf-users, Douglas Anderson, Jason Wessel, linux-kernel

From: Sumit Garg <sumit.garg@linaro.org>

Add a new API kgdb_smp_call_nmi_hook() to expose default CPUs roundup
mechanism to a particular archichecture as a runtime fallback if it
detects to not support NMI roundup.

Currently such an architecture example is arm64 supporting pseudo NMIs
feature which is only available on platforms which have support for GICv3
or later version.

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Tested-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

(no changes since v1)

 include/linux/kgdb.h      | 12 ++++++++++++
 kernel/debug/debug_core.c |  8 +++++++-
 2 files changed, 19 insertions(+), 1 deletion(-)

diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
index 258cdde8d356..87713bd390f3 100644
--- a/include/linux/kgdb.h
+++ b/include/linux/kgdb.h
@@ -199,6 +199,18 @@ kgdb_arch_handle_qxfer_pkt(char *remcom_in_buffer,
 
 extern void kgdb_call_nmi_hook(void *ignored);
 
+/**
+ *	kgdb_smp_call_nmi_hook - Provide default fallback mechanism to
+ *				 round-up CPUs
+ *
+ *	If you're using the default implementation of kgdb_roundup_cpus()
+ *	this function will be called.  And if an arch detects at runtime to
+ *	not support NMI based roundup then it can fallback to default
+ *	mechanism using this API.
+ */
+
+extern void kgdb_smp_call_nmi_hook(void);
+
 /**
  *	kgdb_roundup_cpus - Get other CPUs into a holding pattern
  *
diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
index d5e9ccde3ab8..14d40a7d6a4b 100644
--- a/kernel/debug/debug_core.c
+++ b/kernel/debug/debug_core.c
@@ -238,7 +238,7 @@ NOKPROBE_SYMBOL(kgdb_call_nmi_hook);
 static DEFINE_PER_CPU(call_single_data_t, kgdb_roundup_csd) =
 	CSD_INIT(kgdb_call_nmi_hook, NULL);
 
-void __weak kgdb_roundup_cpus(void)
+void kgdb_smp_call_nmi_hook(void)
 {
 	call_single_data_t *csd;
 	int this_cpu = raw_smp_processor_id();
@@ -269,6 +269,12 @@ void __weak kgdb_roundup_cpus(void)
 			kgdb_info[cpu].rounding_up = false;
 	}
 }
+NOKPROBE_SYMBOL(kgdb_smp_call_nmi_hook);
+
+void __weak kgdb_roundup_cpus(void)
+{
+	kgdb_smp_call_nmi_hook();
+}
 NOKPROBE_SYMBOL(kgdb_roundup_cpus);
 
 #endif
-- 
2.40.0.634.g4ca3ef3211-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 08/10] kgdb: Provide a stub kgdb_nmicallback() if !CONFIG_KGDB
  2023-04-19 22:55 ` Douglas Anderson
@ 2023-04-19 22:56   ` Douglas Anderson
  -1 siblings, 0 replies; 46+ messages in thread
From: Douglas Anderson @ 2023-04-19 22:56 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, Mark Rutland
  Cc: ito-yuichi, kgdb-bugreport, Chen-Yu Tsai, Masayoshi Mizuma,
	Peter Zijlstra, Ard Biesheuvel, Rafael J . Wysocki,
	linux-arm-kernel, Stephen Boyd, Lecopzer Chen, Thomas Gleixner,
	linux-perf-users, Douglas Anderson, Jason Wessel, linux-kernel

To save architectures from needing to wrap the call in #ifdefs, add a
stub no-op version of kgdb_nmicallback(), which returns 1 if it didn't
handle anything.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

Changes in v8:
- "Provide a stub kgdb_nmicallback() if !CONFIG_KGDB" new for v8

 include/linux/kgdb.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
index 87713bd390f3..9ce628ee47cc 100644
--- a/include/linux/kgdb.h
+++ b/include/linux/kgdb.h
@@ -377,5 +377,6 @@ extern void kgdb_free_init_mem(void);
 #define dbg_late_init()
 static inline void kgdb_panic(const char *msg) {}
 static inline void kgdb_free_init_mem(void) { }
+static int kgdb_nmicallback(int cpu, void *regs) { return 1; }
 #endif /* ! CONFIG_KGDB */
 #endif /* _KGDB_H_ */
-- 
2.40.0.634.g4ca3ef3211-goog


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 08/10] kgdb: Provide a stub kgdb_nmicallback() if !CONFIG_KGDB
@ 2023-04-19 22:56   ` Douglas Anderson
  0 siblings, 0 replies; 46+ messages in thread
From: Douglas Anderson @ 2023-04-19 22:56 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, Mark Rutland
  Cc: ito-yuichi, kgdb-bugreport, Chen-Yu Tsai, Masayoshi Mizuma,
	Peter Zijlstra, Ard Biesheuvel, Rafael J . Wysocki,
	linux-arm-kernel, Stephen Boyd, Lecopzer Chen, Thomas Gleixner,
	linux-perf-users, Douglas Anderson, Jason Wessel, linux-kernel

To save architectures from needing to wrap the call in #ifdefs, add a
stub no-op version of kgdb_nmicallback(), which returns 1 if it didn't
handle anything.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

Changes in v8:
- "Provide a stub kgdb_nmicallback() if !CONFIG_KGDB" new for v8

 include/linux/kgdb.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
index 87713bd390f3..9ce628ee47cc 100644
--- a/include/linux/kgdb.h
+++ b/include/linux/kgdb.h
@@ -377,5 +377,6 @@ extern void kgdb_free_init_mem(void);
 #define dbg_late_init()
 static inline void kgdb_panic(const char *msg) {}
 static inline void kgdb_free_init_mem(void) { }
+static int kgdb_nmicallback(int cpu, void *regs) { return 1; }
 #endif /* ! CONFIG_KGDB */
 #endif /* _KGDB_H_ */
-- 
2.40.0.634.g4ca3ef3211-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 09/10] arm64: kgdb: Roundup cpus using IPI as NMI
  2023-04-19 22:55 ` Douglas Anderson
@ 2023-04-19 22:56   ` Douglas Anderson
  -1 siblings, 0 replies; 46+ messages in thread
From: Douglas Anderson @ 2023-04-19 22:56 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, Mark Rutland
  Cc: ito-yuichi, kgdb-bugreport, Chen-Yu Tsai, Masayoshi Mizuma,
	Peter Zijlstra, Ard Biesheuvel, Rafael J . Wysocki,
	linux-arm-kernel, Stephen Boyd, Lecopzer Chen, Thomas Gleixner,
	linux-perf-users, Douglas Anderson, Alexandru Elisei,
	Masayoshi Mizuma, linux-kernel

From: Sumit Garg <sumit.garg@linaro.org>

arm64 platforms with GICv3 or later supports pseudo NMIs which can be
leveraged to roundup CPUs which are stuck in hard lockup state with
interrupts disabled that wouldn't be possible with a normal IPI.

So instead switch to roundup CPUs using IPI turned as NMI. And in
case a particular arm64 platform doesn't supports pseudo NMIs,
it will switch back to default kgdb CPUs roundup mechanism.

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Tested-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

(no changes since v1)

 arch/arm64/kernel/ipi_nmi.c |  5 +++++
 arch/arm64/kernel/kgdb.c    | 18 ++++++++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
index c592e92b8cbf..2adaaf1519e5 100644
--- a/arch/arm64/kernel/ipi_nmi.c
+++ b/arch/arm64/kernel/ipi_nmi.c
@@ -8,6 +8,7 @@
 
 #include <linux/interrupt.h>
 #include <linux/irq.h>
+#include <linux/kgdb.h>
 #include <linux/nmi.h>
 #include <linux/smp.h>
 
@@ -45,10 +46,14 @@ bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
 static irqreturn_t ipi_nmi_handler(int irq, void *data)
 {
 	irqreturn_t ret = IRQ_NONE;
+	unsigned int cpu = smp_processor_id();
 
 	if (nmi_cpu_backtrace(get_irq_regs()))
 		ret = IRQ_HANDLED;
 
+	if (!kgdb_nmicallback(cpu, get_irq_regs()))
+		ret = IRQ_HANDLED;
+
 	return ret;
 }
 
diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c
index cda9c1e9864f..2c85bc1df013 100644
--- a/arch/arm64/kernel/kgdb.c
+++ b/arch/arm64/kernel/kgdb.c
@@ -17,6 +17,7 @@
 
 #include <asm/debug-monitors.h>
 #include <asm/insn.h>
+#include <asm/nmi.h>
 #include <asm/patching.h>
 #include <asm/traps.h>
 
@@ -354,3 +355,20 @@ int kgdb_arch_remove_breakpoint(struct kgdb_bkpt *bpt)
 	return aarch64_insn_write((void *)bpt->bpt_addr,
 			*(u32 *)bpt->saved_instr);
 }
+
+void kgdb_roundup_cpus(void)
+{
+	struct cpumask mask;
+
+	if (!arm64_supports_nmi()) {
+		kgdb_smp_call_nmi_hook();
+		return;
+	}
+
+	cpumask_copy(&mask, cpu_online_mask);
+	cpumask_clear_cpu(raw_smp_processor_id(), &mask);
+	if (cpumask_empty(&mask))
+		return;
+
+	arm64_send_nmi(&mask);
+}
-- 
2.40.0.634.g4ca3ef3211-goog


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 09/10] arm64: kgdb: Roundup cpus using IPI as NMI
@ 2023-04-19 22:56   ` Douglas Anderson
  0 siblings, 0 replies; 46+ messages in thread
From: Douglas Anderson @ 2023-04-19 22:56 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, Mark Rutland
  Cc: ito-yuichi, kgdb-bugreport, Chen-Yu Tsai, Masayoshi Mizuma,
	Peter Zijlstra, Ard Biesheuvel, Rafael J . Wysocki,
	linux-arm-kernel, Stephen Boyd, Lecopzer Chen, Thomas Gleixner,
	linux-perf-users, Douglas Anderson, Alexandru Elisei,
	Masayoshi Mizuma, linux-kernel

From: Sumit Garg <sumit.garg@linaro.org>

arm64 platforms with GICv3 or later supports pseudo NMIs which can be
leveraged to roundup CPUs which are stuck in hard lockup state with
interrupts disabled that wouldn't be possible with a normal IPI.

So instead switch to roundup CPUs using IPI turned as NMI. And in
case a particular arm64 platform doesn't supports pseudo NMIs,
it will switch back to default kgdb CPUs roundup mechanism.

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
Tested-by: Chen-Yu Tsai <wens@csie.org>
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---

(no changes since v1)

 arch/arm64/kernel/ipi_nmi.c |  5 +++++
 arch/arm64/kernel/kgdb.c    | 18 ++++++++++++++++++
 2 files changed, 23 insertions(+)

diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
index c592e92b8cbf..2adaaf1519e5 100644
--- a/arch/arm64/kernel/ipi_nmi.c
+++ b/arch/arm64/kernel/ipi_nmi.c
@@ -8,6 +8,7 @@
 
 #include <linux/interrupt.h>
 #include <linux/irq.h>
+#include <linux/kgdb.h>
 #include <linux/nmi.h>
 #include <linux/smp.h>
 
@@ -45,10 +46,14 @@ bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
 static irqreturn_t ipi_nmi_handler(int irq, void *data)
 {
 	irqreturn_t ret = IRQ_NONE;
+	unsigned int cpu = smp_processor_id();
 
 	if (nmi_cpu_backtrace(get_irq_regs()))
 		ret = IRQ_HANDLED;
 
+	if (!kgdb_nmicallback(cpu, get_irq_regs()))
+		ret = IRQ_HANDLED;
+
 	return ret;
 }
 
diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c
index cda9c1e9864f..2c85bc1df013 100644
--- a/arch/arm64/kernel/kgdb.c
+++ b/arch/arm64/kernel/kgdb.c
@@ -17,6 +17,7 @@
 
 #include <asm/debug-monitors.h>
 #include <asm/insn.h>
+#include <asm/nmi.h>
 #include <asm/patching.h>
 #include <asm/traps.h>
 
@@ -354,3 +355,20 @@ int kgdb_arch_remove_breakpoint(struct kgdb_bkpt *bpt)
 	return aarch64_insn_write((void *)bpt->bpt_addr,
 			*(u32 *)bpt->saved_instr);
 }
+
+void kgdb_roundup_cpus(void)
+{
+	struct cpumask mask;
+
+	if (!arm64_supports_nmi()) {
+		kgdb_smp_call_nmi_hook();
+		return;
+	}
+
+	cpumask_copy(&mask, cpu_online_mask);
+	cpumask_clear_cpu(raw_smp_processor_id(), &mask);
+	if (cpumask_empty(&mask))
+		return;
+
+	arm64_send_nmi(&mask);
+}
-- 
2.40.0.634.g4ca3ef3211-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 10/10] arm64: ipi_nmi: Fallback to a regular IPI if NMI isn't enabled
  2023-04-19 22:55 ` Douglas Anderson
@ 2023-04-19 22:56   ` Douglas Anderson
  -1 siblings, 0 replies; 46+ messages in thread
From: Douglas Anderson @ 2023-04-19 22:56 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, Mark Rutland
  Cc: ito-yuichi, kgdb-bugreport, Chen-Yu Tsai, Masayoshi Mizuma,
	Peter Zijlstra, Ard Biesheuvel, Rafael J . Wysocki,
	linux-arm-kernel, Stephen Boyd, Lecopzer Chen, Thomas Gleixner,
	linux-perf-users, Douglas Anderson, Masayoshi Mizuma,
	linux-kernel

The current ipi_nmi implementation relies on the arm64 pseudo-NMI
support. This needs to be enabled in both the kernel config with
CONFIG_ARM64_PSEUDO_NMI and on the kernel command line with
"irqchip.gicv3_pseudo_nmi=1".

Let's add a fallback of using a regular IPI if the NMI isn't
enabled. The fallback mechanism of using a regular IPI matches what
arm32 does all the time since there is no NMI there.

The reason for doing this is to make the trigger_all_cpu_backtrace()
class of functions work. While those functions all return a bool
indicating that the caller should try a fallback upon failure, an
inspection of the callers shows that nearly nobody implements a
fallback. It's better to at least provide something here.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
I dunno what people think of this patch. If it's great, we could
actually drop some of the patches out of this series since some of
them are to account for the fact that we might not be able to register
an "ipi_nmi". If it's awful, it could simply be dropped.

Changes in v8:
- "Fallback to a regular IPI if NMI isn't enabled" new for v8

 arch/arm64/kernel/ipi_nmi.c | 31 +++++++++++++++++++++++++------
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
index 2adaaf1519e5..02868752845c 100644
--- a/arch/arm64/kernel/ipi_nmi.c
+++ b/arch/arm64/kernel/ipi_nmi.c
@@ -16,6 +16,7 @@
 
 static struct irq_desc *ipi_nmi_desc __read_mostly;
 static int ipi_nmi_id __read_mostly;
+static bool is_nmi;
 
 bool arm64_supports_nmi(void)
 {
@@ -62,8 +63,12 @@ void dynamic_ipi_setup(void)
 	if (!ipi_nmi_desc)
 		return;
 
-	if (!prepare_percpu_nmi(ipi_nmi_id))
-		enable_percpu_nmi(ipi_nmi_id, IRQ_TYPE_NONE);
+	if (is_nmi) {
+		if (!prepare_percpu_nmi(ipi_nmi_id))
+			enable_percpu_nmi(ipi_nmi_id, IRQ_TYPE_NONE);
+	} else {
+		enable_percpu_irq(ipi_nmi_id, IRQ_TYPE_NONE);
+	}
 }
 
 void dynamic_ipi_teardown(void)
@@ -71,14 +76,28 @@ void dynamic_ipi_teardown(void)
 	if (!ipi_nmi_desc)
 		return;
 
-	disable_percpu_nmi(ipi_nmi_id);
-	teardown_percpu_nmi(ipi_nmi_id);
+	if (is_nmi) {
+		disable_percpu_nmi(ipi_nmi_id);
+		teardown_percpu_nmi(ipi_nmi_id);
+	} else {
+		disable_percpu_irq(ipi_nmi_id);
+	}
 }
 
 void __init set_smp_dynamic_ipi(int ipi)
 {
+	int err;
+
 	if (!request_percpu_nmi(ipi, ipi_nmi_handler, "IPI", &cpu_number)) {
-		ipi_nmi_desc = irq_to_desc(ipi);
-		ipi_nmi_id = ipi;
+		is_nmi = true;
+	} else {
+		err = request_percpu_irq(ipi, ipi_nmi_handler, "IPI", &cpu_number);
+		if (WARN_ON(err))
+			return;
+
+		irq_set_status_flags(ipi, IRQ_HIDDEN);
 	}
+
+	ipi_nmi_desc = irq_to_desc(ipi);
+	ipi_nmi_id = ipi;
 }
-- 
2.40.0.634.g4ca3ef3211-goog


^ permalink raw reply related	[flat|nested] 46+ messages in thread

* [PATCH v8 10/10] arm64: ipi_nmi: Fallback to a regular IPI if NMI isn't enabled
@ 2023-04-19 22:56   ` Douglas Anderson
  0 siblings, 0 replies; 46+ messages in thread
From: Douglas Anderson @ 2023-04-19 22:56 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, Mark Rutland
  Cc: ito-yuichi, kgdb-bugreport, Chen-Yu Tsai, Masayoshi Mizuma,
	Peter Zijlstra, Ard Biesheuvel, Rafael J . Wysocki,
	linux-arm-kernel, Stephen Boyd, Lecopzer Chen, Thomas Gleixner,
	linux-perf-users, Douglas Anderson, Masayoshi Mizuma,
	linux-kernel

The current ipi_nmi implementation relies on the arm64 pseudo-NMI
support. This needs to be enabled in both the kernel config with
CONFIG_ARM64_PSEUDO_NMI and on the kernel command line with
"irqchip.gicv3_pseudo_nmi=1".

Let's add a fallback of using a regular IPI if the NMI isn't
enabled. The fallback mechanism of using a regular IPI matches what
arm32 does all the time since there is no NMI there.

The reason for doing this is to make the trigger_all_cpu_backtrace()
class of functions work. While those functions all return a bool
indicating that the caller should try a fallback upon failure, an
inspection of the callers shows that nearly nobody implements a
fallback. It's better to at least provide something here.

Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
I dunno what people think of this patch. If it's great, we could
actually drop some of the patches out of this series since some of
them are to account for the fact that we might not be able to register
an "ipi_nmi". If it's awful, it could simply be dropped.

Changes in v8:
- "Fallback to a regular IPI if NMI isn't enabled" new for v8

 arch/arm64/kernel/ipi_nmi.c | 31 +++++++++++++++++++++++++------
 1 file changed, 25 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
index 2adaaf1519e5..02868752845c 100644
--- a/arch/arm64/kernel/ipi_nmi.c
+++ b/arch/arm64/kernel/ipi_nmi.c
@@ -16,6 +16,7 @@
 
 static struct irq_desc *ipi_nmi_desc __read_mostly;
 static int ipi_nmi_id __read_mostly;
+static bool is_nmi;
 
 bool arm64_supports_nmi(void)
 {
@@ -62,8 +63,12 @@ void dynamic_ipi_setup(void)
 	if (!ipi_nmi_desc)
 		return;
 
-	if (!prepare_percpu_nmi(ipi_nmi_id))
-		enable_percpu_nmi(ipi_nmi_id, IRQ_TYPE_NONE);
+	if (is_nmi) {
+		if (!prepare_percpu_nmi(ipi_nmi_id))
+			enable_percpu_nmi(ipi_nmi_id, IRQ_TYPE_NONE);
+	} else {
+		enable_percpu_irq(ipi_nmi_id, IRQ_TYPE_NONE);
+	}
 }
 
 void dynamic_ipi_teardown(void)
@@ -71,14 +76,28 @@ void dynamic_ipi_teardown(void)
 	if (!ipi_nmi_desc)
 		return;
 
-	disable_percpu_nmi(ipi_nmi_id);
-	teardown_percpu_nmi(ipi_nmi_id);
+	if (is_nmi) {
+		disable_percpu_nmi(ipi_nmi_id);
+		teardown_percpu_nmi(ipi_nmi_id);
+	} else {
+		disable_percpu_irq(ipi_nmi_id);
+	}
 }
 
 void __init set_smp_dynamic_ipi(int ipi)
 {
+	int err;
+
 	if (!request_percpu_nmi(ipi, ipi_nmi_handler, "IPI", &cpu_number)) {
-		ipi_nmi_desc = irq_to_desc(ipi);
-		ipi_nmi_id = ipi;
+		is_nmi = true;
+	} else {
+		err = request_percpu_irq(ipi, ipi_nmi_handler, "IPI", &cpu_number);
+		if (WARN_ON(err))
+			return;
+
+		irq_set_status_flags(ipi, IRQ_HIDDEN);
 	}
+
+	ipi_nmi_desc = irq_to_desc(ipi);
+	ipi_nmi_id = ipi;
 }
-- 
2.40.0.634.g4ca3ef3211-goog


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 03/10] arm64: smp: Assign and setup an IPI as NMI
  2023-04-19 22:55   ` Douglas Anderson
@ 2023-04-24 17:53     ` Doug Anderson
  -1 siblings, 0 replies; 46+ messages in thread
From: Doug Anderson @ 2023-04-24 17:53 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, Mark Rutland
  Cc: ito-yuichi, kgdb-bugreport, Chen-Yu Tsai, Masayoshi Mizuma,
	Peter Zijlstra, Ard Biesheuvel, Rafael J . Wysocki,
	linux-arm-kernel, Stephen Boyd, Lecopzer Chen, Thomas Gleixner,
	linux-perf-users, Masayoshi Mizuma, Ben Dooks, Pierre Gondois,
	linux-kernel

Hi,

On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson <dianders@chromium.org> wrote:
>
> From: Sumit Garg <sumit.garg@linaro.org>
>
> Assign an unused IPI which can be turned as NMI using ipi_nmi framework.
> Also, invoke corresponding dynamic IPI setup/teardown APIs.
>
> Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
> Tested-by: Chen-Yu Tsai <wens@csie.org>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
>
> Changes in v8:
> - dynamic_ipi_setup() and dynamic_ipi_teardown() no longer take cpu param
>
>  arch/arm64/kernel/smp.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 4e8327264255..94ff063527c6 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -43,6 +43,7 @@
>  #include <asm/daifflags.h>
>  #include <asm/kvm_mmu.h>
>  #include <asm/mmu_context.h>
> +#include <asm/nmi.h>
>  #include <asm/numa.h>
>  #include <asm/processor.h>
>  #include <asm/smp_plat.h>
> @@ -938,6 +939,8 @@ static void ipi_setup(int cpu)
>
>         for (i = 0; i < nr_ipi; i++)
>                 enable_percpu_irq(ipi_irq_base + i, 0);
> +
> +       dynamic_ipi_setup();
>  }
>
>  #ifdef CONFIG_HOTPLUG_CPU
> @@ -950,6 +953,8 @@ static void ipi_teardown(int cpu)
>
>         for (i = 0; i < nr_ipi; i++)
>                 disable_percpu_irq(ipi_irq_base + i);
> +
> +       dynamic_ipi_teardown();
>  }
>  #endif
>
> @@ -971,6 +976,9 @@ void __init set_smp_ipi_range(int ipi_base, int n)
>                 irq_set_status_flags(ipi_base + i, IRQ_HIDDEN);
>         }
>
> +       if (n > nr_ipi)
> +               set_smp_dynamic_ipi(ipi_base + nr_ipi);

From thinking about this patch more, I'm guessing that the biggest
objection someone could have is that this uses up the "last" IPI slot
in any systems that are stuck with only 8. Is that the reason that
this patch series stagnated in the past, or was it something else?

If this is truly the concern that people have, it doesn't look like it
would be terribly hard to combine the existing IPI_CPU_STOP and
IPI_CPU_CRASH_STOP. Presumably we could just get rid of the "crash
stop" IPI and have the normal "stop" IPI do the crash if
"waiting_for_crash_ipi" is non-zero. If that's the thing blocking the
series from moving forward then I can add that to the series, or we
could just wait until someone actually needs another IPI. ;-)

-Doug

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 03/10] arm64: smp: Assign and setup an IPI as NMI
@ 2023-04-24 17:53     ` Doug Anderson
  0 siblings, 0 replies; 46+ messages in thread
From: Doug Anderson @ 2023-04-24 17:53 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, Mark Rutland
  Cc: ito-yuichi, kgdb-bugreport, Chen-Yu Tsai, Masayoshi Mizuma,
	Peter Zijlstra, Ard Biesheuvel, Rafael J . Wysocki,
	linux-arm-kernel, Stephen Boyd, Lecopzer Chen, Thomas Gleixner,
	linux-perf-users, Masayoshi Mizuma, Ben Dooks, Pierre Gondois,
	linux-kernel

Hi,

On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson <dianders@chromium.org> wrote:
>
> From: Sumit Garg <sumit.garg@linaro.org>
>
> Assign an unused IPI which can be turned as NMI using ipi_nmi framework.
> Also, invoke corresponding dynamic IPI setup/teardown APIs.
>
> Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> Reviewed-by: Masayoshi Mizuma <m.mizuma@jp.fujitsu.com>
> Tested-by: Chen-Yu Tsai <wens@csie.org>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
>
> Changes in v8:
> - dynamic_ipi_setup() and dynamic_ipi_teardown() no longer take cpu param
>
>  arch/arm64/kernel/smp.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
>
> diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
> index 4e8327264255..94ff063527c6 100644
> --- a/arch/arm64/kernel/smp.c
> +++ b/arch/arm64/kernel/smp.c
> @@ -43,6 +43,7 @@
>  #include <asm/daifflags.h>
>  #include <asm/kvm_mmu.h>
>  #include <asm/mmu_context.h>
> +#include <asm/nmi.h>
>  #include <asm/numa.h>
>  #include <asm/processor.h>
>  #include <asm/smp_plat.h>
> @@ -938,6 +939,8 @@ static void ipi_setup(int cpu)
>
>         for (i = 0; i < nr_ipi; i++)
>                 enable_percpu_irq(ipi_irq_base + i, 0);
> +
> +       dynamic_ipi_setup();
>  }
>
>  #ifdef CONFIG_HOTPLUG_CPU
> @@ -950,6 +953,8 @@ static void ipi_teardown(int cpu)
>
>         for (i = 0; i < nr_ipi; i++)
>                 disable_percpu_irq(ipi_irq_base + i);
> +
> +       dynamic_ipi_teardown();
>  }
>  #endif
>
> @@ -971,6 +976,9 @@ void __init set_smp_ipi_range(int ipi_base, int n)
>                 irq_set_status_flags(ipi_base + i, IRQ_HIDDEN);
>         }
>
> +       if (n > nr_ipi)
> +               set_smp_dynamic_ipi(ipi_base + nr_ipi);

From thinking about this patch more, I'm guessing that the biggest
objection someone could have is that this uses up the "last" IPI slot
in any systems that are stuck with only 8. Is that the reason that
this patch series stagnated in the past, or was it something else?

If this is truly the concern that people have, it doesn't look like it
would be terribly hard to combine the existing IPI_CPU_STOP and
IPI_CPU_CRASH_STOP. Presumably we could just get rid of the "crash
stop" IPI and have the normal "stop" IPI do the crash if
"waiting_for_crash_ipi" is non-zero. If that's the thing blocking the
series from moving forward then I can add that to the series, or we
could just wait until someone actually needs another IPI. ;-)

-Doug

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 00/10] arm64: Add framework to turn an IPI as NMI
  2023-04-19 22:55 ` Douglas Anderson
@ 2023-05-10 15:28   ` Doug Anderson
  -1 siblings, 0 replies; 46+ messages in thread
From: Doug Anderson @ 2023-05-10 15:28 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, Mark Rutland
  Cc: ito-yuichi, kgdb-bugreport, Chen-Yu Tsai, Masayoshi Mizuma,
	Peter Zijlstra, Ard Biesheuvel, Rafael J . Wysocki,
	linux-arm-kernel, Stephen Boyd, Lecopzer Chen, Thomas Gleixner,
	linux-perf-users, Alexandru Elisei, Andrey Konovalov, Ben Dooks,
	Borislav Petkov, Christophe Leroy, Darrick J. Wong, Dave Hansen,
	David S. Miller, Eric W. Biederman, Frederic Weisbecker,
	Gaosheng Cui, Gautham R. Shenoy, Greg Kroah-Hartman,
	Guilherme G. Piccoli, Guo Ren, H. Peter Anvin, Huacai Chen,
	Ingo Molnar, Ingo Molnar, Jason A. Donenfeld, Jason Wessel,
	Jianmin Lv, Jiaxun Yang, Jinyang He, Joey Gouly, Kees Cook,
	Laurent Dufour, Masahiro Yamada, Masayoshi Mizuma,
	Michael Ellerman, Nicholas Piggin, Paul E. McKenney,
	Philippe Mathieu-Daudé,
	Pierre Gondois, Qing Zhang, Russell King (Oracle),
	Russell King, Thomas Bogendoerfer, Ulf Hansson, WANG Xuerui,
	linux-kernel, linux-mips, linuxppc-dev, loongarch, sparclinux,
	x86

Hi,

On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson <dianders@chromium.org> wrote:
>
> This is an attempt to resurrect Sumit's old patch series [1] that
> allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and
> also to round up CPUs in kdb/kgdb. The last post from Sumit that I
> could find was v7, so I called this series v8. I haven't copied all of
> his old changelongs here, but you can find them from the link.
>
> Since v7, I have:
> * Addressed the small amount of feedback that was there for v7.
> * Rebased.
> * Added a new patch that prevents us from spamming the logs with idle
>   tasks.
> * Added an extra patch to gracefully fall back to regular IPIs if
>   pseudo-NMIs aren't there.
>
> Since there appear to be a few different patches series related to
> being able to use NMIs to get stack traces of crashed systems, let me
> try to organize them to the best of my understanding:
>
> a) This series. On its own, a) will (among other things) enable stack
>    traces of all running processes with the soft lockup detector if
>    you've enabled the sysctl "kernel.softlockup_all_cpu_backtrace". On
>    its own, a) doesn't give a hard lockup detector.
>
> b) A different recently-posted series [2] that adds a hard lockup
>    detector based on perf. On its own, b) gives a stack crawl of the
>    locked up CPU but no stack crawls of other CPUs (even if they're
>    locked too). Together with a) + b) we get everything (full lockup
>    detect, full ability to get stack crawls).
>
> c) The old Android "buddy" hard lockup detector [3] that I'm
>    considering trying to upstream. If b) lands then I believe c) would
>    be redundant (at least for arm64). c) on its own is really only
>    useful on arm64 for platforms that can print CPU_DBGPCSR somehow
>    (see [4]). a) + c) is roughly as good as a) + b).
>
> [1] https://lore.kernel.org/linux-arm-kernel/1604317487-14543-1-git-send-email-sumit.garg@linaro.org/
> [2] https://lore.kernel.org/linux-arm-kernel/20220903093415.15850-1-lecopzer.chen@mediatek.com/
> [3] https://issuetracker.google.com/172213097
> [4] https://issuetracker.google.com/172213129
>
> Changes in v8:
> - dynamic_ipi_setup() and dynamic_ipi_teardown() no longer take cpu param
> - dynamic_ipi_setup() and dynamic_ipi_teardown() no longer take cpu param
> - Add loongarch support, too
> - Removed "#ifdef CONFIG_SMP" since arm64 is always SMP
> - "Tag the arm64 idle functions as __cpuidle" new for v8
> - "Provide a stub kgdb_nmicallback() if !CONFIG_KGDB" new for v8
> - "Fallback to a regular IPI if NMI isn't enabled" new for v8
>
> Douglas Anderson (3):
>   arm64: idle: Tag the arm64 idle functions as __cpuidle
>   kgdb: Provide a stub kgdb_nmicallback() if !CONFIG_KGDB
>   arm64: ipi_nmi: Fallback to a regular IPI if NMI isn't enabled
>
> Sumit Garg (7):
>   arm64: Add framework to turn IPI as NMI
>   irqchip/gic-v3: Enable support for SGIs to act as NMIs
>   arm64: smp: Assign and setup an IPI as NMI
>   nmi: backtrace: Allow runtime arch specific override
>   arm64: ipi_nmi: Add support for NMI backtrace
>   kgdb: Expose default CPUs roundup fallback mechanism
>   arm64: kgdb: Roundup cpus using IPI as NMI
>
>  arch/arm/include/asm/irq.h       |   2 +-
>  arch/arm/kernel/smp.c            |   3 +-
>  arch/arm64/include/asm/irq.h     |   4 ++
>  arch/arm64/include/asm/nmi.h     |  17 +++++
>  arch/arm64/kernel/Makefile       |   2 +-
>  arch/arm64/kernel/idle.c         |   4 +-
>  arch/arm64/kernel/ipi_nmi.c      | 103 +++++++++++++++++++++++++++++++
>  arch/arm64/kernel/kgdb.c         |  18 ++++++
>  arch/arm64/kernel/smp.c          |   8 +++
>  arch/loongarch/include/asm/irq.h |   2 +-
>  arch/loongarch/kernel/process.c  |   3 +-
>  arch/mips/include/asm/irq.h      |   2 +-
>  arch/mips/kernel/process.c       |   3 +-
>  arch/powerpc/include/asm/nmi.h   |   2 +-
>  arch/powerpc/kernel/stacktrace.c |   3 +-
>  arch/sparc/include/asm/irq_64.h  |   2 +-
>  arch/sparc/kernel/process_64.c   |   4 +-
>  arch/x86/include/asm/irq.h       |   2 +-
>  arch/x86/kernel/apic/hw_nmi.c    |   3 +-
>  drivers/irqchip/irq-gic-v3.c     |  29 ++++++---
>  include/linux/kgdb.h             |  13 ++++
>  include/linux/nmi.h              |  12 ++--
>  kernel/debug/debug_core.c        |   8 ++-
>  23 files changed, 217 insertions(+), 32 deletions(-)

It's been 3 weeks and I haven't heard a peep on this series. That
means nobody has any objections and it's all good to land, right?
Right? :-P

Seriously, though, I really think we should figure out how to get this
landed. There's obviously interest from several different parties and
I'm chomping at the bit waiting to spin this series according to your
wishes. I also don't think there's anything super scary/ugly here. IMO
the ideal situation is that folks are OK with the idea presented in
the last patch in the series ("arm64: ipi_nmi: Fallback to a regular
IPI if NMI isn't enabled") and then I can re-spin this series to be
_much_ simpler. That being said, I'm also OK with dropping that patch
and starting the discussion for how to land the rest of the patches in
the series.

Please let me know!

-Doug

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 00/10] arm64: Add framework to turn an IPI as NMI
@ 2023-05-10 15:28   ` Doug Anderson
  0 siblings, 0 replies; 46+ messages in thread
From: Doug Anderson @ 2023-05-10 15:28 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, Mark Rutland
  Cc: Masayoshi Mizuma, Jason A. Donenfeld, Huacai Chen, Jinyang He,
	Lecopzer Chen, Peter Zijlstra, kgdb-bugreport, Dave Hansen,
	Qing Zhang, Jiaxun Yang, linux-kernel, Guo Ren, H. Peter Anvin,
	sparclinux, WANG Xuerui, Ard Biesheuvel, Masahiro Yamada,
	Rafael J . Wysocki, Russell King, Gaosheng Cui, Chen-Yu Tsai,
	Ingo Molnar, Darrick J. Wong, Ulf Hansson, Masayoshi Mizuma,
	Ingo Molnar

Hi,

On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson <dianders@chromium.org> wrote:
>
> This is an attempt to resurrect Sumit's old patch series [1] that
> allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and
> also to round up CPUs in kdb/kgdb. The last post from Sumit that I
> could find was v7, so I called this series v8. I haven't copied all of
> his old changelongs here, but you can find them from the link.
>
> Since v7, I have:
> * Addressed the small amount of feedback that was there for v7.
> * Rebased.
> * Added a new patch that prevents us from spamming the logs with idle
>   tasks.
> * Added an extra patch to gracefully fall back to regular IPIs if
>   pseudo-NMIs aren't there.
>
> Since there appear to be a few different patches series related to
> being able to use NMIs to get stack traces of crashed systems, let me
> try to organize them to the best of my understanding:
>
> a) This series. On its own, a) will (among other things) enable stack
>    traces of all running processes with the soft lockup detector if
>    you've enabled the sysctl "kernel.softlockup_all_cpu_backtrace". On
>    its own, a) doesn't give a hard lockup detector.
>
> b) A different recently-posted series [2] that adds a hard lockup
>    detector based on perf. On its own, b) gives a stack crawl of the
>    locked up CPU but no stack crawls of other CPUs (even if they're
>    locked too). Together with a) + b) we get everything (full lockup
>    detect, full ability to get stack crawls).
>
> c) The old Android "buddy" hard lockup detector [3] that I'm
>    considering trying to upstream. If b) lands then I believe c) would
>    be redundant (at least for arm64). c) on its own is really only
>    useful on arm64 for platforms that can print CPU_DBGPCSR somehow
>    (see [4]). a) + c) is roughly as good as a) + b).
>
> [1] https://lore.kernel.org/linux-arm-kernel/1604317487-14543-1-git-send-email-sumit.garg@linaro.org/
> [2] https://lore.kernel.org/linux-arm-kernel/20220903093415.15850-1-lecopzer.chen@mediatek.com/
> [3] https://issuetracker.google.com/172213097
> [4] https://issuetracker.google.com/172213129
>
> Changes in v8:
> - dynamic_ipi_setup() and dynamic_ipi_teardown() no longer take cpu param
> - dynamic_ipi_setup() and dynamic_ipi_teardown() no longer take cpu param
> - Add loongarch support, too
> - Removed "#ifdef CONFIG_SMP" since arm64 is always SMP
> - "Tag the arm64 idle functions as __cpuidle" new for v8
> - "Provide a stub kgdb_nmicallback() if !CONFIG_KGDB" new for v8
> - "Fallback to a regular IPI if NMI isn't enabled" new for v8
>
> Douglas Anderson (3):
>   arm64: idle: Tag the arm64 idle functions as __cpuidle
>   kgdb: Provide a stub kgdb_nmicallback() if !CONFIG_KGDB
>   arm64: ipi_nmi: Fallback to a regular IPI if NMI isn't enabled
>
> Sumit Garg (7):
>   arm64: Add framework to turn IPI as NMI
>   irqchip/gic-v3: Enable support for SGIs to act as NMIs
>   arm64: smp: Assign and setup an IPI as NMI
>   nmi: backtrace: Allow runtime arch specific override
>   arm64: ipi_nmi: Add support for NMI backtrace
>   kgdb: Expose default CPUs roundup fallback mechanism
>   arm64: kgdb: Roundup cpus using IPI as NMI
>
>  arch/arm/include/asm/irq.h       |   2 +-
>  arch/arm/kernel/smp.c            |   3 +-
>  arch/arm64/include/asm/irq.h     |   4 ++
>  arch/arm64/include/asm/nmi.h     |  17 +++++
>  arch/arm64/kernel/Makefile       |   2 +-
>  arch/arm64/kernel/idle.c         |   4 +-
>  arch/arm64/kernel/ipi_nmi.c      | 103 +++++++++++++++++++++++++++++++
>  arch/arm64/kernel/kgdb.c         |  18 ++++++
>  arch/arm64/kernel/smp.c          |   8 +++
>  arch/loongarch/include/asm/irq.h |   2 +-
>  arch/loongarch/kernel/process.c  |   3 +-
>  arch/mips/include/asm/irq.h      |   2 +-
>  arch/mips/kernel/process.c       |   3 +-
>  arch/powerpc/include/asm/nmi.h   |   2 +-
>  arch/powerpc/kernel/stacktrace.c |   3 +-
>  arch/sparc/include/asm/irq_64.h  |   2 +-
>  arch/sparc/kernel/process_64.c   |   4 +-
>  arch/x86/include/asm/irq.h       |   2 +-
>  arch/x86/kernel/apic/hw_nmi.c    |   3 +-
>  drivers/irqchip/irq-gic-v3.c     |  29 ++++++---
>  include/linux/kgdb.h             |  13 ++++
>  include/linux/nmi.h              |  12 ++--
>  kernel/debug/debug_core.c        |   8 ++-
>  23 files changed, 217 insertions(+), 32 deletions(-)

It's been 3 weeks and I haven't heard a peep on this series. That
means nobody has any objections and it's all good to land, right?
Right? :-P

Seriously, though, I really think we should figure out how to get this
landed. There's obviously interest from several different parties and
I'm chomping at the bit waiting to spin this series according to your
wishes. I also don't think there's anything super scary/ugly here. IMO
the ideal situation is that folks are OK with the idea presented in
the last patch in the series ("arm64: ipi_nmi: Fallback to a regular
IPI if NMI isn't enabled") and then I can re-spin this series to be
_much_ simpler. That being said, I'm also OK with dropping that patch
and starting the discussion for how to land the rest of the patches in
the series.

Please let me know!

-Doug

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 00/10] arm64: Add framework to turn an IPI as NMI
  2023-05-10 15:28   ` Doug Anderson
@ 2023-05-10 16:30     ` Mark Rutland
  -1 siblings, 0 replies; 46+ messages in thread
From: Mark Rutland @ 2023-05-10 16:30 UTC (permalink / raw)
  To: Doug Anderson
  Cc: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, ito-yuichi, kgdb-bugreport, Chen-Yu Tsai,
	Masayoshi Mizuma, Peter Zijlstra, Ard Biesheuvel,
	Rafael J . Wysocki, linux-arm-kernel, Stephen Boyd,
	Lecopzer Chen, Thomas Gleixner, linux-perf-users,
	Alexandru Elisei, Andrey Konovalov, Ben Dooks, Borislav Petkov,
	Christophe Leroy, Darrick J. Wong, Dave Hansen, David S. Miller,
	Eric W. Biederman, Frederic Weisbecker, Gaosheng Cui,
	Gautham R. Shenoy, Greg Kroah-Hartman, Guilherme G. Piccoli,
	Guo Ren, H. Peter Anvin, Huacai Chen, Ingo Molnar, Ingo Molnar,
	Jason A. Donenfeld, Jason Wessel, Jianmin Lv, Jiaxun Yang,
	Jinyang He, Joey Gouly, Kees Cook, Laurent Dufour,
	Masahiro Yamada, Masayoshi Mizuma, Michael Ellerman,
	Nicholas Piggin, Paul E. McKenney, Philippe Mathieu-Daudé,
	Pierre Gondois, Qing Zhang, Russell King (Oracle),
	Russell King, Thomas Bogendoerfer, Ulf Hansson, WANG Xuerui,
	linux-kernel, linux-mips, linuxppc-dev, loongarch, sparclinux,
	x86

On Wed, May 10, 2023 at 08:28:17AM -0700, Doug Anderson wrote:
> Hi,

Hi Doug,

> On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson <dianders@chromium.org> wrote:
> > This is an attempt to resurrect Sumit's old patch series [1] that
> > allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and
> > also to round up CPUs in kdb/kgdb. The last post from Sumit that I
> > could find was v7, so I called this series v8. I haven't copied all of
> > his old changelongs here, but you can find them from the link.
> >
> > Since v7, I have:
> > * Addressed the small amount of feedback that was there for v7.
> > * Rebased.
> > * Added a new patch that prevents us from spamming the logs with idle
> >   tasks.
> > * Added an extra patch to gracefully fall back to regular IPIs if
> >   pseudo-NMIs aren't there.
> >
> > Since there appear to be a few different patches series related to
> > being able to use NMIs to get stack traces of crashed systems, let me
> > try to organize them to the best of my understanding:
> >
> > a) This series. On its own, a) will (among other things) enable stack
> >    traces of all running processes with the soft lockup detector if
> >    you've enabled the sysctl "kernel.softlockup_all_cpu_backtrace". On
> >    its own, a) doesn't give a hard lockup detector.
> >
> > b) A different recently-posted series [2] that adds a hard lockup
> >    detector based on perf. On its own, b) gives a stack crawl of the
> >    locked up CPU but no stack crawls of other CPUs (even if they're
> >    locked too). Together with a) + b) we get everything (full lockup
> >    detect, full ability to get stack crawls).
> >
> > c) The old Android "buddy" hard lockup detector [3] that I'm
> >    considering trying to upstream. If b) lands then I believe c) would
> >    be redundant (at least for arm64). c) on its own is really only
> >    useful on arm64 for platforms that can print CPU_DBGPCSR somehow
> >    (see [4]). a) + c) is roughly as good as a) + b).

> It's been 3 weeks and I haven't heard a peep on this series. That
> means nobody has any objections and it's all good to land, right?
> Right? :-P

FWIW, there are still longstanding soundness issues in the arm64 pseudo-NMI
support (and fixing that requires an overhaul of our DAIF / IRQ flag
management, which I've been chipping away at for a number of releases), so I
hadn't looked at this in detail yet because the foundations are still somewhat
dodgy.

I appreciate that this has been around for a while, and it's on my queue to
look at.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 00/10] arm64: Add framework to turn an IPI as NMI
@ 2023-05-10 16:30     ` Mark Rutland
  0 siblings, 0 replies; 46+ messages in thread
From: Mark Rutland @ 2023-05-10 16:30 UTC (permalink / raw)
  To: Doug Anderson
  Cc: Masayoshi Mizuma, Jason A. Donenfeld, Huacai Chen, Jinyang He,
	Lecopzer Chen, Peter Zijlstra, Catalin Marinas, Dave Hansen,
	Qing Zhang, Jiaxun Yang, linux-kernel, Guo Ren, H. Peter Anvin,
	sparclinux, WANG Xuerui, Will Deacon, Ard Biesheuvel,
	Daniel Thompson, Marc Zyngier, Masahiro Yamada,
	Rafael J . Wysocki, Russell King, Gaosheng Cui, Chen-Yu Tsai,
	Ingo Molnar, Darrick J. Wong, kgdb-bugreport

On Wed, May 10, 2023 at 08:28:17AM -0700, Doug Anderson wrote:
> Hi,

Hi Doug,

> On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson <dianders@chromium.org> wrote:
> > This is an attempt to resurrect Sumit's old patch series [1] that
> > allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and
> > also to round up CPUs in kdb/kgdb. The last post from Sumit that I
> > could find was v7, so I called this series v8. I haven't copied all of
> > his old changelongs here, but you can find them from the link.
> >
> > Since v7, I have:
> > * Addressed the small amount of feedback that was there for v7.
> > * Rebased.
> > * Added a new patch that prevents us from spamming the logs with idle
> >   tasks.
> > * Added an extra patch to gracefully fall back to regular IPIs if
> >   pseudo-NMIs aren't there.
> >
> > Since there appear to be a few different patches series related to
> > being able to use NMIs to get stack traces of crashed systems, let me
> > try to organize them to the best of my understanding:
> >
> > a) This series. On its own, a) will (among other things) enable stack
> >    traces of all running processes with the soft lockup detector if
> >    you've enabled the sysctl "kernel.softlockup_all_cpu_backtrace". On
> >    its own, a) doesn't give a hard lockup detector.
> >
> > b) A different recently-posted series [2] that adds a hard lockup
> >    detector based on perf. On its own, b) gives a stack crawl of the
> >    locked up CPU but no stack crawls of other CPUs (even if they're
> >    locked too). Together with a) + b) we get everything (full lockup
> >    detect, full ability to get stack crawls).
> >
> > c) The old Android "buddy" hard lockup detector [3] that I'm
> >    considering trying to upstream. If b) lands then I believe c) would
> >    be redundant (at least for arm64). c) on its own is really only
> >    useful on arm64 for platforms that can print CPU_DBGPCSR somehow
> >    (see [4]). a) + c) is roughly as good as a) + b).

> It's been 3 weeks and I haven't heard a peep on this series. That
> means nobody has any objections and it's all good to land, right?
> Right? :-P

FWIW, there are still longstanding soundness issues in the arm64 pseudo-NMI
support (and fixing that requires an overhaul of our DAIF / IRQ flag
management, which I've been chipping away at for a number of releases), so I
hadn't looked at this in detail yet because the foundations are still somewhat
dodgy.

I appreciate that this has been around for a while, and it's on my queue to
look at.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 00/10] arm64: Add framework to turn an IPI as NMI
  2023-05-10 16:30     ` Mark Rutland
@ 2023-05-10 16:42       ` Doug Anderson
  -1 siblings, 0 replies; 46+ messages in thread
From: Doug Anderson @ 2023-05-10 16:42 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, ito-yuichi, kgdb-bugreport, Chen-Yu Tsai,
	Masayoshi Mizuma, Peter Zijlstra, Ard Biesheuvel,
	Rafael J . Wysocki, linux-arm-kernel, Stephen Boyd,
	Lecopzer Chen, Thomas Gleixner, linux-perf-users,
	Alexandru Elisei, Andrey Konovalov, Ben Dooks, Borislav Petkov,
	Christophe Leroy, Darrick J. Wong, Dave Hansen, David S. Miller,
	Eric W. Biederman, Frederic Weisbecker, Gaosheng Cui,
	Gautham R. Shenoy, Greg Kroah-Hartman, Guilherme G. Piccoli,
	Guo Ren, H. Peter Anvin, Huacai Chen, Ingo Molnar, Ingo Molnar,
	Jason A. Donenfeld, Jason Wessel, Jianmin Lv, Jiaxun Yang,
	Jinyang He, Joey Gouly, Kees Cook, Laurent Dufour,
	Masahiro Yamada, Masayoshi Mizuma, Michael Ellerman,
	Nicholas Piggin, Paul E. McKenney, Philippe Mathieu-Daudé,
	Pierre Gondois, Qing Zhang, Russell King (Oracle),
	Russell King, Thomas Bogendoerfer, Ulf Hansson, WANG Xuerui,
	linux-kernel, linux-mips, linuxppc-dev, loongarch, sparclinux,
	x86

Hi,

On Wed, May 10, 2023 at 9:30 AM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Wed, May 10, 2023 at 08:28:17AM -0700, Doug Anderson wrote:
> > Hi,
>
> Hi Doug,
>
> > On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson <dianders@chromium.org> wrote:
> > > This is an attempt to resurrect Sumit's old patch series [1] that
> > > allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and
> > > also to round up CPUs in kdb/kgdb. The last post from Sumit that I
> > > could find was v7, so I called this series v8. I haven't copied all of
> > > his old changelongs here, but you can find them from the link.
> > >
> > > Since v7, I have:
> > > * Addressed the small amount of feedback that was there for v7.
> > > * Rebased.
> > > * Added a new patch that prevents us from spamming the logs with idle
> > >   tasks.
> > > * Added an extra patch to gracefully fall back to regular IPIs if
> > >   pseudo-NMIs aren't there.
> > >
> > > Since there appear to be a few different patches series related to
> > > being able to use NMIs to get stack traces of crashed systems, let me
> > > try to organize them to the best of my understanding:
> > >
> > > a) This series. On its own, a) will (among other things) enable stack
> > >    traces of all running processes with the soft lockup detector if
> > >    you've enabled the sysctl "kernel.softlockup_all_cpu_backtrace". On
> > >    its own, a) doesn't give a hard lockup detector.
> > >
> > > b) A different recently-posted series [2] that adds a hard lockup
> > >    detector based on perf. On its own, b) gives a stack crawl of the
> > >    locked up CPU but no stack crawls of other CPUs (even if they're
> > >    locked too). Together with a) + b) we get everything (full lockup
> > >    detect, full ability to get stack crawls).
> > >
> > > c) The old Android "buddy" hard lockup detector [3] that I'm
> > >    considering trying to upstream. If b) lands then I believe c) would
> > >    be redundant (at least for arm64). c) on its own is really only
> > >    useful on arm64 for platforms that can print CPU_DBGPCSR somehow
> > >    (see [4]). a) + c) is roughly as good as a) + b).
>
> > It's been 3 weeks and I haven't heard a peep on this series. That
> > means nobody has any objections and it's all good to land, right?
> > Right? :-P
>
> FWIW, there are still longstanding soundness issues in the arm64 pseudo-NMI
> support (and fixing that requires an overhaul of our DAIF / IRQ flag
> management, which I've been chipping away at for a number of releases), so I
> hadn't looked at this in detail yet because the foundations are still somewhat
> dodgy.
>
> I appreciate that this has been around for a while, and it's on my queue to
> look at.

Ah, thanks for the heads up! We've been thinking about turning this on
in production in ChromeOS because it will help us track down a whole
class of field-generated crash reports that are otherwise opaque to
us. It sounds as if maybe that's not a good idea quite yet? Do you
have any idea of how much farther along this needs to go? ...of
course, we've also run into issues with Mediatek devices because they
don't save/restore GICR registers properly [1]. In theory, we might be
able to work around that in the kernel.

In any case, even if there are bugs that would prevent turning this on
for production, it still seems like we could still land this series.
It simply wouldn't do anything until someone turned on pseudo NMIs,
which wouldn't happen till the kinks are worked out.

...actually, I guess I should say that if all the patches of the
current series do land then it actually _would_ still do something,
even without pseudo-NMI. Assuming the last patch looks OK, it would at
least start falling back to using regular IPIs to do backtraces. That
wouldn't get backtraces on hard locked up CPUs but it would be better
than what we have today where we don't get any backtraces. This would
get arm64 on par with arm32...

[1] https://issuetracker.google.com/281831288

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 00/10] arm64: Add framework to turn an IPI as NMI
@ 2023-05-10 16:42       ` Doug Anderson
  0 siblings, 0 replies; 46+ messages in thread
From: Doug Anderson @ 2023-05-10 16:42 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Masayoshi Mizuma, Jason A. Donenfeld, Huacai Chen, Jinyang He,
	Lecopzer Chen, Peter Zijlstra, Catalin Marinas, Dave Hansen,
	Qing Zhang, Jiaxun Yang, linux-kernel, Guo Ren, H. Peter Anvin,
	sparclinux, WANG Xuerui, Will Deacon, Ard Biesheuvel,
	Daniel Thompson, Marc Zyngier, Masahiro Yamada,
	Rafael J . Wysocki, Russell King, Gaosheng Cui, Chen-Yu Tsai,
	Ingo Molnar, Darrick J. Wong, kgdb-bugreport

Hi,

On Wed, May 10, 2023 at 9:30 AM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Wed, May 10, 2023 at 08:28:17AM -0700, Doug Anderson wrote:
> > Hi,
>
> Hi Doug,
>
> > On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson <dianders@chromium.org> wrote:
> > > This is an attempt to resurrect Sumit's old patch series [1] that
> > > allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and
> > > also to round up CPUs in kdb/kgdb. The last post from Sumit that I
> > > could find was v7, so I called this series v8. I haven't copied all of
> > > his old changelongs here, but you can find them from the link.
> > >
> > > Since v7, I have:
> > > * Addressed the small amount of feedback that was there for v7.
> > > * Rebased.
> > > * Added a new patch that prevents us from spamming the logs with idle
> > >   tasks.
> > > * Added an extra patch to gracefully fall back to regular IPIs if
> > >   pseudo-NMIs aren't there.
> > >
> > > Since there appear to be a few different patches series related to
> > > being able to use NMIs to get stack traces of crashed systems, let me
> > > try to organize them to the best of my understanding:
> > >
> > > a) This series. On its own, a) will (among other things) enable stack
> > >    traces of all running processes with the soft lockup detector if
> > >    you've enabled the sysctl "kernel.softlockup_all_cpu_backtrace". On
> > >    its own, a) doesn't give a hard lockup detector.
> > >
> > > b) A different recently-posted series [2] that adds a hard lockup
> > >    detector based on perf. On its own, b) gives a stack crawl of the
> > >    locked up CPU but no stack crawls of other CPUs (even if they're
> > >    locked too). Together with a) + b) we get everything (full lockup
> > >    detect, full ability to get stack crawls).
> > >
> > > c) The old Android "buddy" hard lockup detector [3] that I'm
> > >    considering trying to upstream. If b) lands then I believe c) would
> > >    be redundant (at least for arm64). c) on its own is really only
> > >    useful on arm64 for platforms that can print CPU_DBGPCSR somehow
> > >    (see [4]). a) + c) is roughly as good as a) + b).
>
> > It's been 3 weeks and I haven't heard a peep on this series. That
> > means nobody has any objections and it's all good to land, right?
> > Right? :-P
>
> FWIW, there are still longstanding soundness issues in the arm64 pseudo-NMI
> support (and fixing that requires an overhaul of our DAIF / IRQ flag
> management, which I've been chipping away at for a number of releases), so I
> hadn't looked at this in detail yet because the foundations are still somewhat
> dodgy.
>
> I appreciate that this has been around for a while, and it's on my queue to
> look at.

Ah, thanks for the heads up! We've been thinking about turning this on
in production in ChromeOS because it will help us track down a whole
class of field-generated crash reports that are otherwise opaque to
us. It sounds as if maybe that's not a good idea quite yet? Do you
have any idea of how much farther along this needs to go? ...of
course, we've also run into issues with Mediatek devices because they
don't save/restore GICR registers properly [1]. In theory, we might be
able to work around that in the kernel.

In any case, even if there are bugs that would prevent turning this on
for production, it still seems like we could still land this series.
It simply wouldn't do anything until someone turned on pseudo NMIs,
which wouldn't happen till the kinks are worked out.

...actually, I guess I should say that if all the patches of the
current series do land then it actually _would_ still do something,
even without pseudo-NMI. Assuming the last patch looks OK, it would at
least start falling back to using regular IPIs to do backtraces. That
wouldn't get backtraces on hard locked up CPUs but it would be better
than what we have today where we don't get any backtraces. This would
get arm64 on par with arm32...

[1] https://issuetracker.google.com/281831288

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 06/10] arm64: idle: Tag the arm64 idle functions as __cpuidle
  2023-04-19 22:56   ` Douglas Anderson
  (?)
@ 2023-05-10 16:43   ` Mark Rutland
  2023-05-10 21:13     ` Doug Anderson
  -1 siblings, 1 reply; 46+ messages in thread
From: Mark Rutland @ 2023-05-10 16:43 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, ito-yuichi, kgdb-bugreport, Chen-Yu Tsai,
	Masayoshi Mizuma, Peter Zijlstra, Ard Biesheuvel,
	Rafael J . Wysocki, linux-arm-kernel, Stephen Boyd,
	Lecopzer Chen, Thomas Gleixner, linux-perf-users,
	Frederic Weisbecker, Gautham R. Shenoy, Ingo Molnar,
	linux-kernel

On Wed, Apr 19, 2023 at 03:56:00PM -0700, Douglas Anderson wrote:
> As per the (somewhat recent) comment before the definition of
> `__cpuidle`, the tag is like `noinstr` but also marks a function so it
> can be identified by cpu_in_idle(). Let'a add this.
> 
> After doing this then when we dump stack traces of all processors
> using nmi_cpu_backtrace() then instead of getting useless backtraces
> we get things like:
> 
>   NMI backtrace for cpu N skipped: idling at cpu_do_idle+0x94/0x98

As a heads-up, this is only going to work in the trivial case where a CPU is
within the default cpu_do_idle(), and not for anything using PSCI
cpu_suspend() (which I'd *really* hope is the common case).

That doesn't get inlined, and the invocation is shared with other SMCCC users,
so we probably need more work there if culling idle backtraces is important.

I'm not averse to this change itself.

Mark.

> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
> 
> Changes in v8:
> - "Tag the arm64 idle functions as __cpuidle" new for v8
> 
>  arch/arm64/kernel/idle.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/kernel/idle.c b/arch/arm64/kernel/idle.c
> index c1125753fe9b..05cfb347ec26 100644
> --- a/arch/arm64/kernel/idle.c
> +++ b/arch/arm64/kernel/idle.c
> @@ -20,7 +20,7 @@
>   *	ensure that interrupts are not masked at the PMR (because the core will
>   *	not wake up if we block the wake up signal in the interrupt controller).
>   */
> -void noinstr cpu_do_idle(void)
> +void __cpuidle cpu_do_idle(void)
>  {
>  	struct arm_cpuidle_irq_context context;
>  
> @@ -35,7 +35,7 @@ void noinstr cpu_do_idle(void)
>  /*
>   * This is our default idle handler.
>   */
> -void noinstr arch_cpu_idle(void)
> +void __cpuidle arch_cpu_idle(void)
>  {
>  	/*
>  	 * This should do all the clock switching and wait for interrupt
> -- 
> 2.40.0.634.g4ca3ef3211-goog
> 

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 06/10] arm64: idle: Tag the arm64 idle functions as __cpuidle
  2023-05-10 16:43   ` Mark Rutland
@ 2023-05-10 21:13     ` Doug Anderson
  0 siblings, 0 replies; 46+ messages in thread
From: Doug Anderson @ 2023-05-10 21:13 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, ito-yuichi, kgdb-bugreport, Chen-Yu Tsai,
	Masayoshi Mizuma, Peter Zijlstra, Ard Biesheuvel,
	Rafael J . Wysocki, linux-arm-kernel, Stephen Boyd,
	Lecopzer Chen, Thomas Gleixner, linux-perf-users,
	Frederic Weisbecker, Gautham R. Shenoy, Ingo Molnar,
	linux-kernel

Hi,

On Wed, May 10, 2023 at 9:43 AM Mark Rutland <mark.rutland@arm.com> wrote:
>
> On Wed, Apr 19, 2023 at 03:56:00PM -0700, Douglas Anderson wrote:
> > As per the (somewhat recent) comment before the definition of
> > `__cpuidle`, the tag is like `noinstr` but also marks a function so it
> > can be identified by cpu_in_idle(). Let'a add this.
> >
> > After doing this then when we dump stack traces of all processors
> > using nmi_cpu_backtrace() then instead of getting useless backtraces
> > we get things like:
> >
> >   NMI backtrace for cpu N skipped: idling at cpu_do_idle+0x94/0x98
>
> As a heads-up, this is only going to work in the trivial case where a CPU is
> within the default cpu_do_idle(), and not for anything using PSCI
> cpu_suspend() (which I'd *really* hope is the common case).

Thanks for the review and the heads up! Right. It only catches shallow
idle. Specifically this was the stack trace when it was "caught" on a
sc7180-trogdor device:

 cpu_do_idle+0x94/0x98
 psci_enter_idle_state+0x58/0x70
 cpuidle_enter_state+0xb8/0x260
 cpuidle_enter+0x44/0x5c
 do_idle+0x188/0x30c
 ...

I checked in kgdb and saw that "psci_enter_idle_state()" had "idx" as
0, which makes sense since __CPU_PM_CPU_IDLE_ENTER will directly call
cpu_do_idle() when idx is 0.

I agree that it doesn't catch every case. Certainly it's not too hard
to see a CPU here:

 gic_cpu_sys_reg_init+0x1f8/0x314
 gic_cpu_pm_notifier+0x40/0x78
 raw_notifier_call_chain+0x5c/0x134
 cpu_pm_notify+0x38/0x64
 cpu_pm_exit+0x20/0x2c
 psci_enter_idle_state+0x48/0x70
 cpuidle_enter_state+0xb8/0x260
 cpuidle_enter+0x44/0x5c
 do_idle+0x188/0x30c
 ...

...and kgdb showed "idx" was 3.

That being said, catching some of the cases is better than catching
none of the cases. ;-)

In reality, I've seen cases where it catches most CPUs. For instance,
running soon after bootup on my sc7180-trogdor device:

echo HARDLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT

...and then having the "buddy" hardlockup detector [1] catch the crash
caught all of the CPUs other than the one that was locked up and the
one that detected it. Specifically:

NMI backtrace for cpu 2 skipped: idling at cpu_do_idle+0x94/0x98
NMI backtrace for cpu 1 skipped: idling at cpu_do_idle+0x94/0x98
NMI backtrace for cpu 0 skipped: idling at cpu_do_idle+0x94/0x98
NMI backtrace for cpu 3 skipped: idling at cpu_do_idle+0x94/0x98
NMI backtrace for cpu 4 skipped: idling at cpu_do_idle+0x94/0x98
NMI backtrace for cpu 7 skipped: idling at cpu_do_idle+0x94/0x98

I haven't analyzed it, but I guess it's possible that when the buddy
hardlockup detector triggers that other CPUs are more likely to be in
a shallow idle? Certainly I seem to catch a lot more CPUs in a shallow
idle in buddy lockup reports...


[1] https://lore.kernel.org/r/20230504221349.1535669-1-dianders@chromium.org


> That doesn't get inlined, and the invocation is shared with other SMCCC users,
> so we probably need more work there if culling idle backtraces is important.

At this point I'd say that we should just take the low hanging fruit
(this patch). If someone later wants to try to do better they can.


> I'm not averse to this change itself.

Any chance you'd be willing to give any tags to it? :-P Do you need me
to add any of the above caveats to the commit message?

I also certainly wouldn't object to this patch landing even if others
aren't ready. It has no dependencies on any other patches and just
makes the debug messages prettier in some cases.

-Doug

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 08/10] kgdb: Provide a stub kgdb_nmicallback() if !CONFIG_KGDB
  2023-04-19 22:56   ` Douglas Anderson
  (?)
@ 2023-05-11 14:34   ` Doug Anderson
  2023-05-12 13:52     ` Daniel Thompson
  -1 siblings, 1 reply; 46+ messages in thread
From: Doug Anderson @ 2023-05-11 14:34 UTC (permalink / raw)
  To: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, Mark Rutland
  Cc: ito-yuichi, kgdb-bugreport, Chen-Yu Tsai, Masayoshi Mizuma,
	Peter Zijlstra, Ard Biesheuvel, Rafael J . Wysocki,
	linux-arm-kernel, Stephen Boyd, Lecopzer Chen, Thomas Gleixner,
	linux-perf-users, Jason Wessel, linux-kernel

Hi,

On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson <dianders@chromium.org> wrote:
>
> To save architectures from needing to wrap the call in #ifdefs, add a
> stub no-op version of kgdb_nmicallback(), which returns 1 if it didn't
> handle anything.
>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
>
> Changes in v8:
> - "Provide a stub kgdb_nmicallback() if !CONFIG_KGDB" new for v8
>
>  include/linux/kgdb.h | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
> index 87713bd390f3..9ce628ee47cc 100644
> --- a/include/linux/kgdb.h
> +++ b/include/linux/kgdb.h
> @@ -377,5 +377,6 @@ extern void kgdb_free_init_mem(void);
>  #define dbg_late_init()
>  static inline void kgdb_panic(const char *msg) {}
>  static inline void kgdb_free_init_mem(void) { }
> +static int kgdb_nmicallback(int cpu, void *regs) { return 1; }

FWIW: I just realized that the above needs an "inline" to make the
compiler not complain. I'm still hoping for more feedback on the
series, but I'll plan to fix that in the next spin.

-Doug

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 07/10] kgdb: Expose default CPUs roundup fallback mechanism
  2023-04-19 22:56   ` Douglas Anderson
  (?)
@ 2023-05-12 13:48   ` Daniel Thompson
  2023-05-15 23:21       ` Doug Anderson
  -1 siblings, 1 reply; 46+ messages in thread
From: Daniel Thompson @ 2023-05-12 13:48 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: Catalin Marinas, Will Deacon, Sumit Garg, Marc Zyngier,
	Mark Rutland, ito-yuichi, kgdb-bugreport, Chen-Yu Tsai,
	Masayoshi Mizuma, Peter Zijlstra, Ard Biesheuvel,
	Rafael J . Wysocki, linux-arm-kernel, Stephen Boyd,
	Lecopzer Chen, Thomas Gleixner, linux-perf-users, Jason Wessel,
	linux-kernel

On Wed, Apr 19, 2023 at 03:56:01PM -0700, Douglas Anderson wrote:
> From: Sumit Garg <sumit.garg@linaro.org>
>
> Add a new API kgdb_smp_call_nmi_hook() to expose default CPUs roundup
> mechanism to a particular archichecture as a runtime fallback if it
> detects to not support NMI roundup.
>
> Currently such an architecture example is arm64 supporting pseudo NMIs
> feature which is only available on platforms which have support for GICv3
> or later version.
>
> Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> Tested-by: Chen-Yu Tsai <wens@csie.org>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
>
> (no changes since v1)
>
>  include/linux/kgdb.h      | 12 ++++++++++++
>  kernel/debug/debug_core.c |  8 +++++++-
>  2 files changed, 19 insertions(+), 1 deletion(-)
>
> diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
> index 258cdde8d356..87713bd390f3 100644
> --- a/include/linux/kgdb.h
> +++ b/include/linux/kgdb.h
> @@ -199,6 +199,18 @@ kgdb_arch_handle_qxfer_pkt(char *remcom_in_buffer,
>
>  extern void kgdb_call_nmi_hook(void *ignored);
>
> +/**
> + *	kgdb_smp_call_nmi_hook - Provide default fallback mechanism to
> + *				 round-up CPUs
> + *
> + *	If you're using the default implementation of kgdb_roundup_cpus()
> + *	this function will be called.  And if an arch detects at runtime to
> + *	not support NMI based roundup then it can fallback to default
> + *	mechanism using this API.
> + */
> +
> +extern void kgdb_smp_call_nmi_hook(void);

Concept looks sensible but this is a terrible name for aa command to
round up the CPUs using smp_call... functions. Whilst it is true it that
kgdb_roundup_cpus() does use kgdb_call_nmi_hook() internally that
doesn't mean we should name functions after it. They should be named
after what they are do, not how they do it.

Something more like kgdb_roundup_cpus_with_smp_call() would be a much
better name.


Daniel.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 08/10] kgdb: Provide a stub kgdb_nmicallback() if !CONFIG_KGDB
  2023-05-11 14:34   ` Doug Anderson
@ 2023-05-12 13:52     ` Daniel Thompson
  0 siblings, 0 replies; 46+ messages in thread
From: Daniel Thompson @ 2023-05-12 13:52 UTC (permalink / raw)
  To: Doug Anderson
  Cc: Catalin Marinas, Will Deacon, Sumit Garg, Marc Zyngier,
	Mark Rutland, ito-yuichi, kgdb-bugreport, Chen-Yu Tsai,
	Masayoshi Mizuma, Peter Zijlstra, Ard Biesheuvel,
	Rafael J . Wysocki, linux-arm-kernel, Stephen Boyd,
	Lecopzer Chen, Thomas Gleixner, linux-perf-users, Jason Wessel,
	linux-kernel

On Thu, May 11, 2023 at 07:34:30AM -0700, Doug Anderson wrote:
> Hi,
>
> On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson <dianders@chromium.org> wrote:
> >
> > To save architectures from needing to wrap the call in #ifdefs, add a
> > stub no-op version of kgdb_nmicallback(), which returns 1 if it didn't
> > handle anything.
> >
> > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > ---
> >
> > Changes in v8:
> > - "Provide a stub kgdb_nmicallback() if !CONFIG_KGDB" new for v8
> >
> >  include/linux/kgdb.h | 1 +
> >  1 file changed, 1 insertion(+)
> >
> > diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
> > index 87713bd390f3..9ce628ee47cc 100644
> > --- a/include/linux/kgdb.h
> > +++ b/include/linux/kgdb.h
> > @@ -377,5 +377,6 @@ extern void kgdb_free_init_mem(void);
> >  #define dbg_late_init()
> >  static inline void kgdb_panic(const char *msg) {}
> >  static inline void kgdb_free_init_mem(void) { }
> > +static int kgdb_nmicallback(int cpu, void *regs) { return 1; }
>
> FWIW: I just realized that the above needs an "inline" to make the
> compiler not complain. I'm still hoping for more feedback on the
> series, but I'll plan to fix that in the next spin.

On the next spin feel free to add:
Reviewed-by: Daniel Thompson <daniel.thompson@linaro.org>


Daniel.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 09/10] arm64: kgdb: Roundup cpus using IPI as NMI
  2023-04-19 22:56   ` Douglas Anderson
  (?)
@ 2023-05-12 14:00   ` Daniel Thompson
  2023-05-15 23:11       ` Doug Anderson
  -1 siblings, 1 reply; 46+ messages in thread
From: Daniel Thompson @ 2023-05-12 14:00 UTC (permalink / raw)
  To: Douglas Anderson
  Cc: Catalin Marinas, Will Deacon, Sumit Garg, Marc Zyngier,
	Mark Rutland, ito-yuichi, kgdb-bugreport, Chen-Yu Tsai,
	Masayoshi Mizuma, Peter Zijlstra, Ard Biesheuvel,
	Rafael J . Wysocki, linux-arm-kernel, Stephen Boyd,
	Lecopzer Chen, Thomas Gleixner, linux-perf-users,
	Alexandru Elisei, Masayoshi Mizuma, linux-kernel

On Wed, Apr 19, 2023 at 03:56:03PM -0700, Douglas Anderson wrote:
> From: Sumit Garg <sumit.garg@linaro.org>
>
> arm64 platforms with GICv3 or later supports pseudo NMIs which can be
> leveraged to roundup CPUs which are stuck in hard lockup state with
> interrupts disabled that wouldn't be possible with a normal IPI.
>
> So instead switch to roundup CPUs using IPI turned as NMI. And in
> case a particular arm64 platform doesn't supports pseudo NMIs,
> it will switch back to default kgdb CPUs roundup mechanism.
>
> Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> Tested-by: Chen-Yu Tsai <wens@csie.org>
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
>
> (no changes since v1)
>
>  arch/arm64/kernel/ipi_nmi.c |  5 +++++
>  arch/arm64/kernel/kgdb.c    | 18 ++++++++++++++++++
>  2 files changed, 23 insertions(+)
>
> diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
> index c592e92b8cbf..2adaaf1519e5 100644
> --- a/arch/arm64/kernel/ipi_nmi.c
> +++ b/arch/arm64/kernel/ipi_nmi.c
> @@ -8,6 +8,7 @@
>
>  #include <linux/interrupt.h>
>  #include <linux/irq.h>
> +#include <linux/kgdb.h>
>  #include <linux/nmi.h>
>  #include <linux/smp.h>
>
> @@ -45,10 +46,14 @@ bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
>  static irqreturn_t ipi_nmi_handler(int irq, void *data)
>  {
>  	irqreturn_t ret = IRQ_NONE;
> +	unsigned int cpu = smp_processor_id();

Does this play nice with CONFIG_DEBUG_PREEMPT? I may have missed
something about the NMI entry but a quick scan of the arm64
arch_irq_disabled() suggests that debug_smp_processor_id() will issue
warnings at this point...

Other than I didn't see anything I don't like here.


Daniel.

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 09/10] arm64: kgdb: Roundup cpus using IPI as NMI
  2023-05-12 14:00   ` Daniel Thompson
@ 2023-05-15 23:11       ` Doug Anderson
  0 siblings, 0 replies; 46+ messages in thread
From: Doug Anderson @ 2023-05-15 23:11 UTC (permalink / raw)
  To: Daniel Thompson
  Cc: Catalin Marinas, Will Deacon, Sumit Garg, Marc Zyngier,
	Mark Rutland, ito-yuichi, kgdb-bugreport, Chen-Yu Tsai,
	Masayoshi Mizuma, Peter Zijlstra, Ard Biesheuvel,
	Rafael J . Wysocki, linux-arm-kernel, Stephen Boyd,
	Lecopzer Chen, Thomas Gleixner, linux-perf-users,
	Alexandru Elisei, Masayoshi Mizuma, linux-kernel

Hi,

On Fri, May 12, 2023 at 7:00 AM Daniel Thompson
<daniel.thompson@linaro.org> wrote:
>
> On Wed, Apr 19, 2023 at 03:56:03PM -0700, Douglas Anderson wrote:
> > From: Sumit Garg <sumit.garg@linaro.org>
> >
> > arm64 platforms with GICv3 or later supports pseudo NMIs which can be
> > leveraged to roundup CPUs which are stuck in hard lockup state with
> > interrupts disabled that wouldn't be possible with a normal IPI.
> >
> > So instead switch to roundup CPUs using IPI turned as NMI. And in
> > case a particular arm64 platform doesn't supports pseudo NMIs,
> > it will switch back to default kgdb CPUs roundup mechanism.
> >
> > Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> > Tested-by: Chen-Yu Tsai <wens@csie.org>
> > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > ---
> >
> > (no changes since v1)
> >
> >  arch/arm64/kernel/ipi_nmi.c |  5 +++++
> >  arch/arm64/kernel/kgdb.c    | 18 ++++++++++++++++++
> >  2 files changed, 23 insertions(+)
> >
> > diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
> > index c592e92b8cbf..2adaaf1519e5 100644
> > --- a/arch/arm64/kernel/ipi_nmi.c
> > +++ b/arch/arm64/kernel/ipi_nmi.c
> > @@ -8,6 +8,7 @@
> >
> >  #include <linux/interrupt.h>
> >  #include <linux/irq.h>
> > +#include <linux/kgdb.h>
> >  #include <linux/nmi.h>
> >  #include <linux/smp.h>
> >
> > @@ -45,10 +46,14 @@ bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
> >  static irqreturn_t ipi_nmi_handler(int irq, void *data)
> >  {
> >       irqreturn_t ret = IRQ_NONE;
> > +     unsigned int cpu = smp_processor_id();
>
> Does this play nice with CONFIG_DEBUG_PREEMPT? I may have missed
> something about the NMI entry but a quick scan of the arm64
> arch_irq_disabled() suggests that debug_smp_processor_id() will issue
> warnings at this point...
>
> Other than I didn't see anything I don't like here.

Good question. It seems to, at least on the sc7180-trogdor-based
system I tested.

Just to confirm, I printed out the values of some of the stuff that's
checked. When this function was called, I saw:

irqs_disabled() => true
raw_local_save_flags() => 0x000000f0
__irqflags_uses_pmr() => 1

The "__irqflags_uses_pmr" is the thing that gets set when we try to
enable pseudo-NMIs and they're actually there. That causes us to call
__pmr_irqs_disabled_flags() which will compare the flags (0xf0) to
GIC_PRIO_IRQON (0xe0). If flags is not the same as GIC_PRIO_IRQON then
interrupts are disabled.

...so, assuming I understood everything, I think we're OK.

I also tried to see what happened with the whole "fallback to use
regular IPIs if we don't have pseudo-NMIs enabled" (AKA patch #10 in
this series). In that case, I had:

irqs_disabled() => true
raw_local_save_flags() => 0x000000c0
__irqflags_uses_pmr() => 0

...in this case we end up in __daif_irqs_disabled_flags(). That checks
to see if the flags (0xc0) has the "I bit" (0x80) set. It is set, so
interrupts are disabled.

-Doug

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 09/10] arm64: kgdb: Roundup cpus using IPI as NMI
@ 2023-05-15 23:11       ` Doug Anderson
  0 siblings, 0 replies; 46+ messages in thread
From: Doug Anderson @ 2023-05-15 23:11 UTC (permalink / raw)
  To: Daniel Thompson
  Cc: Catalin Marinas, Will Deacon, Sumit Garg, Marc Zyngier,
	Mark Rutland, ito-yuichi, kgdb-bugreport, Chen-Yu Tsai,
	Masayoshi Mizuma, Peter Zijlstra, Ard Biesheuvel,
	Rafael J . Wysocki, linux-arm-kernel, Stephen Boyd,
	Lecopzer Chen, Thomas Gleixner, linux-perf-users,
	Alexandru Elisei, Masayoshi Mizuma, linux-kernel

Hi,

On Fri, May 12, 2023 at 7:00 AM Daniel Thompson
<daniel.thompson@linaro.org> wrote:
>
> On Wed, Apr 19, 2023 at 03:56:03PM -0700, Douglas Anderson wrote:
> > From: Sumit Garg <sumit.garg@linaro.org>
> >
> > arm64 platforms with GICv3 or later supports pseudo NMIs which can be
> > leveraged to roundup CPUs which are stuck in hard lockup state with
> > interrupts disabled that wouldn't be possible with a normal IPI.
> >
> > So instead switch to roundup CPUs using IPI turned as NMI. And in
> > case a particular arm64 platform doesn't supports pseudo NMIs,
> > it will switch back to default kgdb CPUs roundup mechanism.
> >
> > Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> > Tested-by: Chen-Yu Tsai <wens@csie.org>
> > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > ---
> >
> > (no changes since v1)
> >
> >  arch/arm64/kernel/ipi_nmi.c |  5 +++++
> >  arch/arm64/kernel/kgdb.c    | 18 ++++++++++++++++++
> >  2 files changed, 23 insertions(+)
> >
> > diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
> > index c592e92b8cbf..2adaaf1519e5 100644
> > --- a/arch/arm64/kernel/ipi_nmi.c
> > +++ b/arch/arm64/kernel/ipi_nmi.c
> > @@ -8,6 +8,7 @@
> >
> >  #include <linux/interrupt.h>
> >  #include <linux/irq.h>
> > +#include <linux/kgdb.h>
> >  #include <linux/nmi.h>
> >  #include <linux/smp.h>
> >
> > @@ -45,10 +46,14 @@ bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
> >  static irqreturn_t ipi_nmi_handler(int irq, void *data)
> >  {
> >       irqreturn_t ret = IRQ_NONE;
> > +     unsigned int cpu = smp_processor_id();
>
> Does this play nice with CONFIG_DEBUG_PREEMPT? I may have missed
> something about the NMI entry but a quick scan of the arm64
> arch_irq_disabled() suggests that debug_smp_processor_id() will issue
> warnings at this point...
>
> Other than I didn't see anything I don't like here.

Good question. It seems to, at least on the sc7180-trogdor-based
system I tested.

Just to confirm, I printed out the values of some of the stuff that's
checked. When this function was called, I saw:

irqs_disabled() => true
raw_local_save_flags() => 0x000000f0
__irqflags_uses_pmr() => 1

The "__irqflags_uses_pmr" is the thing that gets set when we try to
enable pseudo-NMIs and they're actually there. That causes us to call
__pmr_irqs_disabled_flags() which will compare the flags (0xf0) to
GIC_PRIO_IRQON (0xe0). If flags is not the same as GIC_PRIO_IRQON then
interrupts are disabled.

...so, assuming I understood everything, I think we're OK.

I also tried to see what happened with the whole "fallback to use
regular IPIs if we don't have pseudo-NMIs enabled" (AKA patch #10 in
this series). In that case, I had:

irqs_disabled() => true
raw_local_save_flags() => 0x000000c0
__irqflags_uses_pmr() => 0

...in this case we end up in __daif_irqs_disabled_flags(). That checks
to see if the flags (0xc0) has the "I bit" (0x80) set. It is set, so
interrupts are disabled.

-Doug

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 07/10] kgdb: Expose default CPUs roundup fallback mechanism
  2023-05-12 13:48   ` Daniel Thompson
@ 2023-05-15 23:21       ` Doug Anderson
  0 siblings, 0 replies; 46+ messages in thread
From: Doug Anderson @ 2023-05-15 23:21 UTC (permalink / raw)
  To: Daniel Thompson
  Cc: Catalin Marinas, Will Deacon, Sumit Garg, Marc Zyngier,
	Mark Rutland, ito-yuichi, kgdb-bugreport, Chen-Yu Tsai,
	Masayoshi Mizuma, Peter Zijlstra, Ard Biesheuvel,
	Rafael J . Wysocki, linux-arm-kernel, Stephen Boyd,
	Lecopzer Chen, Thomas Gleixner, linux-perf-users, Jason Wessel,
	linux-kernel

Hi,

On Fri, May 12, 2023 at 6:49 AM Daniel Thompson
<daniel.thompson@linaro.org> wrote:
>
> On Wed, Apr 19, 2023 at 03:56:01PM -0700, Douglas Anderson wrote:
> > From: Sumit Garg <sumit.garg@linaro.org>
> >
> > Add a new API kgdb_smp_call_nmi_hook() to expose default CPUs roundup
> > mechanism to a particular archichecture as a runtime fallback if it
> > detects to not support NMI roundup.
> >
> > Currently such an architecture example is arm64 supporting pseudo NMIs
> > feature which is only available on platforms which have support for GICv3
> > or later version.
> >
> > Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> > Tested-by: Chen-Yu Tsai <wens@csie.org>
> > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > ---
> >
> > (no changes since v1)
> >
> >  include/linux/kgdb.h      | 12 ++++++++++++
> >  kernel/debug/debug_core.c |  8 +++++++-
> >  2 files changed, 19 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
> > index 258cdde8d356..87713bd390f3 100644
> > --- a/include/linux/kgdb.h
> > +++ b/include/linux/kgdb.h
> > @@ -199,6 +199,18 @@ kgdb_arch_handle_qxfer_pkt(char *remcom_in_buffer,
> >
> >  extern void kgdb_call_nmi_hook(void *ignored);
> >
> > +/**
> > + *   kgdb_smp_call_nmi_hook - Provide default fallback mechanism to
> > + *                            round-up CPUs
> > + *
> > + *   If you're using the default implementation of kgdb_roundup_cpus()
> > + *   this function will be called.  And if an arch detects at runtime to
> > + *   not support NMI based roundup then it can fallback to default
> > + *   mechanism using this API.
> > + */
> > +
> > +extern void kgdb_smp_call_nmi_hook(void);
>
> Concept looks sensible but this is a terrible name for aa command to
> round up the CPUs using smp_call... functions. Whilst it is true it that
> kgdb_roundup_cpus() does use kgdb_call_nmi_hook() internally that
> doesn't mean we should name functions after it. They should be named
> after what they are do, not how they do it.
>
> Something more like kgdb_roundup_cpus_with_smp_call() would be a much
> better name.

Sounds good. I'm happy to spin with this rename, though I was kinda
hoping to drop ${SUBJECT} patch if folks were OK with patch #10 in
this series [1]. I personally think that's the right way to go but
it's unclear to me if arm64 maintainers will think it's a hack
(despite the fact that arm32 implements the "nmi" functions with
regular IPIs).

For now, maybe I'll think positive thoughts and hope that folks will
have the time to review the series soon. If another few weeks go by
then I'll send a v9 with just your comments addressed. If nothing
else, maybe you can land the kgdb parts in a tree targeting v6.5 and
then when arm64 folks have the bandwidth then it will be easier to get
them landed.

[1] https://lore.kernel.org/r/20230419155341.v8.10.Ic3659997d6243139d0522fc3afcdfd88d7a5f030@changeid


-Doug

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 07/10] kgdb: Expose default CPUs roundup fallback mechanism
@ 2023-05-15 23:21       ` Doug Anderson
  0 siblings, 0 replies; 46+ messages in thread
From: Doug Anderson @ 2023-05-15 23:21 UTC (permalink / raw)
  To: Daniel Thompson
  Cc: Catalin Marinas, Will Deacon, Sumit Garg, Marc Zyngier,
	Mark Rutland, ito-yuichi, kgdb-bugreport, Chen-Yu Tsai,
	Masayoshi Mizuma, Peter Zijlstra, Ard Biesheuvel,
	Rafael J . Wysocki, linux-arm-kernel, Stephen Boyd,
	Lecopzer Chen, Thomas Gleixner, linux-perf-users, Jason Wessel,
	linux-kernel

Hi,

On Fri, May 12, 2023 at 6:49 AM Daniel Thompson
<daniel.thompson@linaro.org> wrote:
>
> On Wed, Apr 19, 2023 at 03:56:01PM -0700, Douglas Anderson wrote:
> > From: Sumit Garg <sumit.garg@linaro.org>
> >
> > Add a new API kgdb_smp_call_nmi_hook() to expose default CPUs roundup
> > mechanism to a particular archichecture as a runtime fallback if it
> > detects to not support NMI roundup.
> >
> > Currently such an architecture example is arm64 supporting pseudo NMIs
> > feature which is only available on platforms which have support for GICv3
> > or later version.
> >
> > Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> > Tested-by: Chen-Yu Tsai <wens@csie.org>
> > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > ---
> >
> > (no changes since v1)
> >
> >  include/linux/kgdb.h      | 12 ++++++++++++
> >  kernel/debug/debug_core.c |  8 +++++++-
> >  2 files changed, 19 insertions(+), 1 deletion(-)
> >
> > diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
> > index 258cdde8d356..87713bd390f3 100644
> > --- a/include/linux/kgdb.h
> > +++ b/include/linux/kgdb.h
> > @@ -199,6 +199,18 @@ kgdb_arch_handle_qxfer_pkt(char *remcom_in_buffer,
> >
> >  extern void kgdb_call_nmi_hook(void *ignored);
> >
> > +/**
> > + *   kgdb_smp_call_nmi_hook - Provide default fallback mechanism to
> > + *                            round-up CPUs
> > + *
> > + *   If you're using the default implementation of kgdb_roundup_cpus()
> > + *   this function will be called.  And if an arch detects at runtime to
> > + *   not support NMI based roundup then it can fallback to default
> > + *   mechanism using this API.
> > + */
> > +
> > +extern void kgdb_smp_call_nmi_hook(void);
>
> Concept looks sensible but this is a terrible name for aa command to
> round up the CPUs using smp_call... functions. Whilst it is true it that
> kgdb_roundup_cpus() does use kgdb_call_nmi_hook() internally that
> doesn't mean we should name functions after it. They should be named
> after what they are do, not how they do it.
>
> Something more like kgdb_roundup_cpus_with_smp_call() would be a much
> better name.

Sounds good. I'm happy to spin with this rename, though I was kinda
hoping to drop ${SUBJECT} patch if folks were OK with patch #10 in
this series [1]. I personally think that's the right way to go but
it's unclear to me if arm64 maintainers will think it's a hack
(despite the fact that arm32 implements the "nmi" functions with
regular IPIs).

For now, maybe I'll think positive thoughts and hope that folks will
have the time to review the series soon. If another few weeks go by
then I'll send a v9 with just your comments addressed. If nothing
else, maybe you can land the kgdb parts in a tree targeting v6.5 and
then when arm64 folks have the bandwidth then it will be easier to get
them landed.

[1] https://lore.kernel.org/r/20230419155341.v8.10.Ic3659997d6243139d0522fc3afcdfd88d7a5f030@changeid


-Doug

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 00/10] arm64: Add framework to turn an IPI as NMI
  2023-05-10 16:42       ` Doug Anderson
@ 2023-05-16 10:09         ` Sumit Garg
  -1 siblings, 0 replies; 46+ messages in thread
From: Sumit Garg @ 2023-05-16 10:09 UTC (permalink / raw)
  To: Doug Anderson
  Cc: Mark Rutland, Catalin Marinas, Will Deacon, Daniel Thompson,
	Marc Zyngier, ito-yuichi, kgdb-bugreport, Chen-Yu Tsai,
	Masayoshi Mizuma, Peter Zijlstra, Ard Biesheuvel,
	Rafael J . Wysocki, linux-arm-kernel, Stephen Boyd,
	Lecopzer Chen, Thomas Gleixner, linux-perf-users,
	Alexandru Elisei, Andrey Konovalov, Ben Dooks, Borislav Petkov,
	Christophe Leroy, Darrick J. Wong, Dave Hansen, David S. Miller,
	Eric W. Biederman, Frederic Weisbecker, Gaosheng Cui,
	Gautham R. Shenoy, Greg Kroah-Hartman, Guilherme G. Piccoli,
	Guo Ren, H. Peter Anvin, Huacai Chen, Ingo Molnar, Ingo Molnar,
	Jason A. Donenfeld, Jason Wessel, Jianmin Lv, Jiaxun Yang,
	Jinyang He, Joey Gouly, Kees Cook, Laurent Dufour,
	Masahiro Yamada, Masayoshi Mizuma, Michael Ellerman,
	Nicholas Piggin, Paul E. McKenney, Philippe Mathieu-Daudé,
	Pierre Gondois, Qing Zhang, Russell King (Oracle),
	Russell King, Thomas Bogendoerfer, Ulf Hansson, WANG Xuerui,
	linux-kernel, linux-mips, linuxppc-dev, loongarch, sparclinux,
	x86

On Wed, 10 May 2023 at 22:20, Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Wed, May 10, 2023 at 9:30 AM Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > On Wed, May 10, 2023 at 08:28:17AM -0700, Doug Anderson wrote:
> > > Hi,
> >
> > Hi Doug,
> >
> > > On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson <dianders@chromium.org> wrote:
> > > > This is an attempt to resurrect Sumit's old patch series [1] that
> > > > allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and
> > > > also to round up CPUs in kdb/kgdb. The last post from Sumit that I
> > > > could find was v7, so I called this series v8. I haven't copied all of
> > > > his old changelongs here, but you can find them from the link.
> > > >

Thanks Doug for picking up this work and for all your additions/improvements.

> > > > Since v7, I have:
> > > > * Addressed the small amount of feedback that was there for v7.
> > > > * Rebased.
> > > > * Added a new patch that prevents us from spamming the logs with idle
> > > >   tasks.
> > > > * Added an extra patch to gracefully fall back to regular IPIs if
> > > >   pseudo-NMIs aren't there.
> > > >
> > > > Since there appear to be a few different patches series related to
> > > > being able to use NMIs to get stack traces of crashed systems, let me
> > > > try to organize them to the best of my understanding:
> > > >
> > > > a) This series. On its own, a) will (among other things) enable stack
> > > >    traces of all running processes with the soft lockup detector if
> > > >    you've enabled the sysctl "kernel.softlockup_all_cpu_backtrace". On
> > > >    its own, a) doesn't give a hard lockup detector.
> > > >
> > > > b) A different recently-posted series [2] that adds a hard lockup
> > > >    detector based on perf. On its own, b) gives a stack crawl of the
> > > >    locked up CPU but no stack crawls of other CPUs (even if they're
> > > >    locked too). Together with a) + b) we get everything (full lockup
> > > >    detect, full ability to get stack crawls).
> > > >
> > > > c) The old Android "buddy" hard lockup detector [3] that I'm
> > > >    considering trying to upstream. If b) lands then I believe c) would
> > > >    be redundant (at least for arm64). c) on its own is really only
> > > >    useful on arm64 for platforms that can print CPU_DBGPCSR somehow
> > > >    (see [4]). a) + c) is roughly as good as a) + b).
> >
> > > It's been 3 weeks and I haven't heard a peep on this series. That
> > > means nobody has any objections and it's all good to land, right?
> > > Right? :-P

For me it was months waiting without any feedback. So I think you are
lucky :) or atleast better than me at poking arm64 maintainers.

> >
> > FWIW, there are still longstanding soundness issues in the arm64 pseudo-NMI
> > support (and fixing that requires an overhaul of our DAIF / IRQ flag
> > management, which I've been chipping away at for a number of releases), so I
> > hadn't looked at this in detail yet because the foundations are still somewhat
> > dodgy.
> >
> > I appreciate that this has been around for a while, and it's on my queue to
> > look at.
>
> Ah, thanks for the heads up! We've been thinking about turning this on
> in production in ChromeOS because it will help us track down a whole
> class of field-generated crash reports that are otherwise opaque to
> us. It sounds as if maybe that's not a good idea quite yet? Do you
> have any idea of how much farther along this needs to go? ...of
> course, we've also run into issues with Mediatek devices because they
> don't save/restore GICR registers properly [1]. In theory, we might be
> able to work around that in the kernel.
>
> In any case, even if there are bugs that would prevent turning this on
> for production, it still seems like we could still land this series.
> It simply wouldn't do anything until someone turned on pseudo NMIs,
> which wouldn't happen till the kinks are worked out.

I agree here. We should be able to make the foundations robust later
on. IMHO, until we turn on features surrounding pseudo NMIs, I am not
sure how we can have true confidence in the underlying robustness.

-Sumit

>
> ...actually, I guess I should say that if all the patches of the
> current series do land then it actually _would_ still do something,
> even without pseudo-NMI. Assuming the last patch looks OK, it would at
> least start falling back to using regular IPIs to do backtraces. That
> wouldn't get backtraces on hard locked up CPUs but it would be better
> than what we have today where we don't get any backtraces. This would
> get arm64 on par with arm32...
>
> [1] https://issuetracker.google.com/281831288

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 00/10] arm64: Add framework to turn an IPI as NMI
@ 2023-05-16 10:09         ` Sumit Garg
  0 siblings, 0 replies; 46+ messages in thread
From: Sumit Garg @ 2023-05-16 10:09 UTC (permalink / raw)
  To: Doug Anderson
  Cc: Mark Rutland, Masayoshi Mizuma, Jason A. Donenfeld, Huacai Chen,
	Jinyang He, Lecopzer Chen, Peter Zijlstra, Catalin Marinas,
	Dave Hansen, Qing Zhang, Jiaxun Yang, linux-kernel, Guo Ren,
	H. Peter Anvin, sparclinux, WANG Xuerui, Will Deacon,
	Ard Biesheuvel, Daniel Thompson, Marc Zyngier, Masahiro Yamada,
	Rafael J . Wysocki, Russell King, Gaosheng Cui, Chen-Yu Tsai,
	Ingo Molnar, Darrick J. Wong

On Wed, 10 May 2023 at 22:20, Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Wed, May 10, 2023 at 9:30 AM Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > On Wed, May 10, 2023 at 08:28:17AM -0700, Doug Anderson wrote:
> > > Hi,
> >
> > Hi Doug,
> >
> > > On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson <dianders@chromium.org> wrote:
> > > > This is an attempt to resurrect Sumit's old patch series [1] that
> > > > allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and
> > > > also to round up CPUs in kdb/kgdb. The last post from Sumit that I
> > > > could find was v7, so I called this series v8. I haven't copied all of
> > > > his old changelongs here, but you can find them from the link.
> > > >

Thanks Doug for picking up this work and for all your additions/improvements.

> > > > Since v7, I have:
> > > > * Addressed the small amount of feedback that was there for v7.
> > > > * Rebased.
> > > > * Added a new patch that prevents us from spamming the logs with idle
> > > >   tasks.
> > > > * Added an extra patch to gracefully fall back to regular IPIs if
> > > >   pseudo-NMIs aren't there.
> > > >
> > > > Since there appear to be a few different patches series related to
> > > > being able to use NMIs to get stack traces of crashed systems, let me
> > > > try to organize them to the best of my understanding:
> > > >
> > > > a) This series. On its own, a) will (among other things) enable stack
> > > >    traces of all running processes with the soft lockup detector if
> > > >    you've enabled the sysctl "kernel.softlockup_all_cpu_backtrace". On
> > > >    its own, a) doesn't give a hard lockup detector.
> > > >
> > > > b) A different recently-posted series [2] that adds a hard lockup
> > > >    detector based on perf. On its own, b) gives a stack crawl of the
> > > >    locked up CPU but no stack crawls of other CPUs (even if they're
> > > >    locked too). Together with a) + b) we get everything (full lockup
> > > >    detect, full ability to get stack crawls).
> > > >
> > > > c) The old Android "buddy" hard lockup detector [3] that I'm
> > > >    considering trying to upstream. If b) lands then I believe c) would
> > > >    be redundant (at least for arm64). c) on its own is really only
> > > >    useful on arm64 for platforms that can print CPU_DBGPCSR somehow
> > > >    (see [4]). a) + c) is roughly as good as a) + b).
> >
> > > It's been 3 weeks and I haven't heard a peep on this series. That
> > > means nobody has any objections and it's all good to land, right?
> > > Right? :-P

For me it was months waiting without any feedback. So I think you are
lucky :) or atleast better than me at poking arm64 maintainers.

> >
> > FWIW, there are still longstanding soundness issues in the arm64 pseudo-NMI
> > support (and fixing that requires an overhaul of our DAIF / IRQ flag
> > management, which I've been chipping away at for a number of releases), so I
> > hadn't looked at this in detail yet because the foundations are still somewhat
> > dodgy.
> >
> > I appreciate that this has been around for a while, and it's on my queue to
> > look at.
>
> Ah, thanks for the heads up! We've been thinking about turning this on
> in production in ChromeOS because it will help us track down a whole
> class of field-generated crash reports that are otherwise opaque to
> us. It sounds as if maybe that's not a good idea quite yet? Do you
> have any idea of how much farther along this needs to go? ...of
> course, we've also run into issues with Mediatek devices because they
> don't save/restore GICR registers properly [1]. In theory, we might be
> able to work around that in the kernel.
>
> In any case, even if there are bugs that would prevent turning this on
> for production, it still seems like we could still land this series.
> It simply wouldn't do anything until someone turned on pseudo NMIs,
> which wouldn't happen till the kinks are worked out.

I agree here. We should be able to make the foundations robust later
on. IMHO, until we turn on features surrounding pseudo NMIs, I am not
sure how we can have true confidence in the underlying robustness.

-Sumit

>
> ...actually, I guess I should say that if all the patches of the
> current series do land then it actually _would_ still do something,
> even without pseudo-NMI. Assuming the last patch looks OK, it would at
> least start falling back to using regular IPIs to do backtraces. That
> wouldn't get backtraces on hard locked up CPUs but it would be better
> than what we have today where we don't get any backtraces. This would
> get arm64 on par with arm32...
>
> [1] https://issuetracker.google.com/281831288

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 00/10] arm64: Add framework to turn an IPI as NMI
  2023-05-10 16:42       ` Doug Anderson
@ 2023-06-01 21:46         ` Doug Anderson
  -1 siblings, 0 replies; 46+ messages in thread
From: Doug Anderson @ 2023-06-01 21:46 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Catalin Marinas, Will Deacon, Sumit Garg, Daniel Thompson,
	Marc Zyngier, ito-yuichi, kgdb-bugreport, Chen-Yu Tsai,
	Masayoshi Mizuma, Peter Zijlstra, Ard Biesheuvel,
	Rafael J . Wysocki, linux-arm-kernel, Stephen Boyd,
	Lecopzer Chen, Thomas Gleixner, linux-perf-users,
	Alexandru Elisei, Andrey Konovalov, Ben Dooks, Borislav Petkov,
	Christophe Leroy, Darrick J. Wong, Dave Hansen, David S. Miller,
	Eric W. Biederman, Frederic Weisbecker, Gaosheng Cui,
	Gautham R. Shenoy, Greg Kroah-Hartman, Guilherme G. Piccoli,
	Guo Ren, H. Peter Anvin, Huacai Chen, Ingo Molnar, Ingo Molnar,
	Jason A. Donenfeld, Jason Wessel, Jianmin Lv, Jiaxun Yang,
	Jinyang He, Joey Gouly, Kees Cook, Laurent Dufour,
	Masahiro Yamada, Masayoshi Mizuma, Michael Ellerman,
	Nicholas Piggin, Paul E. McKenney, Philippe Mathieu-Daudé,
	Pierre Gondois, Qing Zhang, Russell King (Oracle),
	Russell King, Thomas Bogendoerfer, Ulf Hansson, WANG Xuerui,
	linux-kernel, linux-mips, linuxppc-dev, loongarch, sparclinux,
	x86

Hi,

On Wed, May 10, 2023 at 9:42 AM Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Wed, May 10, 2023 at 9:30 AM Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > On Wed, May 10, 2023 at 08:28:17AM -0700, Doug Anderson wrote:
> > > Hi,
> >
> > Hi Doug,
> >
> > > On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson <dianders@chromium.org> wrote:
> > > > This is an attempt to resurrect Sumit's old patch series [1] that
> > > > allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and
> > > > also to round up CPUs in kdb/kgdb. The last post from Sumit that I
> > > > could find was v7, so I called this series v8. I haven't copied all of
> > > > his old changelongs here, but you can find them from the link.
> > > >
> > > > Since v7, I have:
> > > > * Addressed the small amount of feedback that was there for v7.
> > > > * Rebased.
> > > > * Added a new patch that prevents us from spamming the logs with idle
> > > >   tasks.
> > > > * Added an extra patch to gracefully fall back to regular IPIs if
> > > >   pseudo-NMIs aren't there.
> > > >
> > > > Since there appear to be a few different patches series related to
> > > > being able to use NMIs to get stack traces of crashed systems, let me
> > > > try to organize them to the best of my understanding:
> > > >
> > > > a) This series. On its own, a) will (among other things) enable stack
> > > >    traces of all running processes with the soft lockup detector if
> > > >    you've enabled the sysctl "kernel.softlockup_all_cpu_backtrace". On
> > > >    its own, a) doesn't give a hard lockup detector.
> > > >
> > > > b) A different recently-posted series [2] that adds a hard lockup
> > > >    detector based on perf. On its own, b) gives a stack crawl of the
> > > >    locked up CPU but no stack crawls of other CPUs (even if they're
> > > >    locked too). Together with a) + b) we get everything (full lockup
> > > >    detect, full ability to get stack crawls).
> > > >
> > > > c) The old Android "buddy" hard lockup detector [3] that I'm
> > > >    considering trying to upstream. If b) lands then I believe c) would
> > > >    be redundant (at least for arm64). c) on its own is really only
> > > >    useful on arm64 for platforms that can print CPU_DBGPCSR somehow
> > > >    (see [4]). a) + c) is roughly as good as a) + b).
> >
> > > It's been 3 weeks and I haven't heard a peep on this series. That
> > > means nobody has any objections and it's all good to land, right?
> > > Right? :-P
> >
> > FWIW, there are still longstanding soundness issues in the arm64 pseudo-NMI
> > support (and fixing that requires an overhaul of our DAIF / IRQ flag
> > management, which I've been chipping away at for a number of releases), so I
> > hadn't looked at this in detail yet because the foundations are still somewhat
> > dodgy.
> >
> > I appreciate that this has been around for a while, and it's on my queue to
> > look at.
>
> Ah, thanks for the heads up! We've been thinking about turning this on
> in production in ChromeOS because it will help us track down a whole
> class of field-generated crash reports that are otherwise opaque to
> us. It sounds as if maybe that's not a good idea quite yet? Do you
> have any idea of how much farther along this needs to go?

I'm still very interested in this topic. Do you have anything concrete
that is broken? Your reply gives me the vibe that there have been a
bunch of bugs found recently and, while there are no known issues,
you're worried that there might be something lingering still. Is that
correct, or do you have something concrete that's broken? Is this
anything others could help with? I could make an attempt, or we might
be able to rope some of the others interested in this topic and more
GIC-knowledgeable to help?

I still have a goal for turning this on for production but obviously
don't want to make the device crashier because of it.

Also: if this really has known bugs, should we put a big WARN_ON splat
if anyone tries to turn on pseudo NMIs? Without that, it feels like
it's a bit of a footgun.


> ...of
> course, we've also run into issues with Mediatek devices because they
> don't save/restore GICR registers properly [1]. In theory, we might be
> able to work around that in the kernel.

To followup with this, we've formulated a plan for dealing with
Mediatek Chromebooks. I doubt anyone is truly interested, but if
anyone is the details are in the public Google bug tracker [1]. None
of that should block lading the NMI backtrace series.


> In any case, even if there are bugs that would prevent turning this on
> for production, it still seems like we could still land this series.
> It simply wouldn't do anything until someone turned on pseudo NMIs,
> which wouldn't happen till the kinks are worked out.
>
> ...actually, I guess I should say that if all the patches of the
> current series do land then it actually _would_ still do something,
> even without pseudo-NMI. Assuming the last patch looks OK, it would at
> least start falling back to using regular IPIs to do backtraces. That
> wouldn't get backtraces on hard locked up CPUs but it would be better
> than what we have today where we don't get any backtraces. This would
> get arm64 on par with arm32...

The more I thought about it, the more I liked the idea of going full
hog on the idea in patch #10. I've posted v9 [2] where I've really
embraced the idea of falling back to a regular IPI if NMI isn't
available. Hopefully it looks keen.

[1] https://issuetracker.google.com/281831288
[2] https://lore.kernel.org/r/20230601213440.2488667-1-dianders@chromium.org

-Doug

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 00/10] arm64: Add framework to turn an IPI as NMI
@ 2023-06-01 21:46         ` Doug Anderson
  0 siblings, 0 replies; 46+ messages in thread
From: Doug Anderson @ 2023-06-01 21:46 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Masayoshi Mizuma, Jason A. Donenfeld, Huacai Chen, Jinyang He,
	Lecopzer Chen, Peter Zijlstra, Catalin Marinas, Dave Hansen,
	Qing Zhang, Jiaxun Yang, linux-kernel, Guo Ren, H. Peter Anvin,
	sparclinux, WANG Xuerui, Will Deacon, Ard Biesheuvel,
	Daniel Thompson, Marc Zyngier, Masahiro Yamada,
	Rafael J . Wysocki, Russell King, Gaosheng Cui, Chen-Yu Tsai,
	Ingo Molnar, Darrick J. Wong, kgdb-bugreport

Hi,

On Wed, May 10, 2023 at 9:42 AM Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Wed, May 10, 2023 at 9:30 AM Mark Rutland <mark.rutland@arm.com> wrote:
> >
> > On Wed, May 10, 2023 at 08:28:17AM -0700, Doug Anderson wrote:
> > > Hi,
> >
> > Hi Doug,
> >
> > > On Wed, Apr 19, 2023 at 3:57 PM Douglas Anderson <dianders@chromium.org> wrote:
> > > > This is an attempt to resurrect Sumit's old patch series [1] that
> > > > allowed us to use the arm64 pseudo-NMI to get backtraces of CPUs and
> > > > also to round up CPUs in kdb/kgdb. The last post from Sumit that I
> > > > could find was v7, so I called this series v8. I haven't copied all of
> > > > his old changelongs here, but you can find them from the link.
> > > >
> > > > Since v7, I have:
> > > > * Addressed the small amount of feedback that was there for v7.
> > > > * Rebased.
> > > > * Added a new patch that prevents us from spamming the logs with idle
> > > >   tasks.
> > > > * Added an extra patch to gracefully fall back to regular IPIs if
> > > >   pseudo-NMIs aren't there.
> > > >
> > > > Since there appear to be a few different patches series related to
> > > > being able to use NMIs to get stack traces of crashed systems, let me
> > > > try to organize them to the best of my understanding:
> > > >
> > > > a) This series. On its own, a) will (among other things) enable stack
> > > >    traces of all running processes with the soft lockup detector if
> > > >    you've enabled the sysctl "kernel.softlockup_all_cpu_backtrace". On
> > > >    its own, a) doesn't give a hard lockup detector.
> > > >
> > > > b) A different recently-posted series [2] that adds a hard lockup
> > > >    detector based on perf. On its own, b) gives a stack crawl of the
> > > >    locked up CPU but no stack crawls of other CPUs (even if they're
> > > >    locked too). Together with a) + b) we get everything (full lockup
> > > >    detect, full ability to get stack crawls).
> > > >
> > > > c) The old Android "buddy" hard lockup detector [3] that I'm
> > > >    considering trying to upstream. If b) lands then I believe c) would
> > > >    be redundant (at least for arm64). c) on its own is really only
> > > >    useful on arm64 for platforms that can print CPU_DBGPCSR somehow
> > > >    (see [4]). a) + c) is roughly as good as a) + b).
> >
> > > It's been 3 weeks and I haven't heard a peep on this series. That
> > > means nobody has any objections and it's all good to land, right?
> > > Right? :-P
> >
> > FWIW, there are still longstanding soundness issues in the arm64 pseudo-NMI
> > support (and fixing that requires an overhaul of our DAIF / IRQ flag
> > management, which I've been chipping away at for a number of releases), so I
> > hadn't looked at this in detail yet because the foundations are still somewhat
> > dodgy.
> >
> > I appreciate that this has been around for a while, and it's on my queue to
> > look at.
>
> Ah, thanks for the heads up! We've been thinking about turning this on
> in production in ChromeOS because it will help us track down a whole
> class of field-generated crash reports that are otherwise opaque to
> us. It sounds as if maybe that's not a good idea quite yet? Do you
> have any idea of how much farther along this needs to go?

I'm still very interested in this topic. Do you have anything concrete
that is broken? Your reply gives me the vibe that there have been a
bunch of bugs found recently and, while there are no known issues,
you're worried that there might be something lingering still. Is that
correct, or do you have something concrete that's broken? Is this
anything others could help with? I could make an attempt, or we might
be able to rope some of the others interested in this topic and more
GIC-knowledgeable to help?

I still have a goal for turning this on for production but obviously
don't want to make the device crashier because of it.

Also: if this really has known bugs, should we put a big WARN_ON splat
if anyone tries to turn on pseudo NMIs? Without that, it feels like
it's a bit of a footgun.


> ...of
> course, we've also run into issues with Mediatek devices because they
> don't save/restore GICR registers properly [1]. In theory, we might be
> able to work around that in the kernel.

To followup with this, we've formulated a plan for dealing with
Mediatek Chromebooks. I doubt anyone is truly interested, but if
anyone is the details are in the public Google bug tracker [1]. None
of that should block lading the NMI backtrace series.


> In any case, even if there are bugs that would prevent turning this on
> for production, it still seems like we could still land this series.
> It simply wouldn't do anything until someone turned on pseudo NMIs,
> which wouldn't happen till the kinks are worked out.
>
> ...actually, I guess I should say that if all the patches of the
> current series do land then it actually _would_ still do something,
> even without pseudo-NMI. Assuming the last patch looks OK, it would at
> least start falling back to using regular IPIs to do backtraces. That
> wouldn't get backtraces on hard locked up CPUs but it would be better
> than what we have today where we don't get any backtraces. This would
> get arm64 on par with arm32...

The more I thought about it, the more I liked the idea of going full
hog on the idea in patch #10. I've posted v9 [2] where I've really
embraced the idea of falling back to a regular IPI if NMI isn't
available. Hopefully it looks keen.

[1] https://issuetracker.google.com/281831288
[2] https://lore.kernel.org/r/20230601213440.2488667-1-dianders@chromium.org

-Doug

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 07/10] kgdb: Expose default CPUs roundup fallback mechanism
  2023-05-15 23:21       ` Doug Anderson
@ 2023-06-01 21:47         ` Doug Anderson
  -1 siblings, 0 replies; 46+ messages in thread
From: Doug Anderson @ 2023-06-01 21:47 UTC (permalink / raw)
  To: Daniel Thompson
  Cc: Catalin Marinas, Will Deacon, Sumit Garg, Marc Zyngier,
	Mark Rutland, ito-yuichi, kgdb-bugreport, Chen-Yu Tsai,
	Masayoshi Mizuma, Peter Zijlstra, Ard Biesheuvel,
	Rafael J . Wysocki, linux-arm-kernel, Stephen Boyd,
	Lecopzer Chen, Thomas Gleixner, linux-perf-users, Jason Wessel,
	linux-kernel

Hi,

On Mon, May 15, 2023 at 4:21 PM Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Fri, May 12, 2023 at 6:49 AM Daniel Thompson
> <daniel.thompson@linaro.org> wrote:
> >
> > On Wed, Apr 19, 2023 at 03:56:01PM -0700, Douglas Anderson wrote:
> > > From: Sumit Garg <sumit.garg@linaro.org>
> > >
> > > Add a new API kgdb_smp_call_nmi_hook() to expose default CPUs roundup
> > > mechanism to a particular archichecture as a runtime fallback if it
> > > detects to not support NMI roundup.
> > >
> > > Currently such an architecture example is arm64 supporting pseudo NMIs
> > > feature which is only available on platforms which have support for GICv3
> > > or later version.
> > >
> > > Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> > > Tested-by: Chen-Yu Tsai <wens@csie.org>
> > > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > > ---
> > >
> > > (no changes since v1)
> > >
> > >  include/linux/kgdb.h      | 12 ++++++++++++
> > >  kernel/debug/debug_core.c |  8 +++++++-
> > >  2 files changed, 19 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
> > > index 258cdde8d356..87713bd390f3 100644
> > > --- a/include/linux/kgdb.h
> > > +++ b/include/linux/kgdb.h
> > > @@ -199,6 +199,18 @@ kgdb_arch_handle_qxfer_pkt(char *remcom_in_buffer,
> > >
> > >  extern void kgdb_call_nmi_hook(void *ignored);
> > >
> > > +/**
> > > + *   kgdb_smp_call_nmi_hook - Provide default fallback mechanism to
> > > + *                            round-up CPUs
> > > + *
> > > + *   If you're using the default implementation of kgdb_roundup_cpus()
> > > + *   this function will be called.  And if an arch detects at runtime to
> > > + *   not support NMI based roundup then it can fallback to default
> > > + *   mechanism using this API.
> > > + */
> > > +
> > > +extern void kgdb_smp_call_nmi_hook(void);
> >
> > Concept looks sensible but this is a terrible name for aa command to
> > round up the CPUs using smp_call... functions. Whilst it is true it that
> > kgdb_roundup_cpus() does use kgdb_call_nmi_hook() internally that
> > doesn't mean we should name functions after it. They should be named
> > after what they are do, not how they do it.
> >
> > Something more like kgdb_roundup_cpus_with_smp_call() would be a much
> > better name.
>
> Sounds good. I'm happy to spin with this rename, though I was kinda
> hoping to drop ${SUBJECT} patch if folks were OK with patch #10 in
> this series [1]. I personally think that's the right way to go but
> it's unclear to me if arm64 maintainers will think it's a hack
> (despite the fact that arm32 implements the "nmi" functions with
> regular IPIs).
>
> For now, maybe I'll think positive thoughts and hope that folks will
> have the time to review the series soon. If another few weeks go by
> then I'll send a v9 with just your comments addressed. If nothing
> else, maybe you can land the kgdb parts in a tree targeting v6.5 and
> then when arm64 folks have the bandwidth then it will be easier to get
> them landed.
>
> [1] https://lore.kernel.org/r/20230419155341.v8.10.Ic3659997d6243139d0522fc3afcdfd88d7a5f030@changeid

Breadcrumbs: I've dropped this patch and several others in v9 [1] by
embracing the idea of using a normal IPI as a fallback.

[1] https://lore.kernel.org/r/20230601213440.2488667-1-dianders@chromium.org/

^ permalink raw reply	[flat|nested] 46+ messages in thread

* Re: [PATCH v8 07/10] kgdb: Expose default CPUs roundup fallback mechanism
@ 2023-06-01 21:47         ` Doug Anderson
  0 siblings, 0 replies; 46+ messages in thread
From: Doug Anderson @ 2023-06-01 21:47 UTC (permalink / raw)
  To: Daniel Thompson
  Cc: Catalin Marinas, Will Deacon, Sumit Garg, Marc Zyngier,
	Mark Rutland, ito-yuichi, kgdb-bugreport, Chen-Yu Tsai,
	Masayoshi Mizuma, Peter Zijlstra, Ard Biesheuvel,
	Rafael J . Wysocki, linux-arm-kernel, Stephen Boyd,
	Lecopzer Chen, Thomas Gleixner, linux-perf-users, Jason Wessel,
	linux-kernel

Hi,

On Mon, May 15, 2023 at 4:21 PM Doug Anderson <dianders@chromium.org> wrote:
>
> Hi,
>
> On Fri, May 12, 2023 at 6:49 AM Daniel Thompson
> <daniel.thompson@linaro.org> wrote:
> >
> > On Wed, Apr 19, 2023 at 03:56:01PM -0700, Douglas Anderson wrote:
> > > From: Sumit Garg <sumit.garg@linaro.org>
> > >
> > > Add a new API kgdb_smp_call_nmi_hook() to expose default CPUs roundup
> > > mechanism to a particular archichecture as a runtime fallback if it
> > > detects to not support NMI roundup.
> > >
> > > Currently such an architecture example is arm64 supporting pseudo NMIs
> > > feature which is only available on platforms which have support for GICv3
> > > or later version.
> > >
> > > Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> > > Tested-by: Chen-Yu Tsai <wens@csie.org>
> > > Signed-off-by: Douglas Anderson <dianders@chromium.org>
> > > ---
> > >
> > > (no changes since v1)
> > >
> > >  include/linux/kgdb.h      | 12 ++++++++++++
> > >  kernel/debug/debug_core.c |  8 +++++++-
> > >  2 files changed, 19 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
> > > index 258cdde8d356..87713bd390f3 100644
> > > --- a/include/linux/kgdb.h
> > > +++ b/include/linux/kgdb.h
> > > @@ -199,6 +199,18 @@ kgdb_arch_handle_qxfer_pkt(char *remcom_in_buffer,
> > >
> > >  extern void kgdb_call_nmi_hook(void *ignored);
> > >
> > > +/**
> > > + *   kgdb_smp_call_nmi_hook - Provide default fallback mechanism to
> > > + *                            round-up CPUs
> > > + *
> > > + *   If you're using the default implementation of kgdb_roundup_cpus()
> > > + *   this function will be called.  And if an arch detects at runtime to
> > > + *   not support NMI based roundup then it can fallback to default
> > > + *   mechanism using this API.
> > > + */
> > > +
> > > +extern void kgdb_smp_call_nmi_hook(void);
> >
> > Concept looks sensible but this is a terrible name for aa command to
> > round up the CPUs using smp_call... functions. Whilst it is true it that
> > kgdb_roundup_cpus() does use kgdb_call_nmi_hook() internally that
> > doesn't mean we should name functions after it. They should be named
> > after what they are do, not how they do it.
> >
> > Something more like kgdb_roundup_cpus_with_smp_call() would be a much
> > better name.
>
> Sounds good. I'm happy to spin with this rename, though I was kinda
> hoping to drop ${SUBJECT} patch if folks were OK with patch #10 in
> this series [1]. I personally think that's the right way to go but
> it's unclear to me if arm64 maintainers will think it's a hack
> (despite the fact that arm32 implements the "nmi" functions with
> regular IPIs).
>
> For now, maybe I'll think positive thoughts and hope that folks will
> have the time to review the series soon. If another few weeks go by
> then I'll send a v9 with just your comments addressed. If nothing
> else, maybe you can land the kgdb parts in a tree targeting v6.5 and
> then when arm64 folks have the bandwidth then it will be easier to get
> them landed.
>
> [1] https://lore.kernel.org/r/20230419155341.v8.10.Ic3659997d6243139d0522fc3afcdfd88d7a5f030@changeid

Breadcrumbs: I've dropped this patch and several others in v9 [1] by
embracing the idea of using a normal IPI as a fallback.

[1] https://lore.kernel.org/r/20230601213440.2488667-1-dianders@chromium.org/

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 46+ messages in thread

end of thread, other threads:[~2023-06-01 23:39 UTC | newest]

Thread overview: 46+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-19 22:55 [PATCH v8 00/10] arm64: Add framework to turn an IPI as NMI Douglas Anderson
2023-04-19 22:55 ` Douglas Anderson
2023-04-19 22:55 ` [PATCH v8 01/10] arm64: Add framework to turn " Douglas Anderson
2023-04-19 22:55   ` Douglas Anderson
2023-04-19 22:55 ` [PATCH v8 02/10] irqchip/gic-v3: Enable support for SGIs to act as NMIs Douglas Anderson
2023-04-19 22:55   ` Douglas Anderson
2023-04-19 22:55 ` [PATCH v8 03/10] arm64: smp: Assign and setup an IPI as NMI Douglas Anderson
2023-04-19 22:55   ` Douglas Anderson
2023-04-24 17:53   ` Doug Anderson
2023-04-24 17:53     ` Doug Anderson
2023-04-19 22:55 ` [PATCH v8 04/10] nmi: backtrace: Allow runtime arch specific override Douglas Anderson
2023-04-19 22:55   ` Douglas Anderson
2023-04-19 22:55 ` [PATCH v8 05/10] arm64: ipi_nmi: Add support for NMI backtrace Douglas Anderson
2023-04-19 22:55   ` Douglas Anderson
2023-04-19 22:56 ` [PATCH v8 06/10] arm64: idle: Tag the arm64 idle functions as __cpuidle Douglas Anderson
2023-04-19 22:56   ` Douglas Anderson
2023-05-10 16:43   ` Mark Rutland
2023-05-10 21:13     ` Doug Anderson
2023-04-19 22:56 ` [PATCH v8 07/10] kgdb: Expose default CPUs roundup fallback mechanism Douglas Anderson
2023-04-19 22:56   ` Douglas Anderson
2023-05-12 13:48   ` Daniel Thompson
2023-05-15 23:21     ` Doug Anderson
2023-05-15 23:21       ` Doug Anderson
2023-06-01 21:47       ` Doug Anderson
2023-06-01 21:47         ` Doug Anderson
2023-04-19 22:56 ` [PATCH v8 08/10] kgdb: Provide a stub kgdb_nmicallback() if !CONFIG_KGDB Douglas Anderson
2023-04-19 22:56   ` Douglas Anderson
2023-05-11 14:34   ` Doug Anderson
2023-05-12 13:52     ` Daniel Thompson
2023-04-19 22:56 ` [PATCH v8 09/10] arm64: kgdb: Roundup cpus using IPI as NMI Douglas Anderson
2023-04-19 22:56   ` Douglas Anderson
2023-05-12 14:00   ` Daniel Thompson
2023-05-15 23:11     ` Doug Anderson
2023-05-15 23:11       ` Doug Anderson
2023-04-19 22:56 ` [PATCH v8 10/10] arm64: ipi_nmi: Fallback to a regular IPI if NMI isn't enabled Douglas Anderson
2023-04-19 22:56   ` Douglas Anderson
2023-05-10 15:28 ` [PATCH v8 00/10] arm64: Add framework to turn an IPI as NMI Doug Anderson
2023-05-10 15:28   ` Doug Anderson
2023-05-10 16:30   ` Mark Rutland
2023-05-10 16:30     ` Mark Rutland
2023-05-10 16:42     ` Doug Anderson
2023-05-10 16:42       ` Doug Anderson
2023-05-16 10:09       ` Sumit Garg
2023-05-16 10:09         ` Sumit Garg
2023-06-01 21:46       ` Doug Anderson
2023-06-01 21:46         ` Doug Anderson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.