linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 0/7] arm64: Add framework to turn an IPI as NMI
@ 2020-10-29 14:56 Sumit Garg
  2020-10-29 14:56 ` [PATCH v6 1/7] arm64: Add framework to turn " Sumit Garg
                   ` (6 more replies)
  0 siblings, 7 replies; 14+ messages in thread
From: Sumit Garg @ 2020-10-29 14:56 UTC (permalink / raw)
  To: maz, catalin.marinas, will
  Cc: linux-arm-kernel, tglx, jason, linux, tsbogend, mpe, davem,
	mingo, bp, x86, mark.rutland, julien.thierry.kdev, dianders,
	daniel.thompson, jason.wessel, msys.mizuma, ito-yuichi,
	kgdb-bugreport, linux-kernel, Sumit Garg

With pseudo NMIs support available its possible to configure SGIs to be
triggered as pseudo NMIs running in NMI context. And kernel features
such as:
- NMI backtrace can leverage IPI turned as NMI to get a backtrace of CPU
  stuck in hard lockup using magic SYSRQ.
- kgdb relies on NMI support to round up CPUs which are stuck in hard
  lockup state with interrupts disabled.

This patch-set adds framework to turn an IPI as NMI which can be triggered
as a pseudo NMI which in turn invokes registered NMI handlers.

After this patch-set we should be able to get a backtrace for a CPU
stuck in HARDLOCKUP. Have a look at an examples below from a hard lockup
testcase run on Developerbox:

$ echo HARDLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT

NMI backtrace:
==============

# Issue Magic SysRq to dump backtrace

[  376.894502] NMI backtrace for cpu 8
[  376.894506] CPU: 8 PID: 555 Comm: bash Not tainted 5.9.0-rc3-00740-g06ff047-dirty #242
[  376.894510] Hardware name: Socionext SynQuacer E-series DeveloperBox, BIOS build #73 Apr  6 2020
[  376.894514] pstate: 40000005 (nZcv daif -PAN -UAO BTYPE=--)
[  376.894517] pc : lkdtm_HARDLOCKUP+0x8/0x18
[  376.894520] lr : lkdtm_do_action+0x24/0x30
[  376.894524] sp : ffff800012cebd20
[  376.894527] pmr_save: 00000060
[  376.894530] x29: ffff800012cebd20 x28: ffff000875ae8000
[  376.894540] x27: 0000000000000000 x26: 0000000000000000
[  376.894550] x25: 000000000000001a x24: ffff800012cebe40
[  376.894560] x23: 000000000000000b x22: ffff800010fc5040
[  376.894569] x21: ffff000878b61000 x20: ffff8000113b2870
[  376.894579] x19: 000000000000001b x18: 0000000000000010
[  376.894588] x17: 0000000000000000 x16: 0000000000000000
[  376.894598] x15: ffff000875ae8470 x14: 00000000000002ad
[  376.894613] x13: 0000000000000000 x12: 0000000000000000
[  376.894622] x11: 0000000000000007 x10: 00000000000009c0
[  376.894631] x9 : ffff800012ceba80 x8 : ffff000875ae8a20
[  376.894641] x7 : ffff00087f6b3280 x6 : ffff00087f6b3200
[  376.894651] x5 : 0000000000000000 x4 : ffff00087f6a91f8
[  376.894660] x3 : ffff00087f6b0120 x2 : 1aa310cec69eb500
[  376.894670] x1 : 0000000000000000 x0 : 0000000000000060
[  376.894679] Call trace:
[  376.894683]  lkdtm_HARDLOCKUP+0x8/0x18
[  376.894686]  direct_entry+0x124/0x1c0
[  376.894689]  full_proxy_write+0x60/0xb0
[  376.894693]  vfs_write+0xf0/0x230
[  376.894696]  ksys_write+0x6c/0xf8
[  376.894699]  __arm64_sys_write+0x1c/0x28
[  376.894703]  el0_svc_common.constprop.0+0x74/0x1f0
[  376.894707]  do_el0_svc+0x24/0x90
[  376.894710]  el0_sync_handler+0x180/0x2f8
[  376.894713]  el0_sync+0x158/0x180

KGDB:
=====

# Enter kdb via Magic SysRq

[6]kdb> btc
btc: cpu status: Currently on cpu 6
Available cpus: 0-5(I), 6, 7(I), 8, 9-23(I)
<snip>
Stack traceback for pid 555
0xffff000875ae8000      555      554  1    8   R  0xffff000875ae89c0  bash
CPU: 8 PID: 555 Comm: bash Not tainted 5.9.0-rc3-00740-g06ff047-dirty #242
Hardware name: Socionext SynQuacer E-series DeveloperBox, BIOS build #73 Apr  6 2020
Call trace:
 dump_backtrace+0x0/0x1a0
 show_stack+0x18/0x28
 dump_stack+0xc0/0x11c
 kgdb_cpu_enter+0x648/0x660
 kgdb_nmicallback+0xa0/0xa8
 ipi_kgdb_nmicallback+0x24/0x30
 ipi_nmi_handler+0x48/0x60
 handle_percpu_devid_fasteoi_ipi+0x74/0x88
 generic_handle_irq+0x30/0x48
 handle_domain_nmi+0x48/0x80
 gic_handle_irq+0x18c/0x34c
 el1_irq+0xcc/0x180 
 lkdtm_HARDLOCKUP+0x8/0x18
 direct_entry+0x124/0x1c0
 full_proxy_write+0x60/0xb0
 vfs_write+0xf0/0x230
 ksys_write+0x6c/0xf8
 __arm64_sys_write+0x1c/0x28
 el0_svc_common.constprop.0+0x74/0x1f0
 do_el0_svc+0x24/0x90
 el0_sync_handler+0x180/0x2f8
 el0_sync+0x158/0x180
<snip>

Changes in v6:
- Two new patches: #4 and #6 which adds runtime fallback framework for
  sysrq backtrace and kgdb roundup features.
- Reversed order of NMI backtrace and kgdb roundup feaure patches.
- Addressed other misc. comments from Marc.
- I haven't picked any tags from v5 since I think there is major rework
  involved. Masayoshi, could you please confirm if these features still
  work for you?

Changes in v5:
- Rebased to head of upstream master.
- Remove redundant invocation of ipi_nmi_setup().
- Addressed misc. comments.

Changes in v4:
- Move IPI NMI framework to a separate file.
- Get rid of hard-coded IPI_CALL_NMI_FUNC allocation.
- Add NMI backtrace support leveraged via magic SYSRQ.

Changes in v3:
- Rebased to Marc's latest IPIs patch-set [1].

[1] https://lkml.org/lkml/2020/9/1/603

Changes since RFC version [1]:
- Switch to use generic interrupt framework to turn an IPI as NMI.
- Dependent on Marc's patch-set [2] which turns IPIs into normal
  interrupts.
- Addressed misc. comments from Doug on patch #4.
- Posted kgdb NMI printk() fixup separately which has evolved since
  to be solved using different approach via changing kgdb interception
  of printk() in common printk() code (see patch [3]).

[1] https://lkml.org/lkml/2020/4/24/328
[2] https://lkml.org/lkml/2020/5/19/710
[3] https://lkml.org/lkml/2020/5/20/418

Sumit Garg (7):
  arm64: Add framework to turn IPI as NMI
  irqchip/gic-v3: Enable support for SGIs to act as NMIs
  arm64: smp: Assign and setup an IPI as NMI
  nmi: backtrace: Allow runtime arch specific override
  arm64: ipi_nmi: Add support for NMI backtrace
  kgdb: roundup: Allow runtime arch specific override
  arm64: kgdb: Roundup cpus using IPI as NMI

 arch/arm/include/asm/irq.h       |  2 +-
 arch/arm/kernel/smp.c            |  3 +-
 arch/arm64/include/asm/irq.h     |  6 +++
 arch/arm64/include/asm/kgdb.h    |  9 +++++
 arch/arm64/include/asm/nmi.h     | 17 ++++++++
 arch/arm64/kernel/Makefile       |  2 +-
 arch/arm64/kernel/ipi_nmi.c      | 84 ++++++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/kgdb.c         | 35 +++++++++++++++++
 arch/arm64/kernel/smp.c          |  8 ++++
 arch/mips/include/asm/irq.h      |  2 +-
 arch/mips/kernel/process.c       |  3 +-
 arch/powerpc/include/asm/nmi.h   |  2 +-
 arch/powerpc/kernel/kgdb.c       |  3 +-
 arch/powerpc/kernel/stacktrace.c |  3 +-
 arch/sparc/include/asm/irq_64.h  |  2 +-
 arch/sparc/kernel/process_64.c   |  4 +-
 arch/sparc/kernel/smp_64.c       |  3 +-
 arch/x86/include/asm/irq.h       |  2 +-
 arch/x86/kernel/apic/hw_nmi.c    |  3 +-
 arch/x86/kernel/kgdb.c           |  6 ++-
 drivers/irqchip/irq-gic-v3.c     | 29 ++++++++++----
 include/linux/kgdb.h             |  5 ++-
 include/linux/nmi.h              | 12 ++----
 kernel/debug/debug_core.c        | 10 ++++-
 24 files changed, 221 insertions(+), 34 deletions(-)
 create mode 100644 arch/arm64/include/asm/nmi.h
 create mode 100644 arch/arm64/kernel/ipi_nmi.c

-- 
2.7.4


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v6 1/7] arm64: Add framework to turn IPI as NMI
  2020-10-29 14:56 [PATCH v6 0/7] arm64: Add framework to turn an IPI as NMI Sumit Garg
@ 2020-10-29 14:56 ` Sumit Garg
  2020-10-29 14:56 ` [PATCH v6 2/7] irqchip/gic-v3: Enable support for SGIs to act as NMIs Sumit Garg
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Sumit Garg @ 2020-10-29 14:56 UTC (permalink / raw)
  To: maz, catalin.marinas, will
  Cc: linux-arm-kernel, tglx, jason, linux, tsbogend, mpe, davem,
	mingo, bp, x86, mark.rutland, julien.thierry.kdev, dianders,
	daniel.thompson, jason.wessel, msys.mizuma, ito-yuichi,
	kgdb-bugreport, linux-kernel, Sumit Garg

Introduce framework to turn an IPI as NMI using pseudo NMIs. The main
motivation for this feature is to have an IPI that can be leveraged to
invoke NMI functions on other CPUs.

And current prospective users are NMI backtrace and KGDB CPUs round-up
whose support is added via future patches.

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
---
 arch/arm64/include/asm/nmi.h | 17 ++++++++++++
 arch/arm64/kernel/Makefile   |  2 +-
 arch/arm64/kernel/ipi_nmi.c  | 65 ++++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 83 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm64/include/asm/nmi.h
 create mode 100644 arch/arm64/kernel/ipi_nmi.c

diff --git a/arch/arm64/include/asm/nmi.h b/arch/arm64/include/asm/nmi.h
new file mode 100644
index 0000000..4cd14b6
--- /dev/null
+++ b/arch/arm64/include/asm/nmi.h
@@ -0,0 +1,17 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_NMI_H
+#define __ASM_NMI_H
+
+#ifndef __ASSEMBLER__
+
+#include <linux/cpumask.h>
+
+extern bool arm64_supports_nmi(void);
+extern void arm64_send_nmi(cpumask_t *mask);
+
+void set_smp_dynamic_ipi(int ipi);
+void dynamic_ipi_setup(int cpu);
+void dynamic_ipi_teardown(int cpu);
+
+#endif /* !__ASSEMBLER__ */
+#endif
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index bbaf0bc..525a1e0 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -17,7 +17,7 @@ obj-y			:= debug-monitors.o entry.o irq.o fpsimd.o		\
 			   return_address.o cpuinfo.o cpu_errata.o		\
 			   cpufeature.o alternative.o cacheinfo.o		\
 			   smp.o smp_spin_table.o topology.o smccc-call.o	\
-			   syscall.o proton-pack.o
+			   syscall.o proton-pack.o ipi_nmi.o
 
 targets			+= efi-entry.o
 
diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
new file mode 100644
index 0000000..a945dcf
--- /dev/null
+++ b/arch/arm64/kernel/ipi_nmi.c
@@ -0,0 +1,65 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * NMI support for IPIs
+ *
+ * Copyright (C) 2020 Linaro Limited
+ * Author: Sumit Garg <sumit.garg@linaro.org>
+ */
+
+#include <linux/interrupt.h>
+#include <linux/irq.h>
+#include <linux/smp.h>
+
+#include <asm/nmi.h>
+
+static struct irq_desc *ipi_nmi_desc __read_mostly;
+static int ipi_nmi_id __read_mostly;
+
+bool arm64_supports_nmi(void)
+{
+	if (ipi_nmi_desc)
+		return true;
+
+	return false;
+}
+
+void arm64_send_nmi(cpumask_t *mask)
+{
+	if (WARN_ON_ONCE(!ipi_nmi_desc))
+		return;
+
+	__ipi_send_mask(ipi_nmi_desc, mask);
+}
+
+static irqreturn_t ipi_nmi_handler(int irq, void *data)
+{
+	/* nop, NMI handlers for special features can be added here. */
+
+	return IRQ_NONE;
+}
+
+void dynamic_ipi_setup(int cpu)
+{
+	if (!ipi_nmi_desc)
+		return;
+
+	if (!prepare_percpu_nmi(ipi_nmi_id))
+		enable_percpu_nmi(ipi_nmi_id, IRQ_TYPE_NONE);
+}
+
+void dynamic_ipi_teardown(int cpu)
+{
+	if (!ipi_nmi_desc)
+		return;
+
+	disable_percpu_nmi(ipi_nmi_id);
+	teardown_percpu_nmi(ipi_nmi_id);
+}
+
+void __init set_smp_dynamic_ipi(int ipi)
+{
+	if (!request_percpu_nmi(ipi, ipi_nmi_handler, "IPI", &cpu_number)) {
+		ipi_nmi_desc = irq_to_desc(ipi);
+		ipi_nmi_id = ipi;
+	}
+}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 2/7] irqchip/gic-v3: Enable support for SGIs to act as NMIs
  2020-10-29 14:56 [PATCH v6 0/7] arm64: Add framework to turn an IPI as NMI Sumit Garg
  2020-10-29 14:56 ` [PATCH v6 1/7] arm64: Add framework to turn " Sumit Garg
@ 2020-10-29 14:56 ` Sumit Garg
  2020-10-29 14:56 ` [PATCH v6 3/7] arm64: smp: Assign and setup an IPI as NMI Sumit Garg
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Sumit Garg @ 2020-10-29 14:56 UTC (permalink / raw)
  To: maz, catalin.marinas, will
  Cc: linux-arm-kernel, tglx, jason, linux, tsbogend, mpe, davem,
	mingo, bp, x86, mark.rutland, julien.thierry.kdev, dianders,
	daniel.thompson, jason.wessel, msys.mizuma, ito-yuichi,
	kgdb-bugreport, linux-kernel, Sumit Garg

Add support to handle SGIs as pseudo NMIs. As SGIs or IPIs default to a
special flow handler: handle_percpu_devid_fasteoi_ipi(), so skip NMI
handler update in case of SGIs.

Also, enable NMI support prior to gic_smp_init() as allocation of SGIs
as IRQs/NMIs happen as part of this routine.

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
---
 drivers/irqchip/irq-gic-v3.c | 29 +++++++++++++++++++++--------
 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
index 16fecc0..7010ae2 100644
--- a/drivers/irqchip/irq-gic-v3.c
+++ b/drivers/irqchip/irq-gic-v3.c
@@ -461,6 +461,7 @@ static u32 gic_get_ppi_index(struct irq_data *d)
 static int gic_irq_nmi_setup(struct irq_data *d)
 {
 	struct irq_desc *desc = irq_to_desc(d->irq);
+	u32 idx;
 
 	if (!gic_supports_nmi())
 		return -EINVAL;
@@ -478,16 +479,22 @@ static int gic_irq_nmi_setup(struct irq_data *d)
 		return -EINVAL;
 
 	/* desc lock should already be held */
-	if (gic_irq_in_rdist(d)) {
-		u32 idx = gic_get_ppi_index(d);
+	switch (get_intid_range(d)) {
+	case SGI_RANGE:
+		break;
+	case PPI_RANGE:
+	case EPPI_RANGE:
+		idx = gic_get_ppi_index(d);
 
 		/* Setting up PPI as NMI, only switch handler for first NMI */
 		if (!refcount_inc_not_zero(&ppi_nmi_refs[idx])) {
 			refcount_set(&ppi_nmi_refs[idx], 1);
 			desc->handle_irq = handle_percpu_devid_fasteoi_nmi;
 		}
-	} else {
+		break;
+	default:
 		desc->handle_irq = handle_fasteoi_nmi;
+		break;
 	}
 
 	gic_irq_set_prio(d, GICD_INT_NMI_PRI);
@@ -498,6 +505,7 @@ static int gic_irq_nmi_setup(struct irq_data *d)
 static void gic_irq_nmi_teardown(struct irq_data *d)
 {
 	struct irq_desc *desc = irq_to_desc(d->irq);
+	u32 idx;
 
 	if (WARN_ON(!gic_supports_nmi()))
 		return;
@@ -515,14 +523,20 @@ static void gic_irq_nmi_teardown(struct irq_data *d)
 		return;
 
 	/* desc lock should already be held */
-	if (gic_irq_in_rdist(d)) {
-		u32 idx = gic_get_ppi_index(d);
+	switch (get_intid_range(d)) {
+	case SGI_RANGE:
+		break;
+	case PPI_RANGE:
+	case EPPI_RANGE:
+		idx = gic_get_ppi_index(d);
 
 		/* Tearing down NMI, only switch handler for last NMI */
 		if (refcount_dec_and_test(&ppi_nmi_refs[idx]))
 			desc->handle_irq = handle_percpu_devid_irq;
-	} else {
+		break;
+	default:
 		desc->handle_irq = handle_fasteoi_irq;
+		break;
 	}
 
 	gic_irq_set_prio(d, GICD_INT_DEF_PRI);
@@ -1708,6 +1722,7 @@ static int __init gic_init_bases(void __iomem *dist_base,
 
 	gic_dist_init();
 	gic_cpu_init();
+	gic_enable_nmi_support();
 	gic_smp_init();
 	gic_cpu_pm_init();
 
@@ -1719,8 +1734,6 @@ static int __init gic_init_bases(void __iomem *dist_base,
 			gicv2m_init(handle, gic_data.domain);
 	}
 
-	gic_enable_nmi_support();
-
 	return 0;
 
 out_free:
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 3/7] arm64: smp: Assign and setup an IPI as NMI
  2020-10-29 14:56 [PATCH v6 0/7] arm64: Add framework to turn an IPI as NMI Sumit Garg
  2020-10-29 14:56 ` [PATCH v6 1/7] arm64: Add framework to turn " Sumit Garg
  2020-10-29 14:56 ` [PATCH v6 2/7] irqchip/gic-v3: Enable support for SGIs to act as NMIs Sumit Garg
@ 2020-10-29 14:56 ` Sumit Garg
  2020-10-29 14:56 ` [PATCH v6 4/7] nmi: backtrace: Allow runtime arch specific override Sumit Garg
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Sumit Garg @ 2020-10-29 14:56 UTC (permalink / raw)
  To: maz, catalin.marinas, will
  Cc: linux-arm-kernel, tglx, jason, linux, tsbogend, mpe, davem,
	mingo, bp, x86, mark.rutland, julien.thierry.kdev, dianders,
	daniel.thompson, jason.wessel, msys.mizuma, ito-yuichi,
	kgdb-bugreport, linux-kernel, Sumit Garg

Assign an unused IPI which can be turned as NMI using ipi_nmi framework.
Also, invoke corresponding dynamic IPI setup/teardown APIs.

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
---
 arch/arm64/kernel/smp.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 82e75fc..2e118e2 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -43,6 +43,7 @@
 #include <asm/daifflags.h>
 #include <asm/kvm_mmu.h>
 #include <asm/mmu_context.h>
+#include <asm/nmi.h>
 #include <asm/numa.h>
 #include <asm/processor.h>
 #include <asm/smp_plat.h>
@@ -962,6 +963,8 @@ static void ipi_setup(int cpu)
 
 	for (i = 0; i < nr_ipi; i++)
 		enable_percpu_irq(ipi_irq_base + i, 0);
+
+	dynamic_ipi_setup(cpu);
 }
 
 #ifdef CONFIG_HOTPLUG_CPU
@@ -974,6 +977,8 @@ static void ipi_teardown(int cpu)
 
 	for (i = 0; i < nr_ipi; i++)
 		disable_percpu_irq(ipi_irq_base + i);
+
+	dynamic_ipi_teardown(cpu);
 }
 #endif
 
@@ -995,6 +1000,9 @@ void __init set_smp_ipi_range(int ipi_base, int n)
 		irq_set_status_flags(ipi_base + i, IRQ_HIDDEN);
 	}
 
+	if (n > nr_ipi)
+		set_smp_dynamic_ipi(ipi_base + nr_ipi);
+
 	ipi_irq_base = ipi_base;
 
 	/* Setup the boot CPU immediately */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 4/7] nmi: backtrace: Allow runtime arch specific override
  2020-10-29 14:56 [PATCH v6 0/7] arm64: Add framework to turn an IPI as NMI Sumit Garg
                   ` (2 preceding siblings ...)
  2020-10-29 14:56 ` [PATCH v6 3/7] arm64: smp: Assign and setup an IPI as NMI Sumit Garg
@ 2020-10-29 14:56 ` Sumit Garg
  2020-10-29 14:56 ` [PATCH v6 5/7] arm64: ipi_nmi: Add support for NMI backtrace Sumit Garg
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 14+ messages in thread
From: Sumit Garg @ 2020-10-29 14:56 UTC (permalink / raw)
  To: maz, catalin.marinas, will
  Cc: linux-arm-kernel, tglx, jason, linux, tsbogend, mpe, davem,
	mingo, bp, x86, mark.rutland, julien.thierry.kdev, dianders,
	daniel.thompson, jason.wessel, msys.mizuma, ito-yuichi,
	kgdb-bugreport, linux-kernel, Sumit Garg

Add a boolean return to arch_trigger_cpumask_backtrace() to support a
use-case where a particular architecture detects at runtime if it supports
NMI backtrace or it would like to fallback to default implementation using
SMP cross-calls.

Currently such an architecture example is arm64 supporting pseudo NMIs
feature which is only available on platforms which have support for GICv3
or later version.

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
---
 arch/arm/include/asm/irq.h       |  2 +-
 arch/arm/kernel/smp.c            |  3 ++-
 arch/mips/include/asm/irq.h      |  2 +-
 arch/mips/kernel/process.c       |  3 ++-
 arch/powerpc/include/asm/nmi.h   |  2 +-
 arch/powerpc/kernel/stacktrace.c |  3 ++-
 arch/sparc/include/asm/irq_64.h  |  2 +-
 arch/sparc/kernel/process_64.c   |  4 +++-
 arch/x86/include/asm/irq.h       |  2 +-
 arch/x86/kernel/apic/hw_nmi.c    |  3 ++-
 include/linux/nmi.h              | 12 ++++--------
 11 files changed, 20 insertions(+), 18 deletions(-)

diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h
index 46d4114..54b0180 100644
--- a/arch/arm/include/asm/irq.h
+++ b/arch/arm/include/asm/irq.h
@@ -31,7 +31,7 @@ void handle_IRQ(unsigned int, struct pt_regs *);
 void init_IRQ(void);
 
 #ifdef CONFIG_SMP
-extern void arch_trigger_cpumask_backtrace(const cpumask_t *mask,
+extern bool arch_trigger_cpumask_backtrace(const cpumask_t *mask,
 					   bool exclude_self);
 #define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
 #endif
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 48099c6e..bb20a43 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -856,7 +856,8 @@ static void raise_nmi(cpumask_t *mask)
 	__ipi_send_mask(ipi_desc[IPI_CPU_BACKTRACE], mask);
 }
 
-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
 {
 	nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_nmi);
+	return true;
 }
diff --git a/arch/mips/include/asm/irq.h b/arch/mips/include/asm/irq.h
index c5d3517..34f3b42 100644
--- a/arch/mips/include/asm/irq.h
+++ b/arch/mips/include/asm/irq.h
@@ -78,7 +78,7 @@ extern int cp0_fdc_irq;
 
 extern int get_c0_fdc_int(void);
 
-void arch_trigger_cpumask_backtrace(const struct cpumask *mask,
+bool arch_trigger_cpumask_backtrace(const struct cpumask *mask,
 				    bool exclude_self);
 #define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
 
diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c
index 75ebd8d..d19e672 100644
--- a/arch/mips/kernel/process.c
+++ b/arch/mips/kernel/process.c
@@ -735,9 +735,10 @@ static void raise_backtrace(cpumask_t *mask)
 	}
 }
 
-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
 {
 	nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_backtrace);
+	return true;
 }
 
 int mips_get_process_fp_mode(struct task_struct *task)
diff --git a/arch/powerpc/include/asm/nmi.h b/arch/powerpc/include/asm/nmi.h
index 84b4cfe..a5eb3e2 100644
--- a/arch/powerpc/include/asm/nmi.h
+++ b/arch/powerpc/include/asm/nmi.h
@@ -9,7 +9,7 @@ static inline void arch_touch_nmi_watchdog(void) {}
 #endif
 
 #if defined(CONFIG_NMI_IPI) && defined(CONFIG_STACKTRACE)
-extern void arch_trigger_cpumask_backtrace(const cpumask_t *mask,
+extern bool arch_trigger_cpumask_backtrace(const cpumask_t *mask,
 					   bool exclude_self);
 #define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
 #endif
diff --git a/arch/powerpc/kernel/stacktrace.c b/arch/powerpc/kernel/stacktrace.c
index b644065..22b112a 100644
--- a/arch/powerpc/kernel/stacktrace.c
+++ b/arch/powerpc/kernel/stacktrace.c
@@ -264,8 +264,9 @@ static void raise_backtrace_ipi(cpumask_t *mask)
 	}
 }
 
-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
 {
 	nmi_trigger_cpumask_backtrace(mask, exclude_self, raise_backtrace_ipi);
+	return true;
 }
 #endif /* defined(CONFIG_PPC_BOOK3S_64) && defined(CONFIG_NMI_IPI) */
diff --git a/arch/sparc/include/asm/irq_64.h b/arch/sparc/include/asm/irq_64.h
index 4d748e9..35c01ff 100644
--- a/arch/sparc/include/asm/irq_64.h
+++ b/arch/sparc/include/asm/irq_64.h
@@ -87,7 +87,7 @@ static inline unsigned long get_softint(void)
 	return retval;
 }
 
-void arch_trigger_cpumask_backtrace(const struct cpumask *mask,
+bool arch_trigger_cpumask_backtrace(const struct cpumask *mask,
 				    bool exclude_self);
 #define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
 
diff --git a/arch/sparc/kernel/process_64.c b/arch/sparc/kernel/process_64.c
index a75093b..9182001 100644
--- a/arch/sparc/kernel/process_64.c
+++ b/arch/sparc/kernel/process_64.c
@@ -248,7 +248,7 @@ static void __global_reg_poll(struct global_reg_snapshot *gp)
 	}
 }
 
-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
 {
 	struct thread_info *tp = current_thread_info();
 	struct pt_regs *regs = get_irq_regs();
@@ -303,6 +303,8 @@ void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
 	memset(global_cpu_snapshot, 0, sizeof(global_cpu_snapshot));
 
 	spin_unlock_irqrestore(&global_cpu_snapshot_lock, flags);
+
+	return true;
 }
 
 #ifdef CONFIG_MAGIC_SYSRQ
diff --git a/arch/x86/include/asm/irq.h b/arch/x86/include/asm/irq.h
index 528c8a7..b7668e0 100644
--- a/arch/x86/include/asm/irq.h
+++ b/arch/x86/include/asm/irq.h
@@ -47,7 +47,7 @@ extern void init_ISA_irqs(void);
 extern void __init init_IRQ(void);
 
 #ifdef CONFIG_X86_LOCAL_APIC
-void arch_trigger_cpumask_backtrace(const struct cpumask *mask,
+bool arch_trigger_cpumask_backtrace(const struct cpumask *mask,
 				    bool exclude_self);
 
 #define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
diff --git a/arch/x86/kernel/apic/hw_nmi.c b/arch/x86/kernel/apic/hw_nmi.c
index 34a992e..e7dcd28 100644
--- a/arch/x86/kernel/apic/hw_nmi.c
+++ b/arch/x86/kernel/apic/hw_nmi.c
@@ -34,10 +34,11 @@ static void nmi_raise_cpu_backtrace(cpumask_t *mask)
 	apic->send_IPI_mask(mask, NMI_VECTOR);
 }
 
-void arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
 {
 	nmi_trigger_cpumask_backtrace(mask, exclude_self,
 				      nmi_raise_cpu_backtrace);
+	return true;
 }
 
 static int nmi_cpu_backtrace_handler(unsigned int cmd, struct pt_regs *regs)
diff --git a/include/linux/nmi.h b/include/linux/nmi.h
index 750c7f3..cedbfc1 100644
--- a/include/linux/nmi.h
+++ b/include/linux/nmi.h
@@ -143,26 +143,22 @@ static inline void touch_nmi_watchdog(void)
 #ifdef arch_trigger_cpumask_backtrace
 static inline bool trigger_all_cpu_backtrace(void)
 {
-	arch_trigger_cpumask_backtrace(cpu_online_mask, false);
-	return true;
+	return arch_trigger_cpumask_backtrace(cpu_online_mask, false);
 }
 
 static inline bool trigger_allbutself_cpu_backtrace(void)
 {
-	arch_trigger_cpumask_backtrace(cpu_online_mask, true);
-	return true;
+	return arch_trigger_cpumask_backtrace(cpu_online_mask, true);
 }
 
 static inline bool trigger_cpumask_backtrace(struct cpumask *mask)
 {
-	arch_trigger_cpumask_backtrace(mask, false);
-	return true;
+	return arch_trigger_cpumask_backtrace(mask, false);
 }
 
 static inline bool trigger_single_cpu_backtrace(int cpu)
 {
-	arch_trigger_cpumask_backtrace(cpumask_of(cpu), false);
-	return true;
+	return arch_trigger_cpumask_backtrace(cpumask_of(cpu), false);
 }
 
 /* generic implementation */
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 5/7] arm64: ipi_nmi: Add support for NMI backtrace
  2020-10-29 14:56 [PATCH v6 0/7] arm64: Add framework to turn an IPI as NMI Sumit Garg
                   ` (3 preceding siblings ...)
  2020-10-29 14:56 ` [PATCH v6 4/7] nmi: backtrace: Allow runtime arch specific override Sumit Garg
@ 2020-10-29 14:56 ` Sumit Garg
  2020-10-29 14:56 ` [PATCH v6 6/7] kgdb: roundup: Allow runtime arch specific override Sumit Garg
  2020-10-29 14:56 ` [PATCH v6 7/7] arm64: kgdb: Roundup cpus using IPI as NMI Sumit Garg
  6 siblings, 0 replies; 14+ messages in thread
From: Sumit Garg @ 2020-10-29 14:56 UTC (permalink / raw)
  To: maz, catalin.marinas, will
  Cc: linux-arm-kernel, tglx, jason, linux, tsbogend, mpe, davem,
	mingo, bp, x86, mark.rutland, julien.thierry.kdev, dianders,
	daniel.thompson, jason.wessel, msys.mizuma, ito-yuichi,
	kgdb-bugreport, linux-kernel, Sumit Garg

Enable NMI backtrace support on arm64 using IPI turned as an NMI
leveraging pseudo NMIs support. It is now possible for users to get a
backtrace of a CPU stuck in hard-lockup using magic SYSRQ.

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
---
 arch/arm64/include/asm/irq.h |  6 ++++++
 arch/arm64/kernel/ipi_nmi.c  | 18 ++++++++++++++++--
 2 files changed, 22 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h
index b2b0c64..ef018a8 100644
--- a/arch/arm64/include/asm/irq.h
+++ b/arch/arm64/include/asm/irq.h
@@ -6,6 +6,12 @@
 
 #include <asm-generic/irq.h>
 
+#ifdef CONFIG_SMP
+extern bool arch_trigger_cpumask_backtrace(const cpumask_t *mask,
+					   bool exclude_self);
+#define arch_trigger_cpumask_backtrace arch_trigger_cpumask_backtrace
+#endif
+
 struct pt_regs;
 
 static inline int nr_legacy_irqs(void)
diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
index a945dcf..597dcf7 100644
--- a/arch/arm64/kernel/ipi_nmi.c
+++ b/arch/arm64/kernel/ipi_nmi.c
@@ -8,6 +8,7 @@
 
 #include <linux/interrupt.h>
 #include <linux/irq.h>
+#include <linux/nmi.h>
 #include <linux/smp.h>
 
 #include <asm/nmi.h>
@@ -31,11 +32,24 @@ void arm64_send_nmi(cpumask_t *mask)
 	__ipi_send_mask(ipi_nmi_desc, mask);
 }
 
+bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
+{
+	if (!ipi_nmi_desc)
+		return false;
+
+	nmi_trigger_cpumask_backtrace(mask, exclude_self, arm64_send_nmi);
+
+	return true;
+}
+
 static irqreturn_t ipi_nmi_handler(int irq, void *data)
 {
-	/* nop, NMI handlers for special features can be added here. */
+	irqreturn_t ret = IRQ_NONE;
+
+	if (nmi_cpu_backtrace(get_irq_regs()))
+		ret = IRQ_HANDLED;
 
-	return IRQ_NONE;
+	return ret;
 }
 
 void dynamic_ipi_setup(int cpu)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 6/7] kgdb: roundup: Allow runtime arch specific override
  2020-10-29 14:56 [PATCH v6 0/7] arm64: Add framework to turn an IPI as NMI Sumit Garg
                   ` (4 preceding siblings ...)
  2020-10-29 14:56 ` [PATCH v6 5/7] arm64: ipi_nmi: Add support for NMI backtrace Sumit Garg
@ 2020-10-29 14:56 ` Sumit Garg
  2020-10-29 15:21   ` Daniel Thompson
  2020-10-29 14:56 ` [PATCH v6 7/7] arm64: kgdb: Roundup cpus using IPI as NMI Sumit Garg
  6 siblings, 1 reply; 14+ messages in thread
From: Sumit Garg @ 2020-10-29 14:56 UTC (permalink / raw)
  To: maz, catalin.marinas, will
  Cc: linux-arm-kernel, tglx, jason, linux, tsbogend, mpe, davem,
	mingo, bp, x86, mark.rutland, julien.thierry.kdev, dianders,
	daniel.thompson, jason.wessel, msys.mizuma, ito-yuichi,
	kgdb-bugreport, linux-kernel, Sumit Garg

Add a new API kgdb_arch_roundup_cpus() for a particular archichecture to
override default kgdb roundup and if it detects at runtime to not support
NMI roundup then it can fallback to default implementation using async
SMP cross-calls.

Currently such an architecture example is arm64 supporting pseudo NMIs
feature which is only available on platforms which have support for GICv3
or later version.

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
---
 arch/powerpc/kernel/kgdb.c |  3 ++-
 arch/sparc/kernel/smp_64.c |  3 ++-
 arch/x86/kernel/kgdb.c     |  6 ++++--
 include/linux/kgdb.h       |  5 +++--
 kernel/debug/debug_core.c  | 10 +++++++++-
 5 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kernel/kgdb.c b/arch/powerpc/kernel/kgdb.c
index 4090802..126575d 100644
--- a/arch/powerpc/kernel/kgdb.c
+++ b/arch/powerpc/kernel/kgdb.c
@@ -125,9 +125,10 @@ static int kgdb_debugger_ipi(struct pt_regs *regs)
 }
 
 #ifdef CONFIG_SMP
-void kgdb_roundup_cpus(void)
+bool kgdb_arch_roundup_cpus(void)
 {
 	smp_send_debugger_break();
+	return true;
 }
 #endif
 
diff --git a/arch/sparc/kernel/smp_64.c b/arch/sparc/kernel/smp_64.c
index e38d8bf..c459c83 100644
--- a/arch/sparc/kernel/smp_64.c
+++ b/arch/sparc/kernel/smp_64.c
@@ -1014,9 +1014,10 @@ void flush_dcache_page_all(struct mm_struct *mm, struct page *page)
 }
 
 #ifdef CONFIG_KGDB
-void kgdb_roundup_cpus(void)
+bool kgdb_arch_roundup_cpus(void)
 {
 	smp_cross_call(&xcall_kgdb_capture, 0, 0, 0);
+	return true;
 }
 #endif
 
diff --git a/arch/x86/kernel/kgdb.c b/arch/x86/kernel/kgdb.c
index ff7878d..1b756d9 100644
--- a/arch/x86/kernel/kgdb.c
+++ b/arch/x86/kernel/kgdb.c
@@ -404,7 +404,8 @@ static void kgdb_disable_hw_debug(struct pt_regs *regs)
 
 #ifdef CONFIG_SMP
 /**
- *	kgdb_roundup_cpus - Get other CPUs into a holding pattern
+ *	kgdb_arch_roundup_cpus - Get other CPUs into a holding pattern
+ *				 in an architectural specific manner
  *
  *	On SMP systems, we need to get the attention of the other CPUs
  *	and get them be in a known state.  This should do what is needed
@@ -414,9 +415,10 @@ static void kgdb_disable_hw_debug(struct pt_regs *regs)
  *
  *	On non-SMP systems, this is not called.
  */
-void kgdb_roundup_cpus(void)
+bool kgdb_arch_roundup_cpus(void)
 {
 	apic_send_IPI_allbutself(NMI_VECTOR);
+	return true;
 }
 #endif
 
diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
index 0d6cf64..f9db5b8 100644
--- a/include/linux/kgdb.h
+++ b/include/linux/kgdb.h
@@ -200,7 +200,8 @@ kgdb_arch_handle_qxfer_pkt(char *remcom_in_buffer,
 extern void kgdb_call_nmi_hook(void *ignored);
 
 /**
- *	kgdb_roundup_cpus - Get other CPUs into a holding pattern
+ *	kgdb_arch_roundup_cpus - Get other CPUs into a holding pattern
+ *				 in an architectural specific manner
  *
  *	On SMP systems, we need to get the attention of the other CPUs
  *	and get them into a known state.  This should do what is needed
@@ -210,7 +211,7 @@ extern void kgdb_call_nmi_hook(void *ignored);
  *
  *	On non-SMP systems, this is not called.
  */
-extern void kgdb_roundup_cpus(void);
+extern bool kgdb_arch_roundup_cpus(void);
 
 /**
  *	kgdb_arch_set_pc - Generic call back to the program counter
diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
index 1e75a89..27e401c 100644
--- a/kernel/debug/debug_core.c
+++ b/kernel/debug/debug_core.c
@@ -241,13 +241,21 @@ void __weak kgdb_call_nmi_hook(void *ignored)
 }
 NOKPROBE_SYMBOL(kgdb_call_nmi_hook);
 
-void __weak kgdb_roundup_cpus(void)
+bool __weak kgdb_arch_roundup_cpus(void)
+{
+	return false;
+}
+
+static void kgdb_roundup_cpus(void)
 {
 	call_single_data_t *csd;
 	int this_cpu = raw_smp_processor_id();
 	int cpu;
 	int ret;
 
+	if (kgdb_arch_roundup_cpus())
+		return;
+
 	for_each_online_cpu(cpu) {
 		/* No need to roundup ourselves */
 		if (cpu == this_cpu)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v6 7/7] arm64: kgdb: Roundup cpus using IPI as NMI
  2020-10-29 14:56 [PATCH v6 0/7] arm64: Add framework to turn an IPI as NMI Sumit Garg
                   ` (5 preceding siblings ...)
  2020-10-29 14:56 ` [PATCH v6 6/7] kgdb: roundup: Allow runtime arch specific override Sumit Garg
@ 2020-10-29 14:56 ` Sumit Garg
  2020-10-29 16:22   ` Daniel Thompson
  6 siblings, 1 reply; 14+ messages in thread
From: Sumit Garg @ 2020-10-29 14:56 UTC (permalink / raw)
  To: maz, catalin.marinas, will
  Cc: linux-arm-kernel, tglx, jason, linux, tsbogend, mpe, davem,
	mingo, bp, x86, mark.rutland, julien.thierry.kdev, dianders,
	daniel.thompson, jason.wessel, msys.mizuma, ito-yuichi,
	kgdb-bugreport, linux-kernel, Sumit Garg

arm64 platforms with GICv3 or later supports pseudo NMIs which can be
leveraged to roundup CPUs which are stuck in hard lockup state with
interrupts disabled that wouldn't be possible with a normal IPI.

So instead switch to roundup CPUs using IPI turned as NMI. And in
case a particular arm64 platform doesn't supports pseudo NMIs,
it will switch back to default kgdb CPUs roundup mechanism.

Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
---
 arch/arm64/include/asm/kgdb.h |  9 +++++++++
 arch/arm64/kernel/ipi_nmi.c   |  5 +++++
 arch/arm64/kernel/kgdb.c      | 35 +++++++++++++++++++++++++++++++++++
 3 files changed, 49 insertions(+)

diff --git a/arch/arm64/include/asm/kgdb.h b/arch/arm64/include/asm/kgdb.h
index 21fc85e..c3d2425 100644
--- a/arch/arm64/include/asm/kgdb.h
+++ b/arch/arm64/include/asm/kgdb.h
@@ -24,6 +24,15 @@ static inline void arch_kgdb_breakpoint(void)
 extern void kgdb_handle_bus_error(void);
 extern int kgdb_fault_expected;
 
+#ifdef CONFIG_KGDB
+extern bool kgdb_ipi_nmicallback(int cpu, void *regs);
+#else
+static inline bool kgdb_ipi_nmicallback(int cpu, void *regs)
+{
+	return false;
+}
+#endif
+
 #endif /* !__ASSEMBLY__ */
 
 /*
diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
index 597dcf7..6ace182 100644
--- a/arch/arm64/kernel/ipi_nmi.c
+++ b/arch/arm64/kernel/ipi_nmi.c
@@ -8,6 +8,7 @@
 
 #include <linux/interrupt.h>
 #include <linux/irq.h>
+#include <linux/kgdb.h>
 #include <linux/nmi.h>
 #include <linux/smp.h>
 
@@ -45,10 +46,14 @@ bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
 static irqreturn_t ipi_nmi_handler(int irq, void *data)
 {
 	irqreturn_t ret = IRQ_NONE;
+	unsigned int cpu = smp_processor_id();
 
 	if (nmi_cpu_backtrace(get_irq_regs()))
 		ret = IRQ_HANDLED;
 
+	if (kgdb_ipi_nmicallback(cpu, get_irq_regs()))
+		ret = IRQ_HANDLED;
+
 	return ret;
 }
 
diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c
index 1a157ca3..c26e710 100644
--- a/arch/arm64/kernel/kgdb.c
+++ b/arch/arm64/kernel/kgdb.c
@@ -17,6 +17,7 @@
 
 #include <asm/debug-monitors.h>
 #include <asm/insn.h>
+#include <asm/nmi.h>
 #include <asm/traps.h>
 
 struct dbg_reg_def_t dbg_reg_def[DBG_MAX_REG_NUM] = {
@@ -353,3 +354,37 @@ int kgdb_arch_remove_breakpoint(struct kgdb_bkpt *bpt)
 	return aarch64_insn_write((void *)bpt->bpt_addr,
 			*(u32 *)bpt->saved_instr);
 }
+
+bool kgdb_ipi_nmicallback(int cpu, void *regs)
+{
+	if (atomic_read(&kgdb_active) != -1) {
+		kgdb_nmicallback(cpu, regs);
+		return true;
+	}
+
+	return false;
+}
+
+static void kgdb_smp_callback(void *data)
+{
+	unsigned int cpu = smp_processor_id();
+
+	if (atomic_read(&kgdb_active) != -1)
+		kgdb_nmicallback(cpu, get_irq_regs());
+}
+
+bool kgdb_arch_roundup_cpus(void)
+{
+	struct cpumask mask;
+
+	if (!arm64_supports_nmi())
+		return false;
+
+	cpumask_copy(&mask, cpu_online_mask);
+	cpumask_clear_cpu(raw_smp_processor_id(), &mask);
+	if (cpumask_empty(&mask))
+		return false;
+
+	arm64_send_nmi(&mask);
+	return true;
+}
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 6/7] kgdb: roundup: Allow runtime arch specific override
  2020-10-29 14:56 ` [PATCH v6 6/7] kgdb: roundup: Allow runtime arch specific override Sumit Garg
@ 2020-10-29 15:21   ` Daniel Thompson
  2020-11-02  6:18     ` Sumit Garg
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel Thompson @ 2020-10-29 15:21 UTC (permalink / raw)
  To: Sumit Garg
  Cc: maz, catalin.marinas, will, linux-arm-kernel, tglx, jason, linux,
	tsbogend, mpe, davem, mingo, bp, x86, mark.rutland,
	julien.thierry.kdev, dianders, jason.wessel, msys.mizuma,
	ito-yuichi, kgdb-bugreport, linux-kernel

On Thu, Oct 29, 2020 at 08:26:26PM +0530, Sumit Garg wrote:
> Add a new API kgdb_arch_roundup_cpus() for a particular archichecture to
> override default kgdb roundup and if it detects at runtime to not support
> NMI roundup then it can fallback to default implementation using async
> SMP cross-calls.
> 
> Currently such an architecture example is arm64 supporting pseudo NMIs
> feature which is only available on platforms which have support for GICv3
> or later version.
> 
> Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> ---
>  arch/powerpc/kernel/kgdb.c |  3 ++-
>  arch/sparc/kernel/smp_64.c |  3 ++-
>  arch/x86/kernel/kgdb.c     |  6 ++++--
>  include/linux/kgdb.h       |  5 +++--
>  kernel/debug/debug_core.c  | 10 +++++++++-
>  5 files changed, 20 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/kgdb.c b/arch/powerpc/kernel/kgdb.c
> index 4090802..126575d 100644
> --- a/arch/powerpc/kernel/kgdb.c
> +++ b/arch/powerpc/kernel/kgdb.c
> @@ -125,9 +125,10 @@ static int kgdb_debugger_ipi(struct pt_regs *regs)
>  }
>  
>  #ifdef CONFIG_SMP
> -void kgdb_roundup_cpus(void)
> +bool kgdb_arch_roundup_cpus(void)
>  {
>  	smp_send_debugger_break();
> +	return true;
>  }
>  #endif
>  
> diff --git a/arch/sparc/kernel/smp_64.c b/arch/sparc/kernel/smp_64.c
> index e38d8bf..c459c83 100644
> --- a/arch/sparc/kernel/smp_64.c
> +++ b/arch/sparc/kernel/smp_64.c
> @@ -1014,9 +1014,10 @@ void flush_dcache_page_all(struct mm_struct *mm, struct page *page)
>  }
>  
>  #ifdef CONFIG_KGDB
> -void kgdb_roundup_cpus(void)
> +bool kgdb_arch_roundup_cpus(void)
>  {
>  	smp_cross_call(&xcall_kgdb_capture, 0, 0, 0);
> +	return true;
>  }
>  #endif
>  
> diff --git a/arch/x86/kernel/kgdb.c b/arch/x86/kernel/kgdb.c
> index ff7878d..1b756d9 100644
> --- a/arch/x86/kernel/kgdb.c
> +++ b/arch/x86/kernel/kgdb.c
> @@ -404,7 +404,8 @@ static void kgdb_disable_hw_debug(struct pt_regs *regs)
>  
>  #ifdef CONFIG_SMP
>  /**
> - *	kgdb_roundup_cpus - Get other CPUs into a holding pattern
> + *	kgdb_arch_roundup_cpus - Get other CPUs into a holding pattern
> + *				 in an architectural specific manner
>   *
>   *	On SMP systems, we need to get the attention of the other CPUs
>   *	and get them be in a known state.  This should do what is needed
> @@ -414,9 +415,10 @@ static void kgdb_disable_hw_debug(struct pt_regs *regs)
>   *
>   *	On non-SMP systems, this is not called.
>   */
> -void kgdb_roundup_cpus(void)
> +bool kgdb_arch_roundup_cpus(void)
>  {
>  	apic_send_IPI_allbutself(NMI_VECTOR);
> +	return true;
>  }
>  #endif
>  
> diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
> index 0d6cf64..f9db5b8 100644
> --- a/include/linux/kgdb.h
> +++ b/include/linux/kgdb.h
> @@ -200,7 +200,8 @@ kgdb_arch_handle_qxfer_pkt(char *remcom_in_buffer,
>  extern void kgdb_call_nmi_hook(void *ignored);
>  
>  /**
> - *	kgdb_roundup_cpus - Get other CPUs into a holding pattern
> + *	kgdb_arch_roundup_cpus - Get other CPUs into a holding pattern
> + *				 in an architectural specific manner
>   *
>   *	On SMP systems, we need to get the attention of the other CPUs
>   *	and get them into a known state.  This should do what is needed
> @@ -210,7 +211,7 @@ extern void kgdb_call_nmi_hook(void *ignored);
>   *
>   *	On non-SMP systems, this is not called.
>   */
> -extern void kgdb_roundup_cpus(void);
> +extern bool kgdb_arch_roundup_cpus(void);
>  
>  /**
>   *	kgdb_arch_set_pc - Generic call back to the program counter
> diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
> index 1e75a89..27e401c 100644
> --- a/kernel/debug/debug_core.c
> +++ b/kernel/debug/debug_core.c
> @@ -241,13 +241,21 @@ void __weak kgdb_call_nmi_hook(void *ignored)
>  }
>  NOKPROBE_SYMBOL(kgdb_call_nmi_hook);
>  
> -void __weak kgdb_roundup_cpus(void)
> +bool __weak kgdb_arch_roundup_cpus(void)
> +{
> +	return false;

Do we really need to introduce all these boolean return values just to
call a bit of library code when one of the architectures wants to
fallback to a generic implementation?

Why not something more like:

void kgdb_smp_call_nmi_hook(void)
{
    /* original weak version of kgdb_roundup_cpus goes here */
}

void __weak kgdb_roundup_cpus(void)
{
    kgdb_smp_call_nmi_hook();
}

arm64 can now simply call the new library function if a fallback is needed. 

Note that same question applies to the backtrace changes...


Daniel.


> +}
> +
> +static void kgdb_roundup_cpus(void)
>  {
>  	call_single_data_t *csd;
>  	int this_cpu = raw_smp_processor_id();
>  	int cpu;
>  	int ret;
>  
> +	if (kgdb_arch_roundup_cpus())
> +		return;
> +
>  	for_each_online_cpu(cpu) {
>  		/* No need to roundup ourselves */
>  		if (cpu == this_cpu)
> -- 
> 2.7.4
> 

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 7/7] arm64: kgdb: Roundup cpus using IPI as NMI
  2020-10-29 14:56 ` [PATCH v6 7/7] arm64: kgdb: Roundup cpus using IPI as NMI Sumit Garg
@ 2020-10-29 16:22   ` Daniel Thompson
  2020-10-29 16:39     ` Daniel Thompson
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel Thompson @ 2020-10-29 16:22 UTC (permalink / raw)
  To: Sumit Garg
  Cc: maz, catalin.marinas, will, linux-arm-kernel, tglx, jason, linux,
	tsbogend, mpe, davem, mingo, bp, x86, mark.rutland,
	julien.thierry.kdev, dianders, jason.wessel, msys.mizuma,
	ito-yuichi, kgdb-bugreport, linux-kernel

On Thu, Oct 29, 2020 at 08:26:27PM +0530, Sumit Garg wrote:
> arm64 platforms with GICv3 or later supports pseudo NMIs which can be
> leveraged to roundup CPUs which are stuck in hard lockup state with
> interrupts disabled that wouldn't be possible with a normal IPI.
> 
> So instead switch to roundup CPUs using IPI turned as NMI. And in
> case a particular arm64 platform doesn't supports pseudo NMIs,
> it will switch back to default kgdb CPUs roundup mechanism.
> 
> Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> ---
>  arch/arm64/include/asm/kgdb.h |  9 +++++++++
>  arch/arm64/kernel/ipi_nmi.c   |  5 +++++
>  arch/arm64/kernel/kgdb.c      | 35 +++++++++++++++++++++++++++++++++++
>  3 files changed, 49 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/kgdb.h b/arch/arm64/include/asm/kgdb.h
> index 21fc85e..c3d2425 100644
> --- a/arch/arm64/include/asm/kgdb.h
> +++ b/arch/arm64/include/asm/kgdb.h
> @@ -24,6 +24,15 @@ static inline void arch_kgdb_breakpoint(void)
>  extern void kgdb_handle_bus_error(void);
>  extern int kgdb_fault_expected;
>  
> +#ifdef CONFIG_KGDB
> +extern bool kgdb_ipi_nmicallback(int cpu, void *regs);
> +#else
> +static inline bool kgdb_ipi_nmicallback(int cpu, void *regs)
> +{
> +	return false;
> +}
> +#endif
> +
>  #endif /* !__ASSEMBLY__ */
>  
>  /*
> diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
> index 597dcf7..6ace182 100644
> --- a/arch/arm64/kernel/ipi_nmi.c
> +++ b/arch/arm64/kernel/ipi_nmi.c
> @@ -8,6 +8,7 @@
>  
>  #include <linux/interrupt.h>
>  #include <linux/irq.h>
> +#include <linux/kgdb.h>
>  #include <linux/nmi.h>
>  #include <linux/smp.h>
>  
> @@ -45,10 +46,14 @@ bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
>  static irqreturn_t ipi_nmi_handler(int irq, void *data)
>  {
>  	irqreturn_t ret = IRQ_NONE;
> +	unsigned int cpu = smp_processor_id();
>  
>  	if (nmi_cpu_backtrace(get_irq_regs()))
>  		ret = IRQ_HANDLED;
>  
> +	if (kgdb_ipi_nmicallback(cpu, get_irq_regs()))
> +		ret = IRQ_HANDLED;
> +
>  	return ret;

It would be better to declare existing return value for
kgdb_nmicallback() to be dangerously stupid and fix it so it returns an
irqreturn_t (that's easy since most callers do not need to check the
return value).

Then this code simply becomes:

	return kgdb_nmicallback(cpu, get_irq_regs());


>  }
>  
> diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c
> index 1a157ca3..c26e710 100644
> --- a/arch/arm64/kernel/kgdb.c
> +++ b/arch/arm64/kernel/kgdb.c
> @@ -17,6 +17,7 @@
>  
>  #include <asm/debug-monitors.h>
>  #include <asm/insn.h>
> +#include <asm/nmi.h>
>  #include <asm/traps.h>
>  
>  struct dbg_reg_def_t dbg_reg_def[DBG_MAX_REG_NUM] = {
> @@ -353,3 +354,37 @@ int kgdb_arch_remove_breakpoint(struct kgdb_bkpt *bpt)
>  	return aarch64_insn_write((void *)bpt->bpt_addr,
>  			*(u32 *)bpt->saved_instr);
>  }
> +
> +bool kgdb_ipi_nmicallback(int cpu, void *regs)
> +{
> +	if (atomic_read(&kgdb_active) != -1) {
> +		kgdb_nmicallback(cpu, regs);
> +		return true;
> +	}
> +
> +	return false;
> +}

I *really* don't like this function.

If the return code of kgdb_nmicallback() is broken then fix it, don't
just wrap it and invent a new criteria for the return code.

To be honest I don't actually think the logic in kgdb_nmicallback() is
broken. As mentioned above the return value has a weird definition (0
for "handled it OK" and 1 for "nothing for me to do") but the logic to
calculate the return code looks OK.


> +
> +static void kgdb_smp_callback(void *data)
> +{
> +	unsigned int cpu = smp_processor_id();
> +
> +	if (atomic_read(&kgdb_active) != -1)
> +		kgdb_nmicallback(cpu, get_irq_regs());
> +}

This is Unused. I presume it is litter from a previous revision of the
code and can be deleted?


> +
> +bool kgdb_arch_roundup_cpus(void)
> +{
> +	struct cpumask mask;
> +
> +	if (!arm64_supports_nmi())
> +		return false;
> +
> +	cpumask_copy(&mask, cpu_online_mask);
> +	cpumask_clear_cpu(raw_smp_processor_id(), &mask);
> +	if (cpumask_empty(&mask))
> +		return false;

Why do we need to fallback if there is no work to do? There will still
be no work to do when we call the fallback.


Daniel.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 7/7] arm64: kgdb: Roundup cpus using IPI as NMI
  2020-10-29 16:22   ` Daniel Thompson
@ 2020-10-29 16:39     ` Daniel Thompson
  2020-11-02  6:59       ` Sumit Garg
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel Thompson @ 2020-10-29 16:39 UTC (permalink / raw)
  To: Sumit Garg
  Cc: maz, catalin.marinas, will, linux-arm-kernel, tglx, jason, linux,
	tsbogend, mpe, davem, mingo, bp, x86, mark.rutland,
	julien.thierry.kdev, dianders, jason.wessel, msys.mizuma,
	ito-yuichi, kgdb-bugreport, linux-kernel

On Thu, Oct 29, 2020 at 04:22:34PM +0000, Daniel Thompson wrote:
> On Thu, Oct 29, 2020 at 08:26:27PM +0530, Sumit Garg wrote:
> > arm64 platforms with GICv3 or later supports pseudo NMIs which can be
> > leveraged to roundup CPUs which are stuck in hard lockup state with
> > interrupts disabled that wouldn't be possible with a normal IPI.
> > 
> > So instead switch to roundup CPUs using IPI turned as NMI. And in
> > case a particular arm64 platform doesn't supports pseudo NMIs,
> > it will switch back to default kgdb CPUs roundup mechanism.
> > 
> > Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> > ---
> >  arch/arm64/include/asm/kgdb.h |  9 +++++++++
> >  arch/arm64/kernel/ipi_nmi.c   |  5 +++++
> >  arch/arm64/kernel/kgdb.c      | 35 +++++++++++++++++++++++++++++++++++
> >  3 files changed, 49 insertions(+)
> > 
> > diff --git a/arch/arm64/include/asm/kgdb.h b/arch/arm64/include/asm/kgdb.h
> > index 21fc85e..c3d2425 100644
> > --- a/arch/arm64/include/asm/kgdb.h
> > +++ b/arch/arm64/include/asm/kgdb.h
> > @@ -24,6 +24,15 @@ static inline void arch_kgdb_breakpoint(void)
> >  extern void kgdb_handle_bus_error(void);
> >  extern int kgdb_fault_expected;
> >  
> > +#ifdef CONFIG_KGDB
> > +extern bool kgdb_ipi_nmicallback(int cpu, void *regs);
> > +#else
> > +static inline bool kgdb_ipi_nmicallback(int cpu, void *regs)
> > +{
> > +	return false;
> > +}
> > +#endif
> > +
> >  #endif /* !__ASSEMBLY__ */
> >  
> >  /*
> > diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
> > index 597dcf7..6ace182 100644
> > --- a/arch/arm64/kernel/ipi_nmi.c
> > +++ b/arch/arm64/kernel/ipi_nmi.c
> > @@ -8,6 +8,7 @@
> >  
> >  #include <linux/interrupt.h>
> >  #include <linux/irq.h>
> > +#include <linux/kgdb.h>
> >  #include <linux/nmi.h>
> >  #include <linux/smp.h>
> >  
> > @@ -45,10 +46,14 @@ bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
> >  static irqreturn_t ipi_nmi_handler(int irq, void *data)
> >  {
> >  	irqreturn_t ret = IRQ_NONE;
> > +	unsigned int cpu = smp_processor_id();
> >  
> >  	if (nmi_cpu_backtrace(get_irq_regs()))
> >  		ret = IRQ_HANDLED;
> >  
> > +	if (kgdb_ipi_nmicallback(cpu, get_irq_regs()))
> > +		ret = IRQ_HANDLED;
> > +
> >  	return ret;
> 
> It would be better to declare existing return value for
> kgdb_nmicallback() to be dangerously stupid and fix it so it returns an
> irqreturn_t (that's easy since most callers do not need to check the
> return value).
> 
> Then this code simply becomes:
> 
> 	return kgdb_nmicallback(cpu, get_irq_regs());

Actually, reflecting on this maybe it is better to keep kgdb_nmicallin()
and kgdb_nmicallback() aligned w.r.t. return codes (even if they are a
little unusual).

I'm still not sure why we'd keep kgdb_ipi_nmicallback() though.
kgdb_nmicallback() is intended to be called from arch code...


Daniel.


> 
> 
> >  }
> >  
> > diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c
> > index 1a157ca3..c26e710 100644
> > --- a/arch/arm64/kernel/kgdb.c
> > +++ b/arch/arm64/kernel/kgdb.c
> > @@ -17,6 +17,7 @@
> >  
> >  #include <asm/debug-monitors.h>
> >  #include <asm/insn.h>
> > +#include <asm/nmi.h>
> >  #include <asm/traps.h>
> >  
> >  struct dbg_reg_def_t dbg_reg_def[DBG_MAX_REG_NUM] = {
> > @@ -353,3 +354,37 @@ int kgdb_arch_remove_breakpoint(struct kgdb_bkpt *bpt)
> >  	return aarch64_insn_write((void *)bpt->bpt_addr,
> >  			*(u32 *)bpt->saved_instr);
> >  }
> > +
> > +bool kgdb_ipi_nmicallback(int cpu, void *regs)
> > +{
> > +	if (atomic_read(&kgdb_active) != -1) {
> > +		kgdb_nmicallback(cpu, regs);
> > +		return true;
> > +	}
> > +
> > +	return false;
> > +}
> 
> I *really* don't like this function.
> 
> If the return code of kgdb_nmicallback() is broken then fix it, don't
> just wrap it and invent a new criteria for the return code.
> 
> To be honest I don't actually think the logic in kgdb_nmicallback() is
> broken. As mentioned above the return value has a weird definition (0
> for "handled it OK" and 1 for "nothing for me to do") but the logic to
> calculate the return code looks OK.
> 
> 
> > +
> > +static void kgdb_smp_callback(void *data)
> > +{
> > +	unsigned int cpu = smp_processor_id();
> > +
> > +	if (atomic_read(&kgdb_active) != -1)
> > +		kgdb_nmicallback(cpu, get_irq_regs());
> > +}
> 
> This is Unused. I presume it is litter from a previous revision of the
> code and can be deleted?
> 
> 
> > +
> > +bool kgdb_arch_roundup_cpus(void)
> > +{
> > +	struct cpumask mask;
> > +
> > +	if (!arm64_supports_nmi())
> > +		return false;
> > +
> > +	cpumask_copy(&mask, cpu_online_mask);
> > +	cpumask_clear_cpu(raw_smp_processor_id(), &mask);
> > +	if (cpumask_empty(&mask))
> > +		return false;
> 
> Why do we need to fallback if there is no work to do? There will still
> be no work to do when we call the fallback.
> 
> 
> Daniel.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 6/7] kgdb: roundup: Allow runtime arch specific override
  2020-10-29 15:21   ` Daniel Thompson
@ 2020-11-02  6:18     ` Sumit Garg
  2020-11-02  9:19       ` Daniel Thompson
  0 siblings, 1 reply; 14+ messages in thread
From: Sumit Garg @ 2020-11-02  6:18 UTC (permalink / raw)
  To: Daniel Thompson
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, linux-arm-kernel,
	Thomas Gleixner, Jason Cooper, Russell King - ARM Linux admin,
	tsbogend, mpe, David S. Miller, mingo, bp, x86, Mark Rutland,
	julien.thierry.kdev, Douglas Anderson, Jason Wessel,
	Masayoshi Mizuma, ito-yuichi, kgdb-bugreport,
	Linux Kernel Mailing List

On Thu, 29 Oct 2020 at 20:51, Daniel Thompson
<daniel.thompson@linaro.org> wrote:
>
> On Thu, Oct 29, 2020 at 08:26:26PM +0530, Sumit Garg wrote:
> > Add a new API kgdb_arch_roundup_cpus() for a particular archichecture to
> > override default kgdb roundup and if it detects at runtime to not support
> > NMI roundup then it can fallback to default implementation using async
> > SMP cross-calls.
> >
> > Currently such an architecture example is arm64 supporting pseudo NMIs
> > feature which is only available on platforms which have support for GICv3
> > or later version.
> >
> > Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> > ---
> >  arch/powerpc/kernel/kgdb.c |  3 ++-
> >  arch/sparc/kernel/smp_64.c |  3 ++-
> >  arch/x86/kernel/kgdb.c     |  6 ++++--
> >  include/linux/kgdb.h       |  5 +++--
> >  kernel/debug/debug_core.c  | 10 +++++++++-
> >  5 files changed, 20 insertions(+), 7 deletions(-)
> >
> > diff --git a/arch/powerpc/kernel/kgdb.c b/arch/powerpc/kernel/kgdb.c
> > index 4090802..126575d 100644
> > --- a/arch/powerpc/kernel/kgdb.c
> > +++ b/arch/powerpc/kernel/kgdb.c
> > @@ -125,9 +125,10 @@ static int kgdb_debugger_ipi(struct pt_regs *regs)
> >  }
> >
> >  #ifdef CONFIG_SMP
> > -void kgdb_roundup_cpus(void)
> > +bool kgdb_arch_roundup_cpus(void)
> >  {
> >       smp_send_debugger_break();
> > +     return true;
> >  }
> >  #endif
> >
> > diff --git a/arch/sparc/kernel/smp_64.c b/arch/sparc/kernel/smp_64.c
> > index e38d8bf..c459c83 100644
> > --- a/arch/sparc/kernel/smp_64.c
> > +++ b/arch/sparc/kernel/smp_64.c
> > @@ -1014,9 +1014,10 @@ void flush_dcache_page_all(struct mm_struct *mm, struct page *page)
> >  }
> >
> >  #ifdef CONFIG_KGDB
> > -void kgdb_roundup_cpus(void)
> > +bool kgdb_arch_roundup_cpus(void)
> >  {
> >       smp_cross_call(&xcall_kgdb_capture, 0, 0, 0);
> > +     return true;
> >  }
> >  #endif
> >
> > diff --git a/arch/x86/kernel/kgdb.c b/arch/x86/kernel/kgdb.c
> > index ff7878d..1b756d9 100644
> > --- a/arch/x86/kernel/kgdb.c
> > +++ b/arch/x86/kernel/kgdb.c
> > @@ -404,7 +404,8 @@ static void kgdb_disable_hw_debug(struct pt_regs *regs)
> >
> >  #ifdef CONFIG_SMP
> >  /**
> > - *   kgdb_roundup_cpus - Get other CPUs into a holding pattern
> > + *   kgdb_arch_roundup_cpus - Get other CPUs into a holding pattern
> > + *                            in an architectural specific manner
> >   *
> >   *   On SMP systems, we need to get the attention of the other CPUs
> >   *   and get them be in a known state.  This should do what is needed
> > @@ -414,9 +415,10 @@ static void kgdb_disable_hw_debug(struct pt_regs *regs)
> >   *
> >   *   On non-SMP systems, this is not called.
> >   */
> > -void kgdb_roundup_cpus(void)
> > +bool kgdb_arch_roundup_cpus(void)
> >  {
> >       apic_send_IPI_allbutself(NMI_VECTOR);
> > +     return true;
> >  }
> >  #endif
> >
> > diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
> > index 0d6cf64..f9db5b8 100644
> > --- a/include/linux/kgdb.h
> > +++ b/include/linux/kgdb.h
> > @@ -200,7 +200,8 @@ kgdb_arch_handle_qxfer_pkt(char *remcom_in_buffer,
> >  extern void kgdb_call_nmi_hook(void *ignored);
> >
> >  /**
> > - *   kgdb_roundup_cpus - Get other CPUs into a holding pattern
> > + *   kgdb_arch_roundup_cpus - Get other CPUs into a holding pattern
> > + *                            in an architectural specific manner
> >   *
> >   *   On SMP systems, we need to get the attention of the other CPUs
> >   *   and get them into a known state.  This should do what is needed
> > @@ -210,7 +211,7 @@ extern void kgdb_call_nmi_hook(void *ignored);
> >   *
> >   *   On non-SMP systems, this is not called.
> >   */
> > -extern void kgdb_roundup_cpus(void);
> > +extern bool kgdb_arch_roundup_cpus(void);
> >
> >  /**
> >   *   kgdb_arch_set_pc - Generic call back to the program counter
> > diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
> > index 1e75a89..27e401c 100644
> > --- a/kernel/debug/debug_core.c
> > +++ b/kernel/debug/debug_core.c
> > @@ -241,13 +241,21 @@ void __weak kgdb_call_nmi_hook(void *ignored)
> >  }
> >  NOKPROBE_SYMBOL(kgdb_call_nmi_hook);
> >
> > -void __weak kgdb_roundup_cpus(void)
> > +bool __weak kgdb_arch_roundup_cpus(void)
> > +{
> > +     return false;
>
> Do we really need to introduce all these boolean return values just to
> call a bit of library code when one of the architectures wants to
> fallback to a generic implementation?
>
> Why not something more like:
>
> void kgdb_smp_call_nmi_hook(void)
> {
>     /* original weak version of kgdb_roundup_cpus goes here */
> }
>
> void __weak kgdb_roundup_cpus(void)
> {
>     kgdb_smp_call_nmi_hook();
> }
>
> arm64 can now simply call the new library function if a fallback is needed.
>

Makes sense, I will use this approach instead.

> Note that same question applies to the backtrace changes...

In case of backtrace, there are already multiple APIs wrapping
arch_trigger_cpumask_backtrace() as follows:

- trigger_all_cpu_backtrace()
- trigger_allbutself_cpu_backtrace()
- trigger_cpumask_backtrace()
- trigger_single_cpu_backtrace()

And each of them already return a boolean and have corresponding
different fallback mechanisms. So we can't have a common fallback API
from arch specific code and instead need to extend that boolean return
to arch specific code that is implemented as part of patch #4.

If you do have any better ideas then do let me know.

-Sumit

>
>
> Daniel.
>
>
> > +}
> > +
> > +static void kgdb_roundup_cpus(void)
> >  {
> >       call_single_data_t *csd;
> >       int this_cpu = raw_smp_processor_id();
> >       int cpu;
> >       int ret;
> >
> > +     if (kgdb_arch_roundup_cpus())
> > +             return;
> > +
> >       for_each_online_cpu(cpu) {
> >               /* No need to roundup ourselves */
> >               if (cpu == this_cpu)
> > --
> > 2.7.4
> >

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 7/7] arm64: kgdb: Roundup cpus using IPI as NMI
  2020-10-29 16:39     ` Daniel Thompson
@ 2020-11-02  6:59       ` Sumit Garg
  0 siblings, 0 replies; 14+ messages in thread
From: Sumit Garg @ 2020-11-02  6:59 UTC (permalink / raw)
  To: Daniel Thompson
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, linux-arm-kernel,
	Thomas Gleixner, Jason Cooper, Russell King - ARM Linux admin,
	tsbogend, mpe, David S. Miller, mingo, bp, x86, Mark Rutland,
	julien.thierry.kdev, Douglas Anderson, Jason Wessel,
	Masayoshi Mizuma, ito-yuichi, kgdb-bugreport,
	Linux Kernel Mailing List

On Thu, 29 Oct 2020 at 22:09, Daniel Thompson
<daniel.thompson@linaro.org> wrote:
>
> On Thu, Oct 29, 2020 at 04:22:34PM +0000, Daniel Thompson wrote:
> > On Thu, Oct 29, 2020 at 08:26:27PM +0530, Sumit Garg wrote:
> > > arm64 platforms with GICv3 or later supports pseudo NMIs which can be
> > > leveraged to roundup CPUs which are stuck in hard lockup state with
> > > interrupts disabled that wouldn't be possible with a normal IPI.
> > >
> > > So instead switch to roundup CPUs using IPI turned as NMI. And in
> > > case a particular arm64 platform doesn't supports pseudo NMIs,
> > > it will switch back to default kgdb CPUs roundup mechanism.
> > >
> > > Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> > > ---
> > >  arch/arm64/include/asm/kgdb.h |  9 +++++++++
> > >  arch/arm64/kernel/ipi_nmi.c   |  5 +++++
> > >  arch/arm64/kernel/kgdb.c      | 35 +++++++++++++++++++++++++++++++++++
> > >  3 files changed, 49 insertions(+)
> > >
> > > diff --git a/arch/arm64/include/asm/kgdb.h b/arch/arm64/include/asm/kgdb.h
> > > index 21fc85e..c3d2425 100644
> > > --- a/arch/arm64/include/asm/kgdb.h
> > > +++ b/arch/arm64/include/asm/kgdb.h
> > > @@ -24,6 +24,15 @@ static inline void arch_kgdb_breakpoint(void)
> > >  extern void kgdb_handle_bus_error(void);
> > >  extern int kgdb_fault_expected;
> > >
> > > +#ifdef CONFIG_KGDB
> > > +extern bool kgdb_ipi_nmicallback(int cpu, void *regs);
> > > +#else
> > > +static inline bool kgdb_ipi_nmicallback(int cpu, void *regs)
> > > +{
> > > +   return false;
> > > +}
> > > +#endif
> > > +
> > >  #endif /* !__ASSEMBLY__ */
> > >
> > >  /*
> > > diff --git a/arch/arm64/kernel/ipi_nmi.c b/arch/arm64/kernel/ipi_nmi.c
> > > index 597dcf7..6ace182 100644
> > > --- a/arch/arm64/kernel/ipi_nmi.c
> > > +++ b/arch/arm64/kernel/ipi_nmi.c
> > > @@ -8,6 +8,7 @@
> > >
> > >  #include <linux/interrupt.h>
> > >  #include <linux/irq.h>
> > > +#include <linux/kgdb.h>
> > >  #include <linux/nmi.h>
> > >  #include <linux/smp.h>
> > >
> > > @@ -45,10 +46,14 @@ bool arch_trigger_cpumask_backtrace(const cpumask_t *mask, bool exclude_self)
> > >  static irqreturn_t ipi_nmi_handler(int irq, void *data)
> > >  {
> > >     irqreturn_t ret = IRQ_NONE;
> > > +   unsigned int cpu = smp_processor_id();
> > >
> > >     if (nmi_cpu_backtrace(get_irq_regs()))
> > >             ret = IRQ_HANDLED;
> > >
> > > +   if (kgdb_ipi_nmicallback(cpu, get_irq_regs()))
> > > +           ret = IRQ_HANDLED;
> > > +
> > >     return ret;
> >
> > It would be better to declare existing return value for
> > kgdb_nmicallback() to be dangerously stupid and fix it so it returns an
> > irqreturn_t (that's easy since most callers do not need to check the
> > return value).
> >
> > Then this code simply becomes:
> >
> >       return kgdb_nmicallback(cpu, get_irq_regs());
>
> Actually, reflecting on this maybe it is better to keep kgdb_nmicallin()
> and kgdb_nmicallback() aligned w.r.t. return codes (even if they are a
> little unusual).
>
> I'm still not sure why we'd keep kgdb_ipi_nmicallback() though.
> kgdb_nmicallback() is intended to be called from arch code...
>

I added kgdb_ipi_nmicallback() just to add a check for "kgdb_active"
prior to entry into kgdb as here we are sharing NMI among backtrace
and kgdb.

But after your comments, I looked carefully into kgdb_nmicallback()
and I see the "raw_spin_is_locked(&dbg_master_lock)" check as well. So
it looked sufficient to me for calling kgdb_nmicallback() directly
from the arch code and hence I will remove kgdb_ipi_nmicallback() in
the next version.

>
> Daniel.
>
>
> >
> >
> > >  }
> > >
> > > diff --git a/arch/arm64/kernel/kgdb.c b/arch/arm64/kernel/kgdb.c
> > > index 1a157ca3..c26e710 100644
> > > --- a/arch/arm64/kernel/kgdb.c
> > > +++ b/arch/arm64/kernel/kgdb.c
> > > @@ -17,6 +17,7 @@
> > >
> > >  #include <asm/debug-monitors.h>
> > >  #include <asm/insn.h>
> > > +#include <asm/nmi.h>
> > >  #include <asm/traps.h>
> > >
> > >  struct dbg_reg_def_t dbg_reg_def[DBG_MAX_REG_NUM] = {
> > > @@ -353,3 +354,37 @@ int kgdb_arch_remove_breakpoint(struct kgdb_bkpt *bpt)
> > >     return aarch64_insn_write((void *)bpt->bpt_addr,
> > >                     *(u32 *)bpt->saved_instr);
> > >  }
> > > +
> > > +bool kgdb_ipi_nmicallback(int cpu, void *regs)
> > > +{
> > > +   if (atomic_read(&kgdb_active) != -1) {
> > > +           kgdb_nmicallback(cpu, regs);
> > > +           return true;
> > > +   }
> > > +
> > > +   return false;
> > > +}
> >
> > I *really* don't like this function.
> >
> > If the return code of kgdb_nmicallback() is broken then fix it, don't
> > just wrap it and invent a new criteria for the return code.
> >
> > To be honest I don't actually think the logic in kgdb_nmicallback() is
> > broken. As mentioned above the return value has a weird definition (0
> > for "handled it OK" and 1 for "nothing for me to do") but the logic to
> > calculate the return code looks OK.
> >

Makes sense, will remove it instead.

> >
> > > +
> > > +static void kgdb_smp_callback(void *data)
> > > +{
> > > +   unsigned int cpu = smp_processor_id();
> > > +
> > > +   if (atomic_read(&kgdb_active) != -1)
> > > +           kgdb_nmicallback(cpu, get_irq_regs());
> > > +}
> >
> > This is Unused. I presume it is litter from a previous revision of the
> > code and can be deleted?
> >

Yeah.

> >
> > > +
> > > +bool kgdb_arch_roundup_cpus(void)
> > > +{
> > > +   struct cpumask mask;
> > > +
> > > +   if (!arm64_supports_nmi())
> > > +           return false;
> > > +
> > > +   cpumask_copy(&mask, cpu_online_mask);
> > > +   cpumask_clear_cpu(raw_smp_processor_id(), &mask);
> > > +   if (cpumask_empty(&mask))
> > > +           return false;
> >
> > Why do we need to fallback if there is no work to do? There will still
> > be no work to do when we call the fallback.

Okay, won't switch back to fallback mode here.

-Sumit

> >
> >
> > Daniel.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v6 6/7] kgdb: roundup: Allow runtime arch specific override
  2020-11-02  6:18     ` Sumit Garg
@ 2020-11-02  9:19       ` Daniel Thompson
  0 siblings, 0 replies; 14+ messages in thread
From: Daniel Thompson @ 2020-11-02  9:19 UTC (permalink / raw)
  To: Sumit Garg
  Cc: Marc Zyngier, Catalin Marinas, Will Deacon, linux-arm-kernel,
	Thomas Gleixner, Jason Cooper, Russell King - ARM Linux admin,
	tsbogend, mpe, David S. Miller, mingo, bp, x86, Mark Rutland,
	julien.thierry.kdev, Douglas Anderson, Jason Wessel,
	Masayoshi Mizuma, ito-yuichi, kgdb-bugreport,
	Linux Kernel Mailing List

On Mon, Nov 02, 2020 at 11:48:53AM +0530, Sumit Garg wrote:
> On Thu, 29 Oct 2020 at 20:51, Daniel Thompson
> <daniel.thompson@linaro.org> wrote:
> >
> > On Thu, Oct 29, 2020 at 08:26:26PM +0530, Sumit Garg wrote:
> > > Add a new API kgdb_arch_roundup_cpus() for a particular archichecture to
> > > override default kgdb roundup and if it detects at runtime to not support
> > > NMI roundup then it can fallback to default implementation using async
> > > SMP cross-calls.
> > >
> > > Currently such an architecture example is arm64 supporting pseudo NMIs
> > > feature which is only available on platforms which have support for GICv3
> > > or later version.
> > >
> > > Signed-off-by: Sumit Garg <sumit.garg@linaro.org>
> > > ---
> > >  arch/powerpc/kernel/kgdb.c |  3 ++-
> > >  arch/sparc/kernel/smp_64.c |  3 ++-
> > >  arch/x86/kernel/kgdb.c     |  6 ++++--
> > >  include/linux/kgdb.h       |  5 +++--
> > >  kernel/debug/debug_core.c  | 10 +++++++++-
> > >  5 files changed, 20 insertions(+), 7 deletions(-)
> > >
> > > diff --git a/arch/powerpc/kernel/kgdb.c b/arch/powerpc/kernel/kgdb.c
> > > index 4090802..126575d 100644
> > > --- a/arch/powerpc/kernel/kgdb.c
> > > +++ b/arch/powerpc/kernel/kgdb.c
> > > @@ -125,9 +125,10 @@ static int kgdb_debugger_ipi(struct pt_regs *regs)
> > >  }
> > >
> > >  #ifdef CONFIG_SMP
> > > -void kgdb_roundup_cpus(void)
> > > +bool kgdb_arch_roundup_cpus(void)
> > >  {
> > >       smp_send_debugger_break();
> > > +     return true;
> > >  }
> > >  #endif
> > >
> > > diff --git a/arch/sparc/kernel/smp_64.c b/arch/sparc/kernel/smp_64.c
> > > index e38d8bf..c459c83 100644
> > > --- a/arch/sparc/kernel/smp_64.c
> > > +++ b/arch/sparc/kernel/smp_64.c
> > > @@ -1014,9 +1014,10 @@ void flush_dcache_page_all(struct mm_struct *mm, struct page *page)
> > >  }
> > >
> > >  #ifdef CONFIG_KGDB
> > > -void kgdb_roundup_cpus(void)
> > > +bool kgdb_arch_roundup_cpus(void)
> > >  {
> > >       smp_cross_call(&xcall_kgdb_capture, 0, 0, 0);
> > > +     return true;
> > >  }
> > >  #endif
> > >
> > > diff --git a/arch/x86/kernel/kgdb.c b/arch/x86/kernel/kgdb.c
> > > index ff7878d..1b756d9 100644
> > > --- a/arch/x86/kernel/kgdb.c
> > > +++ b/arch/x86/kernel/kgdb.c
> > > @@ -404,7 +404,8 @@ static void kgdb_disable_hw_debug(struct pt_regs *regs)
> > >
> > >  #ifdef CONFIG_SMP
> > >  /**
> > > - *   kgdb_roundup_cpus - Get other CPUs into a holding pattern
> > > + *   kgdb_arch_roundup_cpus - Get other CPUs into a holding pattern
> > > + *                            in an architectural specific manner
> > >   *
> > >   *   On SMP systems, we need to get the attention of the other CPUs
> > >   *   and get them be in a known state.  This should do what is needed
> > > @@ -414,9 +415,10 @@ static void kgdb_disable_hw_debug(struct pt_regs *regs)
> > >   *
> > >   *   On non-SMP systems, this is not called.
> > >   */
> > > -void kgdb_roundup_cpus(void)
> > > +bool kgdb_arch_roundup_cpus(void)
> > >  {
> > >       apic_send_IPI_allbutself(NMI_VECTOR);
> > > +     return true;
> > >  }
> > >  #endif
> > >
> > > diff --git a/include/linux/kgdb.h b/include/linux/kgdb.h
> > > index 0d6cf64..f9db5b8 100644
> > > --- a/include/linux/kgdb.h
> > > +++ b/include/linux/kgdb.h
> > > @@ -200,7 +200,8 @@ kgdb_arch_handle_qxfer_pkt(char *remcom_in_buffer,
> > >  extern void kgdb_call_nmi_hook(void *ignored);
> > >
> > >  /**
> > > - *   kgdb_roundup_cpus - Get other CPUs into a holding pattern
> > > + *   kgdb_arch_roundup_cpus - Get other CPUs into a holding pattern
> > > + *                            in an architectural specific manner
> > >   *
> > >   *   On SMP systems, we need to get the attention of the other CPUs
> > >   *   and get them into a known state.  This should do what is needed
> > > @@ -210,7 +211,7 @@ extern void kgdb_call_nmi_hook(void *ignored);
> > >   *
> > >   *   On non-SMP systems, this is not called.
> > >   */
> > > -extern void kgdb_roundup_cpus(void);
> > > +extern bool kgdb_arch_roundup_cpus(void);
> > >
> > >  /**
> > >   *   kgdb_arch_set_pc - Generic call back to the program counter
> > > diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c
> > > index 1e75a89..27e401c 100644
> > > --- a/kernel/debug/debug_core.c
> > > +++ b/kernel/debug/debug_core.c
> > > @@ -241,13 +241,21 @@ void __weak kgdb_call_nmi_hook(void *ignored)
> > >  }
> > >  NOKPROBE_SYMBOL(kgdb_call_nmi_hook);
> > >
> > > -void __weak kgdb_roundup_cpus(void)
> > > +bool __weak kgdb_arch_roundup_cpus(void)
> > > +{
> > > +     return false;
> >
> > Do we really need to introduce all these boolean return values just to
> > call a bit of library code when one of the architectures wants to
> > fallback to a generic implementation?
> >
> > Why not something more like:
> >
> > void kgdb_smp_call_nmi_hook(void)
> > {
> >     /* original weak version of kgdb_roundup_cpus goes here */
> > }
> >
> > void __weak kgdb_roundup_cpus(void)
> > {
> >     kgdb_smp_call_nmi_hook();
> > }
> >
> > arm64 can now simply call the new library function if a fallback is needed.
> >
> 
> Makes sense, I will use this approach instead.
> 
> > Note that same question applies to the backtrace changes...
> 
> In case of backtrace, there are already multiple APIs wrapping
> arch_trigger_cpumask_backtrace() as follows:
> 
> - trigger_all_cpu_backtrace()
> - trigger_allbutself_cpu_backtrace()
> - trigger_cpumask_backtrace()
> - trigger_single_cpu_backtrace()
> 
> And each of them already return a boolean and have corresponding
> different fallback mechanisms. So we can't have a common fallback API
> from arch specific code and instead need to extend that boolean return
> to arch specific code that is implemented as part of patch #4.

Understood. Thanks.


Daniel.

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2020-11-02  9:19 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-29 14:56 [PATCH v6 0/7] arm64: Add framework to turn an IPI as NMI Sumit Garg
2020-10-29 14:56 ` [PATCH v6 1/7] arm64: Add framework to turn " Sumit Garg
2020-10-29 14:56 ` [PATCH v6 2/7] irqchip/gic-v3: Enable support for SGIs to act as NMIs Sumit Garg
2020-10-29 14:56 ` [PATCH v6 3/7] arm64: smp: Assign and setup an IPI as NMI Sumit Garg
2020-10-29 14:56 ` [PATCH v6 4/7] nmi: backtrace: Allow runtime arch specific override Sumit Garg
2020-10-29 14:56 ` [PATCH v6 5/7] arm64: ipi_nmi: Add support for NMI backtrace Sumit Garg
2020-10-29 14:56 ` [PATCH v6 6/7] kgdb: roundup: Allow runtime arch specific override Sumit Garg
2020-10-29 15:21   ` Daniel Thompson
2020-11-02  6:18     ` Sumit Garg
2020-11-02  9:19       ` Daniel Thompson
2020-10-29 14:56 ` [PATCH v6 7/7] arm64: kgdb: Roundup cpus using IPI as NMI Sumit Garg
2020-10-29 16:22   ` Daniel Thompson
2020-10-29 16:39     ` Daniel Thompson
2020-11-02  6:59       ` Sumit Garg

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).