linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH FYI 0/4] FIQ-based backtracing support
@ 2014-09-05  9:40 Russell King - ARM Linux
  2014-09-05  9:41 ` [PATCH FYI 1/4] ARM: remove extraneous newline in show_regs() Russell King
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Russell King - ARM Linux @ 2014-09-05  9:40 UTC (permalink / raw)
  To: linux-arm-kernel

This is a prototype series for FIQ backtracing support, which is most
useful when trying to debug things like a spinlock lockup on a SMP
system, or a deadlock inside an IRQs-off region.

Some of this is based upon work by Daniel Thompson and others.

This is a work in progress - especially the final patch.

-- 
FTTC broadband for 0.8mile line: currently at 9.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH FYI 1/4] ARM: remove extraneous newline in show_regs()
  2014-09-05  9:40 [PATCH FYI 0/4] FIQ-based backtracing support Russell King - ARM Linux
@ 2014-09-05  9:41 ` Russell King
  2014-09-05  9:41 ` [PATCH FYI 2/4] ARM: remove unused do_unexp_fiq() function Russell King
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Russell King @ 2014-09-05  9:41 UTC (permalink / raw)
  To: linux-arm-kernel

Remove an unnecessary newline in show_regs().

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 arch/arm/kernel/process.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index 81ef686a91ca..52a21dafb27e 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -306,7 +306,6 @@ void __show_regs(struct pt_regs *regs)
 
 void show_regs(struct pt_regs * regs)
 {
-	printk("\n");
 	__show_regs(regs);
 	dump_stack();
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH FYI 2/4] ARM: remove unused do_unexp_fiq() function
  2014-09-05  9:40 [PATCH FYI 0/4] FIQ-based backtracing support Russell King - ARM Linux
  2014-09-05  9:41 ` [PATCH FYI 1/4] ARM: remove extraneous newline in show_regs() Russell King
@ 2014-09-05  9:41 ` Russell King
  2014-09-05  9:41 ` [PATCH FYI 3/4] ARM: add basic support for on-demand backtrace of other CPUs Russell King
  2014-09-05  9:41 ` [PATCH FYI 4/4] ARM: cobble together FIQ backtracing Russell King
  3 siblings, 0 replies; 9+ messages in thread
From: Russell King @ 2014-09-05  9:41 UTC (permalink / raw)
  To: linux-arm-kernel

do_unexp_fiq() has never been called by any code in the last 10 years,
it's about time it was removed!

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 arch/arm/kernel/traps.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index c8e4bb714944..a447dcc4d2f7 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -460,12 +460,6 @@ asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
 	arm_notify_die("Oops - undefined instruction", regs, &info, 0, 6);
 }
 
-asmlinkage void do_unexp_fiq (struct pt_regs *regs)
-{
-	printk("Hmm.  Unexpected FIQ received, but trying to continue\n");
-	printk("You may have a hardware problem...\n");
-}
-
 /*
  * bad_mode handles the impossible case in the vectors.  If you see one of
  * these, then it's extremely serious, and could mean you have buggy hardware.
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH FYI 3/4] ARM: add basic support for on-demand backtrace of other CPUs
  2014-09-05  9:40 [PATCH FYI 0/4] FIQ-based backtracing support Russell King - ARM Linux
  2014-09-05  9:41 ` [PATCH FYI 1/4] ARM: remove extraneous newline in show_regs() Russell King
  2014-09-05  9:41 ` [PATCH FYI 2/4] ARM: remove unused do_unexp_fiq() function Russell King
@ 2014-09-05  9:41 ` Russell King
  2014-09-05 10:10   ` Russell King - ARM Linux
                     ` (2 more replies)
  2014-09-05  9:41 ` [PATCH FYI 4/4] ARM: cobble together FIQ backtracing Russell King
  3 siblings, 3 replies; 9+ messages in thread
From: Russell King @ 2014-09-05  9:41 UTC (permalink / raw)
  To: linux-arm-kernel

Add basic infrastructure for triggering a backtrace of other CPUs
via an IPI, preferably at FIQ level.  It is intended that this shall
be used for cases where we have detected that something has already
failed in the kernel.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 arch/arm/include/asm/irq.h |  5 ++++
 arch/arm/kernel/smp.c      | 62 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 67 insertions(+)

diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h
index 53c15dec7af6..be1d07d59ee9 100644
--- a/arch/arm/include/asm/irq.h
+++ b/arch/arm/include/asm/irq.h
@@ -35,6 +35,11 @@ extern void (*handle_arch_irq)(struct pt_regs *);
 extern void set_handle_irq(void (*handle_irq)(struct pt_regs *));
 #endif
 
+#ifdef CONFIG_SMP
+extern void arch_trigger_all_cpu_backtrace(bool);
+#define arch_trigger_all_cpu_backtrace(x) arch_trigger_all_cpu_backtrace(x)
+#endif
+
 #endif
 
 #endif
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 9388a3d479e1..94959f977b82 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -72,8 +72,12 @@ enum ipi_msg_type {
 	IPI_CPU_STOP,
 	IPI_IRQ_WORK,
 	IPI_COMPLETION,
+	IPI_CPU_BACKTRACE,
 };
 
+/* For reliability, we're prepared to waste bits here. */
+static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
+
 static DECLARE_COMPLETION(cpu_running);
 
 static struct smp_operations smp_ops;
@@ -539,6 +543,21 @@ static void ipi_cpu_stop(unsigned int cpu)
 		cpu_relax();
 }
 
+static void ipi_cpu_backtrace(struct pt_regs *regs)
+{
+	int cpu = smp_processor_id();
+
+	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
+		static arch_spinlock_t lock = __ARCH_SPIN_LOCK_UNLOCKED;
+
+		arch_spin_lock(&lock);
+		printk(KERN_WARNING "FIQ backtrace for cpu %d\n", cpu);
+		show_regs(regs);
+		arch_spin_unlock(&lock);
+		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
+	}
+}
+
 static DEFINE_PER_CPU(struct completion *, cpu_completion);
 
 int register_ipi_completion(struct completion *completion, int cpu)
@@ -618,6 +637,12 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
 		irq_exit();
 		break;
 
+	case IPI_CPU_BACKTRACE:
+		irq_enter();
+		ipi_cpu_backtrace(regs);
+		irq_exit();
+		break;
+
 	default:
 		printk(KERN_CRIT "CPU%u: Unknown IPI message 0x%x\n",
 		       cpu, ipinr);
@@ -712,3 +737,40 @@ static int __init register_cpufreq_notifier(void)
 core_initcall(register_cpufreq_notifier);
 
 #endif
+
+void arch_trigger_all_cpu_backtrace(bool include_self)
+{
+	static unsigned long backtrace_flag;
+	int i, cpu = get_cpu();
+
+	if (test_and_set_bit(0, &backtrace_flag)) {
+		/*
+		 * If there is already a trigger_all_cpu_backtrace() in progress
+		 * (backtrace_flag == 1), don't output double cpu dump infos.
+		 */
+		put_cpu();
+		return;
+	}
+
+	cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask);
+	if (!include_self)
+		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
+
+	if (!cpumask_empty(to_cpumask(backtrace_mask))) {
+		pr_info("Sending FIQ to %s CPUs:\n",
+			(include_self ? "all" : "other"));
+		smp_cross_call(to_cpumask(backtrace_mask), IPI_CPU_BACKTRACE);
+	}
+
+	/* Wait for up to 10 seconds for all CPUs to do the backtrace */
+	for (i = 0; i < 10 * 1000; i++) {
+		if (cpumask_empty(to_cpumask(backtrace_mask)))
+			break;
+
+		mdelay(1);
+	}
+
+	clear_bit(0, &backtrace_flag);
+	smp_mb__after_atomic();
+	put_cpu();
+}
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH FYI 4/4] ARM: cobble together FIQ backtracing
  2014-09-05  9:40 [PATCH FYI 0/4] FIQ-based backtracing support Russell King - ARM Linux
                   ` (2 preceding siblings ...)
  2014-09-05  9:41 ` [PATCH FYI 3/4] ARM: add basic support for on-demand backtrace of other CPUs Russell King
@ 2014-09-05  9:41 ` Russell King
  3 siblings, 0 replies; 9+ messages in thread
From: Russell King @ 2014-09-05  9:41 UTC (permalink / raw)
  To: linux-arm-kernel

Cobble the FIQ backtracing together, some of this patch based on work
which David Thompson has done.

Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
---
 arch/arm/kernel/entry-armv.S | 96 ++++++++++++++++++++++++++++++++++++++------
 arch/arm/kernel/fiq.c        | 11 ++++-
 arch/arm/kernel/setup.c      |  8 +++-
 arch/arm/kernel/smp.c        |  5 +++
 drivers/irqchip/irq-gic.c    | 45 ++++++++++++++++++---
 5 files changed, 144 insertions(+), 21 deletions(-)

diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
index 36276cdccfbc..bd1ad0076937 100644
--- a/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -146,7 +146,7 @@ ENDPROC(__und_invalid)
 #define SPFIX(code...)
 #endif
 
-	.macro	svc_entry, stack_hole=0
+	.macro	svc_entry, stack_hole=0, trace=1
  UNWIND(.fnstart		)
  UNWIND(.save {r0 - pc}		)
 	sub	sp, sp, #(S_FRAME_SIZE + \stack_hole - 4)
@@ -182,9 +182,11 @@ ENDPROC(__und_invalid)
 	@
 	stmia	r7, {r2 - r6}
 
+	.if	\trace
 #ifdef CONFIG_TRACE_IRQFLAGS
 	bl	trace_hardirqs_off
 #endif
+	.endif
 	.endm
 
 	.align	5
@@ -314,7 +316,7 @@ ENDPROC(__pabt_svc)
 #error "sizeof(struct pt_regs) must be a multiple of 8"
 #endif
 
-	.macro	usr_entry
+	.macro	usr_entry, trace=1
  UNWIND(.fnstart	)
  UNWIND(.cantunwind	)	@ don't unwind the user space
 	sub	sp, sp, #S_FRAME_SIZE
@@ -351,10 +353,12 @@ ENDPROC(__pabt_svc)
 	@
 	zero_fp
 
+	.if	\trace
 #ifdef CONFIG_IRQSOFF_TRACER
 	bl	trace_hardirqs_off
 #endif
 	ct_user_exit save = 0
+	.endif
 	.endm
 
 	.macro	kuser_cmpxchg_check
@@ -683,6 +687,61 @@ ENTRY(ret_from_exception)
 ENDPROC(__pabt_usr)
 ENDPROC(ret_from_exception)
 
+	.macro  svc_exit_via_fiq, rpsr
+	mov     r0, sp
+	ldmib   r0, {r1 - r14}  @ abort is deadly from here onward (it will
+				@ clobber state restored below)
+	msr     cpsr_c, #FIQ_MODE | PSR_I_BIT | PSR_F_BIT
+	add     r8, r0, #S_PC
+	ldr     r9, [r0, #S_PSR]
+	msr     spsr_cxsf, r9
+	ldr     r0, [r0, #S_R0]
+	ldmia   r8, {pc}^
+	.endm
+
+	.macro	fiq_handler
+	bl	__fiq_handle
+	.endm
+
+	.align 5
+__fiq_usr:
+	usr_entry trace=0
+	kuser_cmpxchg_check
+	mov	r0, sp
+	fiq_handler
+	get_thread_info tsk
+	restore_user_regs fast = 0, offset = 0
+ UNWIND(.fnend)
+ UNWIND(.cantunwind)
+ENDPROC(__fiq_usr)
+
+__fiq_abt:
+	svc_entry trace=0
+	msr     cpsr_c, #ABT_MODE | PSR_I_BIT | PSR_F_BIT
+	mov     r2, lr          @ Save lr_abt
+	mrs     r3, spsr        @ Save spsr_abt, abort is now safe
+	msr     cpsr_c, #SVC_MODE | PSR_I_BIT | PSR_F_BIT
+	mov	r0, sp
+	push    {r2 - r3}
+	fiq_handler
+	pop     {r0 - r1}
+	msr     cpsr_c, #ABT_MODE | PSR_I_BIT | PSR_F_BIT
+	mov     lr, r0          @ Restore lr_abt, abort is unsafe
+	msr     spsr_cxsf, r1   @ Restore spsr_abt
+	msr     cpsr_c, #SVC_MODE | PSR_I_BIT | PSR_F_BIT
+	svc_exit_via_fiq r5
+ UNWIND(.fnend)
+ENDPROC(__fiq_abt)
+
+__fiq_svc:
+	svc_entry trace=0
+	mov	r0, sp
+	fiq_handler
+	svc_exit_via_fiq r5
+ UNWIND(.fnend)
+ENDPROC(__fiq_svc)
+
+
 /*
  * Register switch for ARMv3 and ARMv4 processors
  * r0 = previous task_struct, r1 = previous thread_info, r2 = next thread_info
@@ -1117,18 +1176,29 @@ vector_rst:
 vector_addrexcptn:
 	b	vector_addrexcptn
 
-/*=============================================================================
- * Undefined FIQs
- *-----------------------------------------------------------------------------
- * Enter in FIQ mode, spsr = ANY CPSR, lr = ANY PC
- * MUST PRESERVE SVC SPSR, but need to switch to SVC mode to show our msg.
- * Basically to switch modes, we *HAVE* to clobber one register...  brain
- * damage alert!  I don't think that we can execute any code in here in any
- * other mode than FIQ...  Ok you can switch to another mode, but you can't
- * get out of that mode without clobbering one register.
+/*
+ * FIQs.  We provide a default stub here, which is used by the kernel
+ * when something fails (eg, spinlock lockup or deadlock, rcu stall,
+ * etc.)
  */
-vector_fiq:
-	subs	pc, lr, #4
+	vector_stub	fiq, FIQ_MODE, 4
+
+	.long	__fiq_usr
+	.long	__fiq_svc
+	.long	__fiq_svc
+	.long	__fiq_svc
+	.long	__fiq_svc
+	.long	__fiq_svc
+	.long	__fiq_svc
+	.long	__fiq_abt
+	.long	__fiq_svc
+	.long	__fiq_svc
+	.long	__fiq_svc
+	.long	__fiq_svc
+	.long	__fiq_svc
+	.long	__fiq_svc
+	.long	__fiq_svc
+	.long	__fiq_svc
 
 	.globl	vector_fiq_offset
 	.equ	vector_fiq_offset, vector_fiq
diff --git a/arch/arm/kernel/fiq.c b/arch/arm/kernel/fiq.c
index 918875d96d5d..1743049c433b 100644
--- a/arch/arm/kernel/fiq.c
+++ b/arch/arm/kernel/fiq.c
@@ -53,6 +53,7 @@
 	})
 
 static unsigned long no_fiq_insn;
+static struct pt_regs def_fiq_regs;
 
 /* Default reacquire function
  * - we always relinquish FIQ control
@@ -60,8 +61,15 @@ static unsigned long no_fiq_insn;
  */
 static int fiq_def_op(void *ref, int relinquish)
 {
-	if (!relinquish)
+	if (!relinquish) {
+		/* Restore default handler and registers */
+		local_fiq_disable();
+		set_fiq_regs(&dfl_fiq_regs);
 		set_fiq_handler(&no_fiq_insn, sizeof(no_fiq_insn));
+		local_fiq_enable();
+
+		/* FIXME: notify irq controller to standard enable FIQs */
+	}
 
 	return 0;
 }
@@ -151,5 +159,6 @@ void __init init_FIQ(int start)
 {
 	unsigned offset = FIQ_OFFSET;
 	no_fiq_insn = *(unsigned long *)(0xffff0000 + offset);
+	get_fiq_regs(&dfl_fiq_regs);
 	fiq_start = start;
 }
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 84db893dedc2..22e56f6ff446 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -423,6 +423,8 @@ static void __init elf_hwcap_fixup(void)
 		elf_hwcap &= ~HWCAP_SWP;
 }
 
+static unsigned int fiq_stack[4][1024];
+
 /*
  * cpu_init - initialise one CPU.
  *
@@ -470,7 +472,9 @@ void notrace cpu_init(void)
 	"msr	cpsr_c, %5\n\t"
 	"add	r14, %0, %6\n\t"
 	"mov	sp, r14\n\t"
-	"msr	cpsr_c, %7"
+	"msr	cpsr_c, %7\n\t"
+	"mov	sp, %8\n\t"
+	"msr	cpsr_c, %9"
 	    :
 	    : "r" (stk),
 	      PLC (PSR_F_BIT | PSR_I_BIT | IRQ_MODE),
@@ -479,6 +483,8 @@ void notrace cpu_init(void)
 	      "I" (offsetof(struct stack, abt[0])),
 	      PLC (PSR_F_BIT | PSR_I_BIT | UND_MODE),
 	      "I" (offsetof(struct stack, und[0])),
+	      PLC (PSR_F_BIT | PSR_I_BIT | FIQ_MODE),
+	      "r" (&fiq_stack[cpu][1024]),
 	      PLC (PSR_F_BIT | PSR_I_BIT | SVC_MODE)
 	    : "r14");
 #endif
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 94959f977b82..ce30ed9f8e1a 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -774,3 +774,8 @@ void arch_trigger_all_cpu_backtrace(bool include_self)
 	smp_mb__after_atomic();
 	put_cpu();
 }
+
+void __fiq_handle(struct pt_regs *regs)
+{
+	ipi_cpu_backtrace(regs);
+}
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index 4b959e606fe8..ec7be8eaab3a 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -371,9 +371,19 @@ static void __init gic_dist_init(struct gic_chip_data *gic)
 	for (i = 32; i < gic_irqs; i += 4)
 		writel_relaxed(cpumask, base + GIC_DIST_TARGET + i * 4 / 4);
 
+	/*
+	 * Optionally set all global interrupts to be group 1.
+	 */
+	for (i = 32; i < gic_irqs; i += 32)
+		writel_relaxed(0xffffffff, base + GIC_DIST_IGROUP + i * 4 / 32);
+
 	gic_dist_config(base, gic_irqs, NULL);
 
-	writel_relaxed(1, base + GIC_DIST_CTRL);
+	/*
+	 * Set EnableGrp1/EnableGrp0 (bit 1 and 0) or EnableGrp (bit 0 only,
+	 * bit 1 ignored)
+	 */
+	writel_relaxed(3, base + GIC_DIST_CTRL);
 }
 
 static void gic_cpu_init(struct gic_chip_data *gic)
@@ -400,8 +410,17 @@ static void gic_cpu_init(struct gic_chip_data *gic)
 
 	gic_cpu_config(dist_base, NULL);
 
+	/*
+	 * Set all PPI and SGI interrupts to be group 1.
+	 *
+	 * If grouping is not available (not implemented or prohibited by
+	 * security mode) these registers are read-as-zero/write-ignored.
+	 */
+	writel_relaxed(0xfffffeff, dist_base + GIC_DIST_IGROUP + 0);
+	writel_relaxed(0xa0a0a000, dist_base + GIC_DIST_PRI + 8);
+
 	writel_relaxed(0xf0, base + GIC_CPU_PRIMASK);
-	writel_relaxed(1, base + GIC_CPU_CTRL);
+	writel_relaxed(0x1f, base + GIC_CPU_CTRL);
 }
 
 void gic_cpu_if_down(void)
@@ -485,7 +504,7 @@ static void gic_dist_restore(unsigned int gic_nr)
 		writel_relaxed(gic_data[gic_nr].saved_spi_enable[i],
 			dist_base + GIC_DIST_ENABLE_SET + i * 4);
 
-	writel_relaxed(1, dist_base + GIC_DIST_CTRL);
+	writel_relaxed(3, dist_base + GIC_DIST_CTRL);
 }
 
 static void gic_cpu_save(unsigned int gic_nr)
@@ -542,7 +561,7 @@ static void gic_cpu_restore(unsigned int gic_nr)
 		writel_relaxed(0xa0a0a0a0, dist_base + GIC_DIST_PRI + i * 4);
 
 	writel_relaxed(0xf0, cpu_base + GIC_CPU_PRIMASK);
-	writel_relaxed(1, cpu_base + GIC_CPU_CTRL);
+	writel_relaxed(0x1f, cpu_base + GIC_CPU_CTRL);
 }
 
 static int gic_notifier(struct notifier_block *self, unsigned long cmd,	void *v)
@@ -600,10 +619,18 @@ static void __init gic_pm_init(struct gic_chip_data *gic)
 #endif
 
 #ifdef CONFIG_SMP
+static bool sgi_is_nonsecure(int irq, struct gic_chip_data *gic)
+{
+	/* FIXME: this should be done in a more generic way */
+	return irq != 8;
+}
+
 static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 {
-	int cpu;
+	struct gic_chip_data *gic = &gic_data[0];
 	unsigned long flags, map = 0;
+	unsigned int softirq;
+	int cpu;
 
 	raw_spin_lock_irqsave(&irq_controller_lock, flags);
 
@@ -617,8 +644,14 @@ static void gic_raise_softirq(const struct cpumask *mask, unsigned int irq)
 	 */
 	dmb(ishst);
 
+	softirq = map << 16 | irq;
+
+	/* SATT only has effect if we are running in the secure world */
+	if (sgi_is_nonsecure(irq, gic))
+		softirq |= 0x8000;
+
 	/* this always happens on GIC0 */
-	writel_relaxed(map << 16 | irq, gic_data_dist_base(&gic_data[0]) + GIC_DIST_SOFTINT);
+	writel_relaxed(softirq, gic_data_dist_base(gic) + GIC_DIST_SOFTINT);
 
 	raw_spin_unlock_irqrestore(&irq_controller_lock, flags);
 }
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH FYI 3/4] ARM: add basic support for on-demand backtrace of other CPUs
  2014-09-05  9:41 ` [PATCH FYI 3/4] ARM: add basic support for on-demand backtrace of other CPUs Russell King
@ 2014-09-05 10:10   ` Russell King - ARM Linux
  2014-09-05 14:31   ` Daniel Thompson
  2014-09-05 15:15   ` Daniel Thompson
  2 siblings, 0 replies; 9+ messages in thread
From: Russell King - ARM Linux @ 2014-09-05 10:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 05, 2014 at 10:41:38AM +0100, Russell King wrote:
> Add basic infrastructure for triggering a backtrace of other CPUs
> via an IPI, preferably at FIQ level.  It is intended that this shall
> be used for cases where we have detected that something has already
> failed in the kernel.
> 
> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>

As I've said previously, I believe that most of the code below should
be generic code rather than arch code - I see nothing here which is
arch specific apart from the "how do we send a NMI/FIQ", i.o.w. the
smp_cross_call() call.

I'd propose changing linux/nmi.h such that arch_trigger_all_cpu_backtrace()
becomes "bool trigger_cpu_backtrace(bool include_self)" and have the
presence of the NMI/FIQ trigger function indicate whether we can do
this - and moving the code below into lib/ or kernel/.

> ---
>  arch/arm/include/asm/irq.h |  5 ++++
>  arch/arm/kernel/smp.c      | 62 ++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 67 insertions(+)
> 
> diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h
> index 53c15dec7af6..be1d07d59ee9 100644
> --- a/arch/arm/include/asm/irq.h
> +++ b/arch/arm/include/asm/irq.h
> @@ -35,6 +35,11 @@ extern void (*handle_arch_irq)(struct pt_regs *);
>  extern void set_handle_irq(void (*handle_irq)(struct pt_regs *));
>  #endif
>  
> +#ifdef CONFIG_SMP
> +extern void arch_trigger_all_cpu_backtrace(bool);
> +#define arch_trigger_all_cpu_backtrace(x) arch_trigger_all_cpu_backtrace(x)
> +#endif
> +
>  #endif
>  
>  #endif
> diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> index 9388a3d479e1..94959f977b82 100644
> --- a/arch/arm/kernel/smp.c
> +++ b/arch/arm/kernel/smp.c
> @@ -72,8 +72,12 @@ enum ipi_msg_type {
>  	IPI_CPU_STOP,
>  	IPI_IRQ_WORK,
>  	IPI_COMPLETION,
> +	IPI_CPU_BACKTRACE,
>  };
>  
> +/* For reliability, we're prepared to waste bits here. */
> +static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
> +
>  static DECLARE_COMPLETION(cpu_running);
>  
>  static struct smp_operations smp_ops;
> @@ -539,6 +543,21 @@ static void ipi_cpu_stop(unsigned int cpu)
>  		cpu_relax();
>  }
>  
> +static void ipi_cpu_backtrace(struct pt_regs *regs)
> +{
> +	int cpu = smp_processor_id();
> +
> +	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
> +		static arch_spinlock_t lock = __ARCH_SPIN_LOCK_UNLOCKED;
> +
> +		arch_spin_lock(&lock);
> +		printk(KERN_WARNING "FIQ backtrace for cpu %d\n", cpu);
> +		show_regs(regs);
> +		arch_spin_unlock(&lock);
> +		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
> +	}
> +}
> +
>  static DEFINE_PER_CPU(struct completion *, cpu_completion);
>  
>  int register_ipi_completion(struct completion *completion, int cpu)
> @@ -618,6 +637,12 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
>  		irq_exit();
>  		break;
>  
> +	case IPI_CPU_BACKTRACE:
> +		irq_enter();
> +		ipi_cpu_backtrace(regs);
> +		irq_exit();
> +		break;
> +
>  	default:
>  		printk(KERN_CRIT "CPU%u: Unknown IPI message 0x%x\n",
>  		       cpu, ipinr);
> @@ -712,3 +737,40 @@ static int __init register_cpufreq_notifier(void)
>  core_initcall(register_cpufreq_notifier);
>  
>  #endif
> +
> +void arch_trigger_all_cpu_backtrace(bool include_self)
> +{
> +	static unsigned long backtrace_flag;
> +	int i, cpu = get_cpu();
> +
> +	if (test_and_set_bit(0, &backtrace_flag)) {
> +		/*
> +		 * If there is already a trigger_all_cpu_backtrace() in progress
> +		 * (backtrace_flag == 1), don't output double cpu dump infos.
> +		 */
> +		put_cpu();
> +		return;
> +	}
> +
> +	cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask);
> +	if (!include_self)
> +		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
> +
> +	if (!cpumask_empty(to_cpumask(backtrace_mask))) {
> +		pr_info("Sending FIQ to %s CPUs:\n",
> +			(include_self ? "all" : "other"));
> +		smp_cross_call(to_cpumask(backtrace_mask), IPI_CPU_BACKTRACE);
> +	}
> +
> +	/* Wait for up to 10 seconds for all CPUs to do the backtrace */
> +	for (i = 0; i < 10 * 1000; i++) {
> +		if (cpumask_empty(to_cpumask(backtrace_mask)))
> +			break;
> +
> +		mdelay(1);
> +	}
> +
> +	clear_bit(0, &backtrace_flag);
> +	smp_mb__after_atomic();
> +	put_cpu();
> +}
> -- 
> 1.8.3.1
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

-- 
FTTC broadband for 0.8mile line: currently at 9.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH FYI 3/4] ARM: add basic support for on-demand backtrace of other CPUs
  2014-09-05  9:41 ` [PATCH FYI 3/4] ARM: add basic support for on-demand backtrace of other CPUs Russell King
  2014-09-05 10:10   ` Russell King - ARM Linux
@ 2014-09-05 14:31   ` Daniel Thompson
  2014-09-05 17:40     ` Russell King - ARM Linux
  2014-09-05 15:15   ` Daniel Thompson
  2 siblings, 1 reply; 9+ messages in thread
From: Daniel Thompson @ 2014-09-05 14:31 UTC (permalink / raw)
  To: linux-arm-kernel

On 05/09/14 10:41, Russell King wrote:
> Add basic infrastructure for triggering a backtrace of other CPUs
> via an IPI, preferably at FIQ level.  It is intended that this shall
> be used for cases where we have detected that something has already
> failed in the kernel.
> 
> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
> ---
>  arch/arm/include/asm/irq.h |  5 ++++
>  arch/arm/kernel/smp.c      | 62 ++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 67 insertions(+)
> 
> diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h
> index 53c15dec7af6..be1d07d59ee9 100644
> --- a/arch/arm/include/asm/irq.h
> +++ b/arch/arm/include/asm/irq.h
> @@ -35,6 +35,11 @@ extern void (*handle_arch_irq)(struct pt_regs *);
>  extern void set_handle_irq(void (*handle_irq)(struct pt_regs *));
>  #endif
>  
> +#ifdef CONFIG_SMP
> +extern void arch_trigger_all_cpu_backtrace(bool);
> +#define arch_trigger_all_cpu_backtrace(x) arch_trigger_all_cpu_backtrace(x)
> +#endif
> +
>  #endif
>  
>  #endif
> diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> index 9388a3d479e1..94959f977b82 100644
> --- a/arch/arm/kernel/smp.c
> +++ b/arch/arm/kernel/smp.c
> @@ -72,8 +72,12 @@ enum ipi_msg_type {
>  	IPI_CPU_STOP,
>  	IPI_IRQ_WORK,
>  	IPI_COMPLETION,
> +	IPI_CPU_BACKTRACE,

I'd prefer this to use a more generic name to allow the IPI to be used
for debug/profiling purposes.

The backtrace_mask bitmap already makes it safe to call
ipi_cpu_backtrace() without any additional demux.


>  };
>  
> +/* For reliability, we're prepared to waste bits here. */
> +static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
> +

Both checkpatch and the comments in linux/thread.h imply using
CONFIG_NR_CPUS is better here.


>  static DECLARE_COMPLETION(cpu_running);
>  
>  static struct smp_operations smp_ops;
> @@ -539,6 +543,21 @@ static void ipi_cpu_stop(unsigned int cpu)
>  		cpu_relax();
>  }
>  
> +static void ipi_cpu_backtrace(struct pt_regs *regs)
> +{
> +	int cpu = smp_processor_id();
> +
> +	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
> +		static arch_spinlock_t lock = __ARCH_SPIN_LOCK_UNLOCKED;
> +
> +		arch_spin_lock(&lock);
> +		printk(KERN_WARNING "FIQ backtrace for cpu %d\n", cpu);

How about some indication showing degree of platform support currently
available?

printk(KERN_WARNING "%s backtrace for cpu %d\n",
       in_nmi() ? "FIQ" : "Non-FIQ", cpu);


Daniel.

> +		show_regs(regs);
> +		arch_spin_unlock(&lock);
> +		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
> +	}
> +}
> +
>  static DEFINE_PER_CPU(struct completion *, cpu_completion);
>  
>  int register_ipi_completion(struct completion *completion, int cpu)
> @@ -618,6 +637,12 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
>  		irq_exit();
>  		break;
>  
> +	case IPI_CPU_BACKTRACE:
> +		irq_enter();
> +		ipi_cpu_backtrace(regs);
> +		irq_exit();
> +		break;
> +
>  	default:
>  		printk(KERN_CRIT "CPU%u: Unknown IPI message 0x%x\n",
>  		       cpu, ipinr);
> @@ -712,3 +737,40 @@ static int __init register_cpufreq_notifier(void)
>  core_initcall(register_cpufreq_notifier);
>  
>  #endif
> +
> +void arch_trigger_all_cpu_backtrace(bool include_self)
> +{
> +	static unsigned long backtrace_flag;
> +	int i, cpu = get_cpu();
> +
> +	if (test_and_set_bit(0, &backtrace_flag)) {
> +		/*
> +		 * If there is already a trigger_all_cpu_backtrace() in progress
> +		 * (backtrace_flag == 1), don't output double cpu dump infos.
> +		 */
> +		put_cpu();
> +		return;
> +	}
> +
> +	cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask);
> +	if (!include_self)
> +		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
> +
> +	if (!cpumask_empty(to_cpumask(backtrace_mask))) {
> +		pr_info("Sending FIQ to %s CPUs:\n",
> +			(include_self ? "all" : "other"));
> +		smp_cross_call(to_cpumask(backtrace_mask), IPI_CPU_BACKTRACE);
> +	}
> +
> +	/* Wait for up to 10 seconds for all CPUs to do the backtrace */
> +	for (i = 0; i < 10 * 1000; i++) {
> +		if (cpumask_empty(to_cpumask(backtrace_mask)))
> +			break;
> +
> +		mdelay(1);
> +	}
> +
> +	clear_bit(0, &backtrace_flag);
> +	smp_mb__after_atomic();
> +	put_cpu();
> +}
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH FYI 3/4] ARM: add basic support for on-demand backtrace of other CPUs
  2014-09-05  9:41 ` [PATCH FYI 3/4] ARM: add basic support for on-demand backtrace of other CPUs Russell King
  2014-09-05 10:10   ` Russell King - ARM Linux
  2014-09-05 14:31   ` Daniel Thompson
@ 2014-09-05 15:15   ` Daniel Thompson
  2 siblings, 0 replies; 9+ messages in thread
From: Daniel Thompson @ 2014-09-05 15:15 UTC (permalink / raw)
  To: linux-arm-kernel

On 05/09/14 10:41, Russell King wrote:
> Add basic infrastructure for triggering a backtrace of other CPUs
> via an IPI, preferably at FIQ level.  It is intended that this shall
> be used for cases where we have detected that something has already
> failed in the kernel.
> 
> Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
> ---
>  arch/arm/include/asm/irq.h |  5 ++++
>  arch/arm/kernel/smp.c      | 62 ++++++++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 67 insertions(+)
> 
> diff --git a/arch/arm/include/asm/irq.h b/arch/arm/include/asm/irq.h
> index 53c15dec7af6..be1d07d59ee9 100644
> --- a/arch/arm/include/asm/irq.h
> +++ b/arch/arm/include/asm/irq.h
> @@ -35,6 +35,11 @@ extern void (*handle_arch_irq)(struct pt_regs *);
>  extern void set_handle_irq(void (*handle_irq)(struct pt_regs *));
>  #endif
>  
> +#ifdef CONFIG_SMP
> +extern void arch_trigger_all_cpu_backtrace(bool);
> +#define arch_trigger_all_cpu_backtrace(x) arch_trigger_all_cpu_backtrace(x)
> +#endif
> +
>  #endif
>  
>  #endif
> diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> index 9388a3d479e1..94959f977b82 100644
> --- a/arch/arm/kernel/smp.c
> +++ b/arch/arm/kernel/smp.c
> @@ -72,8 +72,12 @@ enum ipi_msg_type {
>  	IPI_CPU_STOP,
>  	IPI_IRQ_WORK,
>  	IPI_COMPLETION,
> +	IPI_CPU_BACKTRACE,
>  };
>  
> +/* For reliability, we're prepared to waste bits here. */
> +static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
> +
>  static DECLARE_COMPLETION(cpu_running);
>  
>  static struct smp_operations smp_ops;
> @@ -539,6 +543,21 @@ static void ipi_cpu_stop(unsigned int cpu)
>  		cpu_relax();
>  }
>  
> +static void ipi_cpu_backtrace(struct pt_regs *regs)
> +{
> +	int cpu = smp_processor_id();
> +
> +	if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) {
> +		static arch_spinlock_t lock = __ARCH_SPIN_LOCK_UNLOCKED;
> +
> +		arch_spin_lock(&lock);
> +		printk(KERN_WARNING "FIQ backtrace for cpu %d\n", cpu);
> +		show_regs(regs);
> +		arch_spin_unlock(&lock);
> +		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
> +	}
> +}
> +
>  static DEFINE_PER_CPU(struct completion *, cpu_completion);
>  
>  int register_ipi_completion(struct completion *completion, int cpu)
> @@ -618,6 +637,12 @@ void handle_IPI(int ipinr, struct pt_regs *regs)
>  		irq_exit();
>  		break;
>  
> +	case IPI_CPU_BACKTRACE:
> +		irq_enter();
> +		ipi_cpu_backtrace(regs);
> +		irq_exit();
> +		break;
> +
>  	default:
>  		printk(KERN_CRIT "CPU%u: Unknown IPI message 0x%x\n",
>  		       cpu, ipinr);
> @@ -712,3 +737,40 @@ static int __init register_cpufreq_notifier(void)
>  core_initcall(register_cpufreq_notifier);
>  
>  #endif
> +
> +void arch_trigger_all_cpu_backtrace(bool include_self)
> +{
> +	static unsigned long backtrace_flag;
> +	int i, cpu = get_cpu();
> +
> +	if (test_and_set_bit(0, &backtrace_flag)) {
> +		/*
> +		 * If there is already a trigger_all_cpu_backtrace() in progress
> +		 * (backtrace_flag == 1), don't output double cpu dump infos.
> +		 */
> +		put_cpu();
> +		return;
> +	}
> +
> +	cpumask_copy(to_cpumask(backtrace_mask), cpu_online_mask);
> +	if (!include_self)
> +		cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask));
> +
> +	if (!cpumask_empty(to_cpumask(backtrace_mask))) {
> +		pr_info("Sending FIQ to %s CPUs:\n",
> +			(include_self ? "all" : "other"));
> +		smp_cross_call(to_cpumask(backtrace_mask), IPI_CPU_BACKTRACE);
> +	}
> +
> +	/* Wait for up to 10 seconds for all CPUs to do the backtrace */
> +	for (i = 0; i < 10 * 1000; i++) {
> +		if (cpumask_empty(to_cpumask(backtrace_mask)))
> +			break;
> +
> +		mdelay(1);
> +	}


I am seeing timeout problems when I test using SysRq-l to call
arch_trigger_all_cpu_backtrace() with FIQ support *disabled* (i.e. in
secure mode).

I've not investigated yet but it feels like interrupts are masked (which
makes sense given the calling context will be the UART ISR). Basically
the console locks up and the backtrace for CPU-0 is delayed for 10 seconds.

Once the backtrace does come out things return to normal...


Daniel.

> +
> +	clear_bit(0, &backtrace_flag);
> +	smp_mb__after_atomic();
> +	put_cpu();
> +}
> 

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH FYI 3/4] ARM: add basic support for on-demand backtrace of other CPUs
  2014-09-05 14:31   ` Daniel Thompson
@ 2014-09-05 17:40     ` Russell King - ARM Linux
  0 siblings, 0 replies; 9+ messages in thread
From: Russell King - ARM Linux @ 2014-09-05 17:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Sep 05, 2014 at 03:31:01PM +0100, Daniel Thompson wrote:
> On 05/09/14 10:41, Russell King wrote:
> > diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> > index 9388a3d479e1..94959f977b82 100644
> > --- a/arch/arm/kernel/smp.c
> > +++ b/arch/arm/kernel/smp.c
> > @@ -72,8 +72,12 @@ enum ipi_msg_type {
> >  	IPI_CPU_STOP,
> >  	IPI_IRQ_WORK,
> >  	IPI_COMPLETION,
> > +	IPI_CPU_BACKTRACE,
> 
> I'd prefer this to use a more generic name to allow the IPI to be used
> for debug/profiling purposes.

However, for profiling purposes, you want the profiling code to have
as smaller impact as possible, so you wouldn't want to call lots of
unnecessary functions based on:

> The backtrace_mask bitmap already makes it safe to call
> ipi_cpu_backtrace() without any additional demux.

The reason is quite simple - if you spend longer in the profiling code,
you perturb the timings of the code being profiled more, which can
change the profiled code's behaviour.  Take a look at things like the
network receive softirq (or any softirq) where there's a limited amount
of work which is done on the tail-end of a normal interrupt, before
bouncing it across to ksoftirqd.

Let's say you're chasing a performance problem with networking, trying
to work out what's going on, but running perf causes the performance
problem to vanish, because the timings have changed soo much that the
softirq stays scheduled on ksoftirqd.  What do you do then?

We should always aim for profiling and debugging to hasve as low
overhead as is possible.

> >  };
> >  
> > +/* For reliability, we're prepared to waste bits here. */
> > +static DECLARE_BITMAP(backtrace_mask, NR_CPUS) __read_mostly;
> > +
> 
> Both checkpatch and the comments in linux/thread.h imply using
> CONFIG_NR_CPUS is better here.

This comes from x86 code.  I don't wish to introduce an additional delta
between x86 and ARM, especially as we should look towards moving the x86
version out of arch/x86 into a generic location.

> How about some indication showing degree of platform support currently
> available?
> 
> printk(KERN_WARNING "%s backtrace for cpu %d\n",
>        in_nmi() ? "FIQ" : "Non-FIQ", cpu);

We could do that.

-- 
FTTC broadband for 0.8mile line: currently at 9.5Mbps down 400kbps up
according to speedtest.net.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2014-09-05 17:40 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-09-05  9:40 [PATCH FYI 0/4] FIQ-based backtracing support Russell King - ARM Linux
2014-09-05  9:41 ` [PATCH FYI 1/4] ARM: remove extraneous newline in show_regs() Russell King
2014-09-05  9:41 ` [PATCH FYI 2/4] ARM: remove unused do_unexp_fiq() function Russell King
2014-09-05  9:41 ` [PATCH FYI 3/4] ARM: add basic support for on-demand backtrace of other CPUs Russell King
2014-09-05 10:10   ` Russell King - ARM Linux
2014-09-05 14:31   ` Daniel Thompson
2014-09-05 17:40     ` Russell King - ARM Linux
2014-09-05 15:15   ` Daniel Thompson
2014-09-05  9:41 ` [PATCH FYI 4/4] ARM: cobble together FIQ backtracing Russell King

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).