All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v8 0/4] arm64: Add support for IRQ stack
@ 2015-12-04 11:02 James Morse
  2015-12-04 11:02 ` [PATCH v8 1/4] arm64: Store struct task_info in sp_el0 James Morse
                   ` (8 more replies)
  0 siblings, 9 replies; 44+ messages in thread
From: James Morse @ 2015-12-04 11:02 UTC (permalink / raw)
  To: linux-arm-kernel

This consolidated series adds a per-cpu irq_stack, which will eventually allow
us to reduce the size of task stacks.

Code in entry.S switches to the per-cpu irq_stack when irq_count is zero.
This counter is updated in do_softirq_own_stack(), which is called before
__do_softirq() re-enables interrupts, which could cause recursive use of the
irq stack.

sp_el0 is used as a scratch register to store struct thread_info during el1
execution, this means code to find it by masking the stack pointer no longer
needs to be inlined. This also lets us remove the alignment requirements for
the irq stack, (task stacks still need to be naturally aligned).

Patch 2 is a combination of Akashi Takahiro's 'arm64: unwind_frame for
interrupt stack' and 'arm64: fix dump_backtrace() to show correct pt_regs at
interrupt', both of which need to be present before irq_stack is enabled in
Patch 3.

Patch 4 is new, following Catalin's comments, but I don't think it is
necessary unless we also decrease the stack size, which I don't think we
should do immediatly - it would be good to collect some max_stack_size values
for various workloads first.

This series is based on the v4.4-rc3
The series can be pulled from git://linux-arm.org/linux-jm.git -b irq_stack/v8

Comments welcome,


James

Changes since v7 [0]:
 * Removed independent irq_stack_ptr and irq_count values. The lowest irq_stack
   value is used as irq_count, irq_stack_entry adds IRQ_STACK_START_SP to this
   when switching stack.
 * Moved irq_stack definition from patch 1 to patch 2.
 * Added patch 4

Changes since v6 [1]:
 * Add irq_count and do_softirq_own_stack().
 * Removed requirement that the irq_stack is naturally aligned.
 * Made irq_stack per_cpu regardless of page size.


[0] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-November/385337.html
[1] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-November/382238.html


AKASHI Takahiro (1):
  arm64: Modify stack trace and dump for use with irq_stack

James Morse (2):
  arm64: Add do_softirq_own_stack() and enable irq_stacks
  arm64: switch to irq_stack during softirq

Jungseok Lee (1):
  arm64: Store struct task_info in sp_el0

 arch/arm64/include/asm/irq.h         | 36 ++++++++++++++++
 arch/arm64/include/asm/thread_info.h | 10 ++++-
 arch/arm64/kernel/entry.S            | 82 +++++++++++++++++++++++++++++++++---
 arch/arm64/kernel/head.S             |  5 +++
 arch/arm64/kernel/irq.c              | 37 ++++++++++++++++
 arch/arm64/kernel/sleep.S            |  3 ++
 arch/arm64/kernel/stacktrace.c       | 29 ++++++++++++-
 arch/arm64/kernel/traps.c            | 14 +++++-
 8 files changed, 206 insertions(+), 10 deletions(-)

-- 
2.6.2

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 1/4] arm64: Store struct task_info in sp_el0
  2015-12-04 11:02 [PATCH v8 0/4] arm64: Add support for IRQ stack James Morse
@ 2015-12-04 11:02 ` James Morse
  2015-12-04 13:27   ` Catalin Marinas
  2015-12-04 11:02 ` [PATCH v8 2/4] arm64: Modify stack trace and dump for use with irq_stack James Morse
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 44+ messages in thread
From: James Morse @ 2015-12-04 11:02 UTC (permalink / raw)
  To: linux-arm-kernel

From: Jungseok Lee <jungseoklee85@gmail.com>

There is need for figuring out how to manage struct thread_info data when
IRQ stack is introduced. struct thread_info information should be copied
to IRQ stack under the current thread_info calculation logic whenever
context switching is invoked. This is too expensive to keep supporting
the approach.

Instead, this patch pays attention to sp_el0 which is an unused scratch
register in EL1 context. sp_el0 utilization not only simplifies the
management, but also prevents text section size from being increased
largely due to static allocated IRQ stack as removing masking operation
using THREAD_SIZE in many places.

Signed-off-by: Jungseok Lee <jungseoklee85@gmail.com>
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/thread_info.h | 10 ++++++++--
 arch/arm64/kernel/entry.S            | 15 ++++++++++++---
 arch/arm64/kernel/head.S             |  5 +++++
 arch/arm64/kernel/sleep.S            |  3 +++
 4 files changed, 28 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index 90c7ff233735..abd64bd1f6d9 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -73,10 +73,16 @@ register unsigned long current_stack_pointer asm ("sp");
  */
 static inline struct thread_info *current_thread_info(void) __attribute_const__;
 
+/*
+ * struct thread_info can be accessed directly via sp_el0.
+ */
 static inline struct thread_info *current_thread_info(void)
 {
-	return (struct thread_info *)
-		(current_stack_pointer & ~(THREAD_SIZE - 1));
+	unsigned long sp_el0;
+
+	asm ("mrs %0, sp_el0" : "=r" (sp_el0));
+
+	return (struct thread_info *)sp_el0;
 }
 
 #define thread_saved_pc(tsk)	\
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 7ed3d75f6304..fc87373d3f88 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -88,7 +88,8 @@
 
 	.if	\el == 0
 	mrs	x21, sp_el0
-	get_thread_info tsk			// Ensure MDSCR_EL1.SS is clear,
+	mov	tsk, sp
+	and	tsk, tsk, #~(THREAD_SIZE - 1)	// Ensure MDSCR_EL1.SS is clear,
 	ldr	x19, [tsk, #TI_FLAGS]		// since we can unmask debug
 	disable_step_tsk x19, x20		// exceptions when scheduling.
 	.else
@@ -108,6 +109,13 @@
 	.endif
 
 	/*
+	 * Set sp_el0 to current thread_info.
+	 */
+	.if	\el == 0
+	msr	sp_el0, tsk
+	.endif
+
+	/*
 	 * Registers that may be useful after this macro is invoked:
 	 *
 	 * x21 - aborted SP
@@ -164,8 +172,7 @@ alternative_endif
 	.endm
 
 	.macro	get_thread_info, rd
-	mov	\rd, sp
-	and	\rd, \rd, #~(THREAD_SIZE - 1)	// top of stack
+	mrs	\rd, sp_el0
 	.endm
 
 /*
@@ -599,6 +606,8 @@ ENTRY(cpu_switch_to)
 	ldp	x29, x9, [x8], #16
 	ldr	lr, [x8]
 	mov	sp, x9
+	and	x9, x9, #~(THREAD_SIZE - 1)
+	msr	sp_el0, x9
 	ret
 ENDPROC(cpu_switch_to)
 
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 23cfc08fc8ba..b363f340f2c7 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -424,6 +424,9 @@ __mmap_switched:
 	b	1b
 2:
 	adr_l	sp, initial_sp, x4
+	mov	x4, sp
+	and	x4, x4, #~(THREAD_SIZE - 1)
+	msr	sp_el0, x4			// Save thread_info
 	str_l	x21, __fdt_pointer, x5		// Save FDT pointer
 	str_l	x24, memstart_addr, x6		// Save PHYS_OFFSET
 	mov	x29, #0
@@ -606,6 +609,8 @@ ENDPROC(secondary_startup)
 ENTRY(__secondary_switched)
 	ldr	x0, [x21]			// get secondary_data.stack
 	mov	sp, x0
+	and	x0, x0, #~(THREAD_SIZE - 1)
+	msr	sp_el0, x0			// save thread_info
 	mov	x29, #0
 	b	secondary_start_kernel
 ENDPROC(__secondary_switched)
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index f586f7c875e2..e33fe33876ab 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -173,6 +173,9 @@ ENTRY(cpu_resume)
 	/* load physical address of identity map page table in x1 */
 	adrp	x1, idmap_pg_dir
 	mov	sp, x2
+	/* save thread_info */
+	and	x2, x2, #~(THREAD_SIZE - 1)
+	msr	sp_el0, x2
 	/*
 	 * cpu_do_resume expects x0 to contain context physical address
 	 * pointer and x1 to contain physical address of 1:1 page tables
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 2/4] arm64: Modify stack trace and dump for use with irq_stack
  2015-12-04 11:02 [PATCH v8 0/4] arm64: Add support for IRQ stack James Morse
  2015-12-04 11:02 ` [PATCH v8 1/4] arm64: Store struct task_info in sp_el0 James Morse
@ 2015-12-04 11:02 ` James Morse
  2015-12-04 12:21   ` Jungseok Lee
  2015-12-04 14:31   ` Catalin Marinas
  2015-12-04 11:02 ` [PATCH v8 3/4] arm64: Add do_softirq_own_stack() and enable irq_stacks James Morse
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 44+ messages in thread
From: James Morse @ 2015-12-04 11:02 UTC (permalink / raw)
  To: linux-arm-kernel

From: AKASHI Takahiro <takahiro.akashi@linaro.org>

This patch allows unwind_frame() to traverse from interrupt stack to task
stack correctly. It requires data from a dummy stack frame, created
during irq_stack_entry(), added by a later patch.

A similar approach is taken to modify dump_backtrace(), which expects to
find struct pt_regs underneath any call to functions marked __exception.
When on an irq_stack, the struct pt_regs is stored on the old task stack,
the location of which is stored in the dummy stack frame.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
[merged two patches, reworked for per_cpu irq_stacks, and no alignment
 guarantees, added irq_stack definitions]
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/irq.h   | 32 ++++++++++++++++++++++++++++++++
 arch/arm64/kernel/irq.c        |  3 +++
 arch/arm64/kernel/stacktrace.c | 29 +++++++++++++++++++++++++++--
 arch/arm64/kernel/traps.c      | 14 +++++++++++++-
 4 files changed, 75 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h
index 8e8d30684392..e2f3f135a3bc 100644
--- a/arch/arm64/include/asm/irq.h
+++ b/arch/arm64/include/asm/irq.h
@@ -1,10 +1,32 @@
 #ifndef __ASM_IRQ_H
 #define __ASM_IRQ_H
 
+#define IRQ_STACK_SIZE			THREAD_SIZE
+#define IRQ_STACK_START_SP		THREAD_START_SP
+
+#ifndef __ASSEMBLER__
+
+#include <linux/percpu.h>
+
 #include <asm-generic/irq.h>
+#include <asm/thread_info.h>
 
 struct pt_regs;
 
+DECLARE_PER_CPU(unsigned long [IRQ_STACK_SIZE/sizeof(long)], irq_stack);
+
+/*
+ * The highest address on the stack, and the first to be used. Used to
+ * find the dummy-stack frame put down by el?_irq() in entry.S.
+ */
+#define IRQ_STACK_PTR(cpu) ((unsigned long)per_cpu(irq_stack, cpu) + IRQ_STACK_START_SP)
+
+/*
+ * The offset from irq_stack_ptr where entry.S will store the original
+ * stack pointer. Used by unwind_frame() and dump_backtrace().
+ */
+#define IRQ_STACK_TO_TASK_STACK(ptr) *((unsigned long *)(ptr - 0x10));
+
 extern void set_handle_irq(void (*handle_irq)(struct pt_regs *));
 
 static inline int nr_legacy_irqs(void)
@@ -12,4 +34,14 @@ static inline int nr_legacy_irqs(void)
 	return 0;
 }
 
+static inline bool on_irq_stack(unsigned long sp, int cpu)
+{
+	/* variable names the same as kernel/stacktrace.c */
+	unsigned long low = (unsigned long)per_cpu(irq_stack, cpu);
+	unsigned long high = low + IRQ_STACK_START_SP;
+
+	return (low <= sp && sp <= high);
+}
+
+#endif /* !__ASSEMBLER__ */
 #endif
diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
index 9f17ec071ee0..1e3cef578e21 100644
--- a/arch/arm64/kernel/irq.c
+++ b/arch/arm64/kernel/irq.c
@@ -30,6 +30,9 @@
 
 unsigned long irq_err_count;
 
+/* irq stack only needs to be 16 byte aligned - not IRQ_STACK_SIZE aligned */
+DEFINE_PER_CPU(unsigned long [IRQ_STACK_SIZE/sizeof(long)], irq_stack) __aligned(16);
+
 int arch_show_interrupts(struct seq_file *p, int prec)
 {
 	show_ipi_list(p, prec);
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index ccb6078ed9f2..b947eeffa5b2 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -20,6 +20,7 @@
 #include <linux/sched.h>
 #include <linux/stacktrace.h>
 
+#include <asm/irq.h>
 #include <asm/stacktrace.h>
 
 /*
@@ -39,17 +40,41 @@ int notrace unwind_frame(struct stackframe *frame)
 {
 	unsigned long high, low;
 	unsigned long fp = frame->fp;
+	unsigned long irq_stack_ptr;
+
+	/*
+	 * Use raw_smp_processor_id() to avoid false-positives from
+	 * CONFIG_DEBUG_PREEMPT. get_wchan() calls unwind_frame() on sleeping
+	 * task stacks, we can be pre-empted in this case, so
+	 * {raw_,}smp_processor_id() may give us the wrong value. Sleeping
+	 * tasks can't ever be on an interrupt stack, so regardless of cpu,
+	 * the checks will always fail.
+	 */
+	irq_stack_ptr = IRQ_STACK_PTR(raw_smp_processor_id());
 
 	low  = frame->sp;
-	high = ALIGN(low, THREAD_SIZE);
+	/* irq stacks are not THREAD_SIZE aligned */
+	if (on_irq_stack(frame->sp, raw_smp_processor_id()))
+		high = irq_stack_ptr;
+	else
+		high = ALIGN(low, THREAD_SIZE) - 0x20;
 
-	if (fp < low || fp > high - 0x18 || fp & 0xf)
+	if (fp < low || fp > high || fp & 0xf)
 		return -EINVAL;
 
 	frame->sp = fp + 0x10;
 	frame->fp = *(unsigned long *)(fp);
 	frame->pc = *(unsigned long *)(fp + 8);
 
+	/*
+	 * Check whether we are going to walk through from interrupt stack
+	 * to task stack.
+	 * If we reach the end of the stack - and its an interrupt stack,
+	 * read the original task stack pointer from the dummy frame.
+	 */
+	if (frame->sp == irq_stack_ptr)
+		frame->sp = IRQ_STACK_TO_TASK_STACK(irq_stack_ptr);
+
 	return 0;
 }
 
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
index e9b9b5364393..8a0084541f84 100644
--- a/arch/arm64/kernel/traps.c
+++ b/arch/arm64/kernel/traps.c
@@ -146,6 +146,7 @@ static void dump_instr(const char *lvl, struct pt_regs *regs)
 static void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk)
 {
 	struct stackframe frame;
+	unsigned long irq_stack_ptr = IRQ_STACK_PTR(smp_processor_id());
 
 	pr_debug("%s(regs = %p tsk = %p)\n", __func__, regs, tsk);
 
@@ -180,9 +181,20 @@ static void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk)
 		if (ret < 0)
 			break;
 		stack = frame.sp;
-		if (in_exception_text(where))
+		if (in_exception_text(where)) {
+			/*
+			 * If we switched to the irq_stack before calling this
+			 * exception handler, then the pt_regs will be on the
+			 * task stack. The easiest way to tell is if the large
+			 * pt_regs would overlap with the end of the irq_stack.
+			 */
+			if (stack < irq_stack_ptr &&
+			    (stack + sizeof(struct pt_regs)) > irq_stack_ptr)
+				stack = IRQ_STACK_TO_TASK_STACK(irq_stack_ptr);
+
 			dump_mem("", "Exception stack", stack,
 				 stack + sizeof(struct pt_regs), false);
+		}
 	}
 }
 
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 3/4] arm64: Add do_softirq_own_stack() and enable irq_stacks
  2015-12-04 11:02 [PATCH v8 0/4] arm64: Add support for IRQ stack James Morse
  2015-12-04 11:02 ` [PATCH v8 1/4] arm64: Store struct task_info in sp_el0 James Morse
  2015-12-04 11:02 ` [PATCH v8 2/4] arm64: Modify stack trace and dump for use with irq_stack James Morse
@ 2015-12-04 11:02 ` James Morse
  2015-12-04 13:46   ` Catalin Marinas
                     ` (2 more replies)
  2015-12-04 11:02 ` [PATCH v8 4/4] arm64: switch to irq_stack during softirq James Morse
                   ` (5 subsequent siblings)
  8 siblings, 3 replies; 44+ messages in thread
From: James Morse @ 2015-12-04 11:02 UTC (permalink / raw)
  To: linux-arm-kernel

entry.S is modified to switch to the per_cpu irq_stack during el{0,1}_irq.
irq_count is used to detect recursive interrupts on the irq_stack, it is
updated late by do_softirq_own_stack(), when called on the irq_stack, before
__do_softirq() re-enables interrupts to process softirqs.

do_softirq_own_stack() is added by this patch, but does not yet switch
stack.

This patch adds the dummy stack frame and data needed by the previous
stack tracing patches.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/irq.h |  2 ++
 arch/arm64/kernel/entry.S    | 42 ++++++++++++++++++++++++++++++++++++++++--
 arch/arm64/kernel/irq.c      | 38 +++++++++++++++++++++++++++++++++++++-
 3 files changed, 79 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h
index e2f3f135a3bc..fa2a8d0e4792 100644
--- a/arch/arm64/include/asm/irq.h
+++ b/arch/arm64/include/asm/irq.h
@@ -11,6 +11,8 @@
 #include <asm-generic/irq.h>
 #include <asm/thread_info.h>
 
+#define __ARCH_HAS_DO_SOFTIRQ
+
 struct pt_regs;
 
 DECLARE_PER_CPU(unsigned long [IRQ_STACK_SIZE/sizeof(long)], irq_stack);
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index fc87373d3f88..81cc5380977d 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -27,6 +27,7 @@
 #include <asm/cpufeature.h>
 #include <asm/errno.h>
 #include <asm/esr.h>
+#include <asm/irq.h>
 #include <asm/thread_info.h>
 #include <asm/unistd.h>
 
@@ -175,6 +176,42 @@ alternative_endif
 	mrs	\rd, sp_el0
 	.endm
 
+	.macro	irq_stack_entry, dummy_lr
+	mov	x19, sp			// preserve the original sp
+
+	adr_l	x25, irq_stack
+	mrs	x26, tpidr_el1
+	add	x25, x25, x26
+
+	/*
+	 * Check the lowest address on irq_stack for the irq_count value,
+	 * incremented by do_softirq_own_stack if we have re-enabled irqs
+	 * while on the irq_stack.
+	 */
+	ldr	x26, [x25]
+	cbnz	x26, 9998f		// recursive use?
+
+	/* switch to the irq stack */
+	mov	x26, #IRQ_STACK_START_SP
+	add	x26, x25, x26
+	mov	sp, x26
+
+	/* Add a dummy stack frame */
+	stp     x29, \dummy_lr, [sp, #-16]!           // dummy stack frame
+	mov	x29, sp
+	stp     xzr, x19, [sp, #-16]!
+
+9998:
+	.endm
+
+	/*
+	 * x19 should be preserved between irq_stack_entry and
+	 * irq_stack_exit.
+	 */
+	.macro	irq_stack_exit
+	mov	sp, x19
+	.endm
+
 /*
  * These are the registers used in the syscall handler, and allow us to
  * have in theory up to 7 arguments to a function - x0 to x6.
@@ -190,10 +227,11 @@ tsk	.req	x28		// current thread_info
  * Interrupt handling.
  */
 	.macro	irq_handler
-	adrp	x1, handle_arch_irq
-	ldr	x1, [x1, #:lo12:handle_arch_irq]
+	ldr_l	x1, handle_arch_irq
 	mov	x0, sp
+	irq_stack_entry x22
 	blr	x1
+	irq_stack_exit
 	.endm
 
 	.text
diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
index 1e3cef578e21..ff7ebb710e51 100644
--- a/arch/arm64/kernel/irq.c
+++ b/arch/arm64/kernel/irq.c
@@ -25,14 +25,24 @@
 #include <linux/irq.h>
 #include <linux/smp.h>
 #include <linux/init.h>
+#include <linux/interrupt.h>
 #include <linux/irqchip.h>
 #include <linux/seq_file.h>
 
 unsigned long irq_err_count;
 
-/* irq stack only needs to be 16 byte aligned - not IRQ_STACK_SIZE aligned */
+/*
+ * irq stack only needs to be 16 byte aligned - not IRQ_STACK_SIZE aligned.
+ * irq_stack[0] is used as irq_count, a non-zero value indicates the stack
+ * is in use, and el?_irq() shouldn't switch to it. This is used to detect
+ * recursive use of the irq_stack, it is lazily updated by
+ * do_softirq_own_stack(), which is called on the irq_stack, before
+ * re-enabling interrupts to process softirqs.
+ */
 DEFINE_PER_CPU(unsigned long [IRQ_STACK_SIZE/sizeof(long)], irq_stack) __aligned(16);
 
+#define IRQ_COUNT()	(*per_cpu(irq_stack, smp_processor_id()))
+
 int arch_show_interrupts(struct seq_file *p, int prec)
 {
 	show_ipi_list(p, prec);
@@ -56,3 +66,29 @@ void __init init_IRQ(void)
 	if (!handle_arch_irq)
 		panic("No interrupt controller found.");
 }
+
+/*
+ * do_softirq_own_stack() is called from irq_exit() before __do_softirq()
+ * re-enables interrupts, at which point we may re-enter el?_irq(). We
+ * increase irq_count here so that el1_irq() knows that it is already on the
+ * irq stack.
+ *
+ * Called with interrupts disabled, so we don't worry about moving cpu, or
+ * being interrupted while modifying irq_count.
+ *
+ * This function doesn't actually switch stack.
+ */
+void do_softirq_own_stack(void)
+{
+	int cpu = smp_processor_id();
+
+	WARN_ON_ONCE(!irqs_disabled());
+
+	if (on_irq_stack(current_stack_pointer, cpu)) {
+		IRQ_COUNT()++;
+		__do_softirq();
+		IRQ_COUNT()--;
+	} else {
+		__do_softirq();
+	}
+}
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 4/4] arm64: switch to irq_stack during softirq
  2015-12-04 11:02 [PATCH v8 0/4] arm64: Add support for IRQ stack James Morse
                   ` (2 preceding siblings ...)
  2015-12-04 11:02 ` [PATCH v8 3/4] arm64: Add do_softirq_own_stack() and enable irq_stacks James Morse
@ 2015-12-04 11:02 ` James Morse
  2015-12-04 14:01   ` Catalin Marinas
  2015-12-04 12:17 ` [PATCH v8 0/4] arm64: Add support for IRQ stack Jungseok Lee
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 44+ messages in thread
From: James Morse @ 2015-12-04 11:02 UTC (permalink / raw)
  To: linux-arm-kernel

do_softirq_own_stack() was added in a previous patch, but was only used
to increase the irq_count value before __do_softirq() re-enables interrupts.

This patch adds a helper to call __do_softirq() on the irq stack. This means
do_softirq() can be called by a task that is running out of stack space,
as the work will be done on the irq_stack.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/irq.h |  2 ++
 arch/arm64/kernel/entry.S    | 25 +++++++++++++++++++++++++
 arch/arm64/kernel/irq.c      |  4 +---
 3 files changed, 28 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h
index fa2a8d0e4792..6fd25ab534ca 100644
--- a/arch/arm64/include/asm/irq.h
+++ b/arch/arm64/include/asm/irq.h
@@ -31,6 +31,8 @@ DECLARE_PER_CPU(unsigned long [IRQ_STACK_SIZE/sizeof(long)], irq_stack);
 
 extern void set_handle_irq(void (*handle_irq)(struct pt_regs *));
 
+void __do_softirq_on_irqstack(void);
+
 static inline int nr_legacy_irqs(void)
 {
 	return 0;
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 81cc5380977d..2963050e5e9e 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -779,3 +779,28 @@ ENTRY(sys_rt_sigreturn_wrapper)
 	mov	x0, sp
 	b	sys_rt_sigreturn
 ENDPROC(sys_rt_sigreturn_wrapper)
+
+ENTRY(__do_softirq_on_irqstack)
+	push	x19, lr
+	push	x25, x26
+
+	irq_stack_entry lr
+
+	/* irq_stack_entry leaves irq_count in x25 */
+	ldr	x1, [x25]
+	add	x1, x1, #1
+	str	x1, [x25]
+
+	bl	__do_softirq
+
+	ldr	x1, [x25]
+	sub	x1, x1, #1
+	str	x1, [x25]
+
+	irq_stack_exit
+
+	pop x25, x26
+	pop x19, lr
+	ret
+ENDPROC(__do_softirq_on_irqstack)
+
diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
index ff7ebb710e51..cbf1f00993ef 100644
--- a/arch/arm64/kernel/irq.c
+++ b/arch/arm64/kernel/irq.c
@@ -75,8 +75,6 @@ void __init init_IRQ(void)
  *
  * Called with interrupts disabled, so we don't worry about moving cpu, or
  * being interrupted while modifying irq_count.
- *
- * This function doesn't actually switch stack.
  */
 void do_softirq_own_stack(void)
 {
@@ -89,6 +87,6 @@ void do_softirq_own_stack(void)
 		__do_softirq();
 		IRQ_COUNT()--;
 	} else {
-		__do_softirq();
+		__do_softirq_on_irqstack();
 	}
 }
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 0/4] arm64: Add support for IRQ stack
  2015-12-04 11:02 [PATCH v8 0/4] arm64: Add support for IRQ stack James Morse
                   ` (3 preceding siblings ...)
  2015-12-04 11:02 ` [PATCH v8 4/4] arm64: switch to irq_stack during softirq James Morse
@ 2015-12-04 12:17 ` Jungseok Lee
  2015-12-06 13:56   ` Jungseok Lee
  2015-12-04 13:57 ` Catalin Marinas
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 44+ messages in thread
From: Jungseok Lee @ 2015-12-04 12:17 UTC (permalink / raw)
  To: linux-arm-kernel

On Dec 4, 2015, at 8:02 PM, James Morse wrote:

Hi James,

> This consolidated series adds a per-cpu irq_stack, which will eventually allow
> us to reduce the size of task stacks.
> 
> Code in entry.S switches to the per-cpu irq_stack when irq_count is zero.
> This counter is updated in do_softirq_own_stack(), which is called before
> __do_softirq() re-enables interrupts, which could cause recursive use of the
> irq stack.
> 
> sp_el0 is used as a scratch register to store struct thread_info during el1
> execution, this means code to find it by masking the stack pointer no longer
> needs to be inlined. This also lets us remove the alignment requirements for
> the irq stack, (task stacks still need to be naturally aligned).
> 
> Patch 2 is a combination of Akashi Takahiro's 'arm64: unwind_frame for
> interrupt stack' and 'arm64: fix dump_backtrace() to show correct pt_regs at
> interrupt', both of which need to be present before irq_stack is enabled in
> Patch 3.
> 
> Patch 4 is new, following Catalin's comments, but I don't think it is
> necessary unless we also decrease the stack size, which I don't think we
> should do immediatly - it would be good to collect some max_stack_size values
> for various workloads first.
> 
> This series is based on the v4.4-rc3
> The series can be pulled from git://linux-arm.org/linux-jm.git -b irq_stack/v8
> 
> Comments welcome,

At first glance, this series looks mature ;) I will leave feedbacks after playing
with this.

> James
> 
> Changes since v7 [0]:
> * Removed independent irq_stack_ptr and irq_count values. The lowest irq_stack
>   value is used as irq_count, irq_stack_entry adds IRQ_STACK_START_SP to this
>   when switching stack.
> * Moved irq_stack definition from patch 1 to patch 2.
> * Added patch 4
> 
> Changes since v6 [1]:
> * Add irq_count and do_softirq_own_stack().
> * Removed requirement that the irq_stack is naturally aligned.
> * Made irq_stack per_cpu regardless of page size.
> 
> 
> [0] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-November/385337.html
> [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-November/382238.html
> 
> 
> AKASHI Takahiro (1):
>  arm64: Modify stack trace and dump for use with irq_stack
> 
> James Morse (2):
>  arm64: Add do_softirq_own_stack() and enable irq_stacks
>  arm64: switch to irq_stack during softirq
> 
> Jungseok Lee (1):
>  arm64: Store struct task_info in sp_el0

s/task_info/thread_info would be clearer.

Best Regards
Jungseok Lee

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 2/4] arm64: Modify stack trace and dump for use with irq_stack
  2015-12-04 11:02 ` [PATCH v8 2/4] arm64: Modify stack trace and dump for use with irq_stack James Morse
@ 2015-12-04 12:21   ` Jungseok Lee
  2015-12-04 14:31   ` Catalin Marinas
  1 sibling, 0 replies; 44+ messages in thread
From: Jungseok Lee @ 2015-12-04 12:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Dec 4, 2015, at 8:02 PM, James Morse wrote:
> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
> 
> This patch allows unwind_frame() to traverse from interrupt stack to task
> stack correctly. It requires data from a dummy stack frame, created
> during irq_stack_entry(), added by a later patch.
> 
> A similar approach is taken to modify dump_backtrace(), which expects to
> find struct pt_regs underneath any call to functions marked __exception.
> When on an irq_stack, the struct pt_regs is stored on the old task stack,
> the location of which is stored in the dummy stack frame.
> 
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> [merged two patches, reworked for per_cpu irq_stacks, and no alignment
> guarantees, added irq_stack definitions]
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> arch/arm64/include/asm/irq.h   | 32 ++++++++++++++++++++++++++++++++
> arch/arm64/kernel/irq.c        |  3 +++
> arch/arm64/kernel/stacktrace.c | 29 +++++++++++++++++++++++++++--
> arch/arm64/kernel/traps.c      | 14 +++++++++++++-
> 4 files changed, 75 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h
> index 8e8d30684392..e2f3f135a3bc 100644
> --- a/arch/arm64/include/asm/irq.h
> +++ b/arch/arm64/include/asm/irq.h
> @@ -1,10 +1,32 @@
> #ifndef __ASM_IRQ_H
> #define __ASM_IRQ_H
> 
> +#define IRQ_STACK_SIZE			THREAD_SIZE

I agree that it is unnecessary to reduce a process stack size at this stage.
But, it would be no issue to set IRQ_STACK_SIZE to 8KB now, which is aligned
with Catalin's comment [1].

Best Regards
Jungseok Lee

[1] http://www.spinics.net/lists/arm-kernel/msg463868.html

Best Regards
Jungseok Lee

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 1/4] arm64: Store struct task_info in sp_el0
  2015-12-04 11:02 ` [PATCH v8 1/4] arm64: Store struct task_info in sp_el0 James Morse
@ 2015-12-04 13:27   ` Catalin Marinas
  2015-12-04 14:55     ` James Morse
  2015-12-06 13:15     ` Jungseok Lee
  0 siblings, 2 replies; 44+ messages in thread
From: Catalin Marinas @ 2015-12-04 13:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Dec 04, 2015 at 11:02:25AM +0000, James Morse wrote:
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
[...]
> @@ -599,6 +606,8 @@ ENTRY(cpu_switch_to)
>  	ldp	x29, x9, [x8], #16
>  	ldr	lr, [x8]
>  	mov	sp, x9
> +	and	x9, x9, #~(THREAD_SIZE - 1)
> +	msr	sp_el0, x9
>  	ret
>  ENDPROC(cpu_switch_to)

At the beginning of the cpu_switch_to function, could we do
"mrs x9, sp_el0" instead to avoid the "and ... ~(THREAD_SIZE-1)"?

Otherwise:

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 3/4] arm64: Add do_softirq_own_stack() and enable irq_stacks
  2015-12-04 11:02 ` [PATCH v8 3/4] arm64: Add do_softirq_own_stack() and enable irq_stacks James Morse
@ 2015-12-04 13:46   ` Catalin Marinas
  2015-12-04 13:47     ` Catalin Marinas
  2015-12-07 22:48   ` Catalin Marinas
  2015-12-09 13:45   ` Will Deacon
  2 siblings, 1 reply; 44+ messages in thread
From: Catalin Marinas @ 2015-12-04 13:46 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Dec 04, 2015 at 11:02:27AM +0000, James Morse wrote:
> +/*
> + * do_softirq_own_stack() is called from irq_exit() before __do_softirq()
> + * re-enables interrupts, at which point we may re-enter el?_irq(). We
> + * increase irq_count here so that el1_irq() knows that it is already on the
> + * irq stack.
> + *
> + * Called with interrupts disabled, so we don't worry about moving cpu, or
> + * being interrupted while modifying irq_count.
> + *
> + * This function doesn't actually switch stack.
> + */
> +void do_softirq_own_stack(void)
> +{
> +	int cpu = smp_processor_id();
> +
> +	WARN_ON_ONCE(!irqs_disabled());
> +
> +	if (on_irq_stack(current_stack_pointer, cpu)) {
> +		IRQ_COUNT()++;
> +		__do_softirq();
> +		IRQ_COUNT()--;
> +	} else {
> +		__do_softirq();
> +	}

Following your and my reply on the previous series, should we not
_always_ switch to the irqstack here?

-- 
Catalin

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 3/4] arm64: Add do_softirq_own_stack() and enable irq_stacks
  2015-12-04 13:46   ` Catalin Marinas
@ 2015-12-04 13:47     ` Catalin Marinas
  0 siblings, 0 replies; 44+ messages in thread
From: Catalin Marinas @ 2015-12-04 13:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Dec 04, 2015 at 01:46:36PM +0000, Catalin Marinas wrote:
> On Fri, Dec 04, 2015 at 11:02:27AM +0000, James Morse wrote:
> > +/*
> > + * do_softirq_own_stack() is called from irq_exit() before __do_softirq()
> > + * re-enables interrupts, at which point we may re-enter el?_irq(). We
> > + * increase irq_count here so that el1_irq() knows that it is already on the
> > + * irq stack.
> > + *
> > + * Called with interrupts disabled, so we don't worry about moving cpu, or
> > + * being interrupted while modifying irq_count.
> > + *
> > + * This function doesn't actually switch stack.
> > + */
> > +void do_softirq_own_stack(void)
> > +{
> > +	int cpu = smp_processor_id();
> > +
> > +	WARN_ON_ONCE(!irqs_disabled());
> > +
> > +	if (on_irq_stack(current_stack_pointer, cpu)) {
> > +		IRQ_COUNT()++;
> > +		__do_softirq();
> > +		IRQ_COUNT()--;
> > +	} else {
> > +		__do_softirq();
> > +	}
> 
> Following your and my reply on the previous series, should we not
> _always_ switch to the irqstack here?

Ah, sorry, I now noticed that this is done in patch 4.

-- 
Catalin

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 0/4] arm64: Add support for IRQ stack
  2015-12-04 11:02 [PATCH v8 0/4] arm64: Add support for IRQ stack James Morse
                   ` (4 preceding siblings ...)
  2015-12-04 12:17 ` [PATCH v8 0/4] arm64: Add support for IRQ stack Jungseok Lee
@ 2015-12-04 13:57 ` Catalin Marinas
  2015-12-06 13:33   ` Jungseok Lee
  2015-12-10 10:22 ` [PATCH v8 5/4] arm64: Fix off-by-one in stack tracing when stepping off irq stack James Morse
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 44+ messages in thread
From: Catalin Marinas @ 2015-12-04 13:57 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Dec 04, 2015 at 11:02:24AM +0000, James Morse wrote:
> Patch 4 is new, following Catalin's comments, but I don't think it is
> necessary unless we also decrease the stack size,

That's the whole aim of this patchset, to try to to get back to 8KB
stacks.

> which I don't think we should do immediatly - it would be good to
> collect some max_stack_size values for various workloads first.

We could make it configurable though, maybe defaulting to 16KB for a
while.

IIRC, the stack overflows we had in the past were triggered when running
specweb but I don't remember whether it was hard or soft IRQs that
caused the issues (could have been both). We could run some tests again
and see how close we get to the 8KB limit.

BTW, what's the disadvantage of always switching to the IRQ stack in
do_softirq?

-- 
Catalin

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 4/4] arm64: switch to irq_stack during softirq
  2015-12-04 11:02 ` [PATCH v8 4/4] arm64: switch to irq_stack during softirq James Morse
@ 2015-12-04 14:01   ` Catalin Marinas
  2015-12-04 14:39     ` James Morse
  0 siblings, 1 reply; 44+ messages in thread
From: Catalin Marinas @ 2015-12-04 14:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Dec 04, 2015 at 11:02:28AM +0000, James Morse wrote:
> +ENTRY(__do_softirq_on_irqstack)
> +	push	x19, lr
> +	push	x25, x26
> +
> +	irq_stack_entry lr
> +
> +	/* irq_stack_entry leaves irq_count in x25 */
> +	ldr	x1, [x25]
> +	add	x1, x1, #1
> +	str	x1, [x25]
> +
> +	bl	__do_softirq
> +
> +	ldr	x1, [x25]
> +	sub	x1, x1, #1
> +	str	x1, [x25]
> +
> +	irq_stack_exit
> +
> +	pop x25, x26
> +	pop x19, lr
> +	ret
> +ENDPROC(__do_softirq_on_irqstack)

I was thinking of doing do_softirq_own_stack() entirely in assembly
without the need to check for on_irq_stack() check in C, just a test of
the irq_count value.

But this means collapsing patches 3 and 4 ;).

-- 
Catalin

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 2/4] arm64: Modify stack trace and dump for use with irq_stack
  2015-12-04 11:02 ` [PATCH v8 2/4] arm64: Modify stack trace and dump for use with irq_stack James Morse
  2015-12-04 12:21   ` Jungseok Lee
@ 2015-12-04 14:31   ` Catalin Marinas
  1 sibling, 0 replies; 44+ messages in thread
From: Catalin Marinas @ 2015-12-04 14:31 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Dec 04, 2015 at 11:02:26AM +0000, James Morse wrote:
> diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h
> index 8e8d30684392..e2f3f135a3bc 100644
> --- a/arch/arm64/include/asm/irq.h
> +++ b/arch/arm64/include/asm/irq.h
[...]
> +static inline bool on_irq_stack(unsigned long sp, int cpu)
> +{
> +	/* variable names the same as kernel/stacktrace.c */
> +	unsigned long low = (unsigned long)per_cpu(irq_stack, cpu);
> +	unsigned long high = low + IRQ_STACK_START_SP;
> +
> +	return (low <= sp && sp <= high);
> +}
> +
> +#endif /* !__ASSEMBLER__ */
>  #endif
> diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
> index 9f17ec071ee0..1e3cef578e21 100644
> --- a/arch/arm64/kernel/irq.c
> +++ b/arch/arm64/kernel/irq.c
[...]
> @@ -39,17 +40,41 @@ int notrace unwind_frame(struct stackframe *frame)
>  {
>  	unsigned long high, low;
>  	unsigned long fp = frame->fp;
> +	unsigned long irq_stack_ptr;
> +
> +	/*
> +	 * Use raw_smp_processor_id() to avoid false-positives from
> +	 * CONFIG_DEBUG_PREEMPT. get_wchan() calls unwind_frame() on sleeping
> +	 * task stacks, we can be pre-empted in this case, so
> +	 * {raw_,}smp_processor_id() may give us the wrong value. Sleeping
> +	 * tasks can't ever be on an interrupt stack, so regardless of cpu,
> +	 * the checks will always fail.
> +	 */
> +	irq_stack_ptr = IRQ_STACK_PTR(raw_smp_processor_id());
>  
>  	low  = frame->sp;
> -	high = ALIGN(low, THREAD_SIZE);
> +	/* irq stacks are not THREAD_SIZE aligned */
> +	if (on_irq_stack(frame->sp, raw_smp_processor_id()))
> +		high = irq_stack_ptr;
> +	else
> +		high = ALIGN(low, THREAD_SIZE) - 0x20;

More of a nitpick: if you rework patch 4, you can remove on_irq_stack()
here, just expand it inline (and avoid IRQ_STACK_PTR twice, though
that's pretty cheap).

Either way:

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 4/4] arm64: switch to irq_stack during softirq
  2015-12-04 14:01   ` Catalin Marinas
@ 2015-12-04 14:39     ` James Morse
  2015-12-04 18:40       ` Catalin Marinas
  2015-12-06 13:51       ` Jungseok Lee
  0 siblings, 2 replies; 44+ messages in thread
From: James Morse @ 2015-12-04 14:39 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Catalin,

On 04/12/15 14:01, Catalin Marinas wrote:
> On Fri, Dec 04, 2015 at 11:02:28AM +0000, James Morse wrote:
>> +ENTRY(__do_softirq_on_irqstack)
>> +	push	x19, lr
>> +	push	x25, x26
>> +
>> +	irq_stack_entry lr
>> +
>> +	/* irq_stack_entry leaves irq_count in x25 */
>> +	ldr	x1, [x25]
>> +	add	x1, x1, #1
>> +	str	x1, [x25]
>> +
>> +	bl	__do_softirq
>> +
>> +	ldr	x1, [x25]
>> +	sub	x1, x1, #1
>> +	str	x1, [x25]
>> +
>> +	irq_stack_exit
>> +
>> +	pop x25, x26
>> +	pop x19, lr
>> +	ret
>> +ENDPROC(__do_softirq_on_irqstack)
> 
> I was thinking of doing do_softirq_own_stack() entirely in assembly
> without the need to check for on_irq_stack() check in C, just a test of
> the irq_count value.

I tried that, but couldn't get it to work - maybe I'm missing a trick:

There are at least two ways into do_softirq_own_stack():
el1_irq() -> __irq_exit() -> do_softirq_own_stack(),
as well as:
cpu_switch_to(ksoftirqd) -> do_softirq_own_stack()

In both cases irq_count == 0 because do_softirq_own_stack() is where we
update irq_count [0]. I can only see two ways to tell them apart,
increment irq_count in el1_irq(), (which we don't like), or have that
bounds checking.


(We could increase hardirq_count() in el?_irq(), which would mean
in_irq()/in_interrupt() always returns true when on the irq_stack, which
would stop softirqs being processed here... but that is a fairly
significant change in behaviour.)


Thanks,

James

[0] http://article.gmane.org/gmane.linux.kernel/2041877

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 1/4] arm64: Store struct task_info in sp_el0
  2015-12-04 13:27   ` Catalin Marinas
@ 2015-12-04 14:55     ` James Morse
  2015-12-04 16:18       ` Catalin Marinas
  2015-12-06 13:15     ` Jungseok Lee
  1 sibling, 1 reply; 44+ messages in thread
From: James Morse @ 2015-12-04 14:55 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Catalin,

On 04/12/15 13:27, Catalin Marinas wrote:
> On Fri, Dec 04, 2015 at 11:02:25AM +0000, James Morse wrote:
>> --- a/arch/arm64/kernel/entry.S
>> +++ b/arch/arm64/kernel/entry.S
> [...]
>> @@ -599,6 +606,8 @@ ENTRY(cpu_switch_to)
>>  	ldp	x29, x9, [x8], #16
>>  	ldr	lr, [x8]
>>  	mov	sp, x9
>> +	and	x9, x9, #~(THREAD_SIZE - 1)
>> +	msr	sp_el0, x9
>>  	ret
>>  ENDPROC(cpu_switch_to)
> 
> At the beginning of the cpu_switch_to function, could we do
> "mrs x9, sp_el0" instead to avoid the "and ... ~(THREAD_SIZE-1)"?

I'm not sure I follow - are you suggesting to store struct thread_info
in the thread_cpu_context?

This would change a ldr to a ldp, and save the 'and', so its definitely
fewer instructions.



James

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 1/4] arm64: Store struct task_info in sp_el0
  2015-12-04 14:55     ` James Morse
@ 2015-12-04 16:18       ` Catalin Marinas
  0 siblings, 0 replies; 44+ messages in thread
From: Catalin Marinas @ 2015-12-04 16:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Dec 04, 2015 at 02:55:12PM +0000, James Morse wrote:
> Hi Catalin,
> 
> On 04/12/15 13:27, Catalin Marinas wrote:
> > On Fri, Dec 04, 2015 at 11:02:25AM +0000, James Morse wrote:
> >> --- a/arch/arm64/kernel/entry.S
> >> +++ b/arch/arm64/kernel/entry.S
> > [...]
> >> @@ -599,6 +606,8 @@ ENTRY(cpu_switch_to)
> >>  	ldp	x29, x9, [x8], #16
> >>  	ldr	lr, [x8]
> >>  	mov	sp, x9
> >> +	and	x9, x9, #~(THREAD_SIZE - 1)
> >> +	msr	sp_el0, x9
> >>  	ret
> >>  ENDPROC(cpu_switch_to)
> > 
> > At the beginning of the cpu_switch_to function, could we do
> > "mrs x9, sp_el0" instead to avoid the "and ... ~(THREAD_SIZE-1)"?
> 
> I'm not sure I follow - are you suggesting to store struct thread_info
> in the thread_cpu_context?

I was thinking of context switching sp_el0 as well but see below.

> This would change a ldr to a ldp, and save the 'and', so its definitely
> fewer instructions.

I think we end up with an additional memory access, which is usually
more expensive than arithmetic ops. So just leave it as it is.

-- 
Catalin

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 4/4] arm64: switch to irq_stack during softirq
  2015-12-04 14:39     ` James Morse
@ 2015-12-04 18:40       ` Catalin Marinas
  2015-12-08 10:29         ` James Morse
  2015-12-06 13:51       ` Jungseok Lee
  1 sibling, 1 reply; 44+ messages in thread
From: Catalin Marinas @ 2015-12-04 18:40 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Dec 04, 2015 at 02:39:44PM +0000, James Morse wrote:
> On 04/12/15 14:01, Catalin Marinas wrote:
> > On Fri, Dec 04, 2015 at 11:02:28AM +0000, James Morse wrote:
> >> +ENTRY(__do_softirq_on_irqstack)
> >> +	push	x19, lr
> >> +	push	x25, x26
> >> +
> >> +	irq_stack_entry lr
> >> +
> >> +	/* irq_stack_entry leaves irq_count in x25 */
> >> +	ldr	x1, [x25]
> >> +	add	x1, x1, #1
> >> +	str	x1, [x25]
> >> +
> >> +	bl	__do_softirq
> >> +
> >> +	ldr	x1, [x25]
> >> +	sub	x1, x1, #1
> >> +	str	x1, [x25]
> >> +
> >> +	irq_stack_exit
> >> +
> >> +	pop x25, x26
> >> +	pop x19, lr
> >> +	ret
> >> +ENDPROC(__do_softirq_on_irqstack)
> > 
> > I was thinking of doing do_softirq_own_stack() entirely in assembly
> > without the need to check for on_irq_stack() check in C, just a test of
> > the irq_count value.
> 
> I tried that, but couldn't get it to work - maybe I'm missing a trick:
> 
> There are at least two ways into do_softirq_own_stack():
> el1_irq() -> __irq_exit() -> do_softirq_own_stack(),
> as well as:
> cpu_switch_to(ksoftirqd) -> do_softirq_own_stack()

If it was only these two cases, I think it would have been easier as in
both scenarios the stack is nearly empty. But there is another call to
do_softirq() from netif_rx_ni() and I'm not sure how the stack looks
like at this point. That's why I suggested that we always switch to a
different stack for softirqs.

> In both cases irq_count == 0 because do_softirq_own_stack() is where we
> update irq_count [0]. I can only see two ways to tell them apart,
> increment irq_count in el1_irq(), (which we don't like), or have that
> bounds checking.

I now see the problem. Looking at x86, they increment irq_count for
every interrupt (as per Jungseok's early patch).

> (We could increase hardirq_count() in el?_irq(), which would mean
> in_irq()/in_interrupt() always returns true when on the irq_stack, which
> would stop softirqs being processed here... but that is a fairly
> significant change in behaviour.)

So I think the options are (a) irq_count increment in el?_irq or (b)
check the current stack before switching. I'll think about it until
Monday ;).

BTW, at which point does in_irq() return true for an interrupt (and when
it ceases to do so)?

-- 
Catalin

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 1/4] arm64: Store struct task_info in sp_el0
  2015-12-04 13:27   ` Catalin Marinas
  2015-12-04 14:55     ` James Morse
@ 2015-12-06 13:15     ` Jungseok Lee
  1 sibling, 0 replies; 44+ messages in thread
From: Jungseok Lee @ 2015-12-06 13:15 UTC (permalink / raw)
  To: linux-arm-kernel

On Dec 4, 2015, at 10:27 PM, Catalin Marinas wrote:

Hi Catalin,

> On Fri, Dec 04, 2015 at 11:02:25AM +0000, James Morse wrote:
>> --- a/arch/arm64/kernel/entry.S
>> +++ b/arch/arm64/kernel/entry.S
> [...]
>> @@ -599,6 +606,8 @@ ENTRY(cpu_switch_to)
>> 	ldp	x29, x9, [x8], #16
>> 	ldr	lr, [x8]
>> 	mov	sp, x9
>> +	and	x9, x9, #~(THREAD_SIZE - 1)
>> +	msr	sp_el0, x9
>> 	ret
>> ENDPROC(cpu_switch_to)
> 
> At the beginning of the cpu_switch_to function, could we do
> "mrs x9, sp_el0" instead to avoid the "and ... ~(THREAD_SIZE-1)"?
> 
> Otherwise:
> 
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

Thanks for reviewing this one!

If this one is picked up without re-spin, it would be better to change
the word 'task_info' in the subject to 'thread_info' for clarification.

Best Regards
Jungseok Lee

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 0/4] arm64: Add support for IRQ stack
  2015-12-04 13:57 ` Catalin Marinas
@ 2015-12-06 13:33   ` Jungseok Lee
  0 siblings, 0 replies; 44+ messages in thread
From: Jungseok Lee @ 2015-12-06 13:33 UTC (permalink / raw)
  To: linux-arm-kernel

On Dec 4, 2015, at 10:57 PM, Catalin Marinas wrote:
> On Fri, Dec 04, 2015 at 11:02:24AM +0000, James Morse wrote:
>> Patch 4 is new, following Catalin's comments, but I don't think it is
>> necessary unless we also decrease the stack size,
> 
> That's the whole aim of this patchset, to try to to get back to 8KB
> stacks.
> 
>> which I don't think we should do immediatly - it would be good to
>> collect some max_stack_size values for various workloads first.
> 
> We could make it configurable though, maybe defaulting to 16KB for a
> while.
> 
> IIRC, the stack overflows we had in the past were triggered when running
> specweb but I don't remember whether it was hard or soft IRQs that
> caused the issues (could have been both). We could run some tests again
> and see how close we get to the 8KB limit.
> 
> BTW, what's the disadvantage of always switching to the IRQ stack in
> do_softirq?

As we know well, additional operations, such as IRQ count management and
stack pointer update, are the only items I'm aware of.

Best Regards
Jungseok Lee

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 4/4] arm64: switch to irq_stack during softirq
  2015-12-04 14:39     ` James Morse
  2015-12-04 18:40       ` Catalin Marinas
@ 2015-12-06 13:51       ` Jungseok Lee
  1 sibling, 0 replies; 44+ messages in thread
From: Jungseok Lee @ 2015-12-06 13:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Dec 4, 2015, at 11:39 PM, James Morse wrote:

Hi James,

> Hi Catalin,
> 
> On 04/12/15 14:01, Catalin Marinas wrote:
>> On Fri, Dec 04, 2015 at 11:02:28AM +0000, James Morse wrote:
>>> +ENTRY(__do_softirq_on_irqstack)
>>> +	push	x19, lr
>>> +	push	x25, x26
>>> +
>>> +	irq_stack_entry lr
>>> +
>>> +	/* irq_stack_entry leaves irq_count in x25 */
>>> +	ldr	x1, [x25]
>>> +	add	x1, x1, #1
>>> +	str	x1, [x25]
>>> +
>>> +	bl	__do_softirq
>>> +
>>> +	ldr	x1, [x25]
>>> +	sub	x1, x1, #1
>>> +	str	x1, [x25]
>>> +
>>> +	irq_stack_exit
>>> +
>>> +	pop x25, x26
>>> +	pop x19, lr
>>> +	ret
>>> +ENDPROC(__do_softirq_on_irqstack)
>> 
>> I was thinking of doing do_softirq_own_stack() entirely in assembly
>> without the need to check for on_irq_stack() check in C, just a test of
>> the irq_count value.
> 
> I tried that, but couldn't get it to work - maybe I'm missing a trick:
> 
> There are at least two ways into do_softirq_own_stack():
> el1_irq() -> __irq_exit() -> do_softirq_own_stack(),
> as well as:
> cpu_switch_to(ksoftirqd) -> do_softirq_own_stack()
> 
> In both cases irq_count == 0 because do_softirq_own_stack() is where we
> update irq_count [0]. I can only see two ways to tell them apart,
> increment irq_count in el1_irq(), (which we don't like), or have that
> bounds checking.
> 
> 
> (We could increase hardirq_count() in el?_irq(), which would mean
> in_irq()/in_interrupt() always returns true when on the irq_stack, which
> would stop softirqs being processed here... but that is a fairly
> significant change in behavior.)

It is not simple to utilize preempt_count which is based on task for stack
switching based on core. Unluckily, I cannot give a specific challenge I've
experienced.

>From implementation perspective, I think it is an advantage to use tpidr_el1
for percpu variables. But, preempt_count based approach might not utilize it.

Best Regards
Jungseok Lee

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 0/4] arm64: Add support for IRQ stack
  2015-12-04 12:17 ` [PATCH v8 0/4] arm64: Add support for IRQ stack Jungseok Lee
@ 2015-12-06 13:56   ` Jungseok Lee
  0 siblings, 0 replies; 44+ messages in thread
From: Jungseok Lee @ 2015-12-06 13:56 UTC (permalink / raw)
  To: linux-arm-kernel

On Dec 4, 2015, at 9:17 PM, Jungseok Lee wrote:
> On Dec 4, 2015, at 8:02 PM, James Morse wrote:
> 
> Hi James,
> 
>> This consolidated series adds a per-cpu irq_stack, which will eventually allow
>> us to reduce the size of task stacks.
>> 
>> Code in entry.S switches to the per-cpu irq_stack when irq_count is zero.
>> This counter is updated in do_softirq_own_stack(), which is called before
>> __do_softirq() re-enables interrupts, which could cause recursive use of the
>> irq stack.
>> 
>> sp_el0 is used as a scratch register to store struct thread_info during el1
>> execution, this means code to find it by masking the stack pointer no longer
>> needs to be inlined. This also lets us remove the alignment requirements for
>> the irq stack, (task stacks still need to be naturally aligned).
>> 
>> Patch 2 is a combination of Akashi Takahiro's 'arm64: unwind_frame for
>> interrupt stack' and 'arm64: fix dump_backtrace() to show correct pt_regs at
>> interrupt', both of which need to be present before irq_stack is enabled in
>> Patch 3.
>> 
>> Patch 4 is new, following Catalin's comments, but I don't think it is
>> necessary unless we also decrease the stack size, which I don't think we
>> should do immediatly - it would be good to collect some max_stack_size values
>> for various workloads first.
>> 
>> This series is based on the v4.4-rc3
>> The series can be pulled from git://linux-arm.org/linux-jm.git -b irq_stack/v8
>> 
>> Comments welcome,
> 
> At first glance, this series looks mature ;) I will leave feedbacks after playing
> with this.

I will hold off until the IRQ count management scheme is determined, which is
an important design point.

Best Regards
Jungseok Lee

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 3/4] arm64: Add do_softirq_own_stack() and enable irq_stacks
  2015-12-04 11:02 ` [PATCH v8 3/4] arm64: Add do_softirq_own_stack() and enable irq_stacks James Morse
  2015-12-04 13:46   ` Catalin Marinas
@ 2015-12-07 22:48   ` Catalin Marinas
  2015-12-08 11:43     ` Will Deacon
  2015-12-09 13:45   ` Will Deacon
  2 siblings, 1 reply; 44+ messages in thread
From: Catalin Marinas @ 2015-12-07 22:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Dec 04, 2015 at 11:02:27AM +0000, James Morse wrote:
> entry.S is modified to switch to the per_cpu irq_stack during el{0,1}_irq.
> irq_count is used to detect recursive interrupts on the irq_stack, it is
> updated late by do_softirq_own_stack(), when called on the irq_stack, before
> __do_softirq() re-enables interrupts to process softirqs.
> 
> do_softirq_own_stack() is added by this patch, but does not yet switch
> stack.
> 
> This patch adds the dummy stack frame and data needed by the previous
> stack tracing patches.
> 
> Signed-off-by: James Morse <james.morse@arm.com>

In the interest of getting things moving on this series:

Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>

I propose that we skip patch 4 for now and, as James suggested, keep the
THREAD_SIZE to 16KB for one more release cycle. It would be good to get
some statistics on stack usage.

Patches 1-3 look fine to me on their own.

-- 
Catalin

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 4/4] arm64: switch to irq_stack during softirq
  2015-12-04 18:40       ` Catalin Marinas
@ 2015-12-08 10:29         ` James Morse
  0 siblings, 0 replies; 44+ messages in thread
From: James Morse @ 2015-12-08 10:29 UTC (permalink / raw)
  To: linux-arm-kernel

On 04/12/15 18:40, Catalin Marinas wrote:
> BTW, at which point does in_irq() return true for an interrupt (and when
> it ceases to do so)?

It uses hardirq_count(), so its true between irq_enter() and irq_exit().

The calls out from handle_IPI() are wrapped by these (except
scheduler_ipi(), which calls them itself).

The other route looks like:
gic_handle_irq() -> __handle_domain_irq()
which wraps generic_handle_irq() in irq_enter()/irq_exit().


I don't know how chained_irq_enter() changes/affects this.


Thanks,

James

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 3/4] arm64: Add do_softirq_own_stack() and enable irq_stacks
  2015-12-07 22:48   ` Catalin Marinas
@ 2015-12-08 11:43     ` Will Deacon
  2015-12-08 16:02       ` Jungseok Lee
  0 siblings, 1 reply; 44+ messages in thread
From: Will Deacon @ 2015-12-08 11:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Dec 07, 2015 at 10:48:07PM +0000, Catalin Marinas wrote:
> On Fri, Dec 04, 2015 at 11:02:27AM +0000, James Morse wrote:
> > entry.S is modified to switch to the per_cpu irq_stack during el{0,1}_irq.
> > irq_count is used to detect recursive interrupts on the irq_stack, it is
> > updated late by do_softirq_own_stack(), when called on the irq_stack, before
> > __do_softirq() re-enables interrupts to process softirqs.
> > 
> > do_softirq_own_stack() is added by this patch, but does not yet switch
> > stack.
> > 
> > This patch adds the dummy stack frame and data needed by the previous
> > stack tracing patches.
> > 
> > Signed-off-by: James Morse <james.morse@arm.com>
> 
> In the interest of getting things moving on this series:
> 
> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
> 
> I propose that we skip patch 4 for now and, as James suggested, keep the
> THREAD_SIZE to 16KB for one more release cycle. It would be good to get
> some statistics on stack usage.
> 
> Patches 1-3 look fine to me on their own.

I'll pick them up and see if they survive testing.

Will

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 3/4] arm64: Add do_softirq_own_stack() and enable irq_stacks
  2015-12-08 11:43     ` Will Deacon
@ 2015-12-08 16:02       ` Jungseok Lee
  2015-12-08 17:23         ` James Morse
  0 siblings, 1 reply; 44+ messages in thread
From: Jungseok Lee @ 2015-12-08 16:02 UTC (permalink / raw)
  To: linux-arm-kernel

On Dec 8, 2015, at 8:43 PM, Will Deacon wrote:
> On Mon, Dec 07, 2015 at 10:48:07PM +0000, Catalin Marinas wrote:
>> On Fri, Dec 04, 2015 at 11:02:27AM +0000, James Morse wrote:
>>> entry.S is modified to switch to the per_cpu irq_stack during el{0,1}_irq.
>>> irq_count is used to detect recursive interrupts on the irq_stack, it is
>>> updated late by do_softirq_own_stack(), when called on the irq_stack, before
>>> __do_softirq() re-enables interrupts to process softirqs.
>>> 
>>> do_softirq_own_stack() is added by this patch, but does not yet switch
>>> stack.
>>> 
>>> This patch adds the dummy stack frame and data needed by the previous
>>> stack tracing patches.
>>> 
>>> Signed-off-by: James Morse <james.morse@arm.com>
>> 
>> In the interest of getting things moving on this series:
>> 
>> Reviewed-by: Catalin Marinas <catalin.marinas@arm.com>
>> 
>> I propose that we skip patch 4 for now and, as James suggested, keep the
>> THREAD_SIZE to 16KB for one more release cycle. It would be good to get
>> some statistics on stack usage.
>> 
>> Patches 1-3 look fine to me on their own.
> 
> I'll pick them up and see if they survive testing.

Dear All

I've seen the following BUG log with CONFIG_DEBUG_SPINLOCK=y kernel.
 
 BUG: spinlock lockup suspected on CPU#1

Under that option, I cannot even complete a single kernel build successfully.
I hope I'm the only person to experience it. My Android machine is running
well for over 12 hours now with the below change.

----8<----

diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h
index fa2a8d0..db90a69 100644
--- a/arch/arm64/include/asm/irq.h
+++ b/arch/arm64/include/asm/irq.h
@@ -27,7 +27,7 @@ DECLARE_PER_CPU(unsigned long [IRQ_STACK_SIZE/sizeof(long)], irq_stack);
  * The offset from irq_stack_ptr where entry.S will store the original
  * stack pointer. Used by unwind_frame() and dump_backtrace().
  */
-#define IRQ_STACK_TO_TASK_STACK(ptr) *((unsigned long *)(ptr - 0x10));
+#define IRQ_STACK_TO_TASK_STACK(ptr) *((unsigned long *)(ptr - 0x20));
 
 extern void set_handle_irq(void (*handle_irq)(struct pt_regs *));
 
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 81cc538..7d54c26 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -199,7 +199,7 @@ alternative_endif
        /* Add a dummy stack frame */
        stp     x29, \dummy_lr, [sp, #-16]!           // dummy stack frame
        mov     x29, sp
-       stp     xzr, x19, [sp, #-16]!
+       stp     x19, xzr, [sp, #-16]!
 
 9998:
        .endm

----8<----

If I read the patches correctly, the dummy stack frame looks as follows.

  top ------------ <- irq_stack_ptr
      | dummy_lr |
      ------------
      |   x29    |
      ------------ <- new frame pointer (x29)
      |   x19    |
      ------------
      |   xzr    |
      ------------

So, we should refer to x19 in order to retrieve frame->sp. But, frame->sp is
xzr under the current implementation. I suspect this causes spinlock lockup.

Please correct me if I'm wrong.

Best Regards
Jungseok Lee

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 3/4] arm64: Add do_softirq_own_stack() and enable irq_stacks
  2015-12-08 16:02       ` Jungseok Lee
@ 2015-12-08 17:23         ` James Morse
  2015-12-08 17:27           ` Will Deacon
                             ` (2 more replies)
  0 siblings, 3 replies; 44+ messages in thread
From: James Morse @ 2015-12-08 17:23 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/12/15 16:02, Jungseok Lee wrote:
> I've seen the following BUG log with CONFIG_DEBUG_SPINLOCK=y kernel.
>  
>  BUG: spinlock lockup suspected on CPU#1
> 
> Under that option, I cannot even complete a single kernel build successfully.
> I hope I'm the only person to experience it. My Android machine is running
> well for over 12 hours now with the below change.

I can't reproduce this, can you send me your .config file (off-list)? Do
you have any other patches in your tree? What hardware are you using?


> If I read the patches correctly, the dummy stack frame looks as follows.
> 
>   top ------------ <- irq_stack_ptr
>       | dummy_lr |
>       ------------
>       |   x29    |
>       ------------ <- new frame pointer (x29)
>       |   x19    |
>       ------------
>       |   xzr    |
>       ------------
> 
> So, we should refer to x19 in order to retrieve frame->sp. But, frame->sp is
> xzr under the current implementation. I suspect this causes spinlock lockup.

This is the sort of place where it is too easy to make an off-by-one
error, I will go through it all with the debugger again tomorrow.


I'm not seeing this when testing on Juno. This would only affect the
tracing code, are you running perf or ftrace at the same time?

I've just re-tested this with defconfig, and the following hack:
=======%<=======
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index b947eeffa5b2..686086e4d870 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -72,8 +72,10 @@ int notrace unwind_frame(struct stackframe *frame)
         * If we reach the end of the stack - and its an interrupt stack,
         * read the original task stack pointer from the dummy frame.
         */
-       if (frame->sp == irq_stack_ptr)
+       if (frame->sp == irq_stack_ptr) {
                frame->sp = IRQ_STACK_TO_TASK_STACK(irq_stack_ptr);
+               BUG_ON(frame->sp == 0);
+       }

        return 0;
 }
=======%<=======

While running:
> sudo ./perf record -e mem:<address of __do_softirq>:x -ag -- sleep 180

and

> dd if=/dev/sda of=/dev/null bs=512 iflag=direct;

This should cause lots of interrupts from /dev/sda, and cause the
tracing code to step between irq_stack and the original task stack
frequently. The BUG_ON() doesn't fire, and the perf trace output looks
correct.


My only theory is that there is an off by one, and its reading what was
x29 instead. This wouldn't show up in these tests, but might be a
problem for aarch32 user-space, as presumably x29==0 when it switches to
aarch64 mode for el0_irq(). I will try this tomorrow.



Thanks,

James

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 3/4] arm64: Add do_softirq_own_stack() and enable irq_stacks
  2015-12-08 17:23         ` James Morse
@ 2015-12-08 17:27           ` Will Deacon
  2015-12-08 23:13           ` Jungseok Lee
  2015-12-09  9:47           ` James Morse
  2 siblings, 0 replies; 44+ messages in thread
From: Will Deacon @ 2015-12-08 17:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Dec 08, 2015 at 05:23:32PM +0000, James Morse wrote:
> On 08/12/15 16:02, Jungseok Lee wrote:
> > I've seen the following BUG log with CONFIG_DEBUG_SPINLOCK=y kernel.
> >  
> >  BUG: spinlock lockup suspected on CPU#1
> > 
> > Under that option, I cannot even complete a single kernel build successfully.
> > I hope I'm the only person to experience it. My Android machine is running
> > well for over 12 hours now with the below change.
> 
> I can't reproduce this, can you send me your .config file (off-list)? Do
> you have any other patches in your tree? What hardware are you using?

FWIW, I tried to reproduce it and hit something that looks slightly
different. Crash log below.

> > If I read the patches correctly, the dummy stack frame looks as follows.
> > 
> >   top ------------ <- irq_stack_ptr
> >       | dummy_lr |
> >       ------------
> >       |   x29    |
> >       ------------ <- new frame pointer (x29)
> >       |   x19    |
> >       ------------
> >       |   xzr    |
> >       ------------
> > 
> > So, we should refer to x19 in order to retrieve frame->sp. But, frame->sp is
> > xzr under the current implementation. I suspect this causes spinlock lockup.
> 
> This is the sort of place where it is too easy to make an off-by-one
> error, I will go through it all with the debugger again tomorrow.

Ok; I'll hold off pushing this into linux-next until we've worked out
what's going wrong.

> I'm not seeing this when testing on Juno. This would only affect the
> tracing code, are you running perf or ftrace at the same time?

I just set CONFIG_PROVE_LOCKING=y, but that likely turns on a bunch of
the tracing infrastructure.

Will

--->8

Unable to handle kernel paging request at virtual address 7fff6dd050
pgd = ffffffc0601de000
[7fff6dd050] *pgd=00000000f905b003, *pud=00000000f905b003, *pmd=00000000f90cd003, *pte=00a00009f52aabd3
Internal error: Oops: 9600000b [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 1365 Comm: networking Not tainted 4.4.0-rc3+ #1
Hardware name: ARM Juno development board (r0) (DT)
task: ffffffc0792be400 ti: ffffffc0790dc000 task.ti: ffffffc0790dc000
PC is at unwind_frame+0x74/0xa0
LR is at walk_stackframe+0x28/0x50
pc : [<ffffffc0000896bc>] lr : [<ffffffc000089710>] pstate: a00001c5
sp : ffffffc97fe5b8a0
x29: ffffffc97fe5b8a0 x28: ffffffc0792be400
x27: ffffffc000994000 x26: ffffffc000c0f000
x25: ffffffc0792beab0 x24: ffffffc0792beb00
x23: ffffffc000c18518 x22: ffffffc0016aa9a0
x21: ffffffc0000895a0 x20: ffffffc97fe5b8f8
x19: ffffffc97fe5b908 x18: 000000000000381b
x17: ffffffc0014b0130 x16: ffffffc00168f9d8
x15: 000000000000381a x14: ffffffc00136f8d8
x13: 0000000000000002 x12: 0000000000000000
x11: ffffffc975de1d00 x10: ffffffc001696000
x9 : 0000000000000000 x8 : 0000000000000001
x7 : ffffffc000bff8d8 x6 : 0000000000000000
x5 : 0000007fff6dced0 x4 : 0000007fff6dd060
x3 : 0000000000000000 x2 : 0000007fff6dd050
x1 : ffffffc97fe58060 x0 : ffffffc97fe5b908

Process networking (pid: 1365, stack limit = 0xffffffc0790dc020)
Stack: (0xffffffc97fe5b8a0 to 0xffffffc0790e0000)
Call trace:
[<ffffffc0000896bc>] unwind_frame+0x74/0xa0
[<ffffffc000089794>] save_stack_trace_tsk+0x5c/0xa8
[<ffffffc0000897f8>] save_stack_trace+0x18/0x20
[<ffffffc0000f9600>] save_trace+0x48/0x100
[<ffffffc0000fd690>] __lock_acquire+0x1bc8/0x1c48
[<ffffffc0000fdedc>] lock_acquire+0x4c/0x68
[<ffffffc00064f2b0>] _raw_spin_lock+0x40/0x58
[<ffffffc0001af9a8>] unfreeze_partials.isra.23+0x78/0x2c0
[<ffffffc0001afefc>] put_cpu_partial+0x16c/0x200
[<ffffffc0001b14c4>] __slab_free+0x2e4/0x430
[<ffffffc0001b1e1c>] kfree+0x1d4/0x1e8
[<ffffffc00013c9dc>] put_css_set_locked+0x114/0x168
[<ffffffc00013cd7c>] put_css_set+0xac/0xc0
[<ffffffc000143544>] cgroup_free+0x9c/0x108
[<ffffffc0000b41c0>] __put_task_struct+0x38/0x110
[<ffffffc0000b7b60>] delayed_put_task_struct+0x40/0x50
[<ffffffc000115d50>] rcu_process_callbacks+0x2f8/0x5f8
[<ffffffc0000bb0c4>] __do_softirq+0x13c/0x278
[<ffffffc0000860d4>] do_softirq_own_stack+0x84/0xc8
[<ffffffc0000bb3c0>] irq_exit+0xa0/0xd8
[<ffffffc000107b60>] __handle_domain_irq+0x60/0xb8
[<ffffffc0000824f0>] gic_handle_irq+0x58/0xa8
Exception stack(0x0000007fff6dc3e0 to 0x0000007fff6dc500)
c3e0: 0000007fff6dc420 000000558c8d6df8 000000558c8ed058 000000558c8ec000
c400: 0000000000000000 00000055c65b3105 00000055c65bed30 0000000000000000
c420: 0000007fff6dc440 000000558c8caef8 0000000000000000 0000007fff6dc47f
c440: 0000007fff6dc5f0 000000558c8cb3c4 0000000000000001 0000000000000000
c460: 0000000000000000 00000055c65b3105 00000055c65bed30 000000558c8d1b00
c480: 0000000000000000 0000000100000000 000000558c8ec548 00000055c65bed30
c4a0: 0000007fff6dc4b0 0000007fff6ddf82 0000007fff6dc7a8 0000000000000001
c4c0: 0000000000000000 0000000000000000 00000055c65b3105 00000055c65bed30
c4e0: 0000000000000000 00000055c65b77f8 000000558c8ec000 0000007fff6dc6c8
[<ffffffc000085b0c>] el0_irq_naked+0x4c/0x60
[<000000558c8d6d88>] 0x558c8d6d88
[<000000558c8d6df8>] 0x558c8d6df8
[<000000558c8caef8>] 0x558c8caef8
[<000000558c8cb3c4>] 0x558c8cb3c4
[<000000558c8ca2e4>] 0x558c8ca2e4
[<000000558c8ca2ac>] 0x558c8ca2ac
[<000000558c8ca924>] 0x558c8ca924
[<000000558c8cb5a0>] 0x558c8cb5a0
[<000000558c8ca2e4>] 0x558c8ca2e4
[<000000558c8cb738>] 0x558c8cb738
[<000000558c8ce92c>] 0x558c8ce92c
[<000000558c8ceb80>] 0x558c8ceb80
[<000000558c8cb1c0>] 0x558c8cb1c0
[<000000558c8ca2e4>] 0x558c8ca2e4
[<000000558c8ca2ac>] 0x558c8ca2ac
[<000000558c8ca3f0>] 0x558c8ca3f0
[<000000558c8ca3f0>] 0x558c8ca3f0
[<000000558c8ca52c>] 0x558c8ca52c
[<000000558c8ca2ac>] 0x558c8ca2ac
[<000000558c8d18b0>] 0x558c8d18b0
[<000000558c8c8604>] 0x558c8c8604
[<0000007f968fe9bc>] 0x7f968fe9bc

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 3/4] arm64: Add do_softirq_own_stack() and enable irq_stacks
  2015-12-08 17:23         ` James Morse
  2015-12-08 17:27           ` Will Deacon
@ 2015-12-08 23:13           ` Jungseok Lee
  2015-12-09  9:47           ` James Morse
  2 siblings, 0 replies; 44+ messages in thread
From: Jungseok Lee @ 2015-12-08 23:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Dec 9, 2015, at 2:23 AM, James Morse wrote:

Hi James,

> On 08/12/15 16:02, Jungseok Lee wrote:
>> I've seen the following BUG log with CONFIG_DEBUG_SPINLOCK=y kernel.
>> 
>> BUG: spinlock lockup suspected on CPU#1
>> 
>> Under that option, I cannot even complete a single kernel build successfully.
>> I hope I'm the only person to experience it. My Android machine is running
>> well for over 12 hours now with the below change.
> 
> I can't reproduce this, can you send me your .config file (off-list)? Do
> you have any other patches in your tree? What hardware are you using?

No additional patches with dragonboard410c [1]. I don't have Juno unfortunately.
Please refer to 'Lock Debugging' and  'RCU Debugging' of .config. 

#
# Lock Debugging (spinlocks, mutexes, etc...)
#
# CONFIG_DEBUG_RT_MUTEXES is not set
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_MUTEXES=y
# CONFIG_DEBUG_WW_MUTEX_SLOWPATH is not set
CONFIG_DEBUG_LOCK_ALLOC=y
CONFIG_PROVE_LOCKING=y
CONFIG_LOCKDEP=y
# CONFIG_LOCK_STAT is not set
# CONFIG_DEBUG_LOCKDEP is not set
# CONFIG_DEBUG_ATOMIC_SLEEP is not set
# CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set
# CONFIG_LOCK_TORTURE_TEST is not set
CONFIG_TRACE_IRQFLAGS=y
CONFIG_STACKTRACE=y
# CONFIG_DEBUG_KOBJECT is not set
CONFIG_HAVE_DEBUG_BUGVERBOSE=y
CONFIG_DEBUG_BUGVERBOSE=y
# CONFIG_DEBUG_LIST is not set
# CONFIG_DEBUG_PI_LIST is not set
# CONFIG_DEBUG_SG is not set
# CONFIG_DEBUG_NOTIFIERS is not set
# CONFIG_DEBUG_CREDENTIALS is not set

#
# RCU Debugging
#
CONFIG_PROVE_RCU=y
# CONFIG_PROVE_RCU_REPEATEDLY is not set
# CONFIG_SPARSE_RCU_POINTER is not set
# CONFIG_TORTURE_TEST is not set
# CONFIG_RCU_TORTURE_TEST is not set
CONFIG_RCU_CPU_STALL_TIMEOUT=21
# CONFIG_RCU_TRACE is not set
# CONFIG_RCU_EQS_DEBUG is not set
# CONFIG_DEBUG_BLOCK_EXT_DEVT is not set
# CONFIG_NOTIFIER_ERROR_INJECTION is not set
# CONFIG_FAULT_INJECTION is not set
CONFIG_HAVE_FUNCTION_TRACER=y
CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y
CONFIG_HAVE_DYNAMIC_FTRACE=y
CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y
CONFIG_HAVE_SYSCALL_TRACEPOINTS=y
CONFIG_HAVE_C_RECORDMCOUNT=y
CONFIG_TRACING_SUPPORT=y
# CONFIG_FTRACE is not set

>> If I read the patches correctly, the dummy stack frame looks as follows.
>> 
>>  top ------------ <- irq_stack_ptr
>>      | dummy_lr |
>>      ------------
>>      |   x29    |
>>      ------------ <- new frame pointer (x29)
>>      |   x19    |
>>      ------------
>>      |   xzr    |
>>      ------------
>> 
>> So, we should refer to x19 in order to retrieve frame->sp. But, frame->sp is
>> xzr under the current implementation. I suspect this causes spinlock lockup.
> 
> This is the sort of place where it is too easy to make an off-by-one
> error, I will go through it all with the debugger again tomorrow.
> 
> 
> I'm not seeing this when testing on Juno. This would only affect the
> tracing code, are you running perf or ftrace at the same time?

No. I've got the same call trace Will reported when only trying kernel build.  

> I've just re-tested this with defconfig, and the following hack:
> =======%<=======
> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> index b947eeffa5b2..686086e4d870 100644
> --- a/arch/arm64/kernel/stacktrace.c
> +++ b/arch/arm64/kernel/stacktrace.c
> @@ -72,8 +72,10 @@ int notrace unwind_frame(struct stackframe *frame)
>         * If we reach the end of the stack - and its an interrupt stack,
>         * read the original task stack pointer from the dummy frame.
>         */
> -       if (frame->sp == irq_stack_ptr)
> +       if (frame->sp == irq_stack_ptr) {
>                frame->sp = IRQ_STACK_TO_TASK_STACK(irq_stack_ptr);
> +               BUG_ON(frame->sp == 0);
> +       }
> 
>        return 0;
> }
> =======%<=======
> 
> While running:
>> sudo ./perf record -e mem:<address of __do_softirq>:x -ag -- sleep 180
> 
> and
> 
>> dd if=/dev/sda of=/dev/null bs=512 iflag=direct;
> 
> This should cause lots of interrupts from /dev/sda, and cause the
> tracing code to step between irq_stack and the original task stack
> frequently. The BUG_ON() doesn't fire, and the perf trace output looks
> correct.
> 
> 
> My only theory is that there is an off by one, and its reading what was
> x29 instead. This wouldn't show up in these tests, but might be a
> problem for aarch32 user-space, as presumably x29==0 when it switches to
> aarch64 mode for el0_irq(). I will try this tomorrow.

My description on dummy stack frame was incorrect. frame->sp is retrieved from
x29, not xzr currently. That is why 0x20 is needed instead of 0x10.

Best Regards
Jungseok Lee

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 3/4] arm64: Add do_softirq_own_stack() and enable irq_stacks
  2015-12-08 17:23         ` James Morse
  2015-12-08 17:27           ` Will Deacon
  2015-12-08 23:13           ` Jungseok Lee
@ 2015-12-09  9:47           ` James Morse
  2015-12-09 11:38             ` Will Deacon
  2 siblings, 1 reply; 44+ messages in thread
From: James Morse @ 2015-12-09  9:47 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/12/15 17:23, James Morse wrote:
> My only theory is that there is an off by one, and its reading what was
> x29 instead. This wouldn't show up in these tests, but might be a
> problem for aarch32 user-space, as presumably x29==0 when it switches to
> aarch64 mode for el0_irq(). I will try this tomorrow.

Yup, this is what is happening. Its an off-by-one due to broken thinking
about how the stack works. My broken thinking was:

>   top ------------
>       | dummy_lr | <- irq_stack_ptr
>       ------------
>       |   x29    |
>       ------------
>       |   x19    | <- irq_stack_ptr - 0x10
>       ------------
>       |   xzr    |
>       ------------

But the stack-pointer is decreased before use. So it actually looks like
this:

>       ------------
>       |          |  <- irq_stack_ptr
>   top ------------
>       | dummy_lr |
>       ------------
>       |   x29    | <- irq_stack_ptr - 0x10
>       ------------
>       |   x19    |
>       ------------
>       |   xzr    | <- irq_stack_ptr - 0x20
>       ------------

The value being used as the original stack is x29, which in all the
tests is sp but without the current frames data, hence there are no
missing frames in the output.

Jungseok Lee picked it up with a 32bit user space because aarch32 can't
use x29, so it remains 0 forever. The fix he posted is correct.

Will: do you want to take Jungseok Lee's patch as a 'Fixes:', or is it
easier if I repost the series?


Thanks,

James

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 3/4] arm64: Add do_softirq_own_stack() and enable irq_stacks
  2015-12-09  9:47           ` James Morse
@ 2015-12-09 11:38             ` Will Deacon
  0 siblings, 0 replies; 44+ messages in thread
From: Will Deacon @ 2015-12-09 11:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 09, 2015 at 09:47:03AM +0000, James Morse wrote:
> On 08/12/15 17:23, James Morse wrote:
> > My only theory is that there is an off by one, and its reading what was
> > x29 instead. This wouldn't show up in these tests, but might be a
> > problem for aarch32 user-space, as presumably x29==0 when it switches to
> > aarch64 mode for el0_irq(). I will try this tomorrow.
> 
> Yup, this is what is happening. Its an off-by-one due to broken thinking
> about how the stack works. My broken thinking was:
> 
> >   top ------------
> >       | dummy_lr | <- irq_stack_ptr
> >       ------------
> >       |   x29    |
> >       ------------
> >       |   x19    | <- irq_stack_ptr - 0x10
> >       ------------
> >       |   xzr    |
> >       ------------
> 
> But the stack-pointer is decreased before use. So it actually looks like
> this:
> 
> >       ------------
> >       |          |  <- irq_stack_ptr
> >   top ------------
> >       | dummy_lr |
> >       ------------
> >       |   x29    | <- irq_stack_ptr - 0x10
> >       ------------
> >       |   x19    |
> >       ------------
> >       |   xzr    | <- irq_stack_ptr - 0x20
> >       ------------
> 
> The value being used as the original stack is x29, which in all the
> tests is sp but without the current frames data, hence there are no
> missing frames in the output.
> 
> Jungseok Lee picked it up with a 32bit user space because aarch32 can't
> use x29, so it remains 0 forever. The fix he posted is correct.
> 
> Will: do you want to take Jungseok Lee's patch as a 'Fixes:', or is it
> easier if I repost the series?

I'll take it as a fix on top, but I still want to get to the bottom of
why unwind_frame appeared to be exploding. We really shouldn't be relying
on the frame layout to provide us with safe addresses in there.

Will

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 3/4] arm64: Add do_softirq_own_stack() and enable irq_stacks
  2015-12-04 11:02 ` [PATCH v8 3/4] arm64: Add do_softirq_own_stack() and enable irq_stacks James Morse
  2015-12-04 13:46   ` Catalin Marinas
  2015-12-07 22:48   ` Catalin Marinas
@ 2015-12-09 13:45   ` Will Deacon
  2015-12-09 14:36     ` James Morse
  2 siblings, 1 reply; 44+ messages in thread
From: Will Deacon @ 2015-12-09 13:45 UTC (permalink / raw)
  To: linux-arm-kernel

Hi James,

On Fri, Dec 04, 2015 at 11:02:27AM +0000, James Morse wrote:
> entry.S is modified to switch to the per_cpu irq_stack during el{0,1}_irq.
> irq_count is used to detect recursive interrupts on the irq_stack, it is
> updated late by do_softirq_own_stack(), when called on the irq_stack, before
> __do_softirq() re-enables interrupts to process softirqs.
> 
> do_softirq_own_stack() is added by this patch, but does not yet switch
> stack.
> 
> This patch adds the dummy stack frame and data needed by the previous
> stack tracing patches.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  arch/arm64/include/asm/irq.h |  2 ++
>  arch/arm64/kernel/entry.S    | 42 ++++++++++++++++++++++++++++++++++++++++--
>  arch/arm64/kernel/irq.c      | 38 +++++++++++++++++++++++++++++++++++++-
>  3 files changed, 79 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h
> index e2f3f135a3bc..fa2a8d0e4792 100644
> --- a/arch/arm64/include/asm/irq.h
> +++ b/arch/arm64/include/asm/irq.h
> @@ -11,6 +11,8 @@
>  #include <asm-generic/irq.h>
>  #include <asm/thread_info.h>
>  
> +#define __ARCH_HAS_DO_SOFTIRQ
> +
>  struct pt_regs;
>  
>  DECLARE_PER_CPU(unsigned long [IRQ_STACK_SIZE/sizeof(long)], irq_stack);
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index fc87373d3f88..81cc5380977d 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -27,6 +27,7 @@
>  #include <asm/cpufeature.h>
>  #include <asm/errno.h>
>  #include <asm/esr.h>
> +#include <asm/irq.h>
>  #include <asm/thread_info.h>
>  #include <asm/unistd.h>
>  
> @@ -175,6 +176,42 @@ alternative_endif
>  	mrs	\rd, sp_el0
>  	.endm
>  
> +	.macro	irq_stack_entry, dummy_lr
> +	mov	x19, sp			// preserve the original sp
> +
> +	adr_l	x25, irq_stack
> +	mrs	x26, tpidr_el1
> +	add	x25, x25, x26

Perhaps we could add a macro to assembler.h to correspond to __my_cpu_offset
in percpu.h?

> +
> +	/*
> +	 * Check the lowest address on irq_stack for the irq_count value,
> +	 * incremented by do_softirq_own_stack if we have re-enabled irqs
> +	 * while on the irq_stack.
> +	 */
> +	ldr	x26, [x25]
> +	cbnz	x26, 9998f		// recursive use?
> +
> +	/* switch to the irq stack */
> +	mov	x26, #IRQ_STACK_START_SP
> +	add	x26, x25, x26
> +	mov	sp, x26
> +
> +	/* Add a dummy stack frame */
> +	stp     x29, \dummy_lr, [sp, #-16]!           // dummy stack frame
> +	mov	x29, sp
> +	stp     xzr, x19, [sp, #-16]!

Hmm. I'm not sure we necessarily want to push a frame when the interrupt
was taken from userspace. The unwinder will either explode (which should
be fixed separately) or truncate the walk anyway.

If we changed this so that we only push a frame when taking an interrupt
from EL1, could we then avoid pushing x19 as well and get the unwinder
to walk back through the pushed fp like it usually would?

For the case where we've come from EL0, we want to zero fp. I don't
*think* we need to push anything at all.

Thoughts?

Will

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 3/4] arm64: Add do_softirq_own_stack() and enable irq_stacks
  2015-12-09 13:45   ` Will Deacon
@ 2015-12-09 14:36     ` James Morse
  0 siblings, 0 replies; 44+ messages in thread
From: James Morse @ 2015-12-09 14:36 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will,

On 09/12/15 13:45, Will Deacon wrote:
>> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
>> index fc87373d3f88..81cc5380977d 100644
>> --- a/arch/arm64/kernel/entry.S
>> +++ b/arch/arm64/kernel/entry.S
>> @@ -27,6 +27,7 @@
>>  #include <asm/cpufeature.h>
>>  #include <asm/errno.h>
>>  #include <asm/esr.h>
>> +#include <asm/irq.h>
>>  #include <asm/thread_info.h>
>>  #include <asm/unistd.h>
>>  
>> @@ -175,6 +176,42 @@ alternative_endif
>>  	mrs	\rd, sp_el0
>>  	.endm
>>  
>> +	.macro	irq_stack_entry, dummy_lr
>> +	mov	x19, sp			// preserve the original sp
>> +
>> +	adr_l	x25, irq_stack
>> +	mrs	x26, tpidr_el1
>> +	add	x25, x25, x26
> 
> Perhaps we could add a macro to assembler.h to correspond to __my_cpu_offset
> in percpu.h?

Is it worth going all the way and having a this_cpu_ptr() macro? I can't
think of any other use for reading tpidr_el1 from asm.


>> +
>> +	/*
>> +	 * Check the lowest address on irq_stack for the irq_count value,
>> +	 * incremented by do_softirq_own_stack if we have re-enabled irqs
>> +	 * while on the irq_stack.
>> +	 */
>> +	ldr	x26, [x25]
>> +	cbnz	x26, 9998f		// recursive use?
>> +
>> +	/* switch to the irq stack */
>> +	mov	x26, #IRQ_STACK_START_SP
>> +	add	x26, x25, x26
>> +	mov	sp, x26
>> +
>> +	/* Add a dummy stack frame */
>> +	stp     x29, \dummy_lr, [sp, #-16]!           // dummy stack frame
>> +	mov	x29, sp
>> +	stp     xzr, x19, [sp, #-16]!
> 
> Hmm. I'm not sure we necessarily want to push a frame when the interrupt
> was taken from userspace. The unwinder will either explode (which should
> be fixed separately) or truncate the walk anyway.

I think the exploding is due to insufficient sanity checks round the:
>???????if (frame->sp == irq_stack_ptr)
>??????????????frame->sp = IRQ_STACK_TO_TASK_STACK(irq_stack_ptr);

This is effectively allowing the maybe-bogus-fp value at the bottom of
the irq_stack to pass the bounds checking, without doing any checking of
the IRQ_STACK_TO_TASK_STACK(irq_stack_ptr) value.

An extra thing we can do here is check that both this new frame->sp and
frame->fp point into current->stack.


> If we changed this so that we only push a frame when taking an interrupt
> from EL1, could we then avoid pushing x19 as well and get the unwinder
> to walk back through the pushed fp like it usually would?

x19 is also struct pt_regs, which dump_backtrace() wants to print the
registers that were passed into gic_handle_irq(). I think this still
needs to be found at a known location on the stack.


> For the case where we've come from EL0, we want to zero fp.

That sounds better - I will see how it behaves.



Thanks!


James

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 5/4] arm64: Fix off-by-one in stack tracing when stepping off irq stack
  2015-12-04 11:02 [PATCH v8 0/4] arm64: Add support for IRQ stack James Morse
                   ` (5 preceding siblings ...)
  2015-12-04 13:57 ` Catalin Marinas
@ 2015-12-10 10:22 ` James Morse
  2015-12-10 10:22   ` [PATCH v8 6/4] arm64: Add this_cpu_ptr() assembler macro for use in entry.S James Morse
                     ` (3 more replies)
  2015-12-15 11:21 ` [PATCH v8 9/4] arm64: reduce stack use in irq_handler James Morse
  2015-12-18 16:01 ` [PATCH v8 9/4] arm64: remove irq_count and do_softirq_own_stack() James Morse
  8 siblings, 4 replies; 44+ messages in thread
From: James Morse @ 2015-12-10 10:22 UTC (permalink / raw)
  To: linux-arm-kernel

From: Jungseok Lee <jungseoklee85@gmail.com>

Signed-off-by: Jungseok Lee <jungseoklee85@gmail.com>
Signed-off-by: James Morse <james.morse@arm.com>
---
I believe Will already has this patch, its included here for completeness.

 arch/arm64/include/asm/irq.h | 2 +-
 arch/arm64/kernel/entry.S    | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h
index 6fd25ab534ca..cdc2916a00f0 100644
--- a/arch/arm64/include/asm/irq.h
+++ b/arch/arm64/include/asm/irq.h
@@ -27,7 +27,7 @@ DECLARE_PER_CPU(unsigned long [IRQ_STACK_SIZE/sizeof(long)], irq_stack);
  * The offset from irq_stack_ptr where entry.S will store the original
  * stack pointer. Used by unwind_frame() and dump_backtrace().
  */
-#define IRQ_STACK_TO_TASK_STACK(ptr) *((unsigned long *)(ptr - 0x10));
+#define IRQ_STACK_TO_TASK_STACK(ptr) *((unsigned long *)(ptr - 0x20));
 
 extern void set_handle_irq(void (*handle_irq)(struct pt_regs *));
 
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 2963050e5e9e..76e803118f57 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -199,7 +199,7 @@ alternative_endif
 	/* Add a dummy stack frame */
 	stp     x29, \dummy_lr, [sp, #-16]!           // dummy stack frame
 	mov	x29, sp
-	stp     xzr, x19, [sp, #-16]!
+	stp     x19, xzr, [sp, #-16]!
 
 9998:
 	.endm
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 6/4] arm64: Add this_cpu_ptr() assembler macro for use in entry.S
  2015-12-10 10:22 ` [PATCH v8 5/4] arm64: Fix off-by-one in stack tracing when stepping off irq stack James Morse
@ 2015-12-10 10:22   ` James Morse
  2015-12-10 10:22   ` [PATCH v8 7/4] arm64: when walking onto the task stack, check sp & fp are in current->stack James Morse
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 44+ messages in thread
From: James Morse @ 2015-12-10 10:22 UTC (permalink / raw)
  To: linux-arm-kernel

irq_stack is a per_cpu variable, that needs to be access from entry.S.
Use an assembler macro instead of the unreadable details.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/assembler.h | 11 +++++++++++
 arch/arm64/kernel/entry.S          |  4 +---
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 12eff928ef8b..bb7b72734c24 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -193,6 +193,17 @@ lr	.req	x30		// link register
 	str	\src, [\tmp, :lo12:\sym]
 	.endm
 
+	/*
+	 * @sym: The name of the per-cpu variable
+	 * @reg: Result of per_cpu(sym, smp_processor_id())
+	 * @tmp: scratch register
+	 */
+	.macro this_cpu_ptr, sym, reg, tmp
+	adr_l	\reg, \sym
+	mrs	\tmp, tpidr_el1
+	add	\reg, \reg, \tmp
+	.endm
+
 /*
  * Annotate a function as position independent, i.e., safe to be called before
  * the kernel virtual mapping is activated.
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 76e803118f57..9a0450344335 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -179,9 +179,7 @@ alternative_endif
 	.macro	irq_stack_entry, dummy_lr
 	mov	x19, sp			// preserve the original sp
 
-	adr_l	x25, irq_stack
-	mrs	x26, tpidr_el1
-	add	x25, x25, x26
+	this_cpu_ptr irq_stack, x25, x26
 
 	/*
 	 * Check the lowest address on irq_stack for the irq_count value,
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 7/4] arm64: when walking onto the task stack, check sp & fp are in current->stack
  2015-12-10 10:22 ` [PATCH v8 5/4] arm64: Fix off-by-one in stack tracing when stepping off irq stack James Morse
  2015-12-10 10:22   ` [PATCH v8 6/4] arm64: Add this_cpu_ptr() assembler macro for use in entry.S James Morse
@ 2015-12-10 10:22   ` James Morse
  2015-12-10 10:22   ` [PATCH v8 8/4] arm64: don't call C code with el0's fp register James Morse
  2015-12-10 14:03   ` [PATCH v8 5/4] arm64: Fix off-by-one in stack tracing when stepping off irq stack Jungseok Lee
  3 siblings, 0 replies; 44+ messages in thread
From: James Morse @ 2015-12-10 10:22 UTC (permalink / raw)
  To: linux-arm-kernel

When unwind_frame() reaches the bottom of the irq_stack, the last fp
points to the original task stack. unwind_frame() uses
IRQ_STACK_TO_TASK_STACK() to find the sp value. If either values is
wrong, we may end up walking a corrupt stack.

Check these values are sane by testing if they are both on the stack
pointed to by current->stack.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/kernel/stacktrace.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index b947eeffa5b2..d916d5b6aef6 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -71,9 +71,17 @@ int notrace unwind_frame(struct stackframe *frame)
 	 * to task stack.
 	 * If we reach the end of the stack - and its an interrupt stack,
 	 * read the original task stack pointer from the dummy frame.
+	 *
+	 * Check the frame->fp we read from the bottom of the irq_stack,
+	 * and the original task stack pointer are both in current->stack.
 	 */
-	if (frame->sp == irq_stack_ptr)
-		frame->sp = IRQ_STACK_TO_TASK_STACK(irq_stack_ptr);
+	if (frame->sp == irq_stack_ptr) {
+		unsigned long orig_sp = IRQ_STACK_TO_TASK_STACK(irq_stack_ptr);
+
+		if(object_is_on_stack((void *)orig_sp) &&
+		   object_is_on_stack((void *)frame->fp))
+			frame->sp = orig_sp;
+	}
 
 	return 0;
 }
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 8/4] arm64: don't call C code with el0's fp register
  2015-12-10 10:22 ` [PATCH v8 5/4] arm64: Fix off-by-one in stack tracing when stepping off irq stack James Morse
  2015-12-10 10:22   ` [PATCH v8 6/4] arm64: Add this_cpu_ptr() assembler macro for use in entry.S James Morse
  2015-12-10 10:22   ` [PATCH v8 7/4] arm64: when walking onto the task stack, check sp & fp are in current->stack James Morse
@ 2015-12-10 10:22   ` James Morse
  2015-12-10 14:03   ` [PATCH v8 5/4] arm64: Fix off-by-one in stack tracing when stepping off irq stack Jungseok Lee
  3 siblings, 0 replies; 44+ messages in thread
From: James Morse @ 2015-12-10 10:22 UTC (permalink / raw)
  To: linux-arm-kernel

On entry from el0, we save all the registers on the kernel stack, and
restore them before returning. x29 remains unchanged when we call out
to C code, which will store x29 as the frame-pointer on the stack.

Instead, write 0 into x29 after entry from el0, to avoid any risk of
tracing into user space.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/kernel/entry.S | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 9a0450344335..8dd009b4cab2 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -93,6 +93,8 @@
 	and	tsk, tsk, #~(THREAD_SIZE - 1)	// Ensure MDSCR_EL1.SS is clear,
 	ldr	x19, [tsk, #TI_FLAGS]		// since we can unmask debug
 	disable_step_tsk x19, x20		// exceptions when scheduling.
+
+	mov	x29, xzr			// fp pointed to user-space
 	.else
 	add	x21, sp, #S_FRAME_SIZE
 	.endif
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 5/4] arm64: Fix off-by-one in stack tracing when stepping off irq stack
  2015-12-10 10:22 ` [PATCH v8 5/4] arm64: Fix off-by-one in stack tracing when stepping off irq stack James Morse
                     ` (2 preceding siblings ...)
  2015-12-10 10:22   ` [PATCH v8 8/4] arm64: don't call C code with el0's fp register James Morse
@ 2015-12-10 14:03   ` Jungseok Lee
  3 siblings, 0 replies; 44+ messages in thread
From: Jungseok Lee @ 2015-12-10 14:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Dec 10, 2015, at 7:22 PM, James Morse wrote:

Hi James,

> From: Jungseok Lee <jungseoklee85@gmail.com>

I'm satisfied with spotting the frame->sp issue ;) According to for-next/core
branch, Will have already worked on this change with an informative commit msg.

Thanks for your care!

Best Regards
Jungseok Lee

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 9/4] arm64: reduce stack use in irq_handler
  2015-12-04 11:02 [PATCH v8 0/4] arm64: Add support for IRQ stack James Morse
                   ` (6 preceding siblings ...)
  2015-12-10 10:22 ` [PATCH v8 5/4] arm64: Fix off-by-one in stack tracing when stepping off irq stack James Morse
@ 2015-12-15 11:21 ` James Morse
  2015-12-18 16:01 ` [PATCH v8 9/4] arm64: remove irq_count and do_softirq_own_stack() James Morse
  8 siblings, 0 replies; 44+ messages in thread
From: James Morse @ 2015-12-15 11:21 UTC (permalink / raw)
  To: linux-arm-kernel

The code for switching to irq_stack stores three pieces of information on
the stack, fp+lr, as a fake stack frame (that lets us walk back onto the
interrupted tasks stack frame), and the address of the struct pt_regs that
contains the register values from kernel entry. (which dump_backtrace()
will print in any stack trace).

To reduce this, we store fp, and the pointer to the struct pt_regs.
unwind_frame() can recognise this as the irq_stack dummy frame, (as it only
appears at the top of the irq_stack), and use the struct pt_regs values
to find the missing interrupted link-register.

Suggested-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
---
This patch also silently removes the other stupid mistakes I made in
IRQ_STACK_TO_TASK_STACK().

Based on for-next/core


 arch/arm64/include/asm/irq.h   | 11 ++++-------
 arch/arm64/kernel/entry.S      | 12 +++++++-----
 arch/arm64/kernel/stacktrace.c | 17 +++++++++++++++--
 3 files changed, 26 insertions(+), 14 deletions(-)

diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h
index 877c7e358384..3bece4379bd9 100644
--- a/arch/arm64/include/asm/irq.h
+++ b/arch/arm64/include/asm/irq.h
@@ -25,16 +25,13 @@ DECLARE_PER_CPU(unsigned long [IRQ_STACK_SIZE/sizeof(long)], irq_stack);
  *       ------------
  *       |          |  <- irq_stack_ptr
  *   top ------------
- *       |  elr_el1 |
+ *       |   x19    | <- irq_stack_ptr - 0x08
  *       ------------
  *       |   x29    | <- irq_stack_ptr - 0x10
  *       ------------
- *       |   xzr    |
- *       ------------
- *       |   x19    | <- irq_stack_ptr - 0x20
- *       ------------
  *
- * where x19 holds a copy of the task stack pointer.
+ * where x19 holds a copy of the task stack pointer where the struct pt_regs
+ * from kernel_entry can be found.
  *
  */
 #define IRQ_STACK_PTR(cpu) ((unsigned long)per_cpu(irq_stack, cpu) + IRQ_STACK_START_SP)
@@ -43,7 +40,7 @@ DECLARE_PER_CPU(unsigned long [IRQ_STACK_SIZE/sizeof(long)], irq_stack);
  * The offset from irq_stack_ptr where entry.S will store the original
  * stack pointer. Used by unwind_frame() and dump_backtrace().
  */
-#define IRQ_STACK_TO_TASK_STACK(ptr) *((unsigned long *)(ptr - 0x20));
+#define IRQ_STACK_TO_TASK_STACK(ptr) (*((unsigned long *)((ptr) - 0x08)))
 
 extern void set_handle_irq(void (*handle_irq)(struct pt_regs *));
 
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 2284c296e3f7..0667fb7d8bb1 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -178,7 +178,7 @@ alternative_endif
 	mrs	\rd, sp_el0
 	.endm
 
-	.macro	irq_stack_entry, dummy_lr
+	.macro	irq_stack_entry
 	mov	x19, sp			// preserve the original sp
 
 	this_cpu_ptr irq_stack, x25, x26
@@ -196,10 +196,12 @@ alternative_endif
 	add	x26, x25, x26
 	mov	sp, x26
 
-	/* Add a dummy stack frame */
-	stp     x29, \dummy_lr, [sp, #-16]!           // dummy stack frame
+	/*
+	 * Add a dummy stack frame, this non-standard format is fixed up
+	 * by unwind_frame()
+	 */
+	stp     x29, x19, [sp, #-16]!
 	mov	x29, sp
-	stp     x19, xzr, [sp, #-16]!
 
 9998:
 	.endm
@@ -229,7 +231,7 @@ tsk	.req	x28		// current thread_info
 	.macro	irq_handler
 	ldr_l	x1, handle_arch_irq
 	mov	x0, sp
-	irq_stack_entry x22
+	irq_stack_entry
 	blr	x1
 	irq_stack_exit
 	.endm
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index d916d5b6aef6..82659e1df761 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -70,17 +70,30 @@ int notrace unwind_frame(struct stackframe *frame)
 	 * Check whether we are going to walk through from interrupt stack
 	 * to task stack.
 	 * If we reach the end of the stack - and its an interrupt stack,
-	 * read the original task stack pointer from the dummy frame.
+	 * unpack the dummy frame to find the original elr.
 	 *
 	 * Check the frame->fp we read from the bottom of the irq_stack,
 	 * and the original task stack pointer are both in current->stack.
 	 */
 	if (frame->sp == irq_stack_ptr) {
+		struct pt_regs *irq_args;
 		unsigned long orig_sp = IRQ_STACK_TO_TASK_STACK(irq_stack_ptr);
 
 		if(object_is_on_stack((void *)orig_sp) &&
-		   object_is_on_stack((void *)frame->fp))
+		   object_is_on_stack((void *)frame->fp)) {
 			frame->sp = orig_sp;
+
+			/* orig_sp is the saved pt_regs, find the elr */
+			irq_args = (struct pt_regs *)orig_sp;
+			frame->pc = irq_args->pc;
+		} else {
+			/*
+			 * This frame has a non-standard format, and we
+			 * didn't fix it, because the data looked wrong.
+			 * Refuse to output this frame.
+			 */
+			return -EINVAL;
+		}
 	}
 
 	return 0;
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 9/4] arm64: remove irq_count and do_softirq_own_stack()
  2015-12-04 11:02 [PATCH v8 0/4] arm64: Add support for IRQ stack James Morse
                   ` (7 preceding siblings ...)
  2015-12-15 11:21 ` [PATCH v8 9/4] arm64: reduce stack use in irq_handler James Morse
@ 2015-12-18 16:01 ` James Morse
  2015-12-20 11:07   ` Jungseok Lee
  8 siblings, 1 reply; 44+ messages in thread
From: James Morse @ 2015-12-18 16:01 UTC (permalink / raw)
  To: linux-arm-kernel

sysrq_handle_reboot() re-enables interrupts while on the irq stack. The
irq_stack implementation wrongly assumed this would only ever happen
via the softirq path, allowing it to update irq_count late, in
do_softirq_own_stack().

This means if an irq occurs in sysrq_handle_reboot(), during
emergency_restart() the stack will be corrupted, as irq_count wasn't
updated.

Lose the optimisation, and instead of moving the adding/subtracting of
irq_count into irq_stack_entry/irq_stack_exit, remove it, and compare
sp_el0 (struct task_info) with sp & ~(THREAD_SIZE - 1). This tells us if
we are on a task stack, if so, we can safely switch to the irq stack.
Finally, remove do_softirq_own_stack(), we don't need it anymore.

Reported-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/irq.h |  2 --
 arch/arm64/kernel/entry.S    | 17 +++++++++--------
 arch/arm64/kernel/irq.c      | 38 +-------------------------------------
 3 files changed, 10 insertions(+), 47 deletions(-)

diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h
index 3bece4379bd9..b77197d941fc 100644
--- a/arch/arm64/include/asm/irq.h
+++ b/arch/arm64/include/asm/irq.h
@@ -11,8 +11,6 @@
 #include <asm-generic/irq.h>
 #include <asm/thread_info.h>
 
-#define __ARCH_HAS_DO_SOFTIRQ
-
 struct pt_regs;
 
 DECLARE_PER_CPU(unsigned long [IRQ_STACK_SIZE/sizeof(long)], irq_stack);
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 0667fb7d8bb1..6745a9041f99 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -181,19 +181,20 @@ alternative_endif
 	.macro	irq_stack_entry
 	mov	x19, sp			// preserve the original sp
 
-	this_cpu_ptr irq_stack, x25, x26
-
 	/*
-	 * Check the lowest address on irq_stack for the irq_count value,
-	 * incremented by do_softirq_own_stack if we have re-enabled irqs
-	 * while on the irq_stack.
+	 * Compare sp and sp_el0, if the top ~(THREAD_SIZE - 1) bits match,
+	 * we are on a task stack, and should switch to the irq stack.
 	 */
-	ldr	x26, [x25]
-	cbnz	x26, 9998f		// recursive use?
+	mrs	x26, sp_el0		// already masked
+	and	x25, x19, #~(THREAD_SIZE - 1)
+	cmp	x25, x26
+	b.ne	9998f
 
-	/* switch to the irq stack */
+	this_cpu_ptr irq_stack, x25, x26
 	mov	x26, #IRQ_STACK_START_SP
 	add	x26, x25, x26
+
+	/* switch to the irq stack */
 	mov	sp, x26
 
 	/*
diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
index ff7ebb710e51..2386b26c0712 100644
--- a/arch/arm64/kernel/irq.c
+++ b/arch/arm64/kernel/irq.c
@@ -25,24 +25,14 @@
 #include <linux/irq.h>
 #include <linux/smp.h>
 #include <linux/init.h>
-#include <linux/interrupt.h>
 #include <linux/irqchip.h>
 #include <linux/seq_file.h>
 
 unsigned long irq_err_count;
 
-/*
- * irq stack only needs to be 16 byte aligned - not IRQ_STACK_SIZE aligned.
- * irq_stack[0] is used as irq_count, a non-zero value indicates the stack
- * is in use, and el?_irq() shouldn't switch to it. This is used to detect
- * recursive use of the irq_stack, it is lazily updated by
- * do_softirq_own_stack(), which is called on the irq_stack, before
- * re-enabling interrupts to process softirqs.
- */
+/* irq stack only needs to be 16 byte aligned - not IRQ_STACK_SIZE aligned. */
 DEFINE_PER_CPU(unsigned long [IRQ_STACK_SIZE/sizeof(long)], irq_stack) __aligned(16);
 
-#define IRQ_COUNT()	(*per_cpu(irq_stack, smp_processor_id()))
-
 int arch_show_interrupts(struct seq_file *p, int prec)
 {
 	show_ipi_list(p, prec);
@@ -66,29 +56,3 @@ void __init init_IRQ(void)
 	if (!handle_arch_irq)
 		panic("No interrupt controller found.");
 }
-
-/*
- * do_softirq_own_stack() is called from irq_exit() before __do_softirq()
- * re-enables interrupts, at which point we may re-enter el?_irq(). We
- * increase irq_count here so that el1_irq() knows that it is already on the
- * irq stack.
- *
- * Called with interrupts disabled, so we don't worry about moving cpu, or
- * being interrupted while modifying irq_count.
- *
- * This function doesn't actually switch stack.
- */
-void do_softirq_own_stack(void)
-{
-	int cpu = smp_processor_id();
-
-	WARN_ON_ONCE(!irqs_disabled());
-
-	if (on_irq_stack(current_stack_pointer, cpu)) {
-		IRQ_COUNT()++;
-		__do_softirq();
-		IRQ_COUNT()--;
-	} else {
-		__do_softirq();
-	}
-}
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 9/4] arm64: remove irq_count and do_softirq_own_stack()
  2015-12-18 16:01 ` [PATCH v8 9/4] arm64: remove irq_count and do_softirq_own_stack() James Morse
@ 2015-12-20 11:07   ` Jungseok Lee
  2015-12-21 11:30     ` Will Deacon
  0 siblings, 1 reply; 44+ messages in thread
From: Jungseok Lee @ 2015-12-20 11:07 UTC (permalink / raw)
  To: linux-arm-kernel

On Dec 19, 2015, at 1:01 AM, James Morse wrote:

Hi James,

> sysrq_handle_reboot() re-enables interrupts while on the irq stack. The
> irq_stack implementation wrongly assumed this would only ever happen
> via the softirq path, allowing it to update irq_count late, in
> do_softirq_own_stack().
> 
> This means if an irq occurs in sysrq_handle_reboot(), during
> emergency_restart() the stack will be corrupted, as irq_count wasn't
> updated.
> 
> Lose the optimisation, and instead of moving the adding/subtracting of
> irq_count into irq_stack_entry/irq_stack_exit, remove it, and compare
> sp_el0 (struct task_info) with sp & ~(THREAD_SIZE - 1). This tells us if

Nit: s/task_info/thread_info

> we are on a task stack, if so, we can safely switch to the irq stack.
> Finally, remove do_softirq_own_stack(), we don't need it anymore.
> 
> Reported-by: Will Deacon <will.deacon@arm.com>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> arch/arm64/include/asm/irq.h |  2 --
> arch/arm64/kernel/entry.S    | 17 +++++++++--------
> arch/arm64/kernel/irq.c      | 38 +-------------------------------------
> 3 files changed, 10 insertions(+), 47 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h
> index 3bece4379bd9..b77197d941fc 100644
> --- a/arch/arm64/include/asm/irq.h
> +++ b/arch/arm64/include/asm/irq.h
> @@ -11,8 +11,6 @@
> #include <asm-generic/irq.h>
> #include <asm/thread_info.h>
> 
> -#define __ARCH_HAS_DO_SOFTIRQ
> -
> struct pt_regs;
> 
> DECLARE_PER_CPU(unsigned long [IRQ_STACK_SIZE/sizeof(long)], irq_stack);
> diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> index 0667fb7d8bb1..6745a9041f99 100644
> --- a/arch/arm64/kernel/entry.S
> +++ b/arch/arm64/kernel/entry.S
> @@ -181,19 +181,20 @@ alternative_endif
> 	.macro	irq_stack_entry
> 	mov	x19, sp			// preserve the original sp
> 
> -	this_cpu_ptr irq_stack, x25, x26
> -
> 	/*
> -	 * Check the lowest address on irq_stack for the irq_count value,
> -	 * incremented by do_softirq_own_stack if we have re-enabled irqs
> -	 * while on the irq_stack.
> +	 * Compare sp and sp_el0, if the top ~(THREAD_SIZE - 1) bits match,
> +	 * we are on a task stack, and should switch to the irq stack.
> 	 */
> -	ldr	x26, [x25]
> -	cbnz	x26, 9998f		// recursive use?
> +	mrs	x26, sp_el0		// already masked

Nit: How about using 'get_thread_info x26'?

> +	and	x25, x19, #~(THREAD_SIZE - 1)
> +	cmp	x25, x26
> +	b.ne	9998f
> 
> -	/* switch to the irq stack */
> +	this_cpu_ptr irq_stack, x25, x26
> 	mov	x26, #IRQ_STACK_START_SP
> 	add	x26, x25, x26
> +
> +	/* switch to the irq stack */
> 	mov	sp, x26
> 
> 	/*
> diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
> index ff7ebb710e51..2386b26c0712 100644
> --- a/arch/arm64/kernel/irq.c
> +++ b/arch/arm64/kernel/irq.c
> @@ -25,24 +25,14 @@
> #include <linux/irq.h>
> #include <linux/smp.h>
> #include <linux/init.h>
> -#include <linux/interrupt.h>
> #include <linux/irqchip.h>
> #include <linux/seq_file.h>
> 
> unsigned long irq_err_count;
> 
> -/*
> - * irq stack only needs to be 16 byte aligned - not IRQ_STACK_SIZE aligned.
> - * irq_stack[0] is used as irq_count, a non-zero value indicates the stack
> - * is in use, and el?_irq() shouldn't switch to it. This is used to detect
> - * recursive use of the irq_stack, it is lazily updated by
> - * do_softirq_own_stack(), which is called on the irq_stack, before
> - * re-enabling interrupts to process softirqs.
> - */
> +/* irq stack only needs to be 16 byte aligned - not IRQ_STACK_SIZE aligned. */
> DEFINE_PER_CPU(unsigned long [IRQ_STACK_SIZE/sizeof(long)], irq_stack) __aligned(16);
> 
> -#define IRQ_COUNT()	(*per_cpu(irq_stack, smp_processor_id()))
> -
> int arch_show_interrupts(struct seq_file *p, int prec)
> {
> 	show_ipi_list(p, prec);
> @@ -66,29 +56,3 @@ void __init init_IRQ(void)
> 	if (!handle_arch_irq)
> 		panic("No interrupt controller found.");
> }
> -
> -/*
> - * do_softirq_own_stack() is called from irq_exit() before __do_softirq()
> - * re-enables interrupts, at which point we may re-enter el?_irq(). We
> - * increase irq_count here so that el1_irq() knows that it is already on the
> - * irq stack.
> - *
> - * Called with interrupts disabled, so we don't worry about moving cpu, or
> - * being interrupted while modifying irq_count.
> - *
> - * This function doesn't actually switch stack.
> - */
> -void do_softirq_own_stack(void)
> -{
> -	int cpu = smp_processor_id();
> -
> -	WARN_ON_ONCE(!irqs_disabled());
> -
> -	if (on_irq_stack(current_stack_pointer, cpu)) {
> -		IRQ_COUNT()++;
> -		__do_softirq();
> -		IRQ_COUNT()--;
> -	} else {
> -		__do_softirq();
> -	}
> -}

The approach, removal of the counter, looks good to me from a high level point
of view although I have not played with this change seriously yet.

If I spot something strange, I will report it.

Best Regards
Jungseok Lee

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 9/4] arm64: remove irq_count and do_softirq_own_stack()
  2015-12-20 11:07   ` Jungseok Lee
@ 2015-12-21 11:30     ` Will Deacon
  2015-12-21 12:19       ` James Morse
  0 siblings, 1 reply; 44+ messages in thread
From: Will Deacon @ 2015-12-21 11:30 UTC (permalink / raw)
  To: linux-arm-kernel

On Sun, Dec 20, 2015 at 08:07:46PM +0900, Jungseok Lee wrote:
> On Dec 19, 2015, at 1:01 AM, James Morse wrote:
> > diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
> > index 0667fb7d8bb1..6745a9041f99 100644
> > --- a/arch/arm64/kernel/entry.S
> > +++ b/arch/arm64/kernel/entry.S
> > @@ -181,19 +181,20 @@ alternative_endif
> > 	.macro	irq_stack_entry
> > 	mov	x19, sp			// preserve the original sp
> > 
> > -	this_cpu_ptr irq_stack, x25, x26
> > -
> > 	/*
> > -	 * Check the lowest address on irq_stack for the irq_count value,
> > -	 * incremented by do_softirq_own_stack if we have re-enabled irqs
> > -	 * while on the irq_stack.
> > +	 * Compare sp and sp_el0, if the top ~(THREAD_SIZE - 1) bits match,
> > +	 * we are on a task stack, and should switch to the irq stack.
> > 	 */
> > -	ldr	x26, [x25]
> > -	cbnz	x26, 9998f		// recursive use?
> > +	mrs	x26, sp_el0		// already masked
> 
> Nit: How about using 'get_thread_info x26'?

Something like the following?

Will

--->8

diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
index 6745a9041f99..c0db321db7e1 100644
--- a/arch/arm64/kernel/entry.S
+++ b/arch/arm64/kernel/entry.S
@@ -182,12 +182,12 @@ alternative_endif
 	mov	x19, sp			// preserve the original sp
 
 	/*
-	 * Compare sp and sp_el0, if the top ~(THREAD_SIZE - 1) bits match,
-	 * we are on a task stack, and should switch to the irq stack.
+	 * Compare sp with the current thread_info, if the top
+	 * ~(THREAD_SIZE - 1) bits match, we are on a task stack, and
+	 * should switch to the irq stack.
 	 */
-	mrs	x26, sp_el0		// already masked
 	and	x25, x19, #~(THREAD_SIZE - 1)
-	cmp	x25, x26
+	cmp	x25, tsk
 	b.ne	9998f
 
 	this_cpu_ptr irq_stack, x25, x26
@@ -406,10 +406,10 @@ el1_irq:
 	bl	trace_hardirqs_off
 #endif
 
+	get_thread_info tsk
 	irq_handler
 
 #ifdef CONFIG_PREEMPT
-	get_thread_info tsk
 	ldr	w24, [tsk, #TI_PREEMPT]		// get preempt count
 	cbnz	w24, 1f				// preempt count != 0
 	ldr	x0, [tsk, #TI_FLAGS]		// get flags

^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v8 9/4] arm64: remove irq_count and do_softirq_own_stack()
  2015-12-21 11:30     ` Will Deacon
@ 2015-12-21 12:19       ` James Morse
  2015-12-21 12:21         ` Will Deacon
  0 siblings, 1 reply; 44+ messages in thread
From: James Morse @ 2015-12-21 12:19 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Will,

Will Deacon wrote:
>> > +   mrs     x26, sp_el0             // already masked
>>
>> Nit: How about using 'get_thread_info x26'?
>
> Something like the following?

Looks fine - I guess you're cleverly leaving the the el0_irq case to be covered by get_thread_info in kernel_entry?


Thanks,

James


(Sorry if this message formatting is weird - I'm briefly stuck with webmail)
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 9/4] arm64: remove irq_count and do_softirq_own_stack()
  2015-12-21 12:19       ` James Morse
@ 2015-12-21 12:21         ` Will Deacon
  2015-12-21 14:06           ` Jungseok Lee
  0 siblings, 1 reply; 44+ messages in thread
From: Will Deacon @ 2015-12-21 12:21 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Dec 21, 2015 at 12:19:32PM +0000, James Morse wrote:
> Will Deacon wrote:
> >> > +   mrs     x26, sp_el0             // already masked
> >>
> >> Nit: How about using 'get_thread_info x26'?
> > 
> > Something like the following?
> 
> Looks fine - I guess you're cleverly leaving the the el0_irq case to be
> covered by get_thread_info in kernel_entry?

Yup. tsk is always valid for exceptions taken from el0, afaict.

Will

^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v8 9/4] arm64: remove irq_count and do_softirq_own_stack()
  2015-12-21 12:21         ` Will Deacon
@ 2015-12-21 14:06           ` Jungseok Lee
  0 siblings, 0 replies; 44+ messages in thread
From: Jungseok Lee @ 2015-12-21 14:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Dec 21, 2015, at 9:21 PM, Will Deacon wrote:
> On Mon, Dec 21, 2015 at 12:19:32PM +0000, James Morse wrote:
>> Will Deacon wrote:
>>>>> +   mrs     x26, sp_el0             // already masked
>>>> 
>>>> Nit: How about using 'get_thread_info x26'?
>>> 
>>> Something like the following?
>> 
>> Looks fine - I guess you're cleverly leaving the the el0_irq case to be
>> covered by get_thread_info in kernel_entry?
> 
> Yup. tsk is always valid for exceptions taken from el0, afaict.


Good idea! I like the change.

Best Regards
Jungseok Lee

^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2015-12-21 14:06 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-12-04 11:02 [PATCH v8 0/4] arm64: Add support for IRQ stack James Morse
2015-12-04 11:02 ` [PATCH v8 1/4] arm64: Store struct task_info in sp_el0 James Morse
2015-12-04 13:27   ` Catalin Marinas
2015-12-04 14:55     ` James Morse
2015-12-04 16:18       ` Catalin Marinas
2015-12-06 13:15     ` Jungseok Lee
2015-12-04 11:02 ` [PATCH v8 2/4] arm64: Modify stack trace and dump for use with irq_stack James Morse
2015-12-04 12:21   ` Jungseok Lee
2015-12-04 14:31   ` Catalin Marinas
2015-12-04 11:02 ` [PATCH v8 3/4] arm64: Add do_softirq_own_stack() and enable irq_stacks James Morse
2015-12-04 13:46   ` Catalin Marinas
2015-12-04 13:47     ` Catalin Marinas
2015-12-07 22:48   ` Catalin Marinas
2015-12-08 11:43     ` Will Deacon
2015-12-08 16:02       ` Jungseok Lee
2015-12-08 17:23         ` James Morse
2015-12-08 17:27           ` Will Deacon
2015-12-08 23:13           ` Jungseok Lee
2015-12-09  9:47           ` James Morse
2015-12-09 11:38             ` Will Deacon
2015-12-09 13:45   ` Will Deacon
2015-12-09 14:36     ` James Morse
2015-12-04 11:02 ` [PATCH v8 4/4] arm64: switch to irq_stack during softirq James Morse
2015-12-04 14:01   ` Catalin Marinas
2015-12-04 14:39     ` James Morse
2015-12-04 18:40       ` Catalin Marinas
2015-12-08 10:29         ` James Morse
2015-12-06 13:51       ` Jungseok Lee
2015-12-04 12:17 ` [PATCH v8 0/4] arm64: Add support for IRQ stack Jungseok Lee
2015-12-06 13:56   ` Jungseok Lee
2015-12-04 13:57 ` Catalin Marinas
2015-12-06 13:33   ` Jungseok Lee
2015-12-10 10:22 ` [PATCH v8 5/4] arm64: Fix off-by-one in stack tracing when stepping off irq stack James Morse
2015-12-10 10:22   ` [PATCH v8 6/4] arm64: Add this_cpu_ptr() assembler macro for use in entry.S James Morse
2015-12-10 10:22   ` [PATCH v8 7/4] arm64: when walking onto the task stack, check sp & fp are in current->stack James Morse
2015-12-10 10:22   ` [PATCH v8 8/4] arm64: don't call C code with el0's fp register James Morse
2015-12-10 14:03   ` [PATCH v8 5/4] arm64: Fix off-by-one in stack tracing when stepping off irq stack Jungseok Lee
2015-12-15 11:21 ` [PATCH v8 9/4] arm64: reduce stack use in irq_handler James Morse
2015-12-18 16:01 ` [PATCH v8 9/4] arm64: remove irq_count and do_softirq_own_stack() James Morse
2015-12-20 11:07   ` Jungseok Lee
2015-12-21 11:30     ` Will Deacon
2015-12-21 12:19       ` James Morse
2015-12-21 12:21         ` Will Deacon
2015-12-21 14:06           ` Jungseok Lee

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.