linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] MIPS: Add per-cpu IRQ stack
@ 2016-12-02 13:39 Matt Redfearn
  2016-12-02 13:39 ` [PATCH 1/5] MIPS: Introduce irq_stack Matt Redfearn
                   ` (5 more replies)
  0 siblings, 6 replies; 15+ messages in thread
From: Matt Redfearn @ 2016-12-02 13:39 UTC (permalink / raw)
  To: Ralf Baechle
  Cc: linux-mips, Jason A . Donenfeld, Thomas Gleixner, Matt Redfearn,
	Paolo Bonzini, Marcin Nowakowski, Chris Metcalf, Petr Mladek,
	linux-kernel, Andrea Gelmini, James Hogan, Paul Burton,
	Jiri Slaby, Maciej W. Rozycki, Aaron Tomlin, Andrew Morton


This series adds a separate stack for each CPU wihin the system to use
when handling IRQs. Previously IRQs were handled on the kernel stack of
the current task. If that task was deep down a call stack at the point
of the interrupt, and handling the interrupt required a deep IRQ stack,
then there was a likelihood of stack overflow. Since the kernel stack
is in normal unmapped memory, overflowing it can lead to silent
corruption of other kernel data, with weird and wonderful results.

Before this patch series, ftracing the maximum stack size of a v4.9-rc6
kernel running on a Ci40 board gave:
4996

And with this series:
4084

Handling interrupts on a separate stack reduces the maximum kernel stack
usage in this configuration by ~900 bytes.

Since do_IRQ is now invoked on a separate stack, we select
HAVE_IRQ_EXIT_ON_IRQ_STACK so that softirqs will also be executed on the
irq stack rather than attempting to switch with do_softirq_own_stack().

This series has been tested on MIPS Boston, Malta and SEAD3 platforms,
Pistachio on the Creator Ci40 board and Cavium Octeon III.



Matt Redfearn (5):
  MIPS: Introduce irq_stack
  MIPS: Stack unwinding while on IRQ stack
  MIPS: Only change $28 to thread_info if coming from user mode
  MIPS: Switch to the irq_stack in interrupts
  MIPS: Select HAVE_IRQ_EXIT_ON_IRQ_STACK

 arch/mips/Kconfig                  |  1 +
 arch/mips/include/asm/irq.h        | 12 ++++++
 arch/mips/include/asm/stackframe.h | 10 +++++
 arch/mips/kernel/asm-offsets.c     |  1 +
 arch/mips/kernel/genex.S           | 81 +++++++++++++++++++++++++++++++++++---
 arch/mips/kernel/irq.c             | 11 ++++++
 arch/mips/kernel/process.c         | 15 ++++++-
 7 files changed, 125 insertions(+), 6 deletions(-)

-- 
2.7.4

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH 1/5] MIPS: Introduce irq_stack
  2016-12-02 13:39 [PATCH 0/5] MIPS: Add per-cpu IRQ stack Matt Redfearn
@ 2016-12-02 13:39 ` Matt Redfearn
  2016-12-06 22:13   ` Jason A. Donenfeld
  2016-12-02 13:39 ` [PATCH 2/5] MIPS: Stack unwinding while on IRQ stack Matt Redfearn
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Matt Redfearn @ 2016-12-02 13:39 UTC (permalink / raw)
  To: Ralf Baechle
  Cc: linux-mips, Jason A . Donenfeld, Thomas Gleixner, Matt Redfearn,
	Paolo Bonzini, Chris Metcalf, Petr Mladek, linux-kernel,
	James Hogan, Paul Burton, Aaron Tomlin, Andrew Morton

Allocate a per-cpu irq stack for use within interrupt handlers.

Also add a utility function on_irq_stack to determine if a given stack
pointer is within the irq stack for that cpu.

Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>

---

 arch/mips/include/asm/irq.h    | 12 ++++++++++++
 arch/mips/kernel/asm-offsets.c |  1 +
 arch/mips/kernel/irq.c         | 11 +++++++++++
 3 files changed, 24 insertions(+)

diff --git a/arch/mips/include/asm/irq.h b/arch/mips/include/asm/irq.h
index 6bf10e796553..956db6e201d1 100644
--- a/arch/mips/include/asm/irq.h
+++ b/arch/mips/include/asm/irq.h
@@ -17,6 +17,18 @@
 
 #include <irq.h>
 
+#define IRQ_STACK_SIZE			THREAD_SIZE
+
+extern void *irq_stack[NR_CPUS];
+
+static inline bool on_irq_stack(int cpu, unsigned long sp)
+{
+	unsigned long low = (unsigned long)irq_stack[cpu];
+	unsigned long high = low + IRQ_STACK_SIZE;
+
+	return (low <= sp && sp <= high);
+}
+
 #ifdef CONFIG_I8259
 static inline int irq_canonicalize(int irq)
 {
diff --git a/arch/mips/kernel/asm-offsets.c b/arch/mips/kernel/asm-offsets.c
index fae2f9447792..4be2763f835d 100644
--- a/arch/mips/kernel/asm-offsets.c
+++ b/arch/mips/kernel/asm-offsets.c
@@ -102,6 +102,7 @@ void output_thread_info_defines(void)
 	OFFSET(TI_REGS, thread_info, regs);
 	DEFINE(_THREAD_SIZE, THREAD_SIZE);
 	DEFINE(_THREAD_MASK, THREAD_MASK);
+	DEFINE(_IRQ_STACK_SIZE, IRQ_STACK_SIZE);
 	BLANK();
 }
 
diff --git a/arch/mips/kernel/irq.c b/arch/mips/kernel/irq.c
index f25f7eab7307..2b0a371b42af 100644
--- a/arch/mips/kernel/irq.c
+++ b/arch/mips/kernel/irq.c
@@ -25,6 +25,8 @@
 #include <linux/atomic.h>
 #include <asm/uaccess.h>
 
+void *irq_stack[NR_CPUS];
+
 /*
  * 'what should we do if we get a hw irq event on an illegal vector'.
  * each architecture has to answer this themselves.
@@ -58,6 +60,15 @@ void __init init_IRQ(void)
 		clear_c0_status(ST0_IM);
 
 	arch_init_irq();
+
+	for_each_possible_cpu(i) {
+		int irq_pages = IRQ_STACK_SIZE / PAGE_SIZE;
+		void *s = (void *)__get_free_pages(GFP_KERNEL, irq_pages);
+
+		irq_stack[i] = s;
+		pr_debug("CPU%d IRQ stack at 0x%p - 0x%p\n", i,
+			irq_stack[i], irq_stack[i] + IRQ_STACK_SIZE);
+	}
 }
 
 #ifdef CONFIG_DEBUG_STACKOVERFLOW
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 2/5] MIPS: Stack unwinding while on IRQ stack
  2016-12-02 13:39 [PATCH 0/5] MIPS: Add per-cpu IRQ stack Matt Redfearn
  2016-12-02 13:39 ` [PATCH 1/5] MIPS: Introduce irq_stack Matt Redfearn
@ 2016-12-02 13:39 ` Matt Redfearn
  2016-12-02 13:39 ` [PATCH 3/5] MIPS: Only change $28 to thread_info if coming from user mode Matt Redfearn
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 15+ messages in thread
From: Matt Redfearn @ 2016-12-02 13:39 UTC (permalink / raw)
  To: Ralf Baechle
  Cc: linux-mips, Jason A . Donenfeld, Thomas Gleixner, Matt Redfearn,
	Marcin Nowakowski, Maciej W. Rozycki, Jiri Slaby, Chris Metcalf,
	linux-kernel, Andrea Gelmini, James Hogan, Paul Burton,
	Andrew Morton

Within unwind stack, check if the stack pointer being unwound is within
the CPU's irq_stack and if so use that page rather than the task's stack
page.

Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
---

 arch/mips/kernel/process.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

diff --git a/arch/mips/kernel/process.c b/arch/mips/kernel/process.c
index 9514e5f2209f..4fdbf7078071 100644
--- a/arch/mips/kernel/process.c
+++ b/arch/mips/kernel/process.c
@@ -33,6 +33,7 @@
 #include <asm/dsemul.h>
 #include <asm/dsp.h>
 #include <asm/fpu.h>
+#include <asm/irq.h>
 #include <asm/msa.h>
 #include <asm/pgtable.h>
 #include <asm/mipsregs.h>
@@ -511,7 +512,19 @@ EXPORT_SYMBOL(unwind_stack_by_address);
 unsigned long unwind_stack(struct task_struct *task, unsigned long *sp,
 			   unsigned long pc, unsigned long *ra)
 {
-	unsigned long stack_page = (unsigned long)task_stack_page(task);
+	unsigned long stack_page = 0;
+	int cpu;
+
+	for_each_possible_cpu(cpu) {
+		if (on_irq_stack(cpu, *sp)) {
+			stack_page = (unsigned long)irq_stack[cpu];
+			break;
+		}
+	}
+
+	if (!stack_page)
+		stack_page = (unsigned long)task_stack_page(task);
+
 	return unwind_stack_by_address(stack_page, sp, pc, ra);
 }
 #endif
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 3/5] MIPS: Only change $28 to thread_info if coming from user mode
  2016-12-02 13:39 [PATCH 0/5] MIPS: Add per-cpu IRQ stack Matt Redfearn
  2016-12-02 13:39 ` [PATCH 1/5] MIPS: Introduce irq_stack Matt Redfearn
  2016-12-02 13:39 ` [PATCH 2/5] MIPS: Stack unwinding while on IRQ stack Matt Redfearn
@ 2016-12-02 13:39 ` Matt Redfearn
  2016-12-05 16:20   ` Maciej W. Rozycki
  2016-12-02 13:39 ` [PATCH 4/5] MIPS: Switch to the irq_stack in interrupts Matt Redfearn
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 15+ messages in thread
From: Matt Redfearn @ 2016-12-02 13:39 UTC (permalink / raw)
  To: Ralf Baechle
  Cc: linux-mips, Jason A . Donenfeld, Thomas Gleixner, Matt Redfearn,
	linux-kernel, James Hogan, Paul Burton

The SAVE_SOME macro is used to save the execution context on all
exceptions.
If an exception occurs while executing user code, the stack is switched
to the kernel's stack for the current task, and register $28 is switched
to point to the current_thread_info, which is at the bottom of the stack
region.
If the exception occurs while executing kernel code, the stack is left,
and this change ensures that register $28 is not updated. This is the
correct behaviour when the kernel can be executing on the separate irq
stack, because the thread_info will not be at the base of it.

With this change, register $28 is only switched to it's kernel
conventional usage of the currrent thread info pointer at the point at
which execution enters kernel space. Doing it on every exception was
redundant, but OK without an IRQ stack, but will be erroneous once that
is introduced.

Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
---

 arch/mips/include/asm/stackframe.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/arch/mips/include/asm/stackframe.h b/arch/mips/include/asm/stackframe.h
index eebf39549606..5782fa3d63be 100644
--- a/arch/mips/include/asm/stackframe.h
+++ b/arch/mips/include/asm/stackframe.h
@@ -216,12 +216,22 @@
 		LONG_S	$25, PT_R25(sp)
 		LONG_S	$28, PT_R28(sp)
 		LONG_S	$31, PT_R31(sp)
+
+		/* Set thread_info if we're coming from user mode */
+		.set	reorder
+		mfc0	k0, CP0_STATUS
+		sll	k0, 3		/* extract cu0 bit */
+		.set	noreorder
+		bltz	k0, 9f
+		 nop
+
 		ori	$28, sp, _THREAD_MASK
 		xori	$28, _THREAD_MASK
 #ifdef CONFIG_CPU_CAVIUM_OCTEON
 		.set    mips64
 		pref    0, 0($28)       /* Prefetch the current pointer */
 #endif
+9:
 		.set	pop
 		.endm
 
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 4/5] MIPS: Switch to the irq_stack in interrupts
  2016-12-02 13:39 [PATCH 0/5] MIPS: Add per-cpu IRQ stack Matt Redfearn
                   ` (2 preceding siblings ...)
  2016-12-02 13:39 ` [PATCH 3/5] MIPS: Only change $28 to thread_info if coming from user mode Matt Redfearn
@ 2016-12-02 13:39 ` Matt Redfearn
  2016-12-02 13:39 ` [PATCH 5/5] MIPS: Select HAVE_IRQ_EXIT_ON_IRQ_STACK Matt Redfearn
  2016-12-06 22:09 ` [PATCH 0/5] MIPS: Add per-cpu IRQ stack Jason A. Donenfeld
  5 siblings, 0 replies; 15+ messages in thread
From: Matt Redfearn @ 2016-12-02 13:39 UTC (permalink / raw)
  To: Ralf Baechle
  Cc: linux-mips, Jason A . Donenfeld, Thomas Gleixner, Matt Redfearn,
	linux-kernel, James Hogan, Paul Burton

When enterring interrupt context via handle_int or except_vec_vi, switch
to the irq_stack of the current CPU if it is not already in use.

The current stack pointer is masked with the thread size and compared to
the base or the irq stack. If it does not match then the stack pointer
is set to the top of that stack, otherwise this is a nested irq being
handled on the irq stack so the stack pointer should be left as it was.

The in-use stack pointer is placed in the callee saved register s1. It
will be saved to the stack when plat_irq_dispatch is invoked and can be
restored once control returns here.

Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
---

 arch/mips/kernel/genex.S | 81 +++++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 76 insertions(+), 5 deletions(-)

diff --git a/arch/mips/kernel/genex.S b/arch/mips/kernel/genex.S
index dc0b29612891..0a7ba4b2f687 100644
--- a/arch/mips/kernel/genex.S
+++ b/arch/mips/kernel/genex.S
@@ -187,9 +187,44 @@ NESTED(handle_int, PT_SIZE, sp)
 
 	LONG_L	s0, TI_REGS($28)
 	LONG_S	sp, TI_REGS($28)
-	PTR_LA	ra, ret_from_irq
-	PTR_LA	v0, plat_irq_dispatch
-	jr	v0
+
+	/*
+	 * SAVE_ALL ensures we are using a valid kernel stack for the thread.
+	 * Check if we are already using the IRQ stack.
+	 */
+	move	s1, sp # Preserve the sp
+
+	/* Get IRQ stack for this CPU */
+	ASM_CPUID_MFC0	k0, ASM_SMP_CPUID_REG
+#if defined(CONFIG_32BIT) || defined(KBUILD_64BIT_SYM32)
+	lui	k1, %hi(irq_stack)
+#else
+	lui	k1, %highest(irq_stack)
+	daddiu	k1, %higher(irq_stack)
+	dsll	k1, 16
+	daddiu	k1, %hi(irq_stack)
+	dsll	k1, 16
+#endif
+	LONG_SRL	k0, SMP_CPUID_PTRSHIFT
+	LONG_ADDU	k1, k0
+	LONG_L	t0, %lo(irq_stack)(k1)
+
+	# Check if already on IRQ stack
+	PTR_LI	t1, ~(_THREAD_SIZE-1)
+	and	t1, t1, sp
+	beq	t0, t1, 2f
+
+	/* Switch to IRQ stack */
+	li	t1, _IRQ_STACK_SIZE
+	PTR_ADD sp, t0, t1
+
+2:
+	jal	plat_irq_dispatch
+
+	/* Restore sp */
+	move	sp, s1
+
+	j	ret_from_irq
 #ifdef CONFIG_CPU_MICROMIPS
 	nop
 #endif
@@ -262,8 +297,44 @@ NESTED(except_vec_vi_handler, 0, sp)
 
 	LONG_L	s0, TI_REGS($28)
 	LONG_S	sp, TI_REGS($28)
-	PTR_LA	ra, ret_from_irq
-	jr	v0
+
+	/*
+	 * SAVE_ALL ensures we are using a valid kernel stack for the thread.
+	 * Check if we are already using the IRQ stack.
+	 */
+	move	s1, sp # Preserve the sp
+
+	/* Get IRQ stack for this CPU */
+	ASM_CPUID_MFC0	k0, ASM_SMP_CPUID_REG
+#if defined(CONFIG_32BIT) || defined(KBUILD_64BIT_SYM32)
+	lui	k1, %hi(irq_stack)
+#else
+	lui	k1, %highest(irq_stack)
+	daddiu	k1, %higher(irq_stack)
+	dsll	k1, 16
+	daddiu	k1, %hi(irq_stack)
+	dsll	k1, 16
+#endif
+	LONG_SRL	k0, SMP_CPUID_PTRSHIFT
+	LONG_ADDU	k1, k0
+	LONG_L	t0, %lo(irq_stack)(k1)
+
+	# Check if already on IRQ stack
+	PTR_LI	t1, ~(_THREAD_SIZE-1)
+	and	t1, t1, sp
+	beq	t0, t1, 2f
+
+	/* Switch to IRQ stack */
+	li	t1, _IRQ_STACK_SIZE
+	PTR_ADD sp, t0, t1
+
+2:
+	jal	plat_irq_dispatch
+
+	/* Restore sp */
+	move	sp, s1
+
+	j	ret_from_irq
 	END(except_vec_vi_handler)
 
 /*
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH 5/5] MIPS: Select HAVE_IRQ_EXIT_ON_IRQ_STACK
  2016-12-02 13:39 [PATCH 0/5] MIPS: Add per-cpu IRQ stack Matt Redfearn
                   ` (3 preceding siblings ...)
  2016-12-02 13:39 ` [PATCH 4/5] MIPS: Switch to the irq_stack in interrupts Matt Redfearn
@ 2016-12-02 13:39 ` Matt Redfearn
  2016-12-06 22:09 ` [PATCH 0/5] MIPS: Add per-cpu IRQ stack Jason A. Donenfeld
  5 siblings, 0 replies; 15+ messages in thread
From: Matt Redfearn @ 2016-12-02 13:39 UTC (permalink / raw)
  To: Ralf Baechle
  Cc: linux-mips, Jason A . Donenfeld, Thomas Gleixner, Matt Redfearn,
	linux-kernel

Since do_IRQ is now invoked on a separate IRQ stack, we select
HAVE_IRQ_EXIT_ON_IRQ_STACK so that softirq's may be invoked directly
from irq_exit(), rather than requiring do_softirq_own_stack.

Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
---

 arch/mips/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index b3c5bde43d34..80832aa8e4fb 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -9,6 +9,7 @@ config MIPS
 	select HAVE_CONTEXT_TRACKING
 	select HAVE_GENERIC_DMA_COHERENT
 	select HAVE_IDE
+	select HAVE_IRQ_EXIT_ON_IRQ_STACK
 	select HAVE_OPROFILE
 	select HAVE_PERF_EVENTS
 	select PERF_USE_VMALLOC
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] MIPS: Only change $28 to thread_info if coming from user mode
  2016-12-02 13:39 ` [PATCH 3/5] MIPS: Only change $28 to thread_info if coming from user mode Matt Redfearn
@ 2016-12-05 16:20   ` Maciej W. Rozycki
  2016-12-05 16:52     ` Matt Redfearn
  2016-12-05 17:27     ` Paul Burton
  0 siblings, 2 replies; 15+ messages in thread
From: Maciej W. Rozycki @ 2016-12-05 16:20 UTC (permalink / raw)
  To: Matt Redfearn
  Cc: Ralf Baechle, linux-mips, Jason A . Donenfeld, Thomas Gleixner,
	linux-kernel, James Hogan, Paul Burton

On Fri, 2 Dec 2016, Matt Redfearn wrote:

> diff --git a/arch/mips/include/asm/stackframe.h b/arch/mips/include/asm/stackframe.h
> index eebf39549606..5782fa3d63be 100644
> --- a/arch/mips/include/asm/stackframe.h
> +++ b/arch/mips/include/asm/stackframe.h
> @@ -216,12 +216,22 @@
>  		LONG_S	$25, PT_R25(sp)
>  		LONG_S	$28, PT_R28(sp)
>  		LONG_S	$31, PT_R31(sp)
> +
> +		/* Set thread_info if we're coming from user mode */
> +		.set	reorder
> +		mfc0	k0, CP0_STATUS
> +		sll	k0, 3		/* extract cu0 bit */
> +		.set	noreorder
> +		bltz	k0, 9f
> +		 nop

 This code is already `.set reorder', although a badly applied CONFIG_EVA 
change made things slightly less obvious.  So why do you need this `.set 
reorder' in the first place, and then why do you switch code that follows 
to `.set noreorder'?

 Overall I think all <asm/stackframe.h> code should be using the (default) 
`.set reorder' mode, perhaps forced explicitly in case these macros are 
pasted into `.set noreorder' code, to make it easier to avoid subtle data 
dependency bugs, and also to make R6 porting easier.  Except maybe for the 
RFE sequence, for readability's sake, although even there GAS will do the 
right thing.  Surely the BLTZ/MOVE piece does not have to be `.set 
noreorder' as GAS will schedule that delay slot automatically if allowed 
to.

  Maciej

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] MIPS: Only change $28 to thread_info if coming from user mode
  2016-12-05 16:20   ` Maciej W. Rozycki
@ 2016-12-05 16:52     ` Matt Redfearn
  2016-12-05 17:22       ` Maciej W. Rozycki
  2016-12-05 17:27     ` Paul Burton
  1 sibling, 1 reply; 15+ messages in thread
From: Matt Redfearn @ 2016-12-05 16:52 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: Ralf Baechle, linux-mips, Jason A . Donenfeld, Thomas Gleixner,
	linux-kernel, James Hogan, Paul Burton


Hi Maciej,

On 05/12/16 16:20, Maciej W. Rozycki wrote:
> On Fri, 2 Dec 2016, Matt Redfearn wrote:
>
>> diff --git a/arch/mips/include/asm/stackframe.h b/arch/mips/include/asm/stackframe.h
>> index eebf39549606..5782fa3d63be 100644
>> --- a/arch/mips/include/asm/stackframe.h
>> +++ b/arch/mips/include/asm/stackframe.h
>> @@ -216,12 +216,22 @@
>>   		LONG_S	$25, PT_R25(sp)
>>   		LONG_S	$28, PT_R28(sp)
>>   		LONG_S	$31, PT_R31(sp)
>> +
>> +		/* Set thread_info if we're coming from user mode */
>> +		.set	reorder
>> +		mfc0	k0, CP0_STATUS
>> +		sll	k0, 3		/* extract cu0 bit */
>> +		.set	noreorder
>> +		bltz	k0, 9f
>> +		 nop
>   This code is already `.set reorder', although a badly applied CONFIG_EVA
> change made things slightly less obvious.  So why do you need this `.set
> reorder' in the first place, and then why do you switch code that follows
> to `.set noreorder'?

Ah yes, I missed the .set reorder above the EVA ifdef and just included 
the .set reorder as the similar snippet here:
http://lxr.free-electrons.com/source/arch/mips/include/asm/stackframe.h#L149

I'll revise that in V2.

Thanks,
Matt

>
>   Overall I think all <asm/stackframe.h> code should be using the (default)
> `.set reorder' mode, perhaps forced explicitly in case these macros are
> pasted into `.set noreorder' code, to make it easier to avoid subtle data
> dependency bugs, and also to make R6 porting easier.  Except maybe for the
> RFE sequence, for readability's sake, although even there GAS will do the
> right thing.  Surely the BLTZ/MOVE piece does not have to be `.set
> noreorder' as GAS will schedule that delay slot automatically if allowed
> to.
>
>    Maciej

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] MIPS: Only change $28 to thread_info if coming from user mode
  2016-12-05 16:52     ` Matt Redfearn
@ 2016-12-05 17:22       ` Maciej W. Rozycki
  0 siblings, 0 replies; 15+ messages in thread
From: Maciej W. Rozycki @ 2016-12-05 17:22 UTC (permalink / raw)
  To: Matt Redfearn
  Cc: Ralf Baechle, linux-mips, Jason A . Donenfeld, Thomas Gleixner,
	linux-kernel, James Hogan, Paul Burton

On Mon, 5 Dec 2016, Matt Redfearn wrote:

> Ah yes, I missed the .set reorder above the EVA ifdef and just included the
> .set reorder as the similar snippet here:
> http://lxr.free-electrons.com/source/arch/mips/include/asm/stackframe.h#L149

 That's a global `.set reorder' for the whole of SAVE_SOME, just as I 
suggested for all the macros here.

 Then a temporary `.set noreorder' override follows just for the manually 
scheduled BLTZ/MOVE sequence, where someone has later inserted a whole 
CONFIG_EVA block.  As I noted this temporary override is unnecessary.  
Incidentally it does not harm the CONFIG_EVA block either, because EVA 
processors have MFC0 interlocked.

 NB I think the MFC0 macro ought to be called REG_MFC0, to avoid confusion 
and for consistency with the rest of such macros.

  Maciej

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] MIPS: Only change $28 to thread_info if coming from user mode
  2016-12-05 16:20   ` Maciej W. Rozycki
  2016-12-05 16:52     ` Matt Redfearn
@ 2016-12-05 17:27     ` Paul Burton
  2016-12-05 17:41       ` Maciej W. Rozycki
  1 sibling, 1 reply; 15+ messages in thread
From: Paul Burton @ 2016-12-05 17:27 UTC (permalink / raw)
  To: Maciej W. Rozycki
  Cc: Matt Redfearn, Ralf Baechle, linux-mips, Jason A . Donenfeld,
	Thomas Gleixner, linux-kernel, James Hogan

[-- Attachment #1: Type: text/plain, Size: 934 bytes --]

Hi Maciej,

>  Overall I think all <asm/stackframe.h> code should be using the (default)
> `.set reorder' mode, perhaps forced explicitly in case these macros are
> pasted into `.set noreorder' code, to make it easier to avoid subtle data
> dependency bugs, and also to make R6 porting easier.  Except maybe for the
> RFE sequence, for readability's sake, although even there GAS will do the
> right thing.  Surely the BLTZ/MOVE piece does not have to be `.set
> noreorder' as GAS will schedule that delay slot automatically if allowed
> to.

Agreed we ought to use .set reorder (or rather, not use .set noreorder) 
wherever possible but FYI one thing I've only noticed recently is that we 
don't actually get any reordering anyway, presumably because we don't provide 
any -O flags when building asm source. I haven't yet done the legwork or 
figuring out whether we've ever had optimisation & if so what changed...

Thanks,
    Paul

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 3/5] MIPS: Only change $28 to thread_info if coming from user mode
  2016-12-05 17:27     ` Paul Burton
@ 2016-12-05 17:41       ` Maciej W. Rozycki
  0 siblings, 0 replies; 15+ messages in thread
From: Maciej W. Rozycki @ 2016-12-05 17:41 UTC (permalink / raw)
  To: Paul Burton
  Cc: Matt Redfearn, Ralf Baechle, linux-mips, Jason A . Donenfeld,
	Thomas Gleixner, linux-kernel, James Hogan

On Mon, 5 Dec 2016, Paul Burton wrote:

> Agreed we ought to use .set reorder (or rather, not use .set noreorder) 
> wherever possible but FYI one thing I've only noticed recently is that we 
> don't actually get any reordering anyway, presumably because we don't provide 
> any -O flags when building asm source. I haven't yet done the legwork or 
> figuring out whether we've ever had optimisation & if so what changed...

 Reordering or `-O2' is the default for GAS, you actually have to pass a 
different `-O*' option explicitly to GAS to disable it.  NB it's *not* the 
same as passing one of the `-O*' options to GCC, you'd need e.g. `-Wa,-O'.

 So perhaps one of our Makefiles or a particular configuration of the GCC 
driver does the wrong thing?

  Maciej

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/5] MIPS: Add per-cpu IRQ stack
  2016-12-02 13:39 [PATCH 0/5] MIPS: Add per-cpu IRQ stack Matt Redfearn
                   ` (4 preceding siblings ...)
  2016-12-02 13:39 ` [PATCH 5/5] MIPS: Select HAVE_IRQ_EXIT_ON_IRQ_STACK Matt Redfearn
@ 2016-12-06 22:09 ` Jason A. Donenfeld
  2016-12-07  9:30   ` Matt Redfearn
  5 siblings, 1 reply; 15+ messages in thread
From: Jason A. Donenfeld @ 2016-12-06 22:09 UTC (permalink / raw)
  To: Matt Redfearn
  Cc: Ralf Baechle, linux-mips, Thomas Gleixner, Paolo Bonzini,
	Marcin Nowakowski, Chris Metcalf, Petr Mladek, LKML,
	Andrea Gelmini, James Hogan, Paul Burton, Jiri Slaby,
	Maciej W. Rozycki, Aaron Tomlin, Andrew Morton

Hi Matt,

Thanks for submitting this. A happy OpenWRT/WireGuard user has
reported to me that this patch set frees ~1300 bytes of stack on a
small MIPS router. This kind of savings should allow me to reintroduce
my crypto operations to be on the stack, rather than the conditional
MIPS kmallocing, which will be a performance win.

By the way, it looks like x86 and SPARC have separately allocated hard
IRQ and soft IRQ stacks. ARM has only one for both, similar to this
MIPS patchset.

Jason

On Fri, Dec 2, 2016 at 2:39 PM, Matt Redfearn <matt.redfearn@imgtec.com> wrote:
>
> This series adds a separate stack for each CPU wihin the system to use
> when handling IRQs. Previously IRQs were handled on the kernel stack of
> the current task. If that task was deep down a call stack at the point
> of the interrupt, and handling the interrupt required a deep IRQ stack,
> then there was a likelihood of stack overflow. Since the kernel stack
> is in normal unmapped memory, overflowing it can lead to silent
> corruption of other kernel data, with weird and wonderful results.
>
> Before this patch series, ftracing the maximum stack size of a v4.9-rc6
> kernel running on a Ci40 board gave:
> 4996
>
> And with this series:
> 4084
>
> Handling interrupts on a separate stack reduces the maximum kernel stack
> usage in this configuration by ~900 bytes.
>
> Since do_IRQ is now invoked on a separate stack, we select
> HAVE_IRQ_EXIT_ON_IRQ_STACK so that softirqs will also be executed on the
> irq stack rather than attempting to switch with do_softirq_own_stack().
>
> This series has been tested on MIPS Boston, Malta and SEAD3 platforms,
> Pistachio on the Creator Ci40 board and Cavium Octeon III.
>
>
>
> Matt Redfearn (5):
>   MIPS: Introduce irq_stack
>   MIPS: Stack unwinding while on IRQ stack
>   MIPS: Only change $28 to thread_info if coming from user mode
>   MIPS: Switch to the irq_stack in interrupts
>   MIPS: Select HAVE_IRQ_EXIT_ON_IRQ_STACK
>
>  arch/mips/Kconfig                  |  1 +
>  arch/mips/include/asm/irq.h        | 12 ++++++
>  arch/mips/include/asm/stackframe.h | 10 +++++
>  arch/mips/kernel/asm-offsets.c     |  1 +
>  arch/mips/kernel/genex.S           | 81 +++++++++++++++++++++++++++++++++++---
>  arch/mips/kernel/irq.c             | 11 ++++++
>  arch/mips/kernel/process.c         | 15 ++++++-
>  7 files changed, 125 insertions(+), 6 deletions(-)
>
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/5] MIPS: Introduce irq_stack
  2016-12-02 13:39 ` [PATCH 1/5] MIPS: Introduce irq_stack Matt Redfearn
@ 2016-12-06 22:13   ` Jason A. Donenfeld
  2016-12-07  9:25     ` Matt Redfearn
  0 siblings, 1 reply; 15+ messages in thread
From: Jason A. Donenfeld @ 2016-12-06 22:13 UTC (permalink / raw)
  To: Matt Redfearn
  Cc: Ralf Baechle, linux-mips, Thomas Gleixner, Paolo Bonzini,
	Chris Metcalf, Petr Mladek, LKML, James Hogan, Paul Burton,
	Aaron Tomlin, Andrew Morton

On Fri, Dec 2, 2016 at 2:39 PM, Matt Redfearn <matt.redfearn@imgtec.com> wrote:
> +void *irq_stack[NR_CPUS];

I'm curious why you implemented it this way rather than using
DEFINE_PER_CPU and the related percpu helper functions.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 1/5] MIPS: Introduce irq_stack
  2016-12-06 22:13   ` Jason A. Donenfeld
@ 2016-12-07  9:25     ` Matt Redfearn
  0 siblings, 0 replies; 15+ messages in thread
From: Matt Redfearn @ 2016-12-07  9:25 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Ralf Baechle, linux-mips, Thomas Gleixner, Paolo Bonzini,
	Chris Metcalf, Petr Mladek, LKML, James Hogan, Paul Burton,
	Aaron Tomlin, Andrew Morton

Hi Jason,


On 06/12/16 22:13, Jason A. Donenfeld wrote:
> On Fri, Dec 2, 2016 at 2:39 PM, Matt Redfearn <matt.redfearn@imgtec.com> wrote:
>> +void *irq_stack[NR_CPUS];
> I'm curious why you implemented it this way rather than using
> DEFINE_PER_CPU and the related percpu helper functions.
Because in the IRQ entry point in assembler we have to look up the 
address of this CPU's IRQ stack. Doing so with a simple array can be 
done with fewer instructions than a per-cpu variable. The kernel stack 
pointer for each CPU is held in a similar array.

MIPS does not have a particularly optimized per-cpu implementation with 
the per-cpu offset being held somewhere easily accessible, so right now 
it has be looked up from the __per_cpu_offset array, and then applied to 
the per-cpu pointer. Obviously doing the first lookup is analogous to 
what I am doing, and the subsequent application to the per-cpu pointer 
would be additional instructions.

Thanks,
Matt

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH 0/5] MIPS: Add per-cpu IRQ stack
  2016-12-06 22:09 ` [PATCH 0/5] MIPS: Add per-cpu IRQ stack Jason A. Donenfeld
@ 2016-12-07  9:30   ` Matt Redfearn
  0 siblings, 0 replies; 15+ messages in thread
From: Matt Redfearn @ 2016-12-07  9:30 UTC (permalink / raw)
  To: Jason A. Donenfeld
  Cc: Ralf Baechle, linux-mips, Thomas Gleixner, Paolo Bonzini,
	Marcin Nowakowski, Chris Metcalf, Petr Mladek, LKML,
	Andrea Gelmini, James Hogan, Paul Burton, Jiri Slaby,
	Maciej W. Rozycki, Aaron Tomlin, Andrew Morton

Hi Jason,


On 06/12/16 22:09, Jason A. Donenfeld wrote:
> Hi Matt,
>
> Thanks for submitting this. A happy OpenWRT/WireGuard user has
> reported to me that this patch set frees ~1300 bytes of stack on a
> small MIPS router. This kind of savings should allow me to reintroduce
> my crypto operations to be on the stack, rather than the conditional
> MIPS kmallocing, which will be a performance win.

Excellent.

>
> By the way, it looks like x86 and SPARC have separately allocated hard
> IRQ and soft IRQ stacks. ARM has only one for both, similar to this
> MIPS patchset.

The separate stack for soft-IRQs could be added as well, but of course 
there is a tradeoff against the additional memory consumed by it. I 
believe that most soft IRQs will now be handled on the new irq stack, 
because HAVE_IRQ_EXIT_ON_IRQ_STACK causes them to be invoked within 
irq_exit, while still executing on the irq stack

Thanks,
Matt

>
> Jason
>
> On Fri, Dec 2, 2016 at 2:39 PM, Matt Redfearn <matt.redfearn@imgtec.com> wrote:
>> This series adds a separate stack for each CPU wihin the system to use
>> when handling IRQs. Previously IRQs were handled on the kernel stack of
>> the current task. If that task was deep down a call stack at the point
>> of the interrupt, and handling the interrupt required a deep IRQ stack,
>> then there was a likelihood of stack overflow. Since the kernel stack
>> is in normal unmapped memory, overflowing it can lead to silent
>> corruption of other kernel data, with weird and wonderful results.
>>
>> Before this patch series, ftracing the maximum stack size of a v4.9-rc6
>> kernel running on a Ci40 board gave:
>> 4996
>>
>> And with this series:
>> 4084
>>
>> Handling interrupts on a separate stack reduces the maximum kernel stack
>> usage in this configuration by ~900 bytes.
>>
>> Since do_IRQ is now invoked on a separate stack, we select
>> HAVE_IRQ_EXIT_ON_IRQ_STACK so that softirqs will also be executed on the
>> irq stack rather than attempting to switch with do_softirq_own_stack().
>>
>> This series has been tested on MIPS Boston, Malta and SEAD3 platforms,
>> Pistachio on the Creator Ci40 board and Cavium Octeon III.
>>
>>
>>
>> Matt Redfearn (5):
>>    MIPS: Introduce irq_stack
>>    MIPS: Stack unwinding while on IRQ stack
>>    MIPS: Only change $28 to thread_info if coming from user mode
>>    MIPS: Switch to the irq_stack in interrupts
>>    MIPS: Select HAVE_IRQ_EXIT_ON_IRQ_STACK
>>
>>   arch/mips/Kconfig                  |  1 +
>>   arch/mips/include/asm/irq.h        | 12 ++++++
>>   arch/mips/include/asm/stackframe.h | 10 +++++
>>   arch/mips/kernel/asm-offsets.c     |  1 +
>>   arch/mips/kernel/genex.S           | 81 +++++++++++++++++++++++++++++++++++---
>>   arch/mips/kernel/irq.c             | 11 ++++++
>>   arch/mips/kernel/process.c         | 15 ++++++-
>>   7 files changed, 125 insertions(+), 6 deletions(-)
>>
>> --
>> 2.7.4
>>

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2016-12-07  9:30 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-02 13:39 [PATCH 0/5] MIPS: Add per-cpu IRQ stack Matt Redfearn
2016-12-02 13:39 ` [PATCH 1/5] MIPS: Introduce irq_stack Matt Redfearn
2016-12-06 22:13   ` Jason A. Donenfeld
2016-12-07  9:25     ` Matt Redfearn
2016-12-02 13:39 ` [PATCH 2/5] MIPS: Stack unwinding while on IRQ stack Matt Redfearn
2016-12-02 13:39 ` [PATCH 3/5] MIPS: Only change $28 to thread_info if coming from user mode Matt Redfearn
2016-12-05 16:20   ` Maciej W. Rozycki
2016-12-05 16:52     ` Matt Redfearn
2016-12-05 17:22       ` Maciej W. Rozycki
2016-12-05 17:27     ` Paul Burton
2016-12-05 17:41       ` Maciej W. Rozycki
2016-12-02 13:39 ` [PATCH 4/5] MIPS: Switch to the irq_stack in interrupts Matt Redfearn
2016-12-02 13:39 ` [PATCH 5/5] MIPS: Select HAVE_IRQ_EXIT_ON_IRQ_STACK Matt Redfearn
2016-12-06 22:09 ` [PATCH 0/5] MIPS: Add per-cpu IRQ stack Jason A. Donenfeld
2016-12-07  9:30   ` Matt Redfearn

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).