All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP
@ 2021-11-26 10:10 Ard Biesheuvel
  2021-11-26 10:10 ` [RFC PATCH 1/6] ARM: entry: preserve thread_info pointer in switch_to Ard Biesheuvel
                   ` (7 more replies)
  0 siblings, 8 replies; 11+ messages in thread
From: Ard Biesheuvel @ 2021-11-26 10:10 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Ard Biesheuvel, Russell King, Nicolas Pitre, Arnd Bergmann,
	Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers,
	Tony Lindgren

Enable the use of the TLS register to hold the 'current' pointer for all
configurations that can support it, including non-SMP ones that target
v6k or later CPUs, and multi-platform SMP ones that also support v6
based UP systems.

The remaining configurations are all strictly UP, which means we can
switch to a global variable to hold the current pointer. By doing this,
we can enable THREAD_INFO_IN_TASK, which moves thread info off the
stack, protecting it from overflows. It also permits us to enable IRQ
stacks and vmap'ed stacks for UP configurations as well.

Supporting v6 cores without SMP extensions in SMP configurations (e.g.,
omap2plus_defconfig or imx_v6_v7_defconfig) makes this a bit tricky, and
this is a feature we may consider dropping entirely in the future. But
for the time being, we can support this mode as well.

The accesses to the global variable holding 'current' are constructed in
a way that ensures that no literal pool accesses (and associated D-cache
misses) are needed unless the access is from a module and module PLTs
are enabled. This means that accessing 'current' is just as costly as
before, as it used to require some arithmetic involving the stack
pointer and a load from the thread_info::task field.

However, accessing thread_info itself now also involves a load, although
it should be noted that all thread_info and current accesses now go via
the same variable, which is therefore expected to be hot in the caches
at all times.

Cc: Russell King <linux@armlinux.org.uk>
Cc: Nicolas Pitre <nico@fluxnic.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Keith Packard <keithpac@amazon.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Tony Lindgren <tony@atomide.com>

Ard Biesheuvel (6):
  ARM: entry: preserve thread_info pointer in switch_to
  ARM: module: implement support for PC-relative group relocations
  ARM: percpu: add SMP_ON_UP support
  ARM: smp: defer TPIDRURO update for SMP v6 configurations too
  ARM: use TLS register for 'current' on !SMP as well
  ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems

 arch/arm/Kconfig                   |  10 +-
 arch/arm/include/asm/assembler.h   | 136 +++++++++++++++-----
 arch/arm/include/asm/current.h     |  56 +++++++-
 arch/arm/include/asm/elf.h         |   3 +
 arch/arm/include/asm/percpu.h      |  29 ++++-
 arch/arm/include/asm/switch_to.h   |   3 +-
 arch/arm/include/asm/thread_info.h |  27 ----
 arch/arm/include/asm/tls.h         |   4 +-
 arch/arm/kernel/asm-offsets.c      |   3 -
 arch/arm/kernel/entry-armv.S       |  26 ++--
 arch/arm/kernel/entry-header.S     |   8 +-
 arch/arm/kernel/head-common.S      |   4 +-
 arch/arm/kernel/module.c           |  63 +++++++++
 arch/arm/kernel/process.c          |   7 +-
 arch/arm/kernel/traps.c            |   4 +
 arch/arm/mm/Kconfig                |   1 +
 16 files changed, 289 insertions(+), 95 deletions(-)

-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [RFC PATCH 1/6] ARM: entry: preserve thread_info pointer in switch_to
  2021-11-26 10:10 [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP Ard Biesheuvel
@ 2021-11-26 10:10 ` Ard Biesheuvel
  2021-11-26 10:10 ` [RFC PATCH 2/6] ARM: module: implement support for PC-relative group relocations Ard Biesheuvel
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Ard Biesheuvel @ 2021-11-26 10:10 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Ard Biesheuvel, Russell King, Nicolas Pitre, Arnd Bergmann,
	Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers,
	Tony Lindgren

Tweak the UP stack protector handling code so that the thread info
pointer is preserved in R7 until set_current is called. This is needed
for a subsequent patch that implements THREAD_INFO_IN_TASK and
set_current for UP as well.

This also means we will prefer the per-task protector on UP systems that
implement the thread ID registers, so tweak the preprocessor
conditionals to reflect this.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/kernel/entry-armv.S | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
index 5fb7465d14d9..1e26d69ebbf1 100644
--- a/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -814,16 +814,16 @@ ENTRY(__switch_to)
 	ldr	r6, [r2, #TI_CPU_DOMAIN]
 #endif
 	switch_tls r1, r4, r5, r3, r7
-#if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_SMP)
-	ldr	r7, [r2, #TI_TASK]
+#if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_SMP) && \
+    !defined(CONFIG_STACKPROTECTOR_PER_TASK)
+	ldr	r9, [r2, #TI_TASK]
 	ldr	r8, =__stack_chk_guard
 	.if (TSK_STACK_CANARY > IMM12_MASK)
-	add	r7, r7, #TSK_STACK_CANARY & ~IMM12_MASK
+	add	r9, r9, #TSK_STACK_CANARY & ~IMM12_MASK
 	.endif
-	ldr	r7, [r7, #TSK_STACK_CANARY & IMM12_MASK]
-#elif defined(CONFIG_CURRENT_POINTER_IN_TPIDRURO)
-	mov	r7, r2				@ Preserve 'next'
+	ldr	r9, [r9, #TSK_STACK_CANARY & IMM12_MASK]
 #endif
+	mov	r7, r2				@ Preserve 'next'
 #ifdef CONFIG_CPU_USE_DOMAINS
 	mcr	p15, 0, r6, c3, c0, 0		@ Set domain register
 #endif
@@ -832,8 +832,9 @@ ENTRY(__switch_to)
 	ldr	r0, =thread_notify_head
 	mov	r1, #THREAD_NOTIFY_SWITCH
 	bl	atomic_notifier_call_chain
-#if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_SMP)
-	str	r7, [r8]
+#if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_SMP) && \
+    !defined(CONFIG_STACKPROTECTOR_PER_TASK)
+	str	r9, [r8]
 #endif
 	mov	r0, r5
 #if !defined(CONFIG_THUMB2_KERNEL) && !defined(CONFIG_VMAP_STACK)
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 2/6] ARM: module: implement support for PC-relative group relocations
  2021-11-26 10:10 [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP Ard Biesheuvel
  2021-11-26 10:10 ` [RFC PATCH 1/6] ARM: entry: preserve thread_info pointer in switch_to Ard Biesheuvel
@ 2021-11-26 10:10 ` Ard Biesheuvel
  2021-11-26 10:10 ` [RFC PATCH 3/6] ARM: percpu: add SMP_ON_UP support Ard Biesheuvel
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Ard Biesheuvel @ 2021-11-26 10:10 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Ard Biesheuvel, Russell King, Nicolas Pitre, Arnd Bergmann,
	Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers,
	Tony Lindgren

Add support for the R_ARM_ALU_PC_Gn_NC and R_ARM_LDR_PC_G2 group
relocations so we can use them in modules. These will be used to load
the current task pointer from a global variable without having to rely
on a literal pool entry to carry the address of this variable, which
would be slightly less efficient.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/include/asm/elf.h |  3 +
 arch/arm/kernel/module.c   | 63 ++++++++++++++++++++
 2 files changed, 66 insertions(+)

diff --git a/arch/arm/include/asm/elf.h b/arch/arm/include/asm/elf.h
index b8102a6ddf16..d68101655b74 100644
--- a/arch/arm/include/asm/elf.h
+++ b/arch/arm/include/asm/elf.h
@@ -61,6 +61,9 @@ typedef struct user_fp elf_fpregset_t;
 #define R_ARM_MOVT_ABS		44
 #define R_ARM_MOVW_PREL_NC	45
 #define R_ARM_MOVT_PREL		46
+#define R_ARM_ALU_PC_G0_NC	57
+#define R_ARM_ALU_PC_G1_NC	59
+#define R_ARM_LDR_PC_G2		63
 
 #define R_ARM_THM_CALL		10
 #define R_ARM_THM_JUMP24	30
diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
index c2354990290b..0c7f218f9012 100644
--- a/arch/arm/kernel/module.c
+++ b/arch/arm/kernel/module.c
@@ -68,6 +68,20 @@ bool module_exit_section(const char *name)
 		strstarts(name, ".ARM.exidx.exit");
 }
 
+static u32 get_group_rem(u32 group, u32 *offset)
+{
+	u32 val = *offset;
+	u32 shift;
+	do {
+		shift = val ? (31 - __fls(val)) & ~1 : 32;
+		*offset = val;
+		if (!val)
+			break;
+		val &= 0xffffff >> shift;
+	} while (group--);
+	return shift;
+}
+
 int
 apply_relocate(Elf32_Shdr *sechdrs, const char *strtab, unsigned int symindex,
 	       unsigned int relindex, struct module *module)
@@ -82,6 +96,7 @@ apply_relocate(Elf32_Shdr *sechdrs, const char *strtab, unsigned int symindex,
 		unsigned long loc;
 		Elf32_Sym *sym;
 		const char *symname;
+		u32 shift, group = 1;
 		s32 offset;
 		u32 tmp;
 #ifdef CONFIG_THUMB2_KERNEL
@@ -212,6 +227,54 @@ apply_relocate(Elf32_Shdr *sechdrs, const char *strtab, unsigned int symindex,
 			*(u32 *)loc = __opcode_to_mem_arm(tmp);
 			break;
 
+		case R_ARM_ALU_PC_G0_NC:
+			group = 0;
+			fallthrough;
+		case R_ARM_ALU_PC_G1_NC:
+			tmp = __mem_to_opcode_arm(*(u32 *)loc);
+			offset = ror32(tmp & 0xff, (tmp & 0xf00) >> 7);
+			if (tmp & BIT(22))
+				offset = -offset;
+			offset += sym->st_value - loc;
+			if (offset < 0) {
+				offset = -offset;
+				tmp = (tmp & ~BIT(23)) | BIT(22); // SUB opcode
+			} else {
+				tmp = (tmp & ~BIT(22)) | BIT(23); // ADD opcode
+			}
+
+			shift = get_group_rem(group, &offset);
+			if (shift < 24) {
+				offset >>= 24 - shift;
+				offset |= (shift + 8) << 7;
+			}
+			*(u32 *)loc = __opcode_to_mem_arm((tmp & ~0xfff) | offset);
+			break;
+
+		case R_ARM_LDR_PC_G2:
+			tmp = __mem_to_opcode_arm(*(u32 *)loc);
+			offset = tmp & 0xfff;
+			if (~tmp & BIT(23))		// U bit cleared?
+				offset = -offset;
+			offset += sym->st_value - loc;
+			if (offset < 0) {
+				offset = -offset;
+				tmp &= ~BIT(23);	// clear U bit
+			} else {
+				tmp |= BIT(23);		// set U bit
+			}
+			get_group_rem(2, &offset);
+
+			if (offset > 0xfff) {
+				pr_err("%s: section %u reloc %u sym '%s': relocation %u out of range (%#lx -> %#x)\n",
+				       module->name, relindex, i, symname,
+				       ELF32_R_TYPE(rel->r_info), loc,
+				       sym->st_value);
+				return -ENOEXEC;
+			}
+			*(u32 *)loc = __opcode_to_mem_arm((tmp & ~0xfff) | offset);
+			break;
+
 #ifdef CONFIG_THUMB2_KERNEL
 		case R_ARM_THM_CALL:
 		case R_ARM_THM_JUMP24:
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 3/6] ARM: percpu: add SMP_ON_UP support
  2021-11-26 10:10 [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP Ard Biesheuvel
  2021-11-26 10:10 ` [RFC PATCH 1/6] ARM: entry: preserve thread_info pointer in switch_to Ard Biesheuvel
  2021-11-26 10:10 ` [RFC PATCH 2/6] ARM: module: implement support for PC-relative group relocations Ard Biesheuvel
@ 2021-11-26 10:10 ` Ard Biesheuvel
  2021-11-26 10:10 ` [RFC PATCH 4/6] ARM: smp: defer TPIDRURO update for SMP v6 configurations too Ard Biesheuvel
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Ard Biesheuvel @ 2021-11-26 10:10 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Ard Biesheuvel, Russell King, Nicolas Pitre, Arnd Bergmann,
	Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers,
	Tony Lindgren

Permit the use of the TPIDRPRW system register for carrying the per-CPU
offset in generic SMP configurations that also target non-SMP capable
ARMv6 cores. This uses the SMP_ON_UP code patching framework to turn all
TPIDRPRW accesses into reads/writes of entry #0 in the __per_cpu_offset
array.

While at it, switch over some existing direct TPIDRPRW accesses in asm
code to invocations of a new helper that is patched in the same way when
necessary.

Note that CPU_V6+SMP without SMP_ON_UP results in a kernel that does not
boot on v6 CPUs without SMP extensions, so add this dependency to
Kconfig as well.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/include/asm/assembler.h | 21 ++++++++++++++
 arch/arm/include/asm/percpu.h    | 29 ++++++++++++++++++--
 arch/arm/kernel/entry-armv.S     |  4 +--
 arch/arm/mm/Kconfig              |  1 +
 4 files changed, 50 insertions(+), 5 deletions(-)

diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
index 1b9d4df331aa..c4c1d5b2edf5 100644
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@ -312,6 +312,27 @@ THUMB(	fpreg	.req	r7	)
 #define ALT_UP_B(label) b label
 #endif
 
+	/*
+	 * this_cpu_offset - load the per-CPU offset of this CPU into
+	 * 		     register 'rd'
+	 */
+	.macro	this_cpu_offset, rd:req
+#ifdef CONFIG_SMP
+ALT_SMP(mrc p15, 0, \rd, c13, c0, 4)
+#ifdef CONFIG_CPU_V6
+ALT_UP_B(.L1_\@)
+.L0_\@:
+	.subsection 1
+.L1_\@: ldr	\rd, =__per_cpu_offset
+	ldr	\rd, [\rd]
+	b	.L0_\@
+	.previous
+#endif
+#else
+	mov	\rd, #0
+#endif
+	.endm
+
 /*
  * Instruction barrier
  */
diff --git a/arch/arm/include/asm/percpu.h b/arch/arm/include/asm/percpu.h
index e2fcb3cfd3de..7b984352e402 100644
--- a/arch/arm/include/asm/percpu.h
+++ b/arch/arm/include/asm/percpu.h
@@ -5,15 +5,25 @@
 #ifndef _ASM_ARM_PERCPU_H_
 #define _ASM_ARM_PERCPU_H_
 
+#include <linux/threads.h>
+
 register unsigned long current_stack_pointer asm ("sp");
 
 /*
  * Same as asm-generic/percpu.h, except that we store the per cpu offset
  * in the TPIDRPRW. TPIDRPRW only exists on V6K and V7
  */
-#if defined(CONFIG_SMP) && !defined(CONFIG_CPU_V6)
+#ifdef CONFIG_SMP
+extern unsigned long __per_cpu_offset[NR_CPUS];
+extern unsigned int smp_on_up;
+
 static inline void set_my_cpu_offset(unsigned long off)
 {
+	if (IS_ENABLED(CONFIG_CPU_V6) && !smp_on_up) {
+		__per_cpu_offset[0] = off;
+		return;
+	}
+
 	/* Set TPIDRPRW */
 	asm volatile("mcr p15, 0, %0, c13, c0, 4" : : "r" (off) : "memory");
 }
@@ -27,8 +37,21 @@ static inline unsigned long __my_cpu_offset(void)
 	 * We want to allow caching the value, so avoid using volatile and
 	 * instead use a fake stack read to hazard against barrier().
 	 */
-	asm("mrc p15, 0, %0, c13, c0, 4" : "=r" (off)
-		: "Q" (*(const unsigned long *)current_stack_pointer));
+	asm("0:	mrc p15, 0, %0, c13, c0, 4			\n\t"
+#ifdef CONFIG_CPU_V6
+	    "1:							\n\t"
+	    "	.subsection 1					\n\t"
+	    "2: ldr	%0, =__per_cpu_offset			\n\t"
+	    "	ldr	%0, [%0]				\n\t"
+	    "	b	1b					\n\t"
+	    "	.previous					\n\t"
+	    "	.pushsection \".alt.smp.init\", \"a\"		\n\t"
+	    "	.long	0b - .					\n\t"
+	    "	b	. + (2b - 0b)				\n\t"
+	    "	.popsection					\n\t"
+#endif
+	     : "=r" (off)
+	     : "Q" (*(const unsigned long *)current_stack_pointer));
 
 	return off;
 }
diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
index 1e26d69ebbf1..09a9fe501094 100644
--- a/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -41,7 +41,7 @@
 	mov	r0, sp
 #ifdef CONFIG_IRQSTACKS
 	mov_l	r2, irq_stack_ptr	@ Take base address
-	mrc	p15, 0, r3, c13, c0, 4	@ Get CPU offset
+	this_cpu_offset r3
 #ifdef CONFIG_UNWINDER_ARM
 	mov	fpreg, sp		@ Preserve original SP
 #else
@@ -884,7 +884,7 @@ __bad_stack:
 THUMB(	bx	pc		)
 THUMB(	nop			)
 THUMB(	.arm			)
-	mrc	p15, 0, ip, c13, c0, 4		@ Get per-CPU offset
+	this_cpu_offset ip
 
 	.globl	overflow_stack_ptr
 	.reloc	0f, R_ARM_ALU_PC_G0_NC, overflow_stack_ptr
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index 58afba346729..a91ff22c6c2e 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -386,6 +386,7 @@ config CPU_V6
 	select CPU_PABRT_V6
 	select CPU_THUMB_CAPABLE
 	select CPU_TLB_V6 if MMU
+	select SMP_ON_UP if SMP
 
 # ARMv6k
 config CPU_V6K
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 4/6] ARM: smp: defer TPIDRURO update for SMP v6 configurations too
  2021-11-26 10:10 [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP Ard Biesheuvel
                   ` (2 preceding siblings ...)
  2021-11-26 10:10 ` [RFC PATCH 3/6] ARM: percpu: add SMP_ON_UP support Ard Biesheuvel
@ 2021-11-26 10:10 ` Ard Biesheuvel
  2021-11-26 10:10 ` [RFC PATCH 5/6] ARM: use TLS register for 'current' on !SMP as well Ard Biesheuvel
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Ard Biesheuvel @ 2021-11-26 10:10 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Ard Biesheuvel, Russell King, Nicolas Pitre, Arnd Bergmann,
	Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers,
	Tony Lindgren

Defer TPIDURO updates for user space until exit for CPU_V6+SMP
configurations as well so that we can decide at runtime whether to use
it to carry the current pointer, provided that we are running on a CPU
that actually implements this register. This is needed for
THREAD_INFO_IN_TASK support for UP systems, which requires that all SMP
capable systems use the TPIDRURO based access to 'current' as the only
remaining alternative will be a global variable which only work on UP.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/include/asm/tls.h     | 4 +++-
 arch/arm/kernel/entry-header.S | 8 +++++++-
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/tls.h b/arch/arm/include/asm/tls.h
index c3296499176c..9c0965c14a21 100644
--- a/arch/arm/include/asm/tls.h
+++ b/arch/arm/include/asm/tls.h
@@ -24,7 +24,9 @@
 	tst	\tmp1, #HWCAP_TLS		@ hardware TLS available?
 	streq	\tp, [\tmp2, #-15]		@ set TLS value at 0xffff0ff0
 	mrcne	p15, 0, \tmp2, c13, c0, 2	@ get the user r/w register
+#ifndef CONFIG_SMP
 	mcrne	p15, 0, \tp, c13, c0, 3		@ yes, set TLS register
+#endif
 	mcrne	p15, 0, \tpuser, c13, c0, 2	@ set user r/w register
 	strne	\tmp2, [\base, #TI_TP_VALUE + 4] @ save it
 	.endm
@@ -43,7 +45,7 @@
 #elif defined(CONFIG_CPU_V6)
 #define tls_emu		0
 #define has_tls_reg		(elf_hwcap & HWCAP_TLS)
-#define defer_tls_reg_update	0
+#define defer_tls_reg_update	IS_ENABLED(CONFIG_SMP)
 #define switch_tls	switch_tls_v6
 #elif defined(CONFIG_CPU_32v6K)
 #define tls_emu		0
diff --git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
index 81df2a3561ca..aea716c8b97c 100644
--- a/arch/arm/kernel/entry-header.S
+++ b/arch/arm/kernel/entry-header.S
@@ -292,12 +292,18 @@
 
 
 	.macro	restore_user_regs, fast = 0, offset = 0
-#if defined(CONFIG_CPU_32v6K) && !defined(CONFIG_CPU_V6)
+#if defined(CONFIG_CPU_32v6K) || \
+    (defined(CONFIG_CPU_V6) && defined(CONFIG_SMP))
 	@ The TLS register update is deferred until return to user space so we
 	@ can use it for other things while running in the kernel
+#ifdef CONFIG_CPU_V6
+ALT_SMP(nop)
+ALT_UP_B(.L0_\@)
+#endif
 	get_thread_info r1
 	ldr	r1, [r1, #TI_TP_VALUE]
 	mcr	p15, 0, r1, c13, c0, 3		@ set TLS register
+.L0_\@:
 #endif
 
 	uaccess_enable r1, isb=0
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 5/6] ARM: use TLS register for 'current' on !SMP as well
  2021-11-26 10:10 [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP Ard Biesheuvel
                   ` (3 preceding siblings ...)
  2021-11-26 10:10 ` [RFC PATCH 4/6] ARM: smp: defer TPIDRURO update for SMP v6 configurations too Ard Biesheuvel
@ 2021-11-26 10:10 ` Ard Biesheuvel
  2021-11-26 10:10 ` [RFC PATCH 6/6] ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems Ard Biesheuvel
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 11+ messages in thread
From: Ard Biesheuvel @ 2021-11-26 10:10 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Ard Biesheuvel, Russell King, Nicolas Pitre, Arnd Bergmann,
	Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers,
	Tony Lindgren

Enable the use of the TLS register to hold the 'current' pointer also on
non-SMP configurations that target v6k or later CPUs. This will permit
the use of THREAD_INFO_IN_TASK as well as IRQ stacks and vmap'ed stacks
for such configurations.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index e2ab72f2bf4a..61fc5cc03042 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1165,7 +1165,7 @@ config SMP_ON_UP
 
 config CURRENT_POINTER_IN_TPIDRURO
 	def_bool y
-	depends on SMP && CPU_32v6K && !CPU_V6
+	depends on CPU_32v6K && !CPU_V6
 
 config IRQSTACKS
 	def_bool y
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* [RFC PATCH 6/6] ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems
  2021-11-26 10:10 [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP Ard Biesheuvel
                   ` (4 preceding siblings ...)
  2021-11-26 10:10 ` [RFC PATCH 5/6] ARM: use TLS register for 'current' on !SMP as well Ard Biesheuvel
@ 2021-11-26 10:10 ` Ard Biesheuvel
  2021-11-26 22:32   ` Arnd Bergmann
  2021-11-27  0:20 ` [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP Linus Walleij
  2021-11-29 16:32 ` Nicolas Pitre
  7 siblings, 1 reply; 11+ messages in thread
From: Ard Biesheuvel @ 2021-11-26 10:10 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Ard Biesheuvel, Russell King, Nicolas Pitre, Arnd Bergmann,
	Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers,
	Tony Lindgren

On UP systems, only a single task can be 'current' at the same time,
which means we can use a global variable to track it. This means we can
also enable THREAD_INFO_IN_TASK for those systems, as in that case,
thread_info is accessed via current rather than the other way around,
removing the need to store thread_info at the base of the task stack.
This, in turn, permits us to enable IRQ stacks and vmap'ed stacks on UP
systems as well.

To partially mitigate the performance overhead of this arrangement, use
a ADD/ADD/LDR sequence with the appropriate PC-relative group
relocations to load the value of current when needed. This means that
accessing current will still only require a single load as before,
avoiding the need for a literal to carry the address of the global
variable in each function. However, accessing thread_info will now
require this load as well.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/Kconfig                   |   8 +-
 arch/arm/include/asm/assembler.h   | 115 ++++++++++++++------
 arch/arm/include/asm/current.h     |  56 ++++++++--
 arch/arm/include/asm/switch_to.h   |   3 +-
 arch/arm/include/asm/thread_info.h |  27 -----
 arch/arm/kernel/asm-offsets.c      |   3 -
 arch/arm/kernel/entry-armv.S       |  11 +-
 arch/arm/kernel/head-common.S      |   4 +-
 arch/arm/kernel/process.c          |   7 +-
 arch/arm/kernel/traps.c            |   4 +
 10 files changed, 156 insertions(+), 82 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 61fc5cc03042..3d7476ca4d94 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -126,8 +126,8 @@ config ARM
 	select PERF_USE_VMALLOC
 	select RTC_LIB
 	select SYS_SUPPORTS_APM_EMULATION
-	select THREAD_INFO_IN_TASK if CURRENT_POINTER_IN_TPIDRURO
-	select HAVE_ARCH_VMAP_STACK if MMU && THREAD_INFO_IN_TASK && (!LD_IS_LLD || LLD_VERSION >= 140000)
+	select THREAD_INFO_IN_TASK
+	select HAVE_ARCH_VMAP_STACK if MMU && (!LD_IS_LLD || LLD_VERSION >= 140000)
 	select TRACE_IRQFLAGS_SUPPORT if !CPU_V7M
 	# Above selects are sorted alphabetically; please add new ones
 	# according to that.  Thanks.
@@ -1169,7 +1169,7 @@ config CURRENT_POINTER_IN_TPIDRURO
 
 config IRQSTACKS
 	def_bool y
-	depends on GENERIC_IRQ_MULTI_HANDLER && THREAD_INFO_IN_TASK
+	depends on GENERIC_IRQ_MULTI_HANDLER
 	select HAVE_IRQ_EXIT_ON_IRQ_STACK
 	select HAVE_SOFTIRQ_ON_OWN_STACK
 
@@ -1619,7 +1619,7 @@ config CC_HAVE_STACKPROTECTOR_TLS
 
 config STACKPROTECTOR_PER_TASK
 	bool "Use a unique stack canary value for each task"
-	depends on STACKPROTECTOR && THREAD_INFO_IN_TASK && !XIP_DEFLATED_DATA
+	depends on STACKPROTECTOR && CURRENT_POINTER_IN_TPIDRURO && !XIP_DEFLATED_DATA
 	depends on GCC_PLUGINS || CC_HAVE_STACKPROTECTOR_TLS
 	select GCC_PLUGIN_ARM_SSP_PER_TASK if !CC_HAVE_STACKPROTECTOR_TLS
 	default y
diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
index c4c1d5b2edf5..978fdaaac680 100644
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@ -203,43 +203,12 @@ THUMB(	fpreg	.req	r7	)
 	.endm
 	.endr
 
-	.macro	get_current, rd
-#ifdef CONFIG_CURRENT_POINTER_IN_TPIDRURO
-	mrc	p15, 0, \rd, c13, c0, 3		@ get TPIDRURO register
-#else
-	get_thread_info \rd
-	ldr	\rd, [\rd, #TI_TASK]
-#endif
-	.endm
-
-	.macro	set_current, rn
-#ifdef CONFIG_CURRENT_POINTER_IN_TPIDRURO
-	mcr	p15, 0, \rn, c13, c0, 3		@ set TPIDRURO register
-#endif
-	.endm
-
-	.macro	reload_current, t1:req, t2:req
-#ifdef CONFIG_CURRENT_POINTER_IN_TPIDRURO
-	adr_l	\t1, __entry_task		@ get __entry_task base address
-	mrc	p15, 0, \t2, c13, c0, 4		@ get per-CPU offset
-	ldr	\t1, [\t1, \t2]			@ load variable
-	mcr	p15, 0, \t1, c13, c0, 3		@ store in TPIDRURO
-#endif
-	.endm
-
 /*
  * Get current thread_info.
  */
 	.macro	get_thread_info, rd
-#ifdef CONFIG_THREAD_INFO_IN_TASK
 	/* thread_info is the first member of struct task_struct */
 	get_current \rd
-#else
- ARM(	mov	\rd, sp, lsr #THREAD_SIZE_ORDER + PAGE_SHIFT	)
- THUMB(	mov	\rd, sp			)
- THUMB(	lsr	\rd, \rd, #THREAD_SIZE_ORDER + PAGE_SHIFT	)
-	mov	\rd, \rd, lsl #THREAD_SIZE_ORDER + PAGE_SHIFT
-#endif
 	.endm
 
 /*
@@ -333,6 +302,90 @@ ALT_UP_B(.L1_\@)
 #endif
 	.endm
 
+	/*
+	 * load_current - load the current task pointer from the global
+	 * 		  variable '__current'
+	 */
+	.macro	load_current, rd
+#if defined(CONFIG_THUMB2_KERNEL) || \
+    (defined(MODULE) && defined(CONFIG_ARM_MODULE_PLTS)) || \
+    (defined(CONFIG_LD_IS_LLD) && CONFIG_LLD_VERSION < 140000)
+	mov_l	\rd, __current
+	ldr	\rd, [\rd]
+#else
+	/*
+	 * Avoid a literal load, by emitting a sequence of ADD/LDR instructions
+	 * with the appropriate relocations. The combined sequence has a range
+	 * of -/+ 256 MiB, which should be sufficient for the core kernel and
+	 * for modules loaded into the module region.
+	 */
+	.globl	__current
+	.reloc	.L0_\@, R_ARM_ALU_PC_G0_NC, __current
+	.reloc	.L1_\@, R_ARM_ALU_PC_G1_NC, __current
+	.reloc	.L2_\@, R_ARM_LDR_PC_G2, __current
+.L0_\@: sub	\rd, pc, #8
+.L1_\@: sub	\rd, \rd, #4
+.L2_\@: ldr	\rd, [\rd, #0]
+#endif
+	.endm
+
+	/*
+	 * set_current - store the task pointer of this CPU's current task
+	 */
+	.macro	set_current, rn:req, tmp:req
+#if defined(CONFIG_CURRENT_POINTER_IN_TPIDRURO) || defined(CONFIG_SMP)
+9998:	mcr p15, 0, \rn, c13, c0, 3		@ set TPIDRURO register
+#ifdef CONFIG_CPU_V6
+ALT_UP_B(.L1_\@)
+.L0_\@:
+	.subsection 1
+.L1_\@: ldr	\tmp, =__current
+	str	\rn, [\tmp]
+	b	.L0_\@
+	.previous
+#endif
+#else
+	str_l	\rn, __current, \tmp
+#endif
+	.endm
+
+	/*
+	 * get_current - load the task pointer of this CPU's current task
+	 */
+	.macro	get_current, rd
+#if defined(CONFIG_CURRENT_POINTER_IN_TPIDRURO) || defined(CONFIG_SMP)
+9998:	mrc p15, 0, \rd, c13, c0, 3		@ get TPIDRURO register
+#ifdef CONFIG_CPU_V6
+ALT_UP_B(.L1_\@)
+.L0_\@:
+	.subsection 1
+.L1_\@: load_current \rd
+	b	.L0_\@
+	.previous
+#endif
+#else
+	load_current \rd
+#endif
+	.endm
+
+	/*
+	 * reload_current - reload the task pointer of this CPU's current task
+	 *		    into the TLS register
+	 */
+	.macro	reload_current, t1:req, t2:req
+#if defined(CONFIG_CURRENT_POINTER_IN_TPIDRURO) || defined(CONFIG_SMP)
+#ifdef CONFIG_CPU_V6
+ALT_SMP(nop)
+ALT_UP_B(.L0_\@)
+#endif
+	adr_l	\t1, __entry_task		@ get __entry_task base address
+	mrc	p15, 0, \t2, c13, c0, 4		@ get per-CPU offset
+	ldr	\t1, [\t1, \t2]			@ load variable
+	mcr	p15, 0, \t1, c13, c0, 3		@ store in TPIDRURO
+.L0_\@:
+#endif
+	.endm
+
 /*
  * Instruction barrier
  */
diff --git a/arch/arm/include/asm/current.h b/arch/arm/include/asm/current.h
index 6bf0aad672c3..68d6907c9d54 100644
--- a/arch/arm/include/asm/current.h
+++ b/arch/arm/include/asm/current.h
@@ -11,22 +11,50 @@
 
 struct task_struct;
 
+extern struct task_struct *__current;
+extern unsigned int smp_on_up;
+
 static inline void set_current(struct task_struct *cur)
 {
-	if (!IS_ENABLED(CONFIG_CURRENT_POINTER_IN_TPIDRURO))
+	if (!IS_ENABLED(CONFIG_CURRENT_POINTER_IN_TPIDRURO) &&
+	    !(IS_ENABLED(CONFIG_SMP) &&
+	      IS_ENABLED(CONFIG_SMP_ON_UP) &&
+	      smp_on_up)) {
+		__current = cur;
 		return;
+	}
 
 	/* Set TPIDRURO */
 	asm("mcr p15, 0, %0, c13, c0, 3" :: "r"(cur) : "memory");
 }
 
-#ifdef CONFIG_CURRENT_POINTER_IN_TPIDRURO
+/*
+ * Avoid a literal load by emitting a sequence of ADD/LDR instructions with the
+ * appropriate relocations. The combined sequence has a range of -/+ 256 MiB,
+ * which should be sufficient for the core kernel as well as modules loaded
+ * into the module region. (Not supported by LLD before release 14)
+ */
+#if !defined(CONFIG_LD_IS_LLD) || CONFIG_LLD_VERSION >= 140000
+#define LOAD_CURRENT							\
+	"	.globl	__current				\n\t"	\
+	"	.reloc	10f, R_ARM_ALU_PC_G0_NC, __current	\n\t"	\
+	"	.reloc	11f, R_ARM_ALU_PC_G1_NC, __current	\n\t"	\
+	"	.reloc	12f, R_ARM_LDR_PC_G2, __current		\n\t"	\
+	"10:	sub	%0, pc, #8				\n\t"	\
+	"11:	sub	%0, %0, #4				\n\t"	\
+	"12:	ldr	%0, [%0, #0]				\n\t"
+#else
+#define LOAD_CURRENT							\
+	"	ldr	%0, =__current				\n\t"	\
+	"	ldr	%0, [%0]				\n\t"
+#endif
 
-static inline struct task_struct *get_current(void)
+static inline __attribute_const__ struct task_struct *get_current(void)
 {
 	struct task_struct *cur;
 
 #if __has_builtin(__builtin_thread_pointer) && \
+    defined(CONFIG_CURRENT_POINTER_IN_TPIDRURO) && \
     !(defined(CONFIG_THUMB2_KERNEL) && \
       defined(CONFIG_CC_IS_CLANG) && CONFIG_CLANG_VERSION < 130001)
 	/*
@@ -39,16 +67,30 @@ static inline struct task_struct *get_current(void)
 	 * https://github.com/ClangBuiltLinux/linux/issues/1485
 	 */
 	cur = __builtin_thread_pointer();
+#elif defined(CONFIG_CURRENT_POINTER_IN_TPIDRURO) || defined(CONFIG_SMP)
+	asm("0:	mrc p15, 0, %0, c13, c0, 3			\n\t"
+#ifdef CONFIG_CPU_V6
+	    "1:							\n\t"
+	    "	.subsection 1					\n\t"
+	    "2: " LOAD_CURRENT
+	    "	b	1b					\n\t"
+	    "	.previous					\n\t"
+	    "	.pushsection \".alt.smp.init\", \"a\"		\n\t"
+	    "	.long	0b - .					\n\t"
+	    "	b	. + (2b - 0b)				\n\t"
+	    "	.popsection					\n\t"
+#endif
+	    : "=r"(cur));
+#elif defined(CONFIG_THUMB2_KERNEL) || \
+      (defined(MODULE) && defined(CONFIG_ARM_MODULE_PLTS))
+	cur = __current;
 #else
-	asm("mrc p15, 0, %0, c13, c0, 3" : "=r"(cur));
+	asm(LOAD_CURRENT : "=r"(cur));
 #endif
 	return cur;
 }
 
 #define current get_current()
-#else
-#include <asm-generic/current.h>
-#endif /* CONFIG_CURRENT_POINTER_IN_TPIDRURO */
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/arm/include/asm/switch_to.h b/arch/arm/include/asm/switch_to.h
index b55c7b2755e4..a482c99934ff 100644
--- a/arch/arm/include/asm/switch_to.h
+++ b/arch/arm/include/asm/switch_to.h
@@ -40,7 +40,8 @@ static inline void set_ti_cpu(struct task_struct *p)
 do {									\
 	__complete_pending_tlbi();					\
 	set_ti_cpu(next);						\
-	if (IS_ENABLED(CONFIG_CURRENT_POINTER_IN_TPIDRURO))		\
+	if (IS_ENABLED(CONFIG_CURRENT_POINTER_IN_TPIDRURO) ||		\
+	    IS_ENABLED(CONFIG_SMP))					\
 		__this_cpu_write(__entry_task, next);			\
 	last = __switch_to(prev,task_thread_info(prev), task_thread_info(next));	\
 } while (0)
diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h
index 004b89d86224..aecc403b2880 100644
--- a/arch/arm/include/asm/thread_info.h
+++ b/arch/arm/include/asm/thread_info.h
@@ -62,9 +62,6 @@ struct cpu_context_save {
 struct thread_info {
 	unsigned long		flags;		/* low level flags */
 	int			preempt_count;	/* 0 => preemptable, <0 => bug */
-#ifndef CONFIG_THREAD_INFO_IN_TASK
-	struct task_struct	*task;		/* main task structure */
-#endif
 	__u32			cpu;		/* cpu */
 	__u32			cpu_domain;	/* cpu domain */
 	struct cpu_context_save	cpu_context;	/* cpu context */
@@ -80,39 +77,15 @@ struct thread_info {
 
 #define INIT_THREAD_INFO(tsk)						\
 {									\
-	INIT_THREAD_INFO_TASK(tsk)					\
 	.flags		= 0,						\
 	.preempt_count	= INIT_PREEMPT_COUNT,				\
 }
 
-#ifdef CONFIG_THREAD_INFO_IN_TASK
-#define INIT_THREAD_INFO_TASK(tsk)
-
 static inline struct task_struct *thread_task(struct thread_info* ti)
 {
 	return (struct task_struct *)ti;
 }
 
-#else
-#define INIT_THREAD_INFO_TASK(tsk)	.task = &(tsk),
-
-static inline struct task_struct *thread_task(struct thread_info* ti)
-{
-	return ti->task;
-}
-
-/*
- * how to get the thread information struct from C
- */
-static inline struct thread_info *current_thread_info(void) __attribute_const__;
-
-static inline struct thread_info *current_thread_info(void)
-{
-	return (struct thread_info *)
-		(current_stack_pointer & ~(THREAD_SIZE - 1));
-}
-#endif
-
 #define thread_saved_pc(tsk)	\
 	((unsigned long)(task_thread_info(tsk)->cpu_context.pc))
 #define thread_saved_sp(tsk)	\
diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
index 645845e4982a..2c8d76fd7c66 100644
--- a/arch/arm/kernel/asm-offsets.c
+++ b/arch/arm/kernel/asm-offsets.c
@@ -43,9 +43,6 @@ int main(void)
   BLANK();
   DEFINE(TI_FLAGS,		offsetof(struct thread_info, flags));
   DEFINE(TI_PREEMPT,		offsetof(struct thread_info, preempt_count));
-#ifndef CONFIG_THREAD_INFO_IN_TASK
-  DEFINE(TI_TASK,		offsetof(struct thread_info, task));
-#endif
   DEFINE(TI_CPU,		offsetof(struct thread_info, cpu));
   DEFINE(TI_CPU_DOMAIN,		offsetof(struct thread_info, cpu_domain));
   DEFINE(TI_CPU_SAVE,		offsetof(struct thread_info, cpu_context));
diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
index 09a9fe501094..5f3b882d53b7 100644
--- a/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -816,12 +816,13 @@ ENTRY(__switch_to)
 	switch_tls r1, r4, r5, r3, r7
 #if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_SMP) && \
     !defined(CONFIG_STACKPROTECTOR_PER_TASK)
-	ldr	r9, [r2, #TI_TASK]
 	ldr	r8, =__stack_chk_guard
 	.if (TSK_STACK_CANARY > IMM12_MASK)
-	add	r9, r9, #TSK_STACK_CANARY & ~IMM12_MASK
-	.endif
+	add	r9, r2, #TSK_STACK_CANARY & ~IMM12_MASK
 	ldr	r9, [r9, #TSK_STACK_CANARY & IMM12_MASK]
+	.else
+	ldr	r9, [r2, #TSK_STACK_CANARY & IMM12_MASK]
+	.endif
 #endif
 	mov	r7, r2				@ Preserve 'next'
 #ifdef CONFIG_CPU_USE_DOMAINS
@@ -838,7 +839,7 @@ ENTRY(__switch_to)
 #endif
 	mov	r0, r5
 #if !defined(CONFIG_THUMB2_KERNEL) && !defined(CONFIG_VMAP_STACK)
-	set_current r7
+	set_current r7, r8
 	ldmia	r4, {r4 - sl, fp, sp, pc}	@ Load all regs saved previously
 #else
 	mov	r1, r7
@@ -860,7 +861,7 @@ ENTRY(__switch_to)
 	@ switches us to another stack, with few other side effects. In order
 	@ to prevent this distinction from causing any inconsistencies, let's
 	@ keep the 'set_current' call as close as we can to the update of SP.
-	set_current r1
+	set_current r1, r2
 	mov	sp, ip
 	ret	lr
 #endif
diff --git a/arch/arm/kernel/head-common.S b/arch/arm/kernel/head-common.S
index da18e0a17dc2..42cae73fcc19 100644
--- a/arch/arm/kernel/head-common.S
+++ b/arch/arm/kernel/head-common.S
@@ -105,10 +105,8 @@ __mmap_switched:
 	mov	r1, #0
 	bl	__memset			@ clear .bss
 
-#ifdef CONFIG_CURRENT_POINTER_IN_TPIDRURO
 	adr_l	r0, init_task			@ get swapper task_struct
-	set_current r0
-#endif
+	set_current r0, r1
 
 	ldmia	r4, {r0, r1, r2, r3}
 	str	r9, [r0]			@ Save processor ID
diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index d47159f3791c..0617af11377f 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -36,7 +36,7 @@
 
 #include "signal.h"
 
-#ifdef CONFIG_CURRENT_POINTER_IN_TPIDRURO
+#if defined(CONFIG_CURRENT_POINTER_IN_TPIDRURO) || defined(CONFIG_SMP)
 DEFINE_PER_CPU(struct task_struct *, __entry_task);
 #endif
 
@@ -46,6 +46,11 @@ unsigned long __stack_chk_guard __read_mostly;
 EXPORT_SYMBOL(__stack_chk_guard);
 #endif
 
+#ifndef CONFIG_CURRENT_POINTER_IN_TPIDRURO
+asmlinkage struct task_struct *__current;
+EXPORT_SYMBOL(__current);
+#endif
+
 static const char *processor_modes[] __maybe_unused = {
   "USER_26", "FIQ_26" , "IRQ_26" , "SVC_26" , "UK4_26" , "UK5_26" , "UK6_26" , "UK7_26" ,
   "UK8_26" , "UK9_26" , "UK10_26", "UK11_26", "UK12_26", "UK13_26", "UK14_26", "UK15_26",
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index b28a705c49cb..3f38357efc46 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -865,7 +865,9 @@ early_initcall(allocate_overflow_stacks);
 asmlinkage void handle_bad_stack(struct pt_regs *regs)
 {
 	unsigned long tsk_stk = (unsigned long)current->stack;
+#ifdef CONFIG_IRQSTACKS
 	unsigned long irq_stk = (unsigned long)this_cpu_read(irq_stack_ptr);
+#endif
 	unsigned long ovf_stk = (unsigned long)this_cpu_read(overflow_stack_ptr);
 
 	console_verbose();
@@ -873,8 +875,10 @@ asmlinkage void handle_bad_stack(struct pt_regs *regs)
 
 	pr_emerg("Task stack:     [0x%08lx..0x%08lx]\n",
 		 tsk_stk, tsk_stk + THREAD_SIZE);
+#ifdef CONFIG_IRQSTACKS
 	pr_emerg("IRQ stack:      [0x%08lx..0x%08lx]\n",
 		 irq_stk - THREAD_SIZE, irq_stk);
+#endif
 	pr_emerg("Overflow stack: [0x%08lx..0x%08lx]\n",
 		 ovf_stk - OVERFLOW_STACK_SIZE, ovf_stk);
 
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 6/6] ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems
  2021-11-26 10:10 ` [RFC PATCH 6/6] ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems Ard Biesheuvel
@ 2021-11-26 22:32   ` Arnd Bergmann
  2021-11-30  8:00     ` Ard Biesheuvel
  0 siblings, 1 reply; 11+ messages in thread
From: Arnd Bergmann @ 2021-11-26 22:32 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Linux ARM, Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook,
	Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren

On Fri, Nov 26, 2021 at 11:10 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> @@ -1169,7 +1169,7 @@ config CURRENT_POINTER_IN_TPIDRURO
>
>  config IRQSTACKS
>         def_bool y
> -       depends on GENERIC_IRQ_MULTI_HANDLER && THREAD_INFO_IN_TASK
> +       depends on GENERIC_IRQ_MULTI_HANDLER
>         select HAVE_IRQ_EXIT_ON_IRQ_STACK
>         select HAVE_SOFTIRQ_ON_OWN_STACK

Side note: after this, we might want to investigate finishing off
GENERIC_IRQ_MULTI_HANDLER for all architectures. The
currently missing platforms are ARM_SINGLE_ARMV7M,
ARCH_FOOTBRIDGE, ARCH_IOP32X and ARCH_RPC.

These are a bit tricky (presumably this is why they are not converted
yet), but it should be possible to do.

>  static inline void set_current(struct task_struct *cur)
>  {
> -       if (!IS_ENABLED(CONFIG_CURRENT_POINTER_IN_TPIDRURO))
> +       if (!IS_ENABLED(CONFIG_CURRENT_POINTER_IN_TPIDRURO) &&
> +           !(IS_ENABLED(CONFIG_SMP) &&
> +             IS_ENABLED(CONFIG_SMP_ON_UP) &&
> +             smp_on_up)) {

I think you can just use is_smp() here to simplify the condition. You might
need to move the definition to a different header if that causes an #include
loop.

> @@ -39,16 +67,30 @@ static inline struct task_struct *get_current(void)
>          * https://github.com/ClangBuiltLinux/linux/issues/1485
>          */
>         cur = __builtin_thread_pointer();
> +#elif defined(CONFIG_CURRENT_POINTER_IN_TPIDRURO) || defined(CONFIG_SMP)
> +       asm("0: mrc p15, 0, %0, c13, c0, 3                      \n\t"
> +#ifdef CONFIG_CPU_V6
> +           "1:                                                 \n\t"
> +           "   .subsection 1                                   \n\t"
> +           "2: " LOAD_CURRENT
> +           "   b       1b                                      \n\t"
> +           "   .previous                                       \n\t"
> +           "   .pushsection \".alt.smp.init\", \"a\"           \n\t"
> +           "   .long   0b - .                                  \n\t"
> +           "   b       . + (2b - 0b)                           \n\t"
> +           "   .popsection                                     \n\t"
> +#endif

You mentioned earlier that this gets ugly with SMP_ON_UP on ARMv6, now
I see what you meant ;-)

I can see an increasing number of reasons for no longer supporting this
option. As we recently discussed on IRC, this would affect omap2plus_defconfig,
imx_v6_v7_defconfig and realview_defconfig, which would all have to drop
either CPU_V6 or SMP. Since you got it working already, this also seems
better left as a cleanup for another time once we can build consensus on it,
but my guess is that at this point the benefits of removing it outweigh those
of keeping it.

        Arnd

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP
  2021-11-26 10:10 [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP Ard Biesheuvel
                   ` (5 preceding siblings ...)
  2021-11-26 10:10 ` [RFC PATCH 6/6] ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems Ard Biesheuvel
@ 2021-11-27  0:20 ` Linus Walleij
  2021-11-29 16:32 ` Nicolas Pitre
  7 siblings, 0 replies; 11+ messages in thread
From: Linus Walleij @ 2021-11-27  0:20 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, Russell King, Nicolas Pitre, Arnd Bergmann,
	Kees Cook, Keith Packard, Nick Desaulniers, Tony Lindgren

On Fri, Nov 26, 2021 at 11:10 AM Ard Biesheuvel <ardb@kernel.org> wrote:

> Enable the use of the TLS register to hold the 'current' pointer for all
> configurations that can support it, including non-SMP ones that target
> v6k or later CPUs, and multi-platform SMP ones that also support v6
> based UP systems.
>
> The remaining configurations are all strictly UP, which means we can
> switch to a global variable to hold the current pointer. By doing this,
> we can enable THREAD_INFO_IN_TASK, which moves thread info off the
> stack, protecting it from overflows. It also permits us to enable IRQ
> stacks and vmap'ed stacks for UP configurations as well.

I really like what I see here!

I glanced over it but sadly do not have sufficient time to read every
detail of it, but I certainly trust to to get things right and iron out any
rough corners so FWIW:
Acked-by: Linus Walleij <linus.walleij@linaro.org>
on this patch set.

> Supporting v6 cores without SMP extensions in SMP configurations (e.g.,
> omap2plus_defconfig or imx_v6_v7_defconfig) makes this a bit tricky, and
> this is a feature we may consider dropping entirely in the future. But
> for the time being, we can support this mode as well.

Hmmm yes these will look odd and I can see it really makes the
patch hairy too. I think people might be using them though :/

Yours,
Linus Walleij

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP
  2021-11-26 10:10 [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP Ard Biesheuvel
                   ` (6 preceding siblings ...)
  2021-11-27  0:20 ` [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP Linus Walleij
@ 2021-11-29 16:32 ` Nicolas Pitre
  7 siblings, 0 replies; 11+ messages in thread
From: Nicolas Pitre @ 2021-11-29 16:32 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: linux-arm-kernel, Russell King, Arnd Bergmann, Kees Cook,
	Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren

On Fri, 26 Nov 2021, Ard Biesheuvel wrote:

> Enable the use of the TLS register to hold the 'current' pointer for all
> configurations that can support it, including non-SMP ones that target
> v6k or later CPUs, and multi-platform SMP ones that also support v6
> based UP systems.
> 
> The remaining configurations are all strictly UP, which means we can
> switch to a global variable to hold the current pointer. By doing this,
> we can enable THREAD_INFO_IN_TASK, which moves thread info off the
> stack, protecting it from overflows. It also permits us to enable IRQ
> stacks and vmap'ed stacks for UP configurations as well.
> 
> Supporting v6 cores without SMP extensions in SMP configurations (e.g.,
> omap2plus_defconfig or imx_v6_v7_defconfig) makes this a bit tricky, and
> this is a feature we may consider dropping entirely in the future. But
> for the time being, we can support this mode as well.
> 
> The accesses to the global variable holding 'current' are constructed in
> a way that ensures that no literal pool accesses (and associated D-cache
> misses) are needed unless the access is from a module and module PLTs
> are enabled. This means that accessing 'current' is just as costly as
> before, as it used to require some arithmetic involving the stack
> pointer and a load from the thread_info::task field.
> 
> However, accessing thread_info itself now also involves a load, although
> it should be noted that all thread_info and current accesses now go via
> the same variable, which is therefore expected to be hot in the caches
> at all times.

Looks nice overall.

Acked-by: Nicolas Pitre <nico@fluxnic.net>

> Cc: Russell King <linux@armlinux.org.uk>
> Cc: Nicolas Pitre <nico@fluxnic.net>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Keith Packard <keithpac@amazon.com>
> Cc: Linus Walleij <linus.walleij@linaro.org>
> Cc: Nick Desaulniers <ndesaulniers@google.com>
> Cc: Tony Lindgren <tony@atomide.com>
> 
> Ard Biesheuvel (6):
>   ARM: entry: preserve thread_info pointer in switch_to
>   ARM: module: implement support for PC-relative group relocations
>   ARM: percpu: add SMP_ON_UP support
>   ARM: smp: defer TPIDRURO update for SMP v6 configurations too
>   ARM: use TLS register for 'current' on !SMP as well
>   ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems
> 
>  arch/arm/Kconfig                   |  10 +-
>  arch/arm/include/asm/assembler.h   | 136 +++++++++++++++-----
>  arch/arm/include/asm/current.h     |  56 +++++++-
>  arch/arm/include/asm/elf.h         |   3 +
>  arch/arm/include/asm/percpu.h      |  29 ++++-
>  arch/arm/include/asm/switch_to.h   |   3 +-
>  arch/arm/include/asm/thread_info.h |  27 ----
>  arch/arm/include/asm/tls.h         |   4 +-
>  arch/arm/kernel/asm-offsets.c      |   3 -
>  arch/arm/kernel/entry-armv.S       |  26 ++--
>  arch/arm/kernel/entry-header.S     |   8 +-
>  arch/arm/kernel/head-common.S      |   4 +-
>  arch/arm/kernel/module.c           |  63 +++++++++
>  arch/arm/kernel/process.c          |   7 +-
>  arch/arm/kernel/traps.c            |   4 +
>  arch/arm/mm/Kconfig                |   1 +
>  16 files changed, 289 insertions(+), 95 deletions(-)
> 
> -- 
> 2.30.2
> 
> 

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [RFC PATCH 6/6] ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems
  2021-11-26 22:32   ` Arnd Bergmann
@ 2021-11-30  8:00     ` Ard Biesheuvel
  0 siblings, 0 replies; 11+ messages in thread
From: Ard Biesheuvel @ 2021-11-30  8:00 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Linux ARM, Russell King, Nicolas Pitre, Kees Cook, Keith Packard,
	Linus Walleij, Nick Desaulniers, Tony Lindgren

On Fri, 26 Nov 2021 at 23:32, Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Fri, Nov 26, 2021 at 11:10 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> > @@ -1169,7 +1169,7 @@ config CURRENT_POINTER_IN_TPIDRURO
> >
> >  config IRQSTACKS
> >         def_bool y
> > -       depends on GENERIC_IRQ_MULTI_HANDLER && THREAD_INFO_IN_TASK
> > +       depends on GENERIC_IRQ_MULTI_HANDLER
> >         select HAVE_IRQ_EXIT_ON_IRQ_STACK
> >         select HAVE_SOFTIRQ_ON_OWN_STACK
>
> Side note: after this, we might want to investigate finishing off
> GENERIC_IRQ_MULTI_HANDLER for all architectures. The
> currently missing platforms are ARM_SINGLE_ARMV7M,
> ARCH_FOOTBRIDGE, ARCH_IOP32X and ARCH_RPC.
>
> These are a bit tricky (presumably this is why they are not converted
> yet), but it should be possible to do.
>
> >  static inline void set_current(struct task_struct *cur)
> >  {
> > -       if (!IS_ENABLED(CONFIG_CURRENT_POINTER_IN_TPIDRURO))
> > +       if (!IS_ENABLED(CONFIG_CURRENT_POINTER_IN_TPIDRURO) &&
> > +           !(IS_ENABLED(CONFIG_SMP) &&
> > +             IS_ENABLED(CONFIG_SMP_ON_UP) &&
> > +             smp_on_up)) {
>
> I think you can just use is_smp() here to simplify the condition. You might
> need to move the definition to a different header if that causes an #include
> loop.
>

OK

> > @@ -39,16 +67,30 @@ static inline struct task_struct *get_current(void)
> >          * https://github.com/ClangBuiltLinux/linux/issues/1485
> >          */
> >         cur = __builtin_thread_pointer();
> > +#elif defined(CONFIG_CURRENT_POINTER_IN_TPIDRURO) || defined(CONFIG_SMP)
> > +       asm("0: mrc p15, 0, %0, c13, c0, 3                      \n\t"
> > +#ifdef CONFIG_CPU_V6
> > +           "1:                                                 \n\t"
> > +           "   .subsection 1                                   \n\t"
> > +           "2: " LOAD_CURRENT
> > +           "   b       1b                                      \n\t"
> > +           "   .previous                                       \n\t"
> > +           "   .pushsection \".alt.smp.init\", \"a\"           \n\t"
> > +           "   .long   0b - .                                  \n\t"
> > +           "   b       . + (2b - 0b)                           \n\t"
> > +           "   .popsection                                     \n\t"
> > +#endif
>
> You mentioned earlier that this gets ugly with SMP_ON_UP on ARMv6, now
> I see what you meant ;-)
>
> I can see an increasing number of reasons for no longer supporting this
> option. As we recently discussed on IRC, this would affect omap2plus_defconfig,
> imx_v6_v7_defconfig and realview_defconfig, which would all have to drop
> either CPU_V6 or SMP. Since you got it working already, this also seems
> better left as a cleanup for another time once we can build consensus on it,
> but my guess is that at this point the benefits of removing it outweigh those
> of keeping it.
>

Agreed.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2021-11-30  8:04 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-26 10:10 [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP Ard Biesheuvel
2021-11-26 10:10 ` [RFC PATCH 1/6] ARM: entry: preserve thread_info pointer in switch_to Ard Biesheuvel
2021-11-26 10:10 ` [RFC PATCH 2/6] ARM: module: implement support for PC-relative group relocations Ard Biesheuvel
2021-11-26 10:10 ` [RFC PATCH 3/6] ARM: percpu: add SMP_ON_UP support Ard Biesheuvel
2021-11-26 10:10 ` [RFC PATCH 4/6] ARM: smp: defer TPIDRURO update for SMP v6 configurations too Ard Biesheuvel
2021-11-26 10:10 ` [RFC PATCH 5/6] ARM: use TLS register for 'current' on !SMP as well Ard Biesheuvel
2021-11-26 10:10 ` [RFC PATCH 6/6] ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems Ard Biesheuvel
2021-11-26 22:32   ` Arnd Bergmann
2021-11-30  8:00     ` Ard Biesheuvel
2021-11-27  0:20 ` [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP Linus Walleij
2021-11-29 16:32 ` Nicolas Pitre

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.