* [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP
@ 2021-11-26 10:10 Ard Biesheuvel
2021-11-26 10:10 ` [RFC PATCH 1/6] ARM: entry: preserve thread_info pointer in switch_to Ard Biesheuvel
` (7 more replies)
0 siblings, 8 replies; 11+ messages in thread
From: Ard Biesheuvel @ 2021-11-26 10:10 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Russell King, Nicolas Pitre, Arnd Bergmann,
Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers,
Tony Lindgren
Enable the use of the TLS register to hold the 'current' pointer for all
configurations that can support it, including non-SMP ones that target
v6k or later CPUs, and multi-platform SMP ones that also support v6
based UP systems.
The remaining configurations are all strictly UP, which means we can
switch to a global variable to hold the current pointer. By doing this,
we can enable THREAD_INFO_IN_TASK, which moves thread info off the
stack, protecting it from overflows. It also permits us to enable IRQ
stacks and vmap'ed stacks for UP configurations as well.
Supporting v6 cores without SMP extensions in SMP configurations (e.g.,
omap2plus_defconfig or imx_v6_v7_defconfig) makes this a bit tricky, and
this is a feature we may consider dropping entirely in the future. But
for the time being, we can support this mode as well.
The accesses to the global variable holding 'current' are constructed in
a way that ensures that no literal pool accesses (and associated D-cache
misses) are needed unless the access is from a module and module PLTs
are enabled. This means that accessing 'current' is just as costly as
before, as it used to require some arithmetic involving the stack
pointer and a load from the thread_info::task field.
However, accessing thread_info itself now also involves a load, although
it should be noted that all thread_info and current accesses now go via
the same variable, which is therefore expected to be hot in the caches
at all times.
Cc: Russell King <linux@armlinux.org.uk>
Cc: Nicolas Pitre <nico@fluxnic.net>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Kees Cook <keescook@chromium.org>
Cc: Keith Packard <keithpac@amazon.com>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Tony Lindgren <tony@atomide.com>
Ard Biesheuvel (6):
ARM: entry: preserve thread_info pointer in switch_to
ARM: module: implement support for PC-relative group relocations
ARM: percpu: add SMP_ON_UP support
ARM: smp: defer TPIDRURO update for SMP v6 configurations too
ARM: use TLS register for 'current' on !SMP as well
ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems
arch/arm/Kconfig | 10 +-
arch/arm/include/asm/assembler.h | 136 +++++++++++++++-----
arch/arm/include/asm/current.h | 56 +++++++-
arch/arm/include/asm/elf.h | 3 +
arch/arm/include/asm/percpu.h | 29 ++++-
arch/arm/include/asm/switch_to.h | 3 +-
arch/arm/include/asm/thread_info.h | 27 ----
arch/arm/include/asm/tls.h | 4 +-
arch/arm/kernel/asm-offsets.c | 3 -
arch/arm/kernel/entry-armv.S | 26 ++--
arch/arm/kernel/entry-header.S | 8 +-
arch/arm/kernel/head-common.S | 4 +-
arch/arm/kernel/module.c | 63 +++++++++
arch/arm/kernel/process.c | 7 +-
arch/arm/kernel/traps.c | 4 +
arch/arm/mm/Kconfig | 1 +
16 files changed, 289 insertions(+), 95 deletions(-)
--
2.30.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 11+ messages in thread
* [RFC PATCH 1/6] ARM: entry: preserve thread_info pointer in switch_to
2021-11-26 10:10 [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP Ard Biesheuvel
@ 2021-11-26 10:10 ` Ard Biesheuvel
2021-11-26 10:10 ` [RFC PATCH 2/6] ARM: module: implement support for PC-relative group relocations Ard Biesheuvel
` (6 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Ard Biesheuvel @ 2021-11-26 10:10 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Russell King, Nicolas Pitre, Arnd Bergmann,
Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers,
Tony Lindgren
Tweak the UP stack protector handling code so that the thread info
pointer is preserved in R7 until set_current is called. This is needed
for a subsequent patch that implements THREAD_INFO_IN_TASK and
set_current for UP as well.
This also means we will prefer the per-task protector on UP systems that
implement the thread ID registers, so tweak the preprocessor
conditionals to reflect this.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm/kernel/entry-armv.S | 17 +++++++++--------
1 file changed, 9 insertions(+), 8 deletions(-)
diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
index 5fb7465d14d9..1e26d69ebbf1 100644
--- a/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -814,16 +814,16 @@ ENTRY(__switch_to)
ldr r6, [r2, #TI_CPU_DOMAIN]
#endif
switch_tls r1, r4, r5, r3, r7
-#if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_SMP)
- ldr r7, [r2, #TI_TASK]
+#if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_SMP) && \
+ !defined(CONFIG_STACKPROTECTOR_PER_TASK)
+ ldr r9, [r2, #TI_TASK]
ldr r8, =__stack_chk_guard
.if (TSK_STACK_CANARY > IMM12_MASK)
- add r7, r7, #TSK_STACK_CANARY & ~IMM12_MASK
+ add r9, r9, #TSK_STACK_CANARY & ~IMM12_MASK
.endif
- ldr r7, [r7, #TSK_STACK_CANARY & IMM12_MASK]
-#elif defined(CONFIG_CURRENT_POINTER_IN_TPIDRURO)
- mov r7, r2 @ Preserve 'next'
+ ldr r9, [r9, #TSK_STACK_CANARY & IMM12_MASK]
#endif
+ mov r7, r2 @ Preserve 'next'
#ifdef CONFIG_CPU_USE_DOMAINS
mcr p15, 0, r6, c3, c0, 0 @ Set domain register
#endif
@@ -832,8 +832,9 @@ ENTRY(__switch_to)
ldr r0, =thread_notify_head
mov r1, #THREAD_NOTIFY_SWITCH
bl atomic_notifier_call_chain
-#if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_SMP)
- str r7, [r8]
+#if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_SMP) && \
+ !defined(CONFIG_STACKPROTECTOR_PER_TASK)
+ str r9, [r8]
#endif
mov r0, r5
#if !defined(CONFIG_THUMB2_KERNEL) && !defined(CONFIG_VMAP_STACK)
--
2.30.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC PATCH 2/6] ARM: module: implement support for PC-relative group relocations
2021-11-26 10:10 [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP Ard Biesheuvel
2021-11-26 10:10 ` [RFC PATCH 1/6] ARM: entry: preserve thread_info pointer in switch_to Ard Biesheuvel
@ 2021-11-26 10:10 ` Ard Biesheuvel
2021-11-26 10:10 ` [RFC PATCH 3/6] ARM: percpu: add SMP_ON_UP support Ard Biesheuvel
` (5 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Ard Biesheuvel @ 2021-11-26 10:10 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Russell King, Nicolas Pitre, Arnd Bergmann,
Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers,
Tony Lindgren
Add support for the R_ARM_ALU_PC_Gn_NC and R_ARM_LDR_PC_G2 group
relocations so we can use them in modules. These will be used to load
the current task pointer from a global variable without having to rely
on a literal pool entry to carry the address of this variable, which
would be slightly less efficient.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm/include/asm/elf.h | 3 +
arch/arm/kernel/module.c | 63 ++++++++++++++++++++
2 files changed, 66 insertions(+)
diff --git a/arch/arm/include/asm/elf.h b/arch/arm/include/asm/elf.h
index b8102a6ddf16..d68101655b74 100644
--- a/arch/arm/include/asm/elf.h
+++ b/arch/arm/include/asm/elf.h
@@ -61,6 +61,9 @@ typedef struct user_fp elf_fpregset_t;
#define R_ARM_MOVT_ABS 44
#define R_ARM_MOVW_PREL_NC 45
#define R_ARM_MOVT_PREL 46
+#define R_ARM_ALU_PC_G0_NC 57
+#define R_ARM_ALU_PC_G1_NC 59
+#define R_ARM_LDR_PC_G2 63
#define R_ARM_THM_CALL 10
#define R_ARM_THM_JUMP24 30
diff --git a/arch/arm/kernel/module.c b/arch/arm/kernel/module.c
index c2354990290b..0c7f218f9012 100644
--- a/arch/arm/kernel/module.c
+++ b/arch/arm/kernel/module.c
@@ -68,6 +68,20 @@ bool module_exit_section(const char *name)
strstarts(name, ".ARM.exidx.exit");
}
+static u32 get_group_rem(u32 group, u32 *offset)
+{
+ u32 val = *offset;
+ u32 shift;
+ do {
+ shift = val ? (31 - __fls(val)) & ~1 : 32;
+ *offset = val;
+ if (!val)
+ break;
+ val &= 0xffffff >> shift;
+ } while (group--);
+ return shift;
+}
+
int
apply_relocate(Elf32_Shdr *sechdrs, const char *strtab, unsigned int symindex,
unsigned int relindex, struct module *module)
@@ -82,6 +96,7 @@ apply_relocate(Elf32_Shdr *sechdrs, const char *strtab, unsigned int symindex,
unsigned long loc;
Elf32_Sym *sym;
const char *symname;
+ u32 shift, group = 1;
s32 offset;
u32 tmp;
#ifdef CONFIG_THUMB2_KERNEL
@@ -212,6 +227,54 @@ apply_relocate(Elf32_Shdr *sechdrs, const char *strtab, unsigned int symindex,
*(u32 *)loc = __opcode_to_mem_arm(tmp);
break;
+ case R_ARM_ALU_PC_G0_NC:
+ group = 0;
+ fallthrough;
+ case R_ARM_ALU_PC_G1_NC:
+ tmp = __mem_to_opcode_arm(*(u32 *)loc);
+ offset = ror32(tmp & 0xff, (tmp & 0xf00) >> 7);
+ if (tmp & BIT(22))
+ offset = -offset;
+ offset += sym->st_value - loc;
+ if (offset < 0) {
+ offset = -offset;
+ tmp = (tmp & ~BIT(23)) | BIT(22); // SUB opcode
+ } else {
+ tmp = (tmp & ~BIT(22)) | BIT(23); // ADD opcode
+ }
+
+ shift = get_group_rem(group, &offset);
+ if (shift < 24) {
+ offset >>= 24 - shift;
+ offset |= (shift + 8) << 7;
+ }
+ *(u32 *)loc = __opcode_to_mem_arm((tmp & ~0xfff) | offset);
+ break;
+
+ case R_ARM_LDR_PC_G2:
+ tmp = __mem_to_opcode_arm(*(u32 *)loc);
+ offset = tmp & 0xfff;
+ if (~tmp & BIT(23)) // U bit cleared?
+ offset = -offset;
+ offset += sym->st_value - loc;
+ if (offset < 0) {
+ offset = -offset;
+ tmp &= ~BIT(23); // clear U bit
+ } else {
+ tmp |= BIT(23); // set U bit
+ }
+ get_group_rem(2, &offset);
+
+ if (offset > 0xfff) {
+ pr_err("%s: section %u reloc %u sym '%s': relocation %u out of range (%#lx -> %#x)\n",
+ module->name, relindex, i, symname,
+ ELF32_R_TYPE(rel->r_info), loc,
+ sym->st_value);
+ return -ENOEXEC;
+ }
+ *(u32 *)loc = __opcode_to_mem_arm((tmp & ~0xfff) | offset);
+ break;
+
#ifdef CONFIG_THUMB2_KERNEL
case R_ARM_THM_CALL:
case R_ARM_THM_JUMP24:
--
2.30.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC PATCH 3/6] ARM: percpu: add SMP_ON_UP support
2021-11-26 10:10 [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP Ard Biesheuvel
2021-11-26 10:10 ` [RFC PATCH 1/6] ARM: entry: preserve thread_info pointer in switch_to Ard Biesheuvel
2021-11-26 10:10 ` [RFC PATCH 2/6] ARM: module: implement support for PC-relative group relocations Ard Biesheuvel
@ 2021-11-26 10:10 ` Ard Biesheuvel
2021-11-26 10:10 ` [RFC PATCH 4/6] ARM: smp: defer TPIDRURO update for SMP v6 configurations too Ard Biesheuvel
` (4 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Ard Biesheuvel @ 2021-11-26 10:10 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Russell King, Nicolas Pitre, Arnd Bergmann,
Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers,
Tony Lindgren
Permit the use of the TPIDRPRW system register for carrying the per-CPU
offset in generic SMP configurations that also target non-SMP capable
ARMv6 cores. This uses the SMP_ON_UP code patching framework to turn all
TPIDRPRW accesses into reads/writes of entry #0 in the __per_cpu_offset
array.
While at it, switch over some existing direct TPIDRPRW accesses in asm
code to invocations of a new helper that is patched in the same way when
necessary.
Note that CPU_V6+SMP without SMP_ON_UP results in a kernel that does not
boot on v6 CPUs without SMP extensions, so add this dependency to
Kconfig as well.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm/include/asm/assembler.h | 21 ++++++++++++++
arch/arm/include/asm/percpu.h | 29 ++++++++++++++++++--
arch/arm/kernel/entry-armv.S | 4 +--
arch/arm/mm/Kconfig | 1 +
4 files changed, 50 insertions(+), 5 deletions(-)
diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
index 1b9d4df331aa..c4c1d5b2edf5 100644
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@ -312,6 +312,27 @@ THUMB( fpreg .req r7 )
#define ALT_UP_B(label) b label
#endif
+ /*
+ * this_cpu_offset - load the per-CPU offset of this CPU into
+ * register 'rd'
+ */
+ .macro this_cpu_offset, rd:req
+#ifdef CONFIG_SMP
+ALT_SMP(mrc p15, 0, \rd, c13, c0, 4)
+#ifdef CONFIG_CPU_V6
+ALT_UP_B(.L1_\@)
+.L0_\@:
+ .subsection 1
+.L1_\@: ldr \rd, =__per_cpu_offset
+ ldr \rd, [\rd]
+ b .L0_\@
+ .previous
+#endif
+#else
+ mov \rd, #0
+#endif
+ .endm
+
/*
* Instruction barrier
*/
diff --git a/arch/arm/include/asm/percpu.h b/arch/arm/include/asm/percpu.h
index e2fcb3cfd3de..7b984352e402 100644
--- a/arch/arm/include/asm/percpu.h
+++ b/arch/arm/include/asm/percpu.h
@@ -5,15 +5,25 @@
#ifndef _ASM_ARM_PERCPU_H_
#define _ASM_ARM_PERCPU_H_
+#include <linux/threads.h>
+
register unsigned long current_stack_pointer asm ("sp");
/*
* Same as asm-generic/percpu.h, except that we store the per cpu offset
* in the TPIDRPRW. TPIDRPRW only exists on V6K and V7
*/
-#if defined(CONFIG_SMP) && !defined(CONFIG_CPU_V6)
+#ifdef CONFIG_SMP
+extern unsigned long __per_cpu_offset[NR_CPUS];
+extern unsigned int smp_on_up;
+
static inline void set_my_cpu_offset(unsigned long off)
{
+ if (IS_ENABLED(CONFIG_CPU_V6) && !smp_on_up) {
+ __per_cpu_offset[0] = off;
+ return;
+ }
+
/* Set TPIDRPRW */
asm volatile("mcr p15, 0, %0, c13, c0, 4" : : "r" (off) : "memory");
}
@@ -27,8 +37,21 @@ static inline unsigned long __my_cpu_offset(void)
* We want to allow caching the value, so avoid using volatile and
* instead use a fake stack read to hazard against barrier().
*/
- asm("mrc p15, 0, %0, c13, c0, 4" : "=r" (off)
- : "Q" (*(const unsigned long *)current_stack_pointer));
+ asm("0: mrc p15, 0, %0, c13, c0, 4 \n\t"
+#ifdef CONFIG_CPU_V6
+ "1: \n\t"
+ " .subsection 1 \n\t"
+ "2: ldr %0, =__per_cpu_offset \n\t"
+ " ldr %0, [%0] \n\t"
+ " b 1b \n\t"
+ " .previous \n\t"
+ " .pushsection \".alt.smp.init\", \"a\" \n\t"
+ " .long 0b - . \n\t"
+ " b . + (2b - 0b) \n\t"
+ " .popsection \n\t"
+#endif
+ : "=r" (off)
+ : "Q" (*(const unsigned long *)current_stack_pointer));
return off;
}
diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
index 1e26d69ebbf1..09a9fe501094 100644
--- a/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -41,7 +41,7 @@
mov r0, sp
#ifdef CONFIG_IRQSTACKS
mov_l r2, irq_stack_ptr @ Take base address
- mrc p15, 0, r3, c13, c0, 4 @ Get CPU offset
+ this_cpu_offset r3
#ifdef CONFIG_UNWINDER_ARM
mov fpreg, sp @ Preserve original SP
#else
@@ -884,7 +884,7 @@ __bad_stack:
THUMB( bx pc )
THUMB( nop )
THUMB( .arm )
- mrc p15, 0, ip, c13, c0, 4 @ Get per-CPU offset
+ this_cpu_offset ip
.globl overflow_stack_ptr
.reloc 0f, R_ARM_ALU_PC_G0_NC, overflow_stack_ptr
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index 58afba346729..a91ff22c6c2e 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -386,6 +386,7 @@ config CPU_V6
select CPU_PABRT_V6
select CPU_THUMB_CAPABLE
select CPU_TLB_V6 if MMU
+ select SMP_ON_UP if SMP
# ARMv6k
config CPU_V6K
--
2.30.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC PATCH 4/6] ARM: smp: defer TPIDRURO update for SMP v6 configurations too
2021-11-26 10:10 [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP Ard Biesheuvel
` (2 preceding siblings ...)
2021-11-26 10:10 ` [RFC PATCH 3/6] ARM: percpu: add SMP_ON_UP support Ard Biesheuvel
@ 2021-11-26 10:10 ` Ard Biesheuvel
2021-11-26 10:10 ` [RFC PATCH 5/6] ARM: use TLS register for 'current' on !SMP as well Ard Biesheuvel
` (3 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Ard Biesheuvel @ 2021-11-26 10:10 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Russell King, Nicolas Pitre, Arnd Bergmann,
Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers,
Tony Lindgren
Defer TPIDURO updates for user space until exit for CPU_V6+SMP
configurations as well so that we can decide at runtime whether to use
it to carry the current pointer, provided that we are running on a CPU
that actually implements this register. This is needed for
THREAD_INFO_IN_TASK support for UP systems, which requires that all SMP
capable systems use the TPIDRURO based access to 'current' as the only
remaining alternative will be a global variable which only work on UP.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm/include/asm/tls.h | 4 +++-
| 8 +++++++-
2 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/arch/arm/include/asm/tls.h b/arch/arm/include/asm/tls.h
index c3296499176c..9c0965c14a21 100644
--- a/arch/arm/include/asm/tls.h
+++ b/arch/arm/include/asm/tls.h
@@ -24,7 +24,9 @@
tst \tmp1, #HWCAP_TLS @ hardware TLS available?
streq \tp, [\tmp2, #-15] @ set TLS value at 0xffff0ff0
mrcne p15, 0, \tmp2, c13, c0, 2 @ get the user r/w register
+#ifndef CONFIG_SMP
mcrne p15, 0, \tp, c13, c0, 3 @ yes, set TLS register
+#endif
mcrne p15, 0, \tpuser, c13, c0, 2 @ set user r/w register
strne \tmp2, [\base, #TI_TP_VALUE + 4] @ save it
.endm
@@ -43,7 +45,7 @@
#elif defined(CONFIG_CPU_V6)
#define tls_emu 0
#define has_tls_reg (elf_hwcap & HWCAP_TLS)
-#define defer_tls_reg_update 0
+#define defer_tls_reg_update IS_ENABLED(CONFIG_SMP)
#define switch_tls switch_tls_v6
#elif defined(CONFIG_CPU_32v6K)
#define tls_emu 0
--git a/arch/arm/kernel/entry-header.S b/arch/arm/kernel/entry-header.S
index 81df2a3561ca..aea716c8b97c 100644
--- a/arch/arm/kernel/entry-header.S
+++ b/arch/arm/kernel/entry-header.S
@@ -292,12 +292,18 @@
.macro restore_user_regs, fast = 0, offset = 0
-#if defined(CONFIG_CPU_32v6K) && !defined(CONFIG_CPU_V6)
+#if defined(CONFIG_CPU_32v6K) || \
+ (defined(CONFIG_CPU_V6) && defined(CONFIG_SMP))
@ The TLS register update is deferred until return to user space so we
@ can use it for other things while running in the kernel
+#ifdef CONFIG_CPU_V6
+ALT_SMP(nop)
+ALT_UP_B(.L0_\@)
+#endif
get_thread_info r1
ldr r1, [r1, #TI_TP_VALUE]
mcr p15, 0, r1, c13, c0, 3 @ set TLS register
+.L0_\@:
#endif
uaccess_enable r1, isb=0
--
2.30.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC PATCH 5/6] ARM: use TLS register for 'current' on !SMP as well
2021-11-26 10:10 [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP Ard Biesheuvel
` (3 preceding siblings ...)
2021-11-26 10:10 ` [RFC PATCH 4/6] ARM: smp: defer TPIDRURO update for SMP v6 configurations too Ard Biesheuvel
@ 2021-11-26 10:10 ` Ard Biesheuvel
2021-11-26 10:10 ` [RFC PATCH 6/6] ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems Ard Biesheuvel
` (2 subsequent siblings)
7 siblings, 0 replies; 11+ messages in thread
From: Ard Biesheuvel @ 2021-11-26 10:10 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Russell King, Nicolas Pitre, Arnd Bergmann,
Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers,
Tony Lindgren
Enable the use of the TLS register to hold the 'current' pointer also on
non-SMP configurations that target v6k or later CPUs. This will permit
the use of THREAD_INFO_IN_TASK as well as IRQ stacks and vmap'ed stacks
for such configurations.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index e2ab72f2bf4a..61fc5cc03042 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1165,7 +1165,7 @@ config SMP_ON_UP
config CURRENT_POINTER_IN_TPIDRURO
def_bool y
- depends on SMP && CPU_32v6K && !CPU_V6
+ depends on CPU_32v6K && !CPU_V6
config IRQSTACKS
def_bool y
--
2.30.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 11+ messages in thread
* [RFC PATCH 6/6] ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems
2021-11-26 10:10 [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP Ard Biesheuvel
` (4 preceding siblings ...)
2021-11-26 10:10 ` [RFC PATCH 5/6] ARM: use TLS register for 'current' on !SMP as well Ard Biesheuvel
@ 2021-11-26 10:10 ` Ard Biesheuvel
2021-11-26 22:32 ` Arnd Bergmann
2021-11-27 0:20 ` [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP Linus Walleij
2021-11-29 16:32 ` Nicolas Pitre
7 siblings, 1 reply; 11+ messages in thread
From: Ard Biesheuvel @ 2021-11-26 10:10 UTC (permalink / raw)
To: linux-arm-kernel
Cc: Ard Biesheuvel, Russell King, Nicolas Pitre, Arnd Bergmann,
Kees Cook, Keith Packard, Linus Walleij, Nick Desaulniers,
Tony Lindgren
On UP systems, only a single task can be 'current' at the same time,
which means we can use a global variable to track it. This means we can
also enable THREAD_INFO_IN_TASK for those systems, as in that case,
thread_info is accessed via current rather than the other way around,
removing the need to store thread_info at the base of the task stack.
This, in turn, permits us to enable IRQ stacks and vmap'ed stacks on UP
systems as well.
To partially mitigate the performance overhead of this arrangement, use
a ADD/ADD/LDR sequence with the appropriate PC-relative group
relocations to load the value of current when needed. This means that
accessing current will still only require a single load as before,
avoiding the need for a literal to carry the address of the global
variable in each function. However, accessing thread_info will now
require this load as well.
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
arch/arm/Kconfig | 8 +-
arch/arm/include/asm/assembler.h | 115 ++++++++++++++------
arch/arm/include/asm/current.h | 56 ++++++++--
arch/arm/include/asm/switch_to.h | 3 +-
arch/arm/include/asm/thread_info.h | 27 -----
arch/arm/kernel/asm-offsets.c | 3 -
arch/arm/kernel/entry-armv.S | 11 +-
arch/arm/kernel/head-common.S | 4 +-
arch/arm/kernel/process.c | 7 +-
arch/arm/kernel/traps.c | 4 +
10 files changed, 156 insertions(+), 82 deletions(-)
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 61fc5cc03042..3d7476ca4d94 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -126,8 +126,8 @@ config ARM
select PERF_USE_VMALLOC
select RTC_LIB
select SYS_SUPPORTS_APM_EMULATION
- select THREAD_INFO_IN_TASK if CURRENT_POINTER_IN_TPIDRURO
- select HAVE_ARCH_VMAP_STACK if MMU && THREAD_INFO_IN_TASK && (!LD_IS_LLD || LLD_VERSION >= 140000)
+ select THREAD_INFO_IN_TASK
+ select HAVE_ARCH_VMAP_STACK if MMU && (!LD_IS_LLD || LLD_VERSION >= 140000)
select TRACE_IRQFLAGS_SUPPORT if !CPU_V7M
# Above selects are sorted alphabetically; please add new ones
# according to that. Thanks.
@@ -1169,7 +1169,7 @@ config CURRENT_POINTER_IN_TPIDRURO
config IRQSTACKS
def_bool y
- depends on GENERIC_IRQ_MULTI_HANDLER && THREAD_INFO_IN_TASK
+ depends on GENERIC_IRQ_MULTI_HANDLER
select HAVE_IRQ_EXIT_ON_IRQ_STACK
select HAVE_SOFTIRQ_ON_OWN_STACK
@@ -1619,7 +1619,7 @@ config CC_HAVE_STACKPROTECTOR_TLS
config STACKPROTECTOR_PER_TASK
bool "Use a unique stack canary value for each task"
- depends on STACKPROTECTOR && THREAD_INFO_IN_TASK && !XIP_DEFLATED_DATA
+ depends on STACKPROTECTOR && CURRENT_POINTER_IN_TPIDRURO && !XIP_DEFLATED_DATA
depends on GCC_PLUGINS || CC_HAVE_STACKPROTECTOR_TLS
select GCC_PLUGIN_ARM_SSP_PER_TASK if !CC_HAVE_STACKPROTECTOR_TLS
default y
diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
index c4c1d5b2edf5..978fdaaac680 100644
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@ -203,43 +203,12 @@ THUMB( fpreg .req r7 )
.endm
.endr
- .macro get_current, rd
-#ifdef CONFIG_CURRENT_POINTER_IN_TPIDRURO
- mrc p15, 0, \rd, c13, c0, 3 @ get TPIDRURO register
-#else
- get_thread_info \rd
- ldr \rd, [\rd, #TI_TASK]
-#endif
- .endm
-
- .macro set_current, rn
-#ifdef CONFIG_CURRENT_POINTER_IN_TPIDRURO
- mcr p15, 0, \rn, c13, c0, 3 @ set TPIDRURO register
-#endif
- .endm
-
- .macro reload_current, t1:req, t2:req
-#ifdef CONFIG_CURRENT_POINTER_IN_TPIDRURO
- adr_l \t1, __entry_task @ get __entry_task base address
- mrc p15, 0, \t2, c13, c0, 4 @ get per-CPU offset
- ldr \t1, [\t1, \t2] @ load variable
- mcr p15, 0, \t1, c13, c0, 3 @ store in TPIDRURO
-#endif
- .endm
-
/*
* Get current thread_info.
*/
.macro get_thread_info, rd
-#ifdef CONFIG_THREAD_INFO_IN_TASK
/* thread_info is the first member of struct task_struct */
get_current \rd
-#else
- ARM( mov \rd, sp, lsr #THREAD_SIZE_ORDER + PAGE_SHIFT )
- THUMB( mov \rd, sp )
- THUMB( lsr \rd, \rd, #THREAD_SIZE_ORDER + PAGE_SHIFT )
- mov \rd, \rd, lsl #THREAD_SIZE_ORDER + PAGE_SHIFT
-#endif
.endm
/*
@@ -333,6 +302,90 @@ ALT_UP_B(.L1_\@)
#endif
.endm
+ /*
+ * load_current - load the current task pointer from the global
+ * variable '__current'
+ */
+ .macro load_current, rd
+#if defined(CONFIG_THUMB2_KERNEL) || \
+ (defined(MODULE) && defined(CONFIG_ARM_MODULE_PLTS)) || \
+ (defined(CONFIG_LD_IS_LLD) && CONFIG_LLD_VERSION < 140000)
+ mov_l \rd, __current
+ ldr \rd, [\rd]
+#else
+ /*
+ * Avoid a literal load, by emitting a sequence of ADD/LDR instructions
+ * with the appropriate relocations. The combined sequence has a range
+ * of -/+ 256 MiB, which should be sufficient for the core kernel and
+ * for modules loaded into the module region.
+ */
+ .globl __current
+ .reloc .L0_\@, R_ARM_ALU_PC_G0_NC, __current
+ .reloc .L1_\@, R_ARM_ALU_PC_G1_NC, __current
+ .reloc .L2_\@, R_ARM_LDR_PC_G2, __current
+.L0_\@: sub \rd, pc, #8
+.L1_\@: sub \rd, \rd, #4
+.L2_\@: ldr \rd, [\rd, #0]
+#endif
+ .endm
+
+ /*
+ * set_current - store the task pointer of this CPU's current task
+ */
+ .macro set_current, rn:req, tmp:req
+#if defined(CONFIG_CURRENT_POINTER_IN_TPIDRURO) || defined(CONFIG_SMP)
+9998: mcr p15, 0, \rn, c13, c0, 3 @ set TPIDRURO register
+#ifdef CONFIG_CPU_V6
+ALT_UP_B(.L1_\@)
+.L0_\@:
+ .subsection 1
+.L1_\@: ldr \tmp, =__current
+ str \rn, [\tmp]
+ b .L0_\@
+ .previous
+#endif
+#else
+ str_l \rn, __current, \tmp
+#endif
+ .endm
+
+ /*
+ * get_current - load the task pointer of this CPU's current task
+ */
+ .macro get_current, rd
+#if defined(CONFIG_CURRENT_POINTER_IN_TPIDRURO) || defined(CONFIG_SMP)
+9998: mrc p15, 0, \rd, c13, c0, 3 @ get TPIDRURO register
+#ifdef CONFIG_CPU_V6
+ALT_UP_B(.L1_\@)
+.L0_\@:
+ .subsection 1
+.L1_\@: load_current \rd
+ b .L0_\@
+ .previous
+#endif
+#else
+ load_current \rd
+#endif
+ .endm
+
+ /*
+ * reload_current - reload the task pointer of this CPU's current task
+ * into the TLS register
+ */
+ .macro reload_current, t1:req, t2:req
+#if defined(CONFIG_CURRENT_POINTER_IN_TPIDRURO) || defined(CONFIG_SMP)
+#ifdef CONFIG_CPU_V6
+ALT_SMP(nop)
+ALT_UP_B(.L0_\@)
+#endif
+ adr_l \t1, __entry_task @ get __entry_task base address
+ mrc p15, 0, \t2, c13, c0, 4 @ get per-CPU offset
+ ldr \t1, [\t1, \t2] @ load variable
+ mcr p15, 0, \t1, c13, c0, 3 @ store in TPIDRURO
+.L0_\@:
+#endif
+ .endm
+
/*
* Instruction barrier
*/
diff --git a/arch/arm/include/asm/current.h b/arch/arm/include/asm/current.h
index 6bf0aad672c3..68d6907c9d54 100644
--- a/arch/arm/include/asm/current.h
+++ b/arch/arm/include/asm/current.h
@@ -11,22 +11,50 @@
struct task_struct;
+extern struct task_struct *__current;
+extern unsigned int smp_on_up;
+
static inline void set_current(struct task_struct *cur)
{
- if (!IS_ENABLED(CONFIG_CURRENT_POINTER_IN_TPIDRURO))
+ if (!IS_ENABLED(CONFIG_CURRENT_POINTER_IN_TPIDRURO) &&
+ !(IS_ENABLED(CONFIG_SMP) &&
+ IS_ENABLED(CONFIG_SMP_ON_UP) &&
+ smp_on_up)) {
+ __current = cur;
return;
+ }
/* Set TPIDRURO */
asm("mcr p15, 0, %0, c13, c0, 3" :: "r"(cur) : "memory");
}
-#ifdef CONFIG_CURRENT_POINTER_IN_TPIDRURO
+/*
+ * Avoid a literal load by emitting a sequence of ADD/LDR instructions with the
+ * appropriate relocations. The combined sequence has a range of -/+ 256 MiB,
+ * which should be sufficient for the core kernel as well as modules loaded
+ * into the module region. (Not supported by LLD before release 14)
+ */
+#if !defined(CONFIG_LD_IS_LLD) || CONFIG_LLD_VERSION >= 140000
+#define LOAD_CURRENT \
+ " .globl __current \n\t" \
+ " .reloc 10f, R_ARM_ALU_PC_G0_NC, __current \n\t" \
+ " .reloc 11f, R_ARM_ALU_PC_G1_NC, __current \n\t" \
+ " .reloc 12f, R_ARM_LDR_PC_G2, __current \n\t" \
+ "10: sub %0, pc, #8 \n\t" \
+ "11: sub %0, %0, #4 \n\t" \
+ "12: ldr %0, [%0, #0] \n\t"
+#else
+#define LOAD_CURRENT \
+ " ldr %0, =__current \n\t" \
+ " ldr %0, [%0] \n\t"
+#endif
-static inline struct task_struct *get_current(void)
+static inline __attribute_const__ struct task_struct *get_current(void)
{
struct task_struct *cur;
#if __has_builtin(__builtin_thread_pointer) && \
+ defined(CONFIG_CURRENT_POINTER_IN_TPIDRURO) && \
!(defined(CONFIG_THUMB2_KERNEL) && \
defined(CONFIG_CC_IS_CLANG) && CONFIG_CLANG_VERSION < 130001)
/*
@@ -39,16 +67,30 @@ static inline struct task_struct *get_current(void)
* https://github.com/ClangBuiltLinux/linux/issues/1485
*/
cur = __builtin_thread_pointer();
+#elif defined(CONFIG_CURRENT_POINTER_IN_TPIDRURO) || defined(CONFIG_SMP)
+ asm("0: mrc p15, 0, %0, c13, c0, 3 \n\t"
+#ifdef CONFIG_CPU_V6
+ "1: \n\t"
+ " .subsection 1 \n\t"
+ "2: " LOAD_CURRENT
+ " b 1b \n\t"
+ " .previous \n\t"
+ " .pushsection \".alt.smp.init\", \"a\" \n\t"
+ " .long 0b - . \n\t"
+ " b . + (2b - 0b) \n\t"
+ " .popsection \n\t"
+#endif
+ : "=r"(cur));
+#elif defined(CONFIG_THUMB2_KERNEL) || \
+ (defined(MODULE) && defined(CONFIG_ARM_MODULE_PLTS))
+ cur = __current;
#else
- asm("mrc p15, 0, %0, c13, c0, 3" : "=r"(cur));
+ asm(LOAD_CURRENT : "=r"(cur));
#endif
return cur;
}
#define current get_current()
-#else
-#include <asm-generic/current.h>
-#endif /* CONFIG_CURRENT_POINTER_IN_TPIDRURO */
#endif /* __ASSEMBLY__ */
diff --git a/arch/arm/include/asm/switch_to.h b/arch/arm/include/asm/switch_to.h
index b55c7b2755e4..a482c99934ff 100644
--- a/arch/arm/include/asm/switch_to.h
+++ b/arch/arm/include/asm/switch_to.h
@@ -40,7 +40,8 @@ static inline void set_ti_cpu(struct task_struct *p)
do { \
__complete_pending_tlbi(); \
set_ti_cpu(next); \
- if (IS_ENABLED(CONFIG_CURRENT_POINTER_IN_TPIDRURO)) \
+ if (IS_ENABLED(CONFIG_CURRENT_POINTER_IN_TPIDRURO) || \
+ IS_ENABLED(CONFIG_SMP)) \
__this_cpu_write(__entry_task, next); \
last = __switch_to(prev,task_thread_info(prev), task_thread_info(next)); \
} while (0)
diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h
index 004b89d86224..aecc403b2880 100644
--- a/arch/arm/include/asm/thread_info.h
+++ b/arch/arm/include/asm/thread_info.h
@@ -62,9 +62,6 @@ struct cpu_context_save {
struct thread_info {
unsigned long flags; /* low level flags */
int preempt_count; /* 0 => preemptable, <0 => bug */
-#ifndef CONFIG_THREAD_INFO_IN_TASK
- struct task_struct *task; /* main task structure */
-#endif
__u32 cpu; /* cpu */
__u32 cpu_domain; /* cpu domain */
struct cpu_context_save cpu_context; /* cpu context */
@@ -80,39 +77,15 @@ struct thread_info {
#define INIT_THREAD_INFO(tsk) \
{ \
- INIT_THREAD_INFO_TASK(tsk) \
.flags = 0, \
.preempt_count = INIT_PREEMPT_COUNT, \
}
-#ifdef CONFIG_THREAD_INFO_IN_TASK
-#define INIT_THREAD_INFO_TASK(tsk)
-
static inline struct task_struct *thread_task(struct thread_info* ti)
{
return (struct task_struct *)ti;
}
-#else
-#define INIT_THREAD_INFO_TASK(tsk) .task = &(tsk),
-
-static inline struct task_struct *thread_task(struct thread_info* ti)
-{
- return ti->task;
-}
-
-/*
- * how to get the thread information struct from C
- */
-static inline struct thread_info *current_thread_info(void) __attribute_const__;
-
-static inline struct thread_info *current_thread_info(void)
-{
- return (struct thread_info *)
- (current_stack_pointer & ~(THREAD_SIZE - 1));
-}
-#endif
-
#define thread_saved_pc(tsk) \
((unsigned long)(task_thread_info(tsk)->cpu_context.pc))
#define thread_saved_sp(tsk) \
diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
index 645845e4982a..2c8d76fd7c66 100644
--- a/arch/arm/kernel/asm-offsets.c
+++ b/arch/arm/kernel/asm-offsets.c
@@ -43,9 +43,6 @@ int main(void)
BLANK();
DEFINE(TI_FLAGS, offsetof(struct thread_info, flags));
DEFINE(TI_PREEMPT, offsetof(struct thread_info, preempt_count));
-#ifndef CONFIG_THREAD_INFO_IN_TASK
- DEFINE(TI_TASK, offsetof(struct thread_info, task));
-#endif
DEFINE(TI_CPU, offsetof(struct thread_info, cpu));
DEFINE(TI_CPU_DOMAIN, offsetof(struct thread_info, cpu_domain));
DEFINE(TI_CPU_SAVE, offsetof(struct thread_info, cpu_context));
diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
index 09a9fe501094..5f3b882d53b7 100644
--- a/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -816,12 +816,13 @@ ENTRY(__switch_to)
switch_tls r1, r4, r5, r3, r7
#if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_SMP) && \
!defined(CONFIG_STACKPROTECTOR_PER_TASK)
- ldr r9, [r2, #TI_TASK]
ldr r8, =__stack_chk_guard
.if (TSK_STACK_CANARY > IMM12_MASK)
- add r9, r9, #TSK_STACK_CANARY & ~IMM12_MASK
- .endif
+ add r9, r2, #TSK_STACK_CANARY & ~IMM12_MASK
ldr r9, [r9, #TSK_STACK_CANARY & IMM12_MASK]
+ .else
+ ldr r9, [r2, #TSK_STACK_CANARY & IMM12_MASK]
+ .endif
#endif
mov r7, r2 @ Preserve 'next'
#ifdef CONFIG_CPU_USE_DOMAINS
@@ -838,7 +839,7 @@ ENTRY(__switch_to)
#endif
mov r0, r5
#if !defined(CONFIG_THUMB2_KERNEL) && !defined(CONFIG_VMAP_STACK)
- set_current r7
+ set_current r7, r8
ldmia r4, {r4 - sl, fp, sp, pc} @ Load all regs saved previously
#else
mov r1, r7
@@ -860,7 +861,7 @@ ENTRY(__switch_to)
@ switches us to another stack, with few other side effects. In order
@ to prevent this distinction from causing any inconsistencies, let's
@ keep the 'set_current' call as close as we can to the update of SP.
- set_current r1
+ set_current r1, r2
mov sp, ip
ret lr
#endif
diff --git a/arch/arm/kernel/head-common.S b/arch/arm/kernel/head-common.S
index da18e0a17dc2..42cae73fcc19 100644
--- a/arch/arm/kernel/head-common.S
+++ b/arch/arm/kernel/head-common.S
@@ -105,10 +105,8 @@ __mmap_switched:
mov r1, #0
bl __memset @ clear .bss
-#ifdef CONFIG_CURRENT_POINTER_IN_TPIDRURO
adr_l r0, init_task @ get swapper task_struct
- set_current r0
-#endif
+ set_current r0, r1
ldmia r4, {r0, r1, r2, r3}
str r9, [r0] @ Save processor ID
diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index d47159f3791c..0617af11377f 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -36,7 +36,7 @@
#include "signal.h"
-#ifdef CONFIG_CURRENT_POINTER_IN_TPIDRURO
+#if defined(CONFIG_CURRENT_POINTER_IN_TPIDRURO) || defined(CONFIG_SMP)
DEFINE_PER_CPU(struct task_struct *, __entry_task);
#endif
@@ -46,6 +46,11 @@ unsigned long __stack_chk_guard __read_mostly;
EXPORT_SYMBOL(__stack_chk_guard);
#endif
+#ifndef CONFIG_CURRENT_POINTER_IN_TPIDRURO
+asmlinkage struct task_struct *__current;
+EXPORT_SYMBOL(__current);
+#endif
+
static const char *processor_modes[] __maybe_unused = {
"USER_26", "FIQ_26" , "IRQ_26" , "SVC_26" , "UK4_26" , "UK5_26" , "UK6_26" , "UK7_26" ,
"UK8_26" , "UK9_26" , "UK10_26", "UK11_26", "UK12_26", "UK13_26", "UK14_26", "UK15_26",
diff --git a/arch/arm/kernel/traps.c b/arch/arm/kernel/traps.c
index b28a705c49cb..3f38357efc46 100644
--- a/arch/arm/kernel/traps.c
+++ b/arch/arm/kernel/traps.c
@@ -865,7 +865,9 @@ early_initcall(allocate_overflow_stacks);
asmlinkage void handle_bad_stack(struct pt_regs *regs)
{
unsigned long tsk_stk = (unsigned long)current->stack;
+#ifdef CONFIG_IRQSTACKS
unsigned long irq_stk = (unsigned long)this_cpu_read(irq_stack_ptr);
+#endif
unsigned long ovf_stk = (unsigned long)this_cpu_read(overflow_stack_ptr);
console_verbose();
@@ -873,8 +875,10 @@ asmlinkage void handle_bad_stack(struct pt_regs *regs)
pr_emerg("Task stack: [0x%08lx..0x%08lx]\n",
tsk_stk, tsk_stk + THREAD_SIZE);
+#ifdef CONFIG_IRQSTACKS
pr_emerg("IRQ stack: [0x%08lx..0x%08lx]\n",
irq_stk - THREAD_SIZE, irq_stk);
+#endif
pr_emerg("Overflow stack: [0x%08lx..0x%08lx]\n",
ovf_stk - OVERFLOW_STACK_SIZE, ovf_stk);
--
2.30.2
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [RFC PATCH 6/6] ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems
2021-11-26 10:10 ` [RFC PATCH 6/6] ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems Ard Biesheuvel
@ 2021-11-26 22:32 ` Arnd Bergmann
2021-11-30 8:00 ` Ard Biesheuvel
0 siblings, 1 reply; 11+ messages in thread
From: Arnd Bergmann @ 2021-11-26 22:32 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: Linux ARM, Russell King, Nicolas Pitre, Arnd Bergmann, Kees Cook,
Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren
On Fri, Nov 26, 2021 at 11:10 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> @@ -1169,7 +1169,7 @@ config CURRENT_POINTER_IN_TPIDRURO
>
> config IRQSTACKS
> def_bool y
> - depends on GENERIC_IRQ_MULTI_HANDLER && THREAD_INFO_IN_TASK
> + depends on GENERIC_IRQ_MULTI_HANDLER
> select HAVE_IRQ_EXIT_ON_IRQ_STACK
> select HAVE_SOFTIRQ_ON_OWN_STACK
Side note: after this, we might want to investigate finishing off
GENERIC_IRQ_MULTI_HANDLER for all architectures. The
currently missing platforms are ARM_SINGLE_ARMV7M,
ARCH_FOOTBRIDGE, ARCH_IOP32X and ARCH_RPC.
These are a bit tricky (presumably this is why they are not converted
yet), but it should be possible to do.
> static inline void set_current(struct task_struct *cur)
> {
> - if (!IS_ENABLED(CONFIG_CURRENT_POINTER_IN_TPIDRURO))
> + if (!IS_ENABLED(CONFIG_CURRENT_POINTER_IN_TPIDRURO) &&
> + !(IS_ENABLED(CONFIG_SMP) &&
> + IS_ENABLED(CONFIG_SMP_ON_UP) &&
> + smp_on_up)) {
I think you can just use is_smp() here to simplify the condition. You might
need to move the definition to a different header if that causes an #include
loop.
> @@ -39,16 +67,30 @@ static inline struct task_struct *get_current(void)
> * https://github.com/ClangBuiltLinux/linux/issues/1485
> */
> cur = __builtin_thread_pointer();
> +#elif defined(CONFIG_CURRENT_POINTER_IN_TPIDRURO) || defined(CONFIG_SMP)
> + asm("0: mrc p15, 0, %0, c13, c0, 3 \n\t"
> +#ifdef CONFIG_CPU_V6
> + "1: \n\t"
> + " .subsection 1 \n\t"
> + "2: " LOAD_CURRENT
> + " b 1b \n\t"
> + " .previous \n\t"
> + " .pushsection \".alt.smp.init\", \"a\" \n\t"
> + " .long 0b - . \n\t"
> + " b . + (2b - 0b) \n\t"
> + " .popsection \n\t"
> +#endif
You mentioned earlier that this gets ugly with SMP_ON_UP on ARMv6, now
I see what you meant ;-)
I can see an increasing number of reasons for no longer supporting this
option. As we recently discussed on IRC, this would affect omap2plus_defconfig,
imx_v6_v7_defconfig and realview_defconfig, which would all have to drop
either CPU_V6 or SMP. Since you got it working already, this also seems
better left as a cleanup for another time once we can build consensus on it,
but my guess is that at this point the benefits of removing it outweigh those
of keeping it.
Arnd
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP
2021-11-26 10:10 [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP Ard Biesheuvel
` (5 preceding siblings ...)
2021-11-26 10:10 ` [RFC PATCH 6/6] ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems Ard Biesheuvel
@ 2021-11-27 0:20 ` Linus Walleij
2021-11-29 16:32 ` Nicolas Pitre
7 siblings, 0 replies; 11+ messages in thread
From: Linus Walleij @ 2021-11-27 0:20 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, Russell King, Nicolas Pitre, Arnd Bergmann,
Kees Cook, Keith Packard, Nick Desaulniers, Tony Lindgren
On Fri, Nov 26, 2021 at 11:10 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> Enable the use of the TLS register to hold the 'current' pointer for all
> configurations that can support it, including non-SMP ones that target
> v6k or later CPUs, and multi-platform SMP ones that also support v6
> based UP systems.
>
> The remaining configurations are all strictly UP, which means we can
> switch to a global variable to hold the current pointer. By doing this,
> we can enable THREAD_INFO_IN_TASK, which moves thread info off the
> stack, protecting it from overflows. It also permits us to enable IRQ
> stacks and vmap'ed stacks for UP configurations as well.
I really like what I see here!
I glanced over it but sadly do not have sufficient time to read every
detail of it, but I certainly trust to to get things right and iron out any
rough corners so FWIW:
Acked-by: Linus Walleij <linus.walleij@linaro.org>
on this patch set.
> Supporting v6 cores without SMP extensions in SMP configurations (e.g.,
> omap2plus_defconfig or imx_v6_v7_defconfig) makes this a bit tricky, and
> this is a feature we may consider dropping entirely in the future. But
> for the time being, we can support this mode as well.
Hmmm yes these will look odd and I can see it really makes the
patch hairy too. I think people might be using them though :/
Yours,
Linus Walleij
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP
2021-11-26 10:10 [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP Ard Biesheuvel
` (6 preceding siblings ...)
2021-11-27 0:20 ` [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP Linus Walleij
@ 2021-11-29 16:32 ` Nicolas Pitre
7 siblings, 0 replies; 11+ messages in thread
From: Nicolas Pitre @ 2021-11-29 16:32 UTC (permalink / raw)
To: Ard Biesheuvel
Cc: linux-arm-kernel, Russell King, Arnd Bergmann, Kees Cook,
Keith Packard, Linus Walleij, Nick Desaulniers, Tony Lindgren
On Fri, 26 Nov 2021, Ard Biesheuvel wrote:
> Enable the use of the TLS register to hold the 'current' pointer for all
> configurations that can support it, including non-SMP ones that target
> v6k or later CPUs, and multi-platform SMP ones that also support v6
> based UP systems.
>
> The remaining configurations are all strictly UP, which means we can
> switch to a global variable to hold the current pointer. By doing this,
> we can enable THREAD_INFO_IN_TASK, which moves thread info off the
> stack, protecting it from overflows. It also permits us to enable IRQ
> stacks and vmap'ed stacks for UP configurations as well.
>
> Supporting v6 cores without SMP extensions in SMP configurations (e.g.,
> omap2plus_defconfig or imx_v6_v7_defconfig) makes this a bit tricky, and
> this is a feature we may consider dropping entirely in the future. But
> for the time being, we can support this mode as well.
>
> The accesses to the global variable holding 'current' are constructed in
> a way that ensures that no literal pool accesses (and associated D-cache
> misses) are needed unless the access is from a module and module PLTs
> are enabled. This means that accessing 'current' is just as costly as
> before, as it used to require some arithmetic involving the stack
> pointer and a load from the thread_info::task field.
>
> However, accessing thread_info itself now also involves a load, although
> it should be noted that all thread_info and current accesses now go via
> the same variable, which is therefore expected to be hot in the caches
> at all times.
Looks nice overall.
Acked-by: Nicolas Pitre <nico@fluxnic.net>
> Cc: Russell King <linux@armlinux.org.uk>
> Cc: Nicolas Pitre <nico@fluxnic.net>
> Cc: Arnd Bergmann <arnd@arndb.de>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Keith Packard <keithpac@amazon.com>
> Cc: Linus Walleij <linus.walleij@linaro.org>
> Cc: Nick Desaulniers <ndesaulniers@google.com>
> Cc: Tony Lindgren <tony@atomide.com>
>
> Ard Biesheuvel (6):
> ARM: entry: preserve thread_info pointer in switch_to
> ARM: module: implement support for PC-relative group relocations
> ARM: percpu: add SMP_ON_UP support
> ARM: smp: defer TPIDRURO update for SMP v6 configurations too
> ARM: use TLS register for 'current' on !SMP as well
> ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems
>
> arch/arm/Kconfig | 10 +-
> arch/arm/include/asm/assembler.h | 136 +++++++++++++++-----
> arch/arm/include/asm/current.h | 56 +++++++-
> arch/arm/include/asm/elf.h | 3 +
> arch/arm/include/asm/percpu.h | 29 ++++-
> arch/arm/include/asm/switch_to.h | 3 +-
> arch/arm/include/asm/thread_info.h | 27 ----
> arch/arm/include/asm/tls.h | 4 +-
> arch/arm/kernel/asm-offsets.c | 3 -
> arch/arm/kernel/entry-armv.S | 26 ++--
> arch/arm/kernel/entry-header.S | 8 +-
> arch/arm/kernel/head-common.S | 4 +-
> arch/arm/kernel/module.c | 63 +++++++++
> arch/arm/kernel/process.c | 7 +-
> arch/arm/kernel/traps.c | 4 +
> arch/arm/mm/Kconfig | 1 +
> 16 files changed, 289 insertions(+), 95 deletions(-)
>
> --
> 2.30.2
>
>
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [RFC PATCH 6/6] ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems
2021-11-26 22:32 ` Arnd Bergmann
@ 2021-11-30 8:00 ` Ard Biesheuvel
0 siblings, 0 replies; 11+ messages in thread
From: Ard Biesheuvel @ 2021-11-30 8:00 UTC (permalink / raw)
To: Arnd Bergmann
Cc: Linux ARM, Russell King, Nicolas Pitre, Kees Cook, Keith Packard,
Linus Walleij, Nick Desaulniers, Tony Lindgren
On Fri, 26 Nov 2021 at 23:32, Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Fri, Nov 26, 2021 at 11:10 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> > @@ -1169,7 +1169,7 @@ config CURRENT_POINTER_IN_TPIDRURO
> >
> > config IRQSTACKS
> > def_bool y
> > - depends on GENERIC_IRQ_MULTI_HANDLER && THREAD_INFO_IN_TASK
> > + depends on GENERIC_IRQ_MULTI_HANDLER
> > select HAVE_IRQ_EXIT_ON_IRQ_STACK
> > select HAVE_SOFTIRQ_ON_OWN_STACK
>
> Side note: after this, we might want to investigate finishing off
> GENERIC_IRQ_MULTI_HANDLER for all architectures. The
> currently missing platforms are ARM_SINGLE_ARMV7M,
> ARCH_FOOTBRIDGE, ARCH_IOP32X and ARCH_RPC.
>
> These are a bit tricky (presumably this is why they are not converted
> yet), but it should be possible to do.
>
> > static inline void set_current(struct task_struct *cur)
> > {
> > - if (!IS_ENABLED(CONFIG_CURRENT_POINTER_IN_TPIDRURO))
> > + if (!IS_ENABLED(CONFIG_CURRENT_POINTER_IN_TPIDRURO) &&
> > + !(IS_ENABLED(CONFIG_SMP) &&
> > + IS_ENABLED(CONFIG_SMP_ON_UP) &&
> > + smp_on_up)) {
>
> I think you can just use is_smp() here to simplify the condition. You might
> need to move the definition to a different header if that causes an #include
> loop.
>
OK
> > @@ -39,16 +67,30 @@ static inline struct task_struct *get_current(void)
> > * https://github.com/ClangBuiltLinux/linux/issues/1485
> > */
> > cur = __builtin_thread_pointer();
> > +#elif defined(CONFIG_CURRENT_POINTER_IN_TPIDRURO) || defined(CONFIG_SMP)
> > + asm("0: mrc p15, 0, %0, c13, c0, 3 \n\t"
> > +#ifdef CONFIG_CPU_V6
> > + "1: \n\t"
> > + " .subsection 1 \n\t"
> > + "2: " LOAD_CURRENT
> > + " b 1b \n\t"
> > + " .previous \n\t"
> > + " .pushsection \".alt.smp.init\", \"a\" \n\t"
> > + " .long 0b - . \n\t"
> > + " b . + (2b - 0b) \n\t"
> > + " .popsection \n\t"
> > +#endif
>
> You mentioned earlier that this gets ugly with SMP_ON_UP on ARMv6, now
> I see what you meant ;-)
>
> I can see an increasing number of reasons for no longer supporting this
> option. As we recently discussed on IRC, this would affect omap2plus_defconfig,
> imx_v6_v7_defconfig and realview_defconfig, which would all have to drop
> either CPU_V6 or SMP. Since you got it working already, this also seems
> better left as a cleanup for another time once we can build consensus on it,
> but my guess is that at this point the benefits of removing it outweigh those
> of keeping it.
>
Agreed.
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2021-11-30 8:04 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-26 10:10 [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP Ard Biesheuvel
2021-11-26 10:10 ` [RFC PATCH 1/6] ARM: entry: preserve thread_info pointer in switch_to Ard Biesheuvel
2021-11-26 10:10 ` [RFC PATCH 2/6] ARM: module: implement support for PC-relative group relocations Ard Biesheuvel
2021-11-26 10:10 ` [RFC PATCH 3/6] ARM: percpu: add SMP_ON_UP support Ard Biesheuvel
2021-11-26 10:10 ` [RFC PATCH 4/6] ARM: smp: defer TPIDRURO update for SMP v6 configurations too Ard Biesheuvel
2021-11-26 10:10 ` [RFC PATCH 5/6] ARM: use TLS register for 'current' on !SMP as well Ard Biesheuvel
2021-11-26 10:10 ` [RFC PATCH 6/6] ARM: implement THREAD_INFO_IN_TASK for uniprocessor systems Ard Biesheuvel
2021-11-26 22:32 ` Arnd Bergmann
2021-11-30 8:00 ` Ard Biesheuvel
2021-11-27 0:20 ` [RFC PATCH 0/6] ARM: enable IRQ stacks and vmap'ed stacks for UP Linus Walleij
2021-11-29 16:32 ` Nicolas Pitre
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.