linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2]: ARM: Enable THREAD_INFO_IN_TASK
@ 2021-09-02 15:54 Keith Packard
  2021-09-02 15:54 ` [PATCH 1/2] ARM: Add per-cpu variable holding cpu number Keith Packard
                   ` (4 more replies)
  0 siblings, 5 replies; 36+ messages in thread
From: Keith Packard @ 2021-09-02 15:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: Abbott Liu, Alexander Sverdlin, Al Viro, Andrew Morton,
	Anshuman Khandual, Ard Biesheuvel, Arnd Bergmann,
	Bjorn Andersson, Florian Fainelli, Geert Uytterhoeven,
	Hartley Sweeten, Jens Axboe, Jian Cai, Joe Perches,
	Keith Packard, Linus Walleij, linux-arm-kernel, Maninder Singh,
	Manivannan Sadhasivam, Marc Zyngier, Masahiro Yamada,
	Mike Rapoport, Nick Desaulniers, Nick Desaulniers, Nicolas Pitre,
	Peter Zijlstra, Russell King, Thomas Gleixner,
	Uwe Kleine-König, Valentin Schneider, Vaneet Narang,
	Wolfram Sang (Renesas),
	YiFei Zhu, Keith Packard

Placing thread_info in the kernel stack leaves it vulnerable to stack
overflow attacks. This short series addresses that by using the
existing THREAD_INFO_IN_TASK infrastructure.

As this is my first patch in this part of the kernel, I'm looking for
feedback about the general approach as well as specific comments on
places where I've missed something.

I've only run this on armhf running under qemu, so while I've tried to
make patches for other code paths, I haven't been able to test those.

(yes, I know checkpatch.pl complains about whitespace in asm-offsets.c, I
decided to leave the existing whitespace alone)

Signed-off-by: Keith Packard <keithpac@amazon.com>


^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 1/2] ARM: Add per-cpu variable holding cpu number
  2021-09-02 15:54 [PATCH 0/2]: ARM: Enable THREAD_INFO_IN_TASK Keith Packard
@ 2021-09-02 15:54 ` Keith Packard
  2021-09-02 15:54 ` [PATCH 2/2] ARM: Move thread_info into task_struct Keith Packard
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 36+ messages in thread
From: Keith Packard @ 2021-09-02 15:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: Abbott Liu, Alexander Sverdlin, Al Viro, Andrew Morton,
	Anshuman Khandual, Ard Biesheuvel, Arnd Bergmann,
	Bjorn Andersson, Florian Fainelli, Geert Uytterhoeven,
	Hartley Sweeten, Jens Axboe, Jian Cai, Joe Perches,
	Keith Packard, Linus Walleij, linux-arm-kernel, Maninder Singh,
	Manivannan Sadhasivam, Marc Zyngier, Masahiro Yamada,
	Mike Rapoport, Nick Desaulniers, Nick Desaulniers, Nicolas Pitre,
	Peter Zijlstra, Russell King, Thomas Gleixner,
	Uwe Kleine-König, Valentin Schneider, Vaneet Narang,
	Wolfram Sang (Renesas),
	YiFei Zhu, Keith Packard

To help move thread_info into task_struct, stop using the cpu number
contained in the thread_info block in C code and use a per-cpu
variable instead. This value will be initialized long before the
task_struct cpu value for the various idle threads are set, which
avoids ordering issues during CPU startup.

Signed-off-by: Keith Packard <keithpac@amazon.com>
---
 arch/arm/include/asm/smp.h |  5 ++++-
 arch/arm/kernel/smp.c      | 14 ++++++++++++++
 2 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h
index 5d508f5d56c4..3aca2c2089bc 100644
--- a/arch/arm/include/asm/smp.h
+++ b/arch/arm/include/asm/smp.h
@@ -15,7 +15,10 @@
 # error "<asm/smp.h> included in non-SMP build"
 #endif
 
-#define raw_smp_processor_id() (current_thread_info()->cpu)
+#define raw_smp_processor_id() this_cpu_read(cpu_number)
+#define __smp_processor_id() __this_cpu_read(cpu_number)
+
+DECLARE_PER_CPU_READ_MOSTLY(unsigned int, cpu_number);
 
 struct seq_file;
 
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 74679240a9d8..0457e25109c6 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -51,6 +51,9 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/ipi.h>
 
+DEFINE_PER_CPU_READ_MOSTLY(unsigned int, cpu_number);
+EXPORT_PER_CPU_SYMBOL(cpu_number);
+
 /*
  * as from 2.5, kernels no longer have an init_tasks structure
  * so we need some other way of telling a new secondary core
@@ -495,6 +498,7 @@ void __init smp_prepare_boot_cpu(void)
 void __init smp_prepare_cpus(unsigned int max_cpus)
 {
 	unsigned int ncores = num_possible_cpus();
+	unsigned int cpu;
 
 	init_cpu_topology();
 
@@ -505,6 +509,16 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
 	 */
 	if (max_cpus > ncores)
 		max_cpus = ncores;
+
+	/*
+	 * Initialize the cpu_number value for each cpu before we
+	 * start it. This ensures that the value is valid during cpu
+	 * initialization, even before the idle task_struct cpu member
+	 * is set
+	 */
+	for_each_possible_cpu(cpu)
+		per_cpu(cpu_number, cpu) = cpu;
+
 	if (ncores > 1 && max_cpus) {
 		/*
 		 * Initialise the present map, which describes the set of CPUs
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 2/2] ARM: Move thread_info into task_struct
  2021-09-02 15:54 [PATCH 0/2]: ARM: Enable THREAD_INFO_IN_TASK Keith Packard
  2021-09-02 15:54 ` [PATCH 1/2] ARM: Add per-cpu variable holding cpu number Keith Packard
@ 2021-09-02 15:54 ` Keith Packard
  2021-09-02 16:07 ` [PATCH 0/2]: ARM: Enable THREAD_INFO_IN_TASK Kees Cook
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 36+ messages in thread
From: Keith Packard @ 2021-09-02 15:54 UTC (permalink / raw)
  To: linux-kernel
  Cc: Abbott Liu, Alexander Sverdlin, Al Viro, Andrew Morton,
	Anshuman Khandual, Ard Biesheuvel, Arnd Bergmann,
	Bjorn Andersson, Florian Fainelli, Geert Uytterhoeven,
	Hartley Sweeten, Jens Axboe, Jian Cai, Joe Perches,
	Keith Packard, Linus Walleij, linux-arm-kernel, Maninder Singh,
	Manivannan Sadhasivam, Marc Zyngier, Masahiro Yamada,
	Mike Rapoport, Nick Desaulniers, Nick Desaulniers, Nicolas Pitre,
	Peter Zijlstra, Russell King, Thomas Gleixner,
	Uwe Kleine-König, Valentin Schneider, Vaneet Narang,
	Wolfram Sang (Renesas),
	YiFei Zhu, Keith Packard

This avoids many stack overflow attacks which modified the thread_info
structure by moving that into the task_struct as is done is almost all
other architectures.

The task_struct is now referenced by a per-cpu variable,
'current_task', allowing the current one on each cpu to be located
from both C and assembly.

This also involved removing the 'cpu' member from the thread_info
struct and using the one found in the task_struct instead. That is
mostly used in asm code, where it might reasonably be replaced with
the recent per-cpu 'cpu_number' variable.

Signed-off-by: Keith Packard <keithpac@amazon.com>
---
 arch/arm/Kconfig                   |  1 +
 arch/arm/include/asm/assembler.h   | 63 ++++++++++++++++++++++++------
 arch/arm/include/asm/current.h     | 36 +++++++++++++++++
 arch/arm/include/asm/smp.h         |  1 +
 arch/arm/include/asm/thread_info.h | 14 -------
 arch/arm/include/asm/tls.h         |  4 +-
 arch/arm/include/asm/uaccess-asm.h |  6 +--
 arch/arm/kernel/asm-offsets.c      | 31 ++++++++-------
 arch/arm/kernel/entry-armv.S       | 45 ++++++++++-----------
 arch/arm/kernel/entry-common.S     | 20 +++++-----
 arch/arm/kernel/entry-v7m.S        | 12 +++---
 arch/arm/kernel/iwmmxt.S           | 14 +++----
 arch/arm/kernel/process.c          |  1 +
 arch/arm/kernel/setup.c            |  9 ++---
 arch/arm/kernel/smp.c              | 12 +++++-
 arch/arm/lib/copy_from_user.S      |  4 +-
 arch/arm/lib/copy_to_user.S        |  4 +-
 arch/arm/mach-ep93xx/crunch-bits.S |  2 +-
 arch/arm/vfp/entry.S               |  4 +-
 arch/arm/vfp/vfpmodule.c           | 29 +++++++-------
 20 files changed, 193 insertions(+), 119 deletions(-)
 create mode 100644 arch/arm/include/asm/current.h

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 24804f11302d..1be16c7d7360 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -128,6 +128,7 @@ config ARM
 	select RTC_LIB
 	select SET_FS
 	select SYS_SUPPORTS_APM_EMULATION
+	select THREAD_INFO_IN_TASK
 	# Above selects are sorted alphabetically; please add new ones
 	# according to that.  Thanks.
 	help
diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
index e2b1fd558bf3..7723beed5cd9 100644
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@ -200,13 +200,54 @@
 	.endr
 
 /*
- * Get current thread_info.
+ * Get per-CPU offset
+ * rd: Destination register
  */
-	.macro	get_thread_info, rd
- ARM(	mov	\rd, sp, lsr #THREAD_SIZE_ORDER + PAGE_SHIFT	)
- THUMB(	mov	\rd, sp			)
- THUMB(	lsr	\rd, \rd, #THREAD_SIZE_ORDER + PAGE_SHIFT	)
-	mov	\rd, \rd, lsl #THREAD_SIZE_ORDER + PAGE_SHIFT
+	.macro this_cpu_offset rd:req
+	mrc p15, 0, \rd, c13, c0, 4
+	.endm
+
+/*
+ * Load a per-cpu variable
+ * @dst: Destination register to receive value
+ * @sym: The name of the per-cpu variable
+ * @tmp: scratch register
+ */
+	.macro ldr_this_cpu dst : req, sym : req, tmp : req
+	movw	\dst, #:lower16:\sym
+	movt	\dst, #:upper16:\sym
+	this_cpu_offset \tmp
+	ldr	\dst, [\dst, \tmp]
+	.endm
+
+/*
+ * Store a value in a per-cpu variable
+ * @src: Source register with value
+ * @sym: The name of the per-cpu variable
+ * @t1: scratch register 1
+ * @t2: scratch register 2
+ */
+	.macro str_this_cpu src : req, sym : req, t1 : req, t2 : req
+	movw	\t1, #:lower16:\sym
+	movt	\t1, #:upper16:\sym
+	this_cpu_offset \t2
+	str	\src, [\t1, \t2]
+	.endm
+	
+/*
+ * Get current task_info
+ * @dst: Destination register to receive current task
+ * @tmp: scratch register
+ */
+	.macro get_current, dst : req, tmp : req
+	ldr_this_cpu \dst, current_task, \tmp
+	.endm
+
+/*
+ * Set current task info
+ */
+	.macro	set_current_task rs : req, t1 : req, t2 : req
+	str_this_cpu \rs, current_task, \t1, \t2
 	.endm
 
 /*
@@ -214,19 +255,19 @@
  */
 #ifdef CONFIG_PREEMPT_COUNT
 	.macro	inc_preempt_count, ti, tmp
-	ldr	\tmp, [\ti, #TI_PREEMPT]	@ get preempt count
+	ldr	\tmp, [\ti, #TSK_TI_PREEMPT]	@ get preempt count
 	add	\tmp, \tmp, #1			@ increment it
-	str	\tmp, [\ti, #TI_PREEMPT]
+	str	\tmp, [\ti, #TSK_TI_PREEMPT]
 	.endm
 
 	.macro	dec_preempt_count, ti, tmp
-	ldr	\tmp, [\ti, #TI_PREEMPT]	@ get preempt count
+	ldr	\tmp, [\ti, #TSK_TI_PREEMPT]	@ get preempt count
 	sub	\tmp, \tmp, #1			@ decrement it
-	str	\tmp, [\ti, #TI_PREEMPT]
+	str	\tmp, [\ti, #TSK_TI_PREEMPT]
 	.endm
 
 	.macro	dec_preempt_count_ti, ti, tmp
-	get_thread_info \ti
+	get_task_info \ti, \tmp
 	dec_preempt_count \ti, \tmp
 	.endm
 #else
diff --git a/arch/arm/include/asm/current.h b/arch/arm/include/asm/current.h
new file mode 100644
index 000000000000..b389c4c0005c
--- /dev/null
+++ b/arch/arm/include/asm/current.h
@@ -0,0 +1,36 @@
+/*
+ * Copyright © 2021 Keith Packard <keithp@keithp.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+
+#ifndef _ASM_ARM_CURRENT_H
+#define _ASM_ARM_CURRENT_H
+
+#include <linux/compiler.h>
+#include <linux/thread_info.h>
+#include <asm/percpu.h>
+
+#ifndef __ASSEMBLY__
+struct task_struct;
+
+DECLARE_PER_CPU(struct task_struct *, current_task);
+
+static __always_inline struct task_struct *get_current(void)
+{
+	return raw_cpu_read(current_task);
+}
+
+#define current get_current()
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASM_ARM_CURRENT_H */
diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h
index 3aca2c2089bc..02989d094fd8 100644
--- a/arch/arm/include/asm/smp.h
+++ b/arch/arm/include/asm/smp.h
@@ -64,6 +64,7 @@ struct secondary_data {
 	};
 	unsigned long swapper_pg_dir;
 	void *stack;
+	unsigned int cpu;
 };
 extern struct secondary_data secondary_data;
 extern void secondary_startup(void);
diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h
index 70d4cbc49ae1..995ca0bcb7b1 100644
--- a/arch/arm/include/asm/thread_info.h
+++ b/arch/arm/include/asm/thread_info.h
@@ -55,8 +55,6 @@ struct thread_info {
 	unsigned long		flags;		/* low level flags */
 	int			preempt_count;	/* 0 => preemptable, <0 => bug */
 	mm_segment_t		addr_limit;	/* address limit */
-	struct task_struct	*task;		/* main task structure */
-	__u32			cpu;		/* cpu */
 	__u32			cpu_domain;	/* cpu domain */
 #ifdef CONFIG_STACKPROTECTOR_PER_TASK
 	unsigned long		stack_canary;
@@ -77,23 +75,11 @@ struct thread_info {
 
 #define INIT_THREAD_INFO(tsk)						\
 {									\
-	.task		= &tsk,						\
 	.flags		= 0,						\
 	.preempt_count	= INIT_PREEMPT_COUNT,				\
 	.addr_limit	= KERNEL_DS,					\
 }
 
-/*
- * how to get the thread information struct from C
- */
-static inline struct thread_info *current_thread_info(void) __attribute_const__;
-
-static inline struct thread_info *current_thread_info(void)
-{
-	return (struct thread_info *)
-		(current_stack_pointer & ~(THREAD_SIZE - 1));
-}
-
 #define thread_saved_pc(tsk)	\
 	((unsigned long)(task_thread_info(tsk)->cpu_context.pc))
 #define thread_saved_sp(tsk)	\
diff --git a/arch/arm/include/asm/tls.h b/arch/arm/include/asm/tls.h
index 5a66c3b13c92..309e870e70a3 100644
--- a/arch/arm/include/asm/tls.h
+++ b/arch/arm/include/asm/tls.h
@@ -14,7 +14,7 @@
 	mrc	p15, 0, \tmp2, c13, c0, 2	@ get the user r/w register
 	mcr	p15, 0, \tp, c13, c0, 3		@ set TLS register
 	mcr	p15, 0, \tpuser, c13, c0, 2	@ and the user r/w register
-	str	\tmp2, [\base, #TI_TP_VALUE + 4] @ save it
+	str	\tmp2, [\base, #TSK_TI_TP_VALUE + 4] @ save it
 	.endm
 
 	.macro switch_tls_v6, base, tp, tpuser, tmp1, tmp2
@@ -26,7 +26,7 @@
 	mrcne	p15, 0, \tmp2, c13, c0, 2	@ get the user r/w register
 	mcrne	p15, 0, \tp, c13, c0, 3		@ yes, set TLS register
 	mcrne	p15, 0, \tpuser, c13, c0, 2	@ set user r/w register
-	strne	\tmp2, [\base, #TI_TP_VALUE + 4] @ save it
+	strne	\tmp2, [\base, #TSK_TI_TP_VALUE + 4] @ save it
 	.endm
 
 	.macro switch_tls_software, base, tp, tpuser, tmp1, tmp2
diff --git a/arch/arm/include/asm/uaccess-asm.h b/arch/arm/include/asm/uaccess-asm.h
index e6eb7a2aaf1e..1c79d851a396 100644
--- a/arch/arm/include/asm/uaccess-asm.h
+++ b/arch/arm/include/asm/uaccess-asm.h
@@ -84,9 +84,9 @@
 	 * if \disable is set.
 	 */
 	.macro	uaccess_entry, tsk, tmp0, tmp1, tmp2, disable
-	ldr	\tmp1, [\tsk, #TI_ADDR_LIMIT]
+	ldr	\tmp1, [\tsk, #TSK_TI_ADDR_LIMIT]
 	ldr	\tmp2, =TASK_SIZE
-	str	\tmp2, [\tsk, #TI_ADDR_LIMIT]
+	str	\tmp2, [\tsk, #TSK_TI_ADDR_LIMIT]
  DACR(	mrc	p15, 0, \tmp0, c3, c0, 0)
  DACR(	str	\tmp0, [sp, #SVC_DACR])
 	str	\tmp1, [sp, #SVC_ADDR_LIMIT]
@@ -108,7 +108,7 @@
 	.macro	uaccess_exit, tsk, tmp0, tmp1
 	ldr	\tmp1, [sp, #SVC_ADDR_LIMIT]
  DACR(	ldr	\tmp0, [sp, #SVC_DACR])
-	str	\tmp1, [\tsk, #TI_ADDR_LIMIT]
+	str	\tmp1, [\tsk, #TSK_TI_ADDR_LIMIT]
  DACR(	mcr	p15, 0, \tmp0, c3, c0, 0)
 	.endm
 
diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
index 70993af22d80..9de799481fb2 100644
--- a/arch/arm/kernel/asm-offsets.c
+++ b/arch/arm/kernel/asm-offsets.c
@@ -41,33 +41,34 @@ int main(void)
   DEFINE(TSK_STACK_CANARY,	offsetof(struct task_struct, stack_canary));
 #endif
   BLANK();
-  DEFINE(TI_FLAGS,		offsetof(struct thread_info, flags));
-  DEFINE(TI_PREEMPT,		offsetof(struct thread_info, preempt_count));
-  DEFINE(TI_ADDR_LIMIT,		offsetof(struct thread_info, addr_limit));
-  DEFINE(TI_TASK,		offsetof(struct thread_info, task));
-  DEFINE(TI_CPU,		offsetof(struct thread_info, cpu));
-  DEFINE(TI_CPU_DOMAIN,		offsetof(struct thread_info, cpu_domain));
-  DEFINE(TI_CPU_SAVE,		offsetof(struct thread_info, cpu_context));
-  DEFINE(TI_USED_CP,		offsetof(struct thread_info, used_cp));
-  DEFINE(TI_TP_VALUE,		offsetof(struct thread_info, tp_value));
-  DEFINE(TI_FPSTATE,		offsetof(struct thread_info, fpstate));
+  DEFINE(TSK_TI_FLAGS,		offsetof(struct task_struct, thread_info.flags));
+  DEFINE(TSK_TI_PREEMPT,	offsetof(struct task_struct, thread_info.preempt_count));
+  DEFINE(TSK_TI_ADDR_LIMIT,	offsetof(struct task_struct, thread_info.addr_limit));
+  DEFINE(TSK_TI_CPU_DOMAIN,	offsetof(struct task_struct, thread_info.cpu_domain));
+  DEFINE(TSK_TI_CPU_SAVE,	offsetof(struct task_struct, thread_info.cpu_context));
+  DEFINE(TSK_TI_USED_CP,	offsetof(struct task_struct, thread_info.used_cp));
+  DEFINE(TSK_TI_TP_VALUE,	offsetof(struct task_struct, thread_info.tp_value));
+  DEFINE(TSK_TI_FPSTATE,	offsetof(struct task_struct, thread_info.fpstate));
+#ifdef CONFIG_SMP
+  DEFINE(TSK_CPU,		offsetof(struct task_struct, cpu));
+#endif
 #ifdef CONFIG_VFP
-  DEFINE(TI_VFPSTATE,		offsetof(struct thread_info, vfpstate));
+  DEFINE(TSK_TI_VFPSTATE,	offsetof(struct task_struct, thread_info.vfpstate));
 #ifdef CONFIG_SMP
   DEFINE(VFP_CPU,		offsetof(union vfp_state, hard.cpu));
 #endif
 #endif
 #ifdef CONFIG_ARM_THUMBEE
-  DEFINE(TI_THUMBEE_STATE,	offsetof(struct thread_info, thumbee_state));
+  DEFINE(TSK_TI_THUMBEE_STATE,	offsetof(struct task_struct, thread_info.thumbee_state));
 #endif
 #ifdef CONFIG_IWMMXT
-  DEFINE(TI_IWMMXT_STATE,	offsetof(struct thread_info, fpstate.iwmmxt));
+  DEFINE(TSK_TI_IWMMXT_STATE,	offsetof(struct task_struct, thread_info.fpstate.iwmmxt));
 #endif
 #ifdef CONFIG_CRUNCH
-  DEFINE(TI_CRUNCH_STATE,	offsetof(struct thread_info, crunchstate));
+  DEFINE(TSK_TI_CRUNCH_STATE,	offsetof(struct task_struct, thread_info.crunchstate));
 #endif
 #ifdef CONFIG_STACKPROTECTOR_PER_TASK
-  DEFINE(TI_STACK_CANARY,	offsetof(struct thread_info, stack_canary));
+  DEFINE(TSK_TI_STACK_CANARY,	offsetof(struct task_struct, thread_info.stack_canary));
 #endif
   DEFINE(THREAD_SZ_ORDER,	THREAD_SIZE_ORDER);
   BLANK();
diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
index 0ea8529a4872..2196e4dd7877 100644
--- a/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -179,7 +179,7 @@ ENDPROC(__und_invalid)
 	@
 	stmia	r7, {r2 - r6}
 
-	get_thread_info tsk
+	get_current tsk, r0
 	uaccess_entry tsk, r0, r1, r2, \uaccess
 
 	.if \trace
@@ -205,8 +205,8 @@ __irq_svc:
 	irq_handler
 
 #ifdef CONFIG_PREEMPTION
-	ldr	r8, [tsk, #TI_PREEMPT]		@ get preempt count
-	ldr	r0, [tsk, #TI_FLAGS]		@ get flags
+	ldr	r8, [tsk, #TSK_TI_PREEMPT]	@ get preempt count
+	ldr	r0, [tsk, #TSK_TI_FLAGS]	@ get flags
 	teq	r8, #0				@ if preempt count != 0
 	movne	r0, #0				@ force flags to 0
 	tst	r0, #_TIF_NEED_RESCHED
@@ -223,7 +223,7 @@ ENDPROC(__irq_svc)
 svc_preempt:
 	mov	r8, lr
 1:	bl	preempt_schedule_irq		@ irq en/disable is done inside
-	ldr	r0, [tsk, #TI_FLAGS]		@ get new tasks TI_FLAGS
+	ldr	r0, [tsk, #TSK_TI_FLAGS]	@ get new tasks TI_FLAGS
 	tst	r0, #_TIF_NEED_RESCHED
 	reteq	r8				@ go again
 	b	1b
@@ -260,7 +260,7 @@ __und_svc:
 	bl	__und_fault
 
 __und_svc_finish:
-	get_thread_info tsk
+	get_current tsk, r5
 	ldr	r5, [sp, #S_PSR]		@ Get SVC cpsr
 	svc_exit r5				@ return from exception
  UNWIND(.fnend		)
@@ -428,7 +428,7 @@ __irq_usr:
 	usr_entry
 	kuser_cmpxchg_check
 	irq_handler
-	get_thread_info tsk
+	get_current tsk, why
 	mov	why, #0
 	b	ret_to_user_from_irq
  UNWIND(.fnend		)
@@ -572,12 +572,12 @@ ENDPROC(__und_usr)
 	@ Fall-through from Thumb-2 __und_usr
 	@
 #ifdef CONFIG_NEON
-	get_thread_info r10			@ get current thread
+	get_current r10, r6			@ get current thread
 	adr	r6, .LCneon_thumb_opcodes
 	b	2f
 #endif
 call_fpe:
-	get_thread_info r10			@ get current thread
+	get_current r10, r6			@ get current thread
 #ifdef CONFIG_NEON
 	adr	r6, .LCneon_arm_opcodes
 2:	ldr	r5, [r6], #4			@ mask value
@@ -588,8 +588,8 @@ call_fpe:
 	cmp	r8, r7				@ NEON instruction?
 	bne	2b
 	mov	r7, #1
-	strb	r7, [r10, #TI_USED_CP + 10]	@ mark CP#10 as used
-	strb	r7, [r10, #TI_USED_CP + 11]	@ mark CP#11 as used
+	strb	r7, [r10, #TSK_TI_USED_CP + 10]	@ mark CP#10 as used
+	strb	r7, [r10, #TSK_TI_USED_CP + 11]	@ mark CP#11 as used
 	b	do_vfp				@ let VFP handler handle this
 1:
 #endif
@@ -599,12 +599,12 @@ call_fpe:
 	and	r8, r0, #0x00000f00		@ mask out CP number
  THUMB(	lsr	r8, r8, #8		)
 	mov	r7, #1
-	add	r6, r10, #TI_USED_CP
+	add	r6, r10, #TSK_TI_USED_CP
  ARM(	strb	r7, [r6, r8, lsr #8]	)	@ set appropriate used_cp[]
  THUMB(	strb	r7, [r6, r8]		)	@ set appropriate used_cp[]
 #ifdef CONFIG_IWMMXT
 	@ Test if we need to give access to iWMMXt coprocessors
-	ldr	r5, [r10, #TI_FLAGS]
+	ldr	r5, [r10, #TSK_TI_FLAGS]
 	rsbs	r7, r8, #(1 << 8)		@ CP 0 or 1 only
 	movscs	r7, r5, lsr #(TIF_USING_IWMMXT + 1)
 	bcs	iwmmxt_task_enable
@@ -674,7 +674,7 @@ call_fpe:
 
 do_fpe:
 	ldr	r4, .LCfp
-	add	r10, r10, #TI_FPSTATE		@ r10 = workspace
+	add	r10, r10, #TSK_TI_FPSTATE	@ r10 = workspace
 	ldr	pc, [r4]			@ Call FP module USR entry point
 
 /*
@@ -722,7 +722,7 @@ __pabt_usr:
 ENTRY(ret_from_exception)
  UNWIND(.fnstart	)
  UNWIND(.cantunwind	)
-	get_thread_info tsk
+	get_current tsk, why
 	mov	why, #0
 	b	ret_to_user
  UNWIND(.fnend		)
@@ -735,7 +735,7 @@ __fiq_usr:
 	kuser_cmpxchg_check
 	mov	r0, sp				@ struct pt_regs *regs
 	bl	handle_fiq_as_nmi
-	get_thread_info tsk
+	get_current tsk, r0
 	restore_user_regs fast = 0, offset = 0
  UNWIND(.fnend		)
 ENDPROC(__fiq_usr)
@@ -748,21 +748,21 @@ ENDPROC(__fiq_usr)
 ENTRY(__switch_to)
  UNWIND(.fnstart	)
  UNWIND(.cantunwind	)
-	add	ip, r1, #TI_CPU_SAVE
+	add	ip, r1, #TSK_TI_CPU_SAVE
  ARM(	stmia	ip!, {r4 - sl, fp, sp, lr} )	@ Store most regs on stack
  THUMB(	stmia	ip!, {r4 - sl, fp}	   )	@ Store most regs on stack
  THUMB(	str	sp, [ip], #4		   )
  THUMB(	str	lr, [ip], #4		   )
-	ldr	r4, [r2, #TI_TP_VALUE]
-	ldr	r5, [r2, #TI_TP_VALUE + 4]
+	ldr	r4, [r2, #TSK_TI_TP_VALUE]
+	ldr	r5, [r2, #TSK_TI_TP_VALUE + 4]
 #ifdef CONFIG_CPU_USE_DOMAINS
 	mrc	p15, 0, r6, c3, c0, 0		@ Get domain register
-	str	r6, [r1, #TI_CPU_DOMAIN]	@ Save old domain register
-	ldr	r6, [r2, #TI_CPU_DOMAIN]
+	str	r6, [r1, #TSK_TI_CPU_DOMAIN]	@ Save old domain register
+	ldr	r6, [r2, #TSK_TI_CPU_DOMAIN]
 #endif
 	switch_tls r1, r4, r5, r3, r7
 #if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_SMP)
-	ldr	r7, [r2, #TI_TASK]
+	ldr	r7, [r2, #TSK_TI_TASK]
 	ldr	r8, =__stack_chk_guard
 	.if (TSK_STACK_CANARY > IMM12_MASK)
 	add	r7, r7, #TSK_STACK_CANARY & ~IMM12_MASK
@@ -772,8 +772,9 @@ ENTRY(__switch_to)
 #ifdef CONFIG_CPU_USE_DOMAINS
 	mcr	p15, 0, r6, c3, c0, 0		@ Set domain register
 #endif
+	set_current_task r2, r4, r5
 	mov	r5, r0
-	add	r4, r2, #TI_CPU_SAVE
+	add	r4, r2, #TSK_TI_CPU_SAVE
 	ldr	r0, =thread_notify_head
 	mov	r1, #THREAD_NOTIFY_SWITCH
 	bl	atomic_notifier_call_chain
diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index 7f0b7aba1498..7a5044dbfc24 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -49,11 +49,11 @@ __ret_fast_syscall:
  UNWIND(.fnstart	)
  UNWIND(.cantunwind	)
 	disable_irq_notrace			@ disable interrupts
-	ldr	r2, [tsk, #TI_ADDR_LIMIT]
+	ldr	r2, [tsk, #TSK_TI_ADDR_LIMIT]
 	ldr	r1, =TASK_SIZE
 	cmp	r2, r1
 	blne	addr_limit_check_failed
-	ldr	r1, [tsk, #TI_FLAGS]		@ re-check for syscall tracing
+	ldr	r1, [tsk, #TSK_TI_FLAGS]	@ re-check for syscall tracing
 	movs	r1, r1, lsl #16
 	bne	fast_work_pending
 
@@ -87,11 +87,11 @@ __ret_fast_syscall:
 	bl	do_rseq_syscall
 #endif
 	disable_irq_notrace			@ disable interrupts
-	ldr	r2, [tsk, #TI_ADDR_LIMIT]
+	ldr	r2, [tsk, #TSK_TI_ADDR_LIMIT]
 	ldr     r1, =TASK_SIZE
 	cmp     r2, r1
 	blne	addr_limit_check_failed
-	ldr	r1, [tsk, #TI_FLAGS]		@ re-check for syscall tracing
+	ldr	r1, [tsk, #TSK_TI_FLAGS]	@ re-check for syscall tracing
 	movs	r1, r1, lsl #16
 	beq	no_work_pending
  UNWIND(.fnend		)
@@ -129,11 +129,11 @@ ret_slow_syscall:
 #endif
 	disable_irq_notrace			@ disable interrupts
 ENTRY(ret_to_user_from_irq)
-	ldr	r2, [tsk, #TI_ADDR_LIMIT]
+	ldr	r2, [tsk, #TSK_TI_ADDR_LIMIT]
 	ldr     r1, =TASK_SIZE
 	cmp	r2, r1
 	blne	addr_limit_check_failed
-	ldr	r1, [tsk, #TI_FLAGS]
+	ldr	r1, [tsk, #TSK_TI_FLAGS]
 	movs	r1, r1, lsl #16
 	bne	slow_work_pending
 no_work_pending:
@@ -156,7 +156,7 @@ ENTRY(ret_from_fork)
 	movne	r0, r4
 	badrne	lr, 1f
 	retne	r5
-1:	get_thread_info tsk
+1:	get_current tsk, r1
 	b	ret_slow_syscall
 ENDPROC(ret_from_fork)
 
@@ -243,7 +243,7 @@ ENTRY(vector_swi)
 	bic	scno, scno, #0xff000000		@ mask off SWI op-code
 	eor	scno, scno, #__NR_SYSCALL_BASE	@ check OS number
 #endif
-	get_thread_info tsk
+	get_current tsk, r10
 	/*
 	 * Reload the registers that may have been corrupted on entry to
 	 * the syscall assembly (by tracing or context tracking.)
@@ -251,7 +251,7 @@ ENTRY(vector_swi)
  TRACE(	ldmia	sp, {r0 - r3}		)
 
 local_restart:
-	ldr	r10, [tsk, #TI_FLAGS]		@ check for syscall tracing
+	ldr	r10, [tsk, #TSK_TI_FLAGS]	@ check for syscall tracing
 	stmdb	sp!, {r4, r5}			@ push fifth and sixth args
 
 	tst	r10, #_TIF_SYSCALL_WORK		@ are we tracing syscalls?
@@ -278,7 +278,7 @@ local_restart:
 9001:
 	sub	lr, saved_pc, #4
 	str	lr, [sp, #S_PC]
-	get_thread_info tsk
+	get_current tsk, r1
 	b	ret_fast_syscall
 #endif
 ENDPROC(vector_swi)
diff --git a/arch/arm/kernel/entry-v7m.S b/arch/arm/kernel/entry-v7m.S
index d0e898608d30..4de5ee2a432d 100644
--- a/arch/arm/kernel/entry-v7m.S
+++ b/arch/arm/kernel/entry-v7m.S
@@ -57,8 +57,8 @@ __irq_entry:
 	tst	r0, V7M_SCB_ICSR_RETTOBASE
 	beq	2f
 
-	get_thread_info tsk
-	ldr	r2, [tsk, #TI_FLAGS]
+	get_current tsk, r2
+	ldr	r2, [tsk, #TSK_TI_FLAGS]
 	movs	r2, r2, lsl #16
 	beq	2f			@ no work pending
 	mov	r0, #V7M_SCB_ICSR_PENDSVSET
@@ -83,7 +83,7 @@ __pendsv_entry:
 	str	r0, [r1, V7M_SCB_ICSR]	@ clear PendSV
 
 	@ execute the pending work, including reschedule
-	get_thread_info tsk
+	get_current tsk, why
 	mov	why, #0
 	b	ret_to_user_from_irq
 ENDPROC(__pendsv_entry)
@@ -96,17 +96,19 @@ ENDPROC(__pendsv_entry)
 ENTRY(__switch_to)
 	.fnstart
 	.cantunwind
-	add	ip, r1, #TI_CPU_SAVE
+	add	ip, r1, #TSK_TI_CPU_SAVE
 	stmia	ip!, {r4 - r11}		@ Store most regs on stack
 	str	sp, [ip], #4
 	str	lr, [ip], #4
 	mov	r5, r0
-	add	r4, r2, #TI_CPU_SAVE
+	add	r4, r2, #TSK_TI_CPU_SAVE
 	ldr	r0, =thread_notify_head
 	mov	r1, #THREAD_NOTIFY_SWITCH
 	bl	atomic_notifier_call_chain
 	mov	ip, r4
 	mov	r0, r5
+	ldr	r7, [r2, #TSK_TI_TASK]
+	set_current_task r7, r4, r5
 	ldmia	ip!, {r4 - r11}		@ Load all regs saved previously
 	ldr	sp, [ip]
 	ldr	pc, [ip, #4]!
diff --git a/arch/arm/kernel/iwmmxt.S b/arch/arm/kernel/iwmmxt.S
index d2b4ac06e4ed..f709914b9029 100644
--- a/arch/arm/kernel/iwmmxt.S
+++ b/arch/arm/kernel/iwmmxt.S
@@ -84,7 +84,7 @@ ENTRY(iwmmxt_task_enable)
 	PJ4(mcr	p15, 0, r2, c1, c0, 2)
 
 	ldr	r3, =concan_owner
-	add	r0, r10, #TI_IWMMXT_STATE	@ get task Concan save area
+	add	r0, r10, #TSK_TI_IWMMXT_STATE	@ get task Concan save area
 	ldr	r2, [sp, #60]			@ current task pc value
 	ldr	r1, [r3]			@ get current Concan owner
 	str	r0, [r3]			@ this task now owns Concan regs
@@ -96,7 +96,7 @@ ENTRY(iwmmxt_task_enable)
 	bl	concan_save
 
 #ifdef CONFIG_PREEMPT_COUNT
-	get_thread_info r10
+	get_current r10, r3
 #endif
 4:	dec_preempt_count r10, r3
 	ret	r9				@ normal exit from exception
@@ -199,7 +199,7 @@ ENTRY(iwmmxt_task_disable)
 	msr	cpsr_c, r2
 
 	ldr	r3, =concan_owner
-	add	r2, r0, #TI_IWMMXT_STATE	@ get task Concan save area
+	add	r2, r0, #TSK_TI_IWMMXT_STATE	@ get task Concan save area
 	ldr	r1, [r3]			@ get current Concan owner
 	teq	r1, #0				@ any current owner?
 	beq	1f				@ no: quit
@@ -251,7 +251,7 @@ ENTRY(iwmmxt_task_copy)
 	msr	cpsr_c, r2
 
 	ldr	r3, =concan_owner
-	add	r2, r0, #TI_IWMMXT_STATE	@ get task Concan save area
+	add	r2, r0, #TSK_TI_IWMMXT_STATE	@ get task Concan save area
 	ldr	r3, [r3]			@ get current Concan owner
 	teq	r2, r3				@ does this task own it...
 	beq	1f
@@ -289,7 +289,7 @@ ENTRY(iwmmxt_task_restore)
 	msr	cpsr_c, r2
 
 	ldr	r3, =concan_owner
-	add	r2, r0, #TI_IWMMXT_STATE	@ get task Concan save area
+	add	r2, r0, #TSK_TI_IWMMXT_STATE	@ get task Concan save area
 	ldr	r3, [r3]			@ get current Concan owner
 	bic	r2, r2, #0x7			@ 64-bit alignment
 	teq	r2, r3				@ does this task own it...
@@ -328,7 +328,7 @@ ENTRY(iwmmxt_task_switch)
 	bne	1f				@ yes: block them for next task
 
 	ldr	r2, =concan_owner
-	add	r3, r0, #TI_IWMMXT_STATE	@ get next task Concan save area
+	add	r3, r0, #TSK_TI_IWMMXT_STATE	@ get next task Concan save area
 	ldr	r2, [r2]			@ get current Concan owner
 	teq	r2, r3				@ next task owns it?
 	retne	lr				@ no: leave Concan disabled
@@ -355,7 +355,7 @@ ENTRY(iwmmxt_task_release)
 	orr	ip, r2, #PSR_I_BIT		@ disable interrupts
 	msr	cpsr_c, ip
 	ldr	r3, =concan_owner
-	add	r0, r0, #TI_IWMMXT_STATE	@ get task Concan save area
+	add	r0, r0, #TSK_TI_IWMMXT_STATE	@ get task Concan save area
 	ldr	r1, [r3]			@ get current Concan owner
 	eors	r0, r0, r1			@ if equal...
 	streq	r0, [r3]			@ then clear ownership
diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index 6324f4db9b02..c7539f319f96 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -27,6 +27,7 @@
 #include <linux/random.h>
 #include <linux/hw_breakpoint.h>
 #include <linux/leds.h>
+#include <linux/percpu.h>
 
 #include <asm/processor.h>
 #include <asm/thread_notify.h>
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 73ca7797b92f..a5a52817948a 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -532,12 +532,6 @@ void notrace cpu_init(void)
 		BUG();
 	}
 
-	/*
-	 * This only works on resume and secondary cores. For booting on the
-	 * boot cpu, smp_prepare_boot_cpu is called after percpu area setup.
-	 */
-	set_my_cpu_offset(per_cpu_offset(cpu));
-
 	cpu_proc_init();
 
 	/*
@@ -733,6 +727,9 @@ static void __init setup_processor(void)
 	elf_hwcap_fixup();
 
 	cacheid_init();
+
+	set_my_cpu_offset(per_cpu_offset(0));
+
 	cpu_init();
 }
 
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 0457e25109c6..9b584811baad 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -51,6 +51,10 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/ipi.h>
 
+DEFINE_PER_CPU(struct task_struct *, current_task) ____cacheline_aligned =
+	&init_task;
+EXPORT_PER_CPU_SYMBOL(current_task);
+
 DEFINE_PER_CPU_READ_MOSTLY(unsigned int, cpu_number);
 EXPORT_PER_CPU_SYMBOL(cpu_number);
 
@@ -156,8 +160,11 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
 	secondary_data.pgdir = virt_to_phys(idmap_pgd);
 	secondary_data.swapper_pg_dir = get_arch_pgd(swapper_pg_dir);
 #endif
+	secondary_data.cpu = cpu;
 	sync_cache_w(&secondary_data);
 
+	per_cpu(current_task, cpu) = idle;
+
 	/*
 	 * Now bring the CPU into our world.
 	 */
@@ -406,7 +413,9 @@ static void smp_store_cpu_info(unsigned int cpuid)
 asmlinkage void secondary_start_kernel(void)
 {
 	struct mm_struct *mm = &init_mm;
-	unsigned int cpu;
+	unsigned int cpu = secondary_data.cpu;
+
+	set_my_cpu_offset(per_cpu_offset(cpu));
 
 	secondary_biglittle_init();
 
@@ -423,7 +432,6 @@ asmlinkage void secondary_start_kernel(void)
 	 * All kernel threads share the same mm context; grab a
 	 * reference and switch to it.
 	 */
-	cpu = smp_processor_id();
 	mmgrab(mm);
 	current->active_mm = mm;
 	cpumask_set_cpu(cpu, mm_cpumask(mm));
diff --git a/arch/arm/lib/copy_from_user.S b/arch/arm/lib/copy_from_user.S
index f8016e3db65d..4ed04f91ed0d 100644
--- a/arch/arm/lib/copy_from_user.S
+++ b/arch/arm/lib/copy_from_user.S
@@ -109,8 +109,8 @@
 
 ENTRY(arm_copy_from_user)
 #ifdef CONFIG_CPU_SPECTRE
-	get_thread_info r3
-	ldr	r3, [r3, #TI_ADDR_LIMIT]
+	get_current r3, ip
+	ldr	r3, [r3, #TSK_TI_ADDR_LIMIT]
 	uaccess_mask_range_ptr r1, r2, r3, ip
 #endif
 
diff --git a/arch/arm/lib/copy_to_user.S b/arch/arm/lib/copy_to_user.S
index ebfe4cb3d912..229061cfb5b0 100644
--- a/arch/arm/lib/copy_to_user.S
+++ b/arch/arm/lib/copy_to_user.S
@@ -109,8 +109,8 @@
 ENTRY(__copy_to_user_std)
 WEAK(arm_copy_to_user)
 #ifdef CONFIG_CPU_SPECTRE
-	get_thread_info r3
-	ldr	r3, [r3, #TI_ADDR_LIMIT]
+	get_current r3, ip
+	ldr	r3, [r3, #TSK_TI_ADDR_LIMIT]
 	uaccess_mask_range_ptr r0, r2, r3, ip
 #endif
 
diff --git a/arch/arm/mach-ep93xx/crunch-bits.S b/arch/arm/mach-ep93xx/crunch-bits.S
index fb2dbf76f09e..dad62e3d4b03 100644
--- a/arch/arm/mach-ep93xx/crunch-bits.S
+++ b/arch/arm/mach-ep93xx/crunch-bits.S
@@ -192,7 +192,7 @@ crunch_load:
 
 1:
 #ifdef CONFIG_PREEMPT_COUNT
-	get_thread_info r10
+	get_task_info r10, r3
 #endif
 2:	dec_preempt_count r10, r3
 	ret	lr
diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S
index 27b0a1f27fbd..003554f8393d 100644
--- a/arch/arm/vfp/entry.S
+++ b/arch/arm/vfp/entry.S
@@ -24,8 +24,8 @@
 ENTRY(do_vfp)
 	inc_preempt_count r10, r4
  	ldr	r4, .LCvfp
-	ldr	r11, [r10, #TI_CPU]	@ CPU number
-	add	r10, r10, #TI_VFPSTATE	@ r10 = workspace
+	ldr	r11, [r10, #TSK_CPU]	@ CPU number
+	add	r10, r10, #TSK_TI_VFPSTATE	@ r10 = workspace
 	ldr	pc, [r4]		@ call VFP entry point
 ENDPROC(do_vfp)
 
diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
index 2cb355c1b5b7..4e2707fba8fd 100644
--- a/arch/arm/vfp/vfpmodule.c
+++ b/arch/arm/vfp/vfpmodule.c
@@ -143,22 +143,22 @@ static void vfp_thread_copy(struct thread_info *thread)
  *  THREAD_NOFTIFY_SWTICH:
  *   - the previously running thread will not be scheduled onto another CPU.
  *   - the next thread to be run (v) will not be running on another CPU.
- *   - thread->cpu is the local CPU number
+ *   - tsk->cpu is the local CPU number
  *   - not preemptible as we're called in the middle of a thread switch
  *  THREAD_NOTIFY_FLUSH:
  *   - the thread (v) will be running on the local CPU, so
- *	v === current_thread_info()
- *   - thread->cpu is the local CPU number at the time it is accessed,
+ *	v === current
+ *   - tsk->cpu is the local CPU number at the time it is accessed,
  *	but may change at any time.
  *   - we could be preempted if tree preempt rcu is enabled, so
- *	it is unsafe to use thread->cpu.
+ *	it is unsafe to use tsk->cpu.
  *  THREAD_NOTIFY_EXIT
  *   - we could be preempted if tree preempt rcu is enabled, so
- *	it is unsafe to use thread->cpu.
+ *	it is unsafe to use tsk->cpu.
  */
 static int vfp_notifier(struct notifier_block *self, unsigned long cmd, void *v)
 {
-	struct thread_info *thread = v;
+	struct task_struct *tsk = v;
 	u32 fpexc;
 #ifdef CONFIG_SMP
 	unsigned int cpu;
@@ -169,7 +169,7 @@ static int vfp_notifier(struct notifier_block *self, unsigned long cmd, void *v)
 		fpexc = fmrx(FPEXC);
 
 #ifdef CONFIG_SMP
-		cpu = thread->cpu;
+		cpu = tsk->cpu;
 
 		/*
 		 * On SMP, if VFP is enabled, save the old state in
@@ -188,15 +188,15 @@ static int vfp_notifier(struct notifier_block *self, unsigned long cmd, void *v)
 		break;
 
 	case THREAD_NOTIFY_FLUSH:
-		vfp_thread_flush(thread);
+		vfp_thread_flush(&tsk->thread_info);
 		break;
 
 	case THREAD_NOTIFY_EXIT:
-		vfp_thread_exit(thread);
+		vfp_thread_exit(&tsk->thread_info);
 		break;
 
 	case THREAD_NOTIFY_COPY:
-		vfp_thread_copy(thread);
+		vfp_thread_copy(&tsk->thread_info);
 		break;
 	}
 
@@ -448,26 +448,25 @@ void __init vfp_disable(void)
 #ifdef CONFIG_CPU_PM
 static int vfp_pm_suspend(void)
 {
-	struct thread_info *ti = current_thread_info();
 	u32 fpexc = fmrx(FPEXC);
 
 	/* if vfp is on, then save state for resumption */
 	if (fpexc & FPEXC_EN) {
 		pr_debug("%s: saving vfp state\n", __func__);
-		vfp_save_state(&ti->vfpstate, fpexc);
+		vfp_save_state(&current->thread_info.vfpstate, fpexc);
 
 		/* disable, just in case */
 		fmxr(FPEXC, fmrx(FPEXC) & ~FPEXC_EN);
-	} else if (vfp_current_hw_state[ti->cpu]) {
+	} else if (vfp_current_hw_state[current->cpu]) {
 #ifndef CONFIG_SMP
 		fmxr(FPEXC, fpexc | FPEXC_EN);
-		vfp_save_state(vfp_current_hw_state[ti->cpu], fpexc);
+		vfp_save_state(vfp_current_hw_state[current->cpu], fpexc);
 		fmxr(FPEXC, fpexc);
 #endif
 	}
 
 	/* clear any information we had about last context state */
-	vfp_current_hw_state[ti->cpu] = NULL;
+	vfp_current_hw_state[current->cpu] = NULL;
 
 	return 0;
 }
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/2]: ARM: Enable THREAD_INFO_IN_TASK
  2021-09-02 15:54 [PATCH 0/2]: ARM: Enable THREAD_INFO_IN_TASK Keith Packard
  2021-09-02 15:54 ` [PATCH 1/2] ARM: Add per-cpu variable holding cpu number Keith Packard
  2021-09-02 15:54 ` [PATCH 2/2] ARM: Move thread_info into task_struct Keith Packard
@ 2021-09-02 16:07 ` Kees Cook
  2021-09-02 16:18   ` Ard Biesheuvel
  2021-09-02 16:54   ` Russell King (Oracle)
  2021-09-02 16:53 ` Russell King (Oracle)
  2021-09-04  6:09 ` [PATCH 0/2] ARM: support THREAD_INFO_IN_TASK (v7 only) (v2) Keith Packard
  4 siblings, 2 replies; 36+ messages in thread
From: Kees Cook @ 2021-09-02 16:07 UTC (permalink / raw)
  To: Keith Packard
  Cc: linux-kernel, Abbott Liu, Alexander Sverdlin, Al Viro,
	Andrew Morton, Anshuman Khandual, Ard Biesheuvel, Arnd Bergmann,
	Bjorn Andersson, Florian Fainelli, Geert Uytterhoeven,
	Hartley Sweeten, Jens Axboe, Jian Cai, Joe Perches,
	Linus Walleij, linux-arm-kernel, Maninder Singh,
	Manivannan Sadhasivam, Marc Zyngier, Masahiro Yamada,
	Mike Rapoport, Nick Desaulniers, Nick Desaulniers, Nicolas Pitre,
	Peter Zijlstra, Russell King, Thomas Gleixner,
	Uwe Kleine-König, Valentin Schneider, Vaneet Narang,
	Wolfram Sang (Renesas),
	YiFei Zhu, Keith Packard, linux-hardening

On Thu, Sep 02, 2021 at 08:54:26AM -0700, Keith Packard wrote:
> Placing thread_info in the kernel stack leaves it vulnerable to stack
> overflow attacks. This short series addresses that by using the
> existing THREAD_INFO_IN_TASK infrastructure.

Very cool! Thanks for working on this. If you want, you can refer to the
KSPP bug for this too:
https://github.com/KSPP/linux/issues/1

(Anyone want to do MIPS?)

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/2]: ARM: Enable THREAD_INFO_IN_TASK
  2021-09-02 16:07 ` [PATCH 0/2]: ARM: Enable THREAD_INFO_IN_TASK Kees Cook
@ 2021-09-02 16:18   ` Ard Biesheuvel
  2021-09-02 17:37     ` Kees Cook
  2021-09-02 16:54   ` Russell King (Oracle)
  1 sibling, 1 reply; 36+ messages in thread
From: Ard Biesheuvel @ 2021-09-02 16:18 UTC (permalink / raw)
  To: Kees Cook
  Cc: Keith Packard, Linux Kernel Mailing List, Abbott Liu,
	Alexander Sverdlin, Al Viro, Andrew Morton, Anshuman Khandual,
	Arnd Bergmann, Bjorn Andersson, Florian Fainelli,
	Geert Uytterhoeven, Hartley Sweeten, Jens Axboe, Jian Cai,
	Joe Perches, Linus Walleij, Linux ARM, Maninder Singh,
	Manivannan Sadhasivam, Marc Zyngier, Masahiro Yamada,
	Mike Rapoport, Nick Desaulniers, Nick Desaulniers, Nicolas Pitre,
	Peter Zijlstra, Russell King, Thomas Gleixner,
	Uwe Kleine-König, Valentin Schneider, Vaneet Narang,
	Wolfram Sang (Renesas),
	YiFei Zhu, Keith Packard, linux-hardening

On Thu, 2 Sept 2021 at 18:07, Kees Cook <keescook@chromium.org> wrote:
>
> On Thu, Sep 02, 2021 at 08:54:26AM -0700, Keith Packard wrote:
> > Placing thread_info in the kernel stack leaves it vulnerable to stack
> > overflow attacks. This short series addresses that by using the
> > existing THREAD_INFO_IN_TASK infrastructure.
>
> Very cool! Thanks for working on this. If you want, you can refer to the
> KSPP bug for this too:
> https://github.com/KSPP/linux/issues/1
>
> (Anyone want to do MIPS?)
>

I take it this breaks the GCC plugin based per-task stack protector,
given that it emits code to mask the stack pointer and apply an offset
to the resulting value.

It would be nice if we could replace this with something suitable for
THREAD_INFO_IN_TASK, and if it is suitable enough, try and get the
GCC/Clang folks to adopt it as well (which was never going to happen
for the stack pointer mask/offset approach)

Where can I find these patches? I don't see them on linux-arm-kernel@

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/2]: ARM: Enable THREAD_INFO_IN_TASK
  2021-09-02 15:54 [PATCH 0/2]: ARM: Enable THREAD_INFO_IN_TASK Keith Packard
                   ` (2 preceding siblings ...)
  2021-09-02 16:07 ` [PATCH 0/2]: ARM: Enable THREAD_INFO_IN_TASK Kees Cook
@ 2021-09-02 16:53 ` Russell King (Oracle)
  2021-09-02 17:35   ` Kees Cook
  2021-09-02 17:58   ` Keith Packard
  2021-09-04  6:09 ` [PATCH 0/2] ARM: support THREAD_INFO_IN_TASK (v7 only) (v2) Keith Packard
  4 siblings, 2 replies; 36+ messages in thread
From: Russell King (Oracle) @ 2021-09-02 16:53 UTC (permalink / raw)
  To: Keith Packard
  Cc: linux-kernel, Abbott Liu, Alexander Sverdlin, Al Viro,
	Andrew Morton, Anshuman Khandual, Ard Biesheuvel, Arnd Bergmann,
	Bjorn Andersson, Florian Fainelli, Geert Uytterhoeven,
	Hartley Sweeten, Jens Axboe, Jian Cai, Joe Perches,
	Linus Walleij, linux-arm-kernel, Maninder Singh,
	Manivannan Sadhasivam, Marc Zyngier, Masahiro Yamada,
	Mike Rapoport, Nick Desaulniers, Nick Desaulniers, Nicolas Pitre,
	Peter Zijlstra, Thomas Gleixner, Uwe Kleine-König,
	Valentin Schneider, Vaneet Narang, Wolfram Sang (Renesas),
	YiFei Zhu, Keith Packard

On Thu, Sep 02, 2021 at 08:54:26AM -0700, Keith Packard wrote:
> Placing thread_info in the kernel stack leaves it vulnerable to stack
> overflow attacks. This short series addresses that by using the
> existing THREAD_INFO_IN_TASK infrastructure.
> 
> As this is my first patch in this part of the kernel, I'm looking for
> feedback about the general approach as well as specific comments on
> places where I've missed something.
> 
> I've only run this on armhf running under qemu, so while I've tried to
> make patches for other code paths, I haven't been able to test those.
> 
> (yes, I know checkpatch.pl complains about whitespace in asm-offsets.c, I
> decided to leave the existing whitespace alone)
> 
> Signed-off-by: Keith Packard <keithpac@amazon.com>

I think you're introducing a circular dependency with this for
certain kernel configurations:

E.g. Have you tried running this with CONFIG_CPU_V6 enabled?

+#define raw_smp_processor_id() this_cpu_read(cpu_number)
+#define __smp_processor_id() __this_cpu_read(cpu_number)
+
+DECLARE_PER_CPU_READ_MOSTLY(unsigned int, cpu_number);

this_cpu_read() is defined as:

#define this_cpu_read(pcp) __pcpu_size_call_return(this_cpu_read_, pcp)
(which will call this_cpu_read_4)

#define this_cpu_read_4(pcp)            this_cpu_generic_read(pcp)
=> __this_cpu_generic_read_nopreempt()
=>         ___ret = READ_ONCE(*raw_cpu_ptr(&(pcp)));

#define raw_cpu_ptr(ptr)                                                \
({                                                                      \
        __verify_pcpu_ptr(ptr);                                         \
        arch_raw_cpu_ptr(ptr);                                          \
})

#ifndef arch_raw_cpu_ptr
#define arch_raw_cpu_ptr(ptr) SHIFT_PERCPU_PTR(ptr, __my_cpu_offset)
#endif

#ifndef __my_cpu_offset
#define __my_cpu_offset per_cpu_offset(raw_smp_processor_id())
#endif

... which then leads back to a use of raw_smp_processor_id(), thereby
creating a circular loop of preprocessor definitions that the compiler
can't resolve.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/2]: ARM: Enable THREAD_INFO_IN_TASK
  2021-09-02 16:07 ` [PATCH 0/2]: ARM: Enable THREAD_INFO_IN_TASK Kees Cook
  2021-09-02 16:18   ` Ard Biesheuvel
@ 2021-09-02 16:54   ` Russell King (Oracle)
  1 sibling, 0 replies; 36+ messages in thread
From: Russell King (Oracle) @ 2021-09-02 16:54 UTC (permalink / raw)
  To: Kees Cook
  Cc: Keith Packard, linux-kernel, Abbott Liu, Alexander Sverdlin,
	Al Viro, Andrew Morton, Anshuman Khandual, Ard Biesheuvel,
	Arnd Bergmann, Bjorn Andersson, Florian Fainelli,
	Geert Uytterhoeven, Hartley Sweeten, Jens Axboe, Jian Cai,
	Joe Perches, Linus Walleij, linux-arm-kernel, Maninder Singh,
	Manivannan Sadhasivam, Marc Zyngier, Masahiro Yamada,
	Mike Rapoport, Nick Desaulniers, Nick Desaulniers, Nicolas Pitre,
	Peter Zijlstra, Thomas Gleixner, Uwe Kleine-König,
	Valentin Schneider, Vaneet Narang, Wolfram Sang (Renesas),
	YiFei Zhu, Keith Packard, linux-hardening

On Thu, Sep 02, 2021 at 09:07:42AM -0700, Kees Cook wrote:
> On Thu, Sep 02, 2021 at 08:54:26AM -0700, Keith Packard wrote:
> > Placing thread_info in the kernel stack leaves it vulnerable to stack
> > overflow attacks. This short series addresses that by using the
> > existing THREAD_INFO_IN_TASK infrastructure.
> 
> Very cool! Thanks for working on this. If you want, you can refer to the
> KSPP bug for this too:
> https://github.com/KSPP/linux/issues/1

Not so fast. It's buggy. I've rejected this "solution" before.

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/2]: ARM: Enable THREAD_INFO_IN_TASK
  2021-09-02 16:53 ` Russell King (Oracle)
@ 2021-09-02 17:35   ` Kees Cook
  2021-09-02 17:58   ` Keith Packard
  1 sibling, 0 replies; 36+ messages in thread
From: Kees Cook @ 2021-09-02 17:35 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: Keith Packard, linux-kernel, Abbott Liu, Alexander Sverdlin,
	Al Viro, Andrew Morton, Anshuman Khandual, Ard Biesheuvel,
	Arnd Bergmann, Bjorn Andersson, Florian Fainelli,
	Geert Uytterhoeven, Hartley Sweeten, Jens Axboe, Jian Cai,
	Joe Perches, Linus Walleij, linux-arm-kernel, Maninder Singh,
	Manivannan Sadhasivam, Marc Zyngier, Masahiro Yamada,
	Mike Rapoport, Nick Desaulniers, Nicolas Pitre, Peter Zijlstra,
	Thomas Gleixner, Uwe Kleine-König, Valentin Schneider,
	Vaneet Narang, Wolfram Sang (Renesas),
	YiFei Zhu, Keith Packard

On Thu, Sep 02, 2021 at 05:53:53PM +0100, Russell King (Oracle) wrote:
> On Thu, Sep 02, 2021 at 08:54:26AM -0700, Keith Packard wrote:
> > Placing thread_info in the kernel stack leaves it vulnerable to stack
> > overflow attacks. This short series addresses that by using the
> > existing THREAD_INFO_IN_TASK infrastructure.
> > 
> > As this is my first patch in this part of the kernel, I'm looking for
> > feedback about the general approach as well as specific comments on
> > places where I've missed something.
> > 
> > I've only run this on armhf running under qemu, so while I've tried to
> > make patches for other code paths, I haven't been able to test those.
> > 
> > (yes, I know checkpatch.pl complains about whitespace in asm-offsets.c, I
> > decided to leave the existing whitespace alone)
> > 
> > Signed-off-by: Keith Packard <keithpac@amazon.com>
> 
> I think you're introducing a circular dependency with this for
> certain kernel configurations:
> 
> E.g. Have you tried running this with CONFIG_CPU_V6 enabled?
> 
> +#define raw_smp_processor_id() this_cpu_read(cpu_number)
> +#define __smp_processor_id() __this_cpu_read(cpu_number)
> +
> +DECLARE_PER_CPU_READ_MOSTLY(unsigned int, cpu_number);
> 
> this_cpu_read() is defined as:
> 
> #define this_cpu_read(pcp) __pcpu_size_call_return(this_cpu_read_, pcp)
> (which will call this_cpu_read_4)
> 
> #define this_cpu_read_4(pcp)            this_cpu_generic_read(pcp)
> => __this_cpu_generic_read_nopreempt()
> =>         ___ret = READ_ONCE(*raw_cpu_ptr(&(pcp)));
> 
> #define raw_cpu_ptr(ptr)                                                \
> ({                                                                      \
>         __verify_pcpu_ptr(ptr);                                         \
>         arch_raw_cpu_ptr(ptr);                                          \
> })
> 
> #ifndef arch_raw_cpu_ptr
> #define arch_raw_cpu_ptr(ptr) SHIFT_PERCPU_PTR(ptr, __my_cpu_offset)
> #endif
> 
> #ifndef __my_cpu_offset
> #define __my_cpu_offset per_cpu_offset(raw_smp_processor_id())
> #endif
> 
> ... which then leads back to a use of raw_smp_processor_id(), thereby
> creating a circular loop of preprocessor definitions that the compiler
> can't resolve.

If this isn't easy to fix, perhaps this can be a V7-only feature?

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/2]: ARM: Enable THREAD_INFO_IN_TASK
  2021-09-02 16:18   ` Ard Biesheuvel
@ 2021-09-02 17:37     ` Kees Cook
  0 siblings, 0 replies; 36+ messages in thread
From: Kees Cook @ 2021-09-02 17:37 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Keith Packard, Linux Kernel Mailing List, Abbott Liu,
	Alexander Sverdlin, Al Viro, Andrew Morton, Anshuman Khandual,
	Arnd Bergmann, Bjorn Andersson, Florian Fainelli,
	Geert Uytterhoeven, Hartley Sweeten, Jens Axboe, Jian Cai,
	Joe Perches, Linus Walleij, Linux ARM, Maninder Singh,
	Manivannan Sadhasivam, Marc Zyngier, Masahiro Yamada,
	Mike Rapoport, Nick Desaulniers, Nick Desaulniers, Nicolas Pitre,
	Peter Zijlstra, Russell King, Thomas Gleixner,
	Uwe Kleine-König, Valentin Schneider, Vaneet Narang,
	Wolfram Sang (Renesas),
	YiFei Zhu, Keith Packard, linux-hardening

On Thu, Sep 02, 2021 at 06:18:29PM +0200, Ard Biesheuvel wrote:
> On Thu, 2 Sept 2021 at 18:07, Kees Cook <keescook@chromium.org> wrote:
> >
> > On Thu, Sep 02, 2021 at 08:54:26AM -0700, Keith Packard wrote:
> > > Placing thread_info in the kernel stack leaves it vulnerable to stack
> > > overflow attacks. This short series addresses that by using the
> > > existing THREAD_INFO_IN_TASK infrastructure.
> >
> > Very cool! Thanks for working on this. If you want, you can refer to the
> > KSPP bug for this too:
> > https://github.com/KSPP/linux/issues/1
> >
> > (Anyone want to do MIPS?)
> >
> 
> I take it this breaks the GCC plugin based per-task stack protector,
> given that it emits code to mask the stack pointer and apply an offset
> to the resulting value.
> 
> It would be nice if we could replace this with something suitable for
> THREAD_INFO_IN_TASK, and if it is suitable enough, try and get the
> GCC/Clang folks to adopt it as well (which was never going to happen
> for the stack pointer mask/offset approach)

I'd love to see the native GCC offset stuff work on arm32, but it's not
clear to me how much work that would be. It's implemented for several
architectures already. I've tried to capture the matrix here:
https://github.com/KSPP/linux/issues/29

-Kees

-- 
Kees Cook

^ permalink raw reply	[flat|nested] 36+ messages in thread

* RE: [PATCH 0/2]: ARM: Enable THREAD_INFO_IN_TASK
  2021-09-02 16:53 ` Russell King (Oracle)
  2021-09-02 17:35   ` Kees Cook
@ 2021-09-02 17:58   ` Keith Packard
  1 sibling, 0 replies; 36+ messages in thread
From: Keith Packard @ 2021-09-02 17:58 UTC (permalink / raw)
  To: Russell King (Oracle)
  Cc: linux-kernel, Abbott Liu, Alexander Sverdlin, Al Viro,
	Andrew Morton, Anshuman Khandual, Ard Biesheuvel, Arnd Bergmann,
	Bjorn Andersson, Florian Fainelli, Geert Uytterhoeven,
	Hartley Sweeten, Jens Axboe, Jian Cai, Joe Perches,
	Linus Walleij, linux-arm-kernel, Maninder Singh,
	Manivannan Sadhasivam, Marc Zyngier, Masahiro Yamada,
	Mike Rapoport, Nick Desaulniers, Nick Desaulniers, Nicolas Pitre,
	Peter Zijlstra, Thomas Gleixner, Uwe Kleine-König,
	Valentin Schneider, Vaneet Narang, Wolfram Sang (Renesas),
	YiFei Zhu

[-- Attachment #1: Type: text/plain, Size: 856 bytes --]

"Russell King (Oracle)" <linux@armlinux.org.uk> writes:

> I think you're introducing a circular dependency with this for
> certain kernel configurations:
>
> E.g. Have you tried running this with CONFIG_CPU_V6 enabled?

That's very useful feedback -- no, I hadn't ever tried this
configuration (presumably the compiler would have complained).

> #define __my_cpu_offset per_cpu_offset(raw_smp_processor_id())

I've read through the v6 architecture reference manual (again) and
cannot find any place to store either a per_cpu_offset or even a small
processor id, which I didn't find surprising as I would have expected it
to already be used for this purpose.

I'll re-spin the changes making them conditional on !CONFIG_CPU_V6,
unless someone has an idea of how to identify the current core in a
multi-core ARMv6 system.

-- 
-keith

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 0/2] ARM: support THREAD_INFO_IN_TASK (v7 only) (v2)
  2021-09-02 15:54 [PATCH 0/2]: ARM: Enable THREAD_INFO_IN_TASK Keith Packard
                   ` (3 preceding siblings ...)
  2021-09-02 16:53 ` Russell King (Oracle)
@ 2021-09-04  6:09 ` Keith Packard
  2021-09-04  6:09   ` [PATCH 1/3] ARM: Pass cpu number to secondary_start_kernel Keith Packard
                     ` (3 more replies)
  4 siblings, 4 replies; 36+ messages in thread
From: Keith Packard @ 2021-09-04  6:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: Abbott Liu, Alexander Sverdlin, Andrew Morton, Anshuman Khandual,
	Ard Biesheuvel, Arnd Bergmann, Bjorn Andersson, Florian Fainelli,
	Geert Uytterhoeven, Hartley Sweeten, Jens Axboe, Jian Cai,
	Joe Perches, Kees Cook, Keith Packard, Krzysztof Kozlowski,
	Linus Walleij, linux-arm-kernel, Manivannan Sadhasivam,
	Marc Zyngier, Masahiro Yamada, Miguel Ojeda, Mike Rapoport,
	Nathan Chancellor, Nick Desaulniers, Nicolas Pitre, Rob Herring,
	Russell King, Thomas Gleixner, Uwe Kleine-König,
	Valentin Schneider, Viresh Kumar, Wolfram Sang (Renesas),
	YiFei Zhu, Keith Packard

Placing thread_info in the kernel stack leaves it vulnerable to stack
overflow attacks. This short series addresses that by using the
existing THREAD_INFO_IN_TASK infrastructure.

This is the second version of this series, in this version the changes
are restricted to v7 hardware which offers a way to identify each cpu
in the system without reference to the stack it is using.

The series is broken into three pieces:

 1) Change the secondary_start_kernel API to pass the cpu number to
    this function. This is required for the following patch because the
    raw_smp_processor_id() macro will use the per_cpu_offset value which
    needs to have the cpu number to get the right value.

 2) Enable THREAD_INFO_IN_TASK by creating a new per-cpu variable,
    current_task, just like the x86 architecture. The largest changes
    are in the assembly code where fetching the current_task value
    requires a temporary register. Fortunately, each location in the
    code performing this had a reasonably obvious register to use.

 3) Optimize access to the cpu number using another new per-cpu
    variable. This is not functionally necessary, but avoids
    de-referencing through two pointers at modest memory cost.

Signed-off-by: Keith Packard <keithpac@amazon.com>



^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 1/3] ARM: Pass cpu number to secondary_start_kernel
  2021-09-04  6:09 ` [PATCH 0/2] ARM: support THREAD_INFO_IN_TASK (v7 only) (v2) Keith Packard
@ 2021-09-04  6:09   ` Keith Packard
  2021-09-05 20:25     ` Ard Biesheuvel
  2021-09-04  6:09   ` [PATCH 2/3] ARM: Move thread_info into task_struct (v7 only) Keith Packard
                     ` (2 subsequent siblings)
  3 siblings, 1 reply; 36+ messages in thread
From: Keith Packard @ 2021-09-04  6:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: Abbott Liu, Alexander Sverdlin, Andrew Morton, Anshuman Khandual,
	Ard Biesheuvel, Arnd Bergmann, Bjorn Andersson, Florian Fainelli,
	Geert Uytterhoeven, Hartley Sweeten, Jens Axboe, Jian Cai,
	Joe Perches, Kees Cook, Keith Packard, Krzysztof Kozlowski,
	Linus Walleij, linux-arm-kernel, Manivannan Sadhasivam,
	Marc Zyngier, Masahiro Yamada, Miguel Ojeda, Mike Rapoport,
	Nathan Chancellor, Nick Desaulniers, Nicolas Pitre, Rob Herring,
	Russell King, Thomas Gleixner, Uwe Kleine-König,
	Valentin Schneider, Viresh Kumar, Wolfram Sang (Renesas),
	YiFei Zhu, Keith Packard

Instead of pulling the CPU value out of the thread_info struct, pass
it as an argument. When first initializing secondary processors, this
is done by stashing the value in the secondary_data struct. When
restarting idle processors, the previous CPU value is passed.

Because the cpu is now known at the top of secondary_start_kernel, the
per_cpu_offset value can now be set at this point, instead of in
cpu_init where it was also incorrectly setting the per_cpu_offset for
the boot processor before that value had been computed.

Signed-off-by: Keith Packard <keithpac@amazon.com>
---
 arch/arm/include/asm/smp.h   |  3 ++-
 arch/arm/kernel/head-nommu.S |  1 +
 arch/arm/kernel/head.S       |  1 +
 arch/arm/kernel/setup.c      |  6 ------
 arch/arm/kernel/smp.c        | 14 +++++++++-----
 5 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h
index 5d508f5d56c4..86a7fd721556 100644
--- a/arch/arm/include/asm/smp.h
+++ b/arch/arm/include/asm/smp.h
@@ -48,7 +48,7 @@ extern void set_smp_ipi_range(int ipi_base, int nr_ipi);
  * Called from platform specific assembly code, this is the
  * secondary CPU entry point.
  */
-asmlinkage void secondary_start_kernel(void);
+asmlinkage void secondary_start_kernel(unsigned int cpu);
 
 
 /*
@@ -61,6 +61,7 @@ struct secondary_data {
 	};
 	unsigned long swapper_pg_dir;
 	void *stack;
+	unsigned int cpu;
 };
 extern struct secondary_data secondary_data;
 extern void secondary_startup(void);
diff --git a/arch/arm/kernel/head-nommu.S b/arch/arm/kernel/head-nommu.S
index 0fc814bbc34b..5aa8ef42717f 100644
--- a/arch/arm/kernel/head-nommu.S
+++ b/arch/arm/kernel/head-nommu.S
@@ -114,6 +114,7 @@ ENTRY(secondary_startup)
 	add	r12, r12, r10
 	ret	r12
 1:	bl	__after_proc_init
+	ldr	r0, [r7, #16]			@ set up cpu number
 	ldr	sp, [r7, #12]			@ set up the stack pointer
 	mov	fp, #0
 	b	secondary_start_kernel
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 7f62c5eccdf3..0e541af738e2 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -394,6 +394,7 @@ ENDPROC(secondary_startup_arm)
 
 ENTRY(__secondary_switched)
 	ldr_l	r7, secondary_data + 12		@ get secondary_data.stack
+	ldr_l	r0, secondary_data + 16		@ get secondary_data.cpu
 	mov	sp, r7
 	mov	fp, #0
 	b	secondary_start_kernel
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 73ca7797b92f..ca0201635fac 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -532,12 +532,6 @@ void notrace cpu_init(void)
 		BUG();
 	}
 
-	/*
-	 * This only works on resume and secondary cores. For booting on the
-	 * boot cpu, smp_prepare_boot_cpu is called after percpu area setup.
-	 */
-	set_my_cpu_offset(per_cpu_offset(cpu));
-
 	cpu_proc_init();
 
 	/*
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 74679240a9d8..55cb1689a4b3 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -153,6 +153,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
 	secondary_data.pgdir = virt_to_phys(idmap_pgd);
 	secondary_data.swapper_pg_dir = get_arch_pgd(swapper_pg_dir);
 #endif
+	secondary_data.cpu = cpu;
 	sync_cache_w(&secondary_data);
 
 	/*
@@ -373,11 +374,14 @@ void arch_cpu_idle_dead(void)
 	 * cpu initialisation.  There's some initialisation which needs
 	 * to be repeated to undo the effects of taking the CPU offline.
 	 */
-	__asm__("mov	sp, %0\n"
+	__asm__("mov	r0, %1\n"
+	"	mov	sp, %0\n"
 	"	mov	fp, #0\n"
 	"	b	secondary_start_kernel"
 		:
-		: "r" (task_stack_page(current) + THREAD_SIZE - 8));
+		: "r" (task_stack_page(current) + THREAD_SIZE - 8),
+		  "r" (cpu)
+		: "r0");
 }
 #endif /* CONFIG_HOTPLUG_CPU */
 
@@ -400,10 +404,11 @@ static void smp_store_cpu_info(unsigned int cpuid)
  * This is the secondary CPU boot entry.  We're using this CPUs
  * idle thread stack, but a set of temporary page tables.
  */
-asmlinkage void secondary_start_kernel(void)
+asmlinkage void secondary_start_kernel(unsigned int cpu)
 {
 	struct mm_struct *mm = &init_mm;
-	unsigned int cpu;
+
+	set_my_cpu_offset(per_cpu_offset(cpu));
 
 	secondary_biglittle_init();
 
@@ -420,7 +425,6 @@ asmlinkage void secondary_start_kernel(void)
 	 * All kernel threads share the same mm context; grab a
 	 * reference and switch to it.
 	 */
-	cpu = smp_processor_id();
 	mmgrab(mm);
 	current->active_mm = mm;
 	cpumask_set_cpu(cpu, mm_cpumask(mm));
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 2/3] ARM: Move thread_info into task_struct (v7 only)
  2021-09-04  6:09 ` [PATCH 0/2] ARM: support THREAD_INFO_IN_TASK (v7 only) (v2) Keith Packard
  2021-09-04  6:09   ` [PATCH 1/3] ARM: Pass cpu number to secondary_start_kernel Keith Packard
@ 2021-09-04  6:09   ` Keith Packard
  2021-09-05 20:56     ` Ard Biesheuvel
  2021-09-04  6:09   ` [PATCH 3/3] ARM: Add per-cpu variable cpu_number " Keith Packard
  2021-09-07 22:00   ` [PATCH 0/7] ARM: support THREAD_INFO_IN_TASK (v3) Keith Packard
  3 siblings, 1 reply; 36+ messages in thread
From: Keith Packard @ 2021-09-04  6:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: Abbott Liu, Alexander Sverdlin, Andrew Morton, Anshuman Khandual,
	Ard Biesheuvel, Arnd Bergmann, Bjorn Andersson, Florian Fainelli,
	Geert Uytterhoeven, Hartley Sweeten, Jens Axboe, Jian Cai,
	Joe Perches, Kees Cook, Keith Packard, Krzysztof Kozlowski,
	Linus Walleij, linux-arm-kernel, Manivannan Sadhasivam,
	Marc Zyngier, Masahiro Yamada, Miguel Ojeda, Mike Rapoport,
	Nathan Chancellor, Nick Desaulniers, Nicolas Pitre, Rob Herring,
	Russell King, Thomas Gleixner, Uwe Kleine-König,
	Valentin Schneider, Viresh Kumar, Wolfram Sang (Renesas),
	YiFei Zhu, Keith Packard

This avoids many stack overflow attacks which modified the thread_info
structure by moving that into the task_struct as is done is almost all
other architectures.

The task_struct is now referenced by a per-cpu variable,
'current_task', allowing the current one on each cpu to be located
from both C and assembly.

This also involved removing the 'cpu' member from the thread_info
struct and using the one added to the task_struct instead by the
THREAD_INFO_IN_TASK code.

This code is currently enabled only for v7 hardware as other ARM
architectures do not make the c13 register in the p15 co-processor
available for general OS use, leaving identifying the current cpu to
the 'cpu' field in the thread_info located off the stack pointer.

Signed-off-by: Keith Packard <keithpac@amazon.com>
---
 arch/arm/Kconfig                   |  1 +
 arch/arm/Makefile                  |  8 ++++
 arch/arm/include/asm/assembler.h   | 66 ++++++++++++++++++++++++++----
 arch/arm/include/asm/current.h     | 41 +++++++++++++++++++
 arch/arm/include/asm/smp.h         | 18 ++++++++
 arch/arm/include/asm/thread_info.h | 17 ++++++++
 arch/arm/kernel/asm-offsets.c      |  4 ++
 arch/arm/kernel/entry-armv.S       | 17 ++++----
 arch/arm/kernel/entry-common.S     |  6 +--
 arch/arm/kernel/entry-v7m.S        |  7 +++-
 arch/arm/kernel/iwmmxt.S           |  2 +-
 arch/arm/kernel/smp.c              | 12 ++++++
 arch/arm/lib/copy_from_user.S      |  2 +-
 arch/arm/lib/copy_to_user.S        |  2 +-
 arch/arm/mach-ep93xx/crunch-bits.S |  2 +-
 arch/arm/mm/proc-macros.S          |  2 +-
 arch/arm/vfp/entry.S               |  4 ++
 arch/arm/vfp/vfpmodule.c           | 25 +++++++----
 18 files changed, 204 insertions(+), 32 deletions(-)
 create mode 100644 arch/arm/include/asm/current.h

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 24804f11302d..1c1ded500a2b 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -128,6 +128,7 @@ config ARM
 	select RTC_LIB
 	select SET_FS
 	select SYS_SUPPORTS_APM_EMULATION
+	select THREAD_INFO_IN_TASK if CPU_V7
 	# Above selects are sorted alphabetically; please add new ones
 	# according to that.  Thanks.
 	help
diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index 415c3514573a..71a2ba4549d3 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -284,6 +284,14 @@ stack_protector_prepare: prepare0
 	$(eval GCC_PLUGINS_CFLAGS += $(SSP_PLUGIN_CFLAGS))
 endif
 
+ifdef CONFIG_SMP
+prepare: task_cpu_prepare
+
+PHONY += task_cpu_prepare
+task_cpu_prepare: prepare0
+	$(eval KBUILD_CFLAGS += -D_TSK_CPU=$(shell awk '{if ($$2 == "TSK_CPU") print $$3;}' include/generated/asm-offsets.h))
+endif
+
 all:	$(notdir $(KBUILD_IMAGE))
 
 
diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
index e2b1fd558bf3..2ce7403f9298 100644
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@ -199,16 +199,68 @@
 	.endm
 	.endr
 
+#ifdef CONFIG_THREAD_INFO_IN_TASK
 /*
- * Get current thread_info.
+ * Get per-CPU offset
+ * rd: Destination register
  */
-	.macro	get_thread_info, rd
- ARM(	mov	\rd, sp, lsr #THREAD_SIZE_ORDER + PAGE_SHIFT	)
- THUMB(	mov	\rd, sp			)
- THUMB(	lsr	\rd, \rd, #THREAD_SIZE_ORDER + PAGE_SHIFT	)
-	mov	\rd, \rd, lsl #THREAD_SIZE_ORDER + PAGE_SHIFT
+	.macro this_cpu_offset rd:req
+	mrc p15, 0, \rd, c13, c0, 4
 	.endm
 
+/*
+ * Load a per-cpu variable
+ * @dst: Destination register to receive value
+ * @sym: The name of the per-cpu variable
+ * @tmp: scratch register
+ */
+	.macro ldr_this_cpu dst : req, sym : req, tmp : req
+	movw	\dst, #:lower16:\sym
+	movt	\dst, #:upper16:\sym
+	this_cpu_offset \tmp
+	ldr	\dst, [\dst, \tmp]
+	.endm
+
+/*
+ * Store a value in a per-cpu variable
+ * @src: Source register with value
+ * @sym: The name of the per-cpu variable
+ * @t1: scratch register 1
+ * @t2: scratch register 2
+ */
+	.macro str_this_cpu src : req, sym : req, t1 : req, t2 : req
+	movw	\t1, #:lower16:\sym
+	movt	\t1, #:upper16:\sym
+	this_cpu_offset \t2
+	str	\src, [\t1, \t2]
+	.endm
+#endif
+
+/*
+ * Get current thread_info
+ * @dst: Destination register to receive current thread info
+ * @tmp: scratch register
+ */
+	.macro get_current, dst : req, tmp : req
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+	ldr_this_cpu \dst, current_task, \tmp
+#else
+ ARM(	mov	\dst, sp, lsr #THREAD_SIZE_ORDER + PAGE_SHIFT	)
+ THUMB(	mov	\dst, sp			)
+ THUMB(	lsr	\dst, \dst, #THREAD_SIZE_ORDER + PAGE_SHIFT	)
+	mov	\dst, \dst, lsl #THREAD_SIZE_ORDER + PAGE_SHIFT	
+#endif
+	.endm
+
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+/*
+ * Set current task info
+ */
+	.macro	set_current_task rs : req, t1 : req, t2 : req
+	str_this_cpu \rs, current_task, \t1, \t2
+	.endm
+#endif
+
 /*
  * Increment/decrement the preempt count.
  */
@@ -226,7 +278,7 @@
 	.endm
 
 	.macro	dec_preempt_count_ti, ti, tmp
-	get_thread_info \ti
+	get_current \ti, \tmp
 	dec_preempt_count \ti, \tmp
 	.endm
 #else
diff --git a/arch/arm/include/asm/current.h b/arch/arm/include/asm/current.h
new file mode 100644
index 000000000000..7160315eb569
--- /dev/null
+++ b/arch/arm/include/asm/current.h
@@ -0,0 +1,41 @@
+/*
+ * Copyright © 2021 Keith Packard <keithp@keithp.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation, either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ */
+
+#ifndef _ASM_ARM_CURRENT_H
+#define _ASM_ARM_CURRENT_H
+
+#include <linux/compiler.h>
+#include <linux/thread_info.h>
+#include <asm/percpu.h>
+
+#ifndef __ASSEMBLY__
+
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+struct task_struct;
+
+DECLARE_PER_CPU(struct task_struct *, current_task);
+
+static __always_inline struct task_struct *get_current(void)
+{
+	return raw_cpu_read(current_task);
+}
+
+#define current get_current()
+#else
+#include <asm-generic/current.h>
+#endif
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASM_ARM_CURRENT_H */
diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h
index 86a7fd721556..1c38d1fde641 100644
--- a/arch/arm/include/asm/smp.h
+++ b/arch/arm/include/asm/smp.h
@@ -15,7 +15,25 @@
 # error "<asm/smp.h> included in non-SMP build"
 #endif
 
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+/*
+ * This is particularly ugly: it appears we can't actually get the definition
+ * of task_struct here, but we need access to the CPU this task is running on.
+ * Instead of using task_struct we're using TSK_CPU which is extracted from
+ * asm-offsets.h by kbuild to get the current processor ID.
+ *
+ * This also needs to be safeguarded when building asm-offsets.s because at
+ * that time TSK_CPU is not defined yet.
+ */
+#ifndef _TSK_CPU
+#define raw_smp_processor_id()		(0)
+#else
+#define raw_smp_processor_id()		(*(unsigned int *)((void *)current + _TSK_CPU))
+#endif
+
+#else
 #define raw_smp_processor_id() (current_thread_info()->cpu)
+#endif
 
 struct seq_file;
 
diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h
index 70d4cbc49ae1..cbcd476a095b 100644
--- a/arch/arm/include/asm/thread_info.h
+++ b/arch/arm/include/asm/thread_info.h
@@ -55,8 +55,10 @@ struct thread_info {
 	unsigned long		flags;		/* low level flags */
 	int			preempt_count;	/* 0 => preemptable, <0 => bug */
 	mm_segment_t		addr_limit;	/* address limit */
+#ifndef CONFIG_THREAD_INFO_IN_TASK
 	struct task_struct	*task;		/* main task structure */
 	__u32			cpu;		/* cpu */
+#endif
 	__u32			cpu_domain;	/* cpu domain */
 #ifdef CONFIG_STACKPROTECTOR_PER_TASK
 	unsigned long		stack_canary;
@@ -75,6 +77,17 @@ struct thread_info {
 #endif
 };
 
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+
+#define INIT_THREAD_INFO(tsk)						\
+{									\
+	.flags		= 0,						\
+	.preempt_count	= INIT_PREEMPT_COUNT,				\
+	.addr_limit	= KERNEL_DS,					\
+}
+
+#else
+
 #define INIT_THREAD_INFO(tsk)						\
 {									\
 	.task		= &tsk,						\
@@ -83,6 +96,9 @@ struct thread_info {
 	.addr_limit	= KERNEL_DS,					\
 }
 
+#endif
+
+#ifndef CONFIG_THREAD_INFO_IN_TASK
 /*
  * how to get the thread information struct from C
  */
@@ -93,6 +109,7 @@ static inline struct thread_info *current_thread_info(void)
 	return (struct thread_info *)
 		(current_stack_pointer & ~(THREAD_SIZE - 1));
 }
+#endif
 
 #define thread_saved_pc(tsk)	\
 	((unsigned long)(task_thread_info(tsk)->cpu_context.pc))
diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
index 70993af22d80..c91dcfe34bdd 100644
--- a/arch/arm/kernel/asm-offsets.c
+++ b/arch/arm/kernel/asm-offsets.c
@@ -44,8 +44,12 @@ int main(void)
   DEFINE(TI_FLAGS,		offsetof(struct thread_info, flags));
   DEFINE(TI_PREEMPT,		offsetof(struct thread_info, preempt_count));
   DEFINE(TI_ADDR_LIMIT,		offsetof(struct thread_info, addr_limit));
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+  DEFINE(TSK_CPU,		offsetof(struct task_struct, cpu));
+#else
   DEFINE(TI_TASK,		offsetof(struct thread_info, task));
   DEFINE(TI_CPU,		offsetof(struct thread_info, cpu));
+#endif
   DEFINE(TI_CPU_DOMAIN,		offsetof(struct thread_info, cpu_domain));
   DEFINE(TI_CPU_SAVE,		offsetof(struct thread_info, cpu_context));
   DEFINE(TI_USED_CP,		offsetof(struct thread_info, used_cp));
diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
index 0ea8529a4872..a5d71b0d8897 100644
--- a/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -179,7 +179,7 @@ ENDPROC(__und_invalid)
 	@
 	stmia	r7, {r2 - r6}
 
-	get_thread_info tsk
+	get_current tsk, r0
 	uaccess_entry tsk, r0, r1, r2, \uaccess
 
 	.if \trace
@@ -260,7 +260,7 @@ __und_svc:
 	bl	__und_fault
 
 __und_svc_finish:
-	get_thread_info tsk
+	get_current tsk, r5
 	ldr	r5, [sp, #S_PSR]		@ Get SVC cpsr
 	svc_exit r5				@ return from exception
  UNWIND(.fnend		)
@@ -428,7 +428,7 @@ __irq_usr:
 	usr_entry
 	kuser_cmpxchg_check
 	irq_handler
-	get_thread_info tsk
+	get_current tsk, why
 	mov	why, #0
 	b	ret_to_user_from_irq
  UNWIND(.fnend		)
@@ -572,12 +572,12 @@ ENDPROC(__und_usr)
 	@ Fall-through from Thumb-2 __und_usr
 	@
 #ifdef CONFIG_NEON
-	get_thread_info r10			@ get current thread
+	get_current r10, r6			@ get current thread
 	adr	r6, .LCneon_thumb_opcodes
 	b	2f
 #endif
 call_fpe:
-	get_thread_info r10			@ get current thread
+	get_current r10, r6			@ get current thread
 #ifdef CONFIG_NEON
 	adr	r6, .LCneon_arm_opcodes
 2:	ldr	r5, [r6], #4			@ mask value
@@ -722,7 +722,7 @@ __pabt_usr:
 ENTRY(ret_from_exception)
  UNWIND(.fnstart	)
  UNWIND(.cantunwind	)
-	get_thread_info tsk
+	get_current tsk, why
 	mov	why, #0
 	b	ret_to_user
  UNWIND(.fnend		)
@@ -735,7 +735,7 @@ __fiq_usr:
 	kuser_cmpxchg_check
 	mov	r0, sp				@ struct pt_regs *regs
 	bl	handle_fiq_as_nmi
-	get_thread_info tsk
+	get_current tsk, r0
 	restore_user_regs fast = 0, offset = 0
  UNWIND(.fnend		)
 ENDPROC(__fiq_usr)
@@ -771,6 +771,9 @@ ENTRY(__switch_to)
 #endif
 #ifdef CONFIG_CPU_USE_DOMAINS
 	mcr	p15, 0, r6, c3, c0, 0		@ Set domain register
+#endif
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+	set_current_task r2, r4, r5
 #endif
 	mov	r5, r0
 	add	r4, r2, #TI_CPU_SAVE
diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
index 7f0b7aba1498..52580ee463fe 100644
--- a/arch/arm/kernel/entry-common.S
+++ b/arch/arm/kernel/entry-common.S
@@ -156,7 +156,7 @@ ENTRY(ret_from_fork)
 	movne	r0, r4
 	badrne	lr, 1f
 	retne	r5
-1:	get_thread_info tsk
+1:	get_current tsk, r1
 	b	ret_slow_syscall
 ENDPROC(ret_from_fork)
 
@@ -243,7 +243,7 @@ ENTRY(vector_swi)
 	bic	scno, scno, #0xff000000		@ mask off SWI op-code
 	eor	scno, scno, #__NR_SYSCALL_BASE	@ check OS number
 #endif
-	get_thread_info tsk
+	get_current tsk, r10
 	/*
 	 * Reload the registers that may have been corrupted on entry to
 	 * the syscall assembly (by tracing or context tracking.)
@@ -278,7 +278,7 @@ local_restart:
 9001:
 	sub	lr, saved_pc, #4
 	str	lr, [sp, #S_PC]
-	get_thread_info tsk
+	get_current tsk, r1
 	b	ret_fast_syscall
 #endif
 ENDPROC(vector_swi)
diff --git a/arch/arm/kernel/entry-v7m.S b/arch/arm/kernel/entry-v7m.S
index d0e898608d30..f36a7a876085 100644
--- a/arch/arm/kernel/entry-v7m.S
+++ b/arch/arm/kernel/entry-v7m.S
@@ -57,7 +57,7 @@ __irq_entry:
 	tst	r0, V7M_SCB_ICSR_RETTOBASE
 	beq	2f
 
-	get_thread_info tsk
+	get_current tsk, r2
 	ldr	r2, [tsk, #TI_FLAGS]
 	movs	r2, r2, lsl #16
 	beq	2f			@ no work pending
@@ -83,7 +83,7 @@ __pendsv_entry:
 	str	r0, [r1, V7M_SCB_ICSR]	@ clear PendSV
 
 	@ execute the pending work, including reschedule
-	get_thread_info tsk
+	get_current tsk, why
 	mov	why, #0
 	b	ret_to_user_from_irq
 ENDPROC(__pendsv_entry)
@@ -107,6 +107,9 @@ ENTRY(__switch_to)
 	bl	atomic_notifier_call_chain
 	mov	ip, r4
 	mov	r0, r5
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+	set_current_task r2, r4, r5
+#endif
 	ldmia	ip!, {r4 - r11}		@ Load all regs saved previously
 	ldr	sp, [ip]
 	ldr	pc, [ip, #4]!
diff --git a/arch/arm/kernel/iwmmxt.S b/arch/arm/kernel/iwmmxt.S
index d2b4ac06e4ed..781f7c7fca90 100644
--- a/arch/arm/kernel/iwmmxt.S
+++ b/arch/arm/kernel/iwmmxt.S
@@ -96,7 +96,7 @@ ENTRY(iwmmxt_task_enable)
 	bl	concan_save
 
 #ifdef CONFIG_PREEMPT_COUNT
-	get_thread_info r10
+	get_current r10, r3
 #endif
 4:	dec_preempt_count r10, r3
 	ret	r9				@ normal exit from exception
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 55cb1689a4b3..be0ede16dbb1 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -51,6 +51,13 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/ipi.h>
 
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+DEFINE_PER_CPU(struct task_struct *, current_task) ____cacheline_aligned =
+	&init_task;
+EXPORT_PER_CPU_SYMBOL(current_task);
+
+#endif
+
 /*
  * as from 2.5, kernels no longer have an init_tasks structure
  * so we need some other way of telling a new secondary core
@@ -156,6 +163,10 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
 	secondary_data.cpu = cpu;
 	sync_cache_w(&secondary_data);
 
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+	per_cpu(current_task, cpu) = idle;
+#endif
+
 	/*
 	 * Now bring the CPU into our world.
 	 */
@@ -509,6 +520,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
 	 */
 	if (max_cpus > ncores)
 		max_cpus = ncores;
+
 	if (ncores > 1 && max_cpus) {
 		/*
 		 * Initialise the present map, which describes the set of CPUs
diff --git a/arch/arm/lib/copy_from_user.S b/arch/arm/lib/copy_from_user.S
index f8016e3db65d..5320950100a2 100644
--- a/arch/arm/lib/copy_from_user.S
+++ b/arch/arm/lib/copy_from_user.S
@@ -109,7 +109,7 @@
 
 ENTRY(arm_copy_from_user)
 #ifdef CONFIG_CPU_SPECTRE
-	get_thread_info r3
+	get_current r3, ip
 	ldr	r3, [r3, #TI_ADDR_LIMIT]
 	uaccess_mask_range_ptr r1, r2, r3, ip
 #endif
diff --git a/arch/arm/lib/copy_to_user.S b/arch/arm/lib/copy_to_user.S
index ebfe4cb3d912..e7e61c87893a 100644
--- a/arch/arm/lib/copy_to_user.S
+++ b/arch/arm/lib/copy_to_user.S
@@ -109,7 +109,7 @@
 ENTRY(__copy_to_user_std)
 WEAK(arm_copy_to_user)
 #ifdef CONFIG_CPU_SPECTRE
-	get_thread_info r3
+	get_current r3, ip
 	ldr	r3, [r3, #TI_ADDR_LIMIT]
 	uaccess_mask_range_ptr r0, r2, r3, ip
 #endif
diff --git a/arch/arm/mach-ep93xx/crunch-bits.S b/arch/arm/mach-ep93xx/crunch-bits.S
index fb2dbf76f09e..d0bb34b3d973 100644
--- a/arch/arm/mach-ep93xx/crunch-bits.S
+++ b/arch/arm/mach-ep93xx/crunch-bits.S
@@ -192,7 +192,7 @@ crunch_load:
 
 1:
 #ifdef CONFIG_PREEMPT_COUNT
-	get_thread_info r10
+	get_current r10, r3
 #endif
 2:	dec_preempt_count r10, r3
 	ret	lr
diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S
index e2c743aa2eb2..a782c025fdf3 100644
--- a/arch/arm/mm/proc-macros.S
+++ b/arch/arm/mm/proc-macros.S
@@ -30,7 +30,7 @@
  * act_mm - get current->active_mm
  */
 	.macro	act_mm, rd
-	get_thread_info \rd
+	get_current \rd, unused			@ won't build if THREAD_INFO_IN_TASK
 	ldr	\rd, [\rd, #TI_TASK]
 	.if (TSK_ACTIVE_MM > IMM12_MASK)
 	add	\rd, \rd, #TSK_ACTIVE_MM & ~IMM12_MASK
diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S
index 27b0a1f27fbd..48cb40a3b72d 100644
--- a/arch/arm/vfp/entry.S
+++ b/arch/arm/vfp/entry.S
@@ -24,7 +24,11 @@
 ENTRY(do_vfp)
 	inc_preempt_count r10, r4
  	ldr	r4, .LCvfp
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+	ldr	r11, [r10, #TSK_CPU]	@ CPU number
+#else
 	ldr	r11, [r10, #TI_CPU]	@ CPU number
+#endif
 	add	r10, r10, #TI_VFPSTATE	@ r10 = workspace
 	ldr	pc, [r4]		@ call VFP entry point
 ENDPROC(do_vfp)
diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
index 2cb355c1b5b7..f929a26a05a5 100644
--- a/arch/arm/vfp/vfpmodule.c
+++ b/arch/arm/vfp/vfpmodule.c
@@ -143,22 +143,27 @@ static void vfp_thread_copy(struct thread_info *thread)
  *  THREAD_NOFTIFY_SWTICH:
  *   - the previously running thread will not be scheduled onto another CPU.
  *   - the next thread to be run (v) will not be running on another CPU.
- *   - thread->cpu is the local CPU number
+ *   - tsk->cpu is the local CPU number
  *   - not preemptible as we're called in the middle of a thread switch
  *  THREAD_NOTIFY_FLUSH:
  *   - the thread (v) will be running on the local CPU, so
- *	v === current_thread_info()
- *   - thread->cpu is the local CPU number at the time it is accessed,
+ *	v === current
+ *   - tsk->cpu is the local CPU number at the time it is accessed,
  *	but may change at any time.
  *   - we could be preempted if tree preempt rcu is enabled, so
- *	it is unsafe to use thread->cpu.
+ *	it is unsafe to use tsk->cpu.
  *  THREAD_NOTIFY_EXIT
  *   - we could be preempted if tree preempt rcu is enabled, so
- *	it is unsafe to use thread->cpu.
+ *	it is unsafe to use tsk->cpu.
  */
 static int vfp_notifier(struct notifier_block *self, unsigned long cmd, void *v)
 {
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+	struct task_struct *tsk = v;
+	struct thread_info *thread = &tsk->thread_info;
+#else
 	struct thread_info *thread = v;
+#endif
 	u32 fpexc;
 #ifdef CONFIG_SMP
 	unsigned int cpu;
@@ -169,7 +174,11 @@ static int vfp_notifier(struct notifier_block *self, unsigned long cmd, void *v)
 		fpexc = fmrx(FPEXC);
 
 #ifdef CONFIG_SMP
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+		cpu = tsk->cpu;
+#else
 		cpu = thread->cpu;
+#endif
 
 		/*
 		 * On SMP, if VFP is enabled, save the old state in
@@ -458,16 +467,16 @@ static int vfp_pm_suspend(void)
 
 		/* disable, just in case */
 		fmxr(FPEXC, fmrx(FPEXC) & ~FPEXC_EN);
-	} else if (vfp_current_hw_state[ti->cpu]) {
+	} else if (vfp_current_hw_state[smp_processor_id()]) {
 #ifndef CONFIG_SMP
 		fmxr(FPEXC, fpexc | FPEXC_EN);
-		vfp_save_state(vfp_current_hw_state[ti->cpu], fpexc);
+		vfp_save_state(vfp_current_hw_state[smp_processor_id()], fpexc);
 		fmxr(FPEXC, fpexc);
 #endif
 	}
 
 	/* clear any information we had about last context state */
-	vfp_current_hw_state[ti->cpu] = NULL;
+	vfp_current_hw_state[smp_processor_id()] = NULL;
 
 	return 0;
 }
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 3/3] ARM: Add per-cpu variable cpu_number (v7 only)
  2021-09-04  6:09 ` [PATCH 0/2] ARM: support THREAD_INFO_IN_TASK (v7 only) (v2) Keith Packard
  2021-09-04  6:09   ` [PATCH 1/3] ARM: Pass cpu number to secondary_start_kernel Keith Packard
  2021-09-04  6:09   ` [PATCH 2/3] ARM: Move thread_info into task_struct (v7 only) Keith Packard
@ 2021-09-04  6:09   ` Keith Packard
  2021-09-07 22:00   ` [PATCH 0/7] ARM: support THREAD_INFO_IN_TASK (v3) Keith Packard
  3 siblings, 0 replies; 36+ messages in thread
From: Keith Packard @ 2021-09-04  6:09 UTC (permalink / raw)
  To: linux-kernel
  Cc: Abbott Liu, Alexander Sverdlin, Andrew Morton, Anshuman Khandual,
	Ard Biesheuvel, Arnd Bergmann, Bjorn Andersson, Florian Fainelli,
	Geert Uytterhoeven, Hartley Sweeten, Jens Axboe, Jian Cai,
	Joe Perches, Kees Cook, Keith Packard, Krzysztof Kozlowski,
	Linus Walleij, linux-arm-kernel, Manivannan Sadhasivam,
	Marc Zyngier, Masahiro Yamada, Miguel Ojeda, Mike Rapoport,
	Nathan Chancellor, Nick Desaulniers, Nicolas Pitre, Rob Herring,
	Russell King, Thomas Gleixner, Uwe Kleine-König,
	Valentin Schneider, Viresh Kumar, Wolfram Sang (Renesas),
	YiFei Zhu, Keith Packard

Holds the cpu value for each cpu to make accessing this variable more
efficient than fetching the current task struct and pulling the cpu
value from there.

This code is only enabled when THREAD_INFO_IN_TASK is selected, which
is currently only enabled for v7 hardware.

Signed-off-by: Keith Packard <keithpac@amazon.com>
---
 arch/arm/Makefile          |  8 --------
 arch/arm/include/asm/smp.h | 17 +++--------------
 arch/arm/kernel/smp.c      | 16 ++++++++++++++++
 3 files changed, 19 insertions(+), 22 deletions(-)

diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index 71a2ba4549d3..415c3514573a 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -284,14 +284,6 @@ stack_protector_prepare: prepare0
 	$(eval GCC_PLUGINS_CFLAGS += $(SSP_PLUGIN_CFLAGS))
 endif
 
-ifdef CONFIG_SMP
-prepare: task_cpu_prepare
-
-PHONY += task_cpu_prepare
-task_cpu_prepare: prepare0
-	$(eval KBUILD_CFLAGS += -D_TSK_CPU=$(shell awk '{if ($$2 == "TSK_CPU") print $$3;}' include/generated/asm-offsets.h))
-endif
-
 all:	$(notdir $(KBUILD_IMAGE))
 
 
diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h
index 1c38d1fde641..67d21233bdfe 100644
--- a/arch/arm/include/asm/smp.h
+++ b/arch/arm/include/asm/smp.h
@@ -16,21 +16,10 @@
 #endif
 
 #ifdef CONFIG_THREAD_INFO_IN_TASK
-/*
- * This is particularly ugly: it appears we can't actually get the definition
- * of task_struct here, but we need access to the CPU this task is running on.
- * Instead of using task_struct we're using TSK_CPU which is extracted from
- * asm-offsets.h by kbuild to get the current processor ID.
- *
- * This also needs to be safeguarded when building asm-offsets.s because at
- * that time TSK_CPU is not defined yet.
- */
-#ifndef _TSK_CPU
-#define raw_smp_processor_id()		(0)
-#else
-#define raw_smp_processor_id()		(*(unsigned int *)((void *)current + _TSK_CPU))
-#endif
+#define raw_smp_processor_id() this_cpu_read(cpu_number)
+#define __smp_processor_id() __this_cpu_read(cpu_number)
 
+DECLARE_PER_CPU_READ_MOSTLY(unsigned int, cpu_number);
 #else
 #define raw_smp_processor_id() (current_thread_info()->cpu)
 #endif
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index be0ede16dbb1..a33397618f1e 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -56,6 +56,8 @@ DEFINE_PER_CPU(struct task_struct *, current_task) ____cacheline_aligned =
 	&init_task;
 EXPORT_PER_CPU_SYMBOL(current_task);
 
+DEFINE_PER_CPU_READ_MOSTLY(unsigned int, cpu_number);
+EXPORT_PER_CPU_SYMBOL(cpu_number);
 #endif
 
 /*
@@ -510,6 +512,9 @@ void __init smp_prepare_boot_cpu(void)
 void __init smp_prepare_cpus(unsigned int max_cpus)
 {
 	unsigned int ncores = num_possible_cpus();
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+	unsigned int cpu;
+#endif
 
 	init_cpu_topology();
 
@@ -521,6 +526,17 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
 	if (max_cpus > ncores)
 		max_cpus = ncores;
 
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+	/*
+	 * Initialize the cpu_number value for each cpu before we
+	 * start it. This ensures that the value is valid during cpu
+	 * initialization, even before the idle task_struct cpu member
+	 * is set
+	 */
+	for_each_possible_cpu(cpu)
+		per_cpu(cpu_number, cpu) = cpu;
+#endif
+
 	if (ncores > 1 && max_cpus) {
 		/*
 		 * Initialise the present map, which describes the set of CPUs
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH 1/3] ARM: Pass cpu number to secondary_start_kernel
  2021-09-04  6:09   ` [PATCH 1/3] ARM: Pass cpu number to secondary_start_kernel Keith Packard
@ 2021-09-05 20:25     ` Ard Biesheuvel
  0 siblings, 0 replies; 36+ messages in thread
From: Ard Biesheuvel @ 2021-09-05 20:25 UTC (permalink / raw)
  To: Keith Packard
  Cc: Linux Kernel Mailing List, Abbott Liu, Alexander Sverdlin,
	Andrew Morton, Anshuman Khandual, Arnd Bergmann, Bjorn Andersson,
	Florian Fainelli, Geert Uytterhoeven, Hartley Sweeten,
	Jens Axboe, Jian Cai, Joe Perches, Kees Cook,
	Krzysztof Kozlowski, Linus Walleij, Linux ARM,
	Manivannan Sadhasivam, Marc Zyngier, Masahiro Yamada,
	Miguel Ojeda, Mike Rapoport, Nathan Chancellor, Nick Desaulniers,
	Nicolas Pitre, Rob Herring, Russell King, Thomas Gleixner,
	Uwe Kleine-König, Valentin Schneider, Viresh Kumar,
	Wolfram Sang (Renesas),
	YiFei Zhu, Keith Packard

Hi Keith,

This looks like a worthwhile cleanup to me regardless of the following patches.

On Sat, 4 Sept 2021 at 08:09, Keith Packard <keithp@keithp.com> wrote:
>
> Instead of pulling the CPU value out of the thread_info struct, pass
> it as an argument. When first initializing secondary processors, this
> is done by stashing the value in the secondary_data struct. When
> restarting idle processors, the previous CPU value is passed.
>
> Because the cpu is now known at the top of secondary_start_kernel, the
> per_cpu_offset value can now be set at this point, instead of in
> cpu_init where it was also incorrectly setting the per_cpu_offset for
> the boot processor before that value had been computed.
>
> Signed-off-by: Keith Packard <keithpac@amazon.com>

Reviewed-by: Ard Biesheuvel <ardb@kernel.org>

> ---
>  arch/arm/include/asm/smp.h   |  3 ++-
>  arch/arm/kernel/head-nommu.S |  1 +
>  arch/arm/kernel/head.S       |  1 +
>  arch/arm/kernel/setup.c      |  6 ------
>  arch/arm/kernel/smp.c        | 14 +++++++++-----
>  5 files changed, 13 insertions(+), 12 deletions(-)
>
> diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h
> index 5d508f5d56c4..86a7fd721556 100644
> --- a/arch/arm/include/asm/smp.h
> +++ b/arch/arm/include/asm/smp.h
> @@ -48,7 +48,7 @@ extern void set_smp_ipi_range(int ipi_base, int nr_ipi);
>   * Called from platform specific assembly code, this is the
>   * secondary CPU entry point.
>   */
> -asmlinkage void secondary_start_kernel(void);
> +asmlinkage void secondary_start_kernel(unsigned int cpu);
>
>
>  /*
> @@ -61,6 +61,7 @@ struct secondary_data {
>         };
>         unsigned long swapper_pg_dir;
>         void *stack;
> +       unsigned int cpu;
>  };
>  extern struct secondary_data secondary_data;
>  extern void secondary_startup(void);
> diff --git a/arch/arm/kernel/head-nommu.S b/arch/arm/kernel/head-nommu.S
> index 0fc814bbc34b..5aa8ef42717f 100644
> --- a/arch/arm/kernel/head-nommu.S
> +++ b/arch/arm/kernel/head-nommu.S
> @@ -114,6 +114,7 @@ ENTRY(secondary_startup)
>         add     r12, r12, r10
>         ret     r12
>  1:     bl      __after_proc_init
> +       ldr     r0, [r7, #16]                   @ set up cpu number
>         ldr     sp, [r7, #12]                   @ set up the stack pointer
>         mov     fp, #0
>         b       secondary_start_kernel
> diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
> index 7f62c5eccdf3..0e541af738e2 100644
> --- a/arch/arm/kernel/head.S
> +++ b/arch/arm/kernel/head.S
> @@ -394,6 +394,7 @@ ENDPROC(secondary_startup_arm)
>
>  ENTRY(__secondary_switched)
>         ldr_l   r7, secondary_data + 12         @ get secondary_data.stack
> +       ldr_l   r0, secondary_data + 16         @ get secondary_data.cpu
>         mov     sp, r7
>         mov     fp, #0
>         b       secondary_start_kernel
> diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
> index 73ca7797b92f..ca0201635fac 100644
> --- a/arch/arm/kernel/setup.c
> +++ b/arch/arm/kernel/setup.c
> @@ -532,12 +532,6 @@ void notrace cpu_init(void)
>                 BUG();
>         }
>
> -       /*
> -        * This only works on resume and secondary cores. For booting on the
> -        * boot cpu, smp_prepare_boot_cpu is called after percpu area setup.
> -        */
> -       set_my_cpu_offset(per_cpu_offset(cpu));
> -
>         cpu_proc_init();
>
>         /*
> diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> index 74679240a9d8..55cb1689a4b3 100644
> --- a/arch/arm/kernel/smp.c
> +++ b/arch/arm/kernel/smp.c
> @@ -153,6 +153,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
>         secondary_data.pgdir = virt_to_phys(idmap_pgd);
>         secondary_data.swapper_pg_dir = get_arch_pgd(swapper_pg_dir);
>  #endif
> +       secondary_data.cpu = cpu;
>         sync_cache_w(&secondary_data);
>
>         /*
> @@ -373,11 +374,14 @@ void arch_cpu_idle_dead(void)
>          * cpu initialisation.  There's some initialisation which needs
>          * to be repeated to undo the effects of taking the CPU offline.
>          */
> -       __asm__("mov    sp, %0\n"
> +       __asm__("mov    r0, %1\n"
> +       "       mov     sp, %0\n"
>         "       mov     fp, #0\n"
>         "       b       secondary_start_kernel"
>                 :
> -               : "r" (task_stack_page(current) + THREAD_SIZE - 8));
> +               : "r" (task_stack_page(current) + THREAD_SIZE - 8),
> +                 "r" (cpu)
> +               : "r0");
>  }
>  #endif /* CONFIG_HOTPLUG_CPU */
>
> @@ -400,10 +404,11 @@ static void smp_store_cpu_info(unsigned int cpuid)
>   * This is the secondary CPU boot entry.  We're using this CPUs
>   * idle thread stack, but a set of temporary page tables.
>   */
> -asmlinkage void secondary_start_kernel(void)
> +asmlinkage void secondary_start_kernel(unsigned int cpu)
>  {
>         struct mm_struct *mm = &init_mm;
> -       unsigned int cpu;
> +
> +       set_my_cpu_offset(per_cpu_offset(cpu));
>
>         secondary_biglittle_init();
>
> @@ -420,7 +425,6 @@ asmlinkage void secondary_start_kernel(void)
>          * All kernel threads share the same mm context; grab a
>          * reference and switch to it.
>          */
> -       cpu = smp_processor_id();
>         mmgrab(mm);
>         current->active_mm = mm;
>         cpumask_set_cpu(cpu, mm_cpumask(mm));
> --
> 2.33.0
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/3] ARM: Move thread_info into task_struct (v7 only)
  2021-09-04  6:09   ` [PATCH 2/3] ARM: Move thread_info into task_struct (v7 only) Keith Packard
@ 2021-09-05 20:56     ` Ard Biesheuvel
  2021-09-06  6:14       ` Keith Packard
  2021-09-06  6:20       ` Keith Packard
  0 siblings, 2 replies; 36+ messages in thread
From: Ard Biesheuvel @ 2021-09-05 20:56 UTC (permalink / raw)
  To: Keith Packard
  Cc: Linux Kernel Mailing List, Abbott Liu, Alexander Sverdlin,
	Andrew Morton, Anshuman Khandual, Arnd Bergmann, Bjorn Andersson,
	Florian Fainelli, Geert Uytterhoeven, Hartley Sweeten,
	Jens Axboe, Jian Cai, Joe Perches, Kees Cook,
	Krzysztof Kozlowski, Linus Walleij, Linux ARM,
	Manivannan Sadhasivam, Marc Zyngier, Masahiro Yamada,
	Miguel Ojeda, Mike Rapoport, Nathan Chancellor, Nick Desaulniers,
	Nicolas Pitre, Rob Herring, Russell King, Thomas Gleixner,
	Uwe Kleine-König, Valentin Schneider, Viresh Kumar,
	Wolfram Sang (Renesas),
	YiFei Zhu, Keith Packard

On Sat, 4 Sept 2021 at 08:09, Keith Packard <keithp@keithp.com> wrote:
>
> This avoids many stack overflow attacks which modified the thread_info
> structure by moving that into the task_struct as is done is almost all
> other architectures.
>
> The task_struct is now referenced by a per-cpu variable,
> 'current_task', allowing the current one on each cpu to be located
> from both C and assembly.
>
> This also involved removing the 'cpu' member from the thread_info
> struct and using the one added to the task_struct instead by the
> THREAD_INFO_IN_TASK code.
>
> This code is currently enabled only for v7 hardware as other ARM
> architectures do not make the c13 register in the p15 co-processor
> available for general OS use, leaving identifying the current cpu to
> the 'cpu' field in the thread_info located off the stack pointer.
>

c13 is not a register, it is a value in one of the dimensions of the
ARM sysreg space, and probably covers other system registers that
pre-v7 cores do implement.

Better to use its architectural name (TPIDRPRW) and clarify that
pre-v6k/v7 cores simply don't implement it.

> Signed-off-by: Keith Packard <keithpac@amazon.com>
> ---
>  arch/arm/Kconfig                   |  1 +
>  arch/arm/Makefile                  |  8 ++++
>  arch/arm/include/asm/assembler.h   | 66 ++++++++++++++++++++++++++----
>  arch/arm/include/asm/current.h     | 41 +++++++++++++++++++
>  arch/arm/include/asm/smp.h         | 18 ++++++++
>  arch/arm/include/asm/thread_info.h | 17 ++++++++
>  arch/arm/kernel/asm-offsets.c      |  4 ++
>  arch/arm/kernel/entry-armv.S       | 17 ++++----
>  arch/arm/kernel/entry-common.S     |  6 +--
>  arch/arm/kernel/entry-v7m.S        |  7 +++-
>  arch/arm/kernel/iwmmxt.S           |  2 +-
>  arch/arm/kernel/smp.c              | 12 ++++++
>  arch/arm/lib/copy_from_user.S      |  2 +-
>  arch/arm/lib/copy_to_user.S        |  2 +-
>  arch/arm/mach-ep93xx/crunch-bits.S |  2 +-
>  arch/arm/mm/proc-macros.S          |  2 +-
>  arch/arm/vfp/entry.S               |  4 ++
>  arch/arm/vfp/vfpmodule.c           | 25 +++++++----
>  18 files changed, 204 insertions(+), 32 deletions(-)
>  create mode 100644 arch/arm/include/asm/current.h
>

Could we split this up?

> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 24804f11302d..1c1ded500a2b 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -128,6 +128,7 @@ config ARM
>         select RTC_LIB
>         select SET_FS
>         select SYS_SUPPORTS_APM_EMULATION
> +       select THREAD_INFO_IN_TASK if CPU_V7

CPU_V6K also supports this

>         # Above selects are sorted alphabetically; please add new ones
>         # according to that.  Thanks.
>         help
> diff --git a/arch/arm/Makefile b/arch/arm/Makefile
> index 415c3514573a..71a2ba4549d3 100644
> --- a/arch/arm/Makefile
> +++ b/arch/arm/Makefile
> @@ -284,6 +284,14 @@ stack_protector_prepare: prepare0
>         $(eval GCC_PLUGINS_CFLAGS += $(SSP_PLUGIN_CFLAGS))
>  endif
>
> +ifdef CONFIG_SMP
> +prepare: task_cpu_prepare
> +
> +PHONY += task_cpu_prepare
> +task_cpu_prepare: prepare0
> +       $(eval KBUILD_CFLAGS += -D_TSK_CPU=$(shell awk '{if ($$2 == "TSK_CPU") print $$3;}' include/generated/asm-offsets.h))
> +endif
> +
>  all:   $(notdir $(KBUILD_IMAGE))
>

This is rather horrid, and removed again in the next patch. Can we
omit it entirely?

>
> diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
> index e2b1fd558bf3..2ce7403f9298 100644
> --- a/arch/arm/include/asm/assembler.h
> +++ b/arch/arm/include/asm/assembler.h
> @@ -199,16 +199,68 @@
>         .endm
>         .endr
>
> +#ifdef CONFIG_THREAD_INFO_IN_TASK

No need for this #ifdef - it only guards macros that will have no
effect if they are never instantiated (another case below)

>  /*
> - * Get current thread_info.
> + * Get per-CPU offset
> + * rd: Destination register
>   */
> -       .macro  get_thread_info, rd
> - ARM(  mov     \rd, sp, lsr #THREAD_SIZE_ORDER + PAGE_SHIFT    )
> - THUMB(        mov     \rd, sp                 )
> - THUMB(        lsr     \rd, \rd, #THREAD_SIZE_ORDER + PAGE_SHIFT       )
> -       mov     \rd, \rd, lsl #THREAD_SIZE_ORDER + PAGE_SHIFT
> +       .macro this_cpu_offset rd:req
> +       mrc p15, 0, \rd, c13, c0, 4
>         .endm
>
> +/*
> + * Load a per-cpu variable
> + * @dst: Destination register to receive value
> + * @sym: The name of the per-cpu variable
> + * @tmp: scratch register
> + */
> +       .macro ldr_this_cpu dst : req, sym : req, tmp : req
> +       movw    \dst, #:lower16:\sym
> +       movt    \dst, #:upper16:\sym
> +       this_cpu_offset \tmp
> +       ldr     \dst, [\dst, \tmp]
> +       .endm
> +
> +/*
> + * Store a value in a per-cpu variable
> + * @src: Source register with value
> + * @sym: The name of the per-cpu variable
> + * @t1: scratch register 1
> + * @t2: scratch register 2
> + */
> +       .macro str_this_cpu src : req, sym : req, t1 : req, t2 : req
> +       movw    \t1, #:lower16:\sym
> +       movt    \t1, #:upper16:\sym
> +       this_cpu_offset \t2
> +       str     \src, [\t1, \t2]
> +       .endm
> +#endif
> +
> +/*
> + * Get current thread_info
> + * @dst: Destination register to receive current thread info
> + * @tmp: scratch register
> + */
> +       .macro get_current, dst : req, tmp : req
> +#ifdef CONFIG_THREAD_INFO_IN_TASK
> +       ldr_this_cpu \dst, current_task, \tmp
> +#else
> + ARM(  mov     \dst, sp, lsr #THREAD_SIZE_ORDER + PAGE_SHIFT   )
> + THUMB(        mov     \dst, sp                        )
> + THUMB(        lsr     \dst, \dst, #THREAD_SIZE_ORDER + PAGE_SHIFT     )
> +       mov     \dst, \dst, lsl #THREAD_SIZE_ORDER + PAGE_SHIFT
> +#endif
> +       .endm
> +
> +#ifdef CONFIG_THREAD_INFO_IN_TASK
> +/*
> + * Set current task info
> + */
> +       .macro  set_current_task rs : req, t1 : req, t2 : req
> +       str_this_cpu \rs, current_task, \t1, \t2
> +       .endm
> +#endif
> +
>  /*
>   * Increment/decrement the preempt count.
>   */
> @@ -226,7 +278,7 @@
>         .endm
>
>         .macro  dec_preempt_count_ti, ti, tmp
> -       get_thread_info \ti
> +       get_current \ti, \tmp
>         dec_preempt_count \ti, \tmp
>         .endm
>  #else
> diff --git a/arch/arm/include/asm/current.h b/arch/arm/include/asm/current.h
> new file mode 100644
> index 000000000000..7160315eb569
> --- /dev/null
> +++ b/arch/arm/include/asm/current.h
> @@ -0,0 +1,41 @@
> +/*
> + * Copyright © 2021 Keith Packard <keithp@keithp.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License as published by
> + * the Free Software Foundation, either version 2 of the License, or
> + * (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful, but
> + * WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details.

Please drop the boilerplate and use a SPDX header instead.

> + */
> +
> +#ifndef _ASM_ARM_CURRENT_H
> +#define _ASM_ARM_CURRENT_H
> +
> +#include <linux/compiler.h>
> +#include <linux/thread_info.h>
> +#include <asm/percpu.h>
> +
> +#ifndef __ASSEMBLY__
> +
> +#ifdef CONFIG_THREAD_INFO_IN_TASK
> +struct task_struct;
> +
> +DECLARE_PER_CPU(struct task_struct *, current_task);
> +
> +static __always_inline struct task_struct *get_current(void)
> +{
> +       return raw_cpu_read(current_task);

This needs to run with preemption disabled, or we might get migrated
to another CPU halfway through, and produce the wrong result (i.e.,
the current task of the CPU we migrated from). However, it appears
that manipulating the preempt count will create a circular dependency
here.

Instead of using a per-CPU variable for current, I think it might be
better to borrow another system register (TPIDRURO or TPIDRURW) to
carry 'current' when THREAD_INFO_IN_TASK is in effect, similar to how
arm64 uses SP_EL0 - those registers could be preserved/restored in the
entry/exit from/to user space paths rather than in the context switch
path, and then be used any way we like while running in the kernel.


> +}
> +
> +#define current get_current()
> +#else
> +#include <asm-generic/current.h>
> +#endif
> +
> +#endif /* __ASSEMBLY__ */
> +
> +#endif /* _ASM_ARM_CURRENT_H */
> diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h
> index 86a7fd721556..1c38d1fde641 100644
> --- a/arch/arm/include/asm/smp.h
> +++ b/arch/arm/include/asm/smp.h
> @@ -15,7 +15,25 @@
>  # error "<asm/smp.h> included in non-SMP build"
>  #endif
>
> +#ifdef CONFIG_THREAD_INFO_IN_TASK
> +/*
> + * This is particularly ugly: it appears we can't actually get the definition
> + * of task_struct here, but we need access to the CPU this task is running on.
> + * Instead of using task_struct we're using TSK_CPU which is extracted from
> + * asm-offsets.h by kbuild to get the current processor ID.
> + *
> + * This also needs to be safeguarded when building asm-offsets.s because at
> + * that time TSK_CPU is not defined yet.
> + */
> +#ifndef _TSK_CPU
> +#define raw_smp_processor_id()         (0)
> +#else
> +#define raw_smp_processor_id()         (*(unsigned int *)((void *)current + _TSK_CPU))
> +#endif
> +
> +#else
>  #define raw_smp_processor_id() (current_thread_info()->cpu)
> +#endif
>
>  struct seq_file;
>
> diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h
> index 70d4cbc49ae1..cbcd476a095b 100644
> --- a/arch/arm/include/asm/thread_info.h
> +++ b/arch/arm/include/asm/thread_info.h
> @@ -55,8 +55,10 @@ struct thread_info {
>         unsigned long           flags;          /* low level flags */
>         int                     preempt_count;  /* 0 => preemptable, <0 => bug */
>         mm_segment_t            addr_limit;     /* address limit */
> +#ifndef CONFIG_THREAD_INFO_IN_TASK
>         struct task_struct      *task;          /* main task structure */
>         __u32                   cpu;            /* cpu */
> +#endif
>         __u32                   cpu_domain;     /* cpu domain */
>  #ifdef CONFIG_STACKPROTECTOR_PER_TASK
>         unsigned long           stack_canary;
> @@ -75,6 +77,17 @@ struct thread_info {
>  #endif
>  };
>
> +#ifdef CONFIG_THREAD_INFO_IN_TASK
> +
> +#define INIT_THREAD_INFO(tsk)                                          \
> +{                                                                      \
> +       .flags          = 0,                                            \
> +       .preempt_count  = INIT_PREEMPT_COUNT,                           \
> +       .addr_limit     = KERNEL_DS,                                    \
> +}
> +
> +#else
> +
>  #define INIT_THREAD_INFO(tsk)                                          \
>  {                                                                      \
>         .task           = &tsk,                                         \
> @@ -83,6 +96,9 @@ struct thread_info {
>         .addr_limit     = KERNEL_DS,                                    \
>  }
>
> +#endif
> +
> +#ifndef CONFIG_THREAD_INFO_IN_TASK
>  /*
>   * how to get the thread information struct from C
>   */
> @@ -93,6 +109,7 @@ static inline struct thread_info *current_thread_info(void)
>         return (struct thread_info *)
>                 (current_stack_pointer & ~(THREAD_SIZE - 1));
>  }
> +#endif
>
>  #define thread_saved_pc(tsk)   \
>         ((unsigned long)(task_thread_info(tsk)->cpu_context.pc))
> diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
> index 70993af22d80..c91dcfe34bdd 100644
> --- a/arch/arm/kernel/asm-offsets.c
> +++ b/arch/arm/kernel/asm-offsets.c
> @@ -44,8 +44,12 @@ int main(void)
>    DEFINE(TI_FLAGS,             offsetof(struct thread_info, flags));
>    DEFINE(TI_PREEMPT,           offsetof(struct thread_info, preempt_count));
>    DEFINE(TI_ADDR_LIMIT,                offsetof(struct thread_info, addr_limit));
> +#ifdef CONFIG_THREAD_INFO_IN_TASK
> +  DEFINE(TSK_CPU,              offsetof(struct task_struct, cpu));
> +#else
>    DEFINE(TI_TASK,              offsetof(struct thread_info, task));
>    DEFINE(TI_CPU,               offsetof(struct thread_info, cpu));
> +#endif
>    DEFINE(TI_CPU_DOMAIN,                offsetof(struct thread_info, cpu_domain));
>    DEFINE(TI_CPU_SAVE,          offsetof(struct thread_info, cpu_context));
>    DEFINE(TI_USED_CP,           offsetof(struct thread_info, used_cp));
> diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
> index 0ea8529a4872..a5d71b0d8897 100644
> --- a/arch/arm/kernel/entry-armv.S
> +++ b/arch/arm/kernel/entry-armv.S
> @@ -179,7 +179,7 @@ ENDPROC(__und_invalid)
>         @
>         stmia   r7, {r2 - r6}
>
> -       get_thread_info tsk
> +       get_current tsk, r0
>         uaccess_entry tsk, r0, r1, r2, \uaccess
>
>         .if \trace
> @@ -260,7 +260,7 @@ __und_svc:
>         bl      __und_fault
>
>  __und_svc_finish:
> -       get_thread_info tsk
> +       get_current tsk, r5
>         ldr     r5, [sp, #S_PSR]                @ Get SVC cpsr
>         svc_exit r5                             @ return from exception
>   UNWIND(.fnend         )
> @@ -428,7 +428,7 @@ __irq_usr:
>         usr_entry
>         kuser_cmpxchg_check
>         irq_handler
> -       get_thread_info tsk
> +       get_current tsk, why
>         mov     why, #0
>         b       ret_to_user_from_irq
>   UNWIND(.fnend         )
> @@ -572,12 +572,12 @@ ENDPROC(__und_usr)
>         @ Fall-through from Thumb-2 __und_usr
>         @
>  #ifdef CONFIG_NEON
> -       get_thread_info r10                     @ get current thread
> +       get_current r10, r6                     @ get current thread
>         adr     r6, .LCneon_thumb_opcodes
>         b       2f
>  #endif
>  call_fpe:
> -       get_thread_info r10                     @ get current thread
> +       get_current r10, r6                     @ get current thread
>  #ifdef CONFIG_NEON
>         adr     r6, .LCneon_arm_opcodes
>  2:     ldr     r5, [r6], #4                    @ mask value
> @@ -722,7 +722,7 @@ __pabt_usr:
>  ENTRY(ret_from_exception)
>   UNWIND(.fnstart       )
>   UNWIND(.cantunwind    )
> -       get_thread_info tsk
> +       get_current tsk, why
>         mov     why, #0
>         b       ret_to_user
>   UNWIND(.fnend         )
> @@ -735,7 +735,7 @@ __fiq_usr:
>         kuser_cmpxchg_check
>         mov     r0, sp                          @ struct pt_regs *regs
>         bl      handle_fiq_as_nmi
> -       get_thread_info tsk
> +       get_current tsk, r0
>         restore_user_regs fast = 0, offset = 0
>   UNWIND(.fnend         )
>  ENDPROC(__fiq_usr)
> @@ -771,6 +771,9 @@ ENTRY(__switch_to)
>  #endif
>  #ifdef CONFIG_CPU_USE_DOMAINS
>         mcr     p15, 0, r6, c3, c0, 0           @ Set domain register
> +#endif
> +#ifdef CONFIG_THREAD_INFO_IN_TASK
> +       set_current_task r2, r4, r5
>  #endif
>         mov     r5, r0
>         add     r4, r2, #TI_CPU_SAVE
> diff --git a/arch/arm/kernel/entry-common.S b/arch/arm/kernel/entry-common.S
> index 7f0b7aba1498..52580ee463fe 100644
> --- a/arch/arm/kernel/entry-common.S
> +++ b/arch/arm/kernel/entry-common.S
> @@ -156,7 +156,7 @@ ENTRY(ret_from_fork)
>         movne   r0, r4
>         badrne  lr, 1f
>         retne   r5
> -1:     get_thread_info tsk
> +1:     get_current tsk, r1
>         b       ret_slow_syscall
>  ENDPROC(ret_from_fork)
>
> @@ -243,7 +243,7 @@ ENTRY(vector_swi)
>         bic     scno, scno, #0xff000000         @ mask off SWI op-code
>         eor     scno, scno, #__NR_SYSCALL_BASE  @ check OS number
>  #endif
> -       get_thread_info tsk
> +       get_current tsk, r10
>         /*
>          * Reload the registers that may have been corrupted on entry to
>          * the syscall assembly (by tracing or context tracking.)
> @@ -278,7 +278,7 @@ local_restart:
>  9001:
>         sub     lr, saved_pc, #4
>         str     lr, [sp, #S_PC]
> -       get_thread_info tsk
> +       get_current tsk, r1
>         b       ret_fast_syscall
>  #endif
>  ENDPROC(vector_swi)
> diff --git a/arch/arm/kernel/entry-v7m.S b/arch/arm/kernel/entry-v7m.S
> index d0e898608d30..f36a7a876085 100644
> --- a/arch/arm/kernel/entry-v7m.S
> +++ b/arch/arm/kernel/entry-v7m.S
> @@ -57,7 +57,7 @@ __irq_entry:
>         tst     r0, V7M_SCB_ICSR_RETTOBASE
>         beq     2f
>
> -       get_thread_info tsk
> +       get_current tsk, r2
>         ldr     r2, [tsk, #TI_FLAGS]
>         movs    r2, r2, lsl #16
>         beq     2f                      @ no work pending
> @@ -83,7 +83,7 @@ __pendsv_entry:
>         str     r0, [r1, V7M_SCB_ICSR]  @ clear PendSV
>
>         @ execute the pending work, including reschedule
> -       get_thread_info tsk
> +       get_current tsk, why
>         mov     why, #0
>         b       ret_to_user_from_irq
>  ENDPROC(__pendsv_entry)
> @@ -107,6 +107,9 @@ ENTRY(__switch_to)
>         bl      atomic_notifier_call_chain
>         mov     ip, r4
>         mov     r0, r5
> +#ifdef CONFIG_THREAD_INFO_IN_TASK
> +       set_current_task r2, r4, r5
> +#endif
>         ldmia   ip!, {r4 - r11}         @ Load all regs saved previously
>         ldr     sp, [ip]
>         ldr     pc, [ip, #4]!
> diff --git a/arch/arm/kernel/iwmmxt.S b/arch/arm/kernel/iwmmxt.S
> index d2b4ac06e4ed..781f7c7fca90 100644
> --- a/arch/arm/kernel/iwmmxt.S
> +++ b/arch/arm/kernel/iwmmxt.S
> @@ -96,7 +96,7 @@ ENTRY(iwmmxt_task_enable)
>         bl      concan_save
>
>  #ifdef CONFIG_PREEMPT_COUNT
> -       get_thread_info r10
> +       get_current r10, r3
>  #endif
>  4:     dec_preempt_count r10, r3
>         ret     r9                              @ normal exit from exception
> diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> index 55cb1689a4b3..be0ede16dbb1 100644
> --- a/arch/arm/kernel/smp.c
> +++ b/arch/arm/kernel/smp.c
> @@ -51,6 +51,13 @@
>  #define CREATE_TRACE_POINTS
>  #include <trace/events/ipi.h>
>
> +#ifdef CONFIG_THREAD_INFO_IN_TASK
> +DEFINE_PER_CPU(struct task_struct *, current_task) ____cacheline_aligned =
> +       &init_task;
> +EXPORT_PER_CPU_SYMBOL(current_task);
> +
> +#endif
> +
>  /*
>   * as from 2.5, kernels no longer have an init_tasks structure
>   * so we need some other way of telling a new secondary core
> @@ -156,6 +163,10 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
>         secondary_data.cpu = cpu;
>         sync_cache_w(&secondary_data);
>
> +#ifdef CONFIG_THREAD_INFO_IN_TASK
> +       per_cpu(current_task, cpu) = idle;
> +#endif
> +
>         /*
>          * Now bring the CPU into our world.
>          */
> @@ -509,6 +520,7 @@ void __init smp_prepare_cpus(unsigned int max_cpus)
>          */
>         if (max_cpus > ncores)
>                 max_cpus = ncores;
> +
>         if (ncores > 1 && max_cpus) {
>                 /*
>                  * Initialise the present map, which describes the set of CPUs
> diff --git a/arch/arm/lib/copy_from_user.S b/arch/arm/lib/copy_from_user.S
> index f8016e3db65d..5320950100a2 100644
> --- a/arch/arm/lib/copy_from_user.S
> +++ b/arch/arm/lib/copy_from_user.S
> @@ -109,7 +109,7 @@
>
>  ENTRY(arm_copy_from_user)
>  #ifdef CONFIG_CPU_SPECTRE
> -       get_thread_info r3
> +       get_current r3, ip
>         ldr     r3, [r3, #TI_ADDR_LIMIT]
>         uaccess_mask_range_ptr r1, r2, r3, ip
>  #endif
> diff --git a/arch/arm/lib/copy_to_user.S b/arch/arm/lib/copy_to_user.S
> index ebfe4cb3d912..e7e61c87893a 100644
> --- a/arch/arm/lib/copy_to_user.S
> +++ b/arch/arm/lib/copy_to_user.S
> @@ -109,7 +109,7 @@
>  ENTRY(__copy_to_user_std)
>  WEAK(arm_copy_to_user)
>  #ifdef CONFIG_CPU_SPECTRE
> -       get_thread_info r3
> +       get_current r3, ip
>         ldr     r3, [r3, #TI_ADDR_LIMIT]
>         uaccess_mask_range_ptr r0, r2, r3, ip
>  #endif
> diff --git a/arch/arm/mach-ep93xx/crunch-bits.S b/arch/arm/mach-ep93xx/crunch-bits.S
> index fb2dbf76f09e..d0bb34b3d973 100644
> --- a/arch/arm/mach-ep93xx/crunch-bits.S
> +++ b/arch/arm/mach-ep93xx/crunch-bits.S
> @@ -192,7 +192,7 @@ crunch_load:
>
>  1:
>  #ifdef CONFIG_PREEMPT_COUNT
> -       get_thread_info r10
> +       get_current r10, r3
>  #endif
>  2:     dec_preempt_count r10, r3
>         ret     lr
> diff --git a/arch/arm/mm/proc-macros.S b/arch/arm/mm/proc-macros.S
> index e2c743aa2eb2..a782c025fdf3 100644
> --- a/arch/arm/mm/proc-macros.S
> +++ b/arch/arm/mm/proc-macros.S
> @@ -30,7 +30,7 @@
>   * act_mm - get current->active_mm
>   */
>         .macro  act_mm, rd
> -       get_thread_info \rd
> +       get_current \rd, unused                 @ won't build if THREAD_INFO_IN_TASK
>         ldr     \rd, [\rd, #TI_TASK]
>         .if (TSK_ACTIVE_MM > IMM12_MASK)
>         add     \rd, \rd, #TSK_ACTIVE_MM & ~IMM12_MASK
> diff --git a/arch/arm/vfp/entry.S b/arch/arm/vfp/entry.S
> index 27b0a1f27fbd..48cb40a3b72d 100644
> --- a/arch/arm/vfp/entry.S
> +++ b/arch/arm/vfp/entry.S
> @@ -24,7 +24,11 @@
>  ENTRY(do_vfp)
>         inc_preempt_count r10, r4
>         ldr     r4, .LCvfp
> +#ifdef CONFIG_THREAD_INFO_IN_TASK
> +       ldr     r11, [r10, #TSK_CPU]    @ CPU number
> +#else
>         ldr     r11, [r10, #TI_CPU]     @ CPU number
> +#endif
>         add     r10, r10, #TI_VFPSTATE  @ r10 = workspace
>         ldr     pc, [r4]                @ call VFP entry point
>  ENDPROC(do_vfp)
> diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
> index 2cb355c1b5b7..f929a26a05a5 100644
> --- a/arch/arm/vfp/vfpmodule.c
> +++ b/arch/arm/vfp/vfpmodule.c
> @@ -143,22 +143,27 @@ static void vfp_thread_copy(struct thread_info *thread)
>   *  THREAD_NOFTIFY_SWTICH:
>   *   - the previously running thread will not be scheduled onto another CPU.
>   *   - the next thread to be run (v) will not be running on another CPU.
> - *   - thread->cpu is the local CPU number
> + *   - tsk->cpu is the local CPU number
>   *   - not preemptible as we're called in the middle of a thread switch
>   *  THREAD_NOTIFY_FLUSH:
>   *   - the thread (v) will be running on the local CPU, so
> - *     v === current_thread_info()
> - *   - thread->cpu is the local CPU number at the time it is accessed,
> + *     v === current
> + *   - tsk->cpu is the local CPU number at the time it is accessed,
>   *     but may change at any time.
>   *   - we could be preempted if tree preempt rcu is enabled, so
> - *     it is unsafe to use thread->cpu.
> + *     it is unsafe to use tsk->cpu.
>   *  THREAD_NOTIFY_EXIT
>   *   - we could be preempted if tree preempt rcu is enabled, so
> - *     it is unsafe to use thread->cpu.
> + *     it is unsafe to use tsk->cpu.
>   */
>  static int vfp_notifier(struct notifier_block *self, unsigned long cmd, void *v)
>  {
> +#ifdef CONFIG_THREAD_INFO_IN_TASK
> +       struct task_struct *tsk = v;
> +       struct thread_info *thread = &tsk->thread_info;
> +#else
>         struct thread_info *thread = v;
> +#endif
>         u32 fpexc;
>  #ifdef CONFIG_SMP
>         unsigned int cpu;
> @@ -169,7 +174,11 @@ static int vfp_notifier(struct notifier_block *self, unsigned long cmd, void *v)
>                 fpexc = fmrx(FPEXC);
>
>  #ifdef CONFIG_SMP
> +#ifdef CONFIG_THREAD_INFO_IN_TASK
> +               cpu = tsk->cpu;
> +#else
>                 cpu = thread->cpu;
> +#endif
>
>                 /*
>                  * On SMP, if VFP is enabled, save the old state in
> @@ -458,16 +467,16 @@ static int vfp_pm_suspend(void)
>
>                 /* disable, just in case */
>                 fmxr(FPEXC, fmrx(FPEXC) & ~FPEXC_EN);
> -       } else if (vfp_current_hw_state[ti->cpu]) {
> +       } else if (vfp_current_hw_state[smp_processor_id()]) {
>  #ifndef CONFIG_SMP
>                 fmxr(FPEXC, fpexc | FPEXC_EN);
> -               vfp_save_state(vfp_current_hw_state[ti->cpu], fpexc);
> +               vfp_save_state(vfp_current_hw_state[smp_processor_id()], fpexc);
>                 fmxr(FPEXC, fpexc);
>  #endif
>         }
>
>         /* clear any information we had about last context state */
> -       vfp_current_hw_state[ti->cpu] = NULL;
> +       vfp_current_hw_state[smp_processor_id()] = NULL;
>
>         return 0;
>  }
> --
> 2.33.0
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/3] ARM: Move thread_info into task_struct (v7 only)
  2021-09-05 20:56     ` Ard Biesheuvel
@ 2021-09-06  6:14       ` Keith Packard
  2021-09-06  7:49         ` Ard Biesheuvel
  2021-09-06  6:20       ` Keith Packard
  1 sibling, 1 reply; 36+ messages in thread
From: Keith Packard @ 2021-09-06  6:14 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Linux Kernel Mailing List, Abbott Liu, Alexander Sverdlin,
	Andrew Morton, Anshuman Khandual, Arnd Bergmann, Bjorn Andersson,
	Florian Fainelli, Geert Uytterhoeven, Hartley Sweeten,
	Jens Axboe, Jian Cai, Joe Perches, Kees Cook,
	Krzysztof Kozlowski, Linus Walleij, Linux ARM,
	Manivannan Sadhasivam, Marc Zyngier, Masahiro Yamada,
	Miguel Ojeda, Mike Rapoport, Nathan Chancellor, Nick Desaulniers,
	Nicolas Pitre, Rob Herring, Russell King, Thomas Gleixner,
	Uwe Kleine-König, Valentin Schneider, Viresh Kumar,
	Wolfram Sang (Renesas),
	YiFei Zhu

[-- Attachment #1: Type: text/plain, Size: 2504 bytes --]

Ard Biesheuvel <ardb@kernel.org> writes:

> c13 is not a register, it is a value in one of the dimensions of the
> ARM sysreg space, and probably covers other system registers that
> pre-v7 cores do implement.

> Better to use its architectural name (TPIDRPRW) and clarify that
> pre-v6k/v7 cores simply don't implement it.

Thanks, I'll reword the commit message

> Could we split this up?

I could split up the assembler macro changes which add a temporary
register to the calls getting the current thread_info/task_struct value?
If that would help get this reviewed, I'd be happy to do
that. Otherwise, it's pretty much all-or-nothing as enabling
THREAD_INFO_IN_TASK requires a bunch of synchronized changes.

>> +#ifdef CONFIG_THREAD_INFO_IN_TASK
>
> No need for this #ifdef - it only guards macros that will have no
> effect if they are never instantiated (another case below)

Sure, the resulting code is a bit less noisy, which seems good.

>> +DECLARE_PER_CPU(struct task_struct *, current_task);
>> +
>> +static __always_inline struct task_struct *get_current(void)
>> +{
>> +       return raw_cpu_read(current_task);
>
> This needs to run with preemption disabled, or we might get migrated
> to another CPU halfway through, and produce the wrong result (i.e.,
> the current task of the CPU we migrated from). However, it appears
> that manipulating the preempt count will create a circular dependency
> here.

Yeah, I really don't understand the restrictions of this API well. Any
code fetching the current task pointer better not be subject to
preemption or that value will be stale at some point after it was
computed. Do you know if it could ever be run in a context allowing
preemption?
>
> Instead of using a per-CPU variable for current, I think it might be
> better to borrow another system register (TPIDRURO or TPIDRURW) to
> carry 'current' when THREAD_INFO_IN_TASK is in effect, similar to how
> arm64 uses SP_EL0 - those registers could be preserved/restored in the
> entry/exit from/to user space paths rather than in the context switch
> path, and then be used any way we like while running in the kernel.

Making sure these values don't leak through to user space somehow? I'm
not excited by that prospect at all. But, perhaps someone can help me
understand the conditions under which the current task value can be
computed where preemption was enabled and have that not be a problem for
the enclosing code?

-- 
-keith

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/3] ARM: Move thread_info into task_struct (v7 only)
  2021-09-05 20:56     ` Ard Biesheuvel
  2021-09-06  6:14       ` Keith Packard
@ 2021-09-06  6:20       ` Keith Packard
  1 sibling, 0 replies; 36+ messages in thread
From: Keith Packard @ 2021-09-06  6:20 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Linux Kernel Mailing List, Abbott Liu, Alexander Sverdlin,
	Andrew Morton, Anshuman Khandual, Arnd Bergmann, Bjorn Andersson,
	Florian Fainelli, Geert Uytterhoeven, Hartley Sweeten,
	Jens Axboe, Jian Cai, Joe Perches, Kees Cook,
	Krzysztof Kozlowski, Linus Walleij, Linux ARM,
	Manivannan Sadhasivam, Marc Zyngier, Masahiro Yamada,
	Miguel Ojeda, Mike Rapoport, Nathan Chancellor, Nick Desaulniers,
	Nicolas Pitre, Rob Herring, Russell King, Thomas Gleixner,
	Uwe Kleine-König, Valentin Schneider, Viresh Kumar,
	Wolfram Sang (Renesas),
	YiFei Zhu

[-- Attachment #1: Type: text/plain, Size: 2119 bytes --]

Ard Biesheuvel <ardb@kernel.org> writes:

(sorry, missed a couple of comments, and also neglected to thank you for
your review!)

>> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
>> index 24804f11302d..1c1ded500a2b 100644
>> --- a/arch/arm/Kconfig
>> +++ b/arch/arm/Kconfig
>> @@ -128,6 +128,7 @@ config ARM
>>         select RTC_LIB
>>         select SET_FS
>>         select SYS_SUPPORTS_APM_EMULATION
>> +       select THREAD_INFO_IN_TASK if CPU_V7
>
> CPU_V6K also supports this

I've only tested on V7; help getting testing for V6K so we could enable
that as well would be greatly appreciated.

>> diff --git a/arch/arm/Makefile b/arch/arm/Makefile
>> index 415c3514573a..71a2ba4549d3 100644
>> --- a/arch/arm/Makefile
>> +++ b/arch/arm/Makefile
>> @@ -284,6 +284,14 @@ stack_protector_prepare: prepare0
>>         $(eval GCC_PLUGINS_CFLAGS += $(SSP_PLUGIN_CFLAGS))
>>  endif
>>
>> +ifdef CONFIG_SMP
>> +prepare: task_cpu_prepare
>> +
>> +PHONY += task_cpu_prepare
>> +task_cpu_prepare: prepare0
>> +       $(eval KBUILD_CFLAGS += -D_TSK_CPU=$(shell awk '{if ($$2 == "TSK_CPU") print $$3;}' include/generated/asm-offsets.h))
>> +endif
>> +
>>  all:   $(notdir $(KBUILD_IMAGE))
>>
>
> This is rather horrid, and removed again in the next patch. Can we
> omit it entirely?

Yeah, this did get rather ugly -- I wanted to use the task_struct in the
raw_smp_processor_id() macro, but discovered that I couldn't include
linux/sched.h in that file.

I found this lovely hack in the powerpc kernel code and just lifted it
from there. In terms of code changes, this one is "smaller" than
including the new per-cpu variable, "cpu_number", but in terms of
ugliness, it's hard to argue which is cleaner.

I was going for fewest logical changes in structure with this patch
instead of cleanest result to try and make it slightly easier to review.

Happy to squash these two patches together if you'd prefer; I initially
wrote the code with the cpu_number variable but then discovered that I
didn't need it if I used the cpu value in the task_struct...

-- 
-keith

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/3] ARM: Move thread_info into task_struct (v7 only)
  2021-09-06  6:14       ` Keith Packard
@ 2021-09-06  7:49         ` Ard Biesheuvel
  2021-09-07 15:24           ` Keith Packard
  0 siblings, 1 reply; 36+ messages in thread
From: Ard Biesheuvel @ 2021-09-06  7:49 UTC (permalink / raw)
  To: Keith Packard
  Cc: Linux Kernel Mailing List, Abbott Liu, Alexander Sverdlin,
	Andrew Morton, Anshuman Khandual, Arnd Bergmann, Bjorn Andersson,
	Florian Fainelli, Geert Uytterhoeven, Hartley Sweeten,
	Jens Axboe, Jian Cai, Joe Perches, Kees Cook,
	Krzysztof Kozlowski, Linus Walleij, Linux ARM,
	Manivannan Sadhasivam, Marc Zyngier, Masahiro Yamada,
	Miguel Ojeda, Mike Rapoport, Nathan Chancellor, Nick Desaulniers,
	Nicolas Pitre, Rob Herring, Russell King, Thomas Gleixner,
	Uwe Kleine-König, Valentin Schneider, Viresh Kumar,
	Wolfram Sang (Renesas),
	YiFei Zhu

On Mon, 6 Sept 2021 at 08:14, Keith Packard <keithp@keithp.com> wrote:
>
> Ard Biesheuvel <ardb@kernel.org> writes:
>
> > c13 is not a register, it is a value in one of the dimensions of the
> > ARM sysreg space, and probably covers other system registers that
> > pre-v7 cores do implement.
>
> > Better to use its architectural name (TPIDRPRW) and clarify that
> > pre-v6k/v7 cores simply don't implement it.
>
> Thanks, I'll reword the commit message
>
> > Could we split this up?
>
> I could split up the assembler macro changes which add a temporary
> register to the calls getting the current thread_info/task_struct value?
> If that would help get this reviewed, I'd be happy to do
> that. Otherwise, it's pretty much all-or-nothing as enabling
> THREAD_INFO_IN_TASK requires a bunch of synchronized changes.
>

Sure, so it is precisely for that reason that it is better to isolate
changes that can be isolated.

...
> >> +DECLARE_PER_CPU(struct task_struct *, current_task);
> >> +
> >> +static __always_inline struct task_struct *get_current(void)
> >> +{
> >> +       return raw_cpu_read(current_task);
> >
> > This needs to run with preemption disabled, or we might get migrated
> > to another CPU halfway through, and produce the wrong result (i.e.,
> > the current task of the CPU we migrated from). However, it appears
> > that manipulating the preempt count will create a circular dependency
> > here.
>
> Yeah, I really don't understand the restrictions of this API well. Any
> code fetching the current task pointer better not be subject to
> preemption or that value will be stale at some point after it was
> computed. Do you know if it could ever be run in a context allowing
> preemption?

All the time. 'current' essentially never changes value from the POV
of code running in task context, so there is usually no reason to care
about preemption/migration when referring to it. Using per-CPU
variables is what creates the problem here.

> >
> > Instead of using a per-CPU variable for current, I think it might be
> > better to borrow another system register (TPIDRURO or TPIDRURW) to
> > carry 'current' when THREAD_INFO_IN_TASK is in effect, similar to how
> > arm64 uses SP_EL0 - those registers could be preserved/restored in the
> > entry/exit from/to user space paths rather than in the context switch
> > path, and then be used any way we like while running in the kernel.
>
> Making sure these values don't leak through to user space somehow? I'm
> not excited by that prospect at all.

Moving the code that pokes the right user space value into these
registers from the context switch patch to the user space exit path
should be rather straight-forward, so I am not too concerned about
security issues here (famous last words)

> But, perhaps someone can help me
> understand the conditions under which the current task value can be
> computed where preemption was enabled and have that not be a problem for
> the enclosing code?
>

In principle, it should be sufficient to run the per-CPU variable load
sequence with preemption disabled. For instance, your asm version

       movw    \dst, #:lower16:\sym
       movt    \dst, #:upper16:\sym
       this_cpu_offset \tmp
       ldr     \dst, [\dst, \tmp]

must not be preempted and migrated right before the ldr instruction,
because that would end up accessing another CPU's value for 'current'.

However, the preempt_count itself is stored in thread_info as well, so
I'm not sure there is a way we can safely disable preemption for this
sequence to begin with, unless we run the above with interrupts
disabled.

Given that we are already relying on the MP extensions for this
anyway, I personally think that using another thread ID register to
carry 'current' is a reasonable approach as well, since it would also
allow us to get per-task stack protector support into the compiler.
But I would like to hear from some other folks on cc as well.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/3] ARM: Move thread_info into task_struct (v7 only)
  2021-09-06  7:49         ` Ard Biesheuvel
@ 2021-09-07 15:24           ` Keith Packard
  2021-09-07 16:05             ` Ard Biesheuvel
  0 siblings, 1 reply; 36+ messages in thread
From: Keith Packard @ 2021-09-07 15:24 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Linux Kernel Mailing List, Abbott Liu, Alexander Sverdlin,
	Andrew Morton, Anshuman Khandual, Arnd Bergmann, Bjorn Andersson,
	Florian Fainelli, Geert Uytterhoeven, Hartley Sweeten,
	Jens Axboe, Jian Cai, Joe Perches, Kees Cook,
	Krzysztof Kozlowski, Linus Walleij, Linux ARM,
	Manivannan Sadhasivam, Marc Zyngier, Masahiro Yamada,
	Miguel Ojeda, Mike Rapoport, Nathan Chancellor, Nick Desaulniers,
	Nicolas Pitre, Rob Herring, Russell King, Thomas Gleixner,
	Uwe Kleine-König, Valentin Schneider, Viresh Kumar,
	Wolfram Sang (Renesas),
	YiFei Zhu

[-- Attachment #1: Type: text/plain, Size: 1505 bytes --]

Ard Biesheuvel <ardb@kernel.org> writes:

> Sure, so it is precisely for that reason that it is better to isolate
> changes that can be isolated.

I'll go ahead and split this apart then; that is how I did development,
after all.

> All the time. 'current' essentially never changes value from the POV
> of code running in task context, so there is usually no reason to care
> about preemption/migration when referring to it. Using per-CPU
> variables is what creates the problem here.

Thanks for helping me -- I just got the wrong model stuck in my head
over the weekend somehow.

If I do have this figured out, we should be able to stick the
per_cpu_offset value in thread_info and use TPIDRPRW to hold 'current'
as code using per_cpu_offset should already be disabling
preemption. That should be an easier change than putting a kernel
pointer in a user-visible register.

> Given that we are already relying on the MP extensions for this
> anyway, I personally think that using another thread ID register to
> carry 'current' is a reasonable approach as well, since it would also
> allow us to get per-task stack protector support into the compiler.
> But I would like to hear from some other folks on cc as well.

That would be awesome; I assume that doesn't require leaving
per_cpu_offset in a thread ID register?

In any case, I'll give my plan a try, and then see about trying your
plan as well so I can compare the complexity of the two solutions.

-- 
-keith

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/3] ARM: Move thread_info into task_struct (v7 only)
  2021-09-07 15:24           ` Keith Packard
@ 2021-09-07 16:05             ` Ard Biesheuvel
  2021-09-07 22:17               ` Keith Packard
  0 siblings, 1 reply; 36+ messages in thread
From: Ard Biesheuvel @ 2021-09-07 16:05 UTC (permalink / raw)
  To: Keith Packard
  Cc: Linux Kernel Mailing List, Abbott Liu, Alexander Sverdlin,
	Andrew Morton, Anshuman Khandual, Arnd Bergmann, Bjorn Andersson,
	Florian Fainelli, Geert Uytterhoeven, Hartley Sweeten,
	Jens Axboe, Jian Cai, Joe Perches, Kees Cook,
	Krzysztof Kozlowski, Linus Walleij, Linux ARM,
	Manivannan Sadhasivam, Marc Zyngier, Masahiro Yamada,
	Miguel Ojeda, Mike Rapoport, Nathan Chancellor, Nick Desaulniers,
	Nicolas Pitre, Rob Herring, Russell King, Thomas Gleixner,
	Uwe Kleine-König, Valentin Schneider, Viresh Kumar,
	Wolfram Sang (Renesas),
	YiFei Zhu

On Tue, 7 Sept 2021 at 17:24, Keith Packard <keithp@keithp.com> wrote:
>
> Ard Biesheuvel <ardb@kernel.org> writes:
>
> > Sure, so it is precisely for that reason that it is better to isolate
> > changes that can be isolated.
>
> I'll go ahead and split this apart then; that is how I did development,
> after all.
>
> > All the time. 'current' essentially never changes value from the POV
> > of code running in task context, so there is usually no reason to care
> > about preemption/migration when referring to it. Using per-CPU
> > variables is what creates the problem here.
>
> Thanks for helping me -- I just got the wrong model stuck in my head
> over the weekend somehow.
>
> If I do have this figured out, we should be able to stick the
> per_cpu_offset value in thread_info and use TPIDRPRW to hold 'current'
> as code using per_cpu_offset should already be disabling
> preemption. That should be an easier change than putting a kernel
> pointer in a user-visible register.
>

That is still a bit tricky, given that you now have to swap out the
per-CPU offset when you migrate a task, and the generic code is not
really set up for that.

> > Given that we are already relying on the MP extensions for this
> > anyway, I personally think that using another thread ID register to
> > carry 'current' is a reasonable approach as well, since it would also
> > allow us to get per-task stack protector support into the compiler.
> > But I would like to hear from some other folks on cc as well.
>
> That would be awesome; I assume that doesn't require leaving
> per_cpu_offset in a thread ID register?
>
> In any case, I'll give my plan a try, and then see about trying your
> plan as well so I can compare the complexity of the two solutions.
>

I had a stab at this myself today (long boring day with no Internet connection).

https://android-kvm.googlesource.com/linux/+log/refs/heads/ardb/arm32-ti-in-task

It resembles your code in some places - I suppose we went on the same
journey in a sense :-) We'll fix up the credits before this gets
resubmitted.

Fixing the per-task stack protector plugin on top of these changes
should be trivial but I need a coffee first.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 0/7] ARM: support THREAD_INFO_IN_TASK (v3)
  2021-09-04  6:09 ` [PATCH 0/2] ARM: support THREAD_INFO_IN_TASK (v7 only) (v2) Keith Packard
                     ` (2 preceding siblings ...)
  2021-09-04  6:09   ` [PATCH 3/3] ARM: Add per-cpu variable cpu_number " Keith Packard
@ 2021-09-07 22:00   ` Keith Packard
  2021-09-07 22:00     ` [PATCH 1/7] ARM: Pass cpu number to secondary_start_kernel Keith Packard
                       ` (7 more replies)
  3 siblings, 8 replies; 36+ messages in thread
From: Keith Packard @ 2021-09-07 22:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Abbott Liu, Andrew Morton, Andrey Ryabinin, Anshuman Khandual,
	Ard Biesheuvel, Arnd Bergmann, Bjorn Andersson,
	Christoph Lameter, Dennis Zhou, Geert Uytterhoeven, Jens Axboe,
	Joe Perches, Kees Cook, Keith Packard, Krzysztof Kozlowski,
	Linus Walleij, linux-arm-kernel, linux-mm, Manivannan Sadhasivam,
	Marc Zyngier, Masahiro Yamada, Mike Rapoport, Nathan Chancellor,
	Nick Desaulniers, Nick Desaulniers, Nicolas Pitre, Russell King,
	Tejun Heo, Thomas Gleixner, Uwe Kleine-König,
	Valentin Schneider, Viresh Kumar, Wolfram Sang (Renesas),
	YiFei Zhu

Placing thread_info in the kernel stack leaves it vulnerable to stack
overflow attacks. This short series addresses that by using the
existing THREAD_INFO_IN_TASK infrastructure.

This is the third version of this series, in this version the changes
are restricted to hardware which provides the TPIDRPRW register. This
register is repurposed from holding the per_cpu_offset value to
holding the 'current' value as that allows fetching this value
atomically so that it can be used in a preemptable context.

The series is broken into seven pieces:

 1) Change the secondary_start_kernel API to receive the cpu
    number. This avoids needing to be able to find this value independently in
    future patches.

 2) Change the secondary_start_kernel API to also receive the 'task'
    value. Passing the value to this function also avoids needing to
    be able to discover it independently.

 3) A cleanup which avoids assuming that THREAD_INFO_IN_TASK is not set.

 4) A hack, borrowed from the powerpc arch, which allows locating the 'cpu'
    field in either thread_info or task_struct, without requiring linux/sched.h
    to be included in asm/smp.h

 5) Disable the optimization storing per_cpu_offset in TPIDRPRW. This leaves
    the register free to hold 'current' instead.

 6) Use TPIDRPRW for 'current'. This is enabled for either CPU_V6K or CPU_V7,
    but not if CPU_V6 is also enabled.

 7) Enable THREAD_INFO_IN_TASK whenever TPIDRPRW is used to hold 'current'.

Signed-off-by: Keith Packard <keithpac@amazon.com>



^ permalink raw reply	[flat|nested] 36+ messages in thread

* [PATCH 1/7] ARM: Pass cpu number to secondary_start_kernel
  2021-09-07 22:00   ` [PATCH 0/7] ARM: support THREAD_INFO_IN_TASK (v3) Keith Packard
@ 2021-09-07 22:00     ` Keith Packard
  2021-09-07 22:00     ` [PATCH 2/7] ARM: Pass task " Keith Packard
                       ` (6 subsequent siblings)
  7 siblings, 0 replies; 36+ messages in thread
From: Keith Packard @ 2021-09-07 22:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Abbott Liu, Andrew Morton, Andrey Ryabinin, Anshuman Khandual,
	Ard Biesheuvel, Arnd Bergmann, Bjorn Andersson,
	Christoph Lameter, Dennis Zhou, Geert Uytterhoeven, Jens Axboe,
	Joe Perches, Kees Cook, Keith Packard, Krzysztof Kozlowski,
	Linus Walleij, linux-arm-kernel, linux-mm, Manivannan Sadhasivam,
	Marc Zyngier, Masahiro Yamada, Mike Rapoport, Nathan Chancellor,
	Nick Desaulniers, Nick Desaulniers, Nicolas Pitre, Russell King,
	Tejun Heo, Thomas Gleixner, Uwe Kleine-König,
	Valentin Schneider, Viresh Kumar, Wolfram Sang (Renesas),
	YiFei Zhu

Instead of pulling the CPU value out of the thread_info struct, pass
it as an argument. When first initializing secondary processors, this
is done by stashing the value in the secondary_data struct. When
restarting idle processors, the previous CPU value is passed.

Because the cpu is now known at the top of secondary_start_kernel, the
per_cpu_offset value can now be set at this point, instead of in
cpu_init where it was also incorrectly setting the per_cpu_offset for
the boot processor before that value had been computed.

Signed-off-by: Keith Packard <keithpac@amazon.com>
Reviewed-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/include/asm/smp.h   |  3 ++-
 arch/arm/kernel/head-nommu.S |  1 +
 arch/arm/kernel/head.S       |  1 +
 arch/arm/kernel/setup.c      |  6 ------
 arch/arm/kernel/smp.c        | 14 +++++++++-----
 5 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h
index 5d508f5d56c4..86a7fd721556 100644
--- a/arch/arm/include/asm/smp.h
+++ b/arch/arm/include/asm/smp.h
@@ -48,7 +48,7 @@ extern void set_smp_ipi_range(int ipi_base, int nr_ipi);
  * Called from platform specific assembly code, this is the
  * secondary CPU entry point.
  */
-asmlinkage void secondary_start_kernel(void);
+asmlinkage void secondary_start_kernel(unsigned int cpu);
 
 
 /*
@@ -61,6 +61,7 @@ struct secondary_data {
 	};
 	unsigned long swapper_pg_dir;
 	void *stack;
+	unsigned int cpu;
 };
 extern struct secondary_data secondary_data;
 extern void secondary_startup(void);
diff --git a/arch/arm/kernel/head-nommu.S b/arch/arm/kernel/head-nommu.S
index 0fc814bbc34b..5aa8ef42717f 100644
--- a/arch/arm/kernel/head-nommu.S
+++ b/arch/arm/kernel/head-nommu.S
@@ -114,6 +114,7 @@ ENTRY(secondary_startup)
 	add	r12, r12, r10
 	ret	r12
 1:	bl	__after_proc_init
+	ldr	r0, [r7, #16]			@ set up cpu number
 	ldr	sp, [r7, #12]			@ set up the stack pointer
 	mov	fp, #0
 	b	secondary_start_kernel
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 7f62c5eccdf3..0e541af738e2 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -394,6 +394,7 @@ ENDPROC(secondary_startup_arm)
 
 ENTRY(__secondary_switched)
 	ldr_l	r7, secondary_data + 12		@ get secondary_data.stack
+	ldr_l	r0, secondary_data + 16		@ get secondary_data.cpu
 	mov	sp, r7
 	mov	fp, #0
 	b	secondary_start_kernel
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 73ca7797b92f..ca0201635fac 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -532,12 +532,6 @@ void notrace cpu_init(void)
 		BUG();
 	}
 
-	/*
-	 * This only works on resume and secondary cores. For booting on the
-	 * boot cpu, smp_prepare_boot_cpu is called after percpu area setup.
-	 */
-	set_my_cpu_offset(per_cpu_offset(cpu));
-
 	cpu_proc_init();
 
 	/*
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 74679240a9d8..55cb1689a4b3 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -153,6 +153,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
 	secondary_data.pgdir = virt_to_phys(idmap_pgd);
 	secondary_data.swapper_pg_dir = get_arch_pgd(swapper_pg_dir);
 #endif
+	secondary_data.cpu = cpu;
 	sync_cache_w(&secondary_data);
 
 	/*
@@ -373,11 +374,14 @@ void arch_cpu_idle_dead(void)
 	 * cpu initialisation.  There's some initialisation which needs
 	 * to be repeated to undo the effects of taking the CPU offline.
 	 */
-	__asm__("mov	sp, %0\n"
+	__asm__("mov	r0, %1\n"
+	"	mov	sp, %0\n"
 	"	mov	fp, #0\n"
 	"	b	secondary_start_kernel"
 		:
-		: "r" (task_stack_page(current) + THREAD_SIZE - 8));
+		: "r" (task_stack_page(current) + THREAD_SIZE - 8),
+		  "r" (cpu)
+		: "r0");
 }
 #endif /* CONFIG_HOTPLUG_CPU */
 
@@ -400,10 +404,11 @@ static void smp_store_cpu_info(unsigned int cpuid)
  * This is the secondary CPU boot entry.  We're using this CPUs
  * idle thread stack, but a set of temporary page tables.
  */
-asmlinkage void secondary_start_kernel(void)
+asmlinkage void secondary_start_kernel(unsigned int cpu)
 {
 	struct mm_struct *mm = &init_mm;
-	unsigned int cpu;
+
+	set_my_cpu_offset(per_cpu_offset(cpu));
 
 	secondary_biglittle_init();
 
@@ -420,7 +425,6 @@ asmlinkage void secondary_start_kernel(void)
 	 * All kernel threads share the same mm context; grab a
 	 * reference and switch to it.
 	 */
-	cpu = smp_processor_id();
 	mmgrab(mm);
 	current->active_mm = mm;
 	cpumask_set_cpu(cpu, mm_cpumask(mm));
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 2/7] ARM: Pass task to secondary_start_kernel
  2021-09-07 22:00   ` [PATCH 0/7] ARM: support THREAD_INFO_IN_TASK (v3) Keith Packard
  2021-09-07 22:00     ` [PATCH 1/7] ARM: Pass cpu number to secondary_start_kernel Keith Packard
@ 2021-09-07 22:00     ` Keith Packard
  2021-09-07 22:00     ` [PATCH 3/7] ARM: Use smp_processor_id() in vfp_pm_suspend instead of ti->cpu Keith Packard
                       ` (5 subsequent siblings)
  7 siblings, 0 replies; 36+ messages in thread
From: Keith Packard @ 2021-09-07 22:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Abbott Liu, Andrew Morton, Andrey Ryabinin, Anshuman Khandual,
	Ard Biesheuvel, Arnd Bergmann, Bjorn Andersson,
	Christoph Lameter, Dennis Zhou, Geert Uytterhoeven, Jens Axboe,
	Joe Perches, Kees Cook, Keith Packard, Krzysztof Kozlowski,
	Linus Walleij, linux-arm-kernel, linux-mm, Manivannan Sadhasivam,
	Marc Zyngier, Masahiro Yamada, Mike Rapoport, Nathan Chancellor,
	Nick Desaulniers, Nick Desaulniers, Nicolas Pitre, Russell King,
	Tejun Heo, Thomas Gleixner, Uwe Kleine-König,
	Valentin Schneider, Viresh Kumar, Wolfram Sang (Renesas),
	YiFei Zhu

This avoids needing to compute the task pointer in this function,
allowing it to be used as the source of identification in the future.

Signed-off-by: Keith Packard <keithpac@amazon.com>
---
 arch/arm/include/asm/smp.h   | 3 ++-
 arch/arm/kernel/head-nommu.S | 1 +
 arch/arm/kernel/head.S       | 1 +
 arch/arm/kernel/smp.c        | 8 +++++---
 4 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h
index 86a7fd721556..d43b64635d77 100644
--- a/arch/arm/include/asm/smp.h
+++ b/arch/arm/include/asm/smp.h
@@ -48,7 +48,7 @@ extern void set_smp_ipi_range(int ipi_base, int nr_ipi);
  * Called from platform specific assembly code, this is the
  * secondary CPU entry point.
  */
-asmlinkage void secondary_start_kernel(unsigned int cpu);
+asmlinkage void secondary_start_kernel(unsigned int cpu, struct task_struct *task);
 
 
 /*
@@ -62,6 +62,7 @@ struct secondary_data {
 	unsigned long swapper_pg_dir;
 	void *stack;
 	unsigned int cpu;
+	struct task_struct *task;
 };
 extern struct secondary_data secondary_data;
 extern void secondary_startup(void);
diff --git a/arch/arm/kernel/head-nommu.S b/arch/arm/kernel/head-nommu.S
index 5aa8ef42717f..218715c135ed 100644
--- a/arch/arm/kernel/head-nommu.S
+++ b/arch/arm/kernel/head-nommu.S
@@ -115,6 +115,7 @@ ENTRY(secondary_startup)
 	ret	r12
 1:	bl	__after_proc_init
 	ldr	r0, [r7, #16]			@ set up cpu number
+	ldr	r1, [r7, #20]			@ set up task pointer
 	ldr	sp, [r7, #12]			@ set up the stack pointer
 	mov	fp, #0
 	b	secondary_start_kernel
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 0e541af738e2..4a6cb0b0808b 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -395,6 +395,7 @@ ENDPROC(secondary_startup_arm)
 ENTRY(__secondary_switched)
 	ldr_l	r7, secondary_data + 12		@ get secondary_data.stack
 	ldr_l	r0, secondary_data + 16		@ get secondary_data.cpu
+	ldr_l	r1, secondary_data + 20		@ get secondary_data.task
 	mov	sp, r7
 	mov	fp, #0
 	b	secondary_start_kernel
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 55cb1689a4b3..5e999f1f1aea 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -154,6 +154,7 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
 	secondary_data.swapper_pg_dir = get_arch_pgd(swapper_pg_dir);
 #endif
 	secondary_data.cpu = cpu;
+	secondary_data.task = idle;
 	sync_cache_w(&secondary_data);
 
 	/*
@@ -375,13 +376,14 @@ void arch_cpu_idle_dead(void)
 	 * to be repeated to undo the effects of taking the CPU offline.
 	 */
 	__asm__("mov	r0, %1\n"
+	"	mov	r1, %2\n"
 	"	mov	sp, %0\n"
 	"	mov	fp, #0\n"
 	"	b	secondary_start_kernel"
 		:
 		: "r" (task_stack_page(current) + THREAD_SIZE - 8),
-		  "r" (cpu)
-		: "r0");
+		  "r" (cpu), "r" (current)
+		: "r0", "r1");
 }
 #endif /* CONFIG_HOTPLUG_CPU */
 
@@ -404,7 +406,7 @@ static void smp_store_cpu_info(unsigned int cpuid)
  * This is the secondary CPU boot entry.  We're using this CPUs
  * idle thread stack, but a set of temporary page tables.
  */
-asmlinkage void secondary_start_kernel(unsigned int cpu)
+asmlinkage void secondary_start_kernel(unsigned int cpu, struct task_struct *task)
 {
 	struct mm_struct *mm = &init_mm;
 
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 3/7] ARM: Use smp_processor_id() in vfp_pm_suspend instead of ti->cpu
  2021-09-07 22:00   ` [PATCH 0/7] ARM: support THREAD_INFO_IN_TASK (v3) Keith Packard
  2021-09-07 22:00     ` [PATCH 1/7] ARM: Pass cpu number to secondary_start_kernel Keith Packard
  2021-09-07 22:00     ` [PATCH 2/7] ARM: Pass task " Keith Packard
@ 2021-09-07 22:00     ` Keith Packard
  2021-09-07 22:00     ` [PATCH 4/7] ARM: Use hack from powerpc to get current cpu number Keith Packard
                       ` (4 subsequent siblings)
  7 siblings, 0 replies; 36+ messages in thread
From: Keith Packard @ 2021-09-07 22:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Abbott Liu, Andrew Morton, Andrey Ryabinin, Anshuman Khandual,
	Ard Biesheuvel, Arnd Bergmann, Bjorn Andersson,
	Christoph Lameter, Dennis Zhou, Geert Uytterhoeven, Jens Axboe,
	Joe Perches, Kees Cook, Keith Packard, Krzysztof Kozlowski,
	Linus Walleij, linux-arm-kernel, linux-mm, Manivannan Sadhasivam,
	Marc Zyngier, Masahiro Yamada, Mike Rapoport, Nathan Chancellor,
	Nick Desaulniers, Nick Desaulniers, Nicolas Pitre, Russell King,
	Tejun Heo, Thomas Gleixner, Uwe Kleine-König,
	Valentin Schneider, Viresh Kumar, Wolfram Sang (Renesas),
	YiFei Zhu

These are equivalent when thread_info contains the CPU value, but when
THREAD_INFO_IN_TASK is enabled, cpu moves to task_struct. Using the macro
allows either.

Signed-off-by: Keith Packard <keithpac@amazon.com>
---
 arch/arm/vfp/vfpmodule.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
index 2cb355c1b5b7..d7a3818da671 100644
--- a/arch/arm/vfp/vfpmodule.c
+++ b/arch/arm/vfp/vfpmodule.c
@@ -458,16 +458,16 @@ static int vfp_pm_suspend(void)
 
 		/* disable, just in case */
 		fmxr(FPEXC, fmrx(FPEXC) & ~FPEXC_EN);
-	} else if (vfp_current_hw_state[ti->cpu]) {
+	} else if (vfp_current_hw_state[smp_processor_id()]) {
 #ifndef CONFIG_SMP
 		fmxr(FPEXC, fpexc | FPEXC_EN);
-		vfp_save_state(vfp_current_hw_state[ti->cpu], fpexc);
+		vfp_save_state(vfp_current_hw_state[smp_processor_id()], fpexc);
 		fmxr(FPEXC, fpexc);
 #endif
 	}
 
 	/* clear any information we had about last context state */
-	vfp_current_hw_state[ti->cpu] = NULL;
+	vfp_current_hw_state[smp_processor_id()] = NULL;
 
 	return 0;
 }
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 4/7] ARM: Use hack from powerpc to get current cpu number
  2021-09-07 22:00   ` [PATCH 0/7] ARM: support THREAD_INFO_IN_TASK (v3) Keith Packard
                       ` (2 preceding siblings ...)
  2021-09-07 22:00     ` [PATCH 3/7] ARM: Use smp_processor_id() in vfp_pm_suspend instead of ti->cpu Keith Packard
@ 2021-09-07 22:00     ` Keith Packard
  2021-09-08  7:45       ` Ard Biesheuvel
  2021-09-07 22:00     ` [PATCH 5/7] ARM: Stop using TPIDRPRW to hold per_cpu_offset Keith Packard
                       ` (3 subsequent siblings)
  7 siblings, 1 reply; 36+ messages in thread
From: Keith Packard @ 2021-09-07 22:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Abbott Liu, Andrew Morton, Andrey Ryabinin, Anshuman Khandual,
	Ard Biesheuvel, Arnd Bergmann, Bjorn Andersson,
	Christoph Lameter, Dennis Zhou, Geert Uytterhoeven, Jens Axboe,
	Joe Perches, Kees Cook, Keith Packard, Krzysztof Kozlowski,
	Linus Walleij, linux-arm-kernel, linux-mm, Manivannan Sadhasivam,
	Marc Zyngier, Masahiro Yamada, Mike Rapoport, Nathan Chancellor,
	Nick Desaulniers, Nick Desaulniers, Nicolas Pitre, Russell King,
	Tejun Heo, Thomas Gleixner, Uwe Kleine-König,
	Valentin Schneider, Viresh Kumar, Wolfram Sang (Renesas),
	YiFei Zhu

When we enable THREAD_INFO_IN_TASK, the cpu number will disappear from
thread_info and reappear in task_struct. As we cannot include
linux/sched.h in asm/smp.h, there's no way to use that struct type in
the raw_smp_processor_id macro. Instead, a hack from the powerpc code
is used. This pulls the TI_CPU offset out of asm-offsets.h and uses
that to find the cpu value.

Signed-off-by: Keith Packard <keithpac@amazon.com>
---
 arch/arm/Makefile          |  8 ++++++++
 arch/arm/include/asm/smp.h | 18 +++++++++++++++++-
 2 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index 415c3514573a..6752995d2914 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -284,6 +284,14 @@ stack_protector_prepare: prepare0
 	$(eval GCC_PLUGINS_CFLAGS += $(SSP_PLUGIN_CFLAGS))
 endif
 
+ifdef CONFIG_SMP
+prepare: task_cpu_prepare
+
+PHONY += task_cpu_prepare
+task_cpu_prepare: prepare0
+	$(eval KBUILD_CFLAGS += -D_TI_CPU=$(shell awk '{if ($$2 == "TI_CPU") print $$3;}' include/generated/asm-offsets.h))
+endif
+
 all:	$(notdir $(KBUILD_IMAGE))
 
 
diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h
index d43b64635d77..f77ba3753bc4 100644
--- a/arch/arm/include/asm/smp.h
+++ b/arch/arm/include/asm/smp.h
@@ -15,7 +15,23 @@
 # error "<asm/smp.h> included in non-SMP build"
 #endif
 
-#define raw_smp_processor_id() (current_thread_info()->cpu)
+/*
+ * This is particularly ugly: it appears we can't actually get the
+ * definition of task_struct here, but we need access to the CPU this
+ * task is running on, which is stored in task_struct when
+ * THREAD_INFO_IN_TASK is set.  Instead of using task_struct we're
+ * using TI_CPU which is extracted from asm-offsets.h by kbuild to get
+ * the current processor ID.
+ *
+ * This also needs to be safeguarded when building asm-offsets.s
+ * because at that time TI_CPU is not defined yet.
+ */
+#ifndef _TI_CPU
+#define raw_smp_processor_id()		(0)
+#else
+#define raw_smp_processor_id()	\
+	(*(unsigned int *)((void *)current_thread_info() + _TI_CPU))
+#endif
 
 struct seq_file;
 
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 5/7] ARM: Stop using TPIDRPRW to hold per_cpu_offset
  2021-09-07 22:00   ` [PATCH 0/7] ARM: support THREAD_INFO_IN_TASK (v3) Keith Packard
                       ` (3 preceding siblings ...)
  2021-09-07 22:00     ` [PATCH 4/7] ARM: Use hack from powerpc to get current cpu number Keith Packard
@ 2021-09-07 22:00     ` Keith Packard
  2021-09-09 13:54       ` Ard Biesheuvel
  2021-09-07 22:00     ` [PATCH 6/7] ARM: Use TPIDRPRW for current Keith Packard
                       ` (2 subsequent siblings)
  7 siblings, 1 reply; 36+ messages in thread
From: Keith Packard @ 2021-09-07 22:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Abbott Liu, Andrew Morton, Andrey Ryabinin, Anshuman Khandual,
	Ard Biesheuvel, Arnd Bergmann, Bjorn Andersson,
	Christoph Lameter, Dennis Zhou, Geert Uytterhoeven, Jens Axboe,
	Joe Perches, Kees Cook, Keith Packard, Krzysztof Kozlowski,
	Linus Walleij, linux-arm-kernel, linux-mm, Manivannan Sadhasivam,
	Marc Zyngier, Masahiro Yamada, Mike Rapoport, Nathan Chancellor,
	Nick Desaulniers, Nick Desaulniers, Nicolas Pitre, Russell King,
	Tejun Heo, Thomas Gleixner, Uwe Kleine-König,
	Valentin Schneider, Viresh Kumar, Wolfram Sang (Renesas),
	YiFei Zhu

We're going to store TPIDRPRW here instead

Signed-off-by: Keith Packard <keithpac@amazon.com>
---
 arch/arm/include/asm/percpu.h | 31 -------------------------------
 arch/arm/kernel/setup.c       |  7 -------
 arch/arm/kernel/smp.c         |  3 ---
 3 files changed, 41 deletions(-)

diff --git a/arch/arm/include/asm/percpu.h b/arch/arm/include/asm/percpu.h
index e2fcb3cfd3de..eeafcd6a3e01 100644
--- a/arch/arm/include/asm/percpu.h
+++ b/arch/arm/include/asm/percpu.h
@@ -7,37 +7,6 @@
 
 register unsigned long current_stack_pointer asm ("sp");
 
-/*
- * Same as asm-generic/percpu.h, except that we store the per cpu offset
- * in the TPIDRPRW. TPIDRPRW only exists on V6K and V7
- */
-#if defined(CONFIG_SMP) && !defined(CONFIG_CPU_V6)
-static inline void set_my_cpu_offset(unsigned long off)
-{
-	/* Set TPIDRPRW */
-	asm volatile("mcr p15, 0, %0, c13, c0, 4" : : "r" (off) : "memory");
-}
-
-static inline unsigned long __my_cpu_offset(void)
-{
-	unsigned long off;
-
-	/*
-	 * Read TPIDRPRW.
-	 * We want to allow caching the value, so avoid using volatile and
-	 * instead use a fake stack read to hazard against barrier().
-	 */
-	asm("mrc p15, 0, %0, c13, c0, 4" : "=r" (off)
-		: "Q" (*(const unsigned long *)current_stack_pointer));
-
-	return off;
-}
-#define __my_cpu_offset __my_cpu_offset()
-#else
-#define set_my_cpu_offset(x)	do {} while(0)
-
-#endif /* CONFIG_SMP */
-
 #include <asm-generic/percpu.h>
 
 #endif /* _ASM_ARM_PERCPU_H_ */
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index ca0201635fac..d0dc60afe54f 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -590,13 +590,6 @@ void __init smp_setup_processor_id(void)
 	for (i = 1; i < nr_cpu_ids; ++i)
 		cpu_logical_map(i) = i == cpu ? 0 : i;
 
-	/*
-	 * clear __my_cpu_offset on boot CPU to avoid hang caused by
-	 * using percpu variable early, for example, lockdep will
-	 * access percpu variable inside lock_release
-	 */
-	set_my_cpu_offset(0);
-
 	pr_info("Booting Linux on physical CPU 0x%x\n", mpidr);
 }
 
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 5e999f1f1aea..8ccf10b34f08 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -410,8 +410,6 @@ asmlinkage void secondary_start_kernel(unsigned int cpu, struct task_struct *tas
 {
 	struct mm_struct *mm = &init_mm;
 
-	set_my_cpu_offset(per_cpu_offset(cpu));
-
 	secondary_biglittle_init();
 
 	/*
@@ -495,7 +493,6 @@ void __init smp_cpus_done(unsigned int max_cpus)
 
 void __init smp_prepare_boot_cpu(void)
 {
-	set_my_cpu_offset(per_cpu_offset(smp_processor_id()));
 }
 
 void __init smp_prepare_cpus(unsigned int max_cpus)
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 6/7] ARM: Use TPIDRPRW for current
  2021-09-07 22:00   ` [PATCH 0/7] ARM: support THREAD_INFO_IN_TASK (v3) Keith Packard
                       ` (4 preceding siblings ...)
  2021-09-07 22:00     ` [PATCH 5/7] ARM: Stop using TPIDRPRW to hold per_cpu_offset Keith Packard
@ 2021-09-07 22:00     ` Keith Packard
  2021-09-09 13:56       ` Ard Biesheuvel
  2021-09-07 22:00     ` [PATCH 7/7] ARM: Move thread_info into task_struct (v7 only) Keith Packard
  2021-09-08  7:01     ` [PATCH 0/7] ARM: support THREAD_INFO_IN_TASK (v3) Krzysztof Kozlowski
  7 siblings, 1 reply; 36+ messages in thread
From: Keith Packard @ 2021-09-07 22:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Abbott Liu, Andrew Morton, Andrey Ryabinin, Anshuman Khandual,
	Ard Biesheuvel, Arnd Bergmann, Bjorn Andersson,
	Christoph Lameter, Dennis Zhou, Geert Uytterhoeven, Jens Axboe,
	Joe Perches, Kees Cook, Keith Packard, Krzysztof Kozlowski,
	Linus Walleij, linux-arm-kernel, linux-mm, Manivannan Sadhasivam,
	Marc Zyngier, Masahiro Yamada, Mike Rapoport, Nathan Chancellor,
	Nick Desaulniers, Nick Desaulniers, Nicolas Pitre, Russell King,
	Tejun Heo, Thomas Gleixner, Uwe Kleine-König,
	Valentin Schneider, Viresh Kumar, Wolfram Sang (Renesas),
	YiFei Zhu

Store current task pointer in CPU thread ID register TPIDRPRW so that
accessing it doesn't depend on being able to locate thread_info off of
the kernel stack pointer.

Signed-off-by: Keith Packard <keithpac@amazon.com>
---
 arch/arm/Kconfig                 |  4 +++
 arch/arm/include/asm/assembler.h |  8 +++++
 arch/arm/include/asm/current.h   | 52 ++++++++++++++++++++++++++++++++
 arch/arm/kernel/entry-armv.S     |  4 +++
 arch/arm/kernel/setup.c          |  1 +
 arch/arm/kernel/smp.c            |  1 +
 6 files changed, 70 insertions(+)
 create mode 100644 arch/arm/include/asm/current.h

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 24804f11302d..414fe23fd5ac 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -1172,6 +1172,10 @@ config SMP_ON_UP
 
 	  If you don't know what to do here, say Y.
 
+config CURRENT_POINTER_IN_TPIDRPRW
+	def_bool y
+	depends on (CPU_V6K || CPU_V7) && !CPU_V6
+
 config ARM_CPU_TOPOLOGY
 	bool "Support cpu topology definition"
 	depends on SMP && CPU_V7
diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
index e2b1fd558bf3..ea12fe3bb589 100644
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@ -209,6 +209,14 @@
 	mov	\rd, \rd, lsl #THREAD_SIZE_ORDER + PAGE_SHIFT
 	.endm
 
+/*
+ * Set current task_info
+ * @src: Source register containing task_struct pointer
+ */
+	.macro	set_current src : req
+	mcr	p15, 0, \src, c13, c0, 4
+	.endm
+
 /*
  * Increment/decrement the preempt count.
  */
diff --git a/arch/arm/include/asm/current.h b/arch/arm/include/asm/current.h
new file mode 100644
index 000000000000..153a2ea18747
--- /dev/null
+++ b/arch/arm/include/asm/current.h
@@ -0,0 +1,52 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Copyright © 2021 Keith Packard <keithp@keithp.com>
+ */
+
+#ifndef _ASM_ARM_CURRENT_H_
+#define _ASM_ARM_CURRENT_H_
+
+#ifndef __ASSEMBLY__
+
+register unsigned long current_stack_pointer asm ("sp");
+
+/*
+ * Same as asm-generic/current.h, except that we store current
+ * in TPIDRPRW. TPIDRPRW only exists on V6K and V7
+ */
+#ifdef CONFIG_CURRENT_POINTER_IN_TPIDRPRW
+
+struct task_struct;
+
+static inline void set_current(struct task_struct *tsk)
+{
+	/* Set TPIDRPRW */
+	asm volatile("mcr p15, 0, %0, c13, c0, 4" : : "r" (tsk) : "memory");
+}
+
+static __always_inline struct task_struct *get_current(void)
+{
+	struct task_struct *tsk;
+
+	/*
+	 * Read TPIDRPRW.
+	 * We want to allow caching the value, so avoid using volatile and
+	 * instead use a fake stack read to hazard against barrier().
+	 */
+	asm("mrc p15, 0, %0, c13, c0, 4" : "=r" (tsk)
+		: "Q" (*(const unsigned long *)current_stack_pointer));
+
+	return tsk;
+}
+#define current get_current()
+#else
+
+#define set_current(tsk) do {} while (0)
+
+#include <asm-generic/current.h>
+
+#endif /* CONFIG_SMP */
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* _ASM_ARM_CURRENT_H_ */
diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
index 0ea8529a4872..db3947ee9c3e 100644
--- a/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -761,6 +761,10 @@ ENTRY(__switch_to)
 	ldr	r6, [r2, #TI_CPU_DOMAIN]
 #endif
 	switch_tls r1, r4, r5, r3, r7
+#ifdef CONFIG_CURRENT_POINTER_IN_TPIDRPRW
+	ldr	r7, [r2, #TI_TASK]
+	set_current r7
+#endif
 #if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_SMP)
 	ldr	r7, [r2, #TI_TASK]
 	ldr	r8, =__stack_chk_guard
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index d0dc60afe54f..2fdf8c31d6c9 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -586,6 +586,7 @@ void __init smp_setup_processor_id(void)
 	u32 mpidr = is_smp() ? read_cpuid_mpidr() & MPIDR_HWID_BITMASK : 0;
 	u32 cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
 
+	set_current(&init_task);
 	cpu_logical_map(0) = cpu;
 	for (i = 1; i < nr_cpu_ids; ++i)
 		cpu_logical_map(i) = i == cpu ? 0 : i;
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 8ccf10b34f08..09771916442a 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -410,6 +410,7 @@ asmlinkage void secondary_start_kernel(unsigned int cpu, struct task_struct *tas
 {
 	struct mm_struct *mm = &init_mm;
 
+	set_current(task);
 	secondary_biglittle_init();
 
 	/*
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* [PATCH 7/7] ARM: Move thread_info into task_struct (v7 only)
  2021-09-07 22:00   ` [PATCH 0/7] ARM: support THREAD_INFO_IN_TASK (v3) Keith Packard
                       ` (5 preceding siblings ...)
  2021-09-07 22:00     ` [PATCH 6/7] ARM: Use TPIDRPRW for current Keith Packard
@ 2021-09-07 22:00     ` Keith Packard
  2021-09-08  7:01     ` [PATCH 0/7] ARM: support THREAD_INFO_IN_TASK (v3) Krzysztof Kozlowski
  7 siblings, 0 replies; 36+ messages in thread
From: Keith Packard @ 2021-09-07 22:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: Abbott Liu, Andrew Morton, Andrey Ryabinin, Anshuman Khandual,
	Ard Biesheuvel, Arnd Bergmann, Bjorn Andersson,
	Christoph Lameter, Dennis Zhou, Geert Uytterhoeven, Jens Axboe,
	Joe Perches, Kees Cook, Keith Packard, Krzysztof Kozlowski,
	Linus Walleij, linux-arm-kernel, linux-mm, Manivannan Sadhasivam,
	Marc Zyngier, Masahiro Yamada, Mike Rapoport, Nathan Chancellor,
	Nick Desaulniers, Nick Desaulniers, Nicolas Pitre, Russell King,
	Tejun Heo, Thomas Gleixner, Uwe Kleine-König,
	Valentin Schneider, Viresh Kumar, Wolfram Sang (Renesas),
	YiFei Zhu

This avoids many stack overflow attacks which modified the thread_info
structure by moving that into the task_struct as is done is almost all
other architectures.

This depends on having 'current' stored in the TPIDRPRW register as
that allows us to find thread_info and task_struct once the
thread_info cannot be located using the kernel stack pointer.

Signed-off-by: Keith Packard <keithpac@amazon.com>
---
 arch/arm/Kconfig                   |  1 +
 arch/arm/include/asm/assembler.h   |  4 ++++
 arch/arm/include/asm/thread_info.h | 12 +++++++++++-
 arch/arm/kernel/asm-offsets.c      |  4 ++++
 arch/arm/kernel/entry-armv.S       |  4 ++++
 arch/arm/vfp/vfpmodule.c           |  9 +++++++++
 6 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 414fe23fd5ac..5846b4f5444b 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -128,6 +128,7 @@ config ARM
 	select RTC_LIB
 	select SET_FS
 	select SYS_SUPPORTS_APM_EMULATION
+	select THREAD_INFO_IN_TASK if CURRENT_POINTER_IN_TPIDRPRW
 	# Above selects are sorted alphabetically; please add new ones
 	# according to that.  Thanks.
 	help
diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
index ea12fe3bb589..b23d2b87184a 100644
--- a/arch/arm/include/asm/assembler.h
+++ b/arch/arm/include/asm/assembler.h
@@ -203,10 +203,14 @@
  * Get current thread_info.
  */
 	.macro	get_thread_info, rd
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+	mrc	p15, 0, \rd, c13, c0, 4
+#else
  ARM(	mov	\rd, sp, lsr #THREAD_SIZE_ORDER + PAGE_SHIFT	)
  THUMB(	mov	\rd, sp			)
  THUMB(	lsr	\rd, \rd, #THREAD_SIZE_ORDER + PAGE_SHIFT	)
 	mov	\rd, \rd, lsl #THREAD_SIZE_ORDER + PAGE_SHIFT
+#endif
 	.endm
 
 /*
diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h
index 70d4cbc49ae1..6b67703ca16a 100644
--- a/arch/arm/include/asm/thread_info.h
+++ b/arch/arm/include/asm/thread_info.h
@@ -55,8 +55,10 @@ struct thread_info {
 	unsigned long		flags;		/* low level flags */
 	int			preempt_count;	/* 0 => preemptable, <0 => bug */
 	mm_segment_t		addr_limit;	/* address limit */
+#ifndef CONFIG_THREAD_INFO_IN_TASK
 	struct task_struct	*task;		/* main task structure */
 	__u32			cpu;		/* cpu */
+#endif
 	__u32			cpu_domain;	/* cpu domain */
 #ifdef CONFIG_STACKPROTECTOR_PER_TASK
 	unsigned long		stack_canary;
@@ -75,14 +77,21 @@ struct thread_info {
 #endif
 };
 
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+#define INIT_THREAD_INFO_TASK(tsk)
+#else
+#define INIT_THREAD_INFO_TASK(tsk) .task = &tsk,
+#endif
+
 #define INIT_THREAD_INFO(tsk)						\
 {									\
-	.task		= &tsk,						\
+	INIT_THREAD_INFO_TASK(tsk)					\
 	.flags		= 0,						\
 	.preempt_count	= INIT_PREEMPT_COUNT,				\
 	.addr_limit	= KERNEL_DS,					\
 }
 
+#ifndef CONFIG_THREAD_INFO_IN_TASK
 /*
  * how to get the thread information struct from C
  */
@@ -93,6 +102,7 @@ static inline struct thread_info *current_thread_info(void)
 	return (struct thread_info *)
 		(current_stack_pointer & ~(THREAD_SIZE - 1));
 }
+#endif
 
 #define thread_saved_pc(tsk)	\
 	((unsigned long)(task_thread_info(tsk)->cpu_context.pc))
diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
index 70993af22d80..2a6745f7423e 100644
--- a/arch/arm/kernel/asm-offsets.c
+++ b/arch/arm/kernel/asm-offsets.c
@@ -44,8 +44,12 @@ int main(void)
   DEFINE(TI_FLAGS,		offsetof(struct thread_info, flags));
   DEFINE(TI_PREEMPT,		offsetof(struct thread_info, preempt_count));
   DEFINE(TI_ADDR_LIMIT,		offsetof(struct thread_info, addr_limit));
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+  DEFINE(TI_CPU,		offsetof(struct task_struct, cpu));
+#else
   DEFINE(TI_TASK,		offsetof(struct thread_info, task));
   DEFINE(TI_CPU,		offsetof(struct thread_info, cpu));
+#endif
   DEFINE(TI_CPU_DOMAIN,		offsetof(struct thread_info, cpu_domain));
   DEFINE(TI_CPU_SAVE,		offsetof(struct thread_info, cpu_context));
   DEFINE(TI_USED_CP,		offsetof(struct thread_info, used_cp));
diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
index db3947ee9c3e..5ae687c8c7b8 100644
--- a/arch/arm/kernel/entry-armv.S
+++ b/arch/arm/kernel/entry-armv.S
@@ -762,9 +762,13 @@ ENTRY(__switch_to)
 #endif
 	switch_tls r1, r4, r5, r3, r7
 #ifdef CONFIG_CURRENT_POINTER_IN_TPIDRPRW
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+	set_current r2
+#else
 	ldr	r7, [r2, #TI_TASK]
 	set_current r7
 #endif
+#endif
 #if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_SMP)
 	ldr	r7, [r2, #TI_TASK]
 	ldr	r8, =__stack_chk_guard
diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
index d7a3818da671..84a691da59fa 100644
--- a/arch/arm/vfp/vfpmodule.c
+++ b/arch/arm/vfp/vfpmodule.c
@@ -158,7 +158,12 @@ static void vfp_thread_copy(struct thread_info *thread)
  */
 static int vfp_notifier(struct notifier_block *self, unsigned long cmd, void *v)
 {
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+	struct task_struct *tsk = v;
+	struct thread_info *thread = &tsk->thread_info;
+#else
 	struct thread_info *thread = v;
+#endif
 	u32 fpexc;
 #ifdef CONFIG_SMP
 	unsigned int cpu;
@@ -169,7 +174,11 @@ static int vfp_notifier(struct notifier_block *self, unsigned long cmd, void *v)
 		fpexc = fmrx(FPEXC);
 
 #ifdef CONFIG_SMP
+#ifdef CONFIG_THREAD_INFO_IN_TASK
+		cpu = tsk->cpu;
+#else
 		cpu = thread->cpu;
+#endif
 
 		/*
 		 * On SMP, if VFP is enabled, save the old state in
-- 
2.33.0


^ permalink raw reply related	[flat|nested] 36+ messages in thread

* Re: [PATCH 2/3] ARM: Move thread_info into task_struct (v7 only)
  2021-09-07 16:05             ` Ard Biesheuvel
@ 2021-09-07 22:17               ` Keith Packard
  0 siblings, 0 replies; 36+ messages in thread
From: Keith Packard @ 2021-09-07 22:17 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Linux Kernel Mailing List, Abbott Liu, Alexander Sverdlin,
	Andrew Morton, Anshuman Khandual, Arnd Bergmann, Bjorn Andersson,
	Florian Fainelli, Geert Uytterhoeven, Hartley Sweeten,
	Jens Axboe, Jian Cai, Joe Perches, Kees Cook,
	Krzysztof Kozlowski, Linus Walleij, Linux ARM,
	Manivannan Sadhasivam, Marc Zyngier, Masahiro Yamada,
	Miguel Ojeda, Mike Rapoport, Nathan Chancellor, Nick Desaulniers,
	Nicolas Pitre, Rob Herring, Russell King, Thomas Gleixner,
	Uwe Kleine-König, Valentin Schneider, Viresh Kumar,
	Wolfram Sang (Renesas),
	YiFei Zhu

[-- Attachment #1: Type: text/plain, Size: 1446 bytes --]

Ard Biesheuvel <ardb@kernel.org> writes:

> That is still a bit tricky, given that you now have to swap out the
> per-CPU offset when you migrate a task, and the generic code is not
> really set up for that.

Yeah, I decided to just leverage as much generic code as possible and
re-purposed the TPIDRPRW register for 'current', leaving the
per_cpu_offset data in memory. That makes the patch series pretty
compact, with a bunch of restructuring changes followed by fairly short
functional changes.

> I had a stab at this myself today (long boring day with no Internet connection).
>
> https://android-kvm.googlesource.com/linux/+log/refs/heads/ardb/arm32-ti-in-task
>
> It resembles your code in some places - I suppose we went on the same
> journey in a sense :-) We'll fix up the credits before this gets
> resubmitted.

This does look great, and preserves the optimization of keeping the
per_cpu_offset value in a register.

With the series I posted, getting the per_cpu_offset value requires
fetching the cpu value from task_struct and then using that to index the
per_cpu_offset arrray. I don't have a good feeling of how important this
optimization is? We could stick the per_cpu_offset value into
thread_info to avoid one fetch?

> Fixing the per-task stack protector plugin on top of these changes
> should be trivial but I need a coffee first.

I haven't even looked at that step yet :-)

-- 
-keith

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/7] ARM: support THREAD_INFO_IN_TASK (v3)
  2021-09-07 22:00   ` [PATCH 0/7] ARM: support THREAD_INFO_IN_TASK (v3) Keith Packard
                       ` (6 preceding siblings ...)
  2021-09-07 22:00     ` [PATCH 7/7] ARM: Move thread_info into task_struct (v7 only) Keith Packard
@ 2021-09-08  7:01     ` Krzysztof Kozlowski
  2021-09-08  7:47       ` Ard Biesheuvel
  7 siblings, 1 reply; 36+ messages in thread
From: Krzysztof Kozlowski @ 2021-09-08  7:01 UTC (permalink / raw)
  To: Keith Packard, linux-kernel
  Cc: Abbott Liu, Andrew Morton, Andrey Ryabinin, Anshuman Khandual,
	Ard Biesheuvel, Arnd Bergmann, Bjorn Andersson,
	Christoph Lameter, Dennis Zhou, Geert Uytterhoeven, Jens Axboe,
	Joe Perches, Kees Cook, Linus Walleij, linux-arm-kernel,
	linux-mm, Manivannan Sadhasivam, Marc Zyngier, Masahiro Yamada,
	Mike Rapoport, Nathan Chancellor, Nick Desaulniers,
	Nick Desaulniers, Nicolas Pitre, Russell King, Tejun Heo,
	Thomas Gleixner, Uwe Kleine-König, Valentin Schneider,
	Viresh Kumar, Wolfram Sang (Renesas),
	YiFei Zhu

On 08/09/2021 00:00, Keith Packard wrote:
> Placing thread_info in the kernel stack leaves it vulnerable to stack
> overflow attacks. This short series addresses that by using the
> existing THREAD_INFO_IN_TASK infrastructure.
> 
> This is the third version of this series, in this version the changes
> are restricted to hardware which provides the TPIDRPRW register. This
> register is repurposed from holding the per_cpu_offset value to
> holding the 'current' value as that allows fetching this value
> atomically so that it can be used in a preemptable context.
> 
> The series is broken into seven pieces:
> 
>  1) Change the secondary_start_kernel API to receive the cpu
>     number. This avoids needing to be able to find this value independently in
>     future patches.
> 
>  2) Change the secondary_start_kernel API to also receive the 'task'
>     value. Passing the value to this function also avoids needing to
>     be able to discover it independently.
> 
>  3) A cleanup which avoids assuming that THREAD_INFO_IN_TASK is not set.
> 
>  4) A hack, borrowed from the powerpc arch, which allows locating the 'cpu'
>     field in either thread_info or task_struct, without requiring linux/sched.h
>     to be included in asm/smp.h
> 
>  5) Disable the optimization storing per_cpu_offset in TPIDRPRW. This leaves
>     the register free to hold 'current' instead.
> 
>  6) Use TPIDRPRW for 'current'. This is enabled for either CPU_V6K or CPU_V7,
>     but not if CPU_V6 is also enabled.
> 
>  7) Enable THREAD_INFO_IN_TASK whenever TPIDRPRW is used to hold 'current'.

Hi,

Thanks for your patches. This seems to be a v3 but the patches are not
marked with it. Use "-v3" in format-patch to get it right.

The email here also lacks diffstat which is useful, for example to check
whether any maintainer's relevant files are touched here. You can get it
with "--cover-letter".

In total the command should look like:
    git format-patch --cover-letter -v3 -7 HEAD

Of course you can use any other tools to achieve the same result but as
of now - result is not the same.

Best regards,
Krzysztof

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 4/7] ARM: Use hack from powerpc to get current cpu number
  2021-09-07 22:00     ` [PATCH 4/7] ARM: Use hack from powerpc to get current cpu number Keith Packard
@ 2021-09-08  7:45       ` Ard Biesheuvel
  0 siblings, 0 replies; 36+ messages in thread
From: Ard Biesheuvel @ 2021-09-08  7:45 UTC (permalink / raw)
  To: Keith Packard
  Cc: Linux Kernel Mailing List, Abbott Liu, Andrew Morton,
	Andrey Ryabinin, Anshuman Khandual, Arnd Bergmann,
	Bjorn Andersson, Christoph Lameter, Dennis Zhou,
	Geert Uytterhoeven, Jens Axboe, Joe Perches, Kees Cook,
	Krzysztof Kozlowski, Linus Walleij, Linux ARM,
	Linux Memory Management List, Manivannan Sadhasivam,
	Marc Zyngier, Masahiro Yamada, Mike Rapoport, Nathan Chancellor,
	Nick Desaulniers, Nick Desaulniers, Nicolas Pitre, Russell King,
	Tejun Heo, Thomas Gleixner, Uwe Kleine-König,
	Valentin Schneider, Viresh Kumar, Wolfram Sang (Renesas),
	YiFei Zhu

On Wed, 8 Sept 2021 at 00:00, Keith Packard <keithpac@amazon.com> wrote:
>
> When we enable THREAD_INFO_IN_TASK, the cpu number will disappear from
> thread_info and reappear in task_struct. As we cannot include
> linux/sched.h in asm/smp.h, there's no way to use that struct type in
> the raw_smp_processor_id macro. Instead, a hack from the powerpc code
> is used. This pulls the TI_CPU offset out of asm-offsets.h and uses
> that to find the cpu value.
>
> Signed-off-by: Keith Packard <keithpac@amazon.com>
> ---
>  arch/arm/Makefile          |  8 ++++++++
>  arch/arm/include/asm/smp.h | 18 +++++++++++++++++-
>  2 files changed, 25 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm/Makefile b/arch/arm/Makefile
> index 415c3514573a..6752995d2914 100644
> --- a/arch/arm/Makefile
> +++ b/arch/arm/Makefile
> @@ -284,6 +284,14 @@ stack_protector_prepare: prepare0
>         $(eval GCC_PLUGINS_CFLAGS += $(SSP_PLUGIN_CFLAGS))
>  endif
>
> +ifdef CONFIG_SMP
> +prepare: task_cpu_prepare
> +
> +PHONY += task_cpu_prepare
> +task_cpu_prepare: prepare0
> +       $(eval KBUILD_CFLAGS += -D_TI_CPU=$(shell awk '{if ($$2 == "TI_CPU") print $$3;}' include/generated/asm-offsets.h))
> +endif
> +
>  all:   $(notdir $(KBUILD_IMAGE))
>
>
> diff --git a/arch/arm/include/asm/smp.h b/arch/arm/include/asm/smp.h
> index d43b64635d77..f77ba3753bc4 100644
> --- a/arch/arm/include/asm/smp.h
> +++ b/arch/arm/include/asm/smp.h
> @@ -15,7 +15,23 @@
>  # error "<asm/smp.h> included in non-SMP build"
>  #endif
>
> -#define raw_smp_processor_id() (current_thread_info()->cpu)
> +/*
> + * This is particularly ugly: it appears we can't actually get the
> + * definition of task_struct here, but we need access to the CPU this
> + * task is running on, which is stored in task_struct when
> + * THREAD_INFO_IN_TASK is set.  Instead of using task_struct we're
> + * using TI_CPU which is extracted from asm-offsets.h by kbuild to get
> + * the current processor ID.
> + *
> + * This also needs to be safeguarded when building asm-offsets.s
> + * because at that time TI_CPU is not defined yet.
> + */
> +#ifndef _TI_CPU
> +#define raw_smp_processor_id()         (0)
> +#else
> +#define raw_smp_processor_id() \
> +       (*(unsigned int *)((void *)current_thread_info() + _TI_CPU))
> +#endif
>
>  struct seq_file;
>

As I stated before, I would really like to avoid using this hack - I
don't think we need to.

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/7] ARM: support THREAD_INFO_IN_TASK (v3)
  2021-09-08  7:01     ` [PATCH 0/7] ARM: support THREAD_INFO_IN_TASK (v3) Krzysztof Kozlowski
@ 2021-09-08  7:47       ` Ard Biesheuvel
  2021-09-08  7:50         ` Geert Uytterhoeven
  0 siblings, 1 reply; 36+ messages in thread
From: Ard Biesheuvel @ 2021-09-08  7:47 UTC (permalink / raw)
  To: Krzysztof Kozlowski
  Cc: Keith Packard, Linux Kernel Mailing List, Abbott Liu,
	Andrew Morton, Andrey Ryabinin, Anshuman Khandual, Arnd Bergmann,
	Bjorn Andersson, Christoph Lameter, Dennis Zhou,
	Geert Uytterhoeven, Jens Axboe, Joe Perches, Kees Cook,
	Linus Walleij, Linux ARM, Linux Memory Management List,
	Manivannan Sadhasivam, Marc Zyngier, Masahiro Yamada,
	Mike Rapoport, Nathan Chancellor, Nick Desaulniers,
	Nick Desaulniers, Nicolas Pitre, Russell King, Tejun Heo,
	Thomas Gleixner, Uwe Kleine-König, Valentin Schneider,
	Viresh Kumar, Wolfram Sang (Renesas),
	YiFei Zhu

On Wed, 8 Sept 2021 at 09:01, Krzysztof Kozlowski
<krzysztof.kozlowski@canonical.com> wrote:
>
> On 08/09/2021 00:00, Keith Packard wrote:
> > Placing thread_info in the kernel stack leaves it vulnerable to stack
> > overflow attacks. This short series addresses that by using the
> > existing THREAD_INFO_IN_TASK infrastructure.
> >
> > This is the third version of this series, in this version the changes
> > are restricted to hardware which provides the TPIDRPRW register. This
> > register is repurposed from holding the per_cpu_offset value to
> > holding the 'current' value as that allows fetching this value
> > atomically so that it can be used in a preemptable context.
> >
> > The series is broken into seven pieces:
> >
> >  1) Change the secondary_start_kernel API to receive the cpu
> >     number. This avoids needing to be able to find this value independently in
> >     future patches.
> >
> >  2) Change the secondary_start_kernel API to also receive the 'task'
> >     value. Passing the value to this function also avoids needing to
> >     be able to discover it independently.
> >
> >  3) A cleanup which avoids assuming that THREAD_INFO_IN_TASK is not set.
> >
> >  4) A hack, borrowed from the powerpc arch, which allows locating the 'cpu'
> >     field in either thread_info or task_struct, without requiring linux/sched.h
> >     to be included in asm/smp.h
> >
> >  5) Disable the optimization storing per_cpu_offset in TPIDRPRW. This leaves
> >     the register free to hold 'current' instead.
> >
> >  6) Use TPIDRPRW for 'current'. This is enabled for either CPU_V6K or CPU_V7,
> >     but not if CPU_V6 is also enabled.
> >
> >  7) Enable THREAD_INFO_IN_TASK whenever TPIDRPRW is used to hold 'current'.
>
> Hi,
>
> Thanks for your patches. This seems to be a v3 but the patches are not
> marked with it. Use "-v3" in format-patch to get it right.
>
> The email here also lacks diffstat which is useful, for example to check
> whether any maintainer's relevant files are touched here. You can get it
> with "--cover-letter".
>
> In total the command should look like:
>     git format-patch --cover-letter -v3 -7 HEAD
>
> Of course you can use any other tools to achieve the same result but as
> of now - result is not the same.
>

Also, this ended up in my GMail spam folder, likely due to some
antispam ID header being set incorrectly?

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 0/7] ARM: support THREAD_INFO_IN_TASK (v3)
  2021-09-08  7:47       ` Ard Biesheuvel
@ 2021-09-08  7:50         ` Geert Uytterhoeven
  0 siblings, 0 replies; 36+ messages in thread
From: Geert Uytterhoeven @ 2021-09-08  7:50 UTC (permalink / raw)
  To: Ard Biesheuvel
  Cc: Krzysztof Kozlowski, Keith Packard, Linux Kernel Mailing List,
	Abbott Liu, Andrew Morton, Andrey Ryabinin, Anshuman Khandual,
	Arnd Bergmann, Bjorn Andersson, Christoph Lameter, Dennis Zhou,
	Geert Uytterhoeven, Jens Axboe, Joe Perches, Kees Cook,
	Linus Walleij, Linux ARM, Linux Memory Management List,
	Manivannan Sadhasivam, Marc Zyngier, Masahiro Yamada,
	Mike Rapoport, Nathan Chancellor, Nick Desaulniers,
	Nick Desaulniers, Nicolas Pitre, Russell King, Tejun Heo,
	Thomas Gleixner, Uwe Kleine-König, Valentin Schneider,
	Viresh Kumar, Wolfram Sang (Renesas),
	YiFei Zhu

Hi Ard,

On Wed, Sep 8, 2021 at 9:47 AM Ard Biesheuvel <ardb@kernel.org> wrote:
> Also, this ended up in my GMail spam folder, likely due to some
> antispam ID header being set incorrectly?

I have a rule to never mark as spam email with "patch" in the subject.
So far only one false positive in all these years ;-)

Gr{oetje,eeting}s,

                        Geert

-- 
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 5/7] ARM: Stop using TPIDRPRW to hold per_cpu_offset
  2021-09-07 22:00     ` [PATCH 5/7] ARM: Stop using TPIDRPRW to hold per_cpu_offset Keith Packard
@ 2021-09-09 13:54       ` Ard Biesheuvel
  0 siblings, 0 replies; 36+ messages in thread
From: Ard Biesheuvel @ 2021-09-09 13:54 UTC (permalink / raw)
  To: Keith Packard
  Cc: Linux Kernel Mailing List, Abbott Liu, Andrew Morton,
	Andrey Ryabinin, Anshuman Khandual, Arnd Bergmann,
	Bjorn Andersson, Christoph Lameter, Dennis Zhou,
	Geert Uytterhoeven, Jens Axboe, Joe Perches, Kees Cook,
	Krzysztof Kozlowski, Linus Walleij, Linux ARM,
	Linux Memory Management List, Manivannan Sadhasivam,
	Marc Zyngier, Masahiro Yamada, Mike Rapoport, Nathan Chancellor,
	Nick Desaulniers, Nick Desaulniers, Nicolas Pitre, Russell King,
	Tejun Heo, Thomas Gleixner, Uwe Kleine-König,
	Valentin Schneider, Viresh Kumar, Wolfram Sang (Renesas),
	YiFei Zhu

On Wed, 8 Sept 2021 at 00:00, Keith Packard <keithpac@amazon.com> wrote:
>
> We're going to store TPIDRPRW here instead
>

?

> Signed-off-by: Keith Packard <keithpac@amazon.com>

I'd much prefer to keep using TPIDIRPRW for the per-CPU offsets, and
use the user space TLS register for current.

There are several reasons for this:
- arm64 does the same - as someone who still cares about ARM while
many have moved on to arm64 or RISC-V, I am still trying to maintain
parity between ARM and arm64 where possible.
- efficiency: loading the per-CPU offset using a CPU id stored in
memory, which is then used to index the per-CPU offsets array in
memory adds two additional loads to every load/store of a per-CPU
variable
- 'current' usually does not change value under the code's feet,
whereas per-CPU offsets might change at any time. Given the fact that
the CPU offset load is visible to the compiler as a memory access, I
suppose this should be safe, but I would still prefer per-CPU access
to avoid going via current if possible.

> ---
>  arch/arm/include/asm/percpu.h | 31 -------------------------------
>  arch/arm/kernel/setup.c       |  7 -------
>  arch/arm/kernel/smp.c         |  3 ---
>  3 files changed, 41 deletions(-)
>
> diff --git a/arch/arm/include/asm/percpu.h b/arch/arm/include/asm/percpu.h
> index e2fcb3cfd3de..eeafcd6a3e01 100644
> --- a/arch/arm/include/asm/percpu.h
> +++ b/arch/arm/include/asm/percpu.h
> @@ -7,37 +7,6 @@
>
>  register unsigned long current_stack_pointer asm ("sp");
>
> -/*
> - * Same as asm-generic/percpu.h, except that we store the per cpu offset
> - * in the TPIDRPRW. TPIDRPRW only exists on V6K and V7
> - */
> -#if defined(CONFIG_SMP) && !defined(CONFIG_CPU_V6)
> -static inline void set_my_cpu_offset(unsigned long off)
> -{
> -       /* Set TPIDRPRW */
> -       asm volatile("mcr p15, 0, %0, c13, c0, 4" : : "r" (off) : "memory");
> -}
> -
> -static inline unsigned long __my_cpu_offset(void)
> -{
> -       unsigned long off;
> -
> -       /*
> -        * Read TPIDRPRW.
> -        * We want to allow caching the value, so avoid using volatile and
> -        * instead use a fake stack read to hazard against barrier().
> -        */
> -       asm("mrc p15, 0, %0, c13, c0, 4" : "=r" (off)
> -               : "Q" (*(const unsigned long *)current_stack_pointer));
> -
> -       return off;
> -}
> -#define __my_cpu_offset __my_cpu_offset()
> -#else
> -#define set_my_cpu_offset(x)   do {} while(0)
> -
> -#endif /* CONFIG_SMP */
> -
>  #include <asm-generic/percpu.h>
>
>  #endif /* _ASM_ARM_PERCPU_H_ */
> diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
> index ca0201635fac..d0dc60afe54f 100644
> --- a/arch/arm/kernel/setup.c
> +++ b/arch/arm/kernel/setup.c
> @@ -590,13 +590,6 @@ void __init smp_setup_processor_id(void)
>         for (i = 1; i < nr_cpu_ids; ++i)
>                 cpu_logical_map(i) = i == cpu ? 0 : i;
>
> -       /*
> -        * clear __my_cpu_offset on boot CPU to avoid hang caused by
> -        * using percpu variable early, for example, lockdep will
> -        * access percpu variable inside lock_release
> -        */
> -       set_my_cpu_offset(0);
> -
>         pr_info("Booting Linux on physical CPU 0x%x\n", mpidr);
>  }
>
> diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> index 5e999f1f1aea..8ccf10b34f08 100644
> --- a/arch/arm/kernel/smp.c
> +++ b/arch/arm/kernel/smp.c
> @@ -410,8 +410,6 @@ asmlinkage void secondary_start_kernel(unsigned int cpu, struct task_struct *tas
>  {
>         struct mm_struct *mm = &init_mm;
>
> -       set_my_cpu_offset(per_cpu_offset(cpu));
> -
>         secondary_biglittle_init();
>
>         /*
> @@ -495,7 +493,6 @@ void __init smp_cpus_done(unsigned int max_cpus)
>
>  void __init smp_prepare_boot_cpu(void)
>  {
> -       set_my_cpu_offset(per_cpu_offset(smp_processor_id()));
>  }
>
>  void __init smp_prepare_cpus(unsigned int max_cpus)
> --
> 2.33.0
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

* Re: [PATCH 6/7] ARM: Use TPIDRPRW for current
  2021-09-07 22:00     ` [PATCH 6/7] ARM: Use TPIDRPRW for current Keith Packard
@ 2021-09-09 13:56       ` Ard Biesheuvel
  0 siblings, 0 replies; 36+ messages in thread
From: Ard Biesheuvel @ 2021-09-09 13:56 UTC (permalink / raw)
  To: Keith Packard
  Cc: Linux Kernel Mailing List, Abbott Liu, Andrew Morton,
	Andrey Ryabinin, Anshuman Khandual, Arnd Bergmann,
	Bjorn Andersson, Christoph Lameter, Dennis Zhou,
	Geert Uytterhoeven, Jens Axboe, Joe Perches, Kees Cook,
	Krzysztof Kozlowski, Linus Walleij, Linux ARM,
	Linux Memory Management List, Manivannan Sadhasivam,
	Marc Zyngier, Masahiro Yamada, Mike Rapoport, Nathan Chancellor,
	Nick Desaulniers, Nick Desaulniers, Nicolas Pitre, Russell King,
	Tejun Heo, Thomas Gleixner, Uwe Kleine-König,
	Valentin Schneider, Viresh Kumar, Wolfram Sang (Renesas),
	YiFei Zhu

On Wed, 8 Sept 2021 at 00:00, Keith Packard <keithpac@amazon.com> wrote:
>
> Store current task pointer in CPU thread ID register TPIDRPRW so that
> accessing it doesn't depend on being able to locate thread_info off of
> the kernel stack pointer.
>
> Signed-off-by: Keith Packard <keithpac@amazon.com>
> ---
>  arch/arm/Kconfig                 |  4 +++
>  arch/arm/include/asm/assembler.h |  8 +++++
>  arch/arm/include/asm/current.h   | 52 ++++++++++++++++++++++++++++++++
>  arch/arm/kernel/entry-armv.S     |  4 +++
>  arch/arm/kernel/setup.c          |  1 +
>  arch/arm/kernel/smp.c            |  1 +
>  6 files changed, 70 insertions(+)
>  create mode 100644 arch/arm/include/asm/current.h
>
> diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
> index 24804f11302d..414fe23fd5ac 100644
> --- a/arch/arm/Kconfig
> +++ b/arch/arm/Kconfig
> @@ -1172,6 +1172,10 @@ config SMP_ON_UP
>
>           If you don't know what to do here, say Y.
>
> +config CURRENT_POINTER_IN_TPIDRPRW
> +       def_bool y
> +       depends on (CPU_V6K || CPU_V7) && !CPU_V6
> +
>  config ARM_CPU_TOPOLOGY
>         bool "Support cpu topology definition"
>         depends on SMP && CPU_V7
> diff --git a/arch/arm/include/asm/assembler.h b/arch/arm/include/asm/assembler.h
> index e2b1fd558bf3..ea12fe3bb589 100644
> --- a/arch/arm/include/asm/assembler.h
> +++ b/arch/arm/include/asm/assembler.h
> @@ -209,6 +209,14 @@
>         mov     \rd, \rd, lsl #THREAD_SIZE_ORDER + PAGE_SHIFT
>         .endm
>
> +/*
> + * Set current task_info
> + * @src: Source register containing task_struct pointer
> + */
> +       .macro  set_current src : req
> +       mcr     p15, 0, \src, c13, c0, 4
> +       .endm
> +
>  /*
>   * Increment/decrement the preempt count.
>   */
> diff --git a/arch/arm/include/asm/current.h b/arch/arm/include/asm/current.h
> new file mode 100644
> index 000000000000..153a2ea18747
> --- /dev/null
> +++ b/arch/arm/include/asm/current.h
> @@ -0,0 +1,52 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * Copyright © 2021 Keith Packard <keithp@keithp.com>
> + */
> +
> +#ifndef _ASM_ARM_CURRENT_H_
> +#define _ASM_ARM_CURRENT_H_
> +
> +#ifndef __ASSEMBLY__
> +
> +register unsigned long current_stack_pointer asm ("sp");
> +
> +/*
> + * Same as asm-generic/current.h, except that we store current
> + * in TPIDRPRW. TPIDRPRW only exists on V6K and V7
> + */
> +#ifdef CONFIG_CURRENT_POINTER_IN_TPIDRPRW
> +
> +struct task_struct;
> +
> +static inline void set_current(struct task_struct *tsk)
> +{
> +       /* Set TPIDRPRW */
> +       asm volatile("mcr p15, 0, %0, c13, c0, 4" : : "r" (tsk) : "memory");
> +}
> +
> +static __always_inline struct task_struct *get_current(void)
> +{
> +       struct task_struct *tsk;
> +
> +       /*
> +        * Read TPIDRPRW.
> +        * We want to allow caching the value, so avoid using volatile and
> +        * instead use a fake stack read to hazard against barrier().
> +        */
> +       asm("mrc p15, 0, %0, c13, c0, 4" : "=r" (tsk)
> +               : "Q" (*(const unsigned long *)current_stack_pointer));
> +
> +       return tsk;
> +}
> +#define current get_current()
> +#else
> +
> +#define set_current(tsk) do {} while (0)
> +
> +#include <asm-generic/current.h>
> +
> +#endif /* CONFIG_SMP */
> +
> +#endif /* __ASSEMBLY__ */
> +
> +#endif /* _ASM_ARM_CURRENT_H_ */
> diff --git a/arch/arm/kernel/entry-armv.S b/arch/arm/kernel/entry-armv.S
> index 0ea8529a4872..db3947ee9c3e 100644
> --- a/arch/arm/kernel/entry-armv.S
> +++ b/arch/arm/kernel/entry-armv.S
> @@ -761,6 +761,10 @@ ENTRY(__switch_to)
>         ldr     r6, [r2, #TI_CPU_DOMAIN]
>  #endif
>         switch_tls r1, r4, r5, r3, r7
> +#ifdef CONFIG_CURRENT_POINTER_IN_TPIDRPRW
> +       ldr     r7, [r2, #TI_TASK]
> +       set_current r7
> +#endif

This is too early: this will cause the thread notification hooks to be
called with current pointing to the new task instead of the old one.

>  #if defined(CONFIG_STACKPROTECTOR) && !defined(CONFIG_SMP)
>         ldr     r7, [r2, #TI_TASK]
>         ldr     r8, =__stack_chk_guard
> diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
> index d0dc60afe54f..2fdf8c31d6c9 100644
> --- a/arch/arm/kernel/setup.c
> +++ b/arch/arm/kernel/setup.c
> @@ -586,6 +586,7 @@ void __init smp_setup_processor_id(void)
>         u32 mpidr = is_smp() ? read_cpuid_mpidr() & MPIDR_HWID_BITMASK : 0;
>         u32 cpu = MPIDR_AFFINITY_LEVEL(mpidr, 0);
>
> +       set_current(&init_task);
>         cpu_logical_map(0) = cpu;
>         for (i = 1; i < nr_cpu_ids; ++i)
>                 cpu_logical_map(i) = i == cpu ? 0 : i;
> diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
> index 8ccf10b34f08..09771916442a 100644
> --- a/arch/arm/kernel/smp.c
> +++ b/arch/arm/kernel/smp.c
> @@ -410,6 +410,7 @@ asmlinkage void secondary_start_kernel(unsigned int cpu, struct task_struct *tas
>  {
>         struct mm_struct *mm = &init_mm;
>
> +       set_current(task);
>         secondary_biglittle_init();
>
>         /*
> --
> 2.33.0
>

^ permalink raw reply	[flat|nested] 36+ messages in thread

end of thread, other threads:[~2021-09-09 13:58 UTC | newest]

Thread overview: 36+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-02 15:54 [PATCH 0/2]: ARM: Enable THREAD_INFO_IN_TASK Keith Packard
2021-09-02 15:54 ` [PATCH 1/2] ARM: Add per-cpu variable holding cpu number Keith Packard
2021-09-02 15:54 ` [PATCH 2/2] ARM: Move thread_info into task_struct Keith Packard
2021-09-02 16:07 ` [PATCH 0/2]: ARM: Enable THREAD_INFO_IN_TASK Kees Cook
2021-09-02 16:18   ` Ard Biesheuvel
2021-09-02 17:37     ` Kees Cook
2021-09-02 16:54   ` Russell King (Oracle)
2021-09-02 16:53 ` Russell King (Oracle)
2021-09-02 17:35   ` Kees Cook
2021-09-02 17:58   ` Keith Packard
2021-09-04  6:09 ` [PATCH 0/2] ARM: support THREAD_INFO_IN_TASK (v7 only) (v2) Keith Packard
2021-09-04  6:09   ` [PATCH 1/3] ARM: Pass cpu number to secondary_start_kernel Keith Packard
2021-09-05 20:25     ` Ard Biesheuvel
2021-09-04  6:09   ` [PATCH 2/3] ARM: Move thread_info into task_struct (v7 only) Keith Packard
2021-09-05 20:56     ` Ard Biesheuvel
2021-09-06  6:14       ` Keith Packard
2021-09-06  7:49         ` Ard Biesheuvel
2021-09-07 15:24           ` Keith Packard
2021-09-07 16:05             ` Ard Biesheuvel
2021-09-07 22:17               ` Keith Packard
2021-09-06  6:20       ` Keith Packard
2021-09-04  6:09   ` [PATCH 3/3] ARM: Add per-cpu variable cpu_number " Keith Packard
2021-09-07 22:00   ` [PATCH 0/7] ARM: support THREAD_INFO_IN_TASK (v3) Keith Packard
2021-09-07 22:00     ` [PATCH 1/7] ARM: Pass cpu number to secondary_start_kernel Keith Packard
2021-09-07 22:00     ` [PATCH 2/7] ARM: Pass task " Keith Packard
2021-09-07 22:00     ` [PATCH 3/7] ARM: Use smp_processor_id() in vfp_pm_suspend instead of ti->cpu Keith Packard
2021-09-07 22:00     ` [PATCH 4/7] ARM: Use hack from powerpc to get current cpu number Keith Packard
2021-09-08  7:45       ` Ard Biesheuvel
2021-09-07 22:00     ` [PATCH 5/7] ARM: Stop using TPIDRPRW to hold per_cpu_offset Keith Packard
2021-09-09 13:54       ` Ard Biesheuvel
2021-09-07 22:00     ` [PATCH 6/7] ARM: Use TPIDRPRW for current Keith Packard
2021-09-09 13:56       ` Ard Biesheuvel
2021-09-07 22:00     ` [PATCH 7/7] ARM: Move thread_info into task_struct (v7 only) Keith Packard
2021-09-08  7:01     ` [PATCH 0/7] ARM: support THREAD_INFO_IN_TASK (v3) Krzysztof Kozlowski
2021-09-08  7:47       ` Ard Biesheuvel
2021-09-08  7:50         ` Geert Uytterhoeven

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).