All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/8] ARM: ftrace fixes and cleanups
@ 2022-01-25 15:36 Ard Biesheuvel
  2022-01-25 15:36 ` [PATCH 1/8] ARM: ftrace: ensure that ADR take Thumb bit into account Ard Biesheuvel
                   ` (7 more replies)
  0 siblings, 8 replies; 16+ messages in thread
From: Ard Biesheuvel @ 2022-01-25 15:36 UTC (permalink / raw)
  To: linux, linux-arm-kernel
  Cc: Ard Biesheuvel, Arnd Bergmann, Linus Walleij, Nick Desaulniers

This series addresses a number of issues in the ARM support code for
ftrace, mostly related to Thumb2 but affecting other configurations as
well.

Cc: Russell King <linux@armlinux.org.uk>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Linus Walleij <linus.walleij@linaro.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>

Ard Biesheuvel (8):
  ARM: ftrace: ensure that ADR take Thumb bit into account
  ARM: ftrace: use ADD not POP to counter PUSH at entry
  ARM: ftrace: use trampolines to keep .init.text in branching range
  ARM: ftrace: avoid redundant loads or clobbering IP
  ARM: ftrace: avoid unnecessary literal loads
  ARM: ftrace: enable HAVE_FUNCTION_GRAPH_FP_TEST
  ARM: unwind: track location of LR value in stack frame
  ARM: ftrace: enable the graph tracer with the EABI unwinder

 arch/arm/Kconfig                  |   2 +-
 arch/arm/Kconfig.debug            |   2 +-
 arch/arm/include/asm/ftrace.h     |  20 +--
 arch/arm/include/asm/stacktrace.h |   2 +
 arch/arm/kernel/Makefile          |   6 +-
 arch/arm/kernel/entry-ftrace.S    | 128 +++++++++++---------
 arch/arm/kernel/ftrace.c          |  47 ++++++-
 arch/arm/kernel/unwind.c          |   7 +-
 8 files changed, 125 insertions(+), 89 deletions(-)

-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/8] ARM: ftrace: ensure that ADR take Thumb bit into account
  2022-01-25 15:36 [PATCH 0/8] ARM: ftrace fixes and cleanups Ard Biesheuvel
@ 2022-01-25 15:36 ` Ard Biesheuvel
  2022-01-25 19:14   ` Nick Desaulniers
  2022-02-02 23:53   ` Linus Walleij
  2022-01-25 15:36 ` [PATCH 2/8] ARM: ftrace: use ADD not POP to counter PUSH at entry Ard Biesheuvel
                   ` (6 subsequent siblings)
  7 siblings, 2 replies; 16+ messages in thread
From: Ard Biesheuvel @ 2022-01-25 15:36 UTC (permalink / raw)
  To: linux, linux-arm-kernel
  Cc: Ard Biesheuvel, Arnd Bergmann, Linus Walleij, Nick Desaulniers

Using ADR to take the address of 'ftrace_stub' via a local label
produces an address that has the Thumb bit cleared, which means the
subsequent comparison is guaranteed to fail. Instead, use the badr
macro, which forces the Thumb bit to be set.

Fixes: a3ba87a61499 ("ARM: 6316/1: ftrace: add Thumb-2 support")
Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/kernel/entry-ftrace.S | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm/kernel/entry-ftrace.S b/arch/arm/kernel/entry-ftrace.S
index a74289ebc803..f4886fb6e9ba 100644
--- a/arch/arm/kernel/entry-ftrace.S
+++ b/arch/arm/kernel/entry-ftrace.S
@@ -40,7 +40,7 @@
 	mcount_enter
 	ldr	r0, =ftrace_trace_function
 	ldr	r2, [r0]
-	adr	r0, .Lftrace_stub
+	badr	r0, .Lftrace_stub
 	cmp	r0, r2
 	bne	1f
 
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 2/8] ARM: ftrace: use ADD not POP to counter PUSH at entry
  2022-01-25 15:36 [PATCH 0/8] ARM: ftrace fixes and cleanups Ard Biesheuvel
  2022-01-25 15:36 ` [PATCH 1/8] ARM: ftrace: ensure that ADR take Thumb bit into account Ard Biesheuvel
@ 2022-01-25 15:36 ` Ard Biesheuvel
  2022-01-25 19:23   ` Nick Desaulniers
  2022-02-02 23:59   ` Linus Walleij
  2022-01-25 15:36 ` [PATCH 3/8] ARM: ftrace: use trampolines to keep .init.text in branching range Ard Biesheuvel
                   ` (5 subsequent siblings)
  7 siblings, 2 replies; 16+ messages in thread
From: Ard Biesheuvel @ 2022-01-25 15:36 UTC (permalink / raw)
  To: linux, linux-arm-kernel
  Cc: Ard Biesheuvel, Arnd Bergmann, Linus Walleij, Nick Desaulniers

The compiler emitted hook used for ftrace consists of a PUSH {LR} to
preserve the link register, followed by a branch-and-link (BL) to
__gnu_mount_nc. Dynamic ftrace patches away the latter to turn the
combined sequence into a NOP, using a POP {LR} instruction.

This is not necessary, since the link register does not get clobbered in
this case, and simply adding #4 to the stack pointer is sufficient, and
avoids a memory access that may take a few cycles to resolve depending
on the micro-architecture.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/kernel/entry-ftrace.S | 2 +-
 arch/arm/kernel/ftrace.c       | 4 ++--
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm/kernel/entry-ftrace.S b/arch/arm/kernel/entry-ftrace.S
index f4886fb6e9ba..dca12a09322a 100644
--- a/arch/arm/kernel/entry-ftrace.S
+++ b/arch/arm/kernel/entry-ftrace.S
@@ -27,7 +27,7 @@
  * allows it to be clobbered in subroutines and doesn't use it to hold
  * parameters.)
  *
- * When using dynamic ftrace, we patch out the mcount call by a "pop {lr}"
+ * When using dynamic ftrace, we patch out the mcount call by a "add sp, #4"
  * instead of the __gnu_mcount_nc call (see arch/arm/kernel/ftrace.c).
  */
 
diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
index a006585e1c09..db72d3a6522d 100644
--- a/arch/arm/kernel/ftrace.c
+++ b/arch/arm/kernel/ftrace.c
@@ -25,9 +25,9 @@
 #include <asm/patch.h>
 
 #ifdef CONFIG_THUMB2_KERNEL
-#define	NOP		0xf85deb04	/* pop.w {lr} */
+#define	NOP		0xf10d0d04	/* add.w sp, sp, #4 */
 #else
-#define	NOP		0xe8bd4000	/* pop {lr} */
+#define	NOP		0xe28dd004	/* add   sp, sp, #4 */
 #endif
 
 #ifdef CONFIG_DYNAMIC_FTRACE
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 3/8] ARM: ftrace: use trampolines to keep .init.text in branching range
  2022-01-25 15:36 [PATCH 0/8] ARM: ftrace fixes and cleanups Ard Biesheuvel
  2022-01-25 15:36 ` [PATCH 1/8] ARM: ftrace: ensure that ADR take Thumb bit into account Ard Biesheuvel
  2022-01-25 15:36 ` [PATCH 2/8] ARM: ftrace: use ADD not POP to counter PUSH at entry Ard Biesheuvel
@ 2022-01-25 15:36 ` Ard Biesheuvel
  2022-01-25 20:20   ` Nick Desaulniers
  2022-02-03  0:12   ` Linus Walleij
  2022-01-25 15:36 ` [PATCH 4/8] ARM: ftrace: avoid redundant loads or clobbering IP Ard Biesheuvel
                   ` (4 subsequent siblings)
  7 siblings, 2 replies; 16+ messages in thread
From: Ard Biesheuvel @ 2022-01-25 15:36 UTC (permalink / raw)
  To: linux, linux-arm-kernel
  Cc: Ard Biesheuvel, Arnd Bergmann, Linus Walleij, Nick Desaulniers

Kernel images that are large in comparison to the range of a direct
branch may fail to work as expected with ftrace, as patching a direct
branch to one of the core ftrace routines may not be possible from the
.init.text section, if it is emitted too far away from the normal .text
section.

This is more likely to affect Thumb2 builds, given that its range is
only -/+ 16 MiB (as opposed to ARM which has -/+ 32 MiB), but may occur
in either ISA.

To work around this, add a couple of trampolines to .init.text and
swap these in when the ftrace patching code is operating on callers in
.init.text.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/kernel/entry-ftrace.S | 16 ++++++++++++++++
 arch/arm/kernel/ftrace.c       | 19 +++++++++++++++++--
 2 files changed, 33 insertions(+), 2 deletions(-)

diff --git a/arch/arm/kernel/entry-ftrace.S b/arch/arm/kernel/entry-ftrace.S
index dca12a09322a..237d435e29aa 100644
--- a/arch/arm/kernel/entry-ftrace.S
+++ b/arch/arm/kernel/entry-ftrace.S
@@ -270,3 +270,19 @@ ENTRY(ftrace_stub)
 .Lftrace_stub:
 	ret	lr
 ENDPROC(ftrace_stub)
+
+#ifdef CONFIG_DYNAMIC_FTRACE
+
+	__INIT
+
+	.macro	init_tramp, dst:req
+ENTRY(\dst\()_from_init)
+	ldr	pc, =\dst
+ENDPROC(\dst\()_from_init)
+	.endm
+
+	init_tramp	ftrace_caller
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
+	init_tramp	ftrace_regs_caller
+#endif
+#endif
diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
index db72d3a6522d..d2326794fd09 100644
--- a/arch/arm/kernel/ftrace.c
+++ b/arch/arm/kernel/ftrace.c
@@ -51,9 +51,18 @@ static unsigned long ftrace_nop_replace(struct dyn_ftrace *rec)
 	return NOP;
 }
 
+void ftrace_caller_from_init(void);
+void ftrace_regs_caller_from_init(void);
+
 static unsigned long adjust_address(struct dyn_ftrace *rec, unsigned long addr)
 {
-	return addr;
+	if (!IS_ENABLED(CONFIG_DYNAMIC_FTRACE) ||
+	    likely(!is_kernel_inittext(rec->ip)))
+		return addr;
+	if (!IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_REGS) ||
+	    addr == (unsigned long)&ftrace_caller)
+		return (unsigned long)&ftrace_caller_from_init;
+	return (unsigned long)&ftrace_regs_caller_from_init;
 }
 
 int ftrace_arch_code_modify_prepare(void)
@@ -189,7 +198,13 @@ int ftrace_make_nop(struct module *mod,
 #endif
 
 	new = ftrace_nop_replace(rec);
-	ret = ftrace_modify_code(ip, old, new, true);
+	/*
+	 * Locations in .init.text may call __gnu_mcount_mc via a linker
+	 * emitted veneer if they are too far away from its implementation, and
+	 * so validation may fail spuriously in such cases. Let's work around
+	 * this by omitting those from validation.
+	 */
+	ret = ftrace_modify_code(ip, old, new, !is_kernel_inittext(ip));
 
 	return ret;
 }
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 4/8] ARM: ftrace: avoid redundant loads or clobbering IP
  2022-01-25 15:36 [PATCH 0/8] ARM: ftrace fixes and cleanups Ard Biesheuvel
                   ` (2 preceding siblings ...)
  2022-01-25 15:36 ` [PATCH 3/8] ARM: ftrace: use trampolines to keep .init.text in branching range Ard Biesheuvel
@ 2022-01-25 15:36 ` Ard Biesheuvel
  2022-01-25 15:36 ` [PATCH 5/8] ARM: ftrace: avoid unnecessary literal loads Ard Biesheuvel
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 16+ messages in thread
From: Ard Biesheuvel @ 2022-01-25 15:36 UTC (permalink / raw)
  To: linux, linux-arm-kernel
  Cc: Ard Biesheuvel, Arnd Bergmann, Linus Walleij, Nick Desaulniers

Tweak the ftrace return paths to avoid redundant loads of SP, as well as
unnecessary clobbering of IP.

This also fixes the inconsistency of using MOV to perform a function
return, which is sub-optimal on recent micro-architectures but more
importantly, does not perform an interworking return, unlike compiler
generated function returns in Thumb2 builds.

Let's fix this by popping PC from the stack like most ordinary code
does.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/kernel/entry-ftrace.S | 51 +++++++++-----------
 1 file changed, 22 insertions(+), 29 deletions(-)

diff --git a/arch/arm/kernel/entry-ftrace.S b/arch/arm/kernel/entry-ftrace.S
index 237d435e29aa..67548c38a567 100644
--- a/arch/arm/kernel/entry-ftrace.S
+++ b/arch/arm/kernel/entry-ftrace.S
@@ -22,10 +22,7 @@
  * mcount can be thought of as a function called in the middle of a subroutine
  * call.  As such, it needs to be transparent for both the caller and the
  * callee: the original lr needs to be restored when leaving mcount, and no
- * registers should be clobbered.  (In the __gnu_mcount_nc implementation, we
- * clobber the ip register.  This is OK because the ARM calling convention
- * allows it to be clobbered in subroutines and doesn't use it to hold
- * parameters.)
+ * registers should be clobbered.
  *
  * When using dynamic ftrace, we patch out the mcount call by a "add sp, #4"
  * instead of the __gnu_mcount_nc call (see arch/arm/kernel/ftrace.c).
@@ -70,26 +67,25 @@
 
 .macro __ftrace_regs_caller
 
-	sub	sp, sp, #8	@ space for PC and CPSR OLD_R0,
+	str	lr, [sp, #-8]!	@ store LR as PC and make space for CPSR/OLD_R0,
 				@ OLD_R0 will overwrite previous LR
 
-	add 	ip, sp, #12	@ move in IP the value of SP as it was
-				@ before the push {lr} of the mcount mechanism
+	ldr	lr, [sp, #8]    @ get previous LR
 
-	str     lr, [sp, #0]    @ store LR instead of PC
+	str	r0, [sp, #8]	@ write r0 as OLD_R0 over previous LR
 
-	ldr     lr, [sp, #8]    @ get previous LR
+	str	lr, [sp, #-4]!	@ store previous LR as LR
 
-	str	r0, [sp, #8]	@ write r0 as OLD_R0 over previous LR
+	add 	lr, sp, #16	@ move in LR the value of SP as it was
+				@ before the push {lr} of the mcount mechanism
 
-	stmdb   sp!, {ip, lr}
-	stmdb   sp!, {r0-r11, lr}
+	push	{r0-r11, ip, lr}
 
 	@ stack content at this point:
 	@ 0  4          48   52       56            60   64    68       72
-	@ R0 | R1 | ... | LR | SP + 4 | previous LR | LR | PSR | OLD_R0 |
+	@ R0 | R1 | ... | IP | SP + 4 | previous LR | LR | PSR | OLD_R0 |
 
-	mov r3, sp				@ struct pt_regs*
+	mov	r3, sp				@ struct pt_regs*
 
 	ldr r2, =function_trace_op
 	ldr r2, [r2]				@ pointer to the current
@@ -112,11 +108,9 @@ ftrace_graph_regs_call:
 #endif
 
 	@ pop saved regs
-	ldmia   sp!, {r0-r12}			@ restore r0 through r12
-	ldr	ip, [sp, #8]			@ restore PC
-	ldr	lr, [sp, #4]			@ restore LR
-	ldr	sp, [sp, #0]			@ restore SP
-	mov	pc, ip				@ return
+	pop	{r0-r11, ip, lr}		@ restore r0 through r12
+	ldr	lr, [sp], #4			@ restore LR
+	ldr	pc, [sp], #12
 .endm
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
@@ -132,11 +126,9 @@ ftrace_graph_regs_call:
 	bl	prepare_ftrace_return
 
 	@ pop registers saved in ftrace_regs_caller
-	ldmia   sp!, {r0-r12}			@ restore r0 through r12
-	ldr	ip, [sp, #8]			@ restore PC
-	ldr	lr, [sp, #4]			@ restore LR
-	ldr	sp, [sp, #0]			@ restore SP
-	mov	pc, ip				@ return
+	pop	{r0-r11, ip, lr}		@ restore r0 through r12
+	ldr	lr, [sp], #4			@ restore LR
+	ldr	pc, [sp], #12
 
 .endm
 #endif
@@ -202,16 +194,17 @@ ftrace_graph_call\suffix:
 .endm
 
 .macro mcount_exit
-	ldmia	sp!, {r0-r3, ip, lr}
-	ret	ip
+	ldmia	sp!, {r0-r3}
+	ldr	lr, [sp, #4]
+	ldr	pc, [sp], #8
 .endm
 
 ENTRY(__gnu_mcount_nc)
 UNWIND(.fnstart)
 #ifdef CONFIG_DYNAMIC_FTRACE
-	mov	ip, lr
-	ldmia	sp!, {lr}
-	ret	ip
+	push	{lr}
+	ldr	lr, [sp, #4]
+	ldr	pc, [sp], #8
 #else
 	__mcount
 #endif
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 5/8] ARM: ftrace: avoid unnecessary literal loads
  2022-01-25 15:36 [PATCH 0/8] ARM: ftrace fixes and cleanups Ard Biesheuvel
                   ` (3 preceding siblings ...)
  2022-01-25 15:36 ` [PATCH 4/8] ARM: ftrace: avoid redundant loads or clobbering IP Ard Biesheuvel
@ 2022-01-25 15:36 ` Ard Biesheuvel
  2022-01-25 20:27   ` Nick Desaulniers
  2022-01-25 15:36 ` [PATCH 6/8] ARM: ftrace: enable HAVE_FUNCTION_GRAPH_FP_TEST Ard Biesheuvel
                   ` (2 subsequent siblings)
  7 siblings, 1 reply; 16+ messages in thread
From: Ard Biesheuvel @ 2022-01-25 15:36 UTC (permalink / raw)
  To: linux, linux-arm-kernel
  Cc: Ard Biesheuvel, Arnd Bergmann, Linus Walleij, Nick Desaulniers

Avoid explicit literal loads and instead, use accessor macros that
generate the optimal sequence depending on the architecture revision
being targeted.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/kernel/entry-ftrace.S | 27 ++++++++------------
 1 file changed, 11 insertions(+), 16 deletions(-)

diff --git a/arch/arm/kernel/entry-ftrace.S b/arch/arm/kernel/entry-ftrace.S
index 67548c38a567..99720064a4c5 100644
--- a/arch/arm/kernel/entry-ftrace.S
+++ b/arch/arm/kernel/entry-ftrace.S
@@ -35,23 +35,20 @@
 
 .macro __mcount suffix
 	mcount_enter
-	ldr	r0, =ftrace_trace_function
-	ldr	r2, [r0]
+	ldr_va	r2, ftrace_trace_function
 	badr	r0, .Lftrace_stub
 	cmp	r0, r2
 	bne	1f
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
-	ldr     r1, =ftrace_graph_return
-	ldr     r2, [r1]
-	cmp     r0, r2
-	bne     ftrace_graph_caller\suffix
-
-	ldr     r1, =ftrace_graph_entry
-	ldr     r2, [r1]
-	ldr     r0, =ftrace_graph_entry_stub
-	cmp     r0, r2
-	bne     ftrace_graph_caller\suffix
+	ldr_va	r2, ftrace_graph_return
+	cmp	r0, r2
+	bne	ftrace_graph_caller\suffix
+
+	ldr_va	r2, ftrace_graph_entry
+	mov_l	r0, ftrace_graph_entry_stub
+	cmp	r0, r2
+	bne	ftrace_graph_caller\suffix
 #endif
 
 	mcount_exit
@@ -87,8 +84,7 @@
 
 	mov	r3, sp				@ struct pt_regs*
 
-	ldr r2, =function_trace_op
-	ldr r2, [r2]				@ pointer to the current
+	ldr_va	r2, function_trace_op		@ pointer to the current
 						@ function tracing op
 
 	ldr	r1, [sp, #S_LR]			@ lr of instrumented func
@@ -141,8 +137,7 @@ ftrace_graph_regs_call:
 	mcount_adjust_addr	r0, lr		@ instrumented function
 
 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
-	ldr r2, =function_trace_op
-	ldr r2, [r2]				@ pointer to the current
+	ldr_va	r2, function_trace_op		@ pointer to the current
 						@ function tracing op
 	mov r3, #0				@ regs is NULL
 #endif
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 6/8] ARM: ftrace: enable HAVE_FUNCTION_GRAPH_FP_TEST
  2022-01-25 15:36 [PATCH 0/8] ARM: ftrace fixes and cleanups Ard Biesheuvel
                   ` (4 preceding siblings ...)
  2022-01-25 15:36 ` [PATCH 5/8] ARM: ftrace: avoid unnecessary literal loads Ard Biesheuvel
@ 2022-01-25 15:36 ` Ard Biesheuvel
  2022-01-25 15:36 ` [PATCH 7/8] ARM: unwind: track location of LR value in stack frame Ard Biesheuvel
  2022-01-25 15:36 ` [PATCH 8/8] ARM: ftrace: enable the graph tracer with the EABI unwinder Ard Biesheuvel
  7 siblings, 0 replies; 16+ messages in thread
From: Ard Biesheuvel @ 2022-01-25 15:36 UTC (permalink / raw)
  To: linux, linux-arm-kernel
  Cc: Ard Biesheuvel, Arnd Bergmann, Linus Walleij, Nick Desaulniers

Fix the frame pointer handling in the function graph tracer entry and
exit code so we can enable HAVE_FUNCTION_GRAPH_FP_TEST. Instead of using
FP directly (which will have different values between the entry and exit
pieces of the function graph tracer), use the value of SP at entry and
exit, as we can derive the former value from the frame pointer.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/include/asm/ftrace.h  | 2 ++
 arch/arm/kernel/entry-ftrace.S | 2 +-
 arch/arm/kernel/ftrace.c       | 5 +++++
 3 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/ftrace.h b/arch/arm/include/asm/ftrace.h
index a4dbac07e4ef..b4f5fab6b04e 100644
--- a/arch/arm/include/asm/ftrace.h
+++ b/arch/arm/include/asm/ftrace.h
@@ -2,6 +2,8 @@
 #ifndef _ASM_ARM_FTRACE
 #define _ASM_ARM_FTRACE
 
+#define HAVE_FUNCTION_GRAPH_FP_TEST
+
 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
 #define ARCH_SUPPORTS_FTRACE_OPS 1
 #endif
diff --git a/arch/arm/kernel/entry-ftrace.S b/arch/arm/kernel/entry-ftrace.S
index 99720064a4c5..bbfa0954c385 100644
--- a/arch/arm/kernel/entry-ftrace.S
+++ b/arch/arm/kernel/entry-ftrace.S
@@ -247,7 +247,7 @@ ENDPROC(ftrace_graph_regs_caller)
 	.globl return_to_handler
 return_to_handler:
 	stmdb	sp!, {r0-r3}
-	mov	r0, fp			@ frame pointer
+	add	r0, sp, #16		@ sp at exit of instrumented routine
 	bl	ftrace_return_to_handler
 	mov	lr, r0			@ r0 has real ret addr
 	ldmia	sp!, {r0-r3}
diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
index d2326794fd09..988525b274e3 100644
--- a/arch/arm/kernel/ftrace.c
+++ b/arch/arm/kernel/ftrace.c
@@ -220,6 +220,11 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,
 	if (unlikely(atomic_read(&current->tracing_graph_pause)))
 		return;
 
+	if (IS_ENABLED(CONFIG_UNWINDER_FRAME_POINTER)) {
+		/* FP points one word below parent's top of stack */
+		frame_pointer += 4;
+	}
+
 	old = *parent;
 	*parent = return_hooker;
 
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 7/8] ARM: unwind: track location of LR value in stack frame
  2022-01-25 15:36 [PATCH 0/8] ARM: ftrace fixes and cleanups Ard Biesheuvel
                   ` (5 preceding siblings ...)
  2022-01-25 15:36 ` [PATCH 6/8] ARM: ftrace: enable HAVE_FUNCTION_GRAPH_FP_TEST Ard Biesheuvel
@ 2022-01-25 15:36 ` Ard Biesheuvel
  2022-01-25 15:36 ` [PATCH 8/8] ARM: ftrace: enable the graph tracer with the EABI unwinder Ard Biesheuvel
  7 siblings, 0 replies; 16+ messages in thread
From: Ard Biesheuvel @ 2022-01-25 15:36 UTC (permalink / raw)
  To: linux, linux-arm-kernel
  Cc: Ard Biesheuvel, Arnd Bergmann, Linus Walleij, Nick Desaulniers

The ftrace graph tracer needs to override the return address of an
instrumented function, in order to install a hook that gets invoked when
the function returns again.

Currently, we only support this when building for ARM using GCC with
frame pointers, as in this case, it is guaranteed that the function will
reload LR from [FP, #-4] in all cases, and we can simply pass that
address to the ftrace code.

In order to support this for configurations that rely on the EABI
unwinder, such as Thumb2 builds, make the unwinder keep track of the
address from which LR was unwound, permitting ftrace to make use of this
in a subsequent patch.

Drop the call to is_kernel_text_address(), which is problematic in terms
of ftrace recursion, given that it may be instrumented itself. The call
is redundant anyway, as no unwind directives will be found unless the PC
points to memory that is known to contain executable code.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/include/asm/stacktrace.h | 2 ++
 arch/arm/kernel/Makefile          | 1 +
 arch/arm/kernel/unwind.c          | 7 ++++---
 3 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/arch/arm/include/asm/stacktrace.h b/arch/arm/include/asm/stacktrace.h
index d87d60532b86..e5ce7440cba8 100644
--- a/arch/arm/include/asm/stacktrace.h
+++ b/arch/arm/include/asm/stacktrace.h
@@ -27,6 +27,8 @@ struct stackframe {
 	 * executing from another stack.
 	 */
 	unsigned long sp_low;
+	/* address of the LR value on the stack */
+	unsigned long *lr_addr;
 #endif
 };
 
diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
index ae295a3bcfef..56511856ff9d 100644
--- a/arch/arm/kernel/Makefile
+++ b/arch/arm/kernel/Makefile
@@ -10,6 +10,7 @@ ifdef CONFIG_FUNCTION_TRACER
 CFLAGS_REMOVE_ftrace.o = -pg
 CFLAGS_REMOVE_insn.o = -pg
 CFLAGS_REMOVE_patch.o = -pg
+CFLAGS_REMOVE_unwind.o = -pg
 endif
 
 CFLAGS_REMOVE_return_address.o = -pg
diff --git a/arch/arm/kernel/unwind.c b/arch/arm/kernel/unwind.c
index c5ea328c428d..b4e468a7674b 100644
--- a/arch/arm/kernel/unwind.c
+++ b/arch/arm/kernel/unwind.c
@@ -55,6 +55,7 @@ struct unwind_ctrl_block {
 	const unsigned long *insn;	/* pointer to the current instructions word */
 	unsigned long sp_low;		/* lowest value of sp allowed */
 	unsigned long sp_high;		/* highest value of sp allowed */
+	unsigned long *lr_addr;		/* address of LR value on the stack */
 	/*
 	 * 1 : check for stack overflow for each register pop.
 	 * 0 : save overhead if there is plenty of stack remaining.
@@ -239,6 +240,8 @@ static int unwind_pop_register(struct unwind_ctrl_block *ctrl,
 	 * from being tracked by KASAN.
 	 */
 	ctrl->vrs[reg] = READ_ONCE_NOCHECK(*(*vsp));
+	if (reg == 14)
+		ctrl->lr_addr = *vsp;
 	(*vsp)++;
 	return URC_OK;
 }
@@ -395,9 +398,6 @@ int unwind_frame(struct stackframe *frame)
 	pr_debug("%s(pc = %08lx lr = %08lx sp = %08lx)\n", __func__,
 		 frame->pc, frame->lr, frame->sp);
 
-	if (!kernel_text_address(frame->pc))
-		return -URC_FAILURE;
-
 	idx = unwind_find_idx(frame->pc);
 	if (!idx) {
 		pr_warn("unwind: Index not found %08lx\n", frame->pc);
@@ -476,6 +476,7 @@ int unwind_frame(struct stackframe *frame)
 	frame->lr = ctrl.vrs[LR];
 	frame->pc = ctrl.vrs[PC];
 	frame->sp_low = ctrl.sp_low;
+	frame->lr_addr = ctrl.lr_addr;
 
 	return URC_OK;
 }
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 8/8] ARM: ftrace: enable the graph tracer with the EABI unwinder
  2022-01-25 15:36 [PATCH 0/8] ARM: ftrace fixes and cleanups Ard Biesheuvel
                   ` (6 preceding siblings ...)
  2022-01-25 15:36 ` [PATCH 7/8] ARM: unwind: track location of LR value in stack frame Ard Biesheuvel
@ 2022-01-25 15:36 ` Ard Biesheuvel
  7 siblings, 0 replies; 16+ messages in thread
From: Ard Biesheuvel @ 2022-01-25 15:36 UTC (permalink / raw)
  To: linux, linux-arm-kernel
  Cc: Ard Biesheuvel, Arnd Bergmann, Linus Walleij, Nick Desaulniers

Enable the function graph tracer in combination with the EABI unwinder,
so that Thumb2 builds or Clang ARM builds can make use of it.

This involves using the unwinder to locate the return address of an
instrumented function on the stack, so that it can be overridden and
made to refer to the ftrace handling routines that need to be called at
function return.

Given that for these builds, it is not guaranteed that the value of the
link register is stored on the stack, fall back to the stack slot that
will be used by the ftrace exit code to restore LR in the instrumented
function's execution context.

Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
---
 arch/arm/Kconfig               |  2 +-
 arch/arm/Kconfig.debug         |  2 +-
 arch/arm/include/asm/ftrace.h  | 18 -------------
 arch/arm/kernel/Makefile       |  5 +---
 arch/arm/kernel/entry-ftrace.S | 28 ++++++++++++++------
 arch/arm/kernel/ftrace.c       | 19 +++++++++++--
 6 files changed, 40 insertions(+), 34 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index cbbe38f55088..e3c1a76d0a42 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -92,7 +92,7 @@ config ARM
 	select HAVE_EXIT_THREAD
 	select HAVE_FAST_GUP if ARM_LPAE
 	select HAVE_FTRACE_MCOUNT_RECORD if !XIP_KERNEL
-	select HAVE_FUNCTION_GRAPH_TRACER if !THUMB2_KERNEL && !CC_IS_CLANG
+	select HAVE_FUNCTION_GRAPH_TRACER
 	select HAVE_FUNCTION_TRACER if !XIP_KERNEL && !(THUMB2_KERNEL && CC_IS_CLANG)
 	select HAVE_GCC_PLUGINS
 	select HAVE_HW_BREAKPOINT if PERF_EVENTS && (CPU_V6 || CPU_V6K || CPU_V7)
diff --git a/arch/arm/Kconfig.debug b/arch/arm/Kconfig.debug
index 976315dea958..0c9497d549e3 100644
--- a/arch/arm/Kconfig.debug
+++ b/arch/arm/Kconfig.debug
@@ -65,7 +65,7 @@ config UNWINDER_FRAME_POINTER
 
 config UNWINDER_ARM
 	bool "ARM EABI stack unwinder"
-	depends on AEABI && !FUNCTION_GRAPH_TRACER
+	depends on AEABI
 	select ARM_UNWIND
 	help
 	  This option enables stack unwinding support in the kernel
diff --git a/arch/arm/include/asm/ftrace.h b/arch/arm/include/asm/ftrace.h
index b4f5fab6b04e..5358aad67831 100644
--- a/arch/arm/include/asm/ftrace.h
+++ b/arch/arm/include/asm/ftrace.h
@@ -35,26 +35,8 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr)
 
 #ifndef __ASSEMBLY__
 
-#if defined(CONFIG_FRAME_POINTER) && !defined(CONFIG_ARM_UNWIND)
-/*
- * return_address uses walk_stackframe to do it's work.  If both
- * CONFIG_FRAME_POINTER=y and CONFIG_ARM_UNWIND=y walk_stackframe uses unwind
- * information.  For this to work in the function tracer many functions would
- * have to be marked with __notrace.  So for now just depend on
- * !CONFIG_ARM_UNWIND.
- */
-
 void *return_address(unsigned int);
 
-#else
-
-static inline void *return_address(unsigned int level)
-{
-	return NULL;
-}
-
-#endif
-
 #define ftrace_return_address(n) return_address(n)
 
 #define ARCH_HAS_SYSCALL_MATCH_SYM_NAME
diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
index 56511856ff9d..5cebb8d5a1d6 100644
--- a/arch/arm/kernel/Makefile
+++ b/arch/arm/kernel/Makefile
@@ -25,10 +25,7 @@ obj-y		:= elf.o entry-common.o irq.o opcodes.o \
 KASAN_SANITIZE_stacktrace.o := n
 KASAN_SANITIZE_traps.o := n
 
-ifneq ($(CONFIG_ARM_UNWIND),y)
-obj-$(CONFIG_FRAME_POINTER)	+= return_address.o
-endif
-
+obj-y				+= return_address.o
 obj-$(CONFIG_ATAGS)		+= atags_parse.o
 obj-$(CONFIG_ATAGS_PROC)	+= atags_proc.o
 obj-$(CONFIG_DEPRECATED_PARAM_STRUCT) += atags_compat.o
diff --git a/arch/arm/kernel/entry-ftrace.S b/arch/arm/kernel/entry-ftrace.S
index bbfa0954c385..3e7bcaca5e07 100644
--- a/arch/arm/kernel/entry-ftrace.S
+++ b/arch/arm/kernel/entry-ftrace.S
@@ -100,7 +100,8 @@ ftrace_regs_call:
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 	.globl ftrace_graph_regs_call
 ftrace_graph_regs_call:
-	mov	r0, r0
+ARM(	mov	r0, r0	)
+THUMB(	nop.w		)
 #endif
 
 	@ pop saved regs
@@ -112,13 +113,18 @@ ftrace_graph_regs_call:
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 .macro __ftrace_graph_regs_caller
 
-	sub     r0, fp, #4              @ lr of instrumented routine (parent)
+#ifdef CONFIG_UNWINDER_FRAME_POINTER
+	sub	r0, fp, #4		@ lr of instrumented routine (parent)
+#else
+	add	r0, sp, #S_LR
+#endif
 
 	@ called from __ftrace_regs_caller
-	ldr     r1, [sp, #S_PC]		@ instrumented routine (func)
+	ldr	r1, [sp, #S_PC]		@ instrumented routine (func)
 	mcount_adjust_addr	r1, r1
 
-	mov	r2, fp			@ frame pointer
+	mov	r2, fpreg		@ frame pointer
+	add	r3, sp, #PT_REGS_SIZE
 	bl	prepare_ftrace_return
 
 	@ pop registers saved in ftrace_regs_caller
@@ -149,14 +155,19 @@ ftrace_call\suffix:
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 	.globl ftrace_graph_call\suffix
 ftrace_graph_call\suffix:
-	mov	r0, r0
+ARM(	mov	r0, r0	)
+THUMB(	nop.w		)
 #endif
 
 	mcount_exit
 .endm
 
 .macro __ftrace_graph_caller
+#ifdef CONFIG_UNWINDER_FRAME_POINTER
 	sub	r0, fp, #4		@ &lr of instrumented routine (&parent)
+#else
+	add	r0, sp, #20
+#endif
 #ifdef CONFIG_DYNAMIC_FTRACE
 	@ called from __ftrace_caller, saved in mcount_enter
 	ldr	r1, [sp, #16]		@ instrumented routine (func)
@@ -165,7 +176,8 @@ ftrace_graph_call\suffix:
 	@ called from __mcount, untouched in lr
 	mcount_adjust_addr	r1, lr	@ instrumented routine (func)
 #endif
-	mov	r2, fp			@ frame pointer
+	mov	r2, fpreg		@ frame pointer
+	add	r3, sp, #24
 	bl	prepare_ftrace_return
 	mcount_exit
 .endm
@@ -244,14 +256,14 @@ ENDPROC(ftrace_graph_regs_caller)
 .purgem mcount_exit
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
-	.globl return_to_handler
-return_to_handler:
+ENTRY(return_to_handler)
 	stmdb	sp!, {r0-r3}
 	add	r0, sp, #16		@ sp at exit of instrumented routine
 	bl	ftrace_return_to_handler
 	mov	lr, r0			@ r0 has real ret addr
 	ldmia	sp!, {r0-r3}
 	ret	lr
+ENDPROC(return_to_handler)
 #endif
 
 ENTRY(ftrace_stub)
diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
index 988525b274e3..65aa9e1eface 100644
--- a/arch/arm/kernel/ftrace.c
+++ b/arch/arm/kernel/ftrace.c
@@ -22,6 +22,7 @@
 #include <asm/ftrace.h>
 #include <asm/insn.h>
 #include <asm/set_memory.h>
+#include <asm/stacktrace.h>
 #include <asm/patch.h>
 
 #ifdef CONFIG_THUMB2_KERNEL
@@ -211,8 +212,10 @@ int ftrace_make_nop(struct module *mod,
 #endif /* CONFIG_DYNAMIC_FTRACE */
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
+asmlinkage
 void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,
-			   unsigned long frame_pointer)
+			   unsigned long frame_pointer,
+			   unsigned long stack_pointer)
 {
 	unsigned long return_hooker = (unsigned long) &return_to_handler;
 	unsigned long old;
@@ -223,6 +226,18 @@ void prepare_ftrace_return(unsigned long *parent, unsigned long self_addr,
 	if (IS_ENABLED(CONFIG_UNWINDER_FRAME_POINTER)) {
 		/* FP points one word below parent's top of stack */
 		frame_pointer += 4;
+	} else {
+		struct stackframe frame = {
+			.fp = frame_pointer,
+			.sp = stack_pointer,
+			.lr = self_addr,
+			.pc = self_addr,
+		};
+		if (unwind_frame(&frame) < 0)
+			return;
+		if (frame.lr != self_addr)
+			parent = frame.lr_addr;
+		frame_pointer = frame.sp;
 	}
 
 	old = *parent;
@@ -245,7 +260,7 @@ static int __ftrace_modify_caller(unsigned long *callsite,
 	unsigned long caller_fn = (unsigned long) func;
 	unsigned long pc = (unsigned long) callsite;
 	unsigned long branch = arm_gen_branch(pc, caller_fn);
-	unsigned long nop = 0xe1a00000;	/* mov r0, r0 */
+	unsigned long nop = arm_gen_nop();
 	unsigned long old = enable ? nop : branch;
 	unsigned long new = enable ? branch : nop;
 
-- 
2.30.2


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/8] ARM: ftrace: ensure that ADR take Thumb bit into account
  2022-01-25 15:36 ` [PATCH 1/8] ARM: ftrace: ensure that ADR take Thumb bit into account Ard Biesheuvel
@ 2022-01-25 19:14   ` Nick Desaulniers
  2022-02-02 23:53   ` Linus Walleij
  1 sibling, 0 replies; 16+ messages in thread
From: Nick Desaulniers @ 2022-01-25 19:14 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: linux, linux-arm-kernel, Arnd Bergmann, Linus Walleij

On Tue, Jan 25, 2022 at 7:37 AM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> Using ADR to take the address of 'ftrace_stub' via a local label
> produces an address that has the Thumb bit cleared, which means the
> subsequent comparison is guaranteed to fail. Instead, use the badr
> macro, which forces the Thumb bit to be set.
>
> Fixes: a3ba87a61499 ("ARM: 6316/1: ftrace: add Thumb-2 support")
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>

Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>

> ---
>  arch/arm/kernel/entry-ftrace.S | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/arm/kernel/entry-ftrace.S b/arch/arm/kernel/entry-ftrace.S
> index a74289ebc803..f4886fb6e9ba 100644
> --- a/arch/arm/kernel/entry-ftrace.S
> +++ b/arch/arm/kernel/entry-ftrace.S
> @@ -40,7 +40,7 @@
>         mcount_enter
>         ldr     r0, =ftrace_trace_function
>         ldr     r2, [r0]
> -       adr     r0, .Lftrace_stub
> +       badr    r0, .Lftrace_stub
>         cmp     r0, r2
>         bne     1f
>
> --
> 2.30.2
>


-- 
Thanks,
~Nick Desaulniers

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/8] ARM: ftrace: use ADD not POP to counter PUSH at entry
  2022-01-25 15:36 ` [PATCH 2/8] ARM: ftrace: use ADD not POP to counter PUSH at entry Ard Biesheuvel
@ 2022-01-25 19:23   ` Nick Desaulniers
  2022-02-02 23:59   ` Linus Walleij
  1 sibling, 0 replies; 16+ messages in thread
From: Nick Desaulniers @ 2022-01-25 19:23 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: linux, linux-arm-kernel, Arnd Bergmann, Linus Walleij

On Tue, Jan 25, 2022 at 7:37 AM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> The compiler emitted hook used for ftrace consists of a PUSH {LR} to
> preserve the link register, followed by a branch-and-link (BL) to
> __gnu_mount_nc. Dynamic ftrace patches away the latter to turn the
> combined sequence into a NOP, using a POP {LR} instruction.
>
> This is not necessary, since the link register does not get clobbered in
> this case, and simply adding #4 to the stack pointer is sufficient, and
> avoids a memory access that may take a few cycles to resolve depending
> on the micro-architecture.

Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>

>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  arch/arm/kernel/entry-ftrace.S | 2 +-
>  arch/arm/kernel/ftrace.c       | 4 ++--
>  2 files changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm/kernel/entry-ftrace.S b/arch/arm/kernel/entry-ftrace.S
> index f4886fb6e9ba..dca12a09322a 100644
> --- a/arch/arm/kernel/entry-ftrace.S
> +++ b/arch/arm/kernel/entry-ftrace.S
> @@ -27,7 +27,7 @@
>   * allows it to be clobbered in subroutines and doesn't use it to hold
>   * parameters.)
>   *
> - * When using dynamic ftrace, we patch out the mcount call by a "pop {lr}"
> + * When using dynamic ftrace, we patch out the mcount call by a "add sp, #4"
>   * instead of the __gnu_mcount_nc call (see arch/arm/kernel/ftrace.c).
>   */
>
> diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
> index a006585e1c09..db72d3a6522d 100644
> --- a/arch/arm/kernel/ftrace.c
> +++ b/arch/arm/kernel/ftrace.c
> @@ -25,9 +25,9 @@
>  #include <asm/patch.h>
>
>  #ifdef CONFIG_THUMB2_KERNEL
> -#define        NOP             0xf85deb04      /* pop.w {lr} */
> +#define        NOP             0xf10d0d04      /* add.w sp, sp, #4 */
>  #else
> -#define        NOP             0xe8bd4000      /* pop {lr} */
> +#define        NOP             0xe28dd004      /* add   sp, sp, #4 */
>  #endif
>
>  #ifdef CONFIG_DYNAMIC_FTRACE
> --
> 2.30.2
>


-- 
Thanks,
~Nick Desaulniers

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/8] ARM: ftrace: use trampolines to keep .init.text in branching range
  2022-01-25 15:36 ` [PATCH 3/8] ARM: ftrace: use trampolines to keep .init.text in branching range Ard Biesheuvel
@ 2022-01-25 20:20   ` Nick Desaulniers
  2022-02-03  0:12   ` Linus Walleij
  1 sibling, 0 replies; 16+ messages in thread
From: Nick Desaulniers @ 2022-01-25 20:20 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: linux, linux-arm-kernel, Arnd Bergmann, Linus Walleij

On Tue, Jan 25, 2022 at 7:37 AM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> Kernel images that are large in comparison to the range of a direct
> branch may fail to work as expected with ftrace, as patching a direct
> branch to one of the core ftrace routines may not be possible from the
> .init.text section, if it is emitted too far away from the normal .text
> section.
>
> This is more likely to affect Thumb2 builds, given that its range is
> only -/+ 16 MiB (as opposed to ARM which has -/+ 32 MiB), but may occur
> in either ISA.
>
> To work around this, add a couple of trampolines to .init.text and
> swap these in when the ftrace patching code is operating on callers in
> .init.text.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>
> ---
>  arch/arm/kernel/entry-ftrace.S | 16 ++++++++++++++++
>  arch/arm/kernel/ftrace.c       | 19 +++++++++++++++++--
>  2 files changed, 33 insertions(+), 2 deletions(-)
>
> diff --git a/arch/arm/kernel/entry-ftrace.S b/arch/arm/kernel/entry-ftrace.S
> index dca12a09322a..237d435e29aa 100644
> --- a/arch/arm/kernel/entry-ftrace.S
> +++ b/arch/arm/kernel/entry-ftrace.S
> @@ -270,3 +270,19 @@ ENTRY(ftrace_stub)
>  .Lftrace_stub:
>         ret     lr
>  ENDPROC(ftrace_stub)
> +
> +#ifdef CONFIG_DYNAMIC_FTRACE
> +
> +       __INIT
> +
> +       .macro  init_tramp, dst:req
> +ENTRY(\dst\()_from_init)
> +       ldr     pc, =\dst
> +ENDPROC(\dst\()_from_init)
> +       .endm
> +
> +       init_tramp      ftrace_caller
> +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
> +       init_tramp      ftrace_regs_caller
> +#endif

Consider adding __FINIT here; while it's not necessary now, folks who
append new code to this file might be surprised to find new code in
.init.text.  I would also tighten up the __INIT/__FINIT to only
surround the invocations of init_tramp and not the macro definition
itself.

Either way,
Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>

> +#endif
> diff --git a/arch/arm/kernel/ftrace.c b/arch/arm/kernel/ftrace.c
> index db72d3a6522d..d2326794fd09 100644
> --- a/arch/arm/kernel/ftrace.c
> +++ b/arch/arm/kernel/ftrace.c
> @@ -51,9 +51,18 @@ static unsigned long ftrace_nop_replace(struct dyn_ftrace *rec)
>         return NOP;
>  }
>
> +void ftrace_caller_from_init(void);
> +void ftrace_regs_caller_from_init(void);
> +
>  static unsigned long adjust_address(struct dyn_ftrace *rec, unsigned long addr)
>  {
> -       return addr;
> +       if (!IS_ENABLED(CONFIG_DYNAMIC_FTRACE) ||
> +           likely(!is_kernel_inittext(rec->ip)))
> +               return addr;
> +       if (!IS_ENABLED(CONFIG_DYNAMIC_FTRACE_WITH_REGS) ||
> +           addr == (unsigned long)&ftrace_caller)
> +               return (unsigned long)&ftrace_caller_from_init;
> +       return (unsigned long)&ftrace_regs_caller_from_init;
>  }
>
>  int ftrace_arch_code_modify_prepare(void)
> @@ -189,7 +198,13 @@ int ftrace_make_nop(struct module *mod,
>  #endif
>
>         new = ftrace_nop_replace(rec);
> -       ret = ftrace_modify_code(ip, old, new, true);
> +       /*
> +        * Locations in .init.text may call __gnu_mcount_mc via a linker
> +        * emitted veneer if they are too far away from its implementation, and
> +        * so validation may fail spuriously in such cases. Let's work around
> +        * this by omitting those from validation.
> +        */
> +       ret = ftrace_modify_code(ip, old, new, !is_kernel_inittext(ip));
>
>         return ret;
>  }
> --
> 2.30.2
>


-- 
Thanks,
~Nick Desaulniers

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 5/8] ARM: ftrace: avoid unnecessary literal loads
  2022-01-25 15:36 ` [PATCH 5/8] ARM: ftrace: avoid unnecessary literal loads Ard Biesheuvel
@ 2022-01-25 20:27   ` Nick Desaulniers
  0 siblings, 0 replies; 16+ messages in thread
From: Nick Desaulniers @ 2022-01-25 20:27 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: linux, linux-arm-kernel, Arnd Bergmann, Linus Walleij

On Tue, Jan 25, 2022 at 7:37 AM Ard Biesheuvel <ardb@kernel.org> wrote:
>
> Avoid explicit literal loads and instead, use accessor macros that
> generate the optimal sequence depending on the architecture revision
> being targeted.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>

Note to fellow reviewers, Ard pointed out to me that the definition of
ldr_va depends on his vmapped stack series:
https://git.kernel.org/pub/scm/linux/kernel/git/ardb/linux.git/commit/?h=arm-up-ti-in-task-v4&id=4e918ab13eaf40f19938659cb5a22c93172778a8
("ARM: assembler: add optimized ldr/str macros to load variables from memory")

Reviewed-by: Nick Desaulniers <ndesaulniers@google.com>

> ---
>  arch/arm/kernel/entry-ftrace.S | 27 ++++++++------------
>  1 file changed, 11 insertions(+), 16 deletions(-)
>
> diff --git a/arch/arm/kernel/entry-ftrace.S b/arch/arm/kernel/entry-ftrace.S
> index 67548c38a567..99720064a4c5 100644
> --- a/arch/arm/kernel/entry-ftrace.S
> +++ b/arch/arm/kernel/entry-ftrace.S
> @@ -35,23 +35,20 @@
>
>  .macro __mcount suffix
>         mcount_enter
> -       ldr     r0, =ftrace_trace_function
> -       ldr     r2, [r0]
> +       ldr_va  r2, ftrace_trace_function
>         badr    r0, .Lftrace_stub
>         cmp     r0, r2
>         bne     1f
>
>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
> -       ldr     r1, =ftrace_graph_return
> -       ldr     r2, [r1]
> -       cmp     r0, r2
> -       bne     ftrace_graph_caller\suffix
> -
> -       ldr     r1, =ftrace_graph_entry
> -       ldr     r2, [r1]
> -       ldr     r0, =ftrace_graph_entry_stub
> -       cmp     r0, r2
> -       bne     ftrace_graph_caller\suffix
> +       ldr_va  r2, ftrace_graph_return
> +       cmp     r0, r2
> +       bne     ftrace_graph_caller\suffix
> +
> +       ldr_va  r2, ftrace_graph_entry
> +       mov_l   r0, ftrace_graph_entry_stub
> +       cmp     r0, r2
> +       bne     ftrace_graph_caller\suffix
>  #endif
>
>         mcount_exit
> @@ -87,8 +84,7 @@
>
>         mov     r3, sp                          @ struct pt_regs*
>
> -       ldr r2, =function_trace_op
> -       ldr r2, [r2]                            @ pointer to the current
> +       ldr_va  r2, function_trace_op           @ pointer to the current
>                                                 @ function tracing op
>
>         ldr     r1, [sp, #S_LR]                 @ lr of instrumented func
> @@ -141,8 +137,7 @@ ftrace_graph_regs_call:
>         mcount_adjust_addr      r0, lr          @ instrumented function
>
>  #ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS
> -       ldr r2, =function_trace_op
> -       ldr r2, [r2]                            @ pointer to the current
> +       ldr_va  r2, function_trace_op           @ pointer to the current
>                                                 @ function tracing op
>         mov r3, #0                              @ regs is NULL
>  #endif
> --
> 2.30.2
>


-- 
Thanks,
~Nick Desaulniers

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/8] ARM: ftrace: ensure that ADR take Thumb bit into account
  2022-01-25 15:36 ` [PATCH 1/8] ARM: ftrace: ensure that ADR take Thumb bit into account Ard Biesheuvel
  2022-01-25 19:14   ` Nick Desaulniers
@ 2022-02-02 23:53   ` Linus Walleij
  1 sibling, 0 replies; 16+ messages in thread
From: Linus Walleij @ 2022-02-02 23:53 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: linux, linux-arm-kernel, Arnd Bergmann, Nick Desaulniers

On Tue, Jan 25, 2022 at 4:37 PM Ard Biesheuvel <ardb@kernel.org> wrote:

> Using ADR to take the address of 'ftrace_stub' via a local label
> produces an address that has the Thumb bit cleared, which means the
> subsequent comparison is guaranteed to fail. Instead, use the badr
> macro, which forces the Thumb bit to be set.
>
> Fixes: a3ba87a61499 ("ARM: 6316/1: ftrace: add Thumb-2 support")
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>

Looks correct to me!
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

Yours,
Linus Walleij

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 2/8] ARM: ftrace: use ADD not POP to counter PUSH at entry
  2022-01-25 15:36 ` [PATCH 2/8] ARM: ftrace: use ADD not POP to counter PUSH at entry Ard Biesheuvel
  2022-01-25 19:23   ` Nick Desaulniers
@ 2022-02-02 23:59   ` Linus Walleij
  1 sibling, 0 replies; 16+ messages in thread
From: Linus Walleij @ 2022-02-02 23:59 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: linux, linux-arm-kernel, Arnd Bergmann, Nick Desaulniers

On Tue, Jan 25, 2022 at 4:37 PM Ard Biesheuvel <ardb@kernel.org> wrote:

> The compiler emitted hook used for ftrace consists of a PUSH {LR} to
> preserve the link register, followed by a branch-and-link (BL) to
> __gnu_mount_nc. Dynamic ftrace patches away the latter to turn the
> combined sequence into a NOP, using a POP {LR} instruction.
>
> This is not necessary, since the link register does not get clobbered in
> this case, and simply adding #4 to the stack pointer is sufficient, and
> avoids a memory access that may take a few cycles to resolve depending
> on the micro-architecture.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>

That's on the hotpath for ftrace so definitely worth optimizing!
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

Maybe some of the commit message should go into
the code comment as well? "The compiler hook pushed 4 bytes with
a PUSH{LR}, so discard them by simply adding 4 to the
stack pointer" or so? Just a suggestion.

Yours,
Linus Walleij

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 3/8] ARM: ftrace: use trampolines to keep .init.text in branching range
  2022-01-25 15:36 ` [PATCH 3/8] ARM: ftrace: use trampolines to keep .init.text in branching range Ard Biesheuvel
  2022-01-25 20:20   ` Nick Desaulniers
@ 2022-02-03  0:12   ` Linus Walleij
  1 sibling, 0 replies; 16+ messages in thread
From: Linus Walleij @ 2022-02-03  0:12 UTC (permalink / raw)
  To: Ard Biesheuvel; +Cc: linux, linux-arm-kernel, Arnd Bergmann, Nick Desaulniers

On Tue, Jan 25, 2022 at 4:37 PM Ard Biesheuvel <ardb@kernel.org> wrote:

> Kernel images that are large in comparison to the range of a direct
> branch may fail to work as expected with ftrace, as patching a direct
> branch to one of the core ftrace routines may not be possible from the
> .init.text section, if it is emitted too far away from the normal .text
> section.
>
> This is more likely to affect Thumb2 builds, given that its range is
> only -/+ 16 MiB (as opposed to ARM which has -/+ 32 MiB), but may occur
> in either ISA.
>
> To work around this, add a couple of trampolines to .init.text and
> swap these in when the ftrace patching code is operating on callers in
> .init.text.
>
> Signed-off-by: Ard Biesheuvel <ardb@kernel.org>

LGTM
Reviewed-by: Linus Walleij <linus.walleij@linaro.org>

I suppose part of me wonder what happens if .text itself grows
larger than 16MB under thumb2, but I guess then the linker
will just fail.

Yours,
Linus Walleij

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-02-03  0:14 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-01-25 15:36 [PATCH 0/8] ARM: ftrace fixes and cleanups Ard Biesheuvel
2022-01-25 15:36 ` [PATCH 1/8] ARM: ftrace: ensure that ADR take Thumb bit into account Ard Biesheuvel
2022-01-25 19:14   ` Nick Desaulniers
2022-02-02 23:53   ` Linus Walleij
2022-01-25 15:36 ` [PATCH 2/8] ARM: ftrace: use ADD not POP to counter PUSH at entry Ard Biesheuvel
2022-01-25 19:23   ` Nick Desaulniers
2022-02-02 23:59   ` Linus Walleij
2022-01-25 15:36 ` [PATCH 3/8] ARM: ftrace: use trampolines to keep .init.text in branching range Ard Biesheuvel
2022-01-25 20:20   ` Nick Desaulniers
2022-02-03  0:12   ` Linus Walleij
2022-01-25 15:36 ` [PATCH 4/8] ARM: ftrace: avoid redundant loads or clobbering IP Ard Biesheuvel
2022-01-25 15:36 ` [PATCH 5/8] ARM: ftrace: avoid unnecessary literal loads Ard Biesheuvel
2022-01-25 20:27   ` Nick Desaulniers
2022-01-25 15:36 ` [PATCH 6/8] ARM: ftrace: enable HAVE_FUNCTION_GRAPH_FP_TEST Ard Biesheuvel
2022-01-25 15:36 ` [PATCH 7/8] ARM: unwind: track location of LR value in stack frame Ard Biesheuvel
2022-01-25 15:36 ` [PATCH 8/8] ARM: ftrace: enable the graph tracer with the EABI unwinder Ard Biesheuvel

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.