bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v6 0/5] Add ftrace direct call for arm64
@ 2023-04-05 18:02 Florent Revest
  2023-04-05 18:02 ` [PATCH v6 1/5] arm64: ftrace: Add direct call support Florent Revest
                   ` (6 more replies)
  0 siblings, 7 replies; 17+ messages in thread
From: Florent Revest @ 2023-04-05 18:02 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai, lihuafei1,
	Florent Revest

This series adds ftrace direct call support to arm64.
This makes BPF tracing programs (fentry/fexit/fmod_ret/lsm) work on arm64.

It is meant to be taken by the arm64 tree but it depends on the
trace-direct-v6.3-rc3 tag of the linux-trace tree:
  git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
That tag was created by Steven Rostedt so the arm64 tree can pull the prior work
this depends on. [1]

Thanks to the ftrace refactoring under that tag, an ftrace_ops backing a ftrace
direct call will only ever point to *one* direct call. This means we can look up
the direct called trampoline address stored in the ops from the ftrace_caller
trampoline in the case when the destination would be out of reach of a BL
instruction at the ftrace callsite. This fixes limitations of previous attempts
such as [2].

This series has been tested on arm64 with:
1- CONFIG_FTRACE_SELFTEST
2- samples/ftrace/*.ko (cf: patch 4)
3- tools/testing/selftests/bpf/test_progs (cf: patch 5)

Changes since v5 [3]:
- Fixed saving the fourth argument of handle_mm_fault in both the x86 (patch 3)
  and arm64 (as part of patch 4) "ftrace-direct-too" sample trampolines
- Fixed the address of the traced function logged by some direct call samples
  (ftrace-direct-multi and ftrace-direct-multi-modify) by moving lr into x0

1: https://lore.kernel.org/all/ZB2Nl7fzpHoq5V20@FVFF77S0Q05N/
2: https://lore.kernel.org/all/20220913162732.163631-1-xukuohai@huaweicloud.com/
3: https://lore.kernel.org/bpf/20230403113552.2857693-1-revest@chromium.org/

Florent Revest (5):
  arm64: ftrace: Add direct call support
  arm64: ftrace: Simplify get_ftrace_plt
  samples: ftrace: Save required argument registers in sample
    trampolines
  arm64: ftrace: Add direct call trampoline samples support
  selftests/bpf: Update the tests deny list on aarch64

 arch/arm64/Kconfig                           |  6 ++
 arch/arm64/include/asm/ftrace.h              | 22 +++++
 arch/arm64/kernel/asm-offsets.c              |  6 ++
 arch/arm64/kernel/entry-ftrace.S             | 90 ++++++++++++++++----
 arch/arm64/kernel/ftrace.c                   | 46 +++++++---
 samples/ftrace/ftrace-direct-modify.c        | 34 ++++++++
 samples/ftrace/ftrace-direct-multi-modify.c  | 40 +++++++++
 samples/ftrace/ftrace-direct-multi.c         | 24 ++++++
 samples/ftrace/ftrace-direct-too.c           | 40 +++++++--
 samples/ftrace/ftrace-direct.c               | 24 ++++++
 tools/testing/selftests/bpf/DENYLIST.aarch64 | 82 ++----------------
 11 files changed, 306 insertions(+), 108 deletions(-)

-- 
2.40.0.577.gac1e443424-goog


^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v6 1/5] arm64: ftrace: Add direct call support
  2023-04-05 18:02 [PATCH v6 0/5] Add ftrace direct call for arm64 Florent Revest
@ 2023-04-05 18:02 ` Florent Revest
  2023-04-05 18:02 ` [PATCH v6 2/5] arm64: ftrace: Simplify get_ftrace_plt Florent Revest
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Florent Revest @ 2023-04-05 18:02 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai, lihuafei1,
	Florent Revest

This builds up on the CALL_OPS work which extends the ftrace patchsite
on arm64 with an ops pointer usable by the ftrace trampoline.

This ops pointer is valid at all time. Indeed, it is either pointing to
ftrace_list_ops or to the single ops which should be called from that
patchsite.

There are a few cases to distinguish:
- If a direct call ops is the only one tracing a function:
  - If the direct called trampoline is within the reach of a BL
    instruction
     -> the ftrace patchsite jumps to the trampoline
  - Else
     -> the ftrace patchsite jumps to the ftrace_caller trampoline which
        reads the ops pointer in the patchsite and jumps to the direct
        call address stored in the ops
- Else
  -> the ftrace patchsite jumps to the ftrace_caller trampoline and its
     ops literal points to ftrace_list_ops so it iterates over all
     registered ftrace ops, including the direct call ops and calls its
     call_direct_funcs handler which stores the direct called
     trampoline's address in the ftrace_regs and the ftrace_caller
     trampoline will return to that address instead of returning to the
     traced function

Signed-off-by: Florent Revest <revest@chromium.org>
Co-developed-by: Mark Rutland <mark.rutland@arm.com>
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
---
 arch/arm64/Kconfig               |  4 ++
 arch/arm64/include/asm/ftrace.h  | 22 ++++++++
 arch/arm64/kernel/asm-offsets.c  |  6 +++
 arch/arm64/kernel/entry-ftrace.S | 90 ++++++++++++++++++++++++++------
 arch/arm64/kernel/ftrace.c       | 36 +++++++++++--
 5 files changed, 138 insertions(+), 20 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 1023e896d46b..f3503d0cc1b8 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -185,6 +185,10 @@ config ARM64
 	select HAVE_DEBUG_KMEMLEAK
 	select HAVE_DMA_CONTIGUOUS
 	select HAVE_DYNAMIC_FTRACE
+	select HAVE_DYNAMIC_FTRACE_WITH_ARGS \
+		if $(cc-option,-fpatchable-function-entry=2)
+	select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS \
+		if DYNAMIC_FTRACE_WITH_ARGS && DYNAMIC_FTRACE_WITH_CALL_OPS
 	select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS \
 		if (DYNAMIC_FTRACE_WITH_ARGS && !CFI_CLANG && \
 		    !CC_OPTIMIZE_FOR_SIZE)
diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
index 1c2672bbbf37..b87d70b693c6 100644
--- a/arch/arm64/include/asm/ftrace.h
+++ b/arch/arm64/include/asm/ftrace.h
@@ -70,10 +70,19 @@ struct ftrace_ops;
 
 #define arch_ftrace_get_regs(regs) NULL
 
+/*
+ * Note: sizeof(struct ftrace_regs) must be a multiple of 16 to ensure correct
+ * stack alignment
+ */
 struct ftrace_regs {
 	/* x0 - x8 */
 	unsigned long regs[9];
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+	unsigned long direct_tramp;
+#else
 	unsigned long __unused;
+#endif
 
 	unsigned long fp;
 	unsigned long lr;
@@ -136,6 +145,19 @@ int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
 void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
 		       struct ftrace_ops *op, struct ftrace_regs *fregs);
 #define ftrace_graph_func ftrace_graph_func
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+static inline void arch_ftrace_set_direct_caller(struct ftrace_regs *fregs,
+						 unsigned long addr)
+{
+	/*
+	 * The ftrace trampoline will return to this address instead of the
+	 * instrumented function.
+	 */
+	fregs->direct_tramp = addr;
+}
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
+
 #endif
 
 #define ftrace_return_address(n) return_address(n)
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index ae345b06e9f7..0996094b0d22 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -93,6 +93,9 @@ int main(void)
   DEFINE(FREGS_LR,		offsetof(struct ftrace_regs, lr));
   DEFINE(FREGS_SP,		offsetof(struct ftrace_regs, sp));
   DEFINE(FREGS_PC,		offsetof(struct ftrace_regs, pc));
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+  DEFINE(FREGS_DIRECT_TRAMP,	offsetof(struct ftrace_regs, direct_tramp));
+#endif
   DEFINE(FREGS_SIZE,		sizeof(struct ftrace_regs));
   BLANK();
 #endif
@@ -197,6 +200,9 @@ int main(void)
 #endif
 #ifdef CONFIG_FUNCTION_TRACER
   DEFINE(FTRACE_OPS_FUNC,		offsetof(struct ftrace_ops, func));
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+  DEFINE(FTRACE_OPS_DIRECT_CALL,	offsetof(struct ftrace_ops, direct_call));
+#endif
 #endif
   return 0;
 }
diff --git a/arch/arm64/kernel/entry-ftrace.S b/arch/arm64/kernel/entry-ftrace.S
index 350ed81324ac..1c38a60575aa 100644
--- a/arch/arm64/kernel/entry-ftrace.S
+++ b/arch/arm64/kernel/entry-ftrace.S
@@ -36,6 +36,31 @@
 SYM_CODE_START(ftrace_caller)
 	bti	c
 
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS
+	/*
+	 * The literal pointer to the ops is at an 8-byte aligned boundary
+	 * which is either 12 or 16 bytes before the BL instruction in the call
+	 * site. See ftrace_call_adjust() for details.
+	 *
+	 * Therefore here the LR points at `literal + 16` or `literal + 20`,
+	 * and we can find the address of the literal in either case by
+	 * aligning to an 8-byte boundary and subtracting 16. We do the
+	 * alignment first as this allows us to fold the subtraction into the
+	 * LDR.
+	 */
+	bic	x11, x30, 0x7
+	ldr	x11, [x11, #-(4 * AARCH64_INSN_SIZE)]		// op
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+	/*
+	 * If the op has a direct call, handle it immediately without
+	 * saving/restoring registers.
+	 */
+	ldr	x17, [x11, #FTRACE_OPS_DIRECT_CALL]		// op->direct_call
+	cbnz	x17, ftrace_caller_direct
+#endif
+#endif
+
 	/* Save original SP */
 	mov	x10, sp
 
@@ -49,6 +74,10 @@ SYM_CODE_START(ftrace_caller)
 	stp	x6, x7, [sp, #FREGS_X6]
 	str	x8,     [sp, #FREGS_X8]
 
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+	str	xzr, [sp, #FREGS_DIRECT_TRAMP]
+#endif
+
 	/* Save the callsite's FP, LR, SP */
 	str	x29, [sp, #FREGS_FP]
 	str	x9,  [sp, #FREGS_LR]
@@ -71,20 +100,7 @@ SYM_CODE_START(ftrace_caller)
 	mov	x3, sp					// regs
 
 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_CALL_OPS
-	/*
-	 * The literal pointer to the ops is at an 8-byte aligned boundary
-	 * which is either 12 or 16 bytes before the BL instruction in the call
-	 * site. See ftrace_call_adjust() for details.
-	 *
-	 * Therefore here the LR points at `literal + 16` or `literal + 20`,
-	 * and we can find the address of the literal in either case by
-	 * aligning to an 8-byte boundary and subtracting 16. We do the
-	 * alignment first as this allows us to fold the subtraction into the
-	 * LDR.
-	 */
-	bic	x2, x30, 0x7
-	ldr	x2, [x2, #-16]				// op
-
+	mov	x2, x11					// op
 	ldr	x4, [x2, #FTRACE_OPS_FUNC]		// op->func
 	blr	x4					// op->func(ip, parent_ip, op, regs)
 
@@ -107,8 +123,15 @@ SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL)
 	ldp	x6, x7, [sp, #FREGS_X6]
 	ldr	x8,     [sp, #FREGS_X8]
 
-	/* Restore the callsite's FP, LR, PC */
+	/* Restore the callsite's FP */
 	ldr	x29, [sp, #FREGS_FP]
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+	ldr	x17, [sp, #FREGS_DIRECT_TRAMP]
+	cbnz	x17, ftrace_caller_direct_late
+#endif
+
+	/* Restore the callsite's LR and PC */
 	ldr	x30, [sp, #FREGS_LR]
 	ldr	x9,  [sp, #FREGS_PC]
 
@@ -116,8 +139,45 @@ SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL)
 	add	sp, sp, #FREGS_SIZE + 32
 
 	ret	x9
+
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+SYM_INNER_LABEL(ftrace_caller_direct_late, SYM_L_LOCAL)
+	/*
+	 * Head to a direct trampoline in x17 after having run other tracers.
+	 * The ftrace_regs are live, and x0-x8 and FP have been restored. The
+	 * LR, PC, and SP have not been restored.
+	 */
+
+	/*
+	 * Restore the callsite's LR and PC matching the trampoline calling
+	 * convention.
+	 */
+	ldr	x9,  [sp, #FREGS_LR]
+	ldr	x30, [sp, #FREGS_PC]
+
+	/* Restore the callsite's SP */
+	add	sp, sp, #FREGS_SIZE + 32
+
+SYM_INNER_LABEL(ftrace_caller_direct, SYM_L_LOCAL)
+	/*
+	 * Head to a direct trampoline in x17.
+	 *
+	 * We use `BR X17` as this can safely land on a `BTI C` or `PACIASP` in
+	 * the trampoline, and will not unbalance any return stack.
+	 */
+	br	x17
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
 SYM_CODE_END(ftrace_caller)
 
+#ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
+SYM_CODE_START(ftrace_stub_direct_tramp)
+	bti	c
+	mov	x10, x30
+	mov	x30, x9
+	ret	x10
+SYM_CODE_END(ftrace_stub_direct_tramp)
+#endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
+
 #else /* CONFIG_DYNAMIC_FTRACE_WITH_ARGS */
 
 /*
diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
index 5545fe1a9012..758436727fba 100644
--- a/arch/arm64/kernel/ftrace.c
+++ b/arch/arm64/kernel/ftrace.c
@@ -206,6 +206,13 @@ static struct plt_entry *get_ftrace_plt(struct module *mod, unsigned long addr)
 	return NULL;
 }
 
+static bool reachable_by_bl(unsigned long addr, unsigned long pc)
+{
+	long offset = (long)addr - (long)pc;
+
+	return offset >= -SZ_128M && offset < SZ_128M;
+}
+
 /*
  * Find the address the callsite must branch to in order to reach '*addr'.
  *
@@ -220,14 +227,21 @@ static bool ftrace_find_callable_addr(struct dyn_ftrace *rec,
 				      unsigned long *addr)
 {
 	unsigned long pc = rec->ip;
-	long offset = (long)*addr - (long)pc;
 	struct plt_entry *plt;
 
+	/*
+	 * If a custom trampoline is unreachable, rely on the ftrace_caller
+	 * trampoline which knows how to indirectly reach that trampoline
+	 * through ops->direct_call.
+	 */
+	if (*addr != FTRACE_ADDR && !reachable_by_bl(*addr, pc))
+		*addr = FTRACE_ADDR;
+
 	/*
 	 * When the target is within range of the 'BL' instruction, use 'addr'
 	 * as-is and branch to that directly.
 	 */
-	if (offset >= -SZ_128M && offset < SZ_128M)
+	if (reachable_by_bl(*addr, pc))
 		return true;
 
 	/*
@@ -330,12 +344,24 @@ int ftrace_make_call(struct dyn_ftrace *rec, unsigned long addr)
 int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr,
 		       unsigned long addr)
 {
-	if (WARN_ON_ONCE(old_addr != (unsigned long)ftrace_caller))
+	unsigned long pc = rec->ip;
+	u32 old, new;
+	int ret;
+
+	ret = ftrace_rec_set_ops(rec, arm64_rec_get_ops(rec));
+	if (ret)
+		return ret;
+
+	if (!ftrace_find_callable_addr(rec, NULL, &old_addr))
 		return -EINVAL;
-	if (WARN_ON_ONCE(addr != (unsigned long)ftrace_caller))
+	if (!ftrace_find_callable_addr(rec, NULL, &addr))
 		return -EINVAL;
 
-	return ftrace_rec_update_ops(rec);
+	old = aarch64_insn_gen_branch_imm(pc, old_addr,
+					  AARCH64_INSN_BRANCH_LINK);
+	new = aarch64_insn_gen_branch_imm(pc, addr, AARCH64_INSN_BRANCH_LINK);
+
+	return ftrace_modify_code(pc, old, new, true);
 }
 #endif
 
-- 
2.40.0.577.gac1e443424-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v6 2/5] arm64: ftrace: Simplify get_ftrace_plt
  2023-04-05 18:02 [PATCH v6 0/5] Add ftrace direct call for arm64 Florent Revest
  2023-04-05 18:02 ` [PATCH v6 1/5] arm64: ftrace: Add direct call support Florent Revest
@ 2023-04-05 18:02 ` Florent Revest
  2023-04-05 18:02 ` [PATCH v6 3/5] samples: ftrace: Save required argument registers in sample trampolines Florent Revest
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 17+ messages in thread
From: Florent Revest @ 2023-04-05 18:02 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai, lihuafei1,
	Florent Revest

Following recent refactorings, the get_ftrace_plt function only ever
gets called with addr = FTRACE_ADDR so its code can be simplified to
always return the ftrace trampoline plt.

Signed-off-by: Florent Revest <revest@chromium.org>
Acked-by: Mark Rutland <mark.rutland@arm.com>
---
 arch/arm64/kernel/ftrace.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/kernel/ftrace.c b/arch/arm64/kernel/ftrace.c
index 758436727fba..432626c866a8 100644
--- a/arch/arm64/kernel/ftrace.c
+++ b/arch/arm64/kernel/ftrace.c
@@ -195,15 +195,15 @@ int ftrace_update_ftrace_func(ftrace_func_t func)
 	return ftrace_modify_code(pc, 0, new, false);
 }
 
-static struct plt_entry *get_ftrace_plt(struct module *mod, unsigned long addr)
+static struct plt_entry *get_ftrace_plt(struct module *mod)
 {
 #ifdef CONFIG_ARM64_MODULE_PLTS
 	struct plt_entry *plt = mod->arch.ftrace_trampolines;
 
-	if (addr == FTRACE_ADDR)
-		return &plt[FTRACE_PLT_IDX];
-#endif
+	return &plt[FTRACE_PLT_IDX];
+#else
 	return NULL;
+#endif
 }
 
 static bool reachable_by_bl(unsigned long addr, unsigned long pc)
@@ -270,7 +270,7 @@ static bool ftrace_find_callable_addr(struct dyn_ftrace *rec,
 	if (WARN_ON(!mod))
 		return false;
 
-	plt = get_ftrace_plt(mod, *addr);
+	plt = get_ftrace_plt(mod);
 	if (!plt) {
 		pr_err("ftrace: no module PLT for %ps\n", (void *)*addr);
 		return false;
-- 
2.40.0.577.gac1e443424-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v6 3/5] samples: ftrace: Save required argument registers in sample trampolines
  2023-04-05 18:02 [PATCH v6 0/5] Add ftrace direct call for arm64 Florent Revest
  2023-04-05 18:02 ` [PATCH v6 1/5] arm64: ftrace: Add direct call support Florent Revest
  2023-04-05 18:02 ` [PATCH v6 2/5] arm64: ftrace: Simplify get_ftrace_plt Florent Revest
@ 2023-04-05 18:02 ` Florent Revest
  2023-04-05 20:40   ` Steven Rostedt
  2023-04-06 10:22   ` Mark Rutland
  2023-04-05 18:02 ` [PATCH v6 4/5] arm64: ftrace: Add direct call trampoline samples support Florent Revest
                   ` (3 subsequent siblings)
  6 siblings, 2 replies; 17+ messages in thread
From: Florent Revest @ 2023-04-05 18:02 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai, lihuafei1,
	Florent Revest

The ftrace-direct-too sample traces the handle_mm_fault function whose
signature changed since the introduction of the sample. Since:
commit bce617edecad ("mm: do page fault accounting in handle_mm_fault")
handle_mm_fault now has 4 arguments. Therefore, the sample trampoline
should save 4 argument registers.

s390 saves all argument registers already so it does not need a change
but x86_64 needs an extra push and pop.

This also evolves the signature of the tracing function to make it
mirror the signature of the traced function.

Signed-off-by: Florent Revest <revest@chromium.org>
---
 samples/ftrace/ftrace-direct-too.c | 14 ++++++++------
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/samples/ftrace/ftrace-direct-too.c b/samples/ftrace/ftrace-direct-too.c
index f28e7b99840f..71ed4ee8cb4a 100644
--- a/samples/ftrace/ftrace-direct-too.c
+++ b/samples/ftrace/ftrace-direct-too.c
@@ -5,14 +5,14 @@
 #include <linux/ftrace.h>
 #include <asm/asm-offsets.h>
 
-extern void my_direct_func(struct vm_area_struct *vma,
-			   unsigned long address, unsigned int flags);
+extern void my_direct_func(struct vm_area_struct *vma, unsigned long address,
+			   unsigned int flags, struct pt_regs *regs);
 
-void my_direct_func(struct vm_area_struct *vma,
-			unsigned long address, unsigned int flags)
+void my_direct_func(struct vm_area_struct *vma, unsigned long address,
+		    unsigned int flags, struct pt_regs *regs)
 {
-	trace_printk("handle mm fault vma=%p address=%lx flags=%x\n",
-		     vma, address, flags);
+	trace_printk("handle mm fault vma=%p address=%lx flags=%x regs=%p\n",
+		     vma, address, flags, regs);
 }
 
 extern void my_tramp(void *);
@@ -34,7 +34,9 @@ asm (
 "	pushq %rdi\n"
 "	pushq %rsi\n"
 "	pushq %rdx\n"
+"	pushq %rcx\n"
 "	call my_direct_func\n"
+"	popq %rcx\n"
 "	popq %rdx\n"
 "	popq %rsi\n"
 "	popq %rdi\n"
-- 
2.40.0.577.gac1e443424-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v6 4/5] arm64: ftrace: Add direct call trampoline samples support
  2023-04-05 18:02 [PATCH v6 0/5] Add ftrace direct call for arm64 Florent Revest
                   ` (2 preceding siblings ...)
  2023-04-05 18:02 ` [PATCH v6 3/5] samples: ftrace: Save required argument registers in sample trampolines Florent Revest
@ 2023-04-05 18:02 ` Florent Revest
  2023-04-06 10:50   ` Mark Rutland
  2023-04-05 18:02 ` [PATCH v6 5/5] selftests/bpf: Update the tests deny list on aarch64 Florent Revest
                   ` (2 subsequent siblings)
  6 siblings, 1 reply; 17+ messages in thread
From: Florent Revest @ 2023-04-05 18:02 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai, lihuafei1,
	Florent Revest

The ftrace samples need per-architecture trampoline implementations
to save and restore argument registers around the calls to
my_direct_func* and to restore polluted registers (eg: x30).

These samples also include <asm/asm-offsets.h> which, on arm64, is not
necessary and redefines previously defined macros (resulting in
warnings) so these includes are guarded by !CONFIG_ARM64.

Signed-off-by: Florent Revest <revest@chromium.org>
---
 arch/arm64/Kconfig                          |  2 ++
 samples/ftrace/ftrace-direct-modify.c       | 34 ++++++++++++++++++
 samples/ftrace/ftrace-direct-multi-modify.c | 40 +++++++++++++++++++++
 samples/ftrace/ftrace-direct-multi.c        | 24 +++++++++++++
 samples/ftrace/ftrace-direct-too.c          | 26 ++++++++++++++
 samples/ftrace/ftrace-direct.c              | 24 +++++++++++++
 6 files changed, 150 insertions(+)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index f3503d0cc1b8..c2bf28099abd 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -194,6 +194,8 @@ config ARM64
 		    !CC_OPTIMIZE_FOR_SIZE)
 	select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \
 		if DYNAMIC_FTRACE_WITH_ARGS
+	select HAVE_SAMPLE_FTRACE_DIRECT
+	select HAVE_SAMPLE_FTRACE_DIRECT_MULTI
 	select HAVE_EFFICIENT_UNALIGNED_ACCESS
 	select HAVE_FAST_GUP
 	select HAVE_FTRACE_MCOUNT_RECORD
diff --git a/samples/ftrace/ftrace-direct-modify.c b/samples/ftrace/ftrace-direct-modify.c
index 25fba66f61c0..98d1b7385f08 100644
--- a/samples/ftrace/ftrace-direct-modify.c
+++ b/samples/ftrace/ftrace-direct-modify.c
@@ -2,7 +2,9 @@
 #include <linux/module.h>
 #include <linux/kthread.h>
 #include <linux/ftrace.h>
+#ifndef CONFIG_ARM64
 #include <asm/asm-offsets.h>
+#endif
 
 extern void my_direct_func1(void);
 extern void my_direct_func2(void);
@@ -96,6 +98,38 @@ asm (
 
 #endif /* CONFIG_S390 */
 
+#ifdef CONFIG_ARM64
+
+asm (
+"	.pushsection    .text, \"ax\", @progbits\n"
+"	.type		my_tramp1, @function\n"
+"	.globl		my_tramp1\n"
+"   my_tramp1:"
+"	bti	c\n"
+"	sub	sp, sp, #16\n"
+"	stp	x9, x30, [sp]\n"
+"	bl	my_direct_func1\n"
+"	ldp	x30, x9, [sp]\n"
+"	add	sp, sp, #16\n"
+"	ret	x9\n"
+"	.size		my_tramp1, .-my_tramp1\n"
+
+"	.type		my_tramp2, @function\n"
+"	.globl		my_tramp2\n"
+"   my_tramp2:"
+"	bti	c\n"
+"	sub	sp, sp, #16\n"
+"	stp	x9, x30, [sp]\n"
+"	bl	my_direct_func2\n"
+"	ldp	x30, x9, [sp]\n"
+"	add	sp, sp, #16\n"
+"	ret	x9\n"
+"	.size		my_tramp2, .-my_tramp2\n"
+"	.popsection\n"
+);
+
+#endif /* CONFIG_ARM64 */
+
 static struct ftrace_ops direct;
 
 static unsigned long my_tramp = (unsigned long)my_tramp1;
diff --git a/samples/ftrace/ftrace-direct-multi-modify.c b/samples/ftrace/ftrace-direct-multi-modify.c
index f72623899602..26956c8fc513 100644
--- a/samples/ftrace/ftrace-direct-multi-modify.c
+++ b/samples/ftrace/ftrace-direct-multi-modify.c
@@ -2,7 +2,9 @@
 #include <linux/module.h>
 #include <linux/kthread.h>
 #include <linux/ftrace.h>
+#ifndef CONFIG_ARM64
 #include <asm/asm-offsets.h>
+#endif
 
 extern void my_direct_func1(unsigned long ip);
 extern void my_direct_func2(unsigned long ip);
@@ -103,6 +105,44 @@ asm (
 
 #endif /* CONFIG_S390 */
 
+#ifdef CONFIG_ARM64
+
+asm (
+"	.pushsection    .text, \"ax\", @progbits\n"
+"	.type		my_tramp1, @function\n"
+"	.globl		my_tramp1\n"
+"   my_tramp1:"
+"	bti	c\n"
+"	sub	sp, sp, #32\n"
+"	stp	x9, x30, [sp]\n"
+"	str	x0, [sp, #16]\n"
+"	mov	x0, x30\n"
+"	bl	my_direct_func1\n"
+"	ldp	x30, x9, [sp]\n"
+"	ldr	x0, [sp, #16]\n"
+"	add	sp, sp, #32\n"
+"	ret	x9\n"
+"	.size		my_tramp1, .-my_tramp1\n"
+
+"	.type		my_tramp2, @function\n"
+"	.globl		my_tramp2\n"
+"   my_tramp2:"
+"	bti	c\n"
+"	sub	sp, sp, #32\n"
+"	stp	x9, x30, [sp]\n"
+"	str	x0, [sp, #16]\n"
+"	mov	x0, x30\n"
+"	bl	my_direct_func2\n"
+"	ldp	x30, x9, [sp]\n"
+"	ldr	x0, [sp, #16]\n"
+"	add	sp, sp, #32\n"
+"	ret	x9\n"
+"	.size		my_tramp2, .-my_tramp2\n"
+"	.popsection\n"
+);
+
+#endif /* CONFIG_ARM64 */
+
 static unsigned long my_tramp = (unsigned long)my_tramp1;
 static unsigned long tramps[2] = {
 	(unsigned long)my_tramp1,
diff --git a/samples/ftrace/ftrace-direct-multi.c b/samples/ftrace/ftrace-direct-multi.c
index 1547c2c6be02..b2ac90e0c02e 100644
--- a/samples/ftrace/ftrace-direct-multi.c
+++ b/samples/ftrace/ftrace-direct-multi.c
@@ -4,7 +4,9 @@
 #include <linux/mm.h> /* for handle_mm_fault() */
 #include <linux/ftrace.h>
 #include <linux/sched/stat.h>
+#ifndef CONFIG_ARM64
 #include <asm/asm-offsets.h>
+#endif
 
 extern void my_direct_func(unsigned long ip);
 
@@ -66,6 +68,28 @@ asm (
 
 #endif /* CONFIG_S390 */
 
+#ifdef CONFIG_ARM64
+
+asm (
+"	.pushsection	.text, \"ax\", @progbits\n"
+"	.type		my_tramp, @function\n"
+"	.globl		my_tramp\n"
+"   my_tramp:"
+"	bti	c\n"
+"	sub	sp, sp, #32\n"
+"	stp	x9, x30, [sp]\n"
+"	str	x0, [sp, #16]\n"
+"	mov	x0, x30\n"
+"	bl	my_direct_func\n"
+"	ldp	x30, x9, [sp]\n"
+"	ldr	x0, [sp, #16]\n"
+"	add	sp, sp, #32\n"
+"	ret	x9\n"
+"	.size		my_tramp, .-my_tramp\n"
+"	.popsection\n"
+);
+
+#endif /* CONFIG_ARM64 */
 static struct ftrace_ops direct;
 
 static int __init ftrace_direct_multi_init(void)
diff --git a/samples/ftrace/ftrace-direct-too.c b/samples/ftrace/ftrace-direct-too.c
index 71ed4ee8cb4a..38f6f677f913 100644
--- a/samples/ftrace/ftrace-direct-too.c
+++ b/samples/ftrace/ftrace-direct-too.c
@@ -3,7 +3,9 @@
 
 #include <linux/mm.h> /* for handle_mm_fault() */
 #include <linux/ftrace.h>
+#ifndef CONFIG_ARM64
 #include <asm/asm-offsets.h>
+#endif
 
 extern void my_direct_func(struct vm_area_struct *vma, unsigned long address,
 			   unsigned int flags, struct pt_regs *regs);
@@ -72,6 +74,30 @@ asm (
 
 #endif /* CONFIG_S390 */
 
+#ifdef CONFIG_ARM64
+
+asm (
+"	.pushsection	.text, \"ax\", @progbits\n"
+"	.type		my_tramp, @function\n"
+"	.globl		my_tramp\n"
+"   my_tramp:"
+"	bti	c\n"
+"	sub	sp, sp, #48\n"
+"	stp	x9, x30, [sp]\n"
+"	stp	x0, x1, [sp, #16]\n"
+"	stp	x2, x3, [sp, #32]\n"
+"	bl	my_direct_func\n"
+"	ldp	x30, x9, [sp]\n"
+"	ldp	x0, x1, [sp, #16]\n"
+"	ldp	x2, x3, [sp, #32]\n"
+"	add	sp, sp, #48\n"
+"	ret	x9\n"
+"	.size		my_tramp, .-my_tramp\n"
+"	.popsection\n"
+);
+
+#endif /* CONFIG_ARM64 */
+
 static struct ftrace_ops direct;
 
 static int __init ftrace_direct_init(void)
diff --git a/samples/ftrace/ftrace-direct.c b/samples/ftrace/ftrace-direct.c
index d81a9473b585..e5312f9c15d3 100644
--- a/samples/ftrace/ftrace-direct.c
+++ b/samples/ftrace/ftrace-direct.c
@@ -3,7 +3,9 @@
 
 #include <linux/sched.h> /* for wake_up_process() */
 #include <linux/ftrace.h>
+#ifndef CONFIG_ARM64
 #include <asm/asm-offsets.h>
+#endif
 
 extern void my_direct_func(struct task_struct *p);
 
@@ -63,6 +65,28 @@ asm (
 
 #endif /* CONFIG_S390 */
 
+#ifdef CONFIG_ARM64
+
+asm (
+"	.pushsection	.text, \"ax\", @progbits\n"
+"	.type		my_tramp, @function\n"
+"	.globl		my_tramp\n"
+"   my_tramp:"
+"	bti	c\n"
+"	sub	sp, sp, #32\n"
+"	stp	x9, x30, [sp]\n"
+"	str	x0, [sp, #16]\n"
+"	bl	my_direct_func\n"
+"	ldp	x30, x9, [sp]\n"
+"	ldr	x0, [sp, #16]\n"
+"	add	sp, sp, #32\n"
+"	ret	x9\n"
+"	.size		my_tramp, .-my_tramp\n"
+"	.popsection\n"
+);
+
+#endif /* CONFIG_ARM64 */
+
 static struct ftrace_ops direct;
 
 static int __init ftrace_direct_init(void)
-- 
2.40.0.577.gac1e443424-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v6 5/5] selftests/bpf: Update the tests deny list on aarch64
  2023-04-05 18:02 [PATCH v6 0/5] Add ftrace direct call for arm64 Florent Revest
                   ` (3 preceding siblings ...)
  2023-04-05 18:02 ` [PATCH v6 4/5] arm64: ftrace: Add direct call trampoline samples support Florent Revest
@ 2023-04-05 18:02 ` Florent Revest
  2023-04-11 15:56 ` [PATCH v6 0/5] Add ftrace direct call for arm64 Mark Rutland
  2023-04-11 18:37 ` Will Deacon
  6 siblings, 0 replies; 17+ messages in thread
From: Florent Revest @ 2023-04-05 18:02 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf
  Cc: catalin.marinas, will, rostedt, mhiramat, mark.rutland, ast,
	daniel, andrii, kpsingh, jolsa, xukuohai, lihuafei1,
	Florent Revest

Now that ftrace supports direct call on arm64, BPF tracing programs work
on that architecture. This fixes the vast majority of BPF selftests
except for:

- multi_kprobe programs which require fprobe, not available on arm64 yet
- tracing_struct which requires trampoline support to access struct args

This patch updates the list of BPF selftests which are known to fail so
the BPF CI can validate the tests which pass now.

Signed-off-by: Florent Revest <revest@chromium.org>
---
 tools/testing/selftests/bpf/DENYLIST.aarch64 | 82 ++------------------
 1 file changed, 5 insertions(+), 77 deletions(-)

diff --git a/tools/testing/selftests/bpf/DENYLIST.aarch64 b/tools/testing/selftests/bpf/DENYLIST.aarch64
index 99cc33c51eaa..6b95cb544094 100644
--- a/tools/testing/selftests/bpf/DENYLIST.aarch64
+++ b/tools/testing/selftests/bpf/DENYLIST.aarch64
@@ -1,33 +1,5 @@
-bloom_filter_map                                 # libbpf: prog 'check_bloom': failed to attach: ERROR: strerror_r(-524)=22
-bpf_cookie/lsm
-bpf_cookie/multi_kprobe_attach_api
-bpf_cookie/multi_kprobe_link_api
-bpf_cookie/trampoline
-bpf_loop/check_callback_fn_stop                  # link unexpected error: -524
-bpf_loop/check_invalid_flags
-bpf_loop/check_nested_calls
-bpf_loop/check_non_constant_callback
-bpf_loop/check_nr_loops
-bpf_loop/check_null_callback_ctx
-bpf_loop/check_stack
-bpf_mod_race                                     # bpf_mod_kfunc_race__attach unexpected error: -524 (errno 524)
-bpf_tcp_ca/dctcp_fallback
-btf_dump/btf_dump: var_data                      # find type id unexpected find type id: actual -2 < expected 0
-cgroup_hierarchical_stats                        # attach unexpected error: -524 (errno 524)
-d_path/basic                                     # setup attach failed: -524
-deny_namespace                                   # attach unexpected error: -524 (errno 524)
-fentry_fexit                                     # fentry_attach unexpected error: -1 (errno 524)
-fentry_test                                      # fentry_attach unexpected error: -1 (errno 524)
-fexit_sleep                                      # fexit_attach fexit attach failed: -1
-fexit_stress                                     # fexit attach unexpected fexit attach: actual -524 < expected 0
-fexit_test                                       # fexit_attach unexpected error: -1 (errno 524)
-get_func_args_test                               # get_func_args_test__attach unexpected error: -524 (errno 524) (trampoline)
-get_func_ip_test                                 # get_func_ip_test__attach unexpected error: -524 (errno 524) (trampoline)
-htab_update/reenter_update
-kfree_skb                                        # attach fentry unexpected error: -524 (trampoline)
-kfunc_call/subprog                               # extern (var ksym) 'bpf_prog_active': not found in kernel BTF
-kfunc_call/subprog_lskel                         # skel unexpected error: -2
-kfunc_dynptr_param/dynptr_data_null              # libbpf: prog 'dynptr_data_null': failed to attach: ERROR: strerror_r(-524)=22
+bpf_cookie/multi_kprobe_attach_api               # kprobe_multi_link_api_subtest:FAIL:fentry_raw_skel_load unexpected error: -3
+bpf_cookie/multi_kprobe_link_api                 # kprobe_multi_link_api_subtest:FAIL:fentry_raw_skel_load unexpected error: -3
 kprobe_multi_bench_attach                        # bpf_program__attach_kprobe_multi_opts unexpected error: -95
 kprobe_multi_test/attach_api_addrs               # bpf_program__attach_kprobe_multi_opts unexpected error: -95
 kprobe_multi_test/attach_api_pattern             # bpf_program__attach_kprobe_multi_opts unexpected error: -95
@@ -35,50 +7,6 @@ kprobe_multi_test/attach_api_syms                # bpf_program__attach_kprobe_mu
 kprobe_multi_test/bench_attach                   # bpf_program__attach_kprobe_multi_opts unexpected error: -95
 kprobe_multi_test/link_api_addrs                 # link_fd unexpected link_fd: actual -95 < expected 0
 kprobe_multi_test/link_api_syms                  # link_fd unexpected link_fd: actual -95 < expected 0
-kprobe_multi_test/skel_api                       # kprobe_multi__attach unexpected error: -524 (errno 524)
-ksyms_module/libbpf                              # 'bpf_testmod_ksym_percpu': not found in kernel BTF
-ksyms_module/lskel                               # test_ksyms_module_lskel__open_and_load unexpected error: -2
-libbpf_get_fd_by_id_opts                         # test_libbpf_get_fd_by_id_opts__attach unexpected error: -524 (errno 524)
-linked_list
-lookup_key                                       # test_lookup_key__attach unexpected error: -524 (errno 524)
-lru_bug                                          # lru_bug__attach unexpected error: -524 (errno 524)
-modify_return                                    # modify_return__attach failed unexpected error: -524 (errno 524)
-module_attach                                    # skel_attach skeleton attach failed: -524
-mptcp/base                                       # run_test mptcp unexpected error: -524 (errno 524)
-netcnt                                           # packets unexpected packets: actual 10001 != expected 10000
-rcu_read_lock                                    # failed to attach: ERROR: strerror_r(-524)=22
-recursion                                        # skel_attach unexpected error: -524 (errno 524)
-ringbuf                                          # skel_attach skeleton attachment failed: -1
-setget_sockopt                                   # attach_cgroup unexpected error: -524
-sk_storage_tracing                               # test_sk_storage_tracing__attach unexpected error: -524 (errno 524)
-skc_to_unix_sock                                 # could not attach BPF object unexpected error: -524 (errno 524)
-socket_cookie                                    # prog_attach unexpected error: -524
-stacktrace_build_id                              # compare_stack_ips stackmap vs. stack_amap err -1 errno 2
-task_local_storage/exit_creds                    # skel_attach unexpected error: -524 (errno 524)
-task_local_storage/recursion                     # skel_attach unexpected error: -524 (errno 524)
-test_bprm_opts                                   # attach attach failed: -524
-test_ima                                         # attach attach failed: -524
-test_local_storage                               # attach lsm attach failed: -524
-test_lsm                                         # test_lsm_first_attach unexpected error: -524 (errno 524)
-test_overhead                                    # attach_fentry unexpected error: -524
-timer                                            # timer unexpected error: -524 (errno 524)
-timer_crash                                      # timer_crash__attach unexpected error: -524 (errno 524)
-timer_mim                                        # timer_mim unexpected error: -524 (errno 524)
-trace_printk                                     # trace_printk__attach unexpected error: -1 (errno 524)
-trace_vprintk                                    # trace_vprintk__attach unexpected error: -1 (errno 524)
-tracing_struct                                   # tracing_struct__attach unexpected error: -524 (errno 524)
-trampoline_count                                 # attach_prog unexpected error: -524
-unpriv_bpf_disabled                              # skel_attach unexpected error: -524 (errno 524)
-user_ringbuf/test_user_ringbuf_post_misaligned   # misaligned_skel unexpected error: -524 (errno 524)
-user_ringbuf/test_user_ringbuf_post_producer_wrong_offset
-user_ringbuf/test_user_ringbuf_post_larger_than_ringbuf_sz
-user_ringbuf/test_user_ringbuf_basic             # ringbuf_basic_skel unexpected error: -524 (errno 524)
-user_ringbuf/test_user_ringbuf_sample_full_ring_buffer
-user_ringbuf/test_user_ringbuf_post_alignment_autoadjust
-user_ringbuf/test_user_ringbuf_overfill
-user_ringbuf/test_user_ringbuf_discards_properly_ignored
-user_ringbuf/test_user_ringbuf_loop
-user_ringbuf/test_user_ringbuf_msg_protocol
-user_ringbuf/test_user_ringbuf_blocking_reserve
-verify_pkcs7_sig                                 # test_verify_pkcs7_sig__attach unexpected error: -524 (errno 524)
-vmlinux                                          # skel_attach skeleton attach failed: -524
+kprobe_multi_test/skel_api                       # libbpf: failed to load BPF skeleton 'kprobe_multi': -3
+module_attach                                    # prog 'kprobe_multi': failed to auto-attach: -95
+tracing_struct                                   # tracing_struct__attach unexpected error: -524 (errno 524)
\ No newline at end of file
-- 
2.40.0.577.gac1e443424-goog


^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 3/5] samples: ftrace: Save required argument registers in sample trampolines
  2023-04-05 18:02 ` [PATCH v6 3/5] samples: ftrace: Save required argument registers in sample trampolines Florent Revest
@ 2023-04-05 20:40   ` Steven Rostedt
  2023-04-06 10:22   ` Mark Rutland
  1 sibling, 0 replies; 17+ messages in thread
From: Steven Rostedt @ 2023-04-05 20:40 UTC (permalink / raw)
  To: Florent Revest
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, mhiramat, mark.rutland, ast, daniel,
	andrii, kpsingh, jolsa, xukuohai, lihuafei1

On Wed,  5 Apr 2023 20:02:48 +0200
Florent Revest <revest@chromium.org> wrote:

> The ftrace-direct-too sample traces the handle_mm_fault function whose
> signature changed since the introduction of the sample. Since:
> commit bce617edecad ("mm: do page fault accounting in handle_mm_fault")
> handle_mm_fault now has 4 arguments. Therefore, the sample trampoline
> should save 4 argument registers.
> 
> s390 saves all argument registers already so it does not need a change
> but x86_64 needs an extra push and pop.
> 
> This also evolves the signature of the tracing function to make it
> mirror the signature of the traced function.
>

Should probably add:

Cc: stable@vger.kernel.org
Fixes: bce617edecad ("mm: do page fault accounting in handle_mm_fault")

Reviewed-by: Steven Rostedt (Google) <rostedt@goodmis.org>

-- Steve

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 3/5] samples: ftrace: Save required argument registers in sample trampolines
  2023-04-05 18:02 ` [PATCH v6 3/5] samples: ftrace: Save required argument registers in sample trampolines Florent Revest
  2023-04-05 20:40   ` Steven Rostedt
@ 2023-04-06 10:22   ` Mark Rutland
  1 sibling, 0 replies; 17+ messages in thread
From: Mark Rutland @ 2023-04-06 10:22 UTC (permalink / raw)
  To: Florent Revest
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, rostedt, mhiramat, ast, daniel, andrii,
	kpsingh, jolsa, xukuohai, lihuafei1

On Wed, Apr 05, 2023 at 08:02:48PM +0200, Florent Revest wrote:
> The ftrace-direct-too sample traces the handle_mm_fault function whose
> signature changed since the introduction of the sample. Since:
> commit bce617edecad ("mm: do page fault accounting in handle_mm_fault")
> handle_mm_fault now has 4 arguments. Therefore, the sample trampoline
> should save 4 argument registers.
> 
> s390 saves all argument registers already so it does not need a change
> but x86_64 needs an extra push and pop.
> 
> This also evolves the signature of the tracing function to make it
> mirror the signature of the traced function.
> 
> Signed-off-by: Florent Revest <revest@chromium.org>

Reviewed-by: Mark Rutland <mark.rutland@arm.com>

Thanks for this!

Mark.

> ---
>  samples/ftrace/ftrace-direct-too.c | 14 ++++++++------
>  1 file changed, 8 insertions(+), 6 deletions(-)
> 
> diff --git a/samples/ftrace/ftrace-direct-too.c b/samples/ftrace/ftrace-direct-too.c
> index f28e7b99840f..71ed4ee8cb4a 100644
> --- a/samples/ftrace/ftrace-direct-too.c
> +++ b/samples/ftrace/ftrace-direct-too.c
> @@ -5,14 +5,14 @@
>  #include <linux/ftrace.h>
>  #include <asm/asm-offsets.h>
>  
> -extern void my_direct_func(struct vm_area_struct *vma,
> -			   unsigned long address, unsigned int flags);
> +extern void my_direct_func(struct vm_area_struct *vma, unsigned long address,
> +			   unsigned int flags, struct pt_regs *regs);
>  
> -void my_direct_func(struct vm_area_struct *vma,
> -			unsigned long address, unsigned int flags)
> +void my_direct_func(struct vm_area_struct *vma, unsigned long address,
> +		    unsigned int flags, struct pt_regs *regs)
>  {
> -	trace_printk("handle mm fault vma=%p address=%lx flags=%x\n",
> -		     vma, address, flags);
> +	trace_printk("handle mm fault vma=%p address=%lx flags=%x regs=%p\n",
> +		     vma, address, flags, regs);
>  }
>  
>  extern void my_tramp(void *);
> @@ -34,7 +34,9 @@ asm (
>  "	pushq %rdi\n"
>  "	pushq %rsi\n"
>  "	pushq %rdx\n"
> +"	pushq %rcx\n"
>  "	call my_direct_func\n"
> +"	popq %rcx\n"
>  "	popq %rdx\n"
>  "	popq %rsi\n"
>  "	popq %rdi\n"
> -- 
> 2.40.0.577.gac1e443424-goog
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 4/5] arm64: ftrace: Add direct call trampoline samples support
  2023-04-05 18:02 ` [PATCH v6 4/5] arm64: ftrace: Add direct call trampoline samples support Florent Revest
@ 2023-04-06 10:50   ` Mark Rutland
  0 siblings, 0 replies; 17+ messages in thread
From: Mark Rutland @ 2023-04-06 10:50 UTC (permalink / raw)
  To: Florent Revest
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	catalin.marinas, will, rostedt, mhiramat, ast, daniel, andrii,
	kpsingh, jolsa, xukuohai, lihuafei1

On Wed, Apr 05, 2023 at 08:02:49PM +0200, Florent Revest wrote:
> The ftrace samples need per-architecture trampoline implementations
> to save and restore argument registers around the calls to
> my_direct_func* and to restore polluted registers (eg: x30).
> 
> These samples also include <asm/asm-offsets.h> which, on arm64, is not
> necessary and redefines previously defined macros (resulting in
> warnings) so these includes are guarded by !CONFIG_ARM64.
> 
> Signed-off-by: Florent Revest <revest@chromium.org>

These all look good to me. I gave each module a spin in an 8-vCPU VM on an M1
Macbook Pro with a bunch of other work going on, and all of those worked as
expected with sensible output in /sys/kernel/tracing/trace, and no noticeable
failures elsewhere. So:

Reviewed-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Mark Rutland <mark.rutland@arm.com>

Mark.

> ---
>  arch/arm64/Kconfig                          |  2 ++
>  samples/ftrace/ftrace-direct-modify.c       | 34 ++++++++++++++++++
>  samples/ftrace/ftrace-direct-multi-modify.c | 40 +++++++++++++++++++++
>  samples/ftrace/ftrace-direct-multi.c        | 24 +++++++++++++
>  samples/ftrace/ftrace-direct-too.c          | 26 ++++++++++++++
>  samples/ftrace/ftrace-direct.c              | 24 +++++++++++++
>  6 files changed, 150 insertions(+)
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index f3503d0cc1b8..c2bf28099abd 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -194,6 +194,8 @@ config ARM64
>  		    !CC_OPTIMIZE_FOR_SIZE)
>  	select FTRACE_MCOUNT_USE_PATCHABLE_FUNCTION_ENTRY \
>  		if DYNAMIC_FTRACE_WITH_ARGS
> +	select HAVE_SAMPLE_FTRACE_DIRECT
> +	select HAVE_SAMPLE_FTRACE_DIRECT_MULTI
>  	select HAVE_EFFICIENT_UNALIGNED_ACCESS
>  	select HAVE_FAST_GUP
>  	select HAVE_FTRACE_MCOUNT_RECORD
> diff --git a/samples/ftrace/ftrace-direct-modify.c b/samples/ftrace/ftrace-direct-modify.c
> index 25fba66f61c0..98d1b7385f08 100644
> --- a/samples/ftrace/ftrace-direct-modify.c
> +++ b/samples/ftrace/ftrace-direct-modify.c
> @@ -2,7 +2,9 @@
>  #include <linux/module.h>
>  #include <linux/kthread.h>
>  #include <linux/ftrace.h>
> +#ifndef CONFIG_ARM64
>  #include <asm/asm-offsets.h>
> +#endif
>  
>  extern void my_direct_func1(void);
>  extern void my_direct_func2(void);
> @@ -96,6 +98,38 @@ asm (
>  
>  #endif /* CONFIG_S390 */
>  
> +#ifdef CONFIG_ARM64
> +
> +asm (
> +"	.pushsection    .text, \"ax\", @progbits\n"
> +"	.type		my_tramp1, @function\n"
> +"	.globl		my_tramp1\n"
> +"   my_tramp1:"
> +"	bti	c\n"
> +"	sub	sp, sp, #16\n"
> +"	stp	x9, x30, [sp]\n"
> +"	bl	my_direct_func1\n"
> +"	ldp	x30, x9, [sp]\n"
> +"	add	sp, sp, #16\n"
> +"	ret	x9\n"
> +"	.size		my_tramp1, .-my_tramp1\n"
> +
> +"	.type		my_tramp2, @function\n"
> +"	.globl		my_tramp2\n"
> +"   my_tramp2:"
> +"	bti	c\n"
> +"	sub	sp, sp, #16\n"
> +"	stp	x9, x30, [sp]\n"
> +"	bl	my_direct_func2\n"
> +"	ldp	x30, x9, [sp]\n"
> +"	add	sp, sp, #16\n"
> +"	ret	x9\n"
> +"	.size		my_tramp2, .-my_tramp2\n"
> +"	.popsection\n"
> +);
> +
> +#endif /* CONFIG_ARM64 */
> +
>  static struct ftrace_ops direct;
>  
>  static unsigned long my_tramp = (unsigned long)my_tramp1;
> diff --git a/samples/ftrace/ftrace-direct-multi-modify.c b/samples/ftrace/ftrace-direct-multi-modify.c
> index f72623899602..26956c8fc513 100644
> --- a/samples/ftrace/ftrace-direct-multi-modify.c
> +++ b/samples/ftrace/ftrace-direct-multi-modify.c
> @@ -2,7 +2,9 @@
>  #include <linux/module.h>
>  #include <linux/kthread.h>
>  #include <linux/ftrace.h>
> +#ifndef CONFIG_ARM64
>  #include <asm/asm-offsets.h>
> +#endif
>  
>  extern void my_direct_func1(unsigned long ip);
>  extern void my_direct_func2(unsigned long ip);
> @@ -103,6 +105,44 @@ asm (
>  
>  #endif /* CONFIG_S390 */
>  
> +#ifdef CONFIG_ARM64
> +
> +asm (
> +"	.pushsection    .text, \"ax\", @progbits\n"
> +"	.type		my_tramp1, @function\n"
> +"	.globl		my_tramp1\n"
> +"   my_tramp1:"
> +"	bti	c\n"
> +"	sub	sp, sp, #32\n"
> +"	stp	x9, x30, [sp]\n"
> +"	str	x0, [sp, #16]\n"
> +"	mov	x0, x30\n"
> +"	bl	my_direct_func1\n"
> +"	ldp	x30, x9, [sp]\n"
> +"	ldr	x0, [sp, #16]\n"
> +"	add	sp, sp, #32\n"
> +"	ret	x9\n"
> +"	.size		my_tramp1, .-my_tramp1\n"
> +
> +"	.type		my_tramp2, @function\n"
> +"	.globl		my_tramp2\n"
> +"   my_tramp2:"
> +"	bti	c\n"
> +"	sub	sp, sp, #32\n"
> +"	stp	x9, x30, [sp]\n"
> +"	str	x0, [sp, #16]\n"
> +"	mov	x0, x30\n"
> +"	bl	my_direct_func2\n"
> +"	ldp	x30, x9, [sp]\n"
> +"	ldr	x0, [sp, #16]\n"
> +"	add	sp, sp, #32\n"
> +"	ret	x9\n"
> +"	.size		my_tramp2, .-my_tramp2\n"
> +"	.popsection\n"
> +);
> +
> +#endif /* CONFIG_ARM64 */
> +
>  static unsigned long my_tramp = (unsigned long)my_tramp1;
>  static unsigned long tramps[2] = {
>  	(unsigned long)my_tramp1,
> diff --git a/samples/ftrace/ftrace-direct-multi.c b/samples/ftrace/ftrace-direct-multi.c
> index 1547c2c6be02..b2ac90e0c02e 100644
> --- a/samples/ftrace/ftrace-direct-multi.c
> +++ b/samples/ftrace/ftrace-direct-multi.c
> @@ -4,7 +4,9 @@
>  #include <linux/mm.h> /* for handle_mm_fault() */
>  #include <linux/ftrace.h>
>  #include <linux/sched/stat.h>
> +#ifndef CONFIG_ARM64
>  #include <asm/asm-offsets.h>
> +#endif
>  
>  extern void my_direct_func(unsigned long ip);
>  
> @@ -66,6 +68,28 @@ asm (
>  
>  #endif /* CONFIG_S390 */
>  
> +#ifdef CONFIG_ARM64
> +
> +asm (
> +"	.pushsection	.text, \"ax\", @progbits\n"
> +"	.type		my_tramp, @function\n"
> +"	.globl		my_tramp\n"
> +"   my_tramp:"
> +"	bti	c\n"
> +"	sub	sp, sp, #32\n"
> +"	stp	x9, x30, [sp]\n"
> +"	str	x0, [sp, #16]\n"
> +"	mov	x0, x30\n"
> +"	bl	my_direct_func\n"
> +"	ldp	x30, x9, [sp]\n"
> +"	ldr	x0, [sp, #16]\n"
> +"	add	sp, sp, #32\n"
> +"	ret	x9\n"
> +"	.size		my_tramp, .-my_tramp\n"
> +"	.popsection\n"
> +);
> +
> +#endif /* CONFIG_ARM64 */
>  static struct ftrace_ops direct;
>  
>  static int __init ftrace_direct_multi_init(void)
> diff --git a/samples/ftrace/ftrace-direct-too.c b/samples/ftrace/ftrace-direct-too.c
> index 71ed4ee8cb4a..38f6f677f913 100644
> --- a/samples/ftrace/ftrace-direct-too.c
> +++ b/samples/ftrace/ftrace-direct-too.c
> @@ -3,7 +3,9 @@
>  
>  #include <linux/mm.h> /* for handle_mm_fault() */
>  #include <linux/ftrace.h>
> +#ifndef CONFIG_ARM64
>  #include <asm/asm-offsets.h>
> +#endif
>  
>  extern void my_direct_func(struct vm_area_struct *vma, unsigned long address,
>  			   unsigned int flags, struct pt_regs *regs);
> @@ -72,6 +74,30 @@ asm (
>  
>  #endif /* CONFIG_S390 */
>  
> +#ifdef CONFIG_ARM64
> +
> +asm (
> +"	.pushsection	.text, \"ax\", @progbits\n"
> +"	.type		my_tramp, @function\n"
> +"	.globl		my_tramp\n"
> +"   my_tramp:"
> +"	bti	c\n"
> +"	sub	sp, sp, #48\n"
> +"	stp	x9, x30, [sp]\n"
> +"	stp	x0, x1, [sp, #16]\n"
> +"	stp	x2, x3, [sp, #32]\n"
> +"	bl	my_direct_func\n"
> +"	ldp	x30, x9, [sp]\n"
> +"	ldp	x0, x1, [sp, #16]\n"
> +"	ldp	x2, x3, [sp, #32]\n"
> +"	add	sp, sp, #48\n"
> +"	ret	x9\n"
> +"	.size		my_tramp, .-my_tramp\n"
> +"	.popsection\n"
> +);
> +
> +#endif /* CONFIG_ARM64 */
> +
>  static struct ftrace_ops direct;
>  
>  static int __init ftrace_direct_init(void)
> diff --git a/samples/ftrace/ftrace-direct.c b/samples/ftrace/ftrace-direct.c
> index d81a9473b585..e5312f9c15d3 100644
> --- a/samples/ftrace/ftrace-direct.c
> +++ b/samples/ftrace/ftrace-direct.c
> @@ -3,7 +3,9 @@
>  
>  #include <linux/sched.h> /* for wake_up_process() */
>  #include <linux/ftrace.h>
> +#ifndef CONFIG_ARM64
>  #include <asm/asm-offsets.h>
> +#endif
>  
>  extern void my_direct_func(struct task_struct *p);
>  
> @@ -63,6 +65,28 @@ asm (
>  
>  #endif /* CONFIG_S390 */
>  
> +#ifdef CONFIG_ARM64
> +
> +asm (
> +"	.pushsection	.text, \"ax\", @progbits\n"
> +"	.type		my_tramp, @function\n"
> +"	.globl		my_tramp\n"
> +"   my_tramp:"
> +"	bti	c\n"
> +"	sub	sp, sp, #32\n"
> +"	stp	x9, x30, [sp]\n"
> +"	str	x0, [sp, #16]\n"
> +"	bl	my_direct_func\n"
> +"	ldp	x30, x9, [sp]\n"
> +"	ldr	x0, [sp, #16]\n"
> +"	add	sp, sp, #32\n"
> +"	ret	x9\n"
> +"	.size		my_tramp, .-my_tramp\n"
> +"	.popsection\n"
> +);
> +
> +#endif /* CONFIG_ARM64 */
> +
>  static struct ftrace_ops direct;
>  
>  static int __init ftrace_direct_init(void)
> -- 
> 2.40.0.577.gac1e443424-goog
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 0/5] Add ftrace direct call for arm64
  2023-04-05 18:02 [PATCH v6 0/5] Add ftrace direct call for arm64 Florent Revest
                   ` (4 preceding siblings ...)
  2023-04-05 18:02 ` [PATCH v6 5/5] selftests/bpf: Update the tests deny list on aarch64 Florent Revest
@ 2023-04-11 15:56 ` Mark Rutland
  2023-04-11 16:47   ` Steven Rostedt
  2023-04-11 18:37 ` Will Deacon
  6 siblings, 1 reply; 17+ messages in thread
From: Mark Rutland @ 2023-04-11 15:56 UTC (permalink / raw)
  To: Florent Revest, catalin.marinas, will, rostedt
  Cc: linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	mhiramat, ast, daniel, andrii, kpsingh, jolsa, xukuohai,
	lihuafei1

On Wed, Apr 05, 2023 at 08:02:45PM +0200, Florent Revest wrote:
> This series adds ftrace direct call support to arm64.
> This makes BPF tracing programs (fentry/fexit/fmod_ret/lsm) work on arm64.
> 
> It is meant to be taken by the arm64 tree but it depends on the
> trace-direct-v6.3-rc3 tag of the linux-trace tree:
>   git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
> That tag was created by Steven Rostedt so the arm64 tree can pull the prior work
> this depends on. [1]

Catalin, Will, are you happy to pick this via the arm64 tree, or for it to go
via the trace tree?

We'd been assuming the former, but it looks like there'll be a (simple) merge
conflict with the series adding FUNCTION_GRAPH_RETVAL:

  https://lore.kernel.org/lkml/cover.1680954589.git.pengdonglin@sangfor.com.cn/

... as both series add some definitions to arm64's asm-offsets.c in the same
place, and all those additions need to be kept. Other than that, the two series
are independent.

IIUC Steve was hoping to take the FUNCTION_GRAPH_RETVAL series through the
trace tree, and if that's still the plan, maybe both should go that way?

Mark.

> Thanks to the ftrace refactoring under that tag, an ftrace_ops backing a ftrace
> direct call will only ever point to *one* direct call. This means we can look up
> the direct called trampoline address stored in the ops from the ftrace_caller
> trampoline in the case when the destination would be out of reach of a BL
> instruction at the ftrace callsite. This fixes limitations of previous attempts
> such as [2].
> 
> This series has been tested on arm64 with:
> 1- CONFIG_FTRACE_SELFTEST
> 2- samples/ftrace/*.ko (cf: patch 4)
> 3- tools/testing/selftests/bpf/test_progs (cf: patch 5)
> 
> Changes since v5 [3]:
> - Fixed saving the fourth argument of handle_mm_fault in both the x86 (patch 3)
>   and arm64 (as part of patch 4) "ftrace-direct-too" sample trampolines
> - Fixed the address of the traced function logged by some direct call samples
>   (ftrace-direct-multi and ftrace-direct-multi-modify) by moving lr into x0
> 
> 1: https://lore.kernel.org/all/ZB2Nl7fzpHoq5V20@FVFF77S0Q05N/
> 2: https://lore.kernel.org/all/20220913162732.163631-1-xukuohai@huaweicloud.com/
> 3: https://lore.kernel.org/bpf/20230403113552.2857693-1-revest@chromium.org/
> 
> Florent Revest (5):
>   arm64: ftrace: Add direct call support
>   arm64: ftrace: Simplify get_ftrace_plt
>   samples: ftrace: Save required argument registers in sample
>     trampolines
>   arm64: ftrace: Add direct call trampoline samples support
>   selftests/bpf: Update the tests deny list on aarch64
> 
>  arch/arm64/Kconfig                           |  6 ++
>  arch/arm64/include/asm/ftrace.h              | 22 +++++
>  arch/arm64/kernel/asm-offsets.c              |  6 ++
>  arch/arm64/kernel/entry-ftrace.S             | 90 ++++++++++++++++----
>  arch/arm64/kernel/ftrace.c                   | 46 +++++++---
>  samples/ftrace/ftrace-direct-modify.c        | 34 ++++++++
>  samples/ftrace/ftrace-direct-multi-modify.c  | 40 +++++++++
>  samples/ftrace/ftrace-direct-multi.c         | 24 ++++++
>  samples/ftrace/ftrace-direct-too.c           | 40 +++++++--
>  samples/ftrace/ftrace-direct.c               | 24 ++++++
>  tools/testing/selftests/bpf/DENYLIST.aarch64 | 82 ++----------------
>  11 files changed, 306 insertions(+), 108 deletions(-)
> 
> -- 
> 2.40.0.577.gac1e443424-goog
> 

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 0/5] Add ftrace direct call for arm64
  2023-04-11 15:56 ` [PATCH v6 0/5] Add ftrace direct call for arm64 Mark Rutland
@ 2023-04-11 16:47   ` Steven Rostedt
  2023-04-11 17:08     ` Will Deacon
  0 siblings, 1 reply; 17+ messages in thread
From: Steven Rostedt @ 2023-04-11 16:47 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Florent Revest, catalin.marinas, will, linux-arm-kernel,
	linux-kernel, linux-trace-kernel, bpf, mhiramat, ast, daniel,
	andrii, kpsingh, jolsa, xukuohai, lihuafei1, Linus Torvalds

On Tue, 11 Apr 2023 16:56:45 +0100
Mark Rutland <mark.rutland@arm.com> wrote:

> IIUC Steve was hoping to take the FUNCTION_GRAPH_RETVAL series through the
> trace tree, and if that's still the plan, maybe both should go that way?

The conflict is minor, and I think I prefer to still have the ARM64 bits go
through the arm64 tree, as it will get better testing, and I don't like to
merge branches ;-)

I've added Linus to the Cc so he knows that there will be conflicts, but as
long as we mention it in our pull request, with a branch that includes the
solution, it should be fine going through two different trees.

-- Steve

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 0/5] Add ftrace direct call for arm64
  2023-04-11 16:47   ` Steven Rostedt
@ 2023-04-11 17:08     ` Will Deacon
  2023-04-11 17:44       ` Steven Rostedt
  0 siblings, 1 reply; 17+ messages in thread
From: Will Deacon @ 2023-04-11 17:08 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Mark Rutland, Florent Revest, catalin.marinas, linux-arm-kernel,
	linux-kernel, linux-trace-kernel, bpf, mhiramat, ast, daniel,
	andrii, kpsingh, jolsa, xukuohai, lihuafei1, Linus Torvalds

On Tue, Apr 11, 2023 at 12:47:49PM -0400, Steven Rostedt wrote:
> On Tue, 11 Apr 2023 16:56:45 +0100
> Mark Rutland <mark.rutland@arm.com> wrote:
> 
> > IIUC Steve was hoping to take the FUNCTION_GRAPH_RETVAL series through the
> > trace tree, and if that's still the plan, maybe both should go that way?
> 
> The conflict is minor, and I think I prefer to still have the ARM64 bits go
> through the arm64 tree, as it will get better testing, and I don't like to
> merge branches ;-)
> 
> I've added Linus to the Cc so he knows that there will be conflicts, but as
> long as we mention it in our pull request, with a branch that includes the
> solution, it should be fine going through two different trees.

If it's just the simple asm-offsets conflict that Mark mentioned, then that
sounds fine to me. However, patches 3-5 don't seem to have anything to do
with arm64 at all and I'd prefer those to go via other trees (esp. as patch
3 is an independent -stable candidate and the last one is a bpf selftest
change which conflicts in -next).

So I'll queue the first two in arm64 on a branch (or-next/ftrace) based
on trace-direct-v6.3-rc3.

Will

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 0/5] Add ftrace direct call for arm64
  2023-04-11 17:08     ` Will Deacon
@ 2023-04-11 17:44       ` Steven Rostedt
  2023-04-11 17:54         ` Will Deacon
  0 siblings, 1 reply; 17+ messages in thread
From: Steven Rostedt @ 2023-04-11 17:44 UTC (permalink / raw)
  To: Will Deacon
  Cc: Mark Rutland, Florent Revest, catalin.marinas, linux-arm-kernel,
	linux-kernel, linux-trace-kernel, bpf, mhiramat, ast, daniel,
	andrii, kpsingh, jolsa, xukuohai, lihuafei1, Linus Torvalds

On Tue, 11 Apr 2023 18:08:08 +0100
Will Deacon <will@kernel.org> wrote:

> On Tue, Apr 11, 2023 at 12:47:49PM -0400, Steven Rostedt wrote:
> > On Tue, 11 Apr 2023 16:56:45 +0100
> > Mark Rutland <mark.rutland@arm.com> wrote:
> >   
> > > IIUC Steve was hoping to take the FUNCTION_GRAPH_RETVAL series through the
> > > trace tree, and if that's still the plan, maybe both should go that way?  
> > 
> > The conflict is minor, and I think I prefer to still have the ARM64 bits go
> > through the arm64 tree, as it will get better testing, and I don't like to
> > merge branches ;-)
> > 
> > I've added Linus to the Cc so he knows that there will be conflicts, but as
> > long as we mention it in our pull request, with a branch that includes the
> > solution, it should be fine going through two different trees.  
> 
> If it's just the simple asm-offsets conflict that Mark mentioned, then that
> sounds fine to me. However, patches 3-5 don't seem to have anything to do

I guess 3 and 5 are not, but patch 4 adds arm64 code to the samples (as
it requires arch specific asm to handle the direct trampolines).

> with arm64 at all and I'd prefer those to go via other trees (esp. as patch
> 3 is an independent -stable candidate and the last one is a bpf selftest
> change which conflicts in -next).
> 
> So I'll queue the first two in arm64 on a branch (or-next/ftrace) based
> on trace-direct-v6.3-rc3.

Are 3-5 dependent on those changes? If not, I can pull them into my tree.

-- Steve


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 0/5] Add ftrace direct call for arm64
  2023-04-11 17:44       ` Steven Rostedt
@ 2023-04-11 17:54         ` Will Deacon
  2023-04-12  9:50           ` Mark Rutland
  0 siblings, 1 reply; 17+ messages in thread
From: Will Deacon @ 2023-04-11 17:54 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Mark Rutland, Florent Revest, catalin.marinas, linux-arm-kernel,
	linux-kernel, linux-trace-kernel, bpf, mhiramat, ast, daniel,
	andrii, kpsingh, jolsa, xukuohai, lihuafei1, Linus Torvalds

On Tue, Apr 11, 2023 at 01:44:56PM -0400, Steven Rostedt wrote:
> On Tue, 11 Apr 2023 18:08:08 +0100
> Will Deacon <will@kernel.org> wrote:
> 
> > On Tue, Apr 11, 2023 at 12:47:49PM -0400, Steven Rostedt wrote:
> > > On Tue, 11 Apr 2023 16:56:45 +0100
> > > Mark Rutland <mark.rutland@arm.com> wrote:
> > >   
> > > > IIUC Steve was hoping to take the FUNCTION_GRAPH_RETVAL series through the
> > > > trace tree, and if that's still the plan, maybe both should go that way?  
> > > 
> > > The conflict is minor, and I think I prefer to still have the ARM64 bits go
> > > through the arm64 tree, as it will get better testing, and I don't like to
> > > merge branches ;-)
> > > 
> > > I've added Linus to the Cc so he knows that there will be conflicts, but as
> > > long as we mention it in our pull request, with a branch that includes the
> > > solution, it should be fine going through two different trees.  
> > 
> > If it's just the simple asm-offsets conflict that Mark mentioned, then that
> > sounds fine to me. However, patches 3-5 don't seem to have anything to do
> 
> I guess 3 and 5 are not, but patch 4 adds arm64 code to the samples (as
> it requires arch specific asm to handle the direct trampolines).

Sorry, yes, I was thinking of arch/arm64/ and then failed spectacularly
at communicating :)

> > with arm64 at all and I'd prefer those to go via other trees (esp. as patch
> > 3 is an independent -stable candidate and the last one is a bpf selftest
> > change which conflicts in -next).
> > 
> > So I'll queue the first two in arm64 on a branch (or-next/ftrace) based
> > on trace-direct-v6.3-rc3.
> 
> Are 3-5 dependent on those changes? If not, I can pull them into my tree.

Good question. Florent?

Will

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 0/5] Add ftrace direct call for arm64
  2023-04-05 18:02 [PATCH v6 0/5] Add ftrace direct call for arm64 Florent Revest
                   ` (5 preceding siblings ...)
  2023-04-11 15:56 ` [PATCH v6 0/5] Add ftrace direct call for arm64 Mark Rutland
@ 2023-04-11 18:37 ` Will Deacon
  6 siblings, 0 replies; 17+ messages in thread
From: Will Deacon @ 2023-04-11 18:37 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel, linux-trace-kernel, Florent Revest, bpf
  Cc: catalin.marinas, kernel-team, Will Deacon, mark.rutland, daniel,
	andrii, xukuohai, jolsa, rostedt, mhiramat, ast, lihuafei1,
	kpsingh

On Wed, 5 Apr 2023 20:02:45 +0200, Florent Revest wrote:
> This series adds ftrace direct call support to arm64.
> This makes BPF tracing programs (fentry/fexit/fmod_ret/lsm) work on arm64.
> 
> It is meant to be taken by the arm64 tree but it depends on the
> trace-direct-v6.3-rc3 tag of the linux-trace tree:
>   git://git.kernel.org/pub/scm/linux/kernel/git/trace/linux-trace.git
> That tag was created by Steven Rostedt so the arm64 tree can pull the prior work
> this depends on. [1]
> 
> [...]

Applied first two to arm64 (for-next/ftrace), thanks!

[1/5] arm64: ftrace: Add direct call support
      https://git.kernel.org/arm64/c/2aa6ac03516d
[2/5] arm64: ftrace: Simplify get_ftrace_plt
      https://git.kernel.org/arm64/c/0f59dca63bf2

Cheers,
-- 
Will

https://fixes.arm64.dev
https://next.arm64.dev
https://will.arm64.dev

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 0/5] Add ftrace direct call for arm64
  2023-04-11 17:54         ` Will Deacon
@ 2023-04-12  9:50           ` Mark Rutland
  2023-04-24 20:09             ` Steven Rostedt
  0 siblings, 1 reply; 17+ messages in thread
From: Mark Rutland @ 2023-04-12  9:50 UTC (permalink / raw)
  To: Will Deacon
  Cc: Steven Rostedt, Florent Revest, catalin.marinas,
	linux-arm-kernel, linux-kernel, linux-trace-kernel, bpf,
	mhiramat, ast, daniel, andrii, kpsingh, jolsa, xukuohai,
	lihuafei1, Linus Torvalds

On Tue, Apr 11, 2023 at 06:54:24PM +0100, Will Deacon wrote:
> On Tue, Apr 11, 2023 at 01:44:56PM -0400, Steven Rostedt wrote:
> > On Tue, 11 Apr 2023 18:08:08 +0100
> > Will Deacon <will@kernel.org> wrote:
> > 
> > > On Tue, Apr 11, 2023 at 12:47:49PM -0400, Steven Rostedt wrote:
> > > > On Tue, 11 Apr 2023 16:56:45 +0100
> > > > Mark Rutland <mark.rutland@arm.com> wrote:
> > > >   
> > > > > IIUC Steve was hoping to take the FUNCTION_GRAPH_RETVAL series through the
> > > > > trace tree, and if that's still the plan, maybe both should go that way?  
> > > > 
> > > > The conflict is minor, and I think I prefer to still have the ARM64 bits go
> > > > through the arm64 tree, as it will get better testing, and I don't like to
> > > > merge branches ;-)
> > > > 
> > > > I've added Linus to the Cc so he knows that there will be conflicts, but as
> > > > long as we mention it in our pull request, with a branch that includes the
> > > > solution, it should be fine going through two different trees.  
> > > 
> > > If it's just the simple asm-offsets conflict that Mark mentioned, then that
> > > sounds fine to me. However, patches 3-5 don't seem to have anything to do
> > 
> > I guess 3 and 5 are not, but patch 4 adds arm64 code to the samples (as
> > it requires arch specific asm to handle the direct trampolines).
> 
> Sorry, yes, I was thinking of arch/arm64/ and then failed spectacularly
> at communicating :)
> 
> > > with arm64 at all and I'd prefer those to go via other trees (esp. as patch
> > > 3 is an independent -stable candidate and the last one is a bpf selftest
> > > change which conflicts in -next).
> > > 
> > > So I'll queue the first two in arm64 on a branch (or-next/ftrace) based
> > > on trace-direct-v6.3-rc3.
> > 
> > Are 3-5 dependent on those changes? If not, I can pull them into my tree.
> 
> Good question. Florent?

Patch 3 (the fix to the ftrace test) does not depend upon patches 1 and 2. It
probably would've been better to queue that as a preparatory fix before the
other changes.

Patch 4 (adding arm64 support to the samples) depends on patch 3. The arm64
parts depends upon patch 1 to be selectable, and without patch 1 the samples
will behave the same as before. It could be queued independently of patch 1,
but won't have any effect until merged with patch 1.

Patch 5 (the bpf selftest list changes) depends on patch 1 alone.

Perhaps we could queue 1 and 2 via the arm64 tree, 3 and 4 via the ftrace tree,
and follow up with patch 5 via the bpf tree after -rc1?

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v6 0/5] Add ftrace direct call for arm64
  2023-04-12  9:50           ` Mark Rutland
@ 2023-04-24 20:09             ` Steven Rostedt
  0 siblings, 0 replies; 17+ messages in thread
From: Steven Rostedt @ 2023-04-24 20:09 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Will Deacon, Florent Revest, catalin.marinas, linux-arm-kernel,
	linux-kernel, linux-trace-kernel, bpf, mhiramat, ast, daniel,
	andrii, kpsingh, jolsa, xukuohai, lihuafei1, Linus Torvalds

On Wed, 12 Apr 2023 10:50:21 +0100
Mark Rutland <mark.rutland@arm.com> wrote:

> Perhaps we could queue 1 and 2 via the arm64 tree, 3 and 4 via the ftrace tree,
> and follow up with patch 5 via the bpf tree after -rc1?

Any patches that you want through the ftrace tree, please send as a
separate queue to the linux-trace-kernel mailing list (and lkml) if you
haven't done that already. I'm still a thousand emails behind, and
walking through them while at the airport lounge.

-- Steve

^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2023-04-24 20:09 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-04-05 18:02 [PATCH v6 0/5] Add ftrace direct call for arm64 Florent Revest
2023-04-05 18:02 ` [PATCH v6 1/5] arm64: ftrace: Add direct call support Florent Revest
2023-04-05 18:02 ` [PATCH v6 2/5] arm64: ftrace: Simplify get_ftrace_plt Florent Revest
2023-04-05 18:02 ` [PATCH v6 3/5] samples: ftrace: Save required argument registers in sample trampolines Florent Revest
2023-04-05 20:40   ` Steven Rostedt
2023-04-06 10:22   ` Mark Rutland
2023-04-05 18:02 ` [PATCH v6 4/5] arm64: ftrace: Add direct call trampoline samples support Florent Revest
2023-04-06 10:50   ` Mark Rutland
2023-04-05 18:02 ` [PATCH v6 5/5] selftests/bpf: Update the tests deny list on aarch64 Florent Revest
2023-04-11 15:56 ` [PATCH v6 0/5] Add ftrace direct call for arm64 Mark Rutland
2023-04-11 16:47   ` Steven Rostedt
2023-04-11 17:08     ` Will Deacon
2023-04-11 17:44       ` Steven Rostedt
2023-04-11 17:54         ` Will Deacon
2023-04-12  9:50           ` Mark Rutland
2023-04-24 20:09             ` Steven Rostedt
2023-04-11 18:37 ` Will Deacon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).