linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf v2 0/4] kprobes: rethook,ARM,arm64: Replace kretprobe trampoline with rethook
@ 2022-04-08  0:50 Masami Hiramatsu
  2022-04-08  0:50 ` [PATCH bpf v2 1/4] ARM: unwind: Initialize the lr_addr field of unwind_ctrl_block Masami Hiramatsu
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Masami Hiramatsu @ 2022-04-08  0:50 UTC (permalink / raw)
  To: Alexei Starovoitov, Alexei Starovoitov
  Cc: Daniel Borkmann, Shubham Bansal, Andrii Nakryiko,
	Masami Hiramatsu, bpf, kernel-team, Jiri Olsa, Steven Rostedt,
	Naveen N . Rao, Anil S Keshavamurthy, David S . Miller,
	linux-kernel, Mark Rutland, Will Deacon, Ard Biesheuvel,
	Russell King, Catalin Marinas, linux-arm-kernel

Hi,

Here is the 2nd version of the series for replacing kretprobe trampoline
with rethook on ARM/arm64. I fixed some compiler warnings in this version.
The previous version is here[1];

[1] https://lore.kernel.org/all/164915121498.982637.12787715964983738566.stgit@devnote2/T/#u

This series includes a trivial bugfix for the arm unwinder to initialize
an internal data structure([1/4]). This is not critical for stack trace,
but required for rethook to find the LR register from the stack.
This also have an update for the rethook interface, which allows us to
check the rethook_hook() failure ([2/4]). This is also required for the
rethook on arm because unwinder is able to fail.
The rest of patches are replacing kretprobe trampoline with rethook on
ARM ([3/4]) and arm64 ([4/4]).

Background:

This rethook came from Jiri's request of multiple kprobe for bpf[2].
He tried to solve an issue that starting bpf with multiple kprobe will
take a long time because bpf-kprobe will wait for RCU grace period for
sync rcu events.

Jiri wanted to attach a single bpf handler to multiple kprobes and
he tried to introduce multiple-probe interface to kprobe. So I asked
him to use ftrace and kretprobe-like hook if it is only for the
function entry and exit, instead of adding ad-hoc interface
to kprobes.
For this purpose, I introduced the fprobe (kprobe like interface for
ftrace) with the rethook (this is a generic return hook feature for
fprobe exit handler)[3].

[2] https://lore.kernel.org/all/20220104080943.113249-1-jolsa@kernel.org/T/#u
[3] https://lore.kernel.org/all/164191321766.806991.7930388561276940676.stgit@devnote2/T/#u

The rethook is basically same as the kretprobe trampoline. I just made
it decoupled from kprobes. Eventually, the all arch dependent kretprobe
trampolines will be replaced with the rethook trampoline instead of
cloning the code.

When I port the rethook for all arch which supports kretprobe, the
legacy kretprobe specific code (which is for CONFIG_KRETPROBE_ON_RETHOOK=n)
will be removed eventually.

BTW, the arm Clang support for rethook is for kretprobes only. fprobe
and ftrace seems not working with Clang yet.

Thank you,

---

Masami Hiramatsu (4):
      ARM: unwind: Initialize the lr_addr field of unwind_ctrl_block
      rethook,fprobe,kprobes: Check a failure in the rethook_hook()
      ARM: rethook: Replace kretprobe trampoline with rethook
      arm64: rethook: Replace kretprobe trampoline with rethook


 arch/arm/Kconfig                              |    1 
 arch/arm/include/asm/stacktrace.h             |    5 +
 arch/arm/kernel/stacktrace.c                  |   13 +--
 arch/arm/kernel/unwind.c                      |    1 
 arch/arm/probes/Makefile                      |    1 
 arch/arm/probes/kprobes/core.c                |   62 ------------
 arch/arm/probes/rethook.c                     |  127 +++++++++++++++++++++++++
 arch/arm64/Kconfig                            |    1 
 arch/arm64/include/asm/kprobes.h              |    2 
 arch/arm64/include/asm/stacktrace.h           |    2 
 arch/arm64/kernel/Makefile                    |    1 
 arch/arm64/kernel/probes/Makefile             |    1 
 arch/arm64/kernel/probes/kprobes.c            |   15 ---
 arch/arm64/kernel/probes/kprobes_trampoline.S |   86 -----------------
 arch/arm64/kernel/rethook.c                   |   28 ++++++
 arch/arm64/kernel/rethook_trampoline.S        |   87 +++++++++++++++++
 arch/arm64/kernel/stacktrace.c                |    9 +-
 arch/x86/kernel/rethook.c                     |    4 +
 include/linux/rethook.h                       |    4 -
 kernel/kprobes.c                              |    8 +-
 kernel/trace/fprobe.c                         |    5 +
 kernel/trace/rethook.c                        |   12 ++
 22 files changed, 287 insertions(+), 188 deletions(-)
 create mode 100644 arch/arm/probes/rethook.c
 delete mode 100644 arch/arm64/kernel/probes/kprobes_trampoline.S
 create mode 100644 arch/arm64/kernel/rethook.c
 create mode 100644 arch/arm64/kernel/rethook_trampoline.S

--
Masami Hiramatsu (Linaro) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH bpf v2 1/4] ARM: unwind: Initialize the lr_addr field of unwind_ctrl_block
  2022-04-08  0:50 [PATCH bpf v2 0/4] kprobes: rethook,ARM,arm64: Replace kretprobe trampoline with rethook Masami Hiramatsu
@ 2022-04-08  0:50 ` Masami Hiramatsu
  2022-04-08  0:51 ` [PATCH bpf v2 2/4] rethook,fprobe,kprobes: Check a failure in the rethook_hook() Masami Hiramatsu
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Masami Hiramatsu @ 2022-04-08  0:50 UTC (permalink / raw)
  To: Alexei Starovoitov, Alexei Starovoitov
  Cc: Daniel Borkmann, Shubham Bansal, Andrii Nakryiko,
	Masami Hiramatsu, bpf, kernel-team, Jiri Olsa, Steven Rostedt,
	Naveen N . Rao, Anil S Keshavamurthy, David S . Miller,
	linux-kernel, Mark Rutland, Will Deacon, Ard Biesheuvel,
	Russell King, Catalin Marinas, linux-arm-kernel

Since the unwind_ctrl_block::lr_addr is finally passed to
stackframe::lr_addr, that value will be exposed unconditionally.
Thus it should be initialized.

Without this fix, when unwind_frame() doesn't update the
unwind_ctrl_block::lr_addr (e.g. 'lr' register is not saved in the
target function), stackframe::lr_addr will contain a wrong value.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
  Changes in v2:
   - Initialize pointer by NULL instead of 0.
---
 arch/arm/kernel/unwind.c |    1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/kernel/unwind.c b/arch/arm/kernel/unwind.c
index a37ea6c772cd..c9f719e1b350 100644
--- a/arch/arm/kernel/unwind.c
+++ b/arch/arm/kernel/unwind.c
@@ -404,6 +404,7 @@ int unwind_frame(struct stackframe *frame)
 	ctrl.vrs[SP] = frame->sp;
 	ctrl.vrs[LR] = frame->lr;
 	ctrl.vrs[PC] = 0;
+	ctrl.lr_addr = NULL;
 
 	if (idx->insn == 1)
 		/* can't unwind */


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH bpf v2 2/4] rethook,fprobe,kprobes: Check a failure in the rethook_hook()
  2022-04-08  0:50 [PATCH bpf v2 0/4] kprobes: rethook,ARM,arm64: Replace kretprobe trampoline with rethook Masami Hiramatsu
  2022-04-08  0:50 ` [PATCH bpf v2 1/4] ARM: unwind: Initialize the lr_addr field of unwind_ctrl_block Masami Hiramatsu
@ 2022-04-08  0:51 ` Masami Hiramatsu
  2022-04-08  0:51 ` [PATCH bpf v2 3/4] ARM: rethook: Replace kretprobe trampoline with rethook Masami Hiramatsu
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Masami Hiramatsu @ 2022-04-08  0:51 UTC (permalink / raw)
  To: Alexei Starovoitov, Alexei Starovoitov
  Cc: Daniel Borkmann, Shubham Bansal, Andrii Nakryiko,
	Masami Hiramatsu, bpf, kernel-team, Jiri Olsa, Steven Rostedt,
	Naveen N . Rao, Anil S Keshavamurthy, David S . Miller,
	linux-kernel, Mark Rutland, Will Deacon, Ard Biesheuvel,
	Russell King, Catalin Marinas, linux-arm-kernel

Since there are possible to fail to hook the function return (depends on
archtecutre implememtation), rethook_hook() should return the error
in that case and caller must check it.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
 arch/x86/kernel/rethook.c |    4 +++-
 include/linux/rethook.h   |    4 ++--
 kernel/kprobes.c          |    8 +++++---
 kernel/trace/fprobe.c     |    5 ++++-
 kernel/trace/rethook.c    |   12 ++++++++++--
 5 files changed, 24 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/rethook.c b/arch/x86/kernel/rethook.c
index 8a1c0111ae79..c92b4875e3b9 100644
--- a/arch/x86/kernel/rethook.c
+++ b/arch/x86/kernel/rethook.c
@@ -114,7 +114,7 @@ void arch_rethook_fixup_return(struct pt_regs *regs,
 }
 NOKPROBE_SYMBOL(arch_rethook_fixup_return);
 
-void arch_rethook_prepare(struct rethook_node *rh, struct pt_regs *regs, bool mcount)
+int arch_rethook_prepare(struct rethook_node *rh, struct pt_regs *regs, bool mcount)
 {
 	unsigned long *stack = (unsigned long *)regs->sp;
 
@@ -123,5 +123,7 @@ void arch_rethook_prepare(struct rethook_node *rh, struct pt_regs *regs, bool mc
 
 	/* Replace the return addr with trampoline addr */
 	stack[0] = (unsigned long) arch_rethook_trampoline;
+
+	return 0;
 }
 NOKPROBE_SYMBOL(arch_rethook_prepare);
diff --git a/include/linux/rethook.h b/include/linux/rethook.h
index c8ac1e5afcd1..07b9c6663b8e 100644
--- a/include/linux/rethook.h
+++ b/include/linux/rethook.h
@@ -63,12 +63,12 @@ void rethook_free(struct rethook *rh);
 void rethook_add_node(struct rethook *rh, struct rethook_node *node);
 struct rethook_node *rethook_try_get(struct rethook *rh);
 void rethook_recycle(struct rethook_node *node);
-void rethook_hook(struct rethook_node *node, struct pt_regs *regs, bool mcount);
+int rethook_hook(struct rethook_node *node, struct pt_regs *regs, bool mcount);
 unsigned long rethook_find_ret_addr(struct task_struct *tsk, unsigned long frame,
 				    struct llist_node **cur);
 
 /* Arch dependent code must implement arch_* and trampoline code */
-void arch_rethook_prepare(struct rethook_node *node, struct pt_regs *regs, bool mcount);
+int arch_rethook_prepare(struct rethook_node *node, struct pt_regs *regs, bool mcount);
 void arch_rethook_trampoline(void);
 
 /**
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index dbe57df2e199..7fd7f1195bde 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -2109,10 +2109,12 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
 
 	ri = container_of(rhn, struct kretprobe_instance, node);
 
-	if (rp->entry_handler && rp->entry_handler(ri, regs))
+	if (rp->entry_handler && rp->entry_handler(ri, regs)) {
 		rethook_recycle(rhn);
-	else
-		rethook_hook(rhn, regs, kprobe_ftrace(p));
+	} else if (rethook_hook(rhn, regs, kprobe_ftrace(p)) < 0) {
+		rethook_recycle(rhn);
+		rp->nmissed++;
+	}
 
 	return 0;
 }
diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
index 89d9f994ebb0..d3b13294d545 100644
--- a/kernel/trace/fprobe.c
+++ b/kernel/trace/fprobe.c
@@ -48,7 +48,10 @@ static void fprobe_handler(unsigned long ip, unsigned long parent_ip,
 		}
 		fpr = container_of(rh, struct fprobe_rethook_node, node);
 		fpr->entry_ip = ip;
-		rethook_hook(rh, ftrace_get_regs(fregs), true);
+		if (rethook_hook(rh, ftrace_get_regs(fregs), true) < 0) {
+			rethook_recycle(rh);
+			fp->nmissed++;
+		}
 	}
 
 out:
diff --git a/kernel/trace/rethook.c b/kernel/trace/rethook.c
index b56833700d23..e7db83438e45 100644
--- a/kernel/trace/rethook.c
+++ b/kernel/trace/rethook.c
@@ -174,11 +174,19 @@ NOKPROBE_SYMBOL(rethook_try_get);
  * from ftrace (mcount) callback, @mcount must be set true. If this is called
  * from the real function entry (e.g. kprobes) @mcount must be set false.
  * This is because the way to hook the function return depends on the context.
+ * This returns 0 if succeeded to hook the function return, or -errno if
+ * failed.
  */
-void rethook_hook(struct rethook_node *node, struct pt_regs *regs, bool mcount)
+int rethook_hook(struct rethook_node *node, struct pt_regs *regs, bool mcount)
 {
-	arch_rethook_prepare(node, regs, mcount);
+	int ret;
+
+	ret = arch_rethook_prepare(node, regs, mcount);
+	if (ret < 0)
+		return ret;
+
 	__llist_add(&node->llist, &current->rethooks);
+	return 0;
 }
 NOKPROBE_SYMBOL(rethook_hook);
 


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH bpf v2 3/4] ARM: rethook: Replace kretprobe trampoline with rethook
  2022-04-08  0:50 [PATCH bpf v2 0/4] kprobes: rethook,ARM,arm64: Replace kretprobe trampoline with rethook Masami Hiramatsu
  2022-04-08  0:50 ` [PATCH bpf v2 1/4] ARM: unwind: Initialize the lr_addr field of unwind_ctrl_block Masami Hiramatsu
  2022-04-08  0:51 ` [PATCH bpf v2 2/4] rethook,fprobe,kprobes: Check a failure in the rethook_hook() Masami Hiramatsu
@ 2022-04-08  0:51 ` Masami Hiramatsu
  2022-04-08  0:51 ` [PATCH bpf v2 4/4] arm64: " Masami Hiramatsu
  2022-04-12 18:03 ` [PATCH bpf v2 0/4] kprobes: rethook,ARM,arm64: " Mark Rutland
  4 siblings, 0 replies; 9+ messages in thread
From: Masami Hiramatsu @ 2022-04-08  0:51 UTC (permalink / raw)
  To: Alexei Starovoitov, Alexei Starovoitov
  Cc: Daniel Borkmann, Shubham Bansal, Andrii Nakryiko,
	Masami Hiramatsu, bpf, kernel-team, Jiri Olsa, Steven Rostedt,
	Naveen N . Rao, Anil S Keshavamurthy, David S . Miller,
	linux-kernel, Mark Rutland, Will Deacon, Ard Biesheuvel,
	Russell King, Catalin Marinas, linux-arm-kernel

Replace the kretprob's trampoline code with the rethook on arm.
This also enables rethook support on arm. Most of the code has been
copied from kretprobe on arm.

The significant difference is that the rethook on mcount (ftrace)
support is added. If the rethook is called from the kprobes for
kretprobe, there is no problem to replace the LR register with
trampoline address because the LR register will be saved after
kprobe probed. However, the mcount call will be placed right after
making a stack frame for the function. This means we have to decode
the stackframe to find where the LR register is saved. With the
CONFIG_FRAME_POINTER, the frame pointer (FP register) is used.
Without that, rethook has to unwind one stack frame to find it.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
 arch/arm/Kconfig                  |    1 
 arch/arm/include/asm/stacktrace.h |    5 +
 arch/arm/kernel/stacktrace.c      |   13 ++--
 arch/arm/probes/Makefile          |    1 
 arch/arm/probes/kprobes/core.c    |   62 ------------------
 arch/arm/probes/rethook.c         |  127 +++++++++++++++++++++++++++++++++++++
 6 files changed, 139 insertions(+), 70 deletions(-)
 create mode 100644 arch/arm/probes/rethook.c

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 2e8091e2d8a8..f7054ba596ba 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -110,6 +110,7 @@ config ARM
 	select HAVE_MOD_ARCH_SPECIFIC
 	select HAVE_NMI
 	select HAVE_OPTPROBES if !THUMB2_KERNEL
+	select HAVE_RETHOOK
 	select HAVE_PERF_EVENTS
 	select HAVE_PERF_REGS
 	select HAVE_PERF_USER_STACK_DUMP
diff --git a/arch/arm/include/asm/stacktrace.h b/arch/arm/include/asm/stacktrace.h
index 3e78f921b8b2..76a70b25863e 100644
--- a/arch/arm/include/asm/stacktrace.h
+++ b/arch/arm/include/asm/stacktrace.h
@@ -17,7 +17,8 @@ struct stackframe {
 
 	/* address of the LR value on the stack */
 	unsigned long *lr_addr;
-#ifdef CONFIG_KRETPROBES
+
+#if defined(CONFIG_RETHOOK)
 	struct llist_node *kr_cur;
 	struct task_struct *tsk;
 #endif
@@ -30,7 +31,7 @@ void arm_get_current_stackframe(struct pt_regs *regs, struct stackframe *frame)
 		frame->sp = regs->ARM_sp;
 		frame->lr = regs->ARM_lr;
 		frame->pc = regs->ARM_pc;
-#ifdef CONFIG_KRETPROBES
+#if defined(CONFIG_RETHOOK)
 		frame->kr_cur = NULL;
 		frame->tsk = current;
 #endif
diff --git a/arch/arm/kernel/stacktrace.c b/arch/arm/kernel/stacktrace.c
index b5efecb3d730..6df085ecdf41 100644
--- a/arch/arm/kernel/stacktrace.c
+++ b/arch/arm/kernel/stacktrace.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0-only
 #include <linux/export.h>
 #include <linux/kprobes.h>
+#include <linux/rethook.h>
 #include <linux/sched.h>
 #include <linux/sched/debug.h>
 #include <linux/stacktrace.h>
@@ -66,10 +67,10 @@ int notrace unwind_frame(struct stackframe *frame)
 	frame->sp = *(unsigned long *)(fp - 8);
 	frame->pc = *(unsigned long *)(fp - 4);
 #endif
-#ifdef CONFIG_KRETPROBES
-	if (is_kretprobe_trampoline(frame->pc))
-		frame->pc = kretprobe_find_ret_addr(frame->tsk,
-					(void *)frame->fp, &frame->kr_cur);
+#ifdef CONFIG_RETHOOK
+	if (is_rethook_trampoline(frame->pc))
+		frame->pc = rethook_find_ret_addr(frame->tsk, frame->fp,
+						  &frame->kr_cur);
 #endif
 
 	return 0;
@@ -163,7 +164,7 @@ static noinline void __save_stack_trace(struct task_struct *tsk,
 here:
 		frame.pc = (unsigned long)&&here;
 	}
-#ifdef CONFIG_KRETPROBES
+#ifdef CONFIG_RETHOOK
 	frame.kr_cur = NULL;
 	frame.tsk = tsk;
 #endif
@@ -184,7 +185,7 @@ void save_stack_trace_regs(struct pt_regs *regs, struct stack_trace *trace)
 	frame.sp = regs->ARM_sp;
 	frame.lr = regs->ARM_lr;
 	frame.pc = regs->ARM_pc;
-#ifdef CONFIG_KRETPROBES
+#ifdef CONFIG_RETHOOK
 	frame.kr_cur = NULL;
 	frame.tsk = current;
 #endif
diff --git a/arch/arm/probes/Makefile b/arch/arm/probes/Makefile
index 8b0ea5ace100..10c083a22223 100644
--- a/arch/arm/probes/Makefile
+++ b/arch/arm/probes/Makefile
@@ -6,3 +6,4 @@ obj-$(CONFIG_KPROBES)		+= decode-thumb.o
 else
 obj-$(CONFIG_KPROBES)		+= decode-arm.o
 endif
+obj-$(CONFIG_RETHOOK)		+= rethook.o
diff --git a/arch/arm/probes/kprobes/core.c b/arch/arm/probes/kprobes/core.c
index 9090c3a74dcc..2f01f8267cc3 100644
--- a/arch/arm/probes/kprobes/core.c
+++ b/arch/arm/probes/kprobes/core.c
@@ -365,68 +365,6 @@ int __kprobes kprobe_exceptions_notify(struct notifier_block *self,
 	return NOTIFY_DONE;
 }
 
-/*
- * When a retprobed function returns, trampoline_handler() is called,
- * calling the kretprobe's handler. We construct a struct pt_regs to
- * give a view of registers r0-r11, sp, lr, and pc to the user
- * return-handler. This is not a complete pt_regs structure, but that
- * should be enough for stacktrace from the return handler with or
- * without pt_regs.
- */
-void __naked __kprobes __kretprobe_trampoline(void)
-{
-	__asm__ __volatile__ (
-#ifdef CONFIG_FRAME_POINTER
-		"ldr	lr, =__kretprobe_trampoline	\n\t"
-	/* __kretprobe_trampoline makes a framepointer on pt_regs. */
-#ifdef CONFIG_CC_IS_CLANG
-		"stmdb	sp, {sp, lr, pc}	\n\t"
-		"sub	sp, sp, #12		\n\t"
-		/* In clang case, pt_regs->ip = lr. */
-		"stmdb	sp!, {r0 - r11, lr}	\n\t"
-		/* fp points regs->r11 (fp) */
-		"add	fp, sp,	#44		\n\t"
-#else /* !CONFIG_CC_IS_CLANG */
-		/* In gcc case, pt_regs->ip = fp. */
-		"stmdb	sp, {fp, sp, lr, pc}	\n\t"
-		"sub	sp, sp, #16		\n\t"
-		"stmdb	sp!, {r0 - r11}		\n\t"
-		/* fp points regs->r15 (pc) */
-		"add	fp, sp, #60		\n\t"
-#endif /* CONFIG_CC_IS_CLANG */
-#else /* !CONFIG_FRAME_POINTER */
-		"sub	sp, sp, #16		\n\t"
-		"stmdb	sp!, {r0 - r11}		\n\t"
-#endif /* CONFIG_FRAME_POINTER */
-		"mov	r0, sp			\n\t"
-		"bl	trampoline_handler	\n\t"
-		"mov	lr, r0			\n\t"
-		"ldmia	sp!, {r0 - r11}		\n\t"
-		"add	sp, sp, #16		\n\t"
-#ifdef CONFIG_THUMB2_KERNEL
-		"bx	lr			\n\t"
-#else
-		"mov	pc, lr			\n\t"
-#endif
-		: : : "memory");
-}
-
-/* Called from __kretprobe_trampoline */
-static __used __kprobes void *trampoline_handler(struct pt_regs *regs)
-{
-	return (void *)kretprobe_trampoline_handler(regs, (void *)regs->ARM_fp);
-}
-
-void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
-				      struct pt_regs *regs)
-{
-	ri->ret_addr = (kprobe_opcode_t *)regs->ARM_lr;
-	ri->fp = (void *)regs->ARM_fp;
-
-	/* Replace the return addr with trampoline addr. */
-	regs->ARM_lr = (unsigned long)&__kretprobe_trampoline;
-}
-
 int __kprobes arch_trampoline_kprobe(struct kprobe *p)
 {
 	return 0;
diff --git a/arch/arm/probes/rethook.c b/arch/arm/probes/rethook.c
new file mode 100644
index 000000000000..598a2b579b91
--- /dev/null
+++ b/arch/arm/probes/rethook.c
@@ -0,0 +1,127 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * arm implementation of rethook. Mostly copied from arch/arm/probes/kprobes/core.c
+ */
+
+#include <linux/kprobes.h>
+#include <linux/rethook.h>
+
+#include <asm/stacktrace.h>
+
+/* Called from arch_rethook_trampoline */
+static __used notrace unsigned long arch_rethook_trampoline_callback(struct pt_regs *regs)
+{
+	return rethook_trampoline_handler(regs, regs->ARM_fp);
+}
+NOKPROBE_SYMBOL(arch_rethook_trampoline_callback);
+
+/*
+ * When a rethook'ed function returns, it returns to arch_rethook_trampoline
+ * which calls rethook callback. We construct a struct pt_regs to
+ * give a view of registers r0-r11, sp, lr, and pc to the user
+ * return-handler. This is not a complete pt_regs structure, but that
+ * should be enough for stacktrace from the return handler with or
+ * without pt_regs.
+ */
+asm(
+	".text\n"
+	".global arch_rethook_trampoline\n"
+	".type arch_rethook_trampoline, %function\n"
+	"arch_rethook_trampoline:\n"
+#ifdef CONFIG_FRAME_POINTER
+	"adr	lr, .			\n\t"
+	/* this makes a framepointer on pt_regs. */
+#ifdef CONFIG_CC_IS_CLANG
+	"stmdb	sp, {sp, lr, pc}	\n\t"
+	"sub	sp, sp, #12		\n\t"
+	/* In clang case, pt_regs->ip = lr. */
+	"stmdb	sp!, {r0 - r11, lr}	\n\t"
+	/* fp points regs->r11 (fp) */
+	"add	fp, sp,	#44		\n\t"
+#else /* !CONFIG_CC_IS_CLANG */
+	/* In gcc case, pt_regs->ip = fp. */
+	"stmdb	sp, {fp, sp, lr, pc}	\n\t"
+	"sub	sp, sp, #16		\n\t"
+	"stmdb	sp!, {r0 - r11}		\n\t"
+	/* fp points regs->r15 (pc) */
+	"add	fp, sp, #60		\n\t"
+#endif /* CONFIG_CC_IS_CLANG */
+#else /* !CONFIG_FRAME_POINTER */
+	"sub	sp, sp, #16		\n\t"
+	"stmdb	sp!, {r0 - r11}		\n\t"
+#endif /* CONFIG_FRAME_POINTER */
+	"mov	r0, sp			\n\t"
+	"bl	arch_rethook_trampoline_callback	\n\t"
+	"mov	lr, r0			\n\t"
+	"ldmia	sp!, {r0 - r11}		\n\t"
+	"add	sp, sp, #16		\n\t"
+#ifdef CONFIG_THUMB2_KERNEL
+	"bx	lr			\n\t"
+#else
+	"mov	pc, lr			\n\t"
+#endif
+	".size arch_rethook_trampoline, .-arch_rethook_trampoline\n"
+);
+NOKPROBE_SYMBOL(arch_rethook_trampoline);
+
+/*
+ * At the entry of function with mcount, if the FRAME_POINTER is enabled,
+ * the stack and registers are prepared for the mcount function as below.
+ *
+ * mov     ip, sp
+ * push    {fp, ip, lr, pc}
+ * sub     fp, ip, #4	; FP[0] = PC, FP[-4] = LR, and FP[-12] = call-site FP.
+ * push    {lr}
+ * bl      <__gnu_mcount_nc> ; call ftrace
+ *
+ * And when returning from the function, call-site FP, SP and PC are restored
+ * from stack as below;
+ *
+ * ldm     sp, {fp, sp, pc}
+ *
+ * Thus, if the arch_rethook_prepare() is called from real function entry,
+ * it must change the LR and save FP in pt_regs. But if it is called via
+ * mcount context (ftrace), it must change the LR on stack, which is next
+ * to the PC (= FP[-4]), and save the FP value at FP[-12].
+ *
+ * If the FRAME_POINTER is disabled, we have to use arm unwinder to find where
+ * the LR is stored.
+ */
+int notrace arch_rethook_prepare(struct rethook_node *rh, struct pt_regs *regs, bool mcount)
+{
+	unsigned long *lr_addr;
+	int ret;
+
+	if (mcount) {
+		/* Clang + mcount case is not supported yet. */
+		if (IS_ENABLED(CONFIG_CC_IS_CLANG))
+			return -EOPNOTSUPP;
+		if (IS_ENABLED(CONFIG_FRAME_POINTER)) {
+			lr_addr = (unsigned long *)(regs->ARM_fp - 4);
+			rh->frame = *(unsigned long *)(regs->ARM_fp - 12);
+		} else {
+			struct stackframe frame;
+
+			arm_get_current_stackframe(regs, &frame);
+			ret = unwind_frame(&frame);
+			if (ret < 0)
+				return -EINVAL;
+
+			if (frame.lr_addr)
+				lr_addr = frame.lr_addr;
+			else
+				lr_addr = &regs->ARM_lr;
+			rh->frame = regs->ARM_fp;
+		}
+	} else {
+		lr_addr = &regs->ARM_lr;
+		rh->frame = regs->ARM_fp;
+	}
+
+	/* Replace the return addr with trampoline addr. */
+	rh->ret_addr = *lr_addr;
+	*lr_addr = (unsigned long)arch_rethook_trampoline;
+
+	return 0;
+}
+NOKPROBE_SYMBOL(arch_rethook_prepare);


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH bpf v2 4/4] arm64: rethook: Replace kretprobe trampoline with rethook
  2022-04-08  0:50 [PATCH bpf v2 0/4] kprobes: rethook,ARM,arm64: Replace kretprobe trampoline with rethook Masami Hiramatsu
                   ` (2 preceding siblings ...)
  2022-04-08  0:51 ` [PATCH bpf v2 3/4] ARM: rethook: Replace kretprobe trampoline with rethook Masami Hiramatsu
@ 2022-04-08  0:51 ` Masami Hiramatsu
  2022-04-12 19:04   ` Mark Rutland
  2022-04-12 18:03 ` [PATCH bpf v2 0/4] kprobes: rethook,ARM,arm64: " Mark Rutland
  4 siblings, 1 reply; 9+ messages in thread
From: Masami Hiramatsu @ 2022-04-08  0:51 UTC (permalink / raw)
  To: Alexei Starovoitov, Alexei Starovoitov
  Cc: Daniel Borkmann, Shubham Bansal, Andrii Nakryiko,
	Masami Hiramatsu, bpf, kernel-team, Jiri Olsa, Steven Rostedt,
	Naveen N . Rao, Anil S Keshavamurthy, David S . Miller,
	linux-kernel, Mark Rutland, Will Deacon, Ard Biesheuvel,
	Russell King, Catalin Marinas, linux-arm-kernel

Replace the kretprobe's trampoline code with the rethook on arm64.
The rethook on arm64 is almost renamed from kretprobe trampoline
code. The mechanism is completely same.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
 Changes in v2:
  - Fix a compile warning about no prototype.
---
 arch/arm64/Kconfig                            |    1 
 arch/arm64/include/asm/kprobes.h              |    2 -
 arch/arm64/include/asm/stacktrace.h           |    2 -
 arch/arm64/kernel/Makefile                    |    1 
 arch/arm64/kernel/probes/Makefile             |    1 
 arch/arm64/kernel/probes/kprobes.c            |   15 ----
 arch/arm64/kernel/probes/kprobes_trampoline.S |   86 -------------------------
 arch/arm64/kernel/rethook.c                   |   28 ++++++++
 arch/arm64/kernel/rethook_trampoline.S        |   87 +++++++++++++++++++++++++
 arch/arm64/kernel/stacktrace.c                |    9 +--
 10 files changed, 123 insertions(+), 109 deletions(-)
 delete mode 100644 arch/arm64/kernel/probes/kprobes_trampoline.S
 create mode 100644 arch/arm64/kernel/rethook.c
 create mode 100644 arch/arm64/kernel/rethook_trampoline.S

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 57c4c995965f..7d2945930283 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -204,6 +204,7 @@ config ARM64
 	select HAVE_SYSCALL_TRACEPOINTS
 	select HAVE_KPROBES
 	select HAVE_KRETPROBES
+	select HAVE_RETHOOK
 	select HAVE_GENERIC_VDSO
 	select IOMMU_DMA if IOMMU_SUPPORT
 	select IRQ_DOMAIN
diff --git a/arch/arm64/include/asm/kprobes.h b/arch/arm64/include/asm/kprobes.h
index 05cd82eeca13..4ac558058377 100644
--- a/arch/arm64/include/asm/kprobes.h
+++ b/arch/arm64/include/asm/kprobes.h
@@ -39,8 +39,6 @@ void arch_remove_kprobe(struct kprobe *);
 int kprobe_fault_handler(struct pt_regs *regs, unsigned int fsr);
 int kprobe_exceptions_notify(struct notifier_block *self,
 			     unsigned long val, void *data);
-void __kretprobe_trampoline(void);
-void __kprobes *trampoline_probe_handler(struct pt_regs *regs);
 
 #endif /* CONFIG_KPROBES */
 #endif /* _ARM_KPROBES_H */
diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
index e77cdef9ca29..f781874f1609 100644
--- a/arch/arm64/include/asm/stacktrace.h
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -58,7 +58,7 @@ struct stackframe {
 	DECLARE_BITMAP(stacks_done, __NR_STACK_TYPES);
 	unsigned long prev_fp;
 	enum stack_type prev_type;
-#ifdef CONFIG_KRETPROBES
+#if defined(CONFIG_RETHOOK)
 	struct llist_node *kr_cur;
 #endif
 };
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 986837d7ec82..62e033b1b095 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -60,6 +60,7 @@ obj-$(CONFIG_ACPI_NUMA)			+= acpi_numa.o
 obj-$(CONFIG_ARM64_ACPI_PARKING_PROTOCOL)	+= acpi_parking_protocol.o
 obj-$(CONFIG_PARAVIRT)			+= paravirt.o
 obj-$(CONFIG_RANDOMIZE_BASE)		+= kaslr.o
+obj-$(CONFIG_RETHOOK)			+= rethook.o rethook_trampoline.o
 obj-$(CONFIG_HIBERNATION)		+= hibernate.o hibernate-asm.o
 obj-$(CONFIG_ELF_CORE)			+= elfcore.o
 obj-$(CONFIG_KEXEC_CORE)		+= machine_kexec.o relocate_kernel.o	\
diff --git a/arch/arm64/kernel/probes/Makefile b/arch/arm64/kernel/probes/Makefile
index 8e4be92e25b1..1fa58cda64ff 100644
--- a/arch/arm64/kernel/probes/Makefile
+++ b/arch/arm64/kernel/probes/Makefile
@@ -1,6 +1,5 @@
 # SPDX-License-Identifier: GPL-2.0
 obj-$(CONFIG_KPROBES)		+= kprobes.o decode-insn.o	\
-				   kprobes_trampoline.o		\
 				   simulate-insn.o
 obj-$(CONFIG_UPROBES)		+= uprobes.o decode-insn.o	\
 				   simulate-insn.o
diff --git a/arch/arm64/kernel/probes/kprobes.c b/arch/arm64/kernel/probes/kprobes.c
index d9dfa82c1f18..4a3cc266e77e 100644
--- a/arch/arm64/kernel/probes/kprobes.c
+++ b/arch/arm64/kernel/probes/kprobes.c
@@ -399,21 +399,6 @@ int __init arch_populate_kprobe_blacklist(void)
 	return ret;
 }
 
-void __kprobes __used *trampoline_probe_handler(struct pt_regs *regs)
-{
-	return (void *)kretprobe_trampoline_handler(regs, (void *)regs->regs[29]);
-}
-
-void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
-				      struct pt_regs *regs)
-{
-	ri->ret_addr = (kprobe_opcode_t *)regs->regs[30];
-	ri->fp = (void *)regs->regs[29];
-
-	/* replace return addr (x30) with trampoline */
-	regs->regs[30] = (long)&__kretprobe_trampoline;
-}
-
 int __kprobes arch_trampoline_kprobe(struct kprobe *p)
 {
 	return 0;
diff --git a/arch/arm64/kernel/probes/kprobes_trampoline.S b/arch/arm64/kernel/probes/kprobes_trampoline.S
deleted file mode 100644
index 9a6499bed58b..000000000000
--- a/arch/arm64/kernel/probes/kprobes_trampoline.S
+++ /dev/null
@@ -1,86 +0,0 @@
-/* SPDX-License-Identifier: GPL-2.0 */
-/*
- * trampoline entry and return code for kretprobes.
- */
-
-#include <linux/linkage.h>
-#include <asm/asm-offsets.h>
-#include <asm/assembler.h>
-
-	.text
-
-	.macro	save_all_base_regs
-	stp x0, x1, [sp, #S_X0]
-	stp x2, x3, [sp, #S_X2]
-	stp x4, x5, [sp, #S_X4]
-	stp x6, x7, [sp, #S_X6]
-	stp x8, x9, [sp, #S_X8]
-	stp x10, x11, [sp, #S_X10]
-	stp x12, x13, [sp, #S_X12]
-	stp x14, x15, [sp, #S_X14]
-	stp x16, x17, [sp, #S_X16]
-	stp x18, x19, [sp, #S_X18]
-	stp x20, x21, [sp, #S_X20]
-	stp x22, x23, [sp, #S_X22]
-	stp x24, x25, [sp, #S_X24]
-	stp x26, x27, [sp, #S_X26]
-	stp x28, x29, [sp, #S_X28]
-	add x0, sp, #PT_REGS_SIZE
-	stp lr, x0, [sp, #S_LR]
-	/*
-	 * Construct a useful saved PSTATE
-	 */
-	mrs x0, nzcv
-	mrs x1, daif
-	orr x0, x0, x1
-	mrs x1, CurrentEL
-	orr x0, x0, x1
-	mrs x1, SPSel
-	orr x0, x0, x1
-	stp xzr, x0, [sp, #S_PC]
-	.endm
-
-	.macro	restore_all_base_regs
-	ldr x0, [sp, #S_PSTATE]
-	and x0, x0, #(PSR_N_BIT | PSR_Z_BIT | PSR_C_BIT | PSR_V_BIT)
-	msr nzcv, x0
-	ldp x0, x1, [sp, #S_X0]
-	ldp x2, x3, [sp, #S_X2]
-	ldp x4, x5, [sp, #S_X4]
-	ldp x6, x7, [sp, #S_X6]
-	ldp x8, x9, [sp, #S_X8]
-	ldp x10, x11, [sp, #S_X10]
-	ldp x12, x13, [sp, #S_X12]
-	ldp x14, x15, [sp, #S_X14]
-	ldp x16, x17, [sp, #S_X16]
-	ldp x18, x19, [sp, #S_X18]
-	ldp x20, x21, [sp, #S_X20]
-	ldp x22, x23, [sp, #S_X22]
-	ldp x24, x25, [sp, #S_X24]
-	ldp x26, x27, [sp, #S_X26]
-	ldp x28, x29, [sp, #S_X28]
-	.endm
-
-SYM_CODE_START(__kretprobe_trampoline)
-	sub sp, sp, #PT_REGS_SIZE
-
-	save_all_base_regs
-
-	/* Setup a frame pointer. */
-	add x29, sp, #S_FP
-
-	mov x0, sp
-	bl trampoline_probe_handler
-	/*
-	 * Replace trampoline address in lr with actual orig_ret_addr return
-	 * address.
-	 */
-	mov lr, x0
-
-	/* The frame pointer (x29) is restored with other registers. */
-	restore_all_base_regs
-
-	add sp, sp, #PT_REGS_SIZE
-	ret
-
-SYM_CODE_END(__kretprobe_trampoline)
diff --git a/arch/arm64/kernel/rethook.c b/arch/arm64/kernel/rethook.c
new file mode 100644
index 000000000000..07d8f6ea34a0
--- /dev/null
+++ b/arch/arm64/kernel/rethook.c
@@ -0,0 +1,28 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Generic return hook for arm64.
+ * Most of the code is copied from arch/arm64/kernel/probes/kprobes.c
+ */
+
+#include <linux/kprobes.h>
+#include <linux/rethook.h>
+
+/* This is called from arch_rethook_trampoline() */
+unsigned long __used arch_rethook_trampoline_callback(struct pt_regs *regs);
+
+unsigned long __used arch_rethook_trampoline_callback(struct pt_regs *regs)
+{
+	return rethook_trampoline_handler(regs, regs->regs[29]);
+}
+NOKPROBE_SYMBOL(arch_rethook_trampoline_callback);
+
+int arch_rethook_prepare(struct rethook_node *rhn, struct pt_regs *regs, bool mcount)
+{
+	rhn->ret_addr = regs->regs[30];
+	rhn->frame = regs->regs[29];
+
+	/* replace return addr (x30) with trampoline */
+	regs->regs[30] = (u64)arch_rethook_trampoline;
+	return 0;
+}
+NOKPROBE_SYMBOL(arch_rethook_prepare);
diff --git a/arch/arm64/kernel/rethook_trampoline.S b/arch/arm64/kernel/rethook_trampoline.S
new file mode 100644
index 000000000000..146d4553674c
--- /dev/null
+++ b/arch/arm64/kernel/rethook_trampoline.S
@@ -0,0 +1,87 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * trampoline entry and return code for rethook.
+ * Renamed from arch/arm64/kernel/probes/kprobes_trampoline.S
+ */
+
+#include <linux/linkage.h>
+#include <asm/asm-offsets.h>
+#include <asm/assembler.h>
+
+	.text
+
+	.macro	save_all_base_regs
+	stp x0, x1, [sp, #S_X0]
+	stp x2, x3, [sp, #S_X2]
+	stp x4, x5, [sp, #S_X4]
+	stp x6, x7, [sp, #S_X6]
+	stp x8, x9, [sp, #S_X8]
+	stp x10, x11, [sp, #S_X10]
+	stp x12, x13, [sp, #S_X12]
+	stp x14, x15, [sp, #S_X14]
+	stp x16, x17, [sp, #S_X16]
+	stp x18, x19, [sp, #S_X18]
+	stp x20, x21, [sp, #S_X20]
+	stp x22, x23, [sp, #S_X22]
+	stp x24, x25, [sp, #S_X24]
+	stp x26, x27, [sp, #S_X26]
+	stp x28, x29, [sp, #S_X28]
+	add x0, sp, #PT_REGS_SIZE
+	stp lr, x0, [sp, #S_LR]
+	/*
+	 * Construct a useful saved PSTATE
+	 */
+	mrs x0, nzcv
+	mrs x1, daif
+	orr x0, x0, x1
+	mrs x1, CurrentEL
+	orr x0, x0, x1
+	mrs x1, SPSel
+	orr x0, x0, x1
+	stp xzr, x0, [sp, #S_PC]
+	.endm
+
+	.macro	restore_all_base_regs
+	ldr x0, [sp, #S_PSTATE]
+	and x0, x0, #(PSR_N_BIT | PSR_Z_BIT | PSR_C_BIT | PSR_V_BIT)
+	msr nzcv, x0
+	ldp x0, x1, [sp, #S_X0]
+	ldp x2, x3, [sp, #S_X2]
+	ldp x4, x5, [sp, #S_X4]
+	ldp x6, x7, [sp, #S_X6]
+	ldp x8, x9, [sp, #S_X8]
+	ldp x10, x11, [sp, #S_X10]
+	ldp x12, x13, [sp, #S_X12]
+	ldp x14, x15, [sp, #S_X14]
+	ldp x16, x17, [sp, #S_X16]
+	ldp x18, x19, [sp, #S_X18]
+	ldp x20, x21, [sp, #S_X20]
+	ldp x22, x23, [sp, #S_X22]
+	ldp x24, x25, [sp, #S_X24]
+	ldp x26, x27, [sp, #S_X26]
+	ldp x28, x29, [sp, #S_X28]
+	.endm
+
+SYM_CODE_START(arch_rethook_trampoline)
+	sub sp, sp, #PT_REGS_SIZE
+
+	save_all_base_regs
+
+	/* Setup a frame pointer. */
+	add x29, sp, #S_FP
+
+	mov x0, sp
+	bl arch_rethook_trampoline_callback
+	/*
+	 * Replace trampoline address in lr with actual orig_ret_addr return
+	 * address.
+	 */
+	mov lr, x0
+
+	/* The frame pointer (x29) is restored with other registers. */
+	restore_all_base_regs
+
+	add sp, sp, #PT_REGS_SIZE
+	ret
+
+SYM_CODE_END(arch_rethook_trampoline)
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
index e4103e085681..5b717af4b555 100644
--- a/arch/arm64/kernel/stacktrace.c
+++ b/arch/arm64/kernel/stacktrace.c
@@ -8,6 +8,7 @@
 #include <linux/export.h>
 #include <linux/ftrace.h>
 #include <linux/kprobes.h>
+#include <linux/rethook.h>
 #include <linux/sched.h>
 #include <linux/sched/debug.h>
 #include <linux/sched/task_stack.h>
@@ -38,7 +39,7 @@ static notrace void start_backtrace(struct stackframe *frame, unsigned long fp,
 {
 	frame->fp = fp;
 	frame->pc = pc;
-#ifdef CONFIG_KRETPROBES
+#if defined(CONFIG_RETHOOK)
 	frame->kr_cur = NULL;
 #endif
 
@@ -134,9 +135,9 @@ static int notrace unwind_frame(struct task_struct *tsk,
 		frame->pc = orig_pc;
 	}
 #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
-#ifdef CONFIG_KRETPROBES
-	if (is_kretprobe_trampoline(frame->pc))
-		frame->pc = kretprobe_find_ret_addr(tsk, (void *)frame->fp, &frame->kr_cur);
+#ifdef CONFIG_RETHOOK
+	if (is_rethook_trampoline(frame->pc))
+		frame->pc = rethook_find_ret_addr(tsk, frame->fp, &frame->kr_cur);
 #endif
 
 	return 0;


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf v2 0/4] kprobes: rethook,ARM,arm64: Replace kretprobe trampoline with rethook
  2022-04-08  0:50 [PATCH bpf v2 0/4] kprobes: rethook,ARM,arm64: Replace kretprobe trampoline with rethook Masami Hiramatsu
                   ` (3 preceding siblings ...)
  2022-04-08  0:51 ` [PATCH bpf v2 4/4] arm64: " Masami Hiramatsu
@ 2022-04-12 18:03 ` Mark Rutland
  2022-04-29 16:11   ` Masami Hiramatsu
  4 siblings, 1 reply; 9+ messages in thread
From: Mark Rutland @ 2022-04-12 18:03 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Alexei Starovoitov, Alexei Starovoitov, Daniel Borkmann,
	Shubham Bansal, Andrii Nakryiko, bpf, kernel-team, Jiri Olsa,
	Steven Rostedt, Naveen N . Rao, Anil S Keshavamurthy,
	David S . Miller, linux-kernel, Will Deacon, Ard Biesheuvel,
	Russell King, Catalin Marinas, linux-arm-kernel

On Fri, Apr 08, 2022 at 09:50:35AM +0900, Masami Hiramatsu wrote:
> Hi,

Hi,

> Here is the 2nd version of the series for replacing kretprobe trampoline
> with rethook on ARM/arm64. I fixed some compiler warnings in this version.

What tree is this based on? It doesn't cleanly apply atop v5.18-rc1:

| [mark@lakrids:~/src/linux]% git am v2_20220408_mhiramat_kprobes_rethook_arm_arm64_replace_kretprobe_trampoline_with_rethook.mbx
| Applying: ARM: unwind: Initialize the lr_addr field of unwind_ctrl_block
| Applying: rethook,fprobe,kprobes: Check a failure in the rethook_hook()
| Applying: ARM: rethook: Replace kretprobe trampoline with rethook
| error: patch failed: arch/arm/kernel/stacktrace.c:66
| error: arch/arm/kernel/stacktrace.c: patch does not apply
| Patch failed at 0003 ARM: rethook: Replace kretprobe trampoline with rethook
| hint: Use 'git am --show-current-patch=diff' to see the failed patch
| When you have resolved this problem, run "git am --continue".
| If you prefer to skip this patch, run "git am --skip" instead.
| To restore the original branch and stop patching, run "git am --abort".

I've done a `git am -3` locally to make that work for now.

> The previous version is here[1];
> 
> [1] https://lore.kernel.org/all/164915121498.982637.12787715964983738566.stgit@devnote2/T/#u
> 
> This series includes a trivial bugfix for the arm unwinder to initialize
> an internal data structure([1/4]). This is not critical for stack trace,
> but required for rethook to find the LR register from the stack.
> This also have an update for the rethook interface, which allows us to
> check the rethook_hook() failure ([2/4]). This is also required for the
> rethook on arm because unwinder is able to fail.
> The rest of patches are replacing kretprobe trampoline with rethook on
> ARM ([3/4]) and arm64 ([4/4]).

Generally, the arm and arm64 bits go via different trees, and for unwinding the
two are quite different.

IIUC the dependency between the two is just because patch 2 changes the
prototypes of some functions. Is that right?


> Background:
> 
> This rethook came from Jiri's request of multiple kprobe for bpf[2].
> He tried to solve an issue that starting bpf with multiple kprobe will
> take a long time because bpf-kprobe will wait for RCU grace period for
> sync rcu events.
> 
> Jiri wanted to attach a single bpf handler to multiple kprobes and
> he tried to introduce multiple-probe interface to kprobe. So I asked
> him to use ftrace and kretprobe-like hook if it is only for the
> function entry and exit, instead of adding ad-hoc interface
> to kprobes.
> For this purpose, I introduced the fprobe (kprobe like interface for
> ftrace) with the rethook (this is a generic return hook feature for
> fprobe exit handler)[3].
> 
> [2] https://lore.kernel.org/all/20220104080943.113249-1-jolsa@kernel.org/T/#u
> [3] https://lore.kernel.org/all/164191321766.806991.7930388561276940676.stgit@devnote2/T/#u
> 
> The rethook is basically same as the kretprobe trampoline. I just made
> it decoupled from kprobes. Eventually, the all arch dependent kretprobe
> trampolines will be replaced with the rethook trampoline instead of
> cloning the code.
> 
> When I port the rethook for all arch which supports kretprobe, the
> legacy kretprobe specific code (which is for CONFIG_KRETPROBE_ON_RETHOOK=n)
> will be removed eventually.
> 
> BTW, the arm Clang support for rethook is for kretprobes only. fprobe
> and ftrace seems not working with Clang yet.

Do you mean that's an existing issue?

Thanks,
Mark.

> 
> Thank you,
> 
> ---
> 
> Masami Hiramatsu (4):
>       ARM: unwind: Initialize the lr_addr field of unwind_ctrl_block
>       rethook,fprobe,kprobes: Check a failure in the rethook_hook()
>       ARM: rethook: Replace kretprobe trampoline with rethook
>       arm64: rethook: Replace kretprobe trampoline with rethook
> 
> 
>  arch/arm/Kconfig                              |    1 
>  arch/arm/include/asm/stacktrace.h             |    5 +
>  arch/arm/kernel/stacktrace.c                  |   13 +--
>  arch/arm/kernel/unwind.c                      |    1 
>  arch/arm/probes/Makefile                      |    1 
>  arch/arm/probes/kprobes/core.c                |   62 ------------
>  arch/arm/probes/rethook.c                     |  127 +++++++++++++++++++++++++
>  arch/arm64/Kconfig                            |    1 
>  arch/arm64/include/asm/kprobes.h              |    2 
>  arch/arm64/include/asm/stacktrace.h           |    2 
>  arch/arm64/kernel/Makefile                    |    1 
>  arch/arm64/kernel/probes/Makefile             |    1 
>  arch/arm64/kernel/probes/kprobes.c            |   15 ---
>  arch/arm64/kernel/probes/kprobes_trampoline.S |   86 -----------------
>  arch/arm64/kernel/rethook.c                   |   28 ++++++
>  arch/arm64/kernel/rethook_trampoline.S        |   87 +++++++++++++++++
>  arch/arm64/kernel/stacktrace.c                |    9 +-
>  arch/x86/kernel/rethook.c                     |    4 +
>  include/linux/rethook.h                       |    4 -
>  kernel/kprobes.c                              |    8 +-
>  kernel/trace/fprobe.c                         |    5 +
>  kernel/trace/rethook.c                        |   12 ++
>  22 files changed, 287 insertions(+), 188 deletions(-)
>  create mode 100644 arch/arm/probes/rethook.c
>  delete mode 100644 arch/arm64/kernel/probes/kprobes_trampoline.S
>  create mode 100644 arch/arm64/kernel/rethook.c
>  create mode 100644 arch/arm64/kernel/rethook_trampoline.S
> 
> --
> Masami Hiramatsu (Linaro) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf v2 4/4] arm64: rethook: Replace kretprobe trampoline with rethook
  2022-04-08  0:51 ` [PATCH bpf v2 4/4] arm64: " Masami Hiramatsu
@ 2022-04-12 19:04   ` Mark Rutland
  2022-04-29 15:55     ` Masami Hiramatsu
  0 siblings, 1 reply; 9+ messages in thread
From: Mark Rutland @ 2022-04-12 19:04 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Alexei Starovoitov, Alexei Starovoitov, Daniel Borkmann,
	Shubham Bansal, Andrii Nakryiko, bpf, kernel-team, Jiri Olsa,
	Steven Rostedt, Naveen N . Rao, Anil S Keshavamurthy,
	David S . Miller, linux-kernel, Will Deacon, Ard Biesheuvel,
	Russell King, Catalin Marinas, linux-arm-kernel

On Fri, Apr 08, 2022 at 09:51:24AM +0900, Masami Hiramatsu wrote:
> Replace the kretprobe's trampoline code with the rethook on arm64.
> The rethook on arm64 is almost renamed from kretprobe trampoline
> code. The mechanism is completely same.
> 
> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
> ---
>  Changes in v2:
>   - Fix a compile warning about no prototype.
> ---
>  arch/arm64/Kconfig                            |    1 
>  arch/arm64/include/asm/kprobes.h              |    2 -
>  arch/arm64/include/asm/stacktrace.h           |    2 -
>  arch/arm64/kernel/Makefile                    |    1 
>  arch/arm64/kernel/probes/Makefile             |    1 
>  arch/arm64/kernel/probes/kprobes.c            |   15 ----
>  arch/arm64/kernel/probes/kprobes_trampoline.S |   86 -------------------------
>  arch/arm64/kernel/rethook.c                   |   28 ++++++++
>  arch/arm64/kernel/rethook_trampoline.S        |   87 +++++++++++++++++++++++++
>  arch/arm64/kernel/stacktrace.c                |    9 +--
>  10 files changed, 123 insertions(+), 109 deletions(-)
>  delete mode 100644 arch/arm64/kernel/probes/kprobes_trampoline.S
>  create mode 100644 arch/arm64/kernel/rethook.c
>  create mode 100644 arch/arm64/kernel/rethook_trampoline.S
> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> index 57c4c995965f..7d2945930283 100644
> --- a/arch/arm64/Kconfig
> +++ b/arch/arm64/Kconfig
> @@ -204,6 +204,7 @@ config ARM64
>  	select HAVE_SYSCALL_TRACEPOINTS
>  	select HAVE_KPROBES
>  	select HAVE_KRETPROBES
> +	select HAVE_RETHOOK
>  	select HAVE_GENERIC_VDSO
>  	select IOMMU_DMA if IOMMU_SUPPORT
>  	select IRQ_DOMAIN
> diff --git a/arch/arm64/include/asm/kprobes.h b/arch/arm64/include/asm/kprobes.h
> index 05cd82eeca13..4ac558058377 100644
> --- a/arch/arm64/include/asm/kprobes.h
> +++ b/arch/arm64/include/asm/kprobes.h
> @@ -39,8 +39,6 @@ void arch_remove_kprobe(struct kprobe *);
>  int kprobe_fault_handler(struct pt_regs *regs, unsigned int fsr);
>  int kprobe_exceptions_notify(struct notifier_block *self,
>  			     unsigned long val, void *data);
> -void __kretprobe_trampoline(void);
> -void __kprobes *trampoline_probe_handler(struct pt_regs *regs);
>  
>  #endif /* CONFIG_KPROBES */
>  #endif /* _ARM_KPROBES_H */
> diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
> index e77cdef9ca29..f781874f1609 100644
> --- a/arch/arm64/include/asm/stacktrace.h
> +++ b/arch/arm64/include/asm/stacktrace.h
> @@ -58,7 +58,7 @@ struct stackframe {
>  	DECLARE_BITMAP(stacks_done, __NR_STACK_TYPES);
>  	unsigned long prev_fp;
>  	enum stack_type prev_type;
> -#ifdef CONFIG_KRETPROBES
> +#if defined(CONFIG_RETHOOK)
>  	struct llist_node *kr_cur;
>  #endif
>  };
> diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
> index 986837d7ec82..62e033b1b095 100644
> --- a/arch/arm64/kernel/Makefile
> +++ b/arch/arm64/kernel/Makefile
> @@ -60,6 +60,7 @@ obj-$(CONFIG_ACPI_NUMA)			+= acpi_numa.o
>  obj-$(CONFIG_ARM64_ACPI_PARKING_PROTOCOL)	+= acpi_parking_protocol.o
>  obj-$(CONFIG_PARAVIRT)			+= paravirt.o
>  obj-$(CONFIG_RANDOMIZE_BASE)		+= kaslr.o
> +obj-$(CONFIG_RETHOOK)			+= rethook.o rethook_trampoline.o
>  obj-$(CONFIG_HIBERNATION)		+= hibernate.o hibernate-asm.o
>  obj-$(CONFIG_ELF_CORE)			+= elfcore.o
>  obj-$(CONFIG_KEXEC_CORE)		+= machine_kexec.o relocate_kernel.o	\
> diff --git a/arch/arm64/kernel/probes/Makefile b/arch/arm64/kernel/probes/Makefile
> index 8e4be92e25b1..1fa58cda64ff 100644
> --- a/arch/arm64/kernel/probes/Makefile
> +++ b/arch/arm64/kernel/probes/Makefile
> @@ -1,6 +1,5 @@
>  # SPDX-License-Identifier: GPL-2.0
>  obj-$(CONFIG_KPROBES)		+= kprobes.o decode-insn.o	\
> -				   kprobes_trampoline.o		\
>  				   simulate-insn.o
>  obj-$(CONFIG_UPROBES)		+= uprobes.o decode-insn.o	\
>  				   simulate-insn.o
> diff --git a/arch/arm64/kernel/probes/kprobes.c b/arch/arm64/kernel/probes/kprobes.c
> index d9dfa82c1f18..4a3cc266e77e 100644
> --- a/arch/arm64/kernel/probes/kprobes.c
> +++ b/arch/arm64/kernel/probes/kprobes.c
> @@ -399,21 +399,6 @@ int __init arch_populate_kprobe_blacklist(void)
>  	return ret;
>  }
>  
> -void __kprobes __used *trampoline_probe_handler(struct pt_regs *regs)
> -{
> -	return (void *)kretprobe_trampoline_handler(regs, (void *)regs->regs[29]);
> -}
> -
> -void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
> -				      struct pt_regs *regs)
> -{
> -	ri->ret_addr = (kprobe_opcode_t *)regs->regs[30];
> -	ri->fp = (void *)regs->regs[29];
> -
> -	/* replace return addr (x30) with trampoline */
> -	regs->regs[30] = (long)&__kretprobe_trampoline;
> -}
> -
>  int __kprobes arch_trampoline_kprobe(struct kprobe *p)
>  {
>  	return 0;
> diff --git a/arch/arm64/kernel/probes/kprobes_trampoline.S b/arch/arm64/kernel/probes/kprobes_trampoline.S
> deleted file mode 100644
> index 9a6499bed58b..000000000000
> --- a/arch/arm64/kernel/probes/kprobes_trampoline.S
> +++ /dev/null
> @@ -1,86 +0,0 @@
> -/* SPDX-License-Identifier: GPL-2.0 */
> -/*
> - * trampoline entry and return code for kretprobes.
> - */
> -
> -#include <linux/linkage.h>
> -#include <asm/asm-offsets.h>
> -#include <asm/assembler.h>
> -
> -	.text
> -
> -	.macro	save_all_base_regs
> -	stp x0, x1, [sp, #S_X0]
> -	stp x2, x3, [sp, #S_X2]
> -	stp x4, x5, [sp, #S_X4]
> -	stp x6, x7, [sp, #S_X6]
> -	stp x8, x9, [sp, #S_X8]
> -	stp x10, x11, [sp, #S_X10]
> -	stp x12, x13, [sp, #S_X12]
> -	stp x14, x15, [sp, #S_X14]
> -	stp x16, x17, [sp, #S_X16]
> -	stp x18, x19, [sp, #S_X18]
> -	stp x20, x21, [sp, #S_X20]
> -	stp x22, x23, [sp, #S_X22]
> -	stp x24, x25, [sp, #S_X24]
> -	stp x26, x27, [sp, #S_X26]
> -	stp x28, x29, [sp, #S_X28]
> -	add x0, sp, #PT_REGS_SIZE
> -	stp lr, x0, [sp, #S_LR]
> -	/*
> -	 * Construct a useful saved PSTATE
> -	 */
> -	mrs x0, nzcv
> -	mrs x1, daif
> -	orr x0, x0, x1
> -	mrs x1, CurrentEL
> -	orr x0, x0, x1
> -	mrs x1, SPSel
> -	orr x0, x0, x1
> -	stp xzr, x0, [sp, #S_PC]
> -	.endm
> -
> -	.macro	restore_all_base_regs
> -	ldr x0, [sp, #S_PSTATE]
> -	and x0, x0, #(PSR_N_BIT | PSR_Z_BIT | PSR_C_BIT | PSR_V_BIT)
> -	msr nzcv, x0
> -	ldp x0, x1, [sp, #S_X0]
> -	ldp x2, x3, [sp, #S_X2]
> -	ldp x4, x5, [sp, #S_X4]
> -	ldp x6, x7, [sp, #S_X6]
> -	ldp x8, x9, [sp, #S_X8]
> -	ldp x10, x11, [sp, #S_X10]
> -	ldp x12, x13, [sp, #S_X12]
> -	ldp x14, x15, [sp, #S_X14]
> -	ldp x16, x17, [sp, #S_X16]
> -	ldp x18, x19, [sp, #S_X18]
> -	ldp x20, x21, [sp, #S_X20]
> -	ldp x22, x23, [sp, #S_X22]
> -	ldp x24, x25, [sp, #S_X24]
> -	ldp x26, x27, [sp, #S_X26]
> -	ldp x28, x29, [sp, #S_X28]
> -	.endm
> -
> -SYM_CODE_START(__kretprobe_trampoline)
> -	sub sp, sp, #PT_REGS_SIZE
> -
> -	save_all_base_regs
> -
> -	/* Setup a frame pointer. */
> -	add x29, sp, #S_FP
> -
> -	mov x0, sp
> -	bl trampoline_probe_handler
> -	/*
> -	 * Replace trampoline address in lr with actual orig_ret_addr return
> -	 * address.
> -	 */
> -	mov lr, x0
> -
> -	/* The frame pointer (x29) is restored with other registers. */
> -	restore_all_base_regs
> -
> -	add sp, sp, #PT_REGS_SIZE
> -	ret
> -
> -SYM_CODE_END(__kretprobe_trampoline)
> diff --git a/arch/arm64/kernel/rethook.c b/arch/arm64/kernel/rethook.c
> new file mode 100644
> index 000000000000..07d8f6ea34a0
> --- /dev/null
> +++ b/arch/arm64/kernel/rethook.c
> @@ -0,0 +1,28 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Generic return hook for arm64.
> + * Most of the code is copied from arch/arm64/kernel/probes/kprobes.c
> + */
> +
> +#include <linux/kprobes.h>
> +#include <linux/rethook.h>
> +
> +/* This is called from arch_rethook_trampoline() */
> +unsigned long __used arch_rethook_trampoline_callback(struct pt_regs *regs);
> +
> +unsigned long __used arch_rethook_trampoline_callback(struct pt_regs *regs)
> +{
> +	return rethook_trampoline_handler(regs, regs->regs[29]);
> +}
> +NOKPROBE_SYMBOL(arch_rethook_trampoline_callback);
> +
> +int arch_rethook_prepare(struct rethook_node *rhn, struct pt_regs *regs, bool mcount)

What's the `mcount` paramater for, and why don't we seem to use it below?

The presence of that parameter suggests there is a subtle interaction with
ftrace, but the commit message doesn't mention anything of the sort. I really
worried that has implications for the unwinder.

I also see that (with these patches applied) it's possible to select
CONFIG_FPROBE, but the commit message doesn't describe that either, and I'm not
sure how to test any of that.

> +{
> +	rhn->ret_addr = regs->regs[30];
> +	rhn->frame = regs->regs[29];
> +
> +	/* replace return addr (x30) with trampoline */
> +	regs->regs[30] = (u64)arch_rethook_trampoline;
> +	return 0;
> +}
> +NOKPROBE_SYMBOL(arch_rethook_prepare);
> diff --git a/arch/arm64/kernel/rethook_trampoline.S b/arch/arm64/kernel/rethook_trampoline.S
> new file mode 100644
> index 000000000000..146d4553674c
> --- /dev/null
> +++ b/arch/arm64/kernel/rethook_trampoline.S
> @@ -0,0 +1,87 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/*
> + * trampoline entry and return code for rethook.
> + * Renamed from arch/arm64/kernel/probes/kprobes_trampoline.S
> + */
> +
> +#include <linux/linkage.h>
> +#include <asm/asm-offsets.h>
> +#include <asm/assembler.h>
> +
> +	.text
> +
> +	.macro	save_all_base_regs
> +	stp x0, x1, [sp, #S_X0]
> +	stp x2, x3, [sp, #S_X2]
> +	stp x4, x5, [sp, #S_X4]
> +	stp x6, x7, [sp, #S_X6]
> +	stp x8, x9, [sp, #S_X8]
> +	stp x10, x11, [sp, #S_X10]
> +	stp x12, x13, [sp, #S_X12]
> +	stp x14, x15, [sp, #S_X14]
> +	stp x16, x17, [sp, #S_X16]
> +	stp x18, x19, [sp, #S_X18]
> +	stp x20, x21, [sp, #S_X20]
> +	stp x22, x23, [sp, #S_X22]
> +	stp x24, x25, [sp, #S_X24]
> +	stp x26, x27, [sp, #S_X26]
> +	stp x28, x29, [sp, #S_X28]
> +	add x0, sp, #PT_REGS_SIZE
> +	stp lr, x0, [sp, #S_LR]
> +	/*
> +	 * Construct a useful saved PSTATE
> +	 */
> +	mrs x0, nzcv
> +	mrs x1, daif
> +	orr x0, x0, x1
> +	mrs x1, CurrentEL
> +	orr x0, x0, x1
> +	mrs x1, SPSel
> +	orr x0, x0, x1

I realise this is just a copy of the existing kretprobe code, but it would be
*really* nice if we could avoid faking the regs when we don't take an
exception, like the FTRACE_WITH_ARGS thing.

The "useful saved PSTATE" is getting increasingly bogus these days, since it
doesn't contain a bunch of values that are in the real PSTATE, and we don't
restore anything other than NZCV anyway.

> +	stp xzr, x0, [sp, #S_PC]
> +	.endm
> +
> +	.macro	restore_all_base_regs
> +	ldr x0, [sp, #S_PSTATE]
> +	and x0, x0, #(PSR_N_BIT | PSR_Z_BIT | PSR_C_BIT | PSR_V_BIT)
> +	msr nzcv, x0
> +	ldp x0, x1, [sp, #S_X0]
> +	ldp x2, x3, [sp, #S_X2]
> +	ldp x4, x5, [sp, #S_X4]
> +	ldp x6, x7, [sp, #S_X6]
> +	ldp x8, x9, [sp, #S_X8]
> +	ldp x10, x11, [sp, #S_X10]
> +	ldp x12, x13, [sp, #S_X12]
> +	ldp x14, x15, [sp, #S_X14]
> +	ldp x16, x17, [sp, #S_X16]
> +	ldp x18, x19, [sp, #S_X18]
> +	ldp x20, x21, [sp, #S_X20]
> +	ldp x22, x23, [sp, #S_X22]
> +	ldp x24, x25, [sp, #S_X24]
> +	ldp x26, x27, [sp, #S_X26]
> +	ldp x28, x29, [sp, #S_X28]
> +	.endm
> +
> +SYM_CODE_START(arch_rethook_trampoline)
> +	sub sp, sp, #PT_REGS_SIZE
> +
> +	save_all_base_regs
> +
> +	/* Setup a frame pointer. */
> +	add x29, sp, #S_FP
> +
> +	mov x0, sp
> +	bl arch_rethook_trampoline_callback
> +	/*
> +	 * Replace trampoline address in lr with actual orig_ret_addr return
> +	 * address.
> +	 */
> +	mov lr, x0
> +
> +	/* The frame pointer (x29) is restored with other registers. */
> +	restore_all_base_regs
> +
> +	add sp, sp, #PT_REGS_SIZE
> +	ret
> +
> +SYM_CODE_END(arch_rethook_trampoline)
> diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> index e4103e085681..5b717af4b555 100644
> --- a/arch/arm64/kernel/stacktrace.c
> +++ b/arch/arm64/kernel/stacktrace.c
> @@ -8,6 +8,7 @@
>  #include <linux/export.h>
>  #include <linux/ftrace.h>
>  #include <linux/kprobes.h>
> +#include <linux/rethook.h>
>  #include <linux/sched.h>
>  #include <linux/sched/debug.h>
>  #include <linux/sched/task_stack.h>
> @@ -38,7 +39,7 @@ static notrace void start_backtrace(struct stackframe *frame, unsigned long fp,
>  {
>  	frame->fp = fp;
>  	frame->pc = pc;
> -#ifdef CONFIG_KRETPROBES
> +#if defined(CONFIG_RETHOOK)
>  	frame->kr_cur = NULL;
>  #endif
>  
> @@ -134,9 +135,9 @@ static int notrace unwind_frame(struct task_struct *tsk,
>  		frame->pc = orig_pc;
>  	}
>  #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
> -#ifdef CONFIG_KRETPROBES
> -	if (is_kretprobe_trampoline(frame->pc))
> -		frame->pc = kretprobe_find_ret_addr(tsk, (void *)frame->fp, &frame->kr_cur);
> +#ifdef CONFIG_RETHOOK
> +	if (is_rethook_trampoline(frame->pc))
> +		frame->pc = rethook_find_ret_addr(tsk, frame->fp, &frame->kr_cur);
>  #endif

Is it possible to have an fprobe and a regular ftrace graph caller call on the
same ftrace callsite? ... or are those mutually exclusive?

The existing code here depends on kretprobes and ftrace graph caller
instrumentation nesting in a specific order, and I'm worried that might have
changed.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf v2 4/4] arm64: rethook: Replace kretprobe trampoline with rethook
  2022-04-12 19:04   ` Mark Rutland
@ 2022-04-29 15:55     ` Masami Hiramatsu
  0 siblings, 0 replies; 9+ messages in thread
From: Masami Hiramatsu @ 2022-04-29 15:55 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Alexei Starovoitov, Alexei Starovoitov, Daniel Borkmann,
	Shubham Bansal, Andrii Nakryiko, bpf, kernel-team, Jiri Olsa,
	Steven Rostedt, Naveen N . Rao, Anil S Keshavamurthy,
	David S . Miller, linux-kernel, Will Deacon, Ard Biesheuvel,
	Russell King, Catalin Marinas, linux-arm-kernel

Hi Mark,

Thanks for the comment, and sorry about late reply.

On Tue, 12 Apr 2022 20:04:28 +0100
Mark Rutland <mark.rutland@arm.com> wrote:

> > diff --git a/arch/arm64/kernel/rethook.c b/arch/arm64/kernel/rethook.c
> > new file mode 100644
> > index 000000000000..07d8f6ea34a0
> > --- /dev/null
> > +++ b/arch/arm64/kernel/rethook.c
> > @@ -0,0 +1,28 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * Generic return hook for arm64.
> > + * Most of the code is copied from arch/arm64/kernel/probes/kprobes.c
> > + */
> > +
> > +#include <linux/kprobes.h>
> > +#include <linux/rethook.h>
> > +
> > +/* This is called from arch_rethook_trampoline() */
> > +unsigned long __used arch_rethook_trampoline_callback(struct pt_regs *regs);
> > +
> > +unsigned long __used arch_rethook_trampoline_callback(struct pt_regs *regs)
> > +{
> > +	return rethook_trampoline_handler(regs, regs->regs[29]);
> > +}
> > +NOKPROBE_SYMBOL(arch_rethook_trampoline_callback);
> > +
> > +int arch_rethook_prepare(struct rethook_node *rhn, struct pt_regs *regs, bool mcount)
> 
> What's the `mcount` paramater for, and why don't we seem to use it below?
> 
> The presence of that parameter suggests there is a subtle interaction with
> ftrace, but the commit message doesn't mention anything of the sort. I really
> worried that has implications for the unwinder.

mcount parameter is introduced only for arm arch at this point. Other arch will
ignore this parameter. See [3/4] for details. When the rethook caller (ftrace
or kprobe) hooks the function entry, how to get the real return address depends
on whether it is called from ftrace (mcount) or kprobe.
By kprobe, that is called from the first instruction of the function. Thus the
real return address can be found from regs->lr. But from the ftrace (mcount)
context, we can not find it from regs->lr (this LR register is used for saving
the return address of mcount itself) on arm. On arm64, this seems to be fixed.
So I'm not sure this is a real issue on arm or known limitation.

> 
> I also see that (with these patches applied) it's possible to select
> CONFIG_FPROBE, but the commit message doesn't describe that either, and I'm not
> sure how to test any of that.

Ah, Sorry. Fprobe requires this feature, so HAVE_RETHOOK=y, CONFIG_FPROBE is
available on this architecture. If you enable CONFIG_FPROBE_SANITY_TEST=y,
this feature is tested by fprobe.

> 
> > +{
> > +	rhn->ret_addr = regs->regs[30];
> > +	rhn->frame = regs->regs[29];
> > +
> > +	/* replace return addr (x30) with trampoline */
> > +	regs->regs[30] = (u64)arch_rethook_trampoline;
> > +	return 0;
> > +}
> > +NOKPROBE_SYMBOL(arch_rethook_prepare);
> > diff --git a/arch/arm64/kernel/rethook_trampoline.S b/arch/arm64/kernel/rethook_trampoline.S
> > new file mode 100644
> > index 000000000000..146d4553674c
> > --- /dev/null
> > +++ b/arch/arm64/kernel/rethook_trampoline.S
> > @@ -0,0 +1,87 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/*
> > + * trampoline entry and return code for rethook.
> > + * Renamed from arch/arm64/kernel/probes/kprobes_trampoline.S
> > + */
> > +
> > +#include <linux/linkage.h>
> > +#include <asm/asm-offsets.h>
> > +#include <asm/assembler.h>
> > +
> > +	.text
> > +
> > +	.macro	save_all_base_regs
> > +	stp x0, x1, [sp, #S_X0]
> > +	stp x2, x3, [sp, #S_X2]
> > +	stp x4, x5, [sp, #S_X4]
> > +	stp x6, x7, [sp, #S_X6]
> > +	stp x8, x9, [sp, #S_X8]
> > +	stp x10, x11, [sp, #S_X10]
> > +	stp x12, x13, [sp, #S_X12]
> > +	stp x14, x15, [sp, #S_X14]
> > +	stp x16, x17, [sp, #S_X16]
> > +	stp x18, x19, [sp, #S_X18]
> > +	stp x20, x21, [sp, #S_X20]
> > +	stp x22, x23, [sp, #S_X22]
> > +	stp x24, x25, [sp, #S_X24]
> > +	stp x26, x27, [sp, #S_X26]
> > +	stp x28, x29, [sp, #S_X28]
> > +	add x0, sp, #PT_REGS_SIZE
> > +	stp lr, x0, [sp, #S_LR]
> > +	/*
> > +	 * Construct a useful saved PSTATE
> > +	 */
> > +	mrs x0, nzcv
> > +	mrs x1, daif
> > +	orr x0, x0, x1
> > +	mrs x1, CurrentEL
> > +	orr x0, x0, x1
> > +	mrs x1, SPSel
> > +	orr x0, x0, x1
> 
> I realise this is just a copy of the existing kretprobe code, but it would be
> *really* nice if we could avoid faking the regs when we don't take an
> exception, like the FTRACE_WITH_ARGS thing.

No, since this is not a "function call". This is function exit, thus the
all registers can be involved.

> 
> The "useful saved PSTATE" is getting increasingly bogus these days, since it
> doesn't contain a bunch of values that are in the real PSTATE, and we don't
> restore anything other than NZCV anyway.

Hmm, that is a real bug. It should be fixed even without this patch, on kretprobes.
Are there any good way to store those flags in the real PSTATE?

I need arm64 maintainer's help to fix that. If there is a limitation of getting
the PSTATE, I need to decide to
- use kprobe instead of hand-assembly code (like powerpc) to hook the trampoline.
or
- just notice users this is a limitation of kretprobe on arm64.


> 
> > +	stp xzr, x0, [sp, #S_PC]
> > +	.endm
> > +
> > +	.macro	restore_all_base_regs
> > +	ldr x0, [sp, #S_PSTATE]
> > +	and x0, x0, #(PSR_N_BIT | PSR_Z_BIT | PSR_C_BIT | PSR_V_BIT)
> > +	msr nzcv, x0
> > +	ldp x0, x1, [sp, #S_X0]
> > +	ldp x2, x3, [sp, #S_X2]
> > +	ldp x4, x5, [sp, #S_X4]
> > +	ldp x6, x7, [sp, #S_X6]
> > +	ldp x8, x9, [sp, #S_X8]
> > +	ldp x10, x11, [sp, #S_X10]
> > +	ldp x12, x13, [sp, #S_X12]
> > +	ldp x14, x15, [sp, #S_X14]
> > +	ldp x16, x17, [sp, #S_X16]
> > +	ldp x18, x19, [sp, #S_X18]
> > +	ldp x20, x21, [sp, #S_X20]
> > +	ldp x22, x23, [sp, #S_X22]
> > +	ldp x24, x25, [sp, #S_X24]
> > +	ldp x26, x27, [sp, #S_X26]
> > +	ldp x28, x29, [sp, #S_X28]
> > +	.endm
> > +
> > +SYM_CODE_START(arch_rethook_trampoline)
> > +	sub sp, sp, #PT_REGS_SIZE
> > +
> > +	save_all_base_regs
> > +
> > +	/* Setup a frame pointer. */
> > +	add x29, sp, #S_FP
> > +
> > +	mov x0, sp
> > +	bl arch_rethook_trampoline_callback
> > +	/*
> > +	 * Replace trampoline address in lr with actual orig_ret_addr return
> > +	 * address.
> > +	 */
> > +	mov lr, x0
> > +
> > +	/* The frame pointer (x29) is restored with other registers. */
> > +	restore_all_base_regs
> > +
> > +	add sp, sp, #PT_REGS_SIZE
> > +	ret
> > +
> > +SYM_CODE_END(arch_rethook_trampoline)
> > diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
> > index e4103e085681..5b717af4b555 100644
> > --- a/arch/arm64/kernel/stacktrace.c
> > +++ b/arch/arm64/kernel/stacktrace.c
> > @@ -8,6 +8,7 @@
> >  #include <linux/export.h>
> >  #include <linux/ftrace.h>
> >  #include <linux/kprobes.h>
> > +#include <linux/rethook.h>
> >  #include <linux/sched.h>
> >  #include <linux/sched/debug.h>
> >  #include <linux/sched/task_stack.h>
> > @@ -38,7 +39,7 @@ static notrace void start_backtrace(struct stackframe *frame, unsigned long fp,
> >  {
> >  	frame->fp = fp;
> >  	frame->pc = pc;
> > -#ifdef CONFIG_KRETPROBES
> > +#if defined(CONFIG_RETHOOK)
> >  	frame->kr_cur = NULL;
> >  #endif
> >  
> > @@ -134,9 +135,9 @@ static int notrace unwind_frame(struct task_struct *tsk,
> >  		frame->pc = orig_pc;
> >  	}
> >  #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
> > -#ifdef CONFIG_KRETPROBES
> > -	if (is_kretprobe_trampoline(frame->pc))
> > -		frame->pc = kretprobe_find_ret_addr(tsk, (void *)frame->fp, &frame->kr_cur);
> > +#ifdef CONFIG_RETHOOK
> > +	if (is_rethook_trampoline(frame->pc))
> > +		frame->pc = rethook_find_ret_addr(tsk, frame->fp, &frame->kr_cur);
> >  #endif
> 
> Is it possible to have an fprobe and a regular ftrace graph caller call on the
> same ftrace callsite? ... or are those mutually exclusive?

Hmm, those are too different at this moment, but I would like to do that.
What I would like to do is
- if the arch has no ftrace-graph implementation, rethook will use the
  kretprobe trampoline based implementation. This can drop some events
  because of the resource limitation.
- else, it will use ftrace-graph implementation. This can be very effective
  in some case (like tracing frequently called function) because it uses
  par-task shadow stack. But not many arch implemented it.
- Also, the interface needs to be unified. Rethook API is intended to be
  used for the integration. kretprobe and fprobe will use the rethook, thus
  the pt_regs is required.

> 
> The existing code here depends on kretprobes and ftrace graph caller
> instrumentation nesting in a specific order, and I'm worried that might have
> changed.

Would you mean that the "nesting" is using kretprobe and ftrace-graph on
same function, correct?
At this moment, this rethook (kretprobe and fprobe) does not use the same
shadow stack of the ftrace graph caller, so the nesting must work correctly.

Thank you,

> 
> Thanks,
> Mark.


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf v2 0/4] kprobes: rethook,ARM,arm64: Replace kretprobe trampoline with rethook
  2022-04-12 18:03 ` [PATCH bpf v2 0/4] kprobes: rethook,ARM,arm64: " Mark Rutland
@ 2022-04-29 16:11   ` Masami Hiramatsu
  0 siblings, 0 replies; 9+ messages in thread
From: Masami Hiramatsu @ 2022-04-29 16:11 UTC (permalink / raw)
  To: Mark Rutland
  Cc: Alexei Starovoitov, Alexei Starovoitov, Daniel Borkmann,
	Shubham Bansal, Andrii Nakryiko, bpf, kernel-team, Jiri Olsa,
	Steven Rostedt, Naveen N . Rao, Anil S Keshavamurthy,
	David S . Miller, linux-kernel, Will Deacon, Ard Biesheuvel,
	Russell King, Catalin Marinas, linux-arm-kernel

Hi Mark,

Sorry for replying this late.

On Tue, 12 Apr 2022 19:03:53 +0100
Mark Rutland <mark.rutland@arm.com> wrote:

> On Fri, Apr 08, 2022 at 09:50:35AM +0900, Masami Hiramatsu wrote:
> > Hi,
> 
> Hi,
> 
> > Here is the 2nd version of the series for replacing kretprobe trampoline
> > with rethook on ARM/arm64. I fixed some compiler warnings in this version.
> 
> What tree is this based on? It doesn't cleanly apply atop v5.18-rc1:

I worked on bpf tree, but I'll try rebasing on the latest linus tree
(or arm/arm64 tree) in the next version.

> 
> | [mark@lakrids:~/src/linux]% git am v2_20220408_mhiramat_kprobes_rethook_arm_arm64_replace_kretprobe_trampoline_with_rethook.mbx
> | Applying: ARM: unwind: Initialize the lr_addr field of unwind_ctrl_block
> | Applying: rethook,fprobe,kprobes: Check a failure in the rethook_hook()
> | Applying: ARM: rethook: Replace kretprobe trampoline with rethook
> | error: patch failed: arch/arm/kernel/stacktrace.c:66
> | error: arch/arm/kernel/stacktrace.c: patch does not apply
> | Patch failed at 0003 ARM: rethook: Replace kretprobe trampoline with rethook
> | hint: Use 'git am --show-current-patch=diff' to see the failed patch
> | When you have resolved this problem, run "git am --continue".
> | If you prefer to skip this patch, run "git am --skip" instead.
> | To restore the original branch and stop patching, run "git am --abort".
> 
> I've done a `git am -3` locally to make that work for now.
> 
> > The previous version is here[1];
> > 
> > [1] https://lore.kernel.org/all/164915121498.982637.12787715964983738566.stgit@devnote2/T/#u
> > 
> > This series includes a trivial bugfix for the arm unwinder to initialize
> > an internal data structure([1/4]). This is not critical for stack trace,
> > but required for rethook to find the LR register from the stack.
> > This also have an update for the rethook interface, which allows us to
> > check the rethook_hook() failure ([2/4]). This is also required for the
> > rethook on arm because unwinder is able to fail.
> > The rest of patches are replacing kretprobe trampoline with rethook on
> > ARM ([3/4]) and arm64 ([4/4]).
> 
> Generally, the arm and arm64 bits go via different trees, and for unwinding the
> two are quite different.

OK, Should I split this into 2 series?
Anyway, I'll send the first one for arm tree because this is a real bug.

> 
> IIUC the dependency between the two is just because patch 2 changes the
> prototypes of some functions. Is that right?

Yes. I can push it one by one. What is the best way, would you think?
If I split this into some series, I will;

- Send [1/4] to arm tree as a bugfix.
- Send [2/4] and [3/4] to arm tree as an enhancement.
- Send [4/4] to arm64 when above 2 series are merged.

Is that good for you?

> 
> 
> > Background:
> > 
> > This rethook came from Jiri's request of multiple kprobe for bpf[2].
> > He tried to solve an issue that starting bpf with multiple kprobe will
> > take a long time because bpf-kprobe will wait for RCU grace period for
> > sync rcu events.
> > 
> > Jiri wanted to attach a single bpf handler to multiple kprobes and
> > he tried to introduce multiple-probe interface to kprobe. So I asked
> > him to use ftrace and kretprobe-like hook if it is only for the
> > function entry and exit, instead of adding ad-hoc interface
> > to kprobes.
> > For this purpose, I introduced the fprobe (kprobe like interface for
> > ftrace) with the rethook (this is a generic return hook feature for
> > fprobe exit handler)[3].
> > 
> > [2] https://lore.kernel.org/all/20220104080943.113249-1-jolsa@kernel.org/T/#u
> > [3] https://lore.kernel.org/all/164191321766.806991.7930388561276940676.stgit@devnote2/T/#u
> > 
> > The rethook is basically same as the kretprobe trampoline. I just made
> > it decoupled from kprobes. Eventually, the all arch dependent kretprobe
> > trampolines will be replaced with the rethook trampoline instead of
> > cloning the code.
> > 
> > When I port the rethook for all arch which supports kretprobe, the
> > legacy kretprobe specific code (which is for CONFIG_KRETPROBE_ON_RETHOOK=n)
> > will be removed eventually.
> > 
> > BTW, the arm Clang support for rethook is for kretprobes only. fprobe
> > and ftrace seems not working with Clang yet.
> 
> Do you mean that's an existing issue?

Yes, but the above is not enough, Clang on arm and dynamic ftrace doesn't work.
Perhaps, Clang on arm doesn't support -mrecord-mcount? 

Thank you,

> 
> Thanks,
> Mark.
> 
> > 
> > Thank you,
> > 
> > ---
> > 
> > Masami Hiramatsu (4):
> >       ARM: unwind: Initialize the lr_addr field of unwind_ctrl_block
> >       rethook,fprobe,kprobes: Check a failure in the rethook_hook()
> >       ARM: rethook: Replace kretprobe trampoline with rethook
> >       arm64: rethook: Replace kretprobe trampoline with rethook
> > 
> > 
> >  arch/arm/Kconfig                              |    1 
> >  arch/arm/include/asm/stacktrace.h             |    5 +
> >  arch/arm/kernel/stacktrace.c                  |   13 +--
> >  arch/arm/kernel/unwind.c                      |    1 
> >  arch/arm/probes/Makefile                      |    1 
> >  arch/arm/probes/kprobes/core.c                |   62 ------------
> >  arch/arm/probes/rethook.c                     |  127 +++++++++++++++++++++++++
> >  arch/arm64/Kconfig                            |    1 
> >  arch/arm64/include/asm/kprobes.h              |    2 
> >  arch/arm64/include/asm/stacktrace.h           |    2 
> >  arch/arm64/kernel/Makefile                    |    1 
> >  arch/arm64/kernel/probes/Makefile             |    1 
> >  arch/arm64/kernel/probes/kprobes.c            |   15 ---
> >  arch/arm64/kernel/probes/kprobes_trampoline.S |   86 -----------------
> >  arch/arm64/kernel/rethook.c                   |   28 ++++++
> >  arch/arm64/kernel/rethook_trampoline.S        |   87 +++++++++++++++++
> >  arch/arm64/kernel/stacktrace.c                |    9 +-
> >  arch/x86/kernel/rethook.c                     |    4 +
> >  include/linux/rethook.h                       |    4 -
> >  kernel/kprobes.c                              |    8 +-
> >  kernel/trace/fprobe.c                         |    5 +
> >  kernel/trace/rethook.c                        |   12 ++
> >  22 files changed, 287 insertions(+), 188 deletions(-)
> >  create mode 100644 arch/arm/probes/rethook.c
> >  delete mode 100644 arch/arm64/kernel/probes/kprobes_trampoline.S
> >  create mode 100644 arch/arm64/kernel/rethook.c
> >  create mode 100644 arch/arm64/kernel/rethook_trampoline.S
> > 
> > --
> > Masami Hiramatsu (Linaro) <mhiramat@kernel.org>


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-04-29 16:11 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-08  0:50 [PATCH bpf v2 0/4] kprobes: rethook,ARM,arm64: Replace kretprobe trampoline with rethook Masami Hiramatsu
2022-04-08  0:50 ` [PATCH bpf v2 1/4] ARM: unwind: Initialize the lr_addr field of unwind_ctrl_block Masami Hiramatsu
2022-04-08  0:51 ` [PATCH bpf v2 2/4] rethook,fprobe,kprobes: Check a failure in the rethook_hook() Masami Hiramatsu
2022-04-08  0:51 ` [PATCH bpf v2 3/4] ARM: rethook: Replace kretprobe trampoline with rethook Masami Hiramatsu
2022-04-08  0:51 ` [PATCH bpf v2 4/4] arm64: " Masami Hiramatsu
2022-04-12 19:04   ` Mark Rutland
2022-04-29 15:55     ` Masami Hiramatsu
2022-04-12 18:03 ` [PATCH bpf v2 0/4] kprobes: rethook,ARM,arm64: " Mark Rutland
2022-04-29 16:11   ` Masami Hiramatsu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).