All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next 0/2] kprobes: rethook: x86: Replace kretprobe trampoline with rethook
@ 2022-03-25  4:28 Masami Hiramatsu
  2022-03-25  4:28 ` [PATCH bpf-next 1/2] kprobes: Use rethook for kretprobe if possible Masami Hiramatsu
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Masami Hiramatsu @ 2022-03-25  4:28 UTC (permalink / raw)
  To: Alexei Starovoitov, Andrii Nakryiko, x86, Peter Zijlstra
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Dan Carpenter, kernel-janitors, Steven Rostedt, Masami Hiramatsu,
	Jiri Olsa, bpf, linux-kernel

Hi,

Here are the patch set for generic kretprobe and kretprobe on x86 for
replacing the kretprobe trampoline with rethook. For the other archs,
I will port rethook to those after this has been merged.
This is previously called as "rethook: x86: Add rethook x86 implementation"
The previous thread is here[1].

[1] https://lore.kernel.org/all/164800288611.1716332.7053663723617614668.stgit@devnote2/T/#u

Background:

This rethook came from Jiri's request of multiple kprobe for bpf[1].
He tried to solve an issue that starting bpf with multiple kprobe will
take a long time because bpf-kprobe will wait for RCU grace period for
sync rcu events.

Jiri wanted to attach a single bpf handler to multiple kprobes and
he tried to introduce multiple-probe interface to kprobe. So I asked
him to use ftrace and kretprobe-like hook if it is only for the
function entry and exit, instead of adding ad-hoc interface
to kprobes.
For this purpose, I introduced the fprobe (kprobe like interface for
ftrace) with the rethook (this is a generic return hook feature for
fprobe exit handler)[2].

[1] https://lore.kernel.org/all/20220104080943.113249-1-jolsa@kernel.org/T/#u
[2] https://lore.kernel.org/all/164191321766.806991.7930388561276940676.stgit@devnote2/T/#u

The rethook is basically same as the kretprobe trampoline. I just made
it decoupled from kprobes. Eventually, the all arch dependent kretprobe
trampolines will be replaced with the rethook trampoline instead of
cloning and set HAVE_RETHOOK=y.
When I port the rethook for all arch which supports kretprobe, the
legacy kretprobe specific code (which is for CONFIG_KRETPROBE_ON_RETHOOK=n)
will be removed eventually.

BTW, this patch can be applied to next-20220324, not the bpf-next tree
directly, because this depends on ANNOTATE_NOENDBR macro. However, since
the fprobe is merged in the bpf-next, I marked this for bpf-next.
So until merging the both of fprobes and ENDBR series, to compile this
you need below 2 lines in arch/x86/kernel/rethook.c.

#ifndef ANNOTATE_NOENDBR
#define ANNOTATE_NOENDBR

But after those are merged, these lines will be unneeded. How should I
handle this issue? (Just remove ANNOTATE_NOENDBR line in bpf-next?)

Thank you,

---

Masami Hiramatsu (2):
      kprobes: Use rethook for kretprobe if possible
      rethook: kprobes: x86: Replace kretprobe with rethook on x86


 arch/Kconfig                     |    7 ++
 arch/x86/Kconfig                 |    1 
 arch/x86/include/asm/unwind.h    |   23 +++----
 arch/x86/kernel/Makefile         |    1 
 arch/x86/kernel/kprobes/common.h |    1 
 arch/x86/kernel/kprobes/core.c   |  107 ---------------------------------
 arch/x86/kernel/rethook.c        |  121 +++++++++++++++++++++++++++++++++++++
 include/linux/kprobes.h          |   51 +++++++++++++++-
 kernel/kprobes.c                 |  124 ++++++++++++++++++++++++++++++++------
 kernel/trace/trace_kprobe.c      |    4 +
 10 files changed, 296 insertions(+), 144 deletions(-)
 create mode 100644 arch/x86/kernel/rethook.c

--
Masami Hiramatsu (Linaro) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH bpf-next 1/2] kprobes: Use rethook for kretprobe if possible
  2022-03-25  4:28 [PATCH bpf-next 0/2] kprobes: rethook: x86: Replace kretprobe trampoline with rethook Masami Hiramatsu
@ 2022-03-25  4:28 ` Masami Hiramatsu
  2022-03-25  4:29 ` [PATCH bpf-next 2/2] rethook: kprobes: x86: Replace kretprobe with rethook on x86 Masami Hiramatsu
  2022-03-25 11:40 ` [PATCH bpf-next 3/2] x86,rethook: Fix arch_rethook_trampoline() to generate a complete pt_regs Peter Zijlstra
  2 siblings, 0 replies; 9+ messages in thread
From: Masami Hiramatsu @ 2022-03-25  4:28 UTC (permalink / raw)
  To: Alexei Starovoitov, Andrii Nakryiko, x86, Peter Zijlstra
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Dan Carpenter, kernel-janitors, Steven Rostedt, Masami Hiramatsu,
	Jiri Olsa, bpf, linux-kernel

Use rethook for kretprobe function return hooking if the arch sets
CONFIG_HAVE_RETHOOK=y. In this case, CONFIG_KRETPROBE_ON_RETHOOK is
set to 'y' automatically, and the kretprobe internal data fields
switches to use rethook. If not, it continues to use kretprobe
specific function return hooks.

Suggested-by: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
 arch/Kconfig                |    7 ++
 include/linux/kprobes.h     |   51 +++++++++++++++++-
 kernel/kprobes.c            |  124 ++++++++++++++++++++++++++++++++++++-------
 kernel/trace/trace_kprobe.c |    4 +
 4 files changed, 161 insertions(+), 25 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index e12a4268c01d..9570854c4683 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -164,7 +164,12 @@ config ARCH_USE_BUILTIN_BSWAP
 
 config KRETPROBES
 	def_bool y
-	depends on KPROBES && HAVE_KRETPROBES
+	depends on KPROBES && (HAVE_KRETPROBES || HAVE_RETHOOK)
+
+config KRETPROBE_ON_RETHOOK
+	def_bool y
+	depends on HAVE_RETHOOK
+	select RETHOOK
 
 config USER_RETURN_NOTIFIER
 	bool
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 312ff997c743..157168769fc2 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -28,6 +28,7 @@
 #include <linux/ftrace.h>
 #include <linux/refcount.h>
 #include <linux/freelist.h>
+#include <linux/rethook.h>
 #include <asm/kprobes.h>
 
 #ifdef CONFIG_KPROBES
@@ -149,13 +150,20 @@ struct kretprobe {
 	int maxactive;
 	int nmissed;
 	size_t data_size;
+#ifdef CONFIG_KRETPROBE_ON_RETHOOK
+	struct rethook *rh;
+#else
 	struct freelist_head freelist;
 	struct kretprobe_holder *rph;
+#endif
 };
 
 #define KRETPROBE_MAX_DATA_SIZE	4096
 
 struct kretprobe_instance {
+#ifdef CONFIG_KRETPROBE_ON_RETHOOK
+	struct rethook_node node;
+#else
 	union {
 		struct freelist_node freelist;
 		struct rcu_head rcu;
@@ -164,6 +172,7 @@ struct kretprobe_instance {
 	struct kretprobe_holder *rph;
 	kprobe_opcode_t *ret_addr;
 	void *fp;
+#endif
 	char data[];
 };
 
@@ -186,10 +195,24 @@ extern void kprobe_busy_begin(void);
 extern void kprobe_busy_end(void);
 
 #ifdef CONFIG_KRETPROBES
-extern void arch_prepare_kretprobe(struct kretprobe_instance *ri,
-				   struct pt_regs *regs);
+/* Check whether @p is used for implementing a trampoline. */
 extern int arch_trampoline_kprobe(struct kprobe *p);
 
+#ifdef CONFIG_KRETPROBE_ON_RETHOOK
+static nokprobe_inline struct kretprobe *get_kretprobe(struct kretprobe_instance *ri)
+{
+	RCU_LOCKDEP_WARN(!rcu_read_lock_any_held(),
+		"Kretprobe is accessed from instance under preemptive context");
+
+	return (struct kretprobe *)READ_ONCE(ri->node.rethook->data);
+}
+static nokprobe_inline unsigned long get_kretprobe_retaddr(struct kretprobe_instance *ri)
+{
+	return ri->node.ret_addr;
+}
+#else
+extern void arch_prepare_kretprobe(struct kretprobe_instance *ri,
+				   struct pt_regs *regs);
 void arch_kretprobe_fixup_return(struct pt_regs *regs,
 				 kprobe_opcode_t *correct_ret_addr);
 
@@ -232,6 +255,12 @@ static nokprobe_inline struct kretprobe *get_kretprobe(struct kretprobe_instance
 	return READ_ONCE(ri->rph->rp);
 }
 
+static nokprobe_inline unsigned long get_kretprobe_retaddr(struct kretprobe_instance *ri)
+{
+	return (unsigned long)ri->ret_addr;
+}
+#endif /* CONFIG_KRETPROBE_ON_RETHOOK */
+
 #else /* !CONFIG_KRETPROBES */
 static inline void arch_prepare_kretprobe(struct kretprobe *rp,
 					struct pt_regs *regs)
@@ -395,7 +424,11 @@ void unregister_kretprobe(struct kretprobe *rp);
 int register_kretprobes(struct kretprobe **rps, int num);
 void unregister_kretprobes(struct kretprobe **rps, int num);
 
+#ifdef CONFIG_KRETPROBE_ON_RETHOOK
+#define kprobe_flush_task(tk)	do {} while (0)
+#else
 void kprobe_flush_task(struct task_struct *tk);
+#endif
 
 void kprobe_free_init_mem(void);
 
@@ -509,6 +542,19 @@ static inline bool is_kprobe_optinsn_slot(unsigned long addr)
 #endif /* !CONFIG_OPTPROBES */
 
 #ifdef CONFIG_KRETPROBES
+#ifdef CONFIG_KRETPROBE_ON_RETHOOK
+static nokprobe_inline bool is_kretprobe_trampoline(unsigned long addr)
+{
+	return is_rethook_trampoline(addr);
+}
+
+static nokprobe_inline
+unsigned long kretprobe_find_ret_addr(struct task_struct *tsk, void *fp,
+				      struct llist_node **cur)
+{
+	return rethook_find_ret_addr(tsk, (unsigned long)fp, cur);
+}
+#else
 static nokprobe_inline bool is_kretprobe_trampoline(unsigned long addr)
 {
 	return (void *)addr == kretprobe_trampoline_addr();
@@ -516,6 +562,7 @@ static nokprobe_inline bool is_kretprobe_trampoline(unsigned long addr)
 
 unsigned long kretprobe_find_ret_addr(struct task_struct *tsk, void *fp,
 				      struct llist_node **cur);
+#endif
 #else
 static nokprobe_inline bool is_kretprobe_trampoline(unsigned long addr)
 {
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 185badc780b7..dbe57df2e199 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1237,6 +1237,27 @@ void kprobes_inc_nmissed_count(struct kprobe *p)
 }
 NOKPROBE_SYMBOL(kprobes_inc_nmissed_count);
 
+static struct kprobe kprobe_busy = {
+	.addr = (void *) get_kprobe,
+};
+
+void kprobe_busy_begin(void)
+{
+	struct kprobe_ctlblk *kcb;
+
+	preempt_disable();
+	__this_cpu_write(current_kprobe, &kprobe_busy);
+	kcb = get_kprobe_ctlblk();
+	kcb->kprobe_status = KPROBE_HIT_ACTIVE;
+}
+
+void kprobe_busy_end(void)
+{
+	__this_cpu_write(current_kprobe, NULL);
+	preempt_enable();
+}
+
+#if !defined(CONFIG_KRETPROBE_ON_RETHOOK)
 static void free_rp_inst_rcu(struct rcu_head *head)
 {
 	struct kretprobe_instance *ri = container_of(head, struct kretprobe_instance, rcu);
@@ -1258,26 +1279,6 @@ static void recycle_rp_inst(struct kretprobe_instance *ri)
 }
 NOKPROBE_SYMBOL(recycle_rp_inst);
 
-static struct kprobe kprobe_busy = {
-	.addr = (void *) get_kprobe,
-};
-
-void kprobe_busy_begin(void)
-{
-	struct kprobe_ctlblk *kcb;
-
-	preempt_disable();
-	__this_cpu_write(current_kprobe, &kprobe_busy);
-	kcb = get_kprobe_ctlblk();
-	kcb->kprobe_status = KPROBE_HIT_ACTIVE;
-}
-
-void kprobe_busy_end(void)
-{
-	__this_cpu_write(current_kprobe, NULL);
-	preempt_enable();
-}
-
 /*
  * This function is called from delayed_put_task_struct() when a task is
  * dead and cleaned up to recycle any kretprobe instances associated with
@@ -1327,6 +1328,7 @@ static inline void free_rp_inst(struct kretprobe *rp)
 		rp->rph = NULL;
 	}
 }
+#endif	/* !CONFIG_KRETPROBE_ON_RETHOOK */
 
 /* Add the new probe to 'ap->list'. */
 static int add_new_kprobe(struct kprobe *ap, struct kprobe *p)
@@ -1925,6 +1927,7 @@ static struct notifier_block kprobe_exceptions_nb = {
 
 #ifdef CONFIG_KRETPROBES
 
+#if !defined(CONFIG_KRETPROBE_ON_RETHOOK)
 /* This assumes the 'tsk' is the current task or the is not running. */
 static kprobe_opcode_t *__kretprobe_find_ret_addr(struct task_struct *tsk,
 						  struct llist_node **cur)
@@ -2087,6 +2090,57 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
 	return 0;
 }
 NOKPROBE_SYMBOL(pre_handler_kretprobe);
+#else /* CONFIG_KRETPROBE_ON_RETHOOK */
+/*
+ * This kprobe pre_handler is registered with every kretprobe. When probe
+ * hits it will set up the return probe.
+ */
+static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
+{
+	struct kretprobe *rp = container_of(p, struct kretprobe, kp);
+	struct kretprobe_instance *ri;
+	struct rethook_node *rhn;
+
+	rhn = rethook_try_get(rp->rh);
+	if (!rhn) {
+		rp->nmissed++;
+		return 0;
+	}
+
+	ri = container_of(rhn, struct kretprobe_instance, node);
+
+	if (rp->entry_handler && rp->entry_handler(ri, regs))
+		rethook_recycle(rhn);
+	else
+		rethook_hook(rhn, regs, kprobe_ftrace(p));
+
+	return 0;
+}
+NOKPROBE_SYMBOL(pre_handler_kretprobe);
+
+static void kretprobe_rethook_handler(struct rethook_node *rh, void *data,
+				      struct pt_regs *regs)
+{
+	struct kretprobe *rp = (struct kretprobe *)data;
+	struct kretprobe_instance *ri;
+	struct kprobe_ctlblk *kcb;
+
+	/* The data must NOT be null. This means rethook data structure is broken. */
+	if (WARN_ON_ONCE(!data))
+		return;
+
+	__this_cpu_write(current_kprobe, &rp->kp);
+	kcb = get_kprobe_ctlblk();
+	kcb->kprobe_status = KPROBE_HIT_ACTIVE;
+
+	ri = container_of(rh, struct kretprobe_instance, node);
+	rp->handler(ri, regs);
+
+	__this_cpu_write(current_kprobe, NULL);
+}
+NOKPROBE_SYMBOL(kretprobe_rethook_handler);
+
+#endif /* !CONFIG_KRETPROBE_ON_RETHOOK */
 
 /**
  * kprobe_on_func_entry() -- check whether given address is function entry
@@ -2155,6 +2209,29 @@ int register_kretprobe(struct kretprobe *rp)
 		rp->maxactive = num_possible_cpus();
 #endif
 	}
+#ifdef CONFIG_KRETPROBE_ON_RETHOOK
+	rp->rh = rethook_alloc((void *)rp, kretprobe_rethook_handler);
+	if (!rp->rh)
+		return -ENOMEM;
+
+	for (i = 0; i < rp->maxactive; i++) {
+		inst = kzalloc(sizeof(struct kretprobe_instance) +
+			       rp->data_size, GFP_KERNEL);
+		if (inst == NULL) {
+			rethook_free(rp->rh);
+			rp->rh = NULL;
+			return -ENOMEM;
+		}
+		rethook_add_node(rp->rh, &inst->node);
+	}
+	rp->nmissed = 0;
+	/* Establish function entry probe point */
+	ret = register_kprobe(&rp->kp);
+	if (ret != 0) {
+		rethook_free(rp->rh);
+		rp->rh = NULL;
+	}
+#else	/* !CONFIG_KRETPROBE_ON_RETHOOK */
 	rp->freelist.head = NULL;
 	rp->rph = kzalloc(sizeof(struct kretprobe_holder), GFP_KERNEL);
 	if (!rp->rph)
@@ -2179,6 +2256,7 @@ int register_kretprobe(struct kretprobe *rp)
 	ret = register_kprobe(&rp->kp);
 	if (ret != 0)
 		free_rp_inst(rp);
+#endif
 	return ret;
 }
 EXPORT_SYMBOL_GPL(register_kretprobe);
@@ -2217,7 +2295,11 @@ void unregister_kretprobes(struct kretprobe **rps, int num)
 	for (i = 0; i < num; i++) {
 		if (__unregister_kprobe_top(&rps[i]->kp) < 0)
 			rps[i]->kp.addr = NULL;
+#ifdef CONFIG_KRETPROBE_ON_RETHOOK
+		rethook_free(rps[i]->rh);
+#else
 		rps[i]->rph->rp = NULL;
+#endif
 	}
 	mutex_unlock(&kprobe_mutex);
 
@@ -2225,7 +2307,9 @@ void unregister_kretprobes(struct kretprobe **rps, int num)
 	for (i = 0; i < num; i++) {
 		if (rps[i]->kp.addr) {
 			__unregister_kprobe_bottom(&rps[i]->kp);
+#ifndef CONFIG_KRETPROBE_ON_RETHOOK
 			free_rp_inst(rps[i]);
+#endif
 		}
 	}
 }
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index b62fd785b599..47cebef78532 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -1433,7 +1433,7 @@ __kretprobe_trace_func(struct trace_kprobe *tk, struct kretprobe_instance *ri,
 	fbuffer.regs = regs;
 	entry = fbuffer.entry = ring_buffer_event_data(fbuffer.event);
 	entry->func = (unsigned long)tk->rp.kp.addr;
-	entry->ret_ip = (unsigned long)ri->ret_addr;
+	entry->ret_ip = get_kretprobe_retaddr(ri);
 	store_trace_args(&entry[1], &tk->tp, regs, sizeof(*entry), dsize);
 
 	trace_event_buffer_commit(&fbuffer);
@@ -1628,7 +1628,7 @@ kretprobe_perf_func(struct trace_kprobe *tk, struct kretprobe_instance *ri,
 		return;
 
 	entry->func = (unsigned long)tk->rp.kp.addr;
-	entry->ret_ip = (unsigned long)ri->ret_addr;
+	entry->ret_ip = get_kretprobe_retaddr(ri);
 	store_trace_args(&entry[1], &tk->tp, regs, sizeof(*entry), dsize);
 	perf_trace_buf_submit(entry, size, rctx, call->event.type, 1, regs,
 			      head, NULL);


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH bpf-next 2/2] rethook: kprobes: x86: Replace kretprobe with rethook on x86
  2022-03-25  4:28 [PATCH bpf-next 0/2] kprobes: rethook: x86: Replace kretprobe trampoline with rethook Masami Hiramatsu
  2022-03-25  4:28 ` [PATCH bpf-next 1/2] kprobes: Use rethook for kretprobe if possible Masami Hiramatsu
@ 2022-03-25  4:29 ` Masami Hiramatsu
  2022-03-25 10:09   ` Peter Zijlstra
  2022-03-25 10:11   ` Peter Zijlstra
  2022-03-25 11:40 ` [PATCH bpf-next 3/2] x86,rethook: Fix arch_rethook_trampoline() to generate a complete pt_regs Peter Zijlstra
  2 siblings, 2 replies; 9+ messages in thread
From: Masami Hiramatsu @ 2022-03-25  4:29 UTC (permalink / raw)
  To: Alexei Starovoitov, Andrii Nakryiko, x86, Peter Zijlstra
  Cc: Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Dan Carpenter, kernel-janitors, Steven Rostedt, Masami Hiramatsu,
	Jiri Olsa, bpf, linux-kernel

Replaces the kretprobe code with rethook on x86. With this patch,
kretprobe on x86 uses the rethook instead of kretprobe specific
trampoline code.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
 arch/x86/Kconfig                 |    1 
 arch/x86/include/asm/unwind.h    |   23 +++----
 arch/x86/kernel/Makefile         |    1 
 arch/x86/kernel/kprobes/common.h |    1 
 arch/x86/kernel/kprobes/core.c   |  107 ----------------------------------
 arch/x86/kernel/rethook.c        |  121 ++++++++++++++++++++++++++++++++++++++
 6 files changed, 135 insertions(+), 119 deletions(-)
 create mode 100644 arch/x86/kernel/rethook.c

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index db3bedcbf6be..e85b52a6f612 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -224,6 +224,7 @@ config X86
 	select HAVE_KPROBES_ON_FTRACE
 	select HAVE_FUNCTION_ERROR_INJECTION
 	select HAVE_KRETPROBES
+	select HAVE_RETHOOK
 	select HAVE_KVM
 	select HAVE_LIVEPATCH			if X86_64
 	select HAVE_MIXED_BREAKPOINTS_REGS
diff --git a/arch/x86/include/asm/unwind.h b/arch/x86/include/asm/unwind.h
index 2a1f8734416d..7cede4dc21f0 100644
--- a/arch/x86/include/asm/unwind.h
+++ b/arch/x86/include/asm/unwind.h
@@ -4,7 +4,7 @@
 
 #include <linux/sched.h>
 #include <linux/ftrace.h>
-#include <linux/kprobes.h>
+#include <linux/rethook.h>
 #include <asm/ptrace.h>
 #include <asm/stacktrace.h>
 
@@ -16,7 +16,7 @@ struct unwind_state {
 	unsigned long stack_mask;
 	struct task_struct *task;
 	int graph_idx;
-#ifdef CONFIG_KRETPROBES
+#if defined(CONFIG_RETHOOK)
 	struct llist_node *kr_cur;
 #endif
 	bool error;
@@ -104,19 +104,18 @@ void unwind_module_init(struct module *mod, void *orc_ip, size_t orc_ip_size,
 #endif
 
 static inline
-unsigned long unwind_recover_kretprobe(struct unwind_state *state,
-				       unsigned long addr, unsigned long *addr_p)
+unsigned long unwind_recover_rethook(struct unwind_state *state,
+				     unsigned long addr, unsigned long *addr_p)
 {
-#ifdef CONFIG_KRETPROBES
-	return is_kretprobe_trampoline(addr) ?
-		kretprobe_find_ret_addr(state->task, addr_p, &state->kr_cur) :
-		addr;
-#else
-	return addr;
+#ifdef CONFIG_RETHOOK
+	if (is_rethook_trampoline(addr))
+		return rethook_find_ret_addr(state->task, (unsigned long)addr_p,
+					     &state->kr_cur);
 #endif
+	return addr;
 }
 
-/* Recover the return address modified by kretprobe and ftrace_graph. */
+/* Recover the return address modified by rethook and ftrace_graph. */
 static inline
 unsigned long unwind_recover_ret_addr(struct unwind_state *state,
 				     unsigned long addr, unsigned long *addr_p)
@@ -125,7 +124,7 @@ unsigned long unwind_recover_ret_addr(struct unwind_state *state,
 
 	ret = ftrace_graph_ret_addr(state->task, &state->graph_idx,
 				    addr, addr_p);
-	return unwind_recover_kretprobe(state, ret, addr_p);
+	return unwind_recover_rethook(state, ret, addr_p);
 }
 
 /*
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 6462e3dd98f4..c41ef42adbe8 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -103,6 +103,7 @@ obj-$(CONFIG_FUNCTION_GRAPH_TRACER) += ftrace.o
 obj-$(CONFIG_FTRACE_SYSCALLS)	+= ftrace.o
 obj-$(CONFIG_X86_TSC)		+= trace_clock.o
 obj-$(CONFIG_TRACING)		+= trace.o
+obj-$(CONFIG_RETHOOK)		+= rethook.o
 obj-$(CONFIG_CRASH_CORE)	+= crash_core_$(BITS).o
 obj-$(CONFIG_KEXEC_CORE)	+= machine_kexec_$(BITS).o
 obj-$(CONFIG_KEXEC_CORE)	+= relocate_kernel_$(BITS).o crash.o
diff --git a/arch/x86/kernel/kprobes/common.h b/arch/x86/kernel/kprobes/common.h
index 7d3a2e2daf01..c993521d4933 100644
--- a/arch/x86/kernel/kprobes/common.h
+++ b/arch/x86/kernel/kprobes/common.h
@@ -6,6 +6,7 @@
 
 #include <asm/asm.h>
 #include <asm/frame.h>
+#include <asm/insn.h>
 
 #ifdef CONFIG_X86_64
 
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 8ef933c03afa..7c4ab8870da4 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -811,18 +811,6 @@ set_current_kprobe(struct kprobe *p, struct pt_regs *regs,
 		= (regs->flags & X86_EFLAGS_IF);
 }
 
-void arch_prepare_kretprobe(struct kretprobe_instance *ri, struct pt_regs *regs)
-{
-	unsigned long *sara = stack_addr(regs);
-
-	ri->ret_addr = (kprobe_opcode_t *) *sara;
-	ri->fp = sara;
-
-	/* Replace the return addr with trampoline addr */
-	*sara = (unsigned long) &__kretprobe_trampoline;
-}
-NOKPROBE_SYMBOL(arch_prepare_kretprobe);
-
 static void kprobe_post_process(struct kprobe *cur, struct pt_regs *regs,
 			       struct kprobe_ctlblk *kcb)
 {
@@ -1023,101 +1011,6 @@ int kprobe_int3_handler(struct pt_regs *regs)
 }
 NOKPROBE_SYMBOL(kprobe_int3_handler);
 
-/*
- * When a retprobed function returns, this code saves registers and
- * calls trampoline_handler() runs, which calls the kretprobe's handler.
- */
-asm(
-	".text\n"
-	".global __kretprobe_trampoline\n"
-	".type __kretprobe_trampoline, @function\n"
-	"__kretprobe_trampoline:\n"
-#ifdef CONFIG_X86_64
-	ANNOTATE_NOENDBR
-	/* Push a fake return address to tell the unwinder it's a kretprobe. */
-	"	pushq $__kretprobe_trampoline\n"
-	UNWIND_HINT_FUNC
-	/* Save the 'sp - 8', this will be fixed later. */
-	"	pushq %rsp\n"
-	"	pushfq\n"
-	SAVE_REGS_STRING
-	"	movq %rsp, %rdi\n"
-	"	call trampoline_handler\n"
-	RESTORE_REGS_STRING
-	/* In trampoline_handler(), 'regs->flags' is copied to 'regs->sp'. */
-	"	addq $8, %rsp\n"
-	"	popfq\n"
-#else
-	/* Push a fake return address to tell the unwinder it's a kretprobe. */
-	"	pushl $__kretprobe_trampoline\n"
-	UNWIND_HINT_FUNC
-	/* Save the 'sp - 4', this will be fixed later. */
-	"	pushl %esp\n"
-	"	pushfl\n"
-	SAVE_REGS_STRING
-	"	movl %esp, %eax\n"
-	"	call trampoline_handler\n"
-	RESTORE_REGS_STRING
-	/* In trampoline_handler(), 'regs->flags' is copied to 'regs->sp'. */
-	"	addl $4, %esp\n"
-	"	popfl\n"
-#endif
-	ASM_RET
-	".size __kretprobe_trampoline, .-__kretprobe_trampoline\n"
-);
-NOKPROBE_SYMBOL(__kretprobe_trampoline);
-/*
- * __kretprobe_trampoline() skips updating frame pointer. The frame pointer
- * saved in trampoline_handler() points to the real caller function's
- * frame pointer. Thus the __kretprobe_trampoline() doesn't have a
- * standard stack frame with CONFIG_FRAME_POINTER=y.
- * Let's mark it non-standard function. Anyway, FP unwinder can correctly
- * unwind without the hint.
- */
-STACK_FRAME_NON_STANDARD_FP(__kretprobe_trampoline);
-
-/* This is called from kretprobe_trampoline_handler(). */
-void arch_kretprobe_fixup_return(struct pt_regs *regs,
-				 kprobe_opcode_t *correct_ret_addr)
-{
-	unsigned long *frame_pointer = &regs->sp + 1;
-
-	/* Replace fake return address with real one. */
-	*frame_pointer = (unsigned long)correct_ret_addr;
-}
-
-/*
- * Called from __kretprobe_trampoline
- */
-__used __visible void trampoline_handler(struct pt_regs *regs)
-{
-	unsigned long *frame_pointer;
-
-	/* fixup registers */
-	regs->cs = __KERNEL_CS;
-#ifdef CONFIG_X86_32
-	regs->gs = 0;
-#endif
-	regs->ip = (unsigned long)&__kretprobe_trampoline;
-	regs->orig_ax = ~0UL;
-	regs->sp += sizeof(long);
-	frame_pointer = &regs->sp + 1;
-
-	/*
-	 * The return address at 'frame_pointer' is recovered by the
-	 * arch_kretprobe_fixup_return() which called from the
-	 * kretprobe_trampoline_handler().
-	 */
-	kretprobe_trampoline_handler(regs, frame_pointer);
-
-	/*
-	 * Copy FLAGS to 'pt_regs::sp' so that __kretprobe_trapmoline()
-	 * can do RET right after POPF.
-	 */
-	regs->sp = regs->flags;
-}
-NOKPROBE_SYMBOL(trampoline_handler);
-
 int kprobe_fault_handler(struct pt_regs *regs, int trapnr)
 {
 	struct kprobe *cur = kprobe_running();
diff --git a/arch/x86/kernel/rethook.c b/arch/x86/kernel/rethook.c
new file mode 100644
index 000000000000..3e916361c33b
--- /dev/null
+++ b/arch/x86/kernel/rethook.c
@@ -0,0 +1,121 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * x86 implementation of rethook. Mostly copied from arch/x86/kernel/kprobes/core.c.
+ */
+#include <linux/bug.h>
+#include <linux/rethook.h>
+#include <linux/kprobes.h>
+#include <linux/objtool.h>
+
+#include "kprobes/common.h"
+
+__visible void arch_rethook_trampoline_callback(struct pt_regs *regs);
+
+/*
+ * When a target function returns, this code saves registers and calls
+ * arch_rethook_trampoline_callback(), which calls the rethook handler.
+ */
+asm(
+	".text\n"
+	".global arch_rethook_trampoline\n"
+	".type arch_rethook_trampoline, @function\n"
+	"arch_rethook_trampoline:\n"
+#ifdef CONFIG_X86_64
+	ANNOTATE_NOENDBR	/* This is only jumped from ret instruction */
+	/* Push a fake return address to tell the unwinder it's a kretprobe. */
+	"	pushq $arch_rethook_trampoline\n"
+	UNWIND_HINT_FUNC
+	/* Save the 'sp - 8', this will be fixed later. */
+	"	pushq %rsp\n"
+	"	pushfq\n"
+	SAVE_REGS_STRING
+	"	movq %rsp, %rdi\n"
+	"	call arch_rethook_trampoline_callback\n"
+	RESTORE_REGS_STRING
+	/* In the callback function, 'regs->flags' is copied to 'regs->sp'. */
+	"	addq $8, %rsp\n"
+	"	popfq\n"
+#else
+	/* Push a fake return address to tell the unwinder it's a kretprobe. */
+	"	pushl $arch_rethook_trampoline\n"
+	UNWIND_HINT_FUNC
+	/* Save the 'sp - 4', this will be fixed later. */
+	"	pushl %esp\n"
+	"	pushfl\n"
+	SAVE_REGS_STRING
+	"	movl %esp, %eax\n"
+	"	call arch_rethook_trampoline_callback\n"
+	RESTORE_REGS_STRING
+	/* In the callback function, 'regs->flags' is copied to 'regs->sp'. */
+	"	addl $4, %esp\n"
+	"	popfl\n"
+#endif
+	ASM_RET
+	".size arch_rethook_trampoline, .-arch_rethook_trampoline\n"
+);
+NOKPROBE_SYMBOL(arch_rethook_trampoline);
+
+/*
+ * Called from arch_rethook_trampoline
+ */
+__used __visible void arch_rethook_trampoline_callback(struct pt_regs *regs)
+{
+	unsigned long *frame_pointer;
+
+	/* fixup registers */
+	regs->cs = __KERNEL_CS;
+#ifdef CONFIG_X86_32
+	regs->gs = 0;
+#endif
+	regs->ip = (unsigned long)&arch_rethook_trampoline;
+	regs->orig_ax = ~0UL;
+	regs->sp += sizeof(long);
+	frame_pointer = &regs->sp + 1;
+
+	/*
+	 * The return address at 'frame_pointer' is recovered by the
+	 * arch_rethook_fixup_return() which called from this
+	 * rethook_trampoline_handler().
+	 */
+	rethook_trampoline_handler(regs, (unsigned long)frame_pointer);
+
+	/*
+	 * Copy FLAGS to 'pt_regs::sp' so that arch_rethook_trapmoline()
+	 * can do RET right after POPF.
+	 */
+	regs->sp = regs->flags;
+}
+NOKPROBE_SYMBOL(arch_rethook_trampoline_callback);
+
+/*
+ * arch_rethook_trampoline() skips updating frame pointer. The frame pointer
+ * saved in arch_rethook_trampoline_callback() points to the real caller
+ * function's frame pointer. Thus the arch_rethook_trampoline() doesn't have
+ * a standard stack frame with CONFIG_FRAME_POINTER=y.
+ * Let's mark it non-standard function. Anyway, FP unwinder can correctly
+ * unwind without the hint.
+ */
+STACK_FRAME_NON_STANDARD_FP(arch_rethook_trampoline);
+
+/* This is called from rethook_trampoline_handler(). */
+void arch_rethook_fixup_return(struct pt_regs *regs,
+			       unsigned long correct_ret_addr)
+{
+	unsigned long *frame_pointer = &regs->sp + 1;
+
+	/* Replace fake return address with real one. */
+	*frame_pointer = correct_ret_addr;
+}
+NOKPROBE_SYMBOL(arch_rethook_fixup_return);
+
+void arch_rethook_prepare(struct rethook_node *rh, struct pt_regs *regs, bool mcount)
+{
+	unsigned long *stack = (unsigned long *)regs->sp;
+
+	rh->ret_addr = stack[0];
+	rh->frame = regs->sp;
+
+	/* Replace the return addr with trampoline addr */
+	stack[0] = (unsigned long) arch_rethook_trampoline;
+}
+NOKPROBE_SYMBOL(arch_rethook_prepare);


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf-next 2/2] rethook: kprobes: x86: Replace kretprobe with rethook on x86
  2022-03-25  4:29 ` [PATCH bpf-next 2/2] rethook: kprobes: x86: Replace kretprobe with rethook on x86 Masami Hiramatsu
@ 2022-03-25 10:09   ` Peter Zijlstra
  2022-03-25 11:51     ` Jiri Olsa
  2022-03-25 12:56     ` Masami Hiramatsu
  2022-03-25 10:11   ` Peter Zijlstra
  1 sibling, 2 replies; 9+ messages in thread
From: Peter Zijlstra @ 2022-03-25 10:09 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Alexei Starovoitov, Andrii Nakryiko, x86, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Dan Carpenter,
	kernel-janitors, Steven Rostedt, Jiri Olsa, bpf, linux-kernel,
	Josh Poimboeuf

On Fri, Mar 25, 2022 at 01:29:01PM +0900, Masami Hiramatsu wrote:
> Replaces the kretprobe code with rethook on x86. With this patch,
> kretprobe on x86 uses the rethook instead of kretprobe specific
> trampoline code.
> 
> Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
> ---
>  arch/x86/Kconfig                 |    1 
>  arch/x86/include/asm/unwind.h    |   23 +++----
>  arch/x86/kernel/Makefile         |    1 
>  arch/x86/kernel/kprobes/common.h |    1 
>  arch/x86/kernel/kprobes/core.c   |  107 ----------------------------------
>  arch/x86/kernel/rethook.c        |  121 ++++++++++++++++++++++++++++++++++++++
>  6 files changed, 135 insertions(+), 119 deletions(-)
>  create mode 100644 arch/x86/kernel/rethook.c

I'm thinking you'll find it builds much better with this on...

diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index 2de3c8c5eba9..794fdef2501a 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -550,15 +550,15 @@ bool unwind_next_frame(struct unwind_state *state)
 		}
 		/*
 		 * There is a small chance to interrupt at the entry of
-		 * __kretprobe_trampoline() where the ORC info doesn't exist.
-		 * That point is right after the RET to __kretprobe_trampoline()
+		 * arch_rethook_trampoline() where the ORC info doesn't exist.
+		 * That point is right after the RET to arch_rethook_trampoline()
 		 * which was modified return address.
-		 * At that point, the @addr_p of the unwind_recover_kretprobe()
+		 * At that point, the @addr_p of the unwind_recover_rethook()
 		 * (this has to point the address of the stack entry storing
 		 * the modified return address) must be "SP - (a stack entry)"
 		 * because SP is incremented by the RET.
 		 */
-		state->ip = unwind_recover_kretprobe(state, state->ip,
+		state->ip = unwind_recover_rethook(state, state->ip,
 				(unsigned long *)(state->sp - sizeof(long)));
 		state->regs = (struct pt_regs *)sp;
 		state->prev_regs = NULL;
@@ -573,7 +573,7 @@ bool unwind_next_frame(struct unwind_state *state)
 			goto err;
 		}
 		/* See UNWIND_HINT_TYPE_REGS case comment. */
-		state->ip = unwind_recover_kretprobe(state, state->ip,
+		state->ip = unwind_recover_rethook(state, state->ip,
 				(unsigned long *)(state->sp - sizeof(long)));
 
 		if (state->full_regs)

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf-next 2/2] rethook: kprobes: x86: Replace kretprobe with rethook on x86
  2022-03-25  4:29 ` [PATCH bpf-next 2/2] rethook: kprobes: x86: Replace kretprobe with rethook on x86 Masami Hiramatsu
  2022-03-25 10:09   ` Peter Zijlstra
@ 2022-03-25 10:11   ` Peter Zijlstra
  1 sibling, 0 replies; 9+ messages in thread
From: Peter Zijlstra @ 2022-03-25 10:11 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Alexei Starovoitov, Andrii Nakryiko, x86, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Dan Carpenter,
	kernel-janitors, Steven Rostedt, Jiri Olsa, bpf, linux-kernel

On Fri, Mar 25, 2022 at 01:29:01PM +0900, Masami Hiramatsu wrote:
> +	/* Push a fake return address to tell the unwinder it's a kretprobe. */
> +	/* Push a fake return address to tell the unwinder it's a kretprobe. */

s/kretprobe/rethook/ went missing

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH bpf-next 3/2] x86,rethook: Fix arch_rethook_trampoline() to generate a complete pt_regs
  2022-03-25  4:28 [PATCH bpf-next 0/2] kprobes: rethook: x86: Replace kretprobe trampoline with rethook Masami Hiramatsu
  2022-03-25  4:28 ` [PATCH bpf-next 1/2] kprobes: Use rethook for kretprobe if possible Masami Hiramatsu
  2022-03-25  4:29 ` [PATCH bpf-next 2/2] rethook: kprobes: x86: Replace kretprobe with rethook on x86 Masami Hiramatsu
@ 2022-03-25 11:40 ` Peter Zijlstra
  2022-03-25 13:01   ` Masami Hiramatsu
  2 siblings, 1 reply; 9+ messages in thread
From: Peter Zijlstra @ 2022-03-25 11:40 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Alexei Starovoitov, Andrii Nakryiko, x86, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Dan Carpenter,
	kernel-janitors, Steven Rostedt, Jiri Olsa, bpf, linux-kernel


You lost the regs->ss bit again..

Boot tested on tigerlake with IBT enabled -- passed the boot time
kretprobe selftests.

---

Subject: x86,rethook: Fix arch_rethook_trampoline() to generate a complete pt_regs
From: Peter Zijlstra <peterz@infradead.org>
Date: Fri Mar 25 10:25:56 CET 2022

Currently arch_rethook_trampoline() generates an almost complete
pt_regs on-stack, everything except regs->ss that is, that currently
points to the fake return address, which is not a valid segment
descriptor.

Since interpretation of regs->[sb]p should be done in the context of
regs->ss, and we have code actually doing that (see
arch/x86/lib/insn-eval.c for instance), complete the job by also
pushing ss.

This ensures that anybody who does do look at regs->ss doesn't
mysteriously malfunction, avoiding much future pain.

Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
---
 arch/x86/kernel/rethook.c |   24 +++++++++++++-----------
 1 file changed, 13 insertions(+), 11 deletions(-)

--- a/arch/x86/kernel/rethook.c
+++ b/arch/x86/kernel/rethook.c
@@ -25,29 +25,31 @@ asm(
 	/* Push a fake return address to tell the unwinder it's a kretprobe. */
 	"	pushq $arch_rethook_trampoline\n"
 	UNWIND_HINT_FUNC
-	/* Save the 'sp - 8', this will be fixed later. */
+	"       pushq $" __stringify(__KERNEL_DS) "\n"
+	/* Save the 'sp - 16', this will be fixed later. */
 	"	pushq %rsp\n"
 	"	pushfq\n"
 	SAVE_REGS_STRING
 	"	movq %rsp, %rdi\n"
 	"	call arch_rethook_trampoline_callback\n"
 	RESTORE_REGS_STRING
-	/* In the callback function, 'regs->flags' is copied to 'regs->sp'. */
-	"	addq $8, %rsp\n"
+	/* In the callback function, 'regs->flags' is copied to 'regs->ss'. */
+	"	addq $16, %rsp\n"
 	"	popfq\n"
 #else
 	/* Push a fake return address to tell the unwinder it's a kretprobe. */
 	"	pushl $arch_rethook_trampoline\n"
 	UNWIND_HINT_FUNC
-	/* Save the 'sp - 4', this will be fixed later. */
+	"	pushl %ss\n"
+	/* Save the 'sp - 8', this will be fixed later. */
 	"	pushl %esp\n"
 	"	pushfl\n"
 	SAVE_REGS_STRING
 	"	movl %esp, %eax\n"
 	"	call arch_rethook_trampoline_callback\n"
 	RESTORE_REGS_STRING
-	/* In the callback function, 'regs->flags' is copied to 'regs->sp'. */
-	"	addl $4, %esp\n"
+	/* In the callback function, 'regs->flags' is copied to 'regs->ss'. */
+	"	addl $8, %esp\n"
 	"	popfl\n"
 #endif
 	ASM_RET
@@ -69,8 +71,8 @@ __used __visible void arch_rethook_tramp
 #endif
 	regs->ip = (unsigned long)&arch_rethook_trampoline;
 	regs->orig_ax = ~0UL;
-	regs->sp += sizeof(long);
-	frame_pointer = &regs->sp + 1;
+	regs->sp += 2*sizeof(long);
+	frame_pointer = (long *)(regs + 1);
 
 	/*
 	 * The return address at 'frame_pointer' is recovered by the
@@ -80,10 +82,10 @@ __used __visible void arch_rethook_tramp
 	rethook_trampoline_handler(regs, (unsigned long)frame_pointer);
 
 	/*
-	 * Copy FLAGS to 'pt_regs::sp' so that arch_rethook_trapmoline()
+	 * Copy FLAGS to 'pt_regs::ss' so that arch_rethook_trapmoline()
 	 * can do RET right after POPF.
 	 */
-	regs->sp = regs->flags;
+	*(unsigned long *)&regs->ss = regs->flags;
 }
 NOKPROBE_SYMBOL(arch_rethook_trampoline_callback);
 
@@ -101,7 +103,7 @@ STACK_FRAME_NON_STANDARD_FP(arch_rethook
 void arch_rethook_fixup_return(struct pt_regs *regs,
 			       unsigned long correct_ret_addr)
 {
-	unsigned long *frame_pointer = &regs->sp + 1;
+	unsigned long *frame_pointer = (void *)(regs + 1);
 
 	/* Replace fake return address with real one. */
 	*frame_pointer = correct_ret_addr;

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf-next 2/2] rethook: kprobes: x86: Replace kretprobe with rethook on x86
  2022-03-25 10:09   ` Peter Zijlstra
@ 2022-03-25 11:51     ` Jiri Olsa
  2022-03-25 12:56     ` Masami Hiramatsu
  1 sibling, 0 replies; 9+ messages in thread
From: Jiri Olsa @ 2022-03-25 11:51 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Masami Hiramatsu, Alexei Starovoitov, Andrii Nakryiko, x86,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, Dave Hansen,
	Dan Carpenter, kernel-janitors, Steven Rostedt, Jiri Olsa, bpf,
	linux-kernel, Josh Poimboeuf

On Fri, Mar 25, 2022 at 11:09:40AM +0100, Peter Zijlstra wrote:
> On Fri, Mar 25, 2022 at 01:29:01PM +0900, Masami Hiramatsu wrote:
> > Replaces the kretprobe code with rethook on x86. With this patch,
> > kretprobe on x86 uses the rethook instead of kretprobe specific
> > trampoline code.
> > 
> > Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
> > ---
> >  arch/x86/Kconfig                 |    1 
> >  arch/x86/include/asm/unwind.h    |   23 +++----
> >  arch/x86/kernel/Makefile         |    1 
> >  arch/x86/kernel/kprobes/common.h |    1 
> >  arch/x86/kernel/kprobes/core.c   |  107 ----------------------------------
> >  arch/x86/kernel/rethook.c        |  121 ++++++++++++++++++++++++++++++++++++++
> >  6 files changed, 135 insertions(+), 119 deletions(-)
> >  create mode 100644 arch/x86/kernel/rethook.c
> 
> I'm thinking you'll find it builds much better with this on...

I built it with Peter's fix and ran bpf selftests, looks good

Tested-by: Jiri Olsa <jolsa@kernel.org>

thanks,
jirka

> 
> diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
> index 2de3c8c5eba9..794fdef2501a 100644
> --- a/arch/x86/kernel/unwind_orc.c
> +++ b/arch/x86/kernel/unwind_orc.c
> @@ -550,15 +550,15 @@ bool unwind_next_frame(struct unwind_state *state)
>  		}
>  		/*
>  		 * There is a small chance to interrupt at the entry of
> -		 * __kretprobe_trampoline() where the ORC info doesn't exist.
> -		 * That point is right after the RET to __kretprobe_trampoline()
> +		 * arch_rethook_trampoline() where the ORC info doesn't exist.
> +		 * That point is right after the RET to arch_rethook_trampoline()
>  		 * which was modified return address.
> -		 * At that point, the @addr_p of the unwind_recover_kretprobe()
> +		 * At that point, the @addr_p of the unwind_recover_rethook()
>  		 * (this has to point the address of the stack entry storing
>  		 * the modified return address) must be "SP - (a stack entry)"
>  		 * because SP is incremented by the RET.
>  		 */
> -		state->ip = unwind_recover_kretprobe(state, state->ip,
> +		state->ip = unwind_recover_rethook(state, state->ip,
>  				(unsigned long *)(state->sp - sizeof(long)));
>  		state->regs = (struct pt_regs *)sp;
>  		state->prev_regs = NULL;
> @@ -573,7 +573,7 @@ bool unwind_next_frame(struct unwind_state *state)
>  			goto err;
>  		}
>  		/* See UNWIND_HINT_TYPE_REGS case comment. */
> -		state->ip = unwind_recover_kretprobe(state, state->ip,
> +		state->ip = unwind_recover_rethook(state, state->ip,
>  				(unsigned long *)(state->sp - sizeof(long)));
>  
>  		if (state->full_regs)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf-next 2/2] rethook: kprobes: x86: Replace kretprobe with rethook on x86
  2022-03-25 10:09   ` Peter Zijlstra
  2022-03-25 11:51     ` Jiri Olsa
@ 2022-03-25 12:56     ` Masami Hiramatsu
  1 sibling, 0 replies; 9+ messages in thread
From: Masami Hiramatsu @ 2022-03-25 12:56 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Alexei Starovoitov, Andrii Nakryiko, x86, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Dan Carpenter,
	kernel-janitors, Steven Rostedt, Jiri Olsa, bpf, linux-kernel,
	Josh Poimboeuf

On Fri, 25 Mar 2022 11:09:40 +0100
Peter Zijlstra <peterz@infradead.org> wrote:

> On Fri, Mar 25, 2022 at 01:29:01PM +0900, Masami Hiramatsu wrote:
> > Replaces the kretprobe code with rethook on x86. With this patch,
> > kretprobe on x86 uses the rethook instead of kretprobe specific
> > trampoline code.
> > 
> > Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
> > ---
> >  arch/x86/Kconfig                 |    1 
> >  arch/x86/include/asm/unwind.h    |   23 +++----
> >  arch/x86/kernel/Makefile         |    1 
> >  arch/x86/kernel/kprobes/common.h |    1 
> >  arch/x86/kernel/kprobes/core.c   |  107 ----------------------------------
> >  arch/x86/kernel/rethook.c        |  121 ++++++++++++++++++++++++++++++++++++++
> >  6 files changed, 135 insertions(+), 119 deletions(-)
> >  create mode 100644 arch/x86/kernel/rethook.c
> 
> I'm thinking you'll find it builds much better with this on...

Oops, Thanks. I've tested it with framepointer based unwinder...

> 
> diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
> index 2de3c8c5eba9..794fdef2501a 100644
> --- a/arch/x86/kernel/unwind_orc.c
> +++ b/arch/x86/kernel/unwind_orc.c
> @@ -550,15 +550,15 @@ bool unwind_next_frame(struct unwind_state *state)
>  		}
>  		/*
>  		 * There is a small chance to interrupt at the entry of
> -		 * __kretprobe_trampoline() where the ORC info doesn't exist.
> -		 * That point is right after the RET to __kretprobe_trampoline()
> +		 * arch_rethook_trampoline() where the ORC info doesn't exist.
> +		 * That point is right after the RET to arch_rethook_trampoline()
>  		 * which was modified return address.
> -		 * At that point, the @addr_p of the unwind_recover_kretprobe()
> +		 * At that point, the @addr_p of the unwind_recover_rethook()
>  		 * (this has to point the address of the stack entry storing
>  		 * the modified return address) must be "SP - (a stack entry)"
>  		 * because SP is incremented by the RET.
>  		 */
> -		state->ip = unwind_recover_kretprobe(state, state->ip,
> +		state->ip = unwind_recover_rethook(state, state->ip,
>  				(unsigned long *)(state->sp - sizeof(long)));
>  		state->regs = (struct pt_regs *)sp;
>  		state->prev_regs = NULL;
> @@ -573,7 +573,7 @@ bool unwind_next_frame(struct unwind_state *state)
>  			goto err;
>  		}
>  		/* See UNWIND_HINT_TYPE_REGS case comment. */
> -		state->ip = unwind_recover_kretprobe(state, state->ip,
> +		state->ip = unwind_recover_rethook(state, state->ip,
>  				(unsigned long *)(state->sp - sizeof(long)));
>  
>  		if (state->full_regs)


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH bpf-next 3/2] x86,rethook: Fix arch_rethook_trampoline() to generate a complete pt_regs
  2022-03-25 11:40 ` [PATCH bpf-next 3/2] x86,rethook: Fix arch_rethook_trampoline() to generate a complete pt_regs Peter Zijlstra
@ 2022-03-25 13:01   ` Masami Hiramatsu
  0 siblings, 0 replies; 9+ messages in thread
From: Masami Hiramatsu @ 2022-03-25 13:01 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Alexei Starovoitov, Andrii Nakryiko, x86, Thomas Gleixner,
	Ingo Molnar, Borislav Petkov, Dave Hansen, Dan Carpenter,
	kernel-janitors, Steven Rostedt, Jiri Olsa, bpf, linux-kernel

On Fri, 25 Mar 2022 12:40:12 +0100
Peter Zijlstra <peterz@infradead.org> wrote:

> 
> You lost the regs->ss bit again..

Yeah, I planed to split it in another series with optprobe update
because the optprobe also skips regs->ss. Anyway...

> 
> Boot tested on tigerlake with IBT enabled -- passed the boot time
> kretprobe selftests.
> 
> ---
> 
> Subject: x86,rethook: Fix arch_rethook_trampoline() to generate a complete pt_regs
> From: Peter Zijlstra <peterz@infradead.org>
> Date: Fri Mar 25 10:25:56 CET 2022
> 
> Currently arch_rethook_trampoline() generates an almost complete
> pt_regs on-stack, everything except regs->ss that is, that currently
> points to the fake return address, which is not a valid segment
> descriptor.
> 
> Since interpretation of regs->[sb]p should be done in the context of
> regs->ss, and we have code actually doing that (see
> arch/x86/lib/insn-eval.c for instance), complete the job by also
> pushing ss.

This looks good to me.

Acked-by: Masami Hiramatsu <mhiramat@kernel.org>

Thank you!

> 
> This ensures that anybody who does do look at regs->ss doesn't
> mysteriously malfunction, avoiding much future pain.
> 
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
>  arch/x86/kernel/rethook.c |   24 +++++++++++++-----------
>  1 file changed, 13 insertions(+), 11 deletions(-)
> 
> --- a/arch/x86/kernel/rethook.c
> +++ b/arch/x86/kernel/rethook.c
> @@ -25,29 +25,31 @@ asm(
>  	/* Push a fake return address to tell the unwinder it's a kretprobe. */
>  	"	pushq $arch_rethook_trampoline\n"
>  	UNWIND_HINT_FUNC
> -	/* Save the 'sp - 8', this will be fixed later. */
> +	"       pushq $" __stringify(__KERNEL_DS) "\n"
> +	/* Save the 'sp - 16', this will be fixed later. */
>  	"	pushq %rsp\n"
>  	"	pushfq\n"
>  	SAVE_REGS_STRING
>  	"	movq %rsp, %rdi\n"
>  	"	call arch_rethook_trampoline_callback\n"
>  	RESTORE_REGS_STRING
> -	/* In the callback function, 'regs->flags' is copied to 'regs->sp'. */
> -	"	addq $8, %rsp\n"
> +	/* In the callback function, 'regs->flags' is copied to 'regs->ss'. */
> +	"	addq $16, %rsp\n"
>  	"	popfq\n"
>  #else
>  	/* Push a fake return address to tell the unwinder it's a kretprobe. */
>  	"	pushl $arch_rethook_trampoline\n"
>  	UNWIND_HINT_FUNC
> -	/* Save the 'sp - 4', this will be fixed later. */
> +	"	pushl %ss\n"
> +	/* Save the 'sp - 8', this will be fixed later. */
>  	"	pushl %esp\n"
>  	"	pushfl\n"
>  	SAVE_REGS_STRING
>  	"	movl %esp, %eax\n"
>  	"	call arch_rethook_trampoline_callback\n"
>  	RESTORE_REGS_STRING
> -	/* In the callback function, 'regs->flags' is copied to 'regs->sp'. */
> -	"	addl $4, %esp\n"
> +	/* In the callback function, 'regs->flags' is copied to 'regs->ss'. */
> +	"	addl $8, %esp\n"
>  	"	popfl\n"
>  #endif
>  	ASM_RET
> @@ -69,8 +71,8 @@ __used __visible void arch_rethook_tramp
>  #endif
>  	regs->ip = (unsigned long)&arch_rethook_trampoline;
>  	regs->orig_ax = ~0UL;
> -	regs->sp += sizeof(long);
> -	frame_pointer = &regs->sp + 1;
> +	regs->sp += 2*sizeof(long);
> +	frame_pointer = (long *)(regs + 1);
>  
>  	/*
>  	 * The return address at 'frame_pointer' is recovered by the
> @@ -80,10 +82,10 @@ __used __visible void arch_rethook_tramp
>  	rethook_trampoline_handler(regs, (unsigned long)frame_pointer);
>  
>  	/*
> -	 * Copy FLAGS to 'pt_regs::sp' so that arch_rethook_trapmoline()
> +	 * Copy FLAGS to 'pt_regs::ss' so that arch_rethook_trapmoline()
>  	 * can do RET right after POPF.
>  	 */
> -	regs->sp = regs->flags;
> +	*(unsigned long *)&regs->ss = regs->flags;
>  }
>  NOKPROBE_SYMBOL(arch_rethook_trampoline_callback);
>  
> @@ -101,7 +103,7 @@ STACK_FRAME_NON_STANDARD_FP(arch_rethook
>  void arch_rethook_fixup_return(struct pt_regs *regs,
>  			       unsigned long correct_ret_addr)
>  {
> -	unsigned long *frame_pointer = &regs->sp + 1;
> +	unsigned long *frame_pointer = (void *)(regs + 1);
>  
>  	/* Replace fake return address with real one. */
>  	*frame_pointer = correct_ret_addr;


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-03-25 13:01 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-25  4:28 [PATCH bpf-next 0/2] kprobes: rethook: x86: Replace kretprobe trampoline with rethook Masami Hiramatsu
2022-03-25  4:28 ` [PATCH bpf-next 1/2] kprobes: Use rethook for kretprobe if possible Masami Hiramatsu
2022-03-25  4:29 ` [PATCH bpf-next 2/2] rethook: kprobes: x86: Replace kretprobe with rethook on x86 Masami Hiramatsu
2022-03-25 10:09   ` Peter Zijlstra
2022-03-25 11:51     ` Jiri Olsa
2022-03-25 12:56     ` Masami Hiramatsu
2022-03-25 10:11   ` Peter Zijlstra
2022-03-25 11:40 ` [PATCH bpf-next 3/2] x86,rethook: Fix arch_rethook_trampoline() to generate a complete pt_regs Peter Zijlstra
2022-03-25 13:01   ` Masami Hiramatsu

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.