bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH -tip 0/3] x86/kprobes,orc: Fix ORC unwinder to unwind stack with optimized probe
@ 2021-03-31  5:44 Masami Hiramatsu
  2021-03-31  5:44 ` [RFC PATCH -tip 1/3] x86/kprobes: Add ORC information to optprobe template Masami Hiramatsu
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Masami Hiramatsu @ 2021-03-31  5:44 UTC (permalink / raw)
  To: Josh Poimboeuf, Ingo Molnar
  Cc: X86 ML, Masami Hiramatsu, Daniel Xu, linux-kernel, bpf, kuba,
	mingo, ast, tglx, kernel-team, yhs, Steven Rostedt

Hello,

These patches fixes the ORC unwinder to unwind optprobe trampoline
code on the stack correctly.
This patchset is based on the kretporbe and stacktrace fix series v5
which I sent last week.

https://lore.kernel.org/bpf/161676170650.330141.6214727134265514123.stgit@devnote2/

Note that I just confirmed the it fixes the case where the 
stacktrace called from the optprobe handler. So this should be
carefully reviewed.


Here is the test code;

cd /sys/kernel/debug/tracing
echo > trace
echo p full_proxy_read+5 >> kprobe_events
echo 1 > events/kprobes/enable
sleep 1 # wait for optimization
echo stacktrace:1 > events/kprobes/p_full_proxy_read_5/trigger
echo 1 > options/sym-offset
cat /sys/kernel/debug/kprobes/list


Without this,

             cat-138     [001] ...1     6.567662: p_full_proxy_read_5: (full_proxy_read+0x5/0x80)
             cat-138     [001] ...1     6.567711: <stack trace>
 => kprobe_trace_func+0x1d0/0x2c0
 => kprobe_dispatcher+0x39/0x60
 => opt_pre_handler+0x4f/0x80
 => optimized_callback+0xc3/0xf0
 => 0xffffffffa0006032
 => 0
 => 0


With this patch,

             cat-137     [007] ...1    17.542848: p_full_proxy_read_5: (full_proxy_read+0x5/0x80)
             cat-137     [007] ...1    17.542963: <stack trace>
 => kprobe_trace_func+0x1d0/0x2c0
 => kprobe_dispatcher+0x39/0x60
 => opt_pre_handler+0x4f/0x80
 => optimized_callback+0xc3/0xf0
 => full_proxy_read+0x5/0x80
 => vfs_read+0xab/0x1a0
 => ksys_read+0x5f/0xe0
 => do_syscall_64+0x33/0x40
 => entry_SYSCALL_64_after_hwframe+0x44/0xae
 => 0
 => 0


Thank you,

---

Masami Hiramatsu (3):
      x86/kprobes: Add ORC information to optprobe template
      kprobes: Add functions to find instruction buffer entry address
      x86/kprobes,orc: Unwind optprobe trampoline correctly


 arch/x86/include/asm/kprobes.h |    6 +++
 arch/x86/kernel/kprobes/opt.c  |   72 +++++++++++++++++++++++++++++++++++++---
 arch/x86/kernel/unwind_orc.c   |   15 +++++++-
 include/linux/kprobes.h        |    8 ++++
 kernel/kprobes.c               |   25 ++++++++++----
 5 files changed, 111 insertions(+), 15 deletions(-)

--
Masami Hiramatsu (Linaro) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* [RFC PATCH -tip 1/3] x86/kprobes: Add ORC information to optprobe template
  2021-03-31  5:44 [RFC PATCH -tip 0/3] x86/kprobes,orc: Fix ORC unwinder to unwind stack with optimized probe Masami Hiramatsu
@ 2021-03-31  5:44 ` Masami Hiramatsu
  2021-03-31  5:44 ` [RFC PATCH -tip 2/3] kprobes: Add functions to find instruction buffer entry address Masami Hiramatsu
  2021-03-31  5:44 ` [RFC PATCH -tip 3/3] x86/kprobes,orc: Unwind optprobe trampoline correctly Masami Hiramatsu
  2 siblings, 0 replies; 8+ messages in thread
From: Masami Hiramatsu @ 2021-03-31  5:44 UTC (permalink / raw)
  To: Josh Poimboeuf, Ingo Molnar
  Cc: X86 ML, Masami Hiramatsu, Daniel Xu, linux-kernel, bpf, kuba,
	mingo, ast, tglx, kernel-team, yhs, Steven Rostedt

As same as kretprobe_trampoline, move the optprobe template
code in the text for making ORC information on that.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
 arch/x86/kernel/kprobes/opt.c |   18 +++++++++++++-----
 1 file changed, 13 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
index 4299fc865732..6d26e5cf2ba2 100644
--- a/arch/x86/kernel/kprobes/opt.c
+++ b/arch/x86/kernel/kprobes/opt.c
@@ -100,9 +100,13 @@ static void synthesize_set_arg1(kprobe_opcode_t *addr, unsigned long val)
 	*(unsigned long *)addr = val;
 }
 
+static void
+optimized_callback(struct optimized_kprobe *op, struct pt_regs *regs);
+
 asm (
-			".pushsection .rodata\n"
+			".text\n"
 			"optprobe_template_func:\n"
+			".type optprobe_template_func, @function\n"
 			".global optprobe_template_entry\n"
 			"optprobe_template_entry:\n"
 #ifdef CONFIG_X86_64
@@ -120,7 +124,8 @@ asm (
 			ASM_NOP5
 			".global optprobe_template_call\n"
 			"optprobe_template_call:\n"
-			ASM_NOP5
+			/* Dummy call for ORC */
+			"	callq optimized_callback\n"
 			/* Move flags to rsp */
 			"	movq 18*8(%rsp), %rdx\n"
 			"	movq %rdx, 19*8(%rsp)\n"
@@ -141,7 +146,8 @@ asm (
 			ASM_NOP5
 			".global optprobe_template_call\n"
 			"optprobe_template_call:\n"
-			ASM_NOP5
+			/* Dummy call for ORC */
+			"	call optimized_callback\n"
 			/* Move flags into esp */
 			"	movl 14*4(%esp), %edx\n"
 			"	movl %edx, 15*4(%esp)\n"
@@ -152,10 +158,12 @@ asm (
 #endif
 			".global optprobe_template_end\n"
 			"optprobe_template_end:\n"
-			".popsection\n");
+			/* Dummy return for objtool */
+			"	ret\n"
+			".size optprobe_template_func, .-optprobe_template_func\n"
+);
 
 void optprobe_template_func(void);
-STACK_FRAME_NON_STANDARD(optprobe_template_func);
 
 #define TMPL_CLAC_IDX \
 	((long)optprobe_template_clac - (long)optprobe_template_entry)


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH -tip 2/3] kprobes: Add functions to find instruction buffer entry address
  2021-03-31  5:44 [RFC PATCH -tip 0/3] x86/kprobes,orc: Fix ORC unwinder to unwind stack with optimized probe Masami Hiramatsu
  2021-03-31  5:44 ` [RFC PATCH -tip 1/3] x86/kprobes: Add ORC information to optprobe template Masami Hiramatsu
@ 2021-03-31  5:44 ` Masami Hiramatsu
  2021-03-31  5:44 ` [RFC PATCH -tip 3/3] x86/kprobes,orc: Unwind optprobe trampoline correctly Masami Hiramatsu
  2 siblings, 0 replies; 8+ messages in thread
From: Masami Hiramatsu @ 2021-03-31  5:44 UTC (permalink / raw)
  To: Josh Poimboeuf, Ingo Molnar
  Cc: X86 ML, Masami Hiramatsu, Daniel Xu, linux-kernel, bpf, kuba,
	mingo, ast, tglx, kernel-team, yhs, Steven Rostedt

Add find_kprobe_{insn,optinsn}_slot_entry() functions to find
corresponding entry address of the kprobe instrurction buffer
which includes given address.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
 include/linux/kprobes.h |    8 ++++++++
 kernel/kprobes.c        |   25 ++++++++++++++++++-------
 2 files changed, 26 insertions(+), 7 deletions(-)

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index f530f82a046d..08adfd6cf562 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -305,6 +305,8 @@ extern void __free_insn_slot(struct kprobe_insn_cache *c,
 /* sleep-less address checking routine  */
 extern bool __is_insn_slot_addr(struct kprobe_insn_cache *c,
 				unsigned long addr);
+extern unsigned long
+__find_insn_slot_entry(struct kprobe_insn_cache *c, unsigned long addr);
 
 #define DEFINE_INSN_CACHE_OPS(__name)					\
 extern struct kprobe_insn_cache kprobe_##__name##_slots;		\
@@ -322,6 +324,12 @@ static inline void free_##__name##_slot(kprobe_opcode_t *slot, int dirty)\
 static inline bool is_kprobe_##__name##_slot(unsigned long addr)	\
 {									\
 	return __is_insn_slot_addr(&kprobe_##__name##_slots, addr);	\
+}									\
+									\
+static inline unsigned long						\
+find_kprobe_##__name##_slot_entry(unsigned long addr)			\
+{									\
+	return __find_insn_slot_entry(&kprobe_##__name##_slots, addr);	\
 }
 #define KPROBE_INSN_PAGE_SYM		"kprobe_insn_page"
 #define KPROBE_OPTINSN_PAGE_SYM		"kprobe_optinsn_page"
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 4ce3e6f5d28d..b62635969fa6 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -277,26 +277,37 @@ void __free_insn_slot(struct kprobe_insn_cache *c,
 }
 
 /*
- * Check given address is on the page of kprobe instruction slots.
- * This will be used for checking whether the address on a stack
- * is on a text area or not.
+ * Find the entry address of the kprobe instruction slots where the
+ * @addr points.
  */
-bool __is_insn_slot_addr(struct kprobe_insn_cache *c, unsigned long addr)
+unsigned long
+__find_insn_slot_entry(struct kprobe_insn_cache *c, unsigned long addr)
 {
 	struct kprobe_insn_page *kip;
-	bool ret = false;
+	unsigned long entry = 0, index;
 
 	rcu_read_lock();
 	list_for_each_entry_rcu(kip, &c->pages, list) {
 		if (addr >= (unsigned long)kip->insns &&
 		    addr < (unsigned long)kip->insns + PAGE_SIZE) {
-			ret = true;
+			index = (addr - (unsigned long)kip->insns) / c->insn_size;
+			entry = (unsigned long)kip->insns + (index * c->insn_size);
 			break;
 		}
 	}
 	rcu_read_unlock();
 
-	return ret;
+	return entry;
+}
+
+/*
+ * Check given address is on the page of kprobe instruction slots.
+ * This will be used for checking whether the address on a stack
+ * is on a text area or not.
+ */
+bool __is_insn_slot_addr(struct kprobe_insn_cache *c, unsigned long addr)
+{
+	return __find_insn_slot_entry(c, addr) != 0;
 }
 
 int kprobe_cache_get_kallsym(struct kprobe_insn_cache *c, unsigned int *symnum,


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [RFC PATCH -tip 3/3] x86/kprobes,orc: Unwind optprobe trampoline correctly
  2021-03-31  5:44 [RFC PATCH -tip 0/3] x86/kprobes,orc: Fix ORC unwinder to unwind stack with optimized probe Masami Hiramatsu
  2021-03-31  5:44 ` [RFC PATCH -tip 1/3] x86/kprobes: Add ORC information to optprobe template Masami Hiramatsu
  2021-03-31  5:44 ` [RFC PATCH -tip 2/3] kprobes: Add functions to find instruction buffer entry address Masami Hiramatsu
@ 2021-03-31  5:44 ` Masami Hiramatsu
  2021-03-31 15:57   ` Josh Poimboeuf
  2021-04-01  1:54   ` Masami Hiramatsu
  2 siblings, 2 replies; 8+ messages in thread
From: Masami Hiramatsu @ 2021-03-31  5:44 UTC (permalink / raw)
  To: Josh Poimboeuf, Ingo Molnar
  Cc: X86 ML, Masami Hiramatsu, Daniel Xu, linux-kernel, bpf, kuba,
	mingo, ast, tglx, kernel-team, yhs, Steven Rostedt

ORC Unwinder can not unwind if an optprobe trampoline is on the
stack because optprobe trampoline code has no ORC information.

This uses the ORC information on the template code of the
trampoline to adjust the sp register by ORC information and
extract the correct probed address from the optprobe trampoline
address.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
 arch/x86/include/asm/kprobes.h |    6 ++++
 arch/x86/kernel/kprobes/opt.c  |   54 ++++++++++++++++++++++++++++++++++++++++
 arch/x86/kernel/unwind_orc.c   |   15 +++++++++--
 3 files changed, 72 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/kprobes.h b/arch/x86/include/asm/kprobes.h
index 71ea2eab43d5..9bbc45fcb3f1 100644
--- a/arch/x86/include/asm/kprobes.h
+++ b/arch/x86/include/asm/kprobes.h
@@ -119,9 +119,15 @@ extern int kprobe_exceptions_notify(struct notifier_block *self,
 				    unsigned long val, void *data);
 extern int kprobe_int3_handler(struct pt_regs *regs);
 
+unsigned long recover_optprobe_trampoline(unsigned long addr, unsigned long *sp);
 #else
 
 static inline int kprobe_debug_handler(struct pt_regs *regs) { return 0; }
+static inline unsigned long recover_optprobe_trampoline(unsigned long addr,
+							unsigned long *sp)
+{
+	return addr;
+}
 
 #endif /* CONFIG_KPROBES */
 #endif /* _ASM_X86_KPROBES_H */
diff --git a/arch/x86/kernel/kprobes/opt.c b/arch/x86/kernel/kprobes/opt.c
index 6d26e5cf2ba2..f91922ba4844 100644
--- a/arch/x86/kernel/kprobes/opt.c
+++ b/arch/x86/kernel/kprobes/opt.c
@@ -30,6 +30,9 @@
 #include <asm/set_memory.h>
 #include <asm/sections.h>
 #include <asm/nospec-branch.h>
+#include <asm/orc_types.h>
+
+struct orc_entry *orc_find(unsigned long ip);
 
 #include "common.h"
 
@@ -100,6 +103,21 @@ static void synthesize_set_arg1(kprobe_opcode_t *addr, unsigned long val)
 	*(unsigned long *)addr = val;
 }
 
+/* Extract mov operand */
+static unsigned long extract_set_arg1(kprobe_opcode_t *addr)
+{
+#ifdef CONFIG_X86_64
+	if (addr[0] != 0x48 || addr[1] != 0xbf)
+		return 0;
+	addr += 2;
+#else
+	if (*addr != 0xb8)
+		return 0;
+	addr++;
+#endif
+	return *(unsigned long *)addr;
+}
+
 static void
 optimized_callback(struct optimized_kprobe *op, struct pt_regs *regs);
 
@@ -483,6 +501,42 @@ int arch_prepare_optimized_kprobe(struct optimized_kprobe *op,
 	goto out;
 }
 
+#ifdef CONFIG_UNWINDER_ORC
+unsigned long recover_optprobe_trampoline(unsigned long addr, unsigned long *sp)
+{
+	unsigned long offset, entry, probe_addr;
+	struct optimized_kprobe *op;
+	struct orc_entry *orc;
+
+	entry = find_kprobe_optinsn_slot_entry(addr);
+	if (!entry)
+		return addr;
+
+	offset = addr - entry;
+
+	/* Decode arg1 and get the optprobe */
+	op = (void *)extract_set_arg1((void *)(entry + TMPL_MOVE_IDX));
+	if (!op)
+		return addr;
+
+	probe_addr = (unsigned long)op->kp.addr;
+
+	if (offset < TMPL_END_IDX) {
+		orc = orc_find((unsigned long)optprobe_template_func + offset);
+		if (!orc || orc->sp_reg != ORC_REG_SP)
+			return addr;
+		/*
+		 * Since optprobe trampoline doesn't push caller on the stack,
+		 * need to decrement 1 stack entry size
+		 */
+		*sp += orc->sp_offset - sizeof(long);
+		return probe_addr;
+	} else {
+		return probe_addr + offset - TMPL_END_IDX;
+	}
+}
+#endif
+
 /*
  * Replace breakpoints (INT3) with relative jumps (JMP.d32).
  * Caller must call with locking kprobe_mutex and text_mutex.
diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index c70dfeea4552..9f685f9c2358 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -79,7 +79,7 @@ static struct orc_entry *orc_module_find(unsigned long ip)
 #endif
 
 #ifdef CONFIG_DYNAMIC_FTRACE
-static struct orc_entry *orc_find(unsigned long ip);
+struct orc_entry *orc_find(unsigned long ip);
 
 /*
  * Ftrace dynamic trampolines do not have orc entries of their own.
@@ -142,7 +142,7 @@ static struct orc_entry orc_fp_entry = {
 	.end		= 0,
 };
 
-static struct orc_entry *orc_find(unsigned long ip)
+struct orc_entry *orc_find(unsigned long ip)
 {
 	static struct orc_entry *orc;
 
@@ -537,6 +537,7 @@ bool unwind_next_frame(struct unwind_state *state)
 
 		state->ip = unwind_recover_ret_addr(state, state->ip,
 						    (unsigned long *)ip_p);
+		state->ip = recover_optprobe_trampoline(state->ip, &sp);
 		state->sp = sp;
 		state->regs = NULL;
 		state->prev_regs = NULL;
@@ -558,6 +559,14 @@ bool unwind_next_frame(struct unwind_state *state)
 		 */
 		state->ip = unwind_recover_kretprobe(state, state->ip,
 				(unsigned long *)(state->sp - sizeof(long)));
+
+		/*
+		 * The optprobe trampoline has a unique stackframe. It has
+		 * no caller (probed) address on the stack, Thus it has to
+		 * decode the trampoline code and change the stack pointer
+		 * for the next frame, but not change the pt_regs.
+		 */
+		state->ip = recover_optprobe_trampoline(state->ip, &state->sp);
 		state->regs = (struct pt_regs *)sp;
 		state->prev_regs = NULL;
 		state->full_regs = true;
@@ -573,7 +582,7 @@ bool unwind_next_frame(struct unwind_state *state)
 		/* See UNWIND_HINT_TYPE_REGS case comment. */
 		state->ip = unwind_recover_kretprobe(state, state->ip,
 				(unsigned long *)(state->sp - sizeof(long)));
-
+		state->ip = recover_optprobe_trampoline(state->ip, &state->sp);
 		if (state->full_regs)
 			state->prev_regs = state->regs;
 		state->regs = (void *)sp - IRET_FRAME_OFFSET;


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH -tip 3/3] x86/kprobes,orc: Unwind optprobe trampoline correctly
  2021-03-31  5:44 ` [RFC PATCH -tip 3/3] x86/kprobes,orc: Unwind optprobe trampoline correctly Masami Hiramatsu
@ 2021-03-31 15:57   ` Josh Poimboeuf
  2021-04-01  1:44     ` Masami Hiramatsu
  2021-04-01  1:54   ` Masami Hiramatsu
  1 sibling, 1 reply; 8+ messages in thread
From: Josh Poimboeuf @ 2021-03-31 15:57 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Ingo Molnar, X86 ML, Daniel Xu, linux-kernel, bpf, kuba, mingo,
	ast, tglx, kernel-team, yhs, Steven Rostedt

On Wed, Mar 31, 2021 at 02:44:56PM +0900, Masami Hiramatsu wrote:
> +#ifdef CONFIG_UNWINDER_ORC
> +unsigned long recover_optprobe_trampoline(unsigned long addr, unsigned long *sp)
> +{
> +	unsigned long offset, entry, probe_addr;
> +	struct optimized_kprobe *op;
> +	struct orc_entry *orc;
> +
> +	entry = find_kprobe_optinsn_slot_entry(addr);
> +	if (!entry)
> +		return addr;
> +
> +	offset = addr - entry;
> +
> +	/* Decode arg1 and get the optprobe */
> +	op = (void *)extract_set_arg1((void *)(entry + TMPL_MOVE_IDX));
> +	if (!op)
> +		return addr;
> +
> +	probe_addr = (unsigned long)op->kp.addr;
> +
> +	if (offset < TMPL_END_IDX) {
> +		orc = orc_find((unsigned long)optprobe_template_func + offset);
> +		if (!orc || orc->sp_reg != ORC_REG_SP)
> +			return addr;
> +		/*
> +		 * Since optprobe trampoline doesn't push caller on the stack,
> +		 * need to decrement 1 stack entry size
> +		 */
> +		*sp += orc->sp_offset - sizeof(long);
> +		return probe_addr;
> +	} else {
> +		return probe_addr + offset - TMPL_END_IDX;
> +	}
> +}
> +#endif

Hm, I'd like to avoid intertwining kprobes and ORC like this.

ORC unwinds other generated code by assuming the generated code uses a
frame pointer.  Could we do that here?

With CONFIG_FRAME_POINTER, unwinding works because SAVE_REGS_STRING has
ENCODE_FRAME_POINTER, but that's not going to work for ORC.

Instead of these patches, can we 'push %rbp; mov %rsp, %rbp' at the
beginning of the template and 'pop %rbp' at the end?

I guess SAVE_REGS_STRING would need to be smart enough to push the
original saved version of %rbp.  Of course then that breaks the
kretprobe_trampoline() usage, so it may need to be a separate macro.

[ Or make the same change to kretprobe_trampoline().  Then the other
  patch set wouldn't be needed either ;-) ]

Of course the downside is, when you get an interrupt during the frame
pointer setup, unwinding is broken.  But I think that's acceptable for
generated code.  We've lived with that limitation for all code, with
CONFIG_FRAME_POINTER, for many years.

Eventually we may want to have a way to register generated code (and the
ORC for it).

-- 
Josh


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH -tip 3/3] x86/kprobes,orc: Unwind optprobe trampoline correctly
  2021-03-31 15:57   ` Josh Poimboeuf
@ 2021-04-01  1:44     ` Masami Hiramatsu
  2021-04-01  2:19       ` Masami Hiramatsu
  0 siblings, 1 reply; 8+ messages in thread
From: Masami Hiramatsu @ 2021-04-01  1:44 UTC (permalink / raw)
  To: Josh Poimboeuf
  Cc: Ingo Molnar, X86 ML, Daniel Xu, linux-kernel, bpf, kuba, mingo,
	ast, tglx, kernel-team, yhs, Steven Rostedt

On Wed, 31 Mar 2021 10:57:36 -0500
Josh Poimboeuf <jpoimboe@redhat.com> wrote:

> On Wed, Mar 31, 2021 at 02:44:56PM +0900, Masami Hiramatsu wrote:
> > +#ifdef CONFIG_UNWINDER_ORC
> > +unsigned long recover_optprobe_trampoline(unsigned long addr, unsigned long *sp)
> > +{
> > +	unsigned long offset, entry, probe_addr;
> > +	struct optimized_kprobe *op;
> > +	struct orc_entry *orc;
> > +
> > +	entry = find_kprobe_optinsn_slot_entry(addr);
> > +	if (!entry)
> > +		return addr;
> > +
> > +	offset = addr - entry;
> > +
> > +	/* Decode arg1 and get the optprobe */
> > +	op = (void *)extract_set_arg1((void *)(entry + TMPL_MOVE_IDX));
> > +	if (!op)
> > +		return addr;
> > +
> > +	probe_addr = (unsigned long)op->kp.addr;
> > +
> > +	if (offset < TMPL_END_IDX) {
> > +		orc = orc_find((unsigned long)optprobe_template_func + offset);
> > +		if (!orc || orc->sp_reg != ORC_REG_SP)
> > +			return addr;
> > +		/*
> > +		 * Since optprobe trampoline doesn't push caller on the stack,
> > +		 * need to decrement 1 stack entry size
> > +		 */
> > +		*sp += orc->sp_offset - sizeof(long);
> > +		return probe_addr;
> > +	} else {
> > +		return probe_addr + offset - TMPL_END_IDX;
> > +	}
> > +}
> > +#endif
> 
> Hm, I'd like to avoid intertwining kprobes and ORC like this.
> 
> ORC unwinds other generated code by assuming the generated code uses a
> frame pointer.  Could we do that here?

No, because the optprobe is not a function call. I considered to make
it call, but since it has to execute copied instructions directly on
the trampoline code (without changing stack frame) it is not possible.

> With CONFIG_FRAME_POINTER, unwinding works because SAVE_REGS_STRING has
> ENCODE_FRAME_POINTER, but that's not going to work for ORC.

Even in that case, the problem is that any interrupt can happen
before doing ENCODE_FRAME_POINTER. I think this ENCODE_FRAME_POINTER
in the SAVE_REGS_STRING is for probing right before the target
function setup a frame pointer.

> Instead of these patches, can we 'push %rbp; mov %rsp, %rbp' at the
> beginning of the template and 'pop %rbp' at the end?

No, since the trampoline code is not called, it is jumped into.
This means there is no "return address" in the stack. If we setup
the frame, there is no return address, thus it might stop there.
(Moreover, optprobe can copy multiple instructins on trampoline
buffer, since relative jump consumes 5bytes. where is the "return address"?)

> 
> I guess SAVE_REGS_STRING would need to be smart enough to push the
> original saved version of %rbp.  Of course then that breaks the
> kretprobe_trampoline() usage, so it may need to be a separate macro.
> 
> [ Or make the same change to kretprobe_trampoline().  Then the other
>   patch set wouldn't be needed either ;-) ]

Hmm, I don't think it is a good idea which making such change on the
optimized (hot) path only for the stack tracing. Moreover, that maybe
not transparent with the stack made by int3.

> Of course the downside is, when you get an interrupt during the frame
> pointer setup, unwinding is broken.  But I think that's acceptable for
> generated code.  We've lived with that limitation for all code, with
> CONFIG_FRAME_POINTER, for many years.

But above code can fix such issue too. To fix a corner case, non-generic
code may be required, even it is not so simple.

> 
> Eventually we may want to have a way to register generated code (and the
> ORC for it).
> 
> -- 
> Josh
> 


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH -tip 3/3] x86/kprobes,orc: Unwind optprobe trampoline correctly
  2021-03-31  5:44 ` [RFC PATCH -tip 3/3] x86/kprobes,orc: Unwind optprobe trampoline correctly Masami Hiramatsu
  2021-03-31 15:57   ` Josh Poimboeuf
@ 2021-04-01  1:54   ` Masami Hiramatsu
  1 sibling, 0 replies; 8+ messages in thread
From: Masami Hiramatsu @ 2021-04-01  1:54 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Josh Poimboeuf, Ingo Molnar, X86 ML, Daniel Xu, linux-kernel,
	bpf, kuba, mingo, ast, tglx, kernel-team, yhs, Steven Rostedt

On Wed, 31 Mar 2021 14:44:56 +0900
Masami Hiramatsu <mhiramat@kernel.org> wrote:


> +#ifdef CONFIG_UNWINDER_ORC
> +unsigned long recover_optprobe_trampoline(unsigned long addr, unsigned long *sp)
> +{
> +	unsigned long offset, entry, probe_addr;
> +	struct optimized_kprobe *op;
> +	struct orc_entry *orc;
> +
> +	entry = find_kprobe_optinsn_slot_entry(addr);
> +	if (!entry)
> +		return addr;
> +
> +	offset = addr - entry;
> +
> +	/* Decode arg1 and get the optprobe */
> +	op = (void *)extract_set_arg1((void *)(entry + TMPL_MOVE_IDX));
> +	if (!op)
> +		return addr;
> +
> +	probe_addr = (unsigned long)op->kp.addr;
> +
> +	if (offset < TMPL_END_IDX) {

Maybe I should add a comment here.

/*
 * Since this is on the trampoline code based on the template code,
 * ORC information on the template code can be used for adjusting
 * stack pointer. The template code may change the stack pointer to
 * store pt_regs.
 */

> +		orc = orc_find((unsigned long)optprobe_template_func + offset);
> +		if (!orc || orc->sp_reg != ORC_REG_SP)
> +			return addr;
> +		/*
> +		 * Since optprobe trampoline doesn't push caller on the stack,
> +		 * need to decrement 1 stack entry size
> +		 */
> +		*sp += orc->sp_offset - sizeof(long);
> +		return probe_addr;
> +	} else {

/*
 * In this case, the address is on the instruction copied from probed
 * address, and jump instruction. Here the stack pointer is exactly
 * the same as the value where it was copied by the kprobes.
 */

> +		return probe_addr + offset - TMPL_END_IDX;
> +	}
> +}
> +#endif
> +


Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [RFC PATCH -tip 3/3] x86/kprobes,orc: Unwind optprobe trampoline correctly
  2021-04-01  1:44     ` Masami Hiramatsu
@ 2021-04-01  2:19       ` Masami Hiramatsu
  0 siblings, 0 replies; 8+ messages in thread
From: Masami Hiramatsu @ 2021-04-01  2:19 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Josh Poimboeuf, Ingo Molnar, X86 ML, Daniel Xu, linux-kernel,
	bpf, kuba, mingo, ast, tglx, kernel-team, yhs, Steven Rostedt

On Thu, 1 Apr 2021 10:44:52 +0900
Masami Hiramatsu <mhiramat@kernel.org> wrote:

> On Wed, 31 Mar 2021 10:57:36 -0500
> Josh Poimboeuf <jpoimboe@redhat.com> wrote:
> 
> > On Wed, Mar 31, 2021 at 02:44:56PM +0900, Masami Hiramatsu wrote:
> > > +#ifdef CONFIG_UNWINDER_ORC
> > > +unsigned long recover_optprobe_trampoline(unsigned long addr, unsigned long *sp)
> > > +{
> > > +	unsigned long offset, entry, probe_addr;
> > > +	struct optimized_kprobe *op;
> > > +	struct orc_entry *orc;
> > > +
> > > +	entry = find_kprobe_optinsn_slot_entry(addr);
> > > +	if (!entry)
> > > +		return addr;
> > > +
> > > +	offset = addr - entry;
> > > +
> > > +	/* Decode arg1 and get the optprobe */
> > > +	op = (void *)extract_set_arg1((void *)(entry + TMPL_MOVE_IDX));
> > > +	if (!op)
> > > +		return addr;
> > > +
> > > +	probe_addr = (unsigned long)op->kp.addr;
> > > +
> > > +	if (offset < TMPL_END_IDX) {
> > > +		orc = orc_find((unsigned long)optprobe_template_func + offset);
> > > +		if (!orc || orc->sp_reg != ORC_REG_SP)
> > > +			return addr;
> > > +		/*
> > > +		 * Since optprobe trampoline doesn't push caller on the stack,
> > > +		 * need to decrement 1 stack entry size
> > > +		 */
> > > +		*sp += orc->sp_offset - sizeof(long);
> > > +		return probe_addr;
> > > +	} else {
> > > +		return probe_addr + offset - TMPL_END_IDX;
> > > +	}
> > > +}
> > > +#endif
> > 
> > Hm, I'd like to avoid intertwining kprobes and ORC like this.
> > 
> > ORC unwinds other generated code by assuming the generated code uses a
> > frame pointer.  Could we do that here?
> 
> No, because the optprobe is not a function call. I considered to make
> it call, but since it has to execute copied instructions directly on
> the trampoline code (without changing stack frame) it is not possible.
> 
> > With CONFIG_FRAME_POINTER, unwinding works because SAVE_REGS_STRING has
> > ENCODE_FRAME_POINTER, but that's not going to work for ORC.
> 
> Even in that case, the problem is that any interrupt can happen
> before doing ENCODE_FRAME_POINTER. I think this ENCODE_FRAME_POINTER
> in the SAVE_REGS_STRING is for probing right before the target
> function setup a frame pointer.
> 
> > Instead of these patches, can we 'push %rbp; mov %rsp, %rbp' at the
> > beginning of the template and 'pop %rbp' at the end?
> 
> No, since the trampoline code is not called, it is jumped into.
> This means there is no "return address" in the stack. If we setup
> the frame, there is no return address, thus it might stop there.
> (Moreover, optprobe can copy multiple instructins on trampoline
> buffer, since relative jump consumes 5bytes. where is the "return address"?)
> 
> > 
> > I guess SAVE_REGS_STRING would need to be smart enough to push the
> > original saved version of %rbp.  Of course then that breaks the
> > kretprobe_trampoline() usage, so it may need to be a separate macro.
> > 
> > [ Or make the same change to kretprobe_trampoline().  Then the other
> >   patch set wouldn't be needed either ;-) ]
> 
> Hmm, I don't think it is a good idea which making such change on the
> optimized (hot) path only for the stack tracing. Moreover, that maybe
> not transparent with the stack made by int3.
> 
> > Of course the downside is, when you get an interrupt during the frame
> > pointer setup, unwinding is broken.  But I think that's acceptable for
> > generated code.  We've lived with that limitation for all code, with
> > CONFIG_FRAME_POINTER, for many years.
> 
> But above code can fix such issue too. To fix a corner case, non-generic
> code may be required, even it is not so simple.

Hmm, I would like to confirm your policy on ORC unwinder. If it doesn't
care the stacktrace from the interrupt handler, I think your suggestion
is OK. But in that case, from a developer viewpoint, I need to recommend
users to configure CONFIG_UNWIND_FRAME=y when CONFIG_KPROBES=y.

> > Eventually we may want to have a way to register generated code (and the
> > ORC for it).

I see, but the generated code usually does not have a generic way to
handle it. E.g. bpf has a solid entry point, but kretprobe trampoline's
entry point is any "RET", optprobe trampoline's entry point is a jump
which is also generated (patched) ...

Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-04-01  2:21 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-31  5:44 [RFC PATCH -tip 0/3] x86/kprobes,orc: Fix ORC unwinder to unwind stack with optimized probe Masami Hiramatsu
2021-03-31  5:44 ` [RFC PATCH -tip 1/3] x86/kprobes: Add ORC information to optprobe template Masami Hiramatsu
2021-03-31  5:44 ` [RFC PATCH -tip 2/3] kprobes: Add functions to find instruction buffer entry address Masami Hiramatsu
2021-03-31  5:44 ` [RFC PATCH -tip 3/3] x86/kprobes,orc: Unwind optprobe trampoline correctly Masami Hiramatsu
2021-03-31 15:57   ` Josh Poimboeuf
2021-04-01  1:44     ` Masami Hiramatsu
2021-04-01  2:19       ` Masami Hiramatsu
2021-04-01  1:54   ` Masami Hiramatsu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).