linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86
@ 2021-07-29 14:05 Masami Hiramatsu
  2021-07-29 14:06 ` [PATCH -tip v10 01/16] ia64: kprobes: Fix to pass correct trampoline address to the handler Masami Hiramatsu
                   ` (17 more replies)
  0 siblings, 18 replies; 28+ messages in thread
From: Masami Hiramatsu @ 2021-07-29 14:05 UTC (permalink / raw)
  To: Steven Rostedt, Josh Poimboeuf, Ingo Molnar
  Cc: X86 ML, Masami Hiramatsu, Daniel Xu, linux-kernel, bpf, kuba,
	mingo, ast, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
	kernel-team, yhs, linux-ia64, Abhishek Sagar, Andrii Nakryiko

Hello,

This is the 10th version of the series to fix the stacktrace with kretprobe on x86.

The previous version is here;

 https://lore.kernel.org/bpf/162601048053.1318837.1550594515476777588.stgit@devnote2/

This version is rebased on top of new kprobes cleanup series(*1) and merging
Josh's objtool update series (*2)(*3) as [6/16] and [7/16].

(*1) https://lore.kernel.org/bpf/162748615977.59465.13262421617578791515.stgit@devnote2/
(*2) https://lore.kernel.org/bpf/20210710192433.x5cgjsq2ksvaqnss@treble/
(*3) https://lore.kernel.org/bpf/20210710192514.ghvksi3ozhez4lvb@treble/

Changes from v9:
 - Add Josh's objtool update patches with a build error fix as [6/16] and [7/16].
 - Add a API document for kretprobe_find_ret_addr() and check cur != NULL in [5/16].

With this series, unwinder can unwind stack correctly from ftrace as below;

  # cd /sys/kernel/debug/tracing
  # echo > trace
  # echo 1 > options/sym-offset
  # echo r vfs_read >> kprobe_events
  # echo r full_proxy_read >> kprobe_events
  # echo traceoff:1 > events/kprobes/r_vfs_read_0/trigger
  # echo stacktrace:1 > events/kprobes/r_full_proxy_read_0/trigger
  # echo 1 > events/kprobes/enable
  # cat /sys/kernel/debug/kprobes/list
ffffffff8133b740  r  full_proxy_read+0x0    [FTRACE]
ffffffff812560b0  r  vfs_read+0x0    [FTRACE]
  # echo 0 > events/kprobes/enable
  # cat trace
# tracer: nop
#
# entries-in-buffer/entries-written: 3/3   #P:8
#
#                                _-----=> irqs-off
#                               / _----=> need-resched
#                              | / _---=> hardirq/softirq
#                              || / _--=> preempt-depth
#                              ||| /     delay
#           TASK-PID     CPU#  ||||   TIMESTAMP  FUNCTION
#              | |         |   ||||      |         |
           <...>-134     [007] ...1    16.185877: r_full_proxy_read_0: (vfs_read+0x98/0x180 <- full_proxy_read)
           <...>-134     [007] ...1    16.185901: <stack trace>
 => kretprobe_trace_func+0x209/0x300
 => kretprobe_dispatcher+0x4a/0x70
 => __kretprobe_trampoline_handler+0xd4/0x170
 => trampoline_handler+0x43/0x60
 => kretprobe_trampoline+0x2a/0x50
 => vfs_read+0x98/0x180
 => ksys_read+0x5f/0xe0
 => do_syscall_64+0x37/0x90
 => entry_SYSCALL_64_after_hwframe+0x44/0xae
           <...>-134     [007] ...1    16.185902: r_vfs_read_0: (ksys_read+0x5f/0xe0 <- vfs_read)

This shows the double return probes (vfs_read() and full_proxy_read()) on the stack
correctly unwinded. (vfs_read() returns to 'ksys_read+0x5f' and full_proxy_read()
returns to 'vfs_read+0x98')

This also changes the kretprobe behavisor a bit, now the instraction pointer in
the 'pt_regs' passed to kretprobe user handler is correctly set the real return
address. So user handlers can get it via instruction_pointer() API, and can use
stack_trace_save_regs().

You can also get this series from 
 git://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git kprobes/kretprobe-stackfix-v9


Thank you,

---

Josh Poimboeuf (3):
      objtool: Add frame-pointer-specific function ignore
      objtool: Ignore unwind hints for ignored functions
      x86/kprobes: Add UNWIND_HINT_FUNC on kretprobe_trampoline()

Masami Hiramatsu (13):
      ia64: kprobes: Fix to pass correct trampoline address to the handler
      kprobes: treewide: Replace arch_deref_entry_point() with dereference_symbol_descriptor()
      kprobes: treewide: Remove trampoline_address from kretprobe_trampoline_handler()
      kprobes: treewide: Make it harder to refer kretprobe_trampoline directly
      kprobes: Add kretprobe_find_ret_addr() for searching return address
      ARC: Add instruction_pointer_set() API
      ia64: Add instruction_pointer_set() API
      arm: kprobes: Make space for instruction pointer on stack
      kprobes: Enable stacktrace from pt_regs in kretprobe handler
      x86/kprobes: Push a fake return address at kretprobe_trampoline
      x86/unwind: Recover kretprobe trampoline entry
      tracing: Show kretprobe unknown indicator only for kretprobe_trampoline
      x86/kprobes: Fixup return address in generic trampoline handler


 arch/arc/include/asm/kprobes.h                |    2 
 arch/arc/include/asm/ptrace.h                 |    5 +
 arch/arc/kernel/kprobes.c                     |   13 +-
 arch/arm/probes/kprobes/core.c                |   11 +-
 arch/arm64/include/asm/kprobes.h              |    2 
 arch/arm64/kernel/probes/kprobes.c            |    5 -
 arch/arm64/kernel/probes/kprobes_trampoline.S |    4 -
 arch/csky/include/asm/kprobes.h               |    2 
 arch/csky/kernel/probes/kprobes.c             |    4 -
 arch/csky/kernel/probes/kprobes_trampoline.S  |    4 -
 arch/ia64/include/asm/ptrace.h                |    5 +
 arch/ia64/kernel/kprobes.c                    |   15 +--
 arch/mips/kernel/kprobes.c                    |   15 +--
 arch/parisc/kernel/kprobes.c                  |    6 +
 arch/powerpc/include/asm/kprobes.h            |    2 
 arch/powerpc/kernel/kprobes.c                 |   29 ++---
 arch/powerpc/kernel/optprobes.c               |    2 
 arch/powerpc/kernel/stacktrace.c              |    2 
 arch/riscv/include/asm/kprobes.h              |    2 
 arch/riscv/kernel/probes/kprobes.c            |    4 -
 arch/riscv/kernel/probes/kprobes_trampoline.S |    4 -
 arch/s390/include/asm/kprobes.h               |    2 
 arch/s390/kernel/kprobes.c                    |   12 +-
 arch/s390/kernel/stacktrace.c                 |    2 
 arch/sh/include/asm/kprobes.h                 |    2 
 arch/sh/kernel/kprobes.c                      |   12 +-
 arch/sparc/include/asm/kprobes.h              |    2 
 arch/sparc/kernel/kprobes.c                   |   12 +-
 arch/x86/include/asm/kprobes.h                |    1 
 arch/x86/include/asm/unwind.h                 |   23 ++++
 arch/x86/include/asm/unwind_hints.h           |    5 +
 arch/x86/kernel/kprobes/core.c                |   71 ++++++++++---
 arch/x86/kernel/unwind_frame.c                |    3 -
 arch/x86/kernel/unwind_guess.c                |    3 -
 arch/x86/kernel/unwind_orc.c                  |   21 +++-
 include/linux/kprobes.h                       |   44 +++++++-
 include/linux/objtool.h                       |   12 ++
 kernel/kprobes.c                              |  133 +++++++++++++++++++------
 kernel/trace/trace_output.c                   |   17 +--
 lib/error-inject.c                            |    3 -
 tools/include/linux/objtool.h                 |   12 ++
 tools/objtool/check.c                         |    2 
 42 files changed, 360 insertions(+), 172 deletions(-)

--
Masami Hiramatsu (Linaro) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH -tip v10 01/16] ia64: kprobes: Fix to pass correct trampoline address to the handler
  2021-07-29 14:05 [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86 Masami Hiramatsu
@ 2021-07-29 14:06 ` Masami Hiramatsu
  2021-07-29 14:06 ` [PATCH -tip v10 02/16] kprobes: treewide: Replace arch_deref_entry_point() with dereference_symbol_descriptor() Masami Hiramatsu
                   ` (16 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Masami Hiramatsu @ 2021-07-29 14:06 UTC (permalink / raw)
  To: Steven Rostedt, Josh Poimboeuf, Ingo Molnar
  Cc: X86 ML, Masami Hiramatsu, Daniel Xu, linux-kernel, bpf, kuba,
	mingo, ast, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
	kernel-team, yhs, linux-ia64, Abhishek Sagar, Andrii Nakryiko

The following commit:

   Commit e792ff804f49 ("ia64: kprobes: Use generic kretprobe trampoline handler")

Passed the wrong trampoline address to __kretprobe_trampoline_handler(): it
passes the descriptor address instead of function entry address.

Pass the right parameter.

Also use correct symbol dereference function to get the function address
from 'kretprobe_trampoline' - an IA64 special.

Fixes: e792ff804f49 ("ia64: kprobes: Use generic kretprobe trampoline handler")
Cc: stable@vger.kernel.org
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
 Changes in v9:
  - Update changelog according to Ingo's suggestion.
  - Add Cc: stable tag.
 Changes in v5:
  - Fix a compile error typo.
---
 arch/ia64/kernel/kprobes.c |    9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/ia64/kernel/kprobes.c b/arch/ia64/kernel/kprobes.c
index 441ed04b1037..d4048518a1d7 100644
--- a/arch/ia64/kernel/kprobes.c
+++ b/arch/ia64/kernel/kprobes.c
@@ -398,7 +398,8 @@ static void kretprobe_trampoline(void)
 
 int __kprobes trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
 {
-	regs->cr_iip = __kretprobe_trampoline_handler(regs, kretprobe_trampoline, NULL);
+	regs->cr_iip = __kretprobe_trampoline_handler(regs,
+		dereference_function_descriptor(kretprobe_trampoline), NULL);
 	/*
 	 * By returning a non-zero value, we are telling
 	 * kprobe_handler() that we don't want the post_handler
@@ -414,7 +415,7 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
 	ri->fp = NULL;
 
 	/* Replace the return addr with trampoline addr */
-	regs->b0 = ((struct fnptr *)kretprobe_trampoline)->ip;
+	regs->b0 = (unsigned long)dereference_function_descriptor(kretprobe_trampoline);
 }
 
 /* Check the instruction in the slot is break */
@@ -902,14 +903,14 @@ static struct kprobe trampoline_p = {
 int __init arch_init_kprobes(void)
 {
 	trampoline_p.addr =
-		(kprobe_opcode_t *)((struct fnptr *)kretprobe_trampoline)->ip;
+		dereference_function_descriptor(kretprobe_trampoline);
 	return register_kprobe(&trampoline_p);
 }
 
 int __kprobes arch_trampoline_kprobe(struct kprobe *p)
 {
 	if (p->addr ==
-		(kprobe_opcode_t *)((struct fnptr *)kretprobe_trampoline)->ip)
+		dereference_function_descriptor(kretprobe_trampoline))
 		return 1;
 
 	return 0;


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH -tip v10 02/16] kprobes: treewide: Replace arch_deref_entry_point() with dereference_symbol_descriptor()
  2021-07-29 14:05 [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86 Masami Hiramatsu
  2021-07-29 14:06 ` [PATCH -tip v10 01/16] ia64: kprobes: Fix to pass correct trampoline address to the handler Masami Hiramatsu
@ 2021-07-29 14:06 ` Masami Hiramatsu
  2021-07-29 14:06 ` [PATCH -tip v10 03/16] kprobes: treewide: Remove trampoline_address from kretprobe_trampoline_handler() Masami Hiramatsu
                   ` (15 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Masami Hiramatsu @ 2021-07-29 14:06 UTC (permalink / raw)
  To: Steven Rostedt, Josh Poimboeuf, Ingo Molnar
  Cc: X86 ML, Masami Hiramatsu, Daniel Xu, linux-kernel, bpf, kuba,
	mingo, ast, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
	kernel-team, yhs, linux-ia64, Abhishek Sagar, Andrii Nakryiko

~15 years ago kprobes grew the 'arch_deref_entry_point()' __weak function:

  3d7e33825d87: ("jprobes: make jprobes a little safer for users")

But this is just open-coded dereference_symbol_descriptor() in essence, and
its obscure nature was causing bugs.

Just use the real thing and remove arch_deref_entry_point().

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Andrii Nakryiko <andrii@kernel.org>
---
 Changes in v9:
  - Update changelog according to Ingo's suggestion.
 Changes in v6:
  - Use dereference_symbol_descriptor() so that it can handle address in
    modules correctly.
---
 arch/ia64/kernel/kprobes.c    |    5 -----
 arch/powerpc/kernel/kprobes.c |   11 -----------
 include/linux/kprobes.h       |    1 -
 kernel/kprobes.c              |    7 +------
 lib/error-inject.c            |    3 ++-
 5 files changed, 3 insertions(+), 24 deletions(-)

diff --git a/arch/ia64/kernel/kprobes.c b/arch/ia64/kernel/kprobes.c
index d4048518a1d7..0f8573bbf520 100644
--- a/arch/ia64/kernel/kprobes.c
+++ b/arch/ia64/kernel/kprobes.c
@@ -891,11 +891,6 @@ int __kprobes kprobe_exceptions_notify(struct notifier_block *self,
 	return ret;
 }
 
-unsigned long arch_deref_entry_point(void *entry)
-{
-	return ((struct fnptr *)entry)->ip;
-}
-
 static struct kprobe trampoline_p = {
 	.pre_handler = trampoline_probe_handler
 };
diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
index cbc28d1a2e1b..077ab73e6e56 100644
--- a/arch/powerpc/kernel/kprobes.c
+++ b/arch/powerpc/kernel/kprobes.c
@@ -541,17 +541,6 @@ int kprobe_fault_handler(struct pt_regs *regs, int trapnr)
 }
 NOKPROBE_SYMBOL(kprobe_fault_handler);
 
-unsigned long arch_deref_entry_point(void *entry)
-{
-#ifdef PPC64_ELF_ABI_v1
-	if (!kernel_text_address((unsigned long)entry))
-		return ppc_global_function_entry(entry);
-	else
-#endif
-		return (unsigned long)entry;
-}
-NOKPROBE_SYMBOL(arch_deref_entry_point);
-
 static struct kprobe trampoline_p = {
 	.addr = (kprobe_opcode_t *) &kretprobe_trampoline,
 	.pre_handler = trampoline_probe_handler
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 0ba3f9e316d4..2ed61fcbc89c 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -381,7 +381,6 @@ int register_kprobe(struct kprobe *p);
 void unregister_kprobe(struct kprobe *p);
 int register_kprobes(struct kprobe **kps, int num);
 void unregister_kprobes(struct kprobe **kps, int num);
-unsigned long arch_deref_entry_point(void *);
 
 int register_kretprobe(struct kretprobe *rp);
 void unregister_kretprobe(struct kretprobe *rp);
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 8021bccb7770..550042d9a6ef 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1861,11 +1861,6 @@ static struct notifier_block kprobe_exceptions_nb = {
 	.priority = 0x7fffffff /* we need to be notified first */
 };
 
-unsigned long __weak arch_deref_entry_point(void *entry)
-{
-	return (unsigned long)entry;
-}
-
 #ifdef CONFIG_KRETPROBES
 
 unsigned long __kretprobe_trampoline_handler(struct pt_regs *regs,
@@ -2327,7 +2322,7 @@ static int __init populate_kprobe_blacklist(unsigned long *start,
 	int ret;
 
 	for (iter = start; iter < end; iter++) {
-		entry = arch_deref_entry_point((void *)*iter);
+		entry = (unsigned long)dereference_symbol_descriptor((void *)*iter);
 		ret = kprobe_add_ksym_blacklist(entry);
 		if (ret == -EINVAL)
 			continue;
diff --git a/lib/error-inject.c b/lib/error-inject.c
index c73651b15b76..2ff5ef689d72 100644
--- a/lib/error-inject.c
+++ b/lib/error-inject.c
@@ -8,6 +8,7 @@
 #include <linux/mutex.h>
 #include <linux/list.h>
 #include <linux/slab.h>
+#include <asm/sections.h>
 
 /* Whitelist of symbols that can be overridden for error injection. */
 static LIST_HEAD(error_injection_list);
@@ -64,7 +65,7 @@ static void populate_error_injection_list(struct error_injection_entry *start,
 
 	mutex_lock(&ei_mutex);
 	for (iter = start; iter < end; iter++) {
-		entry = arch_deref_entry_point((void *)iter->addr);
+		entry = (unsigned long)dereference_symbol_descriptor((void *)iter->addr);
 
 		if (!kernel_text_address(entry) ||
 		    !kallsyms_lookup_size_offset(entry, &size, &offset)) {


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH -tip v10 03/16] kprobes: treewide: Remove trampoline_address from kretprobe_trampoline_handler()
  2021-07-29 14:05 [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86 Masami Hiramatsu
  2021-07-29 14:06 ` [PATCH -tip v10 01/16] ia64: kprobes: Fix to pass correct trampoline address to the handler Masami Hiramatsu
  2021-07-29 14:06 ` [PATCH -tip v10 02/16] kprobes: treewide: Replace arch_deref_entry_point() with dereference_symbol_descriptor() Masami Hiramatsu
@ 2021-07-29 14:06 ` Masami Hiramatsu
  2021-07-29 14:06 ` [PATCH -tip v10 04/16] kprobes: treewide: Make it harder to refer kretprobe_trampoline directly Masami Hiramatsu
                   ` (14 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Masami Hiramatsu @ 2021-07-29 14:06 UTC (permalink / raw)
  To: Steven Rostedt, Josh Poimboeuf, Ingo Molnar
  Cc: X86 ML, Masami Hiramatsu, Daniel Xu, linux-kernel, bpf, kuba,
	mingo, ast, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
	kernel-team, yhs, linux-ia64, Abhishek Sagar, Andrii Nakryiko

The __kretprobe_trampoline_handler() callback, called from low level
arch kprobes methods, has the 'trampoline_address' parameter, which is
entirely superfluous as it basically just replicates:

  dereference_kernel_function_descriptor(kretprobe_trampoline)

In fact we had bugs in arch code where it wasn't replicated correctly.

So remove this superfluous parameter and use kretprobe_trampoline_addr()
instead.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Andrii Nakryiko <andrii@kernel.org>
---
 Changes in v9:
   - Update changelog according to Ingo's suggestion.
 Changes in v8:
   - Use dereference_kernel_function_descriptor() to
     get the kretprobe_trampoline address.
 Changes in v3:
   - Remove wrong kretprobe_trampoline declaration from
     arch/x86/include/asm/kprobes.h.
 Changes in v2:
   - Remove arch_deref_entry_point() from comment.
---
 arch/arc/kernel/kprobes.c          |    2 +-
 arch/arm/probes/kprobes/core.c     |    3 +--
 arch/arm64/kernel/probes/kprobes.c |    3 +--
 arch/csky/kernel/probes/kprobes.c  |    2 +-
 arch/ia64/kernel/kprobes.c         |    5 ++---
 arch/mips/kernel/kprobes.c         |    3 +--
 arch/parisc/kernel/kprobes.c       |    4 ++--
 arch/powerpc/kernel/kprobes.c      |    2 +-
 arch/riscv/kernel/probes/kprobes.c |    2 +-
 arch/s390/kernel/kprobes.c         |    2 +-
 arch/sh/kernel/kprobes.c           |    2 +-
 arch/sparc/kernel/kprobes.c        |    2 +-
 arch/x86/include/asm/kprobes.h     |    1 -
 arch/x86/kernel/kprobes/core.c     |    2 +-
 include/linux/kprobes.h            |   18 +++++++++++++-----
 kernel/kprobes.c                   |    3 +--
 16 files changed, 29 insertions(+), 27 deletions(-)

diff --git a/arch/arc/kernel/kprobes.c b/arch/arc/kernel/kprobes.c
index 5f0415fc7328..3cee75c87f97 100644
--- a/arch/arc/kernel/kprobes.c
+++ b/arch/arc/kernel/kprobes.c
@@ -381,7 +381,7 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
 static int __kprobes trampoline_probe_handler(struct kprobe *p,
 					      struct pt_regs *regs)
 {
-	regs->ret = __kretprobe_trampoline_handler(regs, &kretprobe_trampoline, NULL);
+	regs->ret = __kretprobe_trampoline_handler(regs, NULL);
 
 	/* By returning a non zero value, we are telling the kprobe handler
 	 * that we don't want the post_handler to run
diff --git a/arch/arm/probes/kprobes/core.c b/arch/arm/probes/kprobes/core.c
index a59e38de4a03..08098ed6f035 100644
--- a/arch/arm/probes/kprobes/core.c
+++ b/arch/arm/probes/kprobes/core.c
@@ -392,8 +392,7 @@ void __naked __kprobes kretprobe_trampoline(void)
 /* Called from kretprobe_trampoline */
 static __used __kprobes void *trampoline_handler(struct pt_regs *regs)
 {
-	return (void *)kretprobe_trampoline_handler(regs, &kretprobe_trampoline,
-						    (void *)regs->ARM_fp);
+	return (void *)kretprobe_trampoline_handler(regs, (void *)regs->ARM_fp);
 }
 
 void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
diff --git a/arch/arm64/kernel/probes/kprobes.c b/arch/arm64/kernel/probes/kprobes.c
index ce429cbacd35..f627a12984a8 100644
--- a/arch/arm64/kernel/probes/kprobes.c
+++ b/arch/arm64/kernel/probes/kprobes.c
@@ -401,8 +401,7 @@ int __init arch_populate_kprobe_blacklist(void)
 
 void __kprobes __used *trampoline_probe_handler(struct pt_regs *regs)
 {
-	return (void *)kretprobe_trampoline_handler(regs, &kretprobe_trampoline,
-					(void *)kernel_stack_pointer(regs));
+	return (void *)kretprobe_trampoline_handler(regs, (void *)kernel_stack_pointer(regs));
 }
 
 void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
diff --git a/arch/csky/kernel/probes/kprobes.c b/arch/csky/kernel/probes/kprobes.c
index e823c3051b24..bf1e6c7094fc 100644
--- a/arch/csky/kernel/probes/kprobes.c
+++ b/arch/csky/kernel/probes/kprobes.c
@@ -387,7 +387,7 @@ int __init arch_populate_kprobe_blacklist(void)
 
 void __kprobes __used *trampoline_probe_handler(struct pt_regs *regs)
 {
-	return (void *)kretprobe_trampoline_handler(regs, &kretprobe_trampoline, NULL);
+	return (void *)kretprobe_trampoline_handler(regs, NULL);
 }
 
 void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
diff --git a/arch/ia64/kernel/kprobes.c b/arch/ia64/kernel/kprobes.c
index 0f8573bbf520..44c84c20b626 100644
--- a/arch/ia64/kernel/kprobes.c
+++ b/arch/ia64/kernel/kprobes.c
@@ -392,14 +392,13 @@ static void __kprobes set_current_kprobe(struct kprobe *p,
 	__this_cpu_write(current_kprobe, p);
 }
 
-static void kretprobe_trampoline(void)
+void kretprobe_trampoline(void)
 {
 }
 
 int __kprobes trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
 {
-	regs->cr_iip = __kretprobe_trampoline_handler(regs,
-		dereference_function_descriptor(kretprobe_trampoline), NULL);
+	regs->cr_iip = __kretprobe_trampoline_handler(regs, NULL);
 	/*
 	 * By returning a non-zero value, we are telling
 	 * kprobe_handler() that we don't want the post_handler
diff --git a/arch/mips/kernel/kprobes.c b/arch/mips/kernel/kprobes.c
index b0934a0d7aed..b33bd2498651 100644
--- a/arch/mips/kernel/kprobes.c
+++ b/arch/mips/kernel/kprobes.c
@@ -485,8 +485,7 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
 static int __kprobes trampoline_probe_handler(struct kprobe *p,
 						struct pt_regs *regs)
 {
-	instruction_pointer(regs) = __kretprobe_trampoline_handler(regs,
-						kretprobe_trampoline, NULL);
+	instruction_pointer(regs) = __kretprobe_trampoline_handler(regs, NULL);
 	/*
 	 * By returning a non-zero value, we are telling
 	 * kprobe_handler() that we don't want the post_handler
diff --git a/arch/parisc/kernel/kprobes.c b/arch/parisc/kernel/kprobes.c
index 6d21a515eea5..4a35ac6e2ca2 100644
--- a/arch/parisc/kernel/kprobes.c
+++ b/arch/parisc/kernel/kprobes.c
@@ -175,7 +175,7 @@ int __kprobes parisc_kprobe_ss_handler(struct pt_regs *regs)
 	return 1;
 }
 
-static inline void kretprobe_trampoline(void)
+void kretprobe_trampoline(void)
 {
 	asm volatile("nop");
 	asm volatile("nop");
@@ -193,7 +193,7 @@ static int __kprobes trampoline_probe_handler(struct kprobe *p,
 {
 	unsigned long orig_ret_address;
 
-	orig_ret_address = __kretprobe_trampoline_handler(regs, trampoline_p.addr, NULL);
+	orig_ret_address = __kretprobe_trampoline_handler(regs, NULL);
 	instruction_pointer_set(regs, orig_ret_address);
 
 	return 1;
diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
index 077ab73e6e56..0764e8a0de69 100644
--- a/arch/powerpc/kernel/kprobes.c
+++ b/arch/powerpc/kernel/kprobes.c
@@ -416,7 +416,7 @@ static int trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
 {
 	unsigned long orig_ret_address;
 
-	orig_ret_address = __kretprobe_trampoline_handler(regs, &kretprobe_trampoline, NULL);
+	orig_ret_address = __kretprobe_trampoline_handler(regs, NULL);
 	/*
 	 * We get here through one of two paths:
 	 * 1. by taking a trap -> kprobe_handler() -> here
diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c
index cab6f874358e..62d477cf11da 100644
--- a/arch/riscv/kernel/probes/kprobes.c
+++ b/arch/riscv/kernel/probes/kprobes.c
@@ -347,7 +347,7 @@ int __init arch_populate_kprobe_blacklist(void)
 
 void __kprobes __used *trampoline_probe_handler(struct pt_regs *regs)
 {
-	return (void *)kretprobe_trampoline_handler(regs, &kretprobe_trampoline, NULL);
+	return (void *)kretprobe_trampoline_handler(regs, NULL);
 }
 
 void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
diff --git a/arch/s390/kernel/kprobes.c b/arch/s390/kernel/kprobes.c
index 952d44b0610b..5fa86e54f129 100644
--- a/arch/s390/kernel/kprobes.c
+++ b/arch/s390/kernel/kprobes.c
@@ -343,7 +343,7 @@ static void __used kretprobe_trampoline_holder(void)
  */
 static int trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
 {
-	regs->psw.addr = __kretprobe_trampoline_handler(regs, &kretprobe_trampoline, NULL);
+	regs->psw.addr = __kretprobe_trampoline_handler(regs, NULL);
 	/*
 	 * By returning a non-zero value, we are telling
 	 * kprobe_handler() that we don't want the post_handler
diff --git a/arch/sh/kernel/kprobes.c b/arch/sh/kernel/kprobes.c
index 1c7f358ef0be..8e76a35e6e33 100644
--- a/arch/sh/kernel/kprobes.c
+++ b/arch/sh/kernel/kprobes.c
@@ -303,7 +303,7 @@ static void __used kretprobe_trampoline_holder(void)
  */
 int __kprobes trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
 {
-	regs->pc = __kretprobe_trampoline_handler(regs, &kretprobe_trampoline, NULL);
+	regs->pc = __kretprobe_trampoline_handler(regs, NULL);
 
 	return 1;
 }
diff --git a/arch/sparc/kernel/kprobes.c b/arch/sparc/kernel/kprobes.c
index 4c05a4ee6a0e..401534236c2e 100644
--- a/arch/sparc/kernel/kprobes.c
+++ b/arch/sparc/kernel/kprobes.c
@@ -451,7 +451,7 @@ static int __kprobes trampoline_probe_handler(struct kprobe *p,
 {
 	unsigned long orig_ret_address = 0;
 
-	orig_ret_address = __kretprobe_trampoline_handler(regs, &kretprobe_trampoline, NULL);
+	orig_ret_address = __kretprobe_trampoline_handler(regs, NULL);
 	regs->tpc = orig_ret_address;
 	regs->tnpc = orig_ret_address + 4;
 
diff --git a/arch/x86/include/asm/kprobes.h b/arch/x86/include/asm/kprobes.h
index bd7f5886a789..71ea2eab43d5 100644
--- a/arch/x86/include/asm/kprobes.h
+++ b/arch/x86/include/asm/kprobes.h
@@ -49,7 +49,6 @@ extern __visible kprobe_opcode_t optprobe_template_end[];
 extern const int kretprobe_blacklist_size;
 
 void arch_remove_kprobe(struct kprobe *p);
-asmlinkage void kretprobe_trampoline(void);
 
 extern void arch_kprobe_override_function(struct pt_regs *regs);
 
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index b6e046e4b289..0c59ef5971de 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -1064,7 +1064,7 @@ __used __visible void *trampoline_handler(struct pt_regs *regs)
 	regs->ip = (unsigned long)&kretprobe_trampoline;
 	regs->orig_ax = ~0UL;
 
-	return (void *)kretprobe_trampoline_handler(regs, &kretprobe_trampoline, &regs->sp);
+	return (void *)kretprobe_trampoline_handler(regs, &regs->sp);
 }
 NOKPROBE_SYMBOL(trampoline_handler);
 
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 2ed61fcbc89c..96f5df93e36e 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -188,15 +188,23 @@ extern void arch_prepare_kretprobe(struct kretprobe_instance *ri,
 				   struct pt_regs *regs);
 extern int arch_trampoline_kprobe(struct kprobe *p);
 
+void kretprobe_trampoline(void);
+/*
+ * Since some architecture uses structured function pointer,
+ * use dereference_function_descriptor() to get real function address.
+ */
+static nokprobe_inline void *kretprobe_trampoline_addr(void)
+{
+	return dereference_kernel_function_descriptor(kretprobe_trampoline);
+}
+
 /* If the trampoline handler called from a kprobe, use this version */
 unsigned long __kretprobe_trampoline_handler(struct pt_regs *regs,
-				void *trampoline_address,
-				void *frame_pointer);
+					     void *frame_pointer);
 
 static nokprobe_inline
 unsigned long kretprobe_trampoline_handler(struct pt_regs *regs,
-				void *trampoline_address,
-				void *frame_pointer)
+					   void *frame_pointer)
 {
 	unsigned long ret;
 	/*
@@ -205,7 +213,7 @@ unsigned long kretprobe_trampoline_handler(struct pt_regs *regs,
 	 * be running at this point.
 	 */
 	kprobe_busy_begin();
-	ret = __kretprobe_trampoline_handler(regs, trampoline_address, frame_pointer);
+	ret = __kretprobe_trampoline_handler(regs, frame_pointer);
 	kprobe_busy_end();
 
 	return ret;
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 550042d9a6ef..6ed755111eea 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1864,7 +1864,6 @@ static struct notifier_block kprobe_exceptions_nb = {
 #ifdef CONFIG_KRETPROBES
 
 unsigned long __kretprobe_trampoline_handler(struct pt_regs *regs,
-					     void *trampoline_address,
 					     void *frame_pointer)
 {
 	kprobe_opcode_t *correct_ret_addr = NULL;
@@ -1879,7 +1878,7 @@ unsigned long __kretprobe_trampoline_handler(struct pt_regs *regs,
 
 		BUG_ON(ri->fp != frame_pointer);
 
-		if (ri->ret_addr != trampoline_address) {
+		if (ri->ret_addr != kretprobe_trampoline_addr()) {
 			correct_ret_addr = ri->ret_addr;
 			/*
 			 * This is the real return address. Any other


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH -tip v10 04/16] kprobes: treewide: Make it harder to refer kretprobe_trampoline directly
  2021-07-29 14:05 [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86 Masami Hiramatsu
                   ` (2 preceding siblings ...)
  2021-07-29 14:06 ` [PATCH -tip v10 03/16] kprobes: treewide: Remove trampoline_address from kretprobe_trampoline_handler() Masami Hiramatsu
@ 2021-07-29 14:06 ` Masami Hiramatsu
  2021-07-29 14:06 ` [PATCH -tip v10 05/16] kprobes: Add kretprobe_find_ret_addr() for searching return address Masami Hiramatsu
                   ` (13 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Masami Hiramatsu @ 2021-07-29 14:06 UTC (permalink / raw)
  To: Steven Rostedt, Josh Poimboeuf, Ingo Molnar
  Cc: X86 ML, Masami Hiramatsu, Daniel Xu, linux-kernel, bpf, kuba,
	mingo, ast, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
	kernel-team, yhs, linux-ia64, Abhishek Sagar, Andrii Nakryiko

Since now there is kretprobe_trampoline_addr() for referring the
address of kretprobe trampoline code, we don't need to access
kretprobe_trampoline directly.

Make it harder to refer by renaming it to __kretprobe_trampoline().

Suggested-by: Ingo Molnar <mingo@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
 arch/arc/include/asm/kprobes.h                |    2 +-
 arch/arc/kernel/kprobes.c                     |   11 ++++++-----
 arch/arm/probes/kprobes/core.c                |    6 +++---
 arch/arm64/include/asm/kprobes.h              |    2 +-
 arch/arm64/kernel/probes/kprobes.c            |    2 +-
 arch/arm64/kernel/probes/kprobes_trampoline.S |    4 ++--
 arch/csky/include/asm/kprobes.h               |    2 +-
 arch/csky/kernel/probes/kprobes.c             |    2 +-
 arch/csky/kernel/probes/kprobes_trampoline.S  |    4 ++--
 arch/ia64/kernel/kprobes.c                    |    8 ++++----
 arch/mips/kernel/kprobes.c                    |   12 ++++++------
 arch/parisc/kernel/kprobes.c                  |    4 ++--
 arch/powerpc/include/asm/kprobes.h            |    2 +-
 arch/powerpc/kernel/kprobes.c                 |   16 ++++++++--------
 arch/powerpc/kernel/optprobes.c               |    2 +-
 arch/powerpc/kernel/stacktrace.c              |    2 +-
 arch/riscv/include/asm/kprobes.h              |    2 +-
 arch/riscv/kernel/probes/kprobes.c            |    2 +-
 arch/riscv/kernel/probes/kprobes_trampoline.S |    4 ++--
 arch/s390/include/asm/kprobes.h               |    2 +-
 arch/s390/kernel/kprobes.c                    |   10 +++++-----
 arch/s390/kernel/stacktrace.c                 |    2 +-
 arch/sh/include/asm/kprobes.h                 |    2 +-
 arch/sh/kernel/kprobes.c                      |   10 +++++-----
 arch/sparc/include/asm/kprobes.h              |    2 +-
 arch/sparc/kernel/kprobes.c                   |   10 +++++-----
 arch/x86/kernel/kprobes/core.c                |   18 +++++++++---------
 include/linux/kprobes.h                       |    4 ++--
 kernel/trace/trace_output.c                   |    2 +-
 29 files changed, 76 insertions(+), 75 deletions(-)

diff --git a/arch/arc/include/asm/kprobes.h b/arch/arc/include/asm/kprobes.h
index 2134721dce44..de1566e32cb8 100644
--- a/arch/arc/include/asm/kprobes.h
+++ b/arch/arc/include/asm/kprobes.h
@@ -46,7 +46,7 @@ struct kprobe_ctlblk {
 };
 
 int kprobe_fault_handler(struct pt_regs *regs, unsigned long cause);
-void kretprobe_trampoline(void);
+void __kretprobe_trampoline(void);
 void trap_is_kprobe(unsigned long address, struct pt_regs *regs);
 #else
 #define trap_is_kprobe(address, regs)
diff --git a/arch/arc/kernel/kprobes.c b/arch/arc/kernel/kprobes.c
index 3cee75c87f97..e71d64119d71 100644
--- a/arch/arc/kernel/kprobes.c
+++ b/arch/arc/kernel/kprobes.c
@@ -363,8 +363,9 @@ int __kprobes kprobe_exceptions_notify(struct notifier_block *self,
 
 static void __used kretprobe_trampoline_holder(void)
 {
-	__asm__ __volatile__(".global kretprobe_trampoline\n"
-			     "kretprobe_trampoline:\n" "nop\n");
+	__asm__ __volatile__(".global __kretprobe_trampoline\n"
+			     "__kretprobe_trampoline:\n"
+			     "nop\n");
 }
 
 void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
@@ -375,7 +376,7 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
 	ri->fp = NULL;
 
 	/* Replace the return addr with trampoline addr */
-	regs->blink = (unsigned long)&kretprobe_trampoline;
+	regs->blink = (unsigned long)&__kretprobe_trampoline;
 }
 
 static int __kprobes trampoline_probe_handler(struct kprobe *p,
@@ -390,7 +391,7 @@ static int __kprobes trampoline_probe_handler(struct kprobe *p,
 }
 
 static struct kprobe trampoline_p = {
-	.addr = (kprobe_opcode_t *) &kretprobe_trampoline,
+	.addr = (kprobe_opcode_t *) &__kretprobe_trampoline,
 	.pre_handler = trampoline_probe_handler
 };
 
@@ -402,7 +403,7 @@ int __init arch_init_kprobes(void)
 
 int __kprobes arch_trampoline_kprobe(struct kprobe *p)
 {
-	if (p->addr == (kprobe_opcode_t *) &kretprobe_trampoline)
+	if (p->addr == (kprobe_opcode_t *) &__kretprobe_trampoline)
 		return 1;
 
 	return 0;
diff --git a/arch/arm/probes/kprobes/core.c b/arch/arm/probes/kprobes/core.c
index 08098ed6f035..67ce7eb8f285 100644
--- a/arch/arm/probes/kprobes/core.c
+++ b/arch/arm/probes/kprobes/core.c
@@ -373,7 +373,7 @@ int __kprobes kprobe_exceptions_notify(struct notifier_block *self,
  * for kretprobe handlers which should normally be interested in r0 only
  * anyway.
  */
-void __naked __kprobes kretprobe_trampoline(void)
+void __naked __kprobes __kretprobe_trampoline(void)
 {
 	__asm__ __volatile__ (
 		"stmdb	sp!, {r0 - r11}		\n\t"
@@ -389,7 +389,7 @@ void __naked __kprobes kretprobe_trampoline(void)
 		: : : "memory");
 }
 
-/* Called from kretprobe_trampoline */
+/* Called from __kretprobe_trampoline */
 static __used __kprobes void *trampoline_handler(struct pt_regs *regs)
 {
 	return (void *)kretprobe_trampoline_handler(regs, (void *)regs->ARM_fp);
@@ -402,7 +402,7 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
 	ri->fp = (void *)regs->ARM_fp;
 
 	/* Replace the return addr with trampoline addr. */
-	regs->ARM_lr = (unsigned long)&kretprobe_trampoline;
+	regs->ARM_lr = (unsigned long)&__kretprobe_trampoline;
 }
 
 int __kprobes arch_trampoline_kprobe(struct kprobe *p)
diff --git a/arch/arm64/include/asm/kprobes.h b/arch/arm64/include/asm/kprobes.h
index 5d38ff4a4806..05cd82eeca13 100644
--- a/arch/arm64/include/asm/kprobes.h
+++ b/arch/arm64/include/asm/kprobes.h
@@ -39,7 +39,7 @@ void arch_remove_kprobe(struct kprobe *);
 int kprobe_fault_handler(struct pt_regs *regs, unsigned int fsr);
 int kprobe_exceptions_notify(struct notifier_block *self,
 			     unsigned long val, void *data);
-void kretprobe_trampoline(void);
+void __kretprobe_trampoline(void);
 void __kprobes *trampoline_probe_handler(struct pt_regs *regs);
 
 #endif /* CONFIG_KPROBES */
diff --git a/arch/arm64/kernel/probes/kprobes.c b/arch/arm64/kernel/probes/kprobes.c
index f627a12984a8..e7ad6da980e8 100644
--- a/arch/arm64/kernel/probes/kprobes.c
+++ b/arch/arm64/kernel/probes/kprobes.c
@@ -411,7 +411,7 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
 	ri->fp = (void *)kernel_stack_pointer(regs);
 
 	/* replace return addr (x30) with trampoline */
-	regs->regs[30] = (long)&kretprobe_trampoline;
+	regs->regs[30] = (long)&__kretprobe_trampoline;
 }
 
 int __kprobes arch_trampoline_kprobe(struct kprobe *p)
diff --git a/arch/arm64/kernel/probes/kprobes_trampoline.S b/arch/arm64/kernel/probes/kprobes_trampoline.S
index 288a84e253cc..520ee8711db1 100644
--- a/arch/arm64/kernel/probes/kprobes_trampoline.S
+++ b/arch/arm64/kernel/probes/kprobes_trampoline.S
@@ -61,7 +61,7 @@
 	ldp x28, x29, [sp, #S_X28]
 	.endm
 
-SYM_CODE_START(kretprobe_trampoline)
+SYM_CODE_START(__kretprobe_trampoline)
 	sub sp, sp, #PT_REGS_SIZE
 
 	save_all_base_regs
@@ -79,4 +79,4 @@ SYM_CODE_START(kretprobe_trampoline)
 	add sp, sp, #PT_REGS_SIZE
 	ret
 
-SYM_CODE_END(kretprobe_trampoline)
+SYM_CODE_END(__kretprobe_trampoline)
diff --git a/arch/csky/include/asm/kprobes.h b/arch/csky/include/asm/kprobes.h
index b647bbde4d6d..55267cbf5204 100644
--- a/arch/csky/include/asm/kprobes.h
+++ b/arch/csky/include/asm/kprobes.h
@@ -41,7 +41,7 @@ void arch_remove_kprobe(struct kprobe *p);
 int kprobe_fault_handler(struct pt_regs *regs, unsigned int trapnr);
 int kprobe_breakpoint_handler(struct pt_regs *regs);
 int kprobe_single_step_handler(struct pt_regs *regs);
-void kretprobe_trampoline(void);
+void __kretprobe_trampoline(void);
 void __kprobes *trampoline_probe_handler(struct pt_regs *regs);
 
 #endif /* CONFIG_KPROBES */
diff --git a/arch/csky/kernel/probes/kprobes.c b/arch/csky/kernel/probes/kprobes.c
index bf1e6c7094fc..8d1fb4e79788 100644
--- a/arch/csky/kernel/probes/kprobes.c
+++ b/arch/csky/kernel/probes/kprobes.c
@@ -395,7 +395,7 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
 {
 	ri->ret_addr = (kprobe_opcode_t *)regs->lr;
 	ri->fp = NULL;
-	regs->lr = (unsigned long) &kretprobe_trampoline;
+	regs->lr = (unsigned long) &__kretprobe_trampoline;
 }
 
 int __kprobes arch_trampoline_kprobe(struct kprobe *p)
diff --git a/arch/csky/kernel/probes/kprobes_trampoline.S b/arch/csky/kernel/probes/kprobes_trampoline.S
index b1fe3af24f03..ba48ad04a847 100644
--- a/arch/csky/kernel/probes/kprobes_trampoline.S
+++ b/arch/csky/kernel/probes/kprobes_trampoline.S
@@ -4,7 +4,7 @@
 
 #include <abi/entry.h>
 
-ENTRY(kretprobe_trampoline)
+ENTRY(__kretprobe_trampoline)
 	SAVE_REGS_FTRACE
 
 	mov	a0, sp /* pt_regs */
@@ -16,4 +16,4 @@ ENTRY(kretprobe_trampoline)
 
 	RESTORE_REGS_FTRACE
 	rts
-ENDPROC(kretprobe_trampoline)
+ENDPROC(__kretprobe_trampoline)
diff --git a/arch/ia64/kernel/kprobes.c b/arch/ia64/kernel/kprobes.c
index 44c84c20b626..1a7bab1c5d7c 100644
--- a/arch/ia64/kernel/kprobes.c
+++ b/arch/ia64/kernel/kprobes.c
@@ -392,7 +392,7 @@ static void __kprobes set_current_kprobe(struct kprobe *p,
 	__this_cpu_write(current_kprobe, p);
 }
 
-void kretprobe_trampoline(void)
+void __kretprobe_trampoline(void)
 {
 }
 
@@ -414,7 +414,7 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
 	ri->fp = NULL;
 
 	/* Replace the return addr with trampoline addr */
-	regs->b0 = (unsigned long)dereference_function_descriptor(kretprobe_trampoline);
+	regs->b0 = (unsigned long)dereference_function_descriptor(__kretprobe_trampoline);
 }
 
 /* Check the instruction in the slot is break */
@@ -897,14 +897,14 @@ static struct kprobe trampoline_p = {
 int __init arch_init_kprobes(void)
 {
 	trampoline_p.addr =
-		dereference_function_descriptor(kretprobe_trampoline);
+		dereference_function_descriptor(__kretprobe_trampoline);
 	return register_kprobe(&trampoline_p);
 }
 
 int __kprobes arch_trampoline_kprobe(struct kprobe *p)
 {
 	if (p->addr ==
-		dereference_function_descriptor(kretprobe_trampoline))
+		dereference_function_descriptor(__kretprobe_trampoline))
 		return 1;
 
 	return 0;
diff --git a/arch/mips/kernel/kprobes.c b/arch/mips/kernel/kprobes.c
index b33bd2498651..6c7f3b143fdc 100644
--- a/arch/mips/kernel/kprobes.c
+++ b/arch/mips/kernel/kprobes.c
@@ -460,14 +460,14 @@ static void __used kretprobe_trampoline_holder(void)
 		/* Keep the assembler from reordering and placing JR here. */
 		".set noreorder\n\t"
 		"nop\n\t"
-		".global kretprobe_trampoline\n"
-		"kretprobe_trampoline:\n\t"
+		".global __kretprobe_trampoline\n"
+		"__kretprobe_trampoline:\n\t"
 		"nop\n\t"
 		".set pop"
 		: : : "memory");
 }
 
-void kretprobe_trampoline(void);
+void __kretprobe_trampoline(void);
 
 void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
 				      struct pt_regs *regs)
@@ -476,7 +476,7 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
 	ri->fp = NULL;
 
 	/* Replace the return addr with trampoline addr */
-	regs->regs[31] = (unsigned long)kretprobe_trampoline;
+	regs->regs[31] = (unsigned long)__kretprobe_trampoline;
 }
 
 /*
@@ -496,14 +496,14 @@ static int __kprobes trampoline_probe_handler(struct kprobe *p,
 
 int __kprobes arch_trampoline_kprobe(struct kprobe *p)
 {
-	if (p->addr == (kprobe_opcode_t *)kretprobe_trampoline)
+	if (p->addr == (kprobe_opcode_t *)__kretprobe_trampoline)
 		return 1;
 
 	return 0;
 }
 
 static struct kprobe trampoline_p = {
-	.addr = (kprobe_opcode_t *)kretprobe_trampoline,
+	.addr = (kprobe_opcode_t *)__kretprobe_trampoline,
 	.pre_handler = trampoline_probe_handler
 };
 
diff --git a/arch/parisc/kernel/kprobes.c b/arch/parisc/kernel/kprobes.c
index 4a35ac6e2ca2..e2bdb5a5f93e 100644
--- a/arch/parisc/kernel/kprobes.c
+++ b/arch/parisc/kernel/kprobes.c
@@ -175,7 +175,7 @@ int __kprobes parisc_kprobe_ss_handler(struct pt_regs *regs)
 	return 1;
 }
 
-void kretprobe_trampoline(void)
+void __kretprobe_trampoline(void)
 {
 	asm volatile("nop");
 	asm volatile("nop");
@@ -217,6 +217,6 @@ int __kprobes arch_trampoline_kprobe(struct kprobe *p)
 int __init arch_init_kprobes(void)
 {
 	trampoline_p.addr = (kprobe_opcode_t *)
-		dereference_function_descriptor(kretprobe_trampoline);
+		dereference_function_descriptor(__kretprobe_trampoline);
 	return register_kprobe(&trampoline_p);
 }
diff --git a/arch/powerpc/include/asm/kprobes.h b/arch/powerpc/include/asm/kprobes.h
index 4fc0e15e23a5..bab364152b29 100644
--- a/arch/powerpc/include/asm/kprobes.h
+++ b/arch/powerpc/include/asm/kprobes.h
@@ -51,7 +51,7 @@ extern kprobe_opcode_t optprobe_template_end[];
 #define flush_insn_slot(p)	do { } while (0)
 #define kretprobe_blacklist_size 0
 
-void kretprobe_trampoline(void);
+void __kretprobe_trampoline(void);
 extern void arch_remove_kprobe(struct kprobe *p);
 
 /* Architecture specific copy of original instruction */
diff --git a/arch/powerpc/kernel/kprobes.c b/arch/powerpc/kernel/kprobes.c
index 0764e8a0de69..03c85962296a 100644
--- a/arch/powerpc/kernel/kprobes.c
+++ b/arch/powerpc/kernel/kprobes.c
@@ -237,7 +237,7 @@ void arch_prepare_kretprobe(struct kretprobe_instance *ri, struct pt_regs *regs)
 	ri->fp = NULL;
 
 	/* Replace the return addr with trampoline addr */
-	regs->link = (unsigned long)kretprobe_trampoline;
+	regs->link = (unsigned long)__kretprobe_trampoline;
 }
 NOKPROBE_SYMBOL(arch_prepare_kretprobe);
 
@@ -402,12 +402,12 @@ NOKPROBE_SYMBOL(kprobe_handler);
  * 	- When the probed function returns, this probe
  * 		causes the handlers to fire
  */
-asm(".global kretprobe_trampoline\n"
-	".type kretprobe_trampoline, @function\n"
-	"kretprobe_trampoline:\n"
+asm(".global __kretprobe_trampoline\n"
+	".type __kretprobe_trampoline, @function\n"
+	"__kretprobe_trampoline:\n"
 	"nop\n"
 	"blr\n"
-	".size kretprobe_trampoline, .-kretprobe_trampoline\n");
+	".size __kretprobe_trampoline, .-__kretprobe_trampoline\n");
 
 /*
  * Called when the probe at kretprobe trampoline is hit
@@ -426,7 +426,7 @@ static int trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
 	 * as it is used to determine the return address from the trap.
 	 * For (2), since nip is not honoured with optprobes, we instead setup
 	 * the link register properly so that the subsequent 'blr' in
-	 * kretprobe_trampoline jumps back to the right instruction.
+	 * __kretprobe_trampoline jumps back to the right instruction.
 	 *
 	 * For nip, we should set the address to the previous instruction since
 	 * we end up emulating it in kprobe_handler(), which increments the nip
@@ -542,7 +542,7 @@ int kprobe_fault_handler(struct pt_regs *regs, int trapnr)
 NOKPROBE_SYMBOL(kprobe_fault_handler);
 
 static struct kprobe trampoline_p = {
-	.addr = (kprobe_opcode_t *) &kretprobe_trampoline,
+	.addr = (kprobe_opcode_t *) &__kretprobe_trampoline,
 	.pre_handler = trampoline_probe_handler
 };
 
@@ -553,7 +553,7 @@ int __init arch_init_kprobes(void)
 
 int arch_trampoline_kprobe(struct kprobe *p)
 {
-	if (p->addr == (kprobe_opcode_t *)&kretprobe_trampoline)
+	if (p->addr == (kprobe_opcode_t *)&__kretprobe_trampoline)
 		return 1;
 
 	return 0;
diff --git a/arch/powerpc/kernel/optprobes.c b/arch/powerpc/kernel/optprobes.c
index 325ba544883c..ce1903064031 100644
--- a/arch/powerpc/kernel/optprobes.c
+++ b/arch/powerpc/kernel/optprobes.c
@@ -56,7 +56,7 @@ static unsigned long can_optimize(struct kprobe *p)
 	 * has a 'nop' instruction, which can be emulated.
 	 * So further checks can be skipped.
 	 */
-	if (p->addr == (kprobe_opcode_t *)&kretprobe_trampoline)
+	if (p->addr == (kprobe_opcode_t *)&__kretprobe_trampoline)
 		return addr + sizeof(kprobe_opcode_t);
 
 	/*
diff --git a/arch/powerpc/kernel/stacktrace.c b/arch/powerpc/kernel/stacktrace.c
index 2b0d04a1b7d2..7d8867106734 100644
--- a/arch/powerpc/kernel/stacktrace.c
+++ b/arch/powerpc/kernel/stacktrace.c
@@ -154,7 +154,7 @@ int __no_sanitize_address arch_stack_walk_reliable(stack_trace_consume_fn consum
 		 * Mark stacktraces with kretprobed functions on them
 		 * as unreliable.
 		 */
-		if (ip == (unsigned long)kretprobe_trampoline)
+		if (ip == (unsigned long)__kretprobe_trampoline)
 			return -EINVAL;
 #endif
 
diff --git a/arch/riscv/include/asm/kprobes.h b/arch/riscv/include/asm/kprobes.h
index 9ea9b5ec3113..217ef89f22b9 100644
--- a/arch/riscv/include/asm/kprobes.h
+++ b/arch/riscv/include/asm/kprobes.h
@@ -40,7 +40,7 @@ void arch_remove_kprobe(struct kprobe *p);
 int kprobe_fault_handler(struct pt_regs *regs, unsigned int trapnr);
 bool kprobe_breakpoint_handler(struct pt_regs *regs);
 bool kprobe_single_step_handler(struct pt_regs *regs);
-void kretprobe_trampoline(void);
+void __kretprobe_trampoline(void);
 void __kprobes *trampoline_probe_handler(struct pt_regs *regs);
 
 #endif /* CONFIG_KPROBES */
diff --git a/arch/riscv/kernel/probes/kprobes.c b/arch/riscv/kernel/probes/kprobes.c
index 62d477cf11da..e6e950b7cf32 100644
--- a/arch/riscv/kernel/probes/kprobes.c
+++ b/arch/riscv/kernel/probes/kprobes.c
@@ -355,7 +355,7 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
 {
 	ri->ret_addr = (kprobe_opcode_t *)regs->ra;
 	ri->fp = NULL;
-	regs->ra = (unsigned long) &kretprobe_trampoline;
+	regs->ra = (unsigned long) &__kretprobe_trampoline;
 }
 
 int __kprobes arch_trampoline_kprobe(struct kprobe *p)
diff --git a/arch/riscv/kernel/probes/kprobes_trampoline.S b/arch/riscv/kernel/probes/kprobes_trampoline.S
index 6e85d021e2a2..7bdb09ded39b 100644
--- a/arch/riscv/kernel/probes/kprobes_trampoline.S
+++ b/arch/riscv/kernel/probes/kprobes_trampoline.S
@@ -75,7 +75,7 @@
 	REG_L x31, PT_T6(sp)
 	.endm
 
-ENTRY(kretprobe_trampoline)
+ENTRY(__kretprobe_trampoline)
 	addi sp, sp, -(PT_SIZE_ON_STACK)
 	save_all_base_regs
 
@@ -90,4 +90,4 @@ ENTRY(kretprobe_trampoline)
 	addi sp, sp, PT_SIZE_ON_STACK
 
 	ret
-ENDPROC(kretprobe_trampoline)
+ENDPROC(__kretprobe_trampoline)
diff --git a/arch/s390/include/asm/kprobes.h b/arch/s390/include/asm/kprobes.h
index 09cdb632a490..5eb722c984e4 100644
--- a/arch/s390/include/asm/kprobes.h
+++ b/arch/s390/include/asm/kprobes.h
@@ -70,7 +70,7 @@ struct kprobe_ctlblk {
 };
 
 void arch_remove_kprobe(struct kprobe *p);
-void kretprobe_trampoline(void);
+void __kretprobe_trampoline(void);
 
 int kprobe_fault_handler(struct pt_regs *regs, int trapnr);
 int kprobe_exceptions_notify(struct notifier_block *self,
diff --git a/arch/s390/kernel/kprobes.c b/arch/s390/kernel/kprobes.c
index 5fa86e54f129..c505c0ee5f47 100644
--- a/arch/s390/kernel/kprobes.c
+++ b/arch/s390/kernel/kprobes.c
@@ -242,7 +242,7 @@ void arch_prepare_kretprobe(struct kretprobe_instance *ri, struct pt_regs *regs)
 	ri->fp = NULL;
 
 	/* Replace the return addr with trampoline addr */
-	regs->gprs[14] = (unsigned long) &kretprobe_trampoline;
+	regs->gprs[14] = (unsigned long) &__kretprobe_trampoline;
 }
 NOKPROBE_SYMBOL(arch_prepare_kretprobe);
 
@@ -334,8 +334,8 @@ NOKPROBE_SYMBOL(kprobe_handler);
  */
 static void __used kretprobe_trampoline_holder(void)
 {
-	asm volatile(".global kretprobe_trampoline\n"
-		     "kretprobe_trampoline: bcr 0,0\n");
+	asm volatile(".global __kretprobe_trampoline\n"
+		     "__kretprobe_trampoline: bcr 0,0\n");
 }
 
 /*
@@ -509,7 +509,7 @@ int kprobe_exceptions_notify(struct notifier_block *self,
 NOKPROBE_SYMBOL(kprobe_exceptions_notify);
 
 static struct kprobe trampoline = {
-	.addr = (kprobe_opcode_t *) &kretprobe_trampoline,
+	.addr = (kprobe_opcode_t *) &__kretprobe_trampoline,
 	.pre_handler = trampoline_probe_handler
 };
 
@@ -520,6 +520,6 @@ int __init arch_init_kprobes(void)
 
 int arch_trampoline_kprobe(struct kprobe *p)
 {
-	return p->addr == (kprobe_opcode_t *) &kretprobe_trampoline;
+	return p->addr == (kprobe_opcode_t *) &__kretprobe_trampoline;
 }
 NOKPROBE_SYMBOL(arch_trampoline_kprobe);
diff --git a/arch/s390/kernel/stacktrace.c b/arch/s390/kernel/stacktrace.c
index 101477b3e263..b7bb1981e9ee 100644
--- a/arch/s390/kernel/stacktrace.c
+++ b/arch/s390/kernel/stacktrace.c
@@ -46,7 +46,7 @@ int arch_stack_walk_reliable(stack_trace_consume_fn consume_entry,
 		 * Mark stacktraces with kretprobed functions on them
 		 * as unreliable.
 		 */
-		if (state.ip == (unsigned long)kretprobe_trampoline)
+		if (state.ip == (unsigned long)__kretprobe_trampoline)
 			return -EINVAL;
 #endif
 
diff --git a/arch/sh/include/asm/kprobes.h b/arch/sh/include/asm/kprobes.h
index 6171682f7798..eeba83e0a7d2 100644
--- a/arch/sh/include/asm/kprobes.h
+++ b/arch/sh/include/asm/kprobes.h
@@ -26,7 +26,7 @@ typedef insn_size_t kprobe_opcode_t;
 struct kprobe;
 
 void arch_remove_kprobe(struct kprobe *);
-void kretprobe_trampoline(void);
+void __kretprobe_trampoline(void);
 
 /* Architecture specific copy of original instruction*/
 struct arch_specific_insn {
diff --git a/arch/sh/kernel/kprobes.c b/arch/sh/kernel/kprobes.c
index 8e76a35e6e33..aed1ea8e2c2f 100644
--- a/arch/sh/kernel/kprobes.c
+++ b/arch/sh/kernel/kprobes.c
@@ -207,7 +207,7 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
 	ri->fp = NULL;
 
 	/* Replace the return addr with trampoline addr */
-	regs->pr = (unsigned long)kretprobe_trampoline;
+	regs->pr = (unsigned long)__kretprobe_trampoline;
 }
 
 static int __kprobes kprobe_handler(struct pt_regs *regs)
@@ -293,13 +293,13 @@ static int __kprobes kprobe_handler(struct pt_regs *regs)
  */
 static void __used kretprobe_trampoline_holder(void)
 {
-	asm volatile (".globl kretprobe_trampoline\n"
-		      "kretprobe_trampoline:\n\t"
+	asm volatile (".globl __kretprobe_trampoline\n"
+		      "__kretprobe_trampoline:\n\t"
 		      "nop\n");
 }
 
 /*
- * Called when we hit the probe point at kretprobe_trampoline
+ * Called when we hit the probe point at __kretprobe_trampoline
  */
 int __kprobes trampoline_probe_handler(struct kprobe *p, struct pt_regs *regs)
 {
@@ -442,7 +442,7 @@ int __kprobes kprobe_exceptions_notify(struct notifier_block *self,
 }
 
 static struct kprobe trampoline_p = {
-	.addr = (kprobe_opcode_t *)&kretprobe_trampoline,
+	.addr = (kprobe_opcode_t *)&__kretprobe_trampoline,
 	.pre_handler = trampoline_probe_handler
 };
 
diff --git a/arch/sparc/include/asm/kprobes.h b/arch/sparc/include/asm/kprobes.h
index bfcaa6326c20..06c2bc767ef7 100644
--- a/arch/sparc/include/asm/kprobes.h
+++ b/arch/sparc/include/asm/kprobes.h
@@ -24,7 +24,7 @@ do { 	flushi(&(p)->ainsn.insn[0]);	\
 	flushi(&(p)->ainsn.insn[1]);	\
 } while (0)
 
-void kretprobe_trampoline(void);
+void __kretprobe_trampoline(void);
 
 /* Architecture specific copy of original instruction*/
 struct arch_specific_insn {
diff --git a/arch/sparc/kernel/kprobes.c b/arch/sparc/kernel/kprobes.c
index 401534236c2e..535c7b35cb59 100644
--- a/arch/sparc/kernel/kprobes.c
+++ b/arch/sparc/kernel/kprobes.c
@@ -440,7 +440,7 @@ void __kprobes arch_prepare_kretprobe(struct kretprobe_instance *ri,
 
 	/* Replace the return addr with trampoline addr */
 	regs->u_regs[UREG_RETPC] =
-		((unsigned long)kretprobe_trampoline) - 8;
+		((unsigned long)__kretprobe_trampoline) - 8;
 }
 
 /*
@@ -465,13 +465,13 @@ static int __kprobes trampoline_probe_handler(struct kprobe *p,
 
 static void __used kretprobe_trampoline_holder(void)
 {
-	asm volatile(".global kretprobe_trampoline\n"
-		     "kretprobe_trampoline:\n"
+	asm volatile(".global __kretprobe_trampoline\n"
+		     "__kretprobe_trampoline:\n"
 		     "\tnop\n"
 		     "\tnop\n");
 }
 static struct kprobe trampoline_p = {
-	.addr = (kprobe_opcode_t *) &kretprobe_trampoline,
+	.addr = (kprobe_opcode_t *) &__kretprobe_trampoline,
 	.pre_handler = trampoline_probe_handler
 };
 
@@ -482,7 +482,7 @@ int __init arch_init_kprobes(void)
 
 int __kprobes arch_trampoline_kprobe(struct kprobe *p)
 {
-	if (p->addr == (kprobe_opcode_t *)&kretprobe_trampoline)
+	if (p->addr == (kprobe_opcode_t *)&__kretprobe_trampoline)
 		return 1;
 
 	return 0;
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 0c59ef5971de..79cd23dba5b5 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -809,7 +809,7 @@ void arch_prepare_kretprobe(struct kretprobe_instance *ri, struct pt_regs *regs)
 	ri->fp = sara;
 
 	/* Replace the return addr with trampoline addr */
-	*sara = (unsigned long) &kretprobe_trampoline;
+	*sara = (unsigned long) &__kretprobe_trampoline;
 }
 NOKPROBE_SYMBOL(arch_prepare_kretprobe);
 
@@ -1019,9 +1019,9 @@ NOKPROBE_SYMBOL(kprobe_int3_handler);
  */
 asm(
 	".text\n"
-	".global kretprobe_trampoline\n"
-	".type kretprobe_trampoline, @function\n"
-	"kretprobe_trampoline:\n"
+	".global __kretprobe_trampoline\n"
+	".type __kretprobe_trampoline, @function\n"
+	"__kretprobe_trampoline:\n"
 	/* We don't bother saving the ss register */
 #ifdef CONFIG_X86_64
 	"	pushq %rsp\n"
@@ -1045,14 +1045,14 @@ asm(
 	"	popfl\n"
 #endif
 	"	ret\n"
-	".size kretprobe_trampoline, .-kretprobe_trampoline\n"
+	".size __kretprobe_trampoline, .-__kretprobe_trampoline\n"
 );
-NOKPROBE_SYMBOL(kretprobe_trampoline);
-STACK_FRAME_NON_STANDARD(kretprobe_trampoline);
+NOKPROBE_SYMBOL(__kretprobe_trampoline);
+STACK_FRAME_NON_STANDARD(__kretprobe_trampoline);
 
 
 /*
- * Called from kretprobe_trampoline
+ * Called from __kretprobe_trampoline
  */
 __used __visible void *trampoline_handler(struct pt_regs *regs)
 {
@@ -1061,7 +1061,7 @@ __used __visible void *trampoline_handler(struct pt_regs *regs)
 #ifdef CONFIG_X86_32
 	regs->gs = 0;
 #endif
-	regs->ip = (unsigned long)&kretprobe_trampoline;
+	regs->ip = (unsigned long)&__kretprobe_trampoline;
 	regs->orig_ax = ~0UL;
 
 	return (void *)kretprobe_trampoline_handler(regs, &regs->sp);
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 96f5df93e36e..b6b2370f4a4c 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -188,14 +188,14 @@ extern void arch_prepare_kretprobe(struct kretprobe_instance *ri,
 				   struct pt_regs *regs);
 extern int arch_trampoline_kprobe(struct kprobe *p);
 
-void kretprobe_trampoline(void);
+void __kretprobe_trampoline(void);
 /*
  * Since some architecture uses structured function pointer,
  * use dereference_function_descriptor() to get real function address.
  */
 static nokprobe_inline void *kretprobe_trampoline_addr(void)
 {
-	return dereference_kernel_function_descriptor(kretprobe_trampoline);
+	return dereference_kernel_function_descriptor(__kretprobe_trampoline);
 }
 
 /* If the trampoline handler called from a kprobe, use this version */
diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index a0bf446bb034..9b2e69619057 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -349,7 +349,7 @@ EXPORT_SYMBOL_GPL(trace_output_call);
 #ifdef CONFIG_KRETPROBES
 static inline const char *kretprobed(const char *name)
 {
-	static const char tramp_name[] = "kretprobe_trampoline";
+	static const char tramp_name[] = "__kretprobe_trampoline";
 	int size = sizeof(tramp_name);
 
 	if (strncmp(tramp_name, name, size) == 0)


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH -tip v10 05/16] kprobes: Add kretprobe_find_ret_addr() for searching return address
  2021-07-29 14:05 [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86 Masami Hiramatsu
                   ` (3 preceding siblings ...)
  2021-07-29 14:06 ` [PATCH -tip v10 04/16] kprobes: treewide: Make it harder to refer kretprobe_trampoline directly Masami Hiramatsu
@ 2021-07-29 14:06 ` Masami Hiramatsu
  2021-07-29 14:06 ` [PATCH -tip v10 06/16] objtool: Add frame-pointer-specific function ignore Masami Hiramatsu
                   ` (12 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Masami Hiramatsu @ 2021-07-29 14:06 UTC (permalink / raw)
  To: Steven Rostedt, Josh Poimboeuf, Ingo Molnar
  Cc: X86 ML, Masami Hiramatsu, Daniel Xu, linux-kernel, bpf, kuba,
	mingo, ast, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
	kernel-team, yhs, linux-ia64, Abhishek Sagar, Andrii Nakryiko

Introduce kretprobe_find_ret_addr() and is_kretprobe_trampoline().
These APIs will be used by the ORC stack unwinder and ftrace, so that
they can check whether the given address points kretprobe trampoline
code and query the correct return address in that case.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Andrii Nakryiko <andrii@kernel.org>
---
 Changes in v10:
  - Add a document about kretprobe_find_ret_addr() and check cur == NULL.
 Changes in v9:
  - Update changelog to explain why this is introduced.
  - Move the prototype of kretprobe_find_ret_addr() and is_kretprobe_trampoline()
    in the same place.
  - Make __kretprobe_find_ret_addr() return 'kprobe_opcode_t *'.
  - Update comments in the __kretprobe_trampoline_handler().
 Changes in v6:
  - Replace BUG_ON() with WARN_ON_ONCE() in __kretprobe_trampoline_handler().
 Changes in v3:
  - Remove generic stacktrace fixup. Instead, it should be solved in
    each unwinder. This just provide the generic interface.
 Changes in v2:
  - Add is_kretprobe_trampoline() for checking address outside of
    kretprobe_find_ret_addr()
  - Remove unneeded addr from kretprobe_find_ret_addr()
  - Rename fixup_kretprobe_tramp_addr() to fixup_kretprobe_trampoline()
---
 include/linux/kprobes.h |   22 +++++++++
 kernel/kprobes.c        |  109 ++++++++++++++++++++++++++++++++++++-----------
 2 files changed, 105 insertions(+), 26 deletions(-)

diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index b6b2370f4a4c..6d47a9da1e0a 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -505,6 +505,28 @@ static inline bool is_kprobe_optinsn_slot(unsigned long addr)
 }
 #endif /* !CONFIG_OPTPROBES */
 
+#ifdef CONFIG_KRETPROBES
+static nokprobe_inline bool is_kretprobe_trampoline(unsigned long addr)
+{
+	return (void *)addr == kretprobe_trampoline_addr();
+}
+
+unsigned long kretprobe_find_ret_addr(struct task_struct *tsk, void *fp,
+				      struct llist_node **cur);
+#else
+static nokprobe_inline bool is_kretprobe_trampoline(unsigned long addr)
+{
+	return false;
+}
+
+static nokprobe_inline
+unsigned long kretprobe_find_ret_addr(struct task_struct *tsk, void *fp,
+				      struct llist_node **cur)
+{
+	return 0;
+}
+#endif
+
 /* Returns true if kprobes handled the fault */
 static nokprobe_inline bool kprobe_page_fault(struct pt_regs *regs,
 					      unsigned int trap)
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 6ed755111eea..833f07f33115 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1863,45 +1863,87 @@ static struct notifier_block kprobe_exceptions_nb = {
 
 #ifdef CONFIG_KRETPROBES
 
-unsigned long __kretprobe_trampoline_handler(struct pt_regs *regs,
-					     void *frame_pointer)
+/* This assumes the 'tsk' is the current task or the is not running. */
+static kprobe_opcode_t *__kretprobe_find_ret_addr(struct task_struct *tsk,
+						  struct llist_node **cur)
 {
-	kprobe_opcode_t *correct_ret_addr = NULL;
 	struct kretprobe_instance *ri = NULL;
-	struct llist_node *first, *node;
-	struct kretprobe *rp;
+	struct llist_node *node = *cur;
+
+	if (!node)
+		node = tsk->kretprobe_instances.first;
+	else
+		node = node->next;
 
-	/* Find all nodes for this frame. */
-	first = node = current->kretprobe_instances.first;
 	while (node) {
 		ri = container_of(node, struct kretprobe_instance, llist);
-
-		BUG_ON(ri->fp != frame_pointer);
-
 		if (ri->ret_addr != kretprobe_trampoline_addr()) {
-			correct_ret_addr = ri->ret_addr;
-			/*
-			 * This is the real return address. Any other
-			 * instances associated with this task are for
-			 * other calls deeper on the call stack
-			 */
-			goto found;
+			*cur = node;
+			return ri->ret_addr;
 		}
-
 		node = node->next;
 	}
-	pr_err("kretprobe: Return address not found, not execute handler. Maybe there is a bug in the kernel.\n");
-	BUG_ON(1);
+	return NULL;
+}
+NOKPROBE_SYMBOL(__kretprobe_find_ret_addr);
 
-found:
-	/* Unlink all nodes for this frame. */
-	current->kretprobe_instances.first = node->next;
-	node->next = NULL;
+/**
+ * kretprobe_find_ret_addr -- Find correct return address modified by kretprobe
+ * @tsk: Target task
+ * @fp: A frame pointer
+ * @cur: a storage of the loop cursor llist_node pointer for next call
+ *
+ * Find the correct return address modified by a kretprobe on @tsk in unsigned
+ * long type. If it finds the return address, this returns that address value,
+ * or this returns 0.
+ * The @tsk must be 'current' or a task which is not running. @fp is a hint
+ * to get the currect return address - which is compared with the
+ * kretprobe_instance::fp field. The @cur is a loop cursor for searching the
+ * kretprobe return addresses on the @tsk. The '*@cur' should be NULL at the
+ * first call, but '@cur' itself must NOT NULL.
+ */
+unsigned long kretprobe_find_ret_addr(struct task_struct *tsk, void *fp,
+				      struct llist_node **cur)
+{
+	struct kretprobe_instance *ri = NULL;
+	kprobe_opcode_t *ret;
+
+	if (WARN_ON_ONCE(!cur))
+		return 0;
+
+	do {
+		ret = __kretprobe_find_ret_addr(tsk, cur);
+		if (!ret)
+			break;
+		ri = container_of(*cur, struct kretprobe_instance, llist);
+	} while (ri->fp != fp);
 
-	/* Run them..  */
+	return (unsigned long)ret;
+}
+NOKPROBE_SYMBOL(kretprobe_find_ret_addr);
+
+unsigned long __kretprobe_trampoline_handler(struct pt_regs *regs,
+					     void *frame_pointer)
+{
+	kprobe_opcode_t *correct_ret_addr = NULL;
+	struct kretprobe_instance *ri = NULL;
+	struct llist_node *first, *node = NULL;
+	struct kretprobe *rp;
+
+	/* Find correct address and all nodes for this frame. */
+	correct_ret_addr = __kretprobe_find_ret_addr(current, &node);
+	if (!correct_ret_addr) {
+		pr_err("kretprobe: Return address not found, not execute handler. Maybe there is a bug in the kernel.\n");
+		BUG_ON(1);
+	}
+
+	/* Run the user handler of the nodes. */
+	first = current->kretprobe_instances.first;
 	while (first) {
 		ri = container_of(first, struct kretprobe_instance, llist);
-		first = first->next;
+
+		if (WARN_ON_ONCE(ri->fp != frame_pointer))
+			break;
 
 		rp = get_kretprobe(ri);
 		if (rp && rp->handler) {
@@ -1912,6 +1954,21 @@ unsigned long __kretprobe_trampoline_handler(struct pt_regs *regs,
 			rp->handler(ri, regs);
 			__this_cpu_write(current_kprobe, prev);
 		}
+		if (first == node)
+			break;
+
+		first = first->next;
+	}
+
+	/* Unlink all nodes for this frame. */
+	first = current->kretprobe_instances.first;
+	current->kretprobe_instances.first = node->next;
+	node->next = NULL;
+
+	/* Recycle free instances. */
+	while (first) {
+		ri = container_of(first, struct kretprobe_instance, llist);
+		first = first->next;
 
 		recycle_rp_inst(ri);
 	}


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH -tip v10 06/16] objtool: Add frame-pointer-specific function ignore
  2021-07-29 14:05 [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86 Masami Hiramatsu
                   ` (4 preceding siblings ...)
  2021-07-29 14:06 ` [PATCH -tip v10 05/16] kprobes: Add kretprobe_find_ret_addr() for searching return address Masami Hiramatsu
@ 2021-07-29 14:06 ` Masami Hiramatsu
  2021-07-29 14:07 ` [PATCH -tip v10 07/16] objtool: Ignore unwind hints for ignored functions Masami Hiramatsu
                   ` (11 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Masami Hiramatsu @ 2021-07-29 14:06 UTC (permalink / raw)
  To: Steven Rostedt, Josh Poimboeuf, Ingo Molnar
  Cc: X86 ML, Masami Hiramatsu, Daniel Xu, linux-kernel, bpf, kuba,
	mingo, ast, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
	kernel-team, yhs, linux-ia64, Abhishek Sagar, Andrii Nakryiko

From: Josh Poimboeuf <jpoimboe@redhat.com>

Add a CONFIG_FRAME_POINTER-specific version of
STACK_FRAME_NON_STANDARD() for the case where a function is
intentionally missing frame pointer setup, but otherwise needs
objtool/ORC coverage when frame pointers are disabled.

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
 Changes in v10:
  - Fixes error in CONFIG_STACK_VALIDATION=n case.
---
 include/linux/objtool.h       |   12 ++++++++++++
 tools/include/linux/objtool.h |   12 ++++++++++++
 2 files changed, 24 insertions(+)

diff --git a/include/linux/objtool.h b/include/linux/objtool.h
index 7e72d975cb76..aca52db2f3f3 100644
--- a/include/linux/objtool.h
+++ b/include/linux/objtool.h
@@ -66,6 +66,17 @@ struct unwind_hint {
 	static void __used __section(".discard.func_stack_frame_non_standard") \
 		*__func_stack_frame_non_standard_##func = func
 
+/*
+ * STACK_FRAME_NON_STANDARD_FP() is a frame-pointer-specific function ignore
+ * for the case where a function is intentionally missing frame pointer setup,
+ * but otherwise needs objtool/ORC coverage when frame pointers are disabled.
+ */
+#ifdef CONFIG_FRAME_POINTER
+#define STACK_FRAME_NON_STANDARD_FP(func) STACK_FRAME_NON_STANDARD(func)
+#else
+#define STACK_FRAME_NON_STANDARD_FP(func)
+#endif
+
 #else /* __ASSEMBLY__ */
 
 /*
@@ -127,6 +138,7 @@ struct unwind_hint {
 #define UNWIND_HINT(sp_reg, sp_offset, type, end)	\
 	"\n\t"
 #define STACK_FRAME_NON_STANDARD(func)
+#define STACK_FRAME_NON_STANDARD_FP(func)
 #else
 #define ANNOTATE_INTRA_FUNCTION_CALL
 .macro UNWIND_HINT sp_reg:req sp_offset=0 type:req end=0
diff --git a/tools/include/linux/objtool.h b/tools/include/linux/objtool.h
index 7e72d975cb76..aca52db2f3f3 100644
--- a/tools/include/linux/objtool.h
+++ b/tools/include/linux/objtool.h
@@ -66,6 +66,17 @@ struct unwind_hint {
 	static void __used __section(".discard.func_stack_frame_non_standard") \
 		*__func_stack_frame_non_standard_##func = func
 
+/*
+ * STACK_FRAME_NON_STANDARD_FP() is a frame-pointer-specific function ignore
+ * for the case where a function is intentionally missing frame pointer setup,
+ * but otherwise needs objtool/ORC coverage when frame pointers are disabled.
+ */
+#ifdef CONFIG_FRAME_POINTER
+#define STACK_FRAME_NON_STANDARD_FP(func) STACK_FRAME_NON_STANDARD(func)
+#else
+#define STACK_FRAME_NON_STANDARD_FP(func)
+#endif
+
 #else /* __ASSEMBLY__ */
 
 /*
@@ -127,6 +138,7 @@ struct unwind_hint {
 #define UNWIND_HINT(sp_reg, sp_offset, type, end)	\
 	"\n\t"
 #define STACK_FRAME_NON_STANDARD(func)
+#define STACK_FRAME_NON_STANDARD_FP(func)
 #else
 #define ANNOTATE_INTRA_FUNCTION_CALL
 .macro UNWIND_HINT sp_reg:req sp_offset=0 type:req end=0


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH -tip v10 07/16] objtool: Ignore unwind hints for ignored functions
  2021-07-29 14:05 [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86 Masami Hiramatsu
                   ` (5 preceding siblings ...)
  2021-07-29 14:06 ` [PATCH -tip v10 06/16] objtool: Add frame-pointer-specific function ignore Masami Hiramatsu
@ 2021-07-29 14:07 ` Masami Hiramatsu
  2021-07-29 14:07 ` [PATCH -tip v10 08/16] x86/kprobes: Add UNWIND_HINT_FUNC on kretprobe_trampoline() Masami Hiramatsu
                   ` (10 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Masami Hiramatsu @ 2021-07-29 14:07 UTC (permalink / raw)
  To: Steven Rostedt, Josh Poimboeuf, Ingo Molnar
  Cc: X86 ML, Masami Hiramatsu, Daniel Xu, linux-kernel, bpf, kuba,
	mingo, ast, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
	kernel-team, yhs, linux-ia64, Abhishek Sagar, Andrii Nakryiko

From: Josh Poimboeuf <jpoimboe@redhat.com>

If a function is ignored, also ignore its hints.  This is useful for the
case where the function ignore is conditional on frame pointers, e.g.
STACK_FRAME_NON_STANDARD_FP().

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Reviewed-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Masami Hiramatsu <mhiramat@kernel.org>
---
 tools/objtool/check.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/objtool/check.c b/tools/objtool/check.c
index e5947fbb9e7a..67cbdcfcabae 100644
--- a/tools/objtool/check.c
+++ b/tools/objtool/check.c
@@ -2909,7 +2909,7 @@ static int validate_unwind_hints(struct objtool_file *file, struct section *sec)
 	}
 
 	while (&insn->list != &file->insn_list && (!sec || insn->sec == sec)) {
-		if (insn->hint && !insn->visited) {
+		if (insn->hint && !insn->visited && !insn->ignore) {
 			ret = validate_branch(file, insn->func, insn, state);
 			if (ret && backtrace)
 				BT_FUNC("<=== (hint)", insn);


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH -tip v10 08/16] x86/kprobes: Add UNWIND_HINT_FUNC on kretprobe_trampoline()
  2021-07-29 14:05 [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86 Masami Hiramatsu
                   ` (6 preceding siblings ...)
  2021-07-29 14:07 ` [PATCH -tip v10 07/16] objtool: Ignore unwind hints for ignored functions Masami Hiramatsu
@ 2021-07-29 14:07 ` Masami Hiramatsu
  2021-07-29 14:07 ` [PATCH -tip v10 09/16] ARC: Add instruction_pointer_set() API Masami Hiramatsu
                   ` (9 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Masami Hiramatsu @ 2021-07-29 14:07 UTC (permalink / raw)
  To: Steven Rostedt, Josh Poimboeuf, Ingo Molnar
  Cc: X86 ML, Masami Hiramatsu, Daniel Xu, linux-kernel, bpf, kuba,
	mingo, ast, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
	kernel-team, yhs, linux-ia64, Abhishek Sagar, Andrii Nakryiko

From: Josh Poimboeuf <jpoimboe@redhat.com>

Add UNWIND_HINT_FUNC on __kretprobe_trampoline() code so that ORC
information is generated on the __kretprobe_trampoline() correctly.
Also, this uses STACK_FRAME_NON_STANDARD_FP(), CONFIG_FRAME_POINTER-
-specific version of STACK_FRAME_NON_STANDARD().

Signed-off-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Andrii Nakryiko <andrii@kernel.org>
---
 Changes in v9:
  - Update changelog and fix comment.
  - Use STACK_FRAME_NON_STANDARD_FP().
 Changes in v4:
  - Apply UNWIND_HINT_FUNC only if CONFIG_FRAME_POINTER=n.
---
 arch/x86/include/asm/unwind_hints.h |    5 +++++
 arch/x86/kernel/kprobes/core.c      |   13 +++++++++++--
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/unwind_hints.h b/arch/x86/include/asm/unwind_hints.h
index 8e574c0afef8..8b33674288ea 100644
--- a/arch/x86/include/asm/unwind_hints.h
+++ b/arch/x86/include/asm/unwind_hints.h
@@ -52,6 +52,11 @@
 	UNWIND_HINT sp_reg=ORC_REG_SP sp_offset=8 type=UNWIND_HINT_TYPE_FUNC
 .endm
 
+#else
+
+#define UNWIND_HINT_FUNC \
+	UNWIND_HINT(ORC_REG_SP, 8, UNWIND_HINT_TYPE_FUNC, 0)
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* _ASM_X86_UNWIND_HINTS_H */
diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 79cd23dba5b5..d1436d7463fd 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -1025,6 +1025,7 @@ asm(
 	/* We don't bother saving the ss register */
 #ifdef CONFIG_X86_64
 	"	pushq %rsp\n"
+	UNWIND_HINT_FUNC
 	"	pushfq\n"
 	SAVE_REGS_STRING
 	"	movq %rsp, %rdi\n"
@@ -1035,6 +1036,7 @@ asm(
 	"	popfq\n"
 #else
 	"	pushl %esp\n"
+	UNWIND_HINT_FUNC
 	"	pushfl\n"
 	SAVE_REGS_STRING
 	"	movl %esp, %eax\n"
@@ -1048,8 +1050,15 @@ asm(
 	".size __kretprobe_trampoline, .-__kretprobe_trampoline\n"
 );
 NOKPROBE_SYMBOL(__kretprobe_trampoline);
-STACK_FRAME_NON_STANDARD(__kretprobe_trampoline);
-
+/*
+ * __kretprobe_trampoline() skips updating frame pointer. The frame pointer
+ * saved in trampoline_handler() points to the real caller function's
+ * frame pointer. Thus the __kretprobe_trampoline() doesn't have a
+ * standard stack frame with CONFIG_FRAME_POINTER=y.
+ * Let's mark it non-standard function. Anyway, FP unwinder can correctly
+ * unwind without the hint.
+ */
+STACK_FRAME_NON_STANDARD_FP(__kretprobe_trampoline);
 
 /*
  * Called from __kretprobe_trampoline


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH -tip v10 09/16] ARC: Add instruction_pointer_set() API
  2021-07-29 14:05 [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86 Masami Hiramatsu
                   ` (7 preceding siblings ...)
  2021-07-29 14:07 ` [PATCH -tip v10 08/16] x86/kprobes: Add UNWIND_HINT_FUNC on kretprobe_trampoline() Masami Hiramatsu
@ 2021-07-29 14:07 ` Masami Hiramatsu
  2021-07-29 14:07 ` [PATCH -tip v10 10/16] ia64: " Masami Hiramatsu
                   ` (8 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Masami Hiramatsu @ 2021-07-29 14:07 UTC (permalink / raw)
  To: Steven Rostedt, Josh Poimboeuf, Ingo Molnar
  Cc: X86 ML, Masami Hiramatsu, Daniel Xu, linux-kernel, bpf, kuba,
	mingo, ast, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
	kernel-team, yhs, linux-ia64, Abhishek Sagar, Andrii Nakryiko

Add instruction_pointer_set() API for arc.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
 arch/arc/include/asm/ptrace.h |    5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arc/include/asm/ptrace.h b/arch/arc/include/asm/ptrace.h
index 4c3c9be5bd16..cca8d6583e31 100644
--- a/arch/arc/include/asm/ptrace.h
+++ b/arch/arc/include/asm/ptrace.h
@@ -149,6 +149,11 @@ static inline long regs_return_value(struct pt_regs *regs)
 	return (long)regs->r0;
 }
 
+static inline void instruction_pointer_set(struct pt_regs *regs,
+					   unsigned long val)
+{
+	instruction_pointer(regs) = val;
+}
 #endif /* !__ASSEMBLY__ */
 
 #endif /* __ASM_PTRACE_H */


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH -tip v10 10/16] ia64: Add instruction_pointer_set() API
  2021-07-29 14:05 [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86 Masami Hiramatsu
                   ` (8 preceding siblings ...)
  2021-07-29 14:07 ` [PATCH -tip v10 09/16] ARC: Add instruction_pointer_set() API Masami Hiramatsu
@ 2021-07-29 14:07 ` Masami Hiramatsu
  2021-07-29 14:07 ` [PATCH -tip v10 11/16] arm: kprobes: Make space for instruction pointer on stack Masami Hiramatsu
                   ` (7 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Masami Hiramatsu @ 2021-07-29 14:07 UTC (permalink / raw)
  To: Steven Rostedt, Josh Poimboeuf, Ingo Molnar
  Cc: X86 ML, Masami Hiramatsu, Daniel Xu, linux-kernel, bpf, kuba,
	mingo, ast, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
	kernel-team, yhs, linux-ia64, Abhishek Sagar, Andrii Nakryiko

Add instruction_pointer_set() API for ia64.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
  Changes in v9:
   - Fix "space at the start of a line" checkpatch warnings.
  Changes in v4:
   - Make the API macro for avoiding a build error.
---
 arch/ia64/include/asm/ptrace.h |    5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/ia64/include/asm/ptrace.h b/arch/ia64/include/asm/ptrace.h
index 08179135905c..8a2d0f72b324 100644
--- a/arch/ia64/include/asm/ptrace.h
+++ b/arch/ia64/include/asm/ptrace.h
@@ -51,6 +51,11 @@
  * the canonical representation by adding to instruction pointer.
  */
 # define instruction_pointer(regs) ((regs)->cr_iip + ia64_psr(regs)->ri)
+# define instruction_pointer_set(regs, val)	\
+({						\
+	ia64_psr(regs)->ri = (val & 0xf);	\
+	regs->cr_iip = (val & ~0xfULL);		\
+})
 
 static inline unsigned long user_stack_pointer(struct pt_regs *regs)
 {


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH -tip v10 11/16] arm: kprobes: Make space for instruction pointer on stack
  2021-07-29 14:05 [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86 Masami Hiramatsu
                   ` (9 preceding siblings ...)
  2021-07-29 14:07 ` [PATCH -tip v10 10/16] ia64: " Masami Hiramatsu
@ 2021-07-29 14:07 ` Masami Hiramatsu
  2021-07-29 14:07 ` [PATCH -tip v10 12/16] kprobes: Enable stacktrace from pt_regs in kretprobe handler Masami Hiramatsu
                   ` (6 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Masami Hiramatsu @ 2021-07-29 14:07 UTC (permalink / raw)
  To: Steven Rostedt, Josh Poimboeuf, Ingo Molnar
  Cc: X86 ML, Masami Hiramatsu, Daniel Xu, linux-kernel, bpf, kuba,
	mingo, ast, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
	kernel-team, yhs, linux-ia64, Abhishek Sagar, Andrii Nakryiko

Since arm's __kretprobe_trampoline() saves partial 'pt_regs' on the
stack, 'regs->ARM_pc' (instruction pointer) is not accessible from
the kretprobe handler. This means if instruction_pointer_set() is
used from kretprobe handler, it will break the data on the stack.

Make space for instruction pointer (ARM_pc) on the stack in the
__kretprobe_trampoline() for fixing this problem.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
 Changes in v9:
  - Update changelog.
---
 arch/arm/probes/kprobes/core.c |    2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm/probes/kprobes/core.c b/arch/arm/probes/kprobes/core.c
index 67ce7eb8f285..95f23b47ba27 100644
--- a/arch/arm/probes/kprobes/core.c
+++ b/arch/arm/probes/kprobes/core.c
@@ -376,11 +376,13 @@ int __kprobes kprobe_exceptions_notify(struct notifier_block *self,
 void __naked __kprobes __kretprobe_trampoline(void)
 {
 	__asm__ __volatile__ (
+		"sub	sp, sp, #16		\n\t"
 		"stmdb	sp!, {r0 - r11}		\n\t"
 		"mov	r0, sp			\n\t"
 		"bl	trampoline_handler	\n\t"
 		"mov	lr, r0			\n\t"
 		"ldmia	sp!, {r0 - r11}		\n\t"
+		"add	sp, sp, #16		\n\t"
 #ifdef CONFIG_THUMB2_KERNEL
 		"bx	lr			\n\t"
 #else


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH -tip v10 12/16] kprobes: Enable stacktrace from pt_regs in kretprobe handler
  2021-07-29 14:05 [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86 Masami Hiramatsu
                   ` (10 preceding siblings ...)
  2021-07-29 14:07 ` [PATCH -tip v10 11/16] arm: kprobes: Make space for instruction pointer on stack Masami Hiramatsu
@ 2021-07-29 14:07 ` Masami Hiramatsu
  2021-07-29 14:07 ` [PATCH -tip v10 13/16] x86/kprobes: Push a fake return address at kretprobe_trampoline Masami Hiramatsu
                   ` (5 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Masami Hiramatsu @ 2021-07-29 14:07 UTC (permalink / raw)
  To: Steven Rostedt, Josh Poimboeuf, Ingo Molnar
  Cc: X86 ML, Masami Hiramatsu, Daniel Xu, linux-kernel, bpf, kuba,
	mingo, ast, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
	kernel-team, yhs, linux-ia64, Abhishek Sagar, Andrii Nakryiko

Since the ORC unwinder from pt_regs requires setting up regs->ip
correctly, set the correct return address to the regs->ip before
calling user kretprobe handler.

This allows the kretrprobe handler to trace stack from the
kretprobe's pt_regs by stack_trace_save_regs() (eBPF will do
this), instead of stack tracing from the handler context by
stack_trace_save() (ftrace will do this).

Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 Changes in v9:
  - Update comment to explain specifically why this is necessary.
 Changes in v8:
  - Update comment to clarify why this is needed.
 Changes in v3:
  - Cast the correct_ret_addr to unsigned long.
---
 kernel/kprobes.c |    7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 833f07f33115..ebc587b9a346 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1937,6 +1937,13 @@ unsigned long __kretprobe_trampoline_handler(struct pt_regs *regs,
 		BUG_ON(1);
 	}
 
+	/*
+	 * Set the return address as the instruction pointer, because if the
+	 * user handler calls stack_trace_save_regs() with this 'regs',
+	 * the stack trace will start from the instruction pointer.
+	 */
+	instruction_pointer_set(regs, (unsigned long)correct_ret_addr);
+
 	/* Run the user handler of the nodes. */
 	first = current->kretprobe_instances.first;
 	while (first) {


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH -tip v10 13/16] x86/kprobes: Push a fake return address at kretprobe_trampoline
  2021-07-29 14:05 [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86 Masami Hiramatsu
                   ` (11 preceding siblings ...)
  2021-07-29 14:07 ` [PATCH -tip v10 12/16] kprobes: Enable stacktrace from pt_regs in kretprobe handler Masami Hiramatsu
@ 2021-07-29 14:07 ` Masami Hiramatsu
  2021-07-29 14:08 ` [PATCH -tip v10 14/16] x86/unwind: Recover kretprobe trampoline entry Masami Hiramatsu
                   ` (4 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Masami Hiramatsu @ 2021-07-29 14:07 UTC (permalink / raw)
  To: Steven Rostedt, Josh Poimboeuf, Ingo Molnar
  Cc: X86 ML, Masami Hiramatsu, Daniel Xu, linux-kernel, bpf, kuba,
	mingo, ast, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
	kernel-team, yhs, linux-ia64, Abhishek Sagar, Andrii Nakryiko

Change __kretprobe_trampoline() to push the address of the
__kretprobe_trampoline() as a fake return address at the bottom
of the stack frame. This fake return address will be replaced
with the correct return address in the trampoline_handler().

With this change, the ORC unwinder can check whether the return
address is modified by kretprobes or not.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
 Changes in v9:
  - Update changelog and comment.
  - Remove unneeded type casting.
---
 arch/x86/kernel/kprobes/core.c |   34 +++++++++++++++++++++++++---------
 1 file changed, 25 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index d1436d7463fd..7e1111c19605 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -1022,28 +1022,33 @@ asm(
 	".global __kretprobe_trampoline\n"
 	".type __kretprobe_trampoline, @function\n"
 	"__kretprobe_trampoline:\n"
-	/* We don't bother saving the ss register */
 #ifdef CONFIG_X86_64
-	"	pushq %rsp\n"
+	/* Push a fake return address to tell the unwinder it's a kretprobe. */
+	"	pushq $__kretprobe_trampoline\n"
 	UNWIND_HINT_FUNC
+	/* Save the 'sp - 8', this will be fixed later. */
+	"	pushq %rsp\n"
 	"	pushfq\n"
 	SAVE_REGS_STRING
 	"	movq %rsp, %rdi\n"
 	"	call trampoline_handler\n"
-	/* Replace saved sp with true return address. */
-	"	movq %rax, 19*8(%rsp)\n"
 	RESTORE_REGS_STRING
+	/* In trampoline_handler(), 'regs->flags' is copied to 'regs->sp'. */
+	"	addq $8, %rsp\n"
 	"	popfq\n"
 #else
-	"	pushl %esp\n"
+	/* Push a fake return address to tell the unwinder it's a kretprobe. */
+	"	pushl $__kretprobe_trampoline\n"
 	UNWIND_HINT_FUNC
+	/* Save the 'sp - 4', this will be fixed later. */
+	"	pushl %esp\n"
 	"	pushfl\n"
 	SAVE_REGS_STRING
 	"	movl %esp, %eax\n"
 	"	call trampoline_handler\n"
-	/* Replace saved sp with true return address. */
-	"	movl %eax, 15*4(%esp)\n"
 	RESTORE_REGS_STRING
+	/* In trampoline_handler(), 'regs->flags' is copied to 'regs->sp'. */
+	"	addl $4, %esp\n"
 	"	popfl\n"
 #endif
 	"	ret\n"
@@ -1063,8 +1068,10 @@ STACK_FRAME_NON_STANDARD_FP(__kretprobe_trampoline);
 /*
  * Called from __kretprobe_trampoline
  */
-__used __visible void *trampoline_handler(struct pt_regs *regs)
+__used __visible void trampoline_handler(struct pt_regs *regs)
 {
+	unsigned long *frame_pointer;
+
 	/* fixup registers */
 	regs->cs = __KERNEL_CS;
 #ifdef CONFIG_X86_32
@@ -1072,8 +1079,17 @@ __used __visible void *trampoline_handler(struct pt_regs *regs)
 #endif
 	regs->ip = (unsigned long)&__kretprobe_trampoline;
 	regs->orig_ax = ~0UL;
+	regs->sp += sizeof(long);
+	frame_pointer = &regs->sp + 1;
+
+	/* Replace fake return address with real one. */
+	*frame_pointer = kretprobe_trampoline_handler(regs, frame_pointer);
 
-	return (void *)kretprobe_trampoline_handler(regs, &regs->sp);
+	/*
+	 * Copy FLAGS to 'pt_regs::sp' so that __kretprobe_trapmoline()
+	 * can do RET right after POPF.
+	 */
+	regs->sp = regs->flags;
 }
 NOKPROBE_SYMBOL(trampoline_handler);
 


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH -tip v10 14/16] x86/unwind: Recover kretprobe trampoline entry
  2021-07-29 14:05 [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86 Masami Hiramatsu
                   ` (12 preceding siblings ...)
  2021-07-29 14:07 ` [PATCH -tip v10 13/16] x86/kprobes: Push a fake return address at kretprobe_trampoline Masami Hiramatsu
@ 2021-07-29 14:08 ` Masami Hiramatsu
  2021-07-29 14:08 ` [PATCH -tip v10 15/16] tracing: Show kretprobe unknown indicator only for kretprobe_trampoline Masami Hiramatsu
                   ` (3 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Masami Hiramatsu @ 2021-07-29 14:08 UTC (permalink / raw)
  To: Steven Rostedt, Josh Poimboeuf, Ingo Molnar
  Cc: X86 ML, Masami Hiramatsu, Daniel Xu, linux-kernel, bpf, kuba,
	mingo, ast, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
	kernel-team, yhs, linux-ia64, Abhishek Sagar, Andrii Nakryiko

Since the kretprobe replaces the function return address with
the kretprobe_trampoline on the stack, x86 unwinders can not
continue the stack unwinding at that point, or record
kretprobe_trampoline instead of correct return address.

To fix this issue, find the correct return address from task's
kretprobe_instances as like as function-graph tracer does.

With this fix, the unwinder can correctly unwind the stack
from kretprobe event on x86, as below.

           <...>-135     [003] ...1     6.722338: r_full_proxy_read_0: (vfs_read+0xab/0x1a0 <- full_proxy_read)
           <...>-135     [003] ...1     6.722377: <stack trace>
 => kretprobe_trace_func+0x209/0x2f0
 => kretprobe_dispatcher+0x4a/0x70
 => __kretprobe_trampoline_handler+0xca/0x150
 => trampoline_handler+0x44/0x70
 => kretprobe_trampoline+0x2a/0x50
 => vfs_read+0xab/0x1a0
 => ksys_read+0x5f/0xe0
 => do_syscall_64+0x33/0x40
 => entry_SYSCALL_64_after_hwframe+0x44/0xae


Reported-by: Daniel Xu <dxu@dxuuu.xyz>
Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Suggested-by: Josh Poimboeuf <jpoimboe@redhat.com>
Tested-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Josh Poimboeuf <jpoimboe@redhat.com>
---
  Changes in v9:
   - Update comment so that it explains why the strange address passed
     to unwind_recover_kretprobe().
  Changes in v7:
   - Remove superfluous #include <linux/kprobes.h>.
  Changes in v5:
   - Fix the case of interrupt happens on kretprobe_trampoline+0.
  Changes in v3:
   - Split out the kretprobe side patch
   - Fix build error when CONFIG_KRETPROBES=n.
  Changes in v2:
   - Remove kretprobe wrapper functions from unwind_orc.c
   - Do not fixup state->ip when unwinding with regs because
     kretprobe fixup instruction pointer before calling handler.
---
 arch/x86/include/asm/unwind.h  |   23 +++++++++++++++++++++++
 arch/x86/kernel/unwind_frame.c |    3 +--
 arch/x86/kernel/unwind_guess.c |    3 +--
 arch/x86/kernel/unwind_orc.c   |   21 +++++++++++++++++----
 4 files changed, 42 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/unwind.h b/arch/x86/include/asm/unwind.h
index 70fc159ebe69..fca2e783e3ce 100644
--- a/arch/x86/include/asm/unwind.h
+++ b/arch/x86/include/asm/unwind.h
@@ -4,6 +4,7 @@
 
 #include <linux/sched.h>
 #include <linux/ftrace.h>
+#include <linux/kprobes.h>
 #include <asm/ptrace.h>
 #include <asm/stacktrace.h>
 
@@ -15,6 +16,7 @@ struct unwind_state {
 	unsigned long stack_mask;
 	struct task_struct *task;
 	int graph_idx;
+	struct llist_node *kr_cur;
 	bool error;
 #if defined(CONFIG_UNWINDER_ORC)
 	bool signal, full_regs;
@@ -99,6 +101,27 @@ void unwind_module_init(struct module *mod, void *orc_ip, size_t orc_ip_size,
 			void *orc, size_t orc_size) {}
 #endif
 
+static inline
+unsigned long unwind_recover_kretprobe(struct unwind_state *state,
+				       unsigned long addr, unsigned long *addr_p)
+{
+	return is_kretprobe_trampoline(addr) ?
+		kretprobe_find_ret_addr(state->task, addr_p, &state->kr_cur) :
+		addr;
+}
+
+/* Recover the return address modified by kretprobe and ftrace_graph. */
+static inline
+unsigned long unwind_recover_ret_addr(struct unwind_state *state,
+				     unsigned long addr, unsigned long *addr_p)
+{
+	unsigned long ret;
+
+	ret = ftrace_graph_ret_addr(state->task, &state->graph_idx,
+				    addr, addr_p);
+	return unwind_recover_kretprobe(state, ret, addr_p);
+}
+
 /*
  * This disables KASAN checking when reading a value from another task's stack,
  * since the other task could be running on another CPU and could have poisoned
diff --git a/arch/x86/kernel/unwind_frame.c b/arch/x86/kernel/unwind_frame.c
index d7c44b257f7f..8e1c50c86e5d 100644
--- a/arch/x86/kernel/unwind_frame.c
+++ b/arch/x86/kernel/unwind_frame.c
@@ -240,8 +240,7 @@ static bool update_stack_state(struct unwind_state *state,
 	else {
 		addr_p = unwind_get_return_address_ptr(state);
 		addr = READ_ONCE_TASK_STACK(state->task, *addr_p);
-		state->ip = ftrace_graph_ret_addr(state->task, &state->graph_idx,
-						  addr, addr_p);
+		state->ip = unwind_recover_ret_addr(state, addr, addr_p);
 	}
 
 	/* Save the original stack pointer for unwind_dump(): */
diff --git a/arch/x86/kernel/unwind_guess.c b/arch/x86/kernel/unwind_guess.c
index c49f10ffd8cd..884d68a6e714 100644
--- a/arch/x86/kernel/unwind_guess.c
+++ b/arch/x86/kernel/unwind_guess.c
@@ -15,8 +15,7 @@ unsigned long unwind_get_return_address(struct unwind_state *state)
 
 	addr = READ_ONCE_NOCHECK(*state->sp);
 
-	return ftrace_graph_ret_addr(state->task, &state->graph_idx,
-				     addr, state->sp);
+	return unwind_recover_ret_addr(state, addr, state->sp);
 }
 EXPORT_SYMBOL_GPL(unwind_get_return_address);
 
diff --git a/arch/x86/kernel/unwind_orc.c b/arch/x86/kernel/unwind_orc.c
index a1202536fc57..e6f7592790af 100644
--- a/arch/x86/kernel/unwind_orc.c
+++ b/arch/x86/kernel/unwind_orc.c
@@ -534,9 +534,8 @@ bool unwind_next_frame(struct unwind_state *state)
 		if (!deref_stack_reg(state, ip_p, &state->ip))
 			goto err;
 
-		state->ip = ftrace_graph_ret_addr(state->task, &state->graph_idx,
-						  state->ip, (void *)ip_p);
-
+		state->ip = unwind_recover_ret_addr(state, state->ip,
+						    (unsigned long *)ip_p);
 		state->sp = sp;
 		state->regs = NULL;
 		state->prev_regs = NULL;
@@ -549,7 +548,18 @@ bool unwind_next_frame(struct unwind_state *state)
 					 (void *)orig_ip);
 			goto err;
 		}
-
+		/*
+		 * There is a small chance to interrupt at the entry of
+		 * __kretprobe_trampoline() where the ORC info doesn't exist.
+		 * That point is right after the RET to __kretprobe_trampoline()
+		 * which was modified return address.
+		 * At that point, the @addr_p of the unwind_recover_kretprobe()
+		 * (this has to point the address of the stack entry storing
+		 * the modified return address) must be "SP - (a stack entry)"
+		 * because SP is incremented by the RET.
+		 */
+		state->ip = unwind_recover_kretprobe(state, state->ip,
+				(unsigned long *)(state->sp - sizeof(long)));
 		state->regs = (struct pt_regs *)sp;
 		state->prev_regs = NULL;
 		state->full_regs = true;
@@ -562,6 +572,9 @@ bool unwind_next_frame(struct unwind_state *state)
 					 (void *)orig_ip);
 			goto err;
 		}
+		/* See UNWIND_HINT_TYPE_REGS case comment. */
+		state->ip = unwind_recover_kretprobe(state, state->ip,
+				(unsigned long *)(state->sp - sizeof(long)));
 
 		if (state->full_regs)
 			state->prev_regs = state->regs;


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH -tip v10 15/16] tracing: Show kretprobe unknown indicator only for kretprobe_trampoline
  2021-07-29 14:05 [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86 Masami Hiramatsu
                   ` (13 preceding siblings ...)
  2021-07-29 14:08 ` [PATCH -tip v10 14/16] x86/unwind: Recover kretprobe trampoline entry Masami Hiramatsu
@ 2021-07-29 14:08 ` Masami Hiramatsu
  2021-07-29 14:08 ` [PATCH -tip v10 16/16] x86/kprobes: Fixup return address in generic trampoline handler Masami Hiramatsu
                   ` (2 subsequent siblings)
  17 siblings, 0 replies; 28+ messages in thread
From: Masami Hiramatsu @ 2021-07-29 14:08 UTC (permalink / raw)
  To: Steven Rostedt, Josh Poimboeuf, Ingo Molnar
  Cc: X86 ML, Masami Hiramatsu, Daniel Xu, linux-kernel, bpf, kuba,
	mingo, ast, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
	kernel-team, yhs, linux-ia64, Abhishek Sagar, Andrii Nakryiko

ftrace shows "[unknown/kretprobe'd]" indicator all addresses in the
kretprobe_trampoline, but the modified address by kretprobe should
be only kretprobe_trampoline+0.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Acked-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Tested-by: Andrii Nakryiko <andrii@kernel.org>
---
 kernel/trace/trace_output.c |   17 ++++-------------
 1 file changed, 4 insertions(+), 13 deletions(-)

diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
index 9b2e69619057..bdadcc8cee75 100644
--- a/kernel/trace/trace_output.c
+++ b/kernel/trace/trace_output.c
@@ -8,6 +8,7 @@
 #include <linux/module.h>
 #include <linux/mutex.h>
 #include <linux/ftrace.h>
+#include <linux/kprobes.h>
 #include <linux/sched/clock.h>
 #include <linux/sched/mm.h>
 
@@ -346,22 +347,12 @@ int trace_output_call(struct trace_iterator *iter, char *name, char *fmt, ...)
 }
 EXPORT_SYMBOL_GPL(trace_output_call);
 
-#ifdef CONFIG_KRETPROBES
-static inline const char *kretprobed(const char *name)
+static inline const char *kretprobed(const char *name, unsigned long addr)
 {
-	static const char tramp_name[] = "__kretprobe_trampoline";
-	int size = sizeof(tramp_name);
-
-	if (strncmp(tramp_name, name, size) == 0)
+	if (is_kretprobe_trampoline(addr))
 		return "[unknown/kretprobe'd]";
 	return name;
 }
-#else
-static inline const char *kretprobed(const char *name)
-{
-	return name;
-}
-#endif /* CONFIG_KRETPROBES */
 
 void
 trace_seq_print_sym(struct trace_seq *s, unsigned long address, bool offset)
@@ -374,7 +365,7 @@ trace_seq_print_sym(struct trace_seq *s, unsigned long address, bool offset)
 		sprint_symbol(str, address);
 	else
 		kallsyms_lookup(address, NULL, NULL, NULL, str);
-	name = kretprobed(str);
+	name = kretprobed(str, address);
 
 	if (name && strlen(name)) {
 		trace_seq_puts(s, name);


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH -tip v10 16/16] x86/kprobes: Fixup return address in generic trampoline handler
  2021-07-29 14:05 [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86 Masami Hiramatsu
                   ` (14 preceding siblings ...)
  2021-07-29 14:08 ` [PATCH -tip v10 15/16] tracing: Show kretprobe unknown indicator only for kretprobe_trampoline Masami Hiramatsu
@ 2021-07-29 14:08 ` Masami Hiramatsu
  2021-07-29 23:35 ` [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86 Masami Hiramatsu
  2021-08-29 14:22 ` [RFC PATCH 0/1] Non stack-intrusive return probe event Masami Hiramatsu
  17 siblings, 0 replies; 28+ messages in thread
From: Masami Hiramatsu @ 2021-07-29 14:08 UTC (permalink / raw)
  To: Steven Rostedt, Josh Poimboeuf, Ingo Molnar
  Cc: X86 ML, Masami Hiramatsu, Daniel Xu, linux-kernel, bpf, kuba,
	mingo, ast, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
	kernel-team, yhs, linux-ia64, Abhishek Sagar, Andrii Nakryiko

In x86, the fake return address on the stack saved by
__kretprobe_trampoline() will be replaced with the real return
address after returning from trampoline_handler(). Before fixing
the return address, the real return address can be found in the
'current->kretprobe_instances'.

However, since there is a window between updating the
'current->kretprobe_instances' and fixing the address on the stack,
if an interrupt happens at that timing and the interrupt handler
does stacktrace, it may fail to unwind because it can not get
the correct return address from 'current->kretprobe_instances'.

This will eliminate that window by fixing the return address
right before updating 'current->kretprobe_instances'.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
Tested-by: Andrii Nakryiko <andrii@kernel.org>
---
 Changes in v9:
  - Fixes the changelog. This can eliminate the window.
  - Add more comment how it works.
 Changes in v7:
  - Add a prototype for arch_kretprobe_fixup_return()
---
 arch/x86/kernel/kprobes/core.c |   18 ++++++++++++++++--
 include/linux/kprobes.h        |    3 +++
 kernel/kprobes.c               |   11 +++++++++++
 3 files changed, 30 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index 7e1111c19605..fce99e249d61 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -1065,6 +1065,16 @@ NOKPROBE_SYMBOL(__kretprobe_trampoline);
  */
 STACK_FRAME_NON_STANDARD_FP(__kretprobe_trampoline);
 
+/* This is called from kretprobe_trampoline_handler(). */
+void arch_kretprobe_fixup_return(struct pt_regs *regs,
+				 kprobe_opcode_t *correct_ret_addr)
+{
+	unsigned long *frame_pointer = &regs->sp + 1;
+
+	/* Replace fake return address with real one. */
+	*frame_pointer = (unsigned long)correct_ret_addr;
+}
+
 /*
  * Called from __kretprobe_trampoline
  */
@@ -1082,8 +1092,12 @@ __used __visible void trampoline_handler(struct pt_regs *regs)
 	regs->sp += sizeof(long);
 	frame_pointer = &regs->sp + 1;
 
-	/* Replace fake return address with real one. */
-	*frame_pointer = kretprobe_trampoline_handler(regs, frame_pointer);
+	/*
+	 * The return address at 'frame_pointer' is recovered by the
+	 * arch_kretprobe_fixup_return() which called from the
+	 * kretprobe_trampoline_handler().
+	 */
+	kretprobe_trampoline_handler(regs, frame_pointer);
 
 	/*
 	 * Copy FLAGS to 'pt_regs::sp' so that __kretprobe_trapmoline()
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index 6d47a9da1e0a..e974caf39d3e 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -188,6 +188,9 @@ extern void arch_prepare_kretprobe(struct kretprobe_instance *ri,
 				   struct pt_regs *regs);
 extern int arch_trampoline_kprobe(struct kprobe *p);
 
+void arch_kretprobe_fixup_return(struct pt_regs *regs,
+				 kprobe_opcode_t *correct_ret_addr);
+
 void __kretprobe_trampoline(void);
 /*
  * Since some architecture uses structured function pointer,
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index ebc587b9a346..b62af9fc3607 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1922,6 +1922,15 @@ unsigned long kretprobe_find_ret_addr(struct task_struct *tsk, void *fp,
 }
 NOKPROBE_SYMBOL(kretprobe_find_ret_addr);
 
+void __weak arch_kretprobe_fixup_return(struct pt_regs *regs,
+					kprobe_opcode_t *correct_ret_addr)
+{
+	/*
+	 * Do nothing by default. Please fill this to update the fake return
+	 * address on the stack with the correct one on each arch if possible.
+	 */
+}
+
 unsigned long __kretprobe_trampoline_handler(struct pt_regs *regs,
 					     void *frame_pointer)
 {
@@ -1967,6 +1976,8 @@ unsigned long __kretprobe_trampoline_handler(struct pt_regs *regs,
 		first = first->next;
 	}
 
+	arch_kretprobe_fixup_return(regs, correct_ret_addr);
+
 	/* Unlink all nodes for this frame. */
 	first = current->kretprobe_instances.first;
 	current->kretprobe_instances.first = node->next;


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86
  2021-07-29 14:05 [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86 Masami Hiramatsu
                   ` (15 preceding siblings ...)
  2021-07-29 14:08 ` [PATCH -tip v10 16/16] x86/kprobes: Fixup return address in generic trampoline handler Masami Hiramatsu
@ 2021-07-29 23:35 ` Masami Hiramatsu
  2021-08-24  5:12   ` Andrii Nakryiko
  2021-08-29 14:22 ` [RFC PATCH 0/1] Non stack-intrusive return probe event Masami Hiramatsu
  17 siblings, 1 reply; 28+ messages in thread
From: Masami Hiramatsu @ 2021-07-29 23:35 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Steven Rostedt, Josh Poimboeuf, Ingo Molnar, X86 ML, Daniel Xu,
	linux-kernel, bpf, kuba, mingo, ast, Thomas Gleixner,
	Borislav Petkov, Peter Zijlstra, kernel-team, yhs, linux-ia64,
	Abhishek Sagar, Andrii Nakryiko

On Thu, 29 Jul 2021 23:05:56 +0900
Masami Hiramatsu <mhiramat@kernel.org> wrote:

> Hello,
> 
> This is the 10th version of the series to fix the stacktrace with kretprobe on x86.
> 
> The previous version is here;
> 
>  https://lore.kernel.org/bpf/162601048053.1318837.1550594515476777588.stgit@devnote2/
> 
> This version is rebased on top of new kprobes cleanup series(*1) and merging
> Josh's objtool update series (*2)(*3) as [6/16] and [7/16].
> 
> (*1) https://lore.kernel.org/bpf/162748615977.59465.13262421617578791515.stgit@devnote2/
> (*2) https://lore.kernel.org/bpf/20210710192433.x5cgjsq2ksvaqnss@treble/
> (*3) https://lore.kernel.org/bpf/20210710192514.ghvksi3ozhez4lvb@treble/
> 
> Changes from v9:
>  - Add Josh's objtool update patches with a build error fix as [6/16] and [7/16].
>  - Add a API document for kretprobe_find_ret_addr() and check cur != NULL in [5/16].
> 
> With this series, unwinder can unwind stack correctly from ftrace as below;
> 
>   # cd /sys/kernel/debug/tracing
>   # echo > trace
>   # echo 1 > options/sym-offset
>   # echo r vfs_read >> kprobe_events
>   # echo r full_proxy_read >> kprobe_events
>   # echo traceoff:1 > events/kprobes/r_vfs_read_0/trigger
>   # echo stacktrace:1 > events/kprobes/r_full_proxy_read_0/trigger
>   # echo 1 > events/kprobes/enable
>   # cat /sys/kernel/debug/kprobes/list
> ffffffff8133b740  r  full_proxy_read+0x0    [FTRACE]
> ffffffff812560b0  r  vfs_read+0x0    [FTRACE]
>   # echo 0 > events/kprobes/enable
>   # cat trace
> # tracer: nop
> #
> # entries-in-buffer/entries-written: 3/3   #P:8
> #
> #                                _-----=> irqs-off
> #                               / _----=> need-resched
> #                              | / _---=> hardirq/softirq
> #                              || / _--=> preempt-depth
> #                              ||| /     delay
> #           TASK-PID     CPU#  ||||   TIMESTAMP  FUNCTION
> #              | |         |   ||||      |         |
>            <...>-134     [007] ...1    16.185877: r_full_proxy_read_0: (vfs_read+0x98/0x180 <- full_proxy_read)
>            <...>-134     [007] ...1    16.185901: <stack trace>
>  => kretprobe_trace_func+0x209/0x300
>  => kretprobe_dispatcher+0x4a/0x70
>  => __kretprobe_trampoline_handler+0xd4/0x170
>  => trampoline_handler+0x43/0x60
>  => kretprobe_trampoline+0x2a/0x50
>  => vfs_read+0x98/0x180
>  => ksys_read+0x5f/0xe0
>  => do_syscall_64+0x37/0x90
>  => entry_SYSCALL_64_after_hwframe+0x44/0xae
>            <...>-134     [007] ...1    16.185902: r_vfs_read_0: (ksys_read+0x5f/0xe0 <- vfs_read)
> 
> This shows the double return probes (vfs_read() and full_proxy_read()) on the stack
> correctly unwinded. (vfs_read() returns to 'ksys_read+0x5f' and full_proxy_read()
> returns to 'vfs_read+0x98')
> 
> This also changes the kretprobe behavisor a bit, now the instraction pointer in
> the 'pt_regs' passed to kretprobe user handler is correctly set the real return
> address. So user handlers can get it via instruction_pointer() API, and can use
> stack_trace_save_regs().
> 
> You can also get this series from 
>  git://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git kprobes/kretprobe-stackfix-v9

Oops, this is of course 'kprobes/kretprobe-stackfix-v10'. And this branch includes above (*1) series.

Thank you,

-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86
  2021-07-29 23:35 ` [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86 Masami Hiramatsu
@ 2021-08-24  5:12   ` Andrii Nakryiko
  2021-08-24  5:32     ` Masami Hiramatsu
  0 siblings, 1 reply; 28+ messages in thread
From: Andrii Nakryiko @ 2021-08-24  5:12 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Steven Rostedt, Josh Poimboeuf, Ingo Molnar, X86 ML, Daniel Xu,
	open list, bpf, Jakub Kicinski, Ingo Molnar, Alexei Starovoitov,
	Thomas Gleixner, Borislav Petkov, Peter Zijlstra, Kernel Team,
	Yonghong Song, linux-ia64, Abhishek Sagar

On Thu, Jul 29, 2021 at 4:35 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Thu, 29 Jul 2021 23:05:56 +0900
> Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> > Hello,
> >
> > This is the 10th version of the series to fix the stacktrace with kretprobe on x86.
> >
> > The previous version is here;
> >
> >  https://lore.kernel.org/bpf/162601048053.1318837.1550594515476777588.stgit@devnote2/
> >
> > This version is rebased on top of new kprobes cleanup series(*1) and merging
> > Josh's objtool update series (*2)(*3) as [6/16] and [7/16].
> >
> > (*1) https://lore.kernel.org/bpf/162748615977.59465.13262421617578791515.stgit@devnote2/
> > (*2) https://lore.kernel.org/bpf/20210710192433.x5cgjsq2ksvaqnss@treble/
> > (*3) https://lore.kernel.org/bpf/20210710192514.ghvksi3ozhez4lvb@treble/
> >
> > Changes from v9:
> >  - Add Josh's objtool update patches with a build error fix as [6/16] and [7/16].
> >  - Add a API document for kretprobe_find_ret_addr() and check cur != NULL in [5/16].
> >
> > With this series, unwinder can unwind stack correctly from ftrace as below;
> >
> >   # cd /sys/kernel/debug/tracing
> >   # echo > trace
> >   # echo 1 > options/sym-offset
> >   # echo r vfs_read >> kprobe_events
> >   # echo r full_proxy_read >> kprobe_events
> >   # echo traceoff:1 > events/kprobes/r_vfs_read_0/trigger
> >   # echo stacktrace:1 > events/kprobes/r_full_proxy_read_0/trigger
> >   # echo 1 > events/kprobes/enable
> >   # cat /sys/kernel/debug/kprobes/list
> > ffffffff8133b740  r  full_proxy_read+0x0    [FTRACE]
> > ffffffff812560b0  r  vfs_read+0x0    [FTRACE]
> >   # echo 0 > events/kprobes/enable
> >   # cat trace
> > # tracer: nop
> > #
> > # entries-in-buffer/entries-written: 3/3   #P:8
> > #
> > #                                _-----=> irqs-off
> > #                               / _----=> need-resched
> > #                              | / _---=> hardirq/softirq
> > #                              || / _--=> preempt-depth
> > #                              ||| /     delay
> > #           TASK-PID     CPU#  ||||   TIMESTAMP  FUNCTION
> > #              | |         |   ||||      |         |
> >            <...>-134     [007] ...1    16.185877: r_full_proxy_read_0: (vfs_read+0x98/0x180 <- full_proxy_read)
> >            <...>-134     [007] ...1    16.185901: <stack trace>
> >  => kretprobe_trace_func+0x209/0x300
> >  => kretprobe_dispatcher+0x4a/0x70
> >  => __kretprobe_trampoline_handler+0xd4/0x170
> >  => trampoline_handler+0x43/0x60
> >  => kretprobe_trampoline+0x2a/0x50
> >  => vfs_read+0x98/0x180
> >  => ksys_read+0x5f/0xe0
> >  => do_syscall_64+0x37/0x90
> >  => entry_SYSCALL_64_after_hwframe+0x44/0xae
> >            <...>-134     [007] ...1    16.185902: r_vfs_read_0: (ksys_read+0x5f/0xe0 <- vfs_read)
> >
> > This shows the double return probes (vfs_read() and full_proxy_read()) on the stack
> > correctly unwinded. (vfs_read() returns to 'ksys_read+0x5f' and full_proxy_read()
> > returns to 'vfs_read+0x98')
> >
> > This also changes the kretprobe behavisor a bit, now the instraction pointer in
> > the 'pt_regs' passed to kretprobe user handler is correctly set the real return
> > address. So user handlers can get it via instruction_pointer() API, and can use
> > stack_trace_save_regs().
> >
> > You can also get this series from
> >  git://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git kprobes/kretprobe-stackfix-v9
>
> Oops, this is of course 'kprobes/kretprobe-stackfix-v10'. And this branch includes above (*1) series.

Hi Masami,

Was this ever merged/applied? This is a very important functionality
for BPF kretprobes, so I hope this won't slip through the cracks.
Thanks!

>
> Thank you,
>
> --
> Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86
  2021-08-24  5:12   ` Andrii Nakryiko
@ 2021-08-24  5:32     ` Masami Hiramatsu
  2021-09-13 17:14       ` Andrii Nakryiko
  0 siblings, 1 reply; 28+ messages in thread
From: Masami Hiramatsu @ 2021-08-24  5:32 UTC (permalink / raw)
  To: Andrii Nakryiko, Josh Poimboeuf, Ingo Molnar
  Cc: Steven Rostedt, X86 ML, Daniel Xu, open list, bpf,
	Jakub Kicinski, Ingo Molnar, Alexei Starovoitov, Thomas Gleixner,
	Borislav Petkov, Peter Zijlstra, Kernel Team, Yonghong Song,
	linux-ia64, Abhishek Sagar

On Mon, 23 Aug 2021 22:12:06 -0700
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> On Thu, Jul 29, 2021 at 4:35 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > On Thu, 29 Jul 2021 23:05:56 +0900
> > Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > > Hello,
> > >
> > > This is the 10th version of the series to fix the stacktrace with kretprobe on x86.
> > >
> > > The previous version is here;
> > >
> > >  https://lore.kernel.org/bpf/162601048053.1318837.1550594515476777588.stgit@devnote2/
> > >
> > > This version is rebased on top of new kprobes cleanup series(*1) and merging
> > > Josh's objtool update series (*2)(*3) as [6/16] and [7/16].
> > >
> > > (*1) https://lore.kernel.org/bpf/162748615977.59465.13262421617578791515.stgit@devnote2/
> > > (*2) https://lore.kernel.org/bpf/20210710192433.x5cgjsq2ksvaqnss@treble/
> > > (*3) https://lore.kernel.org/bpf/20210710192514.ghvksi3ozhez4lvb@treble/
> > >
> > > Changes from v9:
> > >  - Add Josh's objtool update patches with a build error fix as [6/16] and [7/16].
> > >  - Add a API document for kretprobe_find_ret_addr() and check cur != NULL in [5/16].
> > >
> > > With this series, unwinder can unwind stack correctly from ftrace as below;
> > >
> > >   # cd /sys/kernel/debug/tracing
> > >   # echo > trace
> > >   # echo 1 > options/sym-offset
> > >   # echo r vfs_read >> kprobe_events
> > >   # echo r full_proxy_read >> kprobe_events
> > >   # echo traceoff:1 > events/kprobes/r_vfs_read_0/trigger
> > >   # echo stacktrace:1 > events/kprobes/r_full_proxy_read_0/trigger
> > >   # echo 1 > events/kprobes/enable
> > >   # cat /sys/kernel/debug/kprobes/list
> > > ffffffff8133b740  r  full_proxy_read+0x0    [FTRACE]
> > > ffffffff812560b0  r  vfs_read+0x0    [FTRACE]
> > >   # echo 0 > events/kprobes/enable
> > >   # cat trace
> > > # tracer: nop
> > > #
> > > # entries-in-buffer/entries-written: 3/3   #P:8
> > > #
> > > #                                _-----=> irqs-off
> > > #                               / _----=> need-resched
> > > #                              | / _---=> hardirq/softirq
> > > #                              || / _--=> preempt-depth
> > > #                              ||| /     delay
> > > #           TASK-PID     CPU#  ||||   TIMESTAMP  FUNCTION
> > > #              | |         |   ||||      |         |
> > >            <...>-134     [007] ...1    16.185877: r_full_proxy_read_0: (vfs_read+0x98/0x180 <- full_proxy_read)
> > >            <...>-134     [007] ...1    16.185901: <stack trace>
> > >  => kretprobe_trace_func+0x209/0x300
> > >  => kretprobe_dispatcher+0x4a/0x70
> > >  => __kretprobe_trampoline_handler+0xd4/0x170
> > >  => trampoline_handler+0x43/0x60
> > >  => kretprobe_trampoline+0x2a/0x50
> > >  => vfs_read+0x98/0x180
> > >  => ksys_read+0x5f/0xe0
> > >  => do_syscall_64+0x37/0x90
> > >  => entry_SYSCALL_64_after_hwframe+0x44/0xae
> > >            <...>-134     [007] ...1    16.185902: r_vfs_read_0: (ksys_read+0x5f/0xe0 <- vfs_read)
> > >
> > > This shows the double return probes (vfs_read() and full_proxy_read()) on the stack
> > > correctly unwinded. (vfs_read() returns to 'ksys_read+0x5f' and full_proxy_read()
> > > returns to 'vfs_read+0x98')
> > >
> > > This also changes the kretprobe behavisor a bit, now the instraction pointer in
> > > the 'pt_regs' passed to kretprobe user handler is correctly set the real return
> > > address. So user handlers can get it via instruction_pointer() API, and can use
> > > stack_trace_save_regs().
> > >
> > > You can also get this series from
> > >  git://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git kprobes/kretprobe-stackfix-v9
> >
> > Oops, this is of course 'kprobes/kretprobe-stackfix-v10'. And this branch includes above (*1) series.
> 
> Hi Masami,
> 
> Was this ever merged/applied? This is a very important functionality
> for BPF kretprobes, so I hope this won't slip through the cracks.

No, not yet as far as I know.
I'm waiting for any comment on this series. Since this is basically
x86 ORC unwinder improvement, this series should be merged to -tip tree.

Ingo and Josh,

Could you give me any comment, please?

Thank you,


> Thanks!
> 
> >
> > Thank you,
> >
> > --
> > Masami Hiramatsu <mhiramat@kernel.org>


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC PATCH 0/1] Non stack-intrusive return probe event
  2021-07-29 14:05 [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86 Masami Hiramatsu
                   ` (16 preceding siblings ...)
  2021-07-29 23:35 ` [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86 Masami Hiramatsu
@ 2021-08-29 14:22 ` Masami Hiramatsu
  2021-08-29 14:22   ` [RFC PATCH 1/1] [PoC] tracing: kprobe: Add non-stack intrusion " Masami Hiramatsu
  2021-08-30 19:04   ` [RFC PATCH 0/1] Non stack-intrusive " Andrii Nakryiko
  17 siblings, 2 replies; 28+ messages in thread
From: Masami Hiramatsu @ 2021-08-29 14:22 UTC (permalink / raw)
  To: Steven Rostedt, Josh Poimboeuf, Ingo Molnar
  Cc: X86 ML, Masami Hiramatsu, Daniel Xu, linux-kernel, bpf, kuba,
	mingo, ast, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
	kernel-team, yhs, linux-ia64, Abhishek Sagar, Andrii Nakryiko

Hello,

For a long time, we tackled to fix some issues around kretprobe.
One of the latest action was the stacktrace fix on x86 in this
thread.

https://lore.kernel.org/bpf/162756755600.301564.4957591913842010341.stgit@devnote2/

However, there seems no progress/further discussion. So I would
like to make another approach for this (and the other issues.)

Here is my idea -- replace kretprobe with kprobe.
In other words, put a kprobe on the "return instruction" directly
instead of modifying the kernel stack. This can solve most
of the kretprobe disadvantges. E.g.

- Since it doesn't change the kernel stack, any special stack
  unwinder fixup is not needed anymore.
- No "max-instance" limitations anymore, because it will use
  kprobes directly.
- Scalability performance will be improved as same as kprobes.
  No list-operation in probe-runtime.

Here is a PoC code which introduces "retinsn_probe" event as a part
of ftrace kprobe event. I don't think we need to replace the
kretprobe. This should be a higher layer feature, because some
kernel functions can have multiple "return instructions". Thus,
the "retinsn_probe" must manage multiple kprobes. That means the
"retinsn_probe" will be a user of kprobes. I decided to make it
inside the ftrace "kprobe-event". This gives us another advantage
for eBPF support. Because eBPF uses "kprobe-event" instead of
"kprobe" directly, if the "retinsn_probe" is implemented in the
"kprobe-event", eBPF can use it without any change.
Anyway, this can be co-exist with kretprobe. So as far as any
user uses kretprobe, we can keep it.


Example
=======
For example, I ran a shell script, which was used in the
stacktrace fix series.

----
mount -t debugfs debugfs /sys/kernel/debug/
cd /sys/kernel/debug/tracing
echo > trace
echo 1 > options/sym-offset
echo r vfs_read >> kprobe_events
echo r full_proxy_read >> kprobe_events
echo traceoff:1 > events/kprobes/r_vfs_read_0/trigger
echo stacktrace:1 > events/kprobes/r_full_proxy_read_0/trigger
echo 1 > events/kprobes/enable
cat /sys/kernel/debug/kprobes/list
echo 0 > events/kprobes/enable
cat trace
----

This is the result.
----
ffffffff813b420e  k  full_proxy_read+0x6e    
ffffffff812b7c0a  k  vfs_read+0xda  
# tracer: nop
#
# entries-in-buffer/entries-written: 3/3   #P:8
#
#                                _-----=> irqs-off
#                               / _----=> need-resched
#                              | / _---=> hardirq/softirq
#                              || / _--=> preempt-depth
#                              ||| /     delay
#           TASK-PID     CPU#  ||||   TIMESTAMP  FUNCTION
#              | |         |   ||||      |         |
             cat-136     [007] d.Z.     8.038381: r_full_proxy_read_0: (vfs_read+0x9b/0x180 <- full_proxy_read)
             cat-136     [007] d.Z.     8.038386: <stack trace>
 => kretprobe_trace_func+0x209/0x300
 => retinsn_dispatcher+0x7a/0xa0
 => kprobe_post_process+0x28/0x80
 => kprobe_int3_handler+0x166/0x1a0
 => exc_int3+0x47/0x140
 => asm_exc_int3+0x31/0x40
 => vfs_read+0x9b/0x180
 => ksys_read+0x68/0xe0
 => do_syscall_64+0x3b/0x90
 => entry_SYSCALL_64_after_hwframe+0x44/0xae
             cat-136     [007] d.Z.     8.038387: r_vfs_read_0: (ksys_read+0x68/0xe0 <- vfs_read)
----

You can see the return probe events are translated to kprobes
instead of kretprobes. And also, on the stacktrace, we can see
an int3 calls the kprobe and decode stacktrace correctly.


TODO
====
Of course, this is just an PoC code, there are many TODOs.

- This PoC code only supports x86 at this moment. But I think this
  can be done on the other architectures. What it needs is
  to implement "find_return_instructions()".
- Code cleanup is not enough. I have to remove "kretprobe" from
 "trace_kprobe" data structure, rewrite related functions etc.
- It has to handle "tail-call" optimized code, which replaces
  a "call + return" into "jump". find_return_instruction() should
  detect it and decode the jump destination too.


Thank you,


---

Masami Hiramatsu (1):
      [PoC] tracing: kprobe: Add non-stack intrusion return probe event


 arch/x86/kernel/kprobes/core.c |   59 +++++++++++++++++++++
 kernel/trace/trace_kprobe.c    |  110 ++++++++++++++++++++++++++++++++++++++--
 2 files changed, 164 insertions(+), 5 deletions(-)

--
Masami Hiramatsu (Linaro) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* [RFC PATCH 1/1] [PoC] tracing: kprobe: Add non-stack intrusion return probe event
  2021-08-29 14:22 ` [RFC PATCH 0/1] Non stack-intrusive return probe event Masami Hiramatsu
@ 2021-08-29 14:22   ` Masami Hiramatsu
  2021-08-30 19:04   ` [RFC PATCH 0/1] Non stack-intrusive " Andrii Nakryiko
  1 sibling, 0 replies; 28+ messages in thread
From: Masami Hiramatsu @ 2021-08-29 14:22 UTC (permalink / raw)
  To: Steven Rostedt, Josh Poimboeuf, Ingo Molnar
  Cc: X86 ML, Masami Hiramatsu, Daniel Xu, linux-kernel, bpf, kuba,
	mingo, ast, Thomas Gleixner, Borislav Petkov, Peter Zijlstra,
	kernel-team, yhs, linux-ia64, Abhishek Sagar, Andrii Nakryiko

Add kernel return instruction probe event (kriprobe event)
to kprobe event. This will hook the returns of the target function
but does not intrude the real stack entry.
This depends on each architecture implement one function --
find_return_instructions(). If it is implemented correctly,
kprobe event uses the kriprobe event instead of kretprobe.

Note, this is just a PoC code for x86. This doesn't work with
other arch which only supports kretprobe.
Also, This doesn't support the function with the tail call
(jump into another function instead of call & return),
kriprobe doesn't work with it yet.

Signed-off-by: Masami Hiramatsu <mhiramat@kernel.org>
---
 arch/x86/kernel/kprobes/core.c |   59 +++++++++++++++++++++
 kernel/trace/trace_kprobe.c    |  110 ++++++++++++++++++++++++++++++++++++++--
 2 files changed, 164 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/kprobes/core.c b/arch/x86/kernel/kprobes/core.c
index b6e046e4b289..4c4094505712 100644
--- a/arch/x86/kernel/kprobes/core.c
+++ b/arch/x86/kernel/kprobes/core.c
@@ -1117,3 +1117,62 @@ int arch_trampoline_kprobe(struct kprobe *p)
 {
 	return 0;
 }
+
+static bool insn_is_return(struct insn *insn)
+{
+	switch (insn->opcode.bytes[0]) {
+	case 0xc2:
+	case 0xc3:
+	case 0xca:
+	case 0xcb:
+		return true;
+	default:
+		return false;
+	}
+}
+
+/**
+ * find_return_instructions -- Search return instruction in the function
+ * @func: The target function address
+ * @rets: The storage of the return instruction address
+ * @nr_rets: The length of @rets
+ *
+ * This searches the address of return instructions in the @func (@func must
+ * be an entry address of the target function). The results are stored in the
+ * @rets. If the number of return instructions are bigger than @nr_rets, this
+ * will return the required length of the @rets.
+ */
+int find_return_instructions(kprobe_opcode_t *func, kprobe_opcode_t *rets[], int nr_rets)
+{
+	unsigned long addr, end, size = 0, offset = 0;
+	kprobe_opcode_t buf[MAX_INSN_SIZE];
+	unsigned long recovered_insn;
+	struct insn insn;
+	int ret, nr = 0;
+
+	addr = (unsigned long)func;
+	if (!kallsyms_lookup_size_offset(addr, &size, &offset))
+		return -EINVAL;
+
+	if (offset != 0)
+		return -EINVAL;
+	end = addr + size;
+
+	/* Decode the function to find return instructions */
+	while (addr < end) {
+		recovered_insn = recover_probed_instruction(buf, addr);
+		if (!recovered_insn)
+			return -EILSEQ;
+		ret = insn_decode_kernel(&insn, (void *)recovered_insn);
+		if (ret < 0)
+			return -EILSEQ;
+		if (insn_is_return(&insn)) {
+			if (nr < nr_rets)
+				rets[nr++] = (kprobe_opcode_t *)addr;
+		}
+		/* TODO: find jmp for tail call (outside of this func) */
+		addr += insn.length;
+	}
+
+	return nr;
+}
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index 3a64ba4bbad6..99e508ff45ad 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -50,6 +50,13 @@ static struct dyn_event_operations trace_kprobe_ops = {
 	.match = trace_kprobe_match,
 };
 
+#define MAX_RET_INSNS 16
+
+struct kprobe_holder {
+	struct kprobe kp;
+	struct trace_kprobe *tk;
+};
+
 /*
  * Kprobe event core functions
  */
@@ -59,6 +66,8 @@ struct trace_kprobe {
 	unsigned long __percpu *nhit;
 	const char		*symbol;	/* symbol name */
 	struct trace_probe	tp;
+	struct kprobe_holder	*krets;
+	int			nr_krets;
 };
 
 static bool is_trace_kprobe(struct dyn_event *ev)
@@ -82,7 +91,7 @@ static struct trace_kprobe *to_trace_kprobe(struct dyn_event *ev)
 
 static nokprobe_inline bool trace_kprobe_is_return(struct trace_kprobe *tk)
 {
-	return tk->rp.handler != NULL;
+	return tk->rp.handler != NULL || tk->krets != NULL;
 }
 
 static nokprobe_inline const char *trace_kprobe_symbol(struct trace_kprobe *tk)
@@ -180,7 +189,7 @@ static nokprobe_inline unsigned long trace_kprobe_nhit(struct trace_kprobe *tk)
 
 static nokprobe_inline bool trace_kprobe_is_registered(struct trace_kprobe *tk)
 {
-	return !(list_empty(&tk->rp.kp.list) &&
+	return tk->krets || !(list_empty(&tk->rp.kp.list) &&
 		 hlist_unhashed(&tk->rp.kp.hlist));
 }
 
@@ -311,13 +320,23 @@ static struct trace_kprobe *find_trace_kprobe(const char *event,
 	return NULL;
 }
 
+static int enable_retinsn_probe(struct trace_kprobe *tk)
+{
+	int ret, i;
+
+	for (i = 0; i < tk->nr_krets; i++)
+		ret = enable_kprobe(&(tk->krets[i].kp));
+
+	return ret;
+}
+
 static inline int __enable_trace_kprobe(struct trace_kprobe *tk)
 {
 	int ret = 0;
 
 	if (trace_kprobe_is_registered(tk) && !trace_kprobe_has_gone(tk)) {
 		if (trace_kprobe_is_return(tk))
-			ret = enable_kretprobe(&tk->rp);
+			ret = enable_retinsn_probe(tk);
 		else
 			ret = enable_kprobe(&tk->rp.kp);
 	}
@@ -474,6 +493,68 @@ static bool within_notrace_func(struct trace_kprobe *tk)
 #define within_notrace_func(tk)	(false)
 #endif
 
+int find_return_instructions(kprobe_opcode_t *func, kprobe_opcode_t *rets[], int nr_rets);
+static void retinsn_dispatcher(struct kprobe *kp, struct pt_regs *regs, unsigned long flags);
+
+static void unregister_retinsn_probe(struct trace_kprobe *tk)
+{
+	struct kprobe *kpp[MAX_RET_INSNS];
+	int i;
+
+	for (i = 0; i < tk->nr_krets; i++)
+		kpp[i] = &tk->krets[i].kp;
+
+	unregister_kprobes(kpp, tk->nr_krets);
+}
+
+static int register_retinsn_probe(struct trace_kprobe *tk)
+{
+	kprobe_opcode_t *func = (kprobe_opcode_t *)trace_kprobe_address(tk);
+	kprobe_opcode_t *rets[MAX_RET_INSNS];
+	struct kprobe *kpp[MAX_RET_INSNS];
+	struct kprobe_holder *khs;
+	int i, ret, nrets;
+
+	/* Find return instruction in the target function. */
+	ret = find_return_instructions(func, rets, MAX_RET_INSNS);
+	if (ret < 0)
+		return ret;
+
+	/* There might be tail call (jump) in the function. */
+	if (ret == 0)
+		return -ENOENT;
+
+	/* Or, too many return instructions. */
+	if (ret > MAX_RET_INSNS)
+		return -E2BIG;
+
+	/* Allocate kprobes which probes the return instructions directly. */
+	nrets = ret;
+	khs = kcalloc(nrets, sizeof(struct kprobe_holder), GFP_KERNEL);
+	if (!khs)
+		return -ENOENT;
+
+	for (i = 0; i < nrets; i++) {
+		khs[i].kp.addr = rets[i];
+		khs[i].kp.flags = tk->rp.kp.flags;
+		khs[i].kp.post_handler = retinsn_dispatcher;
+		khs[i].tk = tk;
+		kpp[i] = &khs[i].kp;
+	}
+
+	ret = register_kprobes(kpp, nrets);
+	if (ret < 0) {
+		kfree(khs);
+		return ret;
+	}
+
+	tk->rp.kp.addr = trace_kprobe_address(tk);
+	tk->krets = khs;
+	tk->nr_krets = nrets;
+
+	return 0;
+}
+
 /* Internal register function - just handle k*probes and flags */
 static int __register_trace_kprobe(struct trace_kprobe *tk)
 {
@@ -505,7 +586,7 @@ static int __register_trace_kprobe(struct trace_kprobe *tk)
 		tk->rp.kp.flags |= KPROBE_FLAG_DISABLED;
 
 	if (trace_kprobe_is_return(tk))
-		ret = register_kretprobe(&tk->rp);
+		ret = register_retinsn_probe(tk);
 	else
 		ret = register_kprobe(&tk->rp.kp);
 
@@ -517,7 +598,7 @@ static void __unregister_trace_kprobe(struct trace_kprobe *tk)
 {
 	if (trace_kprobe_is_registered(tk)) {
 		if (trace_kprobe_is_return(tk))
-			unregister_kretprobe(&tk->rp);
+			unregister_retinsn_probe(tk);
 		else
 			unregister_kprobe(&tk->rp.kp);
 		/* Cleanup kprobe for reuse and mark it unregistered */
@@ -1744,6 +1825,25 @@ kretprobe_dispatcher(struct kretprobe_instance *ri, struct pt_regs *regs)
 }
 NOKPROBE_SYMBOL(kretprobe_dispatcher);
 
+static void retinsn_dispatcher(struct kprobe *kp, struct pt_regs *regs, unsigned long flags)
+{
+	struct kprobe_holder *kh = container_of(kp, struct kprobe_holder, kp);
+	struct trace_kprobe *tk = kh->tk;
+	struct kretprobe_instance ri;	/* dummy : to be fixed */
+
+	ri.ret_addr = (void *)instruction_pointer(regs);
+
+	raw_cpu_inc(*tk->nhit);
+
+	if (trace_probe_test_flag(&tk->tp, TP_FLAG_TRACE))
+		kretprobe_trace_func(tk, &ri, regs);
+#ifdef CONFIG_PERF_EVENTS
+	if (trace_probe_test_flag(&tk->tp, TP_FLAG_PROFILE))
+		kretprobe_perf_func(tk, &ri, regs);
+#endif
+}
+NOKPROBE_SYMBOL(retinsn_dispatcher);
+
 static struct trace_event_functions kretprobe_funcs = {
 	.trace		= print_kretprobe_event
 };


^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 0/1] Non stack-intrusive return probe event
  2021-08-29 14:22 ` [RFC PATCH 0/1] Non stack-intrusive return probe event Masami Hiramatsu
  2021-08-29 14:22   ` [RFC PATCH 1/1] [PoC] tracing: kprobe: Add non-stack intrusion " Masami Hiramatsu
@ 2021-08-30 19:04   ` Andrii Nakryiko
  2021-08-31  6:06     ` Masami Hiramatsu
  1 sibling, 1 reply; 28+ messages in thread
From: Andrii Nakryiko @ 2021-08-30 19:04 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Steven Rostedt, Josh Poimboeuf, Ingo Molnar, X86 ML, Daniel Xu,
	open list, bpf, Jakub Kicinski, Ingo Molnar, Alexei Starovoitov,
	Thomas Gleixner, Borislav Petkov, Peter Zijlstra, Kernel Team,
	Yonghong Song, linux-ia64, Abhishek Sagar

On Sun, Aug 29, 2021 at 7:22 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> Hello,
>
> For a long time, we tackled to fix some issues around kretprobe.
> One of the latest action was the stacktrace fix on x86 in this
> thread.
>
> https://lore.kernel.org/bpf/162756755600.301564.4957591913842010341.stgit@devnote2/
>
> However, there seems no progress/further discussion. So I would
> like to make another approach for this (and the other issues.)

v10 of kretprobe+stacktrace fixes ([0]) from Masami has received no
comment or objections in the last month, since it was posted. It fixes
the very real and very limiting problem of not being able to capture a
stack trace from BPF kretprobe programs. Masami, while I don't mind
your new approach, I think we shouldn't consider them as "either/or"
solutions. We have a fix that works for existing implementations, can
we please land it, and then work on further improvements
independently?

Ingo, Peter, Steven,

I'm not sure who and which kernel tree this has to go through, but
assuming it's one of you/yours, can you please take a look at [0] and
apply it where appropriate? The work has been going on since March and
it blocks development of some extremely useful tooling (retsnoop [1]
being one of them). There were also bpftrace users that were
completely surprised about the inability to use stack trace capturing
from kretprobe handlers, so it's not just me. I (and a bunch of other
BPF users) would greatly appreciate help with getting this problem
fixed. Thank you!

  [0] https://lore.kernel.org/bpf/162756755600.301564.4957591913842010341.stgit@devnote2/
  [1] https://github.com/anakryiko/retsnoop

>
> Here is my idea -- replace kretprobe with kprobe.
> In other words, put a kprobe on the "return instruction" directly
> instead of modifying the kernel stack. This can solve most
> of the kretprobe disadvantges. E.g.
>
> - Since it doesn't change the kernel stack, any special stack
>   unwinder fixup is not needed anymore.
> - No "max-instance" limitations anymore, because it will use
>   kprobes directly.
> - Scalability performance will be improved as same as kprobes.
>   No list-operation in probe-runtime.
>
> Here is a PoC code which introduces "retinsn_probe" event as a part
> of ftrace kprobe event. I don't think we need to replace the
> kretprobe. This should be a higher layer feature, because some
> kernel functions can have multiple "return instructions". Thus,
> the "retinsn_probe" must manage multiple kprobes. That means the
> "retinsn_probe" will be a user of kprobes. I decided to make it
> inside the ftrace "kprobe-event". This gives us another advantage
> for eBPF support. Because eBPF uses "kprobe-event" instead of
> "kprobe" directly, if the "retinsn_probe" is implemented in the
> "kprobe-event", eBPF can use it without any change.
> Anyway, this can be co-exist with kretprobe. So as far as any
> user uses kretprobe, we can keep it.
>
>
> Example
> =======
> For example, I ran a shell script, which was used in the
> stacktrace fix series.
>
> ----
> mount -t debugfs debugfs /sys/kernel/debug/
> cd /sys/kernel/debug/tracing
> echo > trace
> echo 1 > options/sym-offset
> echo r vfs_read >> kprobe_events
> echo r full_proxy_read >> kprobe_events
> echo traceoff:1 > events/kprobes/r_vfs_read_0/trigger
> echo stacktrace:1 > events/kprobes/r_full_proxy_read_0/trigger
> echo 1 > events/kprobes/enable
> cat /sys/kernel/debug/kprobes/list
> echo 0 > events/kprobes/enable
> cat trace
> ----
>
> This is the result.
> ----
> ffffffff813b420e  k  full_proxy_read+0x6e
> ffffffff812b7c0a  k  vfs_read+0xda
> # tracer: nop
> #
> # entries-in-buffer/entries-written: 3/3   #P:8
> #
> #                                _-----=> irqs-off
> #                               / _----=> need-resched
> #                              | / _---=> hardirq/softirq
> #                              || / _--=> preempt-depth
> #                              ||| /     delay
> #           TASK-PID     CPU#  ||||   TIMESTAMP  FUNCTION
> #              | |         |   ||||      |         |
>              cat-136     [007] d.Z.     8.038381: r_full_proxy_read_0: (vfs_read+0x9b/0x180 <- full_proxy_read)
>              cat-136     [007] d.Z.     8.038386: <stack trace>
>  => kretprobe_trace_func+0x209/0x300
>  => retinsn_dispatcher+0x7a/0xa0
>  => kprobe_post_process+0x28/0x80
>  => kprobe_int3_handler+0x166/0x1a0
>  => exc_int3+0x47/0x140
>  => asm_exc_int3+0x31/0x40
>  => vfs_read+0x9b/0x180
>  => ksys_read+0x68/0xe0
>  => do_syscall_64+0x3b/0x90
>  => entry_SYSCALL_64_after_hwframe+0x44/0xae
>              cat-136     [007] d.Z.     8.038387: r_vfs_read_0: (ksys_read+0x68/0xe0 <- vfs_read)
> ----
>
> You can see the return probe events are translated to kprobes
> instead of kretprobes. And also, on the stacktrace, we can see
> an int3 calls the kprobe and decode stacktrace correctly.
>
>
> TODO
> ====
> Of course, this is just an PoC code, there are many TODOs.
>
> - This PoC code only supports x86 at this moment. But I think this
>   can be done on the other architectures. What it needs is
>   to implement "find_return_instructions()".
> - Code cleanup is not enough. I have to remove "kretprobe" from
>  "trace_kprobe" data structure, rewrite related functions etc.
> - It has to handle "tail-call" optimized code, which replaces
>   a "call + return" into "jump". find_return_instruction() should
>   detect it and decode the jump destination too.
>
>
> Thank you,
>
>
> ---
>
> Masami Hiramatsu (1):
>       [PoC] tracing: kprobe: Add non-stack intrusion return probe event
>
>
>  arch/x86/kernel/kprobes/core.c |   59 +++++++++++++++++++++
>  kernel/trace/trace_kprobe.c    |  110 ++++++++++++++++++++++++++++++++++++++--
>  2 files changed, 164 insertions(+), 5 deletions(-)
>
> --
> Masami Hiramatsu (Linaro) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [RFC PATCH 0/1] Non stack-intrusive return probe event
  2021-08-30 19:04   ` [RFC PATCH 0/1] Non stack-intrusive " Andrii Nakryiko
@ 2021-08-31  6:06     ` Masami Hiramatsu
  0 siblings, 0 replies; 28+ messages in thread
From: Masami Hiramatsu @ 2021-08-31  6:06 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Steven Rostedt, Josh Poimboeuf, Ingo Molnar, X86 ML, Daniel Xu,
	open list, bpf, Jakub Kicinski, Ingo Molnar, Alexei Starovoitov,
	Thomas Gleixner, Borislav Petkov, Peter Zijlstra, Kernel Team,
	Yonghong Song, linux-ia64, Abhishek Sagar

On Mon, 30 Aug 2021 12:04:56 -0700
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> On Sun, Aug 29, 2021 at 7:22 AM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > Hello,
> >
> > For a long time, we tackled to fix some issues around kretprobe.
> > One of the latest action was the stacktrace fix on x86 in this
> > thread.
> >
> > https://lore.kernel.org/bpf/162756755600.301564.4957591913842010341.stgit@devnote2/
> >
> > However, there seems no progress/further discussion. So I would
> > like to make another approach for this (and the other issues.)
> 
> v10 of kretprobe+stacktrace fixes ([0]) from Masami has received no
> comment or objections in the last month, since it was posted. It fixes
> the very real and very limiting problem of not being able to capture a
> stack trace from BPF kretprobe programs. Masami, while I don't mind
> your new approach, I think we shouldn't consider them as "either/or"
> solutions. We have a fix that works for existing implementations, can
> we please land it, and then work on further improvements
> independently?

That's right. The original stacktrace fix series is for fixing
existing issue with kretprobes and stack unwinder.
This one just for avoiding the issue only from dynamic events.
So we can anyway proceed both.

> 
> Ingo, Peter, Steven,
> 
> I'm not sure who and which kernel tree this has to go through, but
> assuming it's one of you/yours, can you please take a look at [0] and
> apply it where appropriate? The work has been going on since March and
> it blocks development of some extremely useful tooling (retsnoop [1]
> being one of them). There were also bpftrace users that were
> completely surprised about the inability to use stack trace capturing
> from kretprobe handlers, so it's not just me. I (and a bunch of other
> BPF users) would greatly appreciate help with getting this problem
> fixed. Thank you!
> 
>   [0] https://lore.kernel.org/bpf/162756755600.301564.4957591913842010341.stgit@devnote2/
>   [1] https://github.com/anakryiko/retsnoop

Thank you,

> 
> >
> > Here is my idea -- replace kretprobe with kprobe.
> > In other words, put a kprobe on the "return instruction" directly
> > instead of modifying the kernel stack. This can solve most
> > of the kretprobe disadvantges. E.g.
> >
> > - Since it doesn't change the kernel stack, any special stack
> >   unwinder fixup is not needed anymore.
> > - No "max-instance" limitations anymore, because it will use
> >   kprobes directly.
> > - Scalability performance will be improved as same as kprobes.
> >   No list-operation in probe-runtime.
> >
> > Here is a PoC code which introduces "retinsn_probe" event as a part
> > of ftrace kprobe event. I don't think we need to replace the
> > kretprobe. This should be a higher layer feature, because some
> > kernel functions can have multiple "return instructions". Thus,
> > the "retinsn_probe" must manage multiple kprobes. That means the
> > "retinsn_probe" will be a user of kprobes. I decided to make it
> > inside the ftrace "kprobe-event". This gives us another advantage
> > for eBPF support. Because eBPF uses "kprobe-event" instead of
> > "kprobe" directly, if the "retinsn_probe" is implemented in the
> > "kprobe-event", eBPF can use it without any change.
> > Anyway, this can be co-exist with kretprobe. So as far as any
> > user uses kretprobe, we can keep it.
> >
> >
> > Example
> > =======
> > For example, I ran a shell script, which was used in the
> > stacktrace fix series.
> >
> > ----
> > mount -t debugfs debugfs /sys/kernel/debug/
> > cd /sys/kernel/debug/tracing
> > echo > trace
> > echo 1 > options/sym-offset
> > echo r vfs_read >> kprobe_events
> > echo r full_proxy_read >> kprobe_events
> > echo traceoff:1 > events/kprobes/r_vfs_read_0/trigger
> > echo stacktrace:1 > events/kprobes/r_full_proxy_read_0/trigger
> > echo 1 > events/kprobes/enable
> > cat /sys/kernel/debug/kprobes/list
> > echo 0 > events/kprobes/enable
> > cat trace
> > ----
> >
> > This is the result.
> > ----
> > ffffffff813b420e  k  full_proxy_read+0x6e
> > ffffffff812b7c0a  k  vfs_read+0xda
> > # tracer: nop
> > #
> > # entries-in-buffer/entries-written: 3/3   #P:8
> > #
> > #                                _-----=> irqs-off
> > #                               / _----=> need-resched
> > #                              | / _---=> hardirq/softirq
> > #                              || / _--=> preempt-depth
> > #                              ||| /     delay
> > #           TASK-PID     CPU#  ||||   TIMESTAMP  FUNCTION
> > #              | |         |   ||||      |         |
> >              cat-136     [007] d.Z.     8.038381: r_full_proxy_read_0: (vfs_read+0x9b/0x180 <- full_proxy_read)
> >              cat-136     [007] d.Z.     8.038386: <stack trace>
> >  => kretprobe_trace_func+0x209/0x300
> >  => retinsn_dispatcher+0x7a/0xa0
> >  => kprobe_post_process+0x28/0x80
> >  => kprobe_int3_handler+0x166/0x1a0
> >  => exc_int3+0x47/0x140
> >  => asm_exc_int3+0x31/0x40
> >  => vfs_read+0x9b/0x180
> >  => ksys_read+0x68/0xe0
> >  => do_syscall_64+0x3b/0x90
> >  => entry_SYSCALL_64_after_hwframe+0x44/0xae
> >              cat-136     [007] d.Z.     8.038387: r_vfs_read_0: (ksys_read+0x68/0xe0 <- vfs_read)
> > ----
> >
> > You can see the return probe events are translated to kprobes
> > instead of kretprobes. And also, on the stacktrace, we can see
> > an int3 calls the kprobe and decode stacktrace correctly.
> >
> >
> > TODO
> > ====
> > Of course, this is just an PoC code, there are many TODOs.
> >
> > - This PoC code only supports x86 at this moment. But I think this
> >   can be done on the other architectures. What it needs is
> >   to implement "find_return_instructions()".
> > - Code cleanup is not enough. I have to remove "kretprobe" from
> >  "trace_kprobe" data structure, rewrite related functions etc.
> > - It has to handle "tail-call" optimized code, which replaces
> >   a "call + return" into "jump". find_return_instruction() should
> >   detect it and decode the jump destination too.
> >
> >
> > Thank you,
> >
> >
> > ---
> >
> > Masami Hiramatsu (1):
> >       [PoC] tracing: kprobe: Add non-stack intrusion return probe event
> >
> >
> >  arch/x86/kernel/kprobes/core.c |   59 +++++++++++++++++++++
> >  kernel/trace/trace_kprobe.c    |  110 ++++++++++++++++++++++++++++++++++++++--
> >  2 files changed, 164 insertions(+), 5 deletions(-)
> >
> > --
> > Masami Hiramatsu (Linaro) <mhiramat@kernel.org>


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86
  2021-08-24  5:32     ` Masami Hiramatsu
@ 2021-09-13 17:14       ` Andrii Nakryiko
  2021-09-14  0:38         ` Masami Hiramatsu
  0 siblings, 1 reply; 28+ messages in thread
From: Andrii Nakryiko @ 2021-09-13 17:14 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Josh Poimboeuf, Ingo Molnar, Steven Rostedt, X86 ML, Daniel Xu,
	open list, bpf, Jakub Kicinski, Ingo Molnar, Alexei Starovoitov,
	Thomas Gleixner, Borislav Petkov, Peter Zijlstra, Kernel Team,
	Yonghong Song, linux-ia64, Abhishek Sagar, Paul E . McKenney

On Mon, Aug 23, 2021 at 10:32 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Mon, 23 Aug 2021 22:12:06 -0700
> Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> > On Thu, Jul 29, 2021 at 4:35 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > >
> > > On Thu, 29 Jul 2021 23:05:56 +0900
> > > Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > >
> > > > Hello,
> > > >
> > > > This is the 10th version of the series to fix the stacktrace with kretprobe on x86.
> > > >
> > > > The previous version is here;
> > > >
> > > >  https://lore.kernel.org/bpf/162601048053.1318837.1550594515476777588.stgit@devnote2/
> > > >
> > > > This version is rebased on top of new kprobes cleanup series(*1) and merging
> > > > Josh's objtool update series (*2)(*3) as [6/16] and [7/16].
> > > >
> > > > (*1) https://lore.kernel.org/bpf/162748615977.59465.13262421617578791515.stgit@devnote2/
> > > > (*2) https://lore.kernel.org/bpf/20210710192433.x5cgjsq2ksvaqnss@treble/
> > > > (*3) https://lore.kernel.org/bpf/20210710192514.ghvksi3ozhez4lvb@treble/
> > > >
> > > > Changes from v9:
> > > >  - Add Josh's objtool update patches with a build error fix as [6/16] and [7/16].
> > > >  - Add a API document for kretprobe_find_ret_addr() and check cur != NULL in [5/16].
> > > >
> > > > With this series, unwinder can unwind stack correctly from ftrace as below;
> > > >
> > > >   # cd /sys/kernel/debug/tracing
> > > >   # echo > trace
> > > >   # echo 1 > options/sym-offset
> > > >   # echo r vfs_read >> kprobe_events
> > > >   # echo r full_proxy_read >> kprobe_events
> > > >   # echo traceoff:1 > events/kprobes/r_vfs_read_0/trigger
> > > >   # echo stacktrace:1 > events/kprobes/r_full_proxy_read_0/trigger
> > > >   # echo 1 > events/kprobes/enable
> > > >   # cat /sys/kernel/debug/kprobes/list
> > > > ffffffff8133b740  r  full_proxy_read+0x0    [FTRACE]
> > > > ffffffff812560b0  r  vfs_read+0x0    [FTRACE]
> > > >   # echo 0 > events/kprobes/enable
> > > >   # cat trace
> > > > # tracer: nop
> > > > #
> > > > # entries-in-buffer/entries-written: 3/3   #P:8
> > > > #
> > > > #                                _-----=> irqs-off
> > > > #                               / _----=> need-resched
> > > > #                              | / _---=> hardirq/softirq
> > > > #                              || / _--=> preempt-depth
> > > > #                              ||| /     delay
> > > > #           TASK-PID     CPU#  ||||   TIMESTAMP  FUNCTION
> > > > #              | |         |   ||||      |         |
> > > >            <...>-134     [007] ...1    16.185877: r_full_proxy_read_0: (vfs_read+0x98/0x180 <- full_proxy_read)
> > > >            <...>-134     [007] ...1    16.185901: <stack trace>
> > > >  => kretprobe_trace_func+0x209/0x300
> > > >  => kretprobe_dispatcher+0x4a/0x70
> > > >  => __kretprobe_trampoline_handler+0xd4/0x170
> > > >  => trampoline_handler+0x43/0x60
> > > >  => kretprobe_trampoline+0x2a/0x50
> > > >  => vfs_read+0x98/0x180
> > > >  => ksys_read+0x5f/0xe0
> > > >  => do_syscall_64+0x37/0x90
> > > >  => entry_SYSCALL_64_after_hwframe+0x44/0xae
> > > >            <...>-134     [007] ...1    16.185902: r_vfs_read_0: (ksys_read+0x5f/0xe0 <- vfs_read)
> > > >
> > > > This shows the double return probes (vfs_read() and full_proxy_read()) on the stack
> > > > correctly unwinded. (vfs_read() returns to 'ksys_read+0x5f' and full_proxy_read()
> > > > returns to 'vfs_read+0x98')
> > > >
> > > > This also changes the kretprobe behavisor a bit, now the instraction pointer in
> > > > the 'pt_regs' passed to kretprobe user handler is correctly set the real return
> > > > address. So user handlers can get it via instruction_pointer() API, and can use
> > > > stack_trace_save_regs().
> > > >
> > > > You can also get this series from
> > > >  git://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git kprobes/kretprobe-stackfix-v9
> > >
> > > Oops, this is of course 'kprobes/kretprobe-stackfix-v10'. And this branch includes above (*1) series.
> >
> > Hi Masami,
> >
> > Was this ever merged/applied? This is a very important functionality
> > for BPF kretprobes, so I hope this won't slip through the cracks.
>
> No, not yet as far as I know.
> I'm waiting for any comment on this series. Since this is basically
> x86 ORC unwinder improvement, this series should be merged to -tip tree.
>

Hey Masami,

It's been a while since you posted v10. It seems like this series
doesn't apply cleanly anymore. Do you mind rebasing and resubmitting
it again to refresh the series and make it easier for folks to review
and test it?

Also, do I understand correctly that [0] is a dependency of this
series? If yes, please rebase and resubmit that one as well. Not sure
on the status of Josh's patches you have dependency on as well. Can
you please coordinate with him and maybe incorporate them into your
series?

Please also cc Paul McKenney <paulmck@kernel.org> for the future
revisions so he can follow along as well? Thanks!


  [0] https://patchwork.kernel.org/project/netdevbpf/list/?series=522757&state=*



> Ingo and Josh,
>
> Could you give me any comment, please?
>
> Thank you,
>
>
> > Thanks!
> >
> > >
> > > Thank you,
> > >
> > > --
> > > Masami Hiramatsu <mhiramat@kernel.org>
>
>
> --
> Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86
  2021-09-13 17:14       ` Andrii Nakryiko
@ 2021-09-14  0:38         ` Masami Hiramatsu
  2021-09-14  1:36           ` Andrii Nakryiko
  0 siblings, 1 reply; 28+ messages in thread
From: Masami Hiramatsu @ 2021-09-14  0:38 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Josh Poimboeuf, Ingo Molnar, Steven Rostedt, X86 ML, Daniel Xu,
	open list, bpf, Jakub Kicinski, Ingo Molnar, Alexei Starovoitov,
	Thomas Gleixner, Borislav Petkov, Peter Zijlstra, Kernel Team,
	Yonghong Song, linux-ia64, Abhishek Sagar, Paul E . McKenney

On Mon, 13 Sep 2021 10:14:55 -0700
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> On Mon, Aug 23, 2021 at 10:32 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > On Mon, 23 Aug 2021 22:12:06 -0700
> > Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> >
> > > On Thu, Jul 29, 2021 at 4:35 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > >
> > > > On Thu, 29 Jul 2021 23:05:56 +0900
> > > > Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > >
> > > > > Hello,
> > > > >
> > > > > This is the 10th version of the series to fix the stacktrace with kretprobe on x86.
> > > > >
> > > > > The previous version is here;
> > > > >
> > > > >  https://lore.kernel.org/bpf/162601048053.1318837.1550594515476777588.stgit@devnote2/
> > > > >
> > > > > This version is rebased on top of new kprobes cleanup series(*1) and merging
> > > > > Josh's objtool update series (*2)(*3) as [6/16] and [7/16].
> > > > >
> > > > > (*1) https://lore.kernel.org/bpf/162748615977.59465.13262421617578791515.stgit@devnote2/
> > > > > (*2) https://lore.kernel.org/bpf/20210710192433.x5cgjsq2ksvaqnss@treble/
> > > > > (*3) https://lore.kernel.org/bpf/20210710192514.ghvksi3ozhez4lvb@treble/
> > > > >
> > > > > Changes from v9:
> > > > >  - Add Josh's objtool update patches with a build error fix as [6/16] and [7/16].
> > > > >  - Add a API document for kretprobe_find_ret_addr() and check cur != NULL in [5/16].
> > > > >
> > > > > With this series, unwinder can unwind stack correctly from ftrace as below;
> > > > >
> > > > >   # cd /sys/kernel/debug/tracing
> > > > >   # echo > trace
> > > > >   # echo 1 > options/sym-offset
> > > > >   # echo r vfs_read >> kprobe_events
> > > > >   # echo r full_proxy_read >> kprobe_events
> > > > >   # echo traceoff:1 > events/kprobes/r_vfs_read_0/trigger
> > > > >   # echo stacktrace:1 > events/kprobes/r_full_proxy_read_0/trigger
> > > > >   # echo 1 > events/kprobes/enable
> > > > >   # cat /sys/kernel/debug/kprobes/list
> > > > > ffffffff8133b740  r  full_proxy_read+0x0    [FTRACE]
> > > > > ffffffff812560b0  r  vfs_read+0x0    [FTRACE]
> > > > >   # echo 0 > events/kprobes/enable
> > > > >   # cat trace
> > > > > # tracer: nop
> > > > > #
> > > > > # entries-in-buffer/entries-written: 3/3   #P:8
> > > > > #
> > > > > #                                _-----=> irqs-off
> > > > > #                               / _----=> need-resched
> > > > > #                              | / _---=> hardirq/softirq
> > > > > #                              || / _--=> preempt-depth
> > > > > #                              ||| /     delay
> > > > > #           TASK-PID     CPU#  ||||   TIMESTAMP  FUNCTION
> > > > > #              | |         |   ||||      |         |
> > > > >            <...>-134     [007] ...1    16.185877: r_full_proxy_read_0: (vfs_read+0x98/0x180 <- full_proxy_read)
> > > > >            <...>-134     [007] ...1    16.185901: <stack trace>
> > > > >  => kretprobe_trace_func+0x209/0x300
> > > > >  => kretprobe_dispatcher+0x4a/0x70
> > > > >  => __kretprobe_trampoline_handler+0xd4/0x170
> > > > >  => trampoline_handler+0x43/0x60
> > > > >  => kretprobe_trampoline+0x2a/0x50
> > > > >  => vfs_read+0x98/0x180
> > > > >  => ksys_read+0x5f/0xe0
> > > > >  => do_syscall_64+0x37/0x90
> > > > >  => entry_SYSCALL_64_after_hwframe+0x44/0xae
> > > > >            <...>-134     [007] ...1    16.185902: r_vfs_read_0: (ksys_read+0x5f/0xe0 <- vfs_read)
> > > > >
> > > > > This shows the double return probes (vfs_read() and full_proxy_read()) on the stack
> > > > > correctly unwinded. (vfs_read() returns to 'ksys_read+0x5f' and full_proxy_read()
> > > > > returns to 'vfs_read+0x98')
> > > > >
> > > > > This also changes the kretprobe behavisor a bit, now the instraction pointer in
> > > > > the 'pt_regs' passed to kretprobe user handler is correctly set the real return
> > > > > address. So user handlers can get it via instruction_pointer() API, and can use
> > > > > stack_trace_save_regs().
> > > > >
> > > > > You can also get this series from
> > > > >  git://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git kprobes/kretprobe-stackfix-v9
> > > >
> > > > Oops, this is of course 'kprobes/kretprobe-stackfix-v10'. And this branch includes above (*1) series.
> > >
> > > Hi Masami,
> > >
> > > Was this ever merged/applied? This is a very important functionality
> > > for BPF kretprobes, so I hope this won't slip through the cracks.
> >
> > No, not yet as far as I know.
> > I'm waiting for any comment on this series. Since this is basically
> > x86 ORC unwinder improvement, this series should be merged to -tip tree.
> >
> 
> Hey Masami,
> 
> It's been a while since you posted v10. It seems like this series
> doesn't apply cleanly anymore. Do you mind rebasing and resubmitting
> it again to refresh the series and make it easier for folks to review
> and test it?

Yes, I'm planning to do that this week soon.
Thank you for ping me :)

> 
> Also, do I understand correctly that [0] is a dependency of this
> series? If yes, please rebase and resubmit that one as well. Not sure
> on the status of Josh's patches you have dependency on as well. Can
> you please coordinate with him and maybe incorporate them into your
> series?

Sorry I can not see [0], could you tell me another URL or title?
Or is that Kees's patch [1]?

[1] https://lore.kernel.org/all/20210903021326.206548-1-keescook@chromium.org/T/#u


> 
> Please also cc Paul McKenney <paulmck@kernel.org> for the future
> revisions so he can follow along as well? Thanks!

OK!

> 
> 
>   [0] https://patchwork.kernel.org/project/netdevbpf/list/?series=522757&state=*
> 
> 
> 
> > Ingo and Josh,
> >
> > Could you give me any comment, please?
> >
> > Thank you,
> >
> >
> > > Thanks!
> > >
> > > >
> > > > Thank you,
> > > >
> > > > --
> > > > Masami Hiramatsu <mhiramat@kernel.org>
> >
> >
> > --
> > Masami Hiramatsu <mhiramat@kernel.org>


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86
  2021-09-14  0:38         ` Masami Hiramatsu
@ 2021-09-14  1:36           ` Andrii Nakryiko
  2021-09-14  5:10             ` Masami Hiramatsu
  0 siblings, 1 reply; 28+ messages in thread
From: Andrii Nakryiko @ 2021-09-14  1:36 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Josh Poimboeuf, Ingo Molnar, Steven Rostedt, X86 ML, Daniel Xu,
	open list, bpf, Jakub Kicinski, Ingo Molnar, Alexei Starovoitov,
	Thomas Gleixner, Borislav Petkov, Peter Zijlstra, Kernel Team,
	Yonghong Song, linux-ia64, Abhishek Sagar, Paul E . McKenney

On Mon, Sep 13, 2021 at 5:38 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Mon, 13 Sep 2021 10:14:55 -0700
> Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> > On Mon, Aug 23, 2021 at 10:32 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > >
> > > On Mon, 23 Aug 2021 22:12:06 -0700
> > > Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > >
> > > > On Thu, Jul 29, 2021 at 4:35 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > > >
> > > > > On Thu, 29 Jul 2021 23:05:56 +0900
> > > > > Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > > >
> > > > > > Hello,
> > > > > >
> > > > > > This is the 10th version of the series to fix the stacktrace with kretprobe on x86.
> > > > > >
> > > > > > The previous version is here;
> > > > > >
> > > > > >  https://lore.kernel.org/bpf/162601048053.1318837.1550594515476777588.stgit@devnote2/
> > > > > >
> > > > > > This version is rebased on top of new kprobes cleanup series(*1) and merging
> > > > > > Josh's objtool update series (*2)(*3) as [6/16] and [7/16].
> > > > > >
> > > > > > (*1) https://lore.kernel.org/bpf/162748615977.59465.13262421617578791515.stgit@devnote2/
> > > > > > (*2) https://lore.kernel.org/bpf/20210710192433.x5cgjsq2ksvaqnss@treble/
> > > > > > (*3) https://lore.kernel.org/bpf/20210710192514.ghvksi3ozhez4lvb@treble/
> > > > > >
> > > > > > Changes from v9:
> > > > > >  - Add Josh's objtool update patches with a build error fix as [6/16] and [7/16].
> > > > > >  - Add a API document for kretprobe_find_ret_addr() and check cur != NULL in [5/16].
> > > > > >
> > > > > > With this series, unwinder can unwind stack correctly from ftrace as below;
> > > > > >
> > > > > >   # cd /sys/kernel/debug/tracing
> > > > > >   # echo > trace
> > > > > >   # echo 1 > options/sym-offset
> > > > > >   # echo r vfs_read >> kprobe_events
> > > > > >   # echo r full_proxy_read >> kprobe_events
> > > > > >   # echo traceoff:1 > events/kprobes/r_vfs_read_0/trigger
> > > > > >   # echo stacktrace:1 > events/kprobes/r_full_proxy_read_0/trigger
> > > > > >   # echo 1 > events/kprobes/enable
> > > > > >   # cat /sys/kernel/debug/kprobes/list
> > > > > > ffffffff8133b740  r  full_proxy_read+0x0    [FTRACE]
> > > > > > ffffffff812560b0  r  vfs_read+0x0    [FTRACE]
> > > > > >   # echo 0 > events/kprobes/enable
> > > > > >   # cat trace
> > > > > > # tracer: nop
> > > > > > #
> > > > > > # entries-in-buffer/entries-written: 3/3   #P:8
> > > > > > #
> > > > > > #                                _-----=> irqs-off
> > > > > > #                               / _----=> need-resched
> > > > > > #                              | / _---=> hardirq/softirq
> > > > > > #                              || / _--=> preempt-depth
> > > > > > #                              ||| /     delay
> > > > > > #           TASK-PID     CPU#  ||||   TIMESTAMP  FUNCTION
> > > > > > #              | |         |   ||||      |         |
> > > > > >            <...>-134     [007] ...1    16.185877: r_full_proxy_read_0: (vfs_read+0x98/0x180 <- full_proxy_read)
> > > > > >            <...>-134     [007] ...1    16.185901: <stack trace>
> > > > > >  => kretprobe_trace_func+0x209/0x300
> > > > > >  => kretprobe_dispatcher+0x4a/0x70
> > > > > >  => __kretprobe_trampoline_handler+0xd4/0x170
> > > > > >  => trampoline_handler+0x43/0x60
> > > > > >  => kretprobe_trampoline+0x2a/0x50
> > > > > >  => vfs_read+0x98/0x180
> > > > > >  => ksys_read+0x5f/0xe0
> > > > > >  => do_syscall_64+0x37/0x90
> > > > > >  => entry_SYSCALL_64_after_hwframe+0x44/0xae
> > > > > >            <...>-134     [007] ...1    16.185902: r_vfs_read_0: (ksys_read+0x5f/0xe0 <- vfs_read)
> > > > > >
> > > > > > This shows the double return probes (vfs_read() and full_proxy_read()) on the stack
> > > > > > correctly unwinded. (vfs_read() returns to 'ksys_read+0x5f' and full_proxy_read()
> > > > > > returns to 'vfs_read+0x98')
> > > > > >
> > > > > > This also changes the kretprobe behavisor a bit, now the instraction pointer in
> > > > > > the 'pt_regs' passed to kretprobe user handler is correctly set the real return
> > > > > > address. So user handlers can get it via instruction_pointer() API, and can use
> > > > > > stack_trace_save_regs().
> > > > > >
> > > > > > You can also get this series from
> > > > > >  git://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git kprobes/kretprobe-stackfix-v9
> > > > >
> > > > > Oops, this is of course 'kprobes/kretprobe-stackfix-v10'. And this branch includes above (*1) series.
> > > >
> > > > Hi Masami,
> > > >
> > > > Was this ever merged/applied? This is a very important functionality
> > > > for BPF kretprobes, so I hope this won't slip through the cracks.
> > >
> > > No, not yet as far as I know.
> > > I'm waiting for any comment on this series. Since this is basically
> > > x86 ORC unwinder improvement, this series should be merged to -tip tree.
> > >
> >
> > Hey Masami,
> >
> > It's been a while since you posted v10. It seems like this series
> > doesn't apply cleanly anymore. Do you mind rebasing and resubmitting
> > it again to refresh the series and make it easier for folks to review
> > and test it?
>
> Yes, I'm planning to do that this week soon.
> Thank you for ping me :)

Sounds good, thank you.

>
> >
> > Also, do I understand correctly that [0] is a dependency of this
> > series? If yes, please rebase and resubmit that one as well. Not sure
> > on the status of Josh's patches you have dependency on as well. Can
> > you please coordinate with him and maybe incorporate them into your
> > series?
>
> Sorry I can not see [0], could you tell me another URL or title?

Make sure that you have `state=*` when you are copy/pasting that URL,
it is just a normal Patchworks URL, should be visible to anyone. But
it's just your "kprobes: treewide: Clean up kprobe code" series (v3).

> Or is that Kees's patch [1]?
>
> [1] https://lore.kernel.org/all/20210903021326.206548-1-keescook@chromium.org/T/#u
>
>
> >
> > Please also cc Paul McKenney <paulmck@kernel.org> for the future
> > revisions so he can follow along as well? Thanks!
>
> OK!
>
> >
> >
> >   [0] https://patchwork.kernel.org/project/netdevbpf/list/?series=522757&state=*
> >
> >
> >
> > > Ingo and Josh,
> > >
> > > Could you give me any comment, please?
> > >
> > > Thank you,
> > >
> > >
> > > > Thanks!
> > > >
> > > > >
> > > > > Thank you,
> > > > >
> > > > > --
> > > > > Masami Hiramatsu <mhiramat@kernel.org>
> > >
> > >
> > > --
> > > Masami Hiramatsu <mhiramat@kernel.org>
>
>
> --
> Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86
  2021-09-14  1:36           ` Andrii Nakryiko
@ 2021-09-14  5:10             ` Masami Hiramatsu
  0 siblings, 0 replies; 28+ messages in thread
From: Masami Hiramatsu @ 2021-09-14  5:10 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Josh Poimboeuf, Ingo Molnar, Steven Rostedt, X86 ML, Daniel Xu,
	open list, bpf, Jakub Kicinski, Ingo Molnar, Alexei Starovoitov,
	Thomas Gleixner, Borislav Petkov, Peter Zijlstra, Kernel Team,
	Yonghong Song, linux-ia64, Abhishek Sagar, Paul E . McKenney

On Mon, 13 Sep 2021 18:36:10 -0700
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> On Mon, Sep 13, 2021 at 5:38 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > On Mon, 13 Sep 2021 10:14:55 -0700
> > Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> >
> > > On Mon, Aug 23, 2021 at 10:32 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > >
> > > > On Mon, 23 Aug 2021 22:12:06 -0700
> > > > Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > > >
> > > > > On Thu, Jul 29, 2021 at 4:35 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > > > >
> > > > > > On Thu, 29 Jul 2021 23:05:56 +0900
> > > > > > Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > > > > >
> > > > > > > Hello,
> > > > > > >
> > > > > > > This is the 10th version of the series to fix the stacktrace with kretprobe on x86.
> > > > > > >
> > > > > > > The previous version is here;
> > > > > > >
> > > > > > >  https://lore.kernel.org/bpf/162601048053.1318837.1550594515476777588.stgit@devnote2/
> > > > > > >
> > > > > > > This version is rebased on top of new kprobes cleanup series(*1) and merging
> > > > > > > Josh's objtool update series (*2)(*3) as [6/16] and [7/16].
> > > > > > >
> > > > > > > (*1) https://lore.kernel.org/bpf/162748615977.59465.13262421617578791515.stgit@devnote2/
> > > > > > > (*2) https://lore.kernel.org/bpf/20210710192433.x5cgjsq2ksvaqnss@treble/
> > > > > > > (*3) https://lore.kernel.org/bpf/20210710192514.ghvksi3ozhez4lvb@treble/
> > > > > > >
> > > > > > > Changes from v9:
> > > > > > >  - Add Josh's objtool update patches with a build error fix as [6/16] and [7/16].
> > > > > > >  - Add a API document for kretprobe_find_ret_addr() and check cur != NULL in [5/16].
> > > > > > >
> > > > > > > With this series, unwinder can unwind stack correctly from ftrace as below;
> > > > > > >
> > > > > > >   # cd /sys/kernel/debug/tracing
> > > > > > >   # echo > trace
> > > > > > >   # echo 1 > options/sym-offset
> > > > > > >   # echo r vfs_read >> kprobe_events
> > > > > > >   # echo r full_proxy_read >> kprobe_events
> > > > > > >   # echo traceoff:1 > events/kprobes/r_vfs_read_0/trigger
> > > > > > >   # echo stacktrace:1 > events/kprobes/r_full_proxy_read_0/trigger
> > > > > > >   # echo 1 > events/kprobes/enable
> > > > > > >   # cat /sys/kernel/debug/kprobes/list
> > > > > > > ffffffff8133b740  r  full_proxy_read+0x0    [FTRACE]
> > > > > > > ffffffff812560b0  r  vfs_read+0x0    [FTRACE]
> > > > > > >   # echo 0 > events/kprobes/enable
> > > > > > >   # cat trace
> > > > > > > # tracer: nop
> > > > > > > #
> > > > > > > # entries-in-buffer/entries-written: 3/3   #P:8
> > > > > > > #
> > > > > > > #                                _-----=> irqs-off
> > > > > > > #                               / _----=> need-resched
> > > > > > > #                              | / _---=> hardirq/softirq
> > > > > > > #                              || / _--=> preempt-depth
> > > > > > > #                              ||| /     delay
> > > > > > > #           TASK-PID     CPU#  ||||   TIMESTAMP  FUNCTION
> > > > > > > #              | |         |   ||||      |         |
> > > > > > >            <...>-134     [007] ...1    16.185877: r_full_proxy_read_0: (vfs_read+0x98/0x180 <- full_proxy_read)
> > > > > > >            <...>-134     [007] ...1    16.185901: <stack trace>
> > > > > > >  => kretprobe_trace_func+0x209/0x300
> > > > > > >  => kretprobe_dispatcher+0x4a/0x70
> > > > > > >  => __kretprobe_trampoline_handler+0xd4/0x170
> > > > > > >  => trampoline_handler+0x43/0x60
> > > > > > >  => kretprobe_trampoline+0x2a/0x50
> > > > > > >  => vfs_read+0x98/0x180
> > > > > > >  => ksys_read+0x5f/0xe0
> > > > > > >  => do_syscall_64+0x37/0x90
> > > > > > >  => entry_SYSCALL_64_after_hwframe+0x44/0xae
> > > > > > >            <...>-134     [007] ...1    16.185902: r_vfs_read_0: (ksys_read+0x5f/0xe0 <- vfs_read)
> > > > > > >
> > > > > > > This shows the double return probes (vfs_read() and full_proxy_read()) on the stack
> > > > > > > correctly unwinded. (vfs_read() returns to 'ksys_read+0x5f' and full_proxy_read()
> > > > > > > returns to 'vfs_read+0x98')
> > > > > > >
> > > > > > > This also changes the kretprobe behavisor a bit, now the instraction pointer in
> > > > > > > the 'pt_regs' passed to kretprobe user handler is correctly set the real return
> > > > > > > address. So user handlers can get it via instruction_pointer() API, and can use
> > > > > > > stack_trace_save_regs().
> > > > > > >
> > > > > > > You can also get this series from
> > > > > > >  git://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git kprobes/kretprobe-stackfix-v9
> > > > > >
> > > > > > Oops, this is of course 'kprobes/kretprobe-stackfix-v10'. And this branch includes above (*1) series.
> > > > >
> > > > > Hi Masami,
> > > > >
> > > > > Was this ever merged/applied? This is a very important functionality
> > > > > for BPF kretprobes, so I hope this won't slip through the cracks.
> > > >
> > > > No, not yet as far as I know.
> > > > I'm waiting for any comment on this series. Since this is basically
> > > > x86 ORC unwinder improvement, this series should be merged to -tip tree.
> > > >
> > >
> > > Hey Masami,
> > >
> > > It's been a while since you posted v10. It seems like this series
> > > doesn't apply cleanly anymore. Do you mind rebasing and resubmitting
> > > it again to refresh the series and make it easier for folks to review
> > > and test it?
> >
> > Yes, I'm planning to do that this week soon.
> > Thank you for ping me :)
> 
> Sounds good, thank you.
> 
> >
> > >
> > > Also, do I understand correctly that [0] is a dependency of this
> > > series? If yes, please rebase and resubmit that one as well. Not sure
> > > on the status of Josh's patches you have dependency on as well. Can
> > > you please coordinate with him and maybe incorporate them into your
> > > series?
> >
> > Sorry I can not see [0], could you tell me another URL or title?
> 
> Make sure that you have `state=*` when you are copy/pasting that URL,
> it is just a normal Patchworks URL, should be visible to anyone. But
> it's just your "kprobes: treewide: Clean up kprobe code" series (v3).

Ah, OK. Of course, I will include it :)

Thank you!

> 
> > Or is that Kees's patch [1]?
> >
> > [1] https://lore.kernel.org/all/20210903021326.206548-1-keescook@chromium.org/T/#u
> >
> >
> > >
> > > Please also cc Paul McKenney <paulmck@kernel.org> for the future
> > > revisions so he can follow along as well? Thanks!
> >
> > OK!
> >
> > >
> > >
> > >   [0] https://patchwork.kernel.org/project/netdevbpf/list/?series=522757&state=*
> > >
> > >
> > >
> > > > Ingo and Josh,
> > > >
> > > > Could you give me any comment, please?
> > > >
> > > > Thank you,
> > > >
> > > >
> > > > > Thanks!
> > > > >
> > > > > >
> > > > > > Thank you,
> > > > > >
> > > > > > --
> > > > > > Masami Hiramatsu <mhiramat@kernel.org>
> > > >
> > > >
> > > > --
> > > > Masami Hiramatsu <mhiramat@kernel.org>
> >
> >
> > --
> > Masami Hiramatsu <mhiramat@kernel.org>


-- 
Masami Hiramatsu <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2021-09-14  5:10 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-29 14:05 [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86 Masami Hiramatsu
2021-07-29 14:06 ` [PATCH -tip v10 01/16] ia64: kprobes: Fix to pass correct trampoline address to the handler Masami Hiramatsu
2021-07-29 14:06 ` [PATCH -tip v10 02/16] kprobes: treewide: Replace arch_deref_entry_point() with dereference_symbol_descriptor() Masami Hiramatsu
2021-07-29 14:06 ` [PATCH -tip v10 03/16] kprobes: treewide: Remove trampoline_address from kretprobe_trampoline_handler() Masami Hiramatsu
2021-07-29 14:06 ` [PATCH -tip v10 04/16] kprobes: treewide: Make it harder to refer kretprobe_trampoline directly Masami Hiramatsu
2021-07-29 14:06 ` [PATCH -tip v10 05/16] kprobes: Add kretprobe_find_ret_addr() for searching return address Masami Hiramatsu
2021-07-29 14:06 ` [PATCH -tip v10 06/16] objtool: Add frame-pointer-specific function ignore Masami Hiramatsu
2021-07-29 14:07 ` [PATCH -tip v10 07/16] objtool: Ignore unwind hints for ignored functions Masami Hiramatsu
2021-07-29 14:07 ` [PATCH -tip v10 08/16] x86/kprobes: Add UNWIND_HINT_FUNC on kretprobe_trampoline() Masami Hiramatsu
2021-07-29 14:07 ` [PATCH -tip v10 09/16] ARC: Add instruction_pointer_set() API Masami Hiramatsu
2021-07-29 14:07 ` [PATCH -tip v10 10/16] ia64: " Masami Hiramatsu
2021-07-29 14:07 ` [PATCH -tip v10 11/16] arm: kprobes: Make space for instruction pointer on stack Masami Hiramatsu
2021-07-29 14:07 ` [PATCH -tip v10 12/16] kprobes: Enable stacktrace from pt_regs in kretprobe handler Masami Hiramatsu
2021-07-29 14:07 ` [PATCH -tip v10 13/16] x86/kprobes: Push a fake return address at kretprobe_trampoline Masami Hiramatsu
2021-07-29 14:08 ` [PATCH -tip v10 14/16] x86/unwind: Recover kretprobe trampoline entry Masami Hiramatsu
2021-07-29 14:08 ` [PATCH -tip v10 15/16] tracing: Show kretprobe unknown indicator only for kretprobe_trampoline Masami Hiramatsu
2021-07-29 14:08 ` [PATCH -tip v10 16/16] x86/kprobes: Fixup return address in generic trampoline handler Masami Hiramatsu
2021-07-29 23:35 ` [PATCH -tip v10 00/16] kprobes: Fix stacktrace with kretprobes on x86 Masami Hiramatsu
2021-08-24  5:12   ` Andrii Nakryiko
2021-08-24  5:32     ` Masami Hiramatsu
2021-09-13 17:14       ` Andrii Nakryiko
2021-09-14  0:38         ` Masami Hiramatsu
2021-09-14  1:36           ` Andrii Nakryiko
2021-09-14  5:10             ` Masami Hiramatsu
2021-08-29 14:22 ` [RFC PATCH 0/1] Non stack-intrusive return probe event Masami Hiramatsu
2021-08-29 14:22   ` [RFC PATCH 1/1] [PoC] tracing: kprobe: Add non-stack intrusion " Masami Hiramatsu
2021-08-30 19:04   ` [RFC PATCH 0/1] Non stack-intrusive " Andrii Nakryiko
2021-08-31  6:06     ` Masami Hiramatsu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).