All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 0/9] bpf: fprobe: rethook: Use ftrace_regs instead of pt_regs
@ 2023-08-23 15:15 Masami Hiramatsu (Google)
  2023-08-23 15:15 ` [PATCH v4 1/9] Documentation: probes: Add a new ret_ip callback parameter Masami Hiramatsu (Google)
                   ` (9 more replies)
  0 siblings, 10 replies; 32+ messages in thread
From: Masami Hiramatsu (Google) @ 2023-08-23 15:15 UTC (permalink / raw)
  To: Alexei Starovoitov, Steven Rostedt, Florent Revest
  Cc: linux-trace-kernel, LKML, Martin KaFai Lau, bpf, Sven Schnelle,
	Alexei Starovoitov, Jiri Olsa, Arnaldo Carvalho de Melo,
	Daniel Borkmann, Alan Maguire, Mark Rutland, Peter Zijlstra,
	Thomas Gleixner

Hi,

Here is the 4th version of the series to use ftrace_regs instead of pt_regs
in fprobe.
The previous version is here;

https://lore.kernel.org/all/169181859570.505132.10136520092011157898.stgit@devnote2/

This version fixes the issues pointed in the previous series; fix document
description, keep CONFIG_FPROBE dependency for multi-kprobe, add
static_assert check for ftrace_regs size, reorder the ftrace_partial_regs()
patch for perf fprobe event support, introduce per-cpu pt_regs stack for
perf fprobe event and add Florent's Ack (Thanks!). Also this adds a new
documentation patch to clarify that the $argN and $retval is best effort.

 - Document fix for the current fprobe callback prototype
 - Simply replace pt_regs in fprobe_entry_handler with ftrace_regs.
 - Expose ftrace_regs even if CONFIG_FUNCTION_TRACER=n.
 - Introduce ftrace_partial_regs(). (This changes ARM64 which needs a custom
   implementation)
 - Replace pt_regs in rethook and fprobe_exit_handler with ftrace_regs. This
   introduce a new HAVE_PT_REGS_TO_FTRACE_REGS_CAST which means ftrace_regs is
   just a wrapper of pt_regs (except for arm64, other architectures do this)
 - Update fprobe-events to use ftrace_regs natively.
 - Update bpf multi-kprobe handler use ftrace_partial_regs().
 - Update document for new fprobe callbacks.
 - Add notes for the $argN and $retval.

This series can be applied against the probes/core branch on linux-trace tree.

This series can also be found below branch.

https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git/log/?h=topic/fprobe-ftrace-regs

Thank you,

---

Masami Hiramatsu (Google) (9):
      Documentation: probes: Add a new ret_ip callback parameter
      fprobe: Use fprobe_regs in fprobe entry handler
      tracing: Expose ftrace_regs regardless of CONFIG_FUNCTION_TRACER
      fprobe: rethook: Use ftrace_regs in fprobe exit handler and rethook
      ftrace: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs
      tracing/fprobe: Enable fprobe events with CONFIG_DYNAMIC_FTRACE_WITH_ARGS
      bpf: Enable kprobe_multi feature if CONFIG_FPROBE is enabled
      Documentations: probes: Update fprobe document to use ftrace_regs
      Documentation: tracing: Add a note about argument and retval access


 Documentation/trace/fprobe.rst      |   18 +++--
 Documentation/trace/fprobetrace.rst |    8 ++
 Documentation/trace/kprobetrace.rst |    8 ++
 arch/Kconfig                        |    1 
 arch/arm64/include/asm/ftrace.h     |   11 +++
 arch/loongarch/Kconfig              |    1 
 arch/s390/Kconfig                   |    1 
 arch/s390/include/asm/ftrace.h      |    4 +
 arch/x86/Kconfig                    |    1 
 arch/x86/kernel/rethook.c           |   13 ++-
 include/linux/fprobe.h              |    4 +
 include/linux/ftrace.h              |   89 ++++++++++++++++++------
 include/linux/rethook.h             |   11 ++-
 kernel/kprobes.c                    |   10 ++-
 kernel/trace/Kconfig                |    9 ++
 kernel/trace/bpf_trace.c            |   14 ++--
 kernel/trace/fprobe.c               |   10 +--
 kernel/trace/rethook.c              |   16 ++--
 kernel/trace/trace_fprobe.c         |  131 ++++++++++++++++++++++++++---------
 kernel/trace/trace_probe_tmpl.h     |    2 -
 lib/test_fprobe.c                   |   10 +--
 samples/fprobe/fprobe_example.c     |    4 +
 22 files changed, 267 insertions(+), 109 deletions(-)

--
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* [PATCH v4 1/9] Documentation: probes: Add a new ret_ip callback parameter
  2023-08-23 15:15 [PATCH v4 0/9] bpf: fprobe: rethook: Use ftrace_regs instead of pt_regs Masami Hiramatsu (Google)
@ 2023-08-23 15:15 ` Masami Hiramatsu (Google)
  2023-08-25 16:11   ` Florent Revest
  2023-08-23 15:15 ` [PATCH v4 2/9] fprobe: Use fprobe_regs in fprobe entry handler Masami Hiramatsu (Google)
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 32+ messages in thread
From: Masami Hiramatsu (Google) @ 2023-08-23 15:15 UTC (permalink / raw)
  To: Alexei Starovoitov, Steven Rostedt, Florent Revest
  Cc: linux-trace-kernel, LKML, Martin KaFai Lau, bpf, Sven Schnelle,
	Alexei Starovoitov, Jiri Olsa, Arnaldo Carvalho de Melo,
	Daniel Borkmann, Alan Maguire, Mark Rutland, Peter Zijlstra,
	Thomas Gleixner

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Add a new ret_ip callback parameter description.

Fixes: cb16330d1274 ("fprobe: Pass return address to the handlers")
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 Changes in v4:
  - Update ret_ip description (Thanks Florent!)
---
 Documentation/trace/fprobe.rst |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/Documentation/trace/fprobe.rst b/Documentation/trace/fprobe.rst
index 40dd2fbce861..5851a14eb893 100644
--- a/Documentation/trace/fprobe.rst
+++ b/Documentation/trace/fprobe.rst
@@ -91,9 +91,9 @@ The prototype of the entry/exit callback function are as follows:
 
 .. code-block:: c
 
- int entry_callback(struct fprobe *fp, unsigned long entry_ip, struct pt_regs *regs, void *entry_data);
+ int entry_callback(struct fprobe *fp, unsigned long entry_ip, unsigned long ret_ip, struct pt_regs *regs, void *entry_data);
 
- void exit_callback(struct fprobe *fp, unsigned long entry_ip, struct pt_regs *regs, void *entry_data);
+ void exit_callback(struct fprobe *fp, unsigned long entry_ip, unsigned long ret_ip, struct pt_regs *regs, void *entry_data);
 
 Note that the @entry_ip is saved at function entry and passed to exit handler.
 If the entry callback function returns !0, the corresponding exit callback will be cancelled.
@@ -108,6 +108,10 @@ If the entry callback function returns !0, the corresponding exit callback will
         Note that this may not be the actual entry address of the function but
         the address where the ftrace is instrumented.
 
+@ret_ip
+        This is the return address that the traced function will return to,
+        somewhere in the caller. This can be used at both entry and exit.
+
 @regs
         This is the `pt_regs` data structure at the entry and exit. Note that
         the instruction pointer of @regs may be different from the @entry_ip


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v4 2/9] fprobe: Use fprobe_regs in fprobe entry handler
  2023-08-23 15:15 [PATCH v4 0/9] bpf: fprobe: rethook: Use ftrace_regs instead of pt_regs Masami Hiramatsu (Google)
  2023-08-23 15:15 ` [PATCH v4 1/9] Documentation: probes: Add a new ret_ip callback parameter Masami Hiramatsu (Google)
@ 2023-08-23 15:15 ` Masami Hiramatsu (Google)
  2023-08-25 16:11   ` Florent Revest
  2023-08-23 15:16 ` [PATCH v4 3/9] tracing: Expose ftrace_regs regardless of CONFIG_FUNCTION_TRACER Masami Hiramatsu (Google)
                   ` (7 subsequent siblings)
  9 siblings, 1 reply; 32+ messages in thread
From: Masami Hiramatsu (Google) @ 2023-08-23 15:15 UTC (permalink / raw)
  To: Alexei Starovoitov, Steven Rostedt, Florent Revest
  Cc: linux-trace-kernel, LKML, Martin KaFai Lau, bpf, Sven Schnelle,
	Alexei Starovoitov, Jiri Olsa, Arnaldo Carvalho de Melo,
	Daniel Borkmann, Alan Maguire, Mark Rutland, Peter Zijlstra,
	Thomas Gleixner

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

This allows fprobes to be available with CONFIG_DYNAMIC_FTRACE_WITH_ARGS
instead of CONFIG_DYNAMIC_FTRACE_WITH_REGS, then we can enable fprobe
on arm64.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 Changes in v3:
  - Use FTRACE_OPS_FL_SAVE_ARGS instead of FTRACE_OPS_FL_SAVE_REGS.
 Changes in v4:
  - Keep CONFIG_FPROBE check for multi-kprobe because this depends
    on FPROBE API.
---
 include/linux/fprobe.h          |    2 +-
 kernel/trace/Kconfig            |    3 ++-
 kernel/trace/bpf_trace.c        |   10 +++++++---
 kernel/trace/fprobe.c           |    4 ++--
 kernel/trace/trace_fprobe.c     |    6 +++++-
 lib/test_fprobe.c               |    4 ++--
 samples/fprobe/fprobe_example.c |    2 +-
 7 files changed, 20 insertions(+), 11 deletions(-)

diff --git a/include/linux/fprobe.h b/include/linux/fprobe.h
index 3e03758151f4..36c0595f7b93 100644
--- a/include/linux/fprobe.h
+++ b/include/linux/fprobe.h
@@ -35,7 +35,7 @@ struct fprobe {
 	int			nr_maxactive;
 
 	int (*entry_handler)(struct fprobe *fp, unsigned long entry_ip,
-			     unsigned long ret_ip, struct pt_regs *regs,
+			     unsigned long ret_ip, struct ftrace_regs *regs,
 			     void *entry_data);
 	void (*exit_handler)(struct fprobe *fp, unsigned long entry_ip,
 			     unsigned long ret_ip, struct pt_regs *regs,
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 61c541c36596..976fd594b446 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -287,7 +287,7 @@ config DYNAMIC_FTRACE_WITH_ARGS
 config FPROBE
 	bool "Kernel Function Probe (fprobe)"
 	depends on FUNCTION_TRACER
-	depends on DYNAMIC_FTRACE_WITH_REGS
+	depends on DYNAMIC_FTRACE_WITH_REGS || DYNAMIC_FTRACE_WITH_ARGS
 	depends on HAVE_RETHOOK
 	select RETHOOK
 	default n
@@ -672,6 +672,7 @@ config FPROBE_EVENTS
 	select TRACING
 	select PROBE_EVENTS
 	select DYNAMIC_EVENTS
+	depends on DYNAMIC_FTRACE_WITH_REGS
 	default y
 	help
 	  This allows user to add tracing events on the function entry and
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index bd1a42b23f3f..22d00c817f1a 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -2467,7 +2467,7 @@ static int __init bpf_event_init(void)
 fs_initcall(bpf_event_init);
 #endif /* CONFIG_MODULES */
 
-#ifdef CONFIG_FPROBE
+#if defined(CONFIG_FPROBE) && defined(CONFIG_DYNAMIC_FTRACE_WITH_REGS)
 struct bpf_kprobe_multi_link {
 	struct bpf_link link;
 	struct fprobe fp;
@@ -2659,10 +2659,14 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
 
 static int
 kprobe_multi_link_handler(struct fprobe *fp, unsigned long fentry_ip,
-			  unsigned long ret_ip, struct pt_regs *regs,
+			  unsigned long ret_ip, struct ftrace_regs *fregs,
 			  void *data)
 {
 	struct bpf_kprobe_multi_link *link;
+	struct pt_regs *regs = ftrace_get_regs(fregs);
+
+	if (!regs)
+		return 0;
 
 	link = container_of(fp, struct bpf_kprobe_multi_link, fp);
 	kprobe_multi_link_prog_run(link, get_entry_ip(fentry_ip), regs);
@@ -2917,7 +2921,7 @@ int bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *pr
 	kvfree(cookies);
 	return err;
 }
-#else /* !CONFIG_FPROBE */
+#else /* !CONFIG_FPROBE || !CONFIG_DYNAMIC_FTRACE_WITH_REGS */
 int bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
 {
 	return -EOPNOTSUPP;
diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
index 3b21f4063258..07deb52df44a 100644
--- a/kernel/trace/fprobe.c
+++ b/kernel/trace/fprobe.c
@@ -46,7 +46,7 @@ static inline void __fprobe_handler(unsigned long ip, unsigned long parent_ip,
 	}
 
 	if (fp->entry_handler)
-		ret = fp->entry_handler(fp, ip, parent_ip, ftrace_get_regs(fregs), entry_data);
+		ret = fp->entry_handler(fp, ip, parent_ip, fregs, entry_data);
 
 	/* If entry_handler returns !0, nmissed is not counted. */
 	if (rh) {
@@ -182,7 +182,7 @@ static void fprobe_init(struct fprobe *fp)
 		fp->ops.func = fprobe_kprobe_handler;
 	else
 		fp->ops.func = fprobe_handler;
-	fp->ops.flags |= FTRACE_OPS_FL_SAVE_REGS;
+	fp->ops.flags |= FTRACE_OPS_FL_SAVE_ARGS;
 }
 
 static int fprobe_init_rethook(struct fprobe *fp, int num)
diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c
index 8bfe23af9c73..71bf38d698f1 100644
--- a/kernel/trace/trace_fprobe.c
+++ b/kernel/trace/trace_fprobe.c
@@ -320,12 +320,16 @@ NOKPROBE_SYMBOL(fexit_perf_func);
 #endif	/* CONFIG_PERF_EVENTS */
 
 static int fentry_dispatcher(struct fprobe *fp, unsigned long entry_ip,
-			     unsigned long ret_ip, struct pt_regs *regs,
+			     unsigned long ret_ip, struct ftrace_regs *fregs,
 			     void *entry_data)
 {
 	struct trace_fprobe *tf = container_of(fp, struct trace_fprobe, fp);
+	struct pt_regs *regs = ftrace_get_regs(fregs);
 	int ret = 0;
 
+	if (!regs)
+		return 0;
+
 	if (trace_probe_test_flag(&tf->tp, TP_FLAG_TRACE))
 		fentry_trace_func(tf, entry_ip, regs);
 #ifdef CONFIG_PERF_EVENTS
diff --git a/lib/test_fprobe.c b/lib/test_fprobe.c
index 24de0e5ff859..ff607babba18 100644
--- a/lib/test_fprobe.c
+++ b/lib/test_fprobe.c
@@ -40,7 +40,7 @@ static noinline u32 fprobe_selftest_nest_target(u32 value, u32 (*nest)(u32))
 
 static notrace int fp_entry_handler(struct fprobe *fp, unsigned long ip,
 				    unsigned long ret_ip,
-				    struct pt_regs *regs, void *data)
+				    struct ftrace_regs *fregs, void *data)
 {
 	KUNIT_EXPECT_FALSE(current_test, preemptible());
 	/* This can be called on the fprobe_selftest_target and the fprobe_selftest_target2 */
@@ -81,7 +81,7 @@ static notrace void fp_exit_handler(struct fprobe *fp, unsigned long ip,
 
 static notrace int nest_entry_handler(struct fprobe *fp, unsigned long ip,
 				      unsigned long ret_ip,
-				      struct pt_regs *regs, void *data)
+				      struct ftrace_regs *fregs, void *data)
 {
 	KUNIT_EXPECT_FALSE(current_test, preemptible());
 	return 0;
diff --git a/samples/fprobe/fprobe_example.c b/samples/fprobe/fprobe_example.c
index 64e715e7ed11..1545a1aac616 100644
--- a/samples/fprobe/fprobe_example.c
+++ b/samples/fprobe/fprobe_example.c
@@ -50,7 +50,7 @@ static void show_backtrace(void)
 
 static int sample_entry_handler(struct fprobe *fp, unsigned long ip,
 				unsigned long ret_ip,
-				struct pt_regs *regs, void *data)
+				struct ftrace_regs *fregs, void *data)
 {
 	if (use_trace)
 		/*


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v4 3/9] tracing: Expose ftrace_regs regardless of CONFIG_FUNCTION_TRACER
  2023-08-23 15:15 [PATCH v4 0/9] bpf: fprobe: rethook: Use ftrace_regs instead of pt_regs Masami Hiramatsu (Google)
  2023-08-23 15:15 ` [PATCH v4 1/9] Documentation: probes: Add a new ret_ip callback parameter Masami Hiramatsu (Google)
  2023-08-23 15:15 ` [PATCH v4 2/9] fprobe: Use fprobe_regs in fprobe entry handler Masami Hiramatsu (Google)
@ 2023-08-23 15:16 ` Masami Hiramatsu (Google)
  2023-08-23 15:16 ` [PATCH v4 4/9] fprobe: rethook: Use ftrace_regs in fprobe exit handler and rethook Masami Hiramatsu (Google)
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 32+ messages in thread
From: Masami Hiramatsu (Google) @ 2023-08-23 15:16 UTC (permalink / raw)
  To: Alexei Starovoitov, Steven Rostedt, Florent Revest
  Cc: linux-trace-kernel, LKML, Martin KaFai Lau, bpf, Sven Schnelle,
	Alexei Starovoitov, Jiri Olsa, Arnaldo Carvalho de Melo,
	Daniel Borkmann, Alan Maguire, Mark Rutland, Peter Zijlstra,
	Thomas Gleixner

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

In order to be able to use ftrace_regs even from features unrelated to
function tracer (e.g. kretprobe), expose ftrace_regs structures and
APIs even if the CONFIG_FUNCTION_TRACER=n.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Florent Revest <revest@chromium.org>
---
 Changes in v3:
  - arch/s390/include/asm/ftrace.h: hide ftrace_regs parts if
    CONFIG_FUNCTION_TRACER=n to avoid conflict.
  - Fixed typo.
  - Define accessor macros only if HAVE_REGS_AND_STACK_ACCESS_API=y
---
 arch/s390/include/asm/ftrace.h |    4 +++
 include/linux/ftrace.h         |   51 +++++++++++++++++++++++-----------------
 2 files changed, 33 insertions(+), 22 deletions(-)

diff --git a/arch/s390/include/asm/ftrace.h b/arch/s390/include/asm/ftrace.h
index e5c5cb1207e2..74cd2e8bf660 100644
--- a/arch/s390/include/asm/ftrace.h
+++ b/arch/s390/include/asm/ftrace.h
@@ -41,6 +41,8 @@ static inline unsigned long ftrace_call_adjust(unsigned long addr)
 	return addr;
 }
 
+#ifdef CONFIG_FUNCTION_TRACER
+
 struct ftrace_regs {
 	struct pt_regs regs;
 };
@@ -80,6 +82,8 @@ ftrace_regs_set_instruction_pointer(struct ftrace_regs *fregs,
 #define ftrace_regs_query_register_offset(name) \
 	regs_query_register_offset(name)
 
+#endif /* CONFIG_FUNCTION_TRACER */
+
 #ifdef CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
 /*
  * When an ftrace registered caller is tracing a function that is
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index aad9cf8876b5..fe335d861f08 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -112,24 +112,44 @@ static inline int ftrace_mod_get_kallsym(unsigned int symnum, unsigned long *val
 }
 #endif
 
-#ifdef CONFIG_FUNCTION_TRACER
-
-extern int ftrace_enabled;
-
-#ifndef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
+/*
+ * If the architecture doesn't support FTRACE_WITH_ARGS or disables function
+ * tracer, define the default(pt_regs compatible) ftrace_regs.
+ */
+#if !defined(CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS) || !defined(CONFIG_FUNCTION_TRACER)
 
 struct ftrace_regs {
 	struct pt_regs		regs;
 };
 #define arch_ftrace_get_regs(fregs) (&(fregs)->regs)
 
+#ifdef CONFIG_HAVE_REGS_AND_STACK_ACCESS_API
+
 /*
  * ftrace_regs_set_instruction_pointer() is to be defined by the architecture
  * if to allow setting of the instruction pointer from the ftrace_regs when
  * HAVE_DYNAMIC_FTRACE_WITH_ARGS is set and it supports live kernel patching.
  */
 #define ftrace_regs_set_instruction_pointer(fregs, ip) do { } while (0)
-#endif /* CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS */
+
+#define ftrace_regs_get_instruction_pointer(fregs) \
+	instruction_pointer(ftrace_get_regs(fregs))
+#define ftrace_regs_get_argument(fregs, n) \
+	regs_get_kernel_argument(ftrace_get_regs(fregs), n)
+#define ftrace_regs_get_stack_pointer(fregs) \
+	kernel_stack_pointer(ftrace_get_regs(fregs))
+#define ftrace_regs_return_value(fregs) \
+	regs_return_value(ftrace_get_regs(fregs))
+#define ftrace_regs_set_return_value(fregs, ret) \
+	regs_set_return_value(ftrace_get_regs(fregs), ret)
+#define ftrace_override_function_with_return(fregs) \
+	override_function_with_return(ftrace_get_regs(fregs))
+#define ftrace_regs_query_register_offset(name) \
+	regs_query_register_offset(name)
+
+#endif /* CONFIG_HAVE_REGS_AND_STACK_ACCESS_API */
+
+#endif /* !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS || !CONFIG_FUNCTION_TRACER */
 
 static __always_inline struct pt_regs *ftrace_get_regs(struct ftrace_regs *fregs)
 {
@@ -151,22 +171,9 @@ static __always_inline bool ftrace_regs_has_args(struct ftrace_regs *fregs)
 	return ftrace_get_regs(fregs) != NULL;
 }
 
-#ifndef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
-#define ftrace_regs_get_instruction_pointer(fregs) \
-	instruction_pointer(ftrace_get_regs(fregs))
-#define ftrace_regs_get_argument(fregs, n) \
-	regs_get_kernel_argument(ftrace_get_regs(fregs), n)
-#define ftrace_regs_get_stack_pointer(fregs) \
-	kernel_stack_pointer(ftrace_get_regs(fregs))
-#define ftrace_regs_return_value(fregs) \
-	regs_return_value(ftrace_get_regs(fregs))
-#define ftrace_regs_set_return_value(fregs, ret) \
-	regs_set_return_value(ftrace_get_regs(fregs), ret)
-#define ftrace_override_function_with_return(fregs) \
-	override_function_with_return(ftrace_get_regs(fregs))
-#define ftrace_regs_query_register_offset(name) \
-	regs_query_register_offset(name)
-#endif
+#ifdef CONFIG_FUNCTION_TRACER
+
+extern int ftrace_enabled;
 
 typedef void (*ftrace_func_t)(unsigned long ip, unsigned long parent_ip,
 			      struct ftrace_ops *op, struct ftrace_regs *fregs);


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v4 4/9] fprobe: rethook: Use ftrace_regs in fprobe exit handler and rethook
  2023-08-23 15:15 [PATCH v4 0/9] bpf: fprobe: rethook: Use ftrace_regs instead of pt_regs Masami Hiramatsu (Google)
                   ` (2 preceding siblings ...)
  2023-08-23 15:16 ` [PATCH v4 3/9] tracing: Expose ftrace_regs regardless of CONFIG_FUNCTION_TRACER Masami Hiramatsu (Google)
@ 2023-08-23 15:16 ` Masami Hiramatsu (Google)
  2023-08-25 16:12   ` Florent Revest
  2023-09-04 13:40   ` Masami Hiramatsu
  2023-08-23 15:16 ` [PATCH v4 5/9] ftrace: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs Masami Hiramatsu (Google)
                   ` (5 subsequent siblings)
  9 siblings, 2 replies; 32+ messages in thread
From: Masami Hiramatsu (Google) @ 2023-08-23 15:16 UTC (permalink / raw)
  To: Alexei Starovoitov, Steven Rostedt, Florent Revest
  Cc: linux-trace-kernel, LKML, Martin KaFai Lau, bpf, Sven Schnelle,
	Alexei Starovoitov, Jiri Olsa, Arnaldo Carvalho de Melo,
	Daniel Borkmann, Alan Maguire, Mark Rutland, Peter Zijlstra,
	Thomas Gleixner

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Change the fprobe exit handler and rethook to use ftrace_regs data structure
instead of pt_regs. This also introduce HAVE_PT_REGS_TO_FTRACE_REGS_CAST
which means the ftrace_regs's memory layout is equal to the pt_regs so
that those are able to cast. Only if it is enabled, kretprobe will use
rethook since kretprobe requires pt_regs for backward compatibility.

This means the archs which currently implement rethook for kretprobes needs to
set that flag and it must ensure struct ftrace_regs is same as pt_regs.
If not, it must be either disabling kretprobe or implementing kretprobe
trampoline separately from rethook trampoline.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 Changes in v3:
  - Config rename to HAVE_PT_REGS_TO_FTRACE_REGS_CAST
  - Use ftrace_regs_* APIs instead of ftrace_get_regs().
 Changes in v4:
  - Add static_assert() to ensure at least the size of pt_regs
    and ftrace_regs are same if HAVE_PT_REGS_TO_FTRACE_REGS_CAST=y.
---
 arch/Kconfig                    |    1 +
 arch/loongarch/Kconfig          |    1 +
 arch/s390/Kconfig               |    1 +
 arch/x86/Kconfig                |    1 +
 arch/x86/kernel/rethook.c       |   13 +++++++------
 include/linux/fprobe.h          |    2 +-
 include/linux/ftrace.h          |    6 ++++++
 include/linux/rethook.h         |   11 ++++++-----
 kernel/kprobes.c                |   10 ++++++++--
 kernel/trace/Kconfig            |    7 +++++++
 kernel/trace/bpf_trace.c        |    6 +++++-
 kernel/trace/fprobe.c           |    6 +++---
 kernel/trace/rethook.c          |   16 ++++++++--------
 kernel/trace/trace_fprobe.c     |    6 +++++-
 lib/test_fprobe.c               |    6 +++---
 samples/fprobe/fprobe_example.c |    2 +-
 16 files changed, 64 insertions(+), 31 deletions(-)

diff --git a/arch/Kconfig b/arch/Kconfig
index aff2746c8af2..e41a270c30bb 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -201,6 +201,7 @@ config KRETPROBE_ON_RETHOOK
 	def_bool y
 	depends on HAVE_RETHOOK
 	depends on KRETPROBES
+	depends on HAVE_PT_REGS_TO_FTRACE_REGS_CAST || !HAVE_DYNAMIC_FTRACE_WITH_ARGS
 	select RETHOOK
 
 config USER_RETURN_NOTIFIER
diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
index e71d5bf2cee0..33c3a4598ae0 100644
--- a/arch/loongarch/Kconfig
+++ b/arch/loongarch/Kconfig
@@ -103,6 +103,7 @@ config LOONGARCH
 	select HAVE_DMA_CONTIGUOUS
 	select HAVE_DYNAMIC_FTRACE
 	select HAVE_DYNAMIC_FTRACE_WITH_ARGS
+	select HAVE_PT_REGS_TO_FTRACE_REGS_CAST
 	select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
 	select HAVE_DYNAMIC_FTRACE_WITH_REGS
 	select HAVE_EBPF_JIT
diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
index 5b39918b7042..ef06c3c2b06d 100644
--- a/arch/s390/Kconfig
+++ b/arch/s390/Kconfig
@@ -165,6 +165,7 @@ config S390
 	select HAVE_DMA_CONTIGUOUS
 	select HAVE_DYNAMIC_FTRACE
 	select HAVE_DYNAMIC_FTRACE_WITH_ARGS
+	select HAVE_PT_REGS_TO_FTRACE_REGS_CAST
 	select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
 	select HAVE_DYNAMIC_FTRACE_WITH_REGS
 	select HAVE_EBPF_JIT if HAVE_MARCH_Z196_FEATURES
diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index e36261b4ea14..7c1f3194e209 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -207,6 +207,7 @@ config X86
 	select HAVE_DYNAMIC_FTRACE
 	select HAVE_DYNAMIC_FTRACE_WITH_REGS
 	select HAVE_DYNAMIC_FTRACE_WITH_ARGS	if X86_64
+	select HAVE_PT_REGS_TO_FTRACE_REGS_CAST	if X86_64
 	select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
 	select HAVE_SAMPLE_FTRACE_DIRECT	if X86_64
 	select HAVE_SAMPLE_FTRACE_DIRECT_MULTI	if X86_64
diff --git a/arch/x86/kernel/rethook.c b/arch/x86/kernel/rethook.c
index 8a1c0111ae79..d714d0276c93 100644
--- a/arch/x86/kernel/rethook.c
+++ b/arch/x86/kernel/rethook.c
@@ -83,7 +83,8 @@ __used __visible void arch_rethook_trampoline_callback(struct pt_regs *regs)
 	 * arch_rethook_fixup_return() which called from this
 	 * rethook_trampoline_handler().
 	 */
-	rethook_trampoline_handler(regs, (unsigned long)frame_pointer);
+	rethook_trampoline_handler((struct ftrace_regs *)regs,
+				   (unsigned long)frame_pointer);
 
 	/*
 	 * Copy FLAGS to 'pt_regs::ss' so that arch_rethook_trapmoline()
@@ -104,22 +105,22 @@ NOKPROBE_SYMBOL(arch_rethook_trampoline_callback);
 STACK_FRAME_NON_STANDARD_FP(arch_rethook_trampoline);
 
 /* This is called from rethook_trampoline_handler(). */
-void arch_rethook_fixup_return(struct pt_regs *regs,
+void arch_rethook_fixup_return(struct ftrace_regs *fregs,
 			       unsigned long correct_ret_addr)
 {
-	unsigned long *frame_pointer = (void *)(regs + 1);
+	unsigned long *frame_pointer = (void *)(fregs + 1);
 
 	/* Replace fake return address with real one. */
 	*frame_pointer = correct_ret_addr;
 }
 NOKPROBE_SYMBOL(arch_rethook_fixup_return);
 
-void arch_rethook_prepare(struct rethook_node *rh, struct pt_regs *regs, bool mcount)
+void arch_rethook_prepare(struct rethook_node *rh, struct ftrace_regs *fregs, bool mcount)
 {
-	unsigned long *stack = (unsigned long *)regs->sp;
+	unsigned long *stack = (unsigned long *)ftrace_regs_get_stack_pointer(fregs);
 
 	rh->ret_addr = stack[0];
-	rh->frame = regs->sp;
+	rh->frame = (unsigned long)stack;
 
 	/* Replace the return addr with trampoline addr */
 	stack[0] = (unsigned long) arch_rethook_trampoline;
diff --git a/include/linux/fprobe.h b/include/linux/fprobe.h
index 36c0595f7b93..b9c0c216dedb 100644
--- a/include/linux/fprobe.h
+++ b/include/linux/fprobe.h
@@ -38,7 +38,7 @@ struct fprobe {
 			     unsigned long ret_ip, struct ftrace_regs *regs,
 			     void *entry_data);
 	void (*exit_handler)(struct fprobe *fp, unsigned long entry_ip,
-			     unsigned long ret_ip, struct pt_regs *regs,
+			     unsigned long ret_ip, struct ftrace_regs *regs,
 			     void *entry_data);
 };
 
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index fe335d861f08..c0a42d0860b8 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -151,6 +151,12 @@ struct ftrace_regs {
 
 #endif /* !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS || !CONFIG_FUNCTION_TRACER */
 
+#ifdef CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST
+
+static_assert(sizeof(struct pt_regs) == sizeof(struct ftrace_regs));
+
+#endif /* CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST */
+
 static __always_inline struct pt_regs *ftrace_get_regs(struct ftrace_regs *fregs)
 {
 	if (!fregs)
diff --git a/include/linux/rethook.h b/include/linux/rethook.h
index 26b6f3c81a76..138d64c8b67b 100644
--- a/include/linux/rethook.h
+++ b/include/linux/rethook.h
@@ -7,6 +7,7 @@
 
 #include <linux/compiler.h>
 #include <linux/freelist.h>
+#include <linux/ftrace.h>
 #include <linux/kallsyms.h>
 #include <linux/llist.h>
 #include <linux/rcupdate.h>
@@ -14,7 +15,7 @@
 
 struct rethook_node;
 
-typedef void (*rethook_handler_t) (struct rethook_node *, void *, unsigned long, struct pt_regs *);
+typedef void (*rethook_handler_t) (struct rethook_node *, void *, unsigned long, struct ftrace_regs *);
 
 /**
  * struct rethook - The rethook management data structure.
@@ -64,12 +65,12 @@ void rethook_free(struct rethook *rh);
 void rethook_add_node(struct rethook *rh, struct rethook_node *node);
 struct rethook_node *rethook_try_get(struct rethook *rh);
 void rethook_recycle(struct rethook_node *node);
-void rethook_hook(struct rethook_node *node, struct pt_regs *regs, bool mcount);
+void rethook_hook(struct rethook_node *node, struct ftrace_regs *regs, bool mcount);
 unsigned long rethook_find_ret_addr(struct task_struct *tsk, unsigned long frame,
 				    struct llist_node **cur);
 
 /* Arch dependent code must implement arch_* and trampoline code */
-void arch_rethook_prepare(struct rethook_node *node, struct pt_regs *regs, bool mcount);
+void arch_rethook_prepare(struct rethook_node *node, struct ftrace_regs *regs, bool mcount);
 void arch_rethook_trampoline(void);
 
 /**
@@ -84,11 +85,11 @@ static inline bool is_rethook_trampoline(unsigned long addr)
 }
 
 /* If the architecture needs to fixup the return address, implement it. */
-void arch_rethook_fixup_return(struct pt_regs *regs,
+void arch_rethook_fixup_return(struct ftrace_regs *regs,
 			       unsigned long correct_ret_addr);
 
 /* Generic trampoline handler, arch code must prepare asm stub */
-unsigned long rethook_trampoline_handler(struct pt_regs *regs,
+unsigned long rethook_trampoline_handler(struct ftrace_regs *regs,
 					 unsigned long frame);
 
 #ifdef CONFIG_RETHOOK
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 0c6185aefaef..821dff656149 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -2132,7 +2132,11 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
 	if (rp->entry_handler && rp->entry_handler(ri, regs))
 		rethook_recycle(rhn);
 	else
-		rethook_hook(rhn, regs, kprobe_ftrace(p));
+		/*
+		 * We can cast pt_regs to ftrace_regs because this depends on
+		 * HAVE_PT_REGS_TO_FTRACE_REGS_CAST.
+		 */
+		rethook_hook(rhn, (struct ftrace_regs *)regs, kprobe_ftrace(p));
 
 	return 0;
 }
@@ -2140,9 +2144,11 @@ NOKPROBE_SYMBOL(pre_handler_kretprobe);
 
 static void kretprobe_rethook_handler(struct rethook_node *rh, void *data,
 				      unsigned long ret_addr,
-				      struct pt_regs *regs)
+				      struct ftrace_regs *fregs)
 {
 	struct kretprobe *rp = (struct kretprobe *)data;
+	/* Ditto, this depends on HAVE_PT_REGS_TO_FTRACE_REGS_CAST. */
+	struct pt_regs *regs = (struct pt_regs *)fregs;
 	struct kretprobe_instance *ri;
 	struct kprobe_ctlblk *kcb;
 
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index 976fd594b446..d56304276318 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -57,6 +57,13 @@ config HAVE_DYNAMIC_FTRACE_WITH_ARGS
 	 This allows for use of ftrace_regs_get_argument() and
 	 ftrace_regs_get_stack_pointer().
 
+config HAVE_PT_REGS_TO_FTRACE_REGS_CAST
+	bool
+	help
+	 If this is set, the memory layout of the ftrace_regs data structure
+	 is the same as the pt_regs. So the pt_regs is possible to be casted
+	 to ftrace_regs.
+
 config HAVE_DYNAMIC_FTRACE_NO_PATCHABLE
 	bool
 	help
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index 22d00c817f1a..c4d57c7cdc7c 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -2675,10 +2675,14 @@ kprobe_multi_link_handler(struct fprobe *fp, unsigned long fentry_ip,
 
 static void
 kprobe_multi_link_exit_handler(struct fprobe *fp, unsigned long fentry_ip,
-			       unsigned long ret_ip, struct pt_regs *regs,
+			       unsigned long ret_ip, struct ftrace_regs *fregs,
 			       void *data)
 {
 	struct bpf_kprobe_multi_link *link;
+	struct pt_regs *regs = ftrace_get_regs(fregs);
+
+	if (!regs)
+		return;
 
 	link = container_of(fp, struct bpf_kprobe_multi_link, fp);
 	kprobe_multi_link_prog_run(link, get_entry_ip(fentry_ip), regs);
diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
index 07deb52df44a..dfddc7e8424e 100644
--- a/kernel/trace/fprobe.c
+++ b/kernel/trace/fprobe.c
@@ -53,7 +53,7 @@ static inline void __fprobe_handler(unsigned long ip, unsigned long parent_ip,
 		if (ret)
 			rethook_recycle(rh);
 		else
-			rethook_hook(rh, ftrace_get_regs(fregs), true);
+			rethook_hook(rh, fregs, true);
 	}
 }
 
@@ -120,7 +120,7 @@ static void fprobe_kprobe_handler(unsigned long ip, unsigned long parent_ip,
 }
 
 static void fprobe_exit_handler(struct rethook_node *rh, void *data,
-				unsigned long ret_ip, struct pt_regs *regs)
+				unsigned long ret_ip, struct ftrace_regs *fregs)
 {
 	struct fprobe *fp = (struct fprobe *)data;
 	struct fprobe_rethook_node *fpr;
@@ -141,7 +141,7 @@ static void fprobe_exit_handler(struct rethook_node *rh, void *data,
 		return;
 	}
 
-	fp->exit_handler(fp, fpr->entry_ip, ret_ip, regs,
+	fp->exit_handler(fp, fpr->entry_ip, ret_ip, fregs,
 			 fp->entry_data_size ? (void *)fpr->data : NULL);
 	ftrace_test_recursion_unlock(bit);
 }
diff --git a/kernel/trace/rethook.c b/kernel/trace/rethook.c
index 5eb9b598f4e9..7c5cf9d5910c 100644
--- a/kernel/trace/rethook.c
+++ b/kernel/trace/rethook.c
@@ -189,7 +189,7 @@ NOKPROBE_SYMBOL(rethook_try_get);
 /**
  * rethook_hook() - Hook the current function return.
  * @node: The struct rethook node to hook the function return.
- * @regs: The struct pt_regs for the function entry.
+ * @fregs: The struct ftrace_regs for the function entry.
  * @mcount: True if this is called from mcount(ftrace) context.
  *
  * Hook the current running function return. This must be called when the
@@ -199,9 +199,9 @@ NOKPROBE_SYMBOL(rethook_try_get);
  * from the real function entry (e.g. kprobes) @mcount must be set false.
  * This is because the way to hook the function return depends on the context.
  */
-void rethook_hook(struct rethook_node *node, struct pt_regs *regs, bool mcount)
+void rethook_hook(struct rethook_node *node, struct ftrace_regs *fregs, bool mcount)
 {
-	arch_rethook_prepare(node, regs, mcount);
+	arch_rethook_prepare(node, fregs, mcount);
 	__llist_add(&node->llist, &current->rethooks);
 }
 NOKPROBE_SYMBOL(rethook_hook);
@@ -269,7 +269,7 @@ unsigned long rethook_find_ret_addr(struct task_struct *tsk, unsigned long frame
 }
 NOKPROBE_SYMBOL(rethook_find_ret_addr);
 
-void __weak arch_rethook_fixup_return(struct pt_regs *regs,
+void __weak arch_rethook_fixup_return(struct ftrace_regs *fregs,
 				      unsigned long correct_ret_addr)
 {
 	/*
@@ -281,7 +281,7 @@ void __weak arch_rethook_fixup_return(struct pt_regs *regs,
 }
 
 /* This function will be called from each arch-defined trampoline. */
-unsigned long rethook_trampoline_handler(struct pt_regs *regs,
+unsigned long rethook_trampoline_handler(struct ftrace_regs *fregs,
 					 unsigned long frame)
 {
 	struct llist_node *first, *node = NULL;
@@ -295,7 +295,7 @@ unsigned long rethook_trampoline_handler(struct pt_regs *regs,
 		BUG_ON(1);
 	}
 
-	instruction_pointer_set(regs, correct_ret_addr);
+	ftrace_regs_set_instruction_pointer(fregs, correct_ret_addr);
 
 	/*
 	 * These loops must be protected from rethook_free_rcu() because those
@@ -315,7 +315,7 @@ unsigned long rethook_trampoline_handler(struct pt_regs *regs,
 		handler = READ_ONCE(rhn->rethook->handler);
 		if (handler)
 			handler(rhn, rhn->rethook->data,
-				correct_ret_addr, regs);
+				correct_ret_addr, fregs);
 
 		if (first == node)
 			break;
@@ -323,7 +323,7 @@ unsigned long rethook_trampoline_handler(struct pt_regs *regs,
 	}
 
 	/* Fixup registers for returning to correct address. */
-	arch_rethook_fixup_return(regs, correct_ret_addr);
+	arch_rethook_fixup_return(fregs, correct_ret_addr);
 
 	/* Unlink used shadow stack */
 	first = current->rethooks.first;
diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c
index 71bf38d698f1..c60d0d9f1a95 100644
--- a/kernel/trace/trace_fprobe.c
+++ b/kernel/trace/trace_fprobe.c
@@ -341,10 +341,14 @@ static int fentry_dispatcher(struct fprobe *fp, unsigned long entry_ip,
 NOKPROBE_SYMBOL(fentry_dispatcher);
 
 static void fexit_dispatcher(struct fprobe *fp, unsigned long entry_ip,
-			     unsigned long ret_ip, struct pt_regs *regs,
+			     unsigned long ret_ip, struct ftrace_regs *fregs,
 			     void *entry_data)
 {
 	struct trace_fprobe *tf = container_of(fp, struct trace_fprobe, fp);
+	struct pt_regs *regs = ftrace_get_regs(fregs);
+
+	if (!regs)
+		return;
 
 	if (trace_probe_test_flag(&tf->tp, TP_FLAG_TRACE))
 		fexit_trace_func(tf, entry_ip, ret_ip, regs);
diff --git a/lib/test_fprobe.c b/lib/test_fprobe.c
index ff607babba18..d1e80653bf0c 100644
--- a/lib/test_fprobe.c
+++ b/lib/test_fprobe.c
@@ -59,9 +59,9 @@ static notrace int fp_entry_handler(struct fprobe *fp, unsigned long ip,
 
 static notrace void fp_exit_handler(struct fprobe *fp, unsigned long ip,
 				    unsigned long ret_ip,
-				    struct pt_regs *regs, void *data)
+				    struct ftrace_regs *fregs, void *data)
 {
-	unsigned long ret = regs_return_value(regs);
+	unsigned long ret = ftrace_regs_return_value(fregs);
 
 	KUNIT_EXPECT_FALSE(current_test, preemptible());
 	if (ip != target_ip) {
@@ -89,7 +89,7 @@ static notrace int nest_entry_handler(struct fprobe *fp, unsigned long ip,
 
 static notrace void nest_exit_handler(struct fprobe *fp, unsigned long ip,
 				      unsigned long ret_ip,
-				      struct pt_regs *regs, void *data)
+				      struct ftrace_regs *fregs, void *data)
 {
 	KUNIT_EXPECT_FALSE(current_test, preemptible());
 	KUNIT_EXPECT_EQ(current_test, ip, target_nest_ip);
diff --git a/samples/fprobe/fprobe_example.c b/samples/fprobe/fprobe_example.c
index 1545a1aac616..d476d1f07538 100644
--- a/samples/fprobe/fprobe_example.c
+++ b/samples/fprobe/fprobe_example.c
@@ -67,7 +67,7 @@ static int sample_entry_handler(struct fprobe *fp, unsigned long ip,
 }
 
 static void sample_exit_handler(struct fprobe *fp, unsigned long ip,
-				unsigned long ret_ip, struct pt_regs *regs,
+				unsigned long ret_ip, struct ftrace_regs *regs,
 				void *data)
 {
 	unsigned long rip = ret_ip;


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v4 5/9] ftrace: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs
  2023-08-23 15:15 [PATCH v4 0/9] bpf: fprobe: rethook: Use ftrace_regs instead of pt_regs Masami Hiramatsu (Google)
                   ` (3 preceding siblings ...)
  2023-08-23 15:16 ` [PATCH v4 4/9] fprobe: rethook: Use ftrace_regs in fprobe exit handler and rethook Masami Hiramatsu (Google)
@ 2023-08-23 15:16 ` Masami Hiramatsu (Google)
  2023-08-25 21:49   ` Andrii Nakryiko
  2023-08-23 15:16 ` [PATCH v4 6/9] tracing/fprobe: Enable fprobe events with CONFIG_DYNAMIC_FTRACE_WITH_ARGS Masami Hiramatsu (Google)
                   ` (4 subsequent siblings)
  9 siblings, 1 reply; 32+ messages in thread
From: Masami Hiramatsu (Google) @ 2023-08-23 15:16 UTC (permalink / raw)
  To: Alexei Starovoitov, Steven Rostedt, Florent Revest
  Cc: linux-trace-kernel, LKML, Martin KaFai Lau, bpf, Sven Schnelle,
	Alexei Starovoitov, Jiri Olsa, Arnaldo Carvalho de Melo,
	Daniel Borkmann, Alan Maguire, Mark Rutland, Peter Zijlstra,
	Thomas Gleixner

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Add ftrace_partial_regs() which converts the ftrace_regs to pt_regs.
If the architecture defines its own ftrace_regs, this copies partial
registers to pt_regs and returns it. If not, ftrace_regs is the same as
pt_regs and ftrace_partial_regs() will return ftrace_regs::regs.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Florent Revest <revest@chromium.org>
---
 Changes in v3:
  - Fix to use pt_regs::regs instead of x.
  - Return ftrace_regs::regs forcibly if HAVE_PT_REGS_COMPAT_FTRACE_REGS=y.
  - Fix typo.
  - Fix to copy correct registers to the pt_regs on arm64.
 Changes in v4:
  - Change the patch order in the series so that fprobe event can use this.
---
 arch/arm64/include/asm/ftrace.h |   11 +++++++++++
 include/linux/ftrace.h          |   17 +++++++++++++++++
 2 files changed, 28 insertions(+)

diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
index ab158196480c..5ad24f315d52 100644
--- a/arch/arm64/include/asm/ftrace.h
+++ b/arch/arm64/include/asm/ftrace.h
@@ -137,6 +137,17 @@ ftrace_override_function_with_return(struct ftrace_regs *fregs)
 	fregs->pc = fregs->lr;
 }
 
+static __always_inline struct pt_regs *
+ftrace_partial_regs(const struct ftrace_regs *fregs, struct pt_regs *regs)
+{
+	memcpy(regs->regs, fregs->regs, sizeof(u64) * 9);
+	regs->sp = fregs->sp;
+	regs->pc = fregs->pc;
+	regs->regs[29] = fregs->fp;
+	regs->regs[30] = fregs->lr;
+	return regs;
+}
+
 int ftrace_regs_query_register_offset(const char *name);
 
 int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index c0a42d0860b8..a6ed2aa71efc 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -165,6 +165,23 @@ static __always_inline struct pt_regs *ftrace_get_regs(struct ftrace_regs *fregs
 	return arch_ftrace_get_regs(fregs);
 }
 
+#if !defined(CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS) || \
+	defined(CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST)
+
+static __always_inline struct pt_regs *
+ftrace_partial_regs(struct ftrace_regs *fregs, struct pt_regs *regs)
+{
+	/*
+	 * If CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST=y, ftrace_regs memory
+	 * layout is the same as pt_regs. So always returns that address.
+	 * Since arch_ftrace_get_regs() will check some members and may return
+	 * NULL, we can not use it.
+	 */
+	return &fregs->regs;
+}
+
+#endif /* !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS || CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST */
+
 /*
  * When true, the ftrace_regs_{get,set}_*() functions may be used on fregs.
  * Note: this can be true even when ftrace_get_regs() cannot provide a pt_regs.


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v4 6/9] tracing/fprobe: Enable fprobe events with CONFIG_DYNAMIC_FTRACE_WITH_ARGS
  2023-08-23 15:15 [PATCH v4 0/9] bpf: fprobe: rethook: Use ftrace_regs instead of pt_regs Masami Hiramatsu (Google)
                   ` (4 preceding siblings ...)
  2023-08-23 15:16 ` [PATCH v4 5/9] ftrace: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs Masami Hiramatsu (Google)
@ 2023-08-23 15:16 ` Masami Hiramatsu (Google)
  2023-08-25 16:12   ` Florent Revest
  2023-08-30  7:20   ` Masami Hiramatsu
  2023-08-23 15:16 ` [PATCH v4 7/9] bpf: Enable kprobe_multi feature if CONFIG_FPROBE is enabled Masami Hiramatsu (Google)
                   ` (3 subsequent siblings)
  9 siblings, 2 replies; 32+ messages in thread
From: Masami Hiramatsu (Google) @ 2023-08-23 15:16 UTC (permalink / raw)
  To: Alexei Starovoitov, Steven Rostedt, Florent Revest
  Cc: linux-trace-kernel, LKML, Martin KaFai Lau, bpf, Sven Schnelle,
	Alexei Starovoitov, Jiri Olsa, Arnaldo Carvalho de Melo,
	Daniel Borkmann, Alan Maguire, Mark Rutland, Peter Zijlstra,
	Thomas Gleixner

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Allow fprobe events to be enabled with CONFIG_DYNAMIC_FTRACE_WITH_ARGS.
With this change, fprobe events mostly use ftrace_regs instead of pt_regs.
Note that if the arch doesn't enable HAVE_PT_REGS_COMPAT_FTRACE_REGS,
fprobe events will not be able to be used from perf.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 Changes in v3:
   - introduce ftrace_regs_get_kernel_stack_nth().
   - fix typo.
 Changes in v4:
   - Use per-cpu pt_regs stack and ftrace_partial_regs() for perf event.
---
 include/linux/ftrace.h          |   15 ++++
 kernel/trace/Kconfig            |    1 
 kernel/trace/trace_fprobe.c     |  135 ++++++++++++++++++++++++++++-----------
 kernel/trace/trace_probe_tmpl.h |    2 -
 4 files changed, 112 insertions(+), 41 deletions(-)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index a6ed2aa71efc..fb0f87d19d35 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -194,6 +194,21 @@ static __always_inline bool ftrace_regs_has_args(struct ftrace_regs *fregs)
 	return ftrace_get_regs(fregs) != NULL;
 }
 
+#ifdef CONFIG_HAVE_REGS_AND_STACK_ACCESS_API
+static __always_inline unsigned long
+ftrace_regs_get_kernel_stack_nth(struct ftrace_regs *fregs, unsigned int nth)
+{
+	unsigned long *stackp;
+
+	stackp = (unsigned long *)ftrace_regs_get_stack_pointer(fregs);
+	if (((unsigned long)(stackp + nth) & ~(THREAD_SIZE - 1)) ==
+	    ((unsigned long)stackp & ~(THREAD_SIZE - 1)))
+		return *(stackp + nth);
+
+	return 0;
+}
+#endif /* CONFIG_HAVE_REGS_AND_STACK_ACCESS_API */
+
 #ifdef CONFIG_FUNCTION_TRACER
 
 extern int ftrace_enabled;
diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
index d56304276318..6fb4ecf8767d 100644
--- a/kernel/trace/Kconfig
+++ b/kernel/trace/Kconfig
@@ -679,7 +679,6 @@ config FPROBE_EVENTS
 	select TRACING
 	select PROBE_EVENTS
 	select DYNAMIC_EVENTS
-	depends on DYNAMIC_FTRACE_WITH_REGS
 	default y
 	help
 	  This allows user to add tracing events on the function entry and
diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c
index c60d0d9f1a95..90ad28260a9f 100644
--- a/kernel/trace/trace_fprobe.c
+++ b/kernel/trace/trace_fprobe.c
@@ -132,7 +132,7 @@ static int
 process_fetch_insn(struct fetch_insn *code, void *rec, void *dest,
 		   void *base)
 {
-	struct pt_regs *regs = rec;
+	struct ftrace_regs *fregs = rec;
 	unsigned long val;
 	int ret;
 
@@ -140,17 +140,17 @@ process_fetch_insn(struct fetch_insn *code, void *rec, void *dest,
 	/* 1st stage: get value from context */
 	switch (code->op) {
 	case FETCH_OP_STACK:
-		val = regs_get_kernel_stack_nth(regs, code->param);
+		val = ftrace_regs_get_kernel_stack_nth(fregs, code->param);
 		break;
 	case FETCH_OP_STACKP:
-		val = kernel_stack_pointer(regs);
+		val = ftrace_regs_get_stack_pointer(fregs);
 		break;
 	case FETCH_OP_RETVAL:
-		val = regs_return_value(regs);
+		val = ftrace_regs_return_value(fregs);
 		break;
 #ifdef CONFIG_HAVE_FUNCTION_ARG_ACCESS_API
 	case FETCH_OP_ARG:
-		val = regs_get_kernel_argument(regs, code->param);
+		val = ftrace_regs_get_argument(fregs, code->param);
 		break;
 #endif
 	case FETCH_NOP_SYMBOL:	/* Ignore a place holder */
@@ -170,7 +170,7 @@ NOKPROBE_SYMBOL(process_fetch_insn)
 /* function entry handler */
 static nokprobe_inline void
 __fentry_trace_func(struct trace_fprobe *tf, unsigned long entry_ip,
-		    struct pt_regs *regs,
+		    struct ftrace_regs *fregs,
 		    struct trace_event_file *trace_file)
 {
 	struct fentry_trace_entry_head *entry;
@@ -184,36 +184,36 @@ __fentry_trace_func(struct trace_fprobe *tf, unsigned long entry_ip,
 	if (trace_trigger_soft_disabled(trace_file))
 		return;
 
-	dsize = __get_data_size(&tf->tp, regs);
+	dsize = __get_data_size(&tf->tp, fregs);
 
 	entry = trace_event_buffer_reserve(&fbuffer, trace_file,
 					   sizeof(*entry) + tf->tp.size + dsize);
 	if (!entry)
 		return;
 
-	fbuffer.regs = regs;
+	fbuffer.regs = ftrace_get_regs(fregs);
 	entry = fbuffer.entry = ring_buffer_event_data(fbuffer.event);
 	entry->ip = entry_ip;
-	store_trace_args(&entry[1], &tf->tp, regs, sizeof(*entry), dsize);
+	store_trace_args(&entry[1], &tf->tp, fregs, sizeof(*entry), dsize);
 
 	trace_event_buffer_commit(&fbuffer);
 }
 
 static void
 fentry_trace_func(struct trace_fprobe *tf, unsigned long entry_ip,
-		  struct pt_regs *regs)
+		  struct ftrace_regs *fregs)
 {
 	struct event_file_link *link;
 
 	trace_probe_for_each_link_rcu(link, &tf->tp)
-		__fentry_trace_func(tf, entry_ip, regs, link->file);
+		__fentry_trace_func(tf, entry_ip, fregs, link->file);
 }
 NOKPROBE_SYMBOL(fentry_trace_func);
 
 /* Kretprobe handler */
 static nokprobe_inline void
 __fexit_trace_func(struct trace_fprobe *tf, unsigned long entry_ip,
-		   unsigned long ret_ip, struct pt_regs *regs,
+		   unsigned long ret_ip, struct ftrace_regs *fregs,
 		   struct trace_event_file *trace_file)
 {
 	struct fexit_trace_entry_head *entry;
@@ -227,94 +227,157 @@ __fexit_trace_func(struct trace_fprobe *tf, unsigned long entry_ip,
 	if (trace_trigger_soft_disabled(trace_file))
 		return;
 
-	dsize = __get_data_size(&tf->tp, regs);
+	dsize = __get_data_size(&tf->tp, fregs);
 
 	entry = trace_event_buffer_reserve(&fbuffer, trace_file,
 					   sizeof(*entry) + tf->tp.size + dsize);
 	if (!entry)
 		return;
 
-	fbuffer.regs = regs;
+	fbuffer.regs = ftrace_get_regs(fregs);
 	entry = fbuffer.entry = ring_buffer_event_data(fbuffer.event);
 	entry->func = entry_ip;
 	entry->ret_ip = ret_ip;
-	store_trace_args(&entry[1], &tf->tp, regs, sizeof(*entry), dsize);
+	store_trace_args(&entry[1], &tf->tp, fregs, sizeof(*entry), dsize);
 
 	trace_event_buffer_commit(&fbuffer);
 }
 
 static void
 fexit_trace_func(struct trace_fprobe *tf, unsigned long entry_ip,
-		 unsigned long ret_ip, struct pt_regs *regs)
+		 unsigned long ret_ip, struct ftrace_regs *fregs)
 {
 	struct event_file_link *link;
 
 	trace_probe_for_each_link_rcu(link, &tf->tp)
-		__fexit_trace_func(tf, entry_ip, ret_ip, regs, link->file);
+		__fexit_trace_func(tf, entry_ip, ret_ip, fregs, link->file);
 }
 NOKPROBE_SYMBOL(fexit_trace_func);
 
 #ifdef CONFIG_PERF_EVENTS
 
+#if !defined(CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS) || \
+	defined(CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST)
+
+static __always_inline
+struct pt_regs *perf_fprobe_partial_regs(struct ftrace_regs *fregs)
+{
+	/* See include/linux/ftrace.h, this returns &fregs->regs */
+	return ftrace_partial_regs(fregs, NULL);
+}
+
+#define perf_fprobe_return_regs(regs) do {} while (0)
+
+#else /* CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS && !CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST */
+
+/* Since fprobe handlers can be nested, pt_regs buffer need to be a stack */
+#define PERF_FPROBE_REGS_MAX	4
+
+struct pt_regs_stack {
+	struct pt_regs regs[PERF_FPROBE_REGS_MAX];
+	int idx;
+};
+
+static DEFINE_PER_CPU(struct pt_regs_stack, perf_fprobe_regs);
+
+static __always_inline
+struct pt_regs *perf_fprobe_partial_regs(struct ftrace_regs *fregs)
+{
+	struct pt_regs_stack *stack = this_cpu_ptr(&perf_fprobe_regs);
+	struct pt_regs *regs;
+
+	if (stack->idx < PERF_FPROBE_REGS_MAX) {
+		regs = stack->regs[stack->idx++];
+		return ftrace_partial_regs(fregs, regs);
+	}
+	return NULL;
+}
+
+static __always_inline void perf_fprobe_return_regs(struct pt_regs *regs)
+{
+	struct pt_regs_stack *stack = this_cpu_ptr(&perf_fprobe_regs);
+
+	if (WARN_ON_ONCE(regs != stack->regs[stack->idx]))
+		return;
+
+	--stack->idx;
+}
+
+#endif /* !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS || CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST */
+
 static int fentry_perf_func(struct trace_fprobe *tf, unsigned long entry_ip,
-			    struct pt_regs *regs)
+			    struct ftrace_regs *fregs)
 {
 	struct trace_event_call *call = trace_probe_event_call(&tf->tp);
 	struct fentry_trace_entry_head *entry;
 	struct hlist_head *head;
 	int size, __size, dsize;
+	struct pt_regs *regs;
 	int rctx;
 
+	regs = perf_fprobe_partial_regs(fregs);
+	if (!regs)
+		return -EINVAL;
+
 	head = this_cpu_ptr(call->perf_events);
 	if (hlist_empty(head))
-		return 0;
+		goto out;
 
-	dsize = __get_data_size(&tf->tp, regs);
+	dsize = __get_data_size(&tf->tp, fregs);
 	__size = sizeof(*entry) + tf->tp.size + dsize;
 	size = ALIGN(__size + sizeof(u32), sizeof(u64));
 	size -= sizeof(u32);
 
 	entry = perf_trace_buf_alloc(size, NULL, &rctx);
 	if (!entry)
-		return 0;
+		goto out;
 
 	entry->ip = entry_ip;
 	memset(&entry[1], 0, dsize);
-	store_trace_args(&entry[1], &tf->tp, regs, sizeof(*entry), dsize);
+	store_trace_args(&entry[1], &tf->tp, fregs, sizeof(*entry), dsize);
 	perf_trace_buf_submit(entry, size, rctx, call->event.type, 1, regs,
 			      head, NULL);
+out:
+	perf_fprobe_return_regs(regs);
 	return 0;
 }
 NOKPROBE_SYMBOL(fentry_perf_func);
 
 static void
 fexit_perf_func(struct trace_fprobe *tf, unsigned long entry_ip,
-		unsigned long ret_ip, struct pt_regs *regs)
+		unsigned long ret_ip, struct ftrace_regs *fregs)
 {
 	struct trace_event_call *call = trace_probe_event_call(&tf->tp);
 	struct fexit_trace_entry_head *entry;
 	struct hlist_head *head;
 	int size, __size, dsize;
+	struct pt_regs *regs;
 	int rctx;
 
+	regs = perf_fprobe_partial_regs(fregs);
+	if (!regs)
+		return;
+
 	head = this_cpu_ptr(call->perf_events);
 	if (hlist_empty(head))
-		return;
+		goto out;
 
-	dsize = __get_data_size(&tf->tp, regs);
+	dsize = __get_data_size(&tf->tp, fregs);
 	__size = sizeof(*entry) + tf->tp.size + dsize;
 	size = ALIGN(__size + sizeof(u32), sizeof(u64));
 	size -= sizeof(u32);
 
 	entry = perf_trace_buf_alloc(size, NULL, &rctx);
 	if (!entry)
-		return;
+		goto out;
 
 	entry->func = entry_ip;
 	entry->ret_ip = ret_ip;
-	store_trace_args(&entry[1], &tf->tp, regs, sizeof(*entry), dsize);
+	store_trace_args(&entry[1], &tf->tp, fregs, sizeof(*entry), dsize);
 	perf_trace_buf_submit(entry, size, rctx, call->event.type, 1, regs,
 			      head, NULL);
+out:
+	perf_fprobe_return_regs(regs);
 }
 NOKPROBE_SYMBOL(fexit_perf_func);
 #endif	/* CONFIG_PERF_EVENTS */
@@ -324,17 +387,14 @@ static int fentry_dispatcher(struct fprobe *fp, unsigned long entry_ip,
 			     void *entry_data)
 {
 	struct trace_fprobe *tf = container_of(fp, struct trace_fprobe, fp);
-	struct pt_regs *regs = ftrace_get_regs(fregs);
 	int ret = 0;
 
-	if (!regs)
-		return 0;
-
 	if (trace_probe_test_flag(&tf->tp, TP_FLAG_TRACE))
-		fentry_trace_func(tf, entry_ip, regs);
+		fentry_trace_func(tf, entry_ip, fregs);
+
 #ifdef CONFIG_PERF_EVENTS
 	if (trace_probe_test_flag(&tf->tp, TP_FLAG_PROFILE))
-		ret = fentry_perf_func(tf, entry_ip, regs);
+		ret = fentry_perf_func(tf, entry_ip, fregs);
 #endif
 	return ret;
 }
@@ -345,16 +405,13 @@ static void fexit_dispatcher(struct fprobe *fp, unsigned long entry_ip,
 			     void *entry_data)
 {
 	struct trace_fprobe *tf = container_of(fp, struct trace_fprobe, fp);
-	struct pt_regs *regs = ftrace_get_regs(fregs);
-
-	if (!regs)
-		return;
 
 	if (trace_probe_test_flag(&tf->tp, TP_FLAG_TRACE))
-		fexit_trace_func(tf, entry_ip, ret_ip, regs);
+		fexit_trace_func(tf, entry_ip, ret_ip, fregs);
+
 #ifdef CONFIG_PERF_EVENTS
 	if (trace_probe_test_flag(&tf->tp, TP_FLAG_PROFILE))
-		fexit_perf_func(tf, entry_ip, ret_ip, regs);
+		fexit_perf_func(tf, entry_ip, ret_ip, fregs);
 #endif
 }
 NOKPROBE_SYMBOL(fexit_dispatcher);
diff --git a/kernel/trace/trace_probe_tmpl.h b/kernel/trace/trace_probe_tmpl.h
index 3935b347f874..05445a745a07 100644
--- a/kernel/trace/trace_probe_tmpl.h
+++ b/kernel/trace/trace_probe_tmpl.h
@@ -232,7 +232,7 @@ process_fetch_insn_bottom(struct fetch_insn *code, unsigned long val,
 
 /* Sum up total data length for dynamic arrays (strings) */
 static nokprobe_inline int
-__get_data_size(struct trace_probe *tp, struct pt_regs *regs)
+__get_data_size(struct trace_probe *tp, void *regs)
 {
 	struct probe_arg *arg;
 	int i, len, ret = 0;


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v4 7/9] bpf: Enable kprobe_multi feature if CONFIG_FPROBE is enabled
  2023-08-23 15:15 [PATCH v4 0/9] bpf: fprobe: rethook: Use ftrace_regs instead of pt_regs Masami Hiramatsu (Google)
                   ` (5 preceding siblings ...)
  2023-08-23 15:16 ` [PATCH v4 6/9] tracing/fprobe: Enable fprobe events with CONFIG_DYNAMIC_FTRACE_WITH_ARGS Masami Hiramatsu (Google)
@ 2023-08-23 15:16 ` Masami Hiramatsu (Google)
  2023-08-23 15:16 ` [PATCH v4 8/9] Documentations: probes: Update fprobe document to use ftrace_regs Masami Hiramatsu (Google)
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 32+ messages in thread
From: Masami Hiramatsu (Google) @ 2023-08-23 15:16 UTC (permalink / raw)
  To: Alexei Starovoitov, Steven Rostedt, Florent Revest
  Cc: linux-trace-kernel, LKML, Martin KaFai Lau, bpf, Sven Schnelle,
	Alexei Starovoitov, Jiri Olsa, Arnaldo Carvalho de Melo,
	Daniel Borkmann, Alan Maguire, Mark Rutland, Peter Zijlstra,
	Thomas Gleixner

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Enable kprobe_multi feature if CONFIG_FPROBE is enabled. The pt_regs is
converted from ftrace_regs by ftrace_partial_regs(), thus some registers
may always returns 0. But it should be enough for function entry (access
arguments) and exit (access return value).

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Florent Revest <revest@chromium.org>
---
 kernel/trace/bpf_trace.c |   22 +++++++++-------------
 1 file changed, 9 insertions(+), 13 deletions(-)

diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index c4d57c7cdc7c..ac9bd5e7ec27 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -2467,7 +2467,7 @@ static int __init bpf_event_init(void)
 fs_initcall(bpf_event_init);
 #endif /* CONFIG_MODULES */
 
-#if defined(CONFIG_FPROBE) && defined(CONFIG_DYNAMIC_FTRACE_WITH_REGS)
+#ifdef CONFIG_FPROBE
 struct bpf_kprobe_multi_link {
 	struct bpf_link link;
 	struct fprobe fp;
@@ -2489,6 +2489,8 @@ struct user_syms {
 	char *buf;
 };
 
+static DEFINE_PER_CPU(struct pt_regs, bpf_kprobe_multi_pt_regs);
+
 static int copy_user_syms(struct user_syms *us, unsigned long __user *usyms, u32 cnt)
 {
 	unsigned long __user usymbol;
@@ -2630,13 +2632,14 @@ static u64 bpf_kprobe_multi_entry_ip(struct bpf_run_ctx *ctx)
 
 static int
 kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
-			   unsigned long entry_ip, struct pt_regs *regs)
+			   unsigned long entry_ip, struct ftrace_regs *fregs)
 {
 	struct bpf_kprobe_multi_run_ctx run_ctx = {
 		.link = link,
 		.entry_ip = entry_ip,
 	};
 	struct bpf_run_ctx *old_run_ctx;
+	struct pt_regs *regs;
 	int err;
 
 	if (unlikely(__this_cpu_inc_return(bpf_prog_active) != 1)) {
@@ -2646,6 +2649,7 @@ kprobe_multi_link_prog_run(struct bpf_kprobe_multi_link *link,
 
 	migrate_disable();
 	rcu_read_lock();
+	regs = ftrace_partial_regs(fregs, this_cpu_ptr(&bpf_kprobe_multi_pt_regs));
 	old_run_ctx = bpf_set_run_ctx(&run_ctx.run_ctx);
 	err = bpf_prog_run(link->link.prog, regs);
 	bpf_reset_run_ctx(old_run_ctx);
@@ -2663,13 +2667,9 @@ kprobe_multi_link_handler(struct fprobe *fp, unsigned long fentry_ip,
 			  void *data)
 {
 	struct bpf_kprobe_multi_link *link;
-	struct pt_regs *regs = ftrace_get_regs(fregs);
-
-	if (!regs)
-		return 0;
 
 	link = container_of(fp, struct bpf_kprobe_multi_link, fp);
-	kprobe_multi_link_prog_run(link, get_entry_ip(fentry_ip), regs);
+	kprobe_multi_link_prog_run(link, get_entry_ip(fentry_ip), fregs);
 	return 0;
 }
 
@@ -2679,13 +2679,9 @@ kprobe_multi_link_exit_handler(struct fprobe *fp, unsigned long fentry_ip,
 			       void *data)
 {
 	struct bpf_kprobe_multi_link *link;
-	struct pt_regs *regs = ftrace_get_regs(fregs);
-
-	if (!regs)
-		return;
 
 	link = container_of(fp, struct bpf_kprobe_multi_link, fp);
-	kprobe_multi_link_prog_run(link, get_entry_ip(fentry_ip), regs);
+	kprobe_multi_link_prog_run(link, get_entry_ip(fentry_ip), fregs);
 }
 
 static int symbols_cmp_r(const void *a, const void *b, const void *priv)
@@ -2925,7 +2921,7 @@ int bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *pr
 	kvfree(cookies);
 	return err;
 }
-#else /* !CONFIG_FPROBE || !CONFIG_DYNAMIC_FTRACE_WITH_REGS */
+#else /* !CONFIG_FPROBE */
 int bpf_kprobe_multi_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
 {
 	return -EOPNOTSUPP;


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v4 8/9] Documentations: probes: Update fprobe document to use ftrace_regs
  2023-08-23 15:15 [PATCH v4 0/9] bpf: fprobe: rethook: Use ftrace_regs instead of pt_regs Masami Hiramatsu (Google)
                   ` (6 preceding siblings ...)
  2023-08-23 15:16 ` [PATCH v4 7/9] bpf: Enable kprobe_multi feature if CONFIG_FPROBE is enabled Masami Hiramatsu (Google)
@ 2023-08-23 15:16 ` Masami Hiramatsu (Google)
  2023-08-23 15:17 ` [PATCH v4 9/9] Documentation: tracing: Add a note about argument and retval access Masami Hiramatsu (Google)
  2023-08-25 16:11 ` [PATCH v4 0/9] bpf: fprobe: rethook: Use ftrace_regs instead of pt_regs Florent Revest
  9 siblings, 0 replies; 32+ messages in thread
From: Masami Hiramatsu (Google) @ 2023-08-23 15:16 UTC (permalink / raw)
  To: Alexei Starovoitov, Steven Rostedt, Florent Revest
  Cc: linux-trace-kernel, LKML, Martin KaFai Lau, bpf, Sven Schnelle,
	Alexei Starovoitov, Jiri Olsa, Arnaldo Carvalho de Melo,
	Daniel Borkmann, Alan Maguire, Mark Rutland, Peter Zijlstra,
	Thomas Gleixner

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Update fprobe document so that the entry/exit handler uses ftrace_regs
instead of pt_regs.

Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
Acked-by: Florent Revest <revest@chromium.org>
---
 Documentation/trace/fprobe.rst |   14 ++++++--------
 1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/Documentation/trace/fprobe.rst b/Documentation/trace/fprobe.rst
index 5851a14eb893..64ef522f7a64 100644
--- a/Documentation/trace/fprobe.rst
+++ b/Documentation/trace/fprobe.rst
@@ -91,9 +91,9 @@ The prototype of the entry/exit callback function are as follows:
 
 .. code-block:: c
 
- int entry_callback(struct fprobe *fp, unsigned long entry_ip, unsigned long ret_ip, struct pt_regs *regs, void *entry_data);
+ int entry_callback(struct fprobe *fp, unsigned long entry_ip, unsigned long ret_ip, struct ftrace_regs *fregs, void *entry_data);
 
- void exit_callback(struct fprobe *fp, unsigned long entry_ip, unsigned long ret_ip, struct pt_regs *regs, void *entry_data);
+ void exit_callback(struct fprobe *fp, unsigned long entry_ip, unsigned long ret_ip, struct ftrace_regs *fregs, void *entry_data);
 
 Note that the @entry_ip is saved at function entry and passed to exit handler.
 If the entry callback function returns !0, the corresponding exit callback will be cancelled.
@@ -112,12 +112,10 @@ If the entry callback function returns !0, the corresponding exit callback will
         This is the return address that the traced function will return to,
         somewhere in the caller. This can be used at both entry and exit.
 
-@regs
-        This is the `pt_regs` data structure at the entry and exit. Note that
-        the instruction pointer of @regs may be different from the @entry_ip
-        in the entry_handler. If you need traced instruction pointer, you need
-        to use @entry_ip. On the other hand, in the exit_handler, the instruction
-        pointer of @regs is set to the currect return address.
+@fregs
+        This is the `ftrace_regs` data structure at the entry and exit. Note that
+        the instruction pointer of @fregs may be incorrect in entry handler and
+        exit handler, so you have to use @entry_ip and @ret_ip instead.
 
 @entry_data
         This is a local storage to share the data between entry and exit handlers.


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* [PATCH v4 9/9] Documentation: tracing: Add a note about argument and retval access
  2023-08-23 15:15 [PATCH v4 0/9] bpf: fprobe: rethook: Use ftrace_regs instead of pt_regs Masami Hiramatsu (Google)
                   ` (7 preceding siblings ...)
  2023-08-23 15:16 ` [PATCH v4 8/9] Documentations: probes: Update fprobe document to use ftrace_regs Masami Hiramatsu (Google)
@ 2023-08-23 15:17 ` Masami Hiramatsu (Google)
  2023-08-25 16:12   ` Florent Revest
  2023-08-25 16:11 ` [PATCH v4 0/9] bpf: fprobe: rethook: Use ftrace_regs instead of pt_regs Florent Revest
  9 siblings, 1 reply; 32+ messages in thread
From: Masami Hiramatsu (Google) @ 2023-08-23 15:17 UTC (permalink / raw)
  To: Alexei Starovoitov, Steven Rostedt, Florent Revest
  Cc: linux-trace-kernel, LKML, Martin KaFai Lau, bpf, Sven Schnelle,
	Alexei Starovoitov, Jiri Olsa, Arnaldo Carvalho de Melo,
	Daniel Borkmann, Alan Maguire, Mark Rutland, Peter Zijlstra,
	Thomas Gleixner

From: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Add a note about the argument and return value accecss will be best
effort. Depending on the type, it will be passed via stack or a
pair of the registers, but $argN and $retval only support the
single register access.

Suggested-by: Alexei Starovoitov <alexei.starovoitov@gmail.com>
Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
---
 Documentation/trace/fprobetrace.rst |    8 ++++++--
 Documentation/trace/kprobetrace.rst |    8 ++++++--
 2 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/Documentation/trace/fprobetrace.rst b/Documentation/trace/fprobetrace.rst
index 8e9bebcf0a2e..e35e6b18df40 100644
--- a/Documentation/trace/fprobetrace.rst
+++ b/Documentation/trace/fprobetrace.rst
@@ -59,8 +59,12 @@ Synopsis of fprobe-events
                   and bitfield are supported.
 
   (\*1) This is available only when BTF is enabled.
-  (\*2) only for the probe on function entry (offs == 0).
-  (\*3) only for return probe.
+  (\*2) only for the probe on function entry (offs == 0). Note, this argument access
+        is best effort, because depending on the argument type, it may be passed on
+        the stack. But this only support the arguments via registers.
+  (\*3) only for return probe. Note that this is also best effort. Depending on the
+        return value type, it might be passed via a pair of registers. But this only
+        accesses one register.
   (\*4) this is useful for fetching a field of data structures.
   (\*5) "u" means user-space dereference.
 
diff --git a/Documentation/trace/kprobetrace.rst b/Documentation/trace/kprobetrace.rst
index 8a2dfee38145..bf9cecb69fc9 100644
--- a/Documentation/trace/kprobetrace.rst
+++ b/Documentation/trace/kprobetrace.rst
@@ -61,8 +61,12 @@ Synopsis of kprobe_events
 		  (x8/x16/x32/x64), "char", "string", "ustring", "symbol", "symstr"
                   and bitfield are supported.
 
-  (\*1) only for the probe on function entry (offs == 0).
-  (\*2) only for return probe.
+  (\*1) only for the probe on function entry (offs == 0). Note, this argument access
+        is best effort, because depending on the argument type, it may be passed on
+        the stack. But this only support the arguments via registers.
+  (\*2) only for return probe. Note that this is also best effort. Depending on the
+        return value type, it might be passed via a pair of registers. But this only
+        accesses one register.
   (\*3) this is useful for fetching a field of data structures.
   (\*4) "u" means user-space dereference. See :ref:`user_mem_access`.
 


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 0/9] bpf: fprobe: rethook: Use ftrace_regs instead of pt_regs
  2023-08-23 15:15 [PATCH v4 0/9] bpf: fprobe: rethook: Use ftrace_regs instead of pt_regs Masami Hiramatsu (Google)
                   ` (8 preceding siblings ...)
  2023-08-23 15:17 ` [PATCH v4 9/9] Documentation: tracing: Add a note about argument and retval access Masami Hiramatsu (Google)
@ 2023-08-25 16:11 ` Florent Revest
  9 siblings, 0 replies; 32+ messages in thread
From: Florent Revest @ 2023-08-25 16:11 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Alexei Starovoitov, Steven Rostedt, linux-trace-kernel, LKML,
	Martin KaFai Lau, bpf, Sven Schnelle, Alexei Starovoitov,
	Jiri Olsa, Arnaldo Carvalho de Melo, Daniel Borkmann,
	Alan Maguire, Mark Rutland, Peter Zijlstra, Thomas Gleixner

On Wed, Aug 23, 2023 at 5:15 PM Masami Hiramatsu (Google)
<mhiramat@kernel.org> wrote:
>
> Hi,
>
> Here is the 4th version of the series to use ftrace_regs instead of pt_regs
> in fprobe.
> The previous version is here;
>
> https://lore.kernel.org/all/169181859570.505132.10136520092011157898.stgit@devnote2/
>
> This version fixes the issues pointed in the previous series; fix document
> description, keep CONFIG_FPROBE dependency for multi-kprobe, add
> static_assert check for ftrace_regs size, reorder the ftrace_partial_regs()
> patch for perf fprobe event support, introduce per-cpu pt_regs stack for
> perf fprobe event and add Florent's Ack (Thanks!). Also this adds a new
> documentation patch to clarify that the $argN and $retval is best effort.
>
>  - Document fix for the current fprobe callback prototype
>  - Simply replace pt_regs in fprobe_entry_handler with ftrace_regs.
>  - Expose ftrace_regs even if CONFIG_FUNCTION_TRACER=n.
>  - Introduce ftrace_partial_regs(). (This changes ARM64 which needs a custom
>    implementation)
>  - Replace pt_regs in rethook and fprobe_exit_handler with ftrace_regs. This
>    introduce a new HAVE_PT_REGS_TO_FTRACE_REGS_CAST which means ftrace_regs is
>    just a wrapper of pt_regs (except for arm64, other architectures do this)
>  - Update fprobe-events to use ftrace_regs natively.
>  - Update bpf multi-kprobe handler use ftrace_partial_regs().
>  - Update document for new fprobe callbacks.
>  - Add notes for the $argN and $retval.
>
> This series can be applied against the probes/core branch on linux-trace tree.
>
> This series can also be found below branch.
>
> https://git.kernel.org/pub/scm/linux/kernel/git/mhiramat/linux.git/log/?h=topic/fprobe-ftrace-regs
>
> Thank you,

FWIW, I verified that I could implement BPF multi_kprobe on arm64 on
top of this series so the BPF multi_kprobe usecase is tested but I
haven't tested the "trace_fprobe" or perf use cases.

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 1/9] Documentation: probes: Add a new ret_ip callback parameter
  2023-08-23 15:15 ` [PATCH v4 1/9] Documentation: probes: Add a new ret_ip callback parameter Masami Hiramatsu (Google)
@ 2023-08-25 16:11   ` Florent Revest
  0 siblings, 0 replies; 32+ messages in thread
From: Florent Revest @ 2023-08-25 16:11 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Alexei Starovoitov, Steven Rostedt, linux-trace-kernel, LKML,
	Martin KaFai Lau, bpf, Sven Schnelle, Alexei Starovoitov,
	Jiri Olsa, Arnaldo Carvalho de Melo, Daniel Borkmann,
	Alan Maguire, Mark Rutland, Peter Zijlstra, Thomas Gleixner

On Wed, Aug 23, 2023 at 5:15 PM Masami Hiramatsu (Google)
<mhiramat@kernel.org> wrote:
>
> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>
> Add a new ret_ip callback parameter description.
>
> Fixes: cb16330d1274 ("fprobe: Pass return address to the handlers")
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Acked-by: Florent Revest <revest@chromium.org>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 2/9] fprobe: Use fprobe_regs in fprobe entry handler
  2023-08-23 15:15 ` [PATCH v4 2/9] fprobe: Use fprobe_regs in fprobe entry handler Masami Hiramatsu (Google)
@ 2023-08-25 16:11   ` Florent Revest
  0 siblings, 0 replies; 32+ messages in thread
From: Florent Revest @ 2023-08-25 16:11 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Alexei Starovoitov, Steven Rostedt, linux-trace-kernel, LKML,
	Martin KaFai Lau, bpf, Sven Schnelle, Alexei Starovoitov,
	Jiri Olsa, Arnaldo Carvalho de Melo, Daniel Borkmann,
	Alan Maguire, Mark Rutland, Peter Zijlstra, Thomas Gleixner

On Wed, Aug 23, 2023 at 5:15 PM Masami Hiramatsu (Google)
<mhiramat@kernel.org> wrote:
>
> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>
> This allows fprobes to be available with CONFIG_DYNAMIC_FTRACE_WITH_ARGS
> instead of CONFIG_DYNAMIC_FTRACE_WITH_REGS, then we can enable fprobe
> on arm64.
>
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Acked-by: Florent Revest <revest@chromium.org>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 4/9] fprobe: rethook: Use ftrace_regs in fprobe exit handler and rethook
  2023-08-23 15:16 ` [PATCH v4 4/9] fprobe: rethook: Use ftrace_regs in fprobe exit handler and rethook Masami Hiramatsu (Google)
@ 2023-08-25 16:12   ` Florent Revest
  2023-09-04 13:40   ` Masami Hiramatsu
  1 sibling, 0 replies; 32+ messages in thread
From: Florent Revest @ 2023-08-25 16:12 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Alexei Starovoitov, Steven Rostedt, linux-trace-kernel, LKML,
	Martin KaFai Lau, bpf, Sven Schnelle, Alexei Starovoitov,
	Jiri Olsa, Arnaldo Carvalho de Melo, Daniel Borkmann,
	Alan Maguire, Mark Rutland, Peter Zijlstra, Thomas Gleixner

On Wed, Aug 23, 2023 at 5:16 PM Masami Hiramatsu (Google)
<mhiramat@kernel.org> wrote:
>
> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>
> Change the fprobe exit handler and rethook to use ftrace_regs data structure
> instead of pt_regs. This also introduce HAVE_PT_REGS_TO_FTRACE_REGS_CAST
> which means the ftrace_regs's memory layout is equal to the pt_regs so
> that those are able to cast. Only if it is enabled, kretprobe will use
> rethook since kretprobe requires pt_regs for backward compatibility.
>
> This means the archs which currently implement rethook for kretprobes needs to
> set that flag and it must ensure struct ftrace_regs is same as pt_regs.
> If not, it must be either disabling kretprobe or implementing kretprobe
> trampoline separately from rethook trampoline.
>
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>

Acked-by: Florent Revest <revest@chromium.org>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 6/9] tracing/fprobe: Enable fprobe events with CONFIG_DYNAMIC_FTRACE_WITH_ARGS
  2023-08-23 15:16 ` [PATCH v4 6/9] tracing/fprobe: Enable fprobe events with CONFIG_DYNAMIC_FTRACE_WITH_ARGS Masami Hiramatsu (Google)
@ 2023-08-25 16:12   ` Florent Revest
  2023-08-26  3:38     ` Masami Hiramatsu
  2023-08-30  7:20   ` Masami Hiramatsu
  1 sibling, 1 reply; 32+ messages in thread
From: Florent Revest @ 2023-08-25 16:12 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Alexei Starovoitov, Steven Rostedt, linux-trace-kernel, LKML,
	Martin KaFai Lau, bpf, Sven Schnelle, Alexei Starovoitov,
	Jiri Olsa, Arnaldo Carvalho de Melo, Daniel Borkmann,
	Alan Maguire, Mark Rutland, Peter Zijlstra, Thomas Gleixner

On Wed, Aug 23, 2023 at 5:16 PM Masami Hiramatsu (Google)
<mhiramat@kernel.org> wrote:
>
> diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c
> index c60d0d9f1a95..90ad28260a9f 100644
> --- a/kernel/trace/trace_fprobe.c
> +++ b/kernel/trace/trace_fprobe.c
> +#else /* CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS && !CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST */
> +
> +/* Since fprobe handlers can be nested, pt_regs buffer need to be a stack */
> +#define PERF_FPROBE_REGS_MAX   4
> +
> +struct pt_regs_stack {
> +       struct pt_regs regs[PERF_FPROBE_REGS_MAX];
> +       int idx;
> +};
> +
> +static DEFINE_PER_CPU(struct pt_regs_stack, perf_fprobe_regs);
> +
> +static __always_inline
> +struct pt_regs *perf_fprobe_partial_regs(struct ftrace_regs *fregs)
> +{
> +       struct pt_regs_stack *stack = this_cpu_ptr(&perf_fprobe_regs);
> +       struct pt_regs *regs;
> +
> +       if (stack->idx < PERF_FPROBE_REGS_MAX) {
> +               regs = stack->regs[stack->idx++];

This is missing an &:
regs = &stack->regs[stack->idx++];

> +               return ftrace_partial_regs(fregs, regs);

I think this is incorrect on arm64 and will likely cause very subtle
failure modes down the line on other architectures too. The problem on
arm64 is that Perf calls "user_mode(regs)" somewhere down the line,
that macro tries to read the "pstate" register, which is not populated
in ftrace_regs, so it's not copied into a "partial" pt_regs either and
Perf can take wrong decisions based on that.

I already mentioned this problem in the past:
- in the third answer block of:
https://lore.kernel.org/all/CABRcYmJjtVq-330ktqTAUiNO1=yG_aHd0xz=c550O5C7QP++UA@mail.gmail.com/
- in the fourth answer block of:
https://lore.kernel.org/all/CABRcYm+esb8J2O1v6=C+h+HSa5NxraPUgo63w7-iZj0CXbpusg@mail.gmail.com/

It is quite possible that other architectures at some point introduce
a light ftrace "args" trampoline that misses one of the registers
expected by Perf because they don't realize that this trampoline calls
fprobe which calls Perf which has specific registers expectations.

We got the green light from Alexei to use ftrace_partial_regs for "BPF
mutli_kprobe" because these BPF programs can gracefully deal with
sparse pt_regs but I think a similar conversation needs to happen with
the Perf folks.

----

On a side-note, a subtle difference between ftrace_partial_regs with
and without HAVE_PT_REGS_TO_FTRACE_REGS_CAST is that one does a copy
and the other does not. If a subsystem receives a partial regs under
HAVE_PT_REGS_TO_FTRACE_REGS_CAST, it can modify register fields and
the modified values will be restored by the ftrace trampoline. Without
HAVE_PT_REGS_TO_FTRACE_REGS_CAST, only the copy will be modified and
ftrace won't restore them. I think the least we can do is to document
thoroughly the guarantees of the ftrace_partial_regs API: users
shouldn't rely on modifying the resulting regs because depending on
the architecture this could do different things. People shouldn't rely
on any register that isn't covered by one of the ftrace_regs_get_*
helpers because it can be unpopulated on some architectures. I believe
this is the case for BPF multi_kprobe but not for Perf.

> +       }
> +       return NULL;
> +}
> +
> +static __always_inline void perf_fprobe_return_regs(struct pt_regs *regs)
> +{
> +       struct pt_regs_stack *stack = this_cpu_ptr(&perf_fprobe_regs);
> +
> +       if (WARN_ON_ONCE(regs != stack->regs[stack->idx]))

This is missing an & too:
if (WARN_ON_ONCE(regs != &stack->regs[stack->idx]))




> +               return;
> +
> +       --stack->idx;
> +}
> +
> +#endif /* !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS || CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST */

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 9/9] Documentation: tracing: Add a note about argument and retval access
  2023-08-23 15:17 ` [PATCH v4 9/9] Documentation: tracing: Add a note about argument and retval access Masami Hiramatsu (Google)
@ 2023-08-25 16:12   ` Florent Revest
  0 siblings, 0 replies; 32+ messages in thread
From: Florent Revest @ 2023-08-25 16:12 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Alexei Starovoitov, Steven Rostedt, linux-trace-kernel, LKML,
	Martin KaFai Lau, bpf, Sven Schnelle, Alexei Starovoitov,
	Jiri Olsa, Arnaldo Carvalho de Melo, Daniel Borkmann,
	Alan Maguire, Mark Rutland, Peter Zijlstra, Thomas Gleixner

On Wed, Aug 23, 2023 at 5:17 PM Masami Hiramatsu (Google)
<mhiramat@kernel.org> wrote:
> diff --git a/Documentation/trace/fprobetrace.rst b/Documentation/trace/fprobetrace.rst
> index 8e9bebcf0a2e..e35e6b18df40 100644
> --- a/Documentation/trace/fprobetrace.rst
> +++ b/Documentation/trace/fprobetrace.rst
> @@ -59,8 +59,12 @@ Synopsis of fprobe-events
>                    and bitfield are supported.
>
>    (\*1) This is available only when BTF is enabled.
> -  (\*2) only for the probe on function entry (offs == 0).
> -  (\*3) only for return probe.
> +  (\*2) only for the probe on function entry (offs == 0). Note, this argument access
> +        is best effort, because depending on the argument type, it may be passed on
> +        the stack. But this only support the arguments via registers.

supports*

Otherwise:
Acked-by: Florent Revest <revest@chromium.org>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 5/9] ftrace: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs
  2023-08-23 15:16 ` [PATCH v4 5/9] ftrace: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs Masami Hiramatsu (Google)
@ 2023-08-25 21:49   ` Andrii Nakryiko
  2023-08-26  1:56     ` Masami Hiramatsu
  0 siblings, 1 reply; 32+ messages in thread
From: Andrii Nakryiko @ 2023-08-25 21:49 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Alexei Starovoitov, Steven Rostedt, Florent Revest,
	linux-trace-kernel, LKML, Martin KaFai Lau, bpf, Sven Schnelle,
	Alexei Starovoitov, Jiri Olsa, Arnaldo Carvalho de Melo,
	Daniel Borkmann, Alan Maguire, Mark Rutland, Peter Zijlstra,
	Thomas Gleixner

On Wed, Aug 23, 2023 at 8:16 AM Masami Hiramatsu (Google)
<mhiramat@kernel.org> wrote:
>
> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
>
> Add ftrace_partial_regs() which converts the ftrace_regs to pt_regs.
> If the architecture defines its own ftrace_regs, this copies partial
> registers to pt_regs and returns it. If not, ftrace_regs is the same as
> pt_regs and ftrace_partial_regs() will return ftrace_regs::regs.
>
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> Acked-by: Florent Revest <revest@chromium.org>
> ---
>  Changes in v3:
>   - Fix to use pt_regs::regs instead of x.
>   - Return ftrace_regs::regs forcibly if HAVE_PT_REGS_COMPAT_FTRACE_REGS=y.
>   - Fix typo.
>   - Fix to copy correct registers to the pt_regs on arm64.
>  Changes in v4:
>   - Change the patch order in the series so that fprobe event can use this.
> ---
>  arch/arm64/include/asm/ftrace.h |   11 +++++++++++
>  include/linux/ftrace.h          |   17 +++++++++++++++++
>  2 files changed, 28 insertions(+)
>
> diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
> index ab158196480c..5ad24f315d52 100644
> --- a/arch/arm64/include/asm/ftrace.h
> +++ b/arch/arm64/include/asm/ftrace.h
> @@ -137,6 +137,17 @@ ftrace_override_function_with_return(struct ftrace_regs *fregs)
>         fregs->pc = fregs->lr;
>  }
>
> +static __always_inline struct pt_regs *
> +ftrace_partial_regs(const struct ftrace_regs *fregs, struct pt_regs *regs)
> +{
> +       memcpy(regs->regs, fregs->regs, sizeof(u64) * 9);
> +       regs->sp = fregs->sp;
> +       regs->pc = fregs->pc;
> +       regs->regs[29] = fregs->fp;
> +       regs->regs[30] = fregs->lr;

I see that orig_x0 from pt_regs is used on arm64 to get syscall's
first parameter. And it seems like this ftrace_regs to pt_regs
conversion doesn't touch orig_x0 at all. Is it maintained/preserved
somewhere else, or will we lose the ability to get the first syscall's
argument?

Looking at libbpf's bpf_tracing.h, other than orig_x0, I think all the
other registers are still preserved, so this seems to be the only
potential problem.


> +       return regs;
> +}
> +
>  int ftrace_regs_query_register_offset(const char *name);
>
>  int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index c0a42d0860b8..a6ed2aa71efc 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -165,6 +165,23 @@ static __always_inline struct pt_regs *ftrace_get_regs(struct ftrace_regs *fregs
>         return arch_ftrace_get_regs(fregs);
>  }
>
> +#if !defined(CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS) || \
> +       defined(CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST)
> +
> +static __always_inline struct pt_regs *
> +ftrace_partial_regs(struct ftrace_regs *fregs, struct pt_regs *regs)
> +{
> +       /*
> +        * If CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST=y, ftrace_regs memory
> +        * layout is the same as pt_regs. So always returns that address.
> +        * Since arch_ftrace_get_regs() will check some members and may return
> +        * NULL, we can not use it.
> +        */
> +       return &fregs->regs;
> +}
> +
> +#endif /* !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS || CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST */
> +
>  /*
>   * When true, the ftrace_regs_{get,set}_*() functions may be used on fregs.
>   * Note: this can be true even when ftrace_get_regs() cannot provide a pt_regs.
>
>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 5/9] ftrace: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs
  2023-08-25 21:49   ` Andrii Nakryiko
@ 2023-08-26  1:56     ` Masami Hiramatsu
  2023-09-05 19:50       ` Andrii Nakryiko
  0 siblings, 1 reply; 32+ messages in thread
From: Masami Hiramatsu @ 2023-08-26  1:56 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Steven Rostedt, Florent Revest,
	linux-trace-kernel, LKML, Martin KaFai Lau, bpf, Sven Schnelle,
	Alexei Starovoitov, Jiri Olsa, Arnaldo Carvalho de Melo,
	Daniel Borkmann, Alan Maguire, Mark Rutland, Peter Zijlstra,
	Thomas Gleixner

On Fri, 25 Aug 2023 14:49:48 -0700
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> On Wed, Aug 23, 2023 at 8:16 AM Masami Hiramatsu (Google)
> <mhiramat@kernel.org> wrote:
> >
> > From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> >
> > Add ftrace_partial_regs() which converts the ftrace_regs to pt_regs.
> > If the architecture defines its own ftrace_regs, this copies partial
> > registers to pt_regs and returns it. If not, ftrace_regs is the same as
> > pt_regs and ftrace_partial_regs() will return ftrace_regs::regs.
> >
> > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > Acked-by: Florent Revest <revest@chromium.org>
> > ---
> >  Changes in v3:
> >   - Fix to use pt_regs::regs instead of x.
> >   - Return ftrace_regs::regs forcibly if HAVE_PT_REGS_COMPAT_FTRACE_REGS=y.
> >   - Fix typo.
> >   - Fix to copy correct registers to the pt_regs on arm64.
> >  Changes in v4:
> >   - Change the patch order in the series so that fprobe event can use this.
> > ---
> >  arch/arm64/include/asm/ftrace.h |   11 +++++++++++
> >  include/linux/ftrace.h          |   17 +++++++++++++++++
> >  2 files changed, 28 insertions(+)
> >
> > diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
> > index ab158196480c..5ad24f315d52 100644
> > --- a/arch/arm64/include/asm/ftrace.h
> > +++ b/arch/arm64/include/asm/ftrace.h
> > @@ -137,6 +137,17 @@ ftrace_override_function_with_return(struct ftrace_regs *fregs)
> >         fregs->pc = fregs->lr;
> >  }
> >
> > +static __always_inline struct pt_regs *
> > +ftrace_partial_regs(const struct ftrace_regs *fregs, struct pt_regs *regs)
> > +{
> > +       memcpy(regs->regs, fregs->regs, sizeof(u64) * 9);
> > +       regs->sp = fregs->sp;
> > +       regs->pc = fregs->pc;
> > +       regs->regs[29] = fregs->fp;
> > +       regs->regs[30] = fregs->lr;
> 
> I see that orig_x0 from pt_regs is used on arm64 to get syscall's
> first parameter. And it seems like this ftrace_regs to pt_regs
> conversion doesn't touch orig_x0 at all. Is it maintained/preserved
> somewhere else, or will we lose the ability to get the first syscall's
> argument?

Thanks for checking it!

Does BPF uses kprobe probe to trace syscalls? Since we have raw_syscall
trace events, no need to use kprobe to do that. (and I don't recommend to
use kprobe to do such fixed event)

> 
> Looking at libbpf's bpf_tracing.h, other than orig_x0, I think all the
> other registers are still preserved, so this seems to be the only
> potential problem.

Great!

Thank you,

> 
> 
> > +       return regs;
> > +}
> > +
> >  int ftrace_regs_query_register_offset(const char *name);
> >
> >  int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
> > diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> > index c0a42d0860b8..a6ed2aa71efc 100644
> > --- a/include/linux/ftrace.h
> > +++ b/include/linux/ftrace.h
> > @@ -165,6 +165,23 @@ static __always_inline struct pt_regs *ftrace_get_regs(struct ftrace_regs *fregs
> >         return arch_ftrace_get_regs(fregs);
> >  }
> >
> > +#if !defined(CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS) || \
> > +       defined(CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST)
> > +
> > +static __always_inline struct pt_regs *
> > +ftrace_partial_regs(struct ftrace_regs *fregs, struct pt_regs *regs)
> > +{
> > +       /*
> > +        * If CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST=y, ftrace_regs memory
> > +        * layout is the same as pt_regs. So always returns that address.
> > +        * Since arch_ftrace_get_regs() will check some members and may return
> > +        * NULL, we can not use it.
> > +        */
> > +       return &fregs->regs;
> > +}
> > +
> > +#endif /* !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS || CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST */
> > +
> >  /*
> >   * When true, the ftrace_regs_{get,set}_*() functions may be used on fregs.
> >   * Note: this can be true even when ftrace_get_regs() cannot provide a pt_regs.
> >
> >


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 6/9] tracing/fprobe: Enable fprobe events with CONFIG_DYNAMIC_FTRACE_WITH_ARGS
  2023-08-25 16:12   ` Florent Revest
@ 2023-08-26  3:38     ` Masami Hiramatsu
  0 siblings, 0 replies; 32+ messages in thread
From: Masami Hiramatsu @ 2023-08-26  3:38 UTC (permalink / raw)
  To: Florent Revest
  Cc: Alexei Starovoitov, Steven Rostedt, linux-trace-kernel, LKML,
	Martin KaFai Lau, bpf, Sven Schnelle, Alexei Starovoitov,
	Jiri Olsa, Arnaldo Carvalho de Melo, Daniel Borkmann,
	Alan Maguire, Mark Rutland, Peter Zijlstra, Thomas Gleixner,
	Peter Zijlstra

(Cc: Peter)

On Fri, 25 Aug 2023 18:12:07 +0200
Florent Revest <revest@chromium.org> wrote:

> On Wed, Aug 23, 2023 at 5:16 PM Masami Hiramatsu (Google)
> <mhiramat@kernel.org> wrote:
> >
> > diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c
> > index c60d0d9f1a95..90ad28260a9f 100644
> > --- a/kernel/trace/trace_fprobe.c
> > +++ b/kernel/trace/trace_fprobe.c
> > +#else /* CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS && !CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST */
> > +
> > +/* Since fprobe handlers can be nested, pt_regs buffer need to be a stack */
> > +#define PERF_FPROBE_REGS_MAX   4
> > +
> > +struct pt_regs_stack {
> > +       struct pt_regs regs[PERF_FPROBE_REGS_MAX];
> > +       int idx;
> > +};
> > +
> > +static DEFINE_PER_CPU(struct pt_regs_stack, perf_fprobe_regs);
> > +
> > +static __always_inline
> > +struct pt_regs *perf_fprobe_partial_regs(struct ftrace_regs *fregs)
> > +{
> > +       struct pt_regs_stack *stack = this_cpu_ptr(&perf_fprobe_regs);
> > +       struct pt_regs *regs;
> > +
> > +       if (stack->idx < PERF_FPROBE_REGS_MAX) {
> > +               regs = stack->regs[stack->idx++];
> 
> This is missing an &:
> regs = &stack->regs[stack->idx++];

Oops, good point. I'm curious it didin't cause compile error...
(I thought I built it on arm64)

> 
> > +               return ftrace_partial_regs(fregs, regs);
> 
> I think this is incorrect on arm64 and will likely cause very subtle
> failure modes down the line on other architectures too. The problem on
> arm64 is that Perf calls "user_mode(regs)" somewhere down the line,
> that macro tries to read the "pstate" register, which is not populated
> in ftrace_regs, so it's not copied into a "partial" pt_regs either and
> Perf can take wrong decisions based on that.

I think we can assure the ftrace_regs is always !user_mode() so in that case
ftrace_partial_regs() should fill the 'pstate' register as kernel mode.

> 
> I already mentioned this problem in the past:
> - in the third answer block of:
> https://lore.kernel.org/all/CABRcYmJjtVq-330ktqTAUiNO1=yG_aHd0xz=c550O5C7QP++UA@mail.gmail.com/
> - in the fourth answer block of:
> https://lore.kernel.org/all/CABRcYm+esb8J2O1v6=C+h+HSa5NxraPUgo63w7-iZj0CXbpusg@mail.gmail.com/
> 

Oops, sorry I missed that. And I basically agreed that we need a special
care for perf. Let me reply it.

> It is quite possible that other architectures at some point introduce
> a light ftrace "args" trampoline that misses one of the registers
> expected by Perf because they don't realize that this trampoline calls
> fprobe which calls Perf which has specific registers expectations.

Agreed.

> 
> We got the green light from Alexei to use ftrace_partial_regs for "BPF
> mutli_kprobe" because these BPF programs can gracefully deal with
> sparse pt_regs but I think a similar conversation needs to happen with
> the Perf folks.

Indeed. Who is the best person to involve, Peterz? (but I think
we need arm64 PMU part maintainer to talk)

> 
> ----
> 
> On a side-note, a subtle difference between ftrace_partial_regs with
> and without HAVE_PT_REGS_TO_FTRACE_REGS_CAST is that one does a copy
> and the other does not. If a subsystem receives a partial regs under
> HAVE_PT_REGS_TO_FTRACE_REGS_CAST, it can modify register fields and
> the modified values will be restored by the ftrace trampoline. Without
> HAVE_PT_REGS_TO_FTRACE_REGS_CAST, only the copy will be modified and
> ftrace won't restore them. I think the least we can do is to document
> thoroughly the guarantees of the ftrace_partial_regs API: users
> shouldn't rely on modifying the resulting regs because depending on
> the architecture this could do different things. People shouldn't rely
> on any register that isn't covered by one of the ftrace_regs_get_*
> helpers because it can be unpopulated on some architectures. I believe
> this is the case for BPF multi_kprobe but not for Perf.

I agree with the documentation requirement, but since the fprobe official
interface becomes ftrace_regs, user naturally expects it is not pt_regs.
The problem is that the perf's case. Since the perf is natively only
support pt_regs (and there is no reason to support ftrace_regs, yes).
Hmm, I will recheck how the perf events on trace-event is implementd.

Thank you,

> 
> > +       }
> > +       return NULL;
> > +}
> > +
> > +static __always_inline void perf_fprobe_return_regs(struct pt_regs *regs)
> > +{
> > +       struct pt_regs_stack *stack = this_cpu_ptr(&perf_fprobe_regs);
> > +
> > +       if (WARN_ON_ONCE(regs != stack->regs[stack->idx]))
> 
> This is missing an & too:
> if (WARN_ON_ONCE(regs != &stack->regs[stack->idx]))
> 
> 
> 
> 
> > +               return;
> > +
> > +       --stack->idx;
> > +}
> > +
> > +#endif /* !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS || CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST */


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 6/9] tracing/fprobe: Enable fprobe events with CONFIG_DYNAMIC_FTRACE_WITH_ARGS
  2023-08-23 15:16 ` [PATCH v4 6/9] tracing/fprobe: Enable fprobe events with CONFIG_DYNAMIC_FTRACE_WITH_ARGS Masami Hiramatsu (Google)
  2023-08-25 16:12   ` Florent Revest
@ 2023-08-30  7:20   ` Masami Hiramatsu
  1 sibling, 0 replies; 32+ messages in thread
From: Masami Hiramatsu @ 2023-08-30  7:20 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Alexei Starovoitov, Steven Rostedt, Florent Revest,
	linux-trace-kernel, LKML, Martin KaFai Lau, bpf, Sven Schnelle,
	Alexei Starovoitov, Jiri Olsa, Arnaldo Carvalho de Melo,
	Daniel Borkmann, Alan Maguire, Mark Rutland, Peter Zijlstra,
	Thomas Gleixner

On Thu, 24 Aug 2023 00:16:37 +0900
"Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:

> +#if !defined(CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS) || \
> +	defined(CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST)
> +
> +static __always_inline
> +struct pt_regs *perf_fprobe_partial_regs(struct ftrace_regs *fregs)
> +{
> +	/* See include/linux/ftrace.h, this returns &fregs->regs */
> +	return ftrace_partial_regs(fregs, NULL);
> +}
> +
> +#define perf_fprobe_return_regs(regs) do {} while (0)
> +
> +#else /* CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS && !CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST */
> +
> +/* Since fprobe handlers can be nested, pt_regs buffer need to be a stack */
> +#define PERF_FPROBE_REGS_MAX	4
> +
> +struct pt_regs_stack {
> +	struct pt_regs regs[PERF_FPROBE_REGS_MAX];
> +	int idx;
> +};
> +
> +static DEFINE_PER_CPU(struct pt_regs_stack, perf_fprobe_regs);
> +
> +static __always_inline
> +struct pt_regs *perf_fprobe_partial_regs(struct ftrace_regs *fregs)
> +{
> +	struct pt_regs_stack *stack = this_cpu_ptr(&perf_fprobe_regs);
> +	struct pt_regs *regs;
> +
> +	if (stack->idx < PERF_FPROBE_REGS_MAX) {
> +		regs = stack->regs[stack->idx++];
> +		return ftrace_partial_regs(fregs, regs);
> +	}
> +	return NULL;
> +}
> +
> +static __always_inline void perf_fprobe_return_regs(struct pt_regs *regs)
> +{
> +	struct pt_regs_stack *stack = this_cpu_ptr(&perf_fprobe_regs);
> +
> +	if (WARN_ON_ONCE(regs != stack->regs[stack->idx]))
> +		return;
> +
> +	--stack->idx;
> +}

Ah, I found that the perf_trace_buf_alloc() does the same thing. So

perf_trace_buf_alloc(size, &pt_regs, &rctx);

will give us the pt_regs at that point. Trace event does that so I think
it is OK to do that here.

Thank you,

> +
> +#endif /* !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS || CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST */
> +
>  static int fentry_perf_func(struct trace_fprobe *tf, unsigned long entry_ip,
> -			    struct pt_regs *regs)
> +			    struct ftrace_regs *fregs)
>  {
>  	struct trace_event_call *call = trace_probe_event_call(&tf->tp);
>  	struct fentry_trace_entry_head *entry;
>  	struct hlist_head *head;
>  	int size, __size, dsize;
> +	struct pt_regs *regs;
>  	int rctx;
>  
> +	regs = perf_fprobe_partial_regs(fregs);
> +	if (!regs)
> +		return -EINVAL;
> +
>  	head = this_cpu_ptr(call->perf_events);
>  	if (hlist_empty(head))
> -		return 0;
> +		goto out;
>  
> -	dsize = __get_data_size(&tf->tp, regs);
> +	dsize = __get_data_size(&tf->tp, fregs);
>  	__size = sizeof(*entry) + tf->tp.size + dsize;
>  	size = ALIGN(__size + sizeof(u32), sizeof(u64));
>  	size -= sizeof(u32);
>  
>  	entry = perf_trace_buf_alloc(size, NULL, &rctx);

Here, we can borrow the pt_regs.

>  	if (!entry)
> -		return 0;
> +		goto out;
>  
>  	entry->ip = entry_ip;
>  	memset(&entry[1], 0, dsize);
> -	store_trace_args(&entry[1], &tf->tp, regs, sizeof(*entry), dsize);
> +	store_trace_args(&entry[1], &tf->tp, fregs, sizeof(*entry), dsize);
>  	perf_trace_buf_submit(entry, size, rctx, call->event.type, 1, regs,
>  			      head, NULL);
> +out:
> +	perf_fprobe_return_regs(regs);
>  	return 0;
>  }
>  NOKPROBE_SYMBOL(fentry_perf_func);
>  
>  static void
>  fexit_perf_func(struct trace_fprobe *tf, unsigned long entry_ip,
> -		unsigned long ret_ip, struct pt_regs *regs)
> +		unsigned long ret_ip, struct ftrace_regs *fregs)
>  {
>  	struct trace_event_call *call = trace_probe_event_call(&tf->tp);
>  	struct fexit_trace_entry_head *entry;
>  	struct hlist_head *head;
>  	int size, __size, dsize;
> +	struct pt_regs *regs;
>  	int rctx;
>  
> +	regs = perf_fprobe_partial_regs(fregs);
> +	if (!regs)
> +		return;
> +
>  	head = this_cpu_ptr(call->perf_events);
>  	if (hlist_empty(head))
> -		return;
> +		goto out;
>  
> -	dsize = __get_data_size(&tf->tp, regs);
> +	dsize = __get_data_size(&tf->tp, fregs);
>  	__size = sizeof(*entry) + tf->tp.size + dsize;
>  	size = ALIGN(__size + sizeof(u32), sizeof(u64));
>  	size -= sizeof(u32);
>  
>  	entry = perf_trace_buf_alloc(size, NULL, &rctx);

Ditto.

Thanks,
-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 4/9] fprobe: rethook: Use ftrace_regs in fprobe exit handler and rethook
  2023-08-23 15:16 ` [PATCH v4 4/9] fprobe: rethook: Use ftrace_regs in fprobe exit handler and rethook Masami Hiramatsu (Google)
  2023-08-25 16:12   ` Florent Revest
@ 2023-09-04 13:40   ` Masami Hiramatsu
  2023-09-05  7:17     ` Sven Schnelle
  1 sibling, 1 reply; 32+ messages in thread
From: Masami Hiramatsu @ 2023-09-04 13:40 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Alexei Starovoitov, Steven Rostedt, Florent Revest,
	linux-trace-kernel, LKML, Martin KaFai Lau, bpf, Sven Schnelle,
	Alexei Starovoitov, Jiri Olsa, Arnaldo Carvalho de Melo,
	Daniel Borkmann, Alan Maguire, Mark Rutland, Peter Zijlstra,
	Thomas Gleixner

Hi,

On Thu, 24 Aug 2023 00:16:14 +0900
"Masami Hiramatsu (Google)" <mhiramat@kernel.org> wrote:

> From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> 
> Change the fprobe exit handler and rethook to use ftrace_regs data structure
> instead of pt_regs. This also introduce HAVE_PT_REGS_TO_FTRACE_REGS_CAST
> which means the ftrace_regs's memory layout is equal to the pt_regs so
> that those are able to cast. Only if it is enabled, kretprobe will use
> rethook since kretprobe requires pt_regs for backward compatibility.
> 
> This means the archs which currently implement rethook for kretprobes needs to
> set that flag and it must ensure struct ftrace_regs is same as pt_regs.
> If not, it must be either disabling kretprobe or implementing kretprobe
> trampoline separately from rethook trampoline.

I found that this is not enough becuase s390/loongarch already implemented
their rethook, and as far as I can see, the s390 ftrace_regs does not save
the required registers for rethook. Thus, for such architecture, we need
another kconfig flag and keep using the pt_regs for rethook.

In this series, I would like to focus on x86_64 and arm64, and others
will support later. What I will do is to introduce another Kconfig item
(HAVE_RETHOOK_FTRACE_REGS_HOOK) which means that the rethook is able to
hook a function with ftrace_regs on that architecture. Thus instead of
replacing pt_regs with ftrace_regs here, I will just introduce another
rethook API which uses ftrace_regs.

If HAVE_RETHOOK_FTRACE_REGS_HOOK=y, that API needs to be implemented too.
(currently, only x86_64 enables this)


      | HAVE_PT_REGS_TO_FTRACE_REGS_CAST | HAVE_RETHOOK_FTRACE_REGS_HOOK |
arm64 | no (keep kretprobe trampoline)   |  yes                          |
x86   | yes                              |  yes                          |
others| yes                              |  no (will be yes later)       |


If HAVE_RETHOOK_FTRACE_REGS_HOOK=n, fprobe requires DYNAMIC_FTRACE_WITH_REGS
and set the FTRACE_OPS_FL_SAVE_REGS flag so that it can get the pt_regs from
ftrace_regs for rethook. It also calls rethook with legacy pt_regs hook API.

Even though, rethook still provides the ftrace_regs callback interface for
fprobe, so that the rethook can be implemented on the architecture which
HAVE_PT_REGS_TO_FTRACE_REGS_CAST=n. 

Let me try to implement it.

Thank you,


> 
> Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> ---
>  Changes in v3:
>   - Config rename to HAVE_PT_REGS_TO_FTRACE_REGS_CAST
>   - Use ftrace_regs_* APIs instead of ftrace_get_regs().
>  Changes in v4:
>   - Add static_assert() to ensure at least the size of pt_regs
>     and ftrace_regs are same if HAVE_PT_REGS_TO_FTRACE_REGS_CAST=y.
> ---
>  arch/Kconfig                    |    1 +
>  arch/loongarch/Kconfig          |    1 +
>  arch/s390/Kconfig               |    1 +
>  arch/x86/Kconfig                |    1 +
>  arch/x86/kernel/rethook.c       |   13 +++++++------
>  include/linux/fprobe.h          |    2 +-
>  include/linux/ftrace.h          |    6 ++++++
>  include/linux/rethook.h         |   11 ++++++-----
>  kernel/kprobes.c                |   10 ++++++++--
>  kernel/trace/Kconfig            |    7 +++++++
>  kernel/trace/bpf_trace.c        |    6 +++++-
>  kernel/trace/fprobe.c           |    6 +++---
>  kernel/trace/rethook.c          |   16 ++++++++--------
>  kernel/trace/trace_fprobe.c     |    6 +++++-
>  lib/test_fprobe.c               |    6 +++---
>  samples/fprobe/fprobe_example.c |    2 +-
>  16 files changed, 64 insertions(+), 31 deletions(-)
> 
> diff --git a/arch/Kconfig b/arch/Kconfig
> index aff2746c8af2..e41a270c30bb 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -201,6 +201,7 @@ config KRETPROBE_ON_RETHOOK
>  	def_bool y
>  	depends on HAVE_RETHOOK
>  	depends on KRETPROBES
> +	depends on HAVE_PT_REGS_TO_FTRACE_REGS_CAST || !HAVE_DYNAMIC_FTRACE_WITH_ARGS
>  	select RETHOOK
>  
>  config USER_RETURN_NOTIFIER
> diff --git a/arch/loongarch/Kconfig b/arch/loongarch/Kconfig
> index e71d5bf2cee0..33c3a4598ae0 100644
> --- a/arch/loongarch/Kconfig
> +++ b/arch/loongarch/Kconfig
> @@ -103,6 +103,7 @@ config LOONGARCH
>  	select HAVE_DMA_CONTIGUOUS
>  	select HAVE_DYNAMIC_FTRACE
>  	select HAVE_DYNAMIC_FTRACE_WITH_ARGS
> +	select HAVE_PT_REGS_TO_FTRACE_REGS_CAST
>  	select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
>  	select HAVE_DYNAMIC_FTRACE_WITH_REGS
>  	select HAVE_EBPF_JIT
> diff --git a/arch/s390/Kconfig b/arch/s390/Kconfig
> index 5b39918b7042..ef06c3c2b06d 100644
> --- a/arch/s390/Kconfig
> +++ b/arch/s390/Kconfig
> @@ -165,6 +165,7 @@ config S390
>  	select HAVE_DMA_CONTIGUOUS
>  	select HAVE_DYNAMIC_FTRACE
>  	select HAVE_DYNAMIC_FTRACE_WITH_ARGS
> +	select HAVE_PT_REGS_TO_FTRACE_REGS_CAST
>  	select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
>  	select HAVE_DYNAMIC_FTRACE_WITH_REGS
>  	select HAVE_EBPF_JIT if HAVE_MARCH_Z196_FEATURES
> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> index e36261b4ea14..7c1f3194e209 100644
> --- a/arch/x86/Kconfig
> +++ b/arch/x86/Kconfig
> @@ -207,6 +207,7 @@ config X86
>  	select HAVE_DYNAMIC_FTRACE
>  	select HAVE_DYNAMIC_FTRACE_WITH_REGS
>  	select HAVE_DYNAMIC_FTRACE_WITH_ARGS	if X86_64
> +	select HAVE_PT_REGS_TO_FTRACE_REGS_CAST	if X86_64
>  	select HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
>  	select HAVE_SAMPLE_FTRACE_DIRECT	if X86_64
>  	select HAVE_SAMPLE_FTRACE_DIRECT_MULTI	if X86_64
> diff --git a/arch/x86/kernel/rethook.c b/arch/x86/kernel/rethook.c
> index 8a1c0111ae79..d714d0276c93 100644
> --- a/arch/x86/kernel/rethook.c
> +++ b/arch/x86/kernel/rethook.c
> @@ -83,7 +83,8 @@ __used __visible void arch_rethook_trampoline_callback(struct pt_regs *regs)
>  	 * arch_rethook_fixup_return() which called from this
>  	 * rethook_trampoline_handler().
>  	 */
> -	rethook_trampoline_handler(regs, (unsigned long)frame_pointer);
> +	rethook_trampoline_handler((struct ftrace_regs *)regs,
> +				   (unsigned long)frame_pointer);
>  
>  	/*
>  	 * Copy FLAGS to 'pt_regs::ss' so that arch_rethook_trapmoline()
> @@ -104,22 +105,22 @@ NOKPROBE_SYMBOL(arch_rethook_trampoline_callback);
>  STACK_FRAME_NON_STANDARD_FP(arch_rethook_trampoline);
>  
>  /* This is called from rethook_trampoline_handler(). */
> -void arch_rethook_fixup_return(struct pt_regs *regs,
> +void arch_rethook_fixup_return(struct ftrace_regs *fregs,
>  			       unsigned long correct_ret_addr)
>  {
> -	unsigned long *frame_pointer = (void *)(regs + 1);
> +	unsigned long *frame_pointer = (void *)(fregs + 1);
>  
>  	/* Replace fake return address with real one. */
>  	*frame_pointer = correct_ret_addr;
>  }
>  NOKPROBE_SYMBOL(arch_rethook_fixup_return);
>  
> -void arch_rethook_prepare(struct rethook_node *rh, struct pt_regs *regs, bool mcount)
> +void arch_rethook_prepare(struct rethook_node *rh, struct ftrace_regs *fregs, bool mcount)
>  {
> -	unsigned long *stack = (unsigned long *)regs->sp;
> +	unsigned long *stack = (unsigned long *)ftrace_regs_get_stack_pointer(fregs);
>  
>  	rh->ret_addr = stack[0];
> -	rh->frame = regs->sp;
> +	rh->frame = (unsigned long)stack;
>  
>  	/* Replace the return addr with trampoline addr */
>  	stack[0] = (unsigned long) arch_rethook_trampoline;
> diff --git a/include/linux/fprobe.h b/include/linux/fprobe.h
> index 36c0595f7b93..b9c0c216dedb 100644
> --- a/include/linux/fprobe.h
> +++ b/include/linux/fprobe.h
> @@ -38,7 +38,7 @@ struct fprobe {
>  			     unsigned long ret_ip, struct ftrace_regs *regs,
>  			     void *entry_data);
>  	void (*exit_handler)(struct fprobe *fp, unsigned long entry_ip,
> -			     unsigned long ret_ip, struct pt_regs *regs,
> +			     unsigned long ret_ip, struct ftrace_regs *regs,
>  			     void *entry_data);
>  };
>  
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index fe335d861f08..c0a42d0860b8 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -151,6 +151,12 @@ struct ftrace_regs {
>  
>  #endif /* !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS || !CONFIG_FUNCTION_TRACER */
>  
> +#ifdef CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST
> +
> +static_assert(sizeof(struct pt_regs) == sizeof(struct ftrace_regs));
> +
> +#endif /* CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST */
> +
>  static __always_inline struct pt_regs *ftrace_get_regs(struct ftrace_regs *fregs)
>  {
>  	if (!fregs)
> diff --git a/include/linux/rethook.h b/include/linux/rethook.h
> index 26b6f3c81a76..138d64c8b67b 100644
> --- a/include/linux/rethook.h
> +++ b/include/linux/rethook.h
> @@ -7,6 +7,7 @@
>  
>  #include <linux/compiler.h>
>  #include <linux/freelist.h>
> +#include <linux/ftrace.h>
>  #include <linux/kallsyms.h>
>  #include <linux/llist.h>
>  #include <linux/rcupdate.h>
> @@ -14,7 +15,7 @@
>  
>  struct rethook_node;
>  
> -typedef void (*rethook_handler_t) (struct rethook_node *, void *, unsigned long, struct pt_regs *);
> +typedef void (*rethook_handler_t) (struct rethook_node *, void *, unsigned long, struct ftrace_regs *);
>  
>  /**
>   * struct rethook - The rethook management data structure.
> @@ -64,12 +65,12 @@ void rethook_free(struct rethook *rh);
>  void rethook_add_node(struct rethook *rh, struct rethook_node *node);
>  struct rethook_node *rethook_try_get(struct rethook *rh);
>  void rethook_recycle(struct rethook_node *node);
> -void rethook_hook(struct rethook_node *node, struct pt_regs *regs, bool mcount);
> +void rethook_hook(struct rethook_node *node, struct ftrace_regs *regs, bool mcount);
>  unsigned long rethook_find_ret_addr(struct task_struct *tsk, unsigned long frame,
>  				    struct llist_node **cur);
>  
>  /* Arch dependent code must implement arch_* and trampoline code */
> -void arch_rethook_prepare(struct rethook_node *node, struct pt_regs *regs, bool mcount);
> +void arch_rethook_prepare(struct rethook_node *node, struct ftrace_regs *regs, bool mcount);
>  void arch_rethook_trampoline(void);
>  
>  /**
> @@ -84,11 +85,11 @@ static inline bool is_rethook_trampoline(unsigned long addr)
>  }
>  
>  /* If the architecture needs to fixup the return address, implement it. */
> -void arch_rethook_fixup_return(struct pt_regs *regs,
> +void arch_rethook_fixup_return(struct ftrace_regs *regs,
>  			       unsigned long correct_ret_addr);
>  
>  /* Generic trampoline handler, arch code must prepare asm stub */
> -unsigned long rethook_trampoline_handler(struct pt_regs *regs,
> +unsigned long rethook_trampoline_handler(struct ftrace_regs *regs,
>  					 unsigned long frame);
>  
>  #ifdef CONFIG_RETHOOK
> diff --git a/kernel/kprobes.c b/kernel/kprobes.c
> index 0c6185aefaef..821dff656149 100644
> --- a/kernel/kprobes.c
> +++ b/kernel/kprobes.c
> @@ -2132,7 +2132,11 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
>  	if (rp->entry_handler && rp->entry_handler(ri, regs))
>  		rethook_recycle(rhn);
>  	else
> -		rethook_hook(rhn, regs, kprobe_ftrace(p));
> +		/*
> +		 * We can cast pt_regs to ftrace_regs because this depends on
> +		 * HAVE_PT_REGS_TO_FTRACE_REGS_CAST.
> +		 */
> +		rethook_hook(rhn, (struct ftrace_regs *)regs, kprobe_ftrace(p));
>  
>  	return 0;
>  }
> @@ -2140,9 +2144,11 @@ NOKPROBE_SYMBOL(pre_handler_kretprobe);
>  
>  static void kretprobe_rethook_handler(struct rethook_node *rh, void *data,
>  				      unsigned long ret_addr,
> -				      struct pt_regs *regs)
> +				      struct ftrace_regs *fregs)
>  {
>  	struct kretprobe *rp = (struct kretprobe *)data;
> +	/* Ditto, this depends on HAVE_PT_REGS_TO_FTRACE_REGS_CAST. */
> +	struct pt_regs *regs = (struct pt_regs *)fregs;
>  	struct kretprobe_instance *ri;
>  	struct kprobe_ctlblk *kcb;
>  
> diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig
> index 976fd594b446..d56304276318 100644
> --- a/kernel/trace/Kconfig
> +++ b/kernel/trace/Kconfig
> @@ -57,6 +57,13 @@ config HAVE_DYNAMIC_FTRACE_WITH_ARGS
>  	 This allows for use of ftrace_regs_get_argument() and
>  	 ftrace_regs_get_stack_pointer().
>  
> +config HAVE_PT_REGS_TO_FTRACE_REGS_CAST
> +	bool
> +	help
> +	 If this is set, the memory layout of the ftrace_regs data structure
> +	 is the same as the pt_regs. So the pt_regs is possible to be casted
> +	 to ftrace_regs.
> +
>  config HAVE_DYNAMIC_FTRACE_NO_PATCHABLE
>  	bool
>  	help
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 22d00c817f1a..c4d57c7cdc7c 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -2675,10 +2675,14 @@ kprobe_multi_link_handler(struct fprobe *fp, unsigned long fentry_ip,
>  
>  static void
>  kprobe_multi_link_exit_handler(struct fprobe *fp, unsigned long fentry_ip,
> -			       unsigned long ret_ip, struct pt_regs *regs,
> +			       unsigned long ret_ip, struct ftrace_regs *fregs,
>  			       void *data)
>  {
>  	struct bpf_kprobe_multi_link *link;
> +	struct pt_regs *regs = ftrace_get_regs(fregs);
> +
> +	if (!regs)
> +		return;
>  
>  	link = container_of(fp, struct bpf_kprobe_multi_link, fp);
>  	kprobe_multi_link_prog_run(link, get_entry_ip(fentry_ip), regs);
> diff --git a/kernel/trace/fprobe.c b/kernel/trace/fprobe.c
> index 07deb52df44a..dfddc7e8424e 100644
> --- a/kernel/trace/fprobe.c
> +++ b/kernel/trace/fprobe.c
> @@ -53,7 +53,7 @@ static inline void __fprobe_handler(unsigned long ip, unsigned long parent_ip,
>  		if (ret)
>  			rethook_recycle(rh);
>  		else
> -			rethook_hook(rh, ftrace_get_regs(fregs), true);
> +			rethook_hook(rh, fregs, true);
>  	}
>  }
>  
> @@ -120,7 +120,7 @@ static void fprobe_kprobe_handler(unsigned long ip, unsigned long parent_ip,
>  }
>  
>  static void fprobe_exit_handler(struct rethook_node *rh, void *data,
> -				unsigned long ret_ip, struct pt_regs *regs)
> +				unsigned long ret_ip, struct ftrace_regs *fregs)
>  {
>  	struct fprobe *fp = (struct fprobe *)data;
>  	struct fprobe_rethook_node *fpr;
> @@ -141,7 +141,7 @@ static void fprobe_exit_handler(struct rethook_node *rh, void *data,
>  		return;
>  	}
>  
> -	fp->exit_handler(fp, fpr->entry_ip, ret_ip, regs,
> +	fp->exit_handler(fp, fpr->entry_ip, ret_ip, fregs,
>  			 fp->entry_data_size ? (void *)fpr->data : NULL);
>  	ftrace_test_recursion_unlock(bit);
>  }
> diff --git a/kernel/trace/rethook.c b/kernel/trace/rethook.c
> index 5eb9b598f4e9..7c5cf9d5910c 100644
> --- a/kernel/trace/rethook.c
> +++ b/kernel/trace/rethook.c
> @@ -189,7 +189,7 @@ NOKPROBE_SYMBOL(rethook_try_get);
>  /**
>   * rethook_hook() - Hook the current function return.
>   * @node: The struct rethook node to hook the function return.
> - * @regs: The struct pt_regs for the function entry.
> + * @fregs: The struct ftrace_regs for the function entry.
>   * @mcount: True if this is called from mcount(ftrace) context.
>   *
>   * Hook the current running function return. This must be called when the
> @@ -199,9 +199,9 @@ NOKPROBE_SYMBOL(rethook_try_get);
>   * from the real function entry (e.g. kprobes) @mcount must be set false.
>   * This is because the way to hook the function return depends on the context.
>   */
> -void rethook_hook(struct rethook_node *node, struct pt_regs *regs, bool mcount)
> +void rethook_hook(struct rethook_node *node, struct ftrace_regs *fregs, bool mcount)
>  {
> -	arch_rethook_prepare(node, regs, mcount);
> +	arch_rethook_prepare(node, fregs, mcount);
>  	__llist_add(&node->llist, &current->rethooks);
>  }
>  NOKPROBE_SYMBOL(rethook_hook);
> @@ -269,7 +269,7 @@ unsigned long rethook_find_ret_addr(struct task_struct *tsk, unsigned long frame
>  }
>  NOKPROBE_SYMBOL(rethook_find_ret_addr);
>  
> -void __weak arch_rethook_fixup_return(struct pt_regs *regs,
> +void __weak arch_rethook_fixup_return(struct ftrace_regs *fregs,
>  				      unsigned long correct_ret_addr)
>  {
>  	/*
> @@ -281,7 +281,7 @@ void __weak arch_rethook_fixup_return(struct pt_regs *regs,
>  }
>  
>  /* This function will be called from each arch-defined trampoline. */
> -unsigned long rethook_trampoline_handler(struct pt_regs *regs,
> +unsigned long rethook_trampoline_handler(struct ftrace_regs *fregs,
>  					 unsigned long frame)
>  {
>  	struct llist_node *first, *node = NULL;
> @@ -295,7 +295,7 @@ unsigned long rethook_trampoline_handler(struct pt_regs *regs,
>  		BUG_ON(1);
>  	}
>  
> -	instruction_pointer_set(regs, correct_ret_addr);
> +	ftrace_regs_set_instruction_pointer(fregs, correct_ret_addr);
>  
>  	/*
>  	 * These loops must be protected from rethook_free_rcu() because those
> @@ -315,7 +315,7 @@ unsigned long rethook_trampoline_handler(struct pt_regs *regs,
>  		handler = READ_ONCE(rhn->rethook->handler);
>  		if (handler)
>  			handler(rhn, rhn->rethook->data,
> -				correct_ret_addr, regs);
> +				correct_ret_addr, fregs);
>  
>  		if (first == node)
>  			break;
> @@ -323,7 +323,7 @@ unsigned long rethook_trampoline_handler(struct pt_regs *regs,
>  	}
>  
>  	/* Fixup registers for returning to correct address. */
> -	arch_rethook_fixup_return(regs, correct_ret_addr);
> +	arch_rethook_fixup_return(fregs, correct_ret_addr);
>  
>  	/* Unlink used shadow stack */
>  	first = current->rethooks.first;
> diff --git a/kernel/trace/trace_fprobe.c b/kernel/trace/trace_fprobe.c
> index 71bf38d698f1..c60d0d9f1a95 100644
> --- a/kernel/trace/trace_fprobe.c
> +++ b/kernel/trace/trace_fprobe.c
> @@ -341,10 +341,14 @@ static int fentry_dispatcher(struct fprobe *fp, unsigned long entry_ip,
>  NOKPROBE_SYMBOL(fentry_dispatcher);
>  
>  static void fexit_dispatcher(struct fprobe *fp, unsigned long entry_ip,
> -			     unsigned long ret_ip, struct pt_regs *regs,
> +			     unsigned long ret_ip, struct ftrace_regs *fregs,
>  			     void *entry_data)
>  {
>  	struct trace_fprobe *tf = container_of(fp, struct trace_fprobe, fp);
> +	struct pt_regs *regs = ftrace_get_regs(fregs);
> +
> +	if (!regs)
> +		return;
>  
>  	if (trace_probe_test_flag(&tf->tp, TP_FLAG_TRACE))
>  		fexit_trace_func(tf, entry_ip, ret_ip, regs);
> diff --git a/lib/test_fprobe.c b/lib/test_fprobe.c
> index ff607babba18..d1e80653bf0c 100644
> --- a/lib/test_fprobe.c
> +++ b/lib/test_fprobe.c
> @@ -59,9 +59,9 @@ static notrace int fp_entry_handler(struct fprobe *fp, unsigned long ip,
>  
>  static notrace void fp_exit_handler(struct fprobe *fp, unsigned long ip,
>  				    unsigned long ret_ip,
> -				    struct pt_regs *regs, void *data)
> +				    struct ftrace_regs *fregs, void *data)
>  {
> -	unsigned long ret = regs_return_value(regs);
> +	unsigned long ret = ftrace_regs_return_value(fregs);
>  
>  	KUNIT_EXPECT_FALSE(current_test, preemptible());
>  	if (ip != target_ip) {
> @@ -89,7 +89,7 @@ static notrace int nest_entry_handler(struct fprobe *fp, unsigned long ip,
>  
>  static notrace void nest_exit_handler(struct fprobe *fp, unsigned long ip,
>  				      unsigned long ret_ip,
> -				      struct pt_regs *regs, void *data)
> +				      struct ftrace_regs *fregs, void *data)
>  {
>  	KUNIT_EXPECT_FALSE(current_test, preemptible());
>  	KUNIT_EXPECT_EQ(current_test, ip, target_nest_ip);
> diff --git a/samples/fprobe/fprobe_example.c b/samples/fprobe/fprobe_example.c
> index 1545a1aac616..d476d1f07538 100644
> --- a/samples/fprobe/fprobe_example.c
> +++ b/samples/fprobe/fprobe_example.c
> @@ -67,7 +67,7 @@ static int sample_entry_handler(struct fprobe *fp, unsigned long ip,
>  }
>  
>  static void sample_exit_handler(struct fprobe *fp, unsigned long ip,
> -				unsigned long ret_ip, struct pt_regs *regs,
> +				unsigned long ret_ip, struct ftrace_regs *regs,
>  				void *data)
>  {
>  	unsigned long rip = ret_ip;
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 4/9] fprobe: rethook: Use ftrace_regs in fprobe exit handler and rethook
  2023-09-04 13:40   ` Masami Hiramatsu
@ 2023-09-05  7:17     ` Sven Schnelle
  2023-09-05 13:36       ` Masami Hiramatsu
  0 siblings, 1 reply; 32+ messages in thread
From: Sven Schnelle @ 2023-09-05  7:17 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Alexei Starovoitov, Steven Rostedt, Florent Revest,
	linux-trace-kernel, LKML, Martin KaFai Lau, bpf,
	Alexei Starovoitov, Jiri Olsa, Arnaldo Carvalho de Melo,
	Daniel Borkmann, Alan Maguire, Mark Rutland, Peter Zijlstra,
	Thomas Gleixner

Masami Hiramatsu (Google) <mhiramat@kernel.org> writes:

> I found that this is not enough becuase s390/loongarch already implemented
> their rethook, and as far as I can see, the s390 ftrace_regs does not save
> the required registers for rethook. Thus, for such architecture, we need
> another kconfig flag and keep using the pt_regs for rethook.

Looking into arch_rethook_trampoline() i think we save all required
registers - which register do you think are missing? Or is there another
function i should look at?

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 4/9] fprobe: rethook: Use ftrace_regs in fprobe exit handler and rethook
  2023-09-05  7:17     ` Sven Schnelle
@ 2023-09-05 13:36       ` Masami Hiramatsu
  2023-09-05 16:30         ` Steven Rostedt
  2023-09-06  6:49         ` Sven Schnelle
  0 siblings, 2 replies; 32+ messages in thread
From: Masami Hiramatsu @ 2023-09-05 13:36 UTC (permalink / raw)
  To: Sven Schnelle
  Cc: Alexei Starovoitov, Steven Rostedt, Florent Revest,
	linux-trace-kernel, LKML, Martin KaFai Lau, bpf,
	Alexei Starovoitov, Jiri Olsa, Arnaldo Carvalho de Melo,
	Daniel Borkmann, Alan Maguire, Mark Rutland, Peter Zijlstra,
	Thomas Gleixner

On Tue, 05 Sep 2023 09:17:02 +0200
Sven Schnelle <svens@linux.ibm.com> wrote:

> Masami Hiramatsu (Google) <mhiramat@kernel.org> writes:
> 
> > I found that this is not enough becuase s390/loongarch already implemented
> > their rethook, and as far as I can see, the s390 ftrace_regs does not save
> > the required registers for rethook. Thus, for such architecture, we need
> > another kconfig flag and keep using the pt_regs for rethook.
> 
> Looking into arch_rethook_trampoline() i think we save all required
> registers - which register do you think are missing? Or is there another
> function i should look at?

Yes, arch_rethook_trampoline() is good. It needs to save all registers.

In this series, I'm trying to change the pt_regs with ftrace_regs which will
reduce trampoline overhead if DYNAMIC_FTRACE_WITH_ARGS=y.

kprobe -> (pt_regs) -> rethook_try_hook()
fprobe -> (ftrace_regs) -> rethook_try_hook_ftrace() # new function

Thus, we need to ensure that the ftrace_regs which is saved in the ftrace
*without* FTRACE_WITH_REGS flags, can be used for hooking the function
return. I saw;

void arch_rethook_prepare(struct rethook_node *rh, struct pt_regs *regs, bool mcount)
{
        rh->ret_addr = regs->gprs[14];
        rh->frame = regs->gprs[15];

        /* Replace the return addr with trampoline addr */
        regs->gprs[14] = (unsigned long)&arch_rethook_trampoline;
}

gprs[15] is a stack pointer, so it is saved in ftrace_regs too, but what about
gprs[14]? (I guess it is a link register)
We need to read the gprs[14] and ensure that is restored to gpr14 when the
ftrace is exit even without FTRACE_WITH_REGS flag.

IOW, it is ftrace save regs/restore regs code issue. I need to check how the
function_graph implements it.

Thank you,

-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 4/9] fprobe: rethook: Use ftrace_regs in fprobe exit handler and rethook
  2023-09-05 13:36       ` Masami Hiramatsu
@ 2023-09-05 16:30         ` Steven Rostedt
  2023-09-06  0:06           ` Masami Hiramatsu
  2023-09-06  6:49         ` Sven Schnelle
  1 sibling, 1 reply; 32+ messages in thread
From: Steven Rostedt @ 2023-09-05 16:30 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Sven Schnelle, Alexei Starovoitov, Florent Revest,
	linux-trace-kernel, LKML, Martin KaFai Lau, bpf,
	Alexei Starovoitov, Jiri Olsa, Arnaldo Carvalho de Melo,
	Daniel Borkmann, Alan Maguire, Mark Rutland, Peter Zijlstra,
	Thomas Gleixner

On Tue, 5 Sep 2023 22:36:33 +0900
Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:

> Yes, arch_rethook_trampoline() is good. It needs to save all registers.
> 
> In this series, I'm trying to change the pt_regs with ftrace_regs which will
> reduce trampoline overhead if DYNAMIC_FTRACE_WITH_ARGS=y.
> 
> kprobe -> (pt_regs) -> rethook_try_hook()
> fprobe -> (ftrace_regs) -> rethook_try_hook_ftrace() # new function
> 
> Thus, we need to ensure that the ftrace_regs which is saved in the ftrace
> *without* FTRACE_WITH_REGS flags, can be used for hooking the function
> return. I saw;
> 
> void arch_rethook_prepare(struct rethook_node *rh, struct pt_regs *regs, bool mcount)
> {
>         rh->ret_addr = regs->gprs[14];
>         rh->frame = regs->gprs[15];
> 
>         /* Replace the return addr with trampoline addr */
>         regs->gprs[14] = (unsigned long)&arch_rethook_trampoline;
> }
> 
> gprs[15] is a stack pointer, so it is saved in ftrace_regs too, but what about
> gprs[14]? (I guess it is a link register)
> We need to read the gprs[14] and ensure that is restored to gpr14 when the
> ftrace is exit even without FTRACE_WITH_REGS flag.
> 
> IOW, it is ftrace save regs/restore regs code issue. I need to check how the
> function_graph implements it.

I would argue that the link register should also be saved in ftrace_regs.

The thing that ftrace_regs is not suppose to save is the general purpose
registers.

-- Steve

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 5/9] ftrace: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs
  2023-08-26  1:56     ` Masami Hiramatsu
@ 2023-09-05 19:50       ` Andrii Nakryiko
  2023-09-06  0:28         ` Masami Hiramatsu
  0 siblings, 1 reply; 32+ messages in thread
From: Andrii Nakryiko @ 2023-09-05 19:50 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Alexei Starovoitov, Steven Rostedt, Florent Revest,
	linux-trace-kernel, LKML, Martin KaFai Lau, bpf, Sven Schnelle,
	Alexei Starovoitov, Jiri Olsa, Arnaldo Carvalho de Melo,
	Daniel Borkmann, Alan Maguire, Mark Rutland, Peter Zijlstra,
	Thomas Gleixner

On Fri, Aug 25, 2023 at 6:56 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Fri, 25 Aug 2023 14:49:48 -0700
> Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> > On Wed, Aug 23, 2023 at 8:16 AM Masami Hiramatsu (Google)
> > <mhiramat@kernel.org> wrote:
> > >
> > > From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > >
> > > Add ftrace_partial_regs() which converts the ftrace_regs to pt_regs.
> > > If the architecture defines its own ftrace_regs, this copies partial
> > > registers to pt_regs and returns it. If not, ftrace_regs is the same as
> > > pt_regs and ftrace_partial_regs() will return ftrace_regs::regs.
> > >
> > > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > > Acked-by: Florent Revest <revest@chromium.org>
> > > ---
> > >  Changes in v3:
> > >   - Fix to use pt_regs::regs instead of x.
> > >   - Return ftrace_regs::regs forcibly if HAVE_PT_REGS_COMPAT_FTRACE_REGS=y.
> > >   - Fix typo.
> > >   - Fix to copy correct registers to the pt_regs on arm64.
> > >  Changes in v4:
> > >   - Change the patch order in the series so that fprobe event can use this.
> > > ---
> > >  arch/arm64/include/asm/ftrace.h |   11 +++++++++++
> > >  include/linux/ftrace.h          |   17 +++++++++++++++++
> > >  2 files changed, 28 insertions(+)
> > >
> > > diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
> > > index ab158196480c..5ad24f315d52 100644
> > > --- a/arch/arm64/include/asm/ftrace.h
> > > +++ b/arch/arm64/include/asm/ftrace.h
> > > @@ -137,6 +137,17 @@ ftrace_override_function_with_return(struct ftrace_regs *fregs)
> > >         fregs->pc = fregs->lr;
> > >  }
> > >
> > > +static __always_inline struct pt_regs *
> > > +ftrace_partial_regs(const struct ftrace_regs *fregs, struct pt_regs *regs)
> > > +{
> > > +       memcpy(regs->regs, fregs->regs, sizeof(u64) * 9);
> > > +       regs->sp = fregs->sp;
> > > +       regs->pc = fregs->pc;
> > > +       regs->regs[29] = fregs->fp;
> > > +       regs->regs[30] = fregs->lr;
> >
> > I see that orig_x0 from pt_regs is used on arm64 to get syscall's
> > first parameter. And it seems like this ftrace_regs to pt_regs
> > conversion doesn't touch orig_x0 at all. Is it maintained/preserved
> > somewhere else, or will we lose the ability to get the first syscall's
> > argument?
>
> Thanks for checking it!
>
> Does BPF uses kprobe probe to trace syscalls? Since we have raw_syscall
> trace events, no need to use kprobe to do that. (and I don't recommend to
> use kprobe to do such fixed event)

Yeah, lots of tools and projects actually trace syscalls with kprobes.
I don't think there is anything we can do to quickly change that, so
we should avoid breaking all of them.

So getting back to my original question, is it possible to preserve orig_x0?

>
> >
> > Looking at libbpf's bpf_tracing.h, other than orig_x0, I think all the
> > other registers are still preserved, so this seems to be the only
> > potential problem.
>
> Great!
>
> Thank you,
>
> >
> >
> > > +       return regs;
> > > +}
> > > +
> > >  int ftrace_regs_query_register_offset(const char *name);
> > >
> > >  int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
> > > diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> > > index c0a42d0860b8..a6ed2aa71efc 100644
> > > --- a/include/linux/ftrace.h
> > > +++ b/include/linux/ftrace.h
> > > @@ -165,6 +165,23 @@ static __always_inline struct pt_regs *ftrace_get_regs(struct ftrace_regs *fregs
> > >         return arch_ftrace_get_regs(fregs);
> > >  }
> > >
> > > +#if !defined(CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS) || \
> > > +       defined(CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST)
> > > +
> > > +static __always_inline struct pt_regs *
> > > +ftrace_partial_regs(struct ftrace_regs *fregs, struct pt_regs *regs)
> > > +{
> > > +       /*
> > > +        * If CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST=y, ftrace_regs memory
> > > +        * layout is the same as pt_regs. So always returns that address.
> > > +        * Since arch_ftrace_get_regs() will check some members and may return
> > > +        * NULL, we can not use it.
> > > +        */
> > > +       return &fregs->regs;
> > > +}
> > > +
> > > +#endif /* !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS || CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST */
> > > +
> > >  /*
> > >   * When true, the ftrace_regs_{get,set}_*() functions may be used on fregs.
> > >   * Note: this can be true even when ftrace_get_regs() cannot provide a pt_regs.
> > >
> > >
>
>
> --
> Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 4/9] fprobe: rethook: Use ftrace_regs in fprobe exit handler and rethook
  2023-09-05 16:30         ` Steven Rostedt
@ 2023-09-06  0:06           ` Masami Hiramatsu
  0 siblings, 0 replies; 32+ messages in thread
From: Masami Hiramatsu @ 2023-09-06  0:06 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Sven Schnelle, Alexei Starovoitov, Florent Revest,
	linux-trace-kernel, LKML, Martin KaFai Lau, bpf,
	Alexei Starovoitov, Jiri Olsa, Arnaldo Carvalho de Melo,
	Daniel Borkmann, Alan Maguire, Mark Rutland, Peter Zijlstra,
	Thomas Gleixner

On Tue, 5 Sep 2023 12:30:58 -0400
Steven Rostedt <rostedt@goodmis.org> wrote:

> On Tue, 5 Sep 2023 22:36:33 +0900
> Masami Hiramatsu (Google) <mhiramat@kernel.org> wrote:
> 
> > Yes, arch_rethook_trampoline() is good. It needs to save all registers.
> > 
> > In this series, I'm trying to change the pt_regs with ftrace_regs which will
> > reduce trampoline overhead if DYNAMIC_FTRACE_WITH_ARGS=y.
> > 
> > kprobe -> (pt_regs) -> rethook_try_hook()
> > fprobe -> (ftrace_regs) -> rethook_try_hook_ftrace() # new function
> > 
> > Thus, we need to ensure that the ftrace_regs which is saved in the ftrace
> > *without* FTRACE_WITH_REGS flags, can be used for hooking the function
> > return. I saw;
> > 
> > void arch_rethook_prepare(struct rethook_node *rh, struct pt_regs *regs, bool mcount)
> > {
> >         rh->ret_addr = regs->gprs[14];
> >         rh->frame = regs->gprs[15];
> > 
> >         /* Replace the return addr with trampoline addr */
> >         regs->gprs[14] = (unsigned long)&arch_rethook_trampoline;
> > }
> > 
> > gprs[15] is a stack pointer, so it is saved in ftrace_regs too, but what about
> > gprs[14]? (I guess it is a link register)
> > We need to read the gprs[14] and ensure that is restored to gpr14 when the
> > ftrace is exit even without FTRACE_WITH_REGS flag.
> > 
> > IOW, it is ftrace save regs/restore regs code issue. I need to check how the
> > function_graph implements it.
> 
> I would argue that the link register should also be saved in ftrace_regs.
> 
> The thing that ftrace_regs is not suppose to save is the general purpose
> registers.

Let me confirm that if ftrace_regs user changes a member of the ftrace_regs,
is that restored to the actual register when exits the ftrace too?

On x86, we just tweak the stack on memory, so I'm sure that that change is
effective, but not sure on other arch. Ah, but function_graph may also need it.

Thank you,

> 
> -- Steve


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 5/9] ftrace: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs
  2023-09-05 19:50       ` Andrii Nakryiko
@ 2023-09-06  0:28         ` Masami Hiramatsu
  2023-09-08 22:56           ` Andrii Nakryiko
  0 siblings, 1 reply; 32+ messages in thread
From: Masami Hiramatsu @ 2023-09-06  0:28 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Steven Rostedt, Florent Revest,
	linux-trace-kernel, LKML, Martin KaFai Lau, bpf, Sven Schnelle,
	Alexei Starovoitov, Jiri Olsa, Arnaldo Carvalho de Melo,
	Daniel Borkmann, Alan Maguire, Mark Rutland, Peter Zijlstra,
	Thomas Gleixner

On Tue, 5 Sep 2023 12:50:28 -0700
Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:

> On Fri, Aug 25, 2023 at 6:56 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> >
> > On Fri, 25 Aug 2023 14:49:48 -0700
> > Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> >
> > > On Wed, Aug 23, 2023 at 8:16 AM Masami Hiramatsu (Google)
> > > <mhiramat@kernel.org> wrote:
> > > >
> > > > From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > > >
> > > > Add ftrace_partial_regs() which converts the ftrace_regs to pt_regs.
> > > > If the architecture defines its own ftrace_regs, this copies partial
> > > > registers to pt_regs and returns it. If not, ftrace_regs is the same as
> > > > pt_regs and ftrace_partial_regs() will return ftrace_regs::regs.
> > > >
> > > > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > > > Acked-by: Florent Revest <revest@chromium.org>
> > > > ---
> > > >  Changes in v3:
> > > >   - Fix to use pt_regs::regs instead of x.
> > > >   - Return ftrace_regs::regs forcibly if HAVE_PT_REGS_COMPAT_FTRACE_REGS=y.
> > > >   - Fix typo.
> > > >   - Fix to copy correct registers to the pt_regs on arm64.
> > > >  Changes in v4:
> > > >   - Change the patch order in the series so that fprobe event can use this.
> > > > ---
> > > >  arch/arm64/include/asm/ftrace.h |   11 +++++++++++
> > > >  include/linux/ftrace.h          |   17 +++++++++++++++++
> > > >  2 files changed, 28 insertions(+)
> > > >
> > > > diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
> > > > index ab158196480c..5ad24f315d52 100644
> > > > --- a/arch/arm64/include/asm/ftrace.h
> > > > +++ b/arch/arm64/include/asm/ftrace.h
> > > > @@ -137,6 +137,17 @@ ftrace_override_function_with_return(struct ftrace_regs *fregs)
> > > >         fregs->pc = fregs->lr;
> > > >  }
> > > >
> > > > +static __always_inline struct pt_regs *
> > > > +ftrace_partial_regs(const struct ftrace_regs *fregs, struct pt_regs *regs)
> > > > +{
> > > > +       memcpy(regs->regs, fregs->regs, sizeof(u64) * 9);
> > > > +       regs->sp = fregs->sp;
> > > > +       regs->pc = fregs->pc;
> > > > +       regs->regs[29] = fregs->fp;
> > > > +       regs->regs[30] = fregs->lr;
> > >
> > > I see that orig_x0 from pt_regs is used on arm64 to get syscall's
> > > first parameter. And it seems like this ftrace_regs to pt_regs
> > > conversion doesn't touch orig_x0 at all. Is it maintained/preserved
> > > somewhere else, or will we lose the ability to get the first syscall's
> > > argument?
> >
> > Thanks for checking it!
> >
> > Does BPF uses kprobe probe to trace syscalls? Since we have raw_syscall
> > trace events, no need to use kprobe to do that. (and I don't recommend to
> > use kprobe to do such fixed event)
> 
> Yeah, lots of tools and projects actually trace syscalls with kprobes.
> I don't think there is anything we can do to quickly change that, so
> we should avoid breaking all of them.

Yes, ah, but anyway, this is the fprobe case, not kprobe. Do you use
multi_kprobes for tracing syscalls? 

Jiri, do you know when the multi-kprobe feature is used? Is that used
implicitly or explicitly?

> 
> So getting back to my original question, is it possible to preserve orig_x0?

I'm curious that the orig_x0 is stored to pt_regs of the kprobes itself
because it is not a real register. There is no way to access it. You can
use regs->x0 instead of that.

I think the orig_x0 is stored when the syscall happened because it has
another user pt_regs on the stack, right?

If so, we don't need to save orig_x0 on the pt_regs for kprobes, but user can
dig the stack to find the orig_x0. Here the arm64, syscall entry handler,

static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
                           const syscall_fn_t syscall_table[])
{
        unsigned long flags = read_thread_flags();

        regs->orig_x0 = regs->regs[0];
        regs->syscallno = scno;

(BTW, it seems syscall number is saved on regs->syscallno.)

It seems that if you probe the el0_svc_common() for tracing syscall,
all you need is tracing $arg1 == pt_regs and $arg2 == syscall number.

Thank you,

> 
> >
> > >
> > > Looking at libbpf's bpf_tracing.h, other than orig_x0, I think all the
> > > other registers are still preserved, so this seems to be the only
> > > potential problem.
> >
> > Great!
> >
> > Thank you,
> >
> > >
> > >
> > > > +       return regs;
> > > > +}
> > > > +
> > > >  int ftrace_regs_query_register_offset(const char *name);
> > > >
> > > >  int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
> > > > diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> > > > index c0a42d0860b8..a6ed2aa71efc 100644
> > > > --- a/include/linux/ftrace.h
> > > > +++ b/include/linux/ftrace.h
> > > > @@ -165,6 +165,23 @@ static __always_inline struct pt_regs *ftrace_get_regs(struct ftrace_regs *fregs
> > > >         return arch_ftrace_get_regs(fregs);
> > > >  }
> > > >
> > > > +#if !defined(CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS) || \
> > > > +       defined(CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST)
> > > > +
> > > > +static __always_inline struct pt_regs *
> > > > +ftrace_partial_regs(struct ftrace_regs *fregs, struct pt_regs *regs)
> > > > +{
> > > > +       /*
> > > > +        * If CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST=y, ftrace_regs memory
> > > > +        * layout is the same as pt_regs. So always returns that address.
> > > > +        * Since arch_ftrace_get_regs() will check some members and may return
> > > > +        * NULL, we can not use it.
> > > > +        */
> > > > +       return &fregs->regs;
> > > > +}
> > > > +
> > > > +#endif /* !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS || CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST */
> > > > +
> > > >  /*
> > > >   * When true, the ftrace_regs_{get,set}_*() functions may be used on fregs.
> > > >   * Note: this can be true even when ftrace_get_regs() cannot provide a pt_regs.
> > > >
> > > >
> >
> >
> > --
> > Masami Hiramatsu (Google) <mhiramat@kernel.org>


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 4/9] fprobe: rethook: Use ftrace_regs in fprobe exit handler and rethook
  2023-09-05 13:36       ` Masami Hiramatsu
  2023-09-05 16:30         ` Steven Rostedt
@ 2023-09-06  6:49         ` Sven Schnelle
  2023-09-09 14:24           ` Masami Hiramatsu
  1 sibling, 1 reply; 32+ messages in thread
From: Sven Schnelle @ 2023-09-06  6:49 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Alexei Starovoitov, Steven Rostedt, Florent Revest,
	linux-trace-kernel, LKML, Martin KaFai Lau, bpf,
	Alexei Starovoitov, Jiri Olsa, Arnaldo Carvalho de Melo,
	Daniel Borkmann, Alan Maguire, Mark Rutland, Peter Zijlstra,
	Thomas Gleixner

Hi Masami,

Masami Hiramatsu (Google) <mhiramat@kernel.org> writes:

> Thus, we need to ensure that the ftrace_regs which is saved in the ftrace
> *without* FTRACE_WITH_REGS flags, can be used for hooking the function
> return. I saw;
>
> void arch_rethook_prepare(struct rethook_node *rh, struct pt_regs *regs, bool mcount)
> {
>         rh->ret_addr = regs->gprs[14];
>         rh->frame = regs->gprs[15];
>
>         /* Replace the return addr with trampoline addr */
>         regs->gprs[14] = (unsigned long)&arch_rethook_trampoline;
> }
>
> gprs[15] is a stack pointer, so it is saved in ftrace_regs too, but what about
> gprs[14]? (I guess it is a link register)
> We need to read the gprs[14] and ensure that is restored to gpr14 when the
> ftrace is exit even without FTRACE_WITH_REGS flag.
>
> IOW, it is ftrace save regs/restore regs code issue. I need to check how the
> function_graph implements it.

gpr2-gpr14 are always saved in ftrace_caller/ftrace_regs_caller(),
regardless of the FTRACE_WITH_REGS flags. The only difference is that
without the FTRACE_WITH_REGS flag the program status word (psw) is not
saved because collecting that is a rather expensive operation.

I used the following commands to test rethook (is that the correct
testcase?)

#!/bin/bash
cd /sys/kernel/tracing

echo 'r:icmp_rcv icmp_rcv' >kprobe_events
echo 1 >events/kprobes/icmp_rcv/enable
ping -c 1 127.0.0.1
cat trace

which gave me:

ping-686     [001] ..s1.    96.890817: icmp_rcv: (ip_protocol_deliver_rcu+0x42/0x218 <- icmp_rcv)

I applied the following patch on top of your patches to make it compile,
and rethook still seems to work:

commit dab51b0a5b885660630433ac89f8e64a2de0eb86
Author: Sven Schnelle <svens@linux.ibm.com>
Date:   Wed Sep 6 08:06:23 2023 +0200

    rethook wip
    
    Signed-off-by: Sven Schnelle <svens@linux.ibm.com>

diff --git a/arch/s390/kernel/rethook.c b/arch/s390/kernel/rethook.c
index af10e6bdd34e..4e86c0a1a064 100644
--- a/arch/s390/kernel/rethook.c
+++ b/arch/s390/kernel/rethook.c
@@ -3,8 +3,9 @@
 #include <linux/kprobes.h>
 #include "rethook.h"
 
-void arch_rethook_prepare(struct rethook_node *rh, struct pt_regs *regs, bool mcount)
+void arch_rethook_prepare(struct rethook_node *rh, struct ftrace_regs *fregs, bool mcount)
 {
+	struct pt_regs *regs = (struct pt_regs *)fregs;
 	rh->ret_addr = regs->gprs[14];
 	rh->frame = regs->gprs[15];
 
@@ -13,10 +14,11 @@ void arch_rethook_prepare(struct rethook_node *rh, struct pt_regs *regs, bool mc
 }
 NOKPROBE_SYMBOL(arch_rethook_prepare);
 
-void arch_rethook_fixup_return(struct pt_regs *regs,
+void arch_rethook_fixup_return(struct ftrace_regs *fregs,
 			       unsigned long correct_ret_addr)
 {
 	/* Replace fake return address with real one. */
+	struct pt_regs *regs = (struct pt_regs *)fregs;
 	regs->gprs[14] = correct_ret_addr;
 }
 NOKPROBE_SYMBOL(arch_rethook_fixup_return);
@@ -24,9 +26,9 @@ NOKPROBE_SYMBOL(arch_rethook_fixup_return);
 /*
  * Called from arch_rethook_trampoline
  */
-unsigned long arch_rethook_trampoline_callback(struct pt_regs *regs)
+unsigned long arch_rethook_trampoline_callback(struct ftrace_regs *fregs)
 {
-	return rethook_trampoline_handler(regs, regs->gprs[15]);
+	return rethook_trampoline_handler(fregs, fregs->regs.gprs[15]);
 }
 NOKPROBE_SYMBOL(arch_rethook_trampoline_callback);
 
diff --git a/arch/s390/kernel/rethook.h b/arch/s390/kernel/rethook.h
index 32f069eed3f3..0fe62424fc78 100644
--- a/arch/s390/kernel/rethook.h
+++ b/arch/s390/kernel/rethook.h
@@ -2,6 +2,6 @@
 #ifndef __S390_RETHOOK_H
 #define __S390_RETHOOK_H
 
-unsigned long arch_rethook_trampoline_callback(struct pt_regs *regs);
+unsigned long arch_rethook_trampoline_callback(struct ftrace_regs *fregs);
 
 #endif


^ permalink raw reply related	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 5/9] ftrace: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs
  2023-09-06  0:28         ` Masami Hiramatsu
@ 2023-09-08 22:56           ` Andrii Nakryiko
  0 siblings, 0 replies; 32+ messages in thread
From: Andrii Nakryiko @ 2023-09-08 22:56 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Alexei Starovoitov, Steven Rostedt, Florent Revest,
	linux-trace-kernel, LKML, Martin KaFai Lau, bpf, Sven Schnelle,
	Alexei Starovoitov, Jiri Olsa, Arnaldo Carvalho de Melo,
	Daniel Borkmann, Alan Maguire, Mark Rutland, Peter Zijlstra,
	Thomas Gleixner

On Tue, Sep 5, 2023 at 5:28 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
>
> On Tue, 5 Sep 2023 12:50:28 -0700
> Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>
> > On Fri, Aug 25, 2023 at 6:56 PM Masami Hiramatsu <mhiramat@kernel.org> wrote:
> > >
> > > On Fri, 25 Aug 2023 14:49:48 -0700
> > > Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> > >
> > > > On Wed, Aug 23, 2023 at 8:16 AM Masami Hiramatsu (Google)
> > > > <mhiramat@kernel.org> wrote:
> > > > >
> > > > > From: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > > > >
> > > > > Add ftrace_partial_regs() which converts the ftrace_regs to pt_regs.
> > > > > If the architecture defines its own ftrace_regs, this copies partial
> > > > > registers to pt_regs and returns it. If not, ftrace_regs is the same as
> > > > > pt_regs and ftrace_partial_regs() will return ftrace_regs::regs.
> > > > >
> > > > > Signed-off-by: Masami Hiramatsu (Google) <mhiramat@kernel.org>
> > > > > Acked-by: Florent Revest <revest@chromium.org>
> > > > > ---
> > > > >  Changes in v3:
> > > > >   - Fix to use pt_regs::regs instead of x.
> > > > >   - Return ftrace_regs::regs forcibly if HAVE_PT_REGS_COMPAT_FTRACE_REGS=y.
> > > > >   - Fix typo.
> > > > >   - Fix to copy correct registers to the pt_regs on arm64.
> > > > >  Changes in v4:
> > > > >   - Change the patch order in the series so that fprobe event can use this.
> > > > > ---
> > > > >  arch/arm64/include/asm/ftrace.h |   11 +++++++++++
> > > > >  include/linux/ftrace.h          |   17 +++++++++++++++++
> > > > >  2 files changed, 28 insertions(+)
> > > > >
> > > > > diff --git a/arch/arm64/include/asm/ftrace.h b/arch/arm64/include/asm/ftrace.h
> > > > > index ab158196480c..5ad24f315d52 100644
> > > > > --- a/arch/arm64/include/asm/ftrace.h
> > > > > +++ b/arch/arm64/include/asm/ftrace.h
> > > > > @@ -137,6 +137,17 @@ ftrace_override_function_with_return(struct ftrace_regs *fregs)
> > > > >         fregs->pc = fregs->lr;
> > > > >  }
> > > > >
> > > > > +static __always_inline struct pt_regs *
> > > > > +ftrace_partial_regs(const struct ftrace_regs *fregs, struct pt_regs *regs)
> > > > > +{
> > > > > +       memcpy(regs->regs, fregs->regs, sizeof(u64) * 9);
> > > > > +       regs->sp = fregs->sp;
> > > > > +       regs->pc = fregs->pc;
> > > > > +       regs->regs[29] = fregs->fp;
> > > > > +       regs->regs[30] = fregs->lr;
> > > >
> > > > I see that orig_x0 from pt_regs is used on arm64 to get syscall's
> > > > first parameter. And it seems like this ftrace_regs to pt_regs
> > > > conversion doesn't touch orig_x0 at all. Is it maintained/preserved
> > > > somewhere else, or will we lose the ability to get the first syscall's
> > > > argument?
> > >
> > > Thanks for checking it!
> > >
> > > Does BPF uses kprobe probe to trace syscalls? Since we have raw_syscall
> > > trace events, no need to use kprobe to do that. (and I don't recommend to
> > > use kprobe to do such fixed event)
> >
> > Yeah, lots of tools and projects actually trace syscalls with kprobes.
> > I don't think there is anything we can do to quickly change that, so
> > we should avoid breaking all of them.
>
> Yes, ah, but anyway, this is the fprobe case, not kprobe. Do you use
> multi_kprobes for tracing syscalls?
>
> Jiri, do you know when the multi-kprobe feature is used? Is that used
> implicitly or explicitly?

ah, ok, so all this fprobe machinery is not used for (single-)kprobes?
This makes it a bit less painful for end users, I believe most syscall
tracing kprobes right now are not multi-kprobes.

>
> >
> > So getting back to my original question, is it possible to preserve orig_x0?
>
> I'm curious that the orig_x0 is stored to pt_regs of the kprobes itself
> because it is not a real register. There is no way to access it. You can
> use regs->x0 instead of that.
>
> I think the orig_x0 is stored when the syscall happened because it has
> another user pt_regs on the stack, right?
>
> If so, we don't need to save orig_x0 on the pt_regs for kprobes, but user can
> dig the stack to find the orig_x0. Here the arm64, syscall entry handler,
>
> static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
>                            const syscall_fn_t syscall_table[])
> {
>         unsigned long flags = read_thread_flags();
>
>         regs->orig_x0 = regs->regs[0];
>         regs->syscallno = scno;
>
> (BTW, it seems syscall number is saved on regs->syscallno.)
>
> It seems that if you probe the el0_svc_common() for tracing syscall,
> all you need is tracing $arg1 == pt_regs and $arg2 == syscall number.
>
> Thank you,
>
> >
> > >
> > > >
> > > > Looking at libbpf's bpf_tracing.h, other than orig_x0, I think all the
> > > > other registers are still preserved, so this seems to be the only
> > > > potential problem.
> > >
> > > Great!
> > >
> > > Thank you,
> > >
> > > >
> > > >
> > > > > +       return regs;
> > > > > +}
> > > > > +
> > > > >  int ftrace_regs_query_register_offset(const char *name);
> > > > >
> > > > >  int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec);
> > > > > diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> > > > > index c0a42d0860b8..a6ed2aa71efc 100644
> > > > > --- a/include/linux/ftrace.h
> > > > > +++ b/include/linux/ftrace.h
> > > > > @@ -165,6 +165,23 @@ static __always_inline struct pt_regs *ftrace_get_regs(struct ftrace_regs *fregs
> > > > >         return arch_ftrace_get_regs(fregs);
> > > > >  }
> > > > >
> > > > > +#if !defined(CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS) || \
> > > > > +       defined(CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST)
> > > > > +
> > > > > +static __always_inline struct pt_regs *
> > > > > +ftrace_partial_regs(struct ftrace_regs *fregs, struct pt_regs *regs)
> > > > > +{
> > > > > +       /*
> > > > > +        * If CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST=y, ftrace_regs memory
> > > > > +        * layout is the same as pt_regs. So always returns that address.
> > > > > +        * Since arch_ftrace_get_regs() will check some members and may return
> > > > > +        * NULL, we can not use it.
> > > > > +        */
> > > > > +       return &fregs->regs;
> > > > > +}
> > > > > +
> > > > > +#endif /* !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS || CONFIG_HAVE_PT_REGS_TO_FTRACE_REGS_CAST */
> > > > > +
> > > > >  /*
> > > > >   * When true, the ftrace_regs_{get,set}_*() functions may be used on fregs.
> > > > >   * Note: this can be true even when ftrace_get_regs() cannot provide a pt_regs.
> > > > >
> > > > >
> > >
> > >
> > > --
> > > Masami Hiramatsu (Google) <mhiramat@kernel.org>
>
>
> --
> Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 4/9] fprobe: rethook: Use ftrace_regs in fprobe exit handler and rethook
  2023-09-06  6:49         ` Sven Schnelle
@ 2023-09-09 14:24           ` Masami Hiramatsu
  2023-09-11  7:55             ` Sven Schnelle
  0 siblings, 1 reply; 32+ messages in thread
From: Masami Hiramatsu @ 2023-09-09 14:24 UTC (permalink / raw)
  To: Sven Schnelle
  Cc: Alexei Starovoitov, Steven Rostedt, Florent Revest,
	linux-trace-kernel, LKML, Martin KaFai Lau, bpf,
	Alexei Starovoitov, Jiri Olsa, Arnaldo Carvalho de Melo,
	Daniel Borkmann, Alan Maguire, Mark Rutland, Peter Zijlstra,
	Thomas Gleixner

Hi Sven,

On Wed, 06 Sep 2023 08:49:11 +0200
Sven Schnelle <svens@linux.ibm.com> wrote:

> Hi Masami,
> 
> Masami Hiramatsu (Google) <mhiramat@kernel.org> writes:
> 
> > Thus, we need to ensure that the ftrace_regs which is saved in the ftrace
> > *without* FTRACE_WITH_REGS flags, can be used for hooking the function
> > return. I saw;
> >
> > void arch_rethook_prepare(struct rethook_node *rh, struct pt_regs *regs, bool mcount)
> > {
> >         rh->ret_addr = regs->gprs[14];
> >         rh->frame = regs->gprs[15];
> >
> >         /* Replace the return addr with trampoline addr */
> >         regs->gprs[14] = (unsigned long)&arch_rethook_trampoline;
> > }
> >
> > gprs[15] is a stack pointer, so it is saved in ftrace_regs too, but what about
> > gprs[14]? (I guess it is a link register)
> > We need to read the gprs[14] and ensure that is restored to gpr14 when the
> > ftrace is exit even without FTRACE_WITH_REGS flag.
> >
> > IOW, it is ftrace save regs/restore regs code issue. I need to check how the
> > function_graph implements it.
> 
> gpr2-gpr14 are always saved in ftrace_caller/ftrace_regs_caller(),
> regardless of the FTRACE_WITH_REGS flags. The only difference is that
> without the FTRACE_WITH_REGS flag the program status word (psw) is not
> saved because collecting that is a rather expensive operation.

Thanks for checking that! So s390 will recover those saved registers
even if FTRACE_WITH_REGS flag is not set? (I wonder what is the requirement
of the ftrace_regs when returning from ftrace_call() without
FTRACE_WITH_REGS?)

> 
> I used the following commands to test rethook (is that the correct
> testcase?)
> 
> #!/bin/bash
> cd /sys/kernel/tracing
> 
> echo 'r:icmp_rcv icmp_rcv' >kprobe_events
> echo 1 >events/kprobes/icmp_rcv/enable
> ping -c 1 127.0.0.1
> cat trace

No, the kprobe will path pt_regs to rethook.
Cna you run

echo "f:icmp_rcv%return icmp_rcv" >> dynamic_events

instead of kprobe_events?

Thank you,

> 
> which gave me:
> 
> ping-686     [001] ..s1.    96.890817: icmp_rcv: (ip_protocol_deliver_rcu+0x42/0x218 <- icmp_rcv)
> 
> I applied the following patch on top of your patches to make it compile,
> and rethook still seems to work:
> 
> commit dab51b0a5b885660630433ac89f8e64a2de0eb86
> Author: Sven Schnelle <svens@linux.ibm.com>
> Date:   Wed Sep 6 08:06:23 2023 +0200
> 
>     rethook wip
>     
>     Signed-off-by: Sven Schnelle <svens@linux.ibm.com>
> 
> diff --git a/arch/s390/kernel/rethook.c b/arch/s390/kernel/rethook.c
> index af10e6bdd34e..4e86c0a1a064 100644
> --- a/arch/s390/kernel/rethook.c
> +++ b/arch/s390/kernel/rethook.c
> @@ -3,8 +3,9 @@
>  #include <linux/kprobes.h>
>  #include "rethook.h"
>  
> -void arch_rethook_prepare(struct rethook_node *rh, struct pt_regs *regs, bool mcount)
> +void arch_rethook_prepare(struct rethook_node *rh, struct ftrace_regs *fregs, bool mcount)
>  {
> +	struct pt_regs *regs = (struct pt_regs *)fregs;
>  	rh->ret_addr = regs->gprs[14];
>  	rh->frame = regs->gprs[15];
>  
> @@ -13,10 +14,11 @@ void arch_rethook_prepare(struct rethook_node *rh, struct pt_regs *regs, bool mc
>  }
>  NOKPROBE_SYMBOL(arch_rethook_prepare);
>  
> -void arch_rethook_fixup_return(struct pt_regs *regs,
> +void arch_rethook_fixup_return(struct ftrace_regs *fregs,
>  			       unsigned long correct_ret_addr)
>  {
>  	/* Replace fake return address with real one. */
> +	struct pt_regs *regs = (struct pt_regs *)fregs;
>  	regs->gprs[14] = correct_ret_addr;
>  }
>  NOKPROBE_SYMBOL(arch_rethook_fixup_return);
> @@ -24,9 +26,9 @@ NOKPROBE_SYMBOL(arch_rethook_fixup_return);
>  /*
>   * Called from arch_rethook_trampoline
>   */
> -unsigned long arch_rethook_trampoline_callback(struct pt_regs *regs)
> +unsigned long arch_rethook_trampoline_callback(struct ftrace_regs *fregs)
>  {
> -	return rethook_trampoline_handler(regs, regs->gprs[15]);
> +	return rethook_trampoline_handler(fregs, fregs->regs.gprs[15]);
>  }
>  NOKPROBE_SYMBOL(arch_rethook_trampoline_callback);
>  
> diff --git a/arch/s390/kernel/rethook.h b/arch/s390/kernel/rethook.h
> index 32f069eed3f3..0fe62424fc78 100644
> --- a/arch/s390/kernel/rethook.h
> +++ b/arch/s390/kernel/rethook.h
> @@ -2,6 +2,6 @@
>  #ifndef __S390_RETHOOK_H
>  #define __S390_RETHOOK_H
>  
> -unsigned long arch_rethook_trampoline_callback(struct pt_regs *regs);
> +unsigned long arch_rethook_trampoline_callback(struct ftrace_regs *fregs);
>  
>  #endif
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 4/9] fprobe: rethook: Use ftrace_regs in fprobe exit handler and rethook
  2023-09-09 14:24           ` Masami Hiramatsu
@ 2023-09-11  7:55             ` Sven Schnelle
  2023-09-11 14:15               ` Masami Hiramatsu
  0 siblings, 1 reply; 32+ messages in thread
From: Sven Schnelle @ 2023-09-11  7:55 UTC (permalink / raw)
  To: Masami Hiramatsu
  Cc: Alexei Starovoitov, Steven Rostedt, Florent Revest,
	linux-trace-kernel, LKML, Martin KaFai Lau, bpf,
	Alexei Starovoitov, Jiri Olsa, Arnaldo Carvalho de Melo,
	Daniel Borkmann, Alan Maguire, Mark Rutland, Peter Zijlstra,
	Thomas Gleixner

Masami Hiramatsu (Google) <mhiramat@kernel.org> writes:

>> > IOW, it is ftrace save regs/restore regs code issue. I need to check how the
>> > function_graph implements it.
>> 
>> gpr2-gpr14 are always saved in ftrace_caller/ftrace_regs_caller(),
>> regardless of the FTRACE_WITH_REGS flags. The only difference is that
>> without the FTRACE_WITH_REGS flag the program status word (psw) is not
>> saved because collecting that is a rather expensive operation.
>
> Thanks for checking that! So s390 will recover those saved registers
> even if FTRACE_WITH_REGS flag is not set? (I wonder what is the requirement
> of the ftrace_regs when returning from ftrace_call() without
> FTRACE_WITH_REGS?)

Yes, it will recover these in all cases.

>> 
>> I used the following commands to test rethook (is that the correct
>> testcase?)
>> 
>> #!/bin/bash
>> cd /sys/kernel/tracing
>> 
>> echo 'r:icmp_rcv icmp_rcv' >kprobe_events
>> echo 1 >events/kprobes/icmp_rcv/enable
>> ping -c 1 127.0.0.1
>> cat trace
>
> No, the kprobe will path pt_regs to rethook.
> Cna you run
>
> echo "f:icmp_rcv%return icmp_rcv" >> dynamic_events

Ah, ok. Seems to work as well:

  ping-481     [001] ..s2.    53.918480: icmp_rcv: (ip_protocol_deliver_rcu+0x42/0x218 <- icmp_rcv)
  ping-481     [001] ..s2.    53.918491: icmp_rcv: (ip_protocol_deliver_rcu+0x42/0x218 <- icmp_rcv)

^ permalink raw reply	[flat|nested] 32+ messages in thread

* Re: [PATCH v4 4/9] fprobe: rethook: Use ftrace_regs in fprobe exit handler and rethook
  2023-09-11  7:55             ` Sven Schnelle
@ 2023-09-11 14:15               ` Masami Hiramatsu
  0 siblings, 0 replies; 32+ messages in thread
From: Masami Hiramatsu @ 2023-09-11 14:15 UTC (permalink / raw)
  To: Sven Schnelle
  Cc: Alexei Starovoitov, Steven Rostedt, Florent Revest,
	linux-trace-kernel, LKML, Martin KaFai Lau, bpf,
	Alexei Starovoitov, Jiri Olsa, Arnaldo Carvalho de Melo,
	Daniel Borkmann, Alan Maguire, Mark Rutland, Peter Zijlstra,
	Thomas Gleixner

On Mon, 11 Sep 2023 09:55:09 +0200
Sven Schnelle <svens@linux.ibm.com> wrote:

> Masami Hiramatsu (Google) <mhiramat@kernel.org> writes:
> 
> >> > IOW, it is ftrace save regs/restore regs code issue. I need to check how the
> >> > function_graph implements it.
> >> 
> >> gpr2-gpr14 are always saved in ftrace_caller/ftrace_regs_caller(),
> >> regardless of the FTRACE_WITH_REGS flags. The only difference is that
> >> without the FTRACE_WITH_REGS flag the program status word (psw) is not
> >> saved because collecting that is a rather expensive operation.
> >
> > Thanks for checking that! So s390 will recover those saved registers
> > even if FTRACE_WITH_REGS flag is not set? (I wonder what is the requirement
> > of the ftrace_regs when returning from ftrace_call() without
> > FTRACE_WITH_REGS?)
> 
> Yes, it will recover these in all cases.

Thanks for the confirmation!

> 
> >> 
> >> I used the following commands to test rethook (is that the correct
> >> testcase?)
> >> 
> >> #!/bin/bash
> >> cd /sys/kernel/tracing
> >> 
> >> echo 'r:icmp_rcv icmp_rcv' >kprobe_events
> >> echo 1 >events/kprobes/icmp_rcv/enable
> >> ping -c 1 127.0.0.1
> >> cat trace
> >
> > No, the kprobe will path pt_regs to rethook.
> > Cna you run
> >
> > echo "f:icmp_rcv%return icmp_rcv" >> dynamic_events
> 
> Ah, ok. Seems to work as well:
> 
>   ping-481     [001] ..s2.    53.918480: icmp_rcv: (ip_protocol_deliver_rcu+0x42/0x218 <- icmp_rcv)
>   ping-481     [001] ..s2.    53.918491: icmp_rcv: (ip_protocol_deliver_rcu+0x42/0x218 <- icmp_rcv)

Nice!
OK, then s390 is safe to use ftrace_regs :) 

Thanks!


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 32+ messages in thread

end of thread, other threads:[~2023-09-11 21:17 UTC | newest]

Thread overview: 32+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-23 15:15 [PATCH v4 0/9] bpf: fprobe: rethook: Use ftrace_regs instead of pt_regs Masami Hiramatsu (Google)
2023-08-23 15:15 ` [PATCH v4 1/9] Documentation: probes: Add a new ret_ip callback parameter Masami Hiramatsu (Google)
2023-08-25 16:11   ` Florent Revest
2023-08-23 15:15 ` [PATCH v4 2/9] fprobe: Use fprobe_regs in fprobe entry handler Masami Hiramatsu (Google)
2023-08-25 16:11   ` Florent Revest
2023-08-23 15:16 ` [PATCH v4 3/9] tracing: Expose ftrace_regs regardless of CONFIG_FUNCTION_TRACER Masami Hiramatsu (Google)
2023-08-23 15:16 ` [PATCH v4 4/9] fprobe: rethook: Use ftrace_regs in fprobe exit handler and rethook Masami Hiramatsu (Google)
2023-08-25 16:12   ` Florent Revest
2023-09-04 13:40   ` Masami Hiramatsu
2023-09-05  7:17     ` Sven Schnelle
2023-09-05 13:36       ` Masami Hiramatsu
2023-09-05 16:30         ` Steven Rostedt
2023-09-06  0:06           ` Masami Hiramatsu
2023-09-06  6:49         ` Sven Schnelle
2023-09-09 14:24           ` Masami Hiramatsu
2023-09-11  7:55             ` Sven Schnelle
2023-09-11 14:15               ` Masami Hiramatsu
2023-08-23 15:16 ` [PATCH v4 5/9] ftrace: Add ftrace_partial_regs() for converting ftrace_regs to pt_regs Masami Hiramatsu (Google)
2023-08-25 21:49   ` Andrii Nakryiko
2023-08-26  1:56     ` Masami Hiramatsu
2023-09-05 19:50       ` Andrii Nakryiko
2023-09-06  0:28         ` Masami Hiramatsu
2023-09-08 22:56           ` Andrii Nakryiko
2023-08-23 15:16 ` [PATCH v4 6/9] tracing/fprobe: Enable fprobe events with CONFIG_DYNAMIC_FTRACE_WITH_ARGS Masami Hiramatsu (Google)
2023-08-25 16:12   ` Florent Revest
2023-08-26  3:38     ` Masami Hiramatsu
2023-08-30  7:20   ` Masami Hiramatsu
2023-08-23 15:16 ` [PATCH v4 7/9] bpf: Enable kprobe_multi feature if CONFIG_FPROBE is enabled Masami Hiramatsu (Google)
2023-08-23 15:16 ` [PATCH v4 8/9] Documentations: probes: Update fprobe document to use ftrace_regs Masami Hiramatsu (Google)
2023-08-23 15:17 ` [PATCH v4 9/9] Documentation: tracing: Add a note about argument and retval access Masami Hiramatsu (Google)
2023-08-25 16:12   ` Florent Revest
2023-08-25 16:11 ` [PATCH v4 0/9] bpf: fprobe: rethook: Use ftrace_regs instead of pt_regs Florent Revest

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.