netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach
@ 2021-06-05 11:10 Jiri Olsa
  2021-06-05 11:10 ` [PATCH 01/19] x86/ftrace: Remove extra orig rax move Jiri Olsa
                   ` (19 more replies)
  0 siblings, 20 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-05 11:10 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

hi,
saga continues.. ;-) previous post is in here [1]

After another discussion with Steven, he mentioned that if we fix
the ftrace graph problem with direct functions, he'd be open to
add batch interface for direct ftrace functions.

He already had prove of concept fix for that, which I took and broke
up into several changes. I added the ftrace direct batch interface
and bpf new interface on top of that.

It's not so many patches after all, so I thought having them all
together will help the review, because they are all connected.
However I can break this up into separate patchsets if necessary.

This patchset contains:

  1) patches (1-4) that fix the ftrace graph tracing over the function
     with direct trampolines attached
  2) patches (5-8) that add batch interface for ftrace direct function
     register/unregister/modify
  3) patches (9-19) that add support to attach BPF program to multiple
     functions

In nutshell:

Ad 1) moves the graph tracing setup before the direct trampoline
prepares the stack, so they don't clash

Ad 2) uses ftrace_ops interface to register direct function with
all functions in ftrace_ops filter.

Ad 3) creates special program and trampoline type to allow attachment
of multiple functions to single program.

There're more detailed desriptions in related changelogs.

I have working bpftrace multi attachment code on top this. I briefly
checked retsnoop and I think it could use the new API as well.


Also available at:
  https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
  bpf/batch

thanks,
jirka


[1] https://lore.kernel.org/bpf/20210413121516.1467989-1-jolsa@kernel.org/

---
Jiri Olsa (17):
      x86/ftrace: Remove extra orig rax move
      tracing: Add trampoline/graph selftest
      ftrace: Add ftrace_add_rec_direct function
      ftrace: Add multi direct register/unregister interface
      ftrace: Add multi direct modify interface
      ftrace/samples: Add multi direct interface test module
      bpf, x64: Allow to use caller address from stack
      bpf: Allow to store caller's ip as argument
      bpf: Add support to load multi func tracing program
      bpf: Add bpf_trampoline_alloc function
      bpf: Add support to link multi func tracing program
      libbpf: Add btf__find_by_pattern_kind function
      libbpf: Add support to link multi func tracing program
      selftests/bpf: Add fentry multi func test
      selftests/bpf: Add fexit multi func test
      selftests/bpf: Add fentry/fexit multi func test
      selftests/bpf: Temporary fix for fentry_fexit_multi_test

Steven Rostedt (VMware) (2):
      x86/ftrace: Remove fault protection code in prepare_ftrace_return
      x86/ftrace: Make function graph use ftrace directly

 arch/x86/include/asm/ftrace.h                                    |   9 ++++--
 arch/x86/kernel/ftrace.c                                         |  71 ++++++++++++++++++++++-----------------------
 arch/x86/kernel/ftrace_64.S                                      |  30 +------------------
 arch/x86/net/bpf_jit_comp.c                                      |  31 ++++++++++++++------
 include/linux/bpf.h                                              |  14 +++++++++
 include/linux/ftrace.h                                           |  22 ++++++++++++++
 include/uapi/linux/bpf.h                                         |  12 ++++++++
 kernel/bpf/btf.c                                                 |   5 ++++
 kernel/bpf/syscall.c                                             | 220 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----
 kernel/bpf/trampoline.c                                          |  83 ++++++++++++++++++++++++++++++++++++++---------------
 kernel/bpf/verifier.c                                            |   3 +-
 kernel/trace/fgraph.c                                            |   8 ++++--
 kernel/trace/ftrace.c                                            | 211 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----------------
 kernel/trace/trace_selftest.c                                    |  49 ++++++++++++++++++++++++++++++-
 samples/ftrace/Makefile                                          |   1 +
 samples/ftrace/ftrace-direct-multi.c                             |  52 +++++++++++++++++++++++++++++++++
 tools/include/uapi/linux/bpf.h                                   |  12 ++++++++
 tools/lib/bpf/bpf.c                                              |  11 ++++++-
 tools/lib/bpf/bpf.h                                              |   4 ++-
 tools/lib/bpf/btf.c                                              |  68 +++++++++++++++++++++++++++++++++++++++++++
 tools/lib/bpf/btf.h                                              |   3 ++
 tools/lib/bpf/libbpf.c                                           |  72 ++++++++++++++++++++++++++++++++++++++++++++++
 tools/testing/selftests/bpf/multi_check.h                        |  53 ++++++++++++++++++++++++++++++++++
 tools/testing/selftests/bpf/prog_tests/fentry_fexit_multi_test.c |  52 +++++++++++++++++++++++++++++++++
 tools/testing/selftests/bpf/prog_tests/fentry_multi_test.c       |  43 +++++++++++++++++++++++++++
 tools/testing/selftests/bpf/prog_tests/fexit_multi_test.c        |  44 ++++++++++++++++++++++++++++
 tools/testing/selftests/bpf/progs/fentry_fexit_multi_test.c      |  31 ++++++++++++++++++++
 tools/testing/selftests/bpf/progs/fentry_multi_test.c            |  20 +++++++++++++
 tools/testing/selftests/bpf/progs/fexit_multi_test.c             |  22 ++++++++++++++
 29 files changed, 1121 insertions(+), 135 deletions(-)
 create mode 100644 samples/ftrace/ftrace-direct-multi.c
 create mode 100644 tools/testing/selftests/bpf/multi_check.h
 create mode 100644 tools/testing/selftests/bpf/prog_tests/fentry_fexit_multi_test.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/fentry_multi_test.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/fexit_multi_test.c
 create mode 100644 tools/testing/selftests/bpf/progs/fentry_fexit_multi_test.c
 create mode 100644 tools/testing/selftests/bpf/progs/fentry_multi_test.c
 create mode 100644 tools/testing/selftests/bpf/progs/fexit_multi_test.c


^ permalink raw reply	[flat|nested] 76+ messages in thread

* [PATCH 01/19] x86/ftrace: Remove extra orig rax move
  2021-06-05 11:10 [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach Jiri Olsa
@ 2021-06-05 11:10 ` Jiri Olsa
  2021-06-05 11:10 ` [PATCH 02/19] x86/ftrace: Remove fault protection code in prepare_ftrace_return Jiri Olsa
                   ` (18 subsequent siblings)
  19 siblings, 0 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-05 11:10 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

There's identical move 2 lines earlier.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 arch/x86/kernel/ftrace_64.S | 1 -
 1 file changed, 1 deletion(-)

diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
index 7c273846c687..a8eb084a7a9a 100644
--- a/arch/x86/kernel/ftrace_64.S
+++ b/arch/x86/kernel/ftrace_64.S
@@ -251,7 +251,6 @@ SYM_INNER_LABEL(ftrace_regs_call, SYM_L_GLOBAL)
 	 * If ORIG_RAX is anything but zero, make this a call to that.
 	 * See arch_ftrace_set_direct_caller().
 	 */
-	movq ORIG_RAX(%rsp), %rax
 	testq	%rax, %rax
 SYM_INNER_LABEL(ftrace_regs_caller_jmp, SYM_L_GLOBAL)
 	jnz	1f
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH 02/19] x86/ftrace: Remove fault protection code in prepare_ftrace_return
  2021-06-05 11:10 [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach Jiri Olsa
  2021-06-05 11:10 ` [PATCH 01/19] x86/ftrace: Remove extra orig rax move Jiri Olsa
@ 2021-06-05 11:10 ` Jiri Olsa
  2021-06-05 11:10 ` [PATCH 03/19] x86/ftrace: Make function graph use ftrace directly Jiri Olsa
                   ` (17 subsequent siblings)
  19 siblings, 0 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-05 11:10 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

Removing the fault protection code when writing return_hooker
to stack. As Steven noted:

> That protection was there from the beginning due to being "paranoid",
> considering ftrace was bricking network cards. But that protection
> would not have even protected against that.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 arch/x86/kernel/ftrace.c | 38 +++-----------------------------------
 1 file changed, 3 insertions(+), 35 deletions(-)

diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index 1b3ce3b4a2a2..c555624da989 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -625,12 +625,10 @@ int ftrace_disable_ftrace_graph_caller(void)
  * Hook the return address and push it in the stack of return addrs
  * in current thread info.
  */
-void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
+void prepare_ftrace_return(unsigned long ip, unsigned long *parent,
 			   unsigned long frame_pointer)
 {
 	unsigned long return_hooker = (unsigned long)&return_to_handler;
-	unsigned long old;
-	int faulted;
 
 	/*
 	 * When resuming from suspend-to-ram, this function can be indirectly
@@ -650,37 +648,7 @@ void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
 	if (unlikely(atomic_read(&current->tracing_graph_pause)))
 		return;
 
-	/*
-	 * Protect against fault, even if it shouldn't
-	 * happen. This tool is too much intrusive to
-	 * ignore such a protection.
-	 */
-	asm volatile(
-		"1: " _ASM_MOV " (%[parent]), %[old]\n"
-		"2: " _ASM_MOV " %[return_hooker], (%[parent])\n"
-		"   movl $0, %[faulted]\n"
-		"3:\n"
-
-		".section .fixup, \"ax\"\n"
-		"4: movl $1, %[faulted]\n"
-		"   jmp 3b\n"
-		".previous\n"
-
-		_ASM_EXTABLE(1b, 4b)
-		_ASM_EXTABLE(2b, 4b)
-
-		: [old] "=&r" (old), [faulted] "=r" (faulted)
-		: [parent] "r" (parent), [return_hooker] "r" (return_hooker)
-		: "memory"
-	);
-
-	if (unlikely(faulted)) {
-		ftrace_graph_stop();
-		WARN_ON(1);
-		return;
-	}
-
-	if (function_graph_enter(old, self_addr, frame_pointer, parent))
-		*parent = old;
+	if (!function_graph_enter(*parent, ip, frame_pointer, parent))
+		*parent = return_hooker;
 }
 #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH 03/19] x86/ftrace: Make function graph use ftrace directly
  2021-06-05 11:10 [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach Jiri Olsa
  2021-06-05 11:10 ` [PATCH 01/19] x86/ftrace: Remove extra orig rax move Jiri Olsa
  2021-06-05 11:10 ` [PATCH 02/19] x86/ftrace: Remove fault protection code in prepare_ftrace_return Jiri Olsa
@ 2021-06-05 11:10 ` Jiri Olsa
  2021-06-08 18:35   ` Andrii Nakryiko
  2021-06-05 11:10 ` [PATCH 04/19] tracing: Add trampoline/graph selftest Jiri Olsa
                   ` (16 subsequent siblings)
  19 siblings, 1 reply; 76+ messages in thread
From: Jiri Olsa @ 2021-06-05 11:10 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

We don't need special hook for graph tracer entry point,
but instead we can use graph_ops::func function to install
the return_hooker.

This moves the graph tracing setup _before_ the direct
trampoline prepares the stack, so the return_hooker will
be called when the direct trampoline is finished.

This simplifies the code, because we don't need to take into
account the direct trampoline setup when preparing the graph
tracer hooker and we can allow function graph tracer on entries
registered with direct trampoline.

Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 arch/x86/include/asm/ftrace.h |  9 +++++++--
 arch/x86/kernel/ftrace.c      | 37 ++++++++++++++++++++++++++++++++---
 arch/x86/kernel/ftrace_64.S   | 29 +--------------------------
 include/linux/ftrace.h        |  6 ++++++
 kernel/trace/fgraph.c         |  8 +++++---
 5 files changed, 53 insertions(+), 36 deletions(-)

diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
index 9f3130f40807..024d9797646e 100644
--- a/arch/x86/include/asm/ftrace.h
+++ b/arch/x86/include/asm/ftrace.h
@@ -57,6 +57,13 @@ arch_ftrace_get_regs(struct ftrace_regs *fregs)
 
 #define ftrace_instruction_pointer_set(fregs, _ip)	\
 	do { (fregs)->regs.ip = (_ip); } while (0)
+
+struct ftrace_ops;
+#define ftrace_graph_func ftrace_graph_func
+void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
+		       struct ftrace_ops *op, struct ftrace_regs *fregs);
+#else
+#define FTRACE_GRAPH_TRAMP_ADDR FTRACE_GRAPH_ADDR
 #endif
 
 #ifdef CONFIG_DYNAMIC_FTRACE
@@ -65,8 +72,6 @@ struct dyn_arch_ftrace {
 	/* No extra data needed for x86 */
 };
 
-#define FTRACE_GRAPH_TRAMP_ADDR FTRACE_GRAPH_ADDR
-
 #endif /*  CONFIG_DYNAMIC_FTRACE */
 #endif /* __ASSEMBLY__ */
 #endif /* CONFIG_FUNCTION_TRACER */
diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
index c555624da989..804fcc6ef2c7 100644
--- a/arch/x86/kernel/ftrace.c
+++ b/arch/x86/kernel/ftrace.c
@@ -527,7 +527,7 @@ static void *addr_from_call(void *ptr)
 	return ptr + CALL_INSN_SIZE + call.disp;
 }
 
-void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
+void prepare_ftrace_return(unsigned long ip, unsigned long *parent,
 			   unsigned long frame_pointer);
 
 /*
@@ -541,7 +541,8 @@ static void *static_tramp_func(struct ftrace_ops *ops, struct dyn_ftrace *rec)
 	void *ptr;
 
 	if (ops && ops->trampoline) {
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
+#if !defined(CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS) && \
+	defined(CONFIG_FUNCTION_GRAPH_TRACER)
 		/*
 		 * We only know about function graph tracer setting as static
 		 * trampoline.
@@ -589,8 +590,9 @@ void arch_ftrace_trampoline_free(struct ftrace_ops *ops)
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
 
 #ifdef CONFIG_DYNAMIC_FTRACE
-extern void ftrace_graph_call(void);
 
+#ifndef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
+extern void ftrace_graph_call(void);
 static const char *ftrace_jmp_replace(unsigned long ip, unsigned long addr)
 {
 	return text_gen_insn(JMP32_INSN_OPCODE, (void *)ip, (void *)addr);
@@ -618,7 +620,17 @@ int ftrace_disable_ftrace_graph_caller(void)
 
 	return ftrace_mod_jmp(ip, &ftrace_stub);
 }
+#else /* !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS */
+int ftrace_enable_ftrace_graph_caller(void)
+{
+	return 0;
+}
 
+int ftrace_disable_ftrace_graph_caller(void)
+{
+	return 0;
+}
+#endif /* CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS */
 #endif /* !CONFIG_DYNAMIC_FTRACE */
 
 /*
@@ -629,6 +641,7 @@ void prepare_ftrace_return(unsigned long ip, unsigned long *parent,
 			   unsigned long frame_pointer)
 {
 	unsigned long return_hooker = (unsigned long)&return_to_handler;
+	int bit;
 
 	/*
 	 * When resuming from suspend-to-ram, this function can be indirectly
@@ -648,7 +661,25 @@ void prepare_ftrace_return(unsigned long ip, unsigned long *parent,
 	if (unlikely(atomic_read(&current->tracing_graph_pause)))
 		return;
 
+	bit = ftrace_test_recursion_trylock(ip, *parent);
+	if (bit < 0)
+		return;
+
 	if (!function_graph_enter(*parent, ip, frame_pointer, parent))
 		*parent = return_hooker;
+
+	ftrace_test_recursion_unlock(bit);
+}
+
+#ifdef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
+void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
+		       struct ftrace_ops *op, struct ftrace_regs *fregs)
+{
+	struct pt_regs *regs = &fregs->regs;
+	unsigned long *stack = (unsigned long *)kernel_stack_pointer(regs);
+
+	prepare_ftrace_return(ip, (unsigned long *)stack, 0);
 }
+#endif
+
 #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
index a8eb084a7a9a..7a879901f103 100644
--- a/arch/x86/kernel/ftrace_64.S
+++ b/arch/x86/kernel/ftrace_64.S
@@ -174,11 +174,6 @@ SYM_INNER_LABEL(ftrace_caller_end, SYM_L_GLOBAL)
 SYM_FUNC_END(ftrace_caller);
 
 SYM_FUNC_START(ftrace_epilogue)
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
-SYM_INNER_LABEL(ftrace_graph_call, SYM_L_GLOBAL)
-	jmp ftrace_stub
-#endif
-
 /*
  * This is weak to keep gas from relaxing the jumps.
  * It is also used to copy the retq for trampolines.
@@ -288,15 +283,6 @@ SYM_FUNC_START(__fentry__)
 	cmpq $ftrace_stub, ftrace_trace_function
 	jnz trace
 
-fgraph_trace:
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER
-	cmpq $ftrace_stub, ftrace_graph_return
-	jnz ftrace_graph_caller
-
-	cmpq $ftrace_graph_entry_stub, ftrace_graph_entry
-	jnz ftrace_graph_caller
-#endif
-
 SYM_INNER_LABEL(ftrace_stub, SYM_L_GLOBAL)
 	retq
 
@@ -314,25 +300,12 @@ trace:
 	CALL_NOSPEC r8
 	restore_mcount_regs
 
-	jmp fgraph_trace
+	jmp ftrace_stub
 SYM_FUNC_END(__fentry__)
 EXPORT_SYMBOL(__fentry__)
 #endif /* CONFIG_DYNAMIC_FTRACE */
 
 #ifdef CONFIG_FUNCTION_GRAPH_TRACER
-SYM_FUNC_START(ftrace_graph_caller)
-	/* Saves rbp into %rdx and fills first parameter  */
-	save_mcount_regs
-
-	leaq MCOUNT_REG_SIZE+8(%rsp), %rsi
-	movq $0, %rdx	/* No framepointers needed */
-	call	prepare_ftrace_return
-
-	restore_mcount_regs
-
-	retq
-SYM_FUNC_END(ftrace_graph_caller)
-
 SYM_FUNC_START(return_to_handler)
 	subq  $24, %rsp
 
diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index a69f363b61bf..40b493908f09 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -614,6 +614,12 @@ void ftrace_modify_all_code(int command);
 extern void ftrace_graph_caller(void);
 extern int ftrace_enable_ftrace_graph_caller(void);
 extern int ftrace_disable_ftrace_graph_caller(void);
+#ifndef ftrace_graph_func
+#define ftrace_graph_func ftrace_stub
+#define FTRACE_OPS_GRAPH_STUB | FTRACE_OPS_FL_STUB
+#else
+#define FTRACE_OPS_GRAPH_STUB
+#endif
 #else
 static inline int ftrace_enable_ftrace_graph_caller(void) { return 0; }
 static inline int ftrace_disable_ftrace_graph_caller(void) { return 0; }
diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
index b8a0d1d564fb..58e96b45e9da 100644
--- a/kernel/trace/fgraph.c
+++ b/kernel/trace/fgraph.c
@@ -115,6 +115,7 @@ int function_graph_enter(unsigned long ret, unsigned long func,
 {
 	struct ftrace_graph_ent trace;
 
+#ifndef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
 	/*
 	 * Skip graph tracing if the return location is served by direct trampoline,
 	 * since call sequence and return addresses are unpredictable anyway.
@@ -124,6 +125,7 @@ int function_graph_enter(unsigned long ret, unsigned long func,
 	if (ftrace_direct_func_count &&
 	    ftrace_find_rec_direct(ret - MCOUNT_INSN_SIZE))
 		return -EBUSY;
+#endif
 	trace.func = func;
 	trace.depth = ++current->curr_ret_depth;
 
@@ -333,10 +335,10 @@ unsigned long ftrace_graph_ret_addr(struct task_struct *task, int *idx,
 #endif /* HAVE_FUNCTION_GRAPH_RET_ADDR_PTR */
 
 static struct ftrace_ops graph_ops = {
-	.func			= ftrace_stub,
+	.func			= ftrace_graph_func,
 	.flags			= FTRACE_OPS_FL_INITIALIZED |
-				   FTRACE_OPS_FL_PID |
-				   FTRACE_OPS_FL_STUB,
+				   FTRACE_OPS_FL_PID
+				   FTRACE_OPS_GRAPH_STUB,
 #ifdef FTRACE_GRAPH_TRAMP_ADDR
 	.trampoline		= FTRACE_GRAPH_TRAMP_ADDR,
 	/* trampoline_size is only needed for dynamically allocated tramps */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH 04/19] tracing: Add trampoline/graph selftest
  2021-06-05 11:10 [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach Jiri Olsa
                   ` (2 preceding siblings ...)
  2021-06-05 11:10 ` [PATCH 03/19] x86/ftrace: Make function graph use ftrace directly Jiri Olsa
@ 2021-06-05 11:10 ` Jiri Olsa
  2021-06-05 11:10 ` [PATCH 05/19] ftrace: Add ftrace_add_rec_direct function Jiri Olsa
                   ` (15 subsequent siblings)
  19 siblings, 0 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-05 11:10 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

Adding selftest for checking that direct trampoline can
co-exist together with graph tracer on same function.

This is supported for CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
config option, which is defined only for x86_64 for now.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 kernel/trace/trace_selftest.c | 49 ++++++++++++++++++++++++++++++++++-
 1 file changed, 48 insertions(+), 1 deletion(-)

diff --git a/kernel/trace/trace_selftest.c b/kernel/trace/trace_selftest.c
index adf7ef194005..f8e55b949cdd 100644
--- a/kernel/trace/trace_selftest.c
+++ b/kernel/trace/trace_selftest.c
@@ -750,6 +750,8 @@ static struct fgraph_ops fgraph_ops __initdata  = {
 	.retfunc		= &trace_graph_return,
 };
 
+noinline __noclone static void trace_direct_tramp(void) { }
+
 /*
  * Pretty much the same than for the function tracer from which the selftest
  * has been borrowed.
@@ -760,6 +762,7 @@ trace_selftest_startup_function_graph(struct tracer *trace,
 {
 	int ret;
 	unsigned long count;
+	char *func_name __maybe_unused;
 
 #ifdef CONFIG_DYNAMIC_FTRACE
 	if (ftrace_filter_param) {
@@ -808,8 +811,52 @@ trace_selftest_startup_function_graph(struct tracer *trace,
 		goto out;
 	}
 
-	/* Don't test dynamic tracing, the function tracer already did */
+#ifdef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
+	tracing_reset_online_cpus(&tr->array_buffer);
+	set_graph_array(tr);
 
+	/*
+	 * Some archs *cough*PowerPC*cough* add characters to the
+	 * start of the function names. We simply put a '*' to
+	 * accommodate them.
+	 */
+	func_name = "*" __stringify(DYN_FTRACE_TEST_NAME);
+	ftrace_set_global_filter(func_name, strlen(func_name), 1);
+
+	/*
+	 * Register direct function together with graph tracer
+	 * and make sure we get graph trace.
+	 */
+	ret = register_ftrace_direct((unsigned long) DYN_FTRACE_TEST_NAME,
+				     (unsigned long) trace_direct_tramp);
+	if (ret)
+		goto out;
+
+	ret = register_ftrace_graph(&fgraph_ops);
+	if (ret) {
+		warn_failed_init_tracer(trace, ret);
+		goto out;
+	}
+
+	DYN_FTRACE_TEST_NAME();
+
+	count = 0;
+
+	tracing_stop();
+	/* check the trace buffer */
+	ret = trace_test_buffer(&tr->array_buffer, &count);
+
+	unregister_ftrace_graph(&fgraph_ops);
+
+	tracing_start();
+
+	if (!ret && !count) {
+		ret = -1;
+		goto out;
+	}
+#endif
+
+	/* Don't test dynamic tracing, the function tracer already did */
 out:
 	/* Stop it if we failed */
 	if (ret)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH 05/19] ftrace: Add ftrace_add_rec_direct function
  2021-06-05 11:10 [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach Jiri Olsa
                   ` (3 preceding siblings ...)
  2021-06-05 11:10 ` [PATCH 04/19] tracing: Add trampoline/graph selftest Jiri Olsa
@ 2021-06-05 11:10 ` Jiri Olsa
  2021-06-05 11:10 ` [PATCH 06/19] ftrace: Add multi direct register/unregister interface Jiri Olsa
                   ` (14 subsequent siblings)
  19 siblings, 0 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-05 11:10 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

Factor out the code that adds (ip, addr) tuple to direct_functions
hash in new ftrace_add_rec_direct function. It will be used in
following patches.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 kernel/trace/ftrace.c | 60 ++++++++++++++++++++++++++-----------------
 1 file changed, 36 insertions(+), 24 deletions(-)

diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 2e8a3fde7104..9e584710f542 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -2388,6 +2388,39 @@ unsigned long ftrace_find_rec_direct(unsigned long ip)
 	return entry->direct;
 }
 
+static struct ftrace_func_entry*
+ftrace_add_rec_direct(unsigned long ip, unsigned long addr,
+		      struct ftrace_hash **free_hash)
+{
+	struct ftrace_func_entry *entry;
+
+	if (ftrace_hash_empty(direct_functions) ||
+	    direct_functions->count > 2 * (1 << direct_functions->size_bits)) {
+		struct ftrace_hash *new_hash;
+		int size = ftrace_hash_empty(direct_functions) ? 0 :
+			direct_functions->count + 1;
+
+		if (size < 32)
+			size = 32;
+
+		new_hash = dup_hash(direct_functions, size);
+		if (!new_hash)
+			return NULL;
+
+		*free_hash = direct_functions;
+		direct_functions = new_hash;
+	}
+
+	entry = kmalloc(sizeof(*entry), GFP_KERNEL);
+	if (!entry)
+		return NULL;
+
+	entry->ip = ip;
+	entry->direct = addr;
+	__add_hash_entry(direct_functions, entry);
+	return entry;
+}
+
 static void call_direct_funcs(unsigned long ip, unsigned long pip,
 			      struct ftrace_ops *ops, struct ftrace_regs *fregs)
 {
@@ -5105,27 +5138,6 @@ int register_ftrace_direct(unsigned long ip, unsigned long addr)
 	}
 
 	ret = -ENOMEM;
-	if (ftrace_hash_empty(direct_functions) ||
-	    direct_functions->count > 2 * (1 << direct_functions->size_bits)) {
-		struct ftrace_hash *new_hash;
-		int size = ftrace_hash_empty(direct_functions) ? 0 :
-			direct_functions->count + 1;
-
-		if (size < 32)
-			size = 32;
-
-		new_hash = dup_hash(direct_functions, size);
-		if (!new_hash)
-			goto out_unlock;
-
-		free_hash = direct_functions;
-		direct_functions = new_hash;
-	}
-
-	entry = kmalloc(sizeof(*entry), GFP_KERNEL);
-	if (!entry)
-		goto out_unlock;
-
 	direct = ftrace_find_direct_func(addr);
 	if (!direct) {
 		direct = ftrace_alloc_direct_func(addr);
@@ -5135,9 +5147,9 @@ int register_ftrace_direct(unsigned long ip, unsigned long addr)
 		}
 	}
 
-	entry->ip = ip;
-	entry->direct = addr;
-	__add_hash_entry(direct_functions, entry);
+	entry = ftrace_add_rec_direct(ip, addr, &free_hash);
+	if (!entry)
+		goto out_unlock;
 
 	ret = ftrace_set_filter_ip(&direct_ops, ip, 0, 0);
 	if (ret)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH 06/19] ftrace: Add multi direct register/unregister interface
  2021-06-05 11:10 [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach Jiri Olsa
                   ` (4 preceding siblings ...)
  2021-06-05 11:10 ` [PATCH 05/19] ftrace: Add ftrace_add_rec_direct function Jiri Olsa
@ 2021-06-05 11:10 ` Jiri Olsa
  2021-06-05 11:10 ` [PATCH 07/19] ftrace: Add multi direct modify interface Jiri Olsa
                   ` (13 subsequent siblings)
  19 siblings, 0 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-05 11:10 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

Adding interface to register multiple direct functions
within single call. Adding following functions:

  register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
  unregister_ftrace_direct_multi(struct ftrace_ops *ops)

The register_ftrace_direct_multi registers direct function (addr)
with all functions in ops filter. The ops filter can be updated
before with ftrace_set_filter_ip calls.

All requested functions must not have direct function currently
registered, otherwise register_ftrace_direct_multi will fail.

The unregister_ftrace_direct_multi unregisters ops related direct
functions.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 include/linux/ftrace.h |  10 ++++
 kernel/trace/ftrace.c  | 108 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 118 insertions(+)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 40b493908f09..91e8d9534ad7 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -316,6 +316,8 @@ int ftrace_modify_direct_caller(struct ftrace_func_entry *entry,
 				unsigned long old_addr,
 				unsigned long new_addr);
 unsigned long ftrace_find_rec_direct(unsigned long ip);
+int register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr);
+int unregister_ftrace_direct_multi(struct ftrace_ops *ops);
 #else
 # define ftrace_direct_func_count 0
 static inline int register_ftrace_direct(unsigned long ip, unsigned long addr)
@@ -346,6 +348,14 @@ static inline unsigned long ftrace_find_rec_direct(unsigned long ip)
 {
 	return 0;
 }
+int register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
+{
+	return -ENODEV;
+}
+int unregister_ftrace_direct_multi(struct ftrace_ops *ops)
+{
+	return -ENODEV;
+}
 #endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
 
 #ifndef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index 9e584710f542..f1ebee1bfbfe 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -5402,6 +5402,114 @@ int modify_ftrace_direct(unsigned long ip,
 	return ret;
 }
 EXPORT_SYMBOL_GPL(modify_ftrace_direct);
+
+#define MULTI_FLAGS (FTRACE_OPS_FL_IPMODIFY | FTRACE_OPS_FL_DIRECT | \
+		     FTRACE_OPS_FL_SAVE_REGS)
+
+static int check_direct_multi(struct ftrace_ops *ops)
+{
+	if (!(ops->flags & FTRACE_OPS_FL_INITIALIZED))
+		return -EINVAL;
+	if ((ops->flags & MULTI_FLAGS) != MULTI_FLAGS)
+		return -EINVAL;
+	return 0;
+}
+
+int register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
+{
+	struct ftrace_hash *hash = ops->func_hash->filter_hash;
+	struct ftrace_func_entry *entry, *new;
+	struct ftrace_hash *free_hash = NULL;
+	int err = -EBUSY, size, i;
+
+	if (ops->func || ops->trampoline)
+		return -EINVAL;
+	if (ops->flags & FTRACE_OPS_FL_ENABLED)
+		return -EINVAL;
+	if (ftrace_hash_empty(hash))
+		return -EINVAL;
+
+	mutex_lock(&direct_mutex);
+
+	/* Make sure requested entries are not already registered.. */
+	size = 1 << hash->size_bits;
+	for (i = 0; i < size; i++) {
+		hlist_for_each_entry(entry, &hash->buckets[i], hlist) {
+			if (ftrace_find_rec_direct(entry->ip))
+				goto out_unlock;
+		}
+	}
+
+	/* ... and insert them to direct_functions hash. */
+	err = -ENOMEM;
+	for (i = 0; i < size; i++) {
+		hlist_for_each_entry(entry, &hash->buckets[i], hlist) {
+			new = ftrace_add_rec_direct(entry->ip, addr, &free_hash);
+			if (!new)
+				goto out_remove;
+			entry->direct = addr;
+		}
+	}
+
+	ops->func = call_direct_funcs;
+	ops->flags = MULTI_FLAGS;
+	ops->trampoline = FTRACE_REGS_ADDR;
+
+	err = register_ftrace_function(ops);
+
+ out_remove:
+	if (err) {
+		for (i = 0; i < size; i++) {
+			hlist_for_each_entry(entry, &hash->buckets[i], hlist) {
+				new = __ftrace_lookup_ip(direct_functions, entry->ip);
+				if (new) {
+					remove_hash_entry(direct_functions, new);
+					kfree(new);
+				}
+			}
+		}
+	}
+
+ out_unlock:
+	mutex_unlock(&direct_mutex);
+
+	if (free_hash) {
+		synchronize_rcu_tasks();
+		free_ftrace_hash(free_hash);
+	}
+	return err;
+}
+EXPORT_SYMBOL_GPL(register_ftrace_direct_multi);
+
+int unregister_ftrace_direct_multi(struct ftrace_ops *ops)
+{
+	struct ftrace_hash *hash = ops->func_hash->filter_hash;
+	struct ftrace_func_entry *entry, *new;
+	int err, size, i;
+
+	if (check_direct_multi(ops))
+		return -EINVAL;
+	if (!(ops->flags & FTRACE_OPS_FL_ENABLED))
+		return -EINVAL;
+
+	mutex_lock(&direct_mutex);
+	err = unregister_ftrace_function(ops);
+
+	size = 1 << hash->size_bits;
+	for (i = 0; i < size; i++) {
+		hlist_for_each_entry(entry, &hash->buckets[i], hlist) {
+			new = __ftrace_lookup_ip(direct_functions, entry->ip);
+			if (new) {
+				remove_hash_entry(direct_functions, new);
+				kfree(new);
+			}
+		}
+	}
+
+	mutex_unlock(&direct_mutex);
+	return err;
+}
+EXPORT_SYMBOL_GPL(unregister_ftrace_direct_multi);
 #endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
 
 /**
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH 07/19] ftrace: Add multi direct modify interface
  2021-06-05 11:10 [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach Jiri Olsa
                   ` (5 preceding siblings ...)
  2021-06-05 11:10 ` [PATCH 06/19] ftrace: Add multi direct register/unregister interface Jiri Olsa
@ 2021-06-05 11:10 ` Jiri Olsa
  2021-06-05 11:10 ` [PATCH 08/19] ftrace/samples: Add multi direct interface test module Jiri Olsa
                   ` (12 subsequent siblings)
  19 siblings, 0 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-05 11:10 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

Adding interface to modify registered direct function
for ftrace_ops. Adding following function:

   modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)

The function changes the currently registered direct
function for all attached functions.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 include/linux/ftrace.h |  6 ++++++
 kernel/trace/ftrace.c  | 43 ++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 49 insertions(+)

diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
index 91e8d9534ad7..7f63615ea2b1 100644
--- a/include/linux/ftrace.h
+++ b/include/linux/ftrace.h
@@ -318,6 +318,8 @@ int ftrace_modify_direct_caller(struct ftrace_func_entry *entry,
 unsigned long ftrace_find_rec_direct(unsigned long ip);
 int register_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr);
 int unregister_ftrace_direct_multi(struct ftrace_ops *ops);
+int modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr);
+
 #else
 # define ftrace_direct_func_count 0
 static inline int register_ftrace_direct(unsigned long ip, unsigned long addr)
@@ -356,6 +358,10 @@ int unregister_ftrace_direct_multi(struct ftrace_ops *ops)
 {
 	return -ENODEV;
 }
+int modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
+{
+	return -ENODEV;
+}
 #endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
 
 #ifndef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_DIRECT_CALLS
diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
index f1ebee1bfbfe..82fbfa05e311 100644
--- a/kernel/trace/ftrace.c
+++ b/kernel/trace/ftrace.c
@@ -5510,6 +5510,49 @@ int unregister_ftrace_direct_multi(struct ftrace_ops *ops)
 	return err;
 }
 EXPORT_SYMBOL_GPL(unregister_ftrace_direct_multi);
+
+int modify_ftrace_direct_multi(struct ftrace_ops *ops, unsigned long addr)
+{
+	struct ftrace_hash *hash = ops->func_hash->filter_hash;
+	struct ftrace_func_entry *entry, *iter;
+	int i, size;
+	int err;
+
+	if (check_direct_multi(ops))
+		return -EINVAL;
+	if (!(ops->flags & FTRACE_OPS_FL_ENABLED))
+		return -EINVAL;
+
+	mutex_lock(&direct_mutex);
+	mutex_lock(&ftrace_lock);
+
+	/*
+	 * Shutdown the ops, change 'direct' pointer for each
+	 * ops entry in direct_functions hash and startup the
+	 * ops back again.
+	 */
+	err = ftrace_shutdown(ops, 0);
+	if (err)
+		goto out_unlock;
+
+	size = 1 << hash->size_bits;
+	for (i = 0; i < size; i++) {
+		hlist_for_each_entry(iter, &hash->buckets[i], hlist) {
+			entry = __ftrace_lookup_ip(direct_functions, iter->ip);
+			if (!entry)
+				continue;
+			entry->direct = addr;
+		}
+	}
+
+	err = ftrace_startup(ops, 0);
+
+ out_unlock:
+	mutex_unlock(&ftrace_lock);
+	mutex_unlock(&direct_mutex);
+	return err;
+}
+EXPORT_SYMBOL_GPL(modify_ftrace_direct_multi);
 #endif /* CONFIG_DYNAMIC_FTRACE_WITH_DIRECT_CALLS */
 
 /**
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH 08/19] ftrace/samples: Add multi direct interface test module
  2021-06-05 11:10 [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach Jiri Olsa
                   ` (6 preceding siblings ...)
  2021-06-05 11:10 ` [PATCH 07/19] ftrace: Add multi direct modify interface Jiri Olsa
@ 2021-06-05 11:10 ` Jiri Olsa
  2021-06-05 11:10 ` [PATCH 09/19] bpf, x64: Allow to use caller address from stack Jiri Olsa
                   ` (11 subsequent siblings)
  19 siblings, 0 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-05 11:10 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

Adding simple module that uses multi direct interface:

  register_ftrace_direct_multi
  unregister_ftrace_direct_multi

The init function registers trampoline for 2 functions,
and exit function unregisters them.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 samples/ftrace/Makefile              |  1 +
 samples/ftrace/ftrace-direct-multi.c | 52 ++++++++++++++++++++++++++++
 2 files changed, 53 insertions(+)
 create mode 100644 samples/ftrace/ftrace-direct-multi.c

diff --git a/samples/ftrace/Makefile b/samples/ftrace/Makefile
index 4ce896e10b2e..ab1d1c05c288 100644
--- a/samples/ftrace/Makefile
+++ b/samples/ftrace/Makefile
@@ -3,6 +3,7 @@
 obj-$(CONFIG_SAMPLE_FTRACE_DIRECT) += ftrace-direct.o
 obj-$(CONFIG_SAMPLE_FTRACE_DIRECT) += ftrace-direct-too.o
 obj-$(CONFIG_SAMPLE_FTRACE_DIRECT) += ftrace-direct-modify.o
+obj-$(CONFIG_SAMPLE_FTRACE_DIRECT) += ftrace-direct-multi.o
 
 CFLAGS_sample-trace-array.o := -I$(src)
 obj-$(CONFIG_SAMPLE_TRACE_ARRAY) += sample-trace-array.o
diff --git a/samples/ftrace/ftrace-direct-multi.c b/samples/ftrace/ftrace-direct-multi.c
new file mode 100644
index 000000000000..76b34d46d11c
--- /dev/null
+++ b/samples/ftrace/ftrace-direct-multi.c
@@ -0,0 +1,52 @@
+// SPDX-License-Identifier: GPL-2.0-only
+#include <linux/module.h>
+
+#include <linux/mm.h> /* for handle_mm_fault() */
+#include <linux/ftrace.h>
+#include <linux/sched/stat.h>
+
+void my_direct_func(unsigned long ip)
+{
+	trace_printk("ip %lx\n", ip);
+}
+
+extern void my_tramp(void *);
+
+asm (
+"	.pushsection    .text, \"ax\", @progbits\n"
+"	.type		my_tramp, @function\n"
+"	.globl		my_tramp\n"
+"   my_tramp:"
+"	pushq %rbp\n"
+"	movq %rsp, %rbp\n"
+"	pushq %rdi\n"
+"	movq 8(%rbp), %rdi\n"
+"	call my_direct_func\n"
+"	popq %rdi\n"
+"	leave\n"
+"	ret\n"
+"	.size		my_tramp, .-my_tramp\n"
+"	.popsection\n"
+);
+
+static struct ftrace_ops direct;
+
+static int __init ftrace_direct_multi_init(void)
+{
+	ftrace_set_filter_ip(&direct, (unsigned long) wake_up_process, 0, 0);
+	ftrace_set_filter_ip(&direct, (unsigned long) schedule, 0, 0);
+
+	return register_ftrace_direct_multi(&direct, (unsigned long) my_tramp);
+}
+
+static void __exit ftrace_direct_multi_exit(void)
+{
+	unregister_ftrace_direct_multi(&direct);
+}
+
+module_init(ftrace_direct_multi_init);
+module_exit(ftrace_direct_multi_exit);
+
+MODULE_AUTHOR("Jiri Olsa");
+MODULE_DESCRIPTION("Example use case of using register_ftrace_direct_multi()");
+MODULE_LICENSE("GPL");
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH 09/19] bpf, x64: Allow to use caller address from stack
  2021-06-05 11:10 [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach Jiri Olsa
                   ` (7 preceding siblings ...)
  2021-06-05 11:10 ` [PATCH 08/19] ftrace/samples: Add multi direct interface test module Jiri Olsa
@ 2021-06-05 11:10 ` Jiri Olsa
  2021-06-07  3:07   ` Yonghong Song
  2021-06-05 11:10 ` [PATCH 10/19] bpf: Allow to store caller's ip as argument Jiri Olsa
                   ` (10 subsequent siblings)
  19 siblings, 1 reply; 76+ messages in thread
From: Jiri Olsa @ 2021-06-05 11:10 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

Currently we call the original function by using the absolute address
given at the JIT generation. That's not usable when having trampoline
attached to multiple functions. In this case we need to take the
return address from the stack.

Adding support to retrieve the original function address from the stack
by adding new BPF_TRAMP_F_ORIG_STACK flag for arch_prepare_bpf_trampoline
function.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 arch/x86/net/bpf_jit_comp.c | 13 +++++++++----
 include/linux/bpf.h         |  5 +++++
 2 files changed, 14 insertions(+), 4 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 2a2e290fa5d8..b77e6bd78354 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -2013,10 +2013,15 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
 	if (flags & BPF_TRAMP_F_CALL_ORIG) {
 		restore_regs(m, &prog, nr_args, stack_size);
 
-		/* call original function */
-		if (emit_call(&prog, orig_call, prog)) {
-			ret = -EINVAL;
-			goto cleanup;
+		if (flags & BPF_TRAMP_F_ORIG_STACK) {
+			emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8);
+			EMIT2(0xff, 0xd0); /* call *rax */
+		} else {
+			/* call original function */
+			if (emit_call(&prog, orig_call, prog)) {
+				ret = -EINVAL;
+				goto cleanup;
+			}
 		}
 		/* remember return value in a stack for bpf prog to access */
 		emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -8);
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 86dec5001ae2..16fc600503fb 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -554,6 +554,11 @@ struct btf_func_model {
  */
 #define BPF_TRAMP_F_SKIP_FRAME		BIT(2)
 
+/* Get original function from stack instead of from provided direct address.
+ * Makes sense for fexit programs only.
+ */
+#define BPF_TRAMP_F_ORIG_STACK		BIT(3)
+
 /* Each call __bpf_prog_enter + call bpf_func + call __bpf_prog_exit is ~50
  * bytes on x86.  Pick a number to fit into BPF_IMAGE_SIZE / 2
  */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH 10/19] bpf: Allow to store caller's ip as argument
  2021-06-05 11:10 [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach Jiri Olsa
                   ` (8 preceding siblings ...)
  2021-06-05 11:10 ` [PATCH 09/19] bpf, x64: Allow to use caller address from stack Jiri Olsa
@ 2021-06-05 11:10 ` Jiri Olsa
  2021-06-07  3:21   ` Yonghong Song
  2021-06-08 18:49   ` Andrii Nakryiko
  2021-06-05 11:10 ` [PATCH 11/19] bpf: Add support to load multi func tracing program Jiri Olsa
                   ` (9 subsequent siblings)
  19 siblings, 2 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-05 11:10 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

When we will have multiple functions attached to trampoline
we need to propagate the function's address to the bpf program.

Adding new BPF_TRAMP_F_IP_ARG flag to arch_prepare_bpf_trampoline
function that will store origin caller's address before function's
arguments.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 arch/x86/net/bpf_jit_comp.c | 18 ++++++++++++++----
 include/linux/bpf.h         |  5 +++++
 2 files changed, 19 insertions(+), 4 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index b77e6bd78354..d2425c18272a 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1951,7 +1951,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
 				void *orig_call)
 {
 	int ret, i, cnt = 0, nr_args = m->nr_args;
-	int stack_size = nr_args * 8;
+	int stack_size = nr_args * 8, ip_arg = 0;
 	struct bpf_tramp_progs *fentry = &tprogs[BPF_TRAMP_FENTRY];
 	struct bpf_tramp_progs *fexit = &tprogs[BPF_TRAMP_FEXIT];
 	struct bpf_tramp_progs *fmod_ret = &tprogs[BPF_TRAMP_MODIFY_RETURN];
@@ -1975,6 +1975,9 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
 		 */
 		orig_call += X86_PATCH_SIZE;
 
+	if (flags & BPF_TRAMP_F_IP_ARG)
+		stack_size += 8;
+
 	prog = image;
 
 	EMIT1(0x55);		 /* push rbp */
@@ -1982,7 +1985,14 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
 	EMIT4(0x48, 0x83, 0xEC, stack_size); /* sub rsp, stack_size */
 	EMIT1(0x53);		 /* push rbx */
 
-	save_regs(m, &prog, nr_args, stack_size);
+	if (flags & BPF_TRAMP_F_IP_ARG) {
+		emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8);
+		EMIT4(0x48, 0x83, 0xe8, X86_PATCH_SIZE); /* sub $X86_PATCH_SIZE,%rax*/
+		emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -stack_size);
+		ip_arg = 8;
+	}
+
+	save_regs(m, &prog, nr_args, stack_size - ip_arg);
 
 	if (flags & BPF_TRAMP_F_CALL_ORIG) {
 		/* arg1: mov rdi, im */
@@ -2011,7 +2021,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
 	}
 
 	if (flags & BPF_TRAMP_F_CALL_ORIG) {
-		restore_regs(m, &prog, nr_args, stack_size);
+		restore_regs(m, &prog, nr_args, stack_size - ip_arg);
 
 		if (flags & BPF_TRAMP_F_ORIG_STACK) {
 			emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8);
@@ -2052,7 +2062,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
 		}
 
 	if (flags & BPF_TRAMP_F_RESTORE_REGS)
-		restore_regs(m, &prog, nr_args, stack_size);
+		restore_regs(m, &prog, nr_args, stack_size - ip_arg);
 
 	/* This needs to be done regardless. If there were fmod_ret programs,
 	 * the return value is only updated on the stack and still needs to be
diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 16fc600503fb..6cbf3c81c650 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -559,6 +559,11 @@ struct btf_func_model {
  */
 #define BPF_TRAMP_F_ORIG_STACK		BIT(3)
 
+/* First argument is IP address of the caller. Makes sense for fentry/fexit
+ * programs only.
+ */
+#define BPF_TRAMP_F_IP_ARG		BIT(4)
+
 /* Each call __bpf_prog_enter + call bpf_func + call __bpf_prog_exit is ~50
  * bytes on x86.  Pick a number to fit into BPF_IMAGE_SIZE / 2
  */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH 11/19] bpf: Add support to load multi func tracing program
  2021-06-05 11:10 [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach Jiri Olsa
                   ` (9 preceding siblings ...)
  2021-06-05 11:10 ` [PATCH 10/19] bpf: Allow to store caller's ip as argument Jiri Olsa
@ 2021-06-05 11:10 ` Jiri Olsa
  2021-06-07  3:56   ` Yonghong Song
  2021-06-05 11:10 ` [PATCH 12/19] bpf: Add bpf_trampoline_alloc function Jiri Olsa
                   ` (8 subsequent siblings)
  19 siblings, 1 reply; 76+ messages in thread
From: Jiri Olsa @ 2021-06-05 11:10 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

Adding support to load tracing program with new BPF_F_MULTI_FUNC flag,
that allows the program to be loaded without specific function to be
attached to.

The verifier assumes the program is using all (6) available arguments
as unsigned long values. We can't add extra ip argument at this time,
because JIT on x86 would fail to process this function. Instead we
allow to access extra first 'ip' argument in btf_ctx_access.

Such program will be allowed to be attached to multiple functions
in following patches.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 include/linux/bpf.h            |  1 +
 include/uapi/linux/bpf.h       |  7 +++++++
 kernel/bpf/btf.c               |  5 +++++
 kernel/bpf/syscall.c           | 35 +++++++++++++++++++++++++++++-----
 kernel/bpf/verifier.c          |  3 ++-
 tools/include/uapi/linux/bpf.h |  7 +++++++
 6 files changed, 52 insertions(+), 6 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 6cbf3c81c650..23221e0e8d3c 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -845,6 +845,7 @@ struct bpf_prog_aux {
 	bool sleepable;
 	bool tail_call_reachable;
 	struct hlist_node tramp_hlist;
+	bool multi_func;
 	/* BTF_KIND_FUNC_PROTO for valid attach_btf_id */
 	const struct btf_type *attach_func_proto;
 	/* function name for valid attach_btf_id */
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 2c1ba70abbf1..ad9340fb14d4 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1109,6 +1109,13 @@ enum bpf_link_type {
  */
 #define BPF_F_SLEEPABLE		(1U << 4)
 
+/* If BPF_F_MULTI_FUNC is used in BPF_PROG_LOAD command, the verifier does
+ * not expect BTF ID for the program, instead it assumes it's function
+ * with 6 u64 arguments. No trampoline is created for the program. Such
+ * program can be attached to multiple functions.
+ */
+#define BPF_F_MULTI_FUNC	(1U << 5)
+
 /* When BPF ldimm64's insn[0].src_reg != 0 then this can have
  * the following extensions:
  *
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index a6e39c5ea0bf..c233aaa6a709 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -4679,6 +4679,11 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
 		args++;
 		nr_args--;
 	}
+	if (prog->aux->multi_func) {
+		if (arg == 0)
+			return true;
+		arg--;
+	}
 
 	if (arg > nr_args) {
 		bpf_log(log, "func '%s' doesn't have %d-th argument\n",
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 50457019da27..8f59090280b5 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -31,6 +31,7 @@
 #include <linux/bpf-netns.h>
 #include <linux/rcupdate_trace.h>
 #include <linux/memcontrol.h>
+#include <linux/btf_ids.h>
 
 #define IS_FD_ARRAY(map) ((map)->map_type == BPF_MAP_TYPE_PERF_EVENT_ARRAY || \
 			  (map)->map_type == BPF_MAP_TYPE_CGROUP_ARRAY || \
@@ -1979,7 +1980,8 @@ static int
 bpf_prog_load_check_attach(enum bpf_prog_type prog_type,
 			   enum bpf_attach_type expected_attach_type,
 			   struct btf *attach_btf, u32 btf_id,
-			   struct bpf_prog *dst_prog)
+			   struct bpf_prog *dst_prog,
+			   bool multi_func)
 {
 	if (btf_id) {
 		if (btf_id > BTF_MAX_TYPE)
@@ -1999,6 +2001,14 @@ bpf_prog_load_check_attach(enum bpf_prog_type prog_type,
 		}
 	}
 
+	if (multi_func) {
+		if (prog_type != BPF_PROG_TYPE_TRACING)
+			return -EINVAL;
+		if (!attach_btf || btf_id)
+			return -EINVAL;
+		return 0;
+	}
+
 	if (attach_btf && (!btf_id || dst_prog))
 		return -EINVAL;
 
@@ -2114,6 +2124,16 @@ static bool is_perfmon_prog_type(enum bpf_prog_type prog_type)
 	}
 }
 
+#define DEFINE_BPF_MULTI_FUNC(args...)			\
+	extern int bpf_multi_func(args);		\
+	int __init bpf_multi_func(args) { return 0; }
+
+DEFINE_BPF_MULTI_FUNC(unsigned long a1, unsigned long a2,
+		      unsigned long a3, unsigned long a4,
+		      unsigned long a5, unsigned long a6)
+
+BTF_ID_LIST_SINGLE(bpf_multi_func_btf_id, func, bpf_multi_func)
+
 /* last field in 'union bpf_attr' used by this command */
 #define	BPF_PROG_LOAD_LAST_FIELD fd_array
 
@@ -2124,6 +2144,7 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr)
 	struct btf *attach_btf = NULL;
 	int err;
 	char license[128];
+	bool multi_func;
 	bool is_gpl;
 
 	if (CHECK_ATTR(BPF_PROG_LOAD))
@@ -2133,7 +2154,8 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr)
 				 BPF_F_ANY_ALIGNMENT |
 				 BPF_F_TEST_STATE_FREQ |
 				 BPF_F_SLEEPABLE |
-				 BPF_F_TEST_RND_HI32))
+				 BPF_F_TEST_RND_HI32 |
+				 BPF_F_MULTI_FUNC))
 		return -EINVAL;
 
 	if (!IS_ENABLED(CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS) &&
@@ -2164,6 +2186,8 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr)
 	if (is_perfmon_prog_type(type) && !perfmon_capable())
 		return -EPERM;
 
+	multi_func = attr->prog_flags & BPF_F_MULTI_FUNC;
+
 	/* attach_prog_fd/attach_btf_obj_fd can specify fd of either bpf_prog
 	 * or btf, we need to check which one it is
 	 */
@@ -2182,7 +2206,7 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr)
 				return -ENOTSUPP;
 			}
 		}
-	} else if (attr->attach_btf_id) {
+	} else if (attr->attach_btf_id || multi_func) {
 		/* fall back to vmlinux BTF, if BTF type ID is specified */
 		attach_btf = bpf_get_btf_vmlinux();
 		if (IS_ERR(attach_btf))
@@ -2195,7 +2219,7 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr)
 	bpf_prog_load_fixup_attach_type(attr);
 	if (bpf_prog_load_check_attach(type, attr->expected_attach_type,
 				       attach_btf, attr->attach_btf_id,
-				       dst_prog)) {
+				       dst_prog, multi_func)) {
 		if (dst_prog)
 			bpf_prog_put(dst_prog);
 		if (attach_btf)
@@ -2215,10 +2239,11 @@ static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr)
 
 	prog->expected_attach_type = attr->expected_attach_type;
 	prog->aux->attach_btf = attach_btf;
-	prog->aux->attach_btf_id = attr->attach_btf_id;
+	prog->aux->attach_btf_id = multi_func ? bpf_multi_func_btf_id[0] : attr->attach_btf_id;
 	prog->aux->dst_prog = dst_prog;
 	prog->aux->offload_requested = !!attr->prog_ifindex;
 	prog->aux->sleepable = attr->prog_flags & BPF_F_SLEEPABLE;
+	prog->aux->multi_func = multi_func;
 
 	err = security_bpf_prog_alloc(prog->aux);
 	if (err)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 1de4b8c6ee42..194adddee2ec 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -13276,7 +13276,8 @@ static int check_attach_btf_id(struct bpf_verifier_env *env)
 		if (!bpf_iter_prog_supported(prog))
 			return -EINVAL;
 		return 0;
-	}
+	} else if (prog->aux->multi_func)
+		return prog->type == BPF_PROG_TYPE_TRACING ? 0 : -EINVAL;
 
 	if (prog->type == BPF_PROG_TYPE_LSM) {
 		ret = bpf_lsm_verify_prog(&env->log, prog);
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 2c1ba70abbf1..ad9340fb14d4 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1109,6 +1109,13 @@ enum bpf_link_type {
  */
 #define BPF_F_SLEEPABLE		(1U << 4)
 
+/* If BPF_F_MULTI_FUNC is used in BPF_PROG_LOAD command, the verifier does
+ * not expect BTF ID for the program, instead it assumes it's function
+ * with 6 u64 arguments. No trampoline is created for the program. Such
+ * program can be attached to multiple functions.
+ */
+#define BPF_F_MULTI_FUNC	(1U << 5)
+
 /* When BPF ldimm64's insn[0].src_reg != 0 then this can have
  * the following extensions:
  *
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH 12/19] bpf: Add bpf_trampoline_alloc function
  2021-06-05 11:10 [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach Jiri Olsa
                   ` (10 preceding siblings ...)
  2021-06-05 11:10 ` [PATCH 11/19] bpf: Add support to load multi func tracing program Jiri Olsa
@ 2021-06-05 11:10 ` Jiri Olsa
  2021-06-05 11:10 ` [PATCH 13/19] bpf: Add support to link multi func tracing program Jiri Olsa
                   ` (7 subsequent siblings)
  19 siblings, 0 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-05 11:10 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

Factor out the bpf_trampoline_alloc function. It will
be used to allocate trampoline for multi-func programs
in following patches.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 kernel/bpf/trampoline.c | 34 ++++++++++++++++++++++------------
 1 file changed, 22 insertions(+), 12 deletions(-)

diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index 28a3630c48ee..2755fdcf9fbf 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -58,11 +58,27 @@ void bpf_image_ksym_del(struct bpf_ksym *ksym)
 			   PAGE_SIZE, true, ksym->name);
 }
 
+static struct bpf_trampoline *bpf_trampoline_alloc(void)
+{
+	struct bpf_trampoline *tr;
+	int i;
+
+	tr = kzalloc(sizeof(*tr), GFP_KERNEL);
+	if (!tr)
+		return NULL;
+
+	INIT_HLIST_NODE(&tr->hlist);
+	refcount_set(&tr->refcnt, 1);
+	mutex_init(&tr->mutex);
+	for (i = 0; i < BPF_TRAMP_MAX; i++)
+		INIT_HLIST_HEAD(&tr->progs_hlist[i]);
+	return tr;
+}
+
 static struct bpf_trampoline *bpf_trampoline_lookup(u64 key)
 {
 	struct bpf_trampoline *tr;
 	struct hlist_head *head;
-	int i;
 
 	mutex_lock(&trampoline_mutex);
 	head = &trampoline_table[hash_64(key, TRAMPOLINE_HASH_BITS)];
@@ -72,17 +88,11 @@ static struct bpf_trampoline *bpf_trampoline_lookup(u64 key)
 			goto out;
 		}
 	}
-	tr = kzalloc(sizeof(*tr), GFP_KERNEL);
-	if (!tr)
-		goto out;
-
-	tr->key = key;
-	INIT_HLIST_NODE(&tr->hlist);
-	hlist_add_head(&tr->hlist, head);
-	refcount_set(&tr->refcnt, 1);
-	mutex_init(&tr->mutex);
-	for (i = 0; i < BPF_TRAMP_MAX; i++)
-		INIT_HLIST_HEAD(&tr->progs_hlist[i]);
+	tr = bpf_trampoline_alloc();
+	if (tr) {
+		tr->key = key;
+		hlist_add_head(&tr->hlist, head);
+	}
 out:
 	mutex_unlock(&trampoline_mutex);
 	return tr;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH 13/19] bpf: Add support to link multi func tracing program
  2021-06-05 11:10 [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach Jiri Olsa
                   ` (11 preceding siblings ...)
  2021-06-05 11:10 ` [PATCH 12/19] bpf: Add bpf_trampoline_alloc function Jiri Olsa
@ 2021-06-05 11:10 ` Jiri Olsa
  2021-06-07  5:36   ` Yonghong Song
                     ` (2 more replies)
  2021-06-05 11:10 ` [PATCH 14/19] libbpf: Add btf__find_by_pattern_kind function Jiri Olsa
                   ` (6 subsequent siblings)
  19 siblings, 3 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-05 11:10 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

Adding support to attach multiple functions to tracing program
by using the link_create/link_update interface.

Adding multi_btf_ids/multi_btf_ids_cnt pair to link_create struct
API, that define array of functions btf ids that will be attached
to prog_fd.

The prog_fd needs to be multi prog tracing program (BPF_F_MULTI_FUNC).

The new link_create interface creates new BPF_LINK_TYPE_TRACING_MULTI
link type, which creates separate bpf_trampoline and registers it
as direct function for all specified btf ids.

The new bpf_trampoline is out of scope (bpf_trampoline_lookup) of
standard trampolines, so all registered functions need to be free
of direct functions, otherwise the link fails.

The new bpf_trampoline will store and pass to bpf program the highest
number of arguments from all given functions.

New programs (fentry or fexit) can be added to the existing trampoline
through the link_update interface via new_prog_fd descriptor.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 include/linux/bpf.h            |   3 +
 include/uapi/linux/bpf.h       |   5 +
 kernel/bpf/syscall.c           | 185 ++++++++++++++++++++++++++++++++-
 kernel/bpf/trampoline.c        |  53 +++++++---
 tools/include/uapi/linux/bpf.h |   5 +
 5 files changed, 237 insertions(+), 14 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 23221e0e8d3c..99a81c6c22e6 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -661,6 +661,7 @@ struct bpf_trampoline {
 	struct bpf_tramp_image *cur_image;
 	u64 selector;
 	struct module *mod;
+	bool multi;
 };
 
 struct bpf_attach_target_info {
@@ -746,6 +747,8 @@ void bpf_ksym_add(struct bpf_ksym *ksym);
 void bpf_ksym_del(struct bpf_ksym *ksym);
 int bpf_jit_charge_modmem(u32 pages);
 void bpf_jit_uncharge_modmem(u32 pages);
+struct bpf_trampoline *bpf_trampoline_multi_alloc(void);
+void bpf_trampoline_multi_free(struct bpf_trampoline *tr);
 #else
 static inline int bpf_trampoline_link_prog(struct bpf_prog *prog,
 					   struct bpf_trampoline *tr)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index ad9340fb14d4..5fd6ff64e8dc 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1007,6 +1007,7 @@ enum bpf_link_type {
 	BPF_LINK_TYPE_ITER = 4,
 	BPF_LINK_TYPE_NETNS = 5,
 	BPF_LINK_TYPE_XDP = 6,
+	BPF_LINK_TYPE_TRACING_MULTI = 7,
 
 	MAX_BPF_LINK_TYPE,
 };
@@ -1454,6 +1455,10 @@ union bpf_attr {
 				__aligned_u64	iter_info;	/* extra bpf_iter_link_info */
 				__u32		iter_info_len;	/* iter_info length */
 			};
+			struct {
+				__aligned_u64	multi_btf_ids;		/* addresses to attach */
+				__u32		multi_btf_ids_cnt;	/* addresses count */
+			};
 		};
 	} link_create;
 
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 8f59090280b5..44446cc67af7 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -32,6 +32,7 @@
 #include <linux/rcupdate_trace.h>
 #include <linux/memcontrol.h>
 #include <linux/btf_ids.h>
+#include <linux/ftrace.h>
 
 #define IS_FD_ARRAY(map) ((map)->map_type == BPF_MAP_TYPE_PERF_EVENT_ARRAY || \
 			  (map)->map_type == BPF_MAP_TYPE_CGROUP_ARRAY || \
@@ -2810,6 +2811,184 @@ static int bpf_tracing_prog_attach(struct bpf_prog *prog,
 	return err;
 }
 
+struct bpf_tracing_multi_link {
+	struct bpf_link link;
+	enum bpf_attach_type attach_type;
+	struct ftrace_ops ops;
+	struct bpf_trampoline *tr;
+};
+
+static void bpf_tracing_multi_link_release(struct bpf_link *link)
+{
+	struct bpf_tracing_multi_link *tr_link =
+		container_of(link, struct bpf_tracing_multi_link, link);
+	const struct bpf_prog_aux *aux;
+	int kind;
+
+	unregister_ftrace_direct_multi(&tr_link->ops);
+
+	for (kind = 0; kind < BPF_TRAMP_MAX; kind++) {
+		hlist_for_each_entry(aux, &tr_link->tr->progs_hlist[kind], tramp_hlist)
+			bpf_prog_put(aux->prog);
+	}
+}
+
+static void bpf_tracing_multi_link_dealloc(struct bpf_link *link)
+{
+	struct bpf_tracing_multi_link *tr_link =
+		container_of(link, struct bpf_tracing_multi_link, link);
+
+	bpf_trampoline_multi_free(tr_link->tr);
+	kfree(tr_link);
+}
+
+static void bpf_tracing_multi_link_show_fdinfo(const struct bpf_link *link,
+					       struct seq_file *seq)
+{
+	struct bpf_tracing_multi_link *tr_link =
+		container_of(link, struct bpf_tracing_multi_link, link);
+
+	seq_printf(seq, "attach_type:\t%d\n", tr_link->attach_type);
+}
+
+static int bpf_tracing_multi_link_fill_link_info(const struct bpf_link *link,
+						 struct bpf_link_info *info)
+{
+	struct bpf_tracing_multi_link *tr_link =
+		container_of(link, struct bpf_tracing_multi_link, link);
+
+	info->tracing.attach_type = tr_link->attach_type;
+	return 0;
+}
+
+static int check_multi_prog_type(struct bpf_prog *prog)
+{
+	if (!prog->aux->multi_func &&
+	    prog->type != BPF_PROG_TYPE_TRACING)
+		return -EINVAL;
+	if (prog->expected_attach_type != BPF_TRACE_FENTRY &&
+	    prog->expected_attach_type != BPF_TRACE_FEXIT)
+		return -EINVAL;
+	return 0;
+}
+
+static int bpf_tracing_multi_link_update(struct bpf_link *link,
+					 struct bpf_prog *new_prog,
+					 struct bpf_prog *old_prog __maybe_unused)
+{
+	struct bpf_tracing_multi_link *tr_link =
+		container_of(link, struct bpf_tracing_multi_link, link);
+	int err;
+
+	if (check_multi_prog_type(new_prog))
+		return -EINVAL;
+
+	err = bpf_trampoline_link_prog(new_prog, tr_link->tr);
+	if (err)
+		return err;
+
+	err = modify_ftrace_direct_multi(&tr_link->ops,
+					 (unsigned long) tr_link->tr->cur_image->image);
+	return WARN_ON(err);
+}
+
+static const struct bpf_link_ops bpf_tracing_multi_link_lops = {
+	.release = bpf_tracing_multi_link_release,
+	.dealloc = bpf_tracing_multi_link_dealloc,
+	.show_fdinfo = bpf_tracing_multi_link_show_fdinfo,
+	.fill_link_info = bpf_tracing_multi_link_fill_link_info,
+	.update_prog = bpf_tracing_multi_link_update,
+};
+
+static void bpf_func_model_nargs(struct btf_func_model *m, int nr_args)
+{
+	int i;
+
+	for (i = 0; i < nr_args; i++)
+		m->arg_size[i] = 8;
+	m->ret_size = 8;
+	m->nr_args = nr_args;
+}
+
+static int bpf_tracing_multi_attach(struct bpf_prog *prog,
+				    const union bpf_attr *attr)
+{
+	void __user *ubtf_ids = u64_to_user_ptr(attr->link_create.multi_btf_ids);
+	u32 size, i, cnt = attr->link_create.multi_btf_ids_cnt;
+	struct bpf_tracing_multi_link *link = NULL;
+	struct bpf_link_primer link_primer;
+	struct bpf_trampoline *tr = NULL;
+	int err = -EINVAL;
+	u8 nr_args = 0;
+	u32 *btf_ids;
+
+	if (check_multi_prog_type(prog))
+		return -EINVAL;
+
+	size = cnt * sizeof(*btf_ids);
+	btf_ids = kmalloc(size, GFP_USER | __GFP_NOWARN);
+	if (!btf_ids)
+		return -ENOMEM;
+
+	err = -EFAULT;
+	if (ubtf_ids && copy_from_user(btf_ids, ubtf_ids, size))
+		goto out_free;
+
+	link = kzalloc(sizeof(*link), GFP_USER);
+	if (!link)
+		goto out_free;
+
+	for (i = 0; i < cnt; i++) {
+		struct bpf_attach_target_info tgt_info = {};
+
+		err = bpf_check_attach_target(NULL, prog, NULL, btf_ids[i],
+					      &tgt_info);
+		if (err)
+			goto out_free;
+
+		if (ftrace_set_filter_ip(&link->ops, tgt_info.tgt_addr, 0, 0))
+			goto out_free;
+
+		if (nr_args < tgt_info.fmodel.nr_args)
+			nr_args = tgt_info.fmodel.nr_args;
+	}
+
+	tr = bpf_trampoline_multi_alloc();
+	if (!tr)
+		goto out_free;
+
+	bpf_func_model_nargs(&tr->func.model, nr_args);
+
+	err = bpf_trampoline_link_prog(prog, tr);
+	if (err)
+		goto out_free;
+
+	err = register_ftrace_direct_multi(&link->ops, (unsigned long) tr->cur_image->image);
+	if (err)
+		goto out_free;
+
+	bpf_link_init(&link->link, BPF_LINK_TYPE_TRACING_MULTI,
+		      &bpf_tracing_multi_link_lops, prog);
+	link->attach_type = prog->expected_attach_type;
+
+	err = bpf_link_prime(&link->link, &link_primer);
+	if (err)
+		goto out_unlink;
+
+	link->tr = tr;
+	/* Take extra ref so we are even with progs added by link_update. */
+	bpf_prog_inc(prog);
+	return bpf_link_settle(&link_primer);
+
+out_unlink:
+	unregister_ftrace_direct_multi(&link->ops);
+out_free:
+	kfree(tr);
+	kfree(btf_ids);
+	kfree(link);
+	return err;
+}
+
 struct bpf_raw_tp_link {
 	struct bpf_link link;
 	struct bpf_raw_event_map *btp;
@@ -3043,6 +3222,8 @@ attach_type_to_prog_type(enum bpf_attach_type attach_type)
 	case BPF_CGROUP_SETSOCKOPT:
 		return BPF_PROG_TYPE_CGROUP_SOCKOPT;
 	case BPF_TRACE_ITER:
+	case BPF_TRACE_FENTRY:
+	case BPF_TRACE_FEXIT:
 		return BPF_PROG_TYPE_TRACING;
 	case BPF_SK_LOOKUP:
 		return BPF_PROG_TYPE_SK_LOOKUP;
@@ -4099,6 +4280,8 @@ static int tracing_bpf_link_attach(const union bpf_attr *attr, bpfptr_t uattr,
 
 	if (prog->expected_attach_type == BPF_TRACE_ITER)
 		return bpf_iter_link_attach(attr, uattr, prog);
+	else if (prog->aux->multi_func)
+		return bpf_tracing_multi_attach(prog, attr);
 	else if (prog->type == BPF_PROG_TYPE_EXT)
 		return bpf_tracing_prog_attach(prog,
 					       attr->link_create.target_fd,
@@ -4106,7 +4289,7 @@ static int tracing_bpf_link_attach(const union bpf_attr *attr, bpfptr_t uattr,
 	return -EINVAL;
 }
 
-#define BPF_LINK_CREATE_LAST_FIELD link_create.iter_info_len
+#define BPF_LINK_CREATE_LAST_FIELD link_create.multi_btf_ids_cnt
 static int link_create(union bpf_attr *attr, bpfptr_t uattr)
 {
 	enum bpf_prog_type ptype;
diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
index 2755fdcf9fbf..660b8197c27f 100644
--- a/kernel/bpf/trampoline.c
+++ b/kernel/bpf/trampoline.c
@@ -58,7 +58,7 @@ void bpf_image_ksym_del(struct bpf_ksym *ksym)
 			   PAGE_SIZE, true, ksym->name);
 }
 
-static struct bpf_trampoline *bpf_trampoline_alloc(void)
+static struct bpf_trampoline *bpf_trampoline_alloc(bool multi)
 {
 	struct bpf_trampoline *tr;
 	int i;
@@ -72,6 +72,7 @@ static struct bpf_trampoline *bpf_trampoline_alloc(void)
 	mutex_init(&tr->mutex);
 	for (i = 0; i < BPF_TRAMP_MAX; i++)
 		INIT_HLIST_HEAD(&tr->progs_hlist[i]);
+	tr->multi = multi;
 	return tr;
 }
 
@@ -88,7 +89,7 @@ static struct bpf_trampoline *bpf_trampoline_lookup(u64 key)
 			goto out;
 		}
 	}
-	tr = bpf_trampoline_alloc();
+	tr = bpf_trampoline_alloc(false);
 	if (tr) {
 		tr->key = key;
 		hlist_add_head(&tr->hlist, head);
@@ -343,14 +344,16 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr)
 	struct bpf_tramp_image *im;
 	struct bpf_tramp_progs *tprogs;
 	u32 flags = BPF_TRAMP_F_RESTORE_REGS;
-	int err, total;
+	bool update = !tr->multi;
+	int err = 0, total;
 
 	tprogs = bpf_trampoline_get_progs(tr, &total);
 	if (IS_ERR(tprogs))
 		return PTR_ERR(tprogs);
 
 	if (total == 0) {
-		err = unregister_fentry(tr, tr->cur_image->image);
+		if (update)
+			err = unregister_fentry(tr, tr->cur_image->image);
 		bpf_tramp_image_put(tr->cur_image);
 		tr->cur_image = NULL;
 		tr->selector = 0;
@@ -363,9 +366,15 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr)
 		goto out;
 	}
 
+	if (tr->multi)
+		flags |= BPF_TRAMP_F_IP_ARG;
+
 	if (tprogs[BPF_TRAMP_FEXIT].nr_progs ||
-	    tprogs[BPF_TRAMP_MODIFY_RETURN].nr_progs)
+	    tprogs[BPF_TRAMP_MODIFY_RETURN].nr_progs) {
 		flags = BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_SKIP_FRAME;
+		if (tr->multi)
+			flags |= BPF_TRAMP_F_ORIG_STACK | BPF_TRAMP_F_IP_ARG;
+	}
 
 	err = arch_prepare_bpf_trampoline(im, im->image, im->image + PAGE_SIZE,
 					  &tr->func.model, flags, tprogs,
@@ -373,16 +382,19 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr)
 	if (err < 0)
 		goto out;
 
+	err = 0;
 	WARN_ON(tr->cur_image && tr->selector == 0);
 	WARN_ON(!tr->cur_image && tr->selector);
-	if (tr->cur_image)
-		/* progs already running at this address */
-		err = modify_fentry(tr, tr->cur_image->image, im->image);
-	else
-		/* first time registering */
-		err = register_fentry(tr, im->image);
-	if (err)
-		goto out;
+	if (update) {
+		if (tr->cur_image)
+			/* progs already running at this address */
+			err = modify_fentry(tr, tr->cur_image->image, im->image);
+		else
+			/* first time registering */
+			err = register_fentry(tr, im->image);
+		if (err)
+			goto out;
+	}
 	if (tr->cur_image)
 		bpf_tramp_image_put(tr->cur_image);
 	tr->cur_image = im;
@@ -436,6 +448,10 @@ int bpf_trampoline_link_prog(struct bpf_prog *prog, struct bpf_trampoline *tr)
 			err = -EBUSY;
 			goto out;
 		}
+		if (tr->multi) {
+			err = -EINVAL;
+			goto out;
+		}
 		tr->extension_prog = prog;
 		err = bpf_arch_text_poke(tr->func.addr, BPF_MOD_JUMP, NULL,
 					 prog->bpf_func);
@@ -529,6 +545,17 @@ void bpf_trampoline_put(struct bpf_trampoline *tr)
 	mutex_unlock(&trampoline_mutex);
 }
 
+struct bpf_trampoline *bpf_trampoline_multi_alloc(void)
+{
+	return bpf_trampoline_alloc(true);
+}
+
+void bpf_trampoline_multi_free(struct bpf_trampoline *tr)
+{
+	bpf_tramp_image_put(tr->cur_image);
+	kfree(tr);
+}
+
 #define NO_START_TIME 1
 static u64 notrace bpf_prog_start_time(void)
 {
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index ad9340fb14d4..5fd6ff64e8dc 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1007,6 +1007,7 @@ enum bpf_link_type {
 	BPF_LINK_TYPE_ITER = 4,
 	BPF_LINK_TYPE_NETNS = 5,
 	BPF_LINK_TYPE_XDP = 6,
+	BPF_LINK_TYPE_TRACING_MULTI = 7,
 
 	MAX_BPF_LINK_TYPE,
 };
@@ -1454,6 +1455,10 @@ union bpf_attr {
 				__aligned_u64	iter_info;	/* extra bpf_iter_link_info */
 				__u32		iter_info_len;	/* iter_info length */
 			};
+			struct {
+				__aligned_u64	multi_btf_ids;		/* addresses to attach */
+				__u32		multi_btf_ids_cnt;	/* addresses count */
+			};
 		};
 	} link_create;
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH 14/19] libbpf: Add btf__find_by_pattern_kind function
  2021-06-05 11:10 [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach Jiri Olsa
                   ` (12 preceding siblings ...)
  2021-06-05 11:10 ` [PATCH 13/19] bpf: Add support to link multi func tracing program Jiri Olsa
@ 2021-06-05 11:10 ` Jiri Olsa
  2021-06-09  5:29   ` Andrii Nakryiko
  2021-06-05 11:10 ` [PATCH 15/19] libbpf: Add support to link multi func tracing program Jiri Olsa
                   ` (5 subsequent siblings)
  19 siblings, 1 reply; 76+ messages in thread
From: Jiri Olsa @ 2021-06-05 11:10 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

Adding btf__find_by_pattern_kind function that returns
array of BTF ids for given function name pattern.

Using libc's regex.h support for that.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/lib/bpf/btf.c | 68 +++++++++++++++++++++++++++++++++++++++++++++
 tools/lib/bpf/btf.h |  3 ++
 2 files changed, 71 insertions(+)

diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index b46760b93bb4..421dd6c1e44a 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -1,6 +1,7 @@
 // SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
 /* Copyright (c) 2018 Facebook */
 
+#define _GNU_SOURCE
 #include <byteswap.h>
 #include <endian.h>
 #include <stdio.h>
@@ -16,6 +17,7 @@
 #include <linux/err.h>
 #include <linux/btf.h>
 #include <gelf.h>
+#include <regex.h>
 #include "btf.h"
 #include "bpf.h"
 #include "libbpf.h"
@@ -711,6 +713,72 @@ __s32 btf__find_by_name_kind(const struct btf *btf, const char *type_name,
 	return libbpf_err(-ENOENT);
 }
 
+static bool is_wildcard(char c)
+{
+	static const char *wildchars = "*?[|";
+
+	return strchr(wildchars, c);
+}
+
+int btf__find_by_pattern_kind(const struct btf *btf,
+			      const char *type_pattern, __u32 kind,
+			      __s32 **__ids)
+{
+	__u32 i, nr_types = btf__get_nr_types(btf);
+	__s32 *ids = NULL;
+	int cnt = 0, alloc = 0, ret;
+	regex_t regex;
+	char *pattern;
+
+	if (kind == BTF_KIND_UNKN || !strcmp(type_pattern, "void"))
+		return 0;
+
+	/* When the pattern does not start with wildcard, treat it as
+	 * if we'd want to match it from the beginning of the string.
+	 */
+	asprintf(&pattern, "%s%s",
+		 is_wildcard(type_pattern[0]) ? "^" : "",
+		 type_pattern);
+
+	ret = regcomp(&regex, pattern, REG_EXTENDED);
+	if (ret) {
+		pr_warn("failed to compile regex\n");
+		free(pattern);
+		return -EINVAL;
+	}
+
+	free(pattern);
+
+	for (i = 1; i <= nr_types; i++) {
+		const struct btf_type *t = btf__type_by_id(btf, i);
+		const char *name;
+		__s32 *p;
+
+		if (btf_kind(t) != kind)
+			continue;
+		name = btf__name_by_offset(btf, t->name_off);
+		if (name && regexec(&regex, name, 0, NULL, 0))
+			continue;
+		if (cnt == alloc) {
+			alloc = max(100, alloc * 3 / 2);
+			p = realloc(ids, alloc * sizeof(__u32));
+			if (!p) {
+				free(ids);
+				regfree(&regex);
+				return -ENOMEM;
+			}
+			ids = p;
+		}
+
+		ids[cnt] = i;
+		cnt++;
+	}
+
+	regfree(&regex);
+	*__ids = ids;
+	return cnt ?: -ENOENT;
+}
+
 static bool btf_is_modifiable(const struct btf *btf)
 {
 	return (void *)btf->hdr != btf->raw_data;
diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
index b54f1c3ebd57..036857aded94 100644
--- a/tools/lib/bpf/btf.h
+++ b/tools/lib/bpf/btf.h
@@ -371,6 +371,9 @@ btf_var_secinfos(const struct btf_type *t)
 	return (struct btf_var_secinfo *)(t + 1);
 }
 
+int btf__find_by_pattern_kind(const struct btf *btf,
+			      const char *type_pattern, __u32 kind,
+			      __s32 **__ids);
 #ifdef __cplusplus
 } /* extern "C" */
 #endif
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH 15/19] libbpf: Add support to link multi func tracing program
  2021-06-05 11:10 [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach Jiri Olsa
                   ` (13 preceding siblings ...)
  2021-06-05 11:10 ` [PATCH 14/19] libbpf: Add btf__find_by_pattern_kind function Jiri Olsa
@ 2021-06-05 11:10 ` Jiri Olsa
  2021-06-07  5:49   ` Yonghong Song
  2021-06-09  5:34   ` Andrii Nakryiko
  2021-06-05 11:10 ` [PATCH 16/19] selftests/bpf: Add fentry multi func test Jiri Olsa
                   ` (4 subsequent siblings)
  19 siblings, 2 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-05 11:10 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

Adding support to link multi func tracing program
through link_create interface.

Adding special types for multi func programs:

  fentry.multi
  fexit.multi

so you can define multi func programs like:

  SEC("fentry.multi/bpf_fentry_test*")
  int BPF_PROG(test1, unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f)

that defines test1 to be attached to bpf_fentry_test* functions,
and able to attach ip and 6 arguments.

If functions are not specified the program needs to be attached
manually.

Adding new btf id related fields to bpf_link_create_opts and
bpf_link_create to use them.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/lib/bpf/bpf.c    | 11 ++++++-
 tools/lib/bpf/bpf.h    |  4 ++-
 tools/lib/bpf/libbpf.c | 72 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 85 insertions(+), 2 deletions(-)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index 86dcac44f32f..da892737b522 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -674,7 +674,8 @@ int bpf_link_create(int prog_fd, int target_fd,
 		    enum bpf_attach_type attach_type,
 		    const struct bpf_link_create_opts *opts)
 {
-	__u32 target_btf_id, iter_info_len;
+	__u32 target_btf_id, iter_info_len, multi_btf_ids_cnt;
+	__s32 *multi_btf_ids;
 	union bpf_attr attr;
 	int fd;
 
@@ -687,6 +688,9 @@ int bpf_link_create(int prog_fd, int target_fd,
 	if (iter_info_len && target_btf_id)
 		return libbpf_err(-EINVAL);
 
+	multi_btf_ids = OPTS_GET(opts, multi_btf_ids, 0);
+	multi_btf_ids_cnt = OPTS_GET(opts, multi_btf_ids_cnt, 0);
+
 	memset(&attr, 0, sizeof(attr));
 	attr.link_create.prog_fd = prog_fd;
 	attr.link_create.target_fd = target_fd;
@@ -701,6 +705,11 @@ int bpf_link_create(int prog_fd, int target_fd,
 		attr.link_create.target_btf_id = target_btf_id;
 	}
 
+	if (multi_btf_ids && multi_btf_ids_cnt) {
+		attr.link_create.multi_btf_ids = (__u64) multi_btf_ids;
+		attr.link_create.multi_btf_ids_cnt = multi_btf_ids_cnt;
+	}
+
 	fd = sys_bpf(BPF_LINK_CREATE, &attr, sizeof(attr));
 	return libbpf_err_errno(fd);
 }
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index 4f758f8f50cd..2f78b6c34765 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -177,8 +177,10 @@ struct bpf_link_create_opts {
 	union bpf_iter_link_info *iter_info;
 	__u32 iter_info_len;
 	__u32 target_btf_id;
+	__s32 *multi_btf_ids;
+	__u32 multi_btf_ids_cnt;
 };
-#define bpf_link_create_opts__last_field target_btf_id
+#define bpf_link_create_opts__last_field multi_btf_ids_cnt
 
 LIBBPF_API int bpf_link_create(int prog_fd, int target_fd,
 			       enum bpf_attach_type attach_type,
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 65f87cc1220c..bd31de3b6a85 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -228,6 +228,7 @@ struct bpf_sec_def {
 	bool is_attachable;
 	bool is_attach_btf;
 	bool is_sleepable;
+	bool is_multi_func;
 	attach_fn_t attach_fn;
 };
 
@@ -7609,6 +7610,8 @@ __bpf_object__open(const char *path, const void *obj_buf, size_t obj_buf_sz,
 
 		if (prog->sec_def->is_sleepable)
 			prog->prog_flags |= BPF_F_SLEEPABLE;
+		if (prog->sec_def->is_multi_func)
+			prog->prog_flags |= BPF_F_MULTI_FUNC;
 		bpf_program__set_type(prog, prog->sec_def->prog_type);
 		bpf_program__set_expected_attach_type(prog,
 				prog->sec_def->expected_attach_type);
@@ -9070,6 +9073,8 @@ static struct bpf_link *attach_raw_tp(const struct bpf_sec_def *sec,
 				      struct bpf_program *prog);
 static struct bpf_link *attach_trace(const struct bpf_sec_def *sec,
 				     struct bpf_program *prog);
+static struct bpf_link *attach_trace_multi(const struct bpf_sec_def *sec,
+					   struct bpf_program *prog);
 static struct bpf_link *attach_lsm(const struct bpf_sec_def *sec,
 				   struct bpf_program *prog);
 static struct bpf_link *attach_iter(const struct bpf_sec_def *sec,
@@ -9143,6 +9148,14 @@ static const struct bpf_sec_def section_defs[] = {
 		.attach_fn = attach_iter),
 	SEC_DEF("syscall", SYSCALL,
 		.is_sleepable = true),
+	SEC_DEF("fentry.multi/", TRACING,
+		.expected_attach_type = BPF_TRACE_FENTRY,
+		.is_multi_func = true,
+		.attach_fn = attach_trace_multi),
+	SEC_DEF("fexit.multi/", TRACING,
+		.expected_attach_type = BPF_TRACE_FEXIT,
+		.is_multi_func = true,
+		.attach_fn = attach_trace_multi),
 	BPF_EAPROG_SEC("xdp_devmap/",		BPF_PROG_TYPE_XDP,
 						BPF_XDP_DEVMAP),
 	BPF_EAPROG_SEC("xdp_cpumap/",		BPF_PROG_TYPE_XDP,
@@ -9584,6 +9597,9 @@ static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd,
 	if (!name)
 		return -EINVAL;
 
+	if (prog->prog_flags & BPF_F_MULTI_FUNC)
+		return 0;
+
 	for (i = 0; i < ARRAY_SIZE(section_defs); i++) {
 		if (!section_defs[i].is_attach_btf)
 			continue;
@@ -10537,6 +10553,62 @@ static struct bpf_link *bpf_program__attach_btf_id(struct bpf_program *prog)
 	return (struct bpf_link *)link;
 }
 
+static struct bpf_link *bpf_program__attach_multi(struct bpf_program *prog)
+{
+	char *pattern = prog->sec_name + prog->sec_def->len;
+	DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts);
+	enum bpf_attach_type attach_type;
+	int prog_fd, link_fd, cnt, err;
+	struct bpf_link *link = NULL;
+	__s32 *ids = NULL;
+
+	prog_fd = bpf_program__fd(prog);
+	if (prog_fd < 0) {
+		pr_warn("prog '%s': can't attach before loaded\n", prog->name);
+		return ERR_PTR(-EINVAL);
+	}
+
+	err = bpf_object__load_vmlinux_btf(prog->obj, true);
+	if (err)
+		return ERR_PTR(err);
+
+	cnt = btf__find_by_pattern_kind(prog->obj->btf_vmlinux, pattern,
+					BTF_KIND_FUNC, &ids);
+	if (cnt <= 0)
+		return ERR_PTR(-EINVAL);
+
+	link = calloc(1, sizeof(*link));
+	if (!link) {
+		err = -ENOMEM;
+		goto out_err;
+	}
+	link->detach = &bpf_link__detach_fd;
+
+	opts.multi_btf_ids = ids;
+	opts.multi_btf_ids_cnt = cnt;
+
+	attach_type = bpf_program__get_expected_attach_type(prog);
+	link_fd = bpf_link_create(prog_fd, 0, attach_type, &opts);
+	if (link_fd < 0) {
+		err = -errno;
+		goto out_err;
+	}
+	link->fd = link_fd;
+	free(ids);
+	return link;
+
+out_err:
+	free(link);
+	free(ids);
+	return ERR_PTR(err);
+}
+
+static struct bpf_link *attach_trace_multi(const struct bpf_sec_def *sec,
+					   struct bpf_program *prog)
+{
+	return bpf_program__attach_multi(prog);
+}
+
 struct bpf_link *bpf_program__attach_trace(struct bpf_program *prog)
 {
 	return bpf_program__attach_btf_id(prog);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH 16/19] selftests/bpf: Add fentry multi func test
  2021-06-05 11:10 [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach Jiri Olsa
                   ` (14 preceding siblings ...)
  2021-06-05 11:10 ` [PATCH 15/19] libbpf: Add support to link multi func tracing program Jiri Olsa
@ 2021-06-05 11:10 ` Jiri Olsa
  2021-06-07  6:06   ` Yonghong Song
  2021-06-09  5:40   ` Andrii Nakryiko
  2021-06-05 11:10 ` [PATCH 17/19] selftests/bpf: Add fexit " Jiri Olsa
                   ` (3 subsequent siblings)
  19 siblings, 2 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-05 11:10 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

Adding selftest for fentry multi func test that attaches
to bpf_fentry_test* functions and checks argument values
based on the processed function.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/testing/selftests/bpf/multi_check.h     | 52 +++++++++++++++++++
 .../bpf/prog_tests/fentry_multi_test.c        | 43 +++++++++++++++
 .../selftests/bpf/progs/fentry_multi_test.c   | 18 +++++++
 3 files changed, 113 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/multi_check.h
 create mode 100644 tools/testing/selftests/bpf/prog_tests/fentry_multi_test.c
 create mode 100644 tools/testing/selftests/bpf/progs/fentry_multi_test.c

diff --git a/tools/testing/selftests/bpf/multi_check.h b/tools/testing/selftests/bpf/multi_check.h
new file mode 100644
index 000000000000..36c2a93f9be3
--- /dev/null
+++ b/tools/testing/selftests/bpf/multi_check.h
@@ -0,0 +1,52 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __MULTI_CHECK_H
+#define __MULTI_CHECK_H
+
+extern unsigned long long bpf_fentry_test[8];
+
+static __attribute__((unused)) inline
+void multi_arg_check(unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f, __u64 *test_result)
+{
+	if (ip == bpf_fentry_test[0]) {
+		*test_result += (int) a == 1;
+	} else if (ip == bpf_fentry_test[1]) {
+		*test_result += (int) a == 2 && (__u64) b == 3;
+	} else if (ip == bpf_fentry_test[2]) {
+		*test_result += (char) a == 4 && (int) b == 5 && (__u64) c == 6;
+	} else if (ip == bpf_fentry_test[3]) {
+		*test_result += (void *) a == (void *) 7 && (char) b == 8 && (int) c == 9 && (__u64) d == 10;
+	} else if (ip == bpf_fentry_test[4]) {
+		*test_result += (__u64) a == 11 && (void *) b == (void *) 12 && (short) c == 13 && (int) d == 14 && (__u64) e == 15;
+	} else if (ip == bpf_fentry_test[5]) {
+		*test_result += (__u64) a == 16 && (void *) b == (void *) 17 && (short) c == 18 && (int) d == 19 && (void *) e == (void *) 20 && (__u64) f == 21;
+	} else if (ip == bpf_fentry_test[6]) {
+		*test_result += 1;
+	} else if (ip == bpf_fentry_test[7]) {
+		*test_result += 1;
+	}
+}
+
+static __attribute__((unused)) inline
+void multi_ret_check(unsigned long ip, int ret, __u64 *test_result)
+{
+	if (ip == bpf_fentry_test[0]) {
+		*test_result += ret == 2;
+	} else if (ip == bpf_fentry_test[1]) {
+		*test_result += ret == 5;
+	} else if (ip == bpf_fentry_test[2]) {
+		*test_result += ret == 15;
+	} else if (ip == bpf_fentry_test[3]) {
+		*test_result += ret == 34;
+	} else if (ip == bpf_fentry_test[4]) {
+		*test_result += ret == 65;
+	} else if (ip == bpf_fentry_test[5]) {
+		*test_result += ret == 111;
+	} else if (ip == bpf_fentry_test[6]) {
+		*test_result += ret == 0;
+	} else if (ip == bpf_fentry_test[7]) {
+		*test_result += ret == 0;
+	}
+}
+
+#endif /* __MULTI_CHECK_H */
diff --git a/tools/testing/selftests/bpf/prog_tests/fentry_multi_test.c b/tools/testing/selftests/bpf/prog_tests/fentry_multi_test.c
new file mode 100644
index 000000000000..e4a8089533d6
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/fentry_multi_test.c
@@ -0,0 +1,43 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <test_progs.h>
+#include "fentry_multi_test.skel.h"
+#include "trace_helpers.h"
+
+void test_fentry_multi_test(void)
+{
+	struct fentry_multi_test *skel = NULL;
+	unsigned long long *bpf_fentry_test;
+	__u32 duration = 0, retval;
+	int err, prog_fd;
+
+	skel = fentry_multi_test__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "fentry_multi_skel_load"))
+		goto cleanup;
+
+	bpf_fentry_test = &skel->bss->bpf_fentry_test[0];
+	ASSERT_OK(kallsyms_find("bpf_fentry_test1", &bpf_fentry_test[0]), "kallsyms_find");
+	ASSERT_OK(kallsyms_find("bpf_fentry_test2", &bpf_fentry_test[1]), "kallsyms_find");
+	ASSERT_OK(kallsyms_find("bpf_fentry_test3", &bpf_fentry_test[2]), "kallsyms_find");
+	ASSERT_OK(kallsyms_find("bpf_fentry_test4", &bpf_fentry_test[3]), "kallsyms_find");
+	ASSERT_OK(kallsyms_find("bpf_fentry_test5", &bpf_fentry_test[4]), "kallsyms_find");
+	ASSERT_OK(kallsyms_find("bpf_fentry_test6", &bpf_fentry_test[5]), "kallsyms_find");
+	ASSERT_OK(kallsyms_find("bpf_fentry_test7", &bpf_fentry_test[6]), "kallsyms_find");
+	ASSERT_OK(kallsyms_find("bpf_fentry_test8", &bpf_fentry_test[7]), "kallsyms_find");
+
+	err = fentry_multi_test__attach(skel);
+	if (!ASSERT_OK(err, "fentry_attach"))
+		goto cleanup;
+
+	prog_fd = bpf_program__fd(skel->progs.test);
+	err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
+				NULL, NULL, &retval, &duration);
+	ASSERT_OK(err, "test_run");
+	ASSERT_EQ(retval, 0, "test_run");
+
+	ASSERT_EQ(skel->bss->test_result, 8, "test_result");
+
+	fentry_multi_test__detach(skel);
+
+cleanup:
+	fentry_multi_test__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/fentry_multi_test.c b/tools/testing/selftests/bpf/progs/fentry_multi_test.c
new file mode 100644
index 000000000000..a443fc958e5a
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/fentry_multi_test.c
@@ -0,0 +1,18 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include "multi_check.h"
+
+char _license[] SEC("license") = "GPL";
+
+unsigned long long bpf_fentry_test[8];
+
+__u64 test_result = 0;
+
+SEC("fentry.multi/bpf_fentry_test*")
+int BPF_PROG(test, unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f)
+{
+	multi_arg_check(ip, a, b, c, d, e, f, &test_result);
+	return 0;
+}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH 17/19] selftests/bpf: Add fexit multi func test
  2021-06-05 11:10 [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach Jiri Olsa
                   ` (15 preceding siblings ...)
  2021-06-05 11:10 ` [PATCH 16/19] selftests/bpf: Add fentry multi func test Jiri Olsa
@ 2021-06-05 11:10 ` Jiri Olsa
  2021-06-05 11:10 ` [PATCH 18/19] selftests/bpf: Add fentry/fexit " Jiri Olsa
                   ` (2 subsequent siblings)
  19 siblings, 0 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-05 11:10 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

Adding selftest for fexit multi func test that attaches
to bpf_fentry_test* functions and checks argument values
based on the processed function.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 .../bpf/prog_tests/fexit_multi_test.c         | 44 +++++++++++++++++++
 .../selftests/bpf/progs/fexit_multi_test.c    | 20 +++++++++
 2 files changed, 64 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/fexit_multi_test.c
 create mode 100644 tools/testing/selftests/bpf/progs/fexit_multi_test.c

diff --git a/tools/testing/selftests/bpf/prog_tests/fexit_multi_test.c b/tools/testing/selftests/bpf/prog_tests/fexit_multi_test.c
new file mode 100644
index 000000000000..76408e1adba6
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/fexit_multi_test.c
@@ -0,0 +1,44 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <test_progs.h>
+#include "fexit_multi_test.skel.h"
+#include "trace_helpers.h"
+
+void test_fexit_multi_test(void)
+{
+	struct fexit_multi_test *skel = NULL;
+	unsigned long long *bpf_fentry_test;
+	__u32 duration = 0, retval;
+	int err, prog_fd;
+
+	skel = fexit_multi_test__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "fexit_multi_skel_load"))
+		goto cleanup;
+
+	bpf_fentry_test = &skel->bss->bpf_fentry_test[0];
+	ASSERT_OK(kallsyms_find("bpf_fentry_test1", &bpf_fentry_test[0]), "kallsyms_find");
+	ASSERT_OK(kallsyms_find("bpf_fentry_test2", &bpf_fentry_test[1]), "kallsyms_find");
+	ASSERT_OK(kallsyms_find("bpf_fentry_test3", &bpf_fentry_test[2]), "kallsyms_find");
+	ASSERT_OK(kallsyms_find("bpf_fentry_test4", &bpf_fentry_test[3]), "kallsyms_find");
+	ASSERT_OK(kallsyms_find("bpf_fentry_test5", &bpf_fentry_test[4]), "kallsyms_find");
+	ASSERT_OK(kallsyms_find("bpf_fentry_test6", &bpf_fentry_test[5]), "kallsyms_find");
+	ASSERT_OK(kallsyms_find("bpf_fentry_test7", &bpf_fentry_test[6]), "kallsyms_find");
+	ASSERT_OK(kallsyms_find("bpf_fentry_test8", &bpf_fentry_test[7]), "kallsyms_find");
+
+	err = fexit_multi_test__attach(skel);
+	if (!ASSERT_OK(err, "fexit_attach"))
+		goto cleanup;
+
+	prog_fd = bpf_program__fd(skel->progs.test);
+	err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
+				NULL, NULL, &retval, &duration);
+	ASSERT_OK(err, "test_run");
+	ASSERT_EQ(retval, 0, "test_run");
+
+	ASSERT_EQ(skel->bss->test_arg_result, 8, "fexit_multi_arg_result");
+	ASSERT_EQ(skel->bss->test_ret_result, 8, "fexit_multi_ret_result");
+
+	fexit_multi_test__detach(skel);
+
+cleanup:
+	fexit_multi_test__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/fexit_multi_test.c b/tools/testing/selftests/bpf/progs/fexit_multi_test.c
new file mode 100644
index 000000000000..365575cf05a0
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/fexit_multi_test.c
@@ -0,0 +1,20 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include "multi_check.h"
+
+char _license[] SEC("license") = "GPL";
+
+unsigned long long bpf_fentry_test[8];
+
+__u64 test_arg_result = 0;
+__u64 test_ret_result = 0;
+
+SEC("fexit.multi/bpf_fentry_test*")
+int BPF_PROG(test, unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f, int ret)
+{
+	multi_arg_check(ip, a, b, c, d, e, f, &test_arg_result);
+	multi_ret_check(ip, ret, &test_ret_result);
+	return 0;
+}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH 18/19] selftests/bpf: Add fentry/fexit multi func test
  2021-06-05 11:10 [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach Jiri Olsa
                   ` (16 preceding siblings ...)
  2021-06-05 11:10 ` [PATCH 17/19] selftests/bpf: Add fexit " Jiri Olsa
@ 2021-06-05 11:10 ` Jiri Olsa
  2021-06-09  5:41   ` Andrii Nakryiko
  2021-06-05 11:10 ` [PATCH 19/19] selftests/bpf: Temporary fix for fentry_fexit_multi_test Jiri Olsa
  2021-06-17 20:29 ` [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach Andrii Nakryiko
  19 siblings, 1 reply; 76+ messages in thread
From: Jiri Olsa @ 2021-06-05 11:10 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

Adding selftest for fentry/fexit multi func test that attaches
to bpf_fentry_test* functions and checks argument values based
on the processed function.

When multi_arg_check is used from 2 different places I'm getting
compilation fail, which I did not deciphered yet:

  $ CLANG=/opt/clang/bin/clang LLC=/opt/clang/bin/llc make
    CLNG-BPF [test_maps] fentry_fexit_multi_test.o
  progs/fentry_fexit_multi_test.c:18:2: error: too many args to t24: i64 = \
  GlobalAddress<void (i64, i64, i64, i64, i64, i64, i64, i64*)* @multi_arg_check> 0, \
  progs/fentry_fexit_multi_test.c:18:2 @[ progs/fentry_fexit_multi_test.c:16:5 ]
          multi_arg_check(ip, a, b, c, d, e, f, &test1_arg_result);
          ^
  progs/fentry_fexit_multi_test.c:25:2: error: too many args to t32: i64 = \
  GlobalAddress<void (i64, i64, i64, i64, i64, i64, i64, i64*)* @multi_arg_check> 0, \
  progs/fentry_fexit_multi_test.c:25:2 @[ progs/fentry_fexit_multi_test.c:23:5 ]
          multi_arg_check(ip, a, b, c, d, e, f, &test2_arg_result);
          ^
  In file included from progs/fentry_fexit_multi_test.c:5:
  /home/jolsa/linux-qemu/tools/testing/selftests/bpf/multi_check.h:9:6: error: defined with too many args
  void multi_arg_check(unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f, __u64 *test_result)
       ^
  /home/jolsa/linux-qemu/tools/testing/selftests/bpf/multi_check.h:9:6: error: defined with too many args
  /home/jolsa/linux-qemu/tools/testing/selftests/bpf/multi_check.h:9:6: error: defined with too many args
  5 errors generated.
  make: *** [Makefile:470: /home/jolsa/linux-qemu/tools/testing/selftests/bpf/fentry_fexit_multi_test.o] Error 1

I can fix that by defining 2 separate multi_arg_check functions
with different names, which I did in follow up temporaary patch.
Not sure I'm hitting some clang/bpf limitation in here?

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 .../bpf/prog_tests/fentry_fexit_multi_test.c  | 52 +++++++++++++++++++
 .../bpf/progs/fentry_fexit_multi_test.c       | 28 ++++++++++
 2 files changed, 80 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/fentry_fexit_multi_test.c
 create mode 100644 tools/testing/selftests/bpf/progs/fentry_fexit_multi_test.c

diff --git a/tools/testing/selftests/bpf/prog_tests/fentry_fexit_multi_test.c b/tools/testing/selftests/bpf/prog_tests/fentry_fexit_multi_test.c
new file mode 100644
index 000000000000..76f917ad843d
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/fentry_fexit_multi_test.c
@@ -0,0 +1,52 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <test_progs.h>
+#include "fentry_fexit_multi_test.skel.h"
+
+void test_fentry_fexit_multi_test(void)
+{
+	DECLARE_LIBBPF_OPTS(bpf_link_update_opts, link_upd_opts);
+	struct fentry_fexit_multi_test *skel = NULL;
+	unsigned long long *bpf_fentry_test;
+	__u32 duration = 0, retval;
+	struct bpf_link *link;
+	int err, prog_fd;
+
+	skel = fentry_fexit_multi_test__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "fentry_multi_skel_load"))
+		goto cleanup;
+
+	bpf_fentry_test = &skel->bss->bpf_fentry_test[0];
+	ASSERT_OK(kallsyms_find("bpf_fentry_test1", &bpf_fentry_test[0]), "kallsyms_find");
+	ASSERT_OK(kallsyms_find("bpf_fentry_test2", &bpf_fentry_test[1]), "kallsyms_find");
+	ASSERT_OK(kallsyms_find("bpf_fentry_test3", &bpf_fentry_test[2]), "kallsyms_find");
+	ASSERT_OK(kallsyms_find("bpf_fentry_test4", &bpf_fentry_test[3]), "kallsyms_find");
+	ASSERT_OK(kallsyms_find("bpf_fentry_test5", &bpf_fentry_test[4]), "kallsyms_find");
+	ASSERT_OK(kallsyms_find("bpf_fentry_test6", &bpf_fentry_test[5]), "kallsyms_find");
+	ASSERT_OK(kallsyms_find("bpf_fentry_test7", &bpf_fentry_test[6]), "kallsyms_find");
+	ASSERT_OK(kallsyms_find("bpf_fentry_test8", &bpf_fentry_test[7]), "kallsyms_find");
+
+	link = bpf_program__attach(skel->progs.test1);
+	if (!ASSERT_OK_PTR(link, "attach_fentry_fexit"))
+		goto cleanup;
+
+	err = bpf_link_update(bpf_link__fd(link),
+			      bpf_program__fd(skel->progs.test2),
+			      NULL);
+	if (!ASSERT_OK(err, "bpf_link_update"))
+		goto cleanup_link;
+
+	prog_fd = bpf_program__fd(skel->progs.test1);
+	err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
+				NULL, NULL, &retval, &duration);
+	ASSERT_OK(err, "test_run");
+	ASSERT_EQ(retval, 0, "test_run");
+
+	ASSERT_EQ(skel->bss->test1_arg_result, 8, "test1_arg_result");
+	ASSERT_EQ(skel->bss->test2_arg_result, 8, "test2_arg_result");
+	ASSERT_EQ(skel->bss->test2_ret_result, 8, "test2_ret_result");
+
+cleanup_link:
+	bpf_link__destroy(link);
+cleanup:
+	fentry_fexit_multi_test__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/fentry_fexit_multi_test.c b/tools/testing/selftests/bpf/progs/fentry_fexit_multi_test.c
new file mode 100644
index 000000000000..e25ab0085399
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/fentry_fexit_multi_test.c
@@ -0,0 +1,28 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include "multi_check.h"
+
+char _license[] SEC("license") = "GPL";
+
+unsigned long long bpf_fentry_test[8];
+
+__u64 test1_arg_result = 0;
+__u64 test2_arg_result = 0;
+__u64 test2_ret_result = 0;
+
+SEC("fentry.multi/bpf_fentry_test*")
+int BPF_PROG(test1, unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f)
+{
+	multi_arg_check(ip, a, b, c, d, e, f, &test1_arg_result);
+	return 0;
+}
+
+SEC("fexit.multi/")
+int BPF_PROG(test2, unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f, int ret)
+{
+	multi_arg_check(ip, a, b, c, d, e, f, &test2_arg_result);
+	multi_ret_check(ip, ret, &test2_ret_result);
+	return 0;
+}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* [PATCH 19/19] selftests/bpf: Temporary fix for fentry_fexit_multi_test
  2021-06-05 11:10 [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach Jiri Olsa
                   ` (17 preceding siblings ...)
  2021-06-05 11:10 ` [PATCH 18/19] selftests/bpf: Add fentry/fexit " Jiri Olsa
@ 2021-06-05 11:10 ` Jiri Olsa
  2021-06-17 20:29 ` [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach Andrii Nakryiko
  19 siblings, 0 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-05 11:10 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

When multi_arg_check is used from 2 different places I'm getting
compilation fail, which I did not deciphered yet:

  $ CLANG=/opt/clang/bin/clang LLC=/opt/clang/bin/llc make
    CLNG-BPF [test_maps] fentry_fexit_multi_test.o
  progs/fentry_fexit_multi_test.c:18:2: error: too many args to t24: i64 = \
  GlobalAddress<void (i64, i64, i64, i64, i64, i64, i64, i64*)* @multi_arg_check> 0, \
  progs/fentry_fexit_multi_test.c:18:2 @[ progs/fentry_fexit_multi_test.c:16:5 ]
          multi_arg_check(ip, a, b, c, d, e, f, &test1_arg_result);
          ^
  progs/fentry_fexit_multi_test.c:25:2: error: too many args to t32: i64 = \
  GlobalAddress<void (i64, i64, i64, i64, i64, i64, i64, i64*)* @multi_arg_check> 0, \
  progs/fentry_fexit_multi_test.c:25:2 @[ progs/fentry_fexit_multi_test.c:23:5 ]
          multi_arg_check(ip, a, b, c, d, e, f, &test2_arg_result);
          ^
  In file included from progs/fentry_fexit_multi_test.c:5:
  /home/jolsa/linux-qemu/tools/testing/selftests/bpf/multi_check.h:9:6: error: defined with too many args
  void multi_arg_check(unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f, __u64 *test_result)
       ^
  /home/jolsa/linux-qemu/tools/testing/selftests/bpf/multi_check.h:9:6: error: defined with too many args
  /home/jolsa/linux-qemu/tools/testing/selftests/bpf/multi_check.h:9:6: error: defined with too many args
  5 errors generated.
  make: *** [Makefile:470: /home/jolsa/linux-qemu/tools/testing/selftests/bpf/fentry_fexit_multi_test.o] Error 1

As a temporary fix adding 2 instaces of multi_arg_check
function, one for each caller.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
---
 tools/testing/selftests/bpf/multi_check.h     | 41 ++++++++++---------
 .../bpf/progs/fentry_fexit_multi_test.c       |  7 +++-
 .../selftests/bpf/progs/fentry_multi_test.c   |  4 +-
 .../selftests/bpf/progs/fexit_multi_test.c    |  4 +-
 4 files changed, 32 insertions(+), 24 deletions(-)

diff --git a/tools/testing/selftests/bpf/multi_check.h b/tools/testing/selftests/bpf/multi_check.h
index 36c2a93f9be3..f720a6f9c6e4 100644
--- a/tools/testing/selftests/bpf/multi_check.h
+++ b/tools/testing/selftests/bpf/multi_check.h
@@ -5,26 +5,27 @@
 
 extern unsigned long long bpf_fentry_test[8];
 
-static __attribute__((unused)) inline
-void multi_arg_check(unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f, __u64 *test_result)
-{
-	if (ip == bpf_fentry_test[0]) {
-		*test_result += (int) a == 1;
-	} else if (ip == bpf_fentry_test[1]) {
-		*test_result += (int) a == 2 && (__u64) b == 3;
-	} else if (ip == bpf_fentry_test[2]) {
-		*test_result += (char) a == 4 && (int) b == 5 && (__u64) c == 6;
-	} else if (ip == bpf_fentry_test[3]) {
-		*test_result += (void *) a == (void *) 7 && (char) b == 8 && (int) c == 9 && (__u64) d == 10;
-	} else if (ip == bpf_fentry_test[4]) {
-		*test_result += (__u64) a == 11 && (void *) b == (void *) 12 && (short) c == 13 && (int) d == 14 && (__u64) e == 15;
-	} else if (ip == bpf_fentry_test[5]) {
-		*test_result += (__u64) a == 16 && (void *) b == (void *) 17 && (short) c == 18 && (int) d == 19 && (void *) e == (void *) 20 && (__u64) f == 21;
-	} else if (ip == bpf_fentry_test[6]) {
-		*test_result += 1;
-	} else if (ip == bpf_fentry_test[7]) {
-		*test_result += 1;
-	}
+#define MULTI_ARG_CHECK(_name) \
+static __attribute__((unused)) inline \
+void _name ## _multi_arg_check(unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f, __u64 *test_result)						\
+{																					\
+	if (ip == bpf_fentry_test[0]) {																	\
+		*test_result +=	(int) a == 1;																\
+	} else if (ip == bpf_fentry_test[1]) {																\
+		*test_result +=	(int) a == 2 && (__u64) b == 3;														\
+	} else if (ip == bpf_fentry_test[2]) {																\
+		*test_result +=	(char) a == 4 && (int) b == 5 && (__u64) c == 6;											\
+	} else if (ip == bpf_fentry_test[3]) {																\
+		*test_result +=	(void *) a == (void *) 7 && (char) b == 8 && (int) c == 9 && (__u64) d == 10;								\
+	} else if (ip == bpf_fentry_test[4]) {																\
+		*test_result +=	(__u64) a == 11 && (void *) b == (void *) 12 && (short) c == 13 && (int) d == 14 && (__u64) e == 15;					\
+	} else if (ip == bpf_fentry_test[5]) {																\
+		*test_result +=	(__u64) a == 16 && (void *) b == (void *) 17 && (short) c == 18 && (int) d == 19 && (void *) e == (void *) 20 && (__u64) f == 21;	\
+	} else if (ip == bpf_fentry_test[6]) {																\
+		*test_result += 1;																	\
+	} else if (ip == bpf_fentry_test[7]) {																\
+		*test_result += 1;																	\
+	}																				\
 }
 
 static __attribute__((unused)) inline
diff --git a/tools/testing/selftests/bpf/progs/fentry_fexit_multi_test.c b/tools/testing/selftests/bpf/progs/fentry_fexit_multi_test.c
index e25ab0085399..dc5b51f20b84 100644
--- a/tools/testing/selftests/bpf/progs/fentry_fexit_multi_test.c
+++ b/tools/testing/selftests/bpf/progs/fentry_fexit_multi_test.c
@@ -6,6 +6,9 @@
 
 char _license[] SEC("license") = "GPL";
 
+MULTI_ARG_CHECK(fentry)
+MULTI_ARG_CHECK(fexit)
+
 unsigned long long bpf_fentry_test[8];
 
 __u64 test1_arg_result = 0;
@@ -15,14 +18,14 @@ __u64 test2_ret_result = 0;
 SEC("fentry.multi/bpf_fentry_test*")
 int BPF_PROG(test1, unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f)
 {
-	multi_arg_check(ip, a, b, c, d, e, f, &test1_arg_result);
+	fentry_multi_arg_check(ip, a, b, c, d, e, f, &test1_arg_result);
 	return 0;
 }
 
 SEC("fexit.multi/")
 int BPF_PROG(test2, unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f, int ret)
 {
-	multi_arg_check(ip, a, b, c, d, e, f, &test2_arg_result);
+	fexit_multi_arg_check(ip, a, b, c, d, e, f, &test2_arg_result);
 	multi_ret_check(ip, ret, &test2_ret_result);
 	return 0;
 }
diff --git a/tools/testing/selftests/bpf/progs/fentry_multi_test.c b/tools/testing/selftests/bpf/progs/fentry_multi_test.c
index a443fc958e5a..b3a025632e77 100644
--- a/tools/testing/selftests/bpf/progs/fentry_multi_test.c
+++ b/tools/testing/selftests/bpf/progs/fentry_multi_test.c
@@ -6,6 +6,8 @@
 
 char _license[] SEC("license") = "GPL";
 
+MULTI_ARG_CHECK(fentry)
+
 unsigned long long bpf_fentry_test[8];
 
 __u64 test_result = 0;
@@ -13,6 +15,6 @@ __u64 test_result = 0;
 SEC("fentry.multi/bpf_fentry_test*")
 int BPF_PROG(test, unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f)
 {
-	multi_arg_check(ip, a, b, c, d, e, f, &test_result);
+	fentry_multi_arg_check(ip, a, b, c, d, e, f, &test_result);
 	return 0;
 }
diff --git a/tools/testing/selftests/bpf/progs/fexit_multi_test.c b/tools/testing/selftests/bpf/progs/fexit_multi_test.c
index 365575cf05a0..8af0d65128d6 100644
--- a/tools/testing/selftests/bpf/progs/fexit_multi_test.c
+++ b/tools/testing/selftests/bpf/progs/fexit_multi_test.c
@@ -6,6 +6,8 @@
 
 char _license[] SEC("license") = "GPL";
 
+MULTI_ARG_CHECK(fexit)
+
 unsigned long long bpf_fentry_test[8];
 
 __u64 test_arg_result = 0;
@@ -14,7 +16,7 @@ __u64 test_ret_result = 0;
 SEC("fexit.multi/bpf_fentry_test*")
 int BPF_PROG(test, unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f, int ret)
 {
-	multi_arg_check(ip, a, b, c, d, e, f, &test_arg_result);
+	fexit_multi_arg_check(ip, a, b, c, d, e, f, &test_arg_result);
 	multi_ret_check(ip, ret, &test_ret_result);
 	return 0;
 }
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 76+ messages in thread

* Re: [PATCH 09/19] bpf, x64: Allow to use caller address from stack
  2021-06-05 11:10 ` [PATCH 09/19] bpf, x64: Allow to use caller address from stack Jiri Olsa
@ 2021-06-07  3:07   ` Yonghong Song
  2021-06-07 18:13     ` Jiri Olsa
  0 siblings, 1 reply; 76+ messages in thread
From: Yonghong Song @ 2021-06-07  3:07 UTC (permalink / raw)
  To: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh, Daniel Xu, Viktor Malik



On 6/5/21 4:10 AM, Jiri Olsa wrote:
> Currently we call the original function by using the absolute address
> given at the JIT generation. That's not usable when having trampoline
> attached to multiple functions. In this case we need to take the
> return address from the stack.

Here, it is mentioned to take the return address from the stack.

> 
> Adding support to retrieve the original function address from the stack

Here, it is said to take original funciton address from the stack.

> by adding new BPF_TRAMP_F_ORIG_STACK flag for arch_prepare_bpf_trampoline
> function.
> 
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>   arch/x86/net/bpf_jit_comp.c | 13 +++++++++----
>   include/linux/bpf.h         |  5 +++++
>   2 files changed, 14 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> index 2a2e290fa5d8..b77e6bd78354 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -2013,10 +2013,15 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>   	if (flags & BPF_TRAMP_F_CALL_ORIG) {
>   		restore_regs(m, &prog, nr_args, stack_size);
>   
> -		/* call original function */
> -		if (emit_call(&prog, orig_call, prog)) {
> -			ret = -EINVAL;
> -			goto cleanup;
> +		if (flags & BPF_TRAMP_F_ORIG_STACK) {
> +			emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8);

This is load double from base_pointer + 8 which should be func return 
address for x86, yet we try to call it.
I guess I must have missed something
here. Could you give some explanation?

> +			EMIT2(0xff, 0xd0); /* call *rax */
> +		} else {
> +			/* call original function */
> +			if (emit_call(&prog, orig_call, prog)) {
> +				ret = -EINVAL;
> +				goto cleanup;
> +			}
>   		}
>   		/* remember return value in a stack for bpf prog to access */
>   		emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -8);
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 86dec5001ae2..16fc600503fb 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -554,6 +554,11 @@ struct btf_func_model {
>    */
>   #define BPF_TRAMP_F_SKIP_FRAME		BIT(2)
>   
> +/* Get original function from stack instead of from provided direct address.
> + * Makes sense for fexit programs only.
> + */
> +#define BPF_TRAMP_F_ORIG_STACK		BIT(3)
> +
>   /* Each call __bpf_prog_enter + call bpf_func + call __bpf_prog_exit is ~50
>    * bytes on x86.  Pick a number to fit into BPF_IMAGE_SIZE / 2
>    */
> 

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 10/19] bpf: Allow to store caller's ip as argument
  2021-06-05 11:10 ` [PATCH 10/19] bpf: Allow to store caller's ip as argument Jiri Olsa
@ 2021-06-07  3:21   ` Yonghong Song
  2021-06-07 18:15     ` Jiri Olsa
  2021-06-08 18:49   ` Andrii Nakryiko
  1 sibling, 1 reply; 76+ messages in thread
From: Yonghong Song @ 2021-06-07  3:21 UTC (permalink / raw)
  To: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh, Daniel Xu, Viktor Malik



On 6/5/21 4:10 AM, Jiri Olsa wrote:
> When we will have multiple functions attached to trampoline
> we need to propagate the function's address to the bpf program.
> 
> Adding new BPF_TRAMP_F_IP_ARG flag to arch_prepare_bpf_trampoline
> function that will store origin caller's address before function's
> arguments.
> 
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>   arch/x86/net/bpf_jit_comp.c | 18 ++++++++++++++----
>   include/linux/bpf.h         |  5 +++++
>   2 files changed, 19 insertions(+), 4 deletions(-)
> 
> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> index b77e6bd78354..d2425c18272a 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -1951,7 +1951,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>   				void *orig_call)
>   {
>   	int ret, i, cnt = 0, nr_args = m->nr_args;
> -	int stack_size = nr_args * 8;
> +	int stack_size = nr_args * 8, ip_arg = 0;
>   	struct bpf_tramp_progs *fentry = &tprogs[BPF_TRAMP_FENTRY];
>   	struct bpf_tramp_progs *fexit = &tprogs[BPF_TRAMP_FEXIT];
>   	struct bpf_tramp_progs *fmod_ret = &tprogs[BPF_TRAMP_MODIFY_RETURN];
> @@ -1975,6 +1975,9 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>   		 */
>   		orig_call += X86_PATCH_SIZE;
>   
> +	if (flags & BPF_TRAMP_F_IP_ARG)
> +		stack_size += 8;
> +
>   	prog = image;
>   
>   	EMIT1(0x55);		 /* push rbp */
> @@ -1982,7 +1985,14 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>   	EMIT4(0x48, 0x83, 0xEC, stack_size); /* sub rsp, stack_size */
>   	EMIT1(0x53);		 /* push rbx */
>   
> -	save_regs(m, &prog, nr_args, stack_size);
> +	if (flags & BPF_TRAMP_F_IP_ARG) {
> +		emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8);
> +		EMIT4(0x48, 0x83, 0xe8, X86_PATCH_SIZE); /* sub $X86_PATCH_SIZE,%rax*/

Could you explain what the above EMIT4 is for? I am not quite familiar 
with this piece of code and hence the question. Some comments here
should help too.

> +		emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -stack_size);
> +		ip_arg = 8;
> +	}
> +
> +	save_regs(m, &prog, nr_args, stack_size - ip_arg);
>   
>   	if (flags & BPF_TRAMP_F_CALL_ORIG) {
>   		/* arg1: mov rdi, im */
> @@ -2011,7 +2021,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>   	}
>   
>   	if (flags & BPF_TRAMP_F_CALL_ORIG) {
> -		restore_regs(m, &prog, nr_args, stack_size);
> +		restore_regs(m, &prog, nr_args, stack_size - ip_arg);
>   
>   		if (flags & BPF_TRAMP_F_ORIG_STACK) {
>   			emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8);
> @@ -2052,7 +2062,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>   		}
>   
>   	if (flags & BPF_TRAMP_F_RESTORE_REGS)
> -		restore_regs(m, &prog, nr_args, stack_size);
> +		restore_regs(m, &prog, nr_args, stack_size - ip_arg);
>   
>   	/* This needs to be done regardless. If there were fmod_ret programs,
>   	 * the return value is only updated on the stack and still needs to be
[...]

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 11/19] bpf: Add support to load multi func tracing program
  2021-06-05 11:10 ` [PATCH 11/19] bpf: Add support to load multi func tracing program Jiri Olsa
@ 2021-06-07  3:56   ` Yonghong Song
  2021-06-07 18:18     ` Jiri Olsa
  0 siblings, 1 reply; 76+ messages in thread
From: Yonghong Song @ 2021-06-07  3:56 UTC (permalink / raw)
  To: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh, Daniel Xu, Viktor Malik



On 6/5/21 4:10 AM, Jiri Olsa wrote:
> Adding support to load tracing program with new BPF_F_MULTI_FUNC flag,
> that allows the program to be loaded without specific function to be
> attached to.
> 
> The verifier assumes the program is using all (6) available arguments

Is this a verifier failure or it is due to the check in the
beginning of function arch_prepare_bpf_trampoline()?

         /* x86-64 supports up to 6 arguments. 7+ can be added in the 
future */
         if (nr_args > 6)
                 return -ENOTSUPP;

If it is indeed due to arch_prepare_bpf_trampoline() maybe we
can improve it instead of specially processing the first argument
"ip" in quite some places?

> as unsigned long values. We can't add extra ip argument at this time,
> because JIT on x86 would fail to process this function. Instead we
> allow to access extra first 'ip' argument in btf_ctx_access.
> 
> Such program will be allowed to be attached to multiple functions
> in following patches.
> 
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>   include/linux/bpf.h            |  1 +
>   include/uapi/linux/bpf.h       |  7 +++++++
>   kernel/bpf/btf.c               |  5 +++++
>   kernel/bpf/syscall.c           | 35 +++++++++++++++++++++++++++++-----
>   kernel/bpf/verifier.c          |  3 ++-
>   tools/include/uapi/linux/bpf.h |  7 +++++++
>   6 files changed, 52 insertions(+), 6 deletions(-)
> 
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 6cbf3c81c650..23221e0e8d3c 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -845,6 +845,7 @@ struct bpf_prog_aux {
>   	bool sleepable;
>   	bool tail_call_reachable;
>   	struct hlist_node tramp_hlist;
> +	bool multi_func;

Move this field right after "tail_call_reachable"?

>   	/* BTF_KIND_FUNC_PROTO for valid attach_btf_id */
>   	const struct btf_type *attach_func_proto;
>   	/* function name for valid attach_btf_id */
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 2c1ba70abbf1..ad9340fb14d4 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -1109,6 +1109,13 @@ enum bpf_link_type {
>    */
>   #define BPF_F_SLEEPABLE		(1U << 4)
>   
> +/* If BPF_F_MULTI_FUNC is used in BPF_PROG_LOAD command, the verifier does
> + * not expect BTF ID for the program, instead it assumes it's function
> + * with 6 u64 arguments. No trampoline is created for the program. Such
> + * program can be attached to multiple functions.
> + */
> +#define BPF_F_MULTI_FUNC	(1U << 5)
> +
>   /* When BPF ldimm64's insn[0].src_reg != 0 then this can have
>    * the following extensions:
>    *
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index a6e39c5ea0bf..c233aaa6a709 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -4679,6 +4679,11 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
>   		args++;
>   		nr_args--;
>   	}
> +	if (prog->aux->multi_func) {
> +		if (arg == 0)
> +			return true;
> +		arg--;

Some comments in the above to mention like "the first 'ip' argument
is omitted" will be good.

> +	}
>   
>   	if (arg > nr_args) {
>   		bpf_log(log, "func '%s' doesn't have %d-th argument\n",
[...]

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 13/19] bpf: Add support to link multi func tracing program
  2021-06-05 11:10 ` [PATCH 13/19] bpf: Add support to link multi func tracing program Jiri Olsa
@ 2021-06-07  5:36   ` Yonghong Song
  2021-06-07 18:25     ` Jiri Olsa
  2021-06-08 15:42   ` Alexei Starovoitov
  2021-06-09  5:18   ` Andrii Nakryiko
  2 siblings, 1 reply; 76+ messages in thread
From: Yonghong Song @ 2021-06-07  5:36 UTC (permalink / raw)
  To: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh, Daniel Xu, Viktor Malik



On 6/5/21 4:10 AM, Jiri Olsa wrote:
> Adding support to attach multiple functions to tracing program
> by using the link_create/link_update interface.
> 
> Adding multi_btf_ids/multi_btf_ids_cnt pair to link_create struct
> API, that define array of functions btf ids that will be attached
> to prog_fd.
> 
> The prog_fd needs to be multi prog tracing program (BPF_F_MULTI_FUNC).
> 
> The new link_create interface creates new BPF_LINK_TYPE_TRACING_MULTI
> link type, which creates separate bpf_trampoline and registers it
> as direct function for all specified btf ids.
> 
> The new bpf_trampoline is out of scope (bpf_trampoline_lookup) of
> standard trampolines, so all registered functions need to be free
> of direct functions, otherwise the link fails.

I am not sure how severe such a limitation could be in practice.
It is possible in production some non-multi fentry/fexit program
may run continuously. Does kprobe program impact this as well?

> 
> The new bpf_trampoline will store and pass to bpf program the highest
> number of arguments from all given functions.
> 
> New programs (fentry or fexit) can be added to the existing trampoline
> through the link_update interface via new_prog_fd descriptor.

Looks we do not support replacing old programs. Do we support
removing old programs?

> 
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>   include/linux/bpf.h            |   3 +
>   include/uapi/linux/bpf.h       |   5 +
>   kernel/bpf/syscall.c           | 185 ++++++++++++++++++++++++++++++++-
>   kernel/bpf/trampoline.c        |  53 +++++++---
>   tools/include/uapi/linux/bpf.h |   5 +
>   5 files changed, 237 insertions(+), 14 deletions(-)
> 
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 23221e0e8d3c..99a81c6c22e6 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -661,6 +661,7 @@ struct bpf_trampoline {
>   	struct bpf_tramp_image *cur_image;
>   	u64 selector;
>   	struct module *mod;
> +	bool multi;
>   };
>   
>   struct bpf_attach_target_info {
> @@ -746,6 +747,8 @@ void bpf_ksym_add(struct bpf_ksym *ksym);
>   void bpf_ksym_del(struct bpf_ksym *ksym);
>   int bpf_jit_charge_modmem(u32 pages);
>   void bpf_jit_uncharge_modmem(u32 pages);
> +struct bpf_trampoline *bpf_trampoline_multi_alloc(void);
> +void bpf_trampoline_multi_free(struct bpf_trampoline *tr);
>   #else
>   static inline int bpf_trampoline_link_prog(struct bpf_prog *prog,
>   					   struct bpf_trampoline *tr)
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index ad9340fb14d4..5fd6ff64e8dc 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -1007,6 +1007,7 @@ enum bpf_link_type {
>   	BPF_LINK_TYPE_ITER = 4,
>   	BPF_LINK_TYPE_NETNS = 5,
>   	BPF_LINK_TYPE_XDP = 6,
> +	BPF_LINK_TYPE_TRACING_MULTI = 7,
>   
>   	MAX_BPF_LINK_TYPE,
>   };
> @@ -1454,6 +1455,10 @@ union bpf_attr {
>   				__aligned_u64	iter_info;	/* extra bpf_iter_link_info */
>   				__u32		iter_info_len;	/* iter_info length */
>   			};
> +			struct {
> +				__aligned_u64	multi_btf_ids;		/* addresses to attach */
> +				__u32		multi_btf_ids_cnt;	/* addresses count */
> +			};
>   		};
>   	} link_create;
>   
[...]
> +static int bpf_tracing_multi_link_fill_link_info(const struct bpf_link *link,
> +						 struct bpf_link_info *info)
> +{
> +	struct bpf_tracing_multi_link *tr_link =
> +		container_of(link, struct bpf_tracing_multi_link, link);
> +
> +	info->tracing.attach_type = tr_link->attach_type;
> +	return 0;
> +}
> +
> +static int check_multi_prog_type(struct bpf_prog *prog)
> +{
> +	if (!prog->aux->multi_func &&
> +	    prog->type != BPF_PROG_TYPE_TRACING)

I think prog->type != BPF_PROG_TYPE_TRACING is not needed, it should 
have been checked during program load time?

> +		return -EINVAL;
> +	if (prog->expected_attach_type != BPF_TRACE_FENTRY &&
> +	    prog->expected_attach_type != BPF_TRACE_FEXIT)
> +		return -EINVAL;
> +	return 0;
> +}
> +
> +static int bpf_tracing_multi_link_update(struct bpf_link *link,
> +					 struct bpf_prog *new_prog,
> +					 struct bpf_prog *old_prog __maybe_unused)
> +{
> +	struct bpf_tracing_multi_link *tr_link =
> +		container_of(link, struct bpf_tracing_multi_link, link);
> +	int err;
> +
> +	if (check_multi_prog_type(new_prog))
> +		return -EINVAL;
> +
> +	err = bpf_trampoline_link_prog(new_prog, tr_link->tr);
> +	if (err)
> +		return err;
> +
> +	err = modify_ftrace_direct_multi(&tr_link->ops,
> +					 (unsigned long) tr_link->tr->cur_image->image);
> +	return WARN_ON(err);

Why WARN_ON here? Some comments will be good.

> +}
> +
> +static const struct bpf_link_ops bpf_tracing_multi_link_lops = {
> +	.release = bpf_tracing_multi_link_release,
> +	.dealloc = bpf_tracing_multi_link_dealloc,
> +	.show_fdinfo = bpf_tracing_multi_link_show_fdinfo,
> +	.fill_link_info = bpf_tracing_multi_link_fill_link_info,
> +	.update_prog = bpf_tracing_multi_link_update,
> +};
> +
[...]
> +
>   struct bpf_raw_tp_link {
>   	struct bpf_link link;
>   	struct bpf_raw_event_map *btp;
> @@ -3043,6 +3222,8 @@ attach_type_to_prog_type(enum bpf_attach_type attach_type)
>   	case BPF_CGROUP_SETSOCKOPT:
>   		return BPF_PROG_TYPE_CGROUP_SOCKOPT;
>   	case BPF_TRACE_ITER:
> +	case BPF_TRACE_FENTRY:
> +	case BPF_TRACE_FEXIT:
>   		return BPF_PROG_TYPE_TRACING;
>   	case BPF_SK_LOOKUP:
>   		return BPF_PROG_TYPE_SK_LOOKUP;
> @@ -4099,6 +4280,8 @@ static int tracing_bpf_link_attach(const union bpf_attr *attr, bpfptr_t uattr,
>   
>   	if (prog->expected_attach_type == BPF_TRACE_ITER)
>   		return bpf_iter_link_attach(attr, uattr, prog);
> +	else if (prog->aux->multi_func)
> +		return bpf_tracing_multi_attach(prog, attr);
>   	else if (prog->type == BPF_PROG_TYPE_EXT)
>   		return bpf_tracing_prog_attach(prog,
>   					       attr->link_create.target_fd,
> @@ -4106,7 +4289,7 @@ static int tracing_bpf_link_attach(const union bpf_attr *attr, bpfptr_t uattr,
>   	return -EINVAL;
>   }
>   
> -#define BPF_LINK_CREATE_LAST_FIELD link_create.iter_info_len
> +#define BPF_LINK_CREATE_LAST_FIELD link_create.multi_btf_ids_cnt

It is okay that we don't change this. link_create.iter_info_len
has the same effect since it is a union.

>   static int link_create(union bpf_attr *attr, bpfptr_t uattr)
>   {
>   	enum bpf_prog_type ptype;
> diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
> index 2755fdcf9fbf..660b8197c27f 100644
> --- a/kernel/bpf/trampoline.c
> +++ b/kernel/bpf/trampoline.c
> @@ -58,7 +58,7 @@ void bpf_image_ksym_del(struct bpf_ksym *ksym)
>   			   PAGE_SIZE, true, ksym->name);
>   }
>   
> -static struct bpf_trampoline *bpf_trampoline_alloc(void)
> +static struct bpf_trampoline *bpf_trampoline_alloc(bool multi)
>   {
>   	struct bpf_trampoline *tr;
>   	int i;
> @@ -72,6 +72,7 @@ static struct bpf_trampoline *bpf_trampoline_alloc(void)
>   	mutex_init(&tr->mutex);
>   	for (i = 0; i < BPF_TRAMP_MAX; i++)
>   		INIT_HLIST_HEAD(&tr->progs_hlist[i]);
> +	tr->multi = multi;
>   	return tr;
>   }
>   
> @@ -88,7 +89,7 @@ static struct bpf_trampoline *bpf_trampoline_lookup(u64 key)
>   			goto out;
>   		}
>   	}
> -	tr = bpf_trampoline_alloc();
> +	tr = bpf_trampoline_alloc(false);
>   	if (tr) {
>   		tr->key = key;
>   		hlist_add_head(&tr->hlist, head);
> @@ -343,14 +344,16 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr)
>   	struct bpf_tramp_image *im;
>   	struct bpf_tramp_progs *tprogs;
>   	u32 flags = BPF_TRAMP_F_RESTORE_REGS;
> -	int err, total;
> +	bool update = !tr->multi;
> +	int err = 0, total;
>   
>   	tprogs = bpf_trampoline_get_progs(tr, &total);
>   	if (IS_ERR(tprogs))
>   		return PTR_ERR(tprogs);
>   
>   	if (total == 0) {
> -		err = unregister_fentry(tr, tr->cur_image->image);
> +		if (update)
> +			err = unregister_fentry(tr, tr->cur_image->image);
>   		bpf_tramp_image_put(tr->cur_image);
>   		tr->cur_image = NULL;
>   		tr->selector = 0;
> @@ -363,9 +366,15 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr)
>   		goto out;
>   	}
>   
> +	if (tr->multi)
> +		flags |= BPF_TRAMP_F_IP_ARG;
> +
>   	if (tprogs[BPF_TRAMP_FEXIT].nr_progs ||
> -	    tprogs[BPF_TRAMP_MODIFY_RETURN].nr_progs)
> +	    tprogs[BPF_TRAMP_MODIFY_RETURN].nr_progs) {
>   		flags = BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_SKIP_FRAME;
> +		if (tr->multi)
> +			flags |= BPF_TRAMP_F_ORIG_STACK | BPF_TRAMP_F_IP_ARG;

BPF_TRAMP_F_IP_ARG is not needed. It has been added before.

> +	}
>   
>   	err = arch_prepare_bpf_trampoline(im, im->image, im->image + PAGE_SIZE,
>   					  &tr->func.model, flags, tprogs,
> @@ -373,16 +382,19 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr)
>   	if (err < 0)
>   		goto out;
>   
> +	err = 0;
>   	WARN_ON(tr->cur_image && tr->selector == 0);
>   	WARN_ON(!tr->cur_image && tr->selector);
> -	if (tr->cur_image)
> -		/* progs already running at this address */
> -		err = modify_fentry(tr, tr->cur_image->image, im->image);
> -	else
> -		/* first time registering */
> -		err = register_fentry(tr, im->image);
> -	if (err)
> -		goto out;
> +	if (update) {
> +		if (tr->cur_image)
> +			/* progs already running at this address */
> +			err = modify_fentry(tr, tr->cur_image->image, im->image);
> +		else
> +			/* first time registering */
> +			err = register_fentry(tr, im->image);
> +		if (err)
> +			goto out;
> +	}
>   	if (tr->cur_image)
>   		bpf_tramp_image_put(tr->cur_image);
>   	tr->cur_image = im;
> @@ -436,6 +448,10 @@ int bpf_trampoline_link_prog(struct bpf_prog *prog, struct bpf_trampoline *tr)
>   			err = -EBUSY;
>   			goto out;
>   		}
> +		if (tr->multi) {
> +			err = -EINVAL;
> +			goto out;
> +		}
>   		tr->extension_prog = prog;
>   		err = bpf_arch_text_poke(tr->func.addr, BPF_MOD_JUMP, NULL,
>   					 prog->bpf_func);
[...]

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 15/19] libbpf: Add support to link multi func tracing program
  2021-06-05 11:10 ` [PATCH 15/19] libbpf: Add support to link multi func tracing program Jiri Olsa
@ 2021-06-07  5:49   ` Yonghong Song
  2021-06-07 18:28     ` Jiri Olsa
  2021-06-09  5:34   ` Andrii Nakryiko
  1 sibling, 1 reply; 76+ messages in thread
From: Yonghong Song @ 2021-06-07  5:49 UTC (permalink / raw)
  To: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh, Daniel Xu, Viktor Malik



On 6/5/21 4:10 AM, Jiri Olsa wrote:
> Adding support to link multi func tracing program
> through link_create interface.
> 
> Adding special types for multi func programs:
> 
>    fentry.multi
>    fexit.multi
> 
> so you can define multi func programs like:
> 
>    SEC("fentry.multi/bpf_fentry_test*")
>    int BPF_PROG(test1, unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f)
> 
> that defines test1 to be attached to bpf_fentry_test* functions,
> and able to attach ip and 6 arguments.
> 
> If functions are not specified the program needs to be attached
> manually.
> 
> Adding new btf id related fields to bpf_link_create_opts and
> bpf_link_create to use them.
> 
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>   tools/lib/bpf/bpf.c    | 11 ++++++-
>   tools/lib/bpf/bpf.h    |  4 ++-
>   tools/lib/bpf/libbpf.c | 72 ++++++++++++++++++++++++++++++++++++++++++
>   3 files changed, 85 insertions(+), 2 deletions(-)
> 
> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> index 86dcac44f32f..da892737b522 100644
> --- a/tools/lib/bpf/bpf.c
> +++ b/tools/lib/bpf/bpf.c
> @@ -674,7 +674,8 @@ int bpf_link_create(int prog_fd, int target_fd,
>   		    enum bpf_attach_type attach_type,
>   		    const struct bpf_link_create_opts *opts)
>   {
> -	__u32 target_btf_id, iter_info_len;
> +	__u32 target_btf_id, iter_info_len, multi_btf_ids_cnt;
> +	__s32 *multi_btf_ids;
>   	union bpf_attr attr;
>   	int fd;
>   
[...]
> @@ -9584,6 +9597,9 @@ static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd,
>   	if (!name)
>   		return -EINVAL;
>   
> +	if (prog->prog_flags & BPF_F_MULTI_FUNC)
> +		return 0;
> +
>   	for (i = 0; i < ARRAY_SIZE(section_defs); i++) {
>   		if (!section_defs[i].is_attach_btf)
>   			continue;
> @@ -10537,6 +10553,62 @@ static struct bpf_link *bpf_program__attach_btf_id(struct bpf_program *prog)
>   	return (struct bpf_link *)link;
>   }
>   
> +static struct bpf_link *bpf_program__attach_multi(struct bpf_program *prog)
> +{
> +	char *pattern = prog->sec_name + prog->sec_def->len;
> +	DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts);
> +	enum bpf_attach_type attach_type;
> +	int prog_fd, link_fd, cnt, err;
> +	struct bpf_link *link = NULL;
> +	__s32 *ids = NULL;
> +
> +	prog_fd = bpf_program__fd(prog);
> +	if (prog_fd < 0) {
> +		pr_warn("prog '%s': can't attach before loaded\n", prog->name);
> +		return ERR_PTR(-EINVAL);
> +	}
> +
> +	err = bpf_object__load_vmlinux_btf(prog->obj, true);
> +	if (err)
> +		return ERR_PTR(err);
> +
> +	cnt = btf__find_by_pattern_kind(prog->obj->btf_vmlinux, pattern,
> +					BTF_KIND_FUNC, &ids);
> +	if (cnt <= 0)
> +		return ERR_PTR(-EINVAL);

In kernel, looks like we support cnt = 0, here we error out.
Should we also error out in the kernel if cnt == 0?

> +
> +	link = calloc(1, sizeof(*link));
> +	if (!link) {
> +		err = -ENOMEM;
> +		goto out_err;
> +	}
> +	link->detach = &bpf_link__detach_fd;
> +
> +	opts.multi_btf_ids = ids;
> +	opts.multi_btf_ids_cnt = cnt;
> +
> +	attach_type = bpf_program__get_expected_attach_type(prog);
> +	link_fd = bpf_link_create(prog_fd, 0, attach_type, &opts);
> +	if (link_fd < 0) {
> +		err = -errno;
> +		goto out_err;
> +	}
> +	link->fd = link_fd;
> +	free(ids);
> +	return link;
> +
> +out_err:
> +	free(link);
> +	free(ids);
> +	return ERR_PTR(err);
> +}
> +
[...]

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 16/19] selftests/bpf: Add fentry multi func test
  2021-06-05 11:10 ` [PATCH 16/19] selftests/bpf: Add fentry multi func test Jiri Olsa
@ 2021-06-07  6:06   ` Yonghong Song
  2021-06-07 18:42     ` Jiri Olsa
  2021-06-09  5:40   ` Andrii Nakryiko
  1 sibling, 1 reply; 76+ messages in thread
From: Yonghong Song @ 2021-06-07  6:06 UTC (permalink / raw)
  To: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware)
  Cc: netdev, bpf, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh, Daniel Xu, Viktor Malik



On 6/5/21 4:10 AM, Jiri Olsa wrote:
> Adding selftest for fentry multi func test that attaches
> to bpf_fentry_test* functions and checks argument values
> based on the processed function.
> 
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>   tools/testing/selftests/bpf/multi_check.h     | 52 +++++++++++++++++++

Should we put this file under selftests/bpf/progs directory?
It is included only by bpf programs.

>   .../bpf/prog_tests/fentry_multi_test.c        | 43 +++++++++++++++
>   .../selftests/bpf/progs/fentry_multi_test.c   | 18 +++++++
>   3 files changed, 113 insertions(+)
>   create mode 100644 tools/testing/selftests/bpf/multi_check.h
>   create mode 100644 tools/testing/selftests/bpf/prog_tests/fentry_multi_test.c
>   create mode 100644 tools/testing/selftests/bpf/progs/fentry_multi_test.c
> 
> diff --git a/tools/testing/selftests/bpf/multi_check.h b/tools/testing/selftests/bpf/multi_check.h
> new file mode 100644
> index 000000000000..36c2a93f9be3
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/multi_check.h
> @@ -0,0 +1,52 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef __MULTI_CHECK_H
> +#define __MULTI_CHECK_H
> +
> +extern unsigned long long bpf_fentry_test[8];
> +
> +static __attribute__((unused)) inline
> +void multi_arg_check(unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f, __u64 *test_result)
> +{
> +	if (ip == bpf_fentry_test[0]) {
> +		*test_result += (int) a == 1;
> +	} else if (ip == bpf_fentry_test[1]) {
> +		*test_result += (int) a == 2 && (__u64) b == 3;
> +	} else if (ip == bpf_fentry_test[2]) {
> +		*test_result += (char) a == 4 && (int) b == 5 && (__u64) c == 6;
> +	} else if (ip == bpf_fentry_test[3]) {
> +		*test_result += (void *) a == (void *) 7 && (char) b == 8 && (int) c == 9 && (__u64) d == 10;
> +	} else if (ip == bpf_fentry_test[4]) {
> +		*test_result += (__u64) a == 11 && (void *) b == (void *) 12 && (short) c == 13 && (int) d == 14 && (__u64) e == 15;
> +	} else if (ip == bpf_fentry_test[5]) {
> +		*test_result += (__u64) a == 16 && (void *) b == (void *) 17 && (short) c == 18 && (int) d == 19 && (void *) e == (void *) 20 && (__u64) f == 21;
> +	} else if (ip == bpf_fentry_test[6]) {
> +		*test_result += 1;
> +	} else if (ip == bpf_fentry_test[7]) {
> +		*test_result += 1;
> +	}
> +}
> +
> +static __attribute__((unused)) inline
> +void multi_ret_check(unsigned long ip, int ret, __u64 *test_result)
> +{
> +	if (ip == bpf_fentry_test[0]) {
> +		*test_result += ret == 2;
> +	} else if (ip == bpf_fentry_test[1]) {
> +		*test_result += ret == 5;
> +	} else if (ip == bpf_fentry_test[2]) {
> +		*test_result += ret == 15;
> +	} else if (ip == bpf_fentry_test[3]) {
> +		*test_result += ret == 34;
> +	} else if (ip == bpf_fentry_test[4]) {
> +		*test_result += ret == 65;
> +	} else if (ip == bpf_fentry_test[5]) {
> +		*test_result += ret == 111;
> +	} else if (ip == bpf_fentry_test[6]) {
> +		*test_result += ret == 0;
> +	} else if (ip == bpf_fentry_test[7]) {
> +		*test_result += ret == 0;
> +	}
> +}
> +
> +#endif /* __MULTI_CHECK_H */
> diff --git a/tools/testing/selftests/bpf/prog_tests/fentry_multi_test.c b/tools/testing/selftests/bpf/prog_tests/fentry_multi_test.c
> new file mode 100644
> index 000000000000..e4a8089533d6
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/fentry_multi_test.c
> @@ -0,0 +1,43 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <test_progs.h>
> +#include "fentry_multi_test.skel.h"
> +#include "trace_helpers.h"
> +
> +void test_fentry_multi_test(void)
> +{
> +	struct fentry_multi_test *skel = NULL;
> +	unsigned long long *bpf_fentry_test;
> +	__u32 duration = 0, retval;
> +	int err, prog_fd;
> +
> +	skel = fentry_multi_test__open_and_load();
> +	if (!ASSERT_OK_PTR(skel, "fentry_multi_skel_load"))
> +		goto cleanup;
> +
> +	bpf_fentry_test = &skel->bss->bpf_fentry_test[0];
> +	ASSERT_OK(kallsyms_find("bpf_fentry_test1", &bpf_fentry_test[0]), "kallsyms_find");
> +	ASSERT_OK(kallsyms_find("bpf_fentry_test2", &bpf_fentry_test[1]), "kallsyms_find");
> +	ASSERT_OK(kallsyms_find("bpf_fentry_test3", &bpf_fentry_test[2]), "kallsyms_find");
> +	ASSERT_OK(kallsyms_find("bpf_fentry_test4", &bpf_fentry_test[3]), "kallsyms_find");
> +	ASSERT_OK(kallsyms_find("bpf_fentry_test5", &bpf_fentry_test[4]), "kallsyms_find");
> +	ASSERT_OK(kallsyms_find("bpf_fentry_test6", &bpf_fentry_test[5]), "kallsyms_find");
> +	ASSERT_OK(kallsyms_find("bpf_fentry_test7", &bpf_fentry_test[6]), "kallsyms_find");
> +	ASSERT_OK(kallsyms_find("bpf_fentry_test8", &bpf_fentry_test[7]), "kallsyms_find");
> +
> +	err = fentry_multi_test__attach(skel);
> +	if (!ASSERT_OK(err, "fentry_attach"))
> +		goto cleanup;
> +
> +	prog_fd = bpf_program__fd(skel->progs.test);
> +	err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
> +				NULL, NULL, &retval, &duration);
> +	ASSERT_OK(err, "test_run");
> +	ASSERT_EQ(retval, 0, "test_run");
> +
> +	ASSERT_EQ(skel->bss->test_result, 8, "test_result");
> +
> +	fentry_multi_test__detach(skel);
> +
> +cleanup:
> +	fentry_multi_test__destroy(skel);
> +}
> diff --git a/tools/testing/selftests/bpf/progs/fentry_multi_test.c b/tools/testing/selftests/bpf/progs/fentry_multi_test.c
> new file mode 100644
> index 000000000000..a443fc958e5a
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/fentry_multi_test.c
> @@ -0,0 +1,18 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <linux/bpf.h>
> +#include <bpf/bpf_helpers.h>
> +#include <bpf/bpf_tracing.h>
> +#include "multi_check.h"
> +
> +char _license[] SEC("license") = "GPL";
> +
> +unsigned long long bpf_fentry_test[8];
> +
> +__u64 test_result = 0;
> +
> +SEC("fentry.multi/bpf_fentry_test*")
> +int BPF_PROG(test, unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f)
> +{
> +	multi_arg_check(ip, a, b, c, d, e, f, &test_result);
> +	return 0;
> +}
> 

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 09/19] bpf, x64: Allow to use caller address from stack
  2021-06-07  3:07   ` Yonghong Song
@ 2021-06-07 18:13     ` Jiri Olsa
  0 siblings, 0 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-07 18:13 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	netdev, bpf, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh, Daniel Xu, Viktor Malik

On Sun, Jun 06, 2021 at 08:07:44PM -0700, Yonghong Song wrote:
> 
> 
> On 6/5/21 4:10 AM, Jiri Olsa wrote:
> > Currently we call the original function by using the absolute address
> > given at the JIT generation. That's not usable when having trampoline
> > attached to multiple functions. In this case we need to take the
> > return address from the stack.
> 
> Here, it is mentioned to take the return address from the stack.
> 
> > 
> > Adding support to retrieve the original function address from the stack
> 
> Here, it is said to take original funciton address from the stack.

sorry if the description is confusing as always, the idea
is to take the function's return address from fentry call:

   function
     call fentry
     xxxx             <---- this address 

and use it to call the original function body before fexit handler

jirka

> 
> > by adding new BPF_TRAMP_F_ORIG_STACK flag for arch_prepare_bpf_trampoline
> > function.
> > 
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >   arch/x86/net/bpf_jit_comp.c | 13 +++++++++----
> >   include/linux/bpf.h         |  5 +++++
> >   2 files changed, 14 insertions(+), 4 deletions(-)
> > 
> > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> > index 2a2e290fa5d8..b77e6bd78354 100644
> > --- a/arch/x86/net/bpf_jit_comp.c
> > +++ b/arch/x86/net/bpf_jit_comp.c
> > @@ -2013,10 +2013,15 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> >   	if (flags & BPF_TRAMP_F_CALL_ORIG) {
> >   		restore_regs(m, &prog, nr_args, stack_size);
> > -		/* call original function */
> > -		if (emit_call(&prog, orig_call, prog)) {
> > -			ret = -EINVAL;
> > -			goto cleanup;
> > +		if (flags & BPF_TRAMP_F_ORIG_STACK) {
> > +			emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8);
> 
> This is load double from base_pointer + 8 which should be func return
> address for x86, yet we try to call it.
> I guess I must have missed something
> here. Could you give some explanation?
> 
> > +			EMIT2(0xff, 0xd0); /* call *rax */
> > +		} else {
> > +			/* call original function */
> > +			if (emit_call(&prog, orig_call, prog)) {
> > +				ret = -EINVAL;
> > +				goto cleanup;
> > +			}
> >   		}
> >   		/* remember return value in a stack for bpf prog to access */
> >   		emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -8);
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index 86dec5001ae2..16fc600503fb 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -554,6 +554,11 @@ struct btf_func_model {
> >    */
> >   #define BPF_TRAMP_F_SKIP_FRAME		BIT(2)
> > +/* Get original function from stack instead of from provided direct address.
> > + * Makes sense for fexit programs only.
> > + */
> > +#define BPF_TRAMP_F_ORIG_STACK		BIT(3)
> > +
> >   /* Each call __bpf_prog_enter + call bpf_func + call __bpf_prog_exit is ~50
> >    * bytes on x86.  Pick a number to fit into BPF_IMAGE_SIZE / 2
> >    */
> > 
> 


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 10/19] bpf: Allow to store caller's ip as argument
  2021-06-07  3:21   ` Yonghong Song
@ 2021-06-07 18:15     ` Jiri Olsa
  0 siblings, 0 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-07 18:15 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	netdev, bpf, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh, Daniel Xu, Viktor Malik

On Sun, Jun 06, 2021 at 08:21:51PM -0700, Yonghong Song wrote:
> 
> 
> On 6/5/21 4:10 AM, Jiri Olsa wrote:
> > When we will have multiple functions attached to trampoline
> > we need to propagate the function's address to the bpf program.
> > 
> > Adding new BPF_TRAMP_F_IP_ARG flag to arch_prepare_bpf_trampoline
> > function that will store origin caller's address before function's
> > arguments.
> > 
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >   arch/x86/net/bpf_jit_comp.c | 18 ++++++++++++++----
> >   include/linux/bpf.h         |  5 +++++
> >   2 files changed, 19 insertions(+), 4 deletions(-)
> > 
> > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> > index b77e6bd78354..d2425c18272a 100644
> > --- a/arch/x86/net/bpf_jit_comp.c
> > +++ b/arch/x86/net/bpf_jit_comp.c
> > @@ -1951,7 +1951,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> >   				void *orig_call)
> >   {
> >   	int ret, i, cnt = 0, nr_args = m->nr_args;
> > -	int stack_size = nr_args * 8;
> > +	int stack_size = nr_args * 8, ip_arg = 0;
> >   	struct bpf_tramp_progs *fentry = &tprogs[BPF_TRAMP_FENTRY];
> >   	struct bpf_tramp_progs *fexit = &tprogs[BPF_TRAMP_FEXIT];
> >   	struct bpf_tramp_progs *fmod_ret = &tprogs[BPF_TRAMP_MODIFY_RETURN];
> > @@ -1975,6 +1975,9 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> >   		 */
> >   		orig_call += X86_PATCH_SIZE;
> > +	if (flags & BPF_TRAMP_F_IP_ARG)
> > +		stack_size += 8;
> > +
> >   	prog = image;
> >   	EMIT1(0x55);		 /* push rbp */
> > @@ -1982,7 +1985,14 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> >   	EMIT4(0x48, 0x83, 0xEC, stack_size); /* sub rsp, stack_size */
> >   	EMIT1(0x53);		 /* push rbx */
> > -	save_regs(m, &prog, nr_args, stack_size);
> > +	if (flags & BPF_TRAMP_F_IP_ARG) {
> > +		emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8);
> > +		EMIT4(0x48, 0x83, 0xe8, X86_PATCH_SIZE); /* sub $X86_PATCH_SIZE,%rax*/
> 
> Could you explain what the above EMIT4 is for? I am not quite familiar with
> this piece of code and hence the question. Some comments here
> should help too.

it's there to generate the 'sub $X86_PATCH_SIZE,%rax' instruction
to get the real IP address of the traced function, and it's stored
to stack on the next line

I'll put more comments in there

jirka

> 
> > +		emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -stack_size);
> > +		ip_arg = 8;
> > +	}
> > +
> > +	save_regs(m, &prog, nr_args, stack_size - ip_arg);
> >   	if (flags & BPF_TRAMP_F_CALL_ORIG) {
> >   		/* arg1: mov rdi, im */
> > @@ -2011,7 +2021,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> >   	}
> >   	if (flags & BPF_TRAMP_F_CALL_ORIG) {
> > -		restore_regs(m, &prog, nr_args, stack_size);
> > +		restore_regs(m, &prog, nr_args, stack_size - ip_arg);
> >   		if (flags & BPF_TRAMP_F_ORIG_STACK) {
> >   			emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8);
> > @@ -2052,7 +2062,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> >   		}
> >   	if (flags & BPF_TRAMP_F_RESTORE_REGS)
> > -		restore_regs(m, &prog, nr_args, stack_size);
> > +		restore_regs(m, &prog, nr_args, stack_size - ip_arg);
> >   	/* This needs to be done regardless. If there were fmod_ret programs,
> >   	 * the return value is only updated on the stack and still needs to be
> [...]
> 


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 11/19] bpf: Add support to load multi func tracing program
  2021-06-07  3:56   ` Yonghong Song
@ 2021-06-07 18:18     ` Jiri Olsa
  2021-06-07 19:35       ` Yonghong Song
  0 siblings, 1 reply; 76+ messages in thread
From: Jiri Olsa @ 2021-06-07 18:18 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	netdev, bpf, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh, Daniel Xu, Viktor Malik

On Sun, Jun 06, 2021 at 08:56:47PM -0700, Yonghong Song wrote:
> 
> 
> On 6/5/21 4:10 AM, Jiri Olsa wrote:
> > Adding support to load tracing program with new BPF_F_MULTI_FUNC flag,
> > that allows the program to be loaded without specific function to be
> > attached to.
> > 
> > The verifier assumes the program is using all (6) available arguments
> 
> Is this a verifier failure or it is due to the check in the
> beginning of function arch_prepare_bpf_trampoline()?
> 
>         /* x86-64 supports up to 6 arguments. 7+ can be added in the future
> */
>         if (nr_args > 6)
>                 return -ENOTSUPP;

yes, that's the limit.. it allows the traced program to
touch 6 arguments, because it's the maximum for JIT

> 
> If it is indeed due to arch_prepare_bpf_trampoline() maybe we
> can improve it instead of specially processing the first argument
> "ip" in quite some places?

do you mean to teach JIT to process more than 6 arguments?

> 
> > as unsigned long values. We can't add extra ip argument at this time,
> > because JIT on x86 would fail to process this function. Instead we
> > allow to access extra first 'ip' argument in btf_ctx_access.
> > 
> > Such program will be allowed to be attached to multiple functions
> > in following patches.
> > 
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >   include/linux/bpf.h            |  1 +
> >   include/uapi/linux/bpf.h       |  7 +++++++
> >   kernel/bpf/btf.c               |  5 +++++
> >   kernel/bpf/syscall.c           | 35 +++++++++++++++++++++++++++++-----
> >   kernel/bpf/verifier.c          |  3 ++-
> >   tools/include/uapi/linux/bpf.h |  7 +++++++
> >   6 files changed, 52 insertions(+), 6 deletions(-)
> > 
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index 6cbf3c81c650..23221e0e8d3c 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -845,6 +845,7 @@ struct bpf_prog_aux {
> >   	bool sleepable;
> >   	bool tail_call_reachable;
> >   	struct hlist_node tramp_hlist;
> > +	bool multi_func;
> 
> Move this field right after "tail_call_reachable"?
> 
> >   	/* BTF_KIND_FUNC_PROTO for valid attach_btf_id */
> >   	const struct btf_type *attach_func_proto;
> >   	/* function name for valid attach_btf_id */
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index 2c1ba70abbf1..ad9340fb14d4 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -1109,6 +1109,13 @@ enum bpf_link_type {
> >    */
> >   #define BPF_F_SLEEPABLE		(1U << 4)
> > +/* If BPF_F_MULTI_FUNC is used in BPF_PROG_LOAD command, the verifier does
> > + * not expect BTF ID for the program, instead it assumes it's function
> > + * with 6 u64 arguments. No trampoline is created for the program. Such
> > + * program can be attached to multiple functions.
> > + */
> > +#define BPF_F_MULTI_FUNC	(1U << 5)
> > +
> >   /* When BPF ldimm64's insn[0].src_reg != 0 then this can have
> >    * the following extensions:
> >    *
> > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > index a6e39c5ea0bf..c233aaa6a709 100644
> > --- a/kernel/bpf/btf.c
> > +++ b/kernel/bpf/btf.c
> > @@ -4679,6 +4679,11 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
> >   		args++;
> >   		nr_args--;
> >   	}
> > +	if (prog->aux->multi_func) {
> > +		if (arg == 0)
> > +			return true;
> > +		arg--;
> 
> Some comments in the above to mention like "the first 'ip' argument
> is omitted" will be good.

will do, thanks

jirka

> 
> > +	}
> >   	if (arg > nr_args) {
> >   		bpf_log(log, "func '%s' doesn't have %d-th argument\n",
> [...]
> 


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 13/19] bpf: Add support to link multi func tracing program
  2021-06-07  5:36   ` Yonghong Song
@ 2021-06-07 18:25     ` Jiri Olsa
  2021-06-07 19:39       ` Yonghong Song
  0 siblings, 1 reply; 76+ messages in thread
From: Jiri Olsa @ 2021-06-07 18:25 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	netdev, bpf, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh, Daniel Xu, Viktor Malik

On Sun, Jun 06, 2021 at 10:36:57PM -0700, Yonghong Song wrote:
> 
> 
> On 6/5/21 4:10 AM, Jiri Olsa wrote:
> > Adding support to attach multiple functions to tracing program
> > by using the link_create/link_update interface.
> > 
> > Adding multi_btf_ids/multi_btf_ids_cnt pair to link_create struct
> > API, that define array of functions btf ids that will be attached
> > to prog_fd.
> > 
> > The prog_fd needs to be multi prog tracing program (BPF_F_MULTI_FUNC).
> > 
> > The new link_create interface creates new BPF_LINK_TYPE_TRACING_MULTI
> > link type, which creates separate bpf_trampoline and registers it
> > as direct function for all specified btf ids.
> > 
> > The new bpf_trampoline is out of scope (bpf_trampoline_lookup) of
> > standard trampolines, so all registered functions need to be free
> > of direct functions, otherwise the link fails.
> 
> I am not sure how severe such a limitation could be in practice.
> It is possible in production some non-multi fentry/fexit program
> may run continuously. Does kprobe program impact this as well?

I did not find a way how to combine current trampolines with the
new ones for multiple programs.. what you described is a limitation
of the current approach

I'm not sure about kprobes and trampolines, but the limitation
should be same as we do have for current trampolines.. I'll check

> 
> > 
> > The new bpf_trampoline will store and pass to bpf program the highest
> > number of arguments from all given functions.
> > 
> > New programs (fentry or fexit) can be added to the existing trampoline
> > through the link_update interface via new_prog_fd descriptor.
> 
> Looks we do not support replacing old programs. Do we support
> removing old programs?

we don't.. it's not what bpftrace would do, it just adds programs
to trace and close all when it's done.. I think interface for removal
could be added if you think it's needed

> 
> > 
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >   include/linux/bpf.h            |   3 +
> >   include/uapi/linux/bpf.h       |   5 +
> >   kernel/bpf/syscall.c           | 185 ++++++++++++++++++++++++++++++++-
> >   kernel/bpf/trampoline.c        |  53 +++++++---
> >   tools/include/uapi/linux/bpf.h |   5 +
> >   5 files changed, 237 insertions(+), 14 deletions(-)
> > 
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index 23221e0e8d3c..99a81c6c22e6 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -661,6 +661,7 @@ struct bpf_trampoline {
> >   	struct bpf_tramp_image *cur_image;
> >   	u64 selector;
> >   	struct module *mod;
> > +	bool multi;
> >   };
> >   struct bpf_attach_target_info {
> > @@ -746,6 +747,8 @@ void bpf_ksym_add(struct bpf_ksym *ksym);
> >   void bpf_ksym_del(struct bpf_ksym *ksym);
> >   int bpf_jit_charge_modmem(u32 pages);
> >   void bpf_jit_uncharge_modmem(u32 pages);
> > +struct bpf_trampoline *bpf_trampoline_multi_alloc(void);
> > +void bpf_trampoline_multi_free(struct bpf_trampoline *tr);
> >   #else
> >   static inline int bpf_trampoline_link_prog(struct bpf_prog *prog,
> >   					   struct bpf_trampoline *tr)
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index ad9340fb14d4..5fd6ff64e8dc 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -1007,6 +1007,7 @@ enum bpf_link_type {
> >   	BPF_LINK_TYPE_ITER = 4,
> >   	BPF_LINK_TYPE_NETNS = 5,
> >   	BPF_LINK_TYPE_XDP = 6,
> > +	BPF_LINK_TYPE_TRACING_MULTI = 7,
> >   	MAX_BPF_LINK_TYPE,
> >   };
> > @@ -1454,6 +1455,10 @@ union bpf_attr {
> >   				__aligned_u64	iter_info;	/* extra bpf_iter_link_info */
> >   				__u32		iter_info_len;	/* iter_info length */
> >   			};
> > +			struct {
> > +				__aligned_u64	multi_btf_ids;		/* addresses to attach */
> > +				__u32		multi_btf_ids_cnt;	/* addresses count */
> > +			};
> >   		};
> >   	} link_create;
> [...]
> > +static int bpf_tracing_multi_link_fill_link_info(const struct bpf_link *link,
> > +						 struct bpf_link_info *info)
> > +{
> > +	struct bpf_tracing_multi_link *tr_link =
> > +		container_of(link, struct bpf_tracing_multi_link, link);
> > +
> > +	info->tracing.attach_type = tr_link->attach_type;
> > +	return 0;
> > +}
> > +
> > +static int check_multi_prog_type(struct bpf_prog *prog)
> > +{
> > +	if (!prog->aux->multi_func &&
> > +	    prog->type != BPF_PROG_TYPE_TRACING)
> 
> I think prog->type != BPF_PROG_TYPE_TRACING is not needed, it should have
> been checked during program load time?
> 
> > +		return -EINVAL;
> > +	if (prog->expected_attach_type != BPF_TRACE_FENTRY &&
> > +	    prog->expected_attach_type != BPF_TRACE_FEXIT)
> > +		return -EINVAL;
> > +	return 0;
> > +}
> > +
> > +static int bpf_tracing_multi_link_update(struct bpf_link *link,
> > +					 struct bpf_prog *new_prog,
> > +					 struct bpf_prog *old_prog __maybe_unused)
> > +{
> > +	struct bpf_tracing_multi_link *tr_link =
> > +		container_of(link, struct bpf_tracing_multi_link, link);
> > +	int err;
> > +
> > +	if (check_multi_prog_type(new_prog))
> > +		return -EINVAL;
> > +
> > +	err = bpf_trampoline_link_prog(new_prog, tr_link->tr);
> > +	if (err)
> > +		return err;
> > +
> > +	err = modify_ftrace_direct_multi(&tr_link->ops,
> > +					 (unsigned long) tr_link->tr->cur_image->image);
> > +	return WARN_ON(err);
> 
> Why WARN_ON here? Some comments will be good.
> 
> > +}
> > +
> > +static const struct bpf_link_ops bpf_tracing_multi_link_lops = {
> > +	.release = bpf_tracing_multi_link_release,
> > +	.dealloc = bpf_tracing_multi_link_dealloc,
> > +	.show_fdinfo = bpf_tracing_multi_link_show_fdinfo,
> > +	.fill_link_info = bpf_tracing_multi_link_fill_link_info,
> > +	.update_prog = bpf_tracing_multi_link_update,
> > +};
> > +
> [...]
> > +
> >   struct bpf_raw_tp_link {
> >   	struct bpf_link link;
> >   	struct bpf_raw_event_map *btp;
> > @@ -3043,6 +3222,8 @@ attach_type_to_prog_type(enum bpf_attach_type attach_type)
> >   	case BPF_CGROUP_SETSOCKOPT:
> >   		return BPF_PROG_TYPE_CGROUP_SOCKOPT;
> >   	case BPF_TRACE_ITER:
> > +	case BPF_TRACE_FENTRY:
> > +	case BPF_TRACE_FEXIT:
> >   		return BPF_PROG_TYPE_TRACING;
> >   	case BPF_SK_LOOKUP:
> >   		return BPF_PROG_TYPE_SK_LOOKUP;
> > @@ -4099,6 +4280,8 @@ static int tracing_bpf_link_attach(const union bpf_attr *attr, bpfptr_t uattr,
> >   	if (prog->expected_attach_type == BPF_TRACE_ITER)
> >   		return bpf_iter_link_attach(attr, uattr, prog);
> > +	else if (prog->aux->multi_func)
> > +		return bpf_tracing_multi_attach(prog, attr);
> >   	else if (prog->type == BPF_PROG_TYPE_EXT)
> >   		return bpf_tracing_prog_attach(prog,
> >   					       attr->link_create.target_fd,
> > @@ -4106,7 +4289,7 @@ static int tracing_bpf_link_attach(const union bpf_attr *attr, bpfptr_t uattr,
> >   	return -EINVAL;
> >   }
> > -#define BPF_LINK_CREATE_LAST_FIELD link_create.iter_info_len
> > +#define BPF_LINK_CREATE_LAST_FIELD link_create.multi_btf_ids_cnt
> 
> It is okay that we don't change this. link_create.iter_info_len
> has the same effect since it is a union.
> 
> >   static int link_create(union bpf_attr *attr, bpfptr_t uattr)
> >   {
> >   	enum bpf_prog_type ptype;
> > diff --git a/kernel/bpf/trampoline.c b/kernel/bpf/trampoline.c
> > index 2755fdcf9fbf..660b8197c27f 100644
> > --- a/kernel/bpf/trampoline.c
> > +++ b/kernel/bpf/trampoline.c
> > @@ -58,7 +58,7 @@ void bpf_image_ksym_del(struct bpf_ksym *ksym)
> >   			   PAGE_SIZE, true, ksym->name);
> >   }
> > -static struct bpf_trampoline *bpf_trampoline_alloc(void)
> > +static struct bpf_trampoline *bpf_trampoline_alloc(bool multi)
> >   {
> >   	struct bpf_trampoline *tr;
> >   	int i;
> > @@ -72,6 +72,7 @@ static struct bpf_trampoline *bpf_trampoline_alloc(void)
> >   	mutex_init(&tr->mutex);
> >   	for (i = 0; i < BPF_TRAMP_MAX; i++)
> >   		INIT_HLIST_HEAD(&tr->progs_hlist[i]);
> > +	tr->multi = multi;
> >   	return tr;
> >   }
> > @@ -88,7 +89,7 @@ static struct bpf_trampoline *bpf_trampoline_lookup(u64 key)
> >   			goto out;
> >   		}
> >   	}
> > -	tr = bpf_trampoline_alloc();
> > +	tr = bpf_trampoline_alloc(false);
> >   	if (tr) {
> >   		tr->key = key;
> >   		hlist_add_head(&tr->hlist, head);
> > @@ -343,14 +344,16 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr)
> >   	struct bpf_tramp_image *im;
> >   	struct bpf_tramp_progs *tprogs;
> >   	u32 flags = BPF_TRAMP_F_RESTORE_REGS;
> > -	int err, total;
> > +	bool update = !tr->multi;
> > +	int err = 0, total;
> >   	tprogs = bpf_trampoline_get_progs(tr, &total);
> >   	if (IS_ERR(tprogs))
> >   		return PTR_ERR(tprogs);
> >   	if (total == 0) {
> > -		err = unregister_fentry(tr, tr->cur_image->image);
> > +		if (update)
> > +			err = unregister_fentry(tr, tr->cur_image->image);
> >   		bpf_tramp_image_put(tr->cur_image);
> >   		tr->cur_image = NULL;
> >   		tr->selector = 0;
> > @@ -363,9 +366,15 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr)
> >   		goto out;
> >   	}
> > +	if (tr->multi)
> > +		flags |= BPF_TRAMP_F_IP_ARG;
> > +
> >   	if (tprogs[BPF_TRAMP_FEXIT].nr_progs ||
> > -	    tprogs[BPF_TRAMP_MODIFY_RETURN].nr_progs)
> > +	    tprogs[BPF_TRAMP_MODIFY_RETURN].nr_progs) {
> >   		flags = BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_SKIP_FRAME;
> > +		if (tr->multi)
> > +			flags |= BPF_TRAMP_F_ORIG_STACK | BPF_TRAMP_F_IP_ARG;
> 
> BPF_TRAMP_F_IP_ARG is not needed. It has been added before.

it's erased in 2 lines above.. which reminds me that I forgot to check
if that's a bug or intended ;-)

jirka


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 15/19] libbpf: Add support to link multi func tracing program
  2021-06-07  5:49   ` Yonghong Song
@ 2021-06-07 18:28     ` Jiri Olsa
  2021-06-07 19:42       ` Yonghong Song
  0 siblings, 1 reply; 76+ messages in thread
From: Jiri Olsa @ 2021-06-07 18:28 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	netdev, bpf, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh, Daniel Xu, Viktor Malik

On Sun, Jun 06, 2021 at 10:49:16PM -0700, Yonghong Song wrote:
> 
> 
> On 6/5/21 4:10 AM, Jiri Olsa wrote:
> > Adding support to link multi func tracing program
> > through link_create interface.
> > 
> > Adding special types for multi func programs:
> > 
> >    fentry.multi
> >    fexit.multi
> > 
> > so you can define multi func programs like:
> > 
> >    SEC("fentry.multi/bpf_fentry_test*")
> >    int BPF_PROG(test1, unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f)
> > 
> > that defines test1 to be attached to bpf_fentry_test* functions,
> > and able to attach ip and 6 arguments.
> > 
> > If functions are not specified the program needs to be attached
> > manually.
> > 
> > Adding new btf id related fields to bpf_link_create_opts and
> > bpf_link_create to use them.
> > 
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >   tools/lib/bpf/bpf.c    | 11 ++++++-
> >   tools/lib/bpf/bpf.h    |  4 ++-
> >   tools/lib/bpf/libbpf.c | 72 ++++++++++++++++++++++++++++++++++++++++++
> >   3 files changed, 85 insertions(+), 2 deletions(-)
> > 
> > diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> > index 86dcac44f32f..da892737b522 100644
> > --- a/tools/lib/bpf/bpf.c
> > +++ b/tools/lib/bpf/bpf.c
> > @@ -674,7 +674,8 @@ int bpf_link_create(int prog_fd, int target_fd,
> >   		    enum bpf_attach_type attach_type,
> >   		    const struct bpf_link_create_opts *opts)
> >   {
> > -	__u32 target_btf_id, iter_info_len;
> > +	__u32 target_btf_id, iter_info_len, multi_btf_ids_cnt;
> > +	__s32 *multi_btf_ids;
> >   	union bpf_attr attr;
> >   	int fd;
> [...]
> > @@ -9584,6 +9597,9 @@ static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd,
> >   	if (!name)
> >   		return -EINVAL;
> > +	if (prog->prog_flags & BPF_F_MULTI_FUNC)
> > +		return 0;
> > +
> >   	for (i = 0; i < ARRAY_SIZE(section_defs); i++) {
> >   		if (!section_defs[i].is_attach_btf)
> >   			continue;
> > @@ -10537,6 +10553,62 @@ static struct bpf_link *bpf_program__attach_btf_id(struct bpf_program *prog)
> >   	return (struct bpf_link *)link;
> >   }
> > +static struct bpf_link *bpf_program__attach_multi(struct bpf_program *prog)
> > +{
> > +	char *pattern = prog->sec_name + prog->sec_def->len;
> > +	DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts);
> > +	enum bpf_attach_type attach_type;
> > +	int prog_fd, link_fd, cnt, err;
> > +	struct bpf_link *link = NULL;
> > +	__s32 *ids = NULL;
> > +
> > +	prog_fd = bpf_program__fd(prog);
> > +	if (prog_fd < 0) {
> > +		pr_warn("prog '%s': can't attach before loaded\n", prog->name);
> > +		return ERR_PTR(-EINVAL);
> > +	}
> > +
> > +	err = bpf_object__load_vmlinux_btf(prog->obj, true);
> > +	if (err)
> > +		return ERR_PTR(err);
> > +
> > +	cnt = btf__find_by_pattern_kind(prog->obj->btf_vmlinux, pattern,
> > +					BTF_KIND_FUNC, &ids);
> > +	if (cnt <= 0)
> > +		return ERR_PTR(-EINVAL);
> 
> In kernel, looks like we support cnt = 0, here we error out.
> Should we also error out in the kernel if cnt == 0?

hum, I'm not what you mean.. what kernel code are you referring to?

thanks,
jirka


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 16/19] selftests/bpf: Add fentry multi func test
  2021-06-07  6:06   ` Yonghong Song
@ 2021-06-07 18:42     ` Jiri Olsa
  0 siblings, 0 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-07 18:42 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	netdev, bpf, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh, Daniel Xu, Viktor Malik

On Sun, Jun 06, 2021 at 11:06:14PM -0700, Yonghong Song wrote:
> 
> 
> On 6/5/21 4:10 AM, Jiri Olsa wrote:
> > Adding selftest for fentry multi func test that attaches
> > to bpf_fentry_test* functions and checks argument values
> > based on the processed function.
> > 
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >   tools/testing/selftests/bpf/multi_check.h     | 52 +++++++++++++++++++
> 
> Should we put this file under selftests/bpf/progs directory?
> It is included only by bpf programs.

ok

jirka


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 11/19] bpf: Add support to load multi func tracing program
  2021-06-07 18:18     ` Jiri Olsa
@ 2021-06-07 19:35       ` Yonghong Song
  0 siblings, 0 replies; 76+ messages in thread
From: Yonghong Song @ 2021-06-07 19:35 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	netdev, bpf, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh, Daniel Xu, Viktor Malik



On 6/7/21 11:18 AM, Jiri Olsa wrote:
> On Sun, Jun 06, 2021 at 08:56:47PM -0700, Yonghong Song wrote:
>>
>>
>> On 6/5/21 4:10 AM, Jiri Olsa wrote:
>>> Adding support to load tracing program with new BPF_F_MULTI_FUNC flag,
>>> that allows the program to be loaded without specific function to be
>>> attached to.
>>>
>>> The verifier assumes the program is using all (6) available arguments
>>
>> Is this a verifier failure or it is due to the check in the
>> beginning of function arch_prepare_bpf_trampoline()?
>>
>>          /* x86-64 supports up to 6 arguments. 7+ can be added in the future
>> */
>>          if (nr_args > 6)
>>                  return -ENOTSUPP;
> 
> yes, that's the limit.. it allows the traced program to
> touch 6 arguments, because it's the maximum for JIT
> 
>>
>> If it is indeed due to arch_prepare_bpf_trampoline() maybe we
>> can improve it instead of specially processing the first argument
>> "ip" in quite some places?
> 
> do you mean to teach JIT to process more than 6 arguments?

Yes. Not sure how hard it is. If it is doable with reasonable 
complexity, I think it will be worth it as it will benefit this
case to avoid special tweaks of the first argument, but also
benefit other cases e.g., attaching to a kernel function with
7 or more arguments.

> 
>>
>>> as unsigned long values. We can't add extra ip argument at this time,
>>> because JIT on x86 would fail to process this function. Instead we
>>> allow to access extra first 'ip' argument in btf_ctx_access.
>>>
>>> Such program will be allowed to be attached to multiple functions
>>> in following patches.
>>>
>>> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
>>> ---
>>>    include/linux/bpf.h            |  1 +
>>>    include/uapi/linux/bpf.h       |  7 +++++++
>>>    kernel/bpf/btf.c               |  5 +++++
>>>    kernel/bpf/syscall.c           | 35 +++++++++++++++++++++++++++++-----
>>>    kernel/bpf/verifier.c          |  3 ++-
>>>    tools/include/uapi/linux/bpf.h |  7 +++++++
>>>    6 files changed, 52 insertions(+), 6 deletions(-)
>>>
[...]

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 13/19] bpf: Add support to link multi func tracing program
  2021-06-07 18:25     ` Jiri Olsa
@ 2021-06-07 19:39       ` Yonghong Song
  0 siblings, 0 replies; 76+ messages in thread
From: Yonghong Song @ 2021-06-07 19:39 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	netdev, bpf, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh, Daniel Xu, Viktor Malik



On 6/7/21 11:25 AM, Jiri Olsa wrote:
> On Sun, Jun 06, 2021 at 10:36:57PM -0700, Yonghong Song wrote:
>>
>>
>> On 6/5/21 4:10 AM, Jiri Olsa wrote:
>>> Adding support to attach multiple functions to tracing program
>>> by using the link_create/link_update interface.
>>>
>>> Adding multi_btf_ids/multi_btf_ids_cnt pair to link_create struct
>>> API, that define array of functions btf ids that will be attached
>>> to prog_fd.
>>>
>>> The prog_fd needs to be multi prog tracing program (BPF_F_MULTI_FUNC).
>>>
>>> The new link_create interface creates new BPF_LINK_TYPE_TRACING_MULTI
>>> link type, which creates separate bpf_trampoline and registers it
>>> as direct function for all specified btf ids.
>>>
>>> The new bpf_trampoline is out of scope (bpf_trampoline_lookup) of
>>> standard trampolines, so all registered functions need to be free
>>> of direct functions, otherwise the link fails.
>>
>> I am not sure how severe such a limitation could be in practice.
>> It is possible in production some non-multi fentry/fexit program
>> may run continuously. Does kprobe program impact this as well?
> 
> I did not find a way how to combine current trampolines with the
> new ones for multiple programs.. what you described is a limitation
> of the current approach
> 
> I'm not sure about kprobes and trampolines, but the limitation
> should be same as we do have for current trampolines.. I'll check
> 
>>
>>>
>>> The new bpf_trampoline will store and pass to bpf program the highest
>>> number of arguments from all given functions.
>>>
>>> New programs (fentry or fexit) can be added to the existing trampoline
>>> through the link_update interface via new_prog_fd descriptor.
>>
>> Looks we do not support replacing old programs. Do we support
>> removing old programs?
> 
> we don't.. it's not what bpftrace would do, it just adds programs
> to trace and close all when it's done.. I think interface for removal
> could be added if you think it's needed

This can be a followup patch. Indeed removing selective old programs 
probably not a common use case.

> 
>>
>>>
>>> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
>>> ---
>>>    include/linux/bpf.h            |   3 +
>>>    include/uapi/linux/bpf.h       |   5 +
>>>    kernel/bpf/syscall.c           | 185 ++++++++++++++++++++++++++++++++-
>>>    kernel/bpf/trampoline.c        |  53 +++++++---
>>>    tools/include/uapi/linux/bpf.h |   5 +
>>>    5 files changed, 237 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
>>> index 23221e0e8d3c..99a81c6c22e6 100644
>>> --- a/include/linux/bpf.h
>>> +++ b/include/linux/bpf.h
>>> @@ -661,6 +661,7 @@ struct bpf_trampoline {
>>>    	struct bpf_tramp_image *cur_image;
>>>    	u64 selector;
>>>    	struct module *mod;
>>> +	bool multi;
>>>    };
>>>    struct bpf_attach_target_info {
>>> @@ -746,6 +747,8 @@ void bpf_ksym_add(struct bpf_ksym *ksym);
>>>    void bpf_ksym_del(struct bpf_ksym *ksym);
>>>    int bpf_jit_charge_modmem(u32 pages);
>>>    void bpf_jit_uncharge_modmem(u32 pages);
>>> +struct bpf_trampoline *bpf_trampoline_multi_alloc(void);
>>> +void bpf_trampoline_multi_free(struct bpf_trampoline *tr);
>>>    #else
>>>    static inline int bpf_trampoline_link_prog(struct bpf_prog *prog,
>>>    					   struct bpf_trampoline *tr)
[...]
>>> @@ -363,9 +366,15 @@ static int bpf_trampoline_update(struct bpf_trampoline *tr)
>>>    		goto out;
>>>    	}
>>> +	if (tr->multi)
>>> +		flags |= BPF_TRAMP_F_IP_ARG;
>>> +
>>>    	if (tprogs[BPF_TRAMP_FEXIT].nr_progs ||
>>> -	    tprogs[BPF_TRAMP_MODIFY_RETURN].nr_progs)
>>> +	    tprogs[BPF_TRAMP_MODIFY_RETURN].nr_progs) {
>>>    		flags = BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_SKIP_FRAME;
>>> +		if (tr->multi)
>>> +			flags |= BPF_TRAMP_F_ORIG_STACK | BPF_TRAMP_F_IP_ARG;
>>
>> BPF_TRAMP_F_IP_ARG is not needed. It has been added before.
> 
> it's erased in 2 lines above.. which reminds me that I forgot to check
> if that's a bug or intended ;-)

Oh, yes, I miss that too :-) I guess it would be good if you can
re-organize the code to avoid resetting of flags.

> 
> jirka
> 

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 15/19] libbpf: Add support to link multi func tracing program
  2021-06-07 18:28     ` Jiri Olsa
@ 2021-06-07 19:42       ` Yonghong Song
  2021-06-07 20:11         ` Jiri Olsa
  0 siblings, 1 reply; 76+ messages in thread
From: Yonghong Song @ 2021-06-07 19:42 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	netdev, bpf, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh, Daniel Xu, Viktor Malik



On 6/7/21 11:28 AM, Jiri Olsa wrote:
> On Sun, Jun 06, 2021 at 10:49:16PM -0700, Yonghong Song wrote:
>>
>>
>> On 6/5/21 4:10 AM, Jiri Olsa wrote:
>>> Adding support to link multi func tracing program
>>> through link_create interface.
>>>
>>> Adding special types for multi func programs:
>>>
>>>     fentry.multi
>>>     fexit.multi
>>>
>>> so you can define multi func programs like:
>>>
>>>     SEC("fentry.multi/bpf_fentry_test*")
>>>     int BPF_PROG(test1, unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f)
>>>
>>> that defines test1 to be attached to bpf_fentry_test* functions,
>>> and able to attach ip and 6 arguments.
>>>
>>> If functions are not specified the program needs to be attached
>>> manually.
>>>
>>> Adding new btf id related fields to bpf_link_create_opts and
>>> bpf_link_create to use them.
>>>
>>> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
>>> ---
>>>    tools/lib/bpf/bpf.c    | 11 ++++++-
>>>    tools/lib/bpf/bpf.h    |  4 ++-
>>>    tools/lib/bpf/libbpf.c | 72 ++++++++++++++++++++++++++++++++++++++++++
>>>    3 files changed, 85 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
>>> index 86dcac44f32f..da892737b522 100644
>>> --- a/tools/lib/bpf/bpf.c
>>> +++ b/tools/lib/bpf/bpf.c
>>> @@ -674,7 +674,8 @@ int bpf_link_create(int prog_fd, int target_fd,
>>>    		    enum bpf_attach_type attach_type,
>>>    		    const struct bpf_link_create_opts *opts)
>>>    {
>>> -	__u32 target_btf_id, iter_info_len;
>>> +	__u32 target_btf_id, iter_info_len, multi_btf_ids_cnt;
>>> +	__s32 *multi_btf_ids;
>>>    	union bpf_attr attr;
>>>    	int fd;
>> [...]
>>> @@ -9584,6 +9597,9 @@ static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd,
>>>    	if (!name)
>>>    		return -EINVAL;
>>> +	if (prog->prog_flags & BPF_F_MULTI_FUNC)
>>> +		return 0;
>>> +
>>>    	for (i = 0; i < ARRAY_SIZE(section_defs); i++) {
>>>    		if (!section_defs[i].is_attach_btf)
>>>    			continue;
>>> @@ -10537,6 +10553,62 @@ static struct bpf_link *bpf_program__attach_btf_id(struct bpf_program *prog)
>>>    	return (struct bpf_link *)link;
>>>    }
>>> +static struct bpf_link *bpf_program__attach_multi(struct bpf_program *prog)
>>> +{
>>> +	char *pattern = prog->sec_name + prog->sec_def->len;
>>> +	DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts);
>>> +	enum bpf_attach_type attach_type;
>>> +	int prog_fd, link_fd, cnt, err;
>>> +	struct bpf_link *link = NULL;
>>> +	__s32 *ids = NULL;
>>> +
>>> +	prog_fd = bpf_program__fd(prog);
>>> +	if (prog_fd < 0) {
>>> +		pr_warn("prog '%s': can't attach before loaded\n", prog->name);
>>> +		return ERR_PTR(-EINVAL);
>>> +	}
>>> +
>>> +	err = bpf_object__load_vmlinux_btf(prog->obj, true);
>>> +	if (err)
>>> +		return ERR_PTR(err);
>>> +
>>> +	cnt = btf__find_by_pattern_kind(prog->obj->btf_vmlinux, pattern,
>>> +					BTF_KIND_FUNC, &ids);
>>> +	if (cnt <= 0)
>>> +		return ERR_PTR(-EINVAL);
>>
>> In kernel, looks like we support cnt = 0, here we error out.
>> Should we also error out in the kernel if cnt == 0?
> 
> hum, I'm not what you mean.. what kernel code are you referring to?

I am referring to the following kernel code:

+static int bpf_tracing_multi_attach(struct bpf_prog *prog,
+				    const union bpf_attr *attr)
+{
+	void __user *ubtf_ids = u64_to_user_ptr(attr->link_create.multi_btf_ids);
+	u32 size, i, cnt = attr->link_create.multi_btf_ids_cnt;
+	struct bpf_tracing_multi_link *link = NULL;
+	struct bpf_link_primer link_primer;
+	struct bpf_trampoline *tr = NULL;
+	int err = -EINVAL;
+	u8 nr_args = 0;
+	u32 *btf_ids;
+
+	if (check_multi_prog_type(prog))
+		return -EINVAL;
+
+	size = cnt * sizeof(*btf_ids);
+	btf_ids = kmalloc(size, GFP_USER | __GFP_NOWARN);
+	if (!btf_ids)
+		return -ENOMEM;
+
+	err = -EFAULT;
+	if (ubtf_ids && copy_from_user(btf_ids, ubtf_ids, size))
+		goto out_free;
+
+	link = kzalloc(sizeof(*link), GFP_USER);
+	if (!link)
+		goto out_free;
+
+	for (i = 0; i < cnt; i++) {
+		struct bpf_attach_target_info tgt_info = {};
+
+		err = bpf_check_attach_target(NULL, prog, NULL, btf_ids[i],
+					      &tgt_info);
+		if (err)
+			goto out_free;
+
+		if (ftrace_set_filter_ip(&link->ops, tgt_info.tgt_addr, 0, 0))
+			goto out_free;
+
+		if (nr_args < tgt_info.fmodel.nr_args)
+			nr_args = tgt_info.fmodel.nr_args;
+	}
+
+	tr = bpf_trampoline_multi_alloc();
+	if (!tr)
+		goto out_free;
+
+	bpf_func_model_nargs(&tr->func.model, nr_args);
+
+	err = bpf_trampoline_link_prog(prog, tr);
+	if (err)
+		goto out_free;
+
+	err = register_ftrace_direct_multi(&link->ops, (unsigned long) 
tr->cur_image->image);
+	if (err)
+		goto out_free;
+
+	bpf_link_init(&link->link, BPF_LINK_TYPE_TRACING_MULTI,
+		      &bpf_tracing_multi_link_lops, prog);
+	link->attach_type = prog->expected_attach_type;
+
+	err = bpf_link_prime(&link->link, &link_primer);
+	if (err)
+		goto out_unlink;
+
+	link->tr = tr;
+	/* Take extra ref so we are even with progs added by link_update. */
+	bpf_prog_inc(prog);
+	return bpf_link_settle(&link_primer);
+
+out_unlink:
+	unregister_ftrace_direct_multi(&link->ops);
+out_free:
+	kfree(tr);
+	kfree(btf_ids);
+	kfree(link);
+	return err;
+}
+

Looks like cnt = 0 is okay in bpf_tracing_multi_attach().

> 
> thanks,
> jirka
> 

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 15/19] libbpf: Add support to link multi func tracing program
  2021-06-07 19:42       ` Yonghong Song
@ 2021-06-07 20:11         ` Jiri Olsa
  0 siblings, 0 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-07 20:11 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	netdev, bpf, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh, Daniel Xu, Viktor Malik

On Mon, Jun 07, 2021 at 12:42:51PM -0700, Yonghong Song wrote:

SNIP

> 
> +static int bpf_tracing_multi_attach(struct bpf_prog *prog,
> +				    const union bpf_attr *attr)
> +{
> +	void __user *ubtf_ids = u64_to_user_ptr(attr->link_create.multi_btf_ids);
> +	u32 size, i, cnt = attr->link_create.multi_btf_ids_cnt;
> +	struct bpf_tracing_multi_link *link = NULL;
> +	struct bpf_link_primer link_primer;
> +	struct bpf_trampoline *tr = NULL;
> +	int err = -EINVAL;
> +	u8 nr_args = 0;
> +	u32 *btf_ids;
> +
> +	if (check_multi_prog_type(prog))
> +		return -EINVAL;
> +
> +	size = cnt * sizeof(*btf_ids);
> +	btf_ids = kmalloc(size, GFP_USER | __GFP_NOWARN);
> +	if (!btf_ids)
> +		return -ENOMEM;
> +
> +	err = -EFAULT;
> +	if (ubtf_ids && copy_from_user(btf_ids, ubtf_ids, size))
> +		goto out_free;
> +
> +	link = kzalloc(sizeof(*link), GFP_USER);
> +	if (!link)
> +		goto out_free;
> +
> +	for (i = 0; i < cnt; i++) {
> +		struct bpf_attach_target_info tgt_info = {};
> +
> +		err = bpf_check_attach_target(NULL, prog, NULL, btf_ids[i],
> +					      &tgt_info);
> +		if (err)
> +			goto out_free;
> +
> +		if (ftrace_set_filter_ip(&link->ops, tgt_info.tgt_addr, 0, 0))
> +			goto out_free;
> +
> +		if (nr_args < tgt_info.fmodel.nr_args)
> +			nr_args = tgt_info.fmodel.nr_args;
> +	}
> +
> +	tr = bpf_trampoline_multi_alloc();
> +	if (!tr)
> +		goto out_free;
> +
> +	bpf_func_model_nargs(&tr->func.model, nr_args);
> +
> +	err = bpf_trampoline_link_prog(prog, tr);
> +	if (err)
> +		goto out_free;
> +
> +	err = register_ftrace_direct_multi(&link->ops, (unsigned long)
> tr->cur_image->image);
> +	if (err)
> +		goto out_free;
> +
> +	bpf_link_init(&link->link, BPF_LINK_TYPE_TRACING_MULTI,
> +		      &bpf_tracing_multi_link_lops, prog);
> +	link->attach_type = prog->expected_attach_type;
> +
> +	err = bpf_link_prime(&link->link, &link_primer);
> +	if (err)
> +		goto out_unlink;
> +
> +	link->tr = tr;
> +	/* Take extra ref so we are even with progs added by link_update. */
> +	bpf_prog_inc(prog);
> +	return bpf_link_settle(&link_primer);
> +
> +out_unlink:
> +	unregister_ftrace_direct_multi(&link->ops);
> +out_free:
> +	kfree(tr);
> +	kfree(btf_ids);
> +	kfree(link);
> +	return err;
> +}
> +
> 
> Looks like cnt = 0 is okay in bpf_tracing_multi_attach().

right, we should fail for that with EINVAL, also the
maximum with prog->aux->attach_btf->nr_types at least

thanks,
jirka


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 13/19] bpf: Add support to link multi func tracing program
  2021-06-05 11:10 ` [PATCH 13/19] bpf: Add support to link multi func tracing program Jiri Olsa
  2021-06-07  5:36   ` Yonghong Song
@ 2021-06-08 15:42   ` Alexei Starovoitov
  2021-06-08 18:17     ` Jiri Olsa
  2021-06-09  5:18   ` Andrii Nakryiko
  2 siblings, 1 reply; 76+ messages in thread
From: Alexei Starovoitov @ 2021-06-08 15:42 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Network Development, bpf, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Sat, Jun 5, 2021 at 4:11 AM Jiri Olsa <jolsa@kernel.org> wrote:
>
> Adding support to attach multiple functions to tracing program
> by using the link_create/link_update interface.
>
> Adding multi_btf_ids/multi_btf_ids_cnt pair to link_create struct
> API, that define array of functions btf ids that will be attached
> to prog_fd.
>
> The prog_fd needs to be multi prog tracing program (BPF_F_MULTI_FUNC).
>
> The new link_create interface creates new BPF_LINK_TYPE_TRACING_MULTI
> link type, which creates separate bpf_trampoline and registers it
> as direct function for all specified btf ids.
>
> The new bpf_trampoline is out of scope (bpf_trampoline_lookup) of
> standard trampolines, so all registered functions need to be free
> of direct functions, otherwise the link fails.

Overall the api makes sense to me.
The restriction of multi vs non-multi is too severe though.
The multi trampoline can serve normal fentry/fexit too.
If ip is moved to the end (instead of start) the trampoline
will be able to call into multi and normal fentry/fexit progs. Right?

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 13/19] bpf: Add support to link multi func tracing program
  2021-06-08 15:42   ` Alexei Starovoitov
@ 2021-06-08 18:17     ` Jiri Olsa
  2021-06-08 18:49       ` Alexei Starovoitov
  0 siblings, 1 reply; 76+ messages in thread
From: Jiri Olsa @ 2021-06-08 18:17 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Network Development, bpf, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Tue, Jun 08, 2021 at 08:42:32AM -0700, Alexei Starovoitov wrote:
> On Sat, Jun 5, 2021 at 4:11 AM Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > Adding support to attach multiple functions to tracing program
> > by using the link_create/link_update interface.
> >
> > Adding multi_btf_ids/multi_btf_ids_cnt pair to link_create struct
> > API, that define array of functions btf ids that will be attached
> > to prog_fd.
> >
> > The prog_fd needs to be multi prog tracing program (BPF_F_MULTI_FUNC).
> >
> > The new link_create interface creates new BPF_LINK_TYPE_TRACING_MULTI
> > link type, which creates separate bpf_trampoline and registers it
> > as direct function for all specified btf ids.
> >
> > The new bpf_trampoline is out of scope (bpf_trampoline_lookup) of
> > standard trampolines, so all registered functions need to be free
> > of direct functions, otherwise the link fails.
> 
> Overall the api makes sense to me.
> The restriction of multi vs non-multi is too severe though.
> The multi trampoline can serve normal fentry/fexit too.

so multi trampoline gets called from all the registered functions,
so there would need to be filter for specific ip before calling the
standard program.. single cmp/jnz might not be that bad, I'll check

> If ip is moved to the end (instead of start) the trampoline
> will be able to call into multi and normal fentry/fexit progs. Right?
> 

we could just skip ip arg when generating entry for normal programs 
and start from first argument address for %rdi

and it'd need to be transparent for current trampolines user API,
so I wonder there will be some hiccup ;-) let's see

thanks,
jirka


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 03/19] x86/ftrace: Make function graph use ftrace directly
  2021-06-05 11:10 ` [PATCH 03/19] x86/ftrace: Make function graph use ftrace directly Jiri Olsa
@ 2021-06-08 18:35   ` Andrii Nakryiko
  2021-06-08 18:51     ` Jiri Olsa
  0 siblings, 1 reply; 76+ messages in thread
From: Andrii Nakryiko @ 2021-06-08 18:35 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Sat, Jun 5, 2021 at 4:12 AM Jiri Olsa <jolsa@kernel.org> wrote:
>
> From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>
>
> We don't need special hook for graph tracer entry point,
> but instead we can use graph_ops::func function to install
> the return_hooker.
>
> This moves the graph tracing setup _before_ the direct
> trampoline prepares the stack, so the return_hooker will
> be called when the direct trampoline is finished.
>
> This simplifies the code, because we don't need to take into
> account the direct trampoline setup when preparing the graph
> tracer hooker and we can allow function graph tracer on entries
> registered with direct trampoline.
>
> Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  arch/x86/include/asm/ftrace.h |  9 +++++++--
>  arch/x86/kernel/ftrace.c      | 37 ++++++++++++++++++++++++++++++++---
>  arch/x86/kernel/ftrace_64.S   | 29 +--------------------------
>  include/linux/ftrace.h        |  6 ++++++
>  kernel/trace/fgraph.c         |  8 +++++---
>  5 files changed, 53 insertions(+), 36 deletions(-)
>
> diff --git a/arch/x86/include/asm/ftrace.h b/arch/x86/include/asm/ftrace.h
> index 9f3130f40807..024d9797646e 100644
> --- a/arch/x86/include/asm/ftrace.h
> +++ b/arch/x86/include/asm/ftrace.h
> @@ -57,6 +57,13 @@ arch_ftrace_get_regs(struct ftrace_regs *fregs)
>
>  #define ftrace_instruction_pointer_set(fregs, _ip)     \
>         do { (fregs)->regs.ip = (_ip); } while (0)
> +
> +struct ftrace_ops;
> +#define ftrace_graph_func ftrace_graph_func
> +void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
> +                      struct ftrace_ops *op, struct ftrace_regs *fregs);
> +#else
> +#define FTRACE_GRAPH_TRAMP_ADDR FTRACE_GRAPH_ADDR
>  #endif
>
>  #ifdef CONFIG_DYNAMIC_FTRACE
> @@ -65,8 +72,6 @@ struct dyn_arch_ftrace {
>         /* No extra data needed for x86 */
>  };
>
> -#define FTRACE_GRAPH_TRAMP_ADDR FTRACE_GRAPH_ADDR
> -
>  #endif /*  CONFIG_DYNAMIC_FTRACE */
>  #endif /* __ASSEMBLY__ */
>  #endif /* CONFIG_FUNCTION_TRACER */
> diff --git a/arch/x86/kernel/ftrace.c b/arch/x86/kernel/ftrace.c
> index c555624da989..804fcc6ef2c7 100644
> --- a/arch/x86/kernel/ftrace.c
> +++ b/arch/x86/kernel/ftrace.c
> @@ -527,7 +527,7 @@ static void *addr_from_call(void *ptr)
>         return ptr + CALL_INSN_SIZE + call.disp;
>  }
>
> -void prepare_ftrace_return(unsigned long self_addr, unsigned long *parent,
> +void prepare_ftrace_return(unsigned long ip, unsigned long *parent,
>                            unsigned long frame_pointer);
>
>  /*
> @@ -541,7 +541,8 @@ static void *static_tramp_func(struct ftrace_ops *ops, struct dyn_ftrace *rec)
>         void *ptr;
>
>         if (ops && ops->trampoline) {
> -#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> +#if !defined(CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS) && \
> +       defined(CONFIG_FUNCTION_GRAPH_TRACER)
>                 /*
>                  * We only know about function graph tracer setting as static
>                  * trampoline.
> @@ -589,8 +590,9 @@ void arch_ftrace_trampoline_free(struct ftrace_ops *ops)
>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
>
>  #ifdef CONFIG_DYNAMIC_FTRACE
> -extern void ftrace_graph_call(void);
>
> +#ifndef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
> +extern void ftrace_graph_call(void);
>  static const char *ftrace_jmp_replace(unsigned long ip, unsigned long addr)
>  {
>         return text_gen_insn(JMP32_INSN_OPCODE, (void *)ip, (void *)addr);
> @@ -618,7 +620,17 @@ int ftrace_disable_ftrace_graph_caller(void)
>
>         return ftrace_mod_jmp(ip, &ftrace_stub);
>  }
> +#else /* !CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS */
> +int ftrace_enable_ftrace_graph_caller(void)
> +{
> +       return 0;
> +}
>
> +int ftrace_disable_ftrace_graph_caller(void)
> +{
> +       return 0;
> +}
> +#endif /* CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS */
>  #endif /* !CONFIG_DYNAMIC_FTRACE */
>
>  /*
> @@ -629,6 +641,7 @@ void prepare_ftrace_return(unsigned long ip, unsigned long *parent,
>                            unsigned long frame_pointer)
>  {
>         unsigned long return_hooker = (unsigned long)&return_to_handler;
> +       int bit;
>
>         /*
>          * When resuming from suspend-to-ram, this function can be indirectly
> @@ -648,7 +661,25 @@ void prepare_ftrace_return(unsigned long ip, unsigned long *parent,
>         if (unlikely(atomic_read(&current->tracing_graph_pause)))
>                 return;
>
> +       bit = ftrace_test_recursion_trylock(ip, *parent);
> +       if (bit < 0)
> +               return;
> +
>         if (!function_graph_enter(*parent, ip, frame_pointer, parent))
>                 *parent = return_hooker;
> +
> +       ftrace_test_recursion_unlock(bit);
> +}
> +
> +#ifdef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
> +void ftrace_graph_func(unsigned long ip, unsigned long parent_ip,
> +                      struct ftrace_ops *op, struct ftrace_regs *fregs)
> +{
> +       struct pt_regs *regs = &fregs->regs;
> +       unsigned long *stack = (unsigned long *)kernel_stack_pointer(regs);
> +
> +       prepare_ftrace_return(ip, (unsigned long *)stack, 0);
>  }
> +#endif
> +
>  #endif /* CONFIG_FUNCTION_GRAPH_TRACER */
> diff --git a/arch/x86/kernel/ftrace_64.S b/arch/x86/kernel/ftrace_64.S
> index a8eb084a7a9a..7a879901f103 100644
> --- a/arch/x86/kernel/ftrace_64.S
> +++ b/arch/x86/kernel/ftrace_64.S
> @@ -174,11 +174,6 @@ SYM_INNER_LABEL(ftrace_caller_end, SYM_L_GLOBAL)
>  SYM_FUNC_END(ftrace_caller);
>
>  SYM_FUNC_START(ftrace_epilogue)
> -#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> -SYM_INNER_LABEL(ftrace_graph_call, SYM_L_GLOBAL)
> -       jmp ftrace_stub
> -#endif
> -
>  /*
>   * This is weak to keep gas from relaxing the jumps.
>   * It is also used to copy the retq for trampolines.
> @@ -288,15 +283,6 @@ SYM_FUNC_START(__fentry__)
>         cmpq $ftrace_stub, ftrace_trace_function
>         jnz trace
>
> -fgraph_trace:
> -#ifdef CONFIG_FUNCTION_GRAPH_TRACER
> -       cmpq $ftrace_stub, ftrace_graph_return
> -       jnz ftrace_graph_caller
> -
> -       cmpq $ftrace_graph_entry_stub, ftrace_graph_entry
> -       jnz ftrace_graph_caller
> -#endif
> -
>  SYM_INNER_LABEL(ftrace_stub, SYM_L_GLOBAL)
>         retq
>
> @@ -314,25 +300,12 @@ trace:
>         CALL_NOSPEC r8
>         restore_mcount_regs
>
> -       jmp fgraph_trace
> +       jmp ftrace_stub
>  SYM_FUNC_END(__fentry__)
>  EXPORT_SYMBOL(__fentry__)
>  #endif /* CONFIG_DYNAMIC_FTRACE */
>
>  #ifdef CONFIG_FUNCTION_GRAPH_TRACER
> -SYM_FUNC_START(ftrace_graph_caller)
> -       /* Saves rbp into %rdx and fills first parameter  */
> -       save_mcount_regs
> -
> -       leaq MCOUNT_REG_SIZE+8(%rsp), %rsi
> -       movq $0, %rdx   /* No framepointers needed */
> -       call    prepare_ftrace_return
> -
> -       restore_mcount_regs
> -
> -       retq
> -SYM_FUNC_END(ftrace_graph_caller)
> -
>  SYM_FUNC_START(return_to_handler)
>         subq  $24, %rsp
>
> diff --git a/include/linux/ftrace.h b/include/linux/ftrace.h
> index a69f363b61bf..40b493908f09 100644
> --- a/include/linux/ftrace.h
> +++ b/include/linux/ftrace.h
> @@ -614,6 +614,12 @@ void ftrace_modify_all_code(int command);
>  extern void ftrace_graph_caller(void);
>  extern int ftrace_enable_ftrace_graph_caller(void);
>  extern int ftrace_disable_ftrace_graph_caller(void);
> +#ifndef ftrace_graph_func
> +#define ftrace_graph_func ftrace_stub
> +#define FTRACE_OPS_GRAPH_STUB | FTRACE_OPS_FL_STUB
> +#else
> +#define FTRACE_OPS_GRAPH_STUB
> +#endif
>  #else
>  static inline int ftrace_enable_ftrace_graph_caller(void) { return 0; }
>  static inline int ftrace_disable_ftrace_graph_caller(void) { return 0; }
> diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
> index b8a0d1d564fb..58e96b45e9da 100644
> --- a/kernel/trace/fgraph.c
> +++ b/kernel/trace/fgraph.c
> @@ -115,6 +115,7 @@ int function_graph_enter(unsigned long ret, unsigned long func,
>  {
>         struct ftrace_graph_ent trace;
>
> +#ifndef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
>         /*
>          * Skip graph tracing if the return location is served by direct trampoline,
>          * since call sequence and return addresses are unpredictable anyway.
> @@ -124,6 +125,7 @@ int function_graph_enter(unsigned long ret, unsigned long func,
>         if (ftrace_direct_func_count &&
>             ftrace_find_rec_direct(ret - MCOUNT_INSN_SIZE))
>                 return -EBUSY;
> +#endif
>         trace.func = func;
>         trace.depth = ++current->curr_ret_depth;
>
> @@ -333,10 +335,10 @@ unsigned long ftrace_graph_ret_addr(struct task_struct *task, int *idx,
>  #endif /* HAVE_FUNCTION_GRAPH_RET_ADDR_PTR */
>
>  static struct ftrace_ops graph_ops = {
> -       .func                   = ftrace_stub,
> +       .func                   = ftrace_graph_func,
>         .flags                  = FTRACE_OPS_FL_INITIALIZED |
> -                                  FTRACE_OPS_FL_PID |
> -                                  FTRACE_OPS_FL_STUB,
> +                                  FTRACE_OPS_FL_PID
> +                                  FTRACE_OPS_GRAPH_STUB,

nit: this looks so weird... Why not define FTRACE_OPS_GRAPH_STUB as
zero in case of #ifdef ftrace_graph_func? Then it will be natural and
correctly looking | FTRACE_OPS_GRAPH_STUB?

>  #ifdef FTRACE_GRAPH_TRAMP_ADDR
>         .trampoline             = FTRACE_GRAPH_TRAMP_ADDR,
>         /* trampoline_size is only needed for dynamically allocated tramps */
> --
> 2.31.1
>

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 13/19] bpf: Add support to link multi func tracing program
  2021-06-08 18:17     ` Jiri Olsa
@ 2021-06-08 18:49       ` Alexei Starovoitov
  2021-06-08 21:07         ` Jiri Olsa
  0 siblings, 1 reply; 76+ messages in thread
From: Alexei Starovoitov @ 2021-06-08 18:49 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Network Development, bpf, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Tue, Jun 08, 2021 at 08:17:00PM +0200, Jiri Olsa wrote:
> On Tue, Jun 08, 2021 at 08:42:32AM -0700, Alexei Starovoitov wrote:
> > On Sat, Jun 5, 2021 at 4:11 AM Jiri Olsa <jolsa@kernel.org> wrote:
> > >
> > > Adding support to attach multiple functions to tracing program
> > > by using the link_create/link_update interface.
> > >
> > > Adding multi_btf_ids/multi_btf_ids_cnt pair to link_create struct
> > > API, that define array of functions btf ids that will be attached
> > > to prog_fd.
> > >
> > > The prog_fd needs to be multi prog tracing program (BPF_F_MULTI_FUNC).
> > >
> > > The new link_create interface creates new BPF_LINK_TYPE_TRACING_MULTI
> > > link type, which creates separate bpf_trampoline and registers it
> > > as direct function for all specified btf ids.
> > >
> > > The new bpf_trampoline is out of scope (bpf_trampoline_lookup) of
> > > standard trampolines, so all registered functions need to be free
> > > of direct functions, otherwise the link fails.
> > 
> > Overall the api makes sense to me.
> > The restriction of multi vs non-multi is too severe though.
> > The multi trampoline can serve normal fentry/fexit too.
> 
> so multi trampoline gets called from all the registered functions,
> so there would need to be filter for specific ip before calling the
> standard program.. single cmp/jnz might not be that bad, I'll check

You mean reusing the same multi trampoline for all IPs and regenerating
it with a bunch of cmp/jnz checks? There should be a better way to scale.
Maybe clone multi trampoline instead?
IPs[1-10] will point to multi.
IP[11] will point to a clone of multi that serves multi prog and
fentry/fexit progs specific for that IP.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 10/19] bpf: Allow to store caller's ip as argument
  2021-06-05 11:10 ` [PATCH 10/19] bpf: Allow to store caller's ip as argument Jiri Olsa
  2021-06-07  3:21   ` Yonghong Song
@ 2021-06-08 18:49   ` Andrii Nakryiko
  2021-06-08 20:58     ` Jiri Olsa
  1 sibling, 1 reply; 76+ messages in thread
From: Andrii Nakryiko @ 2021-06-08 18:49 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Sat, Jun 5, 2021 at 4:12 AM Jiri Olsa <jolsa@kernel.org> wrote:
>
> When we will have multiple functions attached to trampoline
> we need to propagate the function's address to the bpf program.
>
> Adding new BPF_TRAMP_F_IP_ARG flag to arch_prepare_bpf_trampoline
> function that will store origin caller's address before function's
> arguments.
>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  arch/x86/net/bpf_jit_comp.c | 18 ++++++++++++++----
>  include/linux/bpf.h         |  5 +++++
>  2 files changed, 19 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> index b77e6bd78354..d2425c18272a 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -1951,7 +1951,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>                                 void *orig_call)
>  {
>         int ret, i, cnt = 0, nr_args = m->nr_args;
> -       int stack_size = nr_args * 8;
> +       int stack_size = nr_args * 8, ip_arg = 0;
>         struct bpf_tramp_progs *fentry = &tprogs[BPF_TRAMP_FENTRY];
>         struct bpf_tramp_progs *fexit = &tprogs[BPF_TRAMP_FEXIT];
>         struct bpf_tramp_progs *fmod_ret = &tprogs[BPF_TRAMP_MODIFY_RETURN];
> @@ -1975,6 +1975,9 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>                  */
>                 orig_call += X86_PATCH_SIZE;
>
> +       if (flags & BPF_TRAMP_F_IP_ARG)
> +               stack_size += 8;
> +

nit: move it a bit up where we adjust stack_size for BPF_TRAMP_F_CALL_ORIG flag?

>         prog = image;
>
>         EMIT1(0x55);             /* push rbp */
> @@ -1982,7 +1985,14 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>         EMIT4(0x48, 0x83, 0xEC, stack_size); /* sub rsp, stack_size */
>         EMIT1(0x53);             /* push rbx */
>
> -       save_regs(m, &prog, nr_args, stack_size);
> +       if (flags & BPF_TRAMP_F_IP_ARG) {
> +               emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8);
> +               EMIT4(0x48, 0x83, 0xe8, X86_PATCH_SIZE); /* sub $X86_PATCH_SIZE,%rax*/
> +               emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -stack_size);
> +               ip_arg = 8;
> +       }

why not pass flags into save_regs and let it handle this case without
this extra ip_arg adjustment?

> +
> +       save_regs(m, &prog, nr_args, stack_size - ip_arg);
>
>         if (flags & BPF_TRAMP_F_CALL_ORIG) {
>                 /* arg1: mov rdi, im */
> @@ -2011,7 +2021,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>         }
>
>         if (flags & BPF_TRAMP_F_CALL_ORIG) {
> -               restore_regs(m, &prog, nr_args, stack_size);
> +               restore_regs(m, &prog, nr_args, stack_size - ip_arg);
>

similarly (and symmetrically), pass flags into restore_regs() to
handle that ip_arg transparently?

>                 if (flags & BPF_TRAMP_F_ORIG_STACK) {
>                         emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8);
> @@ -2052,7 +2062,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>                 }
>
>         if (flags & BPF_TRAMP_F_RESTORE_REGS)
> -               restore_regs(m, &prog, nr_args, stack_size);
> +               restore_regs(m, &prog, nr_args, stack_size - ip_arg);
>
>         /* This needs to be done regardless. If there were fmod_ret programs,
>          * the return value is only updated on the stack and still needs to be
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 16fc600503fb..6cbf3c81c650 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -559,6 +559,11 @@ struct btf_func_model {
>   */
>  #define BPF_TRAMP_F_ORIG_STACK         BIT(3)
>
> +/* First argument is IP address of the caller. Makes sense for fentry/fexit
> + * programs only.
> + */
> +#define BPF_TRAMP_F_IP_ARG             BIT(4)
> +
>  /* Each call __bpf_prog_enter + call bpf_func + call __bpf_prog_exit is ~50
>   * bytes on x86.  Pick a number to fit into BPF_IMAGE_SIZE / 2
>   */
> --
> 2.31.1
>

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 03/19] x86/ftrace: Make function graph use ftrace directly
  2021-06-08 18:35   ` Andrii Nakryiko
@ 2021-06-08 18:51     ` Jiri Olsa
  2021-06-08 19:11       ` Steven Rostedt
  0 siblings, 1 reply; 76+ messages in thread
From: Jiri Olsa @ 2021-06-08 18:51 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Tue, Jun 08, 2021 at 11:35:58AM -0700, Andrii Nakryiko wrote:

SNIP

> > diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
> > index b8a0d1d564fb..58e96b45e9da 100644
> > --- a/kernel/trace/fgraph.c
> > +++ b/kernel/trace/fgraph.c
> > @@ -115,6 +115,7 @@ int function_graph_enter(unsigned long ret, unsigned long func,
> >  {
> >         struct ftrace_graph_ent trace;
> >
> > +#ifndef CONFIG_HAVE_DYNAMIC_FTRACE_WITH_ARGS
> >         /*
> >          * Skip graph tracing if the return location is served by direct trampoline,
> >          * since call sequence and return addresses are unpredictable anyway.
> > @@ -124,6 +125,7 @@ int function_graph_enter(unsigned long ret, unsigned long func,
> >         if (ftrace_direct_func_count &&
> >             ftrace_find_rec_direct(ret - MCOUNT_INSN_SIZE))
> >                 return -EBUSY;
> > +#endif
> >         trace.func = func;
> >         trace.depth = ++current->curr_ret_depth;
> >
> > @@ -333,10 +335,10 @@ unsigned long ftrace_graph_ret_addr(struct task_struct *task, int *idx,
> >  #endif /* HAVE_FUNCTION_GRAPH_RET_ADDR_PTR */
> >
> >  static struct ftrace_ops graph_ops = {
> > -       .func                   = ftrace_stub,
> > +       .func                   = ftrace_graph_func,
> >         .flags                  = FTRACE_OPS_FL_INITIALIZED |
> > -                                  FTRACE_OPS_FL_PID |
> > -                                  FTRACE_OPS_FL_STUB,
> > +                                  FTRACE_OPS_FL_PID
> > +                                  FTRACE_OPS_GRAPH_STUB,
> 
> nit: this looks so weird... Why not define FTRACE_OPS_GRAPH_STUB as
> zero in case of #ifdef ftrace_graph_func? Then it will be natural and
> correctly looking | FTRACE_OPS_GRAPH_STUB?

ok, I can change that

thanks,
jirka

> 
> >  #ifdef FTRACE_GRAPH_TRAMP_ADDR
> >         .trampoline             = FTRACE_GRAPH_TRAMP_ADDR,
> >         /* trampoline_size is only needed for dynamically allocated tramps */
> > --
> > 2.31.1
> >
> 


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 03/19] x86/ftrace: Make function graph use ftrace directly
  2021-06-08 18:51     ` Jiri Olsa
@ 2021-06-08 19:11       ` Steven Rostedt
  0 siblings, 0 replies; 76+ messages in thread
From: Steven Rostedt @ 2021-06-08 19:11 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Andrii Nakryiko, Jiri Olsa, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Networking, bpf, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Tue, 8 Jun 2021 20:51:25 +0200
Jiri Olsa <jolsa@redhat.com> wrote:

> > > +                                  FTRACE_OPS_FL_PID
> > > +                                  FTRACE_OPS_GRAPH_STUB,  
> > 
> > nit: this looks so weird... Why not define FTRACE_OPS_GRAPH_STUB as
> > zero in case of #ifdef ftrace_graph_func? Then it will be natural and
> > correctly looking | FTRACE_OPS_GRAPH_STUB?  

I have no idea why I did that :-/  But it was a while ago when I wrote
this code. I think there was a reason for it, but with various updates,
that reason disappeared.


> 
> ok, I can change that

Yes, please do.

Thanks,

-- Steve

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 10/19] bpf: Allow to store caller's ip as argument
  2021-06-08 18:49   ` Andrii Nakryiko
@ 2021-06-08 20:58     ` Jiri Olsa
  2021-06-08 21:02       ` Andrii Nakryiko
  0 siblings, 1 reply; 76+ messages in thread
From: Jiri Olsa @ 2021-06-08 20:58 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Tue, Jun 08, 2021 at 11:49:31AM -0700, Andrii Nakryiko wrote:
> On Sat, Jun 5, 2021 at 4:12 AM Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > When we will have multiple functions attached to trampoline
> > we need to propagate the function's address to the bpf program.
> >
> > Adding new BPF_TRAMP_F_IP_ARG flag to arch_prepare_bpf_trampoline
> > function that will store origin caller's address before function's
> > arguments.
> >
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  arch/x86/net/bpf_jit_comp.c | 18 ++++++++++++++----
> >  include/linux/bpf.h         |  5 +++++
> >  2 files changed, 19 insertions(+), 4 deletions(-)
> >
> > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> > index b77e6bd78354..d2425c18272a 100644
> > --- a/arch/x86/net/bpf_jit_comp.c
> > +++ b/arch/x86/net/bpf_jit_comp.c
> > @@ -1951,7 +1951,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> >                                 void *orig_call)
> >  {
> >         int ret, i, cnt = 0, nr_args = m->nr_args;
> > -       int stack_size = nr_args * 8;
> > +       int stack_size = nr_args * 8, ip_arg = 0;
> >         struct bpf_tramp_progs *fentry = &tprogs[BPF_TRAMP_FENTRY];
> >         struct bpf_tramp_progs *fexit = &tprogs[BPF_TRAMP_FEXIT];
> >         struct bpf_tramp_progs *fmod_ret = &tprogs[BPF_TRAMP_MODIFY_RETURN];
> > @@ -1975,6 +1975,9 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> >                  */
> >                 orig_call += X86_PATCH_SIZE;
> >
> > +       if (flags & BPF_TRAMP_F_IP_ARG)
> > +               stack_size += 8;
> > +
> 
> nit: move it a bit up where we adjust stack_size for BPF_TRAMP_F_CALL_ORIG flag?

ok

> 
> >         prog = image;
> >
> >         EMIT1(0x55);             /* push rbp */
> > @@ -1982,7 +1985,14 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> >         EMIT4(0x48, 0x83, 0xEC, stack_size); /* sub rsp, stack_size */
> >         EMIT1(0x53);             /* push rbx */
> >
> > -       save_regs(m, &prog, nr_args, stack_size);
> > +       if (flags & BPF_TRAMP_F_IP_ARG) {
> > +               emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8);
> > +               EMIT4(0x48, 0x83, 0xe8, X86_PATCH_SIZE); /* sub $X86_PATCH_SIZE,%rax*/
> > +               emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -stack_size);
> > +               ip_arg = 8;
> > +       }
> 
> why not pass flags into save_regs and let it handle this case without
> this extra ip_arg adjustment?
> 
> > +
> > +       save_regs(m, &prog, nr_args, stack_size - ip_arg);
> >
> >         if (flags & BPF_TRAMP_F_CALL_ORIG) {
> >                 /* arg1: mov rdi, im */
> > @@ -2011,7 +2021,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> >         }
> >
> >         if (flags & BPF_TRAMP_F_CALL_ORIG) {
> > -               restore_regs(m, &prog, nr_args, stack_size);
> > +               restore_regs(m, &prog, nr_args, stack_size - ip_arg);
> >
> 
> similarly (and symmetrically), pass flags into restore_regs() to
> handle that ip_arg transparently?

so you mean something like:

	if (flags & BPF_TRAMP_F_IP_ARG)
		stack_size -= 8;

in both save_regs and restore_regs function, right?

jirka

> 
> >                 if (flags & BPF_TRAMP_F_ORIG_STACK) {
> >                         emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8);
> > @@ -2052,7 +2062,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> >                 }
> >
> >         if (flags & BPF_TRAMP_F_RESTORE_REGS)
> > -               restore_regs(m, &prog, nr_args, stack_size);
> > +               restore_regs(m, &prog, nr_args, stack_size - ip_arg);
> >
> >         /* This needs to be done regardless. If there were fmod_ret programs,
> >          * the return value is only updated on the stack and still needs to be
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index 16fc600503fb..6cbf3c81c650 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -559,6 +559,11 @@ struct btf_func_model {
> >   */
> >  #define BPF_TRAMP_F_ORIG_STACK         BIT(3)
> >
> > +/* First argument is IP address of the caller. Makes sense for fentry/fexit
> > + * programs only.
> > + */
> > +#define BPF_TRAMP_F_IP_ARG             BIT(4)
> > +
> >  /* Each call __bpf_prog_enter + call bpf_func + call __bpf_prog_exit is ~50
> >   * bytes on x86.  Pick a number to fit into BPF_IMAGE_SIZE / 2
> >   */
> > --
> > 2.31.1
> >
> 


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 10/19] bpf: Allow to store caller's ip as argument
  2021-06-08 20:58     ` Jiri Olsa
@ 2021-06-08 21:02       ` Andrii Nakryiko
  2021-06-08 21:11         ` Jiri Olsa
  0 siblings, 1 reply; 76+ messages in thread
From: Andrii Nakryiko @ 2021-06-08 21:02 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Tue, Jun 8, 2021 at 1:58 PM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Tue, Jun 08, 2021 at 11:49:31AM -0700, Andrii Nakryiko wrote:
> > On Sat, Jun 5, 2021 at 4:12 AM Jiri Olsa <jolsa@kernel.org> wrote:
> > >
> > > When we will have multiple functions attached to trampoline
> > > we need to propagate the function's address to the bpf program.
> > >
> > > Adding new BPF_TRAMP_F_IP_ARG flag to arch_prepare_bpf_trampoline
> > > function that will store origin caller's address before function's
> > > arguments.
> > >
> > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > ---
> > >  arch/x86/net/bpf_jit_comp.c | 18 ++++++++++++++----
> > >  include/linux/bpf.h         |  5 +++++
> > >  2 files changed, 19 insertions(+), 4 deletions(-)
> > >
> > > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> > > index b77e6bd78354..d2425c18272a 100644
> > > --- a/arch/x86/net/bpf_jit_comp.c
> > > +++ b/arch/x86/net/bpf_jit_comp.c
> > > @@ -1951,7 +1951,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> > >                                 void *orig_call)
> > >  {
> > >         int ret, i, cnt = 0, nr_args = m->nr_args;
> > > -       int stack_size = nr_args * 8;
> > > +       int stack_size = nr_args * 8, ip_arg = 0;
> > >         struct bpf_tramp_progs *fentry = &tprogs[BPF_TRAMP_FENTRY];
> > >         struct bpf_tramp_progs *fexit = &tprogs[BPF_TRAMP_FEXIT];
> > >         struct bpf_tramp_progs *fmod_ret = &tprogs[BPF_TRAMP_MODIFY_RETURN];
> > > @@ -1975,6 +1975,9 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> > >                  */
> > >                 orig_call += X86_PATCH_SIZE;
> > >
> > > +       if (flags & BPF_TRAMP_F_IP_ARG)
> > > +               stack_size += 8;
> > > +
> >
> > nit: move it a bit up where we adjust stack_size for BPF_TRAMP_F_CALL_ORIG flag?
>
> ok
>
> >
> > >         prog = image;
> > >
> > >         EMIT1(0x55);             /* push rbp */
> > > @@ -1982,7 +1985,14 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> > >         EMIT4(0x48, 0x83, 0xEC, stack_size); /* sub rsp, stack_size */
> > >         EMIT1(0x53);             /* push rbx */
> > >
> > > -       save_regs(m, &prog, nr_args, stack_size);
> > > +       if (flags & BPF_TRAMP_F_IP_ARG) {
> > > +               emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8);
> > > +               EMIT4(0x48, 0x83, 0xe8, X86_PATCH_SIZE); /* sub $X86_PATCH_SIZE,%rax*/
> > > +               emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -stack_size);
> > > +               ip_arg = 8;
> > > +       }
> >
> > why not pass flags into save_regs and let it handle this case without
> > this extra ip_arg adjustment?
> >
> > > +
> > > +       save_regs(m, &prog, nr_args, stack_size - ip_arg);
> > >
> > >         if (flags & BPF_TRAMP_F_CALL_ORIG) {
> > >                 /* arg1: mov rdi, im */
> > > @@ -2011,7 +2021,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> > >         }
> > >
> > >         if (flags & BPF_TRAMP_F_CALL_ORIG) {
> > > -               restore_regs(m, &prog, nr_args, stack_size);
> > > +               restore_regs(m, &prog, nr_args, stack_size - ip_arg);
> > >
> >
> > similarly (and symmetrically), pass flags into restore_regs() to
> > handle that ip_arg transparently?
>
> so you mean something like:
>
>         if (flags & BPF_TRAMP_F_IP_ARG)
>                 stack_size -= 8;
>
> in both save_regs and restore_regs function, right?

yes, but for save_regs it will do more (emit_ldx and stuff)

>
> jirka
>
> >
> > >                 if (flags & BPF_TRAMP_F_ORIG_STACK) {
> > >                         emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8);
> > > @@ -2052,7 +2062,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> > >                 }
> > >
> > >         if (flags & BPF_TRAMP_F_RESTORE_REGS)
> > > -               restore_regs(m, &prog, nr_args, stack_size);
> > > +               restore_regs(m, &prog, nr_args, stack_size - ip_arg);
> > >
> > >         /* This needs to be done regardless. If there were fmod_ret programs,
> > >          * the return value is only updated on the stack and still needs to be
> > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > index 16fc600503fb..6cbf3c81c650 100644
> > > --- a/include/linux/bpf.h
> > > +++ b/include/linux/bpf.h
> > > @@ -559,6 +559,11 @@ struct btf_func_model {
> > >   */
> > >  #define BPF_TRAMP_F_ORIG_STACK         BIT(3)
> > >
> > > +/* First argument is IP address of the caller. Makes sense for fentry/fexit
> > > + * programs only.
> > > + */
> > > +#define BPF_TRAMP_F_IP_ARG             BIT(4)
> > > +
> > >  /* Each call __bpf_prog_enter + call bpf_func + call __bpf_prog_exit is ~50
> > >   * bytes on x86.  Pick a number to fit into BPF_IMAGE_SIZE / 2
> > >   */
> > > --
> > > 2.31.1
> > >
> >
>

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 13/19] bpf: Add support to link multi func tracing program
  2021-06-08 18:49       ` Alexei Starovoitov
@ 2021-06-08 21:07         ` Jiri Olsa
  2021-06-08 23:05           ` Alexei Starovoitov
  0 siblings, 1 reply; 76+ messages in thread
From: Jiri Olsa @ 2021-06-08 21:07 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Network Development, bpf, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Tue, Jun 08, 2021 at 11:49:03AM -0700, Alexei Starovoitov wrote:
> On Tue, Jun 08, 2021 at 08:17:00PM +0200, Jiri Olsa wrote:
> > On Tue, Jun 08, 2021 at 08:42:32AM -0700, Alexei Starovoitov wrote:
> > > On Sat, Jun 5, 2021 at 4:11 AM Jiri Olsa <jolsa@kernel.org> wrote:
> > > >
> > > > Adding support to attach multiple functions to tracing program
> > > > by using the link_create/link_update interface.
> > > >
> > > > Adding multi_btf_ids/multi_btf_ids_cnt pair to link_create struct
> > > > API, that define array of functions btf ids that will be attached
> > > > to prog_fd.
> > > >
> > > > The prog_fd needs to be multi prog tracing program (BPF_F_MULTI_FUNC).
> > > >
> > > > The new link_create interface creates new BPF_LINK_TYPE_TRACING_MULTI
> > > > link type, which creates separate bpf_trampoline and registers it
> > > > as direct function for all specified btf ids.
> > > >
> > > > The new bpf_trampoline is out of scope (bpf_trampoline_lookup) of
> > > > standard trampolines, so all registered functions need to be free
> > > > of direct functions, otherwise the link fails.
> > > 
> > > Overall the api makes sense to me.
> > > The restriction of multi vs non-multi is too severe though.
> > > The multi trampoline can serve normal fentry/fexit too.
> > 
> > so multi trampoline gets called from all the registered functions,
> > so there would need to be filter for specific ip before calling the
> > standard program.. single cmp/jnz might not be that bad, I'll check
> 
> You mean reusing the same multi trampoline for all IPs and regenerating
> it with a bunch of cmp/jnz checks? There should be a better way to scale.
> Maybe clone multi trampoline instead?
> IPs[1-10] will point to multi.
> IP[11] will point to a clone of multi that serves multi prog and
> fentry/fexit progs specific for that IP.

ok, so we'd clone multi trampoline if there's request to attach
standard trampoline to some IP from multi trampoline

.. and transform currently attached standard trampoline for IP
into clone of multi trampoline, if there's request to create
multi trampoline that covers that IP

jirka


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 10/19] bpf: Allow to store caller's ip as argument
  2021-06-08 21:02       ` Andrii Nakryiko
@ 2021-06-08 21:11         ` Jiri Olsa
  0 siblings, 0 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-08 21:11 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Tue, Jun 08, 2021 at 02:02:56PM -0700, Andrii Nakryiko wrote:
> On Tue, Jun 8, 2021 at 1:58 PM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > On Tue, Jun 08, 2021 at 11:49:31AM -0700, Andrii Nakryiko wrote:
> > > On Sat, Jun 5, 2021 at 4:12 AM Jiri Olsa <jolsa@kernel.org> wrote:
> > > >
> > > > When we will have multiple functions attached to trampoline
> > > > we need to propagate the function's address to the bpf program.
> > > >
> > > > Adding new BPF_TRAMP_F_IP_ARG flag to arch_prepare_bpf_trampoline
> > > > function that will store origin caller's address before function's
> > > > arguments.
> > > >
> > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > > ---
> > > >  arch/x86/net/bpf_jit_comp.c | 18 ++++++++++++++----
> > > >  include/linux/bpf.h         |  5 +++++
> > > >  2 files changed, 19 insertions(+), 4 deletions(-)
> > > >
> > > > diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> > > > index b77e6bd78354..d2425c18272a 100644
> > > > --- a/arch/x86/net/bpf_jit_comp.c
> > > > +++ b/arch/x86/net/bpf_jit_comp.c
> > > > @@ -1951,7 +1951,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> > > >                                 void *orig_call)
> > > >  {
> > > >         int ret, i, cnt = 0, nr_args = m->nr_args;
> > > > -       int stack_size = nr_args * 8;
> > > > +       int stack_size = nr_args * 8, ip_arg = 0;
> > > >         struct bpf_tramp_progs *fentry = &tprogs[BPF_TRAMP_FENTRY];
> > > >         struct bpf_tramp_progs *fexit = &tprogs[BPF_TRAMP_FEXIT];
> > > >         struct bpf_tramp_progs *fmod_ret = &tprogs[BPF_TRAMP_MODIFY_RETURN];
> > > > @@ -1975,6 +1975,9 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> > > >                  */
> > > >                 orig_call += X86_PATCH_SIZE;
> > > >
> > > > +       if (flags & BPF_TRAMP_F_IP_ARG)
> > > > +               stack_size += 8;
> > > > +
> > >
> > > nit: move it a bit up where we adjust stack_size for BPF_TRAMP_F_CALL_ORIG flag?
> >
> > ok
> >
> > >
> > > >         prog = image;
> > > >
> > > >         EMIT1(0x55);             /* push rbp */
> > > > @@ -1982,7 +1985,14 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> > > >         EMIT4(0x48, 0x83, 0xEC, stack_size); /* sub rsp, stack_size */
> > > >         EMIT1(0x53);             /* push rbx */
> > > >
> > > > -       save_regs(m, &prog, nr_args, stack_size);
> > > > +       if (flags & BPF_TRAMP_F_IP_ARG) {
> > > > +               emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8);
> > > > +               EMIT4(0x48, 0x83, 0xe8, X86_PATCH_SIZE); /* sub $X86_PATCH_SIZE,%rax*/
> > > > +               emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -stack_size);
> > > > +               ip_arg = 8;
> > > > +       }
> > >
> > > why not pass flags into save_regs and let it handle this case without
> > > this extra ip_arg adjustment?
> > >
> > > > +
> > > > +       save_regs(m, &prog, nr_args, stack_size - ip_arg);
> > > >
> > > >         if (flags & BPF_TRAMP_F_CALL_ORIG) {
> > > >                 /* arg1: mov rdi, im */
> > > > @@ -2011,7 +2021,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> > > >         }
> > > >
> > > >         if (flags & BPF_TRAMP_F_CALL_ORIG) {
> > > > -               restore_regs(m, &prog, nr_args, stack_size);
> > > > +               restore_regs(m, &prog, nr_args, stack_size - ip_arg);
> > > >
> > >
> > > similarly (and symmetrically), pass flags into restore_regs() to
> > > handle that ip_arg transparently?
> >
> > so you mean something like:
> >
> >         if (flags & BPF_TRAMP_F_IP_ARG)
> >                 stack_size -= 8;
> >
> > in both save_regs and restore_regs function, right?
> 
> yes, but for save_regs it will do more (emit_ldx and stuff)

so the whole stuff then, ok

jirka


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 13/19] bpf: Add support to link multi func tracing program
  2021-06-08 21:07         ` Jiri Olsa
@ 2021-06-08 23:05           ` Alexei Starovoitov
  2021-06-09  5:08             ` Andrii Nakryiko
  2021-06-09 13:33             ` Jiri Olsa
  0 siblings, 2 replies; 76+ messages in thread
From: Alexei Starovoitov @ 2021-06-08 23:05 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Network Development, bpf, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Tue, Jun 8, 2021 at 2:07 PM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Tue, Jun 08, 2021 at 11:49:03AM -0700, Alexei Starovoitov wrote:
> > On Tue, Jun 08, 2021 at 08:17:00PM +0200, Jiri Olsa wrote:
> > > On Tue, Jun 08, 2021 at 08:42:32AM -0700, Alexei Starovoitov wrote:
> > > > On Sat, Jun 5, 2021 at 4:11 AM Jiri Olsa <jolsa@kernel.org> wrote:
> > > > >
> > > > > Adding support to attach multiple functions to tracing program
> > > > > by using the link_create/link_update interface.
> > > > >
> > > > > Adding multi_btf_ids/multi_btf_ids_cnt pair to link_create struct
> > > > > API, that define array of functions btf ids that will be attached
> > > > > to prog_fd.
> > > > >
> > > > > The prog_fd needs to be multi prog tracing program (BPF_F_MULTI_FUNC).
> > > > >
> > > > > The new link_create interface creates new BPF_LINK_TYPE_TRACING_MULTI
> > > > > link type, which creates separate bpf_trampoline and registers it
> > > > > as direct function for all specified btf ids.
> > > > >
> > > > > The new bpf_trampoline is out of scope (bpf_trampoline_lookup) of
> > > > > standard trampolines, so all registered functions need to be free
> > > > > of direct functions, otherwise the link fails.
> > > >
> > > > Overall the api makes sense to me.
> > > > The restriction of multi vs non-multi is too severe though.
> > > > The multi trampoline can serve normal fentry/fexit too.
> > >
> > > so multi trampoline gets called from all the registered functions,
> > > so there would need to be filter for specific ip before calling the
> > > standard program.. single cmp/jnz might not be that bad, I'll check
> >
> > You mean reusing the same multi trampoline for all IPs and regenerating
> > it with a bunch of cmp/jnz checks? There should be a better way to scale.
> > Maybe clone multi trampoline instead?
> > IPs[1-10] will point to multi.
> > IP[11] will point to a clone of multi that serves multi prog and
> > fentry/fexit progs specific for that IP.
>
> ok, so we'd clone multi trampoline if there's request to attach
> standard trampoline to some IP from multi trampoline
>
> .. and transform currently attached standard trampoline for IP
> into clone of multi trampoline, if there's request to create
> multi trampoline that covers that IP

yep. For every IP==btf_id there will be only two possible trampolines.
Should be easy enough to track and transition between them.
The standard fentry/fexit will only get negligible slowdown from
going through multi.
multi+fexit and fmod_ret needs to be thought through as well.
That's why I thought that 'ip' at the end should simplify things.
Only multi will have access to it.
But we can store it first too. fentry/fexit will see ctx=r1 with +8 offset
and will have normal args in ctx. Like ip isn't even there.
While multi trampoline is always doing ip, arg1,arg2, .., arg6
and passes ctx = &ip into multi prog and ctx = &arg1 into fentry/fexit.
'ret' for fexit is problematic though. hmm.
Maybe such clone multi trampoline for specific ip with 2 args will do:
ip, arg1, arg2, ret, 0, 0, 0, ret.
Then multi will have 6 args, though 3rd is actually ret.
Then fexit will have ret in the right place and multi prog will have
it as 7th arg.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 13/19] bpf: Add support to link multi func tracing program
  2021-06-08 23:05           ` Alexei Starovoitov
@ 2021-06-09  5:08             ` Andrii Nakryiko
  2021-06-09 13:42               ` Jiri Olsa
  2021-06-09 13:33             ` Jiri Olsa
  1 sibling, 1 reply; 76+ messages in thread
From: Andrii Nakryiko @ 2021-06-09  5:08 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jiri Olsa, Jiri Olsa, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Steven Rostedt (VMware),
	Network Development, bpf, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Tue, Jun 8, 2021 at 4:07 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Tue, Jun 8, 2021 at 2:07 PM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > On Tue, Jun 08, 2021 at 11:49:03AM -0700, Alexei Starovoitov wrote:
> > > On Tue, Jun 08, 2021 at 08:17:00PM +0200, Jiri Olsa wrote:
> > > > On Tue, Jun 08, 2021 at 08:42:32AM -0700, Alexei Starovoitov wrote:
> > > > > On Sat, Jun 5, 2021 at 4:11 AM Jiri Olsa <jolsa@kernel.org> wrote:
> > > > > >
> > > > > > Adding support to attach multiple functions to tracing program
> > > > > > by using the link_create/link_update interface.
> > > > > >
> > > > > > Adding multi_btf_ids/multi_btf_ids_cnt pair to link_create struct
> > > > > > API, that define array of functions btf ids that will be attached
> > > > > > to prog_fd.
> > > > > >
> > > > > > The prog_fd needs to be multi prog tracing program (BPF_F_MULTI_FUNC).
> > > > > >
> > > > > > The new link_create interface creates new BPF_LINK_TYPE_TRACING_MULTI
> > > > > > link type, which creates separate bpf_trampoline and registers it
> > > > > > as direct function for all specified btf ids.
> > > > > >
> > > > > > The new bpf_trampoline is out of scope (bpf_trampoline_lookup) of
> > > > > > standard trampolines, so all registered functions need to be free
> > > > > > of direct functions, otherwise the link fails.
> > > > >
> > > > > Overall the api makes sense to me.
> > > > > The restriction of multi vs non-multi is too severe though.
> > > > > The multi trampoline can serve normal fentry/fexit too.
> > > >
> > > > so multi trampoline gets called from all the registered functions,
> > > > so there would need to be filter for specific ip before calling the
> > > > standard program.. single cmp/jnz might not be that bad, I'll check
> > >
> > > You mean reusing the same multi trampoline for all IPs and regenerating
> > > it with a bunch of cmp/jnz checks? There should be a better way to scale.
> > > Maybe clone multi trampoline instead?
> > > IPs[1-10] will point to multi.
> > > IP[11] will point to a clone of multi that serves multi prog and
> > > fentry/fexit progs specific for that IP.
> >
> > ok, so we'd clone multi trampoline if there's request to attach
> > standard trampoline to some IP from multi trampoline
> >
> > .. and transform currently attached standard trampoline for IP
> > into clone of multi trampoline, if there's request to create
> > multi trampoline that covers that IP
>
> yep. For every IP==btf_id there will be only two possible trampolines.
> Should be easy enough to track and transition between them.
> The standard fentry/fexit will only get negligible slowdown from
> going through multi.
> multi+fexit and fmod_ret needs to be thought through as well.
> That's why I thought that 'ip' at the end should simplify things.

Putting ip at the end has downsides. We might support >6 arguments
eventually, at which point it will be super weird to have 6 args, ip,
then the rest of arguments?..

Would it be too bad to put IP at -8 offset relative to ctx? That will
also work for normal fentry/fexit, for which it's useful to have ip
passed in as well, IMO. So no special casing for multi/non-multi, and
it's backwards compatible.

Ideally, I'd love it to be actually retrievable through a new BPF
helper, something like bpf_caller_ip(ctx), but I'm not sure if we can
implement this sanely, so I don't hold high hopes.

> Only multi will have access to it.
> But we can store it first too. fentry/fexit will see ctx=r1 with +8 offset
> and will have normal args in ctx. Like ip isn't even there.
> While multi trampoline is always doing ip, arg1,arg2, .., arg6
> and passes ctx = &ip into multi prog and ctx = &arg1 into fentry/fexit.
> 'ret' for fexit is problematic though. hmm.
> Maybe such clone multi trampoline for specific ip with 2 args will do:
> ip, arg1, arg2, ret, 0, 0, 0, ret.
> Then multi will have 6 args, though 3rd is actually ret.
> Then fexit will have ret in the right place and multi prog will have
> it as 7th arg.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 13/19] bpf: Add support to link multi func tracing program
  2021-06-05 11:10 ` [PATCH 13/19] bpf: Add support to link multi func tracing program Jiri Olsa
  2021-06-07  5:36   ` Yonghong Song
  2021-06-08 15:42   ` Alexei Starovoitov
@ 2021-06-09  5:18   ` Andrii Nakryiko
  2021-06-09 13:53     ` Jiri Olsa
  2 siblings, 1 reply; 76+ messages in thread
From: Andrii Nakryiko @ 2021-06-09  5:18 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Sat, Jun 5, 2021 at 4:12 AM Jiri Olsa <jolsa@kernel.org> wrote:
>
> Adding support to attach multiple functions to tracing program
> by using the link_create/link_update interface.
>
> Adding multi_btf_ids/multi_btf_ids_cnt pair to link_create struct
> API, that define array of functions btf ids that will be attached
> to prog_fd.
>
> The prog_fd needs to be multi prog tracing program (BPF_F_MULTI_FUNC).

So I'm not sure why we added a new load flag instead of just using a
new BPF program type or expected attach type?  We have different
trampolines and different kinds of links for them, so why not be
consistent and use the new type of BPF program?.. It does change BPF
verifier's treatment of input arguments, so it's not just a slight
variation, it's quite different type of program.

>
> The new link_create interface creates new BPF_LINK_TYPE_TRACING_MULTI
> link type, which creates separate bpf_trampoline and registers it
> as direct function for all specified btf ids.
>
> The new bpf_trampoline is out of scope (bpf_trampoline_lookup) of
> standard trampolines, so all registered functions need to be free
> of direct functions, otherwise the link fails.
>
> The new bpf_trampoline will store and pass to bpf program the highest
> number of arguments from all given functions.
>
> New programs (fentry or fexit) can be added to the existing trampoline
> through the link_update interface via new_prog_fd descriptor.
>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  include/linux/bpf.h            |   3 +
>  include/uapi/linux/bpf.h       |   5 +
>  kernel/bpf/syscall.c           | 185 ++++++++++++++++++++++++++++++++-
>  kernel/bpf/trampoline.c        |  53 +++++++---
>  tools/include/uapi/linux/bpf.h |   5 +
>  5 files changed, 237 insertions(+), 14 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 23221e0e8d3c..99a81c6c22e6 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -661,6 +661,7 @@ struct bpf_trampoline {
>         struct bpf_tramp_image *cur_image;
>         u64 selector;
>         struct module *mod;
> +       bool multi;
>  };
>
>  struct bpf_attach_target_info {
> @@ -746,6 +747,8 @@ void bpf_ksym_add(struct bpf_ksym *ksym);
>  void bpf_ksym_del(struct bpf_ksym *ksym);
>  int bpf_jit_charge_modmem(u32 pages);
>  void bpf_jit_uncharge_modmem(u32 pages);
> +struct bpf_trampoline *bpf_trampoline_multi_alloc(void);
> +void bpf_trampoline_multi_free(struct bpf_trampoline *tr);
>  #else
>  static inline int bpf_trampoline_link_prog(struct bpf_prog *prog,
>                                            struct bpf_trampoline *tr)
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index ad9340fb14d4..5fd6ff64e8dc 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -1007,6 +1007,7 @@ enum bpf_link_type {
>         BPF_LINK_TYPE_ITER = 4,
>         BPF_LINK_TYPE_NETNS = 5,
>         BPF_LINK_TYPE_XDP = 6,
> +       BPF_LINK_TYPE_TRACING_MULTI = 7,
>
>         MAX_BPF_LINK_TYPE,
>  };
> @@ -1454,6 +1455,10 @@ union bpf_attr {
>                                 __aligned_u64   iter_info;      /* extra bpf_iter_link_info */
>                                 __u32           iter_info_len;  /* iter_info length */
>                         };
> +                       struct {
> +                               __aligned_u64   multi_btf_ids;          /* addresses to attach */
> +                               __u32           multi_btf_ids_cnt;      /* addresses count */
> +                       };

let's do what bpf_link-based TC-BPF API is doing, put it into a named
field (I'd do the same for iter_info/iter_info_len above as well, I'm
not sure why we did this flat naming scheme, we now it's inconvenient
when extending stuff).

struct {
    __aligned_u64 btf_ids;
    __u32 btf_ids_cnt;
} multi;

>                 };
>         } link_create;
>

[...]

> +static int bpf_tracing_multi_link_update(struct bpf_link *link,
> +                                        struct bpf_prog *new_prog,
> +                                        struct bpf_prog *old_prog __maybe_unused)
> +{

BPF_LINK_UPDATE command supports passing old_fd and extra flags. We
can use that to implement both updating existing BPF program in-place
(by passing BPF_F_REPLACE and old_fd) or adding the program to the
list of programs, if old_fd == 0. WDYT?

> +       struct bpf_tracing_multi_link *tr_link =
> +               container_of(link, struct bpf_tracing_multi_link, link);
> +       int err;
> +
> +       if (check_multi_prog_type(new_prog))
> +               return -EINVAL;
> +
> +       err = bpf_trampoline_link_prog(new_prog, tr_link->tr);
> +       if (err)
> +               return err;
> +
> +       err = modify_ftrace_direct_multi(&tr_link->ops,
> +                                        (unsigned long) tr_link->tr->cur_image->image);
> +       return WARN_ON(err);
> +}
> +

[...]

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 14/19] libbpf: Add btf__find_by_pattern_kind function
  2021-06-05 11:10 ` [PATCH 14/19] libbpf: Add btf__find_by_pattern_kind function Jiri Olsa
@ 2021-06-09  5:29   ` Andrii Nakryiko
  2021-06-09 13:59     ` Jiri Olsa
  0 siblings, 1 reply; 76+ messages in thread
From: Andrii Nakryiko @ 2021-06-09  5:29 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Sat, Jun 5, 2021 at 4:14 AM Jiri Olsa <jolsa@kernel.org> wrote:
>
> Adding btf__find_by_pattern_kind function that returns
> array of BTF ids for given function name pattern.
>
> Using libc's regex.h support for that.
>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  tools/lib/bpf/btf.c | 68 +++++++++++++++++++++++++++++++++++++++++++++
>  tools/lib/bpf/btf.h |  3 ++
>  2 files changed, 71 insertions(+)
>
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index b46760b93bb4..421dd6c1e44a 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -1,6 +1,7 @@
>  // SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
>  /* Copyright (c) 2018 Facebook */
>
> +#define _GNU_SOURCE
>  #include <byteswap.h>
>  #include <endian.h>
>  #include <stdio.h>
> @@ -16,6 +17,7 @@
>  #include <linux/err.h>
>  #include <linux/btf.h>
>  #include <gelf.h>
> +#include <regex.h>
>  #include "btf.h"
>  #include "bpf.h"
>  #include "libbpf.h"
> @@ -711,6 +713,72 @@ __s32 btf__find_by_name_kind(const struct btf *btf, const char *type_name,
>         return libbpf_err(-ENOENT);
>  }
>
> +static bool is_wildcard(char c)
> +{
> +       static const char *wildchars = "*?[|";
> +
> +       return strchr(wildchars, c);
> +}
> +
> +int btf__find_by_pattern_kind(const struct btf *btf,
> +                             const char *type_pattern, __u32 kind,
> +                             __s32 **__ids)
> +{
> +       __u32 i, nr_types = btf__get_nr_types(btf);
> +       __s32 *ids = NULL;
> +       int cnt = 0, alloc = 0, ret;
> +       regex_t regex;
> +       char *pattern;
> +
> +       if (kind == BTF_KIND_UNKN || !strcmp(type_pattern, "void"))
> +               return 0;
> +
> +       /* When the pattern does not start with wildcard, treat it as
> +        * if we'd want to match it from the beginning of the string.
> +        */

This assumption is absolutely atrocious. If we say it's regexp, then
it has to always be regexp, not something based on some random
heuristic based on the first character.

Taking a step back, though. Do we really need to provide this API? Why
applications can't implement it on their own, given regexp
functionality is provided by libc. Which I didn't know, actually, so
that's pretty nice, assuming that it's also available in more minimal
implementations like musl.

> +       asprintf(&pattern, "%s%s",
> +                is_wildcard(type_pattern[0]) ? "^" : "",
> +                type_pattern);
> +
> +       ret = regcomp(&regex, pattern, REG_EXTENDED);
> +       if (ret) {
> +               pr_warn("failed to compile regex\n");
> +               free(pattern);
> +               return -EINVAL;
> +       }
> +
> +       free(pattern);
> +
> +       for (i = 1; i <= nr_types; i++) {
> +               const struct btf_type *t = btf__type_by_id(btf, i);
> +               const char *name;
> +               __s32 *p;
> +
> +               if (btf_kind(t) != kind)
> +                       continue;
> +               name = btf__name_by_offset(btf, t->name_off);
> +               if (name && regexec(&regex, name, 0, NULL, 0))
> +                       continue;
> +               if (cnt == alloc) {
> +                       alloc = max(100, alloc * 3 / 2);
> +                       p = realloc(ids, alloc * sizeof(__u32));

this memory allocation and re-allocation on behalf of users is another
argument against this API

> +                       if (!p) {
> +                               free(ids);
> +                               regfree(&regex);
> +                               return -ENOMEM;
> +                       }
> +                       ids = p;
> +               }
> +
> +               ids[cnt] = i;
> +               cnt++;
> +       }
> +
> +       regfree(&regex);
> +       *__ids = ids;
> +       return cnt ?: -ENOENT;
> +}
> +
>  static bool btf_is_modifiable(const struct btf *btf)
>  {
>         return (void *)btf->hdr != btf->raw_data;
> diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
> index b54f1c3ebd57..036857aded94 100644
> --- a/tools/lib/bpf/btf.h
> +++ b/tools/lib/bpf/btf.h
> @@ -371,6 +371,9 @@ btf_var_secinfos(const struct btf_type *t)
>         return (struct btf_var_secinfo *)(t + 1);
>  }
>
> +int btf__find_by_pattern_kind(const struct btf *btf,
> +                             const char *type_pattern, __u32 kind,
> +                             __s32 **__ids);
>  #ifdef __cplusplus
>  } /* extern "C" */
>  #endif
> --
> 2.31.1
>

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 15/19] libbpf: Add support to link multi func tracing program
  2021-06-05 11:10 ` [PATCH 15/19] libbpf: Add support to link multi func tracing program Jiri Olsa
  2021-06-07  5:49   ` Yonghong Song
@ 2021-06-09  5:34   ` Andrii Nakryiko
  2021-06-09 14:17     ` Jiri Olsa
  1 sibling, 1 reply; 76+ messages in thread
From: Andrii Nakryiko @ 2021-06-09  5:34 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Sat, Jun 5, 2021 at 4:12 AM Jiri Olsa <jolsa@kernel.org> wrote:
>
> Adding support to link multi func tracing program
> through link_create interface.
>
> Adding special types for multi func programs:
>
>   fentry.multi
>   fexit.multi
>
> so you can define multi func programs like:
>
>   SEC("fentry.multi/bpf_fentry_test*")
>   int BPF_PROG(test1, unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f)
>
> that defines test1 to be attached to bpf_fentry_test* functions,
> and able to attach ip and 6 arguments.
>
> If functions are not specified the program needs to be attached
> manually.
>
> Adding new btf id related fields to bpf_link_create_opts and
> bpf_link_create to use them.
>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  tools/lib/bpf/bpf.c    | 11 ++++++-
>  tools/lib/bpf/bpf.h    |  4 ++-
>  tools/lib/bpf/libbpf.c | 72 ++++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 85 insertions(+), 2 deletions(-)
>
> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> index 86dcac44f32f..da892737b522 100644
> --- a/tools/lib/bpf/bpf.c
> +++ b/tools/lib/bpf/bpf.c
> @@ -674,7 +674,8 @@ int bpf_link_create(int prog_fd, int target_fd,
>                     enum bpf_attach_type attach_type,
>                     const struct bpf_link_create_opts *opts)
>  {
> -       __u32 target_btf_id, iter_info_len;
> +       __u32 target_btf_id, iter_info_len, multi_btf_ids_cnt;
> +       __s32 *multi_btf_ids;
>         union bpf_attr attr;
>         int fd;
>
> @@ -687,6 +688,9 @@ int bpf_link_create(int prog_fd, int target_fd,
>         if (iter_info_len && target_btf_id)

here we check that mutually exclusive options are not specified, we
should do the same for multi stuff

>                 return libbpf_err(-EINVAL);
>
> +       multi_btf_ids = OPTS_GET(opts, multi_btf_ids, 0);
> +       multi_btf_ids_cnt = OPTS_GET(opts, multi_btf_ids_cnt, 0);
> +
>         memset(&attr, 0, sizeof(attr));
>         attr.link_create.prog_fd = prog_fd;
>         attr.link_create.target_fd = target_fd;
> @@ -701,6 +705,11 @@ int bpf_link_create(int prog_fd, int target_fd,
>                 attr.link_create.target_btf_id = target_btf_id;
>         }
>
> +       if (multi_btf_ids && multi_btf_ids_cnt) {
> +               attr.link_create.multi_btf_ids = (__u64) multi_btf_ids;
> +               attr.link_create.multi_btf_ids_cnt = multi_btf_ids_cnt;
> +       }
> +
>         fd = sys_bpf(BPF_LINK_CREATE, &attr, sizeof(attr));
>         return libbpf_err_errno(fd);
>  }
> diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
> index 4f758f8f50cd..2f78b6c34765 100644
> --- a/tools/lib/bpf/bpf.h
> +++ b/tools/lib/bpf/bpf.h
> @@ -177,8 +177,10 @@ struct bpf_link_create_opts {
>         union bpf_iter_link_info *iter_info;
>         __u32 iter_info_len;
>         __u32 target_btf_id;
> +       __s32 *multi_btf_ids;

why ids are __s32?..

> +       __u32 multi_btf_ids_cnt;
>  };
> -#define bpf_link_create_opts__last_field target_btf_id
> +#define bpf_link_create_opts__last_field multi_btf_ids_cnt
>
>  LIBBPF_API int bpf_link_create(int prog_fd, int target_fd,
>                                enum bpf_attach_type attach_type,
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 65f87cc1220c..bd31de3b6a85 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -228,6 +228,7 @@ struct bpf_sec_def {
>         bool is_attachable;
>         bool is_attach_btf;
>         bool is_sleepable;
> +       bool is_multi_func;
>         attach_fn_t attach_fn;
>  };
>
> @@ -7609,6 +7610,8 @@ __bpf_object__open(const char *path, const void *obj_buf, size_t obj_buf_sz,
>
>                 if (prog->sec_def->is_sleepable)
>                         prog->prog_flags |= BPF_F_SLEEPABLE;
> +               if (prog->sec_def->is_multi_func)
> +                       prog->prog_flags |= BPF_F_MULTI_FUNC;
>                 bpf_program__set_type(prog, prog->sec_def->prog_type);
>                 bpf_program__set_expected_attach_type(prog,
>                                 prog->sec_def->expected_attach_type);
> @@ -9070,6 +9073,8 @@ static struct bpf_link *attach_raw_tp(const struct bpf_sec_def *sec,
>                                       struct bpf_program *prog);
>  static struct bpf_link *attach_trace(const struct bpf_sec_def *sec,
>                                      struct bpf_program *prog);
> +static struct bpf_link *attach_trace_multi(const struct bpf_sec_def *sec,
> +                                          struct bpf_program *prog);
>  static struct bpf_link *attach_lsm(const struct bpf_sec_def *sec,
>                                    struct bpf_program *prog);
>  static struct bpf_link *attach_iter(const struct bpf_sec_def *sec,
> @@ -9143,6 +9148,14 @@ static const struct bpf_sec_def section_defs[] = {
>                 .attach_fn = attach_iter),
>         SEC_DEF("syscall", SYSCALL,
>                 .is_sleepable = true),
> +       SEC_DEF("fentry.multi/", TRACING,
> +               .expected_attach_type = BPF_TRACE_FENTRY,

BPF_TRACE_MULTI_FENTRY instead of is_multi stuff everywhere?.. Or a
new type of BPF program altogether?

> +               .is_multi_func = true,
> +               .attach_fn = attach_trace_multi),
> +       SEC_DEF("fexit.multi/", TRACING,
> +               .expected_attach_type = BPF_TRACE_FEXIT,
> +               .is_multi_func = true,
> +               .attach_fn = attach_trace_multi),
>         BPF_EAPROG_SEC("xdp_devmap/",           BPF_PROG_TYPE_XDP,
>                                                 BPF_XDP_DEVMAP),
>         BPF_EAPROG_SEC("xdp_cpumap/",           BPF_PROG_TYPE_XDP,
> @@ -9584,6 +9597,9 @@ static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd,
>         if (!name)
>                 return -EINVAL;
>
> +       if (prog->prog_flags & BPF_F_MULTI_FUNC)
> +               return 0;
> +
>         for (i = 0; i < ARRAY_SIZE(section_defs); i++) {
>                 if (!section_defs[i].is_attach_btf)
>                         continue;
> @@ -10537,6 +10553,62 @@ static struct bpf_link *bpf_program__attach_btf_id(struct bpf_program *prog)
>         return (struct bpf_link *)link;
>  }
>
> +static struct bpf_link *bpf_program__attach_multi(struct bpf_program *prog)
> +{
> +       char *pattern = prog->sec_name + prog->sec_def->len;
> +       DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts);
> +       enum bpf_attach_type attach_type;
> +       int prog_fd, link_fd, cnt, err;
> +       struct bpf_link *link = NULL;
> +       __s32 *ids = NULL;
> +
> +       prog_fd = bpf_program__fd(prog);
> +       if (prog_fd < 0) {
> +               pr_warn("prog '%s': can't attach before loaded\n", prog->name);
> +               return ERR_PTR(-EINVAL);
> +       }
> +
> +       err = bpf_object__load_vmlinux_btf(prog->obj, true);
> +       if (err)
> +               return ERR_PTR(err);
> +
> +       cnt = btf__find_by_pattern_kind(prog->obj->btf_vmlinux, pattern,
> +                                       BTF_KIND_FUNC, &ids);

I wonder if it would be better to just support a simplified glob
patterns like "prefix*", "*suffix", "exactmatch", and "*substring*"?
That should be sufficient for majority of cases. For the cases where
user needs something more nuanced, they can just construct BTF ID list
with custom code and do manual attach.

> +       if (cnt <= 0)
> +               return ERR_PTR(-EINVAL);
> +
> +       link = calloc(1, sizeof(*link));
> +       if (!link) {
> +               err = -ENOMEM;
> +               goto out_err;
> +       }
> +       link->detach = &bpf_link__detach_fd;
> +
> +       opts.multi_btf_ids = ids;
> +       opts.multi_btf_ids_cnt = cnt;
> +
> +       attach_type = bpf_program__get_expected_attach_type(prog);
> +       link_fd = bpf_link_create(prog_fd, 0, attach_type, &opts);
> +       if (link_fd < 0) {
> +               err = -errno;
> +               goto out_err;
> +       }
> +       link->fd = link_fd;
> +       free(ids);
> +       return link;
> +
> +out_err:
> +       free(link);
> +       free(ids);
> +       return ERR_PTR(err);
> +}
> +
> +static struct bpf_link *attach_trace_multi(const struct bpf_sec_def *sec,
> +                                          struct bpf_program *prog)
> +{
> +       return bpf_program__attach_multi(prog);
> +}
> +
>  struct bpf_link *bpf_program__attach_trace(struct bpf_program *prog)
>  {
>         return bpf_program__attach_btf_id(prog);
> --
> 2.31.1
>

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 16/19] selftests/bpf: Add fentry multi func test
  2021-06-05 11:10 ` [PATCH 16/19] selftests/bpf: Add fentry multi func test Jiri Olsa
  2021-06-07  6:06   ` Yonghong Song
@ 2021-06-09  5:40   ` Andrii Nakryiko
  2021-06-09 14:29     ` Jiri Olsa
  1 sibling, 1 reply; 76+ messages in thread
From: Andrii Nakryiko @ 2021-06-09  5:40 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Sat, Jun 5, 2021 at 4:12 AM Jiri Olsa <jolsa@kernel.org> wrote:
>
> Adding selftest for fentry multi func test that attaches
> to bpf_fentry_test* functions and checks argument values
> based on the processed function.
>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  tools/testing/selftests/bpf/multi_check.h     | 52 +++++++++++++++++++
>  .../bpf/prog_tests/fentry_multi_test.c        | 43 +++++++++++++++
>  .../selftests/bpf/progs/fentry_multi_test.c   | 18 +++++++
>  3 files changed, 113 insertions(+)
>  create mode 100644 tools/testing/selftests/bpf/multi_check.h
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/fentry_multi_test.c
>  create mode 100644 tools/testing/selftests/bpf/progs/fentry_multi_test.c
>
> diff --git a/tools/testing/selftests/bpf/multi_check.h b/tools/testing/selftests/bpf/multi_check.h
> new file mode 100644
> index 000000000000..36c2a93f9be3
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/multi_check.h

we have a proper static linking now, we don't have to use header
inclusion hacks, let's do this properly?

> @@ -0,0 +1,52 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +
> +#ifndef __MULTI_CHECK_H
> +#define __MULTI_CHECK_H
> +
> +extern unsigned long long bpf_fentry_test[8];
> +
> +static __attribute__((unused)) inline
> +void multi_arg_check(unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f, __u64 *test_result)
> +{
> +       if (ip == bpf_fentry_test[0]) {
> +               *test_result += (int) a == 1;
> +       } else if (ip == bpf_fentry_test[1]) {
> +               *test_result += (int) a == 2 && (__u64) b == 3;
> +       } else if (ip == bpf_fentry_test[2]) {
> +               *test_result += (char) a == 4 && (int) b == 5 && (__u64) c == 6;
> +       } else if (ip == bpf_fentry_test[3]) {
> +               *test_result += (void *) a == (void *) 7 && (char) b == 8 && (int) c == 9 && (__u64) d == 10;
> +       } else if (ip == bpf_fentry_test[4]) {
> +               *test_result += (__u64) a == 11 && (void *) b == (void *) 12 && (short) c == 13 && (int) d == 14 && (__u64) e == 15;
> +       } else if (ip == bpf_fentry_test[5]) {
> +               *test_result += (__u64) a == 16 && (void *) b == (void *) 17 && (short) c == 18 && (int) d == 19 && (void *) e == (void *) 20 && (__u64) f == 21;
> +       } else if (ip == bpf_fentry_test[6]) {
> +               *test_result += 1;
> +       } else if (ip == bpf_fentry_test[7]) {
> +               *test_result += 1;
> +       }

why not use switch? and why the casting?

> +}
> +

[...]

> diff --git a/tools/testing/selftests/bpf/progs/fentry_multi_test.c b/tools/testing/selftests/bpf/progs/fentry_multi_test.c
> new file mode 100644
> index 000000000000..a443fc958e5a
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/fentry_multi_test.c
> @@ -0,0 +1,18 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <linux/bpf.h>
> +#include <bpf/bpf_helpers.h>
> +#include <bpf/bpf_tracing.h>
> +#include "multi_check.h"
> +
> +char _license[] SEC("license") = "GPL";
> +
> +unsigned long long bpf_fentry_test[8];
> +
> +__u64 test_result = 0;
> +
> +SEC("fentry.multi/bpf_fentry_test*")

wait, that's a regexp syntax that libc supports?.. Not .*? We should
definitely not provide btf__find_by_pattern_kind() API, I'd like to
avoid explaining what flavors of regexps libbpf supports.

> +int BPF_PROG(test, unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f)
> +{
> +       multi_arg_check(ip, a, b, c, d, e, f, &test_result);
> +       return 0;
> +}
> --
> 2.31.1
>

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 18/19] selftests/bpf: Add fentry/fexit multi func test
  2021-06-05 11:10 ` [PATCH 18/19] selftests/bpf: Add fentry/fexit " Jiri Olsa
@ 2021-06-09  5:41   ` Andrii Nakryiko
  2021-06-09 14:29     ` Jiri Olsa
  0 siblings, 1 reply; 76+ messages in thread
From: Andrii Nakryiko @ 2021-06-09  5:41 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Sat, Jun 5, 2021 at 4:13 AM Jiri Olsa <jolsa@kernel.org> wrote:
>
> Adding selftest for fentry/fexit multi func test that attaches
> to bpf_fentry_test* functions and checks argument values based
> on the processed function.
>
> When multi_arg_check is used from 2 different places I'm getting
> compilation fail, which I did not deciphered yet:
>
>   $ CLANG=/opt/clang/bin/clang LLC=/opt/clang/bin/llc make
>     CLNG-BPF [test_maps] fentry_fexit_multi_test.o
>   progs/fentry_fexit_multi_test.c:18:2: error: too many args to t24: i64 = \
>   GlobalAddress<void (i64, i64, i64, i64, i64, i64, i64, i64*)* @multi_arg_check> 0, \
>   progs/fentry_fexit_multi_test.c:18:2 @[ progs/fentry_fexit_multi_test.c:16:5 ]
>           multi_arg_check(ip, a, b, c, d, e, f, &test1_arg_result);
>           ^
>   progs/fentry_fexit_multi_test.c:25:2: error: too many args to t32: i64 = \
>   GlobalAddress<void (i64, i64, i64, i64, i64, i64, i64, i64*)* @multi_arg_check> 0, \
>   progs/fentry_fexit_multi_test.c:25:2 @[ progs/fentry_fexit_multi_test.c:23:5 ]
>           multi_arg_check(ip, a, b, c, d, e, f, &test2_arg_result);
>           ^
>   In file included from progs/fentry_fexit_multi_test.c:5:
>   /home/jolsa/linux-qemu/tools/testing/selftests/bpf/multi_check.h:9:6: error: defined with too many args
>   void multi_arg_check(unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f, __u64 *test_result)
>        ^
>   /home/jolsa/linux-qemu/tools/testing/selftests/bpf/multi_check.h:9:6: error: defined with too many args
>   /home/jolsa/linux-qemu/tools/testing/selftests/bpf/multi_check.h:9:6: error: defined with too many args
>   5 errors generated.
>   make: *** [Makefile:470: /home/jolsa/linux-qemu/tools/testing/selftests/bpf/fentry_fexit_multi_test.o] Error 1
>
> I can fix that by defining 2 separate multi_arg_check functions
> with different names, which I did in follow up temporaary patch.
> Not sure I'm hitting some clang/bpf limitation in here?

don't know about  clang limitations, but we should use static linking
proper anyways

>
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>  .../bpf/prog_tests/fentry_fexit_multi_test.c  | 52 +++++++++++++++++++
>  .../bpf/progs/fentry_fexit_multi_test.c       | 28 ++++++++++
>  2 files changed, 80 insertions(+)
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/fentry_fexit_multi_test.c
>  create mode 100644 tools/testing/selftests/bpf/progs/fentry_fexit_multi_test.c
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/fentry_fexit_multi_test.c b/tools/testing/selftests/bpf/prog_tests/fentry_fexit_multi_test.c
> new file mode 100644
> index 000000000000..76f917ad843d
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/fentry_fexit_multi_test.c
> @@ -0,0 +1,52 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <test_progs.h>
> +#include "fentry_fexit_multi_test.skel.h"
> +
> +void test_fentry_fexit_multi_test(void)
> +{
> +       DECLARE_LIBBPF_OPTS(bpf_link_update_opts, link_upd_opts);
> +       struct fentry_fexit_multi_test *skel = NULL;
> +       unsigned long long *bpf_fentry_test;
> +       __u32 duration = 0, retval;
> +       struct bpf_link *link;
> +       int err, prog_fd;
> +
> +       skel = fentry_fexit_multi_test__open_and_load();
> +       if (!ASSERT_OK_PTR(skel, "fentry_multi_skel_load"))
> +               goto cleanup;
> +
> +       bpf_fentry_test = &skel->bss->bpf_fentry_test[0];
> +       ASSERT_OK(kallsyms_find("bpf_fentry_test1", &bpf_fentry_test[0]), "kallsyms_find");
> +       ASSERT_OK(kallsyms_find("bpf_fentry_test2", &bpf_fentry_test[1]), "kallsyms_find");
> +       ASSERT_OK(kallsyms_find("bpf_fentry_test3", &bpf_fentry_test[2]), "kallsyms_find");
> +       ASSERT_OK(kallsyms_find("bpf_fentry_test4", &bpf_fentry_test[3]), "kallsyms_find");
> +       ASSERT_OK(kallsyms_find("bpf_fentry_test5", &bpf_fentry_test[4]), "kallsyms_find");
> +       ASSERT_OK(kallsyms_find("bpf_fentry_test6", &bpf_fentry_test[5]), "kallsyms_find");
> +       ASSERT_OK(kallsyms_find("bpf_fentry_test7", &bpf_fentry_test[6]), "kallsyms_find");
> +       ASSERT_OK(kallsyms_find("bpf_fentry_test8", &bpf_fentry_test[7]), "kallsyms_find");
> +
> +       link = bpf_program__attach(skel->progs.test1);
> +       if (!ASSERT_OK_PTR(link, "attach_fentry_fexit"))
> +               goto cleanup;
> +
> +       err = bpf_link_update(bpf_link__fd(link),
> +                             bpf_program__fd(skel->progs.test2),
> +                             NULL);
> +       if (!ASSERT_OK(err, "bpf_link_update"))
> +               goto cleanup_link;
> +
> +       prog_fd = bpf_program__fd(skel->progs.test1);
> +       err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
> +                               NULL, NULL, &retval, &duration);
> +       ASSERT_OK(err, "test_run");
> +       ASSERT_EQ(retval, 0, "test_run");
> +
> +       ASSERT_EQ(skel->bss->test1_arg_result, 8, "test1_arg_result");
> +       ASSERT_EQ(skel->bss->test2_arg_result, 8, "test2_arg_result");
> +       ASSERT_EQ(skel->bss->test2_ret_result, 8, "test2_ret_result");
> +
> +cleanup_link:
> +       bpf_link__destroy(link);
> +cleanup:
> +       fentry_fexit_multi_test__destroy(skel);
> +}
> diff --git a/tools/testing/selftests/bpf/progs/fentry_fexit_multi_test.c b/tools/testing/selftests/bpf/progs/fentry_fexit_multi_test.c
> new file mode 100644
> index 000000000000..e25ab0085399
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/fentry_fexit_multi_test.c
> @@ -0,0 +1,28 @@
> +// SPDX-License-Identifier: GPL-2.0
> +#include <linux/bpf.h>
> +#include <bpf/bpf_helpers.h>
> +#include <bpf/bpf_tracing.h>
> +#include "multi_check.h"
> +
> +char _license[] SEC("license") = "GPL";
> +
> +unsigned long long bpf_fentry_test[8];
> +
> +__u64 test1_arg_result = 0;
> +__u64 test2_arg_result = 0;
> +__u64 test2_ret_result = 0;
> +
> +SEC("fentry.multi/bpf_fentry_test*")
> +int BPF_PROG(test1, unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f)
> +{
> +       multi_arg_check(ip, a, b, c, d, e, f, &test1_arg_result);
> +       return 0;
> +}
> +
> +SEC("fexit.multi/")
> +int BPF_PROG(test2, unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f, int ret)
> +{
> +       multi_arg_check(ip, a, b, c, d, e, f, &test2_arg_result);
> +       multi_ret_check(ip, ret, &test2_ret_result);
> +       return 0;
> +}
> --
> 2.31.1
>

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 13/19] bpf: Add support to link multi func tracing program
  2021-06-08 23:05           ` Alexei Starovoitov
  2021-06-09  5:08             ` Andrii Nakryiko
@ 2021-06-09 13:33             ` Jiri Olsa
  1 sibling, 0 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-09 13:33 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Network Development, bpf, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Tue, Jun 08, 2021 at 04:05:29PM -0700, Alexei Starovoitov wrote:
> On Tue, Jun 8, 2021 at 2:07 PM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > On Tue, Jun 08, 2021 at 11:49:03AM -0700, Alexei Starovoitov wrote:
> > > On Tue, Jun 08, 2021 at 08:17:00PM +0200, Jiri Olsa wrote:
> > > > On Tue, Jun 08, 2021 at 08:42:32AM -0700, Alexei Starovoitov wrote:
> > > > > On Sat, Jun 5, 2021 at 4:11 AM Jiri Olsa <jolsa@kernel.org> wrote:
> > > > > >
> > > > > > Adding support to attach multiple functions to tracing program
> > > > > > by using the link_create/link_update interface.
> > > > > >
> > > > > > Adding multi_btf_ids/multi_btf_ids_cnt pair to link_create struct
> > > > > > API, that define array of functions btf ids that will be attached
> > > > > > to prog_fd.
> > > > > >
> > > > > > The prog_fd needs to be multi prog tracing program (BPF_F_MULTI_FUNC).
> > > > > >
> > > > > > The new link_create interface creates new BPF_LINK_TYPE_TRACING_MULTI
> > > > > > link type, which creates separate bpf_trampoline and registers it
> > > > > > as direct function for all specified btf ids.
> > > > > >
> > > > > > The new bpf_trampoline is out of scope (bpf_trampoline_lookup) of
> > > > > > standard trampolines, so all registered functions need to be free
> > > > > > of direct functions, otherwise the link fails.
> > > > >
> > > > > Overall the api makes sense to me.
> > > > > The restriction of multi vs non-multi is too severe though.
> > > > > The multi trampoline can serve normal fentry/fexit too.
> > > >
> > > > so multi trampoline gets called from all the registered functions,
> > > > so there would need to be filter for specific ip before calling the
> > > > standard program.. single cmp/jnz might not be that bad, I'll check
> > >
> > > You mean reusing the same multi trampoline for all IPs and regenerating
> > > it with a bunch of cmp/jnz checks? There should be a better way to scale.
> > > Maybe clone multi trampoline instead?
> > > IPs[1-10] will point to multi.
> > > IP[11] will point to a clone of multi that serves multi prog and
> > > fentry/fexit progs specific for that IP.
> >
> > ok, so we'd clone multi trampoline if there's request to attach
> > standard trampoline to some IP from multi trampoline
> >
> > .. and transform currently attached standard trampoline for IP
> > into clone of multi trampoline, if there's request to create
> > multi trampoline that covers that IP
> 
> yep. For every IP==btf_id there will be only two possible trampolines.
> Should be easy enough to track and transition between them.
> The standard fentry/fexit will only get negligible slowdown from
> going through multi.
> multi+fexit and fmod_ret needs to be thought through as well.
> That's why I thought that 'ip' at the end should simplify things.
> Only multi will have access to it.
> But we can store it first too. fentry/fexit will see ctx=r1 with +8 offset
> and will have normal args in ctx. Like ip isn't even there.
> While multi trampoline is always doing ip, arg1,arg2, .., arg6
> and passes ctx = &ip into multi prog and ctx = &arg1 into fentry/fexit.
> 'ret' for fexit is problematic though. hmm.
> Maybe such clone multi trampoline for specific ip with 2 args will do:
> ip, arg1, arg2, ret, 0, 0, 0, ret.

we could call multi progs first and setup new args
and call non-multi progs with that

jirka

> Then multi will have 6 args, though 3rd is actually ret.
> Then fexit will have ret in the right place and multi prog will have
> it as 7th arg.
> 


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 13/19] bpf: Add support to link multi func tracing program
  2021-06-09  5:08             ` Andrii Nakryiko
@ 2021-06-09 13:42               ` Jiri Olsa
  0 siblings, 0 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-09 13:42 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Jiri Olsa, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Steven Rostedt (VMware),
	Network Development, bpf, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Tue, Jun 08, 2021 at 10:08:32PM -0700, Andrii Nakryiko wrote:
> On Tue, Jun 8, 2021 at 4:07 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Tue, Jun 8, 2021 at 2:07 PM Jiri Olsa <jolsa@redhat.com> wrote:
> > >
> > > On Tue, Jun 08, 2021 at 11:49:03AM -0700, Alexei Starovoitov wrote:
> > > > On Tue, Jun 08, 2021 at 08:17:00PM +0200, Jiri Olsa wrote:
> > > > > On Tue, Jun 08, 2021 at 08:42:32AM -0700, Alexei Starovoitov wrote:
> > > > > > On Sat, Jun 5, 2021 at 4:11 AM Jiri Olsa <jolsa@kernel.org> wrote:
> > > > > > >
> > > > > > > Adding support to attach multiple functions to tracing program
> > > > > > > by using the link_create/link_update interface.
> > > > > > >
> > > > > > > Adding multi_btf_ids/multi_btf_ids_cnt pair to link_create struct
> > > > > > > API, that define array of functions btf ids that will be attached
> > > > > > > to prog_fd.
> > > > > > >
> > > > > > > The prog_fd needs to be multi prog tracing program (BPF_F_MULTI_FUNC).
> > > > > > >
> > > > > > > The new link_create interface creates new BPF_LINK_TYPE_TRACING_MULTI
> > > > > > > link type, which creates separate bpf_trampoline and registers it
> > > > > > > as direct function for all specified btf ids.
> > > > > > >
> > > > > > > The new bpf_trampoline is out of scope (bpf_trampoline_lookup) of
> > > > > > > standard trampolines, so all registered functions need to be free
> > > > > > > of direct functions, otherwise the link fails.
> > > > > >
> > > > > > Overall the api makes sense to me.
> > > > > > The restriction of multi vs non-multi is too severe though.
> > > > > > The multi trampoline can serve normal fentry/fexit too.
> > > > >
> > > > > so multi trampoline gets called from all the registered functions,
> > > > > so there would need to be filter for specific ip before calling the
> > > > > standard program.. single cmp/jnz might not be that bad, I'll check
> > > >
> > > > You mean reusing the same multi trampoline for all IPs and regenerating
> > > > it with a bunch of cmp/jnz checks? There should be a better way to scale.
> > > > Maybe clone multi trampoline instead?
> > > > IPs[1-10] will point to multi.
> > > > IP[11] will point to a clone of multi that serves multi prog and
> > > > fentry/fexit progs specific for that IP.
> > >
> > > ok, so we'd clone multi trampoline if there's request to attach
> > > standard trampoline to some IP from multi trampoline
> > >
> > > .. and transform currently attached standard trampoline for IP
> > > into clone of multi trampoline, if there's request to create
> > > multi trampoline that covers that IP
> >
> > yep. For every IP==btf_id there will be only two possible trampolines.
> > Should be easy enough to track and transition between them.
> > The standard fentry/fexit will only get negligible slowdown from
> > going through multi.
> > multi+fexit and fmod_ret needs to be thought through as well.
> > That's why I thought that 'ip' at the end should simplify things.
> 
> Putting ip at the end has downsides. We might support >6 arguments
> eventually, at which point it will be super weird to have 6 args, ip,
> then the rest of arguments?..
> 
> Would it be too bad to put IP at -8 offset relative to ctx? That will
> also work for normal fentry/fexit, for which it's useful to have ip
> passed in as well, IMO. So no special casing for multi/non-multi, and
> it's backwards compatible.

I think Alexei is ok with that, as he said below

> 
> Ideally, I'd love it to be actually retrievable through a new BPF
> helper, something like bpf_caller_ip(ctx), but I'm not sure if we can
> implement this sanely, so I don't hold high hopes.

we could always store it in ctx-8 and have the helper to get it
from there.. that might also ease up handling that extra first
ip argument for multi-func programs in verifier

jirka

> 
> > Only multi will have access to it.
> > But we can store it first too. fentry/fexit will see ctx=r1 with +8 offset
> > and will have normal args in ctx. Like ip isn't even there.
> > While multi trampoline is always doing ip, arg1,arg2, .., arg6
> > and passes ctx = &ip into multi prog and ctx = &arg1 into fentry/fexit.
> > 'ret' for fexit is problematic though. hmm.
> > Maybe such clone multi trampoline for specific ip with 2 args will do:
> > ip, arg1, arg2, ret, 0, 0, 0, ret.
> > Then multi will have 6 args, though 3rd is actually ret.
> > Then fexit will have ret in the right place and multi prog will have
> > it as 7th arg.
> 


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 13/19] bpf: Add support to link multi func tracing program
  2021-06-09  5:18   ` Andrii Nakryiko
@ 2021-06-09 13:53     ` Jiri Olsa
  0 siblings, 0 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-09 13:53 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Tue, Jun 08, 2021 at 10:18:21PM -0700, Andrii Nakryiko wrote:
> On Sat, Jun 5, 2021 at 4:12 AM Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > Adding support to attach multiple functions to tracing program
> > by using the link_create/link_update interface.
> >
> > Adding multi_btf_ids/multi_btf_ids_cnt pair to link_create struct
> > API, that define array of functions btf ids that will be attached
> > to prog_fd.
> >
> > The prog_fd needs to be multi prog tracing program (BPF_F_MULTI_FUNC).
> 
> So I'm not sure why we added a new load flag instead of just using a
> new BPF program type or expected attach type?  We have different
> trampolines and different kinds of links for them, so why not be
> consistent and use the new type of BPF program?.. It does change BPF
> verifier's treatment of input arguments, so it's not just a slight
> variation, it's quite different type of program.

ok, makes sense ... BPF_PROG_TYPE_TRACING_MULTI ?

SNIP

> >  struct bpf_attach_target_info {
> > @@ -746,6 +747,8 @@ void bpf_ksym_add(struct bpf_ksym *ksym);
> >  void bpf_ksym_del(struct bpf_ksym *ksym);
> >  int bpf_jit_charge_modmem(u32 pages);
> >  void bpf_jit_uncharge_modmem(u32 pages);
> > +struct bpf_trampoline *bpf_trampoline_multi_alloc(void);
> > +void bpf_trampoline_multi_free(struct bpf_trampoline *tr);
> >  #else
> >  static inline int bpf_trampoline_link_prog(struct bpf_prog *prog,
> >                                            struct bpf_trampoline *tr)
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index ad9340fb14d4..5fd6ff64e8dc 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -1007,6 +1007,7 @@ enum bpf_link_type {
> >         BPF_LINK_TYPE_ITER = 4,
> >         BPF_LINK_TYPE_NETNS = 5,
> >         BPF_LINK_TYPE_XDP = 6,
> > +       BPF_LINK_TYPE_TRACING_MULTI = 7,
> >
> >         MAX_BPF_LINK_TYPE,
> >  };
> > @@ -1454,6 +1455,10 @@ union bpf_attr {
> >                                 __aligned_u64   iter_info;      /* extra bpf_iter_link_info */
> >                                 __u32           iter_info_len;  /* iter_info length */
> >                         };
> > +                       struct {
> > +                               __aligned_u64   multi_btf_ids;          /* addresses to attach */
> > +                               __u32           multi_btf_ids_cnt;      /* addresses count */
> > +                       };
> 
> let's do what bpf_link-based TC-BPF API is doing, put it into a named
> field (I'd do the same for iter_info/iter_info_len above as well, I'm
> not sure why we did this flat naming scheme, we now it's inconvenient
> when extending stuff).
> 
> struct {
>     __aligned_u64 btf_ids;
>     __u32 btf_ids_cnt;
> } multi;

ok

> 
> >                 };
> >         } link_create;
> >
> 
> [...]
> 
> > +static int bpf_tracing_multi_link_update(struct bpf_link *link,
> > +                                        struct bpf_prog *new_prog,
> > +                                        struct bpf_prog *old_prog __maybe_unused)
> > +{
> 
> BPF_LINK_UPDATE command supports passing old_fd and extra flags. We
> can use that to implement both updating existing BPF program in-place
> (by passing BPF_F_REPLACE and old_fd) or adding the program to the
> list of programs, if old_fd == 0. WDYT?

yes, sounds good

thanks,
jirka


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 14/19] libbpf: Add btf__find_by_pattern_kind function
  2021-06-09  5:29   ` Andrii Nakryiko
@ 2021-06-09 13:59     ` Jiri Olsa
  2021-06-09 14:19       ` Jiri Olsa
  0 siblings, 1 reply; 76+ messages in thread
From: Jiri Olsa @ 2021-06-09 13:59 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Tue, Jun 08, 2021 at 10:29:19PM -0700, Andrii Nakryiko wrote:
> On Sat, Jun 5, 2021 at 4:14 AM Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > Adding btf__find_by_pattern_kind function that returns
> > array of BTF ids for given function name pattern.
> >
> > Using libc's regex.h support for that.
> >
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  tools/lib/bpf/btf.c | 68 +++++++++++++++++++++++++++++++++++++++++++++
> >  tools/lib/bpf/btf.h |  3 ++
> >  2 files changed, 71 insertions(+)
> >
> > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > index b46760b93bb4..421dd6c1e44a 100644
> > --- a/tools/lib/bpf/btf.c
> > +++ b/tools/lib/bpf/btf.c
> > @@ -1,6 +1,7 @@
> >  // SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
> >  /* Copyright (c) 2018 Facebook */
> >
> > +#define _GNU_SOURCE
> >  #include <byteswap.h>
> >  #include <endian.h>
> >  #include <stdio.h>
> > @@ -16,6 +17,7 @@
> >  #include <linux/err.h>
> >  #include <linux/btf.h>
> >  #include <gelf.h>
> > +#include <regex.h>
> >  #include "btf.h"
> >  #include "bpf.h"
> >  #include "libbpf.h"
> > @@ -711,6 +713,72 @@ __s32 btf__find_by_name_kind(const struct btf *btf, const char *type_name,
> >         return libbpf_err(-ENOENT);
> >  }
> >
> > +static bool is_wildcard(char c)
> > +{
> > +       static const char *wildchars = "*?[|";
> > +
> > +       return strchr(wildchars, c);
> > +}
> > +
> > +int btf__find_by_pattern_kind(const struct btf *btf,
> > +                             const char *type_pattern, __u32 kind,
> > +                             __s32 **__ids)
> > +{
> > +       __u32 i, nr_types = btf__get_nr_types(btf);
> > +       __s32 *ids = NULL;
> > +       int cnt = 0, alloc = 0, ret;
> > +       regex_t regex;
> > +       char *pattern;
> > +
> > +       if (kind == BTF_KIND_UNKN || !strcmp(type_pattern, "void"))
> > +               return 0;
> > +
> > +       /* When the pattern does not start with wildcard, treat it as
> > +        * if we'd want to match it from the beginning of the string.
> > +        */
> 
> This assumption is absolutely atrocious. If we say it's regexp, then
> it has to always be regexp, not something based on some random
> heuristic based on the first character.
> 
> Taking a step back, though. Do we really need to provide this API? Why
> applications can't implement it on their own, given regexp
> functionality is provided by libc. Which I didn't know, actually, so
> that's pretty nice, assuming that it's also available in more minimal
> implementations like musl.
> 

so the only purpose for this function is to support wildcards in
tests like:

  SEC("fentry.multi/bpf_fentry_test*")

so the generic skeleton attach function can work.. but that can be
removed and the test programs can be attached manually through some
other attach function that will have list of functions as argument

jirka

> > +       asprintf(&pattern, "%s%s",
> > +                is_wildcard(type_pattern[0]) ? "^" : "",
> > +                type_pattern);
> > +
> > +       ret = regcomp(&regex, pattern, REG_EXTENDED);
> > +       if (ret) {
> > +               pr_warn("failed to compile regex\n");
> > +               free(pattern);
> > +               return -EINVAL;
> > +       }
> > +
> > +       free(pattern);
> > +
> > +       for (i = 1; i <= nr_types; i++) {
> > +               const struct btf_type *t = btf__type_by_id(btf, i);
> > +               const char *name;
> > +               __s32 *p;
> > +
> > +               if (btf_kind(t) != kind)
> > +                       continue;
> > +               name = btf__name_by_offset(btf, t->name_off);
> > +               if (name && regexec(&regex, name, 0, NULL, 0))
> > +                       continue;
> > +               if (cnt == alloc) {
> > +                       alloc = max(100, alloc * 3 / 2);
> > +                       p = realloc(ids, alloc * sizeof(__u32));
> 
> this memory allocation and re-allocation on behalf of users is another
> argument against this API
> 
> > +                       if (!p) {
> > +                               free(ids);
> > +                               regfree(&regex);
> > +                               return -ENOMEM;
> > +                       }
> > +                       ids = p;
> > +               }
> > +
> > +               ids[cnt] = i;
> > +               cnt++;
> > +       }
> > +
> > +       regfree(&regex);
> > +       *__ids = ids;
> > +       return cnt ?: -ENOENT;
> > +}
> > +
> >  static bool btf_is_modifiable(const struct btf *btf)
> >  {
> >         return (void *)btf->hdr != btf->raw_data;
> > diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
> > index b54f1c3ebd57..036857aded94 100644
> > --- a/tools/lib/bpf/btf.h
> > +++ b/tools/lib/bpf/btf.h
> > @@ -371,6 +371,9 @@ btf_var_secinfos(const struct btf_type *t)
> >         return (struct btf_var_secinfo *)(t + 1);
> >  }
> >
> > +int btf__find_by_pattern_kind(const struct btf *btf,
> > +                             const char *type_pattern, __u32 kind,
> > +                             __s32 **__ids);
> >  #ifdef __cplusplus
> >  } /* extern "C" */
> >  #endif
> > --
> > 2.31.1
> >
> 


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 15/19] libbpf: Add support to link multi func tracing program
  2021-06-09  5:34   ` Andrii Nakryiko
@ 2021-06-09 14:17     ` Jiri Olsa
  2021-06-10 17:05       ` Andrii Nakryiko
  0 siblings, 1 reply; 76+ messages in thread
From: Jiri Olsa @ 2021-06-09 14:17 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Tue, Jun 08, 2021 at 10:34:11PM -0700, Andrii Nakryiko wrote:
> On Sat, Jun 5, 2021 at 4:12 AM Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > Adding support to link multi func tracing program
> > through link_create interface.
> >
> > Adding special types for multi func programs:
> >
> >   fentry.multi
> >   fexit.multi
> >
> > so you can define multi func programs like:
> >
> >   SEC("fentry.multi/bpf_fentry_test*")
> >   int BPF_PROG(test1, unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f)
> >
> > that defines test1 to be attached to bpf_fentry_test* functions,
> > and able to attach ip and 6 arguments.
> >
> > If functions are not specified the program needs to be attached
> > manually.
> >
> > Adding new btf id related fields to bpf_link_create_opts and
> > bpf_link_create to use them.
> >
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  tools/lib/bpf/bpf.c    | 11 ++++++-
> >  tools/lib/bpf/bpf.h    |  4 ++-
> >  tools/lib/bpf/libbpf.c | 72 ++++++++++++++++++++++++++++++++++++++++++
> >  3 files changed, 85 insertions(+), 2 deletions(-)
> >
> > diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> > index 86dcac44f32f..da892737b522 100644
> > --- a/tools/lib/bpf/bpf.c
> > +++ b/tools/lib/bpf/bpf.c
> > @@ -674,7 +674,8 @@ int bpf_link_create(int prog_fd, int target_fd,
> >                     enum bpf_attach_type attach_type,
> >                     const struct bpf_link_create_opts *opts)
> >  {
> > -       __u32 target_btf_id, iter_info_len;
> > +       __u32 target_btf_id, iter_info_len, multi_btf_ids_cnt;
> > +       __s32 *multi_btf_ids;
> >         union bpf_attr attr;
> >         int fd;
> >
> > @@ -687,6 +688,9 @@ int bpf_link_create(int prog_fd, int target_fd,
> >         if (iter_info_len && target_btf_id)
> 
> here we check that mutually exclusive options are not specified, we
> should do the same for multi stuff

right, ok

> 
> >                 return libbpf_err(-EINVAL);
> >
> > +       multi_btf_ids = OPTS_GET(opts, multi_btf_ids, 0);
> > +       multi_btf_ids_cnt = OPTS_GET(opts, multi_btf_ids_cnt, 0);
> > +
> >         memset(&attr, 0, sizeof(attr));
> >         attr.link_create.prog_fd = prog_fd;
> >         attr.link_create.target_fd = target_fd;
> > @@ -701,6 +705,11 @@ int bpf_link_create(int prog_fd, int target_fd,
> >                 attr.link_create.target_btf_id = target_btf_id;
> >         }
> >
> > +       if (multi_btf_ids && multi_btf_ids_cnt) {
> > +               attr.link_create.multi_btf_ids = (__u64) multi_btf_ids;
> > +               attr.link_create.multi_btf_ids_cnt = multi_btf_ids_cnt;
> > +       }
> > +
> >         fd = sys_bpf(BPF_LINK_CREATE, &attr, sizeof(attr));
> >         return libbpf_err_errno(fd);
> >  }
> > diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
> > index 4f758f8f50cd..2f78b6c34765 100644
> > --- a/tools/lib/bpf/bpf.h
> > +++ b/tools/lib/bpf/bpf.h
> > @@ -177,8 +177,10 @@ struct bpf_link_create_opts {
> >         union bpf_iter_link_info *iter_info;
> >         __u32 iter_info_len;
> >         __u32 target_btf_id;
> > +       __s32 *multi_btf_ids;
> 
> why ids are __s32?..

hum not sure why I did that.. __u32 then

> 
> > +       __u32 multi_btf_ids_cnt;
> >  };
> > -#define bpf_link_create_opts__last_field target_btf_id
> > +#define bpf_link_create_opts__last_field multi_btf_ids_cnt
> >
> >  LIBBPF_API int bpf_link_create(int prog_fd, int target_fd,
> >                                enum bpf_attach_type attach_type,
> > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> > index 65f87cc1220c..bd31de3b6a85 100644
> > --- a/tools/lib/bpf/libbpf.c
> > +++ b/tools/lib/bpf/libbpf.c
> > @@ -228,6 +228,7 @@ struct bpf_sec_def {
> >         bool is_attachable;
> >         bool is_attach_btf;
> >         bool is_sleepable;
> > +       bool is_multi_func;
> >         attach_fn_t attach_fn;
> >  };
> >
> > @@ -7609,6 +7610,8 @@ __bpf_object__open(const char *path, const void *obj_buf, size_t obj_buf_sz,
> >
> >                 if (prog->sec_def->is_sleepable)
> >                         prog->prog_flags |= BPF_F_SLEEPABLE;
> > +               if (prog->sec_def->is_multi_func)
> > +                       prog->prog_flags |= BPF_F_MULTI_FUNC;
> >                 bpf_program__set_type(prog, prog->sec_def->prog_type);
> >                 bpf_program__set_expected_attach_type(prog,
> >                                 prog->sec_def->expected_attach_type);
> > @@ -9070,6 +9073,8 @@ static struct bpf_link *attach_raw_tp(const struct bpf_sec_def *sec,
> >                                       struct bpf_program *prog);
> >  static struct bpf_link *attach_trace(const struct bpf_sec_def *sec,
> >                                      struct bpf_program *prog);
> > +static struct bpf_link *attach_trace_multi(const struct bpf_sec_def *sec,
> > +                                          struct bpf_program *prog);
> >  static struct bpf_link *attach_lsm(const struct bpf_sec_def *sec,
> >                                    struct bpf_program *prog);
> >  static struct bpf_link *attach_iter(const struct bpf_sec_def *sec,
> > @@ -9143,6 +9148,14 @@ static const struct bpf_sec_def section_defs[] = {
> >                 .attach_fn = attach_iter),
> >         SEC_DEF("syscall", SYSCALL,
> >                 .is_sleepable = true),
> > +       SEC_DEF("fentry.multi/", TRACING,
> > +               .expected_attach_type = BPF_TRACE_FENTRY,
> 
> BPF_TRACE_MULTI_FENTRY instead of is_multi stuff everywhere?.. Or a
> new type of BPF program altogether?
> 
> > +               .is_multi_func = true,
> > +               .attach_fn = attach_trace_multi),
> > +       SEC_DEF("fexit.multi/", TRACING,
> > +               .expected_attach_type = BPF_TRACE_FEXIT,
> > +               .is_multi_func = true,
> > +               .attach_fn = attach_trace_multi),
> >         BPF_EAPROG_SEC("xdp_devmap/",           BPF_PROG_TYPE_XDP,
> >                                                 BPF_XDP_DEVMAP),
> >         BPF_EAPROG_SEC("xdp_cpumap/",           BPF_PROG_TYPE_XDP,
> > @@ -9584,6 +9597,9 @@ static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd,
> >         if (!name)
> >                 return -EINVAL;
> >
> > +       if (prog->prog_flags & BPF_F_MULTI_FUNC)
> > +               return 0;
> > +
> >         for (i = 0; i < ARRAY_SIZE(section_defs); i++) {
> >                 if (!section_defs[i].is_attach_btf)
> >                         continue;
> > @@ -10537,6 +10553,62 @@ static struct bpf_link *bpf_program__attach_btf_id(struct bpf_program *prog)
> >         return (struct bpf_link *)link;
> >  }
> >
> > +static struct bpf_link *bpf_program__attach_multi(struct bpf_program *prog)
> > +{
> > +       char *pattern = prog->sec_name + prog->sec_def->len;
> > +       DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts);
> > +       enum bpf_attach_type attach_type;
> > +       int prog_fd, link_fd, cnt, err;
> > +       struct bpf_link *link = NULL;
> > +       __s32 *ids = NULL;
> > +
> > +       prog_fd = bpf_program__fd(prog);
> > +       if (prog_fd < 0) {
> > +               pr_warn("prog '%s': can't attach before loaded\n", prog->name);
> > +               return ERR_PTR(-EINVAL);
> > +       }
> > +
> > +       err = bpf_object__load_vmlinux_btf(prog->obj, true);
> > +       if (err)
> > +               return ERR_PTR(err);
> > +
> > +       cnt = btf__find_by_pattern_kind(prog->obj->btf_vmlinux, pattern,
> > +                                       BTF_KIND_FUNC, &ids);
> 
> I wonder if it would be better to just support a simplified glob
> patterns like "prefix*", "*suffix", "exactmatch", and "*substring*"?
> That should be sufficient for majority of cases. For the cases where
> user needs something more nuanced, they can just construct BTF ID list
> with custom code and do manual attach.

as I wrote earlier the function is just for the purpose of the test,
and we can always do the manual attach

I don't mind adding that simplified matching you described

jirka

> 
> > +       if (cnt <= 0)
> > +               return ERR_PTR(-EINVAL);
> > +
> > +       link = calloc(1, sizeof(*link));
> > +       if (!link) {
> > +               err = -ENOMEM;
> > +               goto out_err;
> > +       }
> > +       link->detach = &bpf_link__detach_fd;
> > +
> > +       opts.multi_btf_ids = ids;
> > +       opts.multi_btf_ids_cnt = cnt;
> > +
> > +       attach_type = bpf_program__get_expected_attach_type(prog);
> > +       link_fd = bpf_link_create(prog_fd, 0, attach_type, &opts);
> > +       if (link_fd < 0) {
> > +               err = -errno;
> > +               goto out_err;
> > +       }
> > +       link->fd = link_fd;
> > +       free(ids);
> > +       return link;
> > +
> > +out_err:
> > +       free(link);
> > +       free(ids);
> > +       return ERR_PTR(err);
> > +}
> > +
> > +static struct bpf_link *attach_trace_multi(const struct bpf_sec_def *sec,
> > +                                          struct bpf_program *prog)
> > +{
> > +       return bpf_program__attach_multi(prog);
> > +}
> > +
> >  struct bpf_link *bpf_program__attach_trace(struct bpf_program *prog)
> >  {
> >         return bpf_program__attach_btf_id(prog);
> > --
> > 2.31.1
> >
> 


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 14/19] libbpf: Add btf__find_by_pattern_kind function
  2021-06-09 13:59     ` Jiri Olsa
@ 2021-06-09 14:19       ` Jiri Olsa
  0 siblings, 0 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-09 14:19 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Wed, Jun 09, 2021 at 03:59:47PM +0200, Jiri Olsa wrote:

SNIP

> > > +
> > > +       /* When the pattern does not start with wildcard, treat it as
> > > +        * if we'd want to match it from the beginning of the string.
> > > +        */
> > 
> > This assumption is absolutely atrocious. If we say it's regexp, then
> > it has to always be regexp, not something based on some random
> > heuristic based on the first character.
> > 
> > Taking a step back, though. Do we really need to provide this API? Why
> > applications can't implement it on their own, given regexp
> > functionality is provided by libc. Which I didn't know, actually, so
> > that's pretty nice, assuming that it's also available in more minimal
> > implementations like musl.
> > 
> 
> so the only purpose for this function is to support wildcards in
> tests like:
> 
>   SEC("fentry.multi/bpf_fentry_test*")
> 
> so the generic skeleton attach function can work.. but that can be
> removed and the test programs can be attached manually through some
> other attach function that will have list of functions as argument

nah, no other attach function is needed, we have that support now in
link_create ready to use ;-) sry

jirka


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 16/19] selftests/bpf: Add fentry multi func test
  2021-06-09  5:40   ` Andrii Nakryiko
@ 2021-06-09 14:29     ` Jiri Olsa
  2021-06-10 17:00       ` Andrii Nakryiko
  0 siblings, 1 reply; 76+ messages in thread
From: Jiri Olsa @ 2021-06-09 14:29 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Tue, Jun 08, 2021 at 10:40:24PM -0700, Andrii Nakryiko wrote:
> On Sat, Jun 5, 2021 at 4:12 AM Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > Adding selftest for fentry multi func test that attaches
> > to bpf_fentry_test* functions and checks argument values
> > based on the processed function.
> >
> > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > ---
> >  tools/testing/selftests/bpf/multi_check.h     | 52 +++++++++++++++++++
> >  .../bpf/prog_tests/fentry_multi_test.c        | 43 +++++++++++++++
> >  .../selftests/bpf/progs/fentry_multi_test.c   | 18 +++++++
> >  3 files changed, 113 insertions(+)
> >  create mode 100644 tools/testing/selftests/bpf/multi_check.h
> >  create mode 100644 tools/testing/selftests/bpf/prog_tests/fentry_multi_test.c
> >  create mode 100644 tools/testing/selftests/bpf/progs/fentry_multi_test.c
> >
> > diff --git a/tools/testing/selftests/bpf/multi_check.h b/tools/testing/selftests/bpf/multi_check.h
> > new file mode 100644
> > index 000000000000..36c2a93f9be3
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/multi_check.h
> 
> we have a proper static linking now, we don't have to use header
> inclusion hacks, let's do this properly?

ok, will change

> 
> > @@ -0,0 +1,52 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +
> > +#ifndef __MULTI_CHECK_H
> > +#define __MULTI_CHECK_H
> > +
> > +extern unsigned long long bpf_fentry_test[8];
> > +
> > +static __attribute__((unused)) inline
> > +void multi_arg_check(unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f, __u64 *test_result)
> > +{
> > +       if (ip == bpf_fentry_test[0]) {
> > +               *test_result += (int) a == 1;
> > +       } else if (ip == bpf_fentry_test[1]) {
> > +               *test_result += (int) a == 2 && (__u64) b == 3;
> > +       } else if (ip == bpf_fentry_test[2]) {
> > +               *test_result += (char) a == 4 && (int) b == 5 && (__u64) c == 6;
> > +       } else if (ip == bpf_fentry_test[3]) {
> > +               *test_result += (void *) a == (void *) 7 && (char) b == 8 && (int) c == 9 && (__u64) d == 10;
> > +       } else if (ip == bpf_fentry_test[4]) {
> > +               *test_result += (__u64) a == 11 && (void *) b == (void *) 12 && (short) c == 13 && (int) d == 14 && (__u64) e == 15;
> > +       } else if (ip == bpf_fentry_test[5]) {
> > +               *test_result += (__u64) a == 16 && (void *) b == (void *) 17 && (short) c == 18 && (int) d == 19 && (void *) e == (void *) 20 && (__u64) f == 21;
> > +       } else if (ip == bpf_fentry_test[6]) {
> > +               *test_result += 1;
> > +       } else if (ip == bpf_fentry_test[7]) {
> > +               *test_result += 1;
> > +       }
> 
> why not use switch? and why the casting?

hum, for switch I'd need constants right? 

casting is extra ;-) wanted to check the actual argument types,
but probably makes no sense

will check

> 
> > +}
> > +
> 
> [...]
> 
> > diff --git a/tools/testing/selftests/bpf/progs/fentry_multi_test.c b/tools/testing/selftests/bpf/progs/fentry_multi_test.c
> > new file mode 100644
> > index 000000000000..a443fc958e5a
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/progs/fentry_multi_test.c
> > @@ -0,0 +1,18 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +#include <linux/bpf.h>
> > +#include <bpf/bpf_helpers.h>
> > +#include <bpf/bpf_tracing.h>
> > +#include "multi_check.h"
> > +
> > +char _license[] SEC("license") = "GPL";
> > +
> > +unsigned long long bpf_fentry_test[8];
> > +
> > +__u64 test_result = 0;
> > +
> > +SEC("fentry.multi/bpf_fentry_test*")
> 
> wait, that's a regexp syntax that libc supports?.. Not .*? We should
> definitely not provide btf__find_by_pattern_kind() API, I'd like to
> avoid explaining what flavors of regexps libbpf supports.

ok

thanks,
jirka


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 18/19] selftests/bpf: Add fentry/fexit multi func test
  2021-06-09  5:41   ` Andrii Nakryiko
@ 2021-06-09 14:29     ` Jiri Olsa
  0 siblings, 0 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-09 14:29 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Tue, Jun 08, 2021 at 10:41:37PM -0700, Andrii Nakryiko wrote:
> On Sat, Jun 5, 2021 at 4:13 AM Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > Adding selftest for fentry/fexit multi func test that attaches
> > to bpf_fentry_test* functions and checks argument values based
> > on the processed function.
> >
> > When multi_arg_check is used from 2 different places I'm getting
> > compilation fail, which I did not deciphered yet:
> >
> >   $ CLANG=/opt/clang/bin/clang LLC=/opt/clang/bin/llc make
> >     CLNG-BPF [test_maps] fentry_fexit_multi_test.o
> >   progs/fentry_fexit_multi_test.c:18:2: error: too many args to t24: i64 = \
> >   GlobalAddress<void (i64, i64, i64, i64, i64, i64, i64, i64*)* @multi_arg_check> 0, \
> >   progs/fentry_fexit_multi_test.c:18:2 @[ progs/fentry_fexit_multi_test.c:16:5 ]
> >           multi_arg_check(ip, a, b, c, d, e, f, &test1_arg_result);
> >           ^
> >   progs/fentry_fexit_multi_test.c:25:2: error: too many args to t32: i64 = \
> >   GlobalAddress<void (i64, i64, i64, i64, i64, i64, i64, i64*)* @multi_arg_check> 0, \
> >   progs/fentry_fexit_multi_test.c:25:2 @[ progs/fentry_fexit_multi_test.c:23:5 ]
> >           multi_arg_check(ip, a, b, c, d, e, f, &test2_arg_result);
> >           ^
> >   In file included from progs/fentry_fexit_multi_test.c:5:
> >   /home/jolsa/linux-qemu/tools/testing/selftests/bpf/multi_check.h:9:6: error: defined with too many args
> >   void multi_arg_check(unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f, __u64 *test_result)
> >        ^
> >   /home/jolsa/linux-qemu/tools/testing/selftests/bpf/multi_check.h:9:6: error: defined with too many args
> >   /home/jolsa/linux-qemu/tools/testing/selftests/bpf/multi_check.h:9:6: error: defined with too many args
> >   5 errors generated.
> >   make: *** [Makefile:470: /home/jolsa/linux-qemu/tools/testing/selftests/bpf/fentry_fexit_multi_test.o] Error 1
> >
> > I can fix that by defining 2 separate multi_arg_check functions
> > with different names, which I did in follow up temporaary patch.
> > Not sure I'm hitting some clang/bpf limitation in here?
> 
> don't know about  clang limitations, but we should use static linking
> proper anyways

ok, will change

thanks,
jirka


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 16/19] selftests/bpf: Add fentry multi func test
  2021-06-09 14:29     ` Jiri Olsa
@ 2021-06-10 17:00       ` Andrii Nakryiko
  2021-06-10 20:28         ` Jiri Olsa
  0 siblings, 1 reply; 76+ messages in thread
From: Andrii Nakryiko @ 2021-06-10 17:00 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Wed, Jun 9, 2021 at 7:29 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Tue, Jun 08, 2021 at 10:40:24PM -0700, Andrii Nakryiko wrote:
> > On Sat, Jun 5, 2021 at 4:12 AM Jiri Olsa <jolsa@kernel.org> wrote:
> > >
> > > Adding selftest for fentry multi func test that attaches
> > > to bpf_fentry_test* functions and checks argument values
> > > based on the processed function.
> > >
> > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > ---
> > >  tools/testing/selftests/bpf/multi_check.h     | 52 +++++++++++++++++++
> > >  .../bpf/prog_tests/fentry_multi_test.c        | 43 +++++++++++++++
> > >  .../selftests/bpf/progs/fentry_multi_test.c   | 18 +++++++
> > >  3 files changed, 113 insertions(+)
> > >  create mode 100644 tools/testing/selftests/bpf/multi_check.h
> > >  create mode 100644 tools/testing/selftests/bpf/prog_tests/fentry_multi_test.c
> > >  create mode 100644 tools/testing/selftests/bpf/progs/fentry_multi_test.c
> > >
> > > diff --git a/tools/testing/selftests/bpf/multi_check.h b/tools/testing/selftests/bpf/multi_check.h
> > > new file mode 100644
> > > index 000000000000..36c2a93f9be3
> > > --- /dev/null
> > > +++ b/tools/testing/selftests/bpf/multi_check.h
> >
> > we have a proper static linking now, we don't have to use header
> > inclusion hacks, let's do this properly?
>
> ok, will change
>
> >
> > > @@ -0,0 +1,52 @@
> > > +/* SPDX-License-Identifier: GPL-2.0 */
> > > +
> > > +#ifndef __MULTI_CHECK_H
> > > +#define __MULTI_CHECK_H
> > > +
> > > +extern unsigned long long bpf_fentry_test[8];
> > > +
> > > +static __attribute__((unused)) inline
> > > +void multi_arg_check(unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f, __u64 *test_result)
> > > +{
> > > +       if (ip == bpf_fentry_test[0]) {
> > > +               *test_result += (int) a == 1;
> > > +       } else if (ip == bpf_fentry_test[1]) {
> > > +               *test_result += (int) a == 2 && (__u64) b == 3;
> > > +       } else if (ip == bpf_fentry_test[2]) {
> > > +               *test_result += (char) a == 4 && (int) b == 5 && (__u64) c == 6;
> > > +       } else if (ip == bpf_fentry_test[3]) {
> > > +               *test_result += (void *) a == (void *) 7 && (char) b == 8 && (int) c == 9 && (__u64) d == 10;
> > > +       } else if (ip == bpf_fentry_test[4]) {
> > > +               *test_result += (__u64) a == 11 && (void *) b == (void *) 12 && (short) c == 13 && (int) d == 14 && (__u64) e == 15;
> > > +       } else if (ip == bpf_fentry_test[5]) {
> > > +               *test_result += (__u64) a == 16 && (void *) b == (void *) 17 && (short) c == 18 && (int) d == 19 && (void *) e == (void *) 20 && (__u64) f == 21;
> > > +       } else if (ip == bpf_fentry_test[6]) {
> > > +               *test_result += 1;
> > > +       } else if (ip == bpf_fentry_test[7]) {
> > > +               *test_result += 1;
> > > +       }
> >
> > why not use switch? and why the casting?
>
> hum, for switch I'd need constants right?

doh, of course :)

but! you don't need to fill out bpf_fentry_test[] array from
user-space, just use extern const void variables to get addresses of
those functions:

extern const void bpf_fentry_test1 __ksym;
extern const void bpf_fentry_test2 __ksym;
...

>
> casting is extra ;-) wanted to check the actual argument types,
> but probably makes no sense

probably doesn't given you already declared it u64 and use integer
values for comparison

>
> will check
>
> >
> > > +}
> > > +
> >
> > [...]
> >
> > > diff --git a/tools/testing/selftests/bpf/progs/fentry_multi_test.c b/tools/testing/selftests/bpf/progs/fentry_multi_test.c
> > > new file mode 100644
> > > index 000000000000..a443fc958e5a
> > > --- /dev/null
> > > +++ b/tools/testing/selftests/bpf/progs/fentry_multi_test.c
> > > @@ -0,0 +1,18 @@
> > > +// SPDX-License-Identifier: GPL-2.0
> > > +#include <linux/bpf.h>
> > > +#include <bpf/bpf_helpers.h>
> > > +#include <bpf/bpf_tracing.h>
> > > +#include "multi_check.h"
> > > +
> > > +char _license[] SEC("license") = "GPL";
> > > +
> > > +unsigned long long bpf_fentry_test[8];
> > > +
> > > +__u64 test_result = 0;
> > > +
> > > +SEC("fentry.multi/bpf_fentry_test*")
> >
> > wait, that's a regexp syntax that libc supports?.. Not .*? We should
> > definitely not provide btf__find_by_pattern_kind() API, I'd like to
> > avoid explaining what flavors of regexps libbpf supports.
>
> ok
>
> thanks,
> jirka
>

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 15/19] libbpf: Add support to link multi func tracing program
  2021-06-09 14:17     ` Jiri Olsa
@ 2021-06-10 17:05       ` Andrii Nakryiko
  2021-06-10 20:35         ` Jiri Olsa
  0 siblings, 1 reply; 76+ messages in thread
From: Andrii Nakryiko @ 2021-06-10 17:05 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Wed, Jun 9, 2021 at 7:17 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Tue, Jun 08, 2021 at 10:34:11PM -0700, Andrii Nakryiko wrote:
> > On Sat, Jun 5, 2021 at 4:12 AM Jiri Olsa <jolsa@kernel.org> wrote:
> > >
> > > Adding support to link multi func tracing program
> > > through link_create interface.
> > >
> > > Adding special types for multi func programs:
> > >
> > >   fentry.multi
> > >   fexit.multi
> > >
> > > so you can define multi func programs like:
> > >
> > >   SEC("fentry.multi/bpf_fentry_test*")
> > >   int BPF_PROG(test1, unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f)
> > >
> > > that defines test1 to be attached to bpf_fentry_test* functions,
> > > and able to attach ip and 6 arguments.
> > >
> > > If functions are not specified the program needs to be attached
> > > manually.
> > >
> > > Adding new btf id related fields to bpf_link_create_opts and
> > > bpf_link_create to use them.
> > >
> > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > ---
> > >  tools/lib/bpf/bpf.c    | 11 ++++++-
> > >  tools/lib/bpf/bpf.h    |  4 ++-
> > >  tools/lib/bpf/libbpf.c | 72 ++++++++++++++++++++++++++++++++++++++++++
> > >  3 files changed, 85 insertions(+), 2 deletions(-)
> > >
> > > diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> > > index 86dcac44f32f..da892737b522 100644
> > > --- a/tools/lib/bpf/bpf.c
> > > +++ b/tools/lib/bpf/bpf.c
> > > @@ -674,7 +674,8 @@ int bpf_link_create(int prog_fd, int target_fd,
> > >                     enum bpf_attach_type attach_type,
> > >                     const struct bpf_link_create_opts *opts)
> > >  {
> > > -       __u32 target_btf_id, iter_info_len;
> > > +       __u32 target_btf_id, iter_info_len, multi_btf_ids_cnt;
> > > +       __s32 *multi_btf_ids;
> > >         union bpf_attr attr;
> > >         int fd;
> > >
> > > @@ -687,6 +688,9 @@ int bpf_link_create(int prog_fd, int target_fd,
> > >         if (iter_info_len && target_btf_id)
> >
> > here we check that mutually exclusive options are not specified, we
> > should do the same for multi stuff
>
> right, ok
>
> >
> > >                 return libbpf_err(-EINVAL);
> > >
> > > +       multi_btf_ids = OPTS_GET(opts, multi_btf_ids, 0);
> > > +       multi_btf_ids_cnt = OPTS_GET(opts, multi_btf_ids_cnt, 0);
> > > +
> > >         memset(&attr, 0, sizeof(attr));
> > >         attr.link_create.prog_fd = prog_fd;
> > >         attr.link_create.target_fd = target_fd;
> > > @@ -701,6 +705,11 @@ int bpf_link_create(int prog_fd, int target_fd,
> > >                 attr.link_create.target_btf_id = target_btf_id;
> > >         }
> > >
> > > +       if (multi_btf_ids && multi_btf_ids_cnt) {
> > > +               attr.link_create.multi_btf_ids = (__u64) multi_btf_ids;
> > > +               attr.link_create.multi_btf_ids_cnt = multi_btf_ids_cnt;
> > > +       }
> > > +
> > >         fd = sys_bpf(BPF_LINK_CREATE, &attr, sizeof(attr));
> > >         return libbpf_err_errno(fd);
> > >  }
> > > diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
> > > index 4f758f8f50cd..2f78b6c34765 100644
> > > --- a/tools/lib/bpf/bpf.h
> > > +++ b/tools/lib/bpf/bpf.h
> > > @@ -177,8 +177,10 @@ struct bpf_link_create_opts {
> > >         union bpf_iter_link_info *iter_info;
> > >         __u32 iter_info_len;
> > >         __u32 target_btf_id;
> > > +       __s32 *multi_btf_ids;
> >
> > why ids are __s32?..
>
> hum not sure why I did that.. __u32 then
>
> >
> > > +       __u32 multi_btf_ids_cnt;
> > >  };
> > > -#define bpf_link_create_opts__last_field target_btf_id
> > > +#define bpf_link_create_opts__last_field multi_btf_ids_cnt
> > >
> > >  LIBBPF_API int bpf_link_create(int prog_fd, int target_fd,
> > >                                enum bpf_attach_type attach_type,
> > > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> > > index 65f87cc1220c..bd31de3b6a85 100644
> > > --- a/tools/lib/bpf/libbpf.c
> > > +++ b/tools/lib/bpf/libbpf.c
> > > @@ -228,6 +228,7 @@ struct bpf_sec_def {
> > >         bool is_attachable;
> > >         bool is_attach_btf;
> > >         bool is_sleepable;
> > > +       bool is_multi_func;
> > >         attach_fn_t attach_fn;
> > >  };
> > >
> > > @@ -7609,6 +7610,8 @@ __bpf_object__open(const char *path, const void *obj_buf, size_t obj_buf_sz,
> > >
> > >                 if (prog->sec_def->is_sleepable)
> > >                         prog->prog_flags |= BPF_F_SLEEPABLE;
> > > +               if (prog->sec_def->is_multi_func)
> > > +                       prog->prog_flags |= BPF_F_MULTI_FUNC;
> > >                 bpf_program__set_type(prog, prog->sec_def->prog_type);
> > >                 bpf_program__set_expected_attach_type(prog,
> > >                                 prog->sec_def->expected_attach_type);
> > > @@ -9070,6 +9073,8 @@ static struct bpf_link *attach_raw_tp(const struct bpf_sec_def *sec,
> > >                                       struct bpf_program *prog);
> > >  static struct bpf_link *attach_trace(const struct bpf_sec_def *sec,
> > >                                      struct bpf_program *prog);
> > > +static struct bpf_link *attach_trace_multi(const struct bpf_sec_def *sec,
> > > +                                          struct bpf_program *prog);
> > >  static struct bpf_link *attach_lsm(const struct bpf_sec_def *sec,
> > >                                    struct bpf_program *prog);
> > >  static struct bpf_link *attach_iter(const struct bpf_sec_def *sec,
> > > @@ -9143,6 +9148,14 @@ static const struct bpf_sec_def section_defs[] = {
> > >                 .attach_fn = attach_iter),
> > >         SEC_DEF("syscall", SYSCALL,
> > >                 .is_sleepable = true),
> > > +       SEC_DEF("fentry.multi/", TRACING,
> > > +               .expected_attach_type = BPF_TRACE_FENTRY,
> >
> > BPF_TRACE_MULTI_FENTRY instead of is_multi stuff everywhere?.. Or a
> > new type of BPF program altogether?
> >
> > > +               .is_multi_func = true,
> > > +               .attach_fn = attach_trace_multi),
> > > +       SEC_DEF("fexit.multi/", TRACING,
> > > +               .expected_attach_type = BPF_TRACE_FEXIT,
> > > +               .is_multi_func = true,
> > > +               .attach_fn = attach_trace_multi),
> > >         BPF_EAPROG_SEC("xdp_devmap/",           BPF_PROG_TYPE_XDP,
> > >                                                 BPF_XDP_DEVMAP),
> > >         BPF_EAPROG_SEC("xdp_cpumap/",           BPF_PROG_TYPE_XDP,
> > > @@ -9584,6 +9597,9 @@ static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd,
> > >         if (!name)
> > >                 return -EINVAL;
> > >
> > > +       if (prog->prog_flags & BPF_F_MULTI_FUNC)
> > > +               return 0;
> > > +
> > >         for (i = 0; i < ARRAY_SIZE(section_defs); i++) {
> > >                 if (!section_defs[i].is_attach_btf)
> > >                         continue;
> > > @@ -10537,6 +10553,62 @@ static struct bpf_link *bpf_program__attach_btf_id(struct bpf_program *prog)
> > >         return (struct bpf_link *)link;
> > >  }
> > >
> > > +static struct bpf_link *bpf_program__attach_multi(struct bpf_program *prog)
> > > +{
> > > +       char *pattern = prog->sec_name + prog->sec_def->len;
> > > +       DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts);
> > > +       enum bpf_attach_type attach_type;
> > > +       int prog_fd, link_fd, cnt, err;
> > > +       struct bpf_link *link = NULL;
> > > +       __s32 *ids = NULL;
> > > +
> > > +       prog_fd = bpf_program__fd(prog);
> > > +       if (prog_fd < 0) {
> > > +               pr_warn("prog '%s': can't attach before loaded\n", prog->name);
> > > +               return ERR_PTR(-EINVAL);
> > > +       }
> > > +
> > > +       err = bpf_object__load_vmlinux_btf(prog->obj, true);
> > > +       if (err)
> > > +               return ERR_PTR(err);
> > > +
> > > +       cnt = btf__find_by_pattern_kind(prog->obj->btf_vmlinux, pattern,
> > > +                                       BTF_KIND_FUNC, &ids);
> >
> > I wonder if it would be better to just support a simplified glob
> > patterns like "prefix*", "*suffix", "exactmatch", and "*substring*"?
> > That should be sufficient for majority of cases. For the cases where
> > user needs something more nuanced, they can just construct BTF ID list
> > with custom code and do manual attach.
>
> as I wrote earlier the function is just for the purpose of the test,
> and we can always do the manual attach
>
> I don't mind adding that simplified matching you described

I use that in retsnoop and that seems to be simple but flexible enough
for all the purposes, so far. It matches typical file globbing rules
(with extra limitations, of course), so it's also intuitive.

But I still am not sure about making it a public API, because in a lot
of cases you'll want a list of patterns (both allowing and denying
different patterns), so it should be generalized to something like

btf__find_by_glob_kind(btf, allow_patterns, deny_patterns, ids)

which gets pretty unwieldy. I'd start with telling users to just
iterate BTF on their own and apply whatever custom filtering they
need. For simple cases libbpf will just initially support a simple and
single glob filter declaratively (e.g, SEC("fentry.multi/bpf_*")).


>
> jirka
>
> >
> > > +       if (cnt <= 0)
> > > +               return ERR_PTR(-EINVAL);
> > > +
> > > +       link = calloc(1, sizeof(*link));
> > > +       if (!link) {
> > > +               err = -ENOMEM;
> > > +               goto out_err;
> > > +       }
> > > +       link->detach = &bpf_link__detach_fd;
> > > +
> > > +       opts.multi_btf_ids = ids;
> > > +       opts.multi_btf_ids_cnt = cnt;
> > > +
> > > +       attach_type = bpf_program__get_expected_attach_type(prog);
> > > +       link_fd = bpf_link_create(prog_fd, 0, attach_type, &opts);
> > > +       if (link_fd < 0) {
> > > +               err = -errno;
> > > +               goto out_err;
> > > +       }
> > > +       link->fd = link_fd;
> > > +       free(ids);
> > > +       return link;
> > > +
> > > +out_err:
> > > +       free(link);
> > > +       free(ids);
> > > +       return ERR_PTR(err);
> > > +}
> > > +
> > > +static struct bpf_link *attach_trace_multi(const struct bpf_sec_def *sec,
> > > +                                          struct bpf_program *prog)
> > > +{
> > > +       return bpf_program__attach_multi(prog);
> > > +}
> > > +
> > >  struct bpf_link *bpf_program__attach_trace(struct bpf_program *prog)
> > >  {
> > >         return bpf_program__attach_btf_id(prog);
> > > --
> > > 2.31.1
> > >
> >
>

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 16/19] selftests/bpf: Add fentry multi func test
  2021-06-10 17:00       ` Andrii Nakryiko
@ 2021-06-10 20:28         ` Jiri Olsa
  0 siblings, 0 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-10 20:28 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Thu, Jun 10, 2021 at 10:00:34AM -0700, Andrii Nakryiko wrote:
> On Wed, Jun 9, 2021 at 7:29 AM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > On Tue, Jun 08, 2021 at 10:40:24PM -0700, Andrii Nakryiko wrote:
> > > On Sat, Jun 5, 2021 at 4:12 AM Jiri Olsa <jolsa@kernel.org> wrote:
> > > >
> > > > Adding selftest for fentry multi func test that attaches
> > > > to bpf_fentry_test* functions and checks argument values
> > > > based on the processed function.
> > > >
> > > > Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> > > > ---
> > > >  tools/testing/selftests/bpf/multi_check.h     | 52 +++++++++++++++++++
> > > >  .../bpf/prog_tests/fentry_multi_test.c        | 43 +++++++++++++++
> > > >  .../selftests/bpf/progs/fentry_multi_test.c   | 18 +++++++
> > > >  3 files changed, 113 insertions(+)
> > > >  create mode 100644 tools/testing/selftests/bpf/multi_check.h
> > > >  create mode 100644 tools/testing/selftests/bpf/prog_tests/fentry_multi_test.c
> > > >  create mode 100644 tools/testing/selftests/bpf/progs/fentry_multi_test.c
> > > >
> > > > diff --git a/tools/testing/selftests/bpf/multi_check.h b/tools/testing/selftests/bpf/multi_check.h
> > > > new file mode 100644
> > > > index 000000000000..36c2a93f9be3
> > > > --- /dev/null
> > > > +++ b/tools/testing/selftests/bpf/multi_check.h
> > >
> > > we have a proper static linking now, we don't have to use header
> > > inclusion hacks, let's do this properly?
> >
> > ok, will change
> >
> > >
> > > > @@ -0,0 +1,52 @@
> > > > +/* SPDX-License-Identifier: GPL-2.0 */
> > > > +
> > > > +#ifndef __MULTI_CHECK_H
> > > > +#define __MULTI_CHECK_H
> > > > +
> > > > +extern unsigned long long bpf_fentry_test[8];
> > > > +
> > > > +static __attribute__((unused)) inline
> > > > +void multi_arg_check(unsigned long ip, __u64 a, __u64 b, __u64 c, __u64 d, __u64 e, __u64 f, __u64 *test_result)
> > > > +{
> > > > +       if (ip == bpf_fentry_test[0]) {
> > > > +               *test_result += (int) a == 1;
> > > > +       } else if (ip == bpf_fentry_test[1]) {
> > > > +               *test_result += (int) a == 2 && (__u64) b == 3;
> > > > +       } else if (ip == bpf_fentry_test[2]) {
> > > > +               *test_result += (char) a == 4 && (int) b == 5 && (__u64) c == 6;
> > > > +       } else if (ip == bpf_fentry_test[3]) {
> > > > +               *test_result += (void *) a == (void *) 7 && (char) b == 8 && (int) c == 9 && (__u64) d == 10;
> > > > +       } else if (ip == bpf_fentry_test[4]) {
> > > > +               *test_result += (__u64) a == 11 && (void *) b == (void *) 12 && (short) c == 13 && (int) d == 14 && (__u64) e == 15;
> > > > +       } else if (ip == bpf_fentry_test[5]) {
> > > > +               *test_result += (__u64) a == 16 && (void *) b == (void *) 17 && (short) c == 18 && (int) d == 19 && (void *) e == (void *) 20 && (__u64) f == 21;
> > > > +       } else if (ip == bpf_fentry_test[6]) {
> > > > +               *test_result += 1;
> > > > +       } else if (ip == bpf_fentry_test[7]) {
> > > > +               *test_result += 1;
> > > > +       }
> > >
> > > why not use switch? and why the casting?
> >
> > hum, for switch I'd need constants right?
> 
> doh, of course :)
> 
> but! you don't need to fill out bpf_fentry_test[] array from
> user-space, just use extern const void variables to get addresses of
> those functions:
> 
> extern const void bpf_fentry_test1 __ksym;
> extern const void bpf_fentry_test2 __ksym;
> ...

nice, will use that

jirka


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [PATCH 15/19] libbpf: Add support to link multi func tracing program
  2021-06-10 17:05       ` Andrii Nakryiko
@ 2021-06-10 20:35         ` Jiri Olsa
  0 siblings, 0 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-10 20:35 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Thu, Jun 10, 2021 at 10:05:39AM -0700, Andrii Nakryiko wrote:

SNIP

> > > > +static struct bpf_link *bpf_program__attach_multi(struct bpf_program *prog)
> > > > +{
> > > > +       char *pattern = prog->sec_name + prog->sec_def->len;
> > > > +       DECLARE_LIBBPF_OPTS(bpf_link_create_opts, opts);
> > > > +       enum bpf_attach_type attach_type;
> > > > +       int prog_fd, link_fd, cnt, err;
> > > > +       struct bpf_link *link = NULL;
> > > > +       __s32 *ids = NULL;
> > > > +
> > > > +       prog_fd = bpf_program__fd(prog);
> > > > +       if (prog_fd < 0) {
> > > > +               pr_warn("prog '%s': can't attach before loaded\n", prog->name);
> > > > +               return ERR_PTR(-EINVAL);
> > > > +       }
> > > > +
> > > > +       err = bpf_object__load_vmlinux_btf(prog->obj, true);
> > > > +       if (err)
> > > > +               return ERR_PTR(err);
> > > > +
> > > > +       cnt = btf__find_by_pattern_kind(prog->obj->btf_vmlinux, pattern,
> > > > +                                       BTF_KIND_FUNC, &ids);
> > >
> > > I wonder if it would be better to just support a simplified glob
> > > patterns like "prefix*", "*suffix", "exactmatch", and "*substring*"?
> > > That should be sufficient for majority of cases. For the cases where
> > > user needs something more nuanced, they can just construct BTF ID list
> > > with custom code and do manual attach.
> >
> > as I wrote earlier the function is just for the purpose of the test,
> > and we can always do the manual attach
> >
> > I don't mind adding that simplified matching you described
> 
> I use that in retsnoop and that seems to be simple but flexible enough
> for all the purposes, so far. It matches typical file globbing rules
> (with extra limitations, of course), so it's also intuitive.
> 
> But I still am not sure about making it a public API, because in a lot
> of cases you'll want a list of patterns (both allowing and denying
> different patterns), so it should be generalized to something like
> 
> btf__find_by_glob_kind(btf, allow_patterns, deny_patterns, ids)
> 
> which gets pretty unwieldy. I'd start with telling users to just
> iterate BTF on their own and apply whatever custom filtering they
> need. For simple cases libbpf will just initially support a simple and
> single glob filter declaratively (e.g, SEC("fentry.multi/bpf_*")).

ok, I'll scan retsnoop and see what I can steal ;-)

jirka


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach
  2021-06-05 11:10 [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach Jiri Olsa
                   ` (18 preceding siblings ...)
  2021-06-05 11:10 ` [PATCH 19/19] selftests/bpf: Temporary fix for fentry_fexit_multi_test Jiri Olsa
@ 2021-06-17 20:29 ` Andrii Nakryiko
  2021-06-19  8:33   ` Jiri Olsa
  19 siblings, 1 reply; 76+ messages in thread
From: Andrii Nakryiko @ 2021-06-17 20:29 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Sat, Jun 5, 2021 at 4:12 AM Jiri Olsa <jolsa@kernel.org> wrote:
>
> hi,
> saga continues.. ;-) previous post is in here [1]
>
> After another discussion with Steven, he mentioned that if we fix
> the ftrace graph problem with direct functions, he'd be open to
> add batch interface for direct ftrace functions.
>
> He already had prove of concept fix for that, which I took and broke
> up into several changes. I added the ftrace direct batch interface
> and bpf new interface on top of that.
>
> It's not so many patches after all, so I thought having them all
> together will help the review, because they are all connected.
> However I can break this up into separate patchsets if necessary.
>
> This patchset contains:
>
>   1) patches (1-4) that fix the ftrace graph tracing over the function
>      with direct trampolines attached
>   2) patches (5-8) that add batch interface for ftrace direct function
>      register/unregister/modify
>   3) patches (9-19) that add support to attach BPF program to multiple
>      functions
>
> In nutshell:
>
> Ad 1) moves the graph tracing setup before the direct trampoline
> prepares the stack, so they don't clash
>
> Ad 2) uses ftrace_ops interface to register direct function with
> all functions in ftrace_ops filter.
>
> Ad 3) creates special program and trampoline type to allow attachment
> of multiple functions to single program.
>
> There're more detailed desriptions in related changelogs.
>
> I have working bpftrace multi attachment code on top this. I briefly
> checked retsnoop and I think it could use the new API as well.

Ok, so I had a bit of time and enthusiasm to try that with retsnoop.
The ugly code is at [0] if you'd like to see what kind of changes I
needed to make to use this (it won't work if you check it out because
it needs your libbpf changes synced into submodule, which I only did
locally). But here are some learnings from that experiment both to
emphasize how important it is to make this work and how restrictive
are some of the current limitations.

First, good news. Using this mass-attach API to attach to almost 1000
kernel functions goes from

Plain fentry/fexit:
===================
real    0m27.321s
user    0m0.352s
sys     0m20.919s

to

Mass-attach fentry/fexit:
=========================
real    0m2.728s
user    0m0.329s
sys     0m2.380s

It's a 10x speed up. And a good chunk of those 2.7 seconds is in some
preparatory steps not related to fentry/fexit stuff.

It's not exactly apples-to-apples, though, because the limitations you
have right now prevents attaching both fentry and fexit programs to
the same set of kernel functions. This makes it pretty useless for a
lot of cases, in particular for retsnoop. So I haven't really tested
retsnoop end-to-end, I only verified that I do see fentries triggered,
but can't have matching fexits. So the speed-up might be smaller due
to additional fexit mass-attach (once that is allowed), but it's still
a massive difference. So we absolutely need to get this optimization
in.

Few more thoughts, if you'd like to plan some more work ahead ;)

1. We need similar mass-attach functionality for kprobe/kretprobe, as
there are use cases where kprobe are more useful than fentry (e.g., >6
args funcs, or funcs with input arguments that are not supported by
BPF verifier, like struct-by-value). It's not clear how to best
represent this, given currently we attach kprobe through perf_event,
but we'll need to think about this for sure.

2. To make mass-attach fentry/fexit useful for practical purposes, it
would be really great to have an ability to fetch traced function's
IP. I.e., if we fentry/fexit func kern_func_abc, bpf_get_func_ip()
would return IP of that functions that matches the one in
/proc/kallsyms. Right now I do very brittle hacks to do that.

So all-in-all, super excited about this, but I hope all those issues
are addressed to make retsnoop possible and fast.

  [0] https://github.com/anakryiko/retsnoop/commit/8a07bc4d8c47d025f755c108f92f0583e3fda6d8

>
>
> Also available at:
>   https://git.kernel.org/pub/scm/linux/kernel/git/jolsa/perf.git
>   bpf/batch
>
> thanks,
> jirka
>
>
> [1] https://lore.kernel.org/bpf/20210413121516.1467989-1-jolsa@kernel.org/
>
> ---
> Jiri Olsa (17):
>       x86/ftrace: Remove extra orig rax move
>       tracing: Add trampoline/graph selftest
>       ftrace: Add ftrace_add_rec_direct function
>       ftrace: Add multi direct register/unregister interface
>       ftrace: Add multi direct modify interface
>       ftrace/samples: Add multi direct interface test module
>       bpf, x64: Allow to use caller address from stack
>       bpf: Allow to store caller's ip as argument
>       bpf: Add support to load multi func tracing program
>       bpf: Add bpf_trampoline_alloc function
>       bpf: Add support to link multi func tracing program
>       libbpf: Add btf__find_by_pattern_kind function
>       libbpf: Add support to link multi func tracing program
>       selftests/bpf: Add fentry multi func test
>       selftests/bpf: Add fexit multi func test
>       selftests/bpf: Add fentry/fexit multi func test
>       selftests/bpf: Temporary fix for fentry_fexit_multi_test
>
> Steven Rostedt (VMware) (2):
>       x86/ftrace: Remove fault protection code in prepare_ftrace_return
>       x86/ftrace: Make function graph use ftrace directly
>

[...]

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach
  2021-06-17 20:29 ` [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach Andrii Nakryiko
@ 2021-06-19  8:33   ` Jiri Olsa
  2021-06-19 16:19     ` Yonghong Song
  2021-06-21  6:50     ` Andrii Nakryiko
  0 siblings, 2 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-06-19  8:33 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Thu, Jun 17, 2021 at 01:29:45PM -0700, Andrii Nakryiko wrote:
> On Sat, Jun 5, 2021 at 4:12 AM Jiri Olsa <jolsa@kernel.org> wrote:
> >
> > hi,
> > saga continues.. ;-) previous post is in here [1]
> >
> > After another discussion with Steven, he mentioned that if we fix
> > the ftrace graph problem with direct functions, he'd be open to
> > add batch interface for direct ftrace functions.
> >
> > He already had prove of concept fix for that, which I took and broke
> > up into several changes. I added the ftrace direct batch interface
> > and bpf new interface on top of that.
> >
> > It's not so many patches after all, so I thought having them all
> > together will help the review, because they are all connected.
> > However I can break this up into separate patchsets if necessary.
> >
> > This patchset contains:
> >
> >   1) patches (1-4) that fix the ftrace graph tracing over the function
> >      with direct trampolines attached
> >   2) patches (5-8) that add batch interface for ftrace direct function
> >      register/unregister/modify
> >   3) patches (9-19) that add support to attach BPF program to multiple
> >      functions
> >
> > In nutshell:
> >
> > Ad 1) moves the graph tracing setup before the direct trampoline
> > prepares the stack, so they don't clash
> >
> > Ad 2) uses ftrace_ops interface to register direct function with
> > all functions in ftrace_ops filter.
> >
> > Ad 3) creates special program and trampoline type to allow attachment
> > of multiple functions to single program.
> >
> > There're more detailed desriptions in related changelogs.
> >
> > I have working bpftrace multi attachment code on top this. I briefly
> > checked retsnoop and I think it could use the new API as well.
> 
> Ok, so I had a bit of time and enthusiasm to try that with retsnoop.
> The ugly code is at [0] if you'd like to see what kind of changes I
> needed to make to use this (it won't work if you check it out because
> it needs your libbpf changes synced into submodule, which I only did
> locally). But here are some learnings from that experiment both to
> emphasize how important it is to make this work and how restrictive
> are some of the current limitations.
> 
> First, good news. Using this mass-attach API to attach to almost 1000
> kernel functions goes from
> 
> Plain fentry/fexit:
> ===================
> real    0m27.321s
> user    0m0.352s
> sys     0m20.919s
> 
> to
> 
> Mass-attach fentry/fexit:
> =========================
> real    0m2.728s
> user    0m0.329s
> sys     0m2.380s

I did not meassured the bpftrace speedup, because the new code
attached instantly ;-)

> 
> It's a 10x speed up. And a good chunk of those 2.7 seconds is in some
> preparatory steps not related to fentry/fexit stuff.
> 
> It's not exactly apples-to-apples, though, because the limitations you
> have right now prevents attaching both fentry and fexit programs to
> the same set of kernel functions. This makes it pretty useless for a

hum, you could do link_update with fexit program on the link fd,
like in the selftest, right?

> lot of cases, in particular for retsnoop. So I haven't really tested
> retsnoop end-to-end, I only verified that I do see fentries triggered,
> but can't have matching fexits. So the speed-up might be smaller due
> to additional fexit mass-attach (once that is allowed), but it's still
> a massive difference. So we absolutely need to get this optimization
> in.
> 
> Few more thoughts, if you'd like to plan some more work ahead ;)
> 
> 1. We need similar mass-attach functionality for kprobe/kretprobe, as
> there are use cases where kprobe are more useful than fentry (e.g., >6
> args funcs, or funcs with input arguments that are not supported by
> BPF verifier, like struct-by-value). It's not clear how to best
> represent this, given currently we attach kprobe through perf_event,
> but we'll need to think about this for sure.

I'm fighting with the '2 trampolines concept' at the moment, but the
mass attach for kprobes seems interesting ;-) will check

> 
> 2. To make mass-attach fentry/fexit useful for practical purposes, it
> would be really great to have an ability to fetch traced function's
> IP. I.e., if we fentry/fexit func kern_func_abc, bpf_get_func_ip()
> would return IP of that functions that matches the one in
> /proc/kallsyms. Right now I do very brittle hacks to do that.

so I hoped that we could store ip always in ctx-8 and have
the bpf_get_func_ip helper to access that, but the BPF_PROG
macro does not pass ctx value to the program, just args

we could perhaps somehow store the ctx in BPF_PROG before calling
the bpf program, but I did not get to try that yet

> 
> So all-in-all, super excited about this, but I hope all those issues
> are addressed to make retsnoop possible and fast.
> 
>   [0] https://github.com/anakryiko/retsnoop/commit/8a07bc4d8c47d025f755c108f92f0583e3fda6d8

thanks for checking on this,
jirka


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach
  2021-06-19  8:33   ` Jiri Olsa
@ 2021-06-19 16:19     ` Yonghong Song
  2021-06-19 17:09       ` Jiri Olsa
  2021-06-21  6:50     ` Andrii Nakryiko
  1 sibling, 1 reply; 76+ messages in thread
From: Yonghong Song @ 2021-06-19 16:19 UTC (permalink / raw)
  To: Jiri Olsa, Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh, Daniel Xu, Viktor Malik



On 6/19/21 1:33 AM, Jiri Olsa wrote:
> On Thu, Jun 17, 2021 at 01:29:45PM -0700, Andrii Nakryiko wrote:
>> On Sat, Jun 5, 2021 at 4:12 AM Jiri Olsa <jolsa@kernel.org> wrote:
>>>
>>> hi,
>>> saga continues.. ;-) previous post is in here [1]
>>>
>>> After another discussion with Steven, he mentioned that if we fix
>>> the ftrace graph problem with direct functions, he'd be open to
>>> add batch interface for direct ftrace functions.
>>>
>>> He already had prove of concept fix for that, which I took and broke
>>> up into several changes. I added the ftrace direct batch interface
>>> and bpf new interface on top of that.
>>>
>>> It's not so many patches after all, so I thought having them all
>>> together will help the review, because they are all connected.
>>> However I can break this up into separate patchsets if necessary.
>>>
>>> This patchset contains:
>>>
>>>    1) patches (1-4) that fix the ftrace graph tracing over the function
>>>       with direct trampolines attached
>>>    2) patches (5-8) that add batch interface for ftrace direct function
>>>       register/unregister/modify
>>>    3) patches (9-19) that add support to attach BPF program to multiple
>>>       functions
>>>
>>> In nutshell:
>>>
>>> Ad 1) moves the graph tracing setup before the direct trampoline
>>> prepares the stack, so they don't clash
>>>
>>> Ad 2) uses ftrace_ops interface to register direct function with
>>> all functions in ftrace_ops filter.
>>>
>>> Ad 3) creates special program and trampoline type to allow attachment
>>> of multiple functions to single program.
>>>
>>> There're more detailed desriptions in related changelogs.
>>>
>>> I have working bpftrace multi attachment code on top this. I briefly
>>> checked retsnoop and I think it could use the new API as well.
>>
>> Ok, so I had a bit of time and enthusiasm to try that with retsnoop.
>> The ugly code is at [0] if you'd like to see what kind of changes I
>> needed to make to use this (it won't work if you check it out because
>> it needs your libbpf changes synced into submodule, which I only did
>> locally). But here are some learnings from that experiment both to
>> emphasize how important it is to make this work and how restrictive
>> are some of the current limitations.
>>
>> First, good news. Using this mass-attach API to attach to almost 1000
>> kernel functions goes from
>>
>> Plain fentry/fexit:
>> ===================
>> real    0m27.321s
>> user    0m0.352s
>> sys     0m20.919s
>>
>> to
>>
>> Mass-attach fentry/fexit:
>> =========================
>> real    0m2.728s
>> user    0m0.329s
>> sys     0m2.380s
> 
> I did not meassured the bpftrace speedup, because the new code
> attached instantly ;-)
> 
>>
>> It's a 10x speed up. And a good chunk of those 2.7 seconds is in some
>> preparatory steps not related to fentry/fexit stuff.
>>
>> It's not exactly apples-to-apples, though, because the limitations you
>> have right now prevents attaching both fentry and fexit programs to
>> the same set of kernel functions. This makes it pretty useless for a
> 
> hum, you could do link_update with fexit program on the link fd,
> like in the selftest, right?
> 
>> lot of cases, in particular for retsnoop. So I haven't really tested
>> retsnoop end-to-end, I only verified that I do see fentries triggered,
>> but can't have matching fexits. So the speed-up might be smaller due
>> to additional fexit mass-attach (once that is allowed), but it's still
>> a massive difference. So we absolutely need to get this optimization
>> in.
>>
>> Few more thoughts, if you'd like to plan some more work ahead ;)
>>
>> 1. We need similar mass-attach functionality for kprobe/kretprobe, as
>> there are use cases where kprobe are more useful than fentry (e.g., >6
>> args funcs, or funcs with input arguments that are not supported by
>> BPF verifier, like struct-by-value). It's not clear how to best
>> represent this, given currently we attach kprobe through perf_event,
>> but we'll need to think about this for sure.
> 
> I'm fighting with the '2 trampolines concept' at the moment, but the
> mass attach for kprobes seems interesting ;-) will check
> 
>>
>> 2. To make mass-attach fentry/fexit useful for practical purposes, it
>> would be really great to have an ability to fetch traced function's
>> IP. I.e., if we fentry/fexit func kern_func_abc, bpf_get_func_ip()
>> would return IP of that functions that matches the one in
>> /proc/kallsyms. Right now I do very brittle hacks to do that.
> 
> so I hoped that we could store ip always in ctx-8 and have
> the bpf_get_func_ip helper to access that, but the BPF_PROG
> macro does not pass ctx value to the program, just args

ctx does pass to the bpf program. You can check BPF_PROG
macro definition.

> 
> we could perhaps somehow store the ctx in BPF_PROG before calling
> the bpf program, but I did not get to try that yet
> 
>>
>> So all-in-all, super excited about this, but I hope all those issues
>> are addressed to make retsnoop possible and fast.
>>
>>    [0] https://github.com/anakryiko/retsnoop/commit/8a07bc4d8c47d025f755c108f92f0583e3fda6d8
> 
> thanks for checking on this,
> jirka
> 

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach
  2021-06-19 16:19     ` Yonghong Song
@ 2021-06-19 17:09       ` Jiri Olsa
  2021-06-20 16:56         ` Yonghong Song
  0 siblings, 1 reply; 76+ messages in thread
From: Jiri Olsa @ 2021-06-19 17:09 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Andrii Nakryiko, Jiri Olsa, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh, Daniel Xu, Viktor Malik

On Sat, Jun 19, 2021 at 09:19:57AM -0700, Yonghong Song wrote:
> 
> 
> On 6/19/21 1:33 AM, Jiri Olsa wrote:
> > On Thu, Jun 17, 2021 at 01:29:45PM -0700, Andrii Nakryiko wrote:
> > > On Sat, Jun 5, 2021 at 4:12 AM Jiri Olsa <jolsa@kernel.org> wrote:
> > > > 
> > > > hi,
> > > > saga continues.. ;-) previous post is in here [1]
> > > > 
> > > > After another discussion with Steven, he mentioned that if we fix
> > > > the ftrace graph problem with direct functions, he'd be open to
> > > > add batch interface for direct ftrace functions.
> > > > 
> > > > He already had prove of concept fix for that, which I took and broke
> > > > up into several changes. I added the ftrace direct batch interface
> > > > and bpf new interface on top of that.
> > > > 
> > > > It's not so many patches after all, so I thought having them all
> > > > together will help the review, because they are all connected.
> > > > However I can break this up into separate patchsets if necessary.
> > > > 
> > > > This patchset contains:
> > > > 
> > > >    1) patches (1-4) that fix the ftrace graph tracing over the function
> > > >       with direct trampolines attached
> > > >    2) patches (5-8) that add batch interface for ftrace direct function
> > > >       register/unregister/modify
> > > >    3) patches (9-19) that add support to attach BPF program to multiple
> > > >       functions
> > > > 
> > > > In nutshell:
> > > > 
> > > > Ad 1) moves the graph tracing setup before the direct trampoline
> > > > prepares the stack, so they don't clash
> > > > 
> > > > Ad 2) uses ftrace_ops interface to register direct function with
> > > > all functions in ftrace_ops filter.
> > > > 
> > > > Ad 3) creates special program and trampoline type to allow attachment
> > > > of multiple functions to single program.
> > > > 
> > > > There're more detailed desriptions in related changelogs.
> > > > 
> > > > I have working bpftrace multi attachment code on top this. I briefly
> > > > checked retsnoop and I think it could use the new API as well.
> > > 
> > > Ok, so I had a bit of time and enthusiasm to try that with retsnoop.
> > > The ugly code is at [0] if you'd like to see what kind of changes I
> > > needed to make to use this (it won't work if you check it out because
> > > it needs your libbpf changes synced into submodule, which I only did
> > > locally). But here are some learnings from that experiment both to
> > > emphasize how important it is to make this work and how restrictive
> > > are some of the current limitations.
> > > 
> > > First, good news. Using this mass-attach API to attach to almost 1000
> > > kernel functions goes from
> > > 
> > > Plain fentry/fexit:
> > > ===================
> > > real    0m27.321s
> > > user    0m0.352s
> > > sys     0m20.919s
> > > 
> > > to
> > > 
> > > Mass-attach fentry/fexit:
> > > =========================
> > > real    0m2.728s
> > > user    0m0.329s
> > > sys     0m2.380s
> > 
> > I did not meassured the bpftrace speedup, because the new code
> > attached instantly ;-)
> > 
> > > 
> > > It's a 10x speed up. And a good chunk of those 2.7 seconds is in some
> > > preparatory steps not related to fentry/fexit stuff.
> > > 
> > > It's not exactly apples-to-apples, though, because the limitations you
> > > have right now prevents attaching both fentry and fexit programs to
> > > the same set of kernel functions. This makes it pretty useless for a
> > 
> > hum, you could do link_update with fexit program on the link fd,
> > like in the selftest, right?
> > 
> > > lot of cases, in particular for retsnoop. So I haven't really tested
> > > retsnoop end-to-end, I only verified that I do see fentries triggered,
> > > but can't have matching fexits. So the speed-up might be smaller due
> > > to additional fexit mass-attach (once that is allowed), but it's still
> > > a massive difference. So we absolutely need to get this optimization
> > > in.
> > > 
> > > Few more thoughts, if you'd like to plan some more work ahead ;)
> > > 
> > > 1. We need similar mass-attach functionality for kprobe/kretprobe, as
> > > there are use cases where kprobe are more useful than fentry (e.g., >6
> > > args funcs, or funcs with input arguments that are not supported by
> > > BPF verifier, like struct-by-value). It's not clear how to best
> > > represent this, given currently we attach kprobe through perf_event,
> > > but we'll need to think about this for sure.
> > 
> > I'm fighting with the '2 trampolines concept' at the moment, but the
> > mass attach for kprobes seems interesting ;-) will check
> > 
> > > 
> > > 2. To make mass-attach fentry/fexit useful for practical purposes, it
> > > would be really great to have an ability to fetch traced function's
> > > IP. I.e., if we fentry/fexit func kern_func_abc, bpf_get_func_ip()
> > > would return IP of that functions that matches the one in
> > > /proc/kallsyms. Right now I do very brittle hacks to do that.
> > 
> > so I hoped that we could store ip always in ctx-8 and have
> > the bpf_get_func_ip helper to access that, but the BPF_PROG
> > macro does not pass ctx value to the program, just args
> 
> ctx does pass to the bpf program. You can check BPF_PROG
> macro definition.

ah right, should have checked it.. so how about we change
trampoline code to store ip in ctx-8 and make bpf_get_func_ip(ctx)
to return [ctx-8]

I'll need to check if it's ok for the tracing helper to take
ctx as argument

thanks,
jirka


^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach
  2021-06-19 17:09       ` Jiri Olsa
@ 2021-06-20 16:56         ` Yonghong Song
  2021-06-20 17:47           ` Alexei Starovoitov
  0 siblings, 1 reply; 76+ messages in thread
From: Yonghong Song @ 2021-06-20 16:56 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Andrii Nakryiko, Jiri Olsa, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh, Daniel Xu, Viktor Malik



On 6/19/21 10:09 AM, Jiri Olsa wrote:
> On Sat, Jun 19, 2021 at 09:19:57AM -0700, Yonghong Song wrote:
>>
>>
>> On 6/19/21 1:33 AM, Jiri Olsa wrote:
>>> On Thu, Jun 17, 2021 at 01:29:45PM -0700, Andrii Nakryiko wrote:
>>>> On Sat, Jun 5, 2021 at 4:12 AM Jiri Olsa <jolsa@kernel.org> wrote:
>>>>>
>>>>> hi,
>>>>> saga continues.. ;-) previous post is in here [1]
>>>>>
>>>>> After another discussion with Steven, he mentioned that if we fix
>>>>> the ftrace graph problem with direct functions, he'd be open to
>>>>> add batch interface for direct ftrace functions.
>>>>>
>>>>> He already had prove of concept fix for that, which I took and broke
>>>>> up into several changes. I added the ftrace direct batch interface
>>>>> and bpf new interface on top of that.
>>>>>
>>>>> It's not so many patches after all, so I thought having them all
>>>>> together will help the review, because they are all connected.
>>>>> However I can break this up into separate patchsets if necessary.
>>>>>
>>>>> This patchset contains:
>>>>>
>>>>>     1) patches (1-4) that fix the ftrace graph tracing over the function
>>>>>        with direct trampolines attached
>>>>>     2) patches (5-8) that add batch interface for ftrace direct function
>>>>>        register/unregister/modify
>>>>>     3) patches (9-19) that add support to attach BPF program to multiple
>>>>>        functions
>>>>>
>>>>> In nutshell:
>>>>>
>>>>> Ad 1) moves the graph tracing setup before the direct trampoline
>>>>> prepares the stack, so they don't clash
>>>>>
>>>>> Ad 2) uses ftrace_ops interface to register direct function with
>>>>> all functions in ftrace_ops filter.
>>>>>
>>>>> Ad 3) creates special program and trampoline type to allow attachment
>>>>> of multiple functions to single program.
>>>>>
>>>>> There're more detailed desriptions in related changelogs.
>>>>>
>>>>> I have working bpftrace multi attachment code on top this. I briefly
>>>>> checked retsnoop and I think it could use the new API as well.
>>>>
>>>> Ok, so I had a bit of time and enthusiasm to try that with retsnoop.
>>>> The ugly code is at [0] if you'd like to see what kind of changes I
>>>> needed to make to use this (it won't work if you check it out because
>>>> it needs your libbpf changes synced into submodule, which I only did
>>>> locally). But here are some learnings from that experiment both to
>>>> emphasize how important it is to make this work and how restrictive
>>>> are some of the current limitations.
>>>>
>>>> First, good news. Using this mass-attach API to attach to almost 1000
>>>> kernel functions goes from
>>>>
>>>> Plain fentry/fexit:
>>>> ===================
>>>> real    0m27.321s
>>>> user    0m0.352s
>>>> sys     0m20.919s
>>>>
>>>> to
>>>>
>>>> Mass-attach fentry/fexit:
>>>> =========================
>>>> real    0m2.728s
>>>> user    0m0.329s
>>>> sys     0m2.380s
>>>
>>> I did not meassured the bpftrace speedup, because the new code
>>> attached instantly ;-)
>>>
>>>>
>>>> It's a 10x speed up. And a good chunk of those 2.7 seconds is in some
>>>> preparatory steps not related to fentry/fexit stuff.
>>>>
>>>> It's not exactly apples-to-apples, though, because the limitations you
>>>> have right now prevents attaching both fentry and fexit programs to
>>>> the same set of kernel functions. This makes it pretty useless for a
>>>
>>> hum, you could do link_update with fexit program on the link fd,
>>> like in the selftest, right?
>>>
>>>> lot of cases, in particular for retsnoop. So I haven't really tested
>>>> retsnoop end-to-end, I only verified that I do see fentries triggered,
>>>> but can't have matching fexits. So the speed-up might be smaller due
>>>> to additional fexit mass-attach (once that is allowed), but it's still
>>>> a massive difference. So we absolutely need to get this optimization
>>>> in.
>>>>
>>>> Few more thoughts, if you'd like to plan some more work ahead ;)
>>>>
>>>> 1. We need similar mass-attach functionality for kprobe/kretprobe, as
>>>> there are use cases where kprobe are more useful than fentry (e.g., >6
>>>> args funcs, or funcs with input arguments that are not supported by
>>>> BPF verifier, like struct-by-value). It's not clear how to best
>>>> represent this, given currently we attach kprobe through perf_event,
>>>> but we'll need to think about this for sure.
>>>
>>> I'm fighting with the '2 trampolines concept' at the moment, but the
>>> mass attach for kprobes seems interesting ;-) will check
>>>
>>>>
>>>> 2. To make mass-attach fentry/fexit useful for practical purposes, it
>>>> would be really great to have an ability to fetch traced function's
>>>> IP. I.e., if we fentry/fexit func kern_func_abc, bpf_get_func_ip()
>>>> would return IP of that functions that matches the one in
>>>> /proc/kallsyms. Right now I do very brittle hacks to do that.
>>>
>>> so I hoped that we could store ip always in ctx-8 and have
>>> the bpf_get_func_ip helper to access that, but the BPF_PROG
>>> macro does not pass ctx value to the program, just args
>>
>> ctx does pass to the bpf program. You can check BPF_PROG
>> macro definition.
> 
> ah right, should have checked it.. so how about we change
> trampoline code to store ip in ctx-8 and make bpf_get_func_ip(ctx)
> to return [ctx-8]

This should work. Thanks!

> 
> I'll need to check if it's ok for the tracing helper to take
> ctx as argument
> 
> thanks,
> jirka
> 

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach
  2021-06-20 16:56         ` Yonghong Song
@ 2021-06-20 17:47           ` Alexei Starovoitov
  2021-06-21  6:46             ` Andrii Nakryiko
  0 siblings, 1 reply; 76+ messages in thread
From: Alexei Starovoitov @ 2021-06-20 17:47 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Jiri Olsa, Andrii Nakryiko, Jiri Olsa, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh, Daniel Xu, Viktor Malik

On Sun, Jun 20, 2021 at 9:57 AM Yonghong Song <yhs@fb.com> wrote:
> >
> > ah right, should have checked it.. so how about we change
> > trampoline code to store ip in ctx-8 and make bpf_get_func_ip(ctx)
> > to return [ctx-8]
>
> This should work. Thanks!

+1
and pls make it always inline into single LDX insn in the verifier.
For both mass attach and normal fentry/fexit.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach
  2021-06-20 17:47           ` Alexei Starovoitov
@ 2021-06-21  6:46             ` Andrii Nakryiko
  0 siblings, 0 replies; 76+ messages in thread
From: Andrii Nakryiko @ 2021-06-21  6:46 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Yonghong Song, Jiri Olsa, Jiri Olsa, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, John Fastabend,
	KP Singh, Daniel Xu, Viktor Malik

On Sun, Jun 20, 2021 at 8:47 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Sun, Jun 20, 2021 at 9:57 AM Yonghong Song <yhs@fb.com> wrote:
> > >
> > > ah right, should have checked it.. so how about we change
> > > trampoline code to store ip in ctx-8 and make bpf_get_func_ip(ctx)
> > > to return [ctx-8]
> >
> > This should work. Thanks!
>
> +1
> and pls make it always inline into single LDX insn in the verifier.
> For both mass attach and normal fentry/fexit.

Yep.

And we should do it for kprobes (trivial, PT_REGS_IP(ctx)) and
kretprobe (less trivial but simple from inside the kernel, Masami
showed how to do it in one of the previous emails). I hope BPF infra
allows inlining of helpers for some program types but not the others.

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach
  2021-06-19  8:33   ` Jiri Olsa
  2021-06-19 16:19     ` Yonghong Song
@ 2021-06-21  6:50     ` Andrii Nakryiko
  2021-07-06 20:26       ` Andrii Nakryiko
  1 sibling, 1 reply; 76+ messages in thread
From: Andrii Nakryiko @ 2021-06-21  6:50 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Sat, Jun 19, 2021 at 11:33 AM Jiri Olsa <jolsa@redhat.com> wrote:
>
> On Thu, Jun 17, 2021 at 01:29:45PM -0700, Andrii Nakryiko wrote:
> > On Sat, Jun 5, 2021 at 4:12 AM Jiri Olsa <jolsa@kernel.org> wrote:
> > >
> > > hi,
> > > saga continues.. ;-) previous post is in here [1]
> > >
> > > After another discussion with Steven, he mentioned that if we fix
> > > the ftrace graph problem with direct functions, he'd be open to
> > > add batch interface for direct ftrace functions.
> > >
> > > He already had prove of concept fix for that, which I took and broke
> > > up into several changes. I added the ftrace direct batch interface
> > > and bpf new interface on top of that.
> > >
> > > It's not so many patches after all, so I thought having them all
> > > together will help the review, because they are all connected.
> > > However I can break this up into separate patchsets if necessary.
> > >
> > > This patchset contains:
> > >
> > >   1) patches (1-4) that fix the ftrace graph tracing over the function
> > >      with direct trampolines attached
> > >   2) patches (5-8) that add batch interface for ftrace direct function
> > >      register/unregister/modify
> > >   3) patches (9-19) that add support to attach BPF program to multiple
> > >      functions
> > >
> > > In nutshell:
> > >
> > > Ad 1) moves the graph tracing setup before the direct trampoline
> > > prepares the stack, so they don't clash
> > >
> > > Ad 2) uses ftrace_ops interface to register direct function with
> > > all functions in ftrace_ops filter.
> > >
> > > Ad 3) creates special program and trampoline type to allow attachment
> > > of multiple functions to single program.
> > >
> > > There're more detailed desriptions in related changelogs.
> > >
> > > I have working bpftrace multi attachment code on top this. I briefly
> > > checked retsnoop and I think it could use the new API as well.
> >
> > Ok, so I had a bit of time and enthusiasm to try that with retsnoop.
> > The ugly code is at [0] if you'd like to see what kind of changes I
> > needed to make to use this (it won't work if you check it out because
> > it needs your libbpf changes synced into submodule, which I only did
> > locally). But here are some learnings from that experiment both to
> > emphasize how important it is to make this work and how restrictive
> > are some of the current limitations.
> >
> > First, good news. Using this mass-attach API to attach to almost 1000
> > kernel functions goes from
> >
> > Plain fentry/fexit:
> > ===================
> > real    0m27.321s
> > user    0m0.352s
> > sys     0m20.919s
> >
> > to
> >
> > Mass-attach fentry/fexit:
> > =========================
> > real    0m2.728s
> > user    0m0.329s
> > sys     0m2.380s
>
> I did not meassured the bpftrace speedup, because the new code
> attached instantly ;-)
>
> >
> > It's a 10x speed up. And a good chunk of those 2.7 seconds is in some
> > preparatory steps not related to fentry/fexit stuff.
> >
> > It's not exactly apples-to-apples, though, because the limitations you
> > have right now prevents attaching both fentry and fexit programs to
> > the same set of kernel functions. This makes it pretty useless for a
>
> hum, you could do link_update with fexit program on the link fd,
> like in the selftest, right?

Hm... I didn't realize we can attach two different prog FDs to the
same link, honestly (and was too lazy to look through selftests
again). I can try that later. But it's actually quite a
counter-intuitive API (I honestly assumed that link_update can be used
to add more BTF IDs, but not change prog_fd). Previously bpf_link was
always associated with single BPF prog FD. It would be good to keep
that property in the final version, but we can get back to that later.

>
> > lot of cases, in particular for retsnoop. So I haven't really tested
> > retsnoop end-to-end, I only verified that I do see fentries triggered,
> > but can't have matching fexits. So the speed-up might be smaller due
> > to additional fexit mass-attach (once that is allowed), but it's still
> > a massive difference. So we absolutely need to get this optimization
> > in.
> >
> > Few more thoughts, if you'd like to plan some more work ahead ;)
> >
> > 1. We need similar mass-attach functionality for kprobe/kretprobe, as
> > there are use cases where kprobe are more useful than fentry (e.g., >6
> > args funcs, or funcs with input arguments that are not supported by
> > BPF verifier, like struct-by-value). It's not clear how to best
> > represent this, given currently we attach kprobe through perf_event,
> > but we'll need to think about this for sure.
>
> I'm fighting with the '2 trampolines concept' at the moment, but the
> mass attach for kprobes seems interesting ;-) will check
>
> >
> > 2. To make mass-attach fentry/fexit useful for practical purposes, it
> > would be really great to have an ability to fetch traced function's
> > IP. I.e., if we fentry/fexit func kern_func_abc, bpf_get_func_ip()
> > would return IP of that functions that matches the one in
> > /proc/kallsyms. Right now I do very brittle hacks to do that.
>
> so I hoped that we could store ip always in ctx-8 and have
> the bpf_get_func_ip helper to access that, but the BPF_PROG
> macro does not pass ctx value to the program, just args
>
> we could perhaps somehow store the ctx in BPF_PROG before calling
> the bpf program, but I did not get to try that yet
>
> >
> > So all-in-all, super excited about this, but I hope all those issues
> > are addressed to make retsnoop possible and fast.
> >
> >   [0] https://github.com/anakryiko/retsnoop/commit/8a07bc4d8c47d025f755c108f92f0583e3fda6d8
>
> thanks for checking on this,
> jirka
>

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach
  2021-06-21  6:50     ` Andrii Nakryiko
@ 2021-07-06 20:26       ` Andrii Nakryiko
  2021-07-07 15:19         ` Jiri Olsa
  0 siblings, 1 reply; 76+ messages in thread
From: Andrii Nakryiko @ 2021-07-06 20:26 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Sun, Jun 20, 2021 at 11:50 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Sat, Jun 19, 2021 at 11:33 AM Jiri Olsa <jolsa@redhat.com> wrote:
> >
> > On Thu, Jun 17, 2021 at 01:29:45PM -0700, Andrii Nakryiko wrote:
> > > On Sat, Jun 5, 2021 at 4:12 AM Jiri Olsa <jolsa@kernel.org> wrote:
> > > >
> > > > hi,
> > > > saga continues.. ;-) previous post is in here [1]
> > > >
> > > > After another discussion with Steven, he mentioned that if we fix
> > > > the ftrace graph problem with direct functions, he'd be open to
> > > > add batch interface for direct ftrace functions.
> > > >
> > > > He already had prove of concept fix for that, which I took and broke
> > > > up into several changes. I added the ftrace direct batch interface
> > > > and bpf new interface on top of that.
> > > >
> > > > It's not so many patches after all, so I thought having them all
> > > > together will help the review, because they are all connected.
> > > > However I can break this up into separate patchsets if necessary.
> > > >
> > > > This patchset contains:
> > > >
> > > >   1) patches (1-4) that fix the ftrace graph tracing over the function
> > > >      with direct trampolines attached
> > > >   2) patches (5-8) that add batch interface for ftrace direct function
> > > >      register/unregister/modify
> > > >   3) patches (9-19) that add support to attach BPF program to multiple
> > > >      functions
> > > >
> > > > In nutshell:
> > > >
> > > > Ad 1) moves the graph tracing setup before the direct trampoline
> > > > prepares the stack, so they don't clash
> > > >
> > > > Ad 2) uses ftrace_ops interface to register direct function with
> > > > all functions in ftrace_ops filter.
> > > >
> > > > Ad 3) creates special program and trampoline type to allow attachment
> > > > of multiple functions to single program.
> > > >
> > > > There're more detailed desriptions in related changelogs.
> > > >
> > > > I have working bpftrace multi attachment code on top this. I briefly
> > > > checked retsnoop and I think it could use the new API as well.
> > >
> > > Ok, so I had a bit of time and enthusiasm to try that with retsnoop.
> > > The ugly code is at [0] if you'd like to see what kind of changes I
> > > needed to make to use this (it won't work if you check it out because
> > > it needs your libbpf changes synced into submodule, which I only did
> > > locally). But here are some learnings from that experiment both to
> > > emphasize how important it is to make this work and how restrictive
> > > are some of the current limitations.
> > >
> > > First, good news. Using this mass-attach API to attach to almost 1000
> > > kernel functions goes from
> > >
> > > Plain fentry/fexit:
> > > ===================
> > > real    0m27.321s
> > > user    0m0.352s
> > > sys     0m20.919s
> > >
> > > to
> > >
> > > Mass-attach fentry/fexit:
> > > =========================
> > > real    0m2.728s
> > > user    0m0.329s
> > > sys     0m2.380s
> >
> > I did not meassured the bpftrace speedup, because the new code
> > attached instantly ;-)
> >
> > >
> > > It's a 10x speed up. And a good chunk of those 2.7 seconds is in some
> > > preparatory steps not related to fentry/fexit stuff.
> > >
> > > It's not exactly apples-to-apples, though, because the limitations you
> > > have right now prevents attaching both fentry and fexit programs to
> > > the same set of kernel functions. This makes it pretty useless for a
> >
> > hum, you could do link_update with fexit program on the link fd,
> > like in the selftest, right?
>
> Hm... I didn't realize we can attach two different prog FDs to the
> same link, honestly (and was too lazy to look through selftests
> again). I can try that later. But it's actually quite a
> counter-intuitive API (I honestly assumed that link_update can be used
> to add more BTF IDs, but not change prog_fd). Previously bpf_link was
> always associated with single BPF prog FD. It would be good to keep
> that property in the final version, but we can get back to that later.

Ok, I'm back from PTO and as a warm-up did a two-line change to make
retsnoop work end-to-end using this bpf_link_update() approach. See
[0]. I still think it's a completely confusing API to do
bpf_link_update() to have both fexit and fentry, but it worked for
this experiment.

BTW, adding ~900 fexit attachments is barely noticeable, which is
great, means that attachment is instantaneous.

real    0m2.739s
user    0m0.351s
sys     0m2.370s

  [0] https://github.com/anakryiko/retsnoop/commit/c915d729d6e98f83601e432e61cb1bdf476ceefb

>
> >
> > > lot of cases, in particular for retsnoop. So I haven't really tested
> > > retsnoop end-to-end, I only verified that I do see fentries triggered,
> > > but can't have matching fexits. So the speed-up might be smaller due
> > > to additional fexit mass-attach (once that is allowed), but it's still
> > > a massive difference. So we absolutely need to get this optimization
> > > in.
> > >
> > > Few more thoughts, if you'd like to plan some more work ahead ;)
> > >
> > > 1. We need similar mass-attach functionality for kprobe/kretprobe, as
> > > there are use cases where kprobe are more useful than fentry (e.g., >6
> > > args funcs, or funcs with input arguments that are not supported by
> > > BPF verifier, like struct-by-value). It's not clear how to best
> > > represent this, given currently we attach kprobe through perf_event,
> > > but we'll need to think about this for sure.
> >
> > I'm fighting with the '2 trampolines concept' at the moment, but the
> > mass attach for kprobes seems interesting ;-) will check
> >
> > >
> > > 2. To make mass-attach fentry/fexit useful for practical purposes, it
> > > would be really great to have an ability to fetch traced function's
> > > IP. I.e., if we fentry/fexit func kern_func_abc, bpf_get_func_ip()
> > > would return IP of that functions that matches the one in
> > > /proc/kallsyms. Right now I do very brittle hacks to do that.
> >
> > so I hoped that we could store ip always in ctx-8 and have
> > the bpf_get_func_ip helper to access that, but the BPF_PROG
> > macro does not pass ctx value to the program, just args
> >
> > we could perhaps somehow store the ctx in BPF_PROG before calling
> > the bpf program, but I did not get to try that yet
> >
> > >
> > > So all-in-all, super excited about this, but I hope all those issues
> > > are addressed to make retsnoop possible and fast.
> > >
> > >   [0] https://github.com/anakryiko/retsnoop/commit/8a07bc4d8c47d025f755c108f92f0583e3fda6d8
> >
> > thanks for checking on this,
> > jirka
> >

^ permalink raw reply	[flat|nested] 76+ messages in thread

* Re: [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach
  2021-07-06 20:26       ` Andrii Nakryiko
@ 2021-07-07 15:19         ` Jiri Olsa
  0 siblings, 0 replies; 76+ messages in thread
From: Jiri Olsa @ 2021-07-07 15:19 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Jiri Olsa, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Steven Rostedt (VMware),
	Networking, bpf, Martin KaFai Lau, Song Liu, Yonghong Song,
	John Fastabend, KP Singh, Daniel Xu, Viktor Malik

On Tue, Jul 06, 2021 at 01:26:46PM -0700, Andrii Nakryiko wrote:

SNIP

> > > >
> > > > It's a 10x speed up. And a good chunk of those 2.7 seconds is in some
> > > > preparatory steps not related to fentry/fexit stuff.
> > > >
> > > > It's not exactly apples-to-apples, though, because the limitations you
> > > > have right now prevents attaching both fentry and fexit programs to
> > > > the same set of kernel functions. This makes it pretty useless for a
> > >
> > > hum, you could do link_update with fexit program on the link fd,
> > > like in the selftest, right?
> >
> > Hm... I didn't realize we can attach two different prog FDs to the
> > same link, honestly (and was too lazy to look through selftests
> > again). I can try that later. But it's actually quite a
> > counter-intuitive API (I honestly assumed that link_update can be used
> > to add more BTF IDs, but not change prog_fd). Previously bpf_link was
> > always associated with single BPF prog FD. It would be good to keep
> > that property in the final version, but we can get back to that later.
> 
> Ok, I'm back from PTO and as a warm-up did a two-line change to make
> retsnoop work end-to-end using this bpf_link_update() approach. See
> [0]. I still think it's a completely confusing API to do
> bpf_link_update() to have both fexit and fentry, but it worked for
> this experiment.

we need the same set of functions, and we have 'fd' representing
that ;-) but that could hopefully go away with the new approach

> 
> BTW, adding ~900 fexit attachments is barely noticeable, which is
> great, means that attachment is instantaneous.

right I see similar not noticable time in bpftrace as well
thanks for testing that,

jirka

> 
> real    0m2.739s
> user    0m0.351s
> sys     0m2.370s
> 
>   [0] https://github.com/anakryiko/retsnoop/commit/c915d729d6e98f83601e432e61cb1bdf476ceefb
> 

SNIP


^ permalink raw reply	[flat|nested] 76+ messages in thread

end of thread, other threads:[~2021-07-07 15:19 UTC | newest]

Thread overview: 76+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-05 11:10 [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach Jiri Olsa
2021-06-05 11:10 ` [PATCH 01/19] x86/ftrace: Remove extra orig rax move Jiri Olsa
2021-06-05 11:10 ` [PATCH 02/19] x86/ftrace: Remove fault protection code in prepare_ftrace_return Jiri Olsa
2021-06-05 11:10 ` [PATCH 03/19] x86/ftrace: Make function graph use ftrace directly Jiri Olsa
2021-06-08 18:35   ` Andrii Nakryiko
2021-06-08 18:51     ` Jiri Olsa
2021-06-08 19:11       ` Steven Rostedt
2021-06-05 11:10 ` [PATCH 04/19] tracing: Add trampoline/graph selftest Jiri Olsa
2021-06-05 11:10 ` [PATCH 05/19] ftrace: Add ftrace_add_rec_direct function Jiri Olsa
2021-06-05 11:10 ` [PATCH 06/19] ftrace: Add multi direct register/unregister interface Jiri Olsa
2021-06-05 11:10 ` [PATCH 07/19] ftrace: Add multi direct modify interface Jiri Olsa
2021-06-05 11:10 ` [PATCH 08/19] ftrace/samples: Add multi direct interface test module Jiri Olsa
2021-06-05 11:10 ` [PATCH 09/19] bpf, x64: Allow to use caller address from stack Jiri Olsa
2021-06-07  3:07   ` Yonghong Song
2021-06-07 18:13     ` Jiri Olsa
2021-06-05 11:10 ` [PATCH 10/19] bpf: Allow to store caller's ip as argument Jiri Olsa
2021-06-07  3:21   ` Yonghong Song
2021-06-07 18:15     ` Jiri Olsa
2021-06-08 18:49   ` Andrii Nakryiko
2021-06-08 20:58     ` Jiri Olsa
2021-06-08 21:02       ` Andrii Nakryiko
2021-06-08 21:11         ` Jiri Olsa
2021-06-05 11:10 ` [PATCH 11/19] bpf: Add support to load multi func tracing program Jiri Olsa
2021-06-07  3:56   ` Yonghong Song
2021-06-07 18:18     ` Jiri Olsa
2021-06-07 19:35       ` Yonghong Song
2021-06-05 11:10 ` [PATCH 12/19] bpf: Add bpf_trampoline_alloc function Jiri Olsa
2021-06-05 11:10 ` [PATCH 13/19] bpf: Add support to link multi func tracing program Jiri Olsa
2021-06-07  5:36   ` Yonghong Song
2021-06-07 18:25     ` Jiri Olsa
2021-06-07 19:39       ` Yonghong Song
2021-06-08 15:42   ` Alexei Starovoitov
2021-06-08 18:17     ` Jiri Olsa
2021-06-08 18:49       ` Alexei Starovoitov
2021-06-08 21:07         ` Jiri Olsa
2021-06-08 23:05           ` Alexei Starovoitov
2021-06-09  5:08             ` Andrii Nakryiko
2021-06-09 13:42               ` Jiri Olsa
2021-06-09 13:33             ` Jiri Olsa
2021-06-09  5:18   ` Andrii Nakryiko
2021-06-09 13:53     ` Jiri Olsa
2021-06-05 11:10 ` [PATCH 14/19] libbpf: Add btf__find_by_pattern_kind function Jiri Olsa
2021-06-09  5:29   ` Andrii Nakryiko
2021-06-09 13:59     ` Jiri Olsa
2021-06-09 14:19       ` Jiri Olsa
2021-06-05 11:10 ` [PATCH 15/19] libbpf: Add support to link multi func tracing program Jiri Olsa
2021-06-07  5:49   ` Yonghong Song
2021-06-07 18:28     ` Jiri Olsa
2021-06-07 19:42       ` Yonghong Song
2021-06-07 20:11         ` Jiri Olsa
2021-06-09  5:34   ` Andrii Nakryiko
2021-06-09 14:17     ` Jiri Olsa
2021-06-10 17:05       ` Andrii Nakryiko
2021-06-10 20:35         ` Jiri Olsa
2021-06-05 11:10 ` [PATCH 16/19] selftests/bpf: Add fentry multi func test Jiri Olsa
2021-06-07  6:06   ` Yonghong Song
2021-06-07 18:42     ` Jiri Olsa
2021-06-09  5:40   ` Andrii Nakryiko
2021-06-09 14:29     ` Jiri Olsa
2021-06-10 17:00       ` Andrii Nakryiko
2021-06-10 20:28         ` Jiri Olsa
2021-06-05 11:10 ` [PATCH 17/19] selftests/bpf: Add fexit " Jiri Olsa
2021-06-05 11:10 ` [PATCH 18/19] selftests/bpf: Add fentry/fexit " Jiri Olsa
2021-06-09  5:41   ` Andrii Nakryiko
2021-06-09 14:29     ` Jiri Olsa
2021-06-05 11:10 ` [PATCH 19/19] selftests/bpf: Temporary fix for fentry_fexit_multi_test Jiri Olsa
2021-06-17 20:29 ` [RFCv3 00/19] x86/ftrace/bpf: Add batch support for direct/tracing attach Andrii Nakryiko
2021-06-19  8:33   ` Jiri Olsa
2021-06-19 16:19     ` Yonghong Song
2021-06-19 17:09       ` Jiri Olsa
2021-06-20 16:56         ` Yonghong Song
2021-06-20 17:47           ` Alexei Starovoitov
2021-06-21  6:46             ` Andrii Nakryiko
2021-06-21  6:50     ` Andrii Nakryiko
2021-07-06 20:26       ` Andrii Nakryiko
2021-07-07 15:19         ` Jiri Olsa

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).