All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v4 0/8] bpf: Support struct argument for trampoline base progs
@ 2022-08-31 15:26 Yonghong Song
  2022-08-31 15:26 ` [PATCH bpf-next v4 1/8] bpf: Allow struct argument in trampoline based programs Yonghong Song
                   ` (8 more replies)
  0 siblings, 9 replies; 19+ messages in thread
From: Yonghong Song @ 2022-08-31 15:26 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team

Currently struct arguments are not supported for trampoline based progs.
One of major reason is that struct argument may pass by value which may
use more than one registers. This breaks trampoline progs where
each argument is assumed to take one register. bcc community reported the
issue ([1]) where struct argument is not supported for fentry program.
  typedef struct {
        uid_t val;
  } kuid_t;
  typedef struct {
        gid_t val;
  } kgid_t;
  int security_path_chown(struct path *path, kuid_t uid, kgid_t gid);
Inside Meta, we also have a use case to attach to tcp_setsockopt()
  typedef struct {
        union {
                void            *kernel;
                void __user     *user;
        };
        bool            is_kernel : 1;
  } sockptr_t;
  int tcp_setsockopt(struct sock *sk, int level, int optname,
                     sockptr_t optval, unsigned int optlen);

This patch added struct value support for bpf tracing programs which
uses trampoline. Only <= 16 byte struct size is supported for now
which covers use cases in the above. For x86/arm64/bpf, <= 16
struct value will be passed in registers instead of by reference.
Only x86_64 is supported in this patch. arm64 support can be
added later.

 [1] https://github.com/iovisor/bcc/issues/3657

Changelog:
  v3 -> v4:
   - fix a test failure where no casting for
     bpf_get_func_arg() value as the value type is 'int'.
   - add tracing_struct test in DENYLIST.s390x
   - simplify macro BPF_REG_CNT for BPF_PROG2.
  v2 -> v3:
   - previously struct arguments (<= 16 bytes) are passed
     by reference for bpf programs. Suggested by Alexei,
     it is passed by value now.
   - in order to support passing <= 16 struct value, a
     new macro BPF_PROG2 is invented.
  rfc v1 -> v2:
   - changed bpf_func_model struct info fields to
     arg_flags[] to make it easy to iterate arguments
     in arch specific {save|restore}_regs() functions.
   - added fexit tests to test return values with
     struct arguments.

Yonghong Song (8):
  bpf: Allow struct argument in trampoline based programs
  bpf: x86: Support in-register struct arguments in trampoline programs
  bpf: Update descriptions for helpers bpf_get_func_arg[_cnt]()
  bpf: arm64: No support of struct argument in trampoline programs
  libbpf: Add new BPF_PROG2 macro
  selftests/bpf: Add struct argument tests with fentry/fexit programs.
  selftests/bpf: Use BPF_PROG2 for some fentry programs without struct
    arguments
  selftests/bpf: Add tracing_struct test in DENYLIST.s390x

 arch/arm64/net/bpf_jit_comp.c                 |   8 +-
 arch/x86/net/bpf_jit_comp.c                   |  68 +++++++---
 include/linux/bpf.h                           |   4 +
 include/uapi/linux/bpf.h                      |   9 +-
 kernel/bpf/btf.c                              |  45 ++++++-
 tools/include/uapi/linux/bpf.h                |   9 +-
 tools/lib/bpf/bpf_tracing.h                   |  79 ++++++++++++
 tools/testing/selftests/bpf/DENYLIST.s390x    |   1 +
 .../selftests/bpf/bpf_testmod/bpf_testmod.c   |  48 +++++++
 .../selftests/bpf/prog_tests/tracing_struct.c |  63 +++++++++
 tools/testing/selftests/bpf/progs/timer.c     |   4 +-
 .../selftests/bpf/progs/tracing_struct.c      | 120 ++++++++++++++++++
 12 files changed, 424 insertions(+), 34 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/tracing_struct.c
 create mode 100644 tools/testing/selftests/bpf/progs/tracing_struct.c

-- 
2.30.2


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v4 1/8] bpf: Allow struct argument in trampoline based programs
  2022-08-31 15:26 [PATCH bpf-next v4 0/8] bpf: Support struct argument for trampoline base progs Yonghong Song
@ 2022-08-31 15:26 ` Yonghong Song
  2022-08-31 15:26 ` [PATCH bpf-next v4 2/8] bpf: x86: Support in-register struct arguments in trampoline programs Yonghong Song
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Yonghong Song @ 2022-08-31 15:26 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team

Allow struct argument in trampoline based programs where
the struct size should be <= 16 bytes. In such cases, the argument
will be put into up to 2 registers for bpf, x86_64 and arm64
architectures.

To support arch-specific trampoline manipulation,
add arg_flags for additional struct information about arguments
in btf_func_model. Such information will be used in arch specific
function arch_prepare_bpf_trampoline() to prepare argument access
properly in trampoline.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 include/linux/bpf.h |  4 ++++
 kernel/bpf/btf.c    | 45 +++++++++++++++++++++++++++++++++++++++------
 2 files changed, 43 insertions(+), 6 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 9c1674973e03..4d32f125f4af 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -727,10 +727,14 @@ enum bpf_cgroup_storage_type {
  */
 #define MAX_BPF_FUNC_REG_ARGS 5
 
+/* The argument is a structure. */
+#define BTF_FMODEL_STRUCT_ARG		BIT(0)
+
 struct btf_func_model {
 	u8 ret_size;
 	u8 nr_args;
 	u8 arg_size[MAX_BPF_FUNC_ARGS];
+	u8 arg_flags[MAX_BPF_FUNC_ARGS];
 };
 
 /* Restore arguments before returning from trampoline to let original function
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 903719b89238..ea94527e5d70 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -5328,6 +5328,34 @@ static bool is_int_ptr(struct btf *btf, const struct btf_type *t)
 	return btf_type_is_int(t);
 }
 
+static u32 get_ctx_arg_idx(struct btf *btf, const struct btf_type *func_proto,
+			   int off)
+{
+	const struct btf_param *args;
+	const struct btf_type *t;
+	u32 offset = 0, nr_args;
+	int i;
+
+	if (!func_proto)
+		return off / 8;
+
+	nr_args = btf_type_vlen(func_proto);
+	args = (const struct btf_param *)(func_proto + 1);
+	for (i = 0; i < nr_args; i++) {
+		t = btf_type_skip_modifiers(btf, args[i].type, NULL);
+		offset += btf_type_is_ptr(t) ? 8 : roundup(t->size, 8);
+		if (off < offset)
+			return i;
+	}
+
+	t = btf_type_skip_modifiers(btf, func_proto->type, NULL);
+	offset += btf_type_is_ptr(t) ? 8 : roundup(t->size, 8);
+	if (off < offset)
+		return nr_args;
+
+	return nr_args + 1;
+}
+
 bool btf_ctx_access(int off, int size, enum bpf_access_type type,
 		    const struct bpf_prog *prog,
 		    struct bpf_insn_access_aux *info)
@@ -5347,7 +5375,7 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
 			tname, off);
 		return false;
 	}
-	arg = off / 8;
+	arg = get_ctx_arg_idx(btf, t, off);
 	args = (const struct btf_param *)(t + 1);
 	/* if (t == NULL) Fall back to default BPF prog with
 	 * MAX_BPF_FUNC_REG_ARGS u64 arguments.
@@ -5417,7 +5445,7 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
 	/* skip modifiers */
 	while (btf_type_is_modifier(t))
 		t = btf_type_by_id(btf, t->type);
-	if (btf_type_is_small_int(t) || btf_is_any_enum(t))
+	if (btf_type_is_small_int(t) || btf_is_any_enum(t) || __btf_type_is_struct(t))
 		/* accessing a scalar */
 		return true;
 	if (!btf_type_is_ptr(t)) {
@@ -5881,7 +5909,7 @@ static int __get_type_size(struct btf *btf, u32 btf_id,
 	if (btf_type_is_ptr(t))
 		/* kernel size of pointer. Not BPF's size of pointer*/
 		return sizeof(void *);
-	if (btf_type_is_int(t) || btf_is_any_enum(t))
+	if (btf_type_is_int(t) || btf_is_any_enum(t) || __btf_type_is_struct(t))
 		return t->size;
 	return -EINVAL;
 }
@@ -5901,8 +5929,10 @@ int btf_distill_func_proto(struct bpf_verifier_log *log,
 		/* BTF function prototype doesn't match the verifier types.
 		 * Fall back to MAX_BPF_FUNC_REG_ARGS u64 args.
 		 */
-		for (i = 0; i < MAX_BPF_FUNC_REG_ARGS; i++)
+		for (i = 0; i < MAX_BPF_FUNC_REG_ARGS; i++) {
 			m->arg_size[i] = 8;
+			m->arg_flags[i] = 0;
+		}
 		m->ret_size = 8;
 		m->nr_args = MAX_BPF_FUNC_REG_ARGS;
 		return 0;
@@ -5916,7 +5946,7 @@ int btf_distill_func_proto(struct bpf_verifier_log *log,
 		return -EINVAL;
 	}
 	ret = __get_type_size(btf, func->type, &t);
-	if (ret < 0) {
+	if (ret < 0 || __btf_type_is_struct(t)) {
 		bpf_log(log,
 			"The function %s return type %s is unsupported.\n",
 			tname, btf_kind_str[BTF_INFO_KIND(t->info)]);
@@ -5932,7 +5962,9 @@ int btf_distill_func_proto(struct bpf_verifier_log *log,
 			return -EINVAL;
 		}
 		ret = __get_type_size(btf, args[i].type, &t);
-		if (ret < 0) {
+
+		/* No support of struct argument size greater than 16 bytes */
+		if (ret < 0 || ret > 16) {
 			bpf_log(log,
 				"The function %s arg%d type %s is unsupported.\n",
 				tname, i, btf_kind_str[BTF_INFO_KIND(t->info)]);
@@ -5945,6 +5977,7 @@ int btf_distill_func_proto(struct bpf_verifier_log *log,
 			return -EINVAL;
 		}
 		m->arg_size[i] = ret;
+		m->arg_flags[i] = __btf_type_is_struct(t) ? BTF_FMODEL_STRUCT_ARG : 0;
 	}
 	m->nr_args = nargs;
 	return 0;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v4 2/8] bpf: x86: Support in-register struct arguments in trampoline programs
  2022-08-31 15:26 [PATCH bpf-next v4 0/8] bpf: Support struct argument for trampoline base progs Yonghong Song
  2022-08-31 15:26 ` [PATCH bpf-next v4 1/8] bpf: Allow struct argument in trampoline based programs Yonghong Song
@ 2022-08-31 15:26 ` Yonghong Song
  2022-09-06 16:40   ` Kui-Feng Lee
  2022-09-07  3:00   ` Alexei Starovoitov
  2022-08-31 15:26 ` [PATCH bpf-next v4 3/8] bpf: Update descriptions for helpers bpf_get_func_arg[_cnt]() Yonghong Song
                   ` (6 subsequent siblings)
  8 siblings, 2 replies; 19+ messages in thread
From: Yonghong Song @ 2022-08-31 15:26 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team

In C, struct value can be passed as a function argument.
For small structs, struct value may be passed in
one or more registers. For trampoline based bpf programs,
this would cause complication since one-to-one mapping between
function argument and arch argument register is not valid
any more.

The latest llvm16 added bpf support to pass by values
for struct up to 16 bytes ([1]). This is also true for
x86_64 architecture where two registers will hold
the struct value if the struct size is >8 and <= 16.
This may not be true if one of struct member is 'double'
type but in current linux source code we don't have
such instance yet, so we assume all >8 && <= 16 struct
holds two general purpose argument registers.

Also change on-stack nr_args value to the number
of registers holding the arguments. This will
permit bpf_get_func_arg() helper to get all
argument values.

 [1] https://reviews.llvm.org/D132144

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 arch/x86/net/bpf_jit_comp.c | 68 +++++++++++++++++++++++++++----------
 1 file changed, 51 insertions(+), 17 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index c1f6c1c51d99..ae89f4143eb4 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1751,34 +1751,60 @@ st:			if (is_imm8(insn->off))
 static void save_regs(const struct btf_func_model *m, u8 **prog, int nr_args,
 		      int stack_size)
 {
-	int i;
+	int i, j, arg_size, nr_regs;
 	/* Store function arguments to stack.
 	 * For a function that accepts two pointers the sequence will be:
 	 * mov QWORD PTR [rbp-0x10],rdi
 	 * mov QWORD PTR [rbp-0x8],rsi
 	 */
-	for (i = 0; i < min(nr_args, 6); i++)
-		emit_stx(prog, bytes_to_bpf_size(m->arg_size[i]),
-			 BPF_REG_FP,
-			 i == 5 ? X86_REG_R9 : BPF_REG_1 + i,
-			 -(stack_size - i * 8));
+	for (i = 0, j = 0; i < min(nr_args, 6); i++) {
+		if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG) {
+			nr_regs = (m->arg_size[i] + 7) / 8;
+			arg_size = 8;
+		} else {
+			nr_regs = 1;
+			arg_size = m->arg_size[i];
+		}
+
+		while (nr_regs) {
+			emit_stx(prog, bytes_to_bpf_size(arg_size),
+				 BPF_REG_FP,
+				 j == 5 ? X86_REG_R9 : BPF_REG_1 + j,
+				 -(stack_size - j * 8));
+			nr_regs--;
+			j++;
+		}
+	}
 }
 
 static void restore_regs(const struct btf_func_model *m, u8 **prog, int nr_args,
 			 int stack_size)
 {
-	int i;
+	int i, j, arg_size, nr_regs;
 
 	/* Restore function arguments from stack.
 	 * For a function that accepts two pointers the sequence will be:
 	 * EMIT4(0x48, 0x8B, 0x7D, 0xF0); mov rdi,QWORD PTR [rbp-0x10]
 	 * EMIT4(0x48, 0x8B, 0x75, 0xF8); mov rsi,QWORD PTR [rbp-0x8]
 	 */
-	for (i = 0; i < min(nr_args, 6); i++)
-		emit_ldx(prog, bytes_to_bpf_size(m->arg_size[i]),
-			 i == 5 ? X86_REG_R9 : BPF_REG_1 + i,
-			 BPF_REG_FP,
-			 -(stack_size - i * 8));
+	for (i = 0, j = 0; i < min(nr_args, 6); i++) {
+		if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG) {
+			nr_regs = (m->arg_size[i] + 7) / 8;
+			arg_size = 8;
+		} else {
+			nr_regs = 1;
+			arg_size = m->arg_size[i];
+		}
+
+		while (nr_regs) {
+			emit_ldx(prog, bytes_to_bpf_size(arg_size),
+				 j == 5 ? X86_REG_R9 : BPF_REG_1 + j,
+				 BPF_REG_FP,
+				 -(stack_size - j * 8));
+			nr_regs--;
+			j++;
+		}
+	}
 }
 
 static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
@@ -2015,7 +2041,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
 				struct bpf_tramp_links *tlinks,
 				void *orig_call)
 {
-	int ret, i, nr_args = m->nr_args;
+	int ret, i, nr_args = m->nr_args, extra_nregs = 0;
 	int regs_off, ip_off, args_off, stack_size = nr_args * 8, run_ctx_off;
 	struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
 	struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
@@ -2028,6 +2054,14 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
 	if (nr_args > 6)
 		return -ENOTSUPP;
 
+	for (i = 0; i < MAX_BPF_FUNC_ARGS; i++) {
+		if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG)
+			extra_nregs += (m->arg_size[i] + 7) / 8 - 1;
+	}
+	if (nr_args + extra_nregs > 6)
+		return -ENOTSUPP;
+	stack_size += extra_nregs * 8;
+
 	/* Generated trampoline stack layout:
 	 *
 	 * RBP + 8         [ return address  ]
@@ -2040,7 +2074,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
 	 *                 [ ...             ]
 	 * RBP - regs_off  [ reg_arg1        ]  program's ctx pointer
 	 *
-	 * RBP - args_off  [ args count      ]  always
+	 * RBP - args_off  [ arg regs count  ]  always
 	 *
 	 * RBP - ip_off    [ traced function ]  BPF_TRAMP_F_IP_ARG flag
 	 *
@@ -2083,11 +2117,11 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
 	EMIT4(0x48, 0x83, 0xEC, stack_size); /* sub rsp, stack_size */
 	EMIT1(0x53);		 /* push rbx */
 
-	/* Store number of arguments of the traced function:
-	 *   mov rax, nr_args
+	/* Store number of argument registers of the traced function:
+	 *   mov rax, nr_args + extra_nregs
 	 *   mov QWORD PTR [rbp - args_off], rax
 	 */
-	emit_mov_imm64(&prog, BPF_REG_0, 0, (u32) nr_args);
+	emit_mov_imm64(&prog, BPF_REG_0, 0, (u32) nr_args + extra_nregs);
 	emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -args_off);
 
 	if (flags & BPF_TRAMP_F_IP_ARG) {
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v4 3/8] bpf: Update descriptions for helpers bpf_get_func_arg[_cnt]()
  2022-08-31 15:26 [PATCH bpf-next v4 0/8] bpf: Support struct argument for trampoline base progs Yonghong Song
  2022-08-31 15:26 ` [PATCH bpf-next v4 1/8] bpf: Allow struct argument in trampoline based programs Yonghong Song
  2022-08-31 15:26 ` [PATCH bpf-next v4 2/8] bpf: x86: Support in-register struct arguments in trampoline programs Yonghong Song
@ 2022-08-31 15:26 ` Yonghong Song
  2022-09-02  7:56   ` Jiri Olsa
  2022-08-31 15:27 ` [PATCH bpf-next v4 4/8] bpf: arm64: No support of struct argument in trampoline programs Yonghong Song
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Yonghong Song @ 2022-08-31 15:26 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team

Now instead of the number of arguments, the number of registers
holding argument values are stored in trampoline. Update
the description of bpf_get_func_arg[_cnt]() helpers. Previous
programs without struct arguments should continue to work
as usual.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 include/uapi/linux/bpf.h       | 9 +++++----
 tools/include/uapi/linux/bpf.h | 9 +++++----
 2 files changed, 10 insertions(+), 8 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 962960a98835..f9f43343ef93 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -5079,12 +5079,12 @@ union bpf_attr {
  *
  * long bpf_get_func_arg(void *ctx, u32 n, u64 *value)
  *	Description
- *		Get **n**-th argument (zero based) of the traced function (for tracing programs)
+ *		Get **n**-th argument register (zero based) of the traced function (for tracing programs)
  *		returned in **value**.
  *
  *	Return
  *		0 on success.
- *		**-EINVAL** if n >= arguments count of traced function.
+ *		**-EINVAL** if n >= argument register count of traced function.
  *
  * long bpf_get_func_ret(void *ctx, u64 *value)
  *	Description
@@ -5097,10 +5097,11 @@ union bpf_attr {
  *
  * long bpf_get_func_arg_cnt(void *ctx)
  *	Description
- *		Get number of arguments of the traced function (for tracing programs).
+ *		Get number of registers of the traced function (for tracing programs) where
+ *		function arguments are stored in these registers.
  *
  *	Return
- *		The number of arguments of the traced function.
+ *		The number of argument registers of the traced function.
  *
  * int bpf_get_retval(void)
  *	Description
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index f4ba82a1eace..f13fa71822f4 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -5079,12 +5079,12 @@ union bpf_attr {
  *
  * long bpf_get_func_arg(void *ctx, u32 n, u64 *value)
  *	Description
- *		Get **n**-th argument (zero based) of the traced function (for tracing programs)
+ *		Get **n**-th argument register (zero based) of the traced function (for tracing programs)
  *		returned in **value**.
  *
  *	Return
  *		0 on success.
- *		**-EINVAL** if n >= arguments count of traced function.
+ *		**-EINVAL** if n >= argument register count of traced function.
  *
  * long bpf_get_func_ret(void *ctx, u64 *value)
  *	Description
@@ -5097,10 +5097,11 @@ union bpf_attr {
  *
  * long bpf_get_func_arg_cnt(void *ctx)
  *	Description
- *		Get number of arguments of the traced function (for tracing programs).
+ *		Get number of registers of the traced function (for tracing programs) where
+ *		function arguments are stored in these registers.
  *
  *	Return
- *		The number of arguments of the traced function.
+ *		The number of argument registers of the traced function.
  *
  * int bpf_get_retval(void)
  *	Description
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v4 4/8] bpf: arm64: No support of struct argument in trampoline programs
  2022-08-31 15:26 [PATCH bpf-next v4 0/8] bpf: Support struct argument for trampoline base progs Yonghong Song
                   ` (2 preceding siblings ...)
  2022-08-31 15:26 ` [PATCH bpf-next v4 3/8] bpf: Update descriptions for helpers bpf_get_func_arg[_cnt]() Yonghong Song
@ 2022-08-31 15:27 ` Yonghong Song
  2022-08-31 15:27 ` [PATCH bpf-next v4 5/8] libbpf: Add new BPF_PROG2 macro Yonghong Song
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Yonghong Song @ 2022-08-31 15:27 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team

ARM64 does not support struct argument for trampoline based
bpf programs yet.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 arch/arm64/net/bpf_jit_comp.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 389623ae5a91..30f76178608b 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -1970,7 +1970,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image,
 				u32 flags, struct bpf_tramp_links *tlinks,
 				void *orig_call)
 {
-	int ret;
+	int i, ret;
 	int nargs = m->nr_args;
 	int max_insns = ((long)image_end - (long)image) / AARCH64_INSN_SIZE;
 	struct jit_ctx ctx = {
@@ -1982,6 +1982,12 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image,
 	if (nargs > 8)
 		return -ENOTSUPP;
 
+	/* don't support struct argument */
+	for (i = 0; i < MAX_BPF_FUNC_ARGS; i++) {
+		if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG)
+			return -ENOTSUPP;
+	}
+
 	ret = prepare_trampoline(&ctx, im, tlinks, orig_call, nargs, flags);
 	if (ret < 0)
 		return ret;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v4 5/8] libbpf: Add new BPF_PROG2 macro
  2022-08-31 15:26 [PATCH bpf-next v4 0/8] bpf: Support struct argument for trampoline base progs Yonghong Song
                   ` (3 preceding siblings ...)
  2022-08-31 15:27 ` [PATCH bpf-next v4 4/8] bpf: arm64: No support of struct argument in trampoline programs Yonghong Song
@ 2022-08-31 15:27 ` Yonghong Song
  2022-09-09  0:11   ` Andrii Nakryiko
  2022-08-31 15:27 ` [PATCH bpf-next v4 6/8] selftests/bpf: Add struct argument tests with fentry/fexit programs Yonghong Song
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 19+ messages in thread
From: Yonghong Song @ 2022-08-31 15:27 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team

To support struct arguments in trampoline based programs,
existing BPF_PROG doesn't work any more since
the type size is needed to find whether a parameter
takes one or two registers. So this patch added a new
BPF_PROG2 macro to support such trampoline programs.

The idea is suggested by Andrii. For example, if the
to-be-traced function has signature like
  typedef struct {
       void *x;
       int t;
  } sockptr;
  int blah(sockptr x, char y);

In the new BPF_PROG2 macro, the argument can be
represented as
  __bpf_prog_call(
     ({ union {
          struct { __u64 x, y; } ___z;
          sockptr x;
        } ___tmp = { .___z = { ctx[0], ctx[1] }};
        ___tmp.x;
     }),
     ({ union {
          struct { __u8 x; } ___z;
          char y;
        } ___tmp = { .___z = { ctx[2] }};
        ___tmp.y;
     }));
In the above, the values stored on the stack are properly
assigned to the actual argument type value by using 'union'
magic. Note that the macro also works even if no arguments
are with struct types.

Note that new BPF_PROG2 works for both llvm16 and pre-llvm16
compilers where llvm16 supports bpf target passing value
with struct up to 16 byte size and pre-llvm16 will pass
by reference by storing values on the stack. With static functions
with struct argument as always inline, the compiler is able
to optimize and remove additional stack saving of struct values.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 tools/lib/bpf/bpf_tracing.h | 79 +++++++++++++++++++++++++++++++++++++
 1 file changed, 79 insertions(+)

diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
index 5fdb93da423b..8d4bdd18cb3d 100644
--- a/tools/lib/bpf/bpf_tracing.h
+++ b/tools/lib/bpf/bpf_tracing.h
@@ -438,6 +438,85 @@ typeof(name(0)) name(unsigned long long *ctx)				    \
 static __always_inline typeof(name(0))					    \
 ____##name(unsigned long long *ctx, ##args)
 
+#ifndef ____bpf_nth
+#define ____bpf_nth(_, _1, _2, _3, _4, _5, _6, _7, _8, _9, _10, _11, _12, _13, _14, _15, _16, _17, _18, _19, _20, _21, _22, _23, _24, N, ...) N
+#endif
+#ifndef ____bpf_narg
+#define ____bpf_narg(...) ____bpf_nth(_, ##__VA_ARGS__, 12, 12, 11, 11, 10, 10, 9, 9, 8, 8, 7, 7, 6, 6, 5, 5, 4, 4, 3, 3, 2, 2, 1, 1, 0)
+#endif
+
+#define BPF_REG_CNT(t) \
+	(__builtin_choose_expr(sizeof(t) == 1 || sizeof(t) == 2 || sizeof(t) == 4 || sizeof(t) == 8, 1,	\
+	 __builtin_choose_expr(sizeof(t) == 16, 2,							\
+			       (void)0)))
+
+#define ____bpf_reg_cnt0()			(0)
+#define ____bpf_reg_cnt1(t, x)			(____bpf_reg_cnt0() + BPF_REG_CNT(t))
+#define ____bpf_reg_cnt2(t, x, args...)		(____bpf_reg_cnt1(args) + BPF_REG_CNT(t))
+#define ____bpf_reg_cnt3(t, x, args...)		(____bpf_reg_cnt2(args) + BPF_REG_CNT(t))
+#define ____bpf_reg_cnt4(t, x, args...)		(____bpf_reg_cnt3(args) + BPF_REG_CNT(t))
+#define ____bpf_reg_cnt5(t, x, args...)		(____bpf_reg_cnt4(args) + BPF_REG_CNT(t))
+#define ____bpf_reg_cnt6(t, x, args...)		(____bpf_reg_cnt5(args) + BPF_REG_CNT(t))
+#define ____bpf_reg_cnt7(t, x, args...)		(____bpf_reg_cnt6(args) + BPF_REG_CNT(t))
+#define ____bpf_reg_cnt8(t, x, args...)		(____bpf_reg_cnt7(args) + BPF_REG_CNT(t))
+#define ____bpf_reg_cnt9(t, x, args...)		(____bpf_reg_cnt8(args) + BPF_REG_CNT(t))
+#define ____bpf_reg_cnt10(t, x, args...)	(____bpf_reg_cnt9(args) + BPF_REG_CNT(t))
+#define ____bpf_reg_cnt11(t, x, args...)	(____bpf_reg_cnt10(args) + BPF_REG_CNT(t))
+#define ____bpf_reg_cnt12(t, x, args...)	(____bpf_reg_cnt11(args) + BPF_REG_CNT(t))
+#define ____bpf_reg_cnt(args...)	 ___bpf_apply(____bpf_reg_cnt, ____bpf_narg(args))(args)
+
+#define ____bpf_union_arg(t, x, n) \
+	__builtin_choose_expr(sizeof(t) == 1, ({ union { struct { __u8 x; } ___z; t x; } ___tmp = { .___z = {ctx[n]}}; ___tmp.x; }), \
+	__builtin_choose_expr(sizeof(t) == 2, ({ union { struct { __u16 x; } ___z; t x; } ___tmp = { .___z = {ctx[n]} }; ___tmp.x; }), \
+	__builtin_choose_expr(sizeof(t) == 4, ({ union { struct { __u32 x; } ___z; t x; } ___tmp = { .___z = {ctx[n]} }; ___tmp.x; }), \
+	__builtin_choose_expr(sizeof(t) == 8, ({ union { struct { __u64 x; } ___z; t x; } ___tmp = {.___z = {ctx[n]} }; ___tmp.x; }), \
+	__builtin_choose_expr(sizeof(t) == 16, ({ union { struct { __u64 x, y; } ___z; t x; } ___tmp = {.___z = {ctx[n], ctx[n + 1]} }; ___tmp.x; }), \
+			      (void)0)))))
+
+#define ____bpf_ctx_arg0(n, args...)
+#define ____bpf_ctx_arg1(n, t, x)		, ____bpf_union_arg(t, x, n - ____bpf_reg_cnt1(t, x))
+#define ____bpf_ctx_arg2(n, t, x, args...)	, ____bpf_union_arg(t, x, n - ____bpf_reg_cnt2(t, x, args)) ____bpf_ctx_arg1(n, args)
+#define ____bpf_ctx_arg3(n, t, x, args...)	, ____bpf_union_arg(t, x, n - ____bpf_reg_cnt3(t, x, args)) ____bpf_ctx_arg2(n, args)
+#define ____bpf_ctx_arg4(n, t, x, args...)	, ____bpf_union_arg(t, x, n - ____bpf_reg_cnt4(t, x, args)) ____bpf_ctx_arg3(n, args)
+#define ____bpf_ctx_arg5(n, t, x, args...)	, ____bpf_union_arg(t, x, n - ____bpf_reg_cnt5(t, x, args)) ____bpf_ctx_arg4(n, args)
+#define ____bpf_ctx_arg6(n, t, x, args...)	, ____bpf_union_arg(t, x, n - ____bpf_reg_cnt6(t, x, args)) ____bpf_ctx_arg5(n, args)
+#define ____bpf_ctx_arg7(n, t, x, args...)	, ____bpf_union_arg(t, x, n - ____bpf_reg_cnt7(t, x, args)) ____bpf_ctx_arg6(n, args)
+#define ____bpf_ctx_arg8(n, t, x, args...)	, ____bpf_union_arg(t, x, n - ____bpf_reg_cnt8(t, x, args)) ____bpf_ctx_arg7(n, args)
+#define ____bpf_ctx_arg9(n, t, x, args...)	, ____bpf_union_arg(t, x, n - ____bpf_reg_cnt9(t, x, args)) ____bpf_ctx_arg8(n, args)
+#define ____bpf_ctx_arg10(n, t, x, args...)	, ____bpf_union_arg(t, x, n - ____bpf_reg_cnt10(t, x, args)) ____bpf_ctx_arg9(n, args)
+#define ____bpf_ctx_arg11(n, t, x, args...)	, ____bpf_union_arg(t, x, n - ____bpf_reg_cnt11(t, x, args)) ____bpf_ctx_arg10(n, args)
+#define ____bpf_ctx_arg12(n, t, x, args...)	, ____bpf_union_arg(t, x, n - ____bpf_reg_cnt12(t, x, args)) ____bpf_ctx_arg11(n, args)
+#define ____bpf_ctx_arg(n, args...)	___bpf_apply(____bpf_ctx_arg, ____bpf_narg(args))(n, args)
+
+#define ____bpf_ctx_decl0()
+#define ____bpf_ctx_decl1(t, x)			, t x
+#define ____bpf_ctx_decl2(t, x, args...)	, t x ____bpf_ctx_decl1(args)
+#define ____bpf_ctx_decl3(t, x, args...)	, t x ____bpf_ctx_decl2(args)
+#define ____bpf_ctx_decl4(t, x, args...)	, t x ____bpf_ctx_decl3(args)
+#define ____bpf_ctx_decl5(t, x, args...)	, t x ____bpf_ctx_decl4(args)
+#define ____bpf_ctx_decl6(t, x, args...)	, t x ____bpf_ctx_decl5(args)
+#define ____bpf_ctx_decl7(t, x, args...)	, t x ____bpf_ctx_decl6(args)
+#define ____bpf_ctx_decl8(t, x, args...)	, t x ____bpf_ctx_decl7(args)
+#define ____bpf_ctx_decl9(t, x, args...)	, t x ____bpf_ctx_decl8(args)
+#define ____bpf_ctx_decl10(t, x, args...)	, t x ____bpf_ctx_decl9(args)
+#define ____bpf_ctx_decl11(t, x, args...)	, t x ____bpf_ctx_decl10(args)
+#define ____bpf_ctx_decl12(t, x, args...)	, t x ____bpf_ctx_decl11(args)
+#define ____bpf_ctx_decl(args...)	___bpf_apply(____bpf_ctx_decl, ____bpf_narg(args))(args)
+
+/*
+ * BPF_PROG2 can handle struct arguments.
+ */
+#define BPF_PROG2(name, args...)						\
+name(unsigned long long *ctx);							\
+static __always_inline typeof(name(0))						\
+____##name(unsigned long long *ctx ____bpf_ctx_decl(args));			\
+typeof(name(0)) name(unsigned long long *ctx)					\
+{										\
+	return ____##name(ctx ____bpf_ctx_arg(____bpf_reg_cnt(args), args));	\
+}										\
+static __always_inline typeof(name(0))						\
+____##name(unsigned long long *ctx ____bpf_ctx_decl(args))
+
 struct pt_regs;
 
 #define ___bpf_kprobe_args0()           ctx
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v4 6/8] selftests/bpf: Add struct argument tests with fentry/fexit programs.
  2022-08-31 15:26 [PATCH bpf-next v4 0/8] bpf: Support struct argument for trampoline base progs Yonghong Song
                   ` (4 preceding siblings ...)
  2022-08-31 15:27 ` [PATCH bpf-next v4 5/8] libbpf: Add new BPF_PROG2 macro Yonghong Song
@ 2022-08-31 15:27 ` Yonghong Song
  2022-08-31 15:27 ` [PATCH bpf-next v4 7/8] selftests/bpf: Use BPF_PROG2 for some fentry programs without struct arguments Yonghong Song
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 19+ messages in thread
From: Yonghong Song @ 2022-08-31 15:27 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team

Add various struct argument tests with fentry/fexit programs.
Also add one test with a kernel func which does not have any
argument to test BPF_PROG2 macro in such situation.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 .../selftests/bpf/bpf_testmod/bpf_testmod.c   |  48 +++++++
 .../selftests/bpf/prog_tests/tracing_struct.c |  63 +++++++++
 .../selftests/bpf/progs/tracing_struct.c      | 120 ++++++++++++++++++
 3 files changed, 231 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/tracing_struct.c
 create mode 100644 tools/testing/selftests/bpf/progs/tracing_struct.c

diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
index 792cb15bac40..a6021d6117b5 100644
--- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
+++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
@@ -18,6 +18,46 @@ typedef int (*func_proto_typedef_nested1)(func_proto_typedef);
 typedef int (*func_proto_typedef_nested2)(func_proto_typedef_nested1);
 
 DEFINE_PER_CPU(int, bpf_testmod_ksym_percpu) = 123;
+long bpf_testmod_test_struct_arg_result;
+
+struct bpf_testmod_struct_arg_1 {
+	int a;
+};
+struct bpf_testmod_struct_arg_2 {
+	long a;
+	long b;
+};
+
+noinline int
+bpf_testmod_test_struct_arg_1(struct bpf_testmod_struct_arg_2 a, int b, int c) {
+	bpf_testmod_test_struct_arg_result = a.a + a.b  + b + c;
+	return bpf_testmod_test_struct_arg_result;
+}
+
+noinline int
+bpf_testmod_test_struct_arg_2(int a, struct bpf_testmod_struct_arg_2 b, int c) {
+	bpf_testmod_test_struct_arg_result = a + b.a + b.b + c;
+	return bpf_testmod_test_struct_arg_result;
+}
+
+noinline int
+bpf_testmod_test_struct_arg_3(int a, int b, struct bpf_testmod_struct_arg_2 c) {
+	bpf_testmod_test_struct_arg_result = a + b + c.a + c.b;
+	return bpf_testmod_test_struct_arg_result;
+}
+
+noinline int
+bpf_testmod_test_struct_arg_4(struct bpf_testmod_struct_arg_1 a, int b,
+			      int c, int d, struct bpf_testmod_struct_arg_2 e) {
+	bpf_testmod_test_struct_arg_result = a.a + b + c + d + e.a + e.b;
+	return bpf_testmod_test_struct_arg_result;
+}
+
+noinline int
+bpf_testmod_test_struct_arg_5(void) {
+	bpf_testmod_test_struct_arg_result = 1;
+	return bpf_testmod_test_struct_arg_result;
+}
 
 noinline void
 bpf_testmod_test_mod_kfunc(int i)
@@ -98,11 +138,19 @@ bpf_testmod_test_read(struct file *file, struct kobject *kobj,
 		.off = off,
 		.len = len,
 	};
+	struct bpf_testmod_struct_arg_1 struct_arg1 = {10};
+	struct bpf_testmod_struct_arg_2 struct_arg2 = {2, 3};
 	int i = 1;
 
 	while (bpf_testmod_return_ptr(i))
 		i++;
 
+	(void)bpf_testmod_test_struct_arg_1(struct_arg2, 1, 4);
+	(void)bpf_testmod_test_struct_arg_2(1, struct_arg2, 4);
+	(void)bpf_testmod_test_struct_arg_3(1, 4, struct_arg2);
+	(void)bpf_testmod_test_struct_arg_4(struct_arg1, 1, 2, 3, struct_arg2);
+	(void)bpf_testmod_test_struct_arg_5();
+
 	/* This is always true. Use the check to make sure the compiler
 	 * doesn't remove bpf_testmod_loop_test.
 	 */
diff --git a/tools/testing/selftests/bpf/prog_tests/tracing_struct.c b/tools/testing/selftests/bpf/prog_tests/tracing_struct.c
new file mode 100644
index 000000000000..d5022b91d1e4
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/tracing_struct.c
@@ -0,0 +1,63 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */
+
+#include <test_progs.h>
+#include "tracing_struct.skel.h"
+
+static void test_fentry(void)
+{
+	struct tracing_struct *skel;
+	int err;
+
+	skel = tracing_struct__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "tracing_struct__open_and_load"))
+		return;
+
+	err = tracing_struct__attach(skel);
+	if (!ASSERT_OK(err, "tracing_struct__attach"))
+		return;
+
+	ASSERT_OK(trigger_module_test_read(256), "trigger_read");
+
+	ASSERT_EQ(skel->bss->t1_a_a, 2, "t1:a.a");
+	ASSERT_EQ(skel->bss->t1_a_b, 3, "t1:a.b");
+	ASSERT_EQ(skel->bss->t1_b, 1, "t1:b");
+	ASSERT_EQ(skel->bss->t1_c, 4, "t1:c");
+
+	ASSERT_EQ(skel->bss->t1_nregs, 4, "t1 nregs");
+	ASSERT_EQ(skel->bss->t1_reg0, 2, "t1 reg0");
+	ASSERT_EQ(skel->bss->t1_reg1, 3, "t1 reg1");
+	ASSERT_EQ(skel->bss->t1_reg2, 1, "t1 reg2");
+	ASSERT_EQ(skel->bss->t1_reg3, 4, "t1 reg3");
+	ASSERT_EQ(skel->bss->t1_ret, 10, "t1 ret");
+
+	ASSERT_EQ(skel->bss->t2_a, 1, "t2:a");
+	ASSERT_EQ(skel->bss->t2_b_a, 2, "t2:b.a");
+	ASSERT_EQ(skel->bss->t2_b_b, 3, "t2:b.b");
+	ASSERT_EQ(skel->bss->t2_c, 4, "t2:c");
+	ASSERT_EQ(skel->bss->t2_ret, 10, "t2 ret");
+
+	ASSERT_EQ(skel->bss->t3_a, 1, "t3:a");
+	ASSERT_EQ(skel->bss->t3_b, 4, "t3:b");
+	ASSERT_EQ(skel->bss->t3_c_a, 2, "t3:c.a");
+	ASSERT_EQ(skel->bss->t3_c_b, 3, "t3:c.b");
+	ASSERT_EQ(skel->bss->t3_ret, 10, "t3 ret");
+
+	ASSERT_EQ(skel->bss->t4_a_a, 10, "t4:a.a");
+	ASSERT_EQ(skel->bss->t4_b, 1, "t4:b");
+	ASSERT_EQ(skel->bss->t4_c, 2, "t4:c");
+	ASSERT_EQ(skel->bss->t4_d, 3, "t4:d");
+	ASSERT_EQ(skel->bss->t4_e_a, 2, "t4:e.a");
+	ASSERT_EQ(skel->bss->t4_e_b, 3, "t4:e.b");
+	ASSERT_EQ(skel->bss->t4_ret, 21, "t4 ret");
+
+	ASSERT_EQ(skel->bss->t5_ret, 1, "t5 ret");
+
+	tracing_struct__detach(skel);
+	tracing_struct__destroy(skel);
+}
+
+void test_tracing_struct(void)
+{
+	test_fentry();
+}
diff --git a/tools/testing/selftests/bpf/progs/tracing_struct.c b/tools/testing/selftests/bpf/progs/tracing_struct.c
new file mode 100644
index 000000000000..e718f0ebee7d
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/tracing_struct.c
@@ -0,0 +1,120 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */
+
+#include <vmlinux.h>
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_helpers.h>
+
+struct bpf_testmod_struct_arg_1 {
+	int a;
+};
+struct bpf_testmod_struct_arg_2 {
+	long a;
+	long b;
+};
+
+long t1_a_a, t1_a_b, t1_b, t1_c, t1_ret, t1_nregs;
+__u64 t1_reg0, t1_reg1, t1_reg2, t1_reg3;
+long t2_a, t2_b_a, t2_b_b, t2_c, t2_ret;
+long t3_a, t3_b, t3_c_a, t3_c_b, t3_ret;
+long t4_a_a, t4_b, t4_c, t4_d, t4_e_a, t4_e_b, t4_ret;
+long t5_ret;
+
+SEC("fentry/bpf_testmod_test_struct_arg_1")
+int BPF_PROG2(test_struct_arg_1, struct bpf_testmod_struct_arg_2, a, int, b, int, c)
+{
+	t1_a_a = a.a;
+	t1_a_b = a.b;
+	t1_b = b;
+	t1_c = c;
+	return 0;
+}
+
+SEC("fexit/bpf_testmod_test_struct_arg_1")
+int BPF_PROG2(test_struct_arg_2, struct bpf_testmod_struct_arg_2, a, int, b, int, c, int, ret)
+{
+	t1_nregs =  bpf_get_func_arg_cnt(ctx);
+	/* a.a */
+	bpf_get_func_arg(ctx, 0, &t1_reg0);
+	/* a.b */
+	bpf_get_func_arg(ctx, 1, &t1_reg1);
+	/* b */
+	bpf_get_func_arg(ctx, 2, &t1_reg2);
+	t1_reg2 = (int)t1_reg2;
+	/* c */
+	bpf_get_func_arg(ctx, 3, &t1_reg3);
+	t1_reg3 = (int)t1_reg3;
+
+	t1_ret = ret;
+	return 0;
+}
+
+SEC("fentry/bpf_testmod_test_struct_arg_2")
+int BPF_PROG2(test_struct_arg_3, int, a, struct bpf_testmod_struct_arg_2, b, int, c)
+{
+	t2_a = a;
+	t2_b_a = b.a;
+	t2_b_b = b.b;
+	t2_c = c;
+	return 0;
+}
+
+SEC("fexit/bpf_testmod_test_struct_arg_2")
+int BPF_PROG2(test_struct_arg_4, int, a, struct bpf_testmod_struct_arg_2, b, int, c, int, ret)
+{
+	t2_ret = ret;
+	return 0;
+}
+
+SEC("fentry/bpf_testmod_test_struct_arg_3")
+int BPF_PROG2(test_struct_arg_5, int, a, int, b, struct bpf_testmod_struct_arg_2, c)
+{
+	t3_a = a;
+	t3_b = b;
+	t3_c_a = c.a;
+	t3_c_b = c.b;
+	return 0;
+}
+
+SEC("fexit/bpf_testmod_test_struct_arg_3")
+int BPF_PROG2(test_struct_arg_6, int, a, int, b, struct bpf_testmod_struct_arg_2, c, int, ret)
+{
+	t3_ret = ret;
+	return 0;
+}
+
+SEC("fentry/bpf_testmod_test_struct_arg_4")
+int BPF_PROG2(test_struct_arg_7, struct bpf_testmod_struct_arg_1, a, int, b,
+	     int, c, int, d, struct bpf_testmod_struct_arg_2, e)
+{
+	t4_a_a = a.a;
+	t4_b = b;
+	t4_c = c;
+	t4_d = d;
+	t4_e_a = e.a;
+	t4_e_b = e.b;
+	return 0;
+}
+
+SEC("fexit/bpf_testmod_test_struct_arg_4")
+int BPF_PROG2(test_struct_arg_8, struct bpf_testmod_struct_arg_1, a, int, b,
+	     int, c, int, d, struct bpf_testmod_struct_arg_2, e, int, ret)
+{
+	t4_ret = ret;
+	return 0;
+}
+
+SEC("fentry/bpf_testmod_test_struct_arg_5")
+int BPF_PROG2(test_struct_arg_9)
+{
+	return 0;
+}
+
+SEC("fexit/bpf_testmod_test_struct_arg_5")
+int BPF_PROG2(test_struct_arg_10, int, ret)
+{
+	t5_ret = ret;
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v4 7/8] selftests/bpf: Use BPF_PROG2 for some fentry programs without struct arguments
  2022-08-31 15:26 [PATCH bpf-next v4 0/8] bpf: Support struct argument for trampoline base progs Yonghong Song
                   ` (5 preceding siblings ...)
  2022-08-31 15:27 ` [PATCH bpf-next v4 6/8] selftests/bpf: Add struct argument tests with fentry/fexit programs Yonghong Song
@ 2022-08-31 15:27 ` Yonghong Song
  2022-08-31 15:27 ` [PATCH bpf-next v4 8/8] selftests/bpf: Add tracing_struct test in DENYLIST.s390x Yonghong Song
  2022-09-07  3:10 ` [PATCH bpf-next v4 0/8] bpf: Support struct argument for trampoline base progs patchwork-bot+netdevbpf
  8 siblings, 0 replies; 19+ messages in thread
From: Yonghong Song @ 2022-08-31 15:27 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team

Use BPF_PROG2 instead of BPF_PROG for programs in progs/timer.c
to test BPF_PROG2 for cases without struct arguments.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 tools/testing/selftests/bpf/progs/timer.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/timer.c b/tools/testing/selftests/bpf/progs/timer.c
index 5f5309791649..560a63e95dc3 100644
--- a/tools/testing/selftests/bpf/progs/timer.c
+++ b/tools/testing/selftests/bpf/progs/timer.c
@@ -120,7 +120,7 @@ static int timer_cb1(void *map, int *key, struct bpf_timer *timer)
 }
 
 SEC("fentry/bpf_fentry_test1")
-int BPF_PROG(test1, int a)
+int BPF_PROG2(test1, int, a)
 {
 	struct bpf_timer *arr_timer, *lru_timer;
 	struct elem init = {};
@@ -247,7 +247,7 @@ int bpf_timer_test(void)
 }
 
 SEC("fentry/bpf_fentry_test2")
-int BPF_PROG(test2, int a, int b)
+int BPF_PROG2(test2, int, a, int, b)
 {
 	struct hmap_elem init = {}, *val;
 	int key = HTAB, key_malloc = HTAB_MALLOC;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v4 8/8] selftests/bpf: Add tracing_struct test in DENYLIST.s390x
  2022-08-31 15:26 [PATCH bpf-next v4 0/8] bpf: Support struct argument for trampoline base progs Yonghong Song
                   ` (6 preceding siblings ...)
  2022-08-31 15:27 ` [PATCH bpf-next v4 7/8] selftests/bpf: Use BPF_PROG2 for some fentry programs without struct arguments Yonghong Song
@ 2022-08-31 15:27 ` Yonghong Song
  2022-09-07  3:10 ` [PATCH bpf-next v4 0/8] bpf: Support struct argument for trampoline base progs patchwork-bot+netdevbpf
  8 siblings, 0 replies; 19+ messages in thread
From: Yonghong Song @ 2022-08-31 15:27 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team

Add tracing_struct test in DENYLIST.s390x since s390x does not
support trampoline now.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 tools/testing/selftests/bpf/DENYLIST.s390x | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/testing/selftests/bpf/DENYLIST.s390x b/tools/testing/selftests/bpf/DENYLIST.s390x
index 736b65f61022..aa18b6d24510 100644
--- a/tools/testing/selftests/bpf/DENYLIST.s390x
+++ b/tools/testing/selftests/bpf/DENYLIST.s390x
@@ -68,3 +68,4 @@ unpriv_bpf_disabled                      # fentry
 setget_sockopt                           # attach unexpected error: -524                                               (trampoline)
 cb_refs                                  # expected error message unexpected error: -524                               (trampoline)
 cgroup_hierarchical_stats                # JIT does not support calling kernel function                                (kfunc)
+tracing_struct                           # failed to auto-attach: -524                                                 (trampoline)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v4 3/8] bpf: Update descriptions for helpers bpf_get_func_arg[_cnt]()
  2022-08-31 15:26 ` [PATCH bpf-next v4 3/8] bpf: Update descriptions for helpers bpf_get_func_arg[_cnt]() Yonghong Song
@ 2022-09-02  7:56   ` Jiri Olsa
  2022-09-06 16:12     ` Yonghong Song
  0 siblings, 1 reply; 19+ messages in thread
From: Jiri Olsa @ 2022-09-02  7:56 UTC (permalink / raw)
  To: Yonghong Song
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team

On Wed, Aug 31, 2022 at 08:26:57AM -0700, Yonghong Song wrote:
> Now instead of the number of arguments, the number of registers
> holding argument values are stored in trampoline. Update
> the description of bpf_get_func_arg[_cnt]() helpers. Previous
> programs without struct arguments should continue to work
> as usual.
> 
> Signed-off-by: Yonghong Song <yhs@fb.com>
> ---
>  include/uapi/linux/bpf.h       | 9 +++++----
>  tools/include/uapi/linux/bpf.h | 9 +++++----
>  2 files changed, 10 insertions(+), 8 deletions(-)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 962960a98835..f9f43343ef93 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -5079,12 +5079,12 @@ union bpf_attr {
>   *
>   * long bpf_get_func_arg(void *ctx, u32 n, u64 *value)
>   *	Description
> - *		Get **n**-th argument (zero based) of the traced function (for tracing programs)
> + *		Get **n**-th argument register (zero based) of the traced function (for tracing programs)

I'm bit worried about the confusion between args/regs we create with
this, but I don't any have better idea how to solve this

keeping extra stack values for nr_args and nr_regs and have new helpers
to get reg values.. but then bpf_get_func_arg still does not return the
full argument value.. also given that the struct args should be rare,
I guess it's fine ;-)

jirka

>   *		returned in **value**.
>   *
>   *	Return
>   *		0 on success.
> - *		**-EINVAL** if n >= arguments count of traced function.
> + *		**-EINVAL** if n >= argument register count of traced function.
>   *
>   * long bpf_get_func_ret(void *ctx, u64 *value)
>   *	Description
> @@ -5097,10 +5097,11 @@ union bpf_attr {
>   *
>   * long bpf_get_func_arg_cnt(void *ctx)
>   *	Description
> - *		Get number of arguments of the traced function (for tracing programs).
> + *		Get number of registers of the traced function (for tracing programs) where
> + *		function arguments are stored in these registers.
>   *
>   *	Return
> - *		The number of arguments of the traced function.
> + *		The number of argument registers of the traced function.
>   *
>   * int bpf_get_retval(void)
>   *	Description
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index f4ba82a1eace..f13fa71822f4 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -5079,12 +5079,12 @@ union bpf_attr {
>   *
>   * long bpf_get_func_arg(void *ctx, u32 n, u64 *value)
>   *	Description
> - *		Get **n**-th argument (zero based) of the traced function (for tracing programs)
> + *		Get **n**-th argument register (zero based) of the traced function (for tracing programs)
>   *		returned in **value**.
>   *
>   *	Return
>   *		0 on success.
> - *		**-EINVAL** if n >= arguments count of traced function.
> + *		**-EINVAL** if n >= argument register count of traced function.
>   *
>   * long bpf_get_func_ret(void *ctx, u64 *value)
>   *	Description
> @@ -5097,10 +5097,11 @@ union bpf_attr {
>   *
>   * long bpf_get_func_arg_cnt(void *ctx)
>   *	Description
> - *		Get number of arguments of the traced function (for tracing programs).
> + *		Get number of registers of the traced function (for tracing programs) where
> + *		function arguments are stored in these registers.
>   *
>   *	Return
> - *		The number of arguments of the traced function.
> + *		The number of argument registers of the traced function.
>   *
>   * int bpf_get_retval(void)
>   *	Description
> -- 
> 2.30.2
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v4 3/8] bpf: Update descriptions for helpers bpf_get_func_arg[_cnt]()
  2022-09-02  7:56   ` Jiri Olsa
@ 2022-09-06 16:12     ` Yonghong Song
  0 siblings, 0 replies; 19+ messages in thread
From: Yonghong Song @ 2022-09-06 16:12 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team



On 9/2/22 12:56 AM, Jiri Olsa wrote:
> On Wed, Aug 31, 2022 at 08:26:57AM -0700, Yonghong Song wrote:
>> Now instead of the number of arguments, the number of registers
>> holding argument values are stored in trampoline. Update
>> the description of bpf_get_func_arg[_cnt]() helpers. Previous
>> programs without struct arguments should continue to work
>> as usual.
>>
>> Signed-off-by: Yonghong Song <yhs@fb.com>
>> ---
>>   include/uapi/linux/bpf.h       | 9 +++++----
>>   tools/include/uapi/linux/bpf.h | 9 +++++----
>>   2 files changed, 10 insertions(+), 8 deletions(-)
>>
>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> index 962960a98835..f9f43343ef93 100644
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h
>> @@ -5079,12 +5079,12 @@ union bpf_attr {
>>    *
>>    * long bpf_get_func_arg(void *ctx, u32 n, u64 *value)
>>    *	Description
>> - *		Get **n**-th argument (zero based) of the traced function (for tracing programs)
>> + *		Get **n**-th argument register (zero based) of the traced function (for tracing programs)
> 
> I'm bit worried about the confusion between args/regs we create with
> this, but I don't any have better idea how to solve this
> 
> keeping extra stack values for nr_args and nr_regs and have new helpers
> to get reg values.. but then bpf_get_func_arg still does not return the
> full argument value.. also given that the struct args should be rare,
> I guess it's fine ;-)

You are right. Since an argument may takes two u64 space, the helper
bpf_get_func_arg() might not return the *whole* argument any more.
But it still returns a func arg, maybe just a partial func arg.
So the 'n' here has a semantic change, instead of 'n'th argument,
it becomes 'n'th argument register. To create new helpers probably
not needed, so that is why I made the description change to the
existing helper which maintains backward compatability too.

> 
> jirka
> 
>>    *		returned in **value**.
>>    *
>>    *	Return
>>    *		0 on success.
>> - *		**-EINVAL** if n >= arguments count of traced function.
>> + *		**-EINVAL** if n >= argument register count of traced function.
>>    *
>>    * long bpf_get_func_ret(void *ctx, u64 *value)
>>    *	Description
>> @@ -5097,10 +5097,11 @@ union bpf_attr {
>>    *
>>    * long bpf_get_func_arg_cnt(void *ctx)
>>    *	Description
>> - *		Get number of arguments of the traced function (for tracing programs).
>> + *		Get number of registers of the traced function (for tracing programs) where
>> + *		function arguments are stored in these registers.
>>    *
>>    *	Return
>> - *		The number of arguments of the traced function.
>> + *		The number of argument registers of the traced function.
>>    *
>>    * int bpf_get_retval(void)
>>    *	Description
[...]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v4 2/8] bpf: x86: Support in-register struct arguments in trampoline programs
  2022-08-31 15:26 ` [PATCH bpf-next v4 2/8] bpf: x86: Support in-register struct arguments in trampoline programs Yonghong Song
@ 2022-09-06 16:40   ` Kui-Feng Lee
  2022-09-06 19:30     ` Yonghong Song
  2022-09-07  3:00   ` Alexei Starovoitov
  1 sibling, 1 reply; 19+ messages in thread
From: Kui-Feng Lee @ 2022-09-06 16:40 UTC (permalink / raw)
  To: Yonghong Song, bpf; +Cc: daniel, Kernel Team, ast, andrii

On Wed, 2022-08-31 at 08:26 -0700, Yonghong Song wrote:
> In C, struct value can be passed as a function argument.
> For small structs, struct value may be passed in
> one or more registers. For trampoline based bpf programs,
> this would cause complication since one-to-one mapping between
> function argument and arch argument register is not valid
> any more.
> 
> The latest llvm16 added bpf support to pass by values
> for struct up to 16 bytes ([1]). This is also true for
> x86_64 architecture where two registers will hold
> the struct value if the struct size is >8 and <= 16.
> This may not be true if one of struct member is 'double'
> type but in current linux source code we don't have
> such instance yet, so we assume all >8 && <= 16 struct
> holds two general purpose argument registers.
> 
> Also change on-stack nr_args value to the number
> of registers holding the arguments. This will
> permit bpf_get_func_arg() helper to get all
> argument values.
> 
>  [1] https://reviews.llvm.org/D132144
> 
> Signed-off-by: Yonghong Song <yhs@fb.com>
> ---
>  arch/x86/net/bpf_jit_comp.c | 68 +++++++++++++++++++++++++++--------
> --
>  1 file changed, 51 insertions(+), 17 deletions(-)
> 
> diff --git a/arch/x86/net/bpf_jit_comp.c
> b/arch/x86/net/bpf_jit_comp.c
> index c1f6c1c51d99..ae89f4143eb4 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -1751,34 +1751,60 @@ st:                     if (is_imm8(insn-
> >off))
>  static void save_regs(const struct btf_func_model *m, u8 **prog, int
> nr_args,
>                       int stack_size)
>  {
> -       int i;
> +       int i, j, arg_size, nr_regs;
>         /* Store function arguments to stack.
>          * For a function that accepts two pointers the sequence will
> be:
>          * mov QWORD PTR [rbp-0x10],rdi
>          * mov QWORD PTR [rbp-0x8],rsi
>          */
> -       for (i = 0; i < min(nr_args, 6); i++)
> -               emit_stx(prog, bytes_to_bpf_size(m->arg_size[i]),
> -                        BPF_REG_FP,
> -                        i == 5 ? X86_REG_R9 : BPF_REG_1 + i,
> -                        -(stack_size - i * 8));
> +       for (i = 0, j = 0; i < min(nr_args, 6); i++) {

Is 6 still correct since an argument can take more than one register
now?  Perphaps j < min(...)?

I am not sure how to deal with a corner case that a 16 bytes struct
arguement happens to be at 6th place.  Does that mean first 8 bytes are
in a register and the reset bytes are in the stack?


> +               if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG) {
> +                       nr_regs = (m->arg_size[i] + 7) / 8;
> +                       arg_size = 8;
> +               } else {
> +                       nr_regs = 1;
> +                       arg_size = m->arg_size[i];
> +               }
> +
> +               while (nr_regs) {
> +                       emit_stx(prog, bytes_to_bpf_size(arg_size),
> +                                BPF_REG_FP,
> +                                j == 5 ? X86_REG_R9 : BPF_REG_1 + j,
> +                                -(stack_size - j * 8));
> +                       nr_regs--;
> +                       j++;
> +               }
> +       }
>  }
>  
>  static void restore_regs(const struct btf_func_model *m, u8 **prog,
> int nr_args,
>                          int stack_size)
>  {
> -       int i;
> +       int i, j, arg_size, nr_regs;
>  
>         /* Restore function arguments from stack.
>          * For a function that accepts two pointers the sequence will
> be:
>          * EMIT4(0x48, 0x8B, 0x7D, 0xF0); mov rdi,QWORD PTR [rbp-
> 0x10]
>          * EMIT4(0x48, 0x8B, 0x75, 0xF8); mov rsi,QWORD PTR [rbp-0x8]
>          */
> -       for (i = 0; i < min(nr_args, 6); i++)
> -               emit_ldx(prog, bytes_to_bpf_size(m->arg_size[i]),
> -                        i == 5 ? X86_REG_R9 : BPF_REG_1 + i,
> -                        BPF_REG_FP,
> -                        -(stack_size - i * 8));
> +       for (i = 0, j = 0; i < min(nr_args, 6); i++) {
> +               if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG) {
> +                       nr_regs = (m->arg_size[i] + 7) / 8;
> +                       arg_size = 8;
> +               } else {
> +                       nr_regs = 1;
> +                       arg_size = m->arg_size[i];
> +               }
> +
> +               while (nr_regs) {
> +                       emit_ldx(prog, bytes_to_bpf_size(arg_size),
> +                                j == 5 ? X86_REG_R9 : BPF_REG_1 + j,
> +                                BPF_REG_FP,
> +                                -(stack_size - j * 8));
> +                       nr_regs--;
> +                       j++;
> +               }
> +       }
>  }
>  
>  static int invoke_bpf_prog(const struct btf_func_model *m, u8
> **pprog,
> @@ -2015,7 +2041,7 @@ int arch_prepare_bpf_trampoline(struct
> bpf_tramp_image *im, void *image, void *i
>                                 struct bpf_tramp_links *tlinks,
>                                 void *orig_call)
>  {
> -       int ret, i, nr_args = m->nr_args;
> +       int ret, i, nr_args = m->nr_args, extra_nregs = 0;
>         int regs_off, ip_off, args_off, stack_size = nr_args * 8,
> run_ctx_off;
>         struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
>         struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
> @@ -2028,6 +2054,14 @@ int arch_prepare_bpf_trampoline(struct
> bpf_tramp_image *im, void *image, void *i
>         if (nr_args > 6)
>                 return -ENOTSUPP;
>  
> +       for (i = 0; i < MAX_BPF_FUNC_ARGS; i++) {
> +               if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG)
> +                       extra_nregs += (m->arg_size[i] + 7) / 8 - 1;
> +       }
> +       if (nr_args + extra_nregs > 6)
> +               return -ENOTSUPP;
> +       stack_size += extra_nregs * 8;
> +
>         /* Generated trampoline stack layout:
>          *
>          * RBP + 8         [ return address  ]
> @@ -2040,7 +2074,7 @@ int arch_prepare_bpf_trampoline(struct
> bpf_tramp_image *im, void *image, void *i
>          *                 [ ...             ]
>          * RBP - regs_off  [ reg_arg1        ]  program's ctx pointer
>          *
> -        * RBP - args_off  [ args count      ]  always
> +        * RBP - args_off  [ arg regs count  ]  always
>          *
>          * RBP - ip_off    [ traced function ]  BPF_TRAMP_F_IP_ARG
> flag
>          *
> @@ -2083,11 +2117,11 @@ int arch_prepare_bpf_trampoline(struct
> bpf_tramp_image *im, void *image, void *i
>         EMIT4(0x48, 0x83, 0xEC, stack_size); /* sub rsp, stack_size
> */
>         EMIT1(0x53);             /* push rbx */
>  
> -       /* Store number of arguments of the traced function:
> -        *   mov rax, nr_args
> +       /* Store number of argument registers of the traced function:
> +        *   mov rax, nr_args + extra_nregs
>          *   mov QWORD PTR [rbp - args_off], rax
>          */
> -       emit_mov_imm64(&prog, BPF_REG_0, 0, (u32) nr_args);
> +       emit_mov_imm64(&prog, BPF_REG_0, 0, (u32) nr_args +
> extra_nregs);
>         emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -args_off);
>  
>         if (flags & BPF_TRAMP_F_IP_ARG) {


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v4 2/8] bpf: x86: Support in-register struct arguments in trampoline programs
  2022-09-06 16:40   ` Kui-Feng Lee
@ 2022-09-06 19:30     ` Yonghong Song
  0 siblings, 0 replies; 19+ messages in thread
From: Yonghong Song @ 2022-09-06 19:30 UTC (permalink / raw)
  To: Kui-Feng Lee, bpf; +Cc: daniel, Kernel Team, ast, andrii



On 9/6/22 9:40 AM, Kui-Feng Lee wrote:
> On Wed, 2022-08-31 at 08:26 -0700, Yonghong Song wrote:
>> In C, struct value can be passed as a function argument.
>> For small structs, struct value may be passed in
>> one or more registers. For trampoline based bpf programs,
>> this would cause complication since one-to-one mapping between
>> function argument and arch argument register is not valid
>> any more.
>>
>> The latest llvm16 added bpf support to pass by values
>> for struct up to 16 bytes ([1]). This is also true for
>> x86_64 architecture where two registers will hold
>> the struct value if the struct size is >8 and <= 16.
>> This may not be true if one of struct member is 'double'
>> type but in current linux source code we don't have
>> such instance yet, so we assume all >8 && <= 16 struct
>> holds two general purpose argument registers.
>>
>> Also change on-stack nr_args value to the number
>> of registers holding the arguments. This will
>> permit bpf_get_func_arg() helper to get all
>> argument values.
>>
>>   [1] https://reviews.llvm.org/D132144
>>
>> Signed-off-by: Yonghong Song <yhs@fb.com>
>> ---
>>   arch/x86/net/bpf_jit_comp.c | 68 +++++++++++++++++++++++++++--------
>> --
>>   1 file changed, 51 insertions(+), 17 deletions(-)
>>
>> diff --git a/arch/x86/net/bpf_jit_comp.c
>> b/arch/x86/net/bpf_jit_comp.c
>> index c1f6c1c51d99..ae89f4143eb4 100644
>> --- a/arch/x86/net/bpf_jit_comp.c
>> +++ b/arch/x86/net/bpf_jit_comp.c
>> @@ -1751,34 +1751,60 @@ st:                     if (is_imm8(insn-
>>> off))
>>   static void save_regs(const struct btf_func_model *m, u8 **prog, int
>> nr_args,
>>                        int stack_size)
>>   {
>> -       int i;
>> +       int i, j, arg_size, nr_regs;
>>          /* Store function arguments to stack.
>>           * For a function that accepts two pointers the sequence will
>> be:
>>           * mov QWORD PTR [rbp-0x10],rdi
>>           * mov QWORD PTR [rbp-0x8],rsi
>>           */
>> -       for (i = 0; i < min(nr_args, 6); i++)
>> -               emit_stx(prog, bytes_to_bpf_size(m->arg_size[i]),
>> -                        BPF_REG_FP,
>> -                        i == 5 ? X86_REG_R9 : BPF_REG_1 + i,
>> -                        -(stack_size - i * 8));
>> +       for (i = 0, j = 0; i < min(nr_args, 6); i++) {
> 
> Is 6 still correct since an argument can take more than one register
> now?  Perphaps j < min(...)?

'min(nr_args, 6)' probably a historical artifact. It can be
replaced with 'nr_args' since in the beginning of function
arch_prepare_bpf_trampoline(), we have

         /* x86-64 supports up to 6 arguments. 7+ can be added in the 
future */
         if (nr_args > 6)
                 return -ENOTSUPP;

so nr_args <= 6 is true.

> 
> I am not sure how to deal with a corner case that a 16 bytes struct
> arguement happens to be at 6th place.  Does that mean first 8 bytes are
> in a register and the reset bytes are in the stack?

This won't happen, see below change in arch_prepare_bpf_trampoline()
where it is checked to ensure the number of registers holding
all arguments mush be <= 6.

> 
> 
>> +               if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG) {
>> +                       nr_regs = (m->arg_size[i] + 7) / 8;
>> +                       arg_size = 8;
>> +               } else {
>> +                       nr_regs = 1;
>> +                       arg_size = m->arg_size[i];
>> +               }
>> +
>> +               while (nr_regs) {
>> +                       emit_stx(prog, bytes_to_bpf_size(arg_size),
>> +                                BPF_REG_FP,
>> +                                j == 5 ? X86_REG_R9 : BPF_REG_1 + j,
>> +                                -(stack_size - j * 8));
>> +                       nr_regs--;
>> +                       j++;
>> +               }
>> +       }
>>   }
>>   
>>   static void restore_regs(const struct btf_func_model *m, u8 **prog,
>> int nr_args,
>>                           int stack_size)
>>   {
>> -       int i;
>> +       int i, j, arg_size, nr_regs;
>>   
>>          /* Restore function arguments from stack.
>>           * For a function that accepts two pointers the sequence will
>> be:
>>           * EMIT4(0x48, 0x8B, 0x7D, 0xF0); mov rdi,QWORD PTR [rbp-
>> 0x10]
>>           * EMIT4(0x48, 0x8B, 0x75, 0xF8); mov rsi,QWORD PTR [rbp-0x8]
>>           */
>> -       for (i = 0; i < min(nr_args, 6); i++)
>> -               emit_ldx(prog, bytes_to_bpf_size(m->arg_size[i]),
>> -                        i == 5 ? X86_REG_R9 : BPF_REG_1 + i,
>> -                        BPF_REG_FP,
>> -                        -(stack_size - i * 8));
>> +       for (i = 0, j = 0; i < min(nr_args, 6); i++) {
>> +               if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG) {
>> +                       nr_regs = (m->arg_size[i] + 7) / 8;
>> +                       arg_size = 8;
>> +               } else {
>> +                       nr_regs = 1;
>> +                       arg_size = m->arg_size[i];
>> +               }
>> +
>> +               while (nr_regs) {
>> +                       emit_ldx(prog, bytes_to_bpf_size(arg_size),
>> +                                j == 5 ? X86_REG_R9 : BPF_REG_1 + j,
>> +                                BPF_REG_FP,
>> +                                -(stack_size - j * 8));
>> +                       nr_regs--;
>> +                       j++;
>> +               }
>> +       }
>>   }
>>   
>>   static int invoke_bpf_prog(const struct btf_func_model *m, u8
>> **pprog,
>> @@ -2015,7 +2041,7 @@ int arch_prepare_bpf_trampoline(struct
>> bpf_tramp_image *im, void *image, void *i
>>                                  struct bpf_tramp_links *tlinks,
>>                                  void *orig_call)
>>   {
>> -       int ret, i, nr_args = m->nr_args;
>> +       int ret, i, nr_args = m->nr_args, extra_nregs = 0;
>>          int regs_off, ip_off, args_off, stack_size = nr_args * 8,
>> run_ctx_off;
>>          struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
>>          struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
>> @@ -2028,6 +2054,14 @@ int arch_prepare_bpf_trampoline(struct
>> bpf_tramp_image *im, void *image, void *i
>>          if (nr_args > 6)
>>                  return -ENOTSUPP;
>>   
>> +       for (i = 0; i < MAX_BPF_FUNC_ARGS; i++) {
>> +               if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG)
>> +                       extra_nregs += (m->arg_size[i] + 7) / 8 - 1;
>> +       }
>> +       if (nr_args + extra_nregs > 6)
>> +               return -ENOTSUPP;

Here to ensure the number of registers holding all arguments
must be <= 6.

>> +       stack_size += extra_nregs * 8;
>> +
>>          /* Generated trampoline stack layout:
>>           *
>>           * RBP + 8         [ return address  ]
[...]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v4 2/8] bpf: x86: Support in-register struct arguments in trampoline programs
  2022-08-31 15:26 ` [PATCH bpf-next v4 2/8] bpf: x86: Support in-register struct arguments in trampoline programs Yonghong Song
  2022-09-06 16:40   ` Kui-Feng Lee
@ 2022-09-07  3:00   ` Alexei Starovoitov
  2022-09-07 18:04     ` Yonghong Song
  1 sibling, 1 reply; 19+ messages in thread
From: Alexei Starovoitov @ 2022-09-07  3:00 UTC (permalink / raw)
  To: Yonghong Song
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Kernel Team

On Wed, Aug 31, 2022 at 8:26 AM Yonghong Song <yhs@fb.com> wrote:
>
> In C, struct value can be passed as a function argument.
> For small structs, struct value may be passed in
> one or more registers. For trampoline based bpf programs,
> this would cause complication since one-to-one mapping between
> function argument and arch argument register is not valid
> any more.
>
> The latest llvm16 added bpf support to pass by values
> for struct up to 16 bytes ([1]). This is also true for
> x86_64 architecture where two registers will hold
> the struct value if the struct size is >8 and <= 16.
> This may not be true if one of struct member is 'double'
> type but in current linux source code we don't have
> such instance yet, so we assume all >8 && <= 16 struct
> holds two general purpose argument registers.
>
> Also change on-stack nr_args value to the number
> of registers holding the arguments. This will
> permit bpf_get_func_arg() helper to get all
> argument values.
>
>  [1] https://reviews.llvm.org/D132144
>
> Signed-off-by: Yonghong Song <yhs@fb.com>
> ---
>  arch/x86/net/bpf_jit_comp.c | 68 +++++++++++++++++++++++++++----------
>  1 file changed, 51 insertions(+), 17 deletions(-)
>
> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
> index c1f6c1c51d99..ae89f4143eb4 100644
> --- a/arch/x86/net/bpf_jit_comp.c
> +++ b/arch/x86/net/bpf_jit_comp.c
> @@ -1751,34 +1751,60 @@ st:                     if (is_imm8(insn->off))
>  static void save_regs(const struct btf_func_model *m, u8 **prog, int nr_args,
>                       int stack_size)
>  {
> -       int i;
> +       int i, j, arg_size, nr_regs;
>         /* Store function arguments to stack.
>          * For a function that accepts two pointers the sequence will be:
>          * mov QWORD PTR [rbp-0x10],rdi
>          * mov QWORD PTR [rbp-0x8],rsi
>          */
> -       for (i = 0; i < min(nr_args, 6); i++)
> -               emit_stx(prog, bytes_to_bpf_size(m->arg_size[i]),
> -                        BPF_REG_FP,
> -                        i == 5 ? X86_REG_R9 : BPF_REG_1 + i,
> -                        -(stack_size - i * 8));
> +       for (i = 0, j = 0; i < min(nr_args, 6); i++) {
> +               if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG) {
> +                       nr_regs = (m->arg_size[i] + 7) / 8;
> +                       arg_size = 8;
> +               } else {
> +                       nr_regs = 1;
> +                       arg_size = m->arg_size[i];
> +               }

This bit begs for a common helper, but I'm not sure
whether it will look better, so applied as-is.

BPF_PROG2 also feels unusual as an API macro name.
We probably should bikeshed a bit and follow up
if a better name is found.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v4 0/8] bpf: Support struct argument for trampoline base progs
  2022-08-31 15:26 [PATCH bpf-next v4 0/8] bpf: Support struct argument for trampoline base progs Yonghong Song
                   ` (7 preceding siblings ...)
  2022-08-31 15:27 ` [PATCH bpf-next v4 8/8] selftests/bpf: Add tracing_struct test in DENYLIST.s390x Yonghong Song
@ 2022-09-07  3:10 ` patchwork-bot+netdevbpf
  8 siblings, 0 replies; 19+ messages in thread
From: patchwork-bot+netdevbpf @ 2022-09-07  3:10 UTC (permalink / raw)
  To: Yonghong Song; +Cc: bpf, ast, andrii, daniel, kernel-team

Hello:

This series was applied to bpf/bpf-next.git (master)
by Alexei Starovoitov <ast@kernel.org>:

On Wed, 31 Aug 2022 08:26:41 -0700 you wrote:
> Currently struct arguments are not supported for trampoline based progs.
> One of major reason is that struct argument may pass by value which may
> use more than one registers. This breaks trampoline progs where
> each argument is assumed to take one register. bcc community reported the
> issue ([1]) where struct argument is not supported for fentry program.
>   typedef struct {
>         uid_t val;
>   } kuid_t;
>   typedef struct {
>         gid_t val;
>   } kgid_t;
>   int security_path_chown(struct path *path, kuid_t uid, kgid_t gid);
> Inside Meta, we also have a use case to attach to tcp_setsockopt()
>   typedef struct {
>         union {
>                 void            *kernel;
>                 void __user     *user;
>         };
>         bool            is_kernel : 1;
>   } sockptr_t;
>   int tcp_setsockopt(struct sock *sk, int level, int optname,
>                      sockptr_t optval, unsigned int optlen);
> 
> [...]

Here is the summary with links:
  - [bpf-next,v4,1/8] bpf: Allow struct argument in trampoline based programs
    https://git.kernel.org/bpf/bpf-next/c/720e6a435194
  - [bpf-next,v4,2/8] bpf: x86: Support in-register struct arguments in trampoline programs
    https://git.kernel.org/bpf/bpf-next/c/a9c5ad31fbdc
  - [bpf-next,v4,3/8] bpf: Update descriptions for helpers bpf_get_func_arg[_cnt]()
    https://git.kernel.org/bpf/bpf-next/c/27ed9353aec9
  - [bpf-next,v4,4/8] bpf: arm64: No support of struct argument in trampoline programs
    https://git.kernel.org/bpf/bpf-next/c/eb707dde264a
  - [bpf-next,v4,5/8] libbpf: Add new BPF_PROG2 macro
    https://git.kernel.org/bpf/bpf-next/c/34586d29f8df
  - [bpf-next,v4,6/8] selftests/bpf: Add struct argument tests with fentry/fexit programs.
    https://git.kernel.org/bpf/bpf-next/c/1642a3945e22
  - [bpf-next,v4,7/8] selftests/bpf: Use BPF_PROG2 for some fentry programs without struct arguments
    https://git.kernel.org/bpf/bpf-next/c/a7c2ca3a2f69
  - [bpf-next,v4,8/8] selftests/bpf: Add tracing_struct test in DENYLIST.s390x
    https://git.kernel.org/bpf/bpf-next/c/ae63c10fc241

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v4 2/8] bpf: x86: Support in-register struct arguments in trampoline programs
  2022-09-07  3:00   ` Alexei Starovoitov
@ 2022-09-07 18:04     ` Yonghong Song
  0 siblings, 0 replies; 19+ messages in thread
From: Yonghong Song @ 2022-09-07 18:04 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Kernel Team



On 9/6/22 8:00 PM, Alexei Starovoitov wrote:
> On Wed, Aug 31, 2022 at 8:26 AM Yonghong Song <yhs@fb.com> wrote:
>>
>> In C, struct value can be passed as a function argument.
>> For small structs, struct value may be passed in
>> one or more registers. For trampoline based bpf programs,
>> this would cause complication since one-to-one mapping between
>> function argument and arch argument register is not valid
>> any more.
>>
>> The latest llvm16 added bpf support to pass by values
>> for struct up to 16 bytes ([1]). This is also true for
>> x86_64 architecture where two registers will hold
>> the struct value if the struct size is >8 and <= 16.
>> This may not be true if one of struct member is 'double'
>> type but in current linux source code we don't have
>> such instance yet, so we assume all >8 && <= 16 struct
>> holds two general purpose argument registers.
>>
>> Also change on-stack nr_args value to the number
>> of registers holding the arguments. This will
>> permit bpf_get_func_arg() helper to get all
>> argument values.
>>
>>   [1] https://reviews.llvm.org/D132144
>>
>> Signed-off-by: Yonghong Song <yhs@fb.com>
>> ---
>>   arch/x86/net/bpf_jit_comp.c | 68 +++++++++++++++++++++++++++----------
>>   1 file changed, 51 insertions(+), 17 deletions(-)
>>
>> diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
>> index c1f6c1c51d99..ae89f4143eb4 100644
>> --- a/arch/x86/net/bpf_jit_comp.c
>> +++ b/arch/x86/net/bpf_jit_comp.c
>> @@ -1751,34 +1751,60 @@ st:                     if (is_imm8(insn->off))
>>   static void save_regs(const struct btf_func_model *m, u8 **prog, int nr_args,
>>                        int stack_size)
>>   {
>> -       int i;
>> +       int i, j, arg_size, nr_regs;
>>          /* Store function arguments to stack.
>>           * For a function that accepts two pointers the sequence will be:
>>           * mov QWORD PTR [rbp-0x10],rdi
>>           * mov QWORD PTR [rbp-0x8],rsi
>>           */
>> -       for (i = 0; i < min(nr_args, 6); i++)
>> -               emit_stx(prog, bytes_to_bpf_size(m->arg_size[i]),
>> -                        BPF_REG_FP,
>> -                        i == 5 ? X86_REG_R9 : BPF_REG_1 + i,
>> -                        -(stack_size - i * 8));
>> +       for (i = 0, j = 0; i < min(nr_args, 6); i++) {
>> +               if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG) {
>> +                       nr_regs = (m->arg_size[i] + 7) / 8;
>> +                       arg_size = 8;
>> +               } else {
>> +                       nr_regs = 1;
>> +                       arg_size = m->arg_size[i];
>> +               }
> 
> This bit begs for a common helper, but I'm not sure
> whether it will look better, so applied as-is.
> 
> BPF_PROG2 also feels unusual as an API macro name.
> We probably should bikeshed a bit and follow up
> if a better name is found.

I didn't come up with a better name either. But happy
to change to a different name if we agreed on one.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v4 5/8] libbpf: Add new BPF_PROG2 macro
  2022-08-31 15:27 ` [PATCH bpf-next v4 5/8] libbpf: Add new BPF_PROG2 macro Yonghong Song
@ 2022-09-09  0:11   ` Andrii Nakryiko
  2022-09-09 16:31     ` Yonghong Song
  0 siblings, 1 reply; 19+ messages in thread
From: Andrii Nakryiko @ 2022-09-09  0:11 UTC (permalink / raw)
  To: Yonghong Song
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team

On Wed, Aug 31, 2022 at 8:27 AM Yonghong Song <yhs@fb.com> wrote:
>
> To support struct arguments in trampoline based programs,
> existing BPF_PROG doesn't work any more since
> the type size is needed to find whether a parameter
> takes one or two registers. So this patch added a new
> BPF_PROG2 macro to support such trampoline programs.
>
> The idea is suggested by Andrii. For example, if the
> to-be-traced function has signature like
>   typedef struct {
>        void *x;
>        int t;
>   } sockptr;
>   int blah(sockptr x, char y);
>
> In the new BPF_PROG2 macro, the argument can be
> represented as
>   __bpf_prog_call(
>      ({ union {
>           struct { __u64 x, y; } ___z;
>           sockptr x;
>         } ___tmp = { .___z = { ctx[0], ctx[1] }};
>         ___tmp.x;
>      }),
>      ({ union {
>           struct { __u8 x; } ___z;
>           char y;
>         } ___tmp = { .___z = { ctx[2] }};
>         ___tmp.y;
>      }));
> In the above, the values stored on the stack are properly
> assigned to the actual argument type value by using 'union'
> magic. Note that the macro also works even if no arguments
> are with struct types.
>
> Note that new BPF_PROG2 works for both llvm16 and pre-llvm16
> compilers where llvm16 supports bpf target passing value
> with struct up to 16 byte size and pre-llvm16 will pass
> by reference by storing values on the stack. With static functions
> with struct argument as always inline, the compiler is able
> to optimize and remove additional stack saving of struct values.
>
> Signed-off-by: Yonghong Song <yhs@fb.com>
> ---
>  tools/lib/bpf/bpf_tracing.h | 79 +++++++++++++++++++++++++++++++++++++
>  1 file changed, 79 insertions(+)
>
> diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
> index 5fdb93da423b..8d4bdd18cb3d 100644
> --- a/tools/lib/bpf/bpf_tracing.h
> +++ b/tools/lib/bpf/bpf_tracing.h
> @@ -438,6 +438,85 @@ typeof(name(0)) name(unsigned long long *ctx)                                  \
>  static __always_inline typeof(name(0))                                     \
>  ____##name(unsigned long long *ctx, ##args)
>
> +#ifndef ____bpf_nth
> +#define ____bpf_nth(_, _1, _2, _3, _4, _5, _6, _7, _8, _9, _10, _11, _12, _13, _14, _15, _16, _17, _18, _19, _20, _21, _22, _23, _24, N, ...) N
> +#endif

we already have ___bpf_nth (triple underscore) variant, wouldn't
extending that one to support up to 24 argument work? It's quite
confusing to have ___bpf_nth and ____bpf_nth. Maybe let's consolidate?

And I'd totally wrap this long line :)


> +#ifndef ____bpf_narg
> +#define ____bpf_narg(...) ____bpf_nth(_, ##__VA_ARGS__, 12, 12, 11, 11, 10, 10, 9, 9, 8, 8, 7, 7, 6, 6, 5, 5, 4, 4, 3, 3, 2, 2, 1, 1, 0)
> +#endif

similar confusiong with triple underscore bpf_narg. Given this is used
in BPF_PROG2, how about renaming it to bpf_narg2 to make this
connection? And also note that all similar macros use triple
underscore, while you added quad underscores everywhere. Can you
please follow up with a rename to use triple underscore for
consistency?

> +
> +#define BPF_REG_CNT(t) \

this looks like a "public API", but I don't think this was the
intention, right? Let's rename it to ___bpf_reg_cnt()?

> +       (__builtin_choose_expr(sizeof(t) == 1 || sizeof(t) == 2 || sizeof(t) == 4 || sizeof(t) == 8, 1, \

nit: seeing ____bpf_union_arg implementation below I prefer one case
per line there as well. How about doing one __builtin_choose_expr per
each supported size?

> +        __builtin_choose_expr(sizeof(t) == 16, 2,                                                      \
> +                              (void)0)))
> +
> +#define ____bpf_reg_cnt0()                     (0)
> +#define ____bpf_reg_cnt1(t, x)                 (____bpf_reg_cnt0() + BPF_REG_CNT(t))
> +#define ____bpf_reg_cnt2(t, x, args...)                (____bpf_reg_cnt1(args) + BPF_REG_CNT(t))
> +#define ____bpf_reg_cnt3(t, x, args...)                (____bpf_reg_cnt2(args) + BPF_REG_CNT(t))
> +#define ____bpf_reg_cnt4(t, x, args...)                (____bpf_reg_cnt3(args) + BPF_REG_CNT(t))
> +#define ____bpf_reg_cnt5(t, x, args...)                (____bpf_reg_cnt4(args) + BPF_REG_CNT(t))
> +#define ____bpf_reg_cnt6(t, x, args...)                (____bpf_reg_cnt5(args) + BPF_REG_CNT(t))
> +#define ____bpf_reg_cnt7(t, x, args...)                (____bpf_reg_cnt6(args) + BPF_REG_CNT(t))
> +#define ____bpf_reg_cnt8(t, x, args...)                (____bpf_reg_cnt7(args) + BPF_REG_CNT(t))
> +#define ____bpf_reg_cnt9(t, x, args...)                (____bpf_reg_cnt8(args) + BPF_REG_CNT(t))
> +#define ____bpf_reg_cnt10(t, x, args...)       (____bpf_reg_cnt9(args) + BPF_REG_CNT(t))
> +#define ____bpf_reg_cnt11(t, x, args...)       (____bpf_reg_cnt10(args) + BPF_REG_CNT(t))
> +#define ____bpf_reg_cnt12(t, x, args...)       (____bpf_reg_cnt11(args) + BPF_REG_CNT(t))
> +#define ____bpf_reg_cnt(args...)        ___bpf_apply(____bpf_reg_cnt, ____bpf_narg(args))(args)
> +
> +#define ____bpf_union_arg(t, x, n) \
> +       __builtin_choose_expr(sizeof(t) == 1, ({ union { struct { __u8 x; } ___z; t x; } ___tmp = { .___z = {ctx[n]}}; ___tmp.x; }), \
> +       __builtin_choose_expr(sizeof(t) == 2, ({ union { struct { __u16 x; } ___z; t x; } ___tmp = { .___z = {ctx[n]} }; ___tmp.x; }), \
> +       __builtin_choose_expr(sizeof(t) == 4, ({ union { struct { __u32 x; } ___z; t x; } ___tmp = { .___z = {ctx[n]} }; ___tmp.x; }), \
> +       __builtin_choose_expr(sizeof(t) == 8, ({ union { struct { __u64 x; } ___z; t x; } ___tmp = {.___z = {ctx[n]} }; ___tmp.x; }), \
> +       __builtin_choose_expr(sizeof(t) == 16, ({ union { struct { __u64 x, y; } ___z; t x; } ___tmp = {.___z = {ctx[n], ctx[n + 1]} }; ___tmp.x; }), \
> +                             (void)0)))))

looking at this again, we can do a bit better by using arrays, please
consider using that. At the very least results in shorter lines:

 #define ____bpf_union_arg(t, x, n) \
-       __builtin_choose_expr(sizeof(t) == 1, ({ union { struct { __u8
x; } ___z; t x; } ___tmp = { .___z = {ctx[n]}}; ___tmp.x; }), \
-       __builtin_choose_expr(sizeof(t) == 2, ({ union { struct {
__u16 x; } ___z; t x; } ___tmp = { .___z = {ctx[n]} }; ___tmp.x; }), \
-       __builtin_choose_expr(sizeof(t) == 4, ({ union { struct {
__u32 x; } ___z; t x; } ___tmp = { .___z = {ctx[n]} }; ___tmp.x; }), \
-       __builtin_choose_expr(sizeof(t) == 8, ({ union { struct {
__u64 x; } ___z; t x; } ___tmp = {.___z = {ctx[n]} }; ___tmp.x; }), \
-       __builtin_choose_expr(sizeof(t) == 16, ({ union { struct {
__u64 x, y; } ___z; t x; } ___tmp = {.___z = {ctx[n], ctx[n + 1]} };
___tmp.x; }), \
+       __builtin_choose_expr(sizeof(t) == 1, ({ union { __u8 z[1]; t
x; } ___tmp = { .z = {ctx[n]} }; ___tmp.x; }), \
+       __builtin_choose_expr(sizeof(t) == 2, ({ union { __u16 z[1]; t
x; } ___tmp = { .z = {ctx[n]} }; ___tmp.x; }), \
+       __builtin_choose_expr(sizeof(t) == 4, ({ union { __u32 z[1]; t
x; } ___tmp = { .z = {ctx[n]} }; ___tmp.x; }), \
+       __builtin_choose_expr(sizeof(t) == 8, ({ union { __u64 z[1]; t
x; } ___tmp = {.z = {ctx[n]} }; ___tmp.x; }), \
+       __builtin_choose_expr(sizeof(t) == 16, ({ union { __u64 z[2];
t x; } ___tmp = {.z = {ctx[n], ctx[n + 1]} }; ___tmp.x; }), \
                              (void)0)))))

It is using one- or two-element arrays, and it also has uniform
{ctx[n]} or {ctx[n], ctx[n + 1]} initialization syntax. Seems a bit
nicer than union { struct { ... combo.

> +
> +#define ____bpf_ctx_arg0(n, args...)
> +#define ____bpf_ctx_arg1(n, t, x)              , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt1(t, x))
> +#define ____bpf_ctx_arg2(n, t, x, args...)     , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt2(t, x, args)) ____bpf_ctx_arg1(n, args)
> +#define ____bpf_ctx_arg3(n, t, x, args...)     , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt3(t, x, args)) ____bpf_ctx_arg2(n, args)
> +#define ____bpf_ctx_arg4(n, t, x, args...)     , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt4(t, x, args)) ____bpf_ctx_arg3(n, args)
> +#define ____bpf_ctx_arg5(n, t, x, args...)     , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt5(t, x, args)) ____bpf_ctx_arg4(n, args)
> +#define ____bpf_ctx_arg6(n, t, x, args...)     , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt6(t, x, args)) ____bpf_ctx_arg5(n, args)
> +#define ____bpf_ctx_arg7(n, t, x, args...)     , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt7(t, x, args)) ____bpf_ctx_arg6(n, args)
> +#define ____bpf_ctx_arg8(n, t, x, args...)     , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt8(t, x, args)) ____bpf_ctx_arg7(n, args)
> +#define ____bpf_ctx_arg9(n, t, x, args...)     , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt9(t, x, args)) ____bpf_ctx_arg8(n, args)
> +#define ____bpf_ctx_arg10(n, t, x, args...)    , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt10(t, x, args)) ____bpf_ctx_arg9(n, args)
> +#define ____bpf_ctx_arg11(n, t, x, args...)    , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt11(t, x, args)) ____bpf_ctx_arg10(n, args)
> +#define ____bpf_ctx_arg12(n, t, x, args...)    , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt12(t, x, args)) ____bpf_ctx_arg11(n, args)
> +#define ____bpf_ctx_arg(n, args...)    ___bpf_apply(____bpf_ctx_arg, ____bpf_narg(args))(n, args)
> +
> +#define ____bpf_ctx_decl0()
> +#define ____bpf_ctx_decl1(t, x)                        , t x
> +#define ____bpf_ctx_decl2(t, x, args...)       , t x ____bpf_ctx_decl1(args)
> +#define ____bpf_ctx_decl3(t, x, args...)       , t x ____bpf_ctx_decl2(args)
> +#define ____bpf_ctx_decl4(t, x, args...)       , t x ____bpf_ctx_decl3(args)
> +#define ____bpf_ctx_decl5(t, x, args...)       , t x ____bpf_ctx_decl4(args)
> +#define ____bpf_ctx_decl6(t, x, args...)       , t x ____bpf_ctx_decl5(args)
> +#define ____bpf_ctx_decl7(t, x, args...)       , t x ____bpf_ctx_decl6(args)
> +#define ____bpf_ctx_decl8(t, x, args...)       , t x ____bpf_ctx_decl7(args)
> +#define ____bpf_ctx_decl9(t, x, args...)       , t x ____bpf_ctx_decl8(args)
> +#define ____bpf_ctx_decl10(t, x, args...)      , t x ____bpf_ctx_decl9(args)
> +#define ____bpf_ctx_decl11(t, x, args...)      , t x ____bpf_ctx_decl10(args)
> +#define ____bpf_ctx_decl12(t, x, args...)      , t x ____bpf_ctx_decl11(args)
> +#define ____bpf_ctx_decl(args...)      ___bpf_apply(____bpf_ctx_decl, ____bpf_narg(args))(args)
> +
> +/*
> + * BPF_PROG2 can handle struct arguments.

We have to expand comment here. Let's not slack on this. Point out
that it's the similar use and idea as with BPF_PROG, but emphasize the
difference in syntax between BPF_PROG and BPF_PROG2. I'd show two
simple examples of the same function with BPF_PROG and BPF_PROG2 here.
Please follow up.

> + */
> +#define BPF_PROG2(name, args...)                                               \
> +name(unsigned long long *ctx);                                                 \
> +static __always_inline typeof(name(0))                                         \
> +____##name(unsigned long long *ctx ____bpf_ctx_decl(args));                    \
> +typeof(name(0)) name(unsigned long long *ctx)                                  \
> +{                                                                              \
> +       return ____##name(ctx ____bpf_ctx_arg(____bpf_reg_cnt(args), args));    \

nit: you could have simplified this by doing ____bpf_reg_cnt() call
inside ____bpf_ctx_decl(args...) definition. I think it's a bit more
self-contained that way.

> +}                                                                              \
> +static __always_inline typeof(name(0))                                         \
> +____##name(unsigned long long *ctx ____bpf_ctx_decl(args))
> +
>  struct pt_regs;
>
>  #define ___bpf_kprobe_args0()           ctx
> --
> 2.30.2
>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v4 5/8] libbpf: Add new BPF_PROG2 macro
  2022-09-09  0:11   ` Andrii Nakryiko
@ 2022-09-09 16:31     ` Yonghong Song
  2022-09-09 18:07       ` Andrii Nakryiko
  0 siblings, 1 reply; 19+ messages in thread
From: Yonghong Song @ 2022-09-09 16:31 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team



On 9/8/22 5:11 PM, Andrii Nakryiko wrote:
> On Wed, Aug 31, 2022 at 8:27 AM Yonghong Song <yhs@fb.com> wrote:
>>
>> To support struct arguments in trampoline based programs,
>> existing BPF_PROG doesn't work any more since
>> the type size is needed to find whether a parameter
>> takes one or two registers. So this patch added a new
>> BPF_PROG2 macro to support such trampoline programs.
>>
>> The idea is suggested by Andrii. For example, if the
>> to-be-traced function has signature like
>>    typedef struct {
>>         void *x;
>>         int t;
>>    } sockptr;
>>    int blah(sockptr x, char y);
>>
>> In the new BPF_PROG2 macro, the argument can be
>> represented as
>>    __bpf_prog_call(
>>       ({ union {
>>            struct { __u64 x, y; } ___z;
>>            sockptr x;
>>          } ___tmp = { .___z = { ctx[0], ctx[1] }};
>>          ___tmp.x;
>>       }),
>>       ({ union {
>>            struct { __u8 x; } ___z;
>>            char y;
>>          } ___tmp = { .___z = { ctx[2] }};
>>          ___tmp.y;
>>       }));
>> In the above, the values stored on the stack are properly
>> assigned to the actual argument type value by using 'union'
>> magic. Note that the macro also works even if no arguments
>> are with struct types.
>>
>> Note that new BPF_PROG2 works for both llvm16 and pre-llvm16
>> compilers where llvm16 supports bpf target passing value
>> with struct up to 16 byte size and pre-llvm16 will pass
>> by reference by storing values on the stack. With static functions
>> with struct argument as always inline, the compiler is able
>> to optimize and remove additional stack saving of struct values.
>>
>> Signed-off-by: Yonghong Song <yhs@fb.com>
>> ---
>>   tools/lib/bpf/bpf_tracing.h | 79 +++++++++++++++++++++++++++++++++++++
>>   1 file changed, 79 insertions(+)
>>
>> diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
>> index 5fdb93da423b..8d4bdd18cb3d 100644
>> --- a/tools/lib/bpf/bpf_tracing.h
>> +++ b/tools/lib/bpf/bpf_tracing.h
>> @@ -438,6 +438,85 @@ typeof(name(0)) name(unsigned long long *ctx)                                  \
>>   static __always_inline typeof(name(0))                                     \
>>   ____##name(unsigned long long *ctx, ##args)
>>
>> +#ifndef ____bpf_nth
>> +#define ____bpf_nth(_, _1, _2, _3, _4, _5, _6, _7, _8, _9, _10, _11, _12, _13, _14, _15, _16, _17, _18, _19, _20, _21, _22, _23, _24, N, ...) N
>> +#endif
> 
> we already have ___bpf_nth (triple underscore) variant, wouldn't
> extending that one to support up to 24 argument work? It's quite
> confusing to have ___bpf_nth and ____bpf_nth. Maybe let's consolidate?
> 
> And I'd totally wrap this long line :)

I tried earlier and it doesn't work. I will try again. If it still not 
works, I will change the name to ___bpf_nth2 similar to ___bpf_narg2 below.

> 
> 
>> +#ifndef ____bpf_narg
>> +#define ____bpf_narg(...) ____bpf_nth(_, ##__VA_ARGS__, 12, 12, 11, 11, 10, 10, 9, 9, 8, 8, 7, 7, 6, 6, 5, 5, 4, 4, 3, 3, 2, 2, 1, 1, 0)
>> +#endif
> 
> similar confusiong with triple underscore bpf_narg. Given this is used
> in BPF_PROG2, how about renaming it to bpf_narg2 to make this
> connection? And also note that all similar macros use triple
> underscore, while you added quad underscores everywhere. Can you
> please follow up with a rename to use triple underscore for
> consistency?

Sounds good. Will change to ___bpf_narg2.

> 
>> +
>> +#define BPF_REG_CNT(t) \
> 
> this looks like a "public API", but I don't think this was the
> intention, right? Let's rename it to ___bpf_reg_cnt()?

Yes, I can do it.

> 
>> +       (__builtin_choose_expr(sizeof(t) == 1 || sizeof(t) == 2 || sizeof(t) == 4 || sizeof(t) == 8, 1, \
> 
> nit: seeing ____bpf_union_arg implementation below I prefer one case
> per line there as well. How about doing one __builtin_choose_expr per
> each supported size?

Actually, I did have each individual case per line in the beginning. But 
later on I changed to the above based on Kui-Feng's comment. I can 
change back to each case per line in the next revision to be aligned
with other usages.

> 
>> +        __builtin_choose_expr(sizeof(t) == 16, 2,                                                      \
>> +                              (void)0)))
>> +
>> +#define ____bpf_reg_cnt0()                     (0)
>> +#define ____bpf_reg_cnt1(t, x)                 (____bpf_reg_cnt0() + BPF_REG_CNT(t))
>> +#define ____bpf_reg_cnt2(t, x, args...)                (____bpf_reg_cnt1(args) + BPF_REG_CNT(t))
>> +#define ____bpf_reg_cnt3(t, x, args...)                (____bpf_reg_cnt2(args) + BPF_REG_CNT(t))
>> +#define ____bpf_reg_cnt4(t, x, args...)                (____bpf_reg_cnt3(args) + BPF_REG_CNT(t))
>> +#define ____bpf_reg_cnt5(t, x, args...)                (____bpf_reg_cnt4(args) + BPF_REG_CNT(t))
>> +#define ____bpf_reg_cnt6(t, x, args...)                (____bpf_reg_cnt5(args) + BPF_REG_CNT(t))
>> +#define ____bpf_reg_cnt7(t, x, args...)                (____bpf_reg_cnt6(args) + BPF_REG_CNT(t))
>> +#define ____bpf_reg_cnt8(t, x, args...)                (____bpf_reg_cnt7(args) + BPF_REG_CNT(t))
>> +#define ____bpf_reg_cnt9(t, x, args...)                (____bpf_reg_cnt8(args) + BPF_REG_CNT(t))
>> +#define ____bpf_reg_cnt10(t, x, args...)       (____bpf_reg_cnt9(args) + BPF_REG_CNT(t))
>> +#define ____bpf_reg_cnt11(t, x, args...)       (____bpf_reg_cnt10(args) + BPF_REG_CNT(t))
>> +#define ____bpf_reg_cnt12(t, x, args...)       (____bpf_reg_cnt11(args) + BPF_REG_CNT(t))
>> +#define ____bpf_reg_cnt(args...)        ___bpf_apply(____bpf_reg_cnt, ____bpf_narg(args))(args)
>> +
>> +#define ____bpf_union_arg(t, x, n) \
>> +       __builtin_choose_expr(sizeof(t) == 1, ({ union { struct { __u8 x; } ___z; t x; } ___tmp = { .___z = {ctx[n]}}; ___tmp.x; }), \
>> +       __builtin_choose_expr(sizeof(t) == 2, ({ union { struct { __u16 x; } ___z; t x; } ___tmp = { .___z = {ctx[n]} }; ___tmp.x; }), \
>> +       __builtin_choose_expr(sizeof(t) == 4, ({ union { struct { __u32 x; } ___z; t x; } ___tmp = { .___z = {ctx[n]} }; ___tmp.x; }), \
>> +       __builtin_choose_expr(sizeof(t) == 8, ({ union { struct { __u64 x; } ___z; t x; } ___tmp = {.___z = {ctx[n]} }; ___tmp.x; }), \
>> +       __builtin_choose_expr(sizeof(t) == 16, ({ union { struct { __u64 x, y; } ___z; t x; } ___tmp = {.___z = {ctx[n], ctx[n + 1]} }; ___tmp.x; }), \
>> +                             (void)0)))))
> 
> looking at this again, we can do a bit better by using arrays, please
> consider using that. At the very least results in shorter lines:
> 
>   #define ____bpf_union_arg(t, x, n) \
> -       __builtin_choose_expr(sizeof(t) == 1, ({ union { struct { __u8
> x; } ___z; t x; } ___tmp = { .___z = {ctx[n]}}; ___tmp.x; }), \
> -       __builtin_choose_expr(sizeof(t) == 2, ({ union { struct {
> __u16 x; } ___z; t x; } ___tmp = { .___z = {ctx[n]} }; ___tmp.x; }), \
> -       __builtin_choose_expr(sizeof(t) == 4, ({ union { struct {
> __u32 x; } ___z; t x; } ___tmp = { .___z = {ctx[n]} }; ___tmp.x; }), \
> -       __builtin_choose_expr(sizeof(t) == 8, ({ union { struct {
> __u64 x; } ___z; t x; } ___tmp = {.___z = {ctx[n]} }; ___tmp.x; }), \
> -       __builtin_choose_expr(sizeof(t) == 16, ({ union { struct {
> __u64 x, y; } ___z; t x; } ___tmp = {.___z = {ctx[n], ctx[n + 1]} };
> ___tmp.x; }), \
> +       __builtin_choose_expr(sizeof(t) == 1, ({ union { __u8 z[1]; t
> x; } ___tmp = { .z = {ctx[n]} }; ___tmp.x; }), \
> +       __builtin_choose_expr(sizeof(t) == 2, ({ union { __u16 z[1]; t
> x; } ___tmp = { .z = {ctx[n]} }; ___tmp.x; }), \
> +       __builtin_choose_expr(sizeof(t) == 4, ({ union { __u32 z[1]; t
> x; } ___tmp = { .z = {ctx[n]} }; ___tmp.x; }), \
> +       __builtin_choose_expr(sizeof(t) == 8, ({ union { __u64 z[1]; t
> x; } ___tmp = {.z = {ctx[n]} }; ___tmp.x; }), \
> +       __builtin_choose_expr(sizeof(t) == 16, ({ union { __u64 z[2];
> t x; } ___tmp = {.z = {ctx[n], ctx[n + 1]} }; ___tmp.x; }), \
>                                (void)0)))))
> 
> It is using one- or two-element arrays, and it also has uniform
> {ctx[n]} or {ctx[n], ctx[n + 1]} initialization syntax. Seems a bit
> nicer than union { struct { ... combo.

Sounds good. Will try to do this.

> 
>> +
>> +#define ____bpf_ctx_arg0(n, args...)
>> +#define ____bpf_ctx_arg1(n, t, x)              , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt1(t, x))
>> +#define ____bpf_ctx_arg2(n, t, x, args...)     , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt2(t, x, args)) ____bpf_ctx_arg1(n, args)
>> +#define ____bpf_ctx_arg3(n, t, x, args...)     , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt3(t, x, args)) ____bpf_ctx_arg2(n, args)
>> +#define ____bpf_ctx_arg4(n, t, x, args...)     , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt4(t, x, args)) ____bpf_ctx_arg3(n, args)
>> +#define ____bpf_ctx_arg5(n, t, x, args...)     , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt5(t, x, args)) ____bpf_ctx_arg4(n, args)
>> +#define ____bpf_ctx_arg6(n, t, x, args...)     , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt6(t, x, args)) ____bpf_ctx_arg5(n, args)
>> +#define ____bpf_ctx_arg7(n, t, x, args...)     , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt7(t, x, args)) ____bpf_ctx_arg6(n, args)
>> +#define ____bpf_ctx_arg8(n, t, x, args...)     , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt8(t, x, args)) ____bpf_ctx_arg7(n, args)
>> +#define ____bpf_ctx_arg9(n, t, x, args...)     , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt9(t, x, args)) ____bpf_ctx_arg8(n, args)
>> +#define ____bpf_ctx_arg10(n, t, x, args...)    , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt10(t, x, args)) ____bpf_ctx_arg9(n, args)
>> +#define ____bpf_ctx_arg11(n, t, x, args...)    , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt11(t, x, args)) ____bpf_ctx_arg10(n, args)
>> +#define ____bpf_ctx_arg12(n, t, x, args...)    , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt12(t, x, args)) ____bpf_ctx_arg11(n, args)
>> +#define ____bpf_ctx_arg(n, args...)    ___bpf_apply(____bpf_ctx_arg, ____bpf_narg(args))(n, args)
>> +
>> +#define ____bpf_ctx_decl0()
>> +#define ____bpf_ctx_decl1(t, x)                        , t x
>> +#define ____bpf_ctx_decl2(t, x, args...)       , t x ____bpf_ctx_decl1(args)
>> +#define ____bpf_ctx_decl3(t, x, args...)       , t x ____bpf_ctx_decl2(args)
>> +#define ____bpf_ctx_decl4(t, x, args...)       , t x ____bpf_ctx_decl3(args)
>> +#define ____bpf_ctx_decl5(t, x, args...)       , t x ____bpf_ctx_decl4(args)
>> +#define ____bpf_ctx_decl6(t, x, args...)       , t x ____bpf_ctx_decl5(args)
>> +#define ____bpf_ctx_decl7(t, x, args...)       , t x ____bpf_ctx_decl6(args)
>> +#define ____bpf_ctx_decl8(t, x, args...)       , t x ____bpf_ctx_decl7(args)
>> +#define ____bpf_ctx_decl9(t, x, args...)       , t x ____bpf_ctx_decl8(args)
>> +#define ____bpf_ctx_decl10(t, x, args...)      , t x ____bpf_ctx_decl9(args)
>> +#define ____bpf_ctx_decl11(t, x, args...)      , t x ____bpf_ctx_decl10(args)
>> +#define ____bpf_ctx_decl12(t, x, args...)      , t x ____bpf_ctx_decl11(args)
>> +#define ____bpf_ctx_decl(args...)      ___bpf_apply(____bpf_ctx_decl, ____bpf_narg(args))(args)
>> +
>> +/*
>> + * BPF_PROG2 can handle struct arguments.
> 
> We have to expand comment here. Let's not slack on this. Point out
> that it's the similar use and idea as with BPF_PROG, but emphasize the
> difference in syntax between BPF_PROG and BPF_PROG2. I'd show two
> simple examples of the same function with BPF_PROG and BPF_PROG2 here.
> Please follow up.

Will add detailed comments to explain why BPF_PROG2 and its core usage
and difference from BPF_PROG in the followup patch.

> 
>> + */
>> +#define BPF_PROG2(name, args...)                                               \
>> +name(unsigned long long *ctx);                                                 \
>> +static __always_inline typeof(name(0))                                         \
>> +____##name(unsigned long long *ctx ____bpf_ctx_decl(args));                    \
>> +typeof(name(0)) name(unsigned long long *ctx)                                  \
>> +{                                                                              \
>> +       return ____##name(ctx ____bpf_ctx_arg(____bpf_reg_cnt(args), args));    \
> 
> nit: you could have simplified this by doing ____bpf_reg_cnt() call
> inside ____bpf_ctx_decl(args...) definition. I think it's a bit more
> self-contained that way.

Will do this in the follow-up patch.

> 
>> +}                                                                              \
>> +static __always_inline typeof(name(0))                                         \
>> +____##name(unsigned long long *ctx ____bpf_ctx_decl(args))
>> +
>>   struct pt_regs;
>>
>>   #define ___bpf_kprobe_args0()           ctx
>> --
>> 2.30.2
>>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v4 5/8] libbpf: Add new BPF_PROG2 macro
  2022-09-09 16:31     ` Yonghong Song
@ 2022-09-09 18:07       ` Andrii Nakryiko
  0 siblings, 0 replies; 19+ messages in thread
From: Andrii Nakryiko @ 2022-09-09 18:07 UTC (permalink / raw)
  To: Yonghong Song
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team

On Fri, Sep 9, 2022 at 9:31 AM Yonghong Song <yhs@fb.com> wrote:
>
>
>
> On 9/8/22 5:11 PM, Andrii Nakryiko wrote:
> > On Wed, Aug 31, 2022 at 8:27 AM Yonghong Song <yhs@fb.com> wrote:
> >>
> >> To support struct arguments in trampoline based programs,
> >> existing BPF_PROG doesn't work any more since
> >> the type size is needed to find whether a parameter
> >> takes one or two registers. So this patch added a new
> >> BPF_PROG2 macro to support such trampoline programs.
> >>
> >> The idea is suggested by Andrii. For example, if the
> >> to-be-traced function has signature like
> >>    typedef struct {
> >>         void *x;
> >>         int t;
> >>    } sockptr;
> >>    int blah(sockptr x, char y);
> >>
> >> In the new BPF_PROG2 macro, the argument can be
> >> represented as
> >>    __bpf_prog_call(
> >>       ({ union {
> >>            struct { __u64 x, y; } ___z;
> >>            sockptr x;
> >>          } ___tmp = { .___z = { ctx[0], ctx[1] }};
> >>          ___tmp.x;
> >>       }),
> >>       ({ union {
> >>            struct { __u8 x; } ___z;
> >>            char y;
> >>          } ___tmp = { .___z = { ctx[2] }};
> >>          ___tmp.y;
> >>       }));
> >> In the above, the values stored on the stack are properly
> >> assigned to the actual argument type value by using 'union'
> >> magic. Note that the macro also works even if no arguments
> >> are with struct types.
> >>
> >> Note that new BPF_PROG2 works for both llvm16 and pre-llvm16
> >> compilers where llvm16 supports bpf target passing value
> >> with struct up to 16 byte size and pre-llvm16 will pass
> >> by reference by storing values on the stack. With static functions
> >> with struct argument as always inline, the compiler is able
> >> to optimize and remove additional stack saving of struct values.
> >>
> >> Signed-off-by: Yonghong Song <yhs@fb.com>
> >> ---
> >>   tools/lib/bpf/bpf_tracing.h | 79 +++++++++++++++++++++++++++++++++++++
> >>   1 file changed, 79 insertions(+)
> >>
> >> diff --git a/tools/lib/bpf/bpf_tracing.h b/tools/lib/bpf/bpf_tracing.h
> >> index 5fdb93da423b..8d4bdd18cb3d 100644
> >> --- a/tools/lib/bpf/bpf_tracing.h
> >> +++ b/tools/lib/bpf/bpf_tracing.h
> >> @@ -438,6 +438,85 @@ typeof(name(0)) name(unsigned long long *ctx)                                  \
> >>   static __always_inline typeof(name(0))                                     \
> >>   ____##name(unsigned long long *ctx, ##args)
> >>
> >> +#ifndef ____bpf_nth
> >> +#define ____bpf_nth(_, _1, _2, _3, _4, _5, _6, _7, _8, _9, _10, _11, _12, _13, _14, _15, _16, _17, _18, _19, _20, _21, _22, _23, _24, N, ...) N
> >> +#endif
> >
> > we already have ___bpf_nth (triple underscore) variant, wouldn't
> > extending that one to support up to 24 argument work? It's quite
> > confusing to have ___bpf_nth and ____bpf_nth. Maybe let's consolidate?
> >
> > And I'd totally wrap this long line :)
>
> I tried earlier and it doesn't work. I will try again. If it still not
> works, I will change the name to ___bpf_nth2 similar to ___bpf_narg2 below.

Yeah, that was the fallback I had in mind. ___bpf_nth2 is fine as well.

>
> >
> >
> >> +#ifndef ____bpf_narg
> >> +#define ____bpf_narg(...) ____bpf_nth(_, ##__VA_ARGS__, 12, 12, 11, 11, 10, 10, 9, 9, 8, 8, 7, 7, 6, 6, 5, 5, 4, 4, 3, 3, 2, 2, 1, 1, 0)
> >> +#endif
> >
> > similar confusiong with triple underscore bpf_narg. Given this is used
> > in BPF_PROG2, how about renaming it to bpf_narg2 to make this
> > connection? And also note that all similar macros use triple
> > underscore, while you added quad underscores everywhere. Can you
> > please follow up with a rename to use triple underscore for
> > consistency?
>
> Sounds good. Will change to ___bpf_narg2.
>
> >
> >> +
> >> +#define BPF_REG_CNT(t) \
> >
> > this looks like a "public API", but I don't think this was the
> > intention, right? Let's rename it to ___bpf_reg_cnt()?
>
> Yes, I can do it.
>
> >
> >> +       (__builtin_choose_expr(sizeof(t) == 1 || sizeof(t) == 2 || sizeof(t) == 4 || sizeof(t) == 8, 1, \
> >
> > nit: seeing ____bpf_union_arg implementation below I prefer one case
> > per line there as well. How about doing one __builtin_choose_expr per
> > each supported size?
>
> Actually, I did have each individual case per line in the beginning. But
> later on I changed to the above based on Kui-Feng's comment. I can
> change back to each case per line in the next revision to be aligned
> with other usages.
>

Didn't see that conversation, sorry. But my point stands, for
consistency with ___bpf_union_arg and to cut this very long line
shorter it makes more sense to have one case per line.

> >
> >> +        __builtin_choose_expr(sizeof(t) == 16, 2,                                                      \
> >> +                              (void)0)))
> >> +
> >> +#define ____bpf_reg_cnt0()                     (0)
> >> +#define ____bpf_reg_cnt1(t, x)                 (____bpf_reg_cnt0() + BPF_REG_CNT(t))
> >> +#define ____bpf_reg_cnt2(t, x, args...)                (____bpf_reg_cnt1(args) + BPF_REG_CNT(t))
> >> +#define ____bpf_reg_cnt3(t, x, args...)                (____bpf_reg_cnt2(args) + BPF_REG_CNT(t))
> >> +#define ____bpf_reg_cnt4(t, x, args...)                (____bpf_reg_cnt3(args) + BPF_REG_CNT(t))
> >> +#define ____bpf_reg_cnt5(t, x, args...)                (____bpf_reg_cnt4(args) + BPF_REG_CNT(t))
> >> +#define ____bpf_reg_cnt6(t, x, args...)                (____bpf_reg_cnt5(args) + BPF_REG_CNT(t))
> >> +#define ____bpf_reg_cnt7(t, x, args...)                (____bpf_reg_cnt6(args) + BPF_REG_CNT(t))
> >> +#define ____bpf_reg_cnt8(t, x, args...)                (____bpf_reg_cnt7(args) + BPF_REG_CNT(t))
> >> +#define ____bpf_reg_cnt9(t, x, args...)                (____bpf_reg_cnt8(args) + BPF_REG_CNT(t))
> >> +#define ____bpf_reg_cnt10(t, x, args...)       (____bpf_reg_cnt9(args) + BPF_REG_CNT(t))
> >> +#define ____bpf_reg_cnt11(t, x, args...)       (____bpf_reg_cnt10(args) + BPF_REG_CNT(t))
> >> +#define ____bpf_reg_cnt12(t, x, args...)       (____bpf_reg_cnt11(args) + BPF_REG_CNT(t))
> >> +#define ____bpf_reg_cnt(args...)        ___bpf_apply(____bpf_reg_cnt, ____bpf_narg(args))(args)
> >> +
> >> +#define ____bpf_union_arg(t, x, n) \
> >> +       __builtin_choose_expr(sizeof(t) == 1, ({ union { struct { __u8 x; } ___z; t x; } ___tmp = { .___z = {ctx[n]}}; ___tmp.x; }), \
> >> +       __builtin_choose_expr(sizeof(t) == 2, ({ union { struct { __u16 x; } ___z; t x; } ___tmp = { .___z = {ctx[n]} }; ___tmp.x; }), \
> >> +       __builtin_choose_expr(sizeof(t) == 4, ({ union { struct { __u32 x; } ___z; t x; } ___tmp = { .___z = {ctx[n]} }; ___tmp.x; }), \
> >> +       __builtin_choose_expr(sizeof(t) == 8, ({ union { struct { __u64 x; } ___z; t x; } ___tmp = {.___z = {ctx[n]} }; ___tmp.x; }), \
> >> +       __builtin_choose_expr(sizeof(t) == 16, ({ union { struct { __u64 x, y; } ___z; t x; } ___tmp = {.___z = {ctx[n], ctx[n + 1]} }; ___tmp.x; }), \
> >> +                             (void)0)))))
> >
> > looking at this again, we can do a bit better by using arrays, please
> > consider using that. At the very least results in shorter lines:
> >
> >   #define ____bpf_union_arg(t, x, n) \
> > -       __builtin_choose_expr(sizeof(t) == 1, ({ union { struct { __u8
> > x; } ___z; t x; } ___tmp = { .___z = {ctx[n]}}; ___tmp.x; }), \
> > -       __builtin_choose_expr(sizeof(t) == 2, ({ union { struct {
> > __u16 x; } ___z; t x; } ___tmp = { .___z = {ctx[n]} }; ___tmp.x; }), \
> > -       __builtin_choose_expr(sizeof(t) == 4, ({ union { struct {
> > __u32 x; } ___z; t x; } ___tmp = { .___z = {ctx[n]} }; ___tmp.x; }), \
> > -       __builtin_choose_expr(sizeof(t) == 8, ({ union { struct {
> > __u64 x; } ___z; t x; } ___tmp = {.___z = {ctx[n]} }; ___tmp.x; }), \
> > -       __builtin_choose_expr(sizeof(t) == 16, ({ union { struct {
> > __u64 x, y; } ___z; t x; } ___tmp = {.___z = {ctx[n], ctx[n + 1]} };
> > ___tmp.x; }), \
> > +       __builtin_choose_expr(sizeof(t) == 1, ({ union { __u8 z[1]; t
> > x; } ___tmp = { .z = {ctx[n]} }; ___tmp.x; }), \
> > +       __builtin_choose_expr(sizeof(t) == 2, ({ union { __u16 z[1]; t
> > x; } ___tmp = { .z = {ctx[n]} }; ___tmp.x; }), \
> > +       __builtin_choose_expr(sizeof(t) == 4, ({ union { __u32 z[1]; t
> > x; } ___tmp = { .z = {ctx[n]} }; ___tmp.x; }), \
> > +       __builtin_choose_expr(sizeof(t) == 8, ({ union { __u64 z[1]; t
> > x; } ___tmp = {.z = {ctx[n]} }; ___tmp.x; }), \
> > +       __builtin_choose_expr(sizeof(t) == 16, ({ union { __u64 z[2];
> > t x; } ___tmp = {.z = {ctx[n], ctx[n + 1]} }; ___tmp.x; }), \
> >                                (void)0)))))
> >
> > It is using one- or two-element arrays, and it also has uniform
> > {ctx[n]} or {ctx[n], ctx[n + 1]} initialization syntax. Seems a bit
> > nicer than union { struct { ... combo.
>
> Sounds good. Will try to do this.
>
> >
> >> +
> >> +#define ____bpf_ctx_arg0(n, args...)
> >> +#define ____bpf_ctx_arg1(n, t, x)              , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt1(t, x))
> >> +#define ____bpf_ctx_arg2(n, t, x, args...)     , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt2(t, x, args)) ____bpf_ctx_arg1(n, args)
> >> +#define ____bpf_ctx_arg3(n, t, x, args...)     , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt3(t, x, args)) ____bpf_ctx_arg2(n, args)
> >> +#define ____bpf_ctx_arg4(n, t, x, args...)     , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt4(t, x, args)) ____bpf_ctx_arg3(n, args)
> >> +#define ____bpf_ctx_arg5(n, t, x, args...)     , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt5(t, x, args)) ____bpf_ctx_arg4(n, args)
> >> +#define ____bpf_ctx_arg6(n, t, x, args...)     , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt6(t, x, args)) ____bpf_ctx_arg5(n, args)
> >> +#define ____bpf_ctx_arg7(n, t, x, args...)     , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt7(t, x, args)) ____bpf_ctx_arg6(n, args)
> >> +#define ____bpf_ctx_arg8(n, t, x, args...)     , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt8(t, x, args)) ____bpf_ctx_arg7(n, args)
> >> +#define ____bpf_ctx_arg9(n, t, x, args...)     , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt9(t, x, args)) ____bpf_ctx_arg8(n, args)
> >> +#define ____bpf_ctx_arg10(n, t, x, args...)    , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt10(t, x, args)) ____bpf_ctx_arg9(n, args)
> >> +#define ____bpf_ctx_arg11(n, t, x, args...)    , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt11(t, x, args)) ____bpf_ctx_arg10(n, args)
> >> +#define ____bpf_ctx_arg12(n, t, x, args...)    , ____bpf_union_arg(t, x, n - ____bpf_reg_cnt12(t, x, args)) ____bpf_ctx_arg11(n, args)
> >> +#define ____bpf_ctx_arg(n, args...)    ___bpf_apply(____bpf_ctx_arg, ____bpf_narg(args))(n, args)
> >> +
> >> +#define ____bpf_ctx_decl0()
> >> +#define ____bpf_ctx_decl1(t, x)                        , t x
> >> +#define ____bpf_ctx_decl2(t, x, args...)       , t x ____bpf_ctx_decl1(args)
> >> +#define ____bpf_ctx_decl3(t, x, args...)       , t x ____bpf_ctx_decl2(args)
> >> +#define ____bpf_ctx_decl4(t, x, args...)       , t x ____bpf_ctx_decl3(args)
> >> +#define ____bpf_ctx_decl5(t, x, args...)       , t x ____bpf_ctx_decl4(args)
> >> +#define ____bpf_ctx_decl6(t, x, args...)       , t x ____bpf_ctx_decl5(args)
> >> +#define ____bpf_ctx_decl7(t, x, args...)       , t x ____bpf_ctx_decl6(args)
> >> +#define ____bpf_ctx_decl8(t, x, args...)       , t x ____bpf_ctx_decl7(args)
> >> +#define ____bpf_ctx_decl9(t, x, args...)       , t x ____bpf_ctx_decl8(args)
> >> +#define ____bpf_ctx_decl10(t, x, args...)      , t x ____bpf_ctx_decl9(args)
> >> +#define ____bpf_ctx_decl11(t, x, args...)      , t x ____bpf_ctx_decl10(args)
> >> +#define ____bpf_ctx_decl12(t, x, args...)      , t x ____bpf_ctx_decl11(args)
> >> +#define ____bpf_ctx_decl(args...)      ___bpf_apply(____bpf_ctx_decl, ____bpf_narg(args))(args)
> >> +
> >> +/*
> >> + * BPF_PROG2 can handle struct arguments.
> >
> > We have to expand comment here. Let's not slack on this. Point out
> > that it's the similar use and idea as with BPF_PROG, but emphasize the
> > difference in syntax between BPF_PROG and BPF_PROG2. I'd show two
> > simple examples of the same function with BPF_PROG and BPF_PROG2 here.
> > Please follow up.
>
> Will add detailed comments to explain why BPF_PROG2 and its core usage
> and difference from BPF_PROG in the followup patch.
>

cool, thanks!

> >
> >> + */
> >> +#define BPF_PROG2(name, args...)                                               \
> >> +name(unsigned long long *ctx);                                                 \
> >> +static __always_inline typeof(name(0))                                         \
> >> +____##name(unsigned long long *ctx ____bpf_ctx_decl(args));                    \
> >> +typeof(name(0)) name(unsigned long long *ctx)                                  \
> >> +{                                                                              \
> >> +       return ____##name(ctx ____bpf_ctx_arg(____bpf_reg_cnt(args), args));    \
> >
> > nit: you could have simplified this by doing ____bpf_reg_cnt() call
> > inside ____bpf_ctx_decl(args...) definition. I think it's a bit more
> > self-contained that way.
>
> Will do this in the follow-up patch.
>
> >
> >> +}                                                                              \
> >> +static __always_inline typeof(name(0))                                         \
> >> +____##name(unsigned long long *ctx ____bpf_ctx_decl(args))
> >> +
> >>   struct pt_regs;
> >>
> >>   #define ___bpf_kprobe_args0()           ctx
> >> --
> >> 2.30.2
> >>

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-09-09 18:08 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-31 15:26 [PATCH bpf-next v4 0/8] bpf: Support struct argument for trampoline base progs Yonghong Song
2022-08-31 15:26 ` [PATCH bpf-next v4 1/8] bpf: Allow struct argument in trampoline based programs Yonghong Song
2022-08-31 15:26 ` [PATCH bpf-next v4 2/8] bpf: x86: Support in-register struct arguments in trampoline programs Yonghong Song
2022-09-06 16:40   ` Kui-Feng Lee
2022-09-06 19:30     ` Yonghong Song
2022-09-07  3:00   ` Alexei Starovoitov
2022-09-07 18:04     ` Yonghong Song
2022-08-31 15:26 ` [PATCH bpf-next v4 3/8] bpf: Update descriptions for helpers bpf_get_func_arg[_cnt]() Yonghong Song
2022-09-02  7:56   ` Jiri Olsa
2022-09-06 16:12     ` Yonghong Song
2022-08-31 15:27 ` [PATCH bpf-next v4 4/8] bpf: arm64: No support of struct argument in trampoline programs Yonghong Song
2022-08-31 15:27 ` [PATCH bpf-next v4 5/8] libbpf: Add new BPF_PROG2 macro Yonghong Song
2022-09-09  0:11   ` Andrii Nakryiko
2022-09-09 16:31     ` Yonghong Song
2022-09-09 18:07       ` Andrii Nakryiko
2022-08-31 15:27 ` [PATCH bpf-next v4 6/8] selftests/bpf: Add struct argument tests with fentry/fexit programs Yonghong Song
2022-08-31 15:27 ` [PATCH bpf-next v4 7/8] selftests/bpf: Use BPF_PROG2 for some fentry programs without struct arguments Yonghong Song
2022-08-31 15:27 ` [PATCH bpf-next v4 8/8] selftests/bpf: Add tracing_struct test in DENYLIST.s390x Yonghong Song
2022-09-07  3:10 ` [PATCH bpf-next v4 0/8] bpf: Support struct argument for trampoline base progs patchwork-bot+netdevbpf

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.