All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v2 0/6] bpf: Support struct argument for trampoline base progs
@ 2022-08-12  5:24 Yonghong Song
  2022-08-12  5:24 ` [PATCH bpf-next v2 1/6] bpf: Add struct argument info in btf_func_model Yonghong Song
                   ` (5 more replies)
  0 siblings, 6 replies; 19+ messages in thread
From: Yonghong Song @ 2022-08-12  5:24 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team

Currently struct arguments are not supported for trampoline based progs.
One of major reason is that struct argument may pass by value which may
use more than one registers. This breaks trampoline progs where
each argument is assumed to take one register. bcc community reported the
issue ([1]) where struct argument is not supported for fentry program.
  typedef struct {
        uid_t val;
  } kuid_t;
  typedef struct {
        gid_t val;
  } kgid_t;
  int security_path_chown(struct path *path, kuid_t uid, kgid_t gid);
Inside Meta, we also have a use case to attach to tcp_setsockopt()
  typedef struct {
        union {
                void            *kernel;
                void __user     *user;
        };
        bool            is_kernel : 1;
  } sockptr_t;
  int tcp_setsockopt(struct sock *sk, int level, int optname,
                     sockptr_t optval, unsigned int optlen);

This patch added struct value support for bpf tracing programs which
uses trampoline. For x86_64, struct argument size needs to be 16 or less so
it can fit in one or two registers. struct argument is not supported
for arm64 in this patch set.

 [1] https://github.com/iovisor/bcc/issues/3657

Changelog:
  rfc v1 -> v2:
   - changed bpf_func_model struct info fields to
     arg_flags[] to make it easy to iterate arguments
     in arch specific {save|restore}_regs() functions.
   - added fexit tests to test return values with
     struct arguments.

Yonghong Song (6):
  bpf: Add struct argument info in btf_func_model
  bpf: x86: Rename stack_size to regs_off in {save,restore}_regs()
  bpf: x86: Support in-register struct arguments
  bpf: arm64: No support of struct argument
  bpf: Populate struct argument info in btf_func_model
  selftests/bpf: Add struct argument tests with fentry/fexit programs.

 arch/arm64/net/bpf_jit_comp.c                 |   8 +-
 arch/x86/net/bpf_jit_comp.c                   | 137 ++++++++++++++----
 include/linux/bpf.h                           |   4 +
 kernel/bpf/btf.c                              |  30 +++-
 .../selftests/bpf/bpf_testmod/bpf_testmod.c   |  41 ++++++
 .../selftests/bpf/prog_tests/tracing_struct.c |  55 +++++++
 .../selftests/bpf/progs/tracing_struct.c      |  93 ++++++++++++
 7 files changed, 334 insertions(+), 34 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/tracing_struct.c
 create mode 100644 tools/testing/selftests/bpf/progs/tracing_struct.c

-- 
2.30.2


^ permalink raw reply	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v2 1/6] bpf: Add struct argument info in btf_func_model
  2022-08-12  5:24 [PATCH bpf-next v2 0/6] bpf: Support struct argument for trampoline base progs Yonghong Song
@ 2022-08-12  5:24 ` Yonghong Song
  2022-08-12  5:24 ` [PATCH bpf-next v2 2/6] bpf: x86: Rename stack_size to regs_off in {save,restore}_regs() Yonghong Song
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: Yonghong Song @ 2022-08-12  5:24 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team

Add arg_flags for additional information about arguments in btf_func_model.
Currently, only whether an argument is a structure or not is recorded.
Such information will be used in arch specific function
arch_prepare_bpf_trampoline() to prepare argument access properly
in trampoline.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 include/linux/bpf.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index a627a02cf8ab..2d3099fb5a0a 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -726,10 +726,14 @@ enum bpf_cgroup_storage_type {
  */
 #define MAX_BPF_FUNC_REG_ARGS 5
 
+/* The argument is a structure. */
+#define BTF_FMODEL_STRUCT_ARG		BIT(0)
+
 struct btf_func_model {
 	u8 ret_size;
 	u8 nr_args;
 	u8 arg_size[MAX_BPF_FUNC_ARGS];
+	u8 arg_flags[MAX_BPF_FUNC_ARGS];
 };
 
 /* Restore arguments before returning from trampoline to let original function
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v2 2/6] bpf: x86: Rename stack_size to regs_off in {save,restore}_regs()
  2022-08-12  5:24 [PATCH bpf-next v2 0/6] bpf: Support struct argument for trampoline base progs Yonghong Song
  2022-08-12  5:24 ` [PATCH bpf-next v2 1/6] bpf: Add struct argument info in btf_func_model Yonghong Song
@ 2022-08-12  5:24 ` Yonghong Song
  2022-08-12  5:24 ` [PATCH bpf-next v2 3/6] bpf: x86: Support in-register struct arguments Yonghong Song
                   ` (3 subsequent siblings)
  5 siblings, 0 replies; 19+ messages in thread
From: Yonghong Song @ 2022-08-12  5:24 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team

Rename argument 'stack_size' to 'regs_off' in save_regs() and
restore_regs() since in reality 'stack_size' represents a particular
stack offset and 'regs_off' is the one in the caller argument.
Leter, another stack offset will be added to these two functions
for saving and restoring struct argument values.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 arch/x86/net/bpf_jit_comp.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index c1f6c1c51d99..2657b58001cf 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -1749,7 +1749,7 @@ st:			if (is_imm8(insn->off))
 }
 
 static void save_regs(const struct btf_func_model *m, u8 **prog, int nr_args,
-		      int stack_size)
+		      int regs_off)
 {
 	int i;
 	/* Store function arguments to stack.
@@ -1761,11 +1761,11 @@ static void save_regs(const struct btf_func_model *m, u8 **prog, int nr_args,
 		emit_stx(prog, bytes_to_bpf_size(m->arg_size[i]),
 			 BPF_REG_FP,
 			 i == 5 ? X86_REG_R9 : BPF_REG_1 + i,
-			 -(stack_size - i * 8));
+			 -(regs_off - i * 8));
 }
 
 static void restore_regs(const struct btf_func_model *m, u8 **prog, int nr_args,
-			 int stack_size)
+			 int regs_off)
 {
 	int i;
 
@@ -1778,7 +1778,7 @@ static void restore_regs(const struct btf_func_model *m, u8 **prog, int nr_args,
 		emit_ldx(prog, bytes_to_bpf_size(m->arg_size[i]),
 			 i == 5 ? X86_REG_R9 : BPF_REG_1 + i,
 			 BPF_REG_FP,
-			 -(stack_size - i * 8));
+			 -(regs_off - i * 8));
 }
 
 static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v2 3/6] bpf: x86: Support in-register struct arguments
  2022-08-12  5:24 [PATCH bpf-next v2 0/6] bpf: Support struct argument for trampoline base progs Yonghong Song
  2022-08-12  5:24 ` [PATCH bpf-next v2 1/6] bpf: Add struct argument info in btf_func_model Yonghong Song
  2022-08-12  5:24 ` [PATCH bpf-next v2 2/6] bpf: x86: Rename stack_size to regs_off in {save,restore}_regs() Yonghong Song
@ 2022-08-12  5:24 ` Yonghong Song
  2022-08-14 20:24   ` Jiri Olsa
  2022-08-15 22:44   ` Alexei Starovoitov
  2022-08-12  5:24 ` [PATCH bpf-next v2 4/6] bpf: arm64: No support of struct argument Yonghong Song
                   ` (2 subsequent siblings)
  5 siblings, 2 replies; 19+ messages in thread
From: Yonghong Song @ 2022-08-12  5:24 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team

In C, struct value can be passed as a function argument.
For small structs, struct value may be passed in
one or more registers. For trampoline based bpf programs,
This would cause complication since one-to-one mapping between
function argument and arch argument register is not valid
any more.

To support struct value argument and make bpf programs
easy to write, the bpf program function parameter is
changed from struct type to a pointer to struct type.
The following is a simplified example.

In one of later selftests, we have a bpf_testmod function:
    struct bpf_testmod_struct_arg_2 {
        long a;
        long b;
    };
    noinline int
    bpf_testmod_test_struct_arg_2(int a, struct bpf_testmod_struct_arg_2 b, int c) {
        bpf_testmod_test_struct_arg_result = a + b.a + b.b + c;
        return bpf_testmod_test_struct_arg_result;
    }

When a bpf program is attached to the bpf_testmod function
bpf_testmod_test_struct_arg_2(), the bpf program may look like
    SEC("fentry/bpf_testmod_test_struct_arg_2")
    int BPF_PROG(test_struct_arg_3, int a, struct bpf_testmod_struct_arg_2 *b, int c)
    {
        t2_a = a;
        t2_b_a = b->a;
        t2_b_b = b->b;
        t2_c = c;
        return 0;
    }

Basically struct value becomes a pointer to the struct.
The trampoline stack will be increased to store the stack values and
the pointer to these values will be saved in the stack slot corresponding
to that argument. For x86_64, the struct size is limited up to 16 bytes
so the struct can fit in one or two registers. The struct size of more
than 16 bytes is not supported now as our current use case is
for sockptr_t in the argument. We could handle such large struct's
in the future if we have concrete use cases.

The main changes are in save_regs() and restore_regs(). The following
illustrated the trampoline asm codes for save_regs() and restore_regs().
save_regs():
    /* first argument */
    mov    DWORD PTR [rbp-0x18],edi
    /* second argument: struct, save actual values and put the pointer to the slot */
    lea    rax,[rbp-0x40]
    mov    QWORD PTR [rbp-0x10],rax
    mov    QWORD PTR [rbp-0x40],rsi
    mov    QWORD PTR [rbp-0x38],rdx
    /* third argument */
    mov    DWORD PTR [rbp-0x8],esi
restore_regs():
    mov    edi,DWORD PTR [rbp-0x18]
    mov    rsi,QWORD PTR [rbp-0x40]
    mov    rdx,QWORD PTR [rbp-0x38]
    mov    esi,DWORD PTR [rbp-0x8]

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 arch/x86/net/bpf_jit_comp.c | 137 +++++++++++++++++++++++++++++-------
 1 file changed, 110 insertions(+), 27 deletions(-)

diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c
index 2657b58001cf..2fa7d694c559 100644
--- a/arch/x86/net/bpf_jit_comp.c
+++ b/arch/x86/net/bpf_jit_comp.c
@@ -37,6 +37,11 @@ static u8 *emit_code(u8 *ptr, u32 bytes, unsigned int len)
 #define EMIT3(b1, b2, b3)	EMIT((b1) + ((b2) << 8) + ((b3) << 16), 3)
 #define EMIT4(b1, b2, b3, b4)   EMIT((b1) + ((b2) << 8) + ((b3) << 16) + ((b4) << 24), 4)
 
+#define EMIT2_UPDATE_PROG(bytes, len) \
+	do { *prog = emit_code(*prog, bytes, len); } while (0)
+#define EMIT4_UPDATE_PROG(b1, b2, b3, b4) \
+	EMIT2_UPDATE_PROG((b1) + ((b2) << 8) + ((b3) << 16) + ((b4) << 24), 4)
+
 #define EMIT1_off32(b1, off) \
 	do { EMIT1(b1); EMIT(off, 4); } while (0)
 #define EMIT2_off32(b1, b2, off) \
@@ -1749,36 +1754,92 @@ st:			if (is_imm8(insn->off))
 }
 
 static void save_regs(const struct btf_func_model *m, u8 **prog, int nr_args,
-		      int regs_off)
+		      int regs_off, int struct_val_off)
 {
-	int i;
-	/* Store function arguments to stack.
-	 * For a function that accepts two pointers the sequence will be:
-	 * mov QWORD PTR [rbp-0x10],rdi
-	 * mov QWORD PTR [rbp-0x8],rsi
-	 */
-	for (i = 0; i < min(nr_args, 6); i++)
-		emit_stx(prog, bytes_to_bpf_size(m->arg_size[i]),
-			 BPF_REG_FP,
-			 i == 5 ? X86_REG_R9 : BPF_REG_1 + i,
-			 -(regs_off - i * 8));
+	int i, j, curr_reg_idx, curr_s_stack_off, s_arg_nregs;
+
+	curr_reg_idx = curr_s_stack_off = 0;
+	for (i = 0; i < nr_args; i++) {
+		if (!(m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG)) {
+			/* Store function arguments to stack.
+			 * For a function that accepts two pointers the sequence will be:
+			 * mov QWORD PTR [rbp-0x10],rdi
+			 * mov QWORD PTR [rbp-0x8],rsi
+			 */
+			emit_stx(prog, bytes_to_bpf_size(m->arg_size[i]),
+				 BPF_REG_FP,
+				 curr_reg_idx == 5 ? X86_REG_R9 : BPF_REG_1 + curr_reg_idx,
+				 -(regs_off - i * 8));
+			curr_reg_idx++;
+			continue;
+		}
+
+		/* lea rax, [rbp - struct_val_off] */
+		EMIT4_UPDATE_PROG(0x48, 0x8D, 0x45, -(struct_val_off - curr_s_stack_off * 8));
+		/* The struct value is converted to a pointer argument.
+		 * mov QWORD PTR [rbp - s_arg_idx * 8],rax
+		 */
+		emit_stx(prog, bytes_to_bpf_size(8),
+			 BPF_REG_FP, BPF_REG_0, -(regs_off - i * 8));
+
+		/* Save struct registers to stack.
+		 * For example, argument 1 (second argument) size is 16 which occupies two
+		 * registers, these two register values will be saved in stack.
+		 * mov QWORD PTR [rbp-0x40],rsi
+		 * mov QWORD PTR [rbp-0x38],rdx
+		 */
+		s_arg_nregs = (m->arg_size[i] + 7) / 8;
+		for (j = 0; j < s_arg_nregs; j++) {
+			emit_stx(prog, bytes_to_bpf_size(8),
+				 BPF_REG_FP,
+				 curr_reg_idx == 5 ? X86_REG_R9 : BPF_REG_1 + curr_reg_idx,
+				 -(struct_val_off - curr_s_stack_off * 8));
+			curr_reg_idx++;
+			curr_s_stack_off++;
+		}
+	}
 }
 
 static void restore_regs(const struct btf_func_model *m, u8 **prog, int nr_args,
-			 int regs_off)
+			 int regs_off, int struct_val_off)
 {
-	int i;
+	int i, j, curr_reg_idx, curr_s_stack_off, s_arg_nregs;
+
+	curr_reg_idx = curr_s_stack_off = 0;
+	for (i = 0; i < nr_args; i++) {
+		if (!(m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG)) {
+			/* Restore struct values from stack.
+			 * For example, argument 1 (second argument) size is 16 which occupies two
+			 * registers, these two register values will be restored to the original
+			 * registers.
+			 * mov rsi,QWORD PTR [rbp-0x40]
+			 * mov rdx,QWORD PTR [rbp-0x38]
+			 */
+			emit_ldx(prog, bytes_to_bpf_size(m->arg_size[i]),
+				 curr_reg_idx  == 5 ? X86_REG_R9 : BPF_REG_1 + curr_reg_idx,
+				 BPF_REG_FP,
+				 -(regs_off - i * 8));
+			curr_reg_idx++;
+			continue;
+		}
 
-	/* Restore function arguments from stack.
-	 * For a function that accepts two pointers the sequence will be:
-	 * EMIT4(0x48, 0x8B, 0x7D, 0xF0); mov rdi,QWORD PTR [rbp-0x10]
-	 * EMIT4(0x48, 0x8B, 0x75, 0xF8); mov rsi,QWORD PTR [rbp-0x8]
-	 */
-	for (i = 0; i < min(nr_args, 6); i++)
-		emit_ldx(prog, bytes_to_bpf_size(m->arg_size[i]),
-			 i == 5 ? X86_REG_R9 : BPF_REG_1 + i,
-			 BPF_REG_FP,
-			 -(regs_off - i * 8));
+		/* Restore struct values from stack.
+		 * For example, argument 1 (second argument) size is 16 which occupies two
+		 * registers, these two register values will be restored to the original
+		 * registers.
+		 * mov rsi,QWORD PTR [rbp-0x40]
+		 * mov rdx,QWORD PTR [rbp-0x38]
+		 */
+		s_arg_nregs = (m->arg_size[i] + 7) / 8;
+		for (j = 0; j < s_arg_nregs; j++) {
+			emit_ldx(prog, bytes_to_bpf_size(8),
+				 curr_reg_idx == 5 ? X86_REG_R9 : BPF_REG_1 + curr_reg_idx,
+				 BPF_REG_FP,
+				 -(struct_val_off - curr_s_stack_off * 8));
+			curr_reg_idx++;
+			curr_s_stack_off++;
+		}
+	}
 }
 
 static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
@@ -2020,6 +2081,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
 	struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
 	struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
 	struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
+	int struct_val_off, extra_nregs = 0;
 	u8 **branches = NULL;
 	u8 *prog;
 	bool save_ret;
@@ -2028,6 +2090,20 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
 	if (nr_args > 6)
 		return -ENOTSUPP;
 
+	for (i = 0; i < MAX_BPF_FUNC_ARGS; i++) {
+		if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG) {
+			/* Only support up to 16 bytes struct which should keep
+			 * values in registers.
+			 */
+			if (m->arg_size[i] > 16)
+				return -ENOTSUPP;
+
+			extra_nregs += (m->arg_size[i] + 7) / 8 - 1;
+		}
+	}
+	if (nr_args + extra_nregs > 6)
+		return -ENOTSUPP;
+
 	/* Generated trampoline stack layout:
 	 *
 	 * RBP + 8         [ return address  ]
@@ -2066,6 +2142,13 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
 	stack_size += (sizeof(struct bpf_tramp_run_ctx) + 7) & ~0x7;
 	run_ctx_off = stack_size;
 
+	/* For structure argument */
+	for (i = 0; i < MAX_BPF_FUNC_ARGS; i++) {
+		if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG)
+			stack_size += (m->arg_size[i] + 7) & ~0x7;
+	}
+	struct_val_off = stack_size;
+
 	if (flags & BPF_TRAMP_F_SKIP_FRAME) {
 		/* skip patched call instruction and point orig_call to actual
 		 * body of the kernel function.
@@ -2101,7 +2184,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
 		emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -ip_off);
 	}
 
-	save_regs(m, &prog, nr_args, regs_off);
+	save_regs(m, &prog, nr_args, regs_off, struct_val_off);
 
 	if (flags & BPF_TRAMP_F_CALL_ORIG) {
 		/* arg1: mov rdi, im */
@@ -2131,7 +2214,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
 	}
 
 	if (flags & BPF_TRAMP_F_CALL_ORIG) {
-		restore_regs(m, &prog, nr_args, regs_off);
+		restore_regs(m, &prog, nr_args, regs_off, struct_val_off);
 
 		if (flags & BPF_TRAMP_F_ORIG_STACK) {
 			emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8);
@@ -2172,7 +2255,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
 		}
 
 	if (flags & BPF_TRAMP_F_RESTORE_REGS)
-		restore_regs(m, &prog, nr_args, regs_off);
+		restore_regs(m, &prog, nr_args, regs_off, struct_val_off);
 
 	/* This needs to be done regardless. If there were fmod_ret programs,
 	 * the return value is only updated on the stack and still needs to be
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v2 4/6] bpf: arm64: No support of struct argument
  2022-08-12  5:24 [PATCH bpf-next v2 0/6] bpf: Support struct argument for trampoline base progs Yonghong Song
                   ` (2 preceding siblings ...)
  2022-08-12  5:24 ` [PATCH bpf-next v2 3/6] bpf: x86: Support in-register struct arguments Yonghong Song
@ 2022-08-12  5:24 ` Yonghong Song
  2022-08-12  5:24 ` [PATCH bpf-next v2 5/6] bpf: Populate struct argument info in btf_func_model Yonghong Song
  2022-08-12  5:24 ` [PATCH bpf-next v2 6/6] selftests/bpf: Add struct argument tests with fentry/fexit programs Yonghong Song
  5 siblings, 0 replies; 19+ messages in thread
From: Yonghong Song @ 2022-08-12  5:24 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team

ARM64 does not support struct argument for trampoline based
bpf programs yet.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 arch/arm64/net/bpf_jit_comp.c | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c
index 7ca8779ae34f..df45d8f9d851 100644
--- a/arch/arm64/net/bpf_jit_comp.c
+++ b/arch/arm64/net/bpf_jit_comp.c
@@ -1970,7 +1970,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image,
 				u32 flags, struct bpf_tramp_links *tlinks,
 				void *orig_call)
 {
-	int ret;
+	int i, ret;
 	int nargs = m->nr_args;
 	int max_insns = ((long)image_end - (long)image) / AARCH64_INSN_SIZE;
 	struct jit_ctx ctx = {
@@ -1982,6 +1982,12 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image,
 	if (nargs > 8)
 		return -ENOTSUPP;
 
+	/* don't support struct argument */
+	for (i = 0; i < MAX_BPF_FUNC_ARGS; i++) {
+		if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG)
+			return -ENOTSUPP;
+	}
+
 	ret = prepare_trampoline(&ctx, im, tlinks, orig_call, nargs, flags);
 	if (ret < 0)
 		return ret;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v2 5/6] bpf: Populate struct argument info in btf_func_model
  2022-08-12  5:24 [PATCH bpf-next v2 0/6] bpf: Support struct argument for trampoline base progs Yonghong Song
                   ` (3 preceding siblings ...)
  2022-08-12  5:24 ` [PATCH bpf-next v2 4/6] bpf: arm64: No support of struct argument Yonghong Song
@ 2022-08-12  5:24 ` Yonghong Song
  2022-08-12  5:24 ` [PATCH bpf-next v2 6/6] selftests/bpf: Add struct argument tests with fentry/fexit programs Yonghong Song
  5 siblings, 0 replies; 19+ messages in thread
From: Yonghong Song @ 2022-08-12  5:24 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team

Add struct argument support in btf_ctx_access() and btf_distill_func_proto().
The arch-specific code will handle whether and how such struct arguments
are supported.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 kernel/bpf/btf.c | 30 ++++++++++++++++++++++++------
 1 file changed, 24 insertions(+), 6 deletions(-)

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 903719b89238..f38ae0e908fd 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -5339,7 +5339,7 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
 	struct bpf_verifier_log *log = info->log;
 	const struct btf_param *args;
 	const char *tag_value;
-	u32 nr_args, arg;
+	u32 nr_args, arg, curr_tid = 0;
 	int i, ret;
 
 	if (off % 8) {
@@ -5385,6 +5385,7 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
 			 */
 			if (!t)
 				return true;
+			curr_tid = t->type;
 			t = btf_type_by_id(btf, t->type);
 			break;
 		case BPF_MODIFY_RETURN:
@@ -5394,7 +5395,7 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
 			if (!t)
 				return false;
 
-			t = btf_type_skip_modifiers(btf, t->type, NULL);
+			t = btf_type_skip_modifiers(btf, t->type, &curr_tid);
 			if (!btf_type_is_small_int(t)) {
 				bpf_log(log,
 					"ret type %s not allowed for fmod_ret\n",
@@ -5411,15 +5412,25 @@ bool btf_ctx_access(int off, int size, enum bpf_access_type type,
 		if (!t)
 			/* Default prog with MAX_BPF_FUNC_REG_ARGS args */
 			return true;
+		curr_tid = args[arg].type;
 		t = btf_type_by_id(btf, args[arg].type);
 	}
 
 	/* skip modifiers */
-	while (btf_type_is_modifier(t))
+	while (btf_type_is_modifier(t)) {
+		curr_tid = t->type;
 		t = btf_type_by_id(btf, t->type);
+	}
 	if (btf_type_is_small_int(t) || btf_is_any_enum(t))
 		/* accessing a scalar */
 		return true;
+	if (__btf_type_is_struct(t) && curr_tid) {
+		info->reg_type = PTR_TO_BTF_ID;
+		info->btf = btf;
+		info->btf_id = curr_tid;
+		return true;
+	}
+
 	if (!btf_type_is_ptr(t)) {
 		bpf_log(log,
 			"func '%s' arg%d '%s' has type %s. Only pointer access is allowed\n",
@@ -5878,7 +5889,7 @@ static int __get_type_size(struct btf *btf, u32 btf_id,
 	if (!t)
 		return -EINVAL;
 	*ret_type = t;
-	if (btf_type_is_ptr(t))
+	if (btf_type_is_ptr(t) || __btf_type_is_struct(t))
 		/* kernel size of pointer. Not BPF's size of pointer*/
 		return sizeof(void *);
 	if (btf_type_is_int(t) || btf_is_any_enum(t))
@@ -5901,8 +5912,10 @@ int btf_distill_func_proto(struct bpf_verifier_log *log,
 		/* BTF function prototype doesn't match the verifier types.
 		 * Fall back to MAX_BPF_FUNC_REG_ARGS u64 args.
 		 */
-		for (i = 0; i < MAX_BPF_FUNC_REG_ARGS; i++)
+		for (i = 0; i < MAX_BPF_FUNC_REG_ARGS; i++) {
 			m->arg_size[i] = 8;
+			m->arg_flags[i] = 0;
+		}
 		m->ret_size = 8;
 		m->nr_args = MAX_BPF_FUNC_REG_ARGS;
 		return 0;
@@ -5944,7 +5957,12 @@ int btf_distill_func_proto(struct bpf_verifier_log *log,
 				tname);
 			return -EINVAL;
 		}
-		m->arg_size[i] = ret;
+		if (__btf_type_is_struct(t)) {
+			m->arg_flags[i] = BTF_FMODEL_STRUCT_ARG;
+			m->arg_size[i] = t->size;
+		} else {
+			m->arg_size[i] = ret;
+		}
 	}
 	m->nr_args = nargs;
 	return 0;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH bpf-next v2 6/6] selftests/bpf: Add struct argument tests with fentry/fexit programs.
  2022-08-12  5:24 [PATCH bpf-next v2 0/6] bpf: Support struct argument for trampoline base progs Yonghong Song
                   ` (4 preceding siblings ...)
  2022-08-12  5:24 ` [PATCH bpf-next v2 5/6] bpf: Populate struct argument info in btf_func_model Yonghong Song
@ 2022-08-12  5:24 ` Yonghong Song
  5 siblings, 0 replies; 19+ messages in thread
From: Yonghong Song @ 2022-08-12  5:24 UTC (permalink / raw)
  To: bpf; +Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team

Add various struct argument tests with fentry/fexit programs.

Signed-off-by: Yonghong Song <yhs@fb.com>
---
 .../selftests/bpf/bpf_testmod/bpf_testmod.c   | 41 ++++++++
 .../selftests/bpf/prog_tests/tracing_struct.c | 55 +++++++++++
 .../selftests/bpf/progs/tracing_struct.c      | 93 +++++++++++++++++++
 3 files changed, 189 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/tracing_struct.c
 create mode 100644 tools/testing/selftests/bpf/progs/tracing_struct.c

diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
index 792cb15bac40..11fe9f219a17 100644
--- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
+++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
@@ -18,6 +18,40 @@ typedef int (*func_proto_typedef_nested1)(func_proto_typedef);
 typedef int (*func_proto_typedef_nested2)(func_proto_typedef_nested1);
 
 DEFINE_PER_CPU(int, bpf_testmod_ksym_percpu) = 123;
+long bpf_testmod_test_struct_arg_result;
+
+struct bpf_testmod_struct_arg_1 {
+	int a;
+};
+struct bpf_testmod_struct_arg_2 {
+	long a;
+	long b;
+};
+
+noinline int
+bpf_testmod_test_struct_arg_1(struct bpf_testmod_struct_arg_2 a, int b, int c) {
+	bpf_testmod_test_struct_arg_result = a.a + a.b  + b + c;
+	return bpf_testmod_test_struct_arg_result;
+}
+
+noinline int
+bpf_testmod_test_struct_arg_2(int a, struct bpf_testmod_struct_arg_2 b, int c) {
+	bpf_testmod_test_struct_arg_result = a + b.a + b.b + c;
+	return bpf_testmod_test_struct_arg_result;
+}
+
+noinline int
+bpf_testmod_test_struct_arg_3(int a, int b, struct bpf_testmod_struct_arg_2 c) {
+	bpf_testmod_test_struct_arg_result = a + b + c.a + c.b;
+	return bpf_testmod_test_struct_arg_result;
+}
+
+noinline int
+bpf_testmod_test_struct_arg_4(struct bpf_testmod_struct_arg_1 a, int b,
+			      int c, int d, struct bpf_testmod_struct_arg_2 e) {
+	bpf_testmod_test_struct_arg_result = a.a + b + c + d + e.a + e.b;
+	return bpf_testmod_test_struct_arg_result;
+}
 
 noinline void
 bpf_testmod_test_mod_kfunc(int i)
@@ -98,11 +132,18 @@ bpf_testmod_test_read(struct file *file, struct kobject *kobj,
 		.off = off,
 		.len = len,
 	};
+	struct bpf_testmod_struct_arg_1 struct_arg1 = {10};
+	struct bpf_testmod_struct_arg_2 struct_arg2 = {2, 3};
 	int i = 1;
 
 	while (bpf_testmod_return_ptr(i))
 		i++;
 
+	(void)bpf_testmod_test_struct_arg_1(struct_arg2, 1, 4);
+	(void)bpf_testmod_test_struct_arg_2(1, struct_arg2, 4);
+	(void)bpf_testmod_test_struct_arg_3(1, 4, struct_arg2);
+	(void)bpf_testmod_test_struct_arg_4(struct_arg1, 1, 2, 3, struct_arg2);
+
 	/* This is always true. Use the check to make sure the compiler
 	 * doesn't remove bpf_testmod_loop_test.
 	 */
diff --git a/tools/testing/selftests/bpf/prog_tests/tracing_struct.c b/tools/testing/selftests/bpf/prog_tests/tracing_struct.c
new file mode 100644
index 000000000000..8b1fbc3fc463
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/tracing_struct.c
@@ -0,0 +1,55 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */
+
+#include <test_progs.h>
+#include "tracing_struct.skel.h"
+
+static void test_fentry(void)
+{
+	struct tracing_struct *skel;
+	int err;
+
+	skel = tracing_struct__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "tracing_struct__open_and_load"))
+		return;
+
+	err = tracing_struct__attach(skel);
+	if (!ASSERT_OK(err, "tracing_struct__attach"))
+		return;
+
+	ASSERT_OK(trigger_module_test_read(256), "trigger_read");
+
+	ASSERT_EQ(skel->bss->t1_a_a, 2, "t1:a.a");
+	ASSERT_EQ(skel->bss->t1_a_b, 3, "t1:a.b");
+	ASSERT_EQ(skel->bss->t1_b, 1, "t1:b");
+	ASSERT_EQ(skel->bss->t1_c, 4, "t1:c");
+	ASSERT_EQ(skel->bss->t1_ret, 10, "t1 ret");
+
+	ASSERT_EQ(skel->bss->t2_a, 1, "t2:a");
+	ASSERT_EQ(skel->bss->t2_b_a, 2, "t2:b.a");
+	ASSERT_EQ(skel->bss->t2_b_b, 3, "t2:b.b");
+	ASSERT_EQ(skel->bss->t2_c, 4, "t2:c");
+	ASSERT_EQ(skel->bss->t2_ret, 10, "t2 ret");
+
+	ASSERT_EQ(skel->bss->t3_a, 1, "t3:a");
+	ASSERT_EQ(skel->bss->t3_b, 4, "t3:b");
+	ASSERT_EQ(skel->bss->t3_c_a, 2, "t3:c.a");
+	ASSERT_EQ(skel->bss->t3_c_b, 3, "t3:c.b");
+	ASSERT_EQ(skel->bss->t3_ret, 10, "t3 ret");
+
+	ASSERT_EQ(skel->bss->t4_a_a, 10, "t4:a.a");
+	ASSERT_EQ(skel->bss->t4_b, 1, "t4:b");
+	ASSERT_EQ(skel->bss->t4_c, 2, "t4:c");
+	ASSERT_EQ(skel->bss->t4_d, 3, "t4:d");
+	ASSERT_EQ(skel->bss->t4_e_a, 2, "t4:e.a");
+	ASSERT_EQ(skel->bss->t4_e_b, 3, "t4:e.b");
+	ASSERT_EQ(skel->bss->t4_ret, 21, "t4 ret");
+
+	tracing_struct__detach(skel);
+	tracing_struct__destroy(skel);
+}
+
+void test_tracing_struct(void)
+{
+	test_fentry();
+}
diff --git a/tools/testing/selftests/bpf/progs/tracing_struct.c b/tools/testing/selftests/bpf/progs/tracing_struct.c
new file mode 100644
index 000000000000..e724c7e1a933
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/tracing_struct.c
@@ -0,0 +1,93 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2022 Meta Platforms, Inc. and affiliates. */
+
+#include <vmlinux.h>
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_helpers.h>
+
+struct bpf_testmod_struct_arg_1 {
+	int a;
+};
+struct bpf_testmod_struct_arg_2 {
+	long a;
+	long b;
+};
+
+long t1_a_a, t1_a_b, t1_b, t1_c, t1_ret;
+long t2_a, t2_b_a, t2_b_b, t2_c, t2_ret;
+long t3_a, t3_b, t3_c_a, t3_c_b, t3_ret;
+long t4_a_a, t4_b, t4_c, t4_d, t4_e_a, t4_e_b, t4_ret;
+
+SEC("fentry/bpf_testmod_test_struct_arg_1")
+int BPF_PROG(test_struct_arg_1, struct bpf_testmod_struct_arg_2 *a, int b, int c)
+{
+	t1_a_a = a->a;
+	t1_a_b = a->b;
+	t1_b = b;
+	t1_c = c;
+	return 0;
+}
+
+SEC("fexit/bpf_testmod_test_struct_arg_1")
+int BPF_PROG(test_struct_arg_2, struct bpf_testmod_struct_arg_2 *a, int b, int c, int ret)
+{
+	t1_ret = ret;
+	return 0;
+}
+
+SEC("fentry/bpf_testmod_test_struct_arg_2")
+int BPF_PROG(test_struct_arg_3, int a, struct bpf_testmod_struct_arg_2 *b, int c)
+{
+	t2_a = a;
+	t2_b_a = b->a;
+	t2_b_b = b->b;
+	t2_c = c;
+	return 0;
+}
+
+SEC("fexit/bpf_testmod_test_struct_arg_2")
+int BPF_PROG(test_struct_arg_4, int a, struct bpf_testmod_struct_arg_2 *b, int c, int ret)
+{
+	t2_ret = ret;
+	return 0;
+}
+
+SEC("fentry/bpf_testmod_test_struct_arg_3")
+int BPF_PROG(test_struct_arg_5, int a, int b, struct bpf_testmod_struct_arg_2 *c)
+{
+	t3_a = a;
+	t3_b = b;
+	t3_c_a = c->a;
+	t3_c_b = c->b;
+	return 0;
+}
+
+SEC("fexit/bpf_testmod_test_struct_arg_3")
+int BPF_PROG(test_struct_arg_6, int a, int b, struct bpf_testmod_struct_arg_2 *c, int ret)
+{
+	t3_ret = ret;
+	return 0;
+}
+
+SEC("fentry/bpf_testmod_test_struct_arg_4")
+int BPF_PROG(test_struct_arg_7, struct bpf_testmod_struct_arg_1 *a, int b,
+	     int c, int d, struct bpf_testmod_struct_arg_2 *e)
+{
+	t4_a_a = a->a;
+	t4_b = b;
+	t4_c = c;
+	t4_d = d;
+	t4_e_a = e->a;
+	t4_e_b = e->b;
+	return 0;
+}
+
+SEC("fexit/bpf_testmod_test_struct_arg_4")
+int BPF_PROG(test_struct_arg_8, struct bpf_testmod_struct_arg_1 *a, int b,
+	     int c, int d, struct bpf_testmod_struct_arg_2 *e, int ret)
+{
+	t4_ret = ret;
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v2 3/6] bpf: x86: Support in-register struct arguments
  2022-08-12  5:24 ` [PATCH bpf-next v2 3/6] bpf: x86: Support in-register struct arguments Yonghong Song
@ 2022-08-14 20:24   ` Jiri Olsa
  2022-08-15  5:29     ` Yonghong Song
  2022-08-15 22:44   ` Alexei Starovoitov
  1 sibling, 1 reply; 19+ messages in thread
From: Jiri Olsa @ 2022-08-14 20:24 UTC (permalink / raw)
  To: Yonghong Song
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team

On Thu, Aug 11, 2022 at 10:24:35PM -0700, Yonghong Song wrote:

SNIP

>  }
>  
>  static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
> @@ -2020,6 +2081,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>  	struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
>  	struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
>  	struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
> +	int struct_val_off, extra_nregs = 0;
>  	u8 **branches = NULL;
>  	u8 *prog;
>  	bool save_ret;
> @@ -2028,6 +2090,20 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>  	if (nr_args > 6)
>  		return -ENOTSUPP;
>  
> +	for (i = 0; i < MAX_BPF_FUNC_ARGS; i++) {
> +		if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG) {
> +			/* Only support up to 16 bytes struct which should keep
> +			 * values in registers.
> +			 */

it seems that if the struct contains 'double' field, it's passed in
SSE register, which we don't support is save/restore

we should probably check struct's BTF in btf_distill_func_proto and
fail if we found anything else than regular regs types?

> +			if (m->arg_size[i] > 16)
> +				return -ENOTSUPP;
> +
> +			extra_nregs += (m->arg_size[i] + 7) / 8 - 1;
> +		}
> +	}
> +	if (nr_args + extra_nregs > 6)

should this value be minus the number of actually found struct arguments?

> +		return -ENOTSUPP;
> +
>  	/* Generated trampoline stack layout:
>  	 *
>  	 * RBP + 8         [ return address  ]
> @@ -2066,6 +2142,13 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>  	stack_size += (sizeof(struct bpf_tramp_run_ctx) + 7) & ~0x7;
>  	run_ctx_off = stack_size;
>  
> +	/* For structure argument */
> +	for (i = 0; i < MAX_BPF_FUNC_ARGS; i++) {
> +		if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG)
> +			stack_size += (m->arg_size[i] + 7) & ~0x7;
> +	}
> +	struct_val_off = stack_size;

could you please update the 'Generated trampoline stack layout' table
above with this offset

thanks,
jirka

> +
>  	if (flags & BPF_TRAMP_F_SKIP_FRAME) {
>  		/* skip patched call instruction and point orig_call to actual
>  		 * body of the kernel function.
> @@ -2101,7 +2184,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>  		emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -ip_off);
>  	}
>  
> -	save_regs(m, &prog, nr_args, regs_off);
> +	save_regs(m, &prog, nr_args, regs_off, struct_val_off);
>  
>  	if (flags & BPF_TRAMP_F_CALL_ORIG) {
>  		/* arg1: mov rdi, im */
> @@ -2131,7 +2214,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>  	}
>  
>  	if (flags & BPF_TRAMP_F_CALL_ORIG) {
> -		restore_regs(m, &prog, nr_args, regs_off);
> +		restore_regs(m, &prog, nr_args, regs_off, struct_val_off);
>  
>  		if (flags & BPF_TRAMP_F_ORIG_STACK) {
>  			emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8);
> @@ -2172,7 +2255,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>  		}
>  
>  	if (flags & BPF_TRAMP_F_RESTORE_REGS)
> -		restore_regs(m, &prog, nr_args, regs_off);
> +		restore_regs(m, &prog, nr_args, regs_off, struct_val_off);
>  
>  	/* This needs to be done regardless. If there were fmod_ret programs,
>  	 * the return value is only updated on the stack and still needs to be
> -- 
> 2.30.2
> 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v2 3/6] bpf: x86: Support in-register struct arguments
  2022-08-14 20:24   ` Jiri Olsa
@ 2022-08-15  5:29     ` Yonghong Song
  2022-08-15  7:29       ` Jiri Olsa
  0 siblings, 1 reply; 19+ messages in thread
From: Yonghong Song @ 2022-08-15  5:29 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team



On 8/14/22 1:24 PM, Jiri Olsa wrote:
> On Thu, Aug 11, 2022 at 10:24:35PM -0700, Yonghong Song wrote:
> 
> SNIP
> 
>>   }
>>   
>>   static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
>> @@ -2020,6 +2081,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>>   	struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
>>   	struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
>>   	struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
>> +	int struct_val_off, extra_nregs = 0;
>>   	u8 **branches = NULL;
>>   	u8 *prog;
>>   	bool save_ret;
>> @@ -2028,6 +2090,20 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>>   	if (nr_args > 6)
>>   		return -ENOTSUPP;
>>   
>> +	for (i = 0; i < MAX_BPF_FUNC_ARGS; i++) {
>> +		if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG) {
>> +			/* Only support up to 16 bytes struct which should keep
>> +			 * values in registers.
>> +			 */
> 
> it seems that if the struct contains 'double' field, it's passed in
> SSE register, which we don't support is save/restore

That is right.

> 
> we should probably check struct's BTF in btf_distill_func_proto and
> fail if we found anything else than regular regs types?

The reason I didn't add float/double checking is that I didn't actually
find any float/double struct members in either vmlinux.h or in
arch/x86 directory. Could you help double check as well?

> 
>> +			if (m->arg_size[i] > 16)
>> +				return -ENOTSUPP;
>> +
>> +			extra_nregs += (m->arg_size[i] + 7) / 8 - 1;
>> +		}
>> +	}
>> +	if (nr_args + extra_nregs > 6)
> 
> should this value be minus the number of actually found struct arguments?

In the above we have
	extra_nregs += (m->arg_size[i] + 7) / 8 - 1;
already did the 'minus' part.

> 
>> +		return -ENOTSUPP;
>> +
>>   	/* Generated trampoline stack layout:
>>   	 *
>>   	 * RBP + 8         [ return address  ]
>> @@ -2066,6 +2142,13 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>>   	stack_size += (sizeof(struct bpf_tramp_run_ctx) + 7) & ~0x7;
>>   	run_ctx_off = stack_size;
>>   
>> +	/* For structure argument */
>> +	for (i = 0; i < MAX_BPF_FUNC_ARGS; i++) {
>> +		if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG)
>> +			stack_size += (m->arg_size[i] + 7) & ~0x7;
>> +	}
>> +	struct_val_off = stack_size;
> 
> could you please update the 'Generated trampoline stack layout' table
> above with this offset

Okay, will do in the next revision.

> 
> thanks,
> jirka
> 
>> +
>>   	if (flags & BPF_TRAMP_F_SKIP_FRAME) {
>>   		/* skip patched call instruction and point orig_call to actual
>>   		 * body of the kernel function.
>> @@ -2101,7 +2184,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>>   		emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -ip_off);
>>   	}
>>   
>> -	save_regs(m, &prog, nr_args, regs_off);
>> +	save_regs(m, &prog, nr_args, regs_off, struct_val_off);
>>   
>>   	if (flags & BPF_TRAMP_F_CALL_ORIG) {
>>   		/* arg1: mov rdi, im */
>> @@ -2131,7 +2214,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>>   	}
>>   
>>   	if (flags & BPF_TRAMP_F_CALL_ORIG) {
>> -		restore_regs(m, &prog, nr_args, regs_off);
>> +		restore_regs(m, &prog, nr_args, regs_off, struct_val_off);
>>   
>>   		if (flags & BPF_TRAMP_F_ORIG_STACK) {
>>   			emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8);
>> @@ -2172,7 +2255,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>>   		}
>>   
>>   	if (flags & BPF_TRAMP_F_RESTORE_REGS)
>> -		restore_regs(m, &prog, nr_args, regs_off);
>> +		restore_regs(m, &prog, nr_args, regs_off, struct_val_off);
>>   
>>   	/* This needs to be done regardless. If there were fmod_ret programs,
>>   	 * the return value is only updated on the stack and still needs to be
>> -- 
>> 2.30.2
>>

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v2 3/6] bpf: x86: Support in-register struct arguments
  2022-08-15  5:29     ` Yonghong Song
@ 2022-08-15  7:29       ` Jiri Olsa
  2022-08-15 15:25         ` Yonghong Song
  0 siblings, 1 reply; 19+ messages in thread
From: Jiri Olsa @ 2022-08-15  7:29 UTC (permalink / raw)
  To: Yonghong Song
  Cc: Jiri Olsa, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, kernel-team

On Sun, Aug 14, 2022 at 10:29:11PM -0700, Yonghong Song wrote:
> 
> 
> On 8/14/22 1:24 PM, Jiri Olsa wrote:
> > On Thu, Aug 11, 2022 at 10:24:35PM -0700, Yonghong Song wrote:
> > 
> > SNIP
> > 
> > >   }
> > >   static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
> > > @@ -2020,6 +2081,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> > >   	struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
> > >   	struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
> > >   	struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
> > > +	int struct_val_off, extra_nregs = 0;
> > >   	u8 **branches = NULL;
> > >   	u8 *prog;
> > >   	bool save_ret;
> > > @@ -2028,6 +2090,20 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> > >   	if (nr_args > 6)
> > >   		return -ENOTSUPP;
> > > +	for (i = 0; i < MAX_BPF_FUNC_ARGS; i++) {
> > > +		if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG) {
> > > +			/* Only support up to 16 bytes struct which should keep
> > > +			 * values in registers.
> > > +			 */
> > 
> > it seems that if the struct contains 'double' field, it's passed in
> > SSE register, which we don't support is save/restore
> 
> That is right.
> 
> > 
> > we should probably check struct's BTF in btf_distill_func_proto and
> > fail if we found anything else than regular regs types?
> 
> The reason I didn't add float/double checking is that I didn't actually
> find any float/double struct members in either vmlinux.h or in
> arch/x86 directory. Could you help double check as well?

ok I checked on fedora's BTF and could not find any

still the check might be good or at least mention
that in comment

> 
> > 
> > > +			if (m->arg_size[i] > 16)
> > > +				return -ENOTSUPP;
> > > +
> > > +			extra_nregs += (m->arg_size[i] + 7) / 8 - 1;
> > > +		}
> > > +	}
> > > +	if (nr_args + extra_nregs > 6)
> > 
> > should this value be minus the number of actually found struct arguments?
> 
> In the above we have
> 	extra_nregs += (m->arg_size[i] + 7) / 8 - 1;
> already did the 'minus' part.

there it is ;-) ok

jirka

> 
> > 
> > > +		return -ENOTSUPP;
> > > +
> > >   	/* Generated trampoline stack layout:
> > >   	 *
> > >   	 * RBP + 8         [ return address  ]
> > > @@ -2066,6 +2142,13 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> > >   	stack_size += (sizeof(struct bpf_tramp_run_ctx) + 7) & ~0x7;
> > >   	run_ctx_off = stack_size;
> > > +	/* For structure argument */
> > > +	for (i = 0; i < MAX_BPF_FUNC_ARGS; i++) {
> > > +		if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG)
> > > +			stack_size += (m->arg_size[i] + 7) & ~0x7;
> > > +	}
> > > +	struct_val_off = stack_size;
> > 
> > could you please update the 'Generated trampoline stack layout' table
> > above with this offset
> 
> Okay, will do in the next revision.
> 
> > 
> > thanks,
> > jirka
> > 
> > > +
> > >   	if (flags & BPF_TRAMP_F_SKIP_FRAME) {
> > >   		/* skip patched call instruction and point orig_call to actual
> > >   		 * body of the kernel function.
> > > @@ -2101,7 +2184,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> > >   		emit_stx(&prog, BPF_DW, BPF_REG_FP, BPF_REG_0, -ip_off);
> > >   	}
> > > -	save_regs(m, &prog, nr_args, regs_off);
> > > +	save_regs(m, &prog, nr_args, regs_off, struct_val_off);
> > >   	if (flags & BPF_TRAMP_F_CALL_ORIG) {
> > >   		/* arg1: mov rdi, im */
> > > @@ -2131,7 +2214,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> > >   	}
> > >   	if (flags & BPF_TRAMP_F_CALL_ORIG) {
> > > -		restore_regs(m, &prog, nr_args, regs_off);
> > > +		restore_regs(m, &prog, nr_args, regs_off, struct_val_off);
> > >   		if (flags & BPF_TRAMP_F_ORIG_STACK) {
> > >   			emit_ldx(&prog, BPF_DW, BPF_REG_0, BPF_REG_FP, 8);
> > > @@ -2172,7 +2255,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
> > >   		}
> > >   	if (flags & BPF_TRAMP_F_RESTORE_REGS)
> > > -		restore_regs(m, &prog, nr_args, regs_off);
> > > +		restore_regs(m, &prog, nr_args, regs_off, struct_val_off);
> > >   	/* This needs to be done regardless. If there were fmod_ret programs,
> > >   	 * the return value is only updated on the stack and still needs to be
> > > -- 
> > > 2.30.2
> > > 

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v2 3/6] bpf: x86: Support in-register struct arguments
  2022-08-15  7:29       ` Jiri Olsa
@ 2022-08-15 15:25         ` Yonghong Song
  0 siblings, 0 replies; 19+ messages in thread
From: Yonghong Song @ 2022-08-15 15:25 UTC (permalink / raw)
  To: Jiri Olsa
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, kernel-team



On 8/15/22 12:29 AM, Jiri Olsa wrote:
> On Sun, Aug 14, 2022 at 10:29:11PM -0700, Yonghong Song wrote:
>>
>>
>> On 8/14/22 1:24 PM, Jiri Olsa wrote:
>>> On Thu, Aug 11, 2022 at 10:24:35PM -0700, Yonghong Song wrote:
>>>
>>> SNIP
>>>
>>>>    }
>>>>    static int invoke_bpf_prog(const struct btf_func_model *m, u8 **pprog,
>>>> @@ -2020,6 +2081,7 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>>>>    	struct bpf_tramp_links *fentry = &tlinks[BPF_TRAMP_FENTRY];
>>>>    	struct bpf_tramp_links *fexit = &tlinks[BPF_TRAMP_FEXIT];
>>>>    	struct bpf_tramp_links *fmod_ret = &tlinks[BPF_TRAMP_MODIFY_RETURN];
>>>> +	int struct_val_off, extra_nregs = 0;
>>>>    	u8 **branches = NULL;
>>>>    	u8 *prog;
>>>>    	bool save_ret;
>>>> @@ -2028,6 +2090,20 @@ int arch_prepare_bpf_trampoline(struct bpf_tramp_image *im, void *image, void *i
>>>>    	if (nr_args > 6)
>>>>    		return -ENOTSUPP;
>>>> +	for (i = 0; i < MAX_BPF_FUNC_ARGS; i++) {
>>>> +		if (m->arg_flags[i] & BTF_FMODEL_STRUCT_ARG) {
>>>> +			/* Only support up to 16 bytes struct which should keep
>>>> +			 * values in registers.
>>>> +			 */
>>>
>>> it seems that if the struct contains 'double' field, it's passed in
>>> SSE register, which we don't support is save/restore
>>
>> That is right.
>>
>>>
>>> we should probably check struct's BTF in btf_distill_func_proto and
>>> fail if we found anything else than regular regs types?
>>
>> The reason I didn't add float/double checking is that I didn't actually
>> find any float/double struct members in either vmlinux.h or in
>> arch/x86 directory. Could you help double check as well?
> 
> ok I checked on fedora's BTF and could not find any
> 
> still the check might be good or at least mention
> that in comment

I will mention in the comment. thanks!

> 
>>
>>>
>>>> +			if (m->arg_size[i] > 16)
>>>> +				return -ENOTSUPP;
>>>> +
>>>> +			extra_nregs += (m->arg_size[i] + 7) / 8 - 1;
>>>> +		}
>>>> +	}
>>>> +	if (nr_args + extra_nregs > 6)
>>>
>>> should this value be minus the number of actually found struct arguments?
>>
>> In the above we have
>> 	extra_nregs += (m->arg_size[i] + 7) / 8 - 1;
>> already did the 'minus' part.
> 
> there it is ;-) ok
> 
> jirka
> 
[...]

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v2 3/6] bpf: x86: Support in-register struct arguments
  2022-08-12  5:24 ` [PATCH bpf-next v2 3/6] bpf: x86: Support in-register struct arguments Yonghong Song
  2022-08-14 20:24   ` Jiri Olsa
@ 2022-08-15 22:44   ` Alexei Starovoitov
  2022-08-18  4:56     ` Yonghong Song
  1 sibling, 1 reply; 19+ messages in thread
From: Alexei Starovoitov @ 2022-08-15 22:44 UTC (permalink / raw)
  To: Yonghong Song
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Kernel Team

On Thu, Aug 11, 2022 at 10:24 PM Yonghong Song <yhs@fb.com> wrote:
>
> In C, struct value can be passed as a function argument.
> For small structs, struct value may be passed in
> one or more registers. For trampoline based bpf programs,
> This would cause complication since one-to-one mapping between
> function argument and arch argument register is not valid
> any more.
>
> To support struct value argument and make bpf programs
> easy to write, the bpf program function parameter is
> changed from struct type to a pointer to struct type.
> The following is a simplified example.
>
> In one of later selftests, we have a bpf_testmod function:
>     struct bpf_testmod_struct_arg_2 {
>         long a;
>         long b;
>     };
>     noinline int
>     bpf_testmod_test_struct_arg_2(int a, struct bpf_testmod_struct_arg_2 b, int c) {
>         bpf_testmod_test_struct_arg_result = a + b.a + b.b + c;
>         return bpf_testmod_test_struct_arg_result;
>     }
>
> When a bpf program is attached to the bpf_testmod function
> bpf_testmod_test_struct_arg_2(), the bpf program may look like
>     SEC("fentry/bpf_testmod_test_struct_arg_2")
>     int BPF_PROG(test_struct_arg_3, int a, struct bpf_testmod_struct_arg_2 *b, int c)
>     {
>         t2_a = a;
>         t2_b_a = b->a;
>         t2_b_b = b->b;
>         t2_c = c;
>         return 0;
>     }
>
> Basically struct value becomes a pointer to the struct.
> The trampoline stack will be increased to store the stack values and
> the pointer to these values will be saved in the stack slot corresponding
> to that argument. For x86_64, the struct size is limited up to 16 bytes
> so the struct can fit in one or two registers. The struct size of more
> than 16 bytes is not supported now as our current use case is
> for sockptr_t in the argument. We could handle such large struct's
> in the future if we have concrete use cases.
>
> The main changes are in save_regs() and restore_regs(). The following
> illustrated the trampoline asm codes for save_regs() and restore_regs().
> save_regs():
>     /* first argument */
>     mov    DWORD PTR [rbp-0x18],edi
>     /* second argument: struct, save actual values and put the pointer to the slot */
>     lea    rax,[rbp-0x40]
>     mov    QWORD PTR [rbp-0x10],rax
>     mov    QWORD PTR [rbp-0x40],rsi
>     mov    QWORD PTR [rbp-0x38],rdx
>     /* third argument */
>     mov    DWORD PTR [rbp-0x8],esi
> restore_regs():
>     mov    edi,DWORD PTR [rbp-0x18]
>     mov    rsi,QWORD PTR [rbp-0x40]
>     mov    rdx,QWORD PTR [rbp-0x38]
>     mov    esi,DWORD PTR [rbp-0x8]

Not sure whether it was discussed before, but
why cannot we adjust the bpf side instead?
Technically struct passing between bpf progs was never
officially supported. llvm generates something.
Probably always passes by reference, but we can adjust
that behavior without breaking any programs because
we don't have bpf libraries. Programs are fully contained
in one or few files. libbpf can do static linking, but
without any actual libraries the chance of breaking
backward compat is close to zero.
Can we teach llvm to pass sizeof(struct) <= 16 in
two bpf registers?
Then we wouldn't need to have a discrepancy between
kernel function prototype and bpf fentry prog proto.
Both will have struct by value in the same spot.
The trampoline generation will be simpler for x86 and
its runtime faster too.
The other architectures that pass small structs by reference
can do a bit more work in the trampoline: copy up to 16 byte
and bpf prog side will see it as they were passed in 'registers'.
wdyt?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v2 3/6] bpf: x86: Support in-register struct arguments
  2022-08-15 22:44   ` Alexei Starovoitov
@ 2022-08-18  4:56     ` Yonghong Song
  2022-08-18 20:44       ` Alexei Starovoitov
  0 siblings, 1 reply; 19+ messages in thread
From: Yonghong Song @ 2022-08-18  4:56 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Kernel Team



On 8/15/22 3:44 PM, Alexei Starovoitov wrote:
> On Thu, Aug 11, 2022 at 10:24 PM Yonghong Song <yhs@fb.com> wrote:
>>
>> In C, struct value can be passed as a function argument.
>> For small structs, struct value may be passed in
>> one or more registers. For trampoline based bpf programs,
>> This would cause complication since one-to-one mapping between
>> function argument and arch argument register is not valid
>> any more.
>>
>> To support struct value argument and make bpf programs
>> easy to write, the bpf program function parameter is
>> changed from struct type to a pointer to struct type.
>> The following is a simplified example.
>>
>> In one of later selftests, we have a bpf_testmod function:
>>      struct bpf_testmod_struct_arg_2 {
>>          long a;
>>          long b;
>>      };
>>      noinline int
>>      bpf_testmod_test_struct_arg_2(int a, struct bpf_testmod_struct_arg_2 b, int c) {
>>          bpf_testmod_test_struct_arg_result = a + b.a + b.b + c;
>>          return bpf_testmod_test_struct_arg_result;
>>      }
>>
>> When a bpf program is attached to the bpf_testmod function
>> bpf_testmod_test_struct_arg_2(), the bpf program may look like
>>      SEC("fentry/bpf_testmod_test_struct_arg_2")
>>      int BPF_PROG(test_struct_arg_3, int a, struct bpf_testmod_struct_arg_2 *b, int c)
>>      {
>>          t2_a = a;
>>          t2_b_a = b->a;
>>          t2_b_b = b->b;
>>          t2_c = c;
>>          return 0;
>>      }
>>
>> Basically struct value becomes a pointer to the struct.
>> The trampoline stack will be increased to store the stack values and
>> the pointer to these values will be saved in the stack slot corresponding
>> to that argument. For x86_64, the struct size is limited up to 16 bytes
>> so the struct can fit in one or two registers. The struct size of more
>> than 16 bytes is not supported now as our current use case is
>> for sockptr_t in the argument. We could handle such large struct's
>> in the future if we have concrete use cases.
>>
>> The main changes are in save_regs() and restore_regs(). The following
>> illustrated the trampoline asm codes for save_regs() and restore_regs().
>> save_regs():
>>      /* first argument */
>>      mov    DWORD PTR [rbp-0x18],edi
>>      /* second argument: struct, save actual values and put the pointer to the slot */
>>      lea    rax,[rbp-0x40]
>>      mov    QWORD PTR [rbp-0x10],rax
>>      mov    QWORD PTR [rbp-0x40],rsi
>>      mov    QWORD PTR [rbp-0x38],rdx
>>      /* third argument */
>>      mov    DWORD PTR [rbp-0x8],esi
>> restore_regs():
>>      mov    edi,DWORD PTR [rbp-0x18]
>>      mov    rsi,QWORD PTR [rbp-0x40]
>>      mov    rdx,QWORD PTR [rbp-0x38]
>>      mov    esi,DWORD PTR [rbp-0x8]
> 
> Not sure whether it was discussed before, but
> why cannot we adjust the bpf side instead?
> Technically struct passing between bpf progs was never
> officially supported. llvm generates something.
> Probably always passes by reference, but we can adjust
> that behavior without breaking any programs because
> we don't have bpf libraries. Programs are fully contained
> in one or few files. libbpf can do static linking, but
> without any actual libraries the chance of breaking
> backward compat is close to zero.

Agree. At this point, we don't need to worry about
compatibility between bpf program and bpf program libraries.

> Can we teach llvm to pass sizeof(struct) <= 16 in
> two bpf registers?

Yes, we can. I just hacked llvm and was able to
do that.

> Then we wouldn't need to have a discrepancy between
> kernel function prototype and bpf fentry prog proto.
> Both will have struct by value in the same spot.
> The trampoline generation will be simpler for x86 and
> its runtime faster too.

I tested x86 and arm64 both supports two registers
for a 16 byte struct.

> The other architectures that pass small structs by reference
> can do a bit more work in the trampoline: copy up to 16 byte
> and bpf prog side will see it as they were passed in 'registers'.
> wdyt?

I know systemz and Hexagon will pass by reference for any
struct size >= 8 bytes. Didn't complete check others.

But since x86 and arm64 supports direct value passing
with two registers, we should be okay. As you mentioned,
we could support systemz/hexagon style of struct passing
by copying the values to the stack.


But I have a problem how to define a user friendly
macro like BPF_PROG for user to use.

Let us say, we have a program like below:
SEC("fentry/bpf_testmod_test_struct_arg_1")
int BPF_PROG(test_struct_arg_1, struct bpf_testmod_struct_arg_2 *a, int 
b, int c) {
...
}

We have BPF_PROG macro definition here:

#define BPF_PROG(name, args...) 
     \
name(unsigned long long *ctx); 
     \
static __always_inline typeof(name(0)) 
     \
____##name(unsigned long long *ctx, ##args); 
     \
typeof(name(0)) name(unsigned long long *ctx) 
     \
{ 
     \
         _Pragma("GCC diagnostic push") 
      \
         _Pragma("GCC diagnostic ignored \"-Wint-conversion\"") 
      \
         return ____##name(___bpf_ctx_cast(args)); 
      \
         _Pragma("GCC diagnostic pop") 
      \
} 
     \
static __always_inline typeof(name(0)) 
     \
____##name(unsigned long long *ctx, ##args)

Some we have static function definition

int ____test_struct_arg_1(unsigned long long *ctx, struct 
bpf_testmod_struct_arg_2 *a, int b, int c);

But the function call inside the function test_struct_arg_1()
is
   ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2]);

We have two problems here:
   ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2])
does not match the static function declaration.
This is not problem if everything is int/ptr type.
If one of argument is structure type, we will have
type conversion problem. Let us this can be resolved
somehow through some hack.

More importantly, because some structure may take two
registers,
    ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2])
may not be correct. In my above example, if the
structure size is 16 bytes,
then the actual call should be
    ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2], ctx[3])
So we need to provide how many extra registers are needed
beyond ##args in the macro. I have not tried how to
resolve this but this extra information in the macro
definite is not user friendly.

Not sure what is the best way to handle this issue (##args is not 
precise and needs addition registers for >8 struct arguments).

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v2 3/6] bpf: x86: Support in-register struct arguments
  2022-08-18  4:56     ` Yonghong Song
@ 2022-08-18 20:44       ` Alexei Starovoitov
  2022-08-24 19:04         ` Yonghong Song
  2022-08-24 19:05         ` Andrii Nakryiko
  0 siblings, 2 replies; 19+ messages in thread
From: Alexei Starovoitov @ 2022-08-18 20:44 UTC (permalink / raw)
  To: Yonghong Song
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Kernel Team

On Wed, Aug 17, 2022 at 09:56:23PM -0700, Yonghong Song wrote:
> 
> 
> On 8/15/22 3:44 PM, Alexei Starovoitov wrote:
> > On Thu, Aug 11, 2022 at 10:24 PM Yonghong Song <yhs@fb.com> wrote:
> > > 
> > > In C, struct value can be passed as a function argument.
> > > For small structs, struct value may be passed in
> > > one or more registers. For trampoline based bpf programs,
> > > This would cause complication since one-to-one mapping between
> > > function argument and arch argument register is not valid
> > > any more.
> > > 
> > > To support struct value argument and make bpf programs
> > > easy to write, the bpf program function parameter is
> > > changed from struct type to a pointer to struct type.
> > > The following is a simplified example.
> > > 
> > > In one of later selftests, we have a bpf_testmod function:
> > >      struct bpf_testmod_struct_arg_2 {
> > >          long a;
> > >          long b;
> > >      };
> > >      noinline int
> > >      bpf_testmod_test_struct_arg_2(int a, struct bpf_testmod_struct_arg_2 b, int c) {
> > >          bpf_testmod_test_struct_arg_result = a + b.a + b.b + c;
> > >          return bpf_testmod_test_struct_arg_result;
> > >      }
> > > 
> > > When a bpf program is attached to the bpf_testmod function
> > > bpf_testmod_test_struct_arg_2(), the bpf program may look like
> > >      SEC("fentry/bpf_testmod_test_struct_arg_2")
> > >      int BPF_PROG(test_struct_arg_3, int a, struct bpf_testmod_struct_arg_2 *b, int c)
> > >      {
> > >          t2_a = a;
> > >          t2_b_a = b->a;
> > >          t2_b_b = b->b;
> > >          t2_c = c;
> > >          return 0;
> > >      }
> > > 
> > > Basically struct value becomes a pointer to the struct.
> > > The trampoline stack will be increased to store the stack values and
> > > the pointer to these values will be saved in the stack slot corresponding
> > > to that argument. For x86_64, the struct size is limited up to 16 bytes
> > > so the struct can fit in one or two registers. The struct size of more
> > > than 16 bytes is not supported now as our current use case is
> > > for sockptr_t in the argument. We could handle such large struct's
> > > in the future if we have concrete use cases.
> > > 
> > > The main changes are in save_regs() and restore_regs(). The following
> > > illustrated the trampoline asm codes for save_regs() and restore_regs().
> > > save_regs():
> > >      /* first argument */
> > >      mov    DWORD PTR [rbp-0x18],edi
> > >      /* second argument: struct, save actual values and put the pointer to the slot */
> > >      lea    rax,[rbp-0x40]
> > >      mov    QWORD PTR [rbp-0x10],rax
> > >      mov    QWORD PTR [rbp-0x40],rsi
> > >      mov    QWORD PTR [rbp-0x38],rdx
> > >      /* third argument */
> > >      mov    DWORD PTR [rbp-0x8],esi
> > > restore_regs():
> > >      mov    edi,DWORD PTR [rbp-0x18]
> > >      mov    rsi,QWORD PTR [rbp-0x40]
> > >      mov    rdx,QWORD PTR [rbp-0x38]
> > >      mov    esi,DWORD PTR [rbp-0x8]
> > 
> > Not sure whether it was discussed before, but
> > why cannot we adjust the bpf side instead?
> > Technically struct passing between bpf progs was never
> > officially supported. llvm generates something.
> > Probably always passes by reference, but we can adjust
> > that behavior without breaking any programs because
> > we don't have bpf libraries. Programs are fully contained
> > in one or few files. libbpf can do static linking, but
> > without any actual libraries the chance of breaking
> > backward compat is close to zero.
> 
> Agree. At this point, we don't need to worry about
> compatibility between bpf program and bpf program libraries.
> 
> > Can we teach llvm to pass sizeof(struct) <= 16 in
> > two bpf registers?
> 
> Yes, we can. I just hacked llvm and was able to
> do that.
> 
> > Then we wouldn't need to have a discrepancy between
> > kernel function prototype and bpf fentry prog proto.
> > Both will have struct by value in the same spot.
> > The trampoline generation will be simpler for x86 and
> > its runtime faster too.
> 
> I tested x86 and arm64 both supports two registers
> for a 16 byte struct.
> 
> > The other architectures that pass small structs by reference
> > can do a bit more work in the trampoline: copy up to 16 byte
> > and bpf prog side will see it as they were passed in 'registers'.
> > wdyt?
> 
> I know systemz and Hexagon will pass by reference for any
> struct size >= 8 bytes. Didn't complete check others.
> 
> But since x86 and arm64 supports direct value passing
> with two registers, we should be okay. As you mentioned,
> we could support systemz/hexagon style of struct passing
> by copying the values to the stack.
> 
> 
> But I have a problem how to define a user friendly
> macro like BPF_PROG for user to use.
> 
> Let us say, we have a program like below:
> SEC("fentry/bpf_testmod_test_struct_arg_1")
> int BPF_PROG(test_struct_arg_1, struct bpf_testmod_struct_arg_2 *a, int b,
> int c) {
> ...
> }
> 
> We have BPF_PROG macro definition here:
> 
> #define BPF_PROG(name, args...)     \
> name(unsigned long long *ctx);     \
> static __always_inline typeof(name(0))     \
> ____##name(unsigned long long *ctx, ##args);     \
> typeof(name(0)) name(unsigned long long *ctx)     \
> {     \
>         _Pragma("GCC diagnostic push")      \
>         _Pragma("GCC diagnostic ignored \"-Wint-conversion\"")      \
>         return ____##name(___bpf_ctx_cast(args));      \
>         _Pragma("GCC diagnostic pop")      \
> }     \
> static __always_inline typeof(name(0))     \
> ____##name(unsigned long long *ctx, ##args)
> 
> Some we have static function definition
> 
> int ____test_struct_arg_1(unsigned long long *ctx, struct
> bpf_testmod_struct_arg_2 *a, int b, int c);
> 
> But the function call inside the function test_struct_arg_1()
> is
>   ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2]);
> 
> We have two problems here:
>   ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2])
> does not match the static function declaration.
> This is not problem if everything is int/ptr type.
> If one of argument is structure type, we will have
> type conversion problem. Let us this can be resolved
> somehow through some hack.
> 
> More importantly, because some structure may take two
> registers,
>    ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2])
> may not be correct. In my above example, if the
> structure size is 16 bytes,
> then the actual call should be
>    ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2], ctx[3])
> So we need to provide how many extra registers are needed
> beyond ##args in the macro. I have not tried how to
> resolve this but this extra information in the macro
> definite is not user friendly.
> 
> Not sure what is the best way to handle this issue (##args is not precise
> and needs addition registers for >8 struct arguments).

The kernel is using this trick to cast 8 byte structs to u64:
/* cast any integer, pointer, or small struct to u64 */
#define UINTTYPE(size) \
        __typeof__(__builtin_choose_expr(size == 1,  (u8)1, \
                   __builtin_choose_expr(size == 2, (u16)2, \
                   __builtin_choose_expr(size == 4, (u32)3, \
                   __builtin_choose_expr(size == 8, (u64)4, \
                                         (void)5)))))
#define __CAST_TO_U64(x) ({ \
        typeof(x) __src = (x); \
        UINTTYPE(sizeof(x)) __dst; \
        memcpy(&__dst, &__src, sizeof(__dst)); \
        (u64)__dst; })

casting 16 byte struct to two u64 can be similar.
Ideally we would declare bpf prog as:
SEC("fentry/bpf_testmod_test_struct_arg_1")
int BPF_PROG(test_struct_arg_1, struct bpf_testmod_struct_arg_2 a, int b, int c) {
note there is no '*'. It's struct by value.
The main challenge is how to do the math in the BPF_PROG macros.
Currently it's doing:
#define ___bpf_ctx_cast1(x)           ___bpf_ctx_cast0(), (void *)ctx[0]
#define ___bpf_ctx_cast2(x, args...)  ___bpf_ctx_cast1(args), (void *)ctx[1]
#define ___bpf_ctx_cast3(x, args...)  ___bpf_ctx_cast2(args), (void *)ctx[2]
#define ___bpf_ctx_cast4(x, args...)  ___bpf_ctx_cast3(args), (void *)ctx[3]

The ctx[index] is one-to-one with argument position.
That 'index' needs to be calculated.
Maybe something like UINTTYPE() that applies to previous arguments?
#define REG_CNT(arg) \
        __builtin_choose_expr(sizeof(arg) == 1,  1, \
        __builtin_choose_expr(sizeof(arg) == 2,  1, \
        __builtin_choose_expr(sizeof(arg) == 4,  1, \
        __builtin_choose_expr(sizeof(arg) == 8,  1, \
        __builtin_choose_expr(sizeof(arg) == 16,  2, \
                                         (void)0)))))

#define ___bpf_reg_cnt0()            0
#define ___bpf_reg_cnt1(x)          ___bpf_reg_cnt0() + REG_CNT(x)
#define ___bpf_reg_cnt2(x, args...) ___bpf_reg_cnt1(args) + REG_CNT(x)
#define ___bpf_reg_cnt(args...)    ___bpf_apply(___bpf_reg_cnt, ___bpf_narg(args))(args)

This way the macro will calculate the index inside ctx[] array.

and then inside ___bpf_ctx_castN macro use ___bpf_reg_cnt.
Instead of:
___bpf_ctx_cast3(x, args...)  ___bpf_ctx_cast2(args), (void *)ctx[2]
it will be
___bpf_ctx_cast3(x, args...)  ___bpf_ctx_cast2(args), \
  __builtin_choose_expr(sizeof(x) <= 8, (void *)ctx[___bpf_reg_cnt(args)],
                        *(typeof(x) *) &ctx[___bpf_reg_cnt(args)])

x - is one of the arguments.
args - all args before 'x'. Doing __bpf_reg_cnt on them should calculate the index.
*(typeof(x) *)& should type cast to struct of 16 bytes.

Rough idea, of course.

Another alternative is instead of:
#define BPF_PROG(name, args...)
name(unsigned long long *ctx);
do:
#define BPF_PROG(name, args...)
struct XX {
  macro inserts all 'args' here separated by ; so it becomes a proper struct
};
name(struct XX *ctx);

and then instead of doing ___bpf_ctx_castN for each argument
do single cast of all of 'u64 ctx[N]' passed from fentry into 'struct XX *'.
The problem with this approach that small args like char, short, int needs to
be declared in struct XX with __align__(8).

Both approaches may be workable?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v2 3/6] bpf: x86: Support in-register struct arguments
  2022-08-18 20:44       ` Alexei Starovoitov
@ 2022-08-24 19:04         ` Yonghong Song
  2022-08-24 22:35           ` Alexei Starovoitov
  2022-08-24 19:05         ` Andrii Nakryiko
  1 sibling, 1 reply; 19+ messages in thread
From: Yonghong Song @ 2022-08-24 19:04 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Kernel Team



On 8/18/22 1:44 PM, Alexei Starovoitov wrote:
> On Wed, Aug 17, 2022 at 09:56:23PM -0700, Yonghong Song wrote:
>>
>>
>> On 8/15/22 3:44 PM, Alexei Starovoitov wrote:
>>> On Thu, Aug 11, 2022 at 10:24 PM Yonghong Song <yhs@fb.com> wrote:
>>>>
>>>> In C, struct value can be passed as a function argument.
>>>> For small structs, struct value may be passed in
>>>> one or more registers. For trampoline based bpf programs,
>>>> This would cause complication since one-to-one mapping between
>>>> function argument and arch argument register is not valid
>>>> any more.
>>>>
>>>> To support struct value argument and make bpf programs
>>>> easy to write, the bpf program function parameter is
>>>> changed from struct type to a pointer to struct type.
>>>> The following is a simplified example.
>>>>
>>>> In one of later selftests, we have a bpf_testmod function:
>>>>       struct bpf_testmod_struct_arg_2 {
>>>>           long a;
>>>>           long b;
>>>>       };
>>>>       noinline int
>>>>       bpf_testmod_test_struct_arg_2(int a, struct bpf_testmod_struct_arg_2 b, int c) {
>>>>           bpf_testmod_test_struct_arg_result = a + b.a + b.b + c;
>>>>           return bpf_testmod_test_struct_arg_result;
>>>>       }
>>>>
>>>> When a bpf program is attached to the bpf_testmod function
>>>> bpf_testmod_test_struct_arg_2(), the bpf program may look like
>>>>       SEC("fentry/bpf_testmod_test_struct_arg_2")
>>>>       int BPF_PROG(test_struct_arg_3, int a, struct bpf_testmod_struct_arg_2 *b, int c)
>>>>       {
>>>>           t2_a = a;
>>>>           t2_b_a = b->a;
>>>>           t2_b_b = b->b;
>>>>           t2_c = c;
>>>>           return 0;
>>>>       }
>>>>
>>>> Basically struct value becomes a pointer to the struct.
>>>> The trampoline stack will be increased to store the stack values and
>>>> the pointer to these values will be saved in the stack slot corresponding
>>>> to that argument. For x86_64, the struct size is limited up to 16 bytes
>>>> so the struct can fit in one or two registers. The struct size of more
>>>> than 16 bytes is not supported now as our current use case is
>>>> for sockptr_t in the argument. We could handle such large struct's
>>>> in the future if we have concrete use cases.
>>>>
>>>> The main changes are in save_regs() and restore_regs(). The following
>>>> illustrated the trampoline asm codes for save_regs() and restore_regs().
>>>> save_regs():
>>>>       /* first argument */
>>>>       mov    DWORD PTR [rbp-0x18],edi
>>>>       /* second argument: struct, save actual values and put the pointer to the slot */
>>>>       lea    rax,[rbp-0x40]
>>>>       mov    QWORD PTR [rbp-0x10],rax
>>>>       mov    QWORD PTR [rbp-0x40],rsi
>>>>       mov    QWORD PTR [rbp-0x38],rdx
>>>>       /* third argument */
>>>>       mov    DWORD PTR [rbp-0x8],esi
>>>> restore_regs():
>>>>       mov    edi,DWORD PTR [rbp-0x18]
>>>>       mov    rsi,QWORD PTR [rbp-0x40]
>>>>       mov    rdx,QWORD PTR [rbp-0x38]
>>>>       mov    esi,DWORD PTR [rbp-0x8]
>>>
>>> Not sure whether it was discussed before, but
>>> why cannot we adjust the bpf side instead?
>>> Technically struct passing between bpf progs was never
>>> officially supported. llvm generates something.
>>> Probably always passes by reference, but we can adjust
>>> that behavior without breaking any programs because
>>> we don't have bpf libraries. Programs are fully contained
>>> in one or few files. libbpf can do static linking, but
>>> without any actual libraries the chance of breaking
>>> backward compat is close to zero.
>>
>> Agree. At this point, we don't need to worry about
>> compatibility between bpf program and bpf program libraries.
>>
>>> Can we teach llvm to pass sizeof(struct) <= 16 in
>>> two bpf registers?
>>
>> Yes, we can. I just hacked llvm and was able to
>> do that.
>>
>>> Then we wouldn't need to have a discrepancy between
>>> kernel function prototype and bpf fentry prog proto.
>>> Both will have struct by value in the same spot.
>>> The trampoline generation will be simpler for x86 and
>>> its runtime faster too.
>>
>> I tested x86 and arm64 both supports two registers
>> for a 16 byte struct.
>>
>>> The other architectures that pass small structs by reference
>>> can do a bit more work in the trampoline: copy up to 16 byte
>>> and bpf prog side will see it as they were passed in 'registers'.
>>> wdyt?
>>
>> I know systemz and Hexagon will pass by reference for any
>> struct size >= 8 bytes. Didn't complete check others.
>>
>> But since x86 and arm64 supports direct value passing
>> with two registers, we should be okay. As you mentioned,
>> we could support systemz/hexagon style of struct passing
>> by copying the values to the stack.
>>
>>
>> But I have a problem how to define a user friendly
>> macro like BPF_PROG for user to use.
>>
>> Let us say, we have a program like below:
>> SEC("fentry/bpf_testmod_test_struct_arg_1")
>> int BPF_PROG(test_struct_arg_1, struct bpf_testmod_struct_arg_2 *a, int b,
>> int c) {
>> ...
>> }
>>
>> We have BPF_PROG macro definition here:
>>
>> #define BPF_PROG(name, args...)     \
>> name(unsigned long long *ctx);     \
>> static __always_inline typeof(name(0))     \
>> ____##name(unsigned long long *ctx, ##args);     \
>> typeof(name(0)) name(unsigned long long *ctx)     \
>> {     \
>>          _Pragma("GCC diagnostic push")      \
>>          _Pragma("GCC diagnostic ignored \"-Wint-conversion\"")      \
>>          return ____##name(___bpf_ctx_cast(args));      \
>>          _Pragma("GCC diagnostic pop")      \
>> }     \
>> static __always_inline typeof(name(0))     \
>> ____##name(unsigned long long *ctx, ##args)
>>
>> Some we have static function definition
>>
>> int ____test_struct_arg_1(unsigned long long *ctx, struct
>> bpf_testmod_struct_arg_2 *a, int b, int c);
>>
>> But the function call inside the function test_struct_arg_1()
>> is
>>    ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2]);
>>
>> We have two problems here:
>>    ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2])
>> does not match the static function declaration.
>> This is not problem if everything is int/ptr type.
>> If one of argument is structure type, we will have
>> type conversion problem. Let us this can be resolved
>> somehow through some hack.
>>
>> More importantly, because some structure may take two
>> registers,
>>     ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2])
>> may not be correct. In my above example, if the
>> structure size is 16 bytes,
>> then the actual call should be
>>     ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2], ctx[3])
>> So we need to provide how many extra registers are needed
>> beyond ##args in the macro. I have not tried how to
>> resolve this but this extra information in the macro
>> definite is not user friendly.
>>
>> Not sure what is the best way to handle this issue (##args is not precise
>> and needs addition registers for >8 struct arguments).
> 
> The kernel is using this trick to cast 8 byte structs to u64:
> /* cast any integer, pointer, or small struct to u64 */
> #define UINTTYPE(size) \
>          __typeof__(__builtin_choose_expr(size == 1,  (u8)1, \
>                     __builtin_choose_expr(size == 2, (u16)2, \
>                     __builtin_choose_expr(size == 4, (u32)3, \
>                     __builtin_choose_expr(size == 8, (u64)4, \
>                                           (void)5)))))
> #define __CAST_TO_U64(x) ({ \
>          typeof(x) __src = (x); \
>          UINTTYPE(sizeof(x)) __dst; \
>          memcpy(&__dst, &__src, sizeof(__dst)); \
>          (u64)__dst; })
> 
> casting 16 byte struct to two u64 can be similar.
> Ideally we would declare bpf prog as:
> SEC("fentry/bpf_testmod_test_struct_arg_1")
> int BPF_PROG(test_struct_arg_1, struct bpf_testmod_struct_arg_2 a, int b, int c) {
> note there is no '*'. It's struct by value.
> The main challenge is how to do the math in the BPF_PROG macros.
> Currently it's doing:
> #define ___bpf_ctx_cast1(x)           ___bpf_ctx_cast0(), (void *)ctx[0]
> #define ___bpf_ctx_cast2(x, args...)  ___bpf_ctx_cast1(args), (void *)ctx[1]
> #define ___bpf_ctx_cast3(x, args...)  ___bpf_ctx_cast2(args), (void *)ctx[2]
> #define ___bpf_ctx_cast4(x, args...)  ___bpf_ctx_cast3(args), (void *)ctx[3]
> 
> The ctx[index] is one-to-one with argument position.
> That 'index' needs to be calculated.
> Maybe something like UINTTYPE() that applies to previous arguments?
> #define REG_CNT(arg) \
>          __builtin_choose_expr(sizeof(arg) == 1,  1, \
>          __builtin_choose_expr(sizeof(arg) == 2,  1, \
>          __builtin_choose_expr(sizeof(arg) == 4,  1, \
>          __builtin_choose_expr(sizeof(arg) == 8,  1, \
>          __builtin_choose_expr(sizeof(arg) == 16,  2, \
>                                           (void)0)))))
> 
> #define ___bpf_reg_cnt0()            0
> #define ___bpf_reg_cnt1(x)          ___bpf_reg_cnt0() + REG_CNT(x)
> #define ___bpf_reg_cnt2(x, args...) ___bpf_reg_cnt1(args) + REG_CNT(x)
> #define ___bpf_reg_cnt(args...)    ___bpf_apply(___bpf_reg_cnt, ___bpf_narg(args))(args)
> 
> This way the macro will calculate the index inside ctx[] array.
> 
> and then inside ___bpf_ctx_castN macro use ___bpf_reg_cnt.
> Instead of:
> ___bpf_ctx_cast3(x, args...)  ___bpf_ctx_cast2(args), (void *)ctx[2]
> it will be
> ___bpf_ctx_cast3(x, args...)  ___bpf_ctx_cast2(args), \
>    __builtin_choose_expr(sizeof(x) <= 8, (void *)ctx[___bpf_reg_cnt(args)],
>                          *(typeof(x) *) &ctx[___bpf_reg_cnt(args)])

I tried this approach. The only problem is sizeof(x) <= 8 may also be
a structure. Since essentially we will have a type conversion like
    (struct <name))(void *)ctx[...]
and this won't work.

So ideally we want something like
__builtin_choose_expr(is_struct_type(x), *(typeof(x) *) 
&ctx[___bpf_reg_cnt(args)]
     (void *)ctx[___bpf_reg_cnt(args)])
here is_struct_type(x) tells whether the type is a struct type
or typedef of a struct. Currently we don't have a such a macro/builtin yet.

Note that in order to make sizeof(x) or is_struct_type(x) work, we
need to separate type and argument name like

int BPF_PROG(test_struct_arg_1, struct bpf_testmod_struct_arg_2, a, int, 
b, int, c)

Which will make the macro incompatible with existing BPF_PROG macro.

> 
> x - is one of the arguments.
> args - all args before 'x'. Doing __bpf_reg_cnt on them should calculate the index.
> *(typeof(x) *)& should type cast to struct of 16 bytes.
> 
> Rough idea, of course.
> 
> Another alternative is instead of:
> #define BPF_PROG(name, args...)
> name(unsigned long long *ctx);
> do:
> #define BPF_PROG(name, args...)
> struct XX {
>    macro inserts all 'args' here separated by ; so it becomes a proper struct
> };
> name(struct XX *ctx);
> 
> and then instead of doing ___bpf_ctx_castN for each argument
> do single cast of all of 'u64 ctx[N]' passed from fentry into 'struct XX *'.
> The problem with this approach that small args like char, short, int needs to
> be declared in struct XX with __align__(8).

This should work. But since we will change context type from
"unsigned long long *" to "struct XX *", the code pattern will look like

BPF_PROG2_DECL(test_struct_arg_1);
SEC("fentry/bpf_testmod_test_struct_arg_1")
int BPF_PROG2(test_struct_arg_1, struct bpf_testmod_struct_arg_2, a, 
int, b, int, c)

Where BPF_PROG2_DECL will provide a forward declaration like
#define BPF_PROG2_DECL(name) struct _____##name;

and BPF_PROG2 will look like (not handling zero argument yere)

#define BPF_PROG2(name, args...)                                      \
name(struct _____##name *ctx);                                        \
struct _____##name {                                                  \
        ___bpf_ctx_field(args)                                         \
};                                                                    \
static __always_inline typeof(name(0))                                \
____##name(struct _____##name *ctx, ___bpf_ctx_decl(args));           \
typeof(name(0)) name(struct _____##name *ctx)                         \
{                                                                     \
        return ____##name(ctx, ___bpf_ctx_arg(args));                  \
}                                                                     \
static __always_inline typeof(name(0))                                \
____##name(struct _____##name *ctx, ___bpf_ctx_decl(args))

where __bpf_ctx_field(args) will generate
    struct bpf_testmod_struct_arg_2 a;
    int b;
    int c;

___bpf_ctx_arg(args) will generate
    ctx->a, ctx->b, ctx->c

and ___bpf_ctx_decl(args) will generate proper argument prototypes
the same way as in BPF_PROG macro.

> 
> Both approaches may be workable?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v2 3/6] bpf: x86: Support in-register struct arguments
  2022-08-18 20:44       ` Alexei Starovoitov
  2022-08-24 19:04         ` Yonghong Song
@ 2022-08-24 19:05         ` Andrii Nakryiko
  2022-08-25  4:04           ` Yonghong Song
  1 sibling, 1 reply; 19+ messages in thread
From: Andrii Nakryiko @ 2022-08-24 19:05 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Yonghong Song, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Kernel Team

On Thu, Aug 18, 2022 at 1:44 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Aug 17, 2022 at 09:56:23PM -0700, Yonghong Song wrote:
> >
> >
> > On 8/15/22 3:44 PM, Alexei Starovoitov wrote:
> > > On Thu, Aug 11, 2022 at 10:24 PM Yonghong Song <yhs@fb.com> wrote:
> > > >
> > > > In C, struct value can be passed as a function argument.
> > > > For small structs, struct value may be passed in
> > > > one or more registers. For trampoline based bpf programs,
> > > > This would cause complication since one-to-one mapping between
> > > > function argument and arch argument register is not valid
> > > > any more.
> > > >
> > > > To support struct value argument and make bpf programs
> > > > easy to write, the bpf program function parameter is
> > > > changed from struct type to a pointer to struct type.
> > > > The following is a simplified example.
> > > >
> > > > In one of later selftests, we have a bpf_testmod function:
> > > >      struct bpf_testmod_struct_arg_2 {
> > > >          long a;
> > > >          long b;
> > > >      };
> > > >      noinline int
> > > >      bpf_testmod_test_struct_arg_2(int a, struct bpf_testmod_struct_arg_2 b, int c) {
> > > >          bpf_testmod_test_struct_arg_result = a + b.a + b.b + c;
> > > >          return bpf_testmod_test_struct_arg_result;
> > > >      }
> > > >
> > > > When a bpf program is attached to the bpf_testmod function
> > > > bpf_testmod_test_struct_arg_2(), the bpf program may look like
> > > >      SEC("fentry/bpf_testmod_test_struct_arg_2")
> > > >      int BPF_PROG(test_struct_arg_3, int a, struct bpf_testmod_struct_arg_2 *b, int c)
> > > >      {
> > > >          t2_a = a;
> > > >          t2_b_a = b->a;
> > > >          t2_b_b = b->b;
> > > >          t2_c = c;
> > > >          return 0;
> > > >      }
> > > >
> > > > Basically struct value becomes a pointer to the struct.
> > > > The trampoline stack will be increased to store the stack values and
> > > > the pointer to these values will be saved in the stack slot corresponding
> > > > to that argument. For x86_64, the struct size is limited up to 16 bytes
> > > > so the struct can fit in one or two registers. The struct size of more
> > > > than 16 bytes is not supported now as our current use case is
> > > > for sockptr_t in the argument. We could handle such large struct's
> > > > in the future if we have concrete use cases.
> > > >
> > > > The main changes are in save_regs() and restore_regs(). The following
> > > > illustrated the trampoline asm codes for save_regs() and restore_regs().
> > > > save_regs():
> > > >      /* first argument */
> > > >      mov    DWORD PTR [rbp-0x18],edi
> > > >      /* second argument: struct, save actual values and put the pointer to the slot */
> > > >      lea    rax,[rbp-0x40]
> > > >      mov    QWORD PTR [rbp-0x10],rax
> > > >      mov    QWORD PTR [rbp-0x40],rsi
> > > >      mov    QWORD PTR [rbp-0x38],rdx
> > > >      /* third argument */
> > > >      mov    DWORD PTR [rbp-0x8],esi
> > > > restore_regs():
> > > >      mov    edi,DWORD PTR [rbp-0x18]
> > > >      mov    rsi,QWORD PTR [rbp-0x40]
> > > >      mov    rdx,QWORD PTR [rbp-0x38]
> > > >      mov    esi,DWORD PTR [rbp-0x8]
> > >
> > > Not sure whether it was discussed before, but
> > > why cannot we adjust the bpf side instead?
> > > Technically struct passing between bpf progs was never
> > > officially supported. llvm generates something.
> > > Probably always passes by reference, but we can adjust
> > > that behavior without breaking any programs because
> > > we don't have bpf libraries. Programs are fully contained
> > > in one or few files. libbpf can do static linking, but
> > > without any actual libraries the chance of breaking
> > > backward compat is close to zero.
> >
> > Agree. At this point, we don't need to worry about
> > compatibility between bpf program and bpf program libraries.
> >
> > > Can we teach llvm to pass sizeof(struct) <= 16 in
> > > two bpf registers?
> >
> > Yes, we can. I just hacked llvm and was able to
> > do that.
> >
> > > Then we wouldn't need to have a discrepancy between
> > > kernel function prototype and bpf fentry prog proto.
> > > Both will have struct by value in the same spot.
> > > The trampoline generation will be simpler for x86 and
> > > its runtime faster too.
> >
> > I tested x86 and arm64 both supports two registers
> > for a 16 byte struct.
> >
> > > The other architectures that pass small structs by reference
> > > can do a bit more work in the trampoline: copy up to 16 byte
> > > and bpf prog side will see it as they were passed in 'registers'.
> > > wdyt?
> >
> > I know systemz and Hexagon will pass by reference for any
> > struct size >= 8 bytes. Didn't complete check others.
> >
> > But since x86 and arm64 supports direct value passing
> > with two registers, we should be okay. As you mentioned,
> > we could support systemz/hexagon style of struct passing
> > by copying the values to the stack.
> >
> >
> > But I have a problem how to define a user friendly
> > macro like BPF_PROG for user to use.
> >
> > Let us say, we have a program like below:
> > SEC("fentry/bpf_testmod_test_struct_arg_1")
> > int BPF_PROG(test_struct_arg_1, struct bpf_testmod_struct_arg_2 *a, int b,
> > int c) {
> > ...
> > }
> >
> > We have BPF_PROG macro definition here:
> >
> > #define BPF_PROG(name, args...)     \
> > name(unsigned long long *ctx);     \
> > static __always_inline typeof(name(0))     \
> > ____##name(unsigned long long *ctx, ##args);     \
> > typeof(name(0)) name(unsigned long long *ctx)     \
> > {     \
> >         _Pragma("GCC diagnostic push")      \
> >         _Pragma("GCC diagnostic ignored \"-Wint-conversion\"")      \
> >         return ____##name(___bpf_ctx_cast(args));      \
> >         _Pragma("GCC diagnostic pop")      \
> > }     \
> > static __always_inline typeof(name(0))     \
> > ____##name(unsigned long long *ctx, ##args)
> >
> > Some we have static function definition
> >
> > int ____test_struct_arg_1(unsigned long long *ctx, struct
> > bpf_testmod_struct_arg_2 *a, int b, int c);
> >
> > But the function call inside the function test_struct_arg_1()
> > is
> >   ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2]);
> >
> > We have two problems here:
> >   ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2])
> > does not match the static function declaration.
> > This is not problem if everything is int/ptr type.
> > If one of argument is structure type, we will have
> > type conversion problem. Let us this can be resolved
> > somehow through some hack.
> >
> > More importantly, because some structure may take two
> > registers,
> >    ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2])
> > may not be correct. In my above example, if the
> > structure size is 16 bytes,
> > then the actual call should be
> >    ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2], ctx[3])
> > So we need to provide how many extra registers are needed
> > beyond ##args in the macro. I have not tried how to
> > resolve this but this extra information in the macro
> > definite is not user friendly.
> >
> > Not sure what is the best way to handle this issue (##args is not precise
> > and needs addition registers for >8 struct arguments).
>
> The kernel is using this trick to cast 8 byte structs to u64:
> /* cast any integer, pointer, or small struct to u64 */
> #define UINTTYPE(size) \
>         __typeof__(__builtin_choose_expr(size == 1,  (u8)1, \
>                    __builtin_choose_expr(size == 2, (u16)2, \
>                    __builtin_choose_expr(size == 4, (u32)3, \
>                    __builtin_choose_expr(size == 8, (u64)4, \
>                                          (void)5)))))
> #define __CAST_TO_U64(x) ({ \
>         typeof(x) __src = (x); \
>         UINTTYPE(sizeof(x)) __dst; \
>         memcpy(&__dst, &__src, sizeof(__dst)); \
>         (u64)__dst; })
>
> casting 16 byte struct to two u64 can be similar.
> Ideally we would declare bpf prog as:
> SEC("fentry/bpf_testmod_test_struct_arg_1")
> int BPF_PROG(test_struct_arg_1, struct bpf_testmod_struct_arg_2 a, int b, int c) {
> note there is no '*'. It's struct by value.

Agree. So I tried to compile this:

$ git diff
diff --git a/tools/testing/selftests/bpf/progs/test_vmlinux.c
b/tools/testing/selftests/bpf/progs/test_vmlinux.c
index e9dfa0313d1b..dccb9ae9801f 100644
--- a/tools/testing/selftests/bpf/progs/test_vmlinux.c
+++ b/tools/testing/selftests/bpf/progs/test_vmlinux.c
@@ -15,6 +15,16 @@ bool tp_btf_called = false;
 bool kprobe_called = false;
 bool fentry_called = false;

+typedef struct {
+       void *x;
+       int t;
+} sockptr;
+
+static int blah(sockptr x)
+{
+       return x.t;
+}
+
 SEC("tp/syscalls/sys_enter_nanosleep")
 int handle__tp(struct trace_event_raw_sys_enter *args)
 {
@@ -30,7 +40,14 @@ int handle__tp(struct trace_event_raw_sys_enter *args)
                return 0;

        tp_called = true;
-       return 0;
+
+       return blah(({ union {
+               struct { u64 x, y; } z;
+               sockptr s;
+               } tmp = {.z = {0, 1}};
+
+               tmp.s;
+       }));
 }

 SEC("raw_tp/sys_enter")


And it compiled. So I think it's possible to do u64 to struct
conversion using this approach.
We'd have to define two variations of macro -- one for structs <= 8
bytes, another for structs > 8 and <= 16 bytes. One will "consume"
single ctx[] slot, another -- will consume both. And then each variant
knows which other macro to refer to after itself.

A bit of macro hackery, but it should work.


> The main challenge is how to do the math in the BPF_PROG macros.
> Currently it's doing:
> #define ___bpf_ctx_cast1(x)           ___bpf_ctx_cast0(), (void *)ctx[0]
> #define ___bpf_ctx_cast2(x, args...)  ___bpf_ctx_cast1(args), (void *)ctx[1]
> #define ___bpf_ctx_cast3(x, args...)  ___bpf_ctx_cast2(args), (void *)ctx[2]
> #define ___bpf_ctx_cast4(x, args...)  ___bpf_ctx_cast3(args), (void *)ctx[3]
>
> The ctx[index] is one-to-one with argument position.
> That 'index' needs to be calculated.
> Maybe something like UINTTYPE() that applies to previous arguments?
> #define REG_CNT(arg) \
>         __builtin_choose_expr(sizeof(arg) == 1,  1, \
>         __builtin_choose_expr(sizeof(arg) == 2,  1, \
>         __builtin_choose_expr(sizeof(arg) == 4,  1, \
>         __builtin_choose_expr(sizeof(arg) == 8,  1, \
>         __builtin_choose_expr(sizeof(arg) == 16,  2, \
>                                          (void)0)))))
>
> #define ___bpf_reg_cnt0()            0
> #define ___bpf_reg_cnt1(x)          ___bpf_reg_cnt0() + REG_CNT(x)
> #define ___bpf_reg_cnt2(x, args...) ___bpf_reg_cnt1(args) + REG_CNT(x)
> #define ___bpf_reg_cnt(args...)    ___bpf_apply(___bpf_reg_cnt, ___bpf_narg(args))(args)
>
> This way the macro will calculate the index inside ctx[] array.
>
> and then inside ___bpf_ctx_castN macro use ___bpf_reg_cnt.
> Instead of:
> ___bpf_ctx_cast3(x, args...)  ___bpf_ctx_cast2(args), (void *)ctx[2]
> it will be
> ___bpf_ctx_cast3(x, args...)  ___bpf_ctx_cast2(args), \
>   __builtin_choose_expr(sizeof(x) <= 8, (void *)ctx[___bpf_reg_cnt(args)],
>                         *(typeof(x) *) &ctx[___bpf_reg_cnt(args)])
>
> x - is one of the arguments.
> args - all args before 'x'. Doing __bpf_reg_cnt on them should calculate the index.
> *(typeof(x) *)& should type cast to struct of 16 bytes.
>
> Rough idea, of course.
>
> Another alternative is instead of:
> #define BPF_PROG(name, args...)
> name(unsigned long long *ctx);
> do:
> #define BPF_PROG(name, args...)
> struct XX {
>   macro inserts all 'args' here separated by ; so it becomes a proper struct
> };
> name(struct XX *ctx);
>
> and then instead of doing ___bpf_ctx_castN for each argument
> do single cast of all of 'u64 ctx[N]' passed from fentry into 'struct XX *'.
> The problem with this approach that small args like char, short, int needs to
> be declared in struct XX with __align__(8).
>
> Both approaches may be workable?

^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v2 3/6] bpf: x86: Support in-register struct arguments
  2022-08-24 19:04         ` Yonghong Song
@ 2022-08-24 22:35           ` Alexei Starovoitov
  2022-08-25  4:10             ` Yonghong Song
  0 siblings, 1 reply; 19+ messages in thread
From: Alexei Starovoitov @ 2022-08-24 22:35 UTC (permalink / raw)
  To: Yonghong Song
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Kernel Team

On Wed, Aug 24, 2022 at 12:05 PM Yonghong Song <yhs@fb.com> wrote:
>
>
>
> On 8/18/22 1:44 PM, Alexei Starovoitov wrote:
> > On Wed, Aug 17, 2022 at 09:56:23PM -0700, Yonghong Song wrote:
> >>
> >>
> >> On 8/15/22 3:44 PM, Alexei Starovoitov wrote:
> >>> On Thu, Aug 11, 2022 at 10:24 PM Yonghong Song <yhs@fb.com> wrote:
> >>>>
> >>>> In C, struct value can be passed as a function argument.
> >>>> For small structs, struct value may be passed in
> >>>> one or more registers. For trampoline based bpf programs,
> >>>> This would cause complication since one-to-one mapping between
> >>>> function argument and arch argument register is not valid
> >>>> any more.
> >>>>
> >>>> To support struct value argument and make bpf programs
> >>>> easy to write, the bpf program function parameter is
> >>>> changed from struct type to a pointer to struct type.
> >>>> The following is a simplified example.
> >>>>
> >>>> In one of later selftests, we have a bpf_testmod function:
> >>>>       struct bpf_testmod_struct_arg_2 {
> >>>>           long a;
> >>>>           long b;
> >>>>       };
> >>>>       noinline int
> >>>>       bpf_testmod_test_struct_arg_2(int a, struct bpf_testmod_struct_arg_2 b, int c) {
> >>>>           bpf_testmod_test_struct_arg_result = a + b.a + b.b + c;
> >>>>           return bpf_testmod_test_struct_arg_result;
> >>>>       }
> >>>>
> >>>> When a bpf program is attached to the bpf_testmod function
> >>>> bpf_testmod_test_struct_arg_2(), the bpf program may look like
> >>>>       SEC("fentry/bpf_testmod_test_struct_arg_2")
> >>>>       int BPF_PROG(test_struct_arg_3, int a, struct bpf_testmod_struct_arg_2 *b, int c)
> >>>>       {
> >>>>           t2_a = a;
> >>>>           t2_b_a = b->a;
> >>>>           t2_b_b = b->b;
> >>>>           t2_c = c;
> >>>>           return 0;
> >>>>       }
> >>>>
> >>>> Basically struct value becomes a pointer to the struct.
> >>>> The trampoline stack will be increased to store the stack values and
> >>>> the pointer to these values will be saved in the stack slot corresponding
> >>>> to that argument. For x86_64, the struct size is limited up to 16 bytes
> >>>> so the struct can fit in one or two registers. The struct size of more
> >>>> than 16 bytes is not supported now as our current use case is
> >>>> for sockptr_t in the argument. We could handle such large struct's
> >>>> in the future if we have concrete use cases.
> >>>>
> >>>> The main changes are in save_regs() and restore_regs(). The following
> >>>> illustrated the trampoline asm codes for save_regs() and restore_regs().
> >>>> save_regs():
> >>>>       /* first argument */
> >>>>       mov    DWORD PTR [rbp-0x18],edi
> >>>>       /* second argument: struct, save actual values and put the pointer to the slot */
> >>>>       lea    rax,[rbp-0x40]
> >>>>       mov    QWORD PTR [rbp-0x10],rax
> >>>>       mov    QWORD PTR [rbp-0x40],rsi
> >>>>       mov    QWORD PTR [rbp-0x38],rdx
> >>>>       /* third argument */
> >>>>       mov    DWORD PTR [rbp-0x8],esi
> >>>> restore_regs():
> >>>>       mov    edi,DWORD PTR [rbp-0x18]
> >>>>       mov    rsi,QWORD PTR [rbp-0x40]
> >>>>       mov    rdx,QWORD PTR [rbp-0x38]
> >>>>       mov    esi,DWORD PTR [rbp-0x8]
> >>>
> >>> Not sure whether it was discussed before, but
> >>> why cannot we adjust the bpf side instead?
> >>> Technically struct passing between bpf progs was never
> >>> officially supported. llvm generates something.
> >>> Probably always passes by reference, but we can adjust
> >>> that behavior without breaking any programs because
> >>> we don't have bpf libraries. Programs are fully contained
> >>> in one or few files. libbpf can do static linking, but
> >>> without any actual libraries the chance of breaking
> >>> backward compat is close to zero.
> >>
> >> Agree. At this point, we don't need to worry about
> >> compatibility between bpf program and bpf program libraries.
> >>
> >>> Can we teach llvm to pass sizeof(struct) <= 16 in
> >>> two bpf registers?
> >>
> >> Yes, we can. I just hacked llvm and was able to
> >> do that.
> >>
> >>> Then we wouldn't need to have a discrepancy between
> >>> kernel function prototype and bpf fentry prog proto.
> >>> Both will have struct by value in the same spot.
> >>> The trampoline generation will be simpler for x86 and
> >>> its runtime faster too.
> >>
> >> I tested x86 and arm64 both supports two registers
> >> for a 16 byte struct.
> >>
> >>> The other architectures that pass small structs by reference
> >>> can do a bit more work in the trampoline: copy up to 16 byte
> >>> and bpf prog side will see it as they were passed in 'registers'.
> >>> wdyt?
> >>
> >> I know systemz and Hexagon will pass by reference for any
> >> struct size >= 8 bytes. Didn't complete check others.
> >>
> >> But since x86 and arm64 supports direct value passing
> >> with two registers, we should be okay. As you mentioned,
> >> we could support systemz/hexagon style of struct passing
> >> by copying the values to the stack.
> >>
> >>
> >> But I have a problem how to define a user friendly
> >> macro like BPF_PROG for user to use.
> >>
> >> Let us say, we have a program like below:
> >> SEC("fentry/bpf_testmod_test_struct_arg_1")
> >> int BPF_PROG(test_struct_arg_1, struct bpf_testmod_struct_arg_2 *a, int b,
> >> int c) {
> >> ...
> >> }
> >>
> >> We have BPF_PROG macro definition here:
> >>
> >> #define BPF_PROG(name, args...)     \
> >> name(unsigned long long *ctx);     \
> >> static __always_inline typeof(name(0))     \
> >> ____##name(unsigned long long *ctx, ##args);     \
> >> typeof(name(0)) name(unsigned long long *ctx)     \
> >> {     \
> >>          _Pragma("GCC diagnostic push")      \
> >>          _Pragma("GCC diagnostic ignored \"-Wint-conversion\"")      \
> >>          return ____##name(___bpf_ctx_cast(args));      \
> >>          _Pragma("GCC diagnostic pop")      \
> >> }     \
> >> static __always_inline typeof(name(0))     \
> >> ____##name(unsigned long long *ctx, ##args)
> >>
> >> Some we have static function definition
> >>
> >> int ____test_struct_arg_1(unsigned long long *ctx, struct
> >> bpf_testmod_struct_arg_2 *a, int b, int c);
> >>
> >> But the function call inside the function test_struct_arg_1()
> >> is
> >>    ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2]);
> >>
> >> We have two problems here:
> >>    ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2])
> >> does not match the static function declaration.
> >> This is not problem if everything is int/ptr type.
> >> If one of argument is structure type, we will have
> >> type conversion problem. Let us this can be resolved
> >> somehow through some hack.
> >>
> >> More importantly, because some structure may take two
> >> registers,
> >>     ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2])
> >> may not be correct. In my above example, if the
> >> structure size is 16 bytes,
> >> then the actual call should be
> >>     ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2], ctx[3])
> >> So we need to provide how many extra registers are needed
> >> beyond ##args in the macro. I have not tried how to
> >> resolve this but this extra information in the macro
> >> definite is not user friendly.
> >>
> >> Not sure what is the best way to handle this issue (##args is not precise
> >> and needs addition registers for >8 struct arguments).
> >
> > The kernel is using this trick to cast 8 byte structs to u64:
> > /* cast any integer, pointer, or small struct to u64 */
> > #define UINTTYPE(size) \
> >          __typeof__(__builtin_choose_expr(size == 1,  (u8)1, \
> >                     __builtin_choose_expr(size == 2, (u16)2, \
> >                     __builtin_choose_expr(size == 4, (u32)3, \
> >                     __builtin_choose_expr(size == 8, (u64)4, \
> >                                           (void)5)))))
> > #define __CAST_TO_U64(x) ({ \
> >          typeof(x) __src = (x); \
> >          UINTTYPE(sizeof(x)) __dst; \
> >          memcpy(&__dst, &__src, sizeof(__dst)); \
> >          (u64)__dst; })
> >
> > casting 16 byte struct to two u64 can be similar.
> > Ideally we would declare bpf prog as:
> > SEC("fentry/bpf_testmod_test_struct_arg_1")
> > int BPF_PROG(test_struct_arg_1, struct bpf_testmod_struct_arg_2 a, int b, int c) {
> > note there is no '*'. It's struct by value.
> > The main challenge is how to do the math in the BPF_PROG macros.
> > Currently it's doing:
> > #define ___bpf_ctx_cast1(x)           ___bpf_ctx_cast0(), (void *)ctx[0]
> > #define ___bpf_ctx_cast2(x, args...)  ___bpf_ctx_cast1(args), (void *)ctx[1]
> > #define ___bpf_ctx_cast3(x, args...)  ___bpf_ctx_cast2(args), (void *)ctx[2]
> > #define ___bpf_ctx_cast4(x, args...)  ___bpf_ctx_cast3(args), (void *)ctx[3]
> >
> > The ctx[index] is one-to-one with argument position.
> > That 'index' needs to be calculated.
> > Maybe something like UINTTYPE() that applies to previous arguments?
> > #define REG_CNT(arg) \
> >          __builtin_choose_expr(sizeof(arg) == 1,  1, \
> >          __builtin_choose_expr(sizeof(arg) == 2,  1, \
> >          __builtin_choose_expr(sizeof(arg) == 4,  1, \
> >          __builtin_choose_expr(sizeof(arg) == 8,  1, \
> >          __builtin_choose_expr(sizeof(arg) == 16,  2, \
> >                                           (void)0)))))
> >
> > #define ___bpf_reg_cnt0()            0
> > #define ___bpf_reg_cnt1(x)          ___bpf_reg_cnt0() + REG_CNT(x)
> > #define ___bpf_reg_cnt2(x, args...) ___bpf_reg_cnt1(args) + REG_CNT(x)
> > #define ___bpf_reg_cnt(args...)    ___bpf_apply(___bpf_reg_cnt, ___bpf_narg(args))(args)
> >
> > This way the macro will calculate the index inside ctx[] array.
> >
> > and then inside ___bpf_ctx_castN macro use ___bpf_reg_cnt.
> > Instead of:
> > ___bpf_ctx_cast3(x, args...)  ___bpf_ctx_cast2(args), (void *)ctx[2]
> > it will be
> > ___bpf_ctx_cast3(x, args...)  ___bpf_ctx_cast2(args), \
> >    __builtin_choose_expr(sizeof(x) <= 8, (void *)ctx[___bpf_reg_cnt(args)],
> >                          *(typeof(x) *) &ctx[___bpf_reg_cnt(args)])
>
> I tried this approach. The only problem is sizeof(x) <= 8 may also be
> a structure. Since essentially we will have a type conversion like
>     (struct <name))(void *)ctx[...]
> and this won't work.

Right. Just sizeof(x) <= 8 won't work.

> So ideally we want something like
> __builtin_choose_expr(is_struct_type(x), *(typeof(x) *)
> &ctx[___bpf_reg_cnt(args)]
>      (void *)ctx[___bpf_reg_cnt(args)])
> here is_struct_type(x) tells whether the type is a struct type
> or typedef of a struct. Currently we don't have a such a macro/builtin yet.

Got it.
Maybe we can do *(typeof(x) *) &ctx[___bpf_reg_cnt(args)]
unconditionally for all args?
Only endianness will be an issue.

> Note that in order to make sizeof(x) or is_struct_type(x) work, we
> need to separate type and argument name like
>
> int BPF_PROG(test_struct_arg_1, struct bpf_testmod_struct_arg_2, a, int,
> b, int, c)

agree.

> Which will make the macro incompatible with existing BPF_PROG macro.

right. we need a new macro regardless.

> >
> > x - is one of the arguments.
> > args - all args before 'x'. Doing __bpf_reg_cnt on them should calculate the index.
> > *(typeof(x) *)& should type cast to struct of 16 bytes.
> >
> > Rough idea, of course.
> >
> > Another alternative is instead of:
> > #define BPF_PROG(name, args...)
> > name(unsigned long long *ctx);
> > do:
> > #define BPF_PROG(name, args...)
> > struct XX {
> >    macro inserts all 'args' here separated by ; so it becomes a proper struct
> > };
> > name(struct XX *ctx);
> >
> > and then instead of doing ___bpf_ctx_castN for each argument
> > do single cast of all of 'u64 ctx[N]' passed from fentry into 'struct XX *'.
> > The problem with this approach that small args like char, short, int needs to
> > be declared in struct XX with __align__(8).
>
> This should work. But since we will change context type from
> "unsigned long long *" to "struct XX *", the code pattern will look like
>
> BPF_PROG2_DECL(test_struct_arg_1);
> SEC("fentry/bpf_testmod_test_struct_arg_1")
> int BPF_PROG2(test_struct_arg_1, struct bpf_testmod_struct_arg_2, a,
> int, b, int, c)
>
> Where BPF_PROG2_DECL will provide a forward declaration like
> #define BPF_PROG2_DECL(name) struct _____##name;
>
> and BPF_PROG2 will look like (not handling zero argument yere)
>
> #define BPF_PROG2(name, args...)                                      \
> name(struct _____##name *ctx);                                        \
> struct _____##name {                                                  \
>         ___bpf_ctx_field(args)                                         \
> };                                                                    \
> static __always_inline typeof(name(0))                                \
> ____##name(struct _____##name *ctx, ___bpf_ctx_decl(args));           \
> typeof(name(0)) name(struct _____##name *ctx)                         \
> {                                                                     \
>         return ____##name(ctx, ___bpf_ctx_arg(args));                  \
> }                                                                     \
> static __always_inline typeof(name(0))                                \
> ____##name(struct _____##name *ctx, ___bpf_ctx_decl(args))
>
> where __bpf_ctx_field(args) will generate
>     struct bpf_testmod_struct_arg_2 a;
>     int b;
>     int c;
>
> ___bpf_ctx_arg(args) will generate
>     ctx->a, ctx->b, ctx->c
>
> and ___bpf_ctx_decl(args) will generate proper argument prototypes
> the same way as in BPF_PROG macro.

Great that 2nd approach works :)
If 1st approach can be made to work we won't need
additional line BPF_PROG2_DECL(test_struct_arg_1);
right?

Either way we can start with 2nd approach and improve on it later.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v2 3/6] bpf: x86: Support in-register struct arguments
  2022-08-24 19:05         ` Andrii Nakryiko
@ 2022-08-25  4:04           ` Yonghong Song
  0 siblings, 0 replies; 19+ messages in thread
From: Yonghong Song @ 2022-08-25  4:04 UTC (permalink / raw)
  To: Andrii Nakryiko, Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Kernel Team



On 8/24/22 12:05 PM, Andrii Nakryiko wrote:
> On Thu, Aug 18, 2022 at 1:44 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
>>
>> On Wed, Aug 17, 2022 at 09:56:23PM -0700, Yonghong Song wrote:
>>>
>>>
>>> On 8/15/22 3:44 PM, Alexei Starovoitov wrote:
>>>> On Thu, Aug 11, 2022 at 10:24 PM Yonghong Song <yhs@fb.com> wrote:
>>>>>
>>>>> In C, struct value can be passed as a function argument.
>>>>> For small structs, struct value may be passed in
>>>>> one or more registers. For trampoline based bpf programs,
>>>>> This would cause complication since one-to-one mapping between
>>>>> function argument and arch argument register is not valid
>>>>> any more.
>>>>>
>>>>> To support struct value argument and make bpf programs
>>>>> easy to write, the bpf program function parameter is
>>>>> changed from struct type to a pointer to struct type.
>>>>> The following is a simplified example.
>>>>>
>>>>> In one of later selftests, we have a bpf_testmod function:
>>>>>       struct bpf_testmod_struct_arg_2 {
>>>>>           long a;
>>>>>           long b;
>>>>>       };
>>>>>       noinline int
>>>>>       bpf_testmod_test_struct_arg_2(int a, struct bpf_testmod_struct_arg_2 b, int c) {
>>>>>           bpf_testmod_test_struct_arg_result = a + b.a + b.b + c;
>>>>>           return bpf_testmod_test_struct_arg_result;
>>>>>       }
>>>>>
>>>>> When a bpf program is attached to the bpf_testmod function
>>>>> bpf_testmod_test_struct_arg_2(), the bpf program may look like
>>>>>       SEC("fentry/bpf_testmod_test_struct_arg_2")
>>>>>       int BPF_PROG(test_struct_arg_3, int a, struct bpf_testmod_struct_arg_2 *b, int c)
>>>>>       {
>>>>>           t2_a = a;
>>>>>           t2_b_a = b->a;
>>>>>           t2_b_b = b->b;
>>>>>           t2_c = c;
>>>>>           return 0;
>>>>>       }
>>>>>
>>>>> Basically struct value becomes a pointer to the struct.
>>>>> The trampoline stack will be increased to store the stack values and
>>>>> the pointer to these values will be saved in the stack slot corresponding
>>>>> to that argument. For x86_64, the struct size is limited up to 16 bytes
>>>>> so the struct can fit in one or two registers. The struct size of more
>>>>> than 16 bytes is not supported now as our current use case is
>>>>> for sockptr_t in the argument. We could handle such large struct's
>>>>> in the future if we have concrete use cases.
>>>>>
>>>>> The main changes are in save_regs() and restore_regs(). The following
>>>>> illustrated the trampoline asm codes for save_regs() and restore_regs().
>>>>> save_regs():
>>>>>       /* first argument */
>>>>>       mov    DWORD PTR [rbp-0x18],edi
>>>>>       /* second argument: struct, save actual values and put the pointer to the slot */
>>>>>       lea    rax,[rbp-0x40]
>>>>>       mov    QWORD PTR [rbp-0x10],rax
>>>>>       mov    QWORD PTR [rbp-0x40],rsi
>>>>>       mov    QWORD PTR [rbp-0x38],rdx
>>>>>       /* third argument */
>>>>>       mov    DWORD PTR [rbp-0x8],esi
>>>>> restore_regs():
>>>>>       mov    edi,DWORD PTR [rbp-0x18]
>>>>>       mov    rsi,QWORD PTR [rbp-0x40]
>>>>>       mov    rdx,QWORD PTR [rbp-0x38]
>>>>>       mov    esi,DWORD PTR [rbp-0x8]
>>>>
>>>> Not sure whether it was discussed before, but
>>>> why cannot we adjust the bpf side instead?
>>>> Technically struct passing between bpf progs was never
>>>> officially supported. llvm generates something.
>>>> Probably always passes by reference, but we can adjust
>>>> that behavior without breaking any programs because
>>>> we don't have bpf libraries. Programs are fully contained
>>>> in one or few files. libbpf can do static linking, but
>>>> without any actual libraries the chance of breaking
>>>> backward compat is close to zero.
>>>
>>> Agree. At this point, we don't need to worry about
>>> compatibility between bpf program and bpf program libraries.
>>>
>>>> Can we teach llvm to pass sizeof(struct) <= 16 in
>>>> two bpf registers?
>>>
>>> Yes, we can. I just hacked llvm and was able to
>>> do that.
>>>
>>>> Then we wouldn't need to have a discrepancy between
>>>> kernel function prototype and bpf fentry prog proto.
>>>> Both will have struct by value in the same spot.
>>>> The trampoline generation will be simpler for x86 and
>>>> its runtime faster too.
>>>
>>> I tested x86 and arm64 both supports two registers
>>> for a 16 byte struct.
>>>
>>>> The other architectures that pass small structs by reference
>>>> can do a bit more work in the trampoline: copy up to 16 byte
>>>> and bpf prog side will see it as they were passed in 'registers'.
>>>> wdyt?
>>>
>>> I know systemz and Hexagon will pass by reference for any
>>> struct size >= 8 bytes. Didn't complete check others.
>>>
>>> But since x86 and arm64 supports direct value passing
>>> with two registers, we should be okay. As you mentioned,
>>> we could support systemz/hexagon style of struct passing
>>> by copying the values to the stack.
>>>
>>>
>>> But I have a problem how to define a user friendly
>>> macro like BPF_PROG for user to use.
>>>
>>> Let us say, we have a program like below:
>>> SEC("fentry/bpf_testmod_test_struct_arg_1")
>>> int BPF_PROG(test_struct_arg_1, struct bpf_testmod_struct_arg_2 *a, int b,
>>> int c) {
>>> ...
>>> }
>>>
>>> We have BPF_PROG macro definition here:
>>>
>>> #define BPF_PROG(name, args...)     \
>>> name(unsigned long long *ctx);     \
>>> static __always_inline typeof(name(0))     \
>>> ____##name(unsigned long long *ctx, ##args);     \
>>> typeof(name(0)) name(unsigned long long *ctx)     \
>>> {     \
>>>          _Pragma("GCC diagnostic push")      \
>>>          _Pragma("GCC diagnostic ignored \"-Wint-conversion\"")      \
>>>          return ____##name(___bpf_ctx_cast(args));      \
>>>          _Pragma("GCC diagnostic pop")      \
>>> }     \
>>> static __always_inline typeof(name(0))     \
>>> ____##name(unsigned long long *ctx, ##args)
>>>
>>> Some we have static function definition
>>>
>>> int ____test_struct_arg_1(unsigned long long *ctx, struct
>>> bpf_testmod_struct_arg_2 *a, int b, int c);
>>>
>>> But the function call inside the function test_struct_arg_1()
>>> is
>>>    ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2]);
>>>
>>> We have two problems here:
>>>    ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2])
>>> does not match the static function declaration.
>>> This is not problem if everything is int/ptr type.
>>> If one of argument is structure type, we will have
>>> type conversion problem. Let us this can be resolved
>>> somehow through some hack.
>>>
>>> More importantly, because some structure may take two
>>> registers,
>>>     ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2])
>>> may not be correct. In my above example, if the
>>> structure size is 16 bytes,
>>> then the actual call should be
>>>     ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2], ctx[3])
>>> So we need to provide how many extra registers are needed
>>> beyond ##args in the macro. I have not tried how to
>>> resolve this but this extra information in the macro
>>> definite is not user friendly.
>>>
>>> Not sure what is the best way to handle this issue (##args is not precise
>>> and needs addition registers for >8 struct arguments).
>>
>> The kernel is using this trick to cast 8 byte structs to u64:
>> /* cast any integer, pointer, or small struct to u64 */
>> #define UINTTYPE(size) \
>>          __typeof__(__builtin_choose_expr(size == 1,  (u8)1, \
>>                     __builtin_choose_expr(size == 2, (u16)2, \
>>                     __builtin_choose_expr(size == 4, (u32)3, \
>>                     __builtin_choose_expr(size == 8, (u64)4, \
>>                                           (void)5)))))
>> #define __CAST_TO_U64(x) ({ \
>>          typeof(x) __src = (x); \
>>          UINTTYPE(sizeof(x)) __dst; \
>>          memcpy(&__dst, &__src, sizeof(__dst)); \
>>          (u64)__dst; })
>>
>> casting 16 byte struct to two u64 can be similar.
>> Ideally we would declare bpf prog as:
>> SEC("fentry/bpf_testmod_test_struct_arg_1")
>> int BPF_PROG(test_struct_arg_1, struct bpf_testmod_struct_arg_2 a, int b, int c) {
>> note there is no '*'. It's struct by value.
> 
> Agree. So I tried to compile this:
> 
> $ git diff
> diff --git a/tools/testing/selftests/bpf/progs/test_vmlinux.c
> b/tools/testing/selftests/bpf/progs/test_vmlinux.c
> index e9dfa0313d1b..dccb9ae9801f 100644
> --- a/tools/testing/selftests/bpf/progs/test_vmlinux.c
> +++ b/tools/testing/selftests/bpf/progs/test_vmlinux.c
> @@ -15,6 +15,16 @@ bool tp_btf_called = false;
>   bool kprobe_called = false;
>   bool fentry_called = false;
> 
> +typedef struct {
> +       void *x;
> +       int t;
> +} sockptr;
> +
> +static int blah(sockptr x)
> +{
> +       return x.t;
> +}
> +
>   SEC("tp/syscalls/sys_enter_nanosleep")
>   int handle__tp(struct trace_event_raw_sys_enter *args)
>   {
> @@ -30,7 +40,14 @@ int handle__tp(struct trace_event_raw_sys_enter *args)
>                  return 0;
> 
>          tp_called = true;
> -       return 0;
> +
> +       return blah(({ union {
> +               struct { u64 x, y; } z;
> +               sockptr s;
> +               } tmp = {.z = {0, 1}};
> +
> +               tmp.s;
> +       }));
>   }
> 
>   SEC("raw_tp/sys_enter")
> 
> 
> And it compiled. So I think it's possible to do u64 to struct
> conversion using this approach.
> We'd have to define two variations of macro -- one for structs <= 8
> bytes, another for structs > 8 and <= 16 bytes. One will "consume"
> single ctx[] slot, another -- will consume both. And then each variant
> knows which other macro to refer to after itself.
> 
> A bit of macro hackery, but it should work.

Thanks for suggestion. This approach might work. Let me
give a try.

> 
> 
>> The main challenge is how to do the math in the BPF_PROG macros.
>> Currently it's doing:
>> #define ___bpf_ctx_cast1(x)           ___bpf_ctx_cast0(), (void *)ctx[0]
>> #define ___bpf_ctx_cast2(x, args...)  ___bpf_ctx_cast1(args), (void *)ctx[1]
>> #define ___bpf_ctx_cast3(x, args...)  ___bpf_ctx_cast2(args), (void *)ctx[2]
>> #define ___bpf_ctx_cast4(x, args...)  ___bpf_ctx_cast3(args), (void *)ctx[3]
>>
>> The ctx[index] is one-to-one with argument position.
>> That 'index' needs to be calculated.
>> Maybe something like UINTTYPE() that applies to previous arguments?
>> #define REG_CNT(arg) \
>>          __builtin_choose_expr(sizeof(arg) == 1,  1, \
>>          __builtin_choose_expr(sizeof(arg) == 2,  1, \
>>          __builtin_choose_expr(sizeof(arg) == 4,  1, \
>>          __builtin_choose_expr(sizeof(arg) == 8,  1, \
>>          __builtin_choose_expr(sizeof(arg) == 16,  2, \
>>                                           (void)0)))))
>>
>> #define ___bpf_reg_cnt0()            0
>> #define ___bpf_reg_cnt1(x)          ___bpf_reg_cnt0() + REG_CNT(x)
>> #define ___bpf_reg_cnt2(x, args...) ___bpf_reg_cnt1(args) + REG_CNT(x)
>> #define ___bpf_reg_cnt(args...)    ___bpf_apply(___bpf_reg_cnt, ___bpf_narg(args))(args)
>>
>> This way the macro will calculate the index inside ctx[] array.
>>
>> and then inside ___bpf_ctx_castN macro use ___bpf_reg_cnt.
>> Instead of:
>> ___bpf_ctx_cast3(x, args...)  ___bpf_ctx_cast2(args), (void *)ctx[2]
>> it will be
>> ___bpf_ctx_cast3(x, args...)  ___bpf_ctx_cast2(args), \
>>    __builtin_choose_expr(sizeof(x) <= 8, (void *)ctx[___bpf_reg_cnt(args)],
>>                          *(typeof(x) *) &ctx[___bpf_reg_cnt(args)])
>>
>> x - is one of the arguments.
>> args - all args before 'x'. Doing __bpf_reg_cnt on them should calculate the index.
>> *(typeof(x) *)& should type cast to struct of 16 bytes.
>>
>> Rough idea, of course.
>>
>> Another alternative is instead of:
>> #define BPF_PROG(name, args...)
>> name(unsigned long long *ctx);
>> do:
>> #define BPF_PROG(name, args...)
>> struct XX {
>>    macro inserts all 'args' here separated by ; so it becomes a proper struct
>> };
>> name(struct XX *ctx);
>>
>> and then instead of doing ___bpf_ctx_castN for each argument
>> do single cast of all of 'u64 ctx[N]' passed from fentry into 'struct XX *'.
>> The problem with this approach that small args like char, short, int needs to
>> be declared in struct XX with __align__(8).
>>
>> Both approaches may be workable?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH bpf-next v2 3/6] bpf: x86: Support in-register struct arguments
  2022-08-24 22:35           ` Alexei Starovoitov
@ 2022-08-25  4:10             ` Yonghong Song
  0 siblings, 0 replies; 19+ messages in thread
From: Yonghong Song @ 2022-08-25  4:10 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann, Kernel Team



On 8/24/22 3:35 PM, Alexei Starovoitov wrote:
> On Wed, Aug 24, 2022 at 12:05 PM Yonghong Song <yhs@fb.com> wrote:
>>
>>
>>
>> On 8/18/22 1:44 PM, Alexei Starovoitov wrote:
>>> On Wed, Aug 17, 2022 at 09:56:23PM -0700, Yonghong Song wrote:
>>>>
>>>>
>>>> On 8/15/22 3:44 PM, Alexei Starovoitov wrote:
>>>>> On Thu, Aug 11, 2022 at 10:24 PM Yonghong Song <yhs@fb.com> wrote:
>>>>>>
>>>>>> In C, struct value can be passed as a function argument.
>>>>>> For small structs, struct value may be passed in
>>>>>> one or more registers. For trampoline based bpf programs,
>>>>>> This would cause complication since one-to-one mapping between
>>>>>> function argument and arch argument register is not valid
>>>>>> any more.
>>>>>>
>>>>>> To support struct value argument and make bpf programs
>>>>>> easy to write, the bpf program function parameter is
>>>>>> changed from struct type to a pointer to struct type.
>>>>>> The following is a simplified example.
>>>>>>
>>>>>> In one of later selftests, we have a bpf_testmod function:
>>>>>>        struct bpf_testmod_struct_arg_2 {
>>>>>>            long a;
>>>>>>            long b;
>>>>>>        };
>>>>>>        noinline int
>>>>>>        bpf_testmod_test_struct_arg_2(int a, struct bpf_testmod_struct_arg_2 b, int c) {
>>>>>>            bpf_testmod_test_struct_arg_result = a + b.a + b.b + c;
>>>>>>            return bpf_testmod_test_struct_arg_result;
>>>>>>        }
>>>>>>
>>>>>> When a bpf program is attached to the bpf_testmod function
>>>>>> bpf_testmod_test_struct_arg_2(), the bpf program may look like
>>>>>>        SEC("fentry/bpf_testmod_test_struct_arg_2")
>>>>>>        int BPF_PROG(test_struct_arg_3, int a, struct bpf_testmod_struct_arg_2 *b, int c)
>>>>>>        {
>>>>>>            t2_a = a;
>>>>>>            t2_b_a = b->a;
>>>>>>            t2_b_b = b->b;
>>>>>>            t2_c = c;
>>>>>>            return 0;
>>>>>>        }
>>>>>>
>>>>>> Basically struct value becomes a pointer to the struct.
>>>>>> The trampoline stack will be increased to store the stack values and
>>>>>> the pointer to these values will be saved in the stack slot corresponding
>>>>>> to that argument. For x86_64, the struct size is limited up to 16 bytes
>>>>>> so the struct can fit in one or two registers. The struct size of more
>>>>>> than 16 bytes is not supported now as our current use case is
>>>>>> for sockptr_t in the argument. We could handle such large struct's
>>>>>> in the future if we have concrete use cases.
>>>>>>
>>>>>> The main changes are in save_regs() and restore_regs(). The following
>>>>>> illustrated the trampoline asm codes for save_regs() and restore_regs().
>>>>>> save_regs():
>>>>>>        /* first argument */
>>>>>>        mov    DWORD PTR [rbp-0x18],edi
>>>>>>        /* second argument: struct, save actual values and put the pointer to the slot */
>>>>>>        lea    rax,[rbp-0x40]
>>>>>>        mov    QWORD PTR [rbp-0x10],rax
>>>>>>        mov    QWORD PTR [rbp-0x40],rsi
>>>>>>        mov    QWORD PTR [rbp-0x38],rdx
>>>>>>        /* third argument */
>>>>>>        mov    DWORD PTR [rbp-0x8],esi
>>>>>> restore_regs():
>>>>>>        mov    edi,DWORD PTR [rbp-0x18]
>>>>>>        mov    rsi,QWORD PTR [rbp-0x40]
>>>>>>        mov    rdx,QWORD PTR [rbp-0x38]
>>>>>>        mov    esi,DWORD PTR [rbp-0x8]
>>>>>
>>>>> Not sure whether it was discussed before, but
>>>>> why cannot we adjust the bpf side instead?
>>>>> Technically struct passing between bpf progs was never
>>>>> officially supported. llvm generates something.
>>>>> Probably always passes by reference, but we can adjust
>>>>> that behavior without breaking any programs because
>>>>> we don't have bpf libraries. Programs are fully contained
>>>>> in one or few files. libbpf can do static linking, but
>>>>> without any actual libraries the chance of breaking
>>>>> backward compat is close to zero.
>>>>
>>>> Agree. At this point, we don't need to worry about
>>>> compatibility between bpf program and bpf program libraries.
>>>>
>>>>> Can we teach llvm to pass sizeof(struct) <= 16 in
>>>>> two bpf registers?
>>>>
>>>> Yes, we can. I just hacked llvm and was able to
>>>> do that.
>>>>
>>>>> Then we wouldn't need to have a discrepancy between
>>>>> kernel function prototype and bpf fentry prog proto.
>>>>> Both will have struct by value in the same spot.
>>>>> The trampoline generation will be simpler for x86 and
>>>>> its runtime faster too.
>>>>
>>>> I tested x86 and arm64 both supports two registers
>>>> for a 16 byte struct.
>>>>
>>>>> The other architectures that pass small structs by reference
>>>>> can do a bit more work in the trampoline: copy up to 16 byte
>>>>> and bpf prog side will see it as they were passed in 'registers'.
>>>>> wdyt?
>>>>
>>>> I know systemz and Hexagon will pass by reference for any
>>>> struct size >= 8 bytes. Didn't complete check others.
>>>>
>>>> But since x86 and arm64 supports direct value passing
>>>> with two registers, we should be okay. As you mentioned,
>>>> we could support systemz/hexagon style of struct passing
>>>> by copying the values to the stack.
>>>>
>>>>
>>>> But I have a problem how to define a user friendly
>>>> macro like BPF_PROG for user to use.
>>>>
>>>> Let us say, we have a program like below:
>>>> SEC("fentry/bpf_testmod_test_struct_arg_1")
>>>> int BPF_PROG(test_struct_arg_1, struct bpf_testmod_struct_arg_2 *a, int b,
>>>> int c) {
>>>> ...
>>>> }
>>>>
>>>> We have BPF_PROG macro definition here:
>>>>
>>>> #define BPF_PROG(name, args...)     \
>>>> name(unsigned long long *ctx);     \
>>>> static __always_inline typeof(name(0))     \
>>>> ____##name(unsigned long long *ctx, ##args);     \
>>>> typeof(name(0)) name(unsigned long long *ctx)     \
>>>> {     \
>>>>           _Pragma("GCC diagnostic push")      \
>>>>           _Pragma("GCC diagnostic ignored \"-Wint-conversion\"")      \
>>>>           return ____##name(___bpf_ctx_cast(args));      \
>>>>           _Pragma("GCC diagnostic pop")      \
>>>> }     \
>>>> static __always_inline typeof(name(0))     \
>>>> ____##name(unsigned long long *ctx, ##args)
>>>>
>>>> Some we have static function definition
>>>>
>>>> int ____test_struct_arg_1(unsigned long long *ctx, struct
>>>> bpf_testmod_struct_arg_2 *a, int b, int c);
>>>>
>>>> But the function call inside the function test_struct_arg_1()
>>>> is
>>>>     ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2]);
>>>>
>>>> We have two problems here:
>>>>     ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2])
>>>> does not match the static function declaration.
>>>> This is not problem if everything is int/ptr type.
>>>> If one of argument is structure type, we will have
>>>> type conversion problem. Let us this can be resolved
>>>> somehow through some hack.
>>>>
>>>> More importantly, because some structure may take two
>>>> registers,
>>>>      ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2])
>>>> may not be correct. In my above example, if the
>>>> structure size is 16 bytes,
>>>> then the actual call should be
>>>>      ____test_struct_arg_1(ctx, ctx[0], ctx[1], ctx[2], ctx[3])
>>>> So we need to provide how many extra registers are needed
>>>> beyond ##args in the macro. I have not tried how to
>>>> resolve this but this extra information in the macro
>>>> definite is not user friendly.
>>>>
>>>> Not sure what is the best way to handle this issue (##args is not precise
>>>> and needs addition registers for >8 struct arguments).
>>>
>>> The kernel is using this trick to cast 8 byte structs to u64:
>>> /* cast any integer, pointer, or small struct to u64 */
>>> #define UINTTYPE(size) \
>>>           __typeof__(__builtin_choose_expr(size == 1,  (u8)1, \
>>>                      __builtin_choose_expr(size == 2, (u16)2, \
>>>                      __builtin_choose_expr(size == 4, (u32)3, \
>>>                      __builtin_choose_expr(size == 8, (u64)4, \
>>>                                            (void)5)))))
>>> #define __CAST_TO_U64(x) ({ \
>>>           typeof(x) __src = (x); \
>>>           UINTTYPE(sizeof(x)) __dst; \
>>>           memcpy(&__dst, &__src, sizeof(__dst)); \
>>>           (u64)__dst; })
>>>
>>> casting 16 byte struct to two u64 can be similar.
>>> Ideally we would declare bpf prog as:
>>> SEC("fentry/bpf_testmod_test_struct_arg_1")
>>> int BPF_PROG(test_struct_arg_1, struct bpf_testmod_struct_arg_2 a, int b, int c) {
>>> note there is no '*'. It's struct by value.
>>> The main challenge is how to do the math in the BPF_PROG macros.
>>> Currently it's doing:
>>> #define ___bpf_ctx_cast1(x)           ___bpf_ctx_cast0(), (void *)ctx[0]
>>> #define ___bpf_ctx_cast2(x, args...)  ___bpf_ctx_cast1(args), (void *)ctx[1]
>>> #define ___bpf_ctx_cast3(x, args...)  ___bpf_ctx_cast2(args), (void *)ctx[2]
>>> #define ___bpf_ctx_cast4(x, args...)  ___bpf_ctx_cast3(args), (void *)ctx[3]
>>>
>>> The ctx[index] is one-to-one with argument position.
>>> That 'index' needs to be calculated.
>>> Maybe something like UINTTYPE() that applies to previous arguments?
>>> #define REG_CNT(arg) \
>>>           __builtin_choose_expr(sizeof(arg) == 1,  1, \
>>>           __builtin_choose_expr(sizeof(arg) == 2,  1, \
>>>           __builtin_choose_expr(sizeof(arg) == 4,  1, \
>>>           __builtin_choose_expr(sizeof(arg) == 8,  1, \
>>>           __builtin_choose_expr(sizeof(arg) == 16,  2, \
>>>                                            (void)0)))))
>>>
>>> #define ___bpf_reg_cnt0()            0
>>> #define ___bpf_reg_cnt1(x)          ___bpf_reg_cnt0() + REG_CNT(x)
>>> #define ___bpf_reg_cnt2(x, args...) ___bpf_reg_cnt1(args) + REG_CNT(x)
>>> #define ___bpf_reg_cnt(args...)    ___bpf_apply(___bpf_reg_cnt, ___bpf_narg(args))(args)
>>>
>>> This way the macro will calculate the index inside ctx[] array.
>>>
>>> and then inside ___bpf_ctx_castN macro use ___bpf_reg_cnt.
>>> Instead of:
>>> ___bpf_ctx_cast3(x, args...)  ___bpf_ctx_cast2(args), (void *)ctx[2]
>>> it will be
>>> ___bpf_ctx_cast3(x, args...)  ___bpf_ctx_cast2(args), \
>>>     __builtin_choose_expr(sizeof(x) <= 8, (void *)ctx[___bpf_reg_cnt(args)],
>>>                           *(typeof(x) *) &ctx[___bpf_reg_cnt(args)])
>>
>> I tried this approach. The only problem is sizeof(x) <= 8 may also be
>> a structure. Since essentially we will have a type conversion like
>>      (struct <name))(void *)ctx[...]
>> and this won't work.
> 
> Right. Just sizeof(x) <= 8 won't work.
> 
>> So ideally we want something like
>> __builtin_choose_expr(is_struct_type(x), *(typeof(x) *)
>> &ctx[___bpf_reg_cnt(args)]
>>       (void *)ctx[___bpf_reg_cnt(args)])
>> here is_struct_type(x) tells whether the type is a struct type
>> or typedef of a struct. Currently we don't have a such a macro/builtin yet.
> 
> Got it.
> Maybe we can do *(typeof(x) *) &ctx[___bpf_reg_cnt(args)]
> unconditionally for all args?
> Only endianness will be an issue.

Yes, endianness will be an issue.

> 
>> Note that in order to make sizeof(x) or is_struct_type(x) work, we
>> need to separate type and argument name like
>>
>> int BPF_PROG(test_struct_arg_1, struct bpf_testmod_struct_arg_2, a, int,
>> b, int, c)
> 
> agree.
> 
>> Which will make the macro incompatible with existing BPF_PROG macro.
> 
> right. we need a new macro regardless.
> 
>>>
>>> x - is one of the arguments.
>>> args - all args before 'x'. Doing __bpf_reg_cnt on them should calculate the index.
>>> *(typeof(x) *)& should type cast to struct of 16 bytes.
>>>
>>> Rough idea, of course.
>>>
>>> Another alternative is instead of:
>>> #define BPF_PROG(name, args...)
>>> name(unsigned long long *ctx);
>>> do:
>>> #define BPF_PROG(name, args...)
>>> struct XX {
>>>     macro inserts all 'args' here separated by ; so it becomes a proper struct
>>> };
>>> name(struct XX *ctx);
>>>
>>> and then instead of doing ___bpf_ctx_castN for each argument
>>> do single cast of all of 'u64 ctx[N]' passed from fentry into 'struct XX *'.
>>> The problem with this approach that small args like char, short, int needs to
>>> be declared in struct XX with __align__(8).
>>
>> This should work. But since we will change context type from
>> "unsigned long long *" to "struct XX *", the code pattern will look like
>>
>> BPF_PROG2_DECL(test_struct_arg_1);
>> SEC("fentry/bpf_testmod_test_struct_arg_1")
>> int BPF_PROG2(test_struct_arg_1, struct bpf_testmod_struct_arg_2, a,
>> int, b, int, c)
>>
>> Where BPF_PROG2_DECL will provide a forward declaration like
>> #define BPF_PROG2_DECL(name) struct _____##name;
>>
>> and BPF_PROG2 will look like (not handling zero argument yere)
>>
>> #define BPF_PROG2(name, args...)                                      \
>> name(struct _____##name *ctx);                                        \
>> struct _____##name {                                                  \
>>          ___bpf_ctx_field(args)                                         \
>> };                                                                    \
>> static __always_inline typeof(name(0))                                \
>> ____##name(struct _____##name *ctx, ___bpf_ctx_decl(args));           \
>> typeof(name(0)) name(struct _____##name *ctx)                         \
>> {                                                                     \
>>          return ____##name(ctx, ___bpf_ctx_arg(args));                  \
>> }                                                                     \
>> static __always_inline typeof(name(0))                                \
>> ____##name(struct _____##name *ctx, ___bpf_ctx_decl(args))
>>
>> where __bpf_ctx_field(args) will generate
>>      struct bpf_testmod_struct_arg_2 a;
>>      int b;
>>      int c;
>>
>> ___bpf_ctx_arg(args) will generate
>>      ctx->a, ctx->b, ctx->c
>>
>> and ___bpf_ctx_decl(args) will generate proper argument prototypes
>> the same way as in BPF_PROG macro.
> 
> Great that 2nd approach works :)
> If 1st approach can be made to work we won't need
> additional line BPF_PROG2_DECL(test_struct_arg_1);
> right?

Right. I don't like addition macro like BPF_PROG2_DECL as well.
I just checked Andrii's approach. It may work and we may not need
the addition macro any more.

> 
> Either way we can start with 2nd approach and improve on it later.

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2022-08-25  4:10 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-12  5:24 [PATCH bpf-next v2 0/6] bpf: Support struct argument for trampoline base progs Yonghong Song
2022-08-12  5:24 ` [PATCH bpf-next v2 1/6] bpf: Add struct argument info in btf_func_model Yonghong Song
2022-08-12  5:24 ` [PATCH bpf-next v2 2/6] bpf: x86: Rename stack_size to regs_off in {save,restore}_regs() Yonghong Song
2022-08-12  5:24 ` [PATCH bpf-next v2 3/6] bpf: x86: Support in-register struct arguments Yonghong Song
2022-08-14 20:24   ` Jiri Olsa
2022-08-15  5:29     ` Yonghong Song
2022-08-15  7:29       ` Jiri Olsa
2022-08-15 15:25         ` Yonghong Song
2022-08-15 22:44   ` Alexei Starovoitov
2022-08-18  4:56     ` Yonghong Song
2022-08-18 20:44       ` Alexei Starovoitov
2022-08-24 19:04         ` Yonghong Song
2022-08-24 22:35           ` Alexei Starovoitov
2022-08-25  4:10             ` Yonghong Song
2022-08-24 19:05         ` Andrii Nakryiko
2022-08-25  4:04           ` Yonghong Song
2022-08-12  5:24 ` [PATCH bpf-next v2 4/6] bpf: arm64: No support of struct argument Yonghong Song
2022-08-12  5:24 ` [PATCH bpf-next v2 5/6] bpf: Populate struct argument info in btf_func_model Yonghong Song
2022-08-12  5:24 ` [PATCH bpf-next v2 6/6] selftests/bpf: Add struct argument tests with fentry/fexit programs Yonghong Song

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.