bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next 00/15] bpf: syscall program, FD array, loader program, light skeleton.
@ 2021-04-17  3:32 Alexei Starovoitov
  2021-04-17  3:32 ` [PATCH bpf-next 01/15] bpf: Introduce bpf_sys_bpf() helper and program type Alexei Starovoitov
                   ` (14 more replies)
  0 siblings, 15 replies; 30+ messages in thread
From: Alexei Starovoitov @ 2021-04-17  3:32 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

This is a first step towards signed bpf programs and the third approach of that kind.
The first approach was to bring libbpf into the kernel as a user-mode-driver.
The second approach was to invent a new file format and let kernel execute
that format as a sequence of syscalls that create maps and load programs.
This third approach is using new type of bpf program instead of inventing file format.
1st and 2nd approaches had too many downsides comparing to this 3rd and were discarded
after months of work.

To make it work the following new concepts are introduced:
1. syscall bpf program type
A kind of bpf program that can do sys_bpf and sys_close syscalls.
It can only execute in user context.

2. FD array or FD index.
Traditionally BPF instructions are patched with FDs.
What it means that maps has to be created first and then instructions modified
which breaks signature verification if the program is signed.
Instead of patching each instruction with FD patch it with an index into array of FDs.
That makes the program signature stable if it uses maps.

3. loader program that is generated as "strace of libbpf".
When libbpf is loading bpf_file.o it does a bunch of sys_bpf() syscalls to
load BTF, create maps, populate maps and finally load programs.
Instead of actually doing the syscalls generate a trace of what libbpf
would have done and represent it as the "loader program".
The "loader program" consists of single map and single bpf program that
does those syscalls.
Executing such "loader program" via bpf_prog_test_run() command will
replay the sequence of syscalls that libbpf would have done which will result
the same maps created and programs loaded as specified in the elf file.
The "loader program" removes libelf and majority of libbpf dependency from
program loading process.

4. light skeleton
Instead of embedding the whole elf file into skeleton and using libbpf
to parse it later generate a loader program and embed it into "light skeleton".
Such skeleton can load the same set of elf files, but it doesn't need
libbpf and libelf to do that. It only needs few sys_bpf wrappers.

Future steps:
- support CO-RE in the kernel. This patch set is already too big,
so that critical feature is left for the next step.
- generate light skeleton in golang to allow such users use BTF and
all other features provided by libbpf
- generate light skeleton for kernel, so that bpf programs can be embeded
in the kernel module. The UMD usage in bpf_preload will be replaced with
such skeleton, so bpf_preload would become a standard kernel module
without user space dependency.
- finally do the signing of the loader program.

The patches are work in progress with few rough edges.

Alexei Starovoitov (15):
  bpf: Introduce bpf_sys_bpf() helper and program type.
  bpf: Introduce bpfptr_t user/kernel pointer.
  bpf: Prepare bpf syscall to be used from kernel and user space.
  libbpf: Support for syscall program type
  selftests/bpf: Test for syscall program type
  bpf: Make btf_load command to be bpfptr_t compatible.
  selftests/bpf: Test for btf_load command.
  bpf: Introduce fd_idx
  libbpf: Support for fd_idx
  bpf: Add bpf_btf_find_by_name_kind() helper.
  bpf: Add bpf_sys_close() helper.
  libbpf: Change the order of data and text relocations.
  libbpf: Generate loader program out of BPF ELF file.
  bpftool: Use syscall/loader program in "prog load" and "gen skeleton"
    command.
  selftests/bpf: Convert few tests to light skeleton.

 include/linux/bpf.h                           |  19 +-
 include/linux/bpf_types.h                     |   2 +
 include/linux/bpf_verifier.h                  |   1 +
 include/linux/bpfptr.h                        |  81 +++
 include/linux/btf.h                           |   2 +-
 include/uapi/linux/bpf.h                      |  39 +-
 kernel/bpf/bpf_iter.c                         |  13 +-
 kernel/bpf/btf.c                              |  59 +-
 kernel/bpf/syscall.c                          | 179 ++++--
 kernel/bpf/verifier.c                         |  81 ++-
 net/bpf/test_run.c                            |  45 +-
 tools/bpf/bpftool/Makefile                    |   2 +-
 tools/bpf/bpftool/gen.c                       | 263 ++++++++-
 tools/bpf/bpftool/main.c                      |   7 +-
 tools/bpf/bpftool/main.h                      |   1 +
 tools/bpf/bpftool/prog.c                      |  78 +++
 tools/bpf/bpftool/xlated_dumper.c             |   3 +
 tools/include/uapi/linux/bpf.h                |  39 +-
 tools/lib/bpf/Build                           |   2 +-
 tools/lib/bpf/bpf.c                           |  62 ++
 tools/lib/bpf/bpf.h                           |  35 ++
 tools/lib/bpf/bpf_gen_internal.h              |  38 ++
 tools/lib/bpf/gen_trace.c                     | 529 ++++++++++++++++++
 tools/lib/bpf/libbpf.c                        | 346 ++++++++++--
 tools/lib/bpf/libbpf.map                      |   1 +
 tools/lib/bpf/libbpf_internal.h               |   3 +
 tools/testing/selftests/bpf/.gitignore        |   1 +
 tools/testing/selftests/bpf/Makefile          |  16 +-
 .../selftests/bpf/prog_tests/fentry_fexit.c   |   6 +-
 .../selftests/bpf/prog_tests/fentry_test.c    |   4 +-
 .../selftests/bpf/prog_tests/fexit_sleep.c    |   6 +-
 .../selftests/bpf/prog_tests/fexit_test.c     |   4 +-
 .../selftests/bpf/prog_tests/kfunc_call.c     |   6 +-
 .../selftests/bpf/prog_tests/syscall.c        |  53 ++
 tools/testing/selftests/bpf/progs/syscall.c   | 121 ++++
 .../selftests/bpf/progs/test_subprogs.c       |  13 +
 36 files changed, 1972 insertions(+), 188 deletions(-)
 create mode 100644 include/linux/bpfptr.h
 create mode 100644 tools/lib/bpf/bpf_gen_internal.h
 create mode 100644 tools/lib/bpf/gen_trace.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/syscall.c
 create mode 100644 tools/testing/selftests/bpf/progs/syscall.c

-- 
2.30.2


^ permalink raw reply	[flat|nested] 30+ messages in thread

* [PATCH bpf-next 01/15] bpf: Introduce bpf_sys_bpf() helper and program type.
  2021-04-17  3:32 [PATCH bpf-next 00/15] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
@ 2021-04-17  3:32 ` Alexei Starovoitov
  2021-04-17  3:32 ` [PATCH bpf-next 02/15] bpf: Introduce bpfptr_t user/kernel pointer Alexei Starovoitov
                   ` (13 subsequent siblings)
  14 siblings, 0 replies; 30+ messages in thread
From: Alexei Starovoitov @ 2021-04-17  3:32 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Add placeholders for bpf_sys_bpf() helper and new program type.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 include/linux/bpf.h            | 10 ++++++++
 include/linux/bpf_types.h      |  2 ++
 include/uapi/linux/bpf.h       |  8 ++++++
 kernel/bpf/syscall.c           | 46 ++++++++++++++++++++++++++++++++++
 net/bpf/test_run.c             | 43 +++++++++++++++++++++++++++++++
 tools/include/uapi/linux/bpf.h |  8 ++++++
 6 files changed, 117 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index ff8cd68c01b3..f2e77ac3d5eb 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1823,6 +1823,9 @@ static inline bool bpf_map_is_dev_bound(struct bpf_map *map)
 
 struct bpf_map *bpf_map_offload_map_alloc(union bpf_attr *attr);
 void bpf_map_offload_map_free(struct bpf_map *map);
+int bpf_prog_test_run_syscall(struct bpf_prog *prog,
+			      const union bpf_attr *kattr,
+			      union bpf_attr __user *uattr);
 #else
 static inline int bpf_prog_offload_init(struct bpf_prog *prog,
 					union bpf_attr *attr)
@@ -1848,6 +1851,13 @@ static inline struct bpf_map *bpf_map_offload_map_alloc(union bpf_attr *attr)
 static inline void bpf_map_offload_map_free(struct bpf_map *map)
 {
 }
+
+static inline int bpf_prog_test_run_syscall(struct bpf_prog *prog,
+					    const union bpf_attr *kattr,
+					    union bpf_attr __user *uattr)
+{
+	return -ENOTSUPP;
+}
 #endif /* CONFIG_NET && CONFIG_BPF_SYSCALL */
 
 #if defined(CONFIG_INET) && defined(CONFIG_BPF_SYSCALL)
diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h
index f883f01a5061..a9db1eae6796 100644
--- a/include/linux/bpf_types.h
+++ b/include/linux/bpf_types.h
@@ -77,6 +77,8 @@ BPF_PROG_TYPE(BPF_PROG_TYPE_LSM, lsm,
 	       void *, void *)
 #endif /* CONFIG_BPF_LSM */
 #endif
+BPF_PROG_TYPE(BPF_PROG_TYPE_SYSCALL, bpf_syscall,
+	      void *, void *)
 
 BPF_MAP_TYPE(BPF_MAP_TYPE_ARRAY, array_map_ops)
 BPF_MAP_TYPE(BPF_MAP_TYPE_PERCPU_ARRAY, percpu_array_map_ops)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index df164a44bb41..ce3e76ff08cd 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -937,6 +937,7 @@ enum bpf_prog_type {
 	BPF_PROG_TYPE_EXT,
 	BPF_PROG_TYPE_LSM,
 	BPF_PROG_TYPE_SK_LOOKUP,
+	BPF_PROG_TYPE_SYSCALL,
 };
 
 enum bpf_attach_type {
@@ -4708,6 +4709,12 @@ union bpf_attr {
  *	Return
  *		The number of traversed map elements for success, **-EINVAL** for
  *		invalid **flags**.
+ *
+ * long bpf_sys_bpf(u32 cmd, void *attr, u32 attr_size)
+ * 	Description
+ * 		Execute bpf syscall with given arguments.
+ * 	Return
+ * 		A syscall result.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -4875,6 +4882,7 @@ union bpf_attr {
 	FN(sock_from_file),		\
 	FN(check_mtu),			\
 	FN(for_each_map_elem),		\
+	FN(sys_bpf),			\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index fd495190115e..0e4ece4d57e0 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -4497,3 +4497,49 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 
 	return err;
 }
+
+static bool syscall_prog_is_valid_access(int off, int size,
+					 enum bpf_access_type type,
+					 const struct bpf_prog *prog,
+					 struct bpf_insn_access_aux *info)
+{
+	if (off < 0 || off >= U16_MAX)
+		return false;
+	if (off % size != 0)
+		return false;
+	return true;
+}
+
+BPF_CALL_3(bpf_sys_bpf, int, cmd, void *, attr, u32, attr_size)
+{
+	return -EINVAL;
+}
+
+const struct bpf_func_proto bpf_sys_bpf_proto = {
+	.func		= bpf_sys_bpf,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_ANYTHING,
+	.arg2_type	= ARG_PTR_TO_MEM,
+	.arg3_type	= ARG_CONST_SIZE,
+};
+
+static const struct bpf_func_proto *
+syscall_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
+{
+	switch (func_id) {
+	case BPF_FUNC_sys_bpf:
+		return &bpf_sys_bpf_proto;
+	default:
+		return bpf_base_func_proto(func_id);
+	}
+}
+
+const struct bpf_verifier_ops bpf_syscall_verifier_ops = {
+	.get_func_proto  = syscall_prog_func_proto,
+	.is_valid_access = syscall_prog_is_valid_access,
+};
+
+const struct bpf_prog_ops bpf_syscall_prog_ops = {
+	.test_run = bpf_prog_test_run_syscall,
+};
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index a5d72c48fb66..1783ea77b95c 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -918,3 +918,46 @@ int bpf_prog_test_run_sk_lookup(struct bpf_prog *prog, const union bpf_attr *kat
 	kfree(user_ctx);
 	return ret;
 }
+
+int bpf_prog_test_run_syscall(struct bpf_prog *prog,
+			      const union bpf_attr *kattr,
+			      union bpf_attr __user *uattr)
+{
+	void __user *ctx_in = u64_to_user_ptr(kattr->test.ctx_in);
+	__u32 ctx_size_in = kattr->test.ctx_size_in;
+	void *ctx = NULL;
+	u32 retval;
+	int err = 0;
+
+	/* doesn't support data_in/out, ctx_out, duration, or repeat or flags */
+	if (kattr->test.data_in || kattr->test.data_out ||
+	    kattr->test.ctx_out || kattr->test.duration ||
+	    kattr->test.repeat || kattr->test.flags)
+		return -EINVAL;
+
+	if (ctx_size_in < prog->aux->max_ctx_offset ||
+	    ctx_size_in > U16_MAX)
+		return -EINVAL;
+
+	if (ctx_size_in) {
+		ctx = kzalloc(ctx_size_in, GFP_USER);
+		if (!ctx)
+			return -ENOMEM;
+		if (copy_from_user(ctx, ctx_in, ctx_size_in)) {
+			err = -EFAULT;
+			goto out;
+		}
+	}
+	retval = bpf_prog_run_pin_on_cpu(prog, ctx);
+
+	if (copy_to_user(&uattr->test.retval, &retval, sizeof(u32)))
+		err = -EFAULT;
+	if (ctx_size_in)
+		if (copy_to_user(ctx_in, ctx, ctx_size_in)) {
+			err = -EFAULT;
+			goto out;
+		}
+out:
+	kfree(ctx);
+	return err;
+}
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index df164a44bb41..ce3e76ff08cd 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -937,6 +937,7 @@ enum bpf_prog_type {
 	BPF_PROG_TYPE_EXT,
 	BPF_PROG_TYPE_LSM,
 	BPF_PROG_TYPE_SK_LOOKUP,
+	BPF_PROG_TYPE_SYSCALL,
 };
 
 enum bpf_attach_type {
@@ -4708,6 +4709,12 @@ union bpf_attr {
  *	Return
  *		The number of traversed map elements for success, **-EINVAL** for
  *		invalid **flags**.
+ *
+ * long bpf_sys_bpf(u32 cmd, void *attr, u32 attr_size)
+ * 	Description
+ * 		Execute bpf syscall with given arguments.
+ * 	Return
+ * 		A syscall result.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -4875,6 +4882,7 @@ union bpf_attr {
 	FN(sock_from_file),		\
 	FN(check_mtu),			\
 	FN(for_each_map_elem),		\
+	FN(sys_bpf),			\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH bpf-next 02/15] bpf: Introduce bpfptr_t user/kernel pointer.
  2021-04-17  3:32 [PATCH bpf-next 00/15] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
  2021-04-17  3:32 ` [PATCH bpf-next 01/15] bpf: Introduce bpf_sys_bpf() helper and program type Alexei Starovoitov
@ 2021-04-17  3:32 ` Alexei Starovoitov
  2021-04-17  3:32 ` [PATCH bpf-next 03/15] bpf: Prepare bpf syscall to be used from kernel and user space Alexei Starovoitov
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 30+ messages in thread
From: Alexei Starovoitov @ 2021-04-17  3:32 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Similar to sockptr_t introduce bpfptr_t with few additions:
make_bpfptr() creates new user/kernel pointer in the same address space as
existing user/kernel pointer.
bpfptr_add() advances the user/kernel pointer.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 include/linux/bpfptr.h | 81 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 81 insertions(+)
 create mode 100644 include/linux/bpfptr.h

diff --git a/include/linux/bpfptr.h b/include/linux/bpfptr.h
new file mode 100644
index 000000000000..e370acb04977
--- /dev/null
+++ b/include/linux/bpfptr.h
@@ -0,0 +1,81 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* A pointer that can point to either kernel or userspace memory. */
+#ifndef _LINUX_BPFPTR_H
+#define _LINUX_BPFPTR_H
+
+#include <linux/sockptr.h>
+
+typedef sockptr_t bpfptr_t;
+
+static inline bool bpfptr_is_kernel(bpfptr_t bpfptr)
+{
+	return bpfptr.is_kernel;
+}
+
+static inline bpfptr_t KERNEL_BPFPTR(void *p)
+{
+	return (bpfptr_t) { .kernel = p, .is_kernel = true };
+}
+
+static inline bpfptr_t USER_BPFPTR(void __user *p)
+{
+	return (bpfptr_t) { .user = p };
+}
+
+static inline bpfptr_t make_bpfptr(u64 addr, bool is_kernel)
+{
+	if (is_kernel)
+		return (bpfptr_t) {
+			.kernel = (void*) (uintptr_t) addr,
+			.is_kernel = true,
+		};
+	else
+		return (bpfptr_t) {
+			.user = u64_to_user_ptr(addr),
+			.is_kernel = false,
+		};
+}
+
+static inline bool bpfptr_is_null(bpfptr_t bpfptr)
+{
+	if (bpfptr_is_kernel(bpfptr))
+		return !bpfptr.kernel;
+	return !bpfptr.user;
+}
+
+static inline void bpfptr_add(bpfptr_t *bpfptr, size_t val)
+{
+	if (bpfptr_is_kernel(*bpfptr))
+		bpfptr->kernel += val;
+	else
+		bpfptr->user += val;
+}
+
+static inline int copy_from_bpfptr_offset(void *dst, bpfptr_t src,
+					  size_t offset, size_t size)
+{
+	return copy_from_sockptr_offset(dst, (sockptr_t) src, offset, size);
+}
+
+static inline int copy_from_bpfptr(void *dst, bpfptr_t src, size_t size)
+{
+	return copy_from_bpfptr_offset(dst, src, 0, size);
+}
+
+static inline int copy_to_bpfptr_offset(bpfptr_t dst, size_t offset,
+					const void *src, size_t size)
+{
+	return copy_to_sockptr_offset((sockptr_t) dst, offset, src, size);
+}
+
+static inline void *memdup_bpfptr(bpfptr_t src, size_t len)
+{
+	return memdup_sockptr((sockptr_t) src, len);
+}
+
+static inline long strncpy_from_bpfptr(char *dst, bpfptr_t src, size_t count)
+{
+	return strncpy_from_sockptr(dst, (sockptr_t) src, count);
+}
+
+#endif /* _LINUX_BPFPTR_H */
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH bpf-next 03/15] bpf: Prepare bpf syscall to be used from kernel and user space.
  2021-04-17  3:32 [PATCH bpf-next 00/15] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
  2021-04-17  3:32 ` [PATCH bpf-next 01/15] bpf: Introduce bpf_sys_bpf() helper and program type Alexei Starovoitov
  2021-04-17  3:32 ` [PATCH bpf-next 02/15] bpf: Introduce bpfptr_t user/kernel pointer Alexei Starovoitov
@ 2021-04-17  3:32 ` Alexei Starovoitov
  2021-04-17  3:32 ` [PATCH bpf-next 04/15] libbpf: Support for syscall program type Alexei Starovoitov
                   ` (11 subsequent siblings)
  14 siblings, 0 replies; 30+ messages in thread
From: Alexei Starovoitov @ 2021-04-17  3:32 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

With the help from bpfptr_t prepare relevant bpf syscall commands
to be used from kernel and user space.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 include/linux/bpf.h   |   8 +--
 kernel/bpf/bpf_iter.c |  13 ++---
 kernel/bpf/syscall.c  | 110 +++++++++++++++++++++++++++---------------
 kernel/bpf/verifier.c |  34 +++++++------
 net/bpf/test_run.c    |   2 +-
 5 files changed, 101 insertions(+), 66 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index f2e77ac3d5eb..1aede7ceea5e 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -22,6 +22,7 @@
 #include <linux/sched/mm.h>
 #include <linux/slab.h>
 #include <linux/percpu-refcount.h>
+#include <linux/bpfptr.h>
 
 struct bpf_verifier_env;
 struct bpf_verifier_log;
@@ -1425,7 +1426,7 @@ struct bpf_iter__bpf_map_elem {
 int bpf_iter_reg_target(const struct bpf_iter_reg *reg_info);
 void bpf_iter_unreg_target(const struct bpf_iter_reg *reg_info);
 bool bpf_iter_prog_supported(struct bpf_prog *prog);
-int bpf_iter_link_attach(const union bpf_attr *attr, struct bpf_prog *prog);
+int bpf_iter_link_attach(const union bpf_attr *attr, bpfptr_t uattr, struct bpf_prog *prog);
 int bpf_iter_new_fd(struct bpf_link *link);
 bool bpf_link_is_iter(struct bpf_link *link);
 struct bpf_prog *bpf_iter_get_info(struct bpf_iter_meta *meta, bool in_stop);
@@ -1456,7 +1457,7 @@ int bpf_fd_htab_map_update_elem(struct bpf_map *map, struct file *map_file,
 int bpf_fd_htab_map_lookup_elem(struct bpf_map *map, void *key, u32 *value);
 
 int bpf_get_file_flag(int flags);
-int bpf_check_uarg_tail_zero(void __user *uaddr, size_t expected_size,
+int bpf_check_uarg_tail_zero(bpfptr_t uaddr, size_t expected_size,
 			     size_t actual_size);
 
 /* memcpy that is used with 8-byte aligned pointers, power-of-8 size and
@@ -1476,8 +1477,7 @@ static inline void bpf_long_memcpy(void *dst, const void *src, u32 size)
 }
 
 /* verify correctness of eBPF program */
-int bpf_check(struct bpf_prog **fp, union bpf_attr *attr,
-	      union bpf_attr __user *uattr);
+int bpf_check(struct bpf_prog **fp, union bpf_attr *attr, bpfptr_t uattr);
 
 #ifndef CONFIG_BPF_JIT_ALWAYS_ON
 void bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth);
diff --git a/kernel/bpf/bpf_iter.c b/kernel/bpf/bpf_iter.c
index 931870f9cf56..2d4fbdbb194e 100644
--- a/kernel/bpf/bpf_iter.c
+++ b/kernel/bpf/bpf_iter.c
@@ -473,15 +473,16 @@ bool bpf_link_is_iter(struct bpf_link *link)
 	return link->ops == &bpf_iter_link_lops;
 }
 
-int bpf_iter_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
+int bpf_iter_link_attach(const union bpf_attr *attr, bpfptr_t uattr,
+			 struct bpf_prog *prog)
 {
-	union bpf_iter_link_info __user *ulinfo;
 	struct bpf_link_primer link_primer;
 	struct bpf_iter_target_info *tinfo;
 	union bpf_iter_link_info linfo;
 	struct bpf_iter_link *link;
 	u32 prog_btf_id, linfo_len;
 	bool existed = false;
+	bpfptr_t ulinfo;
 	int err;
 
 	if (attr->link_create.target_fd || attr->link_create.flags)
@@ -489,18 +490,18 @@ int bpf_iter_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
 
 	memset(&linfo, 0, sizeof(union bpf_iter_link_info));
 
-	ulinfo = u64_to_user_ptr(attr->link_create.iter_info);
+	ulinfo = make_bpfptr(attr->link_create.iter_info, uattr.is_kernel);
 	linfo_len = attr->link_create.iter_info_len;
-	if (!ulinfo ^ !linfo_len)
+	if (bpfptr_is_null(ulinfo) ^ !linfo_len)
 		return -EINVAL;
 
-	if (ulinfo) {
+	if (!bpfptr_is_null(ulinfo)) {
 		err = bpf_check_uarg_tail_zero(ulinfo, sizeof(linfo),
 					       linfo_len);
 		if (err)
 			return err;
 		linfo_len = min_t(u32, linfo_len, sizeof(linfo));
-		if (copy_from_user(&linfo, ulinfo, linfo_len))
+		if (copy_from_bpfptr(&linfo, ulinfo, linfo_len))
 			return -EFAULT;
 	}
 
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 0e4ece4d57e0..e918839b76fd 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -72,11 +72,10 @@ static const struct bpf_map_ops * const bpf_map_types[] = {
  * copy_from_user() call. However, this is not a concern since this function is
  * meant to be a future-proofing of bits.
  */
-int bpf_check_uarg_tail_zero(void __user *uaddr,
+int bpf_check_uarg_tail_zero(bpfptr_t uaddr,
 			     size_t expected_size,
 			     size_t actual_size)
 {
-	unsigned char __user *addr = uaddr + expected_size;
 	int res;
 
 	if (unlikely(actual_size > PAGE_SIZE))	/* silly large */
@@ -85,7 +84,12 @@ int bpf_check_uarg_tail_zero(void __user *uaddr,
 	if (actual_size <= expected_size)
 		return 0;
 
-	res = check_zeroed_user(addr, actual_size - expected_size);
+	if (uaddr.is_kernel)
+		res = memchr_inv(uaddr.kernel + expected_size, 0,
+				 actual_size - expected_size) == NULL;
+	else
+		res = check_zeroed_user(uaddr.user + expected_size,
+					actual_size - expected_size);
 	if (res < 0)
 		return res;
 	return res ? 0 : -E2BIG;
@@ -1004,6 +1008,17 @@ static void *__bpf_copy_key(void __user *ukey, u64 key_size)
 	return NULL;
 }
 
+static void *___bpf_copy_key(bpfptr_t ukey, u64 key_size)
+{
+	if (key_size)
+		return memdup_bpfptr(ukey, key_size);
+
+	if (!bpfptr_is_null(ukey))
+		return ERR_PTR(-EINVAL);
+
+	return NULL;
+}
+
 /* last field in 'union bpf_attr' used by this command */
 #define BPF_MAP_LOOKUP_ELEM_LAST_FIELD flags
 
@@ -1074,10 +1089,10 @@ static int map_lookup_elem(union bpf_attr *attr)
 
 #define BPF_MAP_UPDATE_ELEM_LAST_FIELD flags
 
-static int map_update_elem(union bpf_attr *attr)
+static int map_update_elem(union bpf_attr *attr, bpfptr_t uattr)
 {
-	void __user *ukey = u64_to_user_ptr(attr->key);
-	void __user *uvalue = u64_to_user_ptr(attr->value);
+	bpfptr_t ukey = make_bpfptr(attr->key, uattr.is_kernel);
+	bpfptr_t uvalue = make_bpfptr(attr->value, uattr.is_kernel);
 	int ufd = attr->map_fd;
 	struct bpf_map *map;
 	void *key, *value;
@@ -1103,7 +1118,7 @@ static int map_update_elem(union bpf_attr *attr)
 		goto err_put;
 	}
 
-	key = __bpf_copy_key(ukey, map->key_size);
+	key = ___bpf_copy_key(ukey, map->key_size);
 	if (IS_ERR(key)) {
 		err = PTR_ERR(key);
 		goto err_put;
@@ -1123,7 +1138,7 @@ static int map_update_elem(union bpf_attr *attr)
 		goto free_key;
 
 	err = -EFAULT;
-	if (copy_from_user(value, uvalue, value_size) != 0)
+	if (copy_from_bpfptr(value, uvalue, value_size) != 0)
 		goto free_value;
 
 	err = bpf_map_update_value(map, f, key, value, attr->flags);
@@ -2075,7 +2090,7 @@ static bool is_perfmon_prog_type(enum bpf_prog_type prog_type)
 /* last field in 'union bpf_attr' used by this command */
 #define	BPF_PROG_LOAD_LAST_FIELD attach_prog_fd
 
-static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
+static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr)
 {
 	enum bpf_prog_type type = attr->prog_type;
 	struct bpf_prog *prog, *dst_prog = NULL;
@@ -2100,8 +2115,9 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
 		return -EPERM;
 
 	/* copy eBPF program license from user space */
-	if (strncpy_from_user(license, u64_to_user_ptr(attr->license),
-			      sizeof(license) - 1) < 0)
+	if (strncpy_from_bpfptr(license,
+				make_bpfptr(attr->license, uattr.is_kernel),
+				sizeof(license) - 1) < 0)
 		return -EFAULT;
 	license[sizeof(license) - 1] = 0;
 
@@ -2185,8 +2201,9 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
 	prog->len = attr->insn_cnt;
 
 	err = -EFAULT;
-	if (copy_from_user(prog->insns, u64_to_user_ptr(attr->insns),
-			   bpf_prog_insn_size(prog)) != 0)
+	if (copy_from_bpfptr(prog->insns,
+			     make_bpfptr(attr->insns, uattr.is_kernel),
+			     bpf_prog_insn_size(prog)) != 0)
 		goto free_prog_sec;
 
 	prog->orig_prog = NULL;
@@ -3411,7 +3428,7 @@ static int bpf_prog_get_info_by_fd(struct file *file,
 	u32 ulen;
 	int err;
 
-	err = bpf_check_uarg_tail_zero(uinfo, sizeof(info), info_len);
+	err = bpf_check_uarg_tail_zero(USER_BPFPTR(uinfo), sizeof(info), info_len);
 	if (err)
 		return err;
 	info_len = min_t(u32, sizeof(info), info_len);
@@ -3690,7 +3707,7 @@ static int bpf_map_get_info_by_fd(struct file *file,
 	u32 info_len = attr->info.info_len;
 	int err;
 
-	err = bpf_check_uarg_tail_zero(uinfo, sizeof(info), info_len);
+	err = bpf_check_uarg_tail_zero(USER_BPFPTR(uinfo), sizeof(info), info_len);
 	if (err)
 		return err;
 	info_len = min_t(u32, sizeof(info), info_len);
@@ -3733,7 +3750,7 @@ static int bpf_btf_get_info_by_fd(struct file *file,
 	u32 info_len = attr->info.info_len;
 	int err;
 
-	err = bpf_check_uarg_tail_zero(uinfo, sizeof(*uinfo), info_len);
+	err = bpf_check_uarg_tail_zero(USER_BPFPTR(uinfo), sizeof(*uinfo), info_len);
 	if (err)
 		return err;
 
@@ -3750,7 +3767,7 @@ static int bpf_link_get_info_by_fd(struct file *file,
 	u32 info_len = attr->info.info_len;
 	int err;
 
-	err = bpf_check_uarg_tail_zero(uinfo, sizeof(info), info_len);
+	err = bpf_check_uarg_tail_zero(USER_BPFPTR(uinfo), sizeof(info), info_len);
 	if (err)
 		return err;
 	info_len = min_t(u32, sizeof(info), info_len);
@@ -4011,13 +4028,14 @@ static int bpf_map_do_batch(const union bpf_attr *attr,
 	return err;
 }
 
-static int tracing_bpf_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
+static int tracing_bpf_link_attach(const union bpf_attr *attr, bpfptr_t uattr,
+				   struct bpf_prog *prog)
 {
 	if (attr->link_create.attach_type != prog->expected_attach_type)
 		return -EINVAL;
 
 	if (prog->expected_attach_type == BPF_TRACE_ITER)
-		return bpf_iter_link_attach(attr, prog);
+		return bpf_iter_link_attach(attr, uattr, prog);
 	else if (prog->type == BPF_PROG_TYPE_EXT)
 		return bpf_tracing_prog_attach(prog,
 					       attr->link_create.target_fd,
@@ -4026,7 +4044,7 @@ static int tracing_bpf_link_attach(const union bpf_attr *attr, struct bpf_prog *
 }
 
 #define BPF_LINK_CREATE_LAST_FIELD link_create.iter_info_len
-static int link_create(union bpf_attr *attr)
+static int link_create(union bpf_attr *attr, bpfptr_t uattr)
 {
 	enum bpf_prog_type ptype;
 	struct bpf_prog *prog;
@@ -4045,7 +4063,7 @@ static int link_create(union bpf_attr *attr)
 		goto out;
 
 	if (prog->type == BPF_PROG_TYPE_EXT) {
-		ret = tracing_bpf_link_attach(attr, prog);
+		ret = tracing_bpf_link_attach(attr, uattr, prog);
 		goto out;
 	}
 
@@ -4066,7 +4084,7 @@ static int link_create(union bpf_attr *attr)
 		ret = cgroup_bpf_link_attach(attr, prog);
 		break;
 	case BPF_PROG_TYPE_TRACING:
-		ret = tracing_bpf_link_attach(attr, prog);
+		ret = tracing_bpf_link_attach(attr, uattr, prog);
 		break;
 	case BPF_PROG_TYPE_FLOW_DISSECTOR:
 	case BPF_PROG_TYPE_SK_LOOKUP:
@@ -4354,7 +4372,7 @@ static int bpf_prog_bind_map(union bpf_attr *attr)
 	return ret;
 }
 
-SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, size)
+static int __sys_bpf(int cmd, bpfptr_t uattr, unsigned int size)
 {
 	union bpf_attr attr;
 	int err;
@@ -4369,7 +4387,7 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 
 	/* copy attributes from user space, may be less than sizeof(bpf_attr) */
 	memset(&attr, 0, sizeof(attr));
-	if (copy_from_user(&attr, uattr, size) != 0)
+	if (copy_from_bpfptr(&attr, uattr, size) != 0)
 		return -EFAULT;
 
 	err = security_bpf(cmd, &attr, size);
@@ -4384,7 +4402,7 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 		err = map_lookup_elem(&attr);
 		break;
 	case BPF_MAP_UPDATE_ELEM:
-		err = map_update_elem(&attr);
+		err = map_update_elem(&attr, uattr);
 		break;
 	case BPF_MAP_DELETE_ELEM:
 		err = map_delete_elem(&attr);
@@ -4411,21 +4429,21 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 		err = bpf_prog_detach(&attr);
 		break;
 	case BPF_PROG_QUERY:
-		err = bpf_prog_query(&attr, uattr);
+		err = bpf_prog_query(&attr, uattr.user);
 		break;
 	case BPF_PROG_TEST_RUN:
-		err = bpf_prog_test_run(&attr, uattr);
+		err = bpf_prog_test_run(&attr, uattr.user);
 		break;
 	case BPF_PROG_GET_NEXT_ID:
-		err = bpf_obj_get_next_id(&attr, uattr,
+		err = bpf_obj_get_next_id(&attr, uattr.user,
 					  &prog_idr, &prog_idr_lock);
 		break;
 	case BPF_MAP_GET_NEXT_ID:
-		err = bpf_obj_get_next_id(&attr, uattr,
+		err = bpf_obj_get_next_id(&attr, uattr.user,
 					  &map_idr, &map_idr_lock);
 		break;
 	case BPF_BTF_GET_NEXT_ID:
-		err = bpf_obj_get_next_id(&attr, uattr,
+		err = bpf_obj_get_next_id(&attr, uattr.user,
 					  &btf_idr, &btf_idr_lock);
 		break;
 	case BPF_PROG_GET_FD_BY_ID:
@@ -4435,7 +4453,7 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 		err = bpf_map_get_fd_by_id(&attr);
 		break;
 	case BPF_OBJ_GET_INFO_BY_FD:
-		err = bpf_obj_get_info_by_fd(&attr, uattr);
+		err = bpf_obj_get_info_by_fd(&attr, uattr.user);
 		break;
 	case BPF_RAW_TRACEPOINT_OPEN:
 		err = bpf_raw_tracepoint_open(&attr);
@@ -4447,26 +4465,26 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 		err = bpf_btf_get_fd_by_id(&attr);
 		break;
 	case BPF_TASK_FD_QUERY:
-		err = bpf_task_fd_query(&attr, uattr);
+		err = bpf_task_fd_query(&attr, uattr.user);
 		break;
 	case BPF_MAP_LOOKUP_AND_DELETE_ELEM:
 		err = map_lookup_and_delete_elem(&attr);
 		break;
 	case BPF_MAP_LOOKUP_BATCH:
-		err = bpf_map_do_batch(&attr, uattr, BPF_MAP_LOOKUP_BATCH);
+		err = bpf_map_do_batch(&attr, uattr.user, BPF_MAP_LOOKUP_BATCH);
 		break;
 	case BPF_MAP_LOOKUP_AND_DELETE_BATCH:
-		err = bpf_map_do_batch(&attr, uattr,
+		err = bpf_map_do_batch(&attr, uattr.user,
 				       BPF_MAP_LOOKUP_AND_DELETE_BATCH);
 		break;
 	case BPF_MAP_UPDATE_BATCH:
-		err = bpf_map_do_batch(&attr, uattr, BPF_MAP_UPDATE_BATCH);
+		err = bpf_map_do_batch(&attr, uattr.user, BPF_MAP_UPDATE_BATCH);
 		break;
 	case BPF_MAP_DELETE_BATCH:
-		err = bpf_map_do_batch(&attr, uattr, BPF_MAP_DELETE_BATCH);
+		err = bpf_map_do_batch(&attr, uattr.user, BPF_MAP_DELETE_BATCH);
 		break;
 	case BPF_LINK_CREATE:
-		err = link_create(&attr);
+		err = link_create(&attr, uattr);
 		break;
 	case BPF_LINK_UPDATE:
 		err = link_update(&attr);
@@ -4475,7 +4493,7 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 		err = bpf_link_get_fd_by_id(&attr);
 		break;
 	case BPF_LINK_GET_NEXT_ID:
-		err = bpf_obj_get_next_id(&attr, uattr,
+		err = bpf_obj_get_next_id(&attr, uattr.user,
 					  &link_idr, &link_idr_lock);
 		break;
 	case BPF_ENABLE_STATS:
@@ -4498,6 +4516,11 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 	return err;
 }
 
+SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, size)
+{
+	return __sys_bpf(cmd, USER_BPFPTR(uattr), size);
+}
+
 static bool syscall_prog_is_valid_access(int off, int size,
 					 enum bpf_access_type type,
 					 const struct bpf_prog *prog,
@@ -4512,7 +4535,16 @@ static bool syscall_prog_is_valid_access(int off, int size,
 
 BPF_CALL_3(bpf_sys_bpf, int, cmd, void *, attr, u32, attr_size)
 {
-	return -EINVAL;
+	switch (cmd) {
+	case BPF_MAP_CREATE:
+	case BPF_MAP_UPDATE_ELEM:
+	case BPF_MAP_FREEZE:
+	case BPF_PROG_LOAD:
+		break;
+	default:
+		return -EINVAL;
+	}
+	return __sys_bpf(cmd, KERNEL_BPFPTR(attr), attr_size);
 }
 
 const struct bpf_func_proto bpf_sys_bpf_proto = {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 852541a435ef..7028fd0f4481 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -9268,7 +9268,7 @@ static int check_abnormal_return(struct bpf_verifier_env *env)
 
 static int check_btf_func(struct bpf_verifier_env *env,
 			  const union bpf_attr *attr,
-			  union bpf_attr __user *uattr)
+			  bpfptr_t uattr)
 {
 	const struct btf_type *type, *func_proto, *ret_type;
 	u32 i, nfuncs, urec_size, min_size;
@@ -9277,7 +9277,7 @@ static int check_btf_func(struct bpf_verifier_env *env,
 	struct bpf_func_info_aux *info_aux = NULL;
 	struct bpf_prog *prog;
 	const struct btf *btf;
-	void __user *urecord;
+	bpfptr_t urecord;
 	u32 prev_offset = 0;
 	bool scalar_return;
 	int ret = -ENOMEM;
@@ -9305,7 +9305,7 @@ static int check_btf_func(struct bpf_verifier_env *env,
 	prog = env->prog;
 	btf = prog->aux->btf;
 
-	urecord = u64_to_user_ptr(attr->func_info);
+	urecord = make_bpfptr(attr->func_info, uattr.is_kernel);
 	min_size = min_t(u32, krec_size, urec_size);
 
 	krecord = kvcalloc(nfuncs, krec_size, GFP_KERNEL | __GFP_NOWARN);
@@ -9323,13 +9323,15 @@ static int check_btf_func(struct bpf_verifier_env *env,
 				/* set the size kernel expects so loader can zero
 				 * out the rest of the record.
 				 */
-				if (put_user(min_size, &uattr->func_info_rec_size))
+				if (copy_to_bpfptr_offset(uattr,
+							  offsetof(union bpf_attr, func_info_rec_size),
+							  &min_size, sizeof(min_size)))
 					ret = -EFAULT;
 			}
 			goto err_free;
 		}
 
-		if (copy_from_user(&krecord[i], urecord, min_size)) {
+		if (copy_from_bpfptr(&krecord[i], urecord, min_size)) {
 			ret = -EFAULT;
 			goto err_free;
 		}
@@ -9381,7 +9383,7 @@ static int check_btf_func(struct bpf_verifier_env *env,
 		}
 
 		prev_offset = krecord[i].insn_off;
-		urecord += urec_size;
+		bpfptr_add(&urecord, urec_size);
 	}
 
 	prog->aux->func_info = krecord;
@@ -9413,14 +9415,14 @@ static void adjust_btf_func(struct bpf_verifier_env *env)
 
 static int check_btf_line(struct bpf_verifier_env *env,
 			  const union bpf_attr *attr,
-			  union bpf_attr __user *uattr)
+			  bpfptr_t uattr)
 {
 	u32 i, s, nr_linfo, ncopy, expected_size, rec_size, prev_offset = 0;
 	struct bpf_subprog_info *sub;
 	struct bpf_line_info *linfo;
 	struct bpf_prog *prog;
 	const struct btf *btf;
-	void __user *ulinfo;
+	bpfptr_t ulinfo;
 	int err;
 
 	nr_linfo = attr->line_info_cnt;
@@ -9446,7 +9448,7 @@ static int check_btf_line(struct bpf_verifier_env *env,
 
 	s = 0;
 	sub = env->subprog_info;
-	ulinfo = u64_to_user_ptr(attr->line_info);
+	ulinfo = make_bpfptr(attr->line_info, uattr.is_kernel);
 	expected_size = sizeof(struct bpf_line_info);
 	ncopy = min_t(u32, expected_size, rec_size);
 	for (i = 0; i < nr_linfo; i++) {
@@ -9454,14 +9456,15 @@ static int check_btf_line(struct bpf_verifier_env *env,
 		if (err) {
 			if (err == -E2BIG) {
 				verbose(env, "nonzero tailing record in line_info");
-				if (put_user(expected_size,
-					     &uattr->line_info_rec_size))
+				if (copy_to_bpfptr_offset(uattr,
+							  offsetof(union bpf_attr, line_info_rec_size),
+							  &expected_size, sizeof(expected_size)))
 					err = -EFAULT;
 			}
 			goto err_free;
 		}
 
-		if (copy_from_user(&linfo[i], ulinfo, ncopy)) {
+		if (copy_from_bpfptr(&linfo[i], ulinfo, ncopy)) {
 			err = -EFAULT;
 			goto err_free;
 		}
@@ -9513,7 +9516,7 @@ static int check_btf_line(struct bpf_verifier_env *env,
 		}
 
 		prev_offset = linfo[i].insn_off;
-		ulinfo += rec_size;
+		bpfptr_add(&ulinfo, rec_size);
 	}
 
 	if (s != env->subprog_cnt) {
@@ -9535,7 +9538,7 @@ static int check_btf_line(struct bpf_verifier_env *env,
 
 static int check_btf_info(struct bpf_verifier_env *env,
 			  const union bpf_attr *attr,
-			  union bpf_attr __user *uattr)
+			  bpfptr_t uattr)
 {
 	struct btf *btf;
 	int err;
@@ -13109,8 +13112,7 @@ struct btf *bpf_get_btf_vmlinux(void)
 	return btf_vmlinux;
 }
 
-int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
-	      union bpf_attr __user *uattr)
+int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr)
 {
 	u64 start_time = ktime_get_ns();
 	struct bpf_verifier_env *env;
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 1783ea77b95c..85f41fb8d5bf 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -409,7 +409,7 @@ static void *bpf_ctx_init(const union bpf_attr *kattr, u32 max_size)
 		return ERR_PTR(-ENOMEM);
 
 	if (data_in) {
-		err = bpf_check_uarg_tail_zero(data_in, max_size, size);
+		err = bpf_check_uarg_tail_zero(USER_BPFPTR(data_in), max_size, size);
 		if (err) {
 			kfree(data);
 			return ERR_PTR(err);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH bpf-next 04/15] libbpf: Support for syscall program type
  2021-04-17  3:32 [PATCH bpf-next 00/15] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
                   ` (2 preceding siblings ...)
  2021-04-17  3:32 ` [PATCH bpf-next 03/15] bpf: Prepare bpf syscall to be used from kernel and user space Alexei Starovoitov
@ 2021-04-17  3:32 ` Alexei Starovoitov
  2021-04-17  3:32 ` [PATCH bpf-next 05/15] selftests/bpf: Test " Alexei Starovoitov
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 30+ messages in thread
From: Alexei Starovoitov @ 2021-04-17  3:32 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Trivial support for syscall program type.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/lib/bpf/libbpf.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 9cc2d45b0080..254a0c9aa6cf 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -8899,6 +8899,7 @@ static const struct bpf_sec_def section_defs[] = {
 	BPF_PROG_SEC("struct_ops",		BPF_PROG_TYPE_STRUCT_OPS),
 	BPF_EAPROG_SEC("sk_lookup/",		BPF_PROG_TYPE_SK_LOOKUP,
 						BPF_SK_LOOKUP),
+	BPF_PROG_SEC("syscall",			BPF_PROG_TYPE_SYSCALL),
 };
 
 #undef BPF_PROG_SEC_IMPL
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH bpf-next 05/15] selftests/bpf: Test for syscall program type
  2021-04-17  3:32 [PATCH bpf-next 00/15] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
                   ` (3 preceding siblings ...)
  2021-04-17  3:32 ` [PATCH bpf-next 04/15] libbpf: Support for syscall program type Alexei Starovoitov
@ 2021-04-17  3:32 ` Alexei Starovoitov
  2021-04-17  3:32 ` [PATCH bpf-next 06/15] bpf: Make btf_load command to be bpfptr_t compatible Alexei Starovoitov
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 30+ messages in thread
From: Alexei Starovoitov @ 2021-04-17  3:32 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

bpf_prog_type_syscall is a program that creates a bpf map,
updates it, and loads another bpf program using bpf_sys_bpf() helper.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/testing/selftests/bpf/Makefile          |  1 +
 .../selftests/bpf/prog_tests/syscall.c        | 53 ++++++++++++++
 tools/testing/selftests/bpf/progs/syscall.c   | 73 +++++++++++++++++++
 3 files changed, 127 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/syscall.c
 create mode 100644 tools/testing/selftests/bpf/progs/syscall.c

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index c45ae13b88a0..5e618ff1e8fd 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -277,6 +277,7 @@ MENDIAN=$(if $(IS_LITTLE_ENDIAN),-mlittle-endian,-mbig-endian)
 CLANG_SYS_INCLUDES = $(call get_sys_includes,$(CLANG))
 BPF_CFLAGS = -g -D__TARGET_ARCH_$(SRCARCH) $(MENDIAN) 			\
 	     -I$(INCLUDE_DIR) -I$(CURDIR) -I$(APIDIR)			\
+	     -I$(TOOLSINCDIR) \
 	     -I$(abspath $(OUTPUT)/../usr/include)
 
 CLANG_CFLAGS = $(CLANG_SYS_INCLUDES) \
diff --git a/tools/testing/selftests/bpf/prog_tests/syscall.c b/tools/testing/selftests/bpf/prog_tests/syscall.c
new file mode 100644
index 000000000000..e550e36bb5da
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/syscall.c
@@ -0,0 +1,53 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2021 Facebook */
+#include <test_progs.h>
+#include "syscall.skel.h"
+
+struct args {
+	__u64 log_buf;
+	__u32 log_size;
+	int max_entries;
+	int map_fd;
+	int prog_fd;
+};
+
+void test_syscall(void)
+{
+	static char verifier_log[8192];
+	struct args ctx = {
+		.max_entries = 1024,
+		.log_buf = (uintptr_t) verifier_log,
+		.log_size = sizeof(verifier_log),
+	};
+	struct bpf_prog_test_run_attr tattr = {
+		.ctx_in = &ctx,
+		.ctx_size_in = sizeof(ctx),
+	};
+	struct syscall *skel = NULL;
+	__u64 key = 12, value = 0;
+	__u32 duration = 0;
+	int err;
+
+	skel = syscall__open_and_load();
+	if (CHECK(!skel, "skel_load", "syscall skeleton failed\n"))
+		goto cleanup;
+
+	tattr.prog_fd = bpf_program__fd(skel->progs.bpf_prog);
+	err = bpf_prog_test_run_xattr(&tattr);
+	if (CHECK(err || tattr.retval != 1, "test_run sys_bpf",
+		  "err %d errno %d retval %d duration %d\n",
+		  err, errno, tattr.retval, tattr.duration))
+		goto cleanup;
+
+	CHECK(ctx.map_fd <= 0, "map_fd", "fd = %d\n", ctx.map_fd);
+	CHECK(ctx.prog_fd <= 0, "prog_fd", "fd = %d\n", ctx.prog_fd);
+	CHECK(memcmp(verifier_log, "processed", sizeof("processed") - 1) != 0,
+	      "verifier_log", "%s\n", verifier_log);
+
+	err = bpf_map_lookup_elem(ctx.map_fd, &key, &value);
+	CHECK(err, "map_lookup", "map_lookup failed\n");
+	CHECK(value != 34, "invalid_value",
+	      "got value %llu expected %u\n", value, 34);
+cleanup:
+	syscall__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/syscall.c b/tools/testing/selftests/bpf/progs/syscall.c
new file mode 100644
index 000000000000..01476f88e45f
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/syscall.c
@@ -0,0 +1,73 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2021 Facebook */
+#include <linux/stddef.h>
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include <../../tools/include/linux/filter.h>
+
+volatile const int workaround = 1;
+
+char _license[] SEC("license") = "GPL";
+
+struct args {
+	__u64 log_buf;
+	__u32 log_size;
+	int max_entries;
+	int map_fd;
+	int prog_fd;
+};
+
+SEC("syscall")
+int bpf_prog(struct args *ctx)
+{
+	static char license[] = "GPL";
+	static struct bpf_insn insns[] = {
+		BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+		BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+		BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+		BPF_LD_MAP_FD(BPF_REG_1, 0),
+		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+		BPF_MOV64_IMM(BPF_REG_0, 0),
+		BPF_EXIT_INSN(),
+	};
+	static union bpf_attr map_create_attr = {
+		.map_type = BPF_MAP_TYPE_HASH,
+		.key_size = 8,
+		.value_size = 8,
+	};
+	static union bpf_attr map_update_attr = { .map_fd = 1, };
+	static __u64 key = 12;
+	static __u64 value = 34;
+	static union bpf_attr prog_load_attr = {
+		.prog_type = BPF_PROG_TYPE_XDP,
+		.insn_cnt = sizeof(insns) / sizeof(insns[0]),
+	};
+	int ret;
+
+	map_create_attr.max_entries = ctx->max_entries;
+	prog_load_attr.license = (long) license;
+	prog_load_attr.insns = (long) insns;
+	prog_load_attr.log_buf = ctx->log_buf;
+	prog_load_attr.log_size = ctx->log_size;
+	prog_load_attr.log_level = 1;
+
+	ret = bpf_sys_bpf(BPF_MAP_CREATE, &map_create_attr, sizeof(map_create_attr));
+	if (ret <= 0)
+		return ret;
+	ctx->map_fd = ret;
+	insns[3].imm = ret;
+
+	map_update_attr.map_fd = ret;
+	map_update_attr.key = (long) &key;
+	map_update_attr.value = (long) &value;
+	ret = bpf_sys_bpf(BPF_MAP_UPDATE_ELEM, &map_update_attr, sizeof(map_update_attr));
+	if (ret < 0)
+		return ret;
+
+	ret = bpf_sys_bpf(BPF_PROG_LOAD, &prog_load_attr, sizeof(prog_load_attr));
+	if (ret <= 0)
+		return ret;
+	ctx->prog_fd = ret;
+	return 1;
+}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH bpf-next 06/15] bpf: Make btf_load command to be bpfptr_t compatible.
  2021-04-17  3:32 [PATCH bpf-next 00/15] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
                   ` (4 preceding siblings ...)
  2021-04-17  3:32 ` [PATCH bpf-next 05/15] selftests/bpf: Test " Alexei Starovoitov
@ 2021-04-17  3:32 ` Alexei Starovoitov
  2021-04-17  3:32 ` [PATCH bpf-next 07/15] selftests/bpf: Test for btf_load command Alexei Starovoitov
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 30+ messages in thread
From: Alexei Starovoitov @ 2021-04-17  3:32 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Similar to prog_load make btf_load command to be availble to
bpf_prog_type_syscall program.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 include/linux/btf.h  | 2 +-
 kernel/bpf/btf.c     | 8 ++++----
 kernel/bpf/syscall.c | 7 ++++---
 3 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/include/linux/btf.h b/include/linux/btf.h
index 3bac66e0183a..94a0c976c90f 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -21,7 +21,7 @@ extern const struct file_operations btf_fops;
 
 void btf_get(struct btf *btf);
 void btf_put(struct btf *btf);
-int btf_new_fd(const union bpf_attr *attr);
+int btf_new_fd(const union bpf_attr *attr, bpfptr_t uattr);
 struct btf *btf_get_by_fd(int fd);
 int btf_get_info_by_fd(const struct btf *btf,
 		       const union bpf_attr *attr,
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 0600ed325fa0..fbf6c06a9d62 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -4257,7 +4257,7 @@ static int btf_parse_hdr(struct btf_verifier_env *env)
 	return 0;
 }
 
-static struct btf *btf_parse(void __user *btf_data, u32 btf_data_size,
+static struct btf *btf_parse(bpfptr_t btf_data, u32 btf_data_size,
 			     u32 log_level, char __user *log_ubuf, u32 log_size)
 {
 	struct btf_verifier_env *env = NULL;
@@ -4306,7 +4306,7 @@ static struct btf *btf_parse(void __user *btf_data, u32 btf_data_size,
 	btf->data = data;
 	btf->data_size = btf_data_size;
 
-	if (copy_from_user(data, btf_data, btf_data_size)) {
+	if (copy_from_bpfptr(data, btf_data, btf_data_size)) {
 		err = -EFAULT;
 		goto errout;
 	}
@@ -5780,12 +5780,12 @@ static int __btf_new_fd(struct btf *btf)
 	return anon_inode_getfd("btf", &btf_fops, btf, O_RDONLY | O_CLOEXEC);
 }
 
-int btf_new_fd(const union bpf_attr *attr)
+int btf_new_fd(const union bpf_attr *attr, bpfptr_t uattr)
 {
 	struct btf *btf;
 	int ret;
 
-	btf = btf_parse(u64_to_user_ptr(attr->btf),
+	btf = btf_parse(make_bpfptr(attr->btf, uattr.is_kernel),
 			attr->btf_size, attr->btf_log_level,
 			u64_to_user_ptr(attr->btf_log_buf),
 			attr->btf_log_size);
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index e918839b76fd..f5e2511c504e 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3830,7 +3830,7 @@ static int bpf_obj_get_info_by_fd(const union bpf_attr *attr,
 
 #define BPF_BTF_LOAD_LAST_FIELD btf_log_level
 
-static int bpf_btf_load(const union bpf_attr *attr)
+static int bpf_btf_load(const union bpf_attr *attr, bpfptr_t uattr)
 {
 	if (CHECK_ATTR(BPF_BTF_LOAD))
 		return -EINVAL;
@@ -3838,7 +3838,7 @@ static int bpf_btf_load(const union bpf_attr *attr)
 	if (!bpf_capable())
 		return -EPERM;
 
-	return btf_new_fd(attr);
+	return btf_new_fd(attr, uattr);
 }
 
 #define BPF_BTF_GET_FD_BY_ID_LAST_FIELD btf_id
@@ -4459,7 +4459,7 @@ static int __sys_bpf(int cmd, bpfptr_t uattr, unsigned int size)
 		err = bpf_raw_tracepoint_open(&attr);
 		break;
 	case BPF_BTF_LOAD:
-		err = bpf_btf_load(&attr);
+		err = bpf_btf_load(&attr, uattr);
 		break;
 	case BPF_BTF_GET_FD_BY_ID:
 		err = bpf_btf_get_fd_by_id(&attr);
@@ -4540,6 +4540,7 @@ BPF_CALL_3(bpf_sys_bpf, int, cmd, void *, attr, u32, attr_size)
 	case BPF_MAP_UPDATE_ELEM:
 	case BPF_MAP_FREEZE:
 	case BPF_PROG_LOAD:
+	case BPF_BTF_LOAD:
 		break;
 	default:
 		return -EINVAL;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH bpf-next 07/15] selftests/bpf: Test for btf_load command.
  2021-04-17  3:32 [PATCH bpf-next 00/15] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
                   ` (5 preceding siblings ...)
  2021-04-17  3:32 ` [PATCH bpf-next 06/15] bpf: Make btf_load command to be bpfptr_t compatible Alexei Starovoitov
@ 2021-04-17  3:32 ` Alexei Starovoitov
  2021-04-17  3:32 ` [PATCH bpf-next 08/15] bpf: Introduce fd_idx Alexei Starovoitov
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 30+ messages in thread
From: Alexei Starovoitov @ 2021-04-17  3:32 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Improve selftest to check that btf_load is working from bpf program.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/testing/selftests/bpf/progs/syscall.c | 48 +++++++++++++++++++++
 1 file changed, 48 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/syscall.c b/tools/testing/selftests/bpf/progs/syscall.c
index 01476f88e45f..b6ac10f75c37 100644
--- a/tools/testing/selftests/bpf/progs/syscall.c
+++ b/tools/testing/selftests/bpf/progs/syscall.c
@@ -4,6 +4,7 @@
 #include <linux/bpf.h>
 #include <bpf/bpf_helpers.h>
 #include <bpf/bpf_tracing.h>
+#include <linux/btf.h>
 #include <../../tools/include/linux/filter.h>
 
 volatile const int workaround = 1;
@@ -18,6 +19,45 @@ struct args {
 	int prog_fd;
 };
 
+#define BTF_INFO_ENC(kind, kind_flag, vlen) \
+	((!!(kind_flag) << 31) | ((kind) << 24) | ((vlen) & BTF_MAX_VLEN))
+#define BTF_TYPE_ENC(name, info, size_or_type) (name), (info), (size_or_type)
+#define BTF_INT_ENC(encoding, bits_offset, nr_bits) \
+	((encoding) << 24 | (bits_offset) << 16 | (nr_bits))
+#define BTF_TYPE_INT_ENC(name, encoding, bits_offset, bits, sz) \
+	BTF_TYPE_ENC(name, BTF_INFO_ENC(BTF_KIND_INT, 0, 0), sz), \
+	BTF_INT_ENC(encoding, bits_offset, bits)
+
+static int btf_load(void)
+{
+	struct btf_blob {
+		struct btf_header btf_hdr;
+		__u32 types[8];
+		__u32 str;
+	} raw_btf = {
+		.btf_hdr = {
+			.magic = BTF_MAGIC,
+			.version = BTF_VERSION,
+			.hdr_len = sizeof(struct btf_header),
+			.type_len = sizeof(__u32) * 8,
+			.str_off = sizeof(__u32) * 8,
+			.str_len = sizeof(__u32),
+		},
+		.types = {
+			/* long */
+			BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 64, 8),  /* [1] */
+			/* unsigned long */
+			BTF_TYPE_INT_ENC(0, 0, 0, 64, 8),  /* [2] */
+		},
+	};
+	static union bpf_attr btf_load_attr = {
+		.btf_size = sizeof(raw_btf),
+	};
+
+	btf_load_attr.btf = (long)&raw_btf;
+	return bpf_sys_bpf(BPF_BTF_LOAD, &btf_load_attr, sizeof(btf_load_attr));
+}
+
 SEC("syscall")
 int bpf_prog(struct args *ctx)
 {
@@ -35,6 +75,8 @@ int bpf_prog(struct args *ctx)
 		.map_type = BPF_MAP_TYPE_HASH,
 		.key_size = 8,
 		.value_size = 8,
+		.btf_key_type_id = 1,
+		.btf_value_type_id = 2,
 	};
 	static union bpf_attr map_update_attr = { .map_fd = 1, };
 	static __u64 key = 12;
@@ -45,7 +87,13 @@ int bpf_prog(struct args *ctx)
 	};
 	int ret;
 
+	ret = btf_load();
+	if (ret < 0)
+		return ret;
+
 	map_create_attr.max_entries = ctx->max_entries;
+	map_create_attr.btf_fd = ret;
+
 	prog_load_attr.license = (long) license;
 	prog_load_attr.insns = (long) insns;
 	prog_load_attr.log_buf = ctx->log_buf;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH bpf-next 08/15] bpf: Introduce fd_idx
  2021-04-17  3:32 [PATCH bpf-next 00/15] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
                   ` (6 preceding siblings ...)
  2021-04-17  3:32 ` [PATCH bpf-next 07/15] selftests/bpf: Test for btf_load command Alexei Starovoitov
@ 2021-04-17  3:32 ` Alexei Starovoitov
  2021-04-17  3:32 ` [PATCH bpf-next 09/15] libbpf: Support for fd_idx Alexei Starovoitov
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 30+ messages in thread
From: Alexei Starovoitov @ 2021-04-17  3:32 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Typical program loading sequence involves creating bpf maps and applying
map FDs into bpf instructions in various places in the bpf program.
This job is done by libbpf that is using compiler generated ELF relocations
to patch certain instruction after maps are created and BTFs are loaded.
The goal of fd_idx is to allow bpf instructions to stay immutable
after compilation. At load time the libbpf would still create maps as usual,
but it wouldn't need to patch instructions. It would store map_fds into
__u32 fd_array[] and would pass that pointer to sys_bpf(BPF_PROG_LOAD).

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 include/linux/bpf_verifier.h   |  1 +
 include/uapi/linux/bpf.h       | 16 ++++++++----
 kernel/bpf/syscall.c           |  2 +-
 kernel/bpf/verifier.c          | 47 ++++++++++++++++++++++++++--------
 tools/include/uapi/linux/bpf.h | 16 ++++++++----
 5 files changed, 61 insertions(+), 21 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 6023a1367853..a5a3b4b3e804 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -441,6 +441,7 @@ struct bpf_verifier_env {
 	u32 peak_states;
 	/* longest register parentage chain walked for liveness marking */
 	u32 longest_mark_read_walk;
+	bpfptr_t fd_array;
 };
 
 __printf(2, 0) void bpf_verifier_vlog(struct bpf_verifier_log *log,
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index ce3e76ff08cd..068c23f278bc 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1098,8 +1098,8 @@ enum bpf_link_type {
 /* When BPF ldimm64's insn[0].src_reg != 0 then this can have
  * the following extensions:
  *
- * insn[0].src_reg:  BPF_PSEUDO_MAP_FD
- * insn[0].imm:      map fd
+ * insn[0].src_reg:  BPF_PSEUDO_MAP_[FD|IDX]
+ * insn[0].imm:      map fd or fd_idx
  * insn[1].imm:      0
  * insn[0].off:      0
  * insn[1].off:      0
@@ -1107,15 +1107,19 @@ enum bpf_link_type {
  * verifier type:    CONST_PTR_TO_MAP
  */
 #define BPF_PSEUDO_MAP_FD	1
-/* insn[0].src_reg:  BPF_PSEUDO_MAP_VALUE
- * insn[0].imm:      map fd
+#define BPF_PSEUDO_MAP_IDX	5
+
+/* insn[0].src_reg:  BPF_PSEUDO_MAP_[IDX_]VALUE
+ * insn[0].imm:      map fd or fd_idx
  * insn[1].imm:      offset into value
  * insn[0].off:      0
  * insn[1].off:      0
  * ldimm64 rewrite:  address of map[0]+offset
  * verifier type:    PTR_TO_MAP_VALUE
  */
-#define BPF_PSEUDO_MAP_VALUE	2
+#define BPF_PSEUDO_MAP_VALUE		2
+#define BPF_PSEUDO_MAP_IDX_VALUE	6
+
 /* insn[0].src_reg:  BPF_PSEUDO_BTF_ID
  * insn[0].imm:      kernel btd id of VAR
  * insn[1].imm:      0
@@ -1315,6 +1319,8 @@ union bpf_attr {
 			/* or valid module BTF object fd or 0 to attach to vmlinux */
 			__u32		attach_btf_obj_fd;
 		};
+		__u32		:32;		/* pad */
+		__aligned_u64	fd_array;	/* array of FDs */
 	};
 
 	struct { /* anonymous struct used by BPF_OBJ_* commands */
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index f5e2511c504e..7b51d56e420b 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -2088,7 +2088,7 @@ static bool is_perfmon_prog_type(enum bpf_prog_type prog_type)
 }
 
 /* last field in 'union bpf_attr' used by this command */
-#define	BPF_PROG_LOAD_LAST_FIELD attach_prog_fd
+#define	BPF_PROG_LOAD_LAST_FIELD fd_array
 
 static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr)
 {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 7028fd0f4481..fa853b761f0d 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -8747,12 +8747,14 @@ static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn)
 	mark_reg_known_zero(env, regs, insn->dst_reg);
 	dst_reg->map_ptr = map;
 
-	if (insn->src_reg == BPF_PSEUDO_MAP_VALUE) {
+	if (insn->src_reg == BPF_PSEUDO_MAP_VALUE ||
+	    insn->src_reg == BPF_PSEUDO_MAP_IDX_VALUE) {
 		dst_reg->type = PTR_TO_MAP_VALUE;
 		dst_reg->off = aux->map_off;
 		if (map_value_has_spin_lock(map))
 			dst_reg->id = ++env->id_gen;
-	} else if (insn->src_reg == BPF_PSEUDO_MAP_FD) {
+	} else if (insn->src_reg == BPF_PSEUDO_MAP_FD ||
+		   insn->src_reg == BPF_PSEUDO_MAP_IDX) {
 		dst_reg->type = CONST_PTR_TO_MAP;
 	} else {
 		verbose(env, "bpf verifier is misconfigured\n");
@@ -11021,6 +11023,7 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env)
 			struct bpf_map *map;
 			struct fd f;
 			u64 addr;
+			u32 fd;
 
 			if (i == insn_cnt - 1 || insn[1].code != 0 ||
 			    insn[1].dst_reg != 0 || insn[1].src_reg != 0 ||
@@ -11050,16 +11053,38 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env)
 			/* In final convert_pseudo_ld_imm64() step, this is
 			 * converted into regular 64-bit imm load insn.
 			 */
-			if ((insn[0].src_reg != BPF_PSEUDO_MAP_FD &&
-			     insn[0].src_reg != BPF_PSEUDO_MAP_VALUE) ||
-			    (insn[0].src_reg == BPF_PSEUDO_MAP_FD &&
-			     insn[1].imm != 0)) {
-				verbose(env,
-					"unrecognized bpf_ld_imm64 insn\n");
+			switch (insn[0].src_reg) {
+			case BPF_PSEUDO_MAP_VALUE:
+			case BPF_PSEUDO_MAP_IDX_VALUE:
+				break;
+			case BPF_PSEUDO_MAP_FD:
+			case BPF_PSEUDO_MAP_IDX:
+				if (insn[1].imm == 0)
+					break;
+				fallthrough;
+			default:
+				verbose(env, "unrecognized bpf_ld_imm64 insn\n");
 				return -EINVAL;
 			}
 
-			f = fdget(insn[0].imm);
+			switch (insn[0].src_reg) {
+			case BPF_PSEUDO_MAP_IDX_VALUE:
+			case BPF_PSEUDO_MAP_IDX:
+				if (bpfptr_is_null(env->fd_array)) {
+					verbose(env, "fd_idx without fd_array is invalid\n");
+					return -EPROTO;
+				}
+				if (copy_from_bpfptr_offset(&fd, env->fd_array,
+							    insn[0].imm * sizeof(fd),
+							    sizeof(fd)))
+					return -EFAULT;
+				break;
+			default:
+				fd = insn[0].imm;
+				break;
+			}
+
+			f = fdget(fd);
 			map = __bpf_map_get(f);
 			if (IS_ERR(map)) {
 				verbose(env, "fd %d is not pointing to valid bpf_map\n",
@@ -11074,7 +11099,8 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env)
 			}
 
 			aux = &env->insn_aux_data[i];
-			if (insn->src_reg == BPF_PSEUDO_MAP_FD) {
+			if (insn[0].src_reg == BPF_PSEUDO_MAP_FD ||
+			    insn[0].src_reg == BPF_PSEUDO_MAP_IDX) {
 				addr = (unsigned long)map;
 			} else {
 				u32 off = insn[1].imm;
@@ -13142,6 +13168,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr)
 		env->insn_aux_data[i].orig_idx = i;
 	env->prog = *prog;
 	env->ops = bpf_verifier_ops[env->prog->type];
+	env->fd_array = make_bpfptr(attr->fd_array, uattr.is_kernel);
 	is_priv = bpf_capable();
 
 	bpf_get_btf_vmlinux();
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index ce3e76ff08cd..068c23f278bc 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1098,8 +1098,8 @@ enum bpf_link_type {
 /* When BPF ldimm64's insn[0].src_reg != 0 then this can have
  * the following extensions:
  *
- * insn[0].src_reg:  BPF_PSEUDO_MAP_FD
- * insn[0].imm:      map fd
+ * insn[0].src_reg:  BPF_PSEUDO_MAP_[FD|IDX]
+ * insn[0].imm:      map fd or fd_idx
  * insn[1].imm:      0
  * insn[0].off:      0
  * insn[1].off:      0
@@ -1107,15 +1107,19 @@ enum bpf_link_type {
  * verifier type:    CONST_PTR_TO_MAP
  */
 #define BPF_PSEUDO_MAP_FD	1
-/* insn[0].src_reg:  BPF_PSEUDO_MAP_VALUE
- * insn[0].imm:      map fd
+#define BPF_PSEUDO_MAP_IDX	5
+
+/* insn[0].src_reg:  BPF_PSEUDO_MAP_[IDX_]VALUE
+ * insn[0].imm:      map fd or fd_idx
  * insn[1].imm:      offset into value
  * insn[0].off:      0
  * insn[1].off:      0
  * ldimm64 rewrite:  address of map[0]+offset
  * verifier type:    PTR_TO_MAP_VALUE
  */
-#define BPF_PSEUDO_MAP_VALUE	2
+#define BPF_PSEUDO_MAP_VALUE		2
+#define BPF_PSEUDO_MAP_IDX_VALUE	6
+
 /* insn[0].src_reg:  BPF_PSEUDO_BTF_ID
  * insn[0].imm:      kernel btd id of VAR
  * insn[1].imm:      0
@@ -1315,6 +1319,8 @@ union bpf_attr {
 			/* or valid module BTF object fd or 0 to attach to vmlinux */
 			__u32		attach_btf_obj_fd;
 		};
+		__u32		:32;		/* pad */
+		__aligned_u64	fd_array;	/* array of FDs */
 	};
 
 	struct { /* anonymous struct used by BPF_OBJ_* commands */
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH bpf-next 09/15] libbpf: Support for fd_idx
  2021-04-17  3:32 [PATCH bpf-next 00/15] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
                   ` (7 preceding siblings ...)
  2021-04-17  3:32 ` [PATCH bpf-next 08/15] bpf: Introduce fd_idx Alexei Starovoitov
@ 2021-04-17  3:32 ` Alexei Starovoitov
  2021-04-17  3:32 ` [PATCH bpf-next 10/15] bpf: Add bpf_btf_find_by_name_kind() helper Alexei Starovoitov
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 30+ messages in thread
From: Alexei Starovoitov @ 2021-04-17  3:32 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Add support for FD_IDX make libbpf prefer that approach to loading programs.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/lib/bpf/bpf.c             |  1 +
 tools/lib/bpf/libbpf.c          | 70 +++++++++++++++++++++++++++++----
 tools/lib/bpf/libbpf_internal.h |  1 +
 3 files changed, 65 insertions(+), 7 deletions(-)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index bba48ff4c5c0..b96a3aba6fcc 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -252,6 +252,7 @@ int libbpf__bpf_prog_load(const struct bpf_prog_load_params *load_attr)
 
 	attr.prog_btf_fd = load_attr->prog_btf_fd;
 	attr.prog_flags = load_attr->prog_flags;
+	attr.fd_array = ptr_to_u64(load_attr->fd_array);
 
 	attr.func_info_rec_size = load_attr->func_info_rec_size;
 	attr.func_info_cnt = load_attr->func_info_cnt;
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 254a0c9aa6cf..17cfc5b66111 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -176,6 +176,8 @@ enum kern_feature_id {
 	FEAT_MODULE_BTF,
 	/* BTF_KIND_FLOAT support */
 	FEAT_BTF_FLOAT,
+	/* Kernel support for FD_IDX */
+	FEAT_FD_IDX,
 	__FEAT_CNT,
 };
 
@@ -288,6 +290,7 @@ struct bpf_program {
 	__u32 line_info_rec_size;
 	__u32 line_info_cnt;
 	__u32 prog_flags;
+	int *fd_array;
 };
 
 struct bpf_struct_ops {
@@ -4181,6 +4184,24 @@ static int probe_module_btf(void)
 	return !err;
 }
 
+static int probe_kern_fd_idx(void)
+{
+	struct bpf_load_program_attr attr;
+	struct bpf_insn insns[] = {
+		BPF_LD_IMM64_RAW(BPF_REG_0, BPF_PSEUDO_MAP_IDX, 0),
+		BPF_EXIT_INSN(),
+	};
+
+	memset(&attr, 0, sizeof(attr));
+	attr.prog_type = BPF_PROG_TYPE_SOCKET_FILTER;
+	attr.insns = insns;
+	attr.insns_cnt = ARRAY_SIZE(insns);
+	attr.license = "GPL";
+
+	probe_fd(bpf_load_program_xattr(&attr, NULL, 0));
+	return errno == EPROTO;
+}
+
 enum kern_feature_result {
 	FEAT_UNKNOWN = 0,
 	FEAT_SUPPORTED = 1,
@@ -4231,6 +4252,9 @@ static struct kern_feature_desc {
 	[FEAT_BTF_FLOAT] = {
 		"BTF_KIND_FLOAT support", probe_kern_btf_float,
 	},
+	[FEAT_FD_IDX] = {
+		"prog_load with fd_idx", probe_kern_fd_idx,
+	},
 };
 
 static bool kernel_supports(enum kern_feature_id feat_id)
@@ -6309,19 +6333,34 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog)
 
 		switch (relo->type) {
 		case RELO_LD64:
-			insn[0].src_reg = BPF_PSEUDO_MAP_FD;
-			insn[0].imm = obj->maps[relo->map_idx].fd;
+			if (kernel_supports(FEAT_FD_IDX)) {
+				insn[0].src_reg = BPF_PSEUDO_MAP_IDX;
+				insn[0].imm = relo->map_idx;
+			} else {
+				insn[0].src_reg = BPF_PSEUDO_MAP_FD;
+				insn[0].imm = obj->maps[relo->map_idx].fd;
+			}
 			break;
 		case RELO_DATA:
-			insn[0].src_reg = BPF_PSEUDO_MAP_VALUE;
 			insn[1].imm = insn[0].imm + relo->sym_off;
-			insn[0].imm = obj->maps[relo->map_idx].fd;
+			if (kernel_supports(FEAT_FD_IDX)) {
+				insn[0].src_reg = BPF_PSEUDO_MAP_IDX_VALUE;
+				insn[0].imm = relo->map_idx;
+			} else {
+				insn[0].src_reg = BPF_PSEUDO_MAP_VALUE;
+				insn[0].imm = obj->maps[relo->map_idx].fd;
+			}
 			break;
 		case RELO_EXTERN_VAR:
 			ext = &obj->externs[relo->sym_off];
 			if (ext->type == EXT_KCFG) {
-				insn[0].src_reg = BPF_PSEUDO_MAP_VALUE;
-				insn[0].imm = obj->maps[obj->kconfig_map_idx].fd;
+				if (kernel_supports(FEAT_FD_IDX)) {
+					insn[0].src_reg = BPF_PSEUDO_MAP_IDX_VALUE;
+					insn[0].imm = obj->kconfig_map_idx;
+				} else {
+					insn[0].src_reg = BPF_PSEUDO_MAP_VALUE;
+					insn[0].imm = obj->maps[obj->kconfig_map_idx].fd;
+				}
 				insn[1].imm = ext->kcfg.data_off;
 			} else /* EXT_KSYM */ {
 				if (ext->ksym.type_id) { /* typed ksyms */
@@ -7047,6 +7086,7 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt,
 	load_attr.attach_btf_id = prog->attach_btf_id;
 	load_attr.kern_version = kern_version;
 	load_attr.prog_ifindex = prog->prog_ifindex;
+	load_attr.fd_array = prog->fd_array;
 
 	/* specify func_info/line_info only if kernel supports them */
 	btf_fd = bpf_object__btf_fd(prog->obj);
@@ -7239,6 +7279,8 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)
 	struct bpf_program *prog;
 	size_t i;
 	int err;
+	struct bpf_map *map;
+	int *fd_array = NULL;
 
 	for (i = 0; i < obj->nr_programs; i++) {
 		prog = &obj->programs[i];
@@ -7247,6 +7289,16 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)
 			return err;
 	}
 
+	if (kernel_supports(FEAT_FD_IDX) && obj->nr_maps) {
+		fd_array = malloc(sizeof(int) * obj->nr_maps);
+		if (!fd_array)
+			return -ENOMEM;
+		for (i = 0; i < obj->nr_maps; i++) {
+			map = &obj->maps[i];
+			fd_array[i] = map->fd;
+		}
+	}
+
 	for (i = 0; i < obj->nr_programs; i++) {
 		prog = &obj->programs[i];
 		if (prog_is_subprog(obj, prog))
@@ -7256,10 +7308,14 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)
 			continue;
 		}
 		prog->log_level |= log_level;
+		prog->fd_array = fd_array;
 		err = bpf_program__load(prog, obj->license, obj->kern_version);
-		if (err)
+		if (err) {
+			free(fd_array);
 			return err;
+		}
 	}
+	free(fd_array);
 	return 0;
 }
 
diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
index 6017902c687e..9114c7085f2a 100644
--- a/tools/lib/bpf/libbpf_internal.h
+++ b/tools/lib/bpf/libbpf_internal.h
@@ -204,6 +204,7 @@ struct bpf_prog_load_params {
 	__u32 log_level;
 	char *log_buf;
 	size_t log_buf_sz;
+	int *fd_array;
 };
 
 int libbpf__bpf_prog_load(const struct bpf_prog_load_params *load_attr);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH bpf-next 10/15] bpf: Add bpf_btf_find_by_name_kind() helper.
  2021-04-17  3:32 [PATCH bpf-next 00/15] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
                   ` (8 preceding siblings ...)
  2021-04-17  3:32 ` [PATCH bpf-next 09/15] libbpf: Support for fd_idx Alexei Starovoitov
@ 2021-04-17  3:32 ` Alexei Starovoitov
  2021-04-17  3:32 ` [PATCH bpf-next 11/15] bpf: Add bpf_sys_close() helper Alexei Starovoitov
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 30+ messages in thread
From: Alexei Starovoitov @ 2021-04-17  3:32 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Add bpf_btf_find_by_name_kind() helper.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 include/linux/bpf.h            |  1 +
 include/uapi/linux/bpf.h       |  8 ++++++
 kernel/bpf/btf.c               | 51 ++++++++++++++++++++++++++++++++++
 kernel/bpf/syscall.c           |  2 ++
 tools/include/uapi/linux/bpf.h |  8 ++++++
 5 files changed, 70 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 1aede7ceea5e..fe28ccb57f83 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1970,6 +1970,7 @@ extern const struct bpf_func_proto bpf_get_socket_ptr_cookie_proto;
 extern const struct bpf_func_proto bpf_task_storage_get_proto;
 extern const struct bpf_func_proto bpf_task_storage_delete_proto;
 extern const struct bpf_func_proto bpf_for_each_map_elem_proto;
+extern const struct bpf_func_proto bpf_btf_find_by_name_kind_proto;
 
 const struct bpf_func_proto *bpf_tracing_func_proto(
 	enum bpf_func_id func_id, const struct bpf_prog *prog);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 068c23f278bc..47c4b21a51b6 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -4721,6 +4721,13 @@ union bpf_attr {
  * 		Execute bpf syscall with given arguments.
  * 	Return
  * 		A syscall result.
+ *
+ * long bpf_btf_find_by_name_kind(u32 btf_fd, char *name, u32 kind, u32 *attach_btf_obj_fd, int flags)
+ * 	Description
+ * 		Find given name with given type in BTF pointed to by btf_fd.
+ * 		If btf_fd is zero look for the name in vmlinux BTF and kernel module BTFs.
+ * 	Return
+ * 		Return btf_id and store module's BTF FD into attach_btf_obj_fd.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -4889,6 +4896,7 @@ union bpf_attr {
 	FN(check_mtu),			\
 	FN(for_each_map_elem),		\
 	FN(sys_bpf),			\
+	FN(btf_find_by_name_kind),	\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index fbf6c06a9d62..2f474739d9d3 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -6085,3 +6085,54 @@ struct module *btf_try_get_module(const struct btf *btf)
 
 	return res;
 }
+
+BPF_CALL_5(bpf_btf_find_by_name_kind, int, btf_fd, char *, name, u32, kind, int *, attach_btf_obj_fd, int, flags)
+{
+	struct btf *btf;
+	int ret;
+
+	if (flags)
+		return -EINVAL;
+
+	if (btf_fd)
+		btf = btf_get_by_fd(btf_fd);
+	else
+		btf = bpf_get_btf_vmlinux();
+	if (IS_ERR(btf))
+		return PTR_ERR(btf);
+
+	*attach_btf_obj_fd = 0;
+	ret = btf_find_by_name_kind(btf, name, kind);
+	if (!btf_fd && ret < 0) {
+		struct btf *mod_btf;
+		int id = btf->id;
+
+		spin_lock_bh(&btf_idr_lock);
+		idr_for_each_entry_continue(&btf_idr, mod_btf, id) {
+			if (!btf_is_kernel(btf))
+				continue;
+			ret = btf_find_by_name_kind(mod_btf, name, kind);
+			if (ret > 0)
+				break;
+		}
+		if (mod_btf)
+			btf_get(mod_btf);
+		spin_unlock_bh(&btf_idr_lock);
+		if (mod_btf)
+			*attach_btf_obj_fd = __btf_new_fd(mod_btf);
+	}
+	if (btf_fd)
+		btf_put(btf);
+	return ret;
+}
+
+const struct bpf_func_proto bpf_btf_find_by_name_kind_proto = {
+	.func		= bpf_btf_find_by_name_kind,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_ANYTHING,
+	.arg2_type	= ARG_ANYTHING,
+	.arg3_type	= ARG_ANYTHING,
+	.arg4_type	= ARG_PTR_TO_INT,
+	.arg5_type	= ARG_ANYTHING,
+};
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 7b51d56e420b..6d4d9925c0ec 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -4563,6 +4563,8 @@ syscall_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 	switch (func_id) {
 	case BPF_FUNC_sys_bpf:
 		return &bpf_sys_bpf_proto;
+	case BPF_FUNC_btf_find_by_name_kind:
+		return &bpf_btf_find_by_name_kind_proto;
 	default:
 		return bpf_base_func_proto(func_id);
 	}
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 068c23f278bc..47c4b21a51b6 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -4721,6 +4721,13 @@ union bpf_attr {
  * 		Execute bpf syscall with given arguments.
  * 	Return
  * 		A syscall result.
+ *
+ * long bpf_btf_find_by_name_kind(u32 btf_fd, char *name, u32 kind, u32 *attach_btf_obj_fd, int flags)
+ * 	Description
+ * 		Find given name with given type in BTF pointed to by btf_fd.
+ * 		If btf_fd is zero look for the name in vmlinux BTF and kernel module BTFs.
+ * 	Return
+ * 		Return btf_id and store module's BTF FD into attach_btf_obj_fd.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -4889,6 +4896,7 @@ union bpf_attr {
 	FN(check_mtu),			\
 	FN(for_each_map_elem),		\
 	FN(sys_bpf),			\
+	FN(btf_find_by_name_kind),	\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH bpf-next 11/15] bpf: Add bpf_sys_close() helper.
  2021-04-17  3:32 [PATCH bpf-next 00/15] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
                   ` (9 preceding siblings ...)
  2021-04-17  3:32 ` [PATCH bpf-next 10/15] bpf: Add bpf_btf_find_by_name_kind() helper Alexei Starovoitov
@ 2021-04-17  3:32 ` Alexei Starovoitov
  2021-04-17  3:42   ` Al Viro
  2021-04-17  3:32 ` [PATCH bpf-next 12/15] libbpf: Change the order of data and text relocations Alexei Starovoitov
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 30+ messages in thread
From: Alexei Starovoitov @ 2021-04-17  3:32 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Add bpf_sys_close() helper to be used by the syscall/loader program to close
intermediate FDs and other cleanup.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 include/uapi/linux/bpf.h       |  7 +++++++
 kernel/bpf/syscall.c           | 14 ++++++++++++++
 tools/include/uapi/linux/bpf.h |  7 +++++++
 3 files changed, 28 insertions(+)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 47c4b21a51b6..2251e7894799 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -4728,6 +4728,12 @@ union bpf_attr {
  * 		If btf_fd is zero look for the name in vmlinux BTF and kernel module BTFs.
  * 	Return
  * 		Return btf_id and store module's BTF FD into attach_btf_obj_fd.
+ *
+ * long bpf_sys_close(u32 fd)
+ * 	Description
+ * 		Execute close syscall for given FD.
+ * 	Return
+ * 		A syscall result.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -4897,6 +4903,7 @@ union bpf_attr {
 	FN(for_each_map_elem),		\
 	FN(sys_bpf),			\
 	FN(btf_find_by_name_kind),	\
+	FN(sys_close),			\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 6d4d9925c0ec..822b00908c58 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -4557,6 +4557,18 @@ const struct bpf_func_proto bpf_sys_bpf_proto = {
 	.arg3_type	= ARG_CONST_SIZE,
 };
 
+BPF_CALL_1(bpf_sys_close, u32, fd)
+{
+	return close_fd(fd);
+}
+
+const struct bpf_func_proto bpf_sys_close_proto = {
+	.func		= bpf_sys_close,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_ANYTHING,
+};
+
 static const struct bpf_func_proto *
 syscall_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 {
@@ -4565,6 +4577,8 @@ syscall_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_sys_bpf_proto;
 	case BPF_FUNC_btf_find_by_name_kind:
 		return &bpf_btf_find_by_name_kind_proto;
+	case BPF_FUNC_sys_close:
+		return &bpf_sys_close_proto;
 	default:
 		return bpf_base_func_proto(func_id);
 	}
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 47c4b21a51b6..2251e7894799 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -4728,6 +4728,12 @@ union bpf_attr {
  * 		If btf_fd is zero look for the name in vmlinux BTF and kernel module BTFs.
  * 	Return
  * 		Return btf_id and store module's BTF FD into attach_btf_obj_fd.
+ *
+ * long bpf_sys_close(u32 fd)
+ * 	Description
+ * 		Execute close syscall for given FD.
+ * 	Return
+ * 		A syscall result.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -4897,6 +4903,7 @@ union bpf_attr {
 	FN(for_each_map_elem),		\
 	FN(sys_bpf),			\
 	FN(btf_find_by_name_kind),	\
+	FN(sys_close),			\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH bpf-next 12/15] libbpf: Change the order of data and text relocations.
  2021-04-17  3:32 [PATCH bpf-next 00/15] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
                   ` (10 preceding siblings ...)
  2021-04-17  3:32 ` [PATCH bpf-next 11/15] bpf: Add bpf_sys_close() helper Alexei Starovoitov
@ 2021-04-17  3:32 ` Alexei Starovoitov
  2021-04-17  3:32 ` [PATCH bpf-next 13/15] libbpf: Generate loader program out of BPF ELF file Alexei Starovoitov
                   ` (2 subsequent siblings)
  14 siblings, 0 replies; 30+ messages in thread
From: Alexei Starovoitov @ 2021-04-17  3:32 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

In order to be able to generate loader program in the later
patches change the order of data and text relocations.
Also improve the test to include data relos.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/lib/bpf/libbpf.c                        | 82 ++++++++++++++-----
 .../selftests/bpf/progs/test_subprogs.c       | 13 +++
 2 files changed, 76 insertions(+), 19 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 17cfc5b66111..083e441d9c5e 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -6379,11 +6379,15 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog)
 			insn[0].imm = ext->ksym.kernel_btf_id;
 			break;
 		case RELO_SUBPROG_ADDR:
-			insn[0].src_reg = BPF_PSEUDO_FUNC;
-			/* will be handled as a follow up pass */
+			if (insn[0].src_reg != BPF_PSEUDO_FUNC) {
+				pr_warn("prog '%s': relo #%d: bad insn\n",
+					prog->name, i);
+				return -EINVAL;
+			}
+			/* handled already */
 			break;
 		case RELO_CALL:
-			/* will be handled as a follow up pass */
+			/* handled already */
 			break;
 		default:
 			pr_warn("prog '%s': relo #%d: bad relo type %d\n",
@@ -6552,6 +6556,31 @@ static struct reloc_desc *find_prog_insn_relo(const struct bpf_program *prog, si
 		       sizeof(*prog->reloc_desc), cmp_relo_by_insn_idx);
 }
 
+static int append_subprog_relos(struct bpf_program *main_prog, struct bpf_program *subprog)
+{
+	int new_cnt = main_prog->nr_reloc + subprog->nr_reloc;
+	struct reloc_desc *relos;
+	size_t off = subprog->sub_insn_off;
+	int i;
+
+	if (main_prog == subprog)
+		return 0;
+	relos = libbpf_reallocarray(main_prog->reloc_desc, new_cnt, sizeof(*relos));
+	if (!relos)
+		return -ENOMEM;
+	memcpy(relos + main_prog->nr_reloc, subprog->reloc_desc,
+	       sizeof(*relos) * subprog->nr_reloc);
+
+	for (i = main_prog->nr_reloc; i < new_cnt; i++)
+		relos[i].insn_idx += off;
+	/* After insn_idx adjustment the 'relos' array is still sorted
+	 * by insn_idx and doesn't break bsearch.
+	 */
+	main_prog->reloc_desc = relos;
+	main_prog->nr_reloc = new_cnt;
+	return 0;
+}
+
 static int
 bpf_object__reloc_code(struct bpf_object *obj, struct bpf_program *main_prog,
 		       struct bpf_program *prog)
@@ -6560,12 +6589,21 @@ bpf_object__reloc_code(struct bpf_object *obj, struct bpf_program *main_prog,
 	struct bpf_program *subprog;
 	struct bpf_insn *insns, *insn;
 	struct reloc_desc *relo;
-	int err;
+	int err, i;
 
 	err = reloc_prog_func_and_line_info(obj, main_prog, prog);
 	if (err)
 		return err;
 
+	for (i = 0; i < prog->nr_reloc; i++) {
+		relo = &prog->reloc_desc[i];
+		insn = &main_prog->insns[prog->sub_insn_off + relo->insn_idx];
+
+		if (relo->type == RELO_SUBPROG_ADDR)
+			/* mark the insn, so it becomes insn_is_pseudo_func() */
+			insn[0].src_reg = BPF_PSEUDO_FUNC;
+	}
+
 	for (insn_idx = 0; insn_idx < prog->sec_insn_cnt; insn_idx++) {
 		insn = &main_prog->insns[prog->sub_insn_off + insn_idx];
 		if (!insn_is_subprog_call(insn) && !insn_is_pseudo_func(insn))
@@ -6573,6 +6611,8 @@ bpf_object__reloc_code(struct bpf_object *obj, struct bpf_program *main_prog,
 
 		relo = find_prog_insn_relo(prog, insn_idx);
 		if (relo && relo->type != RELO_CALL && relo->type != RELO_SUBPROG_ADDR) {
+			if (relo->type == RELO_EXTERN_FUNC)
+				continue;
 			pr_warn("prog '%s': unexpected relo for insn #%zu, type %d\n",
 				prog->name, insn_idx, relo->type);
 			return -LIBBPF_ERRNO__RELOC;
@@ -6646,6 +6686,10 @@ bpf_object__reloc_code(struct bpf_object *obj, struct bpf_program *main_prog,
 			pr_debug("prog '%s': added %zu insns from sub-prog '%s'\n",
 				 main_prog->name, subprog->insns_cnt, subprog->name);
 
+			/* The subprog insns are now appended. Append its relos too. */
+			err = append_subprog_relos(main_prog, subprog);
+			if (err)
+				return err;
 			err = bpf_object__reloc_code(obj, main_prog, subprog);
 			if (err)
 				return err;
@@ -6790,23 +6834,11 @@ bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path)
 			return err;
 		}
 	}
-	/* relocate data references first for all programs and sub-programs,
-	 * as they don't change relative to code locations, so subsequent
-	 * subprogram processing won't need to re-calculate any of them
-	 */
-	for (i = 0; i < obj->nr_programs; i++) {
-		prog = &obj->programs[i];
-		err = bpf_object__relocate_data(obj, prog);
-		if (err) {
-			pr_warn("prog '%s': failed to relocate data references: %d\n",
-				prog->name, err);
-			return err;
-		}
-	}
-	/* now relocate subprogram calls and append used subprograms to main
+	/* relocate subprogram calls and append used subprograms to main
 	 * programs; each copy of subprogram code needs to be relocated
 	 * differently for each main program, because its code location might
-	 * have changed
+	 * have changed.
+	 * Append subprog relos to main programs.
 	 */
 	for (i = 0; i < obj->nr_programs; i++) {
 		prog = &obj->programs[i];
@@ -6823,6 +6855,18 @@ bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path)
 			return err;
 		}
 	}
+	/* Process data relos for main programs */
+	for (i = 0; i < obj->nr_programs; i++) {
+		prog = &obj->programs[i];
+		if (prog_is_subprog(obj, prog))
+			continue;
+		err = bpf_object__relocate_data(obj, prog);
+		if (err) {
+			pr_warn("prog '%s': failed to relocate data references: %d\n",
+				prog->name, err);
+			return err;
+		}
+	}
 	/* free up relocation descriptors */
 	for (i = 0; i < obj->nr_programs; i++) {
 		prog = &obj->programs[i];
diff --git a/tools/testing/selftests/bpf/progs/test_subprogs.c b/tools/testing/selftests/bpf/progs/test_subprogs.c
index d3c5673c0218..b7c37ca09544 100644
--- a/tools/testing/selftests/bpf/progs/test_subprogs.c
+++ b/tools/testing/selftests/bpf/progs/test_subprogs.c
@@ -4,8 +4,18 @@
 
 const char LICENSE[] SEC("license") = "GPL";
 
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+	__uint(max_entries, 1);
+	__type(key, __u32);
+	__type(value, __u64);
+} array SEC(".maps");
+
 __noinline int sub1(int x)
 {
+	int key = 0;
+
+	bpf_map_lookup_elem(&array, &key);
 	return x + 1;
 }
 
@@ -23,6 +33,9 @@ static __noinline int sub3(int z)
 
 static __noinline int sub4(int w)
 {
+	int key = 0;
+
+	bpf_map_lookup_elem(&array, &key);
 	return w + sub3(5) + sub1(6);
 }
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH bpf-next 13/15] libbpf: Generate loader program out of BPF ELF file.
  2021-04-17  3:32 [PATCH bpf-next 00/15] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
                   ` (11 preceding siblings ...)
  2021-04-17  3:32 ` [PATCH bpf-next 12/15] libbpf: Change the order of data and text relocations Alexei Starovoitov
@ 2021-04-17  3:32 ` Alexei Starovoitov
  2021-04-21  1:34   ` Yonghong Song
  2021-04-17  3:32 ` [PATCH bpf-next 14/15] bpftool: Use syscall/loader program in "prog load" and "gen skeleton" command Alexei Starovoitov
  2021-04-17  3:32 ` [PATCH bpf-next 15/15] selftests/bpf: Convert few tests to light skeleton Alexei Starovoitov
  14 siblings, 1 reply; 30+ messages in thread
From: Alexei Starovoitov @ 2021-04-17  3:32 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

The BPF program loading process performed by libbpf is quite complex
and consists of the following steps:
"open" phase:
- parse elf file and remember relocations, sections
- collect externs and ksyms including their btf_ids in prog's BTF
- patch BTF datasec (since llvm couldn't do it)
- init maps (old style map_def, BTF based, global data map, kconfig map)
- collect relocations against progs and maps
"load" phase:
- probe kernel features
- load vmlinux BTF
- resolve externs (kconfig and ksym)
- load program BTF
- init struct_ops
- create maps
- apply CO-RE relocations
- patch ld_imm64 insns with src_reg=PSEUDO_MAP, PSEUDO_MAP_VALUE, PSEUDO_BTF_ID
- reposition subprograms and adjust call insns
- sanitize and load progs

During this process libbpf does sys_bpf() calls to load BTF, create maps,
populate maps and finally load programs.
Instead of actually doing the syscalls generate a trace of what libbpf
would have done and represent it as the "loader program".
The "loader program" consists of single map with:
- union bpf_attr(s)
- BTF bytes
- map value bytes
- insns bytes
and single bpf program that passes bpf_attr(s) and data into bpf_sys_bpf() helper.
Executing such "loader program" via bpf_prog_test_run() command will
replay the sequence of syscalls that libbpf would have done which will result
the same maps created and programs loaded as specified in the elf file.
The "loader program" removes libelf and majority of libbpf dependency from
program loading process.

kconfig, typeless ksym, struct_ops and CO-RE are not supported yet.

The order of relocate_data and relocate_calls had to change in order
for trace generation to see all relocations for given program with
correct insn_idx-es.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/lib/bpf/Build              |   2 +-
 tools/lib/bpf/bpf.c              |  61 ++++
 tools/lib/bpf/bpf.h              |  35 ++
 tools/lib/bpf/bpf_gen_internal.h |  38 +++
 tools/lib/bpf/gen_trace.c        | 529 +++++++++++++++++++++++++++++++
 tools/lib/bpf/libbpf.c           | 199 ++++++++++--
 tools/lib/bpf/libbpf.map         |   1 +
 tools/lib/bpf/libbpf_internal.h  |   2 +
 8 files changed, 834 insertions(+), 33 deletions(-)
 create mode 100644 tools/lib/bpf/bpf_gen_internal.h
 create mode 100644 tools/lib/bpf/gen_trace.c

diff --git a/tools/lib/bpf/Build b/tools/lib/bpf/Build
index 9b057cc7650a..d0a1903bcc3c 100644
--- a/tools/lib/bpf/Build
+++ b/tools/lib/bpf/Build
@@ -1,3 +1,3 @@
 libbpf-y := libbpf.o bpf.o nlattr.o btf.o libbpf_errno.o str_error.o \
 	    netlink.o bpf_prog_linfo.o libbpf_probes.o xsk.o hashmap.o \
-	    btf_dump.o ringbuf.o strset.o linker.o
+	    btf_dump.o ringbuf.o strset.o linker.o gen_trace.o
diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index b96a3aba6fcc..517e4f949a73 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -972,3 +972,64 @@ int bpf_prog_bind_map(int prog_fd, int map_fd,
 
 	return sys_bpf(BPF_PROG_BIND_MAP, &attr, sizeof(attr));
 }
+
+int bpf_load(const struct bpf_load_opts *opts)
+{
+	struct bpf_prog_test_run_attr tattr = {};
+	struct bpf_prog_load_params attr = {};
+	int map_fd = -1, prog_fd = -1, key = 0, err;
+
+	if (!OPTS_VALID(opts, bpf_load_opts))
+		return -EINVAL;
+
+	map_fd = bpf_create_map_name(BPF_MAP_TYPE_ARRAY, "__loader.map", 4,
+				     opts->data_sz, 1, 0);
+	if (map_fd < 0) {
+		pr_warn("failed to create loader map");
+		err = errno;
+		goto out;
+	}
+
+	err = bpf_map_update_elem(map_fd, &key, opts->data, 0);
+	if (err < 0) {
+		pr_warn("failed to update loader map");
+		err = errno;
+		goto out;
+	}
+
+	attr.prog_type = BPF_PROG_TYPE_SYSCALL;
+	attr.insns = opts->insns;
+	attr.insn_cnt = opts->insns_sz / sizeof(struct bpf_insn);
+	attr.license = "GPL";
+	attr.name = "__loader.prog";
+	attr.fd_array = &map_fd;
+	attr.log_level = opts->ctx->log_level;
+	attr.log_buf_sz = opts->ctx->log_size;
+	attr.log_buf = (void *) opts->ctx->log_buf;
+	prog_fd = libbpf__bpf_prog_load(&attr);
+	if (prog_fd < 0) {
+		pr_warn("failed to load loader prog %d", errno);
+		err = errno;
+		goto out;
+	}
+
+	tattr.prog_fd = prog_fd;
+	tattr.ctx_in = opts->ctx;
+	tattr.ctx_size_in = opts->ctx->sz;
+	err = bpf_prog_test_run_xattr(&tattr);
+	if (err < 0 || (int)tattr.retval < 0) {
+		pr_warn("failed to execute loader prog %d retval %d",
+			errno, tattr.retval);
+		if (err < 0)
+			err = errno;
+		else
+			err = -(int)tattr.retval;
+		goto out;
+	}
+	err = 0;
+out:
+	close(map_fd);
+	close(prog_fd);
+	return err;
+
+}
diff --git a/tools/lib/bpf/bpf.h b/tools/lib/bpf/bpf.h
index 875dde20d56e..0d36fd412450 100644
--- a/tools/lib/bpf/bpf.h
+++ b/tools/lib/bpf/bpf.h
@@ -278,6 +278,41 @@ struct bpf_test_run_opts {
 LIBBPF_API int bpf_prog_test_run_opts(int prog_fd,
 				      struct bpf_test_run_opts *opts);
 
+/* The layout of bpf_map_prog_desc and bpf_loader_ctx is feature dependent
+ * and will change from one version of libbpf to another and features
+ * requested during loader program generation.
+ */
+union bpf_map_prog_desc {
+	struct {
+		__u32 map_fd;
+		__u32 max_entries;
+	};
+	struct {
+		__u32 prog_fd;
+		__u32 attach_prog_fd;
+	};
+};
+
+struct bpf_loader_ctx {
+	size_t sz;
+	__u32 log_level;
+	__u32 log_size;
+	__u64 log_buf;
+	union bpf_map_prog_desc u[];
+};
+
+struct bpf_load_opts {
+	size_t sz; /* size of this struct for forward/backward compatibility */
+	struct bpf_loader_ctx *ctx;
+	const void *data;
+	const void *insns;
+	__u32 data_sz;
+	__u32 insns_sz;
+};
+#define bpf_load_opts__last_field insns_sz
+
+LIBBPF_API int bpf_load(const struct bpf_load_opts *opts);
+
 #ifdef __cplusplus
 } /* extern "C" */
 #endif
diff --git a/tools/lib/bpf/bpf_gen_internal.h b/tools/lib/bpf/bpf_gen_internal.h
new file mode 100644
index 000000000000..a79f2e4ad980
--- /dev/null
+++ b/tools/lib/bpf/bpf_gen_internal.h
@@ -0,0 +1,38 @@
+/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
+/* Copyright (c) 2021 Facebook */
+#ifndef __BPF_GEN_INTERNAL_H
+#define __BPF_GEN_INTERNAL_H
+
+struct relo_desc {
+	const char *name;
+	int kind;
+	int insn_idx;
+};
+
+struct bpf_gen {
+	void *data_start;
+	void *data_cur;
+	void *insn_start;
+	void *insn_cur;
+	__u32 nr_progs;
+	__u32 nr_maps;
+	int log_level;
+	int error;
+	struct relo_desc *relos;
+	int relo_cnt;
+};
+
+void bpf_object__set_gen_trace(struct bpf_object *obj, struct bpf_gen *gen);
+
+void bpf_gen__init(struct bpf_gen *gen, int log_level);
+int bpf_gen__finish(struct bpf_gen *gen);
+void bpf_gen__load_btf(struct bpf_gen *gen, const void *raw_data, __u32 raw_size);
+void bpf_gen__map_create(struct bpf_gen *gen, struct bpf_create_map_attr *map_attr, int map_idx);
+struct bpf_prog_load_params;
+void bpf_gen__prog_load(struct bpf_gen *gen, struct bpf_prog_load_params *load_attr, int prog_idx);
+void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *value, __u32 value_size);
+void bpf_gen__map_freeze(struct bpf_gen *gen, int map_idx);
+void bpf_gen__record_find_name(struct bpf_gen *gen, const char *name, enum bpf_attach_type type);
+void bpf_gen__record_extern(struct bpf_gen *gen, const char *name, int kind, int insn_idx);
+
+#endif
diff --git a/tools/lib/bpf/gen_trace.c b/tools/lib/bpf/gen_trace.c
new file mode 100644
index 000000000000..1a80a8dd1c9f
--- /dev/null
+++ b/tools/lib/bpf/gen_trace.c
@@ -0,0 +1,529 @@
+// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
+/* Copyright (c) 2021 Facebook */
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <errno.h>
+#include <linux/filter.h>
+#include "btf.h"
+#include "bpf.h"
+#include "libbpf.h"
+#include "libbpf_internal.h"
+#include "hashmap.h"
+#include "bpf_gen_internal.h"
+
+#define MAX_USED_MAPS 64
+#define MAX_USED_PROGS 32
+
+/* The following structure describes the stack layout of the loader program.
+ * In addition R6 contains the pointer to context.
+ * R7 contains the result of the last sys_bpf command (typically error or FD).
+ */
+struct loader_stack {
+	__u32 btf_fd;
+	__u32 map_fd[MAX_USED_MAPS];
+	__u32 prog_fd[MAX_USED_PROGS];
+	__u32 inner_map_fd;
+	__u32 last_btf_id;
+	__u32 last_attach_btf_obj_fd;
+};
+#define stack_off(field) (__s16)(-sizeof(struct loader_stack) + offsetof(struct loader_stack, field))
+
+static int bpf_gen__realloc_insn_buf(struct bpf_gen *gen, __u32 size)
+{
+	size_t off = gen->insn_cur - gen->insn_start;
+
+	if (gen->error)
+		return -ENOMEM;
+	if (off + size > UINT32_MAX) {
+		gen->error = -ERANGE;
+		return -ERANGE;
+	}
+	gen->insn_start = realloc(gen->insn_start, off + size);
+	if (!gen->insn_start) {
+		gen->error = -ENOMEM;
+		return -ENOMEM;
+	}
+	gen->insn_cur = gen->insn_start + off;
+	return 0;
+}
+
+static int bpf_gen__realloc_data_buf(struct bpf_gen *gen, __u32 size)
+{
+	size_t off = gen->data_cur - gen->data_start;
+
+	if (gen->error)
+		return -ENOMEM;
+	if (off + size > UINT32_MAX) {
+		gen->error = -ERANGE;
+		return -ERANGE;
+	}
+	gen->data_start = realloc(gen->data_start, off + size);
+	if (!gen->data_start) {
+		gen->error = -ENOMEM;
+		return -ENOMEM;
+	}
+	gen->data_cur = gen->data_start + off;
+	return 0;
+}
+
+static void bpf_gen__emit(struct bpf_gen *gen, struct bpf_insn insn)
+{
+	if (bpf_gen__realloc_insn_buf(gen, sizeof(insn)))
+		return;
+	memcpy(gen->insn_cur, &insn, sizeof(insn));
+	gen->insn_cur += sizeof(insn);
+}
+
+static void bpf_gen__emit2(struct bpf_gen *gen, struct bpf_insn insn1, struct bpf_insn insn2)
+{
+	bpf_gen__emit(gen, insn1);
+	bpf_gen__emit(gen, insn2);
+}
+
+void bpf_gen__init(struct bpf_gen *gen, int log_level)
+{
+	gen->log_level = log_level;
+	bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_6, BPF_REG_1));
+	bpf_gen__emit(gen, BPF_ST_MEM(BPF_W, BPF_REG_10, stack_off(last_attach_btf_obj_fd), 0));
+}
+
+static int bpf_gen__add_data(struct bpf_gen *gen, const void *data, __u32 size)
+{
+	void *prev;
+
+	if (bpf_gen__realloc_data_buf(gen, size))
+		return 0;
+	prev = gen->data_cur;
+	memcpy(gen->data_cur, data, size);
+	gen->data_cur += size;
+	return prev - gen->data_start;
+}
+
+static int insn_bytes_to_bpf_size(__u32 sz)
+{
+	switch (sz) {
+	case 8: return BPF_DW;
+	case 4: return BPF_W;
+	case 2: return BPF_H;
+	case 1: return BPF_B;
+	default: return -1;
+	}
+}
+
+/* *(u64 *)(blob + off) = (u64)(void *)(blob + data) */
+static void bpf_gen__emit_rel_store(struct bpf_gen *gen, int off, int data)
+{
+	bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_0, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, data));
+	bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, off));
+	bpf_gen__emit(gen, BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_0, 0));
+}
+
+/* *(u64 *)(blob + off) = (u64)(void *)(%sp + stack_off) */
+static void bpf_gen__emit_rel_store_sp(struct bpf_gen *gen, int off, int stack_off)
+{
+	bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_0, BPF_REG_10));
+	bpf_gen__emit(gen, BPF_ALU64_IMM(BPF_ADD, BPF_REG_0, stack_off));
+	bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, off));
+	bpf_gen__emit(gen, BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_0, 0));
+}
+
+static void bpf_gen__move_ctx2blob(struct bpf_gen *gen, int off, int size, int ctx_off)
+{
+	bpf_gen__emit(gen, BPF_LDX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_0, BPF_REG_6, ctx_off));
+	bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, off));
+	bpf_gen__emit(gen, BPF_STX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_1, BPF_REG_0, 0));
+}
+
+static void bpf_gen__move_stack2blob(struct bpf_gen *gen, int off, int size, int stack_off)
+{
+	bpf_gen__emit(gen, BPF_LDX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_0, BPF_REG_10, stack_off));
+	bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, off));
+	bpf_gen__emit(gen, BPF_STX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_1, BPF_REG_0, 0));
+}
+
+static void bpf_gen__move_stack2ctx(struct bpf_gen *gen, int ctx_off, int size, int stack_off)
+{
+	bpf_gen__emit(gen, BPF_LDX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_0, BPF_REG_10, stack_off));
+	bpf_gen__emit(gen, BPF_STX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_6, BPF_REG_0, ctx_off));
+}
+
+static void bpf_gen__emit_sys_bpf(struct bpf_gen *gen, int cmd, int attr, int attr_size)
+{
+	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_1, cmd));
+	bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_2, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, attr));
+	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_3, attr_size));
+	bpf_gen__emit(gen, BPF_EMIT_CALL(BPF_FUNC_sys_bpf));
+	/* remember the result in R7 */
+	bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_7, BPF_REG_0));
+}
+
+static void bpf_gen__emit_check_err(struct bpf_gen *gen)
+{
+	bpf_gen__emit(gen, BPF_JMP_IMM(BPF_JSGE, BPF_REG_7, 0, 2));
+	bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_0, BPF_REG_7));
+	bpf_gen__emit(gen, BPF_EXIT_INSN());
+}
+
+static void __bpf_gen__debug(struct bpf_gen *gen, int reg1, int reg2, const char *fmt, va_list args)
+{
+	char buf[1024];
+	int addr, len, ret;
+
+	if (!gen->log_level)
+		return;
+	ret = vsnprintf(buf, sizeof(buf), fmt, args);
+	if (ret < 1024 - 7 && reg1 >= 0 && reg2 < 0)
+		strcat(buf, " r=%d");
+	len = strlen(buf) + 1;
+	addr = bpf_gen__add_data(gen, buf, len);
+
+	bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, addr));
+	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_2, len));
+	if (reg1 >= 0)
+		bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_3, reg1));
+	if (reg2 >= 0)
+		bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_4, reg2));
+	bpf_gen__emit(gen, BPF_EMIT_CALL(BPF_FUNC_trace_printk));
+}
+
+static void bpf_gen__debug_regs(struct bpf_gen *gen, int reg1, int reg2, const char *fmt, ...)
+{
+	va_list args;
+
+	va_start(args, fmt);
+	__bpf_gen__debug(gen, reg1, reg2, fmt, args);
+	va_end(args);
+}
+
+static void bpf_gen__debug_ret(struct bpf_gen *gen, const char *fmt, ...)
+{
+	va_list args;
+
+	va_start(args, fmt);
+	__bpf_gen__debug(gen, BPF_REG_7, -1, fmt, args);
+	va_end(args);
+}
+
+static void bpf_gen__emit_sys_close(struct bpf_gen *gen, int stack_off)
+{
+	bpf_gen__emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_10, stack_off));
+	bpf_gen__emit(gen, BPF_JMP_IMM(BPF_JSLE, BPF_REG_1, 0, 2 + (gen->log_level ? 6 : 0)));
+	bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_9, BPF_REG_1));
+	bpf_gen__emit(gen, BPF_EMIT_CALL(BPF_FUNC_sys_close));
+	bpf_gen__debug_regs(gen, BPF_REG_9, BPF_REG_0, "close(%%d) = %%d");
+}
+
+int bpf_gen__finish(struct bpf_gen *gen)
+{
+	int i;
+
+	bpf_gen__emit_sys_close(gen, stack_off(btf_fd));
+	for (i = 0; i < gen->nr_progs; i++)
+		bpf_gen__move_stack2ctx(gen, offsetof(struct bpf_loader_ctx,
+						      u[gen->nr_maps + i].map_fd), 4,
+					stack_off(prog_fd[i]));
+	for (i = 0; i < gen->nr_maps; i++)
+		bpf_gen__move_stack2ctx(gen, offsetof(struct bpf_loader_ctx,
+						      u[i].prog_fd), 4,
+					stack_off(map_fd[i]));
+	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_0, 0));
+	bpf_gen__emit(gen, BPF_EXIT_INSN());
+	pr_debug("bpf_gen__finish %d\n", gen->error);
+	return gen->error;
+}
+
+void bpf_gen__load_btf(struct bpf_gen *gen, const void *btf_raw_data, __u32 btf_raw_size)
+{
+	union bpf_attr attr = {};
+	int attr_size = offsetofend(union bpf_attr, btf_log_level);
+	int btf_data, btf_load_attr;
+
+	pr_debug("btf_load: size %d\n", btf_raw_size);
+	btf_data = bpf_gen__add_data(gen, btf_raw_data, btf_raw_size);
+
+	attr.btf_size = btf_raw_size;
+	btf_load_attr = bpf_gen__add_data(gen, &attr, attr_size);
+
+	/* populate union bpf_attr with user provided log details */
+	bpf_gen__move_ctx2blob(gen, btf_load_attr + offsetof(union bpf_attr, btf_log_level), 4,
+			       offsetof(struct bpf_loader_ctx, log_level));
+	bpf_gen__move_ctx2blob(gen, btf_load_attr + offsetof(union bpf_attr, btf_log_size), 4,
+			       offsetof(struct bpf_loader_ctx, log_size));
+	bpf_gen__move_ctx2blob(gen, btf_load_attr + offsetof(union bpf_attr, btf_log_buf), 8,
+			       offsetof(struct bpf_loader_ctx, log_buf));
+	/* populate union bpf_attr with a pointer to the BTF data */
+	bpf_gen__emit_rel_store(gen, btf_load_attr + offsetof(union bpf_attr, btf), btf_data);
+	/* emit BTF_LOAD command */
+	bpf_gen__emit_sys_bpf(gen, BPF_BTF_LOAD, btf_load_attr, attr_size);
+	bpf_gen__debug_ret(gen, "btf_load size %d", btf_raw_size);
+	bpf_gen__emit_check_err(gen);
+	/* remember btf_fd in the stack, if successful */
+	bpf_gen__emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7, stack_off(btf_fd)));
+}
+
+void bpf_gen__map_create(struct bpf_gen *gen, struct bpf_create_map_attr *map_attr, int map_idx)
+{
+	union bpf_attr attr = {};
+	int attr_size = offsetofend(union bpf_attr, btf_vmlinux_value_type_id);
+	bool close_inner_map_fd = false;
+	int map_create_attr;
+
+	attr.map_type = map_attr->map_type;
+	attr.key_size = map_attr->key_size;
+	attr.value_size = map_attr->value_size;
+	attr.map_flags = map_attr->map_flags;
+	memcpy(attr.map_name, map_attr->name,
+	       min((unsigned)strlen(map_attr->name), BPF_OBJ_NAME_LEN - 1));
+	attr.numa_node = map_attr->numa_node;
+	attr.map_ifindex = map_attr->map_ifindex;
+	attr.max_entries = map_attr->max_entries;
+	switch (attr.map_type) {
+	case BPF_MAP_TYPE_PERF_EVENT_ARRAY:
+	case BPF_MAP_TYPE_CGROUP_ARRAY:
+	case BPF_MAP_TYPE_STACK_TRACE:
+	case BPF_MAP_TYPE_ARRAY_OF_MAPS:
+	case BPF_MAP_TYPE_HASH_OF_MAPS:
+	case BPF_MAP_TYPE_DEVMAP:
+	case BPF_MAP_TYPE_DEVMAP_HASH:
+	case BPF_MAP_TYPE_CPUMAP:
+	case BPF_MAP_TYPE_XSKMAP:
+	case BPF_MAP_TYPE_SOCKMAP:
+	case BPF_MAP_TYPE_SOCKHASH:
+	case BPF_MAP_TYPE_QUEUE:
+	case BPF_MAP_TYPE_STACK:
+	case BPF_MAP_TYPE_RINGBUF:
+		break;
+	default:
+		attr.btf_key_type_id = map_attr->btf_key_type_id;
+		attr.btf_value_type_id = map_attr->btf_value_type_id;
+	}
+
+	pr_debug("map_create: %s idx %d type %d value_type_id %d\n",
+		 attr.map_name, map_idx, map_attr->map_type, attr.btf_value_type_id);
+
+	map_create_attr = bpf_gen__add_data(gen, &attr, attr_size);
+	if (attr.btf_value_type_id)
+		/* populate union bpf_attr with btf_fd saved in the stack earlier */
+		bpf_gen__move_stack2blob(gen, map_create_attr + offsetof(union bpf_attr, btf_fd), 4,
+					 stack_off(btf_fd));
+	switch (attr.map_type) {
+	case BPF_MAP_TYPE_ARRAY_OF_MAPS:
+	case BPF_MAP_TYPE_HASH_OF_MAPS:
+		bpf_gen__move_stack2blob(gen, map_create_attr + offsetof(union bpf_attr, inner_map_fd),
+					 4, stack_off(inner_map_fd));
+		close_inner_map_fd = true;
+		break;
+	default:;
+	}
+	/* emit MAP_CREATE command */
+	bpf_gen__emit_sys_bpf(gen, BPF_MAP_CREATE, map_create_attr, attr_size);
+	bpf_gen__debug_ret(gen, "map_create %s idx %d type %d value_size %d",
+			   attr.map_name, map_idx, map_attr->map_type, attr.value_size);
+	bpf_gen__emit_check_err(gen);
+	/* remember map_fd in the stack, if successful */
+	if (map_idx < 0) {
+		bpf_gen__emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7, stack_off(inner_map_fd)));
+	} else {
+		if (map_idx != gen->nr_maps) {
+			gen->error = -EDOM; /* internal bug */
+			return;
+		}
+		bpf_gen__emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7, stack_off(map_fd[map_idx])));
+		gen->nr_maps++;
+	}
+	if (close_inner_map_fd)
+		bpf_gen__emit_sys_close(gen, stack_off(inner_map_fd));
+}
+
+void bpf_gen__record_find_name(struct bpf_gen *gen, const char *attach_name,
+			       enum bpf_attach_type type)
+{
+	const char *prefix;
+	int kind, len, name;
+
+	btf_get_kernel_prefix_kind(type, &prefix, &kind);
+	pr_debug("find_btf_id '%s%s'\n", prefix, attach_name);
+	len = strlen(prefix);
+	if (len)
+		name = bpf_gen__add_data(gen, prefix, len);
+	name = bpf_gen__add_data(gen, attach_name, strlen(attach_name) + 1);
+	name -= len;
+
+	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_1, 0));
+	bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_2, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, name));
+	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_3, kind));
+	bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_4, BPF_REG_10));
+	bpf_gen__emit(gen, BPF_ALU64_IMM(BPF_ADD, BPF_REG_4, stack_off(last_attach_btf_obj_fd)));
+	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_5, 0));
+	bpf_gen__emit(gen, BPF_EMIT_CALL(BPF_FUNC_btf_find_by_name_kind));
+	bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_7, BPF_REG_0));
+	bpf_gen__debug_ret(gen, "find_by_name_kind(%s%s,%d)", prefix, attach_name, kind);
+	bpf_gen__emit_check_err(gen);
+	/* remember btf_id */
+	bpf_gen__emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7, stack_off(last_btf_id)));
+}
+
+void bpf_gen__record_extern(struct bpf_gen *gen, const char *name, int kind, int insn_idx)
+{
+	struct relo_desc *relo;
+
+	relo = libbpf_reallocarray(gen->relos, gen->relo_cnt + 1, sizeof(*relo));
+	if (!relo) {
+		gen->error = -ENOMEM;
+		return;
+	}
+	gen->relos = relo;
+	relo += gen->relo_cnt;
+	relo->name = name;
+	relo->kind = kind;
+	relo->insn_idx = insn_idx;
+	gen->relo_cnt++;
+}
+
+static void bpf_gen__emit_relo(struct bpf_gen *gen, struct relo_desc *relo, int insns)
+{
+	int name, insn;
+
+	pr_debug("relo: %s at %d\n", relo->name, relo->insn_idx);
+	name = bpf_gen__add_data(gen, relo->name, strlen(relo->name) + 1);
+
+	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_1, 0));
+	bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_2, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, name));
+	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_3, relo->kind));
+	bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_4, BPF_REG_10));
+	bpf_gen__emit(gen, BPF_ALU64_IMM(BPF_ADD, BPF_REG_4, stack_off(last_attach_btf_obj_fd)));
+	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_5, 0));
+	bpf_gen__emit(gen, BPF_EMIT_CALL(BPF_FUNC_btf_find_by_name_kind));
+	bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_7, BPF_REG_0));
+	bpf_gen__debug_ret(gen, "find_by_name_kind(%s,%d)", relo->name, relo->kind);
+	bpf_gen__emit_check_err(gen);
+	/* store btf_id into insn[insn_idx].imm */
+	insn = (int)(long)&((struct bpf_insn *)(long)insns)[relo->insn_idx].imm;
+	bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_0, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, insn));
+	bpf_gen__emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_0, BPF_REG_7, 0));
+}
+
+void bpf_gen__prog_load(struct bpf_gen *gen, struct bpf_prog_load_params *load_attr, int prog_idx)
+{
+	union bpf_attr attr = {};
+	int attr_size = offsetofend(union bpf_attr, fd_array);
+	int prog_load_attr, license, insns, func_info, line_info, i;
+
+	pr_debug("prog_load: type %d insns_cnt %zd\n",
+		 load_attr->prog_type, load_attr->insn_cnt);
+	/* add license string to blob of bytes */
+	license = bpf_gen__add_data(gen, load_attr->license, strlen(load_attr->license) + 1);
+	/* add insns to blob of bytes */
+	insns = bpf_gen__add_data(gen, load_attr->insns,
+				  load_attr->insn_cnt * sizeof(struct bpf_insn));
+
+	attr.prog_type = load_attr->prog_type;
+	attr.expected_attach_type = load_attr->expected_attach_type;
+	attr.attach_btf_id = load_attr->attach_btf_id;
+	attr.prog_ifindex = load_attr->prog_ifindex;
+	attr.kern_version = 0;
+	attr.insn_cnt = (__u32)load_attr->insn_cnt;
+	attr.prog_flags = load_attr->prog_flags;
+
+	attr.func_info_rec_size = load_attr->func_info_rec_size;
+	attr.func_info_cnt = load_attr->func_info_cnt;
+	func_info = bpf_gen__add_data(gen, load_attr->func_info,
+				      attr.func_info_cnt * attr.func_info_rec_size);
+
+	attr.line_info_rec_size = load_attr->line_info_rec_size;
+	attr.line_info_cnt = load_attr->line_info_cnt;
+	line_info = bpf_gen__add_data(gen, load_attr->line_info,
+				      attr.line_info_cnt * attr.line_info_rec_size);
+
+	memcpy(attr.prog_name, load_attr->name,
+	       min((unsigned)strlen(load_attr->name), BPF_OBJ_NAME_LEN - 1));
+	prog_load_attr = bpf_gen__add_data(gen, &attr, attr_size);
+
+	/* populate union bpf_attr with a pointer to license */
+	bpf_gen__emit_rel_store(gen, prog_load_attr + offsetof(union bpf_attr, license), license);
+
+	/* populate union bpf_attr with a pointer to instructions */
+	bpf_gen__emit_rel_store(gen, prog_load_attr + offsetof(union bpf_attr, insns), insns);
+
+	/* populate union bpf_attr with a pointer to func_info */
+	bpf_gen__emit_rel_store(gen, prog_load_attr + offsetof(union bpf_attr, func_info), func_info);
+
+	/* populate union bpf_attr with a pointer to line_info */
+	bpf_gen__emit_rel_store(gen, prog_load_attr + offsetof(union bpf_attr, line_info), line_info);
+
+	/* populate union bpf_attr fd_array with a pointer to stack where map_fds are saved */
+	bpf_gen__emit_rel_store_sp(gen, prog_load_attr + offsetof(union bpf_attr, fd_array),
+				   stack_off(map_fd[0]));
+
+	/* populate union bpf_attr with user provided log details */
+	bpf_gen__move_ctx2blob(gen, prog_load_attr + offsetof(union bpf_attr, log_level), 4,
+			       offsetof(struct bpf_loader_ctx, log_level));
+	bpf_gen__move_ctx2blob(gen, prog_load_attr + offsetof(union bpf_attr, log_size), 4,
+			       offsetof(struct bpf_loader_ctx, log_size));
+	bpf_gen__move_ctx2blob(gen, prog_load_attr + offsetof(union bpf_attr, log_buf), 8,
+			       offsetof(struct bpf_loader_ctx, log_buf));
+	/* populate union bpf_attr with btf_fd saved in the stack earlier */
+	bpf_gen__move_stack2blob(gen, prog_load_attr + offsetof(union bpf_attr, prog_btf_fd), 4,
+				 stack_off(btf_fd));
+	if (attr.attach_btf_id) {
+		/* populate union bpf_attr with btf_id and obj_fd found by helper */
+		bpf_gen__move_stack2blob(gen, prog_load_attr + offsetof(union bpf_attr, attach_btf_id), 4,
+					 stack_off(last_btf_id));
+		bpf_gen__move_stack2blob(gen, prog_load_attr + offsetof(union bpf_attr, attach_btf_obj_fd), 4,
+					 stack_off(last_attach_btf_obj_fd));
+	}
+	for (i = 0; i < gen->relo_cnt; i++)
+		bpf_gen__emit_relo(gen, gen->relos + i, insns);
+	if (gen->relo_cnt) {
+		free(gen->relos);
+		gen->relo_cnt = 0;
+		gen->relos = NULL;
+	}
+	/* emit PROG_LOAD command */
+	bpf_gen__emit_sys_bpf(gen, BPF_PROG_LOAD, prog_load_attr, attr_size);
+	bpf_gen__debug_ret(gen, "prog_load %s insn_cnt %d", attr.prog_name, attr.insn_cnt);
+	bpf_gen__emit_check_err(gen);
+	/* remember prog_fd in the stack, if successful */
+	bpf_gen__emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7, stack_off(prog_fd[gen->nr_progs])));
+	if (attr.attach_btf_id)
+		bpf_gen__emit_sys_close(gen, stack_off(last_attach_btf_obj_fd));
+	gen->nr_progs++;
+}
+
+void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *pvalue, __u32 value_size)
+{
+	union bpf_attr attr = {};
+	int attr_size = offsetofend(union bpf_attr, flags);
+	int map_update_attr, value, key;
+	int zero = 0;
+
+	pr_debug("map_update_elem: idx %d\n", map_idx);
+	value = bpf_gen__add_data(gen, pvalue, value_size);
+	key = bpf_gen__add_data(gen, &zero, sizeof(zero));
+	map_update_attr = bpf_gen__add_data(gen, &attr, attr_size);
+	bpf_gen__move_stack2blob(gen, map_update_attr + offsetof(union bpf_attr, map_fd), 4,
+				 stack_off(map_fd[map_idx]));
+	bpf_gen__emit_rel_store(gen, map_update_attr + offsetof(union bpf_attr, key), key);
+	bpf_gen__emit_rel_store(gen, map_update_attr + offsetof(union bpf_attr, value), value);
+	/* emit MAP_UPDATE_ELEM command */
+	bpf_gen__emit_sys_bpf(gen, BPF_MAP_UPDATE_ELEM, map_update_attr, attr_size);
+	bpf_gen__debug_ret(gen, "update_elem idx %d value_size %d", map_idx, value_size);
+	bpf_gen__emit_check_err(gen);
+}
+
+void bpf_gen__map_freeze(struct bpf_gen *gen, int map_idx)
+{
+	union bpf_attr attr = {};
+	int attr_size = offsetofend(union bpf_attr, map_fd);
+	int map_freeze_attr;
+
+	pr_debug("map_freeze: idx %d\n", map_idx);
+	map_freeze_attr = bpf_gen__add_data(gen, &attr, attr_size);
+	bpf_gen__move_stack2blob(gen, map_freeze_attr + offsetof(union bpf_attr, map_fd), 4,
+				 stack_off(map_fd[map_idx]));
+	/* emit MAP_FREEZE command */
+	bpf_gen__emit_sys_bpf(gen, BPF_MAP_FREEZE, map_freeze_attr, attr_size);
+	bpf_gen__debug_ret(gen, "map_freeze");
+	bpf_gen__emit_check_err(gen);
+}
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 083e441d9c5e..a61b4d401527 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -54,6 +54,7 @@
 #include "str_error.h"
 #include "libbpf_internal.h"
 #include "hashmap.h"
+#include "bpf_gen_internal.h"
 
 #ifndef BPF_FS_MAGIC
 #define BPF_FS_MAGIC		0xcafe4a11
@@ -435,6 +436,8 @@ struct bpf_object {
 	bool loaded;
 	bool has_subcalls;
 
+	struct bpf_gen *gen_trace;
+
 	/*
 	 * Information when doing elf related work. Only valid if fd
 	 * is valid.
@@ -2651,7 +2654,15 @@ static int bpf_object__sanitize_and_load_btf(struct bpf_object *obj)
 		bpf_object__sanitize_btf(obj, kern_btf);
 	}
 
-	err = btf__load(kern_btf);
+	if (obj->gen_trace) {
+		__u32 raw_size = 0;
+		const void *raw_data = btf__get_raw_data(kern_btf, &raw_size);
+
+		bpf_gen__load_btf(obj->gen_trace, raw_data, raw_size);
+		btf__set_fd(kern_btf, 0);
+	} else {
+		err = btf__load(kern_btf);
+	}
 	if (sanitize) {
 		if (!err) {
 			/* move fd to libbpf's BTF */
@@ -4277,6 +4288,17 @@ static bool kernel_supports(enum kern_feature_id feat_id)
 	return READ_ONCE(feat->res) == FEAT_SUPPORTED;
 }
 
+static void mark_feat_supported(enum kern_feature_id last_feat)
+{
+	struct kern_feature_desc *feat;
+	int i;
+
+	for (i = 0; i <= last_feat; i++) {
+		feat = &feature_probes[i];
+		WRITE_ONCE(feat->res, FEAT_SUPPORTED);
+	}
+}
+
 static bool map_is_reuse_compat(const struct bpf_map *map, int map_fd)
 {
 	struct bpf_map_info map_info = {};
@@ -4344,6 +4366,13 @@ bpf_object__populate_internal_map(struct bpf_object *obj, struct bpf_map *map)
 	char *cp, errmsg[STRERR_BUFSIZE];
 	int err, zero = 0;
 
+	if (obj->gen_trace) {
+		bpf_gen__map_update_elem(obj->gen_trace, map - obj->maps,
+					 map->mmaped, map->def.value_size);
+		if (map_type == LIBBPF_MAP_RODATA || map_type == LIBBPF_MAP_KCONFIG)
+			bpf_gen__map_freeze(obj->gen_trace, map - obj->maps);
+		return 0;
+	}
 	err = bpf_map_update_elem(map->fd, &zero, map->mmaped, 0);
 	if (err) {
 		err = -errno;
@@ -4369,7 +4398,7 @@ bpf_object__populate_internal_map(struct bpf_object *obj, struct bpf_map *map)
 
 static void bpf_map__destroy(struct bpf_map *map);
 
-static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map)
+static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, bool is_inner)
 {
 	struct bpf_create_map_attr create_attr;
 	struct bpf_map_def *def = &map->def;
@@ -4415,9 +4444,9 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map)
 
 	if (bpf_map_type__is_map_in_map(def->type)) {
 		if (map->inner_map) {
-			int err;
+			int err = 0;
 
-			err = bpf_object__create_map(obj, map->inner_map);
+			err = bpf_object__create_map(obj, map->inner_map, true);
 			if (err) {
 				pr_warn("map '%s': failed to create inner map: %d\n",
 					map->name, err);
@@ -4429,7 +4458,12 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map)
 			create_attr.inner_map_fd = map->inner_map_fd;
 	}
 
-	map->fd = bpf_create_map_xattr(&create_attr);
+	if (obj->gen_trace) {
+		bpf_gen__map_create(obj->gen_trace, &create_attr, is_inner ? -1 : map - obj->maps);
+		map->fd = 0;
+	} else {
+		map->fd = bpf_create_map_xattr(&create_attr);
+	}
 	if (map->fd < 0 && (create_attr.btf_key_type_id ||
 			    create_attr.btf_value_type_id)) {
 		char *cp, errmsg[STRERR_BUFSIZE];
@@ -4457,11 +4491,11 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map)
 	return 0;
 }
 
-static int init_map_slots(struct bpf_map *map)
+static int init_map_slots(struct bpf_object *obj, struct bpf_map *map)
 {
 	const struct bpf_map *targ_map;
 	unsigned int i;
-	int fd, err;
+	int fd, err = 0;
 
 	for (i = 0; i < map->init_slots_sz; i++) {
 		if (!map->init_slots[i])
@@ -4469,7 +4503,12 @@ static int init_map_slots(struct bpf_map *map)
 
 		targ_map = map->init_slots[i];
 		fd = bpf_map__fd(targ_map);
-		err = bpf_map_update_elem(map->fd, &i, &fd, 0);
+		if (obj->gen_trace) {
+			printf("// TODO map_update_elem: idx %ld key %d value==map_idx %ld\n",
+			       map - obj->maps, i, targ_map - obj->maps);
+		} else {
+			err = bpf_map_update_elem(map->fd, &i, &fd, 0);
+		}
 		if (err) {
 			err = -errno;
 			pr_warn("map '%s': failed to initialize slot [%d] to map '%s' fd=%d: %d\n",
@@ -4511,7 +4550,7 @@ bpf_object__create_maps(struct bpf_object *obj)
 			pr_debug("map '%s': skipping creation (preset fd=%d)\n",
 				 map->name, map->fd);
 		} else {
-			err = bpf_object__create_map(obj, map);
+			err = bpf_object__create_map(obj, map, false);
 			if (err)
 				goto err_out;
 
@@ -4527,7 +4566,7 @@ bpf_object__create_maps(struct bpf_object *obj)
 			}
 
 			if (map->init_slots_sz) {
-				err = init_map_slots(map);
+				err = init_map_slots(obj, map);
 				if (err < 0) {
 					zclose(map->fd);
 					goto err_out;
@@ -4937,6 +4976,9 @@ static int load_module_btfs(struct bpf_object *obj)
 	if (obj->btf_modules_loaded)
 		return 0;
 
+	if (obj->gen_trace)
+		return 0;
+
 	/* don't do this again, even if we find no module BTFs */
 	obj->btf_modules_loaded = true;
 
@@ -6082,6 +6124,11 @@ static int bpf_core_apply_relo(struct bpf_program *prog,
 	if (str_is_empty(spec_str))
 		return -EINVAL;
 
+	if (prog->obj->gen_trace) {
+		printf("// TODO core_relo: prog %ld insn[%d] %s %s kind %d\n",
+		       prog - prog->obj->programs, relo->insn_off / 8,
+		       local_name, spec_str, relo->kind);
+	}
 	err = bpf_core_parse_spec(local_btf, local_id, spec_str, relo->kind, &local_spec);
 	if (err) {
 		pr_warn("prog '%s': relo #%d: parsing [%d] %s %s + %s failed: %d\n",
@@ -6818,6 +6865,19 @@ bpf_object__relocate_calls(struct bpf_object *obj, struct bpf_program *prog)
 
 	return 0;
 }
+static void
+bpf_object__free_relocs(struct bpf_object *obj)
+{
+	struct bpf_program *prog;
+	int i;
+
+	/* free up relocation descriptors */
+	for (i = 0; i < obj->nr_programs; i++) {
+		prog = &obj->programs[i];
+		zfree(&prog->reloc_desc);
+		prog->nr_reloc = 0;
+	}
+}
 
 static int
 bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path)
@@ -6867,12 +6927,8 @@ bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path)
 			return err;
 		}
 	}
-	/* free up relocation descriptors */
-	for (i = 0; i < obj->nr_programs; i++) {
-		prog = &obj->programs[i];
-		zfree(&prog->reloc_desc);
-		prog->nr_reloc = 0;
-	}
+	if (!obj->gen_trace)
+		bpf_object__free_relocs(obj);
 	return 0;
 }
 
@@ -7061,6 +7117,9 @@ static int bpf_object__sanitize_prog(struct bpf_object* obj, struct bpf_program
 	enum bpf_func_id func_id;
 	int i;
 
+	if (obj->gen_trace)
+		return 0;
+
 	for (i = 0; i < prog->insns_cnt; i++, insn++) {
 		if (!insn_is_helper_call(insn, &func_id))
 			continue;
@@ -7146,6 +7205,12 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt,
 	load_attr.log_level = prog->log_level;
 	load_attr.prog_flags = prog->prog_flags;
 
+	if (prog->obj->gen_trace) {
+		bpf_gen__prog_load(prog->obj->gen_trace, &load_attr,
+				   prog - prog->obj->programs);
+		*pfd = 0;
+		return 0;
+	}
 retry_load:
 	if (log_buf_size) {
 		log_buf = malloc(log_buf_size);
@@ -7223,6 +7288,35 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt,
 	return ret;
 }
 
+static int bpf_program__record_externs(struct bpf_program *prog)
+{
+	struct bpf_object *obj = prog->obj;
+	int i;
+
+	for (i = 0; i < prog->nr_reloc; i++) {
+		struct reloc_desc *relo = &prog->reloc_desc[i];
+		struct extern_desc *ext = &obj->externs[relo->sym_off];
+
+		switch (relo->type) {
+		case RELO_EXTERN_VAR:
+			if (ext->type != EXT_KSYM)
+				continue;
+			if (!ext->ksym.type_id) /* typeless ksym */
+				continue;
+			bpf_gen__record_extern(obj->gen_trace, ext->name, BTF_KIND_VAR,
+					       relo->insn_idx);
+			break;
+		case RELO_EXTERN_FUNC:
+			bpf_gen__record_extern(obj->gen_trace, ext->name, BTF_KIND_FUNC,
+					       relo->insn_idx);
+			break;
+		default:
+			continue;
+		}
+	}
+	return 0;
+}
+
 static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd, int *btf_type_id);
 
 int bpf_program__load(struct bpf_program *prog, char *license, __u32 kern_ver)
@@ -7268,6 +7362,8 @@ int bpf_program__load(struct bpf_program *prog, char *license, __u32 kern_ver)
 			pr_warn("prog '%s': inconsistent nr(%d) != 1\n",
 				prog->name, prog->instances.nr);
 		}
+		if (prog->obj->gen_trace)
+			bpf_program__record_externs(prog);
 		err = load_program(prog, prog->insns, prog->insns_cnt,
 				   license, kern_ver, &fd);
 		if (!err)
@@ -7359,6 +7455,8 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)
 			return err;
 		}
 	}
+	if (obj->gen_trace)
+		bpf_object__free_relocs(obj);
 	free(fd_array);
 	return 0;
 }
@@ -7740,6 +7838,12 @@ static int bpf_object__resolve_ksyms_btf_id(struct bpf_object *obj)
 		if (ext->type != EXT_KSYM || !ext->ksym.type_id)
 			continue;
 
+		if (obj->gen_trace) {
+			ext->is_set = true;
+			ext->ksym.kernel_btf_obj_fd = 0;
+			ext->ksym.kernel_btf_id = 0;
+			continue;
+		}
 		t = btf__type_by_id(obj->btf, ext->btf_id);
 		if (btf_is_var(t))
 			err = bpf_object__resolve_ksym_var_btf_id(obj, ext);
@@ -7854,6 +7958,11 @@ int bpf_object__load_xattr(struct bpf_object_load_attr *attr)
 		return -EINVAL;
 	}
 
+	if (obj->gen_trace) {
+		mark_feat_supported(FEAT_FD_IDX);
+		bpf_gen__init(obj->gen_trace, attr->log_level);
+	}
+
 	err = bpf_object__probe_loading(obj);
 	err = err ? : bpf_object__load_vmlinux_btf(obj, false);
 	err = err ? : bpf_object__resolve_externs(obj, obj->kconfig);
@@ -7864,6 +7973,9 @@ int bpf_object__load_xattr(struct bpf_object_load_attr *attr)
 	err = err ? : bpf_object__relocate(obj, attr->target_btf_path);
 	err = err ? : bpf_object__load_progs(obj, attr->log_level);
 
+	if (obj->gen_trace && !err)
+		err = bpf_gen__finish(obj->gen_trace);
+
 	/* clean up module BTFs */
 	for (i = 0; i < obj->btf_module_cnt; i++) {
 		close(obj->btf_modules[i].fd);
@@ -8579,6 +8691,11 @@ void *bpf_object__priv(const struct bpf_object *obj)
 	return obj ? obj->priv : ERR_PTR(-EINVAL);
 }
 
+void bpf_object__set_gen_trace(struct bpf_object *obj, struct bpf_gen *gen)
+{
+	obj->gen_trace = gen;
+}
+
 static struct bpf_program *
 __bpf_program__iter(const struct bpf_program *p, const struct bpf_object *obj,
 		    bool forward)
@@ -9215,6 +9332,28 @@ static int bpf_object__collect_st_ops_relos(struct bpf_object *obj,
 #define BTF_ITER_PREFIX "bpf_iter_"
 #define BTF_MAX_NAME_SIZE 128
 
+void btf_get_kernel_prefix_kind(enum bpf_attach_type attach_type,
+				const char **prefix, int *kind)
+{
+	switch (attach_type) {
+	case BPF_TRACE_RAW_TP:
+		*prefix = BTF_TRACE_PREFIX;
+		*kind = BTF_KIND_TYPEDEF;
+		break;
+	case BPF_LSM_MAC:
+		*prefix = BTF_LSM_PREFIX;
+		*kind = BTF_KIND_FUNC;
+		break;
+	case BPF_TRACE_ITER:
+		*prefix = BTF_ITER_PREFIX;
+		*kind = BTF_KIND_FUNC;
+		break;
+	default:
+		*prefix = "";
+		*kind = BTF_KIND_FUNC;
+	}
+}
+
 static int find_btf_by_prefix_kind(const struct btf *btf, const char *prefix,
 				   const char *name, __u32 kind)
 {
@@ -9235,21 +9374,11 @@ static int find_btf_by_prefix_kind(const struct btf *btf, const char *prefix,
 static inline int find_attach_btf_id(struct btf *btf, const char *name,
 				     enum bpf_attach_type attach_type)
 {
-	int err;
-
-	if (attach_type == BPF_TRACE_RAW_TP)
-		err = find_btf_by_prefix_kind(btf, BTF_TRACE_PREFIX, name,
-					      BTF_KIND_TYPEDEF);
-	else if (attach_type == BPF_LSM_MAC)
-		err = find_btf_by_prefix_kind(btf, BTF_LSM_PREFIX, name,
-					      BTF_KIND_FUNC);
-	else if (attach_type == BPF_TRACE_ITER)
-		err = find_btf_by_prefix_kind(btf, BTF_ITER_PREFIX, name,
-					      BTF_KIND_FUNC);
-	else
-		err = btf__find_by_name_kind(btf, name, BTF_KIND_FUNC);
+	const char *prefix;
+	int kind;
 
-	return err;
+	btf_get_kernel_prefix_kind(attach_type, &prefix, &kind);
+	return find_btf_by_prefix_kind(btf, prefix, name, kind);
 }
 
 int libbpf_find_vmlinux_btf_id(const char *name,
@@ -9348,7 +9477,7 @@ static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd,
 	__u32 attach_prog_fd = prog->attach_prog_fd;
 	const char *name = prog->sec_name, *attach_name;
 	const struct bpf_sec_def *sec = NULL;
-	int i, err;
+	int i, err = 0;
 
 	if (!name)
 		return -EINVAL;
@@ -9383,7 +9512,13 @@ static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd,
 	}
 
 	/* kernel/module BTF ID */
-	err = find_kernel_btf_id(prog->obj, attach_name, attach_type, btf_obj_fd, btf_type_id);
+	if (prog->obj->gen_trace) {
+		bpf_gen__record_find_name(prog->obj->gen_trace, attach_name, attach_type);
+		*btf_obj_fd = 0;
+		*btf_type_id = 1;
+	} else {
+		err = find_kernel_btf_id(prog->obj, attach_name, attach_type, btf_obj_fd, btf_type_id);
+	}
 	if (err) {
 		pr_warn("failed to find kernel BTF type ID of '%s': %d\n", attach_name, err);
 		return err;
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index b9b29baf1df8..a5dffc0a3369 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -361,4 +361,5 @@ LIBBPF_0.4.0 {
 		bpf_linker__new;
 		bpf_map__inner_map;
 		bpf_object__set_kversion;
+		bpf_load;
 } LIBBPF_0.3.0;
diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
index 9114c7085f2a..fd5c57ac93f1 100644
--- a/tools/lib/bpf/libbpf_internal.h
+++ b/tools/lib/bpf/libbpf_internal.h
@@ -214,6 +214,8 @@ int bpf_object__section_size(const struct bpf_object *obj, const char *name,
 int bpf_object__variable_offset(const struct bpf_object *obj, const char *name,
 				__u32 *off);
 struct btf *btf_get_from_fd(int btf_fd, struct btf *base_btf);
+void btf_get_kernel_prefix_kind(enum bpf_attach_type attach_type,
+				const char **prefix, int *kind);
 
 struct btf_ext_info {
 	/*
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH bpf-next 14/15] bpftool: Use syscall/loader program in "prog load" and "gen skeleton" command.
  2021-04-17  3:32 [PATCH bpf-next 00/15] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
                   ` (12 preceding siblings ...)
  2021-04-17  3:32 ` [PATCH bpf-next 13/15] libbpf: Generate loader program out of BPF ELF file Alexei Starovoitov
@ 2021-04-17  3:32 ` Alexei Starovoitov
  2021-04-17  3:32 ` [PATCH bpf-next 15/15] selftests/bpf: Convert few tests to light skeleton Alexei Starovoitov
  14 siblings, 0 replies; 30+ messages in thread
From: Alexei Starovoitov @ 2021-04-17  3:32 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Add -L flag to bpftool to use libbpf gen_trace facility and syscall/loader program
for skeleton generation and program loading.

"bpftool gen skeleton -L" command will generate a "light skeleton" or "loader skeleton"
that is similar to existing skeleton, but has one major difference:
$ bpftool gen skeleton lsm.o > lsm.skel.h
$ bpftool gen skeleton -L lsm.o > lsm.lskel.h
$ diff lsm.skel.h lsm.lskel.h
@@ -5,34 +4,34 @@
 #define __LSM_SKEL_H__

 #include <stdlib.h>
-#include <bpf/libbpf.h>
+#include <bpf/bpf.h>

The light skeleton does not use majority of libbpf infrastructure.
It doesn't need libelf. It doesn't parse .o file.
It only needs few sys_bpf wrappers. All of them are in bpf/bpf.h file.
In future libbpf/bpf.c can be inlined into bpf.h, so not even libbpf.a would be
needed to work with light skeleton.

"bpftool prog load -L file.o" command is introduced for debugging of syscall/loader
program generation. Just like the same command without -L it will try to load
the programs from file.o into the kernel. It won't even try to pin them.

"bpftool prog load -L -d file.o" command will provide additional debug messages
on how syscall/loader program was generated.
Also the execution of syscall/loader program will use bpf_trace_printk() for
each step of loading BTF, creating maps, and loading programs.
The user can do "cat /.../trace_pipe" for further debug.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/bpf/bpftool/Makefile        |   2 +-
 tools/bpf/bpftool/gen.c           | 263 +++++++++++++++++++++++++++---
 tools/bpf/bpftool/main.c          |   7 +-
 tools/bpf/bpftool/main.h          |   1 +
 tools/bpf/bpftool/prog.c          |  78 +++++++++
 tools/bpf/bpftool/xlated_dumper.c |   3 +
 6 files changed, 330 insertions(+), 24 deletions(-)

diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile
index b3073ae84018..d16d289ade7a 100644
--- a/tools/bpf/bpftool/Makefile
+++ b/tools/bpf/bpftool/Makefile
@@ -136,7 +136,7 @@ endif
 
 BPFTOOL_BOOTSTRAP := $(BOOTSTRAP_OUTPUT)bpftool
 
-BOOTSTRAP_OBJS = $(addprefix $(BOOTSTRAP_OUTPUT),main.o common.o json_writer.o gen.o btf.o)
+BOOTSTRAP_OBJS = $(addprefix $(BOOTSTRAP_OUTPUT),main.o common.o json_writer.o gen.o btf.o xlated_dumper.o btf_dumper.o) $(OUTPUT)disasm.o
 OBJS = $(patsubst %.c,$(OUTPUT)%.o,$(SRCS)) $(OUTPUT)disasm.o
 
 VMLINUX_BTF_PATHS ?= $(if $(O),$(O)/vmlinux)				\
diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c
index 31ade77f5ef8..470110c1e95f 100644
--- a/tools/bpf/bpftool/gen.c
+++ b/tools/bpf/bpftool/gen.c
@@ -18,6 +18,7 @@
 #include <sys/stat.h>
 #include <sys/mman.h>
 #include <bpf/btf.h>
+#include <bpf/bpf_gen_internal.h>
 
 #include "json_writer.h"
 #include "main.h"
@@ -268,6 +269,204 @@ static void codegen(const char *template, ...)
 	free(s);
 }
 
+static void print_hex(const char *obj_data, int file_sz)
+{
+	int i, len;
+
+	/* embed contents of BPF object file */
+	for (i = 0, len = 0; i < file_sz; i++) {
+		int w = obj_data[i] ? 4 : 2;
+
+		len += w;
+		if (len > 78) {
+			printf("\\\n");
+			len = w;
+		}
+		if (!obj_data[i])
+			printf("\\0");
+		else
+			printf("\\x%02x", (unsigned char)obj_data[i]);
+	}
+}
+
+static size_t bpf_map_mmap_sz(const struct bpf_map *map)
+{
+	long page_sz = sysconf(_SC_PAGE_SIZE);
+	size_t map_sz;
+
+	map_sz = (size_t)roundup(bpf_map__value_size(map), 8) * bpf_map__max_entries(map);
+	map_sz = roundup(map_sz, page_sz);
+	return map_sz;
+}
+
+static int gen_trace(struct bpf_object *obj, const char *obj_name, const char *header_guard)
+{
+	struct bpf_object_load_attr load_attr = {};
+	struct bpf_program *prog;
+	struct bpf_gen gen = {};
+	int data_sz, insns_sz;
+	struct bpf_map *map;
+	char *data, *insns;
+	int err = 0;
+
+	bpf_object__set_gen_trace(obj, &gen);
+
+	load_attr.obj = obj;
+	if (verifier_logs)
+		/* log_level1 + log_level2 + stats, but not stable UAPI */
+		load_attr.log_level = 1 + 2 + 4;
+
+	err = bpf_object__load_xattr(&load_attr);
+	if (err) {
+		p_err("failed to load object file");
+		goto out;
+	}
+	data = gen.data_start;
+	data_sz = gen.data_cur - gen.data_start;
+	insns = gen.insn_start;
+	insns_sz = gen.insn_cur - gen.insn_start;
+
+	/* finish 'struct skel' */
+	codegen("\
+		\n\
+		};							    \n\
+		", obj_name);
+
+	codegen("\
+		\n\
+									    \n\
+		static inline int					    \n\
+		%1$s__attach(struct %1$s *skel)				    \n\
+		{							    \n\
+		", obj_name);
+
+	bpf_object__for_each_program(prog, obj) {
+		printf("\tskel->links.%1$s_fd =\n"
+		       "\t\tbpf_raw_tracepoint_open(NULL, skel->progs.%1$s.prog_fd);\n",
+		       bpf_program__name(prog));
+	}
+
+	codegen("\
+		\n\
+			return 0;					    \n\
+		}							    \n\
+									    \n\
+		static inline void					    \n\
+		%1$s__detach(struct %1$s *skel)				    \n\
+		{							    \n\
+		", obj_name);
+	bpf_object__for_each_program(prog, obj) {
+		printf("\tclose(skel->links.%1$s_fd);\n",
+		       bpf_program__name(prog));
+	}
+	codegen("\
+		\n\
+		}							    \n\
+		");
+
+	codegen("\
+		\n\
+		static void						    \n\
+		%1$s__destroy(struct %1$s *skel)			    \n\
+		{							    \n\
+			if (!skel)					    \n\
+				return;					    \n\
+			%1$s__detach(skel);				    \n\
+			free(skel);					    \n\
+		}							    \n\
+									    \n\
+		static inline struct %1$s *				    \n\
+		%1$s__open(void)					    \n\
+		{							    \n\
+			struct %1$s *skel;				    \n\
+									    \n\
+			skel = calloc(sizeof(*skel), 1);		    \n\
+			if (!skel)					    \n\
+				return NULL;				    \n\
+			skel->ctx.sz = (void *)&skel->links - (void *)skel; \n\
+			return skel;					    \n\
+		}							    \n\
+									    \n\
+		static inline int					    \n\
+		%1$s__load(struct %1$s *skel)				    \n\
+		{							    \n\
+			struct bpf_load_opts opts = {};			    \n\
+			int err;					    \n\
+									    \n\
+			opts.sz = sizeof(opts);				    \n\
+			opts.ctx = (struct bpf_loader_ctx *)skel;	    \n\
+			opts.data_sz = %2$d;				    \n\
+			opts.data = (void *)\"\\			    \n\
+		",
+		obj_name, data_sz);
+	print_hex(data, data_sz);
+	codegen("\
+		\n\
+		\";							    \n\
+		");
+
+	codegen("\
+		\n\
+			opts.insns_sz = %d;				    \n\
+			opts.insns = (void *)\"\\			    \n\
+		",
+		insns_sz);
+	print_hex(insns, insns_sz);
+	codegen("\
+		\n\
+		\";							    \n\
+			err = bpf_load(&opts);				    \n\
+			if (err < 0)					    \n\
+				return err;				    \n\
+		", obj_name);
+	bpf_object__for_each_map(map, obj) {
+		const char * ident;
+
+		ident = get_map_ident(map);
+		if (!ident)
+			continue;
+
+		if (!bpf_map__is_internal(map) ||
+		    !(bpf_map__def(map)->map_flags & BPF_F_MMAPABLE))
+			continue;
+
+		printf("\tskel->%1$s =\n"
+		       "\t\tmmap(NULL, %2$zd, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_FIXED,\n"
+		       "\t\t\tskel->maps.%1$s.map_fd, 0);\n",
+		       ident, bpf_map_mmap_sz(map));
+	}
+	codegen("\
+		\n\
+			return 0;					    \n\
+		}							    \n\
+									    \n\
+		static inline struct %1$s *				    \n\
+		%1$s__open_and_load(void)				    \n\
+		{							    \n\
+			struct %1$s *skel;				    \n\
+									    \n\
+			skel = %1$s__open();				    \n\
+			if (!skel)					    \n\
+				return NULL;				    \n\
+			if (%1$s__load(skel)) {				    \n\
+				%1$s__destroy(skel);			    \n\
+				return NULL;				    \n\
+			}						    \n\
+			return skel;					    \n\
+		}							    \n\
+		", obj_name);
+
+	codegen("\
+		\n\
+									    \n\
+		#endif /* %s */						    \n\
+		",
+		header_guard);
+	err = 0;
+out:
+	return err;
+}
+
 static int do_skeleton(int argc, char **argv)
 {
 	char header_guard[MAX_OBJ_NAME_LEN + sizeof("__SKEL_H__")];
@@ -277,7 +476,7 @@ static int do_skeleton(int argc, char **argv)
 	struct bpf_object *obj = NULL;
 	const char *file, *ident;
 	struct bpf_program *prog;
-	int fd, len, err = -1;
+	int fd, err = -1;
 	struct bpf_map *map;
 	struct btf *btf;
 	struct stat st;
@@ -359,7 +558,25 @@ static int do_skeleton(int argc, char **argv)
 	}
 
 	get_header_guard(header_guard, obj_name);
-	codegen("\
+	if (use_loader)
+		codegen("\
+		\n\
+		/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */   \n\
+		/* THIS FILE IS AUTOGENERATED! */			    \n\
+		#ifndef %2$s						    \n\
+		#define %2$s						    \n\
+									    \n\
+		#include <stdlib.h>					    \n\
+		#include <sys/mman.h>					    \n\
+		#include <bpf/bpf.h>					    \n\
+									    \n\
+		struct %1$s {						    \n\
+			struct bpf_loader_ctx ctx;			    \n\
+		",
+		obj_name, header_guard
+		);
+	else
+		codegen("\
 		\n\
 		/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */   \n\
 									    \n\
@@ -375,7 +592,7 @@ static int do_skeleton(int argc, char **argv)
 			struct bpf_object *obj;				    \n\
 		",
 		obj_name, header_guard
-	);
+		);
 
 	if (map_cnt) {
 		printf("\tstruct {\n");
@@ -383,7 +600,10 @@ static int do_skeleton(int argc, char **argv)
 			ident = get_map_ident(map);
 			if (!ident)
 				continue;
-			printf("\t\tstruct bpf_map *%s;\n", ident);
+			if (use_loader)
+				printf("\t\tunion bpf_map_prog_desc %s;\n", ident);
+			else
+				printf("\t\tstruct bpf_map *%s;\n", ident);
 		}
 		printf("\t} maps;\n");
 	}
@@ -391,14 +611,22 @@ static int do_skeleton(int argc, char **argv)
 	if (prog_cnt) {
 		printf("\tstruct {\n");
 		bpf_object__for_each_program(prog, obj) {
-			printf("\t\tstruct bpf_program *%s;\n",
-			       bpf_program__name(prog));
+			if (use_loader)
+				printf("\t\tunion bpf_map_prog_desc %s;\n",
+				       bpf_program__name(prog));
+			else
+				printf("\t\tstruct bpf_program *%s;\n",
+				       bpf_program__name(prog));
 		}
 		printf("\t} progs;\n");
 		printf("\tstruct {\n");
 		bpf_object__for_each_program(prog, obj) {
-			printf("\t\tstruct bpf_link *%s;\n",
-			       bpf_program__name(prog));
+			if (use_loader)
+				printf("\t\tint %s_fd;\n",
+				       bpf_program__name(prog));
+			else
+				printf("\t\tstruct bpf_link *%s;\n",
+				       bpf_program__name(prog));
 		}
 		printf("\t} links;\n");
 	}
@@ -409,6 +637,10 @@ static int do_skeleton(int argc, char **argv)
 		if (err)
 			goto out;
 	}
+	if (use_loader) {
+		err = gen_trace(obj, obj_name, header_guard);
+		goto out;
+	}
 
 	codegen("\
 		\n\
@@ -577,20 +809,7 @@ static int do_skeleton(int argc, char **argv)
 		",
 		file_sz);
 
-	/* embed contents of BPF object file */
-	for (i = 0, len = 0; i < file_sz; i++) {
-		int w = obj_data[i] ? 4 : 2;
-
-		len += w;
-		if (len > 78) {
-			printf("\\\n");
-			len = w;
-		}
-		if (!obj_data[i])
-			printf("\\0");
-		else
-			printf("\\x%02x", (unsigned char)obj_data[i]);
-	}
+	print_hex(obj_data, file_sz);
 
 	codegen("\
 		\n\
diff --git a/tools/bpf/bpftool/main.c b/tools/bpf/bpftool/main.c
index d9afb730136a..7f2817d97079 100644
--- a/tools/bpf/bpftool/main.c
+++ b/tools/bpf/bpftool/main.c
@@ -29,6 +29,7 @@ bool show_pinned;
 bool block_mount;
 bool verifier_logs;
 bool relaxed_maps;
+bool use_loader;
 struct btf *base_btf;
 struct pinned_obj_table prog_table;
 struct pinned_obj_table map_table;
@@ -392,6 +393,7 @@ int main(int argc, char **argv)
 		{ "mapcompat",	no_argument,	NULL,	'm' },
 		{ "nomount",	no_argument,	NULL,	'n' },
 		{ "debug",	no_argument,	NULL,	'd' },
+		{ "use-loader",	no_argument,	NULL,	'L' },
 		{ "base-btf",	required_argument, NULL, 'B' },
 		{ 0 }
 	};
@@ -409,7 +411,7 @@ int main(int argc, char **argv)
 	hash_init(link_table.table);
 
 	opterr = 0;
-	while ((opt = getopt_long(argc, argv, "VhpjfmndB:",
+	while ((opt = getopt_long(argc, argv, "VhpjfLmndB:",
 				  options, NULL)) >= 0) {
 		switch (opt) {
 		case 'V':
@@ -452,6 +454,9 @@ int main(int argc, char **argv)
 				return -1;
 			}
 			break;
+		case 'L':
+			use_loader = true;
+			break;
 		default:
 			p_err("unrecognized option '%s'", argv[optind - 1]);
 			if (json_output)
diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h
index 76e91641262b..c1cf29798b99 100644
--- a/tools/bpf/bpftool/main.h
+++ b/tools/bpf/bpftool/main.h
@@ -90,6 +90,7 @@ extern bool show_pids;
 extern bool block_mount;
 extern bool verifier_logs;
 extern bool relaxed_maps;
+extern bool use_loader;
 extern struct btf *base_btf;
 extern struct pinned_obj_table prog_table;
 extern struct pinned_obj_table map_table;
diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
index 3f067d2d7584..44d602ca1634 100644
--- a/tools/bpf/bpftool/prog.c
+++ b/tools/bpf/bpftool/prog.c
@@ -24,6 +24,7 @@
 #include <bpf/bpf.h>
 #include <bpf/btf.h>
 #include <bpf/libbpf.h>
+#include <bpf/bpf_gen_internal.h>
 
 #include "cfg.h"
 #include "main.h"
@@ -1645,8 +1646,85 @@ static int load_with_options(int argc, char **argv, bool first_prog_only)
 	return -1;
 }
 
+static int try_loader(struct bpf_gen *gen)
+{
+	DECLARE_LIBBPF_OPTS(bpf_load_opts, opts);
+	struct bpf_loader_ctx *ctx;
+	int ctx_sz = sizeof(*ctx) + 64 * sizeof(union bpf_map_prog_desc);
+	int log_buf_sz = (1u << 24) - 1;
+	char *log_buf;
+	int err;
+
+	ctx = alloca(sizeof(*ctx) + 64 * sizeof(union bpf_map_prog_desc));
+	ctx->sz = ctx_sz;
+	ctx->log_level = 4;
+	ctx->log_size = log_buf_sz;
+	log_buf = malloc(log_buf_sz);
+	if (!log_buf)
+		return ENOMEM;
+	ctx->log_buf = (long)log_buf;
+	opts.ctx = ctx;
+	opts.data = gen->data_start;
+	opts.data_sz = gen->data_cur - gen->data_start;
+	opts.insns = gen->insn_start;
+	opts.insns_sz = gen->insn_cur - gen->insn_start;
+	err = bpf_load(&opts);
+	if (err < 0)
+		fprintf(stderr, "%s", log_buf);
+	free(log_buf);
+	return err;
+}
+
+static int do_loader(int argc, char **argv)
+{
+	DECLARE_LIBBPF_OPTS(bpf_object_open_opts, open_opts);
+	struct bpf_object_load_attr load_attr = {};
+	struct bpf_object *obj;
+	const char *file;
+	struct bpf_gen gen = {};
+	int err = 0;
+
+	if (!REQ_ARGS(1))
+		return -1;
+	file = GET_ARG();
+
+	obj = bpf_object__open_file(file, &open_opts);
+	if (IS_ERR_OR_NULL(obj)) {
+		p_err("failed to open object file");
+		goto err_close_obj;
+	}
+
+	bpf_object__set_gen_trace(obj, &gen);
+
+	load_attr.obj = obj;
+	if (verifier_logs)
+		/* log_level1 + log_level2 + stats, but not stable UAPI */
+		load_attr.log_level = 1 + 2 + 4;
+
+	err = bpf_object__load_xattr(&load_attr);
+	if (err) {
+		p_err("failed to load object file");
+		goto err_close_obj;
+	}
+
+	if (verifier_logs) {
+		struct dump_data dd = {};
+
+		kernel_syms_load(&dd);
+		dump_xlated_plain(&dd, gen.insn_start,
+				  gen.insn_cur - gen.insn_start, false, false);
+		kernel_syms_destroy(&dd);
+	}
+	err = try_loader(&gen);
+err_close_obj:
+	bpf_object__close(obj);
+	return err;
+}
+
 static int do_load(int argc, char **argv)
 {
+	if (use_loader)
+		return do_loader(argc, argv);
 	return load_with_options(argc, argv, true);
 }
 
diff --git a/tools/bpf/bpftool/xlated_dumper.c b/tools/bpf/bpftool/xlated_dumper.c
index 6fc3e6f7f40c..f1f32e21d5cd 100644
--- a/tools/bpf/bpftool/xlated_dumper.c
+++ b/tools/bpf/bpftool/xlated_dumper.c
@@ -196,6 +196,9 @@ static const char *print_imm(void *private_data,
 	else if (insn->src_reg == BPF_PSEUDO_MAP_VALUE)
 		snprintf(dd->scratch_buff, sizeof(dd->scratch_buff),
 			 "map[id:%u][0]+%u", insn->imm, (insn + 1)->imm);
+	else if (insn->src_reg == BPF_PSEUDO_MAP_IDX_VALUE)
+		snprintf(dd->scratch_buff, sizeof(dd->scratch_buff),
+			 "map[idx:%u]+%u", insn->imm, (insn + 1)->imm);
 	else if (insn->src_reg == BPF_PSEUDO_FUNC)
 		snprintf(dd->scratch_buff, sizeof(dd->scratch_buff),
 			 "subprog[%+d]", insn->imm);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* [PATCH bpf-next 15/15] selftests/bpf: Convert few tests to light skeleton.
  2021-04-17  3:32 [PATCH bpf-next 00/15] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
                   ` (13 preceding siblings ...)
  2021-04-17  3:32 ` [PATCH bpf-next 14/15] bpftool: Use syscall/loader program in "prog load" and "gen skeleton" command Alexei Starovoitov
@ 2021-04-17  3:32 ` Alexei Starovoitov
  14 siblings, 0 replies; 30+ messages in thread
From: Alexei Starovoitov @ 2021-04-17  3:32 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Convert few tests that don't use CO-RE to light skeleton.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/testing/selftests/bpf/.gitignore            |  1 +
 tools/testing/selftests/bpf/Makefile              | 15 ++++++++++++++-
 .../selftests/bpf/prog_tests/fentry_fexit.c       |  6 +++---
 .../selftests/bpf/prog_tests/fentry_test.c        |  4 ++--
 .../selftests/bpf/prog_tests/fexit_sleep.c        |  6 +++---
 .../testing/selftests/bpf/prog_tests/fexit_test.c |  4 ++--
 .../testing/selftests/bpf/prog_tests/kfunc_call.c |  6 +++---
 7 files changed, 28 insertions(+), 14 deletions(-)

diff --git a/tools/testing/selftests/bpf/.gitignore b/tools/testing/selftests/bpf/.gitignore
index 4866f6a21901..a030aa4a8a9e 100644
--- a/tools/testing/selftests/bpf/.gitignore
+++ b/tools/testing/selftests/bpf/.gitignore
@@ -30,6 +30,7 @@ test_sysctl
 xdping
 test_cpp
 *.skel.h
+*.lskel.h
 /no_alu32
 /bpf_gcc
 /tools
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 5e618ff1e8fd..e3f42df0e250 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -311,6 +311,9 @@ SKEL_BLACKLIST := btf__% test_pinning_invalid.c test_sk_assign.c
 
 LINKED_SKELS := test_static_linked.skel.h
 
+LSKELS := kfunc_call_test.c fentry_test.c fexit_test.c fexit_sleep.c
+SKEL_BLACKLIST += $$(LSKELS)
+
 test_static_linked.skel.h-deps := test_static_linked1.o test_static_linked2.o
 
 # Set up extra TRUNNER_XXX "temporary" variables in the environment (relies on
@@ -333,6 +336,7 @@ TRUNNER_BPF_OBJS := $$(patsubst %.c,$$(TRUNNER_OUTPUT)/%.o, $$(TRUNNER_BPF_SRCS)
 TRUNNER_BPF_SKELS := $$(patsubst %.c,$$(TRUNNER_OUTPUT)/%.skel.h,	\
 				 $$(filter-out $(SKEL_BLACKLIST),	\
 					       $$(TRUNNER_BPF_SRCS)))
+TRUNNER_BPF_LSKELS := $$(patsubst %.c,$$(TRUNNER_OUTPUT)/%.lskel.h, $$(LSKELS))
 TRUNNER_BPF_SKELS_LINKED := $$(addprefix $$(TRUNNER_OUTPUT)/,$(LINKED_SKELS))
 TEST_GEN_FILES += $$(TRUNNER_BPF_OBJS)
 
@@ -374,6 +378,14 @@ $(TRUNNER_BPF_SKELS): %.skel.h: %.o $(BPFTOOL) | $(TRUNNER_OUTPUT)
 	$(Q)diff $$(<:.o=.linked2.o) $$(<:.o=.linked3.o)
 	$(Q)$$(BPFTOOL) gen skeleton $$(<:.o=.linked3.o) name $$(notdir $$(<:.o=)) > $$@
 
+$(TRUNNER_BPF_LSKELS): %.lskel.h: %.o $(BPFTOOL) | $(TRUNNER_OUTPUT)
+	$$(call msg,GEN-SKEL,$(TRUNNER_BINARY),$$@)
+	$(Q)$$(BPFTOOL) gen object $$(<:.o=.linked1.o) $$<
+	$(Q)$$(BPFTOOL) gen object $$(<:.o=.linked2.o) $$(<:.o=.linked1.o)
+	$(Q)$$(BPFTOOL) gen object $$(<:.o=.linked3.o) $$(<:.o=.linked2.o)
+	$(Q)diff $$(<:.o=.linked2.o) $$(<:.o=.linked3.o)
+	$(Q)$$(BPFTOOL) gen skeleton -L $$(<:.o=.linked3.o) name $$(notdir $$(<:.o=)) > $$@
+
 $(TRUNNER_BPF_SKELS_LINKED): $(TRUNNER_BPF_OBJS) $(BPFTOOL) | $(TRUNNER_OUTPUT)
 	$$(call msg,LINK-BPF,$(TRUNNER_BINARY),$$(@:.skel.h=.o))
 	$(Q)$$(BPFTOOL) gen object $$(@:.skel.h=.linked1.o) $$(addprefix $(TRUNNER_OUTPUT)/,$$($$(@F)-deps))
@@ -403,6 +415,7 @@ $(TRUNNER_TEST_OBJS): $(TRUNNER_OUTPUT)/%.test.o:			\
 		      $(TRUNNER_EXTRA_HDRS)				\
 		      $(TRUNNER_BPF_OBJS)				\
 		      $(TRUNNER_BPF_SKELS)				\
+		      $(TRUNNER_BPF_LSKELS)				\
 		      $(TRUNNER_BPF_SKELS_LINKED)			\
 		      $$(BPFOBJ) | $(TRUNNER_OUTPUT)
 	$$(call msg,TEST-OBJ,$(TRUNNER_BINARY),$$@)
@@ -510,6 +523,6 @@ $(OUTPUT)/bench: $(OUTPUT)/bench.o $(OUTPUT)/testing_helpers.o \
 EXTRA_CLEAN := $(TEST_CUSTOM_PROGS) $(SCRATCH_DIR) $(HOST_SCRATCH_DIR)	\
 	prog_tests/tests.h map_tests/tests.h verifier/tests.h		\
 	feature								\
-	$(addprefix $(OUTPUT)/,*.o *.skel.h no_alu32 bpf_gcc bpf_testmod.ko)
+	$(addprefix $(OUTPUT)/,*.o *.skel.h *.lskel.h no_alu32 bpf_gcc bpf_testmod.ko)
 
 .PHONY: docs docs-clean
diff --git a/tools/testing/selftests/bpf/prog_tests/fentry_fexit.c b/tools/testing/selftests/bpf/prog_tests/fentry_fexit.c
index 109d0345a2be..91154c2ba256 100644
--- a/tools/testing/selftests/bpf/prog_tests/fentry_fexit.c
+++ b/tools/testing/selftests/bpf/prog_tests/fentry_fexit.c
@@ -1,8 +1,8 @@
 // SPDX-License-Identifier: GPL-2.0
 /* Copyright (c) 2019 Facebook */
 #include <test_progs.h>
-#include "fentry_test.skel.h"
-#include "fexit_test.skel.h"
+#include "fentry_test.lskel.h"
+#include "fexit_test.lskel.h"
 
 void test_fentry_fexit(void)
 {
@@ -26,7 +26,7 @@ void test_fentry_fexit(void)
 	if (CHECK(err, "fexit_attach", "fexit attach failed: %d\n", err))
 		goto close_prog;
 
-	prog_fd = bpf_program__fd(fexit_skel->progs.test1);
+	prog_fd = fexit_skel->progs.test1.prog_fd;
 	err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
 				NULL, NULL, &retval, &duration);
 	CHECK(err || retval, "ipv6",
diff --git a/tools/testing/selftests/bpf/prog_tests/fentry_test.c b/tools/testing/selftests/bpf/prog_tests/fentry_test.c
index 04ebbf1cb390..78062855b142 100644
--- a/tools/testing/selftests/bpf/prog_tests/fentry_test.c
+++ b/tools/testing/selftests/bpf/prog_tests/fentry_test.c
@@ -1,7 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 /* Copyright (c) 2019 Facebook */
 #include <test_progs.h>
-#include "fentry_test.skel.h"
+#include "fentry_test.lskel.h"
 
 void test_fentry_test(void)
 {
@@ -18,7 +18,7 @@ void test_fentry_test(void)
 	if (CHECK(err, "fentry_attach", "fentry attach failed: %d\n", err))
 		goto cleanup;
 
-	prog_fd = bpf_program__fd(fentry_skel->progs.test1);
+	prog_fd = fentry_skel->progs.test1.prog_fd;
 	err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
 				NULL, NULL, &retval, &duration);
 	CHECK(err || retval, "test_run",
diff --git a/tools/testing/selftests/bpf/prog_tests/fexit_sleep.c b/tools/testing/selftests/bpf/prog_tests/fexit_sleep.c
index ccc7e8a34ab6..4e7f4b42ea29 100644
--- a/tools/testing/selftests/bpf/prog_tests/fexit_sleep.c
+++ b/tools/testing/selftests/bpf/prog_tests/fexit_sleep.c
@@ -6,7 +6,7 @@
 #include <time.h>
 #include <sys/mman.h>
 #include <sys/syscall.h>
-#include "fexit_sleep.skel.h"
+#include "fexit_sleep.lskel.h"
 
 static int do_sleep(void *skel)
 {
@@ -58,8 +58,8 @@ void test_fexit_sleep(void)
 	 * waiting for percpu_ref_kill to confirm). The other one
 	 * will be freed quickly.
 	 */
-	close(bpf_program__fd(fexit_skel->progs.nanosleep_fentry));
-	close(bpf_program__fd(fexit_skel->progs.nanosleep_fexit));
+	close(fexit_skel->progs.nanosleep_fentry.prog_fd);
+	close(fexit_skel->progs.nanosleep_fexit.prog_fd);
 	fexit_sleep__detach(fexit_skel);
 
 	/* kill the thread to unwind sys_nanosleep stack through the trampoline */
diff --git a/tools/testing/selftests/bpf/prog_tests/fexit_test.c b/tools/testing/selftests/bpf/prog_tests/fexit_test.c
index 78d7a2765c27..be75d0c1018a 100644
--- a/tools/testing/selftests/bpf/prog_tests/fexit_test.c
+++ b/tools/testing/selftests/bpf/prog_tests/fexit_test.c
@@ -1,7 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 /* Copyright (c) 2019 Facebook */
 #include <test_progs.h>
-#include "fexit_test.skel.h"
+#include "fexit_test.lskel.h"
 
 void test_fexit_test(void)
 {
@@ -18,7 +18,7 @@ void test_fexit_test(void)
 	if (CHECK(err, "fexit_attach", "fexit attach failed: %d\n", err))
 		goto cleanup;
 
-	prog_fd = bpf_program__fd(fexit_skel->progs.test1);
+	prog_fd = fexit_skel->progs.test1.prog_fd;
 	err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
 				NULL, NULL, &retval, &duration);
 	CHECK(err || retval, "test_run",
diff --git a/tools/testing/selftests/bpf/prog_tests/kfunc_call.c b/tools/testing/selftests/bpf/prog_tests/kfunc_call.c
index 7fc0951ee75f..30a7b9b837bf 100644
--- a/tools/testing/selftests/bpf/prog_tests/kfunc_call.c
+++ b/tools/testing/selftests/bpf/prog_tests/kfunc_call.c
@@ -2,7 +2,7 @@
 /* Copyright (c) 2021 Facebook */
 #include <test_progs.h>
 #include <network_helpers.h>
-#include "kfunc_call_test.skel.h"
+#include "kfunc_call_test.lskel.h"
 #include "kfunc_call_test_subprog.skel.h"
 
 static void test_main(void)
@@ -14,13 +14,13 @@ static void test_main(void)
 	if (!ASSERT_OK_PTR(skel, "skel"))
 		return;
 
-	prog_fd = bpf_program__fd(skel->progs.kfunc_call_test1);
+	prog_fd = skel->progs.kfunc_call_test1.prog_fd;
 	err = bpf_prog_test_run(prog_fd, 1, &pkt_v4, sizeof(pkt_v4),
 				NULL, NULL, (__u32 *)&retval, NULL);
 	ASSERT_OK(err, "bpf_prog_test_run(test1)");
 	ASSERT_EQ(retval, 12, "test1-retval");
 
-	prog_fd = bpf_program__fd(skel->progs.kfunc_call_test2);
+	prog_fd = skel->progs.kfunc_call_test2.prog_fd;
 	err = bpf_prog_test_run(prog_fd, 1, &pkt_v4, sizeof(pkt_v4),
 				NULL, NULL, (__u32 *)&retval, NULL);
 	ASSERT_OK(err, "bpf_prog_test_run(test2)");
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next 11/15] bpf: Add bpf_sys_close() helper.
  2021-04-17  3:32 ` [PATCH bpf-next 11/15] bpf: Add bpf_sys_close() helper Alexei Starovoitov
@ 2021-04-17  3:42   ` Al Viro
  2021-04-17  3:46     ` Alexei Starovoitov
  0 siblings, 1 reply; 30+ messages in thread
From: Al Viro @ 2021-04-17  3:42 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: davem, daniel, andrii, netdev, bpf, kernel-team

On Fri, Apr 16, 2021 at 08:32:20PM -0700, Alexei Starovoitov wrote:
> From: Alexei Starovoitov <ast@kernel.org>
> 
> Add bpf_sys_close() helper to be used by the syscall/loader program to close
> intermediate FDs and other cleanup.

Conditional NAK.  In a lot of contexts close_fd() is very much unsafe.
In particular, anything that might call it between fdget() and fdput()
is Right Fucking Out(tm).

In which contexts can that thing be executed?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next 11/15] bpf: Add bpf_sys_close() helper.
  2021-04-17  3:42   ` Al Viro
@ 2021-04-17  3:46     ` Alexei Starovoitov
  2021-04-17  4:04       ` Al Viro
  0 siblings, 1 reply; 30+ messages in thread
From: Alexei Starovoitov @ 2021-04-17  3:46 UTC (permalink / raw)
  To: Al Viro
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko,
	Network Development, bpf, Kernel Team

On Fri, Apr 16, 2021 at 8:42 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Fri, Apr 16, 2021 at 08:32:20PM -0700, Alexei Starovoitov wrote:
> > From: Alexei Starovoitov <ast@kernel.org>
> >
> > Add bpf_sys_close() helper to be used by the syscall/loader program to close
> > intermediate FDs and other cleanup.
>
> Conditional NAK.  In a lot of contexts close_fd() is very much unsafe.
> In particular, anything that might call it between fdget() and fdput()
> is Right Fucking Out(tm).
> In which contexts can that thing be executed?

user context only.
It's not for all of bpf _obviously_.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next 11/15] bpf: Add bpf_sys_close() helper.
  2021-04-17  3:46     ` Alexei Starovoitov
@ 2021-04-17  4:04       ` Al Viro
  2021-04-17  5:01         ` Alexei Starovoitov
  0 siblings, 1 reply; 30+ messages in thread
From: Al Viro @ 2021-04-17  4:04 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko,
	Network Development, bpf, Kernel Team

On Fri, Apr 16, 2021 at 08:46:05PM -0700, Alexei Starovoitov wrote:
> On Fri, Apr 16, 2021 at 8:42 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > On Fri, Apr 16, 2021 at 08:32:20PM -0700, Alexei Starovoitov wrote:
> > > From: Alexei Starovoitov <ast@kernel.org>
> > >
> > > Add bpf_sys_close() helper to be used by the syscall/loader program to close
> > > intermediate FDs and other cleanup.
> >
> > Conditional NAK.  In a lot of contexts close_fd() is very much unsafe.
> > In particular, anything that might call it between fdget() and fdput()
> > is Right Fucking Out(tm).
> > In which contexts can that thing be executed?
> 
> user context only.
> It's not for all of bpf _obviously_.

Let me restate the question: what call chains could lead to bpf_sys_close()?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next 11/15] bpf: Add bpf_sys_close() helper.
  2021-04-17  4:04       ` Al Viro
@ 2021-04-17  5:01         ` Alexei Starovoitov
  2021-04-17 14:36           ` Alexei Starovoitov
  0 siblings, 1 reply; 30+ messages in thread
From: Alexei Starovoitov @ 2021-04-17  5:01 UTC (permalink / raw)
  To: Al Viro
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko,
	Network Development, bpf, Kernel Team

On Fri, Apr 16, 2021 at 9:04 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> On Fri, Apr 16, 2021 at 08:46:05PM -0700, Alexei Starovoitov wrote:
> > On Fri, Apr 16, 2021 at 8:42 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
> > >
> > > On Fri, Apr 16, 2021 at 08:32:20PM -0700, Alexei Starovoitov wrote:
> > > > From: Alexei Starovoitov <ast@kernel.org>
> > > >
> > > > Add bpf_sys_close() helper to be used by the syscall/loader program to close
> > > > intermediate FDs and other cleanup.
> > >
> > > Conditional NAK.  In a lot of contexts close_fd() is very much unsafe.
> > > In particular, anything that might call it between fdget() and fdput()
> > > is Right Fucking Out(tm).
> > > In which contexts can that thing be executed?
> >
> > user context only.
> > It's not for all of bpf _obviously_.
>
> Let me restate the question: what call chains could lead to bpf_sys_close()?

Already answered. User context only. It's all safe.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next 11/15] bpf: Add bpf_sys_close() helper.
  2021-04-17  5:01         ` Alexei Starovoitov
@ 2021-04-17 14:36           ` Alexei Starovoitov
  2021-04-17 16:48             ` Al Viro
  0 siblings, 1 reply; 30+ messages in thread
From: Alexei Starovoitov @ 2021-04-17 14:36 UTC (permalink / raw)
  To: Al Viro
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko,
	Network Development, bpf, Kernel Team

On Fri, Apr 16, 2021 at 10:01:43PM -0700, Alexei Starovoitov wrote:
> On Fri, Apr 16, 2021 at 9:04 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > On Fri, Apr 16, 2021 at 08:46:05PM -0700, Alexei Starovoitov wrote:
> > > On Fri, Apr 16, 2021 at 8:42 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
> > > >
> > > > On Fri, Apr 16, 2021 at 08:32:20PM -0700, Alexei Starovoitov wrote:
> > > > > From: Alexei Starovoitov <ast@kernel.org>
> > > > >
> > > > > Add bpf_sys_close() helper to be used by the syscall/loader program to close
> > > > > intermediate FDs and other cleanup.
> > > >
> > > > Conditional NAK.  In a lot of contexts close_fd() is very much unsafe.
> > > > In particular, anything that might call it between fdget() and fdput()
> > > > is Right Fucking Out(tm).
> > > > In which contexts can that thing be executed?
> > >
> > > user context only.
> > > It's not for all of bpf _obviously_.
> >
> > Let me restate the question: what call chains could lead to bpf_sys_close()?
> 
> Already answered. User context only. It's all safe.

Not only sys_close is safe to call. Literally all syscalls are safe to call.
The current allowlist contains two syscalls. It may get extended as use cases come up.

The following two codes are equivalent:
1.
bpf_prog.c:
  SEC("syscall")
  int bpf_prog(struct args *ctx)
  {
    bpf_sys_close(1);
    bpf_sys_close(2);
    bpf_sys_close(3);
    return 0;
  }
main.c:
  int main(int ac, char **av)
  {
    bpf_prog_load_and_run("bpf_prog.o");
  }

2.
main.c:
  int main(int ac, char **av)
  {
    close(1);
    close(2);
    close(3);
  }

The kernel will perform the same work with FDs. The same locks are held
and the same execution conditions are in both cases. The LSM hooks,
fsnotify, etc will be called the same way.
It's no different if new syscall was introduced "sys_foo(int num)" that
would do { return close_fd(num); }.
It would opearate in the same user context.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next 11/15] bpf: Add bpf_sys_close() helper.
  2021-04-17 14:36           ` Alexei Starovoitov
@ 2021-04-17 16:48             ` Al Viro
  2021-04-17 17:09               ` Alexei Starovoitov
  0 siblings, 1 reply; 30+ messages in thread
From: Al Viro @ 2021-04-17 16:48 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko,
	Network Development, bpf, Kernel Team

On Sat, Apr 17, 2021 at 07:36:39AM -0700, Alexei Starovoitov wrote:

> The kernel will perform the same work with FDs. The same locks are held
> and the same execution conditions are in both cases. The LSM hooks,
> fsnotify, etc will be called the same way.
> It's no different if new syscall was introduced "sys_foo(int num)" that
> would do { return close_fd(num); }.
> It would opearate in the same user context.

Hmm...  unless I'm misreading the code, one of the call chains would seem to
be sys_bpf() -> bpf_prog_test_run() -> ->test_run() -> ... -> bpf_sys_close().
OK, as long as you make sure bpf_prog_get() does fdput() (i.e. that we
don't have it restructured so that fdget()/fdput() pair would be lifted into
bpf_prog_test_run(), with fdput() moved in place of bpf_prog_put()).

Note that we *really* can not allow close_fd() on anything to be bracketed
by fdget()/fdput() pair; we had bugs of that sort and, as the matter of fact,
still have one in autofs_dev_ioctl().

The trouble happens if you have file F with 2 references, held by descriptor
tables of different processes.  Say, process A has descriptor 6 refering to
it, while B has descriptor 42 doing the same.  Descriptor tables of A and B
are not shared with anyone.

A: fdget(6) 	-> returns a reference to F, refcount _not_ touched
A: close_fd(6)	-> rips the reference to F from descriptor table, does fput(F)
		   refcount drops to 1.
B: close(42)	-> rips the reference to F from B's descriptor table, does fput(F)
		   This time refcount does reach 0 and we use task_work_add() to
		   make sure the destructor (__fput()) runs before B returns to
		   userland.  sys_close() returns and B goes off to userland.
		   On the way out __fput() is run, and among other things,
		   ->release() of F is executed, doing whatever it wants to do.
		   F is freed.
And at that point A, which presumably is using the guts of F, gets screwed.

In case of autofs_dev_ioctl(), it's possible for a thread to end up blocked
inside copy_to_user(), with autofs functions in call chains *AND* module
refcount of autofs not pinned by anything.  The effects of returning into a
function that used to reside in now unmapped page are obviously not pretty...

Basically, the rule is
	* never remove from descriptor table if you currently have an outstadning
fdget() (i.e. without having done the matching fdput() yet).

	That, obviously, covers all ioctls - there we have fdget() done
by sys_ioctl() on the issuing descriptor.  In your case you seem to be
safe, but it's a bloody dangerous minefield - you really need a big warning
in all call sites.  The worst part is that it won't happen with intended
use, so it doesn't show up in routine regression testing.  In particular,
for autofs the normal case is AUTOFS_DEV_IOCTL_CLOSEMOUNT getting passed
a file descriptor of something mounted and *not* the descriptor of
/dev/autofs we are holding fdget() on.  However, there's no way to prevent
a malicious call when we pass exactly that.

	So please, mark all call sites with "make very sure you never get
here with unpaired fdget()".

	BTW, if my reading (re ->test_run()) is correct, what limits the recursion
via bpf_sys_bpf()?

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next 11/15] bpf: Add bpf_sys_close() helper.
  2021-04-17 16:48             ` Al Viro
@ 2021-04-17 17:09               ` Alexei Starovoitov
  0 siblings, 0 replies; 30+ messages in thread
From: Alexei Starovoitov @ 2021-04-17 17:09 UTC (permalink / raw)
  To: Al Viro
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko,
	Network Development, bpf, Kernel Team

On Sat, Apr 17, 2021 at 04:48:53PM +0000, Al Viro wrote:
> On Sat, Apr 17, 2021 at 07:36:39AM -0700, Alexei Starovoitov wrote:
> 
> > The kernel will perform the same work with FDs. The same locks are held
> > and the same execution conditions are in both cases. The LSM hooks,
> > fsnotify, etc will be called the same way.
> > It's no different if new syscall was introduced "sys_foo(int num)" that
> > would do { return close_fd(num); }.
> > It would opearate in the same user context.
> 
> Hmm...  unless I'm misreading the code, one of the call chains would seem to
> be sys_bpf() -> bpf_prog_test_run() -> ->test_run() -> ... -> bpf_sys_close().
> OK, as long as you make sure bpf_prog_get() does fdput() (i.e. that we
> don't have it restructured so that fdget()/fdput() pair would be lifted into
> bpf_prog_test_run(), with fdput() moved in place of bpf_prog_put()).

Got it. There is no fdget/put bracketing in the code.
On the way to test_run we do __bpf_prog_get() which does fdget and immediately
fdput after incrementing refcnt of the prog.
I believe this pattern is consistent everywhere in kernel/bpf/*

> Note that we *really* can not allow close_fd() on anything to be bracketed
> by fdget()/fdput() pair; we had bugs of that sort and, as the matter of fact,
> still have one in autofs_dev_ioctl().
> 
> The trouble happens if you have file F with 2 references, held by descriptor
> tables of different processes.  Say, process A has descriptor 6 refering to
> it, while B has descriptor 42 doing the same.  Descriptor tables of A and B
> are not shared with anyone.
> 
> A: fdget(6) 	-> returns a reference to F, refcount _not_ touched
> A: close_fd(6)	-> rips the reference to F from descriptor table, does fput(F)
> 		   refcount drops to 1.
> B: close(42)	-> rips the reference to F from B's descriptor table, does fput(F)
> 		   This time refcount does reach 0 and we use task_work_add() to
> 		   make sure the destructor (__fput()) runs before B returns to
> 		   userland.  sys_close() returns and B goes off to userland.
> 		   On the way out __fput() is run, and among other things,
> 		   ->release() of F is executed, doing whatever it wants to do.
> 		   F is freed.
> And at that point A, which presumably is using the guts of F, gets screwed.

Thanks for these details. That's really helpful.

> 	So please, mark all call sites with "make very sure you never get
> here with unpaired fdget()".

Good point. Will add this comment.

> 	BTW, if my reading (re ->test_run()) is correct, what limits the recursion
> via bpf_sys_bpf()?

Glad you asked! This kind of code review questions are much appreciated.

It's an allowlist of possible commands in bpf_sys_bpf().
'case BPF_PROG_TEST_RUN:' is not there for this exact reason.
I'll add a comment to make it more obvious.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next 13/15] libbpf: Generate loader program out of BPF ELF file.
  2021-04-17  3:32 ` [PATCH bpf-next 13/15] libbpf: Generate loader program out of BPF ELF file Alexei Starovoitov
@ 2021-04-21  1:34   ` Yonghong Song
  2021-04-21  4:46     ` Alexei Starovoitov
  0 siblings, 1 reply; 30+ messages in thread
From: Yonghong Song @ 2021-04-21  1:34 UTC (permalink / raw)
  To: Alexei Starovoitov, davem; +Cc: daniel, andrii, netdev, bpf, kernel-team



On 4/16/21 8:32 PM, Alexei Starovoitov wrote:
> From: Alexei Starovoitov <ast@kernel.org>
> 
> The BPF program loading process performed by libbpf is quite complex
> and consists of the following steps:
> "open" phase:
> - parse elf file and remember relocations, sections
> - collect externs and ksyms including their btf_ids in prog's BTF
> - patch BTF datasec (since llvm couldn't do it)
> - init maps (old style map_def, BTF based, global data map, kconfig map)
> - collect relocations against progs and maps
> "load" phase:
> - probe kernel features
> - load vmlinux BTF
> - resolve externs (kconfig and ksym)
> - load program BTF
> - init struct_ops
> - create maps
> - apply CO-RE relocations
> - patch ld_imm64 insns with src_reg=PSEUDO_MAP, PSEUDO_MAP_VALUE, PSEUDO_BTF_ID
> - reposition subprograms and adjust call insns
> - sanitize and load progs
> 
> During this process libbpf does sys_bpf() calls to load BTF, create maps,
> populate maps and finally load programs.
> Instead of actually doing the syscalls generate a trace of what libbpf
> would have done and represent it as the "loader program".
> The "loader program" consists of single map with:
> - union bpf_attr(s)
> - BTF bytes
> - map value bytes
> - insns bytes
> and single bpf program that passes bpf_attr(s) and data into bpf_sys_bpf() helper.
> Executing such "loader program" via bpf_prog_test_run() command will
> replay the sequence of syscalls that libbpf would have done which will result
> the same maps created and programs loaded as specified in the elf file.
> The "loader program" removes libelf and majority of libbpf dependency from
> program loading process.
> 
> kconfig, typeless ksym, struct_ops and CO-RE are not supported yet.

Beyond this, currently libbpf has a lot of flexibility between prog open
and load, change program type, key/value size, pin maps, max_entries, 
reuse map, etc. it is worthwhile to mention this in the cover letter.
It is possible that these changes may defeat the purpose of signing the
program though.

> 
> The order of relocate_data and relocate_calls had to change in order
> for trace generation to see all relocations for given program with
> correct insn_idx-es.
> 
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---
>   tools/lib/bpf/Build              |   2 +-
>   tools/lib/bpf/bpf.c              |  61 ++++
>   tools/lib/bpf/bpf.h              |  35 ++
>   tools/lib/bpf/bpf_gen_internal.h |  38 +++
>   tools/lib/bpf/gen_trace.c        | 529 +++++++++++++++++++++++++++++++
>   tools/lib/bpf/libbpf.c           | 199 ++++++++++--
>   tools/lib/bpf/libbpf.map         |   1 +
>   tools/lib/bpf/libbpf_internal.h  |   2 +
>   8 files changed, 834 insertions(+), 33 deletions(-)
>   create mode 100644 tools/lib/bpf/bpf_gen_internal.h
>   create mode 100644 tools/lib/bpf/gen_trace.c
> 
> diff --git a/tools/lib/bpf/Build b/tools/lib/bpf/Build
> index 9b057cc7650a..d0a1903bcc3c 100644
> --- a/tools/lib/bpf/Build
> +++ b/tools/lib/bpf/Build
> @@ -1,3 +1,3 @@
>   libbpf-y := libbpf.o bpf.o nlattr.o btf.o libbpf_errno.o str_error.o \
>   	    netlink.o bpf_prog_linfo.o libbpf_probes.o xsk.o hashmap.o \
> -	    btf_dump.o ringbuf.o strset.o linker.o
> +	    btf_dump.o ringbuf.o strset.o linker.o gen_trace.o
> diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
> index b96a3aba6fcc..517e4f949a73 100644
> --- a/tools/lib/bpf/bpf.c
> +++ b/tools/lib/bpf/bpf.c
> @@ -972,3 +972,64 @@ int bpf_prog_bind_map(int prog_fd, int map_fd,
[...]
> +/* The layout of bpf_map_prog_desc and bpf_loader_ctx is feature dependent
> + * and will change from one version of libbpf to another and features
> + * requested during loader program generation.
> + */
> +union bpf_map_prog_desc {
> +	struct {
> +		__u32 map_fd;
> +		__u32 max_entries;
> +	};
> +	struct {
> +		__u32 prog_fd;
> +		__u32 attach_prog_fd;
> +	};
> +};
> +
> +struct bpf_loader_ctx {
> +	size_t sz;
> +	__u32 log_level;
> +	__u32 log_size;
> +	__u64 log_buf;
> +	union bpf_map_prog_desc u[];
> +};
> +
> +struct bpf_load_opts {
> +	size_t sz; /* size of this struct for forward/backward compatibility */
> +	struct bpf_loader_ctx *ctx;
> +	const void *data;
> +	const void *insns;
> +	__u32 data_sz;
> +	__u32 insns_sz;
> +};
> +#define bpf_load_opts__last_field insns_sz
> +
> +LIBBPF_API int bpf_load(const struct bpf_load_opts *opts);
> +
>   #ifdef __cplusplus
>   } /* extern "C" */
>   #endif
> diff --git a/tools/lib/bpf/bpf_gen_internal.h b/tools/lib/bpf/bpf_gen_internal.h
> new file mode 100644
> index 000000000000..a79f2e4ad980
> --- /dev/null
> +++ b/tools/lib/bpf/bpf_gen_internal.h
> @@ -0,0 +1,38 @@
> +/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
> +/* Copyright (c) 2021 Facebook */
> +#ifndef __BPF_GEN_INTERNAL_H
> +#define __BPF_GEN_INTERNAL_H
> +
> +struct relo_desc {
> +	const char *name;
> +	int kind;
> +	int insn_idx;
> +};
> +
> +struct bpf_gen {
> +	void *data_start;
> +	void *data_cur;
> +	void *insn_start;
> +	void *insn_cur;
> +	__u32 nr_progs;
> +	__u32 nr_maps;
> +	int log_level;
> +	int error;
> +	struct relo_desc *relos;
> +	int relo_cnt;
> +};
> +
> +void bpf_object__set_gen_trace(struct bpf_object *obj, struct bpf_gen *gen);
> +
> +void bpf_gen__init(struct bpf_gen *gen, int log_level);
> +int bpf_gen__finish(struct bpf_gen *gen);
> +void bpf_gen__load_btf(struct bpf_gen *gen, const void *raw_data, __u32 raw_size);
> +void bpf_gen__map_create(struct bpf_gen *gen, struct bpf_create_map_attr *map_attr, int map_idx);
> +struct bpf_prog_load_params;
> +void bpf_gen__prog_load(struct bpf_gen *gen, struct bpf_prog_load_params *load_attr, int prog_idx);
> +void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *value, __u32 value_size);
> +void bpf_gen__map_freeze(struct bpf_gen *gen, int map_idx);
> +void bpf_gen__record_find_name(struct bpf_gen *gen, const char *name, enum bpf_attach_type type);
> +void bpf_gen__record_extern(struct bpf_gen *gen, const char *name, int kind, int insn_idx);
> +
> +#endif
> diff --git a/tools/lib/bpf/gen_trace.c b/tools/lib/bpf/gen_trace.c
> new file mode 100644
> index 000000000000..1a80a8dd1c9f
> --- /dev/null
> +++ b/tools/lib/bpf/gen_trace.c
> @@ -0,0 +1,529 @@
> +// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
> +/* Copyright (c) 2021 Facebook */
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <errno.h>
> +#include <linux/filter.h>
> +#include "btf.h"
> +#include "bpf.h"
> +#include "libbpf.h"
> +#include "libbpf_internal.h"
> +#include "hashmap.h"
> +#include "bpf_gen_internal.h"
> +
> +#define MAX_USED_MAPS 64
> +#define MAX_USED_PROGS 32
> +
> +/* The following structure describes the stack layout of the loader program.
> + * In addition R6 contains the pointer to context.
> + * R7 contains the result of the last sys_bpf command (typically error or FD).
> + */
> +struct loader_stack {
> +	__u32 btf_fd;
> +	__u32 map_fd[MAX_USED_MAPS];
> +	__u32 prog_fd[MAX_USED_PROGS];
> +	__u32 inner_map_fd;
> +	__u32 last_btf_id;
> +	__u32 last_attach_btf_obj_fd;
> +};
> +#define stack_off(field) (__s16)(-sizeof(struct loader_stack) + offsetof(struct loader_stack, field))
> +
> +static int bpf_gen__realloc_insn_buf(struct bpf_gen *gen, __u32 size)
> +{
> +	size_t off = gen->insn_cur - gen->insn_start;
> +
> +	if (gen->error)
> +		return -ENOMEM;

return gen->error?

> +	if (off + size > UINT32_MAX) {
> +		gen->error = -ERANGE;
> +		return -ERANGE;
> +	}
> +	gen->insn_start = realloc(gen->insn_start, off + size);
> +	if (!gen->insn_start) {
> +		gen->error = -ENOMEM;
> +		return -ENOMEM;
> +	}
> +	gen->insn_cur = gen->insn_start + off;
> +	return 0;
> +}
> +
> +static int bpf_gen__realloc_data_buf(struct bpf_gen *gen, __u32 size)

Maybe change the return type to size_t? Esp. in the below
we have off + size > UINT32_MAX.

> +{
> +	size_t off = gen->data_cur - gen->data_start;
> +
> +	if (gen->error)
> +		return -ENOMEM;

return gen->error?

> +	if (off + size > UINT32_MAX) {
> +		gen->error = -ERANGE;
> +		return -ERANGE;
> +	}
> +	gen->data_start = realloc(gen->data_start, off + size);
> +	if (!gen->data_start) {
> +		gen->error = -ENOMEM;
> +		return -ENOMEM;
> +	}
> +	gen->data_cur = gen->data_start + off;
> +	return 0;
> +}
> +
> +static void bpf_gen__emit(struct bpf_gen *gen, struct bpf_insn insn)
> +{
> +	if (bpf_gen__realloc_insn_buf(gen, sizeof(insn)))
> +		return;
> +	memcpy(gen->insn_cur, &insn, sizeof(insn));
> +	gen->insn_cur += sizeof(insn);
> +}
> +
> +static void bpf_gen__emit2(struct bpf_gen *gen, struct bpf_insn insn1, struct bpf_insn insn2)
> +{
> +	bpf_gen__emit(gen, insn1);
> +	bpf_gen__emit(gen, insn2);
> +}
> +
> +void bpf_gen__init(struct bpf_gen *gen, int log_level)
> +{
> +	gen->log_level = log_level;
> +	bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_6, BPF_REG_1));
> +	bpf_gen__emit(gen, BPF_ST_MEM(BPF_W, BPF_REG_10, stack_off(last_attach_btf_obj_fd), 0));

Here we initialize last_attach_btf_obj_fd, do we need to initialize 
last_btf_id?

> +}
> +
> +static int bpf_gen__add_data(struct bpf_gen *gen, const void *data, __u32 size)
> +{
> +	void *prev;
> +
> +	if (bpf_gen__realloc_data_buf(gen, size))
> +		return 0;
> +	prev = gen->data_cur;
> +	memcpy(gen->data_cur, data, size);
> +	gen->data_cur += size;
> +	return prev - gen->data_start;
> +}
> +
> +static int insn_bytes_to_bpf_size(__u32 sz)
> +{
> +	switch (sz) {
> +	case 8: return BPF_DW;
> +	case 4: return BPF_W;
> +	case 2: return BPF_H;
> +	case 1: return BPF_B;
> +	default: return -1;
> +	}
> +}
> +
[...]
> +
> +static void __bpf_gen__debug(struct bpf_gen *gen, int reg1, int reg2, const char *fmt, va_list args)
> +{
> +	char buf[1024];
> +	int addr, len, ret;
> +
> +	if (!gen->log_level)
> +		return;
> +	ret = vsnprintf(buf, sizeof(buf), fmt, args);
> +	if (ret < 1024 - 7 && reg1 >= 0 && reg2 < 0)
> +		strcat(buf, " r=%d");

Why only for reg1 >= 0 && reg2 < 0?

> +	len = strlen(buf) + 1;
> +	addr = bpf_gen__add_data(gen, buf, len);
> +
> +	bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, addr));
> +	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_2, len));
> +	if (reg1 >= 0)
> +		bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_3, reg1));
> +	if (reg2 >= 0)
> +		bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_4, reg2));
> +	bpf_gen__emit(gen, BPF_EMIT_CALL(BPF_FUNC_trace_printk));
> +}
> +
> +static void bpf_gen__debug_regs(struct bpf_gen *gen, int reg1, int reg2, const char *fmt, ...)
> +{
> +	va_list args;
> +
> +	va_start(args, fmt);
> +	__bpf_gen__debug(gen, reg1, reg2, fmt, args);
> +	va_end(args);
> +}
> +
> +static void bpf_gen__debug_ret(struct bpf_gen *gen, const char *fmt, ...)
> +{
> +	va_list args;
> +
> +	va_start(args, fmt);
> +	__bpf_gen__debug(gen, BPF_REG_7, -1, fmt, args);
> +	va_end(args);
> +}
> +
> +static void bpf_gen__emit_sys_close(struct bpf_gen *gen, int stack_off)
> +{
> +	bpf_gen__emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_10, stack_off));
> +	bpf_gen__emit(gen, BPF_JMP_IMM(BPF_JSLE, BPF_REG_1, 0, 2 + (gen->log_level ? 6 : 0)));

The number "6" is magic. This refers the number of insns generated below 
with
    bpf_gen__debug_regs(gen, BPF_REG_9, BPF_REG_0, "close(%%d) = %%d");
At least some comment will be better.

> +	bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_9, BPF_REG_1));
> +	bpf_gen__emit(gen, BPF_EMIT_CALL(BPF_FUNC_sys_close));
> +	bpf_gen__debug_regs(gen, BPF_REG_9, BPF_REG_0, "close(%%d) = %%d");
> +}
> +
> +int bpf_gen__finish(struct bpf_gen *gen)
> +{
> +	int i;
> +
> +	bpf_gen__emit_sys_close(gen, stack_off(btf_fd));
> +	for (i = 0; i < gen->nr_progs; i++)
> +		bpf_gen__move_stack2ctx(gen, offsetof(struct bpf_loader_ctx,
> +						      u[gen->nr_maps + i].map_fd), 4,

Maybe u[gen->nr_maps + i].prog_fd?
u[..] is a union, but prog_fd better reflects what it is.

> +					stack_off(prog_fd[i]));
> +	for (i = 0; i < gen->nr_maps; i++)
> +		bpf_gen__move_stack2ctx(gen, offsetof(struct bpf_loader_ctx,
> +						      u[i].prog_fd), 4,

u[i].map_fd?

> +					stack_off(map_fd[i]));
> +	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_0, 0));
> +	bpf_gen__emit(gen, BPF_EXIT_INSN());
> +	pr_debug("bpf_gen__finish %d\n", gen->error);
> +	return gen->error;
> +}
> +
> +void bpf_gen__load_btf(struct bpf_gen *gen, const void *btf_raw_data, __u32 btf_raw_size)
> +{
> +	union bpf_attr attr = {};
> +	int attr_size = offsetofend(union bpf_attr, btf_log_level);
> +	int btf_data, btf_load_attr;
> +
> +	pr_debug("btf_load: size %d\n", btf_raw_size);
> +	btf_data = bpf_gen__add_data(gen, btf_raw_data, btf_raw_size);
> +
> +	attr.btf_size = btf_raw_size;
> +	btf_load_attr = bpf_gen__add_data(gen, &attr, attr_size);
> +
> +	/* populate union bpf_attr with user provided log details */
> +	bpf_gen__move_ctx2blob(gen, btf_load_attr + offsetof(union bpf_attr, btf_log_level), 4,
> +			       offsetof(struct bpf_loader_ctx, log_level));
> +	bpf_gen__move_ctx2blob(gen, btf_load_attr + offsetof(union bpf_attr, btf_log_size), 4,
> +			       offsetof(struct bpf_loader_ctx, log_size));
> +	bpf_gen__move_ctx2blob(gen, btf_load_attr + offsetof(union bpf_attr, btf_log_buf), 8,
> +			       offsetof(struct bpf_loader_ctx, log_buf));
> +	/* populate union bpf_attr with a pointer to the BTF data */
> +	bpf_gen__emit_rel_store(gen, btf_load_attr + offsetof(union bpf_attr, btf), btf_data);
> +	/* emit BTF_LOAD command */
> +	bpf_gen__emit_sys_bpf(gen, BPF_BTF_LOAD, btf_load_attr, attr_size);
> +	bpf_gen__debug_ret(gen, "btf_load size %d", btf_raw_size);
> +	bpf_gen__emit_check_err(gen);
> +	/* remember btf_fd in the stack, if successful */
> +	bpf_gen__emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7, stack_off(btf_fd)));
> +}
> +
> +void bpf_gen__map_create(struct bpf_gen *gen, struct bpf_create_map_attr *map_attr, int map_idx)
> +{
> +	union bpf_attr attr = {};
> +	int attr_size = offsetofend(union bpf_attr, btf_vmlinux_value_type_id);
> +	bool close_inner_map_fd = false;
> +	int map_create_attr;
> +
> +	attr.map_type = map_attr->map_type;
> +	attr.key_size = map_attr->key_size;
> +	attr.value_size = map_attr->value_size;
> +	attr.map_flags = map_attr->map_flags;
> +	memcpy(attr.map_name, map_attr->name,
> +	       min((unsigned)strlen(map_attr->name), BPF_OBJ_NAME_LEN - 1));
> +	attr.numa_node = map_attr->numa_node;
> +	attr.map_ifindex = map_attr->map_ifindex;
> +	attr.max_entries = map_attr->max_entries;
> +	switch (attr.map_type) {
> +	case BPF_MAP_TYPE_PERF_EVENT_ARRAY:
> +	case BPF_MAP_TYPE_CGROUP_ARRAY:
> +	case BPF_MAP_TYPE_STACK_TRACE:
> +	case BPF_MAP_TYPE_ARRAY_OF_MAPS:
> +	case BPF_MAP_TYPE_HASH_OF_MAPS:
> +	case BPF_MAP_TYPE_DEVMAP:
> +	case BPF_MAP_TYPE_DEVMAP_HASH:
> +	case BPF_MAP_TYPE_CPUMAP:
> +	case BPF_MAP_TYPE_XSKMAP:
> +	case BPF_MAP_TYPE_SOCKMAP:
> +	case BPF_MAP_TYPE_SOCKHASH:
> +	case BPF_MAP_TYPE_QUEUE:
> +	case BPF_MAP_TYPE_STACK:
> +	case BPF_MAP_TYPE_RINGBUF:
> +		break;
> +	default:
> +		attr.btf_key_type_id = map_attr->btf_key_type_id;
> +		attr.btf_value_type_id = map_attr->btf_value_type_id;
> +	}
> +
> +	pr_debug("map_create: %s idx %d type %d value_type_id %d\n",
> +		 attr.map_name, map_idx, map_attr->map_type, attr.btf_value_type_id);
> +
> +	map_create_attr = bpf_gen__add_data(gen, &attr, attr_size);
> +	if (attr.btf_value_type_id)
> +		/* populate union bpf_attr with btf_fd saved in the stack earlier */
> +		bpf_gen__move_stack2blob(gen, map_create_attr + offsetof(union bpf_attr, btf_fd), 4,
> +					 stack_off(btf_fd));
> +	switch (attr.map_type) {
> +	case BPF_MAP_TYPE_ARRAY_OF_MAPS:
> +	case BPF_MAP_TYPE_HASH_OF_MAPS:
> +		bpf_gen__move_stack2blob(gen, map_create_attr + offsetof(union bpf_attr, inner_map_fd),
> +					 4, stack_off(inner_map_fd));
> +		close_inner_map_fd = true;
> +		break;
> +	default:;
> +	}
> +	/* emit MAP_CREATE command */
> +	bpf_gen__emit_sys_bpf(gen, BPF_MAP_CREATE, map_create_attr, attr_size);
> +	bpf_gen__debug_ret(gen, "map_create %s idx %d type %d value_size %d",
> +			   attr.map_name, map_idx, map_attr->map_type, attr.value_size);
> +	bpf_gen__emit_check_err(gen);
> +	/* remember map_fd in the stack, if successful */
> +	if (map_idx < 0) {
> +		bpf_gen__emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7, stack_off(inner_map_fd)));

Some comments here to indicate map_idx < 0 is for inner map creation 
will help understand the code.

> +	} else {
> +		if (map_idx != gen->nr_maps) {
> +			gen->error = -EDOM; /* internal bug */
> +			return;
> +		}
> +		bpf_gen__emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7, stack_off(map_fd[map_idx])));
> +		gen->nr_maps++;
> +	}
> +	if (close_inner_map_fd)
> +		bpf_gen__emit_sys_close(gen, stack_off(inner_map_fd));
> +}
> +
> +void bpf_gen__record_find_name(struct bpf_gen *gen, const char *attach_name,
> +			       enum bpf_attach_type type)
> +{
> +	const char *prefix;
> +	int kind, len, name;
> +
> +	btf_get_kernel_prefix_kind(type, &prefix, &kind);
> +	pr_debug("find_btf_id '%s%s'\n", prefix, attach_name);
> +	len = strlen(prefix);
> +	if (len)
> +		name = bpf_gen__add_data(gen, prefix, len);
> +	name = bpf_gen__add_data(gen, attach_name, strlen(attach_name) + 1);
> +	name -= len;
> +
> +	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_1, 0));
> +	bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_2, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, name));
> +	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_3, kind));
> +	bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_4, BPF_REG_10));
> +	bpf_gen__emit(gen, BPF_ALU64_IMM(BPF_ADD, BPF_REG_4, stack_off(last_attach_btf_obj_fd)));
> +	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_5, 0));
> +	bpf_gen__emit(gen, BPF_EMIT_CALL(BPF_FUNC_btf_find_by_name_kind));
> +	bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_7, BPF_REG_0));
> +	bpf_gen__debug_ret(gen, "find_by_name_kind(%s%s,%d)", prefix, attach_name, kind);
> +	bpf_gen__emit_check_err(gen);
> +	/* remember btf_id */
> +	bpf_gen__emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7, stack_off(last_btf_id)));
> +}
> +
> +void bpf_gen__record_extern(struct bpf_gen *gen, const char *name, int kind, int insn_idx)
> +{
> +	struct relo_desc *relo;
> +
> +	relo = libbpf_reallocarray(gen->relos, gen->relo_cnt + 1, sizeof(*relo));
> +	if (!relo) {
> +		gen->error = -ENOMEM;
> +		return;
> +	}
> +	gen->relos = relo;
> +	relo += gen->relo_cnt;
> +	relo->name = name;
> +	relo->kind = kind;
> +	relo->insn_idx = insn_idx;
> +	gen->relo_cnt++;
> +}
> +
> +static void bpf_gen__emit_relo(struct bpf_gen *gen, struct relo_desc *relo, int insns)
> +{
> +	int name, insn;
> +
> +	pr_debug("relo: %s at %d\n", relo->name, relo->insn_idx);
> +	name = bpf_gen__add_data(gen, relo->name, strlen(relo->name) + 1);
> +
> +	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_1, 0));
> +	bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_2, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, name));
> +	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_3, relo->kind));
> +	bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_4, BPF_REG_10));
> +	bpf_gen__emit(gen, BPF_ALU64_IMM(BPF_ADD, BPF_REG_4, stack_off(last_attach_btf_obj_fd)));
> +	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_5, 0));
> +	bpf_gen__emit(gen, BPF_EMIT_CALL(BPF_FUNC_btf_find_by_name_kind));
> +	bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_7, BPF_REG_0));
> +	bpf_gen__debug_ret(gen, "find_by_name_kind(%s,%d)", relo->name, relo->kind);
> +	bpf_gen__emit_check_err(gen);
> +	/* store btf_id into insn[insn_idx].imm */
> +	insn = (int)(long)&((struct bpf_insn *)(long)insns)[relo->insn_idx].imm;

This is really fancy. Maybe something like
	insn = insns + sizeof(struct bpf_insn) * relo->insn_idx + 
offsetof(struct bpf_insn, imm).
Does this sound better?

> +	bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_0, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, insn));
> +	bpf_gen__emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_0, BPF_REG_7, 0));
> +}
> +
[...]
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 083e441d9c5e..a61b4d401527 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -54,6 +54,7 @@
>   #include "str_error.h"
>   #include "libbpf_internal.h"
>   #include "hashmap.h"
> +#include "bpf_gen_internal.h"
>   
>   #ifndef BPF_FS_MAGIC
>   #define BPF_FS_MAGIC		0xcafe4a11
> @@ -435,6 +436,8 @@ struct bpf_object {
>   	bool loaded;
>   	bool has_subcalls;
>   
> +	struct bpf_gen *gen_trace;
> +
>   	/*
>   	 * Information when doing elf related work. Only valid if fd
>   	 * is valid.
> @@ -2651,7 +2654,15 @@ static int bpf_object__sanitize_and_load_btf(struct bpf_object *obj)
>   		bpf_object__sanitize_btf(obj, kern_btf);
>   	}
>   
> -	err = btf__load(kern_btf);
> +	if (obj->gen_trace) {
> +		__u32 raw_size = 0;
> +		const void *raw_data = btf__get_raw_data(kern_btf, &raw_size);
> +
> +		bpf_gen__load_btf(obj->gen_trace, raw_data, raw_size);
> +		btf__set_fd(kern_btf, 0);
> +	} else {
> +		err = btf__load(kern_btf);
> +	}
>   	if (sanitize) {
>   		if (!err) {
>   			/* move fd to libbpf's BTF */
> @@ -4277,6 +4288,17 @@ static bool kernel_supports(enum kern_feature_id feat_id)
>   	return READ_ONCE(feat->res) == FEAT_SUPPORTED;
>   }
>   
> +static void mark_feat_supported(enum kern_feature_id last_feat)
> +{
> +	struct kern_feature_desc *feat;
> +	int i;
> +
> +	for (i = 0; i <= last_feat; i++) {
> +		feat = &feature_probes[i];
> +		WRITE_ONCE(feat->res, FEAT_SUPPORTED);
> +	}

This assumes all earlier features than FD_IDX are supported. I think 
this is probably fine although it may not work for some weird backport.
Did you see any issues if we don't explicitly set previous features
supported?

> +}
> +
>   static bool map_is_reuse_compat(const struct bpf_map *map, int map_fd)
>   {
>   	struct bpf_map_info map_info = {};
> @@ -4344,6 +4366,13 @@ bpf_object__populate_internal_map(struct bpf_object *obj, struct bpf_map *map)
>   	char *cp, errmsg[STRERR_BUFSIZE];
>   	int err, zero = 0;
>   
> +	if (obj->gen_trace) {
> +		bpf_gen__map_update_elem(obj->gen_trace, map - obj->maps,
> +					 map->mmaped, map->def.value_size);
> +		if (map_type == LIBBPF_MAP_RODATA || map_type == LIBBPF_MAP_KCONFIG)
> +			bpf_gen__map_freeze(obj->gen_trace, map - obj->maps);
> +		return 0;
> +	}
>   	err = bpf_map_update_elem(map->fd, &zero, map->mmaped, 0);
>   	if (err) {
>   		err = -errno;
> @@ -4369,7 +4398,7 @@ bpf_object__populate_internal_map(struct bpf_object *obj, struct bpf_map *map)
>   
>   static void bpf_map__destroy(struct bpf_map *map);
[...]
> @@ -9383,7 +9512,13 @@ static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd,
>   	}
>   
>   	/* kernel/module BTF ID */
> -	err = find_kernel_btf_id(prog->obj, attach_name, attach_type, btf_obj_fd, btf_type_id);
> +	if (prog->obj->gen_trace) {
> +		bpf_gen__record_find_name(prog->obj->gen_trace, attach_name, attach_type);
> +		*btf_obj_fd = 0;
> +		*btf_type_id = 1;

We have quite some codes like this and may add more to support more 
features. I am wondering whether we could have some kind of callbacks
to make the code more streamlined. But I am not sure how easy it is.

> +	} else {
> +		err = find_kernel_btf_id(prog->obj, attach_name, attach_type, btf_obj_fd, btf_type_id);
> +	}
>   	if (err) {
>   		pr_warn("failed to find kernel BTF type ID of '%s': %d\n", attach_name, err);
>   		return err;
> diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
> index b9b29baf1df8..a5dffc0a3369 100644
> --- a/tools/lib/bpf/libbpf.map
> +++ b/tools/lib/bpf/libbpf.map
> @@ -361,4 +361,5 @@ LIBBPF_0.4.0 {
>   		bpf_linker__new;
>   		bpf_map__inner_map;
>   		bpf_object__set_kversion;
> +		bpf_load;

Based on alphabet ordering, this should move a few places earlier.

I will need to go through the patch again for better understanding ...

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next 13/15] libbpf: Generate loader program out of BPF ELF file.
  2021-04-21  1:34   ` Yonghong Song
@ 2021-04-21  4:46     ` Alexei Starovoitov
  2021-04-21  5:30       ` Yonghong Song
  2021-04-21 17:46       ` Andrii Nakryiko
  0 siblings, 2 replies; 30+ messages in thread
From: Alexei Starovoitov @ 2021-04-21  4:46 UTC (permalink / raw)
  To: Yonghong Song; +Cc: davem, daniel, andrii, netdev, bpf, kernel-team

On Tue, Apr 20, 2021 at 06:34:11PM -0700, Yonghong Song wrote:
> > 
> > kconfig, typeless ksym, struct_ops and CO-RE are not supported yet.
> 
> Beyond this, currently libbpf has a lot of flexibility between prog open
> and load, change program type, key/value size, pin maps, max_entries, reuse
> map, etc. it is worthwhile to mention this in the cover letter.
> It is possible that these changes may defeat the purpose of signing the
> program though.

Right. We'd need to decide which ones are ok to change after signature
verification. I think max_entries gotta be allowed, since tools
actively change it. The other fields selftest change too, but I'm not sure
it's a good thing to allow for signed progs. TBD.

> > +	if (gen->error)
> > +		return -ENOMEM;
> 
> return gen->error?

right

> > +	if (off + size > UINT32_MAX) {
> > +		gen->error = -ERANGE;
> > +		return -ERANGE;
> > +	}
> > +	gen->insn_start = realloc(gen->insn_start, off + size);
> > +	if (!gen->insn_start) {
> > +		gen->error = -ENOMEM;
> > +		return -ENOMEM;
> > +	}
> > +	gen->insn_cur = gen->insn_start + off;
> > +	return 0;
> > +}
> > +
> > +static int bpf_gen__realloc_data_buf(struct bpf_gen *gen, __u32 size)
> 
> Maybe change the return type to size_t? Esp. in the below
> we have off + size > UINT32_MAX.

return type? it's 0 or error. you mean argument type?
I think u32 is better. The prog size and all other ways
the bpf_gen__add_data is called with 32-bit values.

> > +{
> > +	size_t off = gen->data_cur - gen->data_start;
> > +
> > +	if (gen->error)
> > +		return -ENOMEM;
> 
> return gen->error?

right

> > +	if (off + size > UINT32_MAX) {
> > +		gen->error = -ERANGE;
> > +		return -ERANGE;
> > +	}
> > +	gen->data_start = realloc(gen->data_start, off + size);
> > +	if (!gen->data_start) {
> > +		gen->error = -ENOMEM;
> > +		return -ENOMEM;
> > +	}
> > +	gen->data_cur = gen->data_start + off;
> > +	return 0;
> > +}
> > +
> > +static void bpf_gen__emit(struct bpf_gen *gen, struct bpf_insn insn)
> > +{
> > +	if (bpf_gen__realloc_insn_buf(gen, sizeof(insn)))
> > +		return;
> > +	memcpy(gen->insn_cur, &insn, sizeof(insn));
> > +	gen->insn_cur += sizeof(insn);
> > +}
> > +
> > +static void bpf_gen__emit2(struct bpf_gen *gen, struct bpf_insn insn1, struct bpf_insn insn2)
> > +{
> > +	bpf_gen__emit(gen, insn1);
> > +	bpf_gen__emit(gen, insn2);
> > +}
> > +
> > +void bpf_gen__init(struct bpf_gen *gen, int log_level)
> > +{
> > +	gen->log_level = log_level;
> > +	bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_6, BPF_REG_1));
> > +	bpf_gen__emit(gen, BPF_ST_MEM(BPF_W, BPF_REG_10, stack_off(last_attach_btf_obj_fd), 0));
> 
> Here we initialize last_attach_btf_obj_fd, do we need to initialize
> last_btf_id?

Not sure why I inited it. Probably left over. I'll remove it.

> > +}
> > +
> > +static int bpf_gen__add_data(struct bpf_gen *gen, const void *data, __u32 size)
> > +{
> > +	void *prev;
> > +
> > +	if (bpf_gen__realloc_data_buf(gen, size))
> > +		return 0;
> > +	prev = gen->data_cur;
> > +	memcpy(gen->data_cur, data, size);
> > +	gen->data_cur += size;
> > +	return prev - gen->data_start;
> > +}
> > +
> > +static int insn_bytes_to_bpf_size(__u32 sz)
> > +{
> > +	switch (sz) {
> > +	case 8: return BPF_DW;
> > +	case 4: return BPF_W;
> > +	case 2: return BPF_H;
> > +	case 1: return BPF_B;
> > +	default: return -1;
> > +	}
> > +}
> > +
> [...]
> > +
> > +static void __bpf_gen__debug(struct bpf_gen *gen, int reg1, int reg2, const char *fmt, va_list args)
> > +{
> > +	char buf[1024];
> > +	int addr, len, ret;
> > +
> > +	if (!gen->log_level)
> > +		return;
> > +	ret = vsnprintf(buf, sizeof(buf), fmt, args);
> > +	if (ret < 1024 - 7 && reg1 >= 0 && reg2 < 0)
> > +		strcat(buf, " r=%d");
> 
> Why only for reg1 >= 0 && reg2 < 0?

To avoid specifying BPF_REG_7 and adding " r=%%d" to printks explicitly.
Just to make bpf_gen__debug_ret() short and less verbose.

> > +	len = strlen(buf) + 1;
> > +	addr = bpf_gen__add_data(gen, buf, len);
> > +
> > +	bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, addr));
> > +	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_2, len));
> > +	if (reg1 >= 0)
> > +		bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_3, reg1));
> > +	if (reg2 >= 0)
> > +		bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_4, reg2));
> > +	bpf_gen__emit(gen, BPF_EMIT_CALL(BPF_FUNC_trace_printk));
> > +}
> > +
> > +static void bpf_gen__debug_regs(struct bpf_gen *gen, int reg1, int reg2, const char *fmt, ...)
> > +{
> > +	va_list args;
> > +
> > +	va_start(args, fmt);
> > +	__bpf_gen__debug(gen, reg1, reg2, fmt, args);
> > +	va_end(args);
> > +}
> > +
> > +static void bpf_gen__debug_ret(struct bpf_gen *gen, const char *fmt, ...)
> > +{
> > +	va_list args;
> > +
> > +	va_start(args, fmt);
> > +	__bpf_gen__debug(gen, BPF_REG_7, -1, fmt, args);
> > +	va_end(args);
> > +}
> > +
> > +static void bpf_gen__emit_sys_close(struct bpf_gen *gen, int stack_off)
> > +{
> > +	bpf_gen__emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_10, stack_off));
> > +	bpf_gen__emit(gen, BPF_JMP_IMM(BPF_JSLE, BPF_REG_1, 0, 2 + (gen->log_level ? 6 : 0)));
> 
> The number "6" is magic. This refers the number of insns generated below
> with
>    bpf_gen__debug_regs(gen, BPF_REG_9, BPF_REG_0, "close(%%d) = %%d");
> At least some comment will be better.

good point. will add a comment.

> 
> > +	bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_9, BPF_REG_1));
> > +	bpf_gen__emit(gen, BPF_EMIT_CALL(BPF_FUNC_sys_close));
> > +	bpf_gen__debug_regs(gen, BPF_REG_9, BPF_REG_0, "close(%%d) = %%d");
> > +}
> > +
> > +int bpf_gen__finish(struct bpf_gen *gen)
> > +{
> > +	int i;
> > +
> > +	bpf_gen__emit_sys_close(gen, stack_off(btf_fd));
> > +	for (i = 0; i < gen->nr_progs; i++)
> > +		bpf_gen__move_stack2ctx(gen, offsetof(struct bpf_loader_ctx,
> > +						      u[gen->nr_maps + i].map_fd), 4,
> 
> Maybe u[gen->nr_maps + i].prog_fd?
> u[..] is a union, but prog_fd better reflects what it is.

ohh. right.

> > +					stack_off(prog_fd[i]));
> > +	for (i = 0; i < gen->nr_maps; i++)
> > +		bpf_gen__move_stack2ctx(gen, offsetof(struct bpf_loader_ctx,
> > +						      u[i].prog_fd), 4,
> 
> u[i].map_fd?

right.

> > +	/* remember map_fd in the stack, if successful */
> > +	if (map_idx < 0) {
> > +		bpf_gen__emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7, stack_off(inner_map_fd)));
> 
> Some comments here to indicate map_idx < 0 is for inner map creation will
> help understand the code.

will do.

> > +	/* store btf_id into insn[insn_idx].imm */
> > +	insn = (int)(long)&((struct bpf_insn *)(long)insns)[relo->insn_idx].imm;
> 
> This is really fancy. Maybe something like
> 	insn = insns + sizeof(struct bpf_insn) * relo->insn_idx + offsetof(struct
> bpf_insn, imm).
> Does this sound better?

yeah. much better.

> > +static void mark_feat_supported(enum kern_feature_id last_feat)
> > +{
> > +	struct kern_feature_desc *feat;
> > +	int i;
> > +
> > +	for (i = 0; i <= last_feat; i++) {
> > +		feat = &feature_probes[i];
> > +		WRITE_ONCE(feat->res, FEAT_SUPPORTED);
> > +	}
> 
> This assumes all earlier features than FD_IDX are supported. I think this is
> probably fine although it may not work for some weird backport.
> Did you see any issues if we don't explicitly set previous features
> supported?

This helper is only used as mark_feat_supported(FEAT_FD_IDX)
to tell libbpf that it shouldn't probe anything.
Otherwise probing via prog_load screw up gen_trace completely.
May be it will be mark_all_feat_supported(void), but that seems less flexible.

> > @@ -9383,7 +9512,13 @@ static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd,
> >   	}
> >   	/* kernel/module BTF ID */
> > -	err = find_kernel_btf_id(prog->obj, attach_name, attach_type, btf_obj_fd, btf_type_id);
> > +	if (prog->obj->gen_trace) {
> > +		bpf_gen__record_find_name(prog->obj->gen_trace, attach_name, attach_type);
> > +		*btf_obj_fd = 0;
> > +		*btf_type_id = 1;
> 
> We have quite some codes like this and may add more to support more
> features. I am wondering whether we could have some kind of callbacks
> to make the code more streamlined. But I am not sure how easy it is.

you mean find_kernel_btf_id() in general?
This 'find' operation is translated differently for
prog name as seen in this hunk via bpf_gen__record_find_name()
and via bpf_gen__record_extern() in another place.
For libbpf it's all find_kernel_btf_id(), but semantically they are different,
so they cannot map as-is to gen trace bpf_gen__find_kernel_btf_id (if there was
such thing).
Because such 'generic' callback wouldn't convey the meaning of what to do
with the result of the find.

> > +	} else {
> > +		err = find_kernel_btf_id(prog->obj, attach_name, attach_type, btf_obj_fd, btf_type_id);
> > +	}
> >   	if (err) {
> >   		pr_warn("failed to find kernel BTF type ID of '%s': %d\n", attach_name, err);
> >   		return err;
> > diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
> > index b9b29baf1df8..a5dffc0a3369 100644
> > --- a/tools/lib/bpf/libbpf.map
> > +++ b/tools/lib/bpf/libbpf.map
> > @@ -361,4 +361,5 @@ LIBBPF_0.4.0 {
> >   		bpf_linker__new;
> >   		bpf_map__inner_map;
> >   		bpf_object__set_kversion;
> > +		bpf_load;
> 
> Based on alphabet ordering, this should move a few places earlier.
> 
> I will need to go through the patch again for better understanding ...

Thanks a lot for the review.

I'll address these comments and those that I got offline and will post v2.
This gen stuff will look quite different.
I hope bpf_load will not be a part of uapi anymore.
And 'struct bpf_gen' will not be exposed to bpftool directly.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next 13/15] libbpf: Generate loader program out of BPF ELF file.
  2021-04-21  4:46     ` Alexei Starovoitov
@ 2021-04-21  5:30       ` Yonghong Song
  2021-04-21  6:06         ` Alexei Starovoitov
  2021-04-21 17:46       ` Andrii Nakryiko
  1 sibling, 1 reply; 30+ messages in thread
From: Yonghong Song @ 2021-04-21  5:30 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: davem, daniel, andrii, netdev, bpf, kernel-team



On 4/20/21 9:46 PM, Alexei Starovoitov wrote:
> On Tue, Apr 20, 2021 at 06:34:11PM -0700, Yonghong Song wrote:
>>>
>>> kconfig, typeless ksym, struct_ops and CO-RE are not supported yet.
>>
>> Beyond this, currently libbpf has a lot of flexibility between prog open
>> and load, change program type, key/value size, pin maps, max_entries, reuse
>> map, etc. it is worthwhile to mention this in the cover letter.
>> It is possible that these changes may defeat the purpose of signing the
>> program though.
> 
> Right. We'd need to decide which ones are ok to change after signature
> verification. I think max_entries gotta be allowed, since tools
> actively change it. The other fields selftest change too, but I'm not sure
> it's a good thing to allow for signed progs. TBD.
> 
>>> +	if (gen->error)
>>> +		return -ENOMEM;
>>
>> return gen->error?
> 
> right
> 
>>> +	if (off + size > UINT32_MAX) {
>>> +		gen->error = -ERANGE;
>>> +		return -ERANGE;
>>> +	}
>>> +	gen->insn_start = realloc(gen->insn_start, off + size);
>>> +	if (!gen->insn_start) {
>>> +		gen->error = -ENOMEM;
>>> +		return -ENOMEM;
>>> +	}
>>> +	gen->insn_cur = gen->insn_start + off;
>>> +	return 0;
>>> +}
>>> +
>>> +static int bpf_gen__realloc_data_buf(struct bpf_gen *gen, __u32 size)
>>
>> Maybe change the return type to size_t? Esp. in the below
>> we have off + size > UINT32_MAX.
> 
> return type? it's 0 or error. you mean argument type?
> I think u32 is better. The prog size and all other ways
> the bpf_gen__add_data is called with 32-bit values.

Sorry, I mean

+static int bpf_gen__add_data(struct bpf_gen *gen, const void *data, 
__u32 size)

Since we allow off + size could be close to UINT32_MAX,
maybe bpf_gen__add_data should return __u32 instead of int.

> 
>>> +{
>>> +	size_t off = gen->data_cur - gen->data_start;
>>> +
>>> +	if (gen->error)
>>> +		return -ENOMEM;
>>
>> return gen->error?
> 
> right
> 
>>> +	if (off + size > UINT32_MAX) {
>>> +		gen->error = -ERANGE;
>>> +		return -ERANGE;
>>> +	}
>>> +	gen->data_start = realloc(gen->data_start, off + size);
>>> +	if (!gen->data_start) {
>>> +		gen->error = -ENOMEM;
>>> +		return -ENOMEM;
>>> +	}
>>> +	gen->data_cur = gen->data_start + off;
>>> +	return 0;
>>> +}
>>> +
>>> +static void bpf_gen__emit(struct bpf_gen *gen, struct bpf_insn insn)
>>> +{
>>> +	if (bpf_gen__realloc_insn_buf(gen, sizeof(insn)))
>>> +		return;
>>> +	memcpy(gen->insn_cur, &insn, sizeof(insn));
>>> +	gen->insn_cur += sizeof(insn);
>>> +}
>>> +
>>> +static void bpf_gen__emit2(struct bpf_gen *gen, struct bpf_insn insn1, struct bpf_insn insn2)
>>> +{
>>> +	bpf_gen__emit(gen, insn1);
>>> +	bpf_gen__emit(gen, insn2);
>>> +}
>>> +
>>> +void bpf_gen__init(struct bpf_gen *gen, int log_level)
>>> +{
>>> +	gen->log_level = log_level;
>>> +	bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_6, BPF_REG_1));
>>> +	bpf_gen__emit(gen, BPF_ST_MEM(BPF_W, BPF_REG_10, stack_off(last_attach_btf_obj_fd), 0));
>>
>> Here we initialize last_attach_btf_obj_fd, do we need to initialize
>> last_btf_id?
> 
> Not sure why I inited it. Probably left over. I'll remove it.
> 
>>> +}
>>> +
>>> +static int bpf_gen__add_data(struct bpf_gen *gen, const void *data, __u32 size)
>>> +{
>>> +	void *prev;
>>> +
>>> +	if (bpf_gen__realloc_data_buf(gen, size))
>>> +		return 0;
>>> +	prev = gen->data_cur;
>>> +	memcpy(gen->data_cur, data, size);
>>> +	gen->data_cur += size;
>>> +	return prev - gen->data_start;
>>> +}
>>> +
>>> +static int insn_bytes_to_bpf_size(__u32 sz)
>>> +{
>>> +	switch (sz) {
>>> +	case 8: return BPF_DW;
>>> +	case 4: return BPF_W;
>>> +	case 2: return BPF_H;
>>> +	case 1: return BPF_B;
>>> +	default: return -1;
>>> +	}
>>> +}
>>> +
>> [...]
>>> +
>>> +static void __bpf_gen__debug(struct bpf_gen *gen, int reg1, int reg2, const char *fmt, va_list args)
>>> +{
>>> +	char buf[1024];
>>> +	int addr, len, ret;
>>> +
>>> +	if (!gen->log_level)
>>> +		return;
>>> +	ret = vsnprintf(buf, sizeof(buf), fmt, args);
>>> +	if (ret < 1024 - 7 && reg1 >= 0 && reg2 < 0)
>>> +		strcat(buf, " r=%d");
>>
>> Why only for reg1 >= 0 && reg2 < 0?
> 
> To avoid specifying BPF_REG_7 and adding " r=%%d" to printks explicitly.
> Just to make bpf_gen__debug_ret() short and less verbose.
> 
>>> +	len = strlen(buf) + 1;
>>> +	addr = bpf_gen__add_data(gen, buf, len);
>>> +
>>> +	bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, addr));
>>> +	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_2, len));
>>> +	if (reg1 >= 0)
>>> +		bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_3, reg1));
>>> +	if (reg2 >= 0)
>>> +		bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_4, reg2));
>>> +	bpf_gen__emit(gen, BPF_EMIT_CALL(BPF_FUNC_trace_printk));
>>> +}
>>> +
>>> +static void bpf_gen__debug_regs(struct bpf_gen *gen, int reg1, int reg2, const char *fmt, ...)
>>> +{
>>> +	va_list args;
>>> +
>>> +	va_start(args, fmt);
>>> +	__bpf_gen__debug(gen, reg1, reg2, fmt, args);
>>> +	va_end(args);
>>> +}
>>> +
>>> +static void bpf_gen__debug_ret(struct bpf_gen *gen, const char *fmt, ...)
>>> +{
>>> +	va_list args;
>>> +
>>> +	va_start(args, fmt);
>>> +	__bpf_gen__debug(gen, BPF_REG_7, -1, fmt, args);
>>> +	va_end(args);
>>> +}
>>> +
>>> +static void bpf_gen__emit_sys_close(struct bpf_gen *gen, int stack_off)
>>> +{
>>> +	bpf_gen__emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_10, stack_off));
>>> +	bpf_gen__emit(gen, BPF_JMP_IMM(BPF_JSLE, BPF_REG_1, 0, 2 + (gen->log_level ? 6 : 0)));
>>
>> The number "6" is magic. This refers the number of insns generated below
>> with
>>     bpf_gen__debug_regs(gen, BPF_REG_9, BPF_REG_0, "close(%%d) = %%d");
>> At least some comment will be better.
> 
> good point. will add a comment.
> 
>>
>>> +	bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_9, BPF_REG_1));
>>> +	bpf_gen__emit(gen, BPF_EMIT_CALL(BPF_FUNC_sys_close));
>>> +	bpf_gen__debug_regs(gen, BPF_REG_9, BPF_REG_0, "close(%%d) = %%d");
>>> +}
>>> +
>>> +int bpf_gen__finish(struct bpf_gen *gen)
>>> +{
>>> +	int i;
>>> +
>>> +	bpf_gen__emit_sys_close(gen, stack_off(btf_fd));
>>> +	for (i = 0; i < gen->nr_progs; i++)
>>> +		bpf_gen__move_stack2ctx(gen, offsetof(struct bpf_loader_ctx,
>>> +						      u[gen->nr_maps + i].map_fd), 4,
>>
>> Maybe u[gen->nr_maps + i].prog_fd?
>> u[..] is a union, but prog_fd better reflects what it is.
> 
> ohh. right.
> 
>>> +					stack_off(prog_fd[i]));
>>> +	for (i = 0; i < gen->nr_maps; i++)
>>> +		bpf_gen__move_stack2ctx(gen, offsetof(struct bpf_loader_ctx,
>>> +						      u[i].prog_fd), 4,
>>
>> u[i].map_fd?
> 
> right.
> 
>>> +	/* remember map_fd in the stack, if successful */
>>> +	if (map_idx < 0) {
>>> +		bpf_gen__emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7, stack_off(inner_map_fd)));
>>
>> Some comments here to indicate map_idx < 0 is for inner map creation will
>> help understand the code.
> 
> will do.
> 
>>> +	/* store btf_id into insn[insn_idx].imm */
>>> +	insn = (int)(long)&((struct bpf_insn *)(long)insns)[relo->insn_idx].imm;
>>
>> This is really fancy. Maybe something like
>> 	insn = insns + sizeof(struct bpf_insn) * relo->insn_idx + offsetof(struct
>> bpf_insn, imm).
>> Does this sound better?
> 
> yeah. much better.
> 
>>> +static void mark_feat_supported(enum kern_feature_id last_feat)
>>> +{
>>> +	struct kern_feature_desc *feat;
>>> +	int i;
>>> +
>>> +	for (i = 0; i <= last_feat; i++) {
>>> +		feat = &feature_probes[i];
>>> +		WRITE_ONCE(feat->res, FEAT_SUPPORTED);
>>> +	}
>>
>> This assumes all earlier features than FD_IDX are supported. I think this is
>> probably fine although it may not work for some weird backport.
>> Did you see any issues if we don't explicitly set previous features
>> supported?
> 
> This helper is only used as mark_feat_supported(FEAT_FD_IDX)
> to tell libbpf that it shouldn't probe anything.
> Otherwise probing via prog_load screw up gen_trace completely.
> May be it will be mark_all_feat_supported(void), but that seems less flexible.

Maybe add some comments here to explain why marking explicit supported
instead if probing?

> 
>>> @@ -9383,7 +9512,13 @@ static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd,
>>>    	}
>>>    	/* kernel/module BTF ID */
>>> -	err = find_kernel_btf_id(prog->obj, attach_name, attach_type, btf_obj_fd, btf_type_id);
>>> +	if (prog->obj->gen_trace) {
>>> +		bpf_gen__record_find_name(prog->obj->gen_trace, attach_name, attach_type);
>>> +		*btf_obj_fd = 0;
>>> +		*btf_type_id = 1;
>>
>> We have quite some codes like this and may add more to support more
>> features. I am wondering whether we could have some kind of callbacks
>> to make the code more streamlined. But I am not sure how easy it is.
> 
> you mean find_kernel_btf_id() in general?
> This 'find' operation is translated differently for
> prog name as seen in this hunk via bpf_gen__record_find_name()
> and via bpf_gen__record_extern() in another place.
> For libbpf it's all find_kernel_btf_id(), but semantically they are different,
> so they cannot map as-is to gen trace bpf_gen__find_kernel_btf_id (if there was
> such thing).
> Because such 'generic' callback wouldn't convey the meaning of what to do
> with the result of the find.

I mean like calling
     err = obj->ops->find_kernel_btf_id(...)
where gen_trace and normal libbpf all registers their own callback 
functions for find_kernel_btf_id(). Similar ideas can be applied to
other places or not. Not 100% sure this is the best approach or not,
just want to bring it up for discussion.

> 
>>> +	} else {
>>> +		err = find_kernel_btf_id(prog->obj, attach_name, attach_type, btf_obj_fd, btf_type_id);
>>> +	}
>>>    	if (err) {
>>>    		pr_warn("failed to find kernel BTF type ID of '%s': %d\n", attach_name, err);
>>>    		return err;
>>> diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
>>> index b9b29baf1df8..a5dffc0a3369 100644
>>> --- a/tools/lib/bpf/libbpf.map
>>> +++ b/tools/lib/bpf/libbpf.map
>>> @@ -361,4 +361,5 @@ LIBBPF_0.4.0 {
>>>    		bpf_linker__new;
>>>    		bpf_map__inner_map;
>>>    		bpf_object__set_kversion;
>>> +		bpf_load;
>>
>> Based on alphabet ordering, this should move a few places earlier.
>>
>> I will need to go through the patch again for better understanding ...
> 
> Thanks a lot for the review.
> 
> I'll address these comments and those that I got offline and will post v2.
> This gen stuff will look quite different.
> I hope bpf_load will not be a part of uapi anymore.
> And 'struct bpf_gen' will not be exposed to bpftool directly.

Sounds good.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next 13/15] libbpf: Generate loader program out of BPF ELF file.
  2021-04-21  5:30       ` Yonghong Song
@ 2021-04-21  6:06         ` Alexei Starovoitov
  2021-04-21 14:05           ` Yonghong Song
  0 siblings, 1 reply; 30+ messages in thread
From: Alexei Starovoitov @ 2021-04-21  6:06 UTC (permalink / raw)
  To: Yonghong Song; +Cc: davem, daniel, andrii, netdev, bpf, kernel-team

On Tue, Apr 20, 2021 at 10:30:21PM -0700, Yonghong Song wrote:
> > > > +
> > > > +static int bpf_gen__realloc_data_buf(struct bpf_gen *gen, __u32 size)
> > > 
> > > Maybe change the return type to size_t? Esp. in the below
> > > we have off + size > UINT32_MAX.
> > 
> > return type? it's 0 or error. you mean argument type?
> > I think u32 is better. The prog size and all other ways
> > the bpf_gen__add_data is called with 32-bit values.
> 
> Sorry, I mean
> 
> +static int bpf_gen__add_data(struct bpf_gen *gen, const void *data, __u32
> size)
> 
> Since we allow off + size could be close to UINT32_MAX,
> maybe bpf_gen__add_data should return __u32 instead of int.

ahh. that makes sense.

> > This helper is only used as mark_feat_supported(FEAT_FD_IDX)
> > to tell libbpf that it shouldn't probe anything.
> > Otherwise probing via prog_load screw up gen_trace completely.
> > May be it will be mark_all_feat_supported(void), but that seems less flexible.
> 
> Maybe add some comments here to explain why marking explicit supported
> instead if probing?

will do.

> > 
> > > > @@ -9383,7 +9512,13 @@ static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd,
> > > >    	}
> > > >    	/* kernel/module BTF ID */
> > > > -	err = find_kernel_btf_id(prog->obj, attach_name, attach_type, btf_obj_fd, btf_type_id);
> > > > +	if (prog->obj->gen_trace) {
> > > > +		bpf_gen__record_find_name(prog->obj->gen_trace, attach_name, attach_type);
> > > > +		*btf_obj_fd = 0;
> > > > +		*btf_type_id = 1;
> > > 
> > > We have quite some codes like this and may add more to support more
> > > features. I am wondering whether we could have some kind of callbacks
> > > to make the code more streamlined. But I am not sure how easy it is.
> > 
> > you mean find_kernel_btf_id() in general?
> > This 'find' operation is translated differently for
> > prog name as seen in this hunk via bpf_gen__record_find_name()
> > and via bpf_gen__record_extern() in another place.
> > For libbpf it's all find_kernel_btf_id(), but semantically they are different,
> > so they cannot map as-is to gen trace bpf_gen__find_kernel_btf_id (if there was
> > such thing).
> > Because such 'generic' callback wouldn't convey the meaning of what to do
> > with the result of the find.
> 
> I mean like calling
>     err = obj->ops->find_kernel_btf_id(...)
> where gen_trace and normal libbpf all registers their own callback functions
> for find_kernel_btf_id(). Similar ideas can be applied to
> other places or not. Not 100% sure this is the best approach or not,
> just want to bring it up for discussion.

What args that 'ops->find_kernel_btf_id' will have?
If it's done as-is with btf_obj_fd, btf_type_id pointers to store the results
how libbpf will invoke it?
Where this destination pointers point to?
In one case the desitination is btf_id inside bpf_attr to load a prog.
In other case the destination is a btf_id inside bpf_insn ld_imm64.
In other case it could be different bpf_insn.
That's what I meant that semantical context matters
and cannot be expressed a single callback.
bpf_gen__record_find_name vs bpf_gen__record_extern have this semantical
difference builtin into their names. They will be called by libbpf differently.

If you mean to allow to specify all such callbacks via ops and indirect
pointers instead of specific bpf_gen__foo/bar callbacks then it's certainly
doable I just don't see a use case for it. No one bothered to do this
kind of 'strace of libbpf'. It's also not exactly an strace. It's
recording the sequence of events that libbpf is doing.
Consider patch 12. It changes the order of
bpf_object__relocate_data and text. It doesn't call any new bpf_gen__ methods.
But the data these methods will see later is different. In this case they will
see relo->insn_idx that is correct for the whole 'main' program after
subprogs were appended to the end instead of relo->insn_idx that points
within a given subprog.
So this gen_trace logic is very tightly built around libbpf loading
internals and will change in the future as more features will be supported
by this loader prog (like CO-RE).
Hence I don't think 'callback' idea fits here, since callback assumes
generic infra that will likely stay. Whereas here bpf_gen__ methods
are more like tracepoints inside libbpf that will be added and removed.
Nothing stable about them.
If this loader prog logic was built from scratch it probably would be different.
It would just parse elf and relocate text and data.
It would certainly not have hacks like "*btf_obj_fd = 0; *btf_type_id = 1;"
They're only there to avoid changing every check inside libbpf that
assumes that if a helper succeeded these values are valid.
Like if map_create is a success the resulting fd != -1.
The alternative is to do 'if (obj->gen_trace)' in a lot more places
which looks less appealing. I hope to reduce the number of such hacks, of course.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next 13/15] libbpf: Generate loader program out of BPF ELF file.
  2021-04-21  6:06         ` Alexei Starovoitov
@ 2021-04-21 14:05           ` Yonghong Song
  0 siblings, 0 replies; 30+ messages in thread
From: Yonghong Song @ 2021-04-21 14:05 UTC (permalink / raw)
  To: Alexei Starovoitov; +Cc: davem, daniel, andrii, netdev, bpf, kernel-team



On 4/20/21 11:06 PM, Alexei Starovoitov wrote:
> On Tue, Apr 20, 2021 at 10:30:21PM -0700, Yonghong Song wrote:
>>>>> +
>>>>> +static int bpf_gen__realloc_data_buf(struct bpf_gen *gen, __u32 size)
>>>>
>>>> Maybe change the return type to size_t? Esp. in the below
>>>> we have off + size > UINT32_MAX.
>>>
>>> return type? it's 0 or error. you mean argument type?
>>> I think u32 is better. The prog size and all other ways
>>> the bpf_gen__add_data is called with 32-bit values.
>>
>> Sorry, I mean
>>
>> +static int bpf_gen__add_data(struct bpf_gen *gen, const void *data, __u32
>> size)
>>
>> Since we allow off + size could be close to UINT32_MAX,
>> maybe bpf_gen__add_data should return __u32 instead of int.
> 
> ahh. that makes sense.
> 
>>> This helper is only used as mark_feat_supported(FEAT_FD_IDX)
>>> to tell libbpf that it shouldn't probe anything.
>>> Otherwise probing via prog_load screw up gen_trace completely.
>>> May be it will be mark_all_feat_supported(void), but that seems less flexible.
>>
>> Maybe add some comments here to explain why marking explicit supported
>> instead if probing?
> 
> will do.
> 
>>>
>>>>> @@ -9383,7 +9512,13 @@ static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd,
>>>>>     	}
>>>>>     	/* kernel/module BTF ID */
>>>>> -	err = find_kernel_btf_id(prog->obj, attach_name, attach_type, btf_obj_fd, btf_type_id);
>>>>> +	if (prog->obj->gen_trace) {
>>>>> +		bpf_gen__record_find_name(prog->obj->gen_trace, attach_name, attach_type);
>>>>> +		*btf_obj_fd = 0;
>>>>> +		*btf_type_id = 1;
>>>>
>>>> We have quite some codes like this and may add more to support more
>>>> features. I am wondering whether we could have some kind of callbacks
>>>> to make the code more streamlined. But I am not sure how easy it is.
>>>
>>> you mean find_kernel_btf_id() in general?
>>> This 'find' operation is translated differently for
>>> prog name as seen in this hunk via bpf_gen__record_find_name()
>>> and via bpf_gen__record_extern() in another place.
>>> For libbpf it's all find_kernel_btf_id(), but semantically they are different,
>>> so they cannot map as-is to gen trace bpf_gen__find_kernel_btf_id (if there was
>>> such thing).
>>> Because such 'generic' callback wouldn't convey the meaning of what to do
>>> with the result of the find.
>>
>> I mean like calling
>>      err = obj->ops->find_kernel_btf_id(...)
>> where gen_trace and normal libbpf all registers their own callback functions
>> for find_kernel_btf_id(). Similar ideas can be applied to
>> other places or not. Not 100% sure this is the best approach or not,
>> just want to bring it up for discussion.
> 
> What args that 'ops->find_kernel_btf_id' will have?
> If it's done as-is with btf_obj_fd, btf_type_id pointers to store the results
> how libbpf will invoke it?
> Where this destination pointers point to?
> In one case the desitination is btf_id inside bpf_attr to load a prog.
> In other case the destination is a btf_id inside bpf_insn ld_imm64.
> In other case it could be different bpf_insn.
> That's what I meant that semantical context matters
> and cannot be expressed a single callback.
> bpf_gen__record_find_name vs bpf_gen__record_extern have this semantical
> difference builtin into their names. They will be called by libbpf differently.
> 
> If you mean to allow to specify all such callbacks via ops and indirect
> pointers instead of specific bpf_gen__foo/bar callbacks then it's certainly
> doable I just don't see a use case for it. No one bothered to do this
> kind of 'strace of libbpf'. It's also not exactly an strace. It's
> recording the sequence of events that libbpf is doing.
> Consider patch 12. It changes the order of
> bpf_object__relocate_data and text. It doesn't call any new bpf_gen__ methods.
> But the data these methods will see later is different. In this case they will
> see relo->insn_idx that is correct for the whole 'main' program after
> subprogs were appended to the end instead of relo->insn_idx that points
> within a given subprog.
> So this gen_trace logic is very tightly built around libbpf loading
> internals and will change in the future as more features will be supported
> by this loader prog (like CO-RE).
> Hence I don't think 'callback' idea fits here, since callback assumes
> generic infra that will likely stay. Whereas here bpf_gen__ methods
> are more like tracepoints inside libbpf that will be added and removed.
> Nothing stable about them.
> If this loader prog logic was built from scratch it probably would be different.
> It would just parse elf and relocate text and data.
> It would certainly not have hacks like "*btf_obj_fd = 0; *btf_type_id = 1;"
> They're only there to avoid changing every check inside libbpf that
> assumes that if a helper succeeded these values are valid.
> Like if map_create is a success the resulting fd != -1.
> The alternative is to do 'if (obj->gen_trace)' in a lot more places
> which looks less appealing. I hope to reduce the number of such hacks, of course.

Thanks for explanation. Agree that gen_trace and non-gen_trace are two
totally actions as you said. Trying to reduce the number of hacks
will make codes better too.

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next 13/15] libbpf: Generate loader program out of BPF ELF file.
  2021-04-21  4:46     ` Alexei Starovoitov
  2021-04-21  5:30       ` Yonghong Song
@ 2021-04-21 17:46       ` Andrii Nakryiko
  2021-04-21 17:50         ` Alexei Starovoitov
  1 sibling, 1 reply; 30+ messages in thread
From: Andrii Nakryiko @ 2021-04-21 17:46 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Yonghong Song, David S. Miller, Daniel Borkmann, Andrii Nakryiko,
	Networking, bpf, Kernel Team

On Tue, Apr 20, 2021 at 9:46 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Tue, Apr 20, 2021 at 06:34:11PM -0700, Yonghong Song wrote:
> > >
> > > kconfig, typeless ksym, struct_ops and CO-RE are not supported yet.
> >
> > Beyond this, currently libbpf has a lot of flexibility between prog open
> > and load, change program type, key/value size, pin maps, max_entries, reuse
> > map, etc. it is worthwhile to mention this in the cover letter.
> > It is possible that these changes may defeat the purpose of signing the
> > program though.
>
> Right. We'd need to decide which ones are ok to change after signature
> verification. I think max_entries gotta be allowed, since tools
> actively change it. The other fields selftest change too, but I'm not sure
> it's a good thing to allow for signed progs. TBD.
>

[...]

>
> > > +static void mark_feat_supported(enum kern_feature_id last_feat)
> > > +{
> > > +   struct kern_feature_desc *feat;
> > > +   int i;
> > > +
> > > +   for (i = 0; i <= last_feat; i++) {
> > > +           feat = &feature_probes[i];
> > > +           WRITE_ONCE(feat->res, FEAT_SUPPORTED);
> > > +   }
> >
> > This assumes all earlier features than FD_IDX are supported. I think this is
> > probably fine although it may not work for some weird backport.
> > Did you see any issues if we don't explicitly set previous features
> > supported?
>
> This helper is only used as mark_feat_supported(FEAT_FD_IDX)
> to tell libbpf that it shouldn't probe anything.
> Otherwise probing via prog_load screw up gen_trace completely.
> May be it will be mark_all_feat_supported(void), but that seems less flexible.


mark_feat_supported() is changing global state irreversibly, which is
not great. I think it will be cleaner to just pass bpf_object * into
kernel_supports() helper, and there return true if obj->gen_trace is
set. That way it won't affect any other use cases that can happen in
the same process (not that there are any right now, but still). I
checked and in all current places there is obj available or it can be
accessed through prog->obj, so this shouldn't be a problem.


[...]

^ permalink raw reply	[flat|nested] 30+ messages in thread

* Re: [PATCH bpf-next 13/15] libbpf: Generate loader program out of BPF ELF file.
  2021-04-21 17:46       ` Andrii Nakryiko
@ 2021-04-21 17:50         ` Alexei Starovoitov
  0 siblings, 0 replies; 30+ messages in thread
From: Alexei Starovoitov @ 2021-04-21 17:50 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Yonghong Song, David S. Miller, Daniel Borkmann, Andrii Nakryiko,
	Networking, bpf, Kernel Team

On Wed, Apr 21, 2021 at 10:46 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, Apr 20, 2021 at 9:46 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Tue, Apr 20, 2021 at 06:34:11PM -0700, Yonghong Song wrote:
> > > >
> > > > kconfig, typeless ksym, struct_ops and CO-RE are not supported yet.
> > >
> > > Beyond this, currently libbpf has a lot of flexibility between prog open
> > > and load, change program type, key/value size, pin maps, max_entries, reuse
> > > map, etc. it is worthwhile to mention this in the cover letter.
> > > It is possible that these changes may defeat the purpose of signing the
> > > program though.
> >
> > Right. We'd need to decide which ones are ok to change after signature
> > verification. I think max_entries gotta be allowed, since tools
> > actively change it. The other fields selftest change too, but I'm not sure
> > it's a good thing to allow for signed progs. TBD.
> >
>
> [...]
>
> >
> > > > +static void mark_feat_supported(enum kern_feature_id last_feat)
> > > > +{
> > > > +   struct kern_feature_desc *feat;
> > > > +   int i;
> > > > +
> > > > +   for (i = 0; i <= last_feat; i++) {
> > > > +           feat = &feature_probes[i];
> > > > +           WRITE_ONCE(feat->res, FEAT_SUPPORTED);
> > > > +   }
> > >
> > > This assumes all earlier features than FD_IDX are supported. I think this is
> > > probably fine although it may not work for some weird backport.
> > > Did you see any issues if we don't explicitly set previous features
> > > supported?
> >
> > This helper is only used as mark_feat_supported(FEAT_FD_IDX)
> > to tell libbpf that it shouldn't probe anything.
> > Otherwise probing via prog_load screw up gen_trace completely.
> > May be it will be mark_all_feat_supported(void), but that seems less flexible.
>
>
> mark_feat_supported() is changing global state irreversibly, which is
> not great. I think it will be cleaner to just pass bpf_object * into
> kernel_supports() helper, and there return true if obj->gen_trace is
> set. That way it won't affect any other use cases that can happen in
> the same process (not that there are any right now, but still). I
> checked and in all current places there is obj available or it can be
> accessed through prog->obj, so this shouldn't be a problem.

sure. Will use that.

^ permalink raw reply	[flat|nested] 30+ messages in thread

end of thread, other threads:[~2021-04-21 17:50 UTC | newest]

Thread overview: 30+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-17  3:32 [PATCH bpf-next 00/15] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 01/15] bpf: Introduce bpf_sys_bpf() helper and program type Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 02/15] bpf: Introduce bpfptr_t user/kernel pointer Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 03/15] bpf: Prepare bpf syscall to be used from kernel and user space Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 04/15] libbpf: Support for syscall program type Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 05/15] selftests/bpf: Test " Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 06/15] bpf: Make btf_load command to be bpfptr_t compatible Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 07/15] selftests/bpf: Test for btf_load command Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 08/15] bpf: Introduce fd_idx Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 09/15] libbpf: Support for fd_idx Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 10/15] bpf: Add bpf_btf_find_by_name_kind() helper Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 11/15] bpf: Add bpf_sys_close() helper Alexei Starovoitov
2021-04-17  3:42   ` Al Viro
2021-04-17  3:46     ` Alexei Starovoitov
2021-04-17  4:04       ` Al Viro
2021-04-17  5:01         ` Alexei Starovoitov
2021-04-17 14:36           ` Alexei Starovoitov
2021-04-17 16:48             ` Al Viro
2021-04-17 17:09               ` Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 12/15] libbpf: Change the order of data and text relocations Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 13/15] libbpf: Generate loader program out of BPF ELF file Alexei Starovoitov
2021-04-21  1:34   ` Yonghong Song
2021-04-21  4:46     ` Alexei Starovoitov
2021-04-21  5:30       ` Yonghong Song
2021-04-21  6:06         ` Alexei Starovoitov
2021-04-21 14:05           ` Yonghong Song
2021-04-21 17:46       ` Andrii Nakryiko
2021-04-21 17:50         ` Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 14/15] bpftool: Use syscall/loader program in "prog load" and "gen skeleton" command Alexei Starovoitov
2021-04-17  3:32 ` [PATCH bpf-next 15/15] selftests/bpf: Convert few tests to light skeleton Alexei Starovoitov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).