BPF Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH v2 bpf-next 00/16] bpf: syscall program, FD array, loader program, light skeleton.
@ 2021-04-23  0:26 Alexei Starovoitov
  2021-04-23  0:26 ` [PATCH v2 bpf-next 01/16] bpf: Introduce bpf_sys_bpf() helper and program type Alexei Starovoitov
                   ` (16 more replies)
  0 siblings, 17 replies; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-23  0:26 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

v1->v2: Addressed comments from Al, Yonghong and Andrii.
- documented sys_close fdget/fdput requirement and non-recursion check.
- reduced internal api leaks between libbpf and bpftool.
  Now bpf_object__gen_loader() is the only new libbf api with minimal fields.
- fixed light skeleton __destroy() method to munmap and close maps and progs.
- refactored bpf_btf_find_by_name_kind to return btf_id | (btf_obj_fd << 32).
- refactored use of bpf_btf_find_by_name_kind from loader prog.
- moved auto-gen like code into skel_internal.h that is used by *.lskel.h
  It has minimal static inline bpf_load_and_run() method used by lskel.
- added lksel.h example in patch 15.
- replaced union bpf_map_prog_desc with struct bpf_map_desc and struct bpf_prog_desc.
- removed mark_feat_supported and added a patch to pass 'obj' into kernel_supports.
- added proper tracking of temporary FDs in loader prog and their cleanup via bpf_sys_close.
- rename gen_trace.c into gen_loader.c to better align the naming throughout.
- expanded number of available helpers in new prog type.
- added support for raw_tp attaching in lskel.
  lskel supports tracing and raw_tp progs now.
  It correctly loads all networking prog types too, but __attach() method is tbd.
- converted progs/test_ksyms_module.c to lskel.
- minor feedback fixes all over.

One thing that was not addressed from feedback is the name of new program type.
Currently it's still:
BPF_PROG_TYPE_SYSCALL, /* a program that can execute syscalls */

The concern raised was that it sounds like a program that should be attached
to a syscall. Like BPF_PROG_TYPE_KPROBE is used to process kprobes.
I've considered and rejected:
BPF_PROG_TYPE_USER - too generic
BPF_PROG_TYPE_USERCTX - ambiguous with uprobes
BPF_PROG_TYPE_LOADER - ok-ish, but imo TYPE_SYSCALL is cleaner.
Other suggestions?

The description of V1 set is still valid:
----
This is a first step towards signed bpf programs and the third approach of that kind.
The first approach was to bring libbpf into the kernel as a user-mode-driver.
The second approach was to invent a new file format and let kernel execute
that format as a sequence of syscalls that create maps and load programs.
This third approach is using new type of bpf program instead of inventing file format.
1st and 2nd approaches had too many downsides comparing to this 3rd and were discarded
after months of work.

To make it work the following new concepts are introduced:
1. syscall bpf program type
A kind of bpf program that can do sys_bpf and sys_close syscalls.
It can only execute in user context.

2. FD array or FD index.
Traditionally BPF instructions are patched with FDs.
What it means that maps has to be created first and then instructions modified
which breaks signature verification if the program is signed.
Instead of patching each instruction with FD patch it with an index into array of FDs.
That makes the program signature stable if it uses maps.

3. loader program that is generated as "strace of libbpf".
When libbpf is loading bpf_file.o it does a bunch of sys_bpf() syscalls to
load BTF, create maps, populate maps and finally load programs.
Instead of actually doing the syscalls generate a trace of what libbpf
would have done and represent it as the "loader program".
The "loader program" consists of single map and single bpf program that
does those syscalls.
Executing such "loader program" via bpf_prog_test_run() command will
replay the sequence of syscalls that libbpf would have done which will result
the same maps created and programs loaded as specified in the elf file.
The "loader program" removes libelf and majority of libbpf dependency from
program loading process.

4. light skeleton
Instead of embedding the whole elf file into skeleton and using libbpf
to parse it later generate a loader program and embed it into "light skeleton".
Such skeleton can load the same set of elf files, but it doesn't need
libbpf and libelf to do that. It only needs few sys_bpf wrappers.

Future steps:
- support CO-RE in the kernel. This patch set is already too big,
so that critical feature is left for the next step.
- generate light skeleton in golang to allow such users use BTF and
all other features provided by libbpf
- generate light skeleton for kernel, so that bpf programs can be embeded
in the kernel module. The UMD usage in bpf_preload will be replaced with
such skeleton, so bpf_preload would become a standard kernel module
without user space dependency.
- finally do the signing of the loader program.

The patches are work in progress with few rough edges.

Alexei Starovoitov (16):
  bpf: Introduce bpf_sys_bpf() helper and program type.
  bpf: Introduce bpfptr_t user/kernel pointer.
  bpf: Prepare bpf syscall to be used from kernel and user space.
  libbpf: Support for syscall program type
  selftests/bpf: Test for syscall program type
  bpf: Make btf_load command to be bpfptr_t compatible.
  selftests/bpf: Test for btf_load command.
  bpf: Introduce fd_idx
  libbpf: Support for fd_idx
  bpf: Add bpf_btf_find_by_name_kind() helper.
  bpf: Add bpf_sys_close() helper.
  libbpf: Change the order of data and text relocations.
  libbpf: Add bpf_object pointer to kernel_supports().
  libbpf: Generate loader program out of BPF ELF file.
  bpftool: Use syscall/loader program in "prog load" and "gen skeleton"
    command.
  selftests/bpf: Convert few tests to light skeleton.

 include/linux/bpf.h                           |  19 +-
 include/linux/bpf_types.h                     |   2 +
 include/linux/bpf_verifier.h                  |   1 +
 include/linux/bpfptr.h                        |  81 +++
 include/linux/btf.h                           |   2 +-
 include/uapi/linux/bpf.h                      |  39 +-
 kernel/bpf/bpf_iter.c                         |  13 +-
 kernel/bpf/btf.c                              |  76 ++-
 kernel/bpf/syscall.c                          | 195 ++++--
 kernel/bpf/verifier.c                         |  81 ++-
 net/bpf/test_run.c                            |  45 +-
 tools/bpf/bpftool/Makefile                    |   2 +-
 tools/bpf/bpftool/gen.c                       | 313 ++++++++-
 tools/bpf/bpftool/main.c                      |   7 +-
 tools/bpf/bpftool/main.h                      |   1 +
 tools/bpf/bpftool/prog.c                      |  80 +++
 tools/bpf/bpftool/xlated_dumper.c             |   3 +
 tools/include/uapi/linux/bpf.h                |  39 +-
 tools/lib/bpf/Build                           |   2 +-
 tools/lib/bpf/bpf.c                           |   1 +
 tools/lib/bpf/bpf_gen_internal.h              |  40 ++
 tools/lib/bpf/gen_loader.c                    | 615 ++++++++++++++++++
 tools/lib/bpf/libbpf.c                        | 399 +++++++++---
 tools/lib/bpf/libbpf.h                        |  12 +
 tools/lib/bpf/libbpf.map                      |   1 +
 tools/lib/bpf/libbpf_internal.h               |   3 +
 tools/lib/bpf/skel_internal.h                 | 105 +++
 tools/testing/selftests/bpf/.gitignore        |   1 +
 tools/testing/selftests/bpf/Makefile          |  16 +-
 .../selftests/bpf/prog_tests/fentry_fexit.c   |   6 +-
 .../selftests/bpf/prog_tests/fentry_test.c    |   4 +-
 .../selftests/bpf/prog_tests/fexit_sleep.c    |   6 +-
 .../selftests/bpf/prog_tests/fexit_test.c     |   4 +-
 .../selftests/bpf/prog_tests/kfunc_call.c     |   6 +-
 .../selftests/bpf/prog_tests/ksyms_module.c   |   2 +-
 .../selftests/bpf/prog_tests/syscall.c        |  53 ++
 tools/testing/selftests/bpf/progs/syscall.c   | 121 ++++
 .../selftests/bpf/progs/test_subprogs.c       |  13 +
 38 files changed, 2198 insertions(+), 211 deletions(-)
 create mode 100644 include/linux/bpfptr.h
 create mode 100644 tools/lib/bpf/bpf_gen_internal.h
 create mode 100644 tools/lib/bpf/gen_loader.c
 create mode 100644 tools/lib/bpf/skel_internal.h
 create mode 100644 tools/testing/selftests/bpf/prog_tests/syscall.c
 create mode 100644 tools/testing/selftests/bpf/progs/syscall.c

-- 
2.30.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v2 bpf-next 01/16] bpf: Introduce bpf_sys_bpf() helper and program type.
  2021-04-23  0:26 [PATCH v2 bpf-next 00/16] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
@ 2021-04-23  0:26 ` Alexei Starovoitov
  2021-04-23 18:15   ` Yonghong Song
                     ` (2 more replies)
  2021-04-23  0:26 ` [PATCH v2 bpf-next 02/16] bpf: Introduce bpfptr_t user/kernel pointer Alexei Starovoitov
                   ` (15 subsequent siblings)
  16 siblings, 3 replies; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-23  0:26 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Add placeholders for bpf_sys_bpf() helper and new program type.

v1->v2:
- check that expected_attach_type is zero
- allow more helper functions to be used in this program type, since they will
  only execute from user context via bpf_prog_test_run.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 include/linux/bpf.h            | 10 +++++++
 include/linux/bpf_types.h      |  2 ++
 include/uapi/linux/bpf.h       |  8 +++++
 kernel/bpf/syscall.c           | 54 ++++++++++++++++++++++++++++++++++
 net/bpf/test_run.c             | 43 +++++++++++++++++++++++++++
 tools/include/uapi/linux/bpf.h |  8 +++++
 6 files changed, 125 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index f8a45f109e96..aed30bbffb54 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1824,6 +1824,9 @@ static inline bool bpf_map_is_dev_bound(struct bpf_map *map)
 
 struct bpf_map *bpf_map_offload_map_alloc(union bpf_attr *attr);
 void bpf_map_offload_map_free(struct bpf_map *map);
+int bpf_prog_test_run_syscall(struct bpf_prog *prog,
+			      const union bpf_attr *kattr,
+			      union bpf_attr __user *uattr);
 #else
 static inline int bpf_prog_offload_init(struct bpf_prog *prog,
 					union bpf_attr *attr)
@@ -1849,6 +1852,13 @@ static inline struct bpf_map *bpf_map_offload_map_alloc(union bpf_attr *attr)
 static inline void bpf_map_offload_map_free(struct bpf_map *map)
 {
 }
+
+static inline int bpf_prog_test_run_syscall(struct bpf_prog *prog,
+					    const union bpf_attr *kattr,
+					    union bpf_attr __user *uattr)
+{
+	return -ENOTSUPP;
+}
 #endif /* CONFIG_NET && CONFIG_BPF_SYSCALL */
 
 #if defined(CONFIG_INET) && defined(CONFIG_BPF_SYSCALL)
diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h
index f883f01a5061..a9db1eae6796 100644
--- a/include/linux/bpf_types.h
+++ b/include/linux/bpf_types.h
@@ -77,6 +77,8 @@ BPF_PROG_TYPE(BPF_PROG_TYPE_LSM, lsm,
 	       void *, void *)
 #endif /* CONFIG_BPF_LSM */
 #endif
+BPF_PROG_TYPE(BPF_PROG_TYPE_SYSCALL, bpf_syscall,
+	      void *, void *)
 
 BPF_MAP_TYPE(BPF_MAP_TYPE_ARRAY, array_map_ops)
 BPF_MAP_TYPE(BPF_MAP_TYPE_PERCPU_ARRAY, percpu_array_map_ops)
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index ec6d85a81744..c92648f38144 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -937,6 +937,7 @@ enum bpf_prog_type {
 	BPF_PROG_TYPE_EXT,
 	BPF_PROG_TYPE_LSM,
 	BPF_PROG_TYPE_SK_LOOKUP,
+	BPF_PROG_TYPE_SYSCALL, /* a program that can execute syscalls */
 };
 
 enum bpf_attach_type {
@@ -4735,6 +4736,12 @@ union bpf_attr {
  *		be zero-terminated except when **str_size** is 0.
  *
  *		Or **-EBUSY** if the per-CPU memory copy buffer is busy.
+ *
+ * long bpf_sys_bpf(u32 cmd, void *attr, u32 attr_size)
+ * 	Description
+ * 		Execute bpf syscall with given arguments.
+ * 	Return
+ * 		A syscall result.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -4903,6 +4910,7 @@ union bpf_attr {
 	FN(check_mtu),			\
 	FN(for_each_map_elem),		\
 	FN(snprintf),			\
+	FN(sys_bpf),			\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index fd495190115e..8636876f3e6b 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -2014,6 +2014,7 @@ bpf_prog_load_check_attach(enum bpf_prog_type prog_type,
 		if (expected_attach_type == BPF_SK_LOOKUP)
 			return 0;
 		return -EINVAL;
+	case BPF_PROG_TYPE_SYSCALL:
 	case BPF_PROG_TYPE_EXT:
 		if (expected_attach_type)
 			return -EINVAL;
@@ -4497,3 +4498,56 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 
 	return err;
 }
+
+static bool syscall_prog_is_valid_access(int off, int size,
+					 enum bpf_access_type type,
+					 const struct bpf_prog *prog,
+					 struct bpf_insn_access_aux *info)
+{
+	if (off < 0 || off >= U16_MAX)
+		return false;
+	if (off % size != 0)
+		return false;
+	return true;
+}
+
+BPF_CALL_3(bpf_sys_bpf, int, cmd, void *, attr, u32, attr_size)
+{
+	return -EINVAL;
+}
+
+const struct bpf_func_proto bpf_sys_bpf_proto = {
+	.func		= bpf_sys_bpf,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_ANYTHING,
+	.arg2_type	= ARG_PTR_TO_MEM,
+	.arg3_type	= ARG_CONST_SIZE,
+};
+
+const struct bpf_func_proto * __weak
+tracing_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
+{
+
+	return bpf_base_func_proto(func_id);
+}
+
+static const struct bpf_func_proto *
+syscall_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
+{
+	switch (func_id) {
+	case BPF_FUNC_sys_bpf:
+		return &bpf_sys_bpf_proto;
+	default:
+		return tracing_prog_func_proto(func_id, prog);
+	}
+}
+
+const struct bpf_verifier_ops bpf_syscall_verifier_ops = {
+	.get_func_proto  = syscall_prog_func_proto,
+	.is_valid_access = syscall_prog_is_valid_access,
+};
+
+const struct bpf_prog_ops bpf_syscall_prog_ops = {
+	.test_run = bpf_prog_test_run_syscall,
+};
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index a5d72c48fb66..1783ea77b95c 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -918,3 +918,46 @@ int bpf_prog_test_run_sk_lookup(struct bpf_prog *prog, const union bpf_attr *kat
 	kfree(user_ctx);
 	return ret;
 }
+
+int bpf_prog_test_run_syscall(struct bpf_prog *prog,
+			      const union bpf_attr *kattr,
+			      union bpf_attr __user *uattr)
+{
+	void __user *ctx_in = u64_to_user_ptr(kattr->test.ctx_in);
+	__u32 ctx_size_in = kattr->test.ctx_size_in;
+	void *ctx = NULL;
+	u32 retval;
+	int err = 0;
+
+	/* doesn't support data_in/out, ctx_out, duration, or repeat or flags */
+	if (kattr->test.data_in || kattr->test.data_out ||
+	    kattr->test.ctx_out || kattr->test.duration ||
+	    kattr->test.repeat || kattr->test.flags)
+		return -EINVAL;
+
+	if (ctx_size_in < prog->aux->max_ctx_offset ||
+	    ctx_size_in > U16_MAX)
+		return -EINVAL;
+
+	if (ctx_size_in) {
+		ctx = kzalloc(ctx_size_in, GFP_USER);
+		if (!ctx)
+			return -ENOMEM;
+		if (copy_from_user(ctx, ctx_in, ctx_size_in)) {
+			err = -EFAULT;
+			goto out;
+		}
+	}
+	retval = bpf_prog_run_pin_on_cpu(prog, ctx);
+
+	if (copy_to_user(&uattr->test.retval, &retval, sizeof(u32)))
+		err = -EFAULT;
+	if (ctx_size_in)
+		if (copy_to_user(ctx_in, ctx, ctx_size_in)) {
+			err = -EFAULT;
+			goto out;
+		}
+out:
+	kfree(ctx);
+	return err;
+}
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index ec6d85a81744..0c13016d3d2c 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -937,6 +937,7 @@ enum bpf_prog_type {
 	BPF_PROG_TYPE_EXT,
 	BPF_PROG_TYPE_LSM,
 	BPF_PROG_TYPE_SK_LOOKUP,
+	BPF_PROG_TYPE_SYSCALL,
 };
 
 enum bpf_attach_type {
@@ -4735,6 +4736,12 @@ union bpf_attr {
  *		be zero-terminated except when **str_size** is 0.
  *
  *		Or **-EBUSY** if the per-CPU memory copy buffer is busy.
+ *
+ * long bpf_sys_bpf(u32 cmd, void *attr, u32 attr_size)
+ * 	Description
+ * 		Execute bpf syscall with given arguments.
+ * 	Return
+ * 		A syscall result.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -4903,6 +4910,7 @@ union bpf_attr {
 	FN(check_mtu),			\
 	FN(for_each_map_elem),		\
 	FN(snprintf),			\
+	FN(sys_bpf),			\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
-- 
2.30.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v2 bpf-next 02/16] bpf: Introduce bpfptr_t user/kernel pointer.
  2021-04-23  0:26 [PATCH v2 bpf-next 00/16] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
  2021-04-23  0:26 ` [PATCH v2 bpf-next 01/16] bpf: Introduce bpf_sys_bpf() helper and program type Alexei Starovoitov
@ 2021-04-23  0:26 ` Alexei Starovoitov
  2021-04-23  0:26 ` [PATCH v2 bpf-next 03/16] bpf: Prepare bpf syscall to be used from kernel and user space Alexei Starovoitov
                   ` (14 subsequent siblings)
  16 siblings, 0 replies; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-23  0:26 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Similar to sockptr_t introduce bpfptr_t with few additions:
make_bpfptr() creates new user/kernel pointer in the same address space as
existing user/kernel pointer.
bpfptr_add() advances the user/kernel pointer.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 include/linux/bpfptr.h | 81 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 81 insertions(+)
 create mode 100644 include/linux/bpfptr.h

diff --git a/include/linux/bpfptr.h b/include/linux/bpfptr.h
new file mode 100644
index 000000000000..e370acb04977
--- /dev/null
+++ b/include/linux/bpfptr.h
@@ -0,0 +1,81 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/* A pointer that can point to either kernel or userspace memory. */
+#ifndef _LINUX_BPFPTR_H
+#define _LINUX_BPFPTR_H
+
+#include <linux/sockptr.h>
+
+typedef sockptr_t bpfptr_t;
+
+static inline bool bpfptr_is_kernel(bpfptr_t bpfptr)
+{
+	return bpfptr.is_kernel;
+}
+
+static inline bpfptr_t KERNEL_BPFPTR(void *p)
+{
+	return (bpfptr_t) { .kernel = p, .is_kernel = true };
+}
+
+static inline bpfptr_t USER_BPFPTR(void __user *p)
+{
+	return (bpfptr_t) { .user = p };
+}
+
+static inline bpfptr_t make_bpfptr(u64 addr, bool is_kernel)
+{
+	if (is_kernel)
+		return (bpfptr_t) {
+			.kernel = (void*) (uintptr_t) addr,
+			.is_kernel = true,
+		};
+	else
+		return (bpfptr_t) {
+			.user = u64_to_user_ptr(addr),
+			.is_kernel = false,
+		};
+}
+
+static inline bool bpfptr_is_null(bpfptr_t bpfptr)
+{
+	if (bpfptr_is_kernel(bpfptr))
+		return !bpfptr.kernel;
+	return !bpfptr.user;
+}
+
+static inline void bpfptr_add(bpfptr_t *bpfptr, size_t val)
+{
+	if (bpfptr_is_kernel(*bpfptr))
+		bpfptr->kernel += val;
+	else
+		bpfptr->user += val;
+}
+
+static inline int copy_from_bpfptr_offset(void *dst, bpfptr_t src,
+					  size_t offset, size_t size)
+{
+	return copy_from_sockptr_offset(dst, (sockptr_t) src, offset, size);
+}
+
+static inline int copy_from_bpfptr(void *dst, bpfptr_t src, size_t size)
+{
+	return copy_from_bpfptr_offset(dst, src, 0, size);
+}
+
+static inline int copy_to_bpfptr_offset(bpfptr_t dst, size_t offset,
+					const void *src, size_t size)
+{
+	return copy_to_sockptr_offset((sockptr_t) dst, offset, src, size);
+}
+
+static inline void *memdup_bpfptr(bpfptr_t src, size_t len)
+{
+	return memdup_sockptr((sockptr_t) src, len);
+}
+
+static inline long strncpy_from_bpfptr(char *dst, bpfptr_t src, size_t count)
+{
+	return strncpy_from_sockptr(dst, (sockptr_t) src, count);
+}
+
+#endif /* _LINUX_BPFPTR_H */
-- 
2.30.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v2 bpf-next 03/16] bpf: Prepare bpf syscall to be used from kernel and user space.
  2021-04-23  0:26 [PATCH v2 bpf-next 00/16] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
  2021-04-23  0:26 ` [PATCH v2 bpf-next 01/16] bpf: Introduce bpf_sys_bpf() helper and program type Alexei Starovoitov
  2021-04-23  0:26 ` [PATCH v2 bpf-next 02/16] bpf: Introduce bpfptr_t user/kernel pointer Alexei Starovoitov
@ 2021-04-23  0:26 ` Alexei Starovoitov
  2021-04-23  0:26 ` [PATCH v2 bpf-next 04/16] libbpf: Support for syscall program type Alexei Starovoitov
                   ` (13 subsequent siblings)
  16 siblings, 0 replies; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-23  0:26 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

With the help from bpfptr_t prepare relevant bpf syscall commands
to be used from kernel and user space.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 include/linux/bpf.h   |   8 +--
 kernel/bpf/bpf_iter.c |  13 ++---
 kernel/bpf/syscall.c  | 113 +++++++++++++++++++++++++++---------------
 kernel/bpf/verifier.c |  34 +++++++------
 net/bpf/test_run.c    |   2 +-
 5 files changed, 104 insertions(+), 66 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index aed30bbffb54..0f841bd0cb85 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -22,6 +22,7 @@
 #include <linux/sched/mm.h>
 #include <linux/slab.h>
 #include <linux/percpu-refcount.h>
+#include <linux/bpfptr.h>
 
 struct bpf_verifier_env;
 struct bpf_verifier_log;
@@ -1426,7 +1427,7 @@ struct bpf_iter__bpf_map_elem {
 int bpf_iter_reg_target(const struct bpf_iter_reg *reg_info);
 void bpf_iter_unreg_target(const struct bpf_iter_reg *reg_info);
 bool bpf_iter_prog_supported(struct bpf_prog *prog);
-int bpf_iter_link_attach(const union bpf_attr *attr, struct bpf_prog *prog);
+int bpf_iter_link_attach(const union bpf_attr *attr, bpfptr_t uattr, struct bpf_prog *prog);
 int bpf_iter_new_fd(struct bpf_link *link);
 bool bpf_link_is_iter(struct bpf_link *link);
 struct bpf_prog *bpf_iter_get_info(struct bpf_iter_meta *meta, bool in_stop);
@@ -1457,7 +1458,7 @@ int bpf_fd_htab_map_update_elem(struct bpf_map *map, struct file *map_file,
 int bpf_fd_htab_map_lookup_elem(struct bpf_map *map, void *key, u32 *value);
 
 int bpf_get_file_flag(int flags);
-int bpf_check_uarg_tail_zero(void __user *uaddr, size_t expected_size,
+int bpf_check_uarg_tail_zero(bpfptr_t uaddr, size_t expected_size,
 			     size_t actual_size);
 
 /* memcpy that is used with 8-byte aligned pointers, power-of-8 size and
@@ -1477,8 +1478,7 @@ static inline void bpf_long_memcpy(void *dst, const void *src, u32 size)
 }
 
 /* verify correctness of eBPF program */
-int bpf_check(struct bpf_prog **fp, union bpf_attr *attr,
-	      union bpf_attr __user *uattr);
+int bpf_check(struct bpf_prog **fp, union bpf_attr *attr, bpfptr_t uattr);
 
 #ifndef CONFIG_BPF_JIT_ALWAYS_ON
 void bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth);
diff --git a/kernel/bpf/bpf_iter.c b/kernel/bpf/bpf_iter.c
index 931870f9cf56..2d4fbdbb194e 100644
--- a/kernel/bpf/bpf_iter.c
+++ b/kernel/bpf/bpf_iter.c
@@ -473,15 +473,16 @@ bool bpf_link_is_iter(struct bpf_link *link)
 	return link->ops == &bpf_iter_link_lops;
 }
 
-int bpf_iter_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
+int bpf_iter_link_attach(const union bpf_attr *attr, bpfptr_t uattr,
+			 struct bpf_prog *prog)
 {
-	union bpf_iter_link_info __user *ulinfo;
 	struct bpf_link_primer link_primer;
 	struct bpf_iter_target_info *tinfo;
 	union bpf_iter_link_info linfo;
 	struct bpf_iter_link *link;
 	u32 prog_btf_id, linfo_len;
 	bool existed = false;
+	bpfptr_t ulinfo;
 	int err;
 
 	if (attr->link_create.target_fd || attr->link_create.flags)
@@ -489,18 +490,18 @@ int bpf_iter_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
 
 	memset(&linfo, 0, sizeof(union bpf_iter_link_info));
 
-	ulinfo = u64_to_user_ptr(attr->link_create.iter_info);
+	ulinfo = make_bpfptr(attr->link_create.iter_info, uattr.is_kernel);
 	linfo_len = attr->link_create.iter_info_len;
-	if (!ulinfo ^ !linfo_len)
+	if (bpfptr_is_null(ulinfo) ^ !linfo_len)
 		return -EINVAL;
 
-	if (ulinfo) {
+	if (!bpfptr_is_null(ulinfo)) {
 		err = bpf_check_uarg_tail_zero(ulinfo, sizeof(linfo),
 					       linfo_len);
 		if (err)
 			return err;
 		linfo_len = min_t(u32, linfo_len, sizeof(linfo));
-		if (copy_from_user(&linfo, ulinfo, linfo_len))
+		if (copy_from_bpfptr(&linfo, ulinfo, linfo_len))
 			return -EFAULT;
 	}
 
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 8636876f3e6b..2e9bc04fd821 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -72,11 +72,10 @@ static const struct bpf_map_ops * const bpf_map_types[] = {
  * copy_from_user() call. However, this is not a concern since this function is
  * meant to be a future-proofing of bits.
  */
-int bpf_check_uarg_tail_zero(void __user *uaddr,
+int bpf_check_uarg_tail_zero(bpfptr_t uaddr,
 			     size_t expected_size,
 			     size_t actual_size)
 {
-	unsigned char __user *addr = uaddr + expected_size;
 	int res;
 
 	if (unlikely(actual_size > PAGE_SIZE))	/* silly large */
@@ -85,7 +84,12 @@ int bpf_check_uarg_tail_zero(void __user *uaddr,
 	if (actual_size <= expected_size)
 		return 0;
 
-	res = check_zeroed_user(addr, actual_size - expected_size);
+	if (uaddr.is_kernel)
+		res = memchr_inv(uaddr.kernel + expected_size, 0,
+				 actual_size - expected_size) == NULL;
+	else
+		res = check_zeroed_user(uaddr.user + expected_size,
+					actual_size - expected_size);
 	if (res < 0)
 		return res;
 	return res ? 0 : -E2BIG;
@@ -1004,6 +1008,17 @@ static void *__bpf_copy_key(void __user *ukey, u64 key_size)
 	return NULL;
 }
 
+static void *___bpf_copy_key(bpfptr_t ukey, u64 key_size)
+{
+	if (key_size)
+		return memdup_bpfptr(ukey, key_size);
+
+	if (!bpfptr_is_null(ukey))
+		return ERR_PTR(-EINVAL);
+
+	return NULL;
+}
+
 /* last field in 'union bpf_attr' used by this command */
 #define BPF_MAP_LOOKUP_ELEM_LAST_FIELD flags
 
@@ -1074,10 +1089,10 @@ static int map_lookup_elem(union bpf_attr *attr)
 
 #define BPF_MAP_UPDATE_ELEM_LAST_FIELD flags
 
-static int map_update_elem(union bpf_attr *attr)
+static int map_update_elem(union bpf_attr *attr, bpfptr_t uattr)
 {
-	void __user *ukey = u64_to_user_ptr(attr->key);
-	void __user *uvalue = u64_to_user_ptr(attr->value);
+	bpfptr_t ukey = make_bpfptr(attr->key, uattr.is_kernel);
+	bpfptr_t uvalue = make_bpfptr(attr->value, uattr.is_kernel);
 	int ufd = attr->map_fd;
 	struct bpf_map *map;
 	void *key, *value;
@@ -1103,7 +1118,7 @@ static int map_update_elem(union bpf_attr *attr)
 		goto err_put;
 	}
 
-	key = __bpf_copy_key(ukey, map->key_size);
+	key = ___bpf_copy_key(ukey, map->key_size);
 	if (IS_ERR(key)) {
 		err = PTR_ERR(key);
 		goto err_put;
@@ -1123,7 +1138,7 @@ static int map_update_elem(union bpf_attr *attr)
 		goto free_key;
 
 	err = -EFAULT;
-	if (copy_from_user(value, uvalue, value_size) != 0)
+	if (copy_from_bpfptr(value, uvalue, value_size) != 0)
 		goto free_value;
 
 	err = bpf_map_update_value(map, f, key, value, attr->flags);
@@ -2076,7 +2091,7 @@ static bool is_perfmon_prog_type(enum bpf_prog_type prog_type)
 /* last field in 'union bpf_attr' used by this command */
 #define	BPF_PROG_LOAD_LAST_FIELD attach_prog_fd
 
-static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
+static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr)
 {
 	enum bpf_prog_type type = attr->prog_type;
 	struct bpf_prog *prog, *dst_prog = NULL;
@@ -2101,8 +2116,9 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
 		return -EPERM;
 
 	/* copy eBPF program license from user space */
-	if (strncpy_from_user(license, u64_to_user_ptr(attr->license),
-			      sizeof(license) - 1) < 0)
+	if (strncpy_from_bpfptr(license,
+				make_bpfptr(attr->license, uattr.is_kernel),
+				sizeof(license) - 1) < 0)
 		return -EFAULT;
 	license[sizeof(license) - 1] = 0;
 
@@ -2186,8 +2202,9 @@ static int bpf_prog_load(union bpf_attr *attr, union bpf_attr __user *uattr)
 	prog->len = attr->insn_cnt;
 
 	err = -EFAULT;
-	if (copy_from_user(prog->insns, u64_to_user_ptr(attr->insns),
-			   bpf_prog_insn_size(prog)) != 0)
+	if (copy_from_bpfptr(prog->insns,
+			     make_bpfptr(attr->insns, uattr.is_kernel),
+			     bpf_prog_insn_size(prog)) != 0)
 		goto free_prog_sec;
 
 	prog->orig_prog = NULL;
@@ -3412,7 +3429,7 @@ static int bpf_prog_get_info_by_fd(struct file *file,
 	u32 ulen;
 	int err;
 
-	err = bpf_check_uarg_tail_zero(uinfo, sizeof(info), info_len);
+	err = bpf_check_uarg_tail_zero(USER_BPFPTR(uinfo), sizeof(info), info_len);
 	if (err)
 		return err;
 	info_len = min_t(u32, sizeof(info), info_len);
@@ -3691,7 +3708,7 @@ static int bpf_map_get_info_by_fd(struct file *file,
 	u32 info_len = attr->info.info_len;
 	int err;
 
-	err = bpf_check_uarg_tail_zero(uinfo, sizeof(info), info_len);
+	err = bpf_check_uarg_tail_zero(USER_BPFPTR(uinfo), sizeof(info), info_len);
 	if (err)
 		return err;
 	info_len = min_t(u32, sizeof(info), info_len);
@@ -3734,7 +3751,7 @@ static int bpf_btf_get_info_by_fd(struct file *file,
 	u32 info_len = attr->info.info_len;
 	int err;
 
-	err = bpf_check_uarg_tail_zero(uinfo, sizeof(*uinfo), info_len);
+	err = bpf_check_uarg_tail_zero(USER_BPFPTR(uinfo), sizeof(*uinfo), info_len);
 	if (err)
 		return err;
 
@@ -3751,7 +3768,7 @@ static int bpf_link_get_info_by_fd(struct file *file,
 	u32 info_len = attr->info.info_len;
 	int err;
 
-	err = bpf_check_uarg_tail_zero(uinfo, sizeof(info), info_len);
+	err = bpf_check_uarg_tail_zero(USER_BPFPTR(uinfo), sizeof(info), info_len);
 	if (err)
 		return err;
 	info_len = min_t(u32, sizeof(info), info_len);
@@ -4012,13 +4029,14 @@ static int bpf_map_do_batch(const union bpf_attr *attr,
 	return err;
 }
 
-static int tracing_bpf_link_attach(const union bpf_attr *attr, struct bpf_prog *prog)
+static int tracing_bpf_link_attach(const union bpf_attr *attr, bpfptr_t uattr,
+				   struct bpf_prog *prog)
 {
 	if (attr->link_create.attach_type != prog->expected_attach_type)
 		return -EINVAL;
 
 	if (prog->expected_attach_type == BPF_TRACE_ITER)
-		return bpf_iter_link_attach(attr, prog);
+		return bpf_iter_link_attach(attr, uattr, prog);
 	else if (prog->type == BPF_PROG_TYPE_EXT)
 		return bpf_tracing_prog_attach(prog,
 					       attr->link_create.target_fd,
@@ -4027,7 +4045,7 @@ static int tracing_bpf_link_attach(const union bpf_attr *attr, struct bpf_prog *
 }
 
 #define BPF_LINK_CREATE_LAST_FIELD link_create.iter_info_len
-static int link_create(union bpf_attr *attr)
+static int link_create(union bpf_attr *attr, bpfptr_t uattr)
 {
 	enum bpf_prog_type ptype;
 	struct bpf_prog *prog;
@@ -4046,7 +4064,7 @@ static int link_create(union bpf_attr *attr)
 		goto out;
 
 	if (prog->type == BPF_PROG_TYPE_EXT) {
-		ret = tracing_bpf_link_attach(attr, prog);
+		ret = tracing_bpf_link_attach(attr, uattr, prog);
 		goto out;
 	}
 
@@ -4067,7 +4085,7 @@ static int link_create(union bpf_attr *attr)
 		ret = cgroup_bpf_link_attach(attr, prog);
 		break;
 	case BPF_PROG_TYPE_TRACING:
-		ret = tracing_bpf_link_attach(attr, prog);
+		ret = tracing_bpf_link_attach(attr, uattr, prog);
 		break;
 	case BPF_PROG_TYPE_FLOW_DISSECTOR:
 	case BPF_PROG_TYPE_SK_LOOKUP:
@@ -4355,7 +4373,7 @@ static int bpf_prog_bind_map(union bpf_attr *attr)
 	return ret;
 }
 
-SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, size)
+static int __sys_bpf(int cmd, bpfptr_t uattr, unsigned int size)
 {
 	union bpf_attr attr;
 	int err;
@@ -4370,7 +4388,7 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 
 	/* copy attributes from user space, may be less than sizeof(bpf_attr) */
 	memset(&attr, 0, sizeof(attr));
-	if (copy_from_user(&attr, uattr, size) != 0)
+	if (copy_from_bpfptr(&attr, uattr, size) != 0)
 		return -EFAULT;
 
 	err = security_bpf(cmd, &attr, size);
@@ -4385,7 +4403,7 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 		err = map_lookup_elem(&attr);
 		break;
 	case BPF_MAP_UPDATE_ELEM:
-		err = map_update_elem(&attr);
+		err = map_update_elem(&attr, uattr);
 		break;
 	case BPF_MAP_DELETE_ELEM:
 		err = map_delete_elem(&attr);
@@ -4412,21 +4430,21 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 		err = bpf_prog_detach(&attr);
 		break;
 	case BPF_PROG_QUERY:
-		err = bpf_prog_query(&attr, uattr);
+		err = bpf_prog_query(&attr, uattr.user);
 		break;
 	case BPF_PROG_TEST_RUN:
-		err = bpf_prog_test_run(&attr, uattr);
+		err = bpf_prog_test_run(&attr, uattr.user);
 		break;
 	case BPF_PROG_GET_NEXT_ID:
-		err = bpf_obj_get_next_id(&attr, uattr,
+		err = bpf_obj_get_next_id(&attr, uattr.user,
 					  &prog_idr, &prog_idr_lock);
 		break;
 	case BPF_MAP_GET_NEXT_ID:
-		err = bpf_obj_get_next_id(&attr, uattr,
+		err = bpf_obj_get_next_id(&attr, uattr.user,
 					  &map_idr, &map_idr_lock);
 		break;
 	case BPF_BTF_GET_NEXT_ID:
-		err = bpf_obj_get_next_id(&attr, uattr,
+		err = bpf_obj_get_next_id(&attr, uattr.user,
 					  &btf_idr, &btf_idr_lock);
 		break;
 	case BPF_PROG_GET_FD_BY_ID:
@@ -4436,7 +4454,7 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 		err = bpf_map_get_fd_by_id(&attr);
 		break;
 	case BPF_OBJ_GET_INFO_BY_FD:
-		err = bpf_obj_get_info_by_fd(&attr, uattr);
+		err = bpf_obj_get_info_by_fd(&attr, uattr.user);
 		break;
 	case BPF_RAW_TRACEPOINT_OPEN:
 		err = bpf_raw_tracepoint_open(&attr);
@@ -4448,26 +4466,26 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 		err = bpf_btf_get_fd_by_id(&attr);
 		break;
 	case BPF_TASK_FD_QUERY:
-		err = bpf_task_fd_query(&attr, uattr);
+		err = bpf_task_fd_query(&attr, uattr.user);
 		break;
 	case BPF_MAP_LOOKUP_AND_DELETE_ELEM:
 		err = map_lookup_and_delete_elem(&attr);
 		break;
 	case BPF_MAP_LOOKUP_BATCH:
-		err = bpf_map_do_batch(&attr, uattr, BPF_MAP_LOOKUP_BATCH);
+		err = bpf_map_do_batch(&attr, uattr.user, BPF_MAP_LOOKUP_BATCH);
 		break;
 	case BPF_MAP_LOOKUP_AND_DELETE_BATCH:
-		err = bpf_map_do_batch(&attr, uattr,
+		err = bpf_map_do_batch(&attr, uattr.user,
 				       BPF_MAP_LOOKUP_AND_DELETE_BATCH);
 		break;
 	case BPF_MAP_UPDATE_BATCH:
-		err = bpf_map_do_batch(&attr, uattr, BPF_MAP_UPDATE_BATCH);
+		err = bpf_map_do_batch(&attr, uattr.user, BPF_MAP_UPDATE_BATCH);
 		break;
 	case BPF_MAP_DELETE_BATCH:
-		err = bpf_map_do_batch(&attr, uattr, BPF_MAP_DELETE_BATCH);
+		err = bpf_map_do_batch(&attr, uattr.user, BPF_MAP_DELETE_BATCH);
 		break;
 	case BPF_LINK_CREATE:
-		err = link_create(&attr);
+		err = link_create(&attr, uattr);
 		break;
 	case BPF_LINK_UPDATE:
 		err = link_update(&attr);
@@ -4476,7 +4494,7 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 		err = bpf_link_get_fd_by_id(&attr);
 		break;
 	case BPF_LINK_GET_NEXT_ID:
-		err = bpf_obj_get_next_id(&attr, uattr,
+		err = bpf_obj_get_next_id(&attr, uattr.user,
 					  &link_idr, &link_idr_lock);
 		break;
 	case BPF_ENABLE_STATS:
@@ -4499,6 +4517,11 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
 	return err;
 }
 
+SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, size)
+{
+	return __sys_bpf(cmd, USER_BPFPTR(uattr), size);
+}
+
 static bool syscall_prog_is_valid_access(int off, int size,
 					 enum bpf_access_type type,
 					 const struct bpf_prog *prog,
@@ -4513,7 +4536,19 @@ static bool syscall_prog_is_valid_access(int off, int size,
 
 BPF_CALL_3(bpf_sys_bpf, int, cmd, void *, attr, u32, attr_size)
 {
-	return -EINVAL;
+	switch (cmd) {
+	case BPF_MAP_CREATE:
+	case BPF_MAP_UPDATE_ELEM:
+	case BPF_MAP_FREEZE:
+	case BPF_PROG_LOAD:
+		break;
+	/* case BPF_PROG_TEST_RUN:
+	 * is not part of this list to prevent recursive test_run
+	 */
+	default:
+		return -EINVAL;
+	}
+	return __sys_bpf(cmd, KERNEL_BPFPTR(attr), attr_size);
 }
 
 const struct bpf_func_proto bpf_sys_bpf_proto = {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 58730872f7e5..76a18fb1e792 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -9351,7 +9351,7 @@ static int check_abnormal_return(struct bpf_verifier_env *env)
 
 static int check_btf_func(struct bpf_verifier_env *env,
 			  const union bpf_attr *attr,
-			  union bpf_attr __user *uattr)
+			  bpfptr_t uattr)
 {
 	const struct btf_type *type, *func_proto, *ret_type;
 	u32 i, nfuncs, urec_size, min_size;
@@ -9360,7 +9360,7 @@ static int check_btf_func(struct bpf_verifier_env *env,
 	struct bpf_func_info_aux *info_aux = NULL;
 	struct bpf_prog *prog;
 	const struct btf *btf;
-	void __user *urecord;
+	bpfptr_t urecord;
 	u32 prev_offset = 0;
 	bool scalar_return;
 	int ret = -ENOMEM;
@@ -9388,7 +9388,7 @@ static int check_btf_func(struct bpf_verifier_env *env,
 	prog = env->prog;
 	btf = prog->aux->btf;
 
-	urecord = u64_to_user_ptr(attr->func_info);
+	urecord = make_bpfptr(attr->func_info, uattr.is_kernel);
 	min_size = min_t(u32, krec_size, urec_size);
 
 	krecord = kvcalloc(nfuncs, krec_size, GFP_KERNEL | __GFP_NOWARN);
@@ -9406,13 +9406,15 @@ static int check_btf_func(struct bpf_verifier_env *env,
 				/* set the size kernel expects so loader can zero
 				 * out the rest of the record.
 				 */
-				if (put_user(min_size, &uattr->func_info_rec_size))
+				if (copy_to_bpfptr_offset(uattr,
+							  offsetof(union bpf_attr, func_info_rec_size),
+							  &min_size, sizeof(min_size)))
 					ret = -EFAULT;
 			}
 			goto err_free;
 		}
 
-		if (copy_from_user(&krecord[i], urecord, min_size)) {
+		if (copy_from_bpfptr(&krecord[i], urecord, min_size)) {
 			ret = -EFAULT;
 			goto err_free;
 		}
@@ -9464,7 +9466,7 @@ static int check_btf_func(struct bpf_verifier_env *env,
 		}
 
 		prev_offset = krecord[i].insn_off;
-		urecord += urec_size;
+		bpfptr_add(&urecord, urec_size);
 	}
 
 	prog->aux->func_info = krecord;
@@ -9496,14 +9498,14 @@ static void adjust_btf_func(struct bpf_verifier_env *env)
 
 static int check_btf_line(struct bpf_verifier_env *env,
 			  const union bpf_attr *attr,
-			  union bpf_attr __user *uattr)
+			  bpfptr_t uattr)
 {
 	u32 i, s, nr_linfo, ncopy, expected_size, rec_size, prev_offset = 0;
 	struct bpf_subprog_info *sub;
 	struct bpf_line_info *linfo;
 	struct bpf_prog *prog;
 	const struct btf *btf;
-	void __user *ulinfo;
+	bpfptr_t ulinfo;
 	int err;
 
 	nr_linfo = attr->line_info_cnt;
@@ -9529,7 +9531,7 @@ static int check_btf_line(struct bpf_verifier_env *env,
 
 	s = 0;
 	sub = env->subprog_info;
-	ulinfo = u64_to_user_ptr(attr->line_info);
+	ulinfo = make_bpfptr(attr->line_info, uattr.is_kernel);
 	expected_size = sizeof(struct bpf_line_info);
 	ncopy = min_t(u32, expected_size, rec_size);
 	for (i = 0; i < nr_linfo; i++) {
@@ -9537,14 +9539,15 @@ static int check_btf_line(struct bpf_verifier_env *env,
 		if (err) {
 			if (err == -E2BIG) {
 				verbose(env, "nonzero tailing record in line_info");
-				if (put_user(expected_size,
-					     &uattr->line_info_rec_size))
+				if (copy_to_bpfptr_offset(uattr,
+							  offsetof(union bpf_attr, line_info_rec_size),
+							  &expected_size, sizeof(expected_size)))
 					err = -EFAULT;
 			}
 			goto err_free;
 		}
 
-		if (copy_from_user(&linfo[i], ulinfo, ncopy)) {
+		if (copy_from_bpfptr(&linfo[i], ulinfo, ncopy)) {
 			err = -EFAULT;
 			goto err_free;
 		}
@@ -9596,7 +9599,7 @@ static int check_btf_line(struct bpf_verifier_env *env,
 		}
 
 		prev_offset = linfo[i].insn_off;
-		ulinfo += rec_size;
+		bpfptr_add(&ulinfo, rec_size);
 	}
 
 	if (s != env->subprog_cnt) {
@@ -9618,7 +9621,7 @@ static int check_btf_line(struct bpf_verifier_env *env,
 
 static int check_btf_info(struct bpf_verifier_env *env,
 			  const union bpf_attr *attr,
-			  union bpf_attr __user *uattr)
+			  bpfptr_t uattr)
 {
 	struct btf *btf;
 	int err;
@@ -13192,8 +13195,7 @@ struct btf *bpf_get_btf_vmlinux(void)
 	return btf_vmlinux;
 }
 
-int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
-	      union bpf_attr __user *uattr)
+int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr)
 {
 	u64 start_time = ktime_get_ns();
 	struct bpf_verifier_env *env;
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 1783ea77b95c..85f41fb8d5bf 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -409,7 +409,7 @@ static void *bpf_ctx_init(const union bpf_attr *kattr, u32 max_size)
 		return ERR_PTR(-ENOMEM);
 
 	if (data_in) {
-		err = bpf_check_uarg_tail_zero(data_in, max_size, size);
+		err = bpf_check_uarg_tail_zero(USER_BPFPTR(data_in), max_size, size);
 		if (err) {
 			kfree(data);
 			return ERR_PTR(err);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v2 bpf-next 04/16] libbpf: Support for syscall program type
  2021-04-23  0:26 [PATCH v2 bpf-next 00/16] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
                   ` (2 preceding siblings ...)
  2021-04-23  0:26 ` [PATCH v2 bpf-next 03/16] bpf: Prepare bpf syscall to be used from kernel and user space Alexei Starovoitov
@ 2021-04-23  0:26 ` Alexei Starovoitov
  2021-04-26 22:24   ` Andrii Nakryiko
  2021-04-23  0:26 ` [PATCH v2 bpf-next 05/16] selftests/bpf: Test " Alexei Starovoitov
                   ` (12 subsequent siblings)
  16 siblings, 1 reply; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-23  0:26 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Trivial support for syscall program type.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/lib/bpf/libbpf.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 9cc2d45b0080..254a0c9aa6cf 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -8899,6 +8899,7 @@ static const struct bpf_sec_def section_defs[] = {
 	BPF_PROG_SEC("struct_ops",		BPF_PROG_TYPE_STRUCT_OPS),
 	BPF_EAPROG_SEC("sk_lookup/",		BPF_PROG_TYPE_SK_LOOKUP,
 						BPF_SK_LOOKUP),
+	BPF_PROG_SEC("syscall",			BPF_PROG_TYPE_SYSCALL),
 };
 
 #undef BPF_PROG_SEC_IMPL
-- 
2.30.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v2 bpf-next 05/16] selftests/bpf: Test for syscall program type
  2021-04-23  0:26 [PATCH v2 bpf-next 00/16] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
                   ` (3 preceding siblings ...)
  2021-04-23  0:26 ` [PATCH v2 bpf-next 04/16] libbpf: Support for syscall program type Alexei Starovoitov
@ 2021-04-23  0:26 ` Alexei Starovoitov
  2021-04-26 17:02   ` Andrii Nakryiko
  2021-04-23  0:26 ` [PATCH v2 bpf-next 06/16] bpf: Make btf_load command to be bpfptr_t compatible Alexei Starovoitov
                   ` (11 subsequent siblings)
  16 siblings, 1 reply; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-23  0:26 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

bpf_prog_type_syscall is a program that creates a bpf map,
updates it, and loads another bpf program using bpf_sys_bpf() helper.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/testing/selftests/bpf/Makefile          |  1 +
 .../selftests/bpf/prog_tests/syscall.c        | 53 ++++++++++++++
 tools/testing/selftests/bpf/progs/syscall.c   | 73 +++++++++++++++++++
 3 files changed, 127 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/syscall.c
 create mode 100644 tools/testing/selftests/bpf/progs/syscall.c

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index c5bcdb3d4b12..9fdfdbc61857 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -278,6 +278,7 @@ MENDIAN=$(if $(IS_LITTLE_ENDIAN),-mlittle-endian,-mbig-endian)
 CLANG_SYS_INCLUDES = $(call get_sys_includes,$(CLANG))
 BPF_CFLAGS = -g -D__TARGET_ARCH_$(SRCARCH) $(MENDIAN) 			\
 	     -I$(INCLUDE_DIR) -I$(CURDIR) -I$(APIDIR)			\
+	     -I$(TOOLSINCDIR) \
 	     -I$(abspath $(OUTPUT)/../usr/include)
 
 CLANG_CFLAGS = $(CLANG_SYS_INCLUDES) \
diff --git a/tools/testing/selftests/bpf/prog_tests/syscall.c b/tools/testing/selftests/bpf/prog_tests/syscall.c
new file mode 100644
index 000000000000..e550e36bb5da
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/syscall.c
@@ -0,0 +1,53 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2021 Facebook */
+#include <test_progs.h>
+#include "syscall.skel.h"
+
+struct args {
+	__u64 log_buf;
+	__u32 log_size;
+	int max_entries;
+	int map_fd;
+	int prog_fd;
+};
+
+void test_syscall(void)
+{
+	static char verifier_log[8192];
+	struct args ctx = {
+		.max_entries = 1024,
+		.log_buf = (uintptr_t) verifier_log,
+		.log_size = sizeof(verifier_log),
+	};
+	struct bpf_prog_test_run_attr tattr = {
+		.ctx_in = &ctx,
+		.ctx_size_in = sizeof(ctx),
+	};
+	struct syscall *skel = NULL;
+	__u64 key = 12, value = 0;
+	__u32 duration = 0;
+	int err;
+
+	skel = syscall__open_and_load();
+	if (CHECK(!skel, "skel_load", "syscall skeleton failed\n"))
+		goto cleanup;
+
+	tattr.prog_fd = bpf_program__fd(skel->progs.bpf_prog);
+	err = bpf_prog_test_run_xattr(&tattr);
+	if (CHECK(err || tattr.retval != 1, "test_run sys_bpf",
+		  "err %d errno %d retval %d duration %d\n",
+		  err, errno, tattr.retval, tattr.duration))
+		goto cleanup;
+
+	CHECK(ctx.map_fd <= 0, "map_fd", "fd = %d\n", ctx.map_fd);
+	CHECK(ctx.prog_fd <= 0, "prog_fd", "fd = %d\n", ctx.prog_fd);
+	CHECK(memcmp(verifier_log, "processed", sizeof("processed") - 1) != 0,
+	      "verifier_log", "%s\n", verifier_log);
+
+	err = bpf_map_lookup_elem(ctx.map_fd, &key, &value);
+	CHECK(err, "map_lookup", "map_lookup failed\n");
+	CHECK(value != 34, "invalid_value",
+	      "got value %llu expected %u\n", value, 34);
+cleanup:
+	syscall__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/syscall.c b/tools/testing/selftests/bpf/progs/syscall.c
new file mode 100644
index 000000000000..01476f88e45f
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/syscall.c
@@ -0,0 +1,73 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2021 Facebook */
+#include <linux/stddef.h>
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include <../../tools/include/linux/filter.h>
+
+volatile const int workaround = 1;
+
+char _license[] SEC("license") = "GPL";
+
+struct args {
+	__u64 log_buf;
+	__u32 log_size;
+	int max_entries;
+	int map_fd;
+	int prog_fd;
+};
+
+SEC("syscall")
+int bpf_prog(struct args *ctx)
+{
+	static char license[] = "GPL";
+	static struct bpf_insn insns[] = {
+		BPF_ST_MEM(BPF_DW, BPF_REG_10, -8, 0),
+		BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
+		BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
+		BPF_LD_MAP_FD(BPF_REG_1, 0),
+		BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
+		BPF_MOV64_IMM(BPF_REG_0, 0),
+		BPF_EXIT_INSN(),
+	};
+	static union bpf_attr map_create_attr = {
+		.map_type = BPF_MAP_TYPE_HASH,
+		.key_size = 8,
+		.value_size = 8,
+	};
+	static union bpf_attr map_update_attr = { .map_fd = 1, };
+	static __u64 key = 12;
+	static __u64 value = 34;
+	static union bpf_attr prog_load_attr = {
+		.prog_type = BPF_PROG_TYPE_XDP,
+		.insn_cnt = sizeof(insns) / sizeof(insns[0]),
+	};
+	int ret;
+
+	map_create_attr.max_entries = ctx->max_entries;
+	prog_load_attr.license = (long) license;
+	prog_load_attr.insns = (long) insns;
+	prog_load_attr.log_buf = ctx->log_buf;
+	prog_load_attr.log_size = ctx->log_size;
+	prog_load_attr.log_level = 1;
+
+	ret = bpf_sys_bpf(BPF_MAP_CREATE, &map_create_attr, sizeof(map_create_attr));
+	if (ret <= 0)
+		return ret;
+	ctx->map_fd = ret;
+	insns[3].imm = ret;
+
+	map_update_attr.map_fd = ret;
+	map_update_attr.key = (long) &key;
+	map_update_attr.value = (long) &value;
+	ret = bpf_sys_bpf(BPF_MAP_UPDATE_ELEM, &map_update_attr, sizeof(map_update_attr));
+	if (ret < 0)
+		return ret;
+
+	ret = bpf_sys_bpf(BPF_PROG_LOAD, &prog_load_attr, sizeof(prog_load_attr));
+	if (ret <= 0)
+		return ret;
+	ctx->prog_fd = ret;
+	return 1;
+}
-- 
2.30.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v2 bpf-next 06/16] bpf: Make btf_load command to be bpfptr_t compatible.
  2021-04-23  0:26 [PATCH v2 bpf-next 00/16] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
                   ` (4 preceding siblings ...)
  2021-04-23  0:26 ` [PATCH v2 bpf-next 05/16] selftests/bpf: Test " Alexei Starovoitov
@ 2021-04-23  0:26 ` Alexei Starovoitov
  2021-04-23  0:26 ` [PATCH v2 bpf-next 07/16] selftests/bpf: Test for btf_load command Alexei Starovoitov
                   ` (10 subsequent siblings)
  16 siblings, 0 replies; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-23  0:26 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Similar to prog_load make btf_load command to be availble to
bpf_prog_type_syscall program.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 include/linux/btf.h  | 2 +-
 kernel/bpf/btf.c     | 8 ++++----
 kernel/bpf/syscall.c | 7 ++++---
 3 files changed, 9 insertions(+), 8 deletions(-)

diff --git a/include/linux/btf.h b/include/linux/btf.h
index 3bac66e0183a..94a0c976c90f 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -21,7 +21,7 @@ extern const struct file_operations btf_fops;
 
 void btf_get(struct btf *btf);
 void btf_put(struct btf *btf);
-int btf_new_fd(const union bpf_attr *attr);
+int btf_new_fd(const union bpf_attr *attr, bpfptr_t uattr);
 struct btf *btf_get_by_fd(int fd);
 int btf_get_info_by_fd(const struct btf *btf,
 		       const union bpf_attr *attr,
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 0600ed325fa0..fbf6c06a9d62 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -4257,7 +4257,7 @@ static int btf_parse_hdr(struct btf_verifier_env *env)
 	return 0;
 }
 
-static struct btf *btf_parse(void __user *btf_data, u32 btf_data_size,
+static struct btf *btf_parse(bpfptr_t btf_data, u32 btf_data_size,
 			     u32 log_level, char __user *log_ubuf, u32 log_size)
 {
 	struct btf_verifier_env *env = NULL;
@@ -4306,7 +4306,7 @@ static struct btf *btf_parse(void __user *btf_data, u32 btf_data_size,
 	btf->data = data;
 	btf->data_size = btf_data_size;
 
-	if (copy_from_user(data, btf_data, btf_data_size)) {
+	if (copy_from_bpfptr(data, btf_data, btf_data_size)) {
 		err = -EFAULT;
 		goto errout;
 	}
@@ -5780,12 +5780,12 @@ static int __btf_new_fd(struct btf *btf)
 	return anon_inode_getfd("btf", &btf_fops, btf, O_RDONLY | O_CLOEXEC);
 }
 
-int btf_new_fd(const union bpf_attr *attr)
+int btf_new_fd(const union bpf_attr *attr, bpfptr_t uattr)
 {
 	struct btf *btf;
 	int ret;
 
-	btf = btf_parse(u64_to_user_ptr(attr->btf),
+	btf = btf_parse(make_bpfptr(attr->btf, uattr.is_kernel),
 			attr->btf_size, attr->btf_log_level,
 			u64_to_user_ptr(attr->btf_log_buf),
 			attr->btf_log_size);
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 2e9bc04fd821..9b3bc48b1cc6 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -3831,7 +3831,7 @@ static int bpf_obj_get_info_by_fd(const union bpf_attr *attr,
 
 #define BPF_BTF_LOAD_LAST_FIELD btf_log_level
 
-static int bpf_btf_load(const union bpf_attr *attr)
+static int bpf_btf_load(const union bpf_attr *attr, bpfptr_t uattr)
 {
 	if (CHECK_ATTR(BPF_BTF_LOAD))
 		return -EINVAL;
@@ -3839,7 +3839,7 @@ static int bpf_btf_load(const union bpf_attr *attr)
 	if (!bpf_capable())
 		return -EPERM;
 
-	return btf_new_fd(attr);
+	return btf_new_fd(attr, uattr);
 }
 
 #define BPF_BTF_GET_FD_BY_ID_LAST_FIELD btf_id
@@ -4460,7 +4460,7 @@ static int __sys_bpf(int cmd, bpfptr_t uattr, unsigned int size)
 		err = bpf_raw_tracepoint_open(&attr);
 		break;
 	case BPF_BTF_LOAD:
-		err = bpf_btf_load(&attr);
+		err = bpf_btf_load(&attr, uattr);
 		break;
 	case BPF_BTF_GET_FD_BY_ID:
 		err = bpf_btf_get_fd_by_id(&attr);
@@ -4541,6 +4541,7 @@ BPF_CALL_3(bpf_sys_bpf, int, cmd, void *, attr, u32, attr_size)
 	case BPF_MAP_UPDATE_ELEM:
 	case BPF_MAP_FREEZE:
 	case BPF_PROG_LOAD:
+	case BPF_BTF_LOAD:
 		break;
 	/* case BPF_PROG_TEST_RUN:
 	 * is not part of this list to prevent recursive test_run
-- 
2.30.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v2 bpf-next 07/16] selftests/bpf: Test for btf_load command.
  2021-04-23  0:26 [PATCH v2 bpf-next 00/16] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
                   ` (5 preceding siblings ...)
  2021-04-23  0:26 ` [PATCH v2 bpf-next 06/16] bpf: Make btf_load command to be bpfptr_t compatible Alexei Starovoitov
@ 2021-04-23  0:26 ` Alexei Starovoitov
  2021-04-23  0:26 ` [PATCH v2 bpf-next 08/16] bpf: Introduce fd_idx Alexei Starovoitov
                   ` (9 subsequent siblings)
  16 siblings, 0 replies; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-23  0:26 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Improve selftest to check that btf_load is working from bpf program.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/testing/selftests/bpf/progs/syscall.c | 48 +++++++++++++++++++++
 1 file changed, 48 insertions(+)

diff --git a/tools/testing/selftests/bpf/progs/syscall.c b/tools/testing/selftests/bpf/progs/syscall.c
index 01476f88e45f..b6ac10f75c37 100644
--- a/tools/testing/selftests/bpf/progs/syscall.c
+++ b/tools/testing/selftests/bpf/progs/syscall.c
@@ -4,6 +4,7 @@
 #include <linux/bpf.h>
 #include <bpf/bpf_helpers.h>
 #include <bpf/bpf_tracing.h>
+#include <linux/btf.h>
 #include <../../tools/include/linux/filter.h>
 
 volatile const int workaround = 1;
@@ -18,6 +19,45 @@ struct args {
 	int prog_fd;
 };
 
+#define BTF_INFO_ENC(kind, kind_flag, vlen) \
+	((!!(kind_flag) << 31) | ((kind) << 24) | ((vlen) & BTF_MAX_VLEN))
+#define BTF_TYPE_ENC(name, info, size_or_type) (name), (info), (size_or_type)
+#define BTF_INT_ENC(encoding, bits_offset, nr_bits) \
+	((encoding) << 24 | (bits_offset) << 16 | (nr_bits))
+#define BTF_TYPE_INT_ENC(name, encoding, bits_offset, bits, sz) \
+	BTF_TYPE_ENC(name, BTF_INFO_ENC(BTF_KIND_INT, 0, 0), sz), \
+	BTF_INT_ENC(encoding, bits_offset, bits)
+
+static int btf_load(void)
+{
+	struct btf_blob {
+		struct btf_header btf_hdr;
+		__u32 types[8];
+		__u32 str;
+	} raw_btf = {
+		.btf_hdr = {
+			.magic = BTF_MAGIC,
+			.version = BTF_VERSION,
+			.hdr_len = sizeof(struct btf_header),
+			.type_len = sizeof(__u32) * 8,
+			.str_off = sizeof(__u32) * 8,
+			.str_len = sizeof(__u32),
+		},
+		.types = {
+			/* long */
+			BTF_TYPE_INT_ENC(0, BTF_INT_SIGNED, 0, 64, 8),  /* [1] */
+			/* unsigned long */
+			BTF_TYPE_INT_ENC(0, 0, 0, 64, 8),  /* [2] */
+		},
+	};
+	static union bpf_attr btf_load_attr = {
+		.btf_size = sizeof(raw_btf),
+	};
+
+	btf_load_attr.btf = (long)&raw_btf;
+	return bpf_sys_bpf(BPF_BTF_LOAD, &btf_load_attr, sizeof(btf_load_attr));
+}
+
 SEC("syscall")
 int bpf_prog(struct args *ctx)
 {
@@ -35,6 +75,8 @@ int bpf_prog(struct args *ctx)
 		.map_type = BPF_MAP_TYPE_HASH,
 		.key_size = 8,
 		.value_size = 8,
+		.btf_key_type_id = 1,
+		.btf_value_type_id = 2,
 	};
 	static union bpf_attr map_update_attr = { .map_fd = 1, };
 	static __u64 key = 12;
@@ -45,7 +87,13 @@ int bpf_prog(struct args *ctx)
 	};
 	int ret;
 
+	ret = btf_load();
+	if (ret < 0)
+		return ret;
+
 	map_create_attr.max_entries = ctx->max_entries;
+	map_create_attr.btf_fd = ret;
+
 	prog_load_attr.license = (long) license;
 	prog_load_attr.insns = (long) insns;
 	prog_load_attr.log_buf = ctx->log_buf;
-- 
2.30.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v2 bpf-next 08/16] bpf: Introduce fd_idx
  2021-04-23  0:26 [PATCH v2 bpf-next 00/16] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
                   ` (6 preceding siblings ...)
  2021-04-23  0:26 ` [PATCH v2 bpf-next 07/16] selftests/bpf: Test for btf_load command Alexei Starovoitov
@ 2021-04-23  0:26 ` Alexei Starovoitov
  2021-04-23  0:26 ` [PATCH v2 bpf-next 09/16] libbpf: Support for fd_idx Alexei Starovoitov
                   ` (8 subsequent siblings)
  16 siblings, 0 replies; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-23  0:26 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Typical program loading sequence involves creating bpf maps and applying
map FDs into bpf instructions in various places in the bpf program.
This job is done by libbpf that is using compiler generated ELF relocations
to patch certain instruction after maps are created and BTFs are loaded.
The goal of fd_idx is to allow bpf instructions to stay immutable
after compilation. At load time the libbpf would still create maps as usual,
but it wouldn't need to patch instructions. It would store map_fds into
__u32 fd_array[] and would pass that pointer to sys_bpf(BPF_PROG_LOAD).

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 include/linux/bpf_verifier.h   |  1 +
 include/uapi/linux/bpf.h       | 16 ++++++++----
 kernel/bpf/syscall.c           |  2 +-
 kernel/bpf/verifier.c          | 47 ++++++++++++++++++++++++++--------
 tools/include/uapi/linux/bpf.h | 16 ++++++++----
 5 files changed, 61 insertions(+), 21 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 6023a1367853..a5a3b4b3e804 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -441,6 +441,7 @@ struct bpf_verifier_env {
 	u32 peak_states;
 	/* longest register parentage chain walked for liveness marking */
 	u32 longest_mark_read_walk;
+	bpfptr_t fd_array;
 };
 
 __printf(2, 0) void bpf_verifier_vlog(struct bpf_verifier_log *log,
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index c92648f38144..de58a714ed36 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1098,8 +1098,8 @@ enum bpf_link_type {
 /* When BPF ldimm64's insn[0].src_reg != 0 then this can have
  * the following extensions:
  *
- * insn[0].src_reg:  BPF_PSEUDO_MAP_FD
- * insn[0].imm:      map fd
+ * insn[0].src_reg:  BPF_PSEUDO_MAP_[FD|IDX]
+ * insn[0].imm:      map fd or fd_idx
  * insn[1].imm:      0
  * insn[0].off:      0
  * insn[1].off:      0
@@ -1107,15 +1107,19 @@ enum bpf_link_type {
  * verifier type:    CONST_PTR_TO_MAP
  */
 #define BPF_PSEUDO_MAP_FD	1
-/* insn[0].src_reg:  BPF_PSEUDO_MAP_VALUE
- * insn[0].imm:      map fd
+#define BPF_PSEUDO_MAP_IDX	5
+
+/* insn[0].src_reg:  BPF_PSEUDO_MAP_[IDX_]VALUE
+ * insn[0].imm:      map fd or fd_idx
  * insn[1].imm:      offset into value
  * insn[0].off:      0
  * insn[1].off:      0
  * ldimm64 rewrite:  address of map[0]+offset
  * verifier type:    PTR_TO_MAP_VALUE
  */
-#define BPF_PSEUDO_MAP_VALUE	2
+#define BPF_PSEUDO_MAP_VALUE		2
+#define BPF_PSEUDO_MAP_IDX_VALUE	6
+
 /* insn[0].src_reg:  BPF_PSEUDO_BTF_ID
  * insn[0].imm:      kernel btd id of VAR
  * insn[1].imm:      0
@@ -1315,6 +1319,8 @@ union bpf_attr {
 			/* or valid module BTF object fd or 0 to attach to vmlinux */
 			__u32		attach_btf_obj_fd;
 		};
+		__u32		:32;		/* pad */
+		__aligned_u64	fd_array;	/* array of FDs */
 	};
 
 	struct { /* anonymous struct used by BPF_OBJ_* commands */
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 9b3bc48b1cc6..a81496c5d09f 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -2089,7 +2089,7 @@ static bool is_perfmon_prog_type(enum bpf_prog_type prog_type)
 }
 
 /* last field in 'union bpf_attr' used by this command */
-#define	BPF_PROG_LOAD_LAST_FIELD attach_prog_fd
+#define	BPF_PROG_LOAD_LAST_FIELD fd_array
 
 static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr)
 {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 76a18fb1e792..99ca02d7f0e3 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -8830,12 +8830,14 @@ static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn)
 	mark_reg_known_zero(env, regs, insn->dst_reg);
 	dst_reg->map_ptr = map;
 
-	if (insn->src_reg == BPF_PSEUDO_MAP_VALUE) {
+	if (insn->src_reg == BPF_PSEUDO_MAP_VALUE ||
+	    insn->src_reg == BPF_PSEUDO_MAP_IDX_VALUE) {
 		dst_reg->type = PTR_TO_MAP_VALUE;
 		dst_reg->off = aux->map_off;
 		if (map_value_has_spin_lock(map))
 			dst_reg->id = ++env->id_gen;
-	} else if (insn->src_reg == BPF_PSEUDO_MAP_FD) {
+	} else if (insn->src_reg == BPF_PSEUDO_MAP_FD ||
+		   insn->src_reg == BPF_PSEUDO_MAP_IDX) {
 		dst_reg->type = CONST_PTR_TO_MAP;
 	} else {
 		verbose(env, "bpf verifier is misconfigured\n");
@@ -11104,6 +11106,7 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env)
 			struct bpf_map *map;
 			struct fd f;
 			u64 addr;
+			u32 fd;
 
 			if (i == insn_cnt - 1 || insn[1].code != 0 ||
 			    insn[1].dst_reg != 0 || insn[1].src_reg != 0 ||
@@ -11133,16 +11136,38 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env)
 			/* In final convert_pseudo_ld_imm64() step, this is
 			 * converted into regular 64-bit imm load insn.
 			 */
-			if ((insn[0].src_reg != BPF_PSEUDO_MAP_FD &&
-			     insn[0].src_reg != BPF_PSEUDO_MAP_VALUE) ||
-			    (insn[0].src_reg == BPF_PSEUDO_MAP_FD &&
-			     insn[1].imm != 0)) {
-				verbose(env,
-					"unrecognized bpf_ld_imm64 insn\n");
+			switch (insn[0].src_reg) {
+			case BPF_PSEUDO_MAP_VALUE:
+			case BPF_PSEUDO_MAP_IDX_VALUE:
+				break;
+			case BPF_PSEUDO_MAP_FD:
+			case BPF_PSEUDO_MAP_IDX:
+				if (insn[1].imm == 0)
+					break;
+				fallthrough;
+			default:
+				verbose(env, "unrecognized bpf_ld_imm64 insn\n");
 				return -EINVAL;
 			}
 
-			f = fdget(insn[0].imm);
+			switch (insn[0].src_reg) {
+			case BPF_PSEUDO_MAP_IDX_VALUE:
+			case BPF_PSEUDO_MAP_IDX:
+				if (bpfptr_is_null(env->fd_array)) {
+					verbose(env, "fd_idx without fd_array is invalid\n");
+					return -EPROTO;
+				}
+				if (copy_from_bpfptr_offset(&fd, env->fd_array,
+							    insn[0].imm * sizeof(fd),
+							    sizeof(fd)))
+					return -EFAULT;
+				break;
+			default:
+				fd = insn[0].imm;
+				break;
+			}
+
+			f = fdget(fd);
 			map = __bpf_map_get(f);
 			if (IS_ERR(map)) {
 				verbose(env, "fd %d is not pointing to valid bpf_map\n",
@@ -11157,7 +11182,8 @@ static int resolve_pseudo_ldimm64(struct bpf_verifier_env *env)
 			}
 
 			aux = &env->insn_aux_data[i];
-			if (insn->src_reg == BPF_PSEUDO_MAP_FD) {
+			if (insn[0].src_reg == BPF_PSEUDO_MAP_FD ||
+			    insn[0].src_reg == BPF_PSEUDO_MAP_IDX) {
 				addr = (unsigned long)map;
 			} else {
 				u32 off = insn[1].imm;
@@ -13225,6 +13251,7 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr, bpfptr_t uattr)
 		env->insn_aux_data[i].orig_idx = i;
 	env->prog = *prog;
 	env->ops = bpf_verifier_ops[env->prog->type];
+	env->fd_array = make_bpfptr(attr->fd_array, uattr.is_kernel);
 	is_priv = bpf_capable();
 
 	bpf_get_btf_vmlinux();
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 0c13016d3d2c..6c8e178d8ffa 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1098,8 +1098,8 @@ enum bpf_link_type {
 /* When BPF ldimm64's insn[0].src_reg != 0 then this can have
  * the following extensions:
  *
- * insn[0].src_reg:  BPF_PSEUDO_MAP_FD
- * insn[0].imm:      map fd
+ * insn[0].src_reg:  BPF_PSEUDO_MAP_[FD|IDX]
+ * insn[0].imm:      map fd or fd_idx
  * insn[1].imm:      0
  * insn[0].off:      0
  * insn[1].off:      0
@@ -1107,15 +1107,19 @@ enum bpf_link_type {
  * verifier type:    CONST_PTR_TO_MAP
  */
 #define BPF_PSEUDO_MAP_FD	1
-/* insn[0].src_reg:  BPF_PSEUDO_MAP_VALUE
- * insn[0].imm:      map fd
+#define BPF_PSEUDO_MAP_IDX	5
+
+/* insn[0].src_reg:  BPF_PSEUDO_MAP_[IDX_]VALUE
+ * insn[0].imm:      map fd or fd_idx
  * insn[1].imm:      offset into value
  * insn[0].off:      0
  * insn[1].off:      0
  * ldimm64 rewrite:  address of map[0]+offset
  * verifier type:    PTR_TO_MAP_VALUE
  */
-#define BPF_PSEUDO_MAP_VALUE	2
+#define BPF_PSEUDO_MAP_VALUE		2
+#define BPF_PSEUDO_MAP_IDX_VALUE	6
+
 /* insn[0].src_reg:  BPF_PSEUDO_BTF_ID
  * insn[0].imm:      kernel btd id of VAR
  * insn[1].imm:      0
@@ -1315,6 +1319,8 @@ union bpf_attr {
 			/* or valid module BTF object fd or 0 to attach to vmlinux */
 			__u32		attach_btf_obj_fd;
 		};
+		__u32		:32;		/* pad */
+		__aligned_u64	fd_array;	/* array of FDs */
 	};
 
 	struct { /* anonymous struct used by BPF_OBJ_* commands */
-- 
2.30.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v2 bpf-next 09/16] libbpf: Support for fd_idx
  2021-04-23  0:26 [PATCH v2 bpf-next 00/16] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
                   ` (7 preceding siblings ...)
  2021-04-23  0:26 ` [PATCH v2 bpf-next 08/16] bpf: Introduce fd_idx Alexei Starovoitov
@ 2021-04-23  0:26 ` Alexei Starovoitov
  2021-04-26 17:14   ` Andrii Nakryiko
  2021-04-23  0:26 ` [PATCH v2 bpf-next 10/16] bpf: Add bpf_btf_find_by_name_kind() helper Alexei Starovoitov
                   ` (7 subsequent siblings)
  16 siblings, 1 reply; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-23  0:26 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Add support for FD_IDX make libbpf prefer that approach to loading programs.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/lib/bpf/bpf.c             |  1 +
 tools/lib/bpf/libbpf.c          | 70 +++++++++++++++++++++++++++++----
 tools/lib/bpf/libbpf_internal.h |  1 +
 3 files changed, 65 insertions(+), 7 deletions(-)

diff --git a/tools/lib/bpf/bpf.c b/tools/lib/bpf/bpf.c
index bba48ff4c5c0..b96a3aba6fcc 100644
--- a/tools/lib/bpf/bpf.c
+++ b/tools/lib/bpf/bpf.c
@@ -252,6 +252,7 @@ int libbpf__bpf_prog_load(const struct bpf_prog_load_params *load_attr)
 
 	attr.prog_btf_fd = load_attr->prog_btf_fd;
 	attr.prog_flags = load_attr->prog_flags;
+	attr.fd_array = ptr_to_u64(load_attr->fd_array);
 
 	attr.func_info_rec_size = load_attr->func_info_rec_size;
 	attr.func_info_cnt = load_attr->func_info_cnt;
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 254a0c9aa6cf..17cfc5b66111 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -176,6 +176,8 @@ enum kern_feature_id {
 	FEAT_MODULE_BTF,
 	/* BTF_KIND_FLOAT support */
 	FEAT_BTF_FLOAT,
+	/* Kernel support for FD_IDX */
+	FEAT_FD_IDX,
 	__FEAT_CNT,
 };
 
@@ -288,6 +290,7 @@ struct bpf_program {
 	__u32 line_info_rec_size;
 	__u32 line_info_cnt;
 	__u32 prog_flags;
+	int *fd_array;
 };
 
 struct bpf_struct_ops {
@@ -4181,6 +4184,24 @@ static int probe_module_btf(void)
 	return !err;
 }
 
+static int probe_kern_fd_idx(void)
+{
+	struct bpf_load_program_attr attr;
+	struct bpf_insn insns[] = {
+		BPF_LD_IMM64_RAW(BPF_REG_0, BPF_PSEUDO_MAP_IDX, 0),
+		BPF_EXIT_INSN(),
+	};
+
+	memset(&attr, 0, sizeof(attr));
+	attr.prog_type = BPF_PROG_TYPE_SOCKET_FILTER;
+	attr.insns = insns;
+	attr.insns_cnt = ARRAY_SIZE(insns);
+	attr.license = "GPL";
+
+	probe_fd(bpf_load_program_xattr(&attr, NULL, 0));
+	return errno == EPROTO;
+}
+
 enum kern_feature_result {
 	FEAT_UNKNOWN = 0,
 	FEAT_SUPPORTED = 1,
@@ -4231,6 +4252,9 @@ static struct kern_feature_desc {
 	[FEAT_BTF_FLOAT] = {
 		"BTF_KIND_FLOAT support", probe_kern_btf_float,
 	},
+	[FEAT_FD_IDX] = {
+		"prog_load with fd_idx", probe_kern_fd_idx,
+	},
 };
 
 static bool kernel_supports(enum kern_feature_id feat_id)
@@ -6309,19 +6333,34 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog)
 
 		switch (relo->type) {
 		case RELO_LD64:
-			insn[0].src_reg = BPF_PSEUDO_MAP_FD;
-			insn[0].imm = obj->maps[relo->map_idx].fd;
+			if (kernel_supports(FEAT_FD_IDX)) {
+				insn[0].src_reg = BPF_PSEUDO_MAP_IDX;
+				insn[0].imm = relo->map_idx;
+			} else {
+				insn[0].src_reg = BPF_PSEUDO_MAP_FD;
+				insn[0].imm = obj->maps[relo->map_idx].fd;
+			}
 			break;
 		case RELO_DATA:
-			insn[0].src_reg = BPF_PSEUDO_MAP_VALUE;
 			insn[1].imm = insn[0].imm + relo->sym_off;
-			insn[0].imm = obj->maps[relo->map_idx].fd;
+			if (kernel_supports(FEAT_FD_IDX)) {
+				insn[0].src_reg = BPF_PSEUDO_MAP_IDX_VALUE;
+				insn[0].imm = relo->map_idx;
+			} else {
+				insn[0].src_reg = BPF_PSEUDO_MAP_VALUE;
+				insn[0].imm = obj->maps[relo->map_idx].fd;
+			}
 			break;
 		case RELO_EXTERN_VAR:
 			ext = &obj->externs[relo->sym_off];
 			if (ext->type == EXT_KCFG) {
-				insn[0].src_reg = BPF_PSEUDO_MAP_VALUE;
-				insn[0].imm = obj->maps[obj->kconfig_map_idx].fd;
+				if (kernel_supports(FEAT_FD_IDX)) {
+					insn[0].src_reg = BPF_PSEUDO_MAP_IDX_VALUE;
+					insn[0].imm = obj->kconfig_map_idx;
+				} else {
+					insn[0].src_reg = BPF_PSEUDO_MAP_VALUE;
+					insn[0].imm = obj->maps[obj->kconfig_map_idx].fd;
+				}
 				insn[1].imm = ext->kcfg.data_off;
 			} else /* EXT_KSYM */ {
 				if (ext->ksym.type_id) { /* typed ksyms */
@@ -7047,6 +7086,7 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt,
 	load_attr.attach_btf_id = prog->attach_btf_id;
 	load_attr.kern_version = kern_version;
 	load_attr.prog_ifindex = prog->prog_ifindex;
+	load_attr.fd_array = prog->fd_array;
 
 	/* specify func_info/line_info only if kernel supports them */
 	btf_fd = bpf_object__btf_fd(prog->obj);
@@ -7239,6 +7279,8 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)
 	struct bpf_program *prog;
 	size_t i;
 	int err;
+	struct bpf_map *map;
+	int *fd_array = NULL;
 
 	for (i = 0; i < obj->nr_programs; i++) {
 		prog = &obj->programs[i];
@@ -7247,6 +7289,16 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)
 			return err;
 	}
 
+	if (kernel_supports(FEAT_FD_IDX) && obj->nr_maps) {
+		fd_array = malloc(sizeof(int) * obj->nr_maps);
+		if (!fd_array)
+			return -ENOMEM;
+		for (i = 0; i < obj->nr_maps; i++) {
+			map = &obj->maps[i];
+			fd_array[i] = map->fd;
+		}
+	}
+
 	for (i = 0; i < obj->nr_programs; i++) {
 		prog = &obj->programs[i];
 		if (prog_is_subprog(obj, prog))
@@ -7256,10 +7308,14 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)
 			continue;
 		}
 		prog->log_level |= log_level;
+		prog->fd_array = fd_array;
 		err = bpf_program__load(prog, obj->license, obj->kern_version);
-		if (err)
+		if (err) {
+			free(fd_array);
 			return err;
+		}
 	}
+	free(fd_array);
 	return 0;
 }
 
diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
index 6017902c687e..9114c7085f2a 100644
--- a/tools/lib/bpf/libbpf_internal.h
+++ b/tools/lib/bpf/libbpf_internal.h
@@ -204,6 +204,7 @@ struct bpf_prog_load_params {
 	__u32 log_level;
 	char *log_buf;
 	size_t log_buf_sz;
+	int *fd_array;
 };
 
 int libbpf__bpf_prog_load(const struct bpf_prog_load_params *load_attr);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v2 bpf-next 10/16] bpf: Add bpf_btf_find_by_name_kind() helper.
  2021-04-23  0:26 [PATCH v2 bpf-next 00/16] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
                   ` (8 preceding siblings ...)
  2021-04-23  0:26 ` [PATCH v2 bpf-next 09/16] libbpf: Support for fd_idx Alexei Starovoitov
@ 2021-04-23  0:26 ` Alexei Starovoitov
  2021-04-26 22:46   ` Andrii Nakryiko
  2021-04-27 21:00   ` John Fastabend
  2021-04-23  0:26 ` [PATCH v2 bpf-next 11/16] bpf: Add bpf_sys_close() helper Alexei Starovoitov
                   ` (6 subsequent siblings)
  16 siblings, 2 replies; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-23  0:26 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Add new helper:

long bpf_btf_find_by_name_kind(u32 btf_fd, char *name, u32 kind, int flags)
	Description
		Find given name with given type in BTF pointed to by btf_fd.
		If btf_fd is zero look for the name in vmlinux BTF and in module's BTFs.
	Return
		Returns btf_id and btf_obj_fd in lower and upper 32 bits.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 include/linux/bpf.h            |  1 +
 include/uapi/linux/bpf.h       |  8 ++++
 kernel/bpf/btf.c               | 68 ++++++++++++++++++++++++++++++++++
 kernel/bpf/syscall.c           |  2 +
 tools/include/uapi/linux/bpf.h |  8 ++++
 5 files changed, 87 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 0f841bd0cb85..4cf361eb6a80 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1972,6 +1972,7 @@ extern const struct bpf_func_proto bpf_get_socket_ptr_cookie_proto;
 extern const struct bpf_func_proto bpf_task_storage_get_proto;
 extern const struct bpf_func_proto bpf_task_storage_delete_proto;
 extern const struct bpf_func_proto bpf_for_each_map_elem_proto;
+extern const struct bpf_func_proto bpf_btf_find_by_name_kind_proto;
 
 const struct bpf_func_proto *bpf_tracing_func_proto(
 	enum bpf_func_id func_id, const struct bpf_prog *prog);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index de58a714ed36..253f5f031f08 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -4748,6 +4748,13 @@ union bpf_attr {
  * 		Execute bpf syscall with given arguments.
  * 	Return
  * 		A syscall result.
+ *
+ * long bpf_btf_find_by_name_kind(u32 btf_fd, char *name, u32 kind, int flags)
+ * 	Description
+ * 		Find given name with given type in BTF pointed to by btf_fd.
+ * 		If btf_fd is zero look for the name in vmlinux BTF and in module's BTFs.
+ * 	Return
+ * 		Returns btf_id and btf_obj_fd in lower and upper 32 bits.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -4917,6 +4924,7 @@ union bpf_attr {
 	FN(for_each_map_elem),		\
 	FN(snprintf),			\
 	FN(sys_bpf),			\
+	FN(btf_find_by_name_kind),	\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index fbf6c06a9d62..446c64171464 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -6085,3 +6085,71 @@ struct module *btf_try_get_module(const struct btf *btf)
 
 	return res;
 }
+
+BPF_CALL_4(bpf_btf_find_by_name_kind, int, btf_fd, char *, name, u32, kind, int, flags)
+{
+	char kname[KSYM_NAME_LEN];
+	struct btf *btf;
+	long ret;
+
+	if (flags)
+		return -EINVAL;
+
+	ret = strncpy_from_kernel_nofault(kname, name, sizeof(kname));
+	if (ret < 0)
+		return ret;
+	if (btf_fd)
+		btf = btf_get_by_fd(btf_fd);
+	else
+		btf = bpf_get_btf_vmlinux();
+	if (IS_ERR(btf))
+		return PTR_ERR(btf);
+
+	ret = btf_find_by_name_kind(btf, kname, kind);
+	/* ret is never zero, since btf_find_by_name_kind returns
+	 * positive btf_id or negative error.
+	 */
+	if (btf_fd)
+		btf_put(btf);
+	else if (ret < 0) {
+		struct btf *mod_btf;
+		int id;
+
+		/* If name is not found in vmlinux's BTF then search in module's BTFs */
+		spin_lock_bh(&btf_idr_lock);
+		idr_for_each_entry(&btf_idr, mod_btf, id) {
+			if (!btf_is_module(mod_btf))
+				continue;
+			/* linear search could be slow hence unlock/lock
+			 * the IDR to avoiding holding it for too long
+			 */
+			btf_get(mod_btf);
+			spin_unlock_bh(&btf_idr_lock);
+			ret = btf_find_by_name_kind(mod_btf, kname, kind);
+			if (ret > 0) {
+				int btf_obj_fd;
+
+				btf_obj_fd = __btf_new_fd(mod_btf);
+				if (btf_obj_fd < 0) {
+					btf_put(mod_btf);
+					return btf_obj_fd;
+				}
+				return ret | (((u64)btf_obj_fd) << 32);
+			}
+			spin_lock_bh(&btf_idr_lock);
+			btf_put(mod_btf);
+		}
+		spin_unlock_bh(&btf_idr_lock);
+	}
+	return ret;
+}
+
+const struct bpf_func_proto bpf_btf_find_by_name_kind_proto = {
+	.func		= bpf_btf_find_by_name_kind,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_ANYTHING,
+	.arg2_type	= ARG_ANYTHING,
+	.arg3_type	= ARG_ANYTHING,
+	.arg4_type	= ARG_ANYTHING,
+};
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index a81496c5d09f..638c7acad925 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -4574,6 +4574,8 @@ syscall_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 	switch (func_id) {
 	case BPF_FUNC_sys_bpf:
 		return &bpf_sys_bpf_proto;
+	case BPF_FUNC_btf_find_by_name_kind:
+		return &bpf_btf_find_by_name_kind_proto;
 	default:
 		return tracing_prog_func_proto(func_id, prog);
 	}
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 6c8e178d8ffa..5841adb44de6 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -4748,6 +4748,13 @@ union bpf_attr {
  * 		Execute bpf syscall with given arguments.
  * 	Return
  * 		A syscall result.
+ *
+ * long bpf_btf_find_by_name_kind(u32 btf_fd, char *name, u32 kind, int flags)
+ * 	Description
+ * 		Find given name with given type in BTF pointed to by btf_fd.
+ * 		If btf_fd is zero look for the name in vmlinux BTF and in module's BTFs.
+ * 	Return
+ * 		Returns btf_id and btf_obj_fd in lower and upper 32 bits.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -4917,6 +4924,7 @@ union bpf_attr {
 	FN(for_each_map_elem),		\
 	FN(snprintf),			\
 	FN(sys_bpf),			\
+	FN(btf_find_by_name_kind),	\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
-- 
2.30.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v2 bpf-next 11/16] bpf: Add bpf_sys_close() helper.
  2021-04-23  0:26 [PATCH v2 bpf-next 00/16] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
                   ` (9 preceding siblings ...)
  2021-04-23  0:26 ` [PATCH v2 bpf-next 10/16] bpf: Add bpf_btf_find_by_name_kind() helper Alexei Starovoitov
@ 2021-04-23  0:26 ` Alexei Starovoitov
  2021-04-23  0:26 ` [PATCH v2 bpf-next 12/16] libbpf: Change the order of data and text relocations Alexei Starovoitov
                   ` (5 subsequent siblings)
  16 siblings, 0 replies; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-23  0:26 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Add bpf_sys_close() helper to be used by the syscall/loader program to close
intermediate FDs and other cleanup.
Note this helper must never be allowed inside fdget/fdput bracketing.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 include/uapi/linux/bpf.h       |  7 +++++++
 kernel/bpf/syscall.c           | 19 +++++++++++++++++++
 tools/include/uapi/linux/bpf.h |  7 +++++++
 3 files changed, 33 insertions(+)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 253f5f031f08..45e55ad3c617 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -4755,6 +4755,12 @@ union bpf_attr {
  * 		If btf_fd is zero look for the name in vmlinux BTF and in module's BTFs.
  * 	Return
  * 		Returns btf_id and btf_obj_fd in lower and upper 32 bits.
+ *
+ * long bpf_sys_close(u32 fd)
+ * 	Description
+ * 		Execute close syscall for given FD.
+ * 	Return
+ * 		A syscall result.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -4925,6 +4931,7 @@ union bpf_attr {
 	FN(snprintf),			\
 	FN(sys_bpf),			\
 	FN(btf_find_by_name_kind),	\
+	FN(sys_close),			\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 638c7acad925..f5519e84b097 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -4568,6 +4568,23 @@ tracing_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 	return bpf_base_func_proto(func_id);
 }
 
+BPF_CALL_1(bpf_sys_close, u32, fd)
+{
+	/* When bpf program calls this helper there should not be
+	 * an fdget() without matching completed fdput().
+	 * This helper is allowed in the following callchain only:
+	 * sys_bpf->prog_test_run->bpf_prog->bpf_sys_close
+	 */
+	return close_fd(fd);
+}
+
+const struct bpf_func_proto bpf_sys_close_proto = {
+	.func		= bpf_sys_close,
+	.gpl_only	= false,
+	.ret_type	= RET_INTEGER,
+	.arg1_type	= ARG_ANYTHING,
+};
+
 static const struct bpf_func_proto *
 syscall_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 {
@@ -4576,6 +4593,8 @@ syscall_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_sys_bpf_proto;
 	case BPF_FUNC_btf_find_by_name_kind:
 		return &bpf_btf_find_by_name_kind_proto;
+	case BPF_FUNC_sys_close:
+		return &bpf_sys_close_proto;
 	default:
 		return tracing_prog_func_proto(func_id, prog);
 	}
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 5841adb44de6..8dd27faa30ee 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -4755,6 +4755,12 @@ union bpf_attr {
  * 		If btf_fd is zero look for the name in vmlinux BTF and in module's BTFs.
  * 	Return
  * 		Returns btf_id and btf_obj_fd in lower and upper 32 bits.
+ *
+ * long bpf_sys_close(u32 fd)
+ * 	Description
+ * 		Execute close syscall for given FD.
+ * 	Return
+ * 		A syscall result.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -4925,6 +4931,7 @@ union bpf_attr {
 	FN(snprintf),			\
 	FN(sys_bpf),			\
 	FN(btf_find_by_name_kind),	\
+	FN(sys_close),			\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
-- 
2.30.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v2 bpf-next 12/16] libbpf: Change the order of data and text relocations.
  2021-04-23  0:26 [PATCH v2 bpf-next 00/16] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
                   ` (10 preceding siblings ...)
  2021-04-23  0:26 ` [PATCH v2 bpf-next 11/16] bpf: Add bpf_sys_close() helper Alexei Starovoitov
@ 2021-04-23  0:26 ` Alexei Starovoitov
  2021-04-26 17:29   ` Andrii Nakryiko
  2021-04-23  0:26 ` [PATCH v2 bpf-next 13/16] libbpf: Add bpf_object pointer to kernel_supports() Alexei Starovoitov
                   ` (4 subsequent siblings)
  16 siblings, 1 reply; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-23  0:26 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

In order to be able to generate loader program in the later
patches change the order of data and text relocations.
Also improve the test to include data relos.

If the kernel supports "FD array" the map_fd relocations can be processed
before text relos since generated loader program won't need to manually
patch ld_imm64 insns with map_fd.
But ksym and kfunc relocations can only be processed after all calls
are relocated, since loader program will consist of a sequence
of calls to bpf_btf_find_by_name_kind() followed by patching of btf_id
and btf_obj_fd into corresponding ld_imm64 insns. The locations of those
ld_imm64 insns are specified in relocations.
Hence process all data relocations (maps, ksym, kfunc) together after call relos.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/lib/bpf/libbpf.c                        | 86 +++++++++++++++----
 .../selftests/bpf/progs/test_subprogs.c       | 13 +++
 2 files changed, 80 insertions(+), 19 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 17cfc5b66111..c73a85b97ca5 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -6379,11 +6379,15 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog)
 			insn[0].imm = ext->ksym.kernel_btf_id;
 			break;
 		case RELO_SUBPROG_ADDR:
-			insn[0].src_reg = BPF_PSEUDO_FUNC;
-			/* will be handled as a follow up pass */
+			if (insn[0].src_reg != BPF_PSEUDO_FUNC) {
+				pr_warn("prog '%s': relo #%d: bad insn\n",
+					prog->name, i);
+				return -EINVAL;
+			}
+			/* handled already */
 			break;
 		case RELO_CALL:
-			/* will be handled as a follow up pass */
+			/* handled already */
 			break;
 		default:
 			pr_warn("prog '%s': relo #%d: bad relo type %d\n",
@@ -6552,6 +6556,31 @@ static struct reloc_desc *find_prog_insn_relo(const struct bpf_program *prog, si
 		       sizeof(*prog->reloc_desc), cmp_relo_by_insn_idx);
 }
 
+static int append_subprog_relos(struct bpf_program *main_prog, struct bpf_program *subprog)
+{
+	int new_cnt = main_prog->nr_reloc + subprog->nr_reloc;
+	struct reloc_desc *relos;
+	size_t off = subprog->sub_insn_off;
+	int i;
+
+	if (main_prog == subprog)
+		return 0;
+	relos = libbpf_reallocarray(main_prog->reloc_desc, new_cnt, sizeof(*relos));
+	if (!relos)
+		return -ENOMEM;
+	memcpy(relos + main_prog->nr_reloc, subprog->reloc_desc,
+	       sizeof(*relos) * subprog->nr_reloc);
+
+	for (i = main_prog->nr_reloc; i < new_cnt; i++)
+		relos[i].insn_idx += off;
+	/* After insn_idx adjustment the 'relos' array is still sorted
+	 * by insn_idx and doesn't break bsearch.
+	 */
+	main_prog->reloc_desc = relos;
+	main_prog->nr_reloc = new_cnt;
+	return 0;
+}
+
 static int
 bpf_object__reloc_code(struct bpf_object *obj, struct bpf_program *main_prog,
 		       struct bpf_program *prog)
@@ -6560,18 +6589,32 @@ bpf_object__reloc_code(struct bpf_object *obj, struct bpf_program *main_prog,
 	struct bpf_program *subprog;
 	struct bpf_insn *insns, *insn;
 	struct reloc_desc *relo;
-	int err;
+	int err, i;
 
 	err = reloc_prog_func_and_line_info(obj, main_prog, prog);
 	if (err)
 		return err;
 
+	for (i = 0; i < prog->nr_reloc; i++) {
+		relo = &prog->reloc_desc[i];
+		insn = &main_prog->insns[prog->sub_insn_off + relo->insn_idx];
+
+		if (relo->type == RELO_SUBPROG_ADDR)
+			/* mark the insn, so it becomes insn_is_pseudo_func() */
+			insn[0].src_reg = BPF_PSEUDO_FUNC;
+	}
+
 	for (insn_idx = 0; insn_idx < prog->sec_insn_cnt; insn_idx++) {
 		insn = &main_prog->insns[prog->sub_insn_off + insn_idx];
 		if (!insn_is_subprog_call(insn) && !insn_is_pseudo_func(insn))
 			continue;
 
 		relo = find_prog_insn_relo(prog, insn_idx);
+		if (relo && relo->type == RELO_EXTERN_FUNC)
+			/* kfunc relocations will be handled later
+			 * in bpf_object__relocate_data()
+			 */
+			continue;
 		if (relo && relo->type != RELO_CALL && relo->type != RELO_SUBPROG_ADDR) {
 			pr_warn("prog '%s': unexpected relo for insn #%zu, type %d\n",
 				prog->name, insn_idx, relo->type);
@@ -6646,6 +6689,10 @@ bpf_object__reloc_code(struct bpf_object *obj, struct bpf_program *main_prog,
 			pr_debug("prog '%s': added %zu insns from sub-prog '%s'\n",
 				 main_prog->name, subprog->insns_cnt, subprog->name);
 
+			/* The subprog insns are now appended. Append its relos too. */
+			err = append_subprog_relos(main_prog, subprog);
+			if (err)
+				return err;
 			err = bpf_object__reloc_code(obj, main_prog, subprog);
 			if (err)
 				return err;
@@ -6790,23 +6837,12 @@ bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path)
 			return err;
 		}
 	}
-	/* relocate data references first for all programs and sub-programs,
-	 * as they don't change relative to code locations, so subsequent
-	 * subprogram processing won't need to re-calculate any of them
-	 */
-	for (i = 0; i < obj->nr_programs; i++) {
-		prog = &obj->programs[i];
-		err = bpf_object__relocate_data(obj, prog);
-		if (err) {
-			pr_warn("prog '%s': failed to relocate data references: %d\n",
-				prog->name, err);
-			return err;
-		}
-	}
-	/* now relocate subprogram calls and append used subprograms to main
+	/* relocate subprogram calls and append used subprograms to main
 	 * programs; each copy of subprogram code needs to be relocated
 	 * differently for each main program, because its code location might
-	 * have changed
+	 * have changed.
+	 * Append subprog relos to main programs to allow data relos to be
+	 * processed after text is completely relocated.
 	 */
 	for (i = 0; i < obj->nr_programs; i++) {
 		prog = &obj->programs[i];
@@ -6823,6 +6859,18 @@ bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path)
 			return err;
 		}
 	}
+	/* Process data relos for main programs */
+	for (i = 0; i < obj->nr_programs; i++) {
+		prog = &obj->programs[i];
+		if (prog_is_subprog(obj, prog))
+			continue;
+		err = bpf_object__relocate_data(obj, prog);
+		if (err) {
+			pr_warn("prog '%s': failed to relocate data references: %d\n",
+				prog->name, err);
+			return err;
+		}
+	}
 	/* free up relocation descriptors */
 	for (i = 0; i < obj->nr_programs; i++) {
 		prog = &obj->programs[i];
diff --git a/tools/testing/selftests/bpf/progs/test_subprogs.c b/tools/testing/selftests/bpf/progs/test_subprogs.c
index d3c5673c0218..b7c37ca09544 100644
--- a/tools/testing/selftests/bpf/progs/test_subprogs.c
+++ b/tools/testing/selftests/bpf/progs/test_subprogs.c
@@ -4,8 +4,18 @@
 
 const char LICENSE[] SEC("license") = "GPL";
 
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+	__uint(max_entries, 1);
+	__type(key, __u32);
+	__type(value, __u64);
+} array SEC(".maps");
+
 __noinline int sub1(int x)
 {
+	int key = 0;
+
+	bpf_map_lookup_elem(&array, &key);
 	return x + 1;
 }
 
@@ -23,6 +33,9 @@ static __noinline int sub3(int z)
 
 static __noinline int sub4(int w)
 {
+	int key = 0;
+
+	bpf_map_lookup_elem(&array, &key);
 	return w + sub3(5) + sub1(6);
 }
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v2 bpf-next 13/16] libbpf: Add bpf_object pointer to kernel_supports().
  2021-04-23  0:26 [PATCH v2 bpf-next 00/16] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
                   ` (11 preceding siblings ...)
  2021-04-23  0:26 ` [PATCH v2 bpf-next 12/16] libbpf: Change the order of data and text relocations Alexei Starovoitov
@ 2021-04-23  0:26 ` Alexei Starovoitov
  2021-04-26 17:30   ` Andrii Nakryiko
  2021-04-23  0:26 ` [PATCH v2 bpf-next 14/16] libbpf: Generate loader program out of BPF ELF file Alexei Starovoitov
                   ` (3 subsequent siblings)
  16 siblings, 1 reply; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-23  0:26 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Add a pointer to 'struct bpf_object' to kernel_supports() helper.
It will be used in the next patch.
No functional changes.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/lib/bpf/libbpf.c | 52 +++++++++++++++++++++---------------------
 1 file changed, 26 insertions(+), 26 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index c73a85b97ca5..7f58c389ff67 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -181,7 +181,7 @@ enum kern_feature_id {
 	__FEAT_CNT,
 };
 
-static bool kernel_supports(enum kern_feature_id feat_id);
+static bool kernel_supports(const struct bpf_object *obj, enum kern_feature_id feat_id);
 
 enum reloc_type {
 	RELO_LD64,
@@ -2407,20 +2407,20 @@ static bool section_have_execinstr(struct bpf_object *obj, int idx)
 
 static bool btf_needs_sanitization(struct bpf_object *obj)
 {
-	bool has_func_global = kernel_supports(FEAT_BTF_GLOBAL_FUNC);
-	bool has_datasec = kernel_supports(FEAT_BTF_DATASEC);
-	bool has_float = kernel_supports(FEAT_BTF_FLOAT);
-	bool has_func = kernel_supports(FEAT_BTF_FUNC);
+	bool has_func_global = kernel_supports(obj, FEAT_BTF_GLOBAL_FUNC);
+	bool has_datasec = kernel_supports(obj, FEAT_BTF_DATASEC);
+	bool has_float = kernel_supports(obj, FEAT_BTF_FLOAT);
+	bool has_func = kernel_supports(obj, FEAT_BTF_FUNC);
 
 	return !has_func || !has_datasec || !has_func_global || !has_float;
 }
 
 static void bpf_object__sanitize_btf(struct bpf_object *obj, struct btf *btf)
 {
-	bool has_func_global = kernel_supports(FEAT_BTF_GLOBAL_FUNC);
-	bool has_datasec = kernel_supports(FEAT_BTF_DATASEC);
-	bool has_float = kernel_supports(FEAT_BTF_FLOAT);
-	bool has_func = kernel_supports(FEAT_BTF_FUNC);
+	bool has_func_global = kernel_supports(obj, FEAT_BTF_GLOBAL_FUNC);
+	bool has_datasec = kernel_supports(obj, FEAT_BTF_DATASEC);
+	bool has_float = kernel_supports(obj, FEAT_BTF_FLOAT);
+	bool has_func = kernel_supports(obj, FEAT_BTF_FUNC);
 	struct btf_type *t;
 	int i, j, vlen;
 
@@ -2626,7 +2626,7 @@ static int bpf_object__sanitize_and_load_btf(struct bpf_object *obj)
 	if (!obj->btf)
 		return 0;
 
-	if (!kernel_supports(FEAT_BTF)) {
+	if (!kernel_supports(obj, FEAT_BTF)) {
 		if (kernel_needs_btf(obj)) {
 			err = -EOPNOTSUPP;
 			goto report;
@@ -4257,7 +4257,7 @@ static struct kern_feature_desc {
 	},
 };
 
-static bool kernel_supports(enum kern_feature_id feat_id)
+static bool kernel_supports(const struct bpf_object *obj, enum kern_feature_id feat_id)
 {
 	struct kern_feature_desc *feat = &feature_probes[feat_id];
 	int ret;
@@ -4376,7 +4376,7 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map)
 
 	memset(&create_attr, 0, sizeof(create_attr));
 
-	if (kernel_supports(FEAT_PROG_NAME))
+	if (kernel_supports(obj, FEAT_PROG_NAME))
 		create_attr.name = map->name;
 	create_attr.map_ifindex = map->map_ifindex;
 	create_attr.map_type = def->type;
@@ -4941,7 +4941,7 @@ static int load_module_btfs(struct bpf_object *obj)
 	obj->btf_modules_loaded = true;
 
 	/* kernel too old to support module BTFs */
-	if (!kernel_supports(FEAT_MODULE_BTF))
+	if (!kernel_supports(obj, FEAT_MODULE_BTF))
 		return 0;
 
 	while (true) {
@@ -6333,7 +6333,7 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog)
 
 		switch (relo->type) {
 		case RELO_LD64:
-			if (kernel_supports(FEAT_FD_IDX)) {
+			if (kernel_supports(obj, FEAT_FD_IDX)) {
 				insn[0].src_reg = BPF_PSEUDO_MAP_IDX;
 				insn[0].imm = relo->map_idx;
 			} else {
@@ -6343,7 +6343,7 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog)
 			break;
 		case RELO_DATA:
 			insn[1].imm = insn[0].imm + relo->sym_off;
-			if (kernel_supports(FEAT_FD_IDX)) {
+			if (kernel_supports(obj, FEAT_FD_IDX)) {
 				insn[0].src_reg = BPF_PSEUDO_MAP_IDX_VALUE;
 				insn[0].imm = relo->map_idx;
 			} else {
@@ -6354,7 +6354,7 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog)
 		case RELO_EXTERN_VAR:
 			ext = &obj->externs[relo->sym_off];
 			if (ext->type == EXT_KCFG) {
-				if (kernel_supports(FEAT_FD_IDX)) {
+				if (kernel_supports(obj, FEAT_FD_IDX)) {
 					insn[0].src_reg = BPF_PSEUDO_MAP_IDX_VALUE;
 					insn[0].imm = obj->kconfig_map_idx;
 				} else {
@@ -6478,7 +6478,7 @@ reloc_prog_func_and_line_info(const struct bpf_object *obj,
 	/* no .BTF.ext relocation if .BTF.ext is missing or kernel doesn't
 	 * supprot func/line info
 	 */
-	if (!obj->btf_ext || !kernel_supports(FEAT_BTF_FUNC))
+	if (!obj->btf_ext || !kernel_supports(obj, FEAT_BTF_FUNC))
 		return 0;
 
 	/* only attempt func info relocation if main program's func_info
@@ -7076,12 +7076,12 @@ static int bpf_object__sanitize_prog(struct bpf_object* obj, struct bpf_program
 		switch (func_id) {
 		case BPF_FUNC_probe_read_kernel:
 		case BPF_FUNC_probe_read_user:
-			if (!kernel_supports(FEAT_PROBE_READ_KERN))
+			if (!kernel_supports(obj, FEAT_PROBE_READ_KERN))
 				insn->imm = BPF_FUNC_probe_read;
 			break;
 		case BPF_FUNC_probe_read_kernel_str:
 		case BPF_FUNC_probe_read_user_str:
-			if (!kernel_supports(FEAT_PROBE_READ_KERN))
+			if (!kernel_supports(obj, FEAT_PROBE_READ_KERN))
 				insn->imm = BPF_FUNC_probe_read_str;
 			break;
 		default:
@@ -7116,12 +7116,12 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt,
 
 	load_attr.prog_type = prog->type;
 	/* old kernels might not support specifying expected_attach_type */
-	if (!kernel_supports(FEAT_EXP_ATTACH_TYPE) && prog->sec_def &&
+	if (!kernel_supports(prog->obj, FEAT_EXP_ATTACH_TYPE) && prog->sec_def &&
 	    prog->sec_def->is_exp_attach_type_optional)
 		load_attr.expected_attach_type = 0;
 	else
 		load_attr.expected_attach_type = prog->expected_attach_type;
-	if (kernel_supports(FEAT_PROG_NAME))
+	if (kernel_supports(prog->obj, FEAT_PROG_NAME))
 		load_attr.name = prog->name;
 	load_attr.insns = insns;
 	load_attr.insn_cnt = insns_cnt;
@@ -7138,7 +7138,7 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt,
 
 	/* specify func_info/line_info only if kernel supports them */
 	btf_fd = bpf_object__btf_fd(prog->obj);
-	if (btf_fd >= 0 && kernel_supports(FEAT_BTF_FUNC)) {
+	if (btf_fd >= 0 && kernel_supports(prog->obj, FEAT_BTF_FUNC)) {
 		load_attr.prog_btf_fd = btf_fd;
 		load_attr.func_info = prog->func_info;
 		load_attr.func_info_rec_size = prog->func_info_rec_size;
@@ -7168,7 +7168,7 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt,
 			pr_debug("verifier log:\n%s", log_buf);
 
 		if (prog->obj->rodata_map_idx >= 0 &&
-		    kernel_supports(FEAT_PROG_BIND_MAP)) {
+		    kernel_supports(prog->obj, FEAT_PROG_BIND_MAP)) {
 			struct bpf_map *rodata_map =
 				&prog->obj->maps[prog->obj->rodata_map_idx];
 
@@ -7337,7 +7337,7 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)
 			return err;
 	}
 
-	if (kernel_supports(FEAT_FD_IDX) && obj->nr_maps) {
+	if (kernel_supports(obj, FEAT_FD_IDX) && obj->nr_maps) {
 		fd_array = malloc(sizeof(int) * obj->nr_maps);
 		if (!fd_array)
 			return -ENOMEM;
@@ -7542,11 +7542,11 @@ static int bpf_object__sanitize_maps(struct bpf_object *obj)
 	bpf_object__for_each_map(m, obj) {
 		if (!bpf_map__is_internal(m))
 			continue;
-		if (!kernel_supports(FEAT_GLOBAL_DATA)) {
+		if (!kernel_supports(obj, FEAT_GLOBAL_DATA)) {
 			pr_warn("kernel doesn't support global data\n");
 			return -ENOTSUP;
 		}
-		if (!kernel_supports(FEAT_ARRAY_MMAP))
+		if (!kernel_supports(obj, FEAT_ARRAY_MMAP))
 			m->def.map_flags ^= BPF_F_MMAPABLE;
 	}
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v2 bpf-next 14/16] libbpf: Generate loader program out of BPF ELF file.
  2021-04-23  0:26 [PATCH v2 bpf-next 00/16] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
                   ` (12 preceding siblings ...)
  2021-04-23  0:26 ` [PATCH v2 bpf-next 13/16] libbpf: Add bpf_object pointer to kernel_supports() Alexei Starovoitov
@ 2021-04-23  0:26 ` Alexei Starovoitov
  2021-04-26 22:22   ` Andrii Nakryiko
  2021-04-23  0:26 ` [PATCH v2 bpf-next 15/16] bpftool: Use syscall/loader program in "prog load" and "gen skeleton" command Alexei Starovoitov
                   ` (2 subsequent siblings)
  16 siblings, 1 reply; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-23  0:26 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

The BPF program loading process performed by libbpf is quite complex
and consists of the following steps:
"open" phase:
- parse elf file and remember relocations, sections
- collect externs and ksyms including their btf_ids in prog's BTF
- patch BTF datasec (since llvm couldn't do it)
- init maps (old style map_def, BTF based, global data map, kconfig map)
- collect relocations against progs and maps
"load" phase:
- probe kernel features
- load vmlinux BTF
- resolve externs (kconfig and ksym)
- load program BTF
- init struct_ops
- create maps
- apply CO-RE relocations
- patch ld_imm64 insns with src_reg=PSEUDO_MAP, PSEUDO_MAP_VALUE, PSEUDO_BTF_ID
- reposition subprograms and adjust call insns
- sanitize and load progs

During this process libbpf does sys_bpf() calls to load BTF, create maps,
populate maps and finally load programs.
Instead of actually doing the syscalls generate a trace of what libbpf
would have done and represent it as the "loader program".
The "loader program" consists of single map with:
- union bpf_attr(s)
- BTF bytes
- map value bytes
- insns bytes
and single bpf program that passes bpf_attr(s) and data into bpf_sys_bpf() helper.
Executing such "loader program" via bpf_prog_test_run() command will
replay the sequence of syscalls that libbpf would have done which will result
the same maps created and programs loaded as specified in the elf file.
The "loader program" removes libelf and majority of libbpf dependency from
program loading process.

kconfig, typeless ksym, struct_ops and CO-RE are not supported yet.

The order of relocate_data and relocate_calls had to change, so that
bpf_gen__prog_load() can see all relocations for a given program with
correct insn_idx-es.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/lib/bpf/Build              |   2 +-
 tools/lib/bpf/bpf_gen_internal.h |  40 ++
 tools/lib/bpf/gen_loader.c       | 615 +++++++++++++++++++++++++++++++
 tools/lib/bpf/libbpf.c           | 204 ++++++++--
 tools/lib/bpf/libbpf.h           |  12 +
 tools/lib/bpf/libbpf.map         |   1 +
 tools/lib/bpf/libbpf_internal.h  |   2 +
 tools/lib/bpf/skel_internal.h    | 105 ++++++
 8 files changed, 948 insertions(+), 33 deletions(-)
 create mode 100644 tools/lib/bpf/bpf_gen_internal.h
 create mode 100644 tools/lib/bpf/gen_loader.c
 create mode 100644 tools/lib/bpf/skel_internal.h

diff --git a/tools/lib/bpf/Build b/tools/lib/bpf/Build
index 9b057cc7650a..430f6874fa41 100644
--- a/tools/lib/bpf/Build
+++ b/tools/lib/bpf/Build
@@ -1,3 +1,3 @@
 libbpf-y := libbpf.o bpf.o nlattr.o btf.o libbpf_errno.o str_error.o \
 	    netlink.o bpf_prog_linfo.o libbpf_probes.o xsk.o hashmap.o \
-	    btf_dump.o ringbuf.o strset.o linker.o
+	    btf_dump.o ringbuf.o strset.o linker.o gen_loader.o
diff --git a/tools/lib/bpf/bpf_gen_internal.h b/tools/lib/bpf/bpf_gen_internal.h
new file mode 100644
index 000000000000..dc3e2cbf9ce3
--- /dev/null
+++ b/tools/lib/bpf/bpf_gen_internal.h
@@ -0,0 +1,40 @@
+/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
+/* Copyright (c) 2021 Facebook */
+#ifndef __BPF_GEN_INTERNAL_H
+#define __BPF_GEN_INTERNAL_H
+
+struct relo_desc {
+	const char *name;
+	int kind;
+	int insn_idx;
+};
+
+struct bpf_gen {
+	struct gen_loader_opts *opts;
+	void *data_start;
+	void *data_cur;
+	void *insn_start;
+	void *insn_cur;
+	__u32 nr_progs;
+	__u32 nr_maps;
+	int log_level;
+	int error;
+	struct relo_desc *relos;
+	int relo_cnt;
+	char attach_target[128];
+	int attach_kind;
+};
+
+void bpf_gen__init(struct bpf_gen *gen, int log_level);
+int bpf_gen__finish(struct bpf_gen *gen);
+void bpf_gen__free(struct bpf_gen *gen);
+void bpf_gen__load_btf(struct bpf_gen *gen, const void *raw_data, __u32 raw_size);
+void bpf_gen__map_create(struct bpf_gen *gen, struct bpf_create_map_attr *map_attr, int map_idx);
+struct bpf_prog_load_params;
+void bpf_gen__prog_load(struct bpf_gen *gen, struct bpf_prog_load_params *load_attr, int prog_idx);
+void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *value, __u32 value_size);
+void bpf_gen__map_freeze(struct bpf_gen *gen, int map_idx);
+void bpf_gen__record_attach_target(struct bpf_gen *gen, const char *name, enum bpf_attach_type type);
+void bpf_gen__record_extern(struct bpf_gen *gen, const char *name, int kind, int insn_idx);
+
+#endif
diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c
new file mode 100644
index 000000000000..3765ff302eb6
--- /dev/null
+++ b/tools/lib/bpf/gen_loader.c
@@ -0,0 +1,615 @@
+// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
+/* Copyright (c) 2021 Facebook */
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <errno.h>
+#include <linux/filter.h>
+#include "btf.h"
+#include "bpf.h"
+#include "libbpf.h"
+#include "libbpf_internal.h"
+#include "hashmap.h"
+#include "bpf_gen_internal.h"
+#include "skel_internal.h"
+
+#define MAX_USED_MAPS 64
+#define MAX_USED_PROGS 32
+
+/* The following structure describes the stack layout of the loader program.
+ * In addition R6 contains the pointer to context.
+ * R7 contains the result of the last sys_bpf command (typically error or FD).
+ * R9 contains the result of the last sys_close command.
+ *
+ * Naming convention:
+ * ctx - bpf program context
+ * stack - bpf program stack
+ * blob - bpf_attr-s, strings, insns, map data.
+ *        All the bytes that loader prog will use for read/write.
+ */
+struct loader_stack {
+	__u32 btf_fd;
+	__u32 map_fd[MAX_USED_MAPS];
+	__u32 prog_fd[MAX_USED_PROGS];
+	__u32 inner_map_fd;
+};
+#define stack_off(field) (__s16)(-sizeof(struct loader_stack) + offsetof(struct loader_stack, field))
+
+static int bpf_gen__realloc_insn_buf(struct bpf_gen *gen, __u32 size)
+{
+	size_t off = gen->insn_cur - gen->insn_start;
+
+	if (gen->error)
+		return gen->error;
+	if (size > INT32_MAX || off + size > INT32_MAX) {
+		gen->error = -ERANGE;
+		return -ERANGE;
+	}
+	gen->insn_start = realloc(gen->insn_start, off + size);
+	if (!gen->insn_start) {
+		gen->error = -ENOMEM;
+		return -ENOMEM;
+	}
+	gen->insn_cur = gen->insn_start + off;
+	return 0;
+}
+
+static int bpf_gen__realloc_data_buf(struct bpf_gen *gen, __u32 size)
+{
+	size_t off = gen->data_cur - gen->data_start;
+
+	if (gen->error)
+		return gen->error;
+	if (size > INT32_MAX || off + size > INT32_MAX) {
+		gen->error = -ERANGE;
+		return -ERANGE;
+	}
+	gen->data_start = realloc(gen->data_start, off + size);
+	if (!gen->data_start) {
+		gen->error = -ENOMEM;
+		return -ENOMEM;
+	}
+	gen->data_cur = gen->data_start + off;
+	return 0;
+}
+
+static void bpf_gen__emit(struct bpf_gen *gen, struct bpf_insn insn)
+{
+	if (bpf_gen__realloc_insn_buf(gen, sizeof(insn)))
+		return;
+	memcpy(gen->insn_cur, &insn, sizeof(insn));
+	gen->insn_cur += sizeof(insn);
+}
+
+static void bpf_gen__emit2(struct bpf_gen *gen, struct bpf_insn insn1, struct bpf_insn insn2)
+{
+	bpf_gen__emit(gen, insn1);
+	bpf_gen__emit(gen, insn2);
+}
+
+void bpf_gen__init(struct bpf_gen *gen, int log_level)
+{
+	gen->log_level = log_level;
+	bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_6, BPF_REG_1));
+}
+
+static int bpf_gen__add_data(struct bpf_gen *gen, const void *data, __u32 size)
+{
+	void *prev;
+
+	if (bpf_gen__realloc_data_buf(gen, size))
+		return 0;
+	prev = gen->data_cur;
+	memcpy(gen->data_cur, data, size);
+	gen->data_cur += size;
+	return prev - gen->data_start;
+}
+
+static int insn_bytes_to_bpf_size(__u32 sz)
+{
+	switch (sz) {
+	case 8: return BPF_DW;
+	case 4: return BPF_W;
+	case 2: return BPF_H;
+	case 1: return BPF_B;
+	default: return -1;
+	}
+}
+
+/* *(u64 *)(blob + off) = (u64)(void *)(blob + data) */
+static void bpf_gen__emit_rel_store(struct bpf_gen *gen, int off, int data)
+{
+	bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_0, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, data));
+	bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, off));
+	bpf_gen__emit(gen, BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_0, 0));
+}
+
+/* *(u64 *)(blob + off) = (u64)(void *)(%sp + stack_off) */
+static void bpf_gen__emit_rel_store_sp(struct bpf_gen *gen, int off, int stack_off)
+{
+	bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_0, BPF_REG_10));
+	bpf_gen__emit(gen, BPF_ALU64_IMM(BPF_ADD, BPF_REG_0, stack_off));
+	bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, off));
+	bpf_gen__emit(gen, BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_0, 0));
+}
+
+static void bpf_gen__move_ctx2blob(struct bpf_gen *gen, int off, int size, int ctx_off)
+{
+	bpf_gen__emit(gen, BPF_LDX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_0, BPF_REG_6, ctx_off));
+	bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, off));
+	bpf_gen__emit(gen, BPF_STX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_1, BPF_REG_0, 0));
+}
+
+static void bpf_gen__move_stack2blob(struct bpf_gen *gen, int off, int size, int stack_off)
+{
+	bpf_gen__emit(gen, BPF_LDX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_0, BPF_REG_10, stack_off));
+	bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, off));
+	bpf_gen__emit(gen, BPF_STX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_1, BPF_REG_0, 0));
+}
+
+static void bpf_gen__move_stack2ctx(struct bpf_gen *gen, int ctx_off, int size, int stack_off)
+{
+	bpf_gen__emit(gen, BPF_LDX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_0, BPF_REG_10, stack_off));
+	bpf_gen__emit(gen, BPF_STX_MEM(insn_bytes_to_bpf_size(size), BPF_REG_6, BPF_REG_0, ctx_off));
+}
+
+static void bpf_gen__emit_sys_bpf(struct bpf_gen *gen, int cmd, int attr, int attr_size)
+{
+	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_1, cmd));
+	bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_2, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, attr));
+	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_3, attr_size));
+	bpf_gen__emit(gen, BPF_EMIT_CALL(BPF_FUNC_sys_bpf));
+	/* remember the result in R7 */
+	bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_7, BPF_REG_0));
+}
+
+static void bpf_gen__emit_check_err(struct bpf_gen *gen)
+{
+	bpf_gen__emit(gen, BPF_JMP_IMM(BPF_JSGE, BPF_REG_7, 0, 2));
+	bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_0, BPF_REG_7));
+	bpf_gen__emit(gen, BPF_EXIT_INSN());
+}
+
+static void __bpf_gen__debug(struct bpf_gen *gen, int reg1, int reg2, const char *fmt, va_list args)
+{
+	char buf[1024];
+	int addr, len, ret;
+
+	if (!gen->log_level)
+		return;
+	ret = vsnprintf(buf, sizeof(buf), fmt, args);
+	if (ret < 1024 - 7 && reg1 >= 0 && reg2 < 0)
+		/* The special case to accommodate common bpf_gen__debug_ret():
+		 * to avoid specifying BPF_REG_7 and adding " r=%%d" to prints explicitly.
+		 */
+		strcat(buf, " r=%d");
+	len = strlen(buf) + 1;
+	addr = bpf_gen__add_data(gen, buf, len);
+
+	bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, addr));
+	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_2, len));
+	if (reg1 >= 0)
+		bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_3, reg1));
+	if (reg2 >= 0)
+		bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_4, reg2));
+	bpf_gen__emit(gen, BPF_EMIT_CALL(BPF_FUNC_trace_printk));
+}
+
+static void bpf_gen__debug_regs(struct bpf_gen *gen, int reg1, int reg2, const char *fmt, ...)
+{
+	va_list args;
+
+	va_start(args, fmt);
+	__bpf_gen__debug(gen, reg1, reg2, fmt, args);
+	va_end(args);
+}
+
+static void bpf_gen__debug_ret(struct bpf_gen *gen, const char *fmt, ...)
+{
+	va_list args;
+
+	va_start(args, fmt);
+	__bpf_gen__debug(gen, BPF_REG_7, -1, fmt, args);
+	va_end(args);
+}
+
+static void __bpf_gen__emit_sys_close(struct bpf_gen *gen)
+{
+	bpf_gen__emit(gen, BPF_JMP_IMM(BPF_JSLE, BPF_REG_1, 0,
+				       /* 2 is the number of the following insns
+					* 6 is additional insns in debug_regs
+					*/
+				       2 + (gen->log_level ? 6 : 0)));
+	bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_9, BPF_REG_1));
+	bpf_gen__emit(gen, BPF_EMIT_CALL(BPF_FUNC_sys_close));
+	bpf_gen__debug_regs(gen, BPF_REG_9, BPF_REG_0, "close(%%d) = %%d");
+}
+
+static void bpf_gen__emit_sys_close_stack(struct bpf_gen *gen, int stack_off)
+{
+	bpf_gen__emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_10, stack_off));
+	__bpf_gen__emit_sys_close(gen);
+}
+
+static void bpf_gen__emit_sys_close_blob(struct bpf_gen *gen, int blob_off)
+{
+	bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_0, BPF_PSEUDO_MAP_IDX_VALUE,
+						  0, 0, 0, blob_off));
+	bpf_gen__emit(gen, BPF_LDX_MEM(BPF_W, BPF_REG_1, BPF_REG_0, 0));
+	__bpf_gen__emit_sys_close(gen);
+}
+
+int bpf_gen__finish(struct bpf_gen *gen)
+{
+	int i;
+
+	bpf_gen__emit_sys_close_stack(gen, stack_off(btf_fd));
+	for (i = 0; i < gen->nr_progs; i++)
+		bpf_gen__move_stack2ctx(gen,
+					sizeof(struct bpf_loader_ctx) +
+					sizeof(struct bpf_map_desc) * gen->nr_maps +
+					sizeof(struct bpf_prog_desc) * i +
+					offsetof(struct bpf_prog_desc, prog_fd), 4,
+					stack_off(prog_fd[i]));
+	for (i = 0; i < gen->nr_maps; i++)
+		bpf_gen__move_stack2ctx(gen,
+					sizeof(struct bpf_loader_ctx) +
+					sizeof(struct bpf_map_desc) * i +
+					offsetof(struct bpf_map_desc, map_fd), 4,
+					stack_off(map_fd[i]));
+	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_0, 0));
+	bpf_gen__emit(gen, BPF_EXIT_INSN());
+	pr_debug("bpf_gen__finish %d\n", gen->error);
+	if (!gen->error) {
+		struct gen_loader_opts *opts = gen->opts;
+
+		opts->insns = gen->insn_start;
+		opts->insns_sz = gen->insn_cur - gen->insn_start;
+		opts->data = gen->data_start;
+		opts->data_sz = gen->data_cur - gen->data_start;
+	}
+	return gen->error;
+}
+
+void bpf_gen__free(struct bpf_gen *gen)
+{
+	if (!gen)
+		return;
+	free(gen->data_start);
+	free(gen->insn_start);
+	gen->data_start = NULL;
+	gen->insn_start = NULL;
+}
+
+void bpf_gen__load_btf(struct bpf_gen *gen, const void *btf_raw_data, __u32 btf_raw_size)
+{
+	union bpf_attr attr = {};
+	int attr_size = offsetofend(union bpf_attr, btf_log_level);
+	int btf_data, btf_load_attr;
+
+	pr_debug("btf_load: size %d\n", btf_raw_size);
+	btf_data = bpf_gen__add_data(gen, btf_raw_data, btf_raw_size);
+
+	attr.btf_size = btf_raw_size;
+	btf_load_attr = bpf_gen__add_data(gen, &attr, attr_size);
+
+	/* populate union bpf_attr with user provided log details */
+	bpf_gen__move_ctx2blob(gen, btf_load_attr + offsetof(union bpf_attr, btf_log_level), 4,
+			       offsetof(struct bpf_loader_ctx, log_level));
+	bpf_gen__move_ctx2blob(gen, btf_load_attr + offsetof(union bpf_attr, btf_log_size), 4,
+			       offsetof(struct bpf_loader_ctx, log_size));
+	bpf_gen__move_ctx2blob(gen, btf_load_attr + offsetof(union bpf_attr, btf_log_buf), 8,
+			       offsetof(struct bpf_loader_ctx, log_buf));
+	/* populate union bpf_attr with a pointer to the BTF data */
+	bpf_gen__emit_rel_store(gen, btf_load_attr + offsetof(union bpf_attr, btf), btf_data);
+	/* emit BTF_LOAD command */
+	bpf_gen__emit_sys_bpf(gen, BPF_BTF_LOAD, btf_load_attr, attr_size);
+	bpf_gen__debug_ret(gen, "btf_load size %d", btf_raw_size);
+	bpf_gen__emit_check_err(gen);
+	/* remember btf_fd in the stack, if successful */
+	bpf_gen__emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7, stack_off(btf_fd)));
+}
+
+void bpf_gen__map_create(struct bpf_gen *gen, struct bpf_create_map_attr *map_attr, int map_idx)
+{
+	union bpf_attr attr = {};
+	int attr_size = offsetofend(union bpf_attr, btf_vmlinux_value_type_id);
+	bool close_inner_map_fd = false;
+	int map_create_attr;
+
+	attr.map_type = map_attr->map_type;
+	attr.key_size = map_attr->key_size;
+	attr.value_size = map_attr->value_size;
+	attr.map_flags = map_attr->map_flags;
+	memcpy(attr.map_name, map_attr->name,
+	       min((unsigned)strlen(map_attr->name), BPF_OBJ_NAME_LEN - 1));
+	attr.numa_node = map_attr->numa_node;
+	attr.map_ifindex = map_attr->map_ifindex;
+	attr.max_entries = map_attr->max_entries;
+	switch (attr.map_type) {
+	case BPF_MAP_TYPE_PERF_EVENT_ARRAY:
+	case BPF_MAP_TYPE_CGROUP_ARRAY:
+	case BPF_MAP_TYPE_STACK_TRACE:
+	case BPF_MAP_TYPE_ARRAY_OF_MAPS:
+	case BPF_MAP_TYPE_HASH_OF_MAPS:
+	case BPF_MAP_TYPE_DEVMAP:
+	case BPF_MAP_TYPE_DEVMAP_HASH:
+	case BPF_MAP_TYPE_CPUMAP:
+	case BPF_MAP_TYPE_XSKMAP:
+	case BPF_MAP_TYPE_SOCKMAP:
+	case BPF_MAP_TYPE_SOCKHASH:
+	case BPF_MAP_TYPE_QUEUE:
+	case BPF_MAP_TYPE_STACK:
+	case BPF_MAP_TYPE_RINGBUF:
+		break;
+	default:
+		attr.btf_key_type_id = map_attr->btf_key_type_id;
+		attr.btf_value_type_id = map_attr->btf_value_type_id;
+	}
+
+	pr_debug("map_create: %s idx %d type %d value_type_id %d\n",
+		 attr.map_name, map_idx, map_attr->map_type, attr.btf_value_type_id);
+
+	map_create_attr = bpf_gen__add_data(gen, &attr, attr_size);
+	if (attr.btf_value_type_id)
+		/* populate union bpf_attr with btf_fd saved in the stack earlier */
+		bpf_gen__move_stack2blob(gen, map_create_attr + offsetof(union bpf_attr, btf_fd), 4,
+					 stack_off(btf_fd));
+	switch (attr.map_type) {
+	case BPF_MAP_TYPE_ARRAY_OF_MAPS:
+	case BPF_MAP_TYPE_HASH_OF_MAPS:
+		bpf_gen__move_stack2blob(gen, map_create_attr + offsetof(union bpf_attr, inner_map_fd),
+					 4, stack_off(inner_map_fd));
+		close_inner_map_fd = true;
+		break;
+	default:;
+	}
+	/* emit MAP_CREATE command */
+	bpf_gen__emit_sys_bpf(gen, BPF_MAP_CREATE, map_create_attr, attr_size);
+	bpf_gen__debug_ret(gen, "map_create %s idx %d type %d value_size %d",
+			   attr.map_name, map_idx, map_attr->map_type, attr.value_size);
+	bpf_gen__emit_check_err(gen);
+	/* remember map_fd in the stack, if successful */
+	if (map_idx < 0) {
+		/* This bpf_gen__map_create() function is called with map_idx >= 0 for all maps
+		 * that libbpf loading logic tracks.
+		 * It's called with -1 to create an inner map.
+		 */
+		bpf_gen__emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7, stack_off(inner_map_fd)));
+	} else {
+		if (map_idx != gen->nr_maps) {
+			gen->error = -EDOM; /* internal bug */
+			return;
+		}
+		bpf_gen__emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7, stack_off(map_fd[map_idx])));
+		gen->nr_maps++;
+	}
+	if (close_inner_map_fd)
+		bpf_gen__emit_sys_close_stack(gen, stack_off(inner_map_fd));
+}
+
+void bpf_gen__record_attach_target(struct bpf_gen *gen, const char *attach_name,
+				   enum bpf_attach_type type)
+{
+	const char *prefix;
+	int kind, ret;
+
+	btf_get_kernel_prefix_kind(type, &prefix, &kind);
+	gen->attach_kind = kind;
+	ret = snprintf(gen->attach_target, sizeof(gen->attach_target), "%s%s", prefix, attach_name);
+	if (ret == sizeof(gen->attach_target))
+		gen->error = -ENOSPC;
+}
+
+static void bpf_gen__emit_find_attach_target(struct bpf_gen *gen)
+{
+	int name;
+
+	pr_debug("find_btf_id %s %d\n", gen->attach_target, gen->attach_kind);
+	name = bpf_gen__add_data(gen, gen->attach_target, strlen(gen->attach_target) + 1);
+
+	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_1, 0));
+	bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_2, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, name));
+	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_3, gen->attach_kind));
+	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_4, 0));
+	bpf_gen__emit(gen, BPF_EMIT_CALL(BPF_FUNC_btf_find_by_name_kind));
+	bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_7, BPF_REG_0));
+	bpf_gen__debug_ret(gen, "find_by_name_kind(%s,%d)", gen->attach_target, gen->attach_kind);
+	bpf_gen__emit_check_err(gen);
+	/* if successful, btf_id is in lower 32-bit of R7 and btf_obj_fd is in upper 32-bit */
+}
+
+void bpf_gen__record_extern(struct bpf_gen *gen, const char *name, int kind, int insn_idx)
+{
+	struct relo_desc *relo;
+
+	relo = libbpf_reallocarray(gen->relos, gen->relo_cnt + 1, sizeof(*relo));
+	if (!relo) {
+		gen->error = -ENOMEM;
+		return;
+	}
+	gen->relos = relo;
+	relo += gen->relo_cnt;
+	relo->name = name;
+	relo->kind = kind;
+	relo->insn_idx = insn_idx;
+	gen->relo_cnt++;
+}
+
+static void bpf_gen__emit_relo(struct bpf_gen *gen, struct relo_desc *relo, int insns)
+{
+	int name, insn;
+
+	pr_debug("relo: %s at %d\n", relo->name, relo->insn_idx);
+	name = bpf_gen__add_data(gen, relo->name, strlen(relo->name) + 1);
+
+	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_1, 0));
+	bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_2, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, name));
+	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_3, relo->kind));
+	bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_4, 0));
+	bpf_gen__emit(gen, BPF_EMIT_CALL(BPF_FUNC_btf_find_by_name_kind));
+	bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_7, BPF_REG_0));
+	bpf_gen__debug_ret(gen, "find_by_name_kind(%s,%d)", relo->name, relo->kind);
+	bpf_gen__emit_check_err(gen);
+	/* store btf_id into insn[insn_idx].imm */
+	insn = insns + sizeof(struct bpf_insn) * relo->insn_idx + offsetof(struct bpf_insn, imm);
+	bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_0, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, insn));
+	bpf_gen__emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_0, BPF_REG_7, 0));
+	if (relo->kind == BTF_KIND_VAR) {
+		/* store btf_obj_fd into insn[insn_idx + 1].imm */
+		bpf_gen__emit(gen, BPF_ALU64_IMM(BPF_RSH, BPF_REG_7, 32));
+		bpf_gen__emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_0, BPF_REG_7, sizeof(struct bpf_insn)));
+	}
+}
+
+static void bpf_gen__emit_relos(struct bpf_gen *gen, int insns)
+{
+	int i;
+
+	for (i = 0; i < gen->relo_cnt; i++)
+		bpf_gen__emit_relo(gen, gen->relos + i, insns);
+}
+
+static void bpf_gen__cleanup_relos(struct bpf_gen *gen, int insns)
+{
+	int i, insn;
+
+	for (i = 0; i < gen->relo_cnt; i++) {
+		if (gen->relos[i].kind != BTF_KIND_VAR)
+			continue;
+		/* close fd recorded in insn[insn_idx + 1].imm */
+		insn = insns + sizeof(struct bpf_insn) * (gen->relos[i].insn_idx + 1)
+			+ offsetof(struct bpf_insn, imm);
+		bpf_gen__emit_sys_close_blob(gen, insn);
+	}
+	if (gen->relo_cnt) {
+		free(gen->relos);
+		gen->relo_cnt = 0;
+		gen->relos = NULL;
+	}
+}
+
+void bpf_gen__prog_load(struct bpf_gen *gen, struct bpf_prog_load_params *load_attr, int prog_idx)
+{
+	union bpf_attr attr = {};
+	int attr_size = offsetofend(union bpf_attr, fd_array);
+	int prog_load_attr, license, insns, func_info, line_info;
+
+	pr_debug("prog_load: type %d insns_cnt %zd\n",
+		 load_attr->prog_type, load_attr->insn_cnt);
+	/* add license string to blob of bytes */
+	license = bpf_gen__add_data(gen, load_attr->license, strlen(load_attr->license) + 1);
+	/* add insns to blob of bytes */
+	insns = bpf_gen__add_data(gen, load_attr->insns,
+				  load_attr->insn_cnt * sizeof(struct bpf_insn));
+
+	attr.prog_type = load_attr->prog_type;
+	attr.expected_attach_type = load_attr->expected_attach_type;
+	attr.attach_btf_id = load_attr->attach_btf_id;
+	attr.prog_ifindex = load_attr->prog_ifindex;
+	attr.kern_version = 0;
+	attr.insn_cnt = (__u32)load_attr->insn_cnt;
+	attr.prog_flags = load_attr->prog_flags;
+
+	attr.func_info_rec_size = load_attr->func_info_rec_size;
+	attr.func_info_cnt = load_attr->func_info_cnt;
+	func_info = bpf_gen__add_data(gen, load_attr->func_info,
+				      attr.func_info_cnt * attr.func_info_rec_size);
+
+	attr.line_info_rec_size = load_attr->line_info_rec_size;
+	attr.line_info_cnt = load_attr->line_info_cnt;
+	line_info = bpf_gen__add_data(gen, load_attr->line_info,
+				      attr.line_info_cnt * attr.line_info_rec_size);
+
+	memcpy(attr.prog_name, load_attr->name,
+	       min((unsigned)strlen(load_attr->name), BPF_OBJ_NAME_LEN - 1));
+	prog_load_attr = bpf_gen__add_data(gen, &attr, attr_size);
+
+	/* populate union bpf_attr with a pointer to license */
+	bpf_gen__emit_rel_store(gen, prog_load_attr + offsetof(union bpf_attr, license), license);
+
+	/* populate union bpf_attr with a pointer to instructions */
+	bpf_gen__emit_rel_store(gen, prog_load_attr + offsetof(union bpf_attr, insns), insns);
+
+	/* populate union bpf_attr with a pointer to func_info */
+	bpf_gen__emit_rel_store(gen, prog_load_attr + offsetof(union bpf_attr, func_info), func_info);
+
+	/* populate union bpf_attr with a pointer to line_info */
+	bpf_gen__emit_rel_store(gen, prog_load_attr + offsetof(union bpf_attr, line_info), line_info);
+
+	/* populate union bpf_attr fd_array with a pointer to stack where map_fds are saved */
+	bpf_gen__emit_rel_store_sp(gen, prog_load_attr + offsetof(union bpf_attr, fd_array),
+				   stack_off(map_fd[0]));
+
+	/* populate union bpf_attr with user provided log details */
+	bpf_gen__move_ctx2blob(gen, prog_load_attr + offsetof(union bpf_attr, log_level), 4,
+			       offsetof(struct bpf_loader_ctx, log_level));
+	bpf_gen__move_ctx2blob(gen, prog_load_attr + offsetof(union bpf_attr, log_size), 4,
+			       offsetof(struct bpf_loader_ctx, log_size));
+	bpf_gen__move_ctx2blob(gen, prog_load_attr + offsetof(union bpf_attr, log_buf), 8,
+			       offsetof(struct bpf_loader_ctx, log_buf));
+	/* populate union bpf_attr with btf_fd saved in the stack earlier */
+	bpf_gen__move_stack2blob(gen, prog_load_attr + offsetof(union bpf_attr, prog_btf_fd), 4,
+				 stack_off(btf_fd));
+	if (gen->attach_kind) {
+		bpf_gen__emit_find_attach_target(gen);
+		/* populate union bpf_attr with btf_id and btf_obj_fd found by helper */
+		bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_0, BPF_PSEUDO_MAP_IDX_VALUE,
+							  0, 0, 0, prog_load_attr));
+		bpf_gen__emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_0, BPF_REG_7,
+					       offsetof(union bpf_attr, attach_btf_id)));
+		bpf_gen__emit(gen, BPF_ALU64_IMM(BPF_RSH, BPF_REG_7, 32));
+		bpf_gen__emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_0, BPF_REG_7,
+					       offsetof(union bpf_attr, attach_btf_obj_fd)));
+	}
+	bpf_gen__emit_relos(gen, insns);
+	/* emit PROG_LOAD command */
+	bpf_gen__emit_sys_bpf(gen, BPF_PROG_LOAD, prog_load_attr, attr_size);
+	bpf_gen__debug_ret(gen, "prog_load %s insn_cnt %d", attr.prog_name, attr.insn_cnt);
+	/* successful or not, close btf module FDs used in extern ksyms and attach_btf_obj_fd */
+	bpf_gen__cleanup_relos(gen, insns);
+	if (gen->attach_kind)
+		bpf_gen__emit_sys_close_blob(gen,
+					     prog_load_attr + offsetof(union bpf_attr, attach_btf_obj_fd));
+	bpf_gen__emit_check_err(gen);
+	/* remember prog_fd in the stack, if successful */
+	bpf_gen__emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7, stack_off(prog_fd[gen->nr_progs])));
+	gen->nr_progs++;
+}
+
+void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *pvalue, __u32 value_size)
+{
+	union bpf_attr attr = {};
+	int attr_size = offsetofend(union bpf_attr, flags);
+	int map_update_attr, value, key;
+	int zero = 0;
+
+	pr_debug("map_update_elem: idx %d\n", map_idx);
+	value = bpf_gen__add_data(gen, pvalue, value_size);
+	key = bpf_gen__add_data(gen, &zero, sizeof(zero));
+	map_update_attr = bpf_gen__add_data(gen, &attr, attr_size);
+	bpf_gen__move_stack2blob(gen, map_update_attr + offsetof(union bpf_attr, map_fd), 4,
+				 stack_off(map_fd[map_idx]));
+	bpf_gen__emit_rel_store(gen, map_update_attr + offsetof(union bpf_attr, key), key);
+	bpf_gen__emit_rel_store(gen, map_update_attr + offsetof(union bpf_attr, value), value);
+	/* emit MAP_UPDATE_ELEM command */
+	bpf_gen__emit_sys_bpf(gen, BPF_MAP_UPDATE_ELEM, map_update_attr, attr_size);
+	bpf_gen__debug_ret(gen, "update_elem idx %d value_size %d", map_idx, value_size);
+	bpf_gen__emit_check_err(gen);
+}
+
+void bpf_gen__map_freeze(struct bpf_gen *gen, int map_idx)
+{
+	union bpf_attr attr = {};
+	int attr_size = offsetofend(union bpf_attr, map_fd);
+	int map_freeze_attr;
+
+	pr_debug("map_freeze: idx %d\n", map_idx);
+	map_freeze_attr = bpf_gen__add_data(gen, &attr, attr_size);
+	bpf_gen__move_stack2blob(gen, map_freeze_attr + offsetof(union bpf_attr, map_fd), 4,
+				 stack_off(map_fd[map_idx]));
+	/* emit MAP_FREEZE command */
+	bpf_gen__emit_sys_bpf(gen, BPF_MAP_FREEZE, map_freeze_attr, attr_size);
+	bpf_gen__debug_ret(gen, "map_freeze");
+	bpf_gen__emit_check_err(gen);
+}
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 7f58c389ff67..c8864d2d6f12 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -54,6 +54,7 @@
 #include "str_error.h"
 #include "libbpf_internal.h"
 #include "hashmap.h"
+#include "bpf_gen_internal.h"
 
 #ifndef BPF_FS_MAGIC
 #define BPF_FS_MAGIC		0xcafe4a11
@@ -435,6 +436,8 @@ struct bpf_object {
 	bool loaded;
 	bool has_subcalls;
 
+	struct bpf_gen *gen_loader;
+
 	/*
 	 * Information when doing elf related work. Only valid if fd
 	 * is valid.
@@ -2651,7 +2654,15 @@ static int bpf_object__sanitize_and_load_btf(struct bpf_object *obj)
 		bpf_object__sanitize_btf(obj, kern_btf);
 	}
 
-	err = btf__load(kern_btf);
+	if (obj->gen_loader) {
+		__u32 raw_size = 0;
+		const void *raw_data = btf__get_raw_data(kern_btf, &raw_size);
+
+		bpf_gen__load_btf(obj->gen_loader, raw_data, raw_size);
+		btf__set_fd(kern_btf, 0);
+	} else {
+		err = btf__load(kern_btf);
+	}
 	if (sanitize) {
 		if (!err) {
 			/* move fd to libbpf's BTF */
@@ -4262,6 +4273,12 @@ static bool kernel_supports(const struct bpf_object *obj, enum kern_feature_id f
 	struct kern_feature_desc *feat = &feature_probes[feat_id];
 	int ret;
 
+	if (obj->gen_loader)
+		/* To generate loader program assume the latest kernel
+		 * to avoid doing extra prog_load, map_create syscalls.
+		 */
+		return true;
+
 	if (READ_ONCE(feat->res) == FEAT_UNKNOWN) {
 		ret = feat->probe();
 		if (ret > 0) {
@@ -4344,6 +4361,13 @@ bpf_object__populate_internal_map(struct bpf_object *obj, struct bpf_map *map)
 	char *cp, errmsg[STRERR_BUFSIZE];
 	int err, zero = 0;
 
+	if (obj->gen_loader) {
+		bpf_gen__map_update_elem(obj->gen_loader, map - obj->maps,
+					 map->mmaped, map->def.value_size);
+		if (map_type == LIBBPF_MAP_RODATA || map_type == LIBBPF_MAP_KCONFIG)
+			bpf_gen__map_freeze(obj->gen_loader, map - obj->maps);
+		return 0;
+	}
 	err = bpf_map_update_elem(map->fd, &zero, map->mmaped, 0);
 	if (err) {
 		err = -errno;
@@ -4369,7 +4393,7 @@ bpf_object__populate_internal_map(struct bpf_object *obj, struct bpf_map *map)
 
 static void bpf_map__destroy(struct bpf_map *map);
 
-static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map)
+static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, bool is_inner)
 {
 	struct bpf_create_map_attr create_attr;
 	struct bpf_map_def *def = &map->def;
@@ -4415,9 +4439,9 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map)
 
 	if (bpf_map_type__is_map_in_map(def->type)) {
 		if (map->inner_map) {
-			int err;
+			int err = 0;
 
-			err = bpf_object__create_map(obj, map->inner_map);
+			err = bpf_object__create_map(obj, map->inner_map, true);
 			if (err) {
 				pr_warn("map '%s': failed to create inner map: %d\n",
 					map->name, err);
@@ -4429,7 +4453,12 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map)
 			create_attr.inner_map_fd = map->inner_map_fd;
 	}
 
-	map->fd = bpf_create_map_xattr(&create_attr);
+	if (obj->gen_loader) {
+		bpf_gen__map_create(obj->gen_loader, &create_attr, is_inner ? -1 : map - obj->maps);
+		map->fd = 0;
+	} else {
+		map->fd = bpf_create_map_xattr(&create_attr);
+	}
 	if (map->fd < 0 && (create_attr.btf_key_type_id ||
 			    create_attr.btf_value_type_id)) {
 		char *cp, errmsg[STRERR_BUFSIZE];
@@ -4457,11 +4486,11 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map)
 	return 0;
 }
 
-static int init_map_slots(struct bpf_map *map)
+static int init_map_slots(struct bpf_object *obj, struct bpf_map *map)
 {
 	const struct bpf_map *targ_map;
 	unsigned int i;
-	int fd, err;
+	int fd, err = 0;
 
 	for (i = 0; i < map->init_slots_sz; i++) {
 		if (!map->init_slots[i])
@@ -4469,7 +4498,12 @@ static int init_map_slots(struct bpf_map *map)
 
 		targ_map = map->init_slots[i];
 		fd = bpf_map__fd(targ_map);
-		err = bpf_map_update_elem(map->fd, &i, &fd, 0);
+		if (obj->gen_loader) {
+			printf("// TODO map_update_elem: idx %ld key %d value==map_idx %ld\n",
+			       map - obj->maps, i, targ_map - obj->maps);
+		} else {
+			err = bpf_map_update_elem(map->fd, &i, &fd, 0);
+		}
 		if (err) {
 			err = -errno;
 			pr_warn("map '%s': failed to initialize slot [%d] to map '%s' fd=%d: %d\n",
@@ -4511,7 +4545,7 @@ bpf_object__create_maps(struct bpf_object *obj)
 			pr_debug("map '%s': skipping creation (preset fd=%d)\n",
 				 map->name, map->fd);
 		} else {
-			err = bpf_object__create_map(obj, map);
+			err = bpf_object__create_map(obj, map, false);
 			if (err)
 				goto err_out;
 
@@ -4527,7 +4561,7 @@ bpf_object__create_maps(struct bpf_object *obj)
 			}
 
 			if (map->init_slots_sz) {
-				err = init_map_slots(map);
+				err = init_map_slots(obj, map);
 				if (err < 0) {
 					zclose(map->fd);
 					goto err_out;
@@ -4937,6 +4971,9 @@ static int load_module_btfs(struct bpf_object *obj)
 	if (obj->btf_modules_loaded)
 		return 0;
 
+	if (obj->gen_loader)
+		return 0;
+
 	/* don't do this again, even if we find no module BTFs */
 	obj->btf_modules_loaded = true;
 
@@ -6082,6 +6119,11 @@ static int bpf_core_apply_relo(struct bpf_program *prog,
 	if (str_is_empty(spec_str))
 		return -EINVAL;
 
+	if (prog->obj->gen_loader) {
+		printf("// TODO core_relo: prog %ld insn[%d] %s %s kind %d\n",
+		       prog - prog->obj->programs, relo->insn_off / 8,
+		       local_name, spec_str, relo->kind);
+	}
 	err = bpf_core_parse_spec(local_btf, local_id, spec_str, relo->kind, &local_spec);
 	if (err) {
 		pr_warn("prog '%s': relo #%d: parsing [%d] %s %s + %s failed: %d\n",
@@ -6821,6 +6863,19 @@ bpf_object__relocate_calls(struct bpf_object *obj, struct bpf_program *prog)
 
 	return 0;
 }
+static void
+bpf_object__free_relocs(struct bpf_object *obj)
+{
+	struct bpf_program *prog;
+	int i;
+
+	/* free up relocation descriptors */
+	for (i = 0; i < obj->nr_programs; i++) {
+		prog = &obj->programs[i];
+		zfree(&prog->reloc_desc);
+		prog->nr_reloc = 0;
+	}
+}
 
 static int
 bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path)
@@ -6871,12 +6926,8 @@ bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path)
 			return err;
 		}
 	}
-	/* free up relocation descriptors */
-	for (i = 0; i < obj->nr_programs; i++) {
-		prog = &obj->programs[i];
-		zfree(&prog->reloc_desc);
-		prog->nr_reloc = 0;
-	}
+	if (!obj->gen_loader)
+		bpf_object__free_relocs(obj);
 	return 0;
 }
 
@@ -7065,6 +7116,9 @@ static int bpf_object__sanitize_prog(struct bpf_object* obj, struct bpf_program
 	enum bpf_func_id func_id;
 	int i;
 
+	if (obj->gen_loader)
+		return 0;
+
 	for (i = 0; i < prog->insns_cnt; i++, insn++) {
 		if (!insn_is_helper_call(insn, &func_id))
 			continue;
@@ -7150,6 +7204,12 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt,
 	load_attr.log_level = prog->log_level;
 	load_attr.prog_flags = prog->prog_flags;
 
+	if (prog->obj->gen_loader) {
+		bpf_gen__prog_load(prog->obj->gen_loader, &load_attr,
+				   prog - prog->obj->programs);
+		*pfd = 0;
+		return 0;
+	}
 retry_load:
 	if (log_buf_size) {
 		log_buf = malloc(log_buf_size);
@@ -7227,6 +7287,35 @@ load_program(struct bpf_program *prog, struct bpf_insn *insns, int insns_cnt,
 	return ret;
 }
 
+static int bpf_program__record_externs(struct bpf_program *prog)
+{
+	struct bpf_object *obj = prog->obj;
+	int i;
+
+	for (i = 0; i < prog->nr_reloc; i++) {
+		struct reloc_desc *relo = &prog->reloc_desc[i];
+		struct extern_desc *ext = &obj->externs[relo->sym_off];
+
+		switch (relo->type) {
+		case RELO_EXTERN_VAR:
+			if (ext->type != EXT_KSYM)
+				continue;
+			if (!ext->ksym.type_id) /* typeless ksym */
+				continue;
+			bpf_gen__record_extern(obj->gen_loader, ext->name, BTF_KIND_VAR,
+					       relo->insn_idx);
+			break;
+		case RELO_EXTERN_FUNC:
+			bpf_gen__record_extern(obj->gen_loader, ext->name, BTF_KIND_FUNC,
+					       relo->insn_idx);
+			break;
+		default:
+			continue;
+		}
+	}
+	return 0;
+}
+
 static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd, int *btf_type_id);
 
 int bpf_program__load(struct bpf_program *prog, char *license, __u32 kern_ver)
@@ -7272,6 +7361,8 @@ int bpf_program__load(struct bpf_program *prog, char *license, __u32 kern_ver)
 			pr_warn("prog '%s': inconsistent nr(%d) != 1\n",
 				prog->name, prog->instances.nr);
 		}
+		if (prog->obj->gen_loader)
+			bpf_program__record_externs(prog);
 		err = load_program(prog, prog->insns, prog->insns_cnt,
 				   license, kern_ver, &fd);
 		if (!err)
@@ -7363,6 +7454,8 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)
 			return err;
 		}
 	}
+	if (obj->gen_loader)
+		bpf_object__free_relocs(obj);
 	free(fd_array);
 	return 0;
 }
@@ -7744,6 +7837,12 @@ static int bpf_object__resolve_ksyms_btf_id(struct bpf_object *obj)
 		if (ext->type != EXT_KSYM || !ext->ksym.type_id)
 			continue;
 
+		if (obj->gen_loader) {
+			ext->is_set = true;
+			ext->ksym.kernel_btf_obj_fd = 0;
+			ext->ksym.kernel_btf_id = 0;
+			continue;
+		}
 		t = btf__type_by_id(obj->btf, ext->btf_id);
 		if (btf_is_var(t))
 			err = bpf_object__resolve_ksym_var_btf_id(obj, ext);
@@ -7858,6 +7957,9 @@ int bpf_object__load_xattr(struct bpf_object_load_attr *attr)
 		return -EINVAL;
 	}
 
+	if (obj->gen_loader)
+		bpf_gen__init(obj->gen_loader, attr->log_level);
+
 	err = bpf_object__probe_loading(obj);
 	err = err ? : bpf_object__load_vmlinux_btf(obj, false);
 	err = err ? : bpf_object__resolve_externs(obj, obj->kconfig);
@@ -7868,6 +7970,9 @@ int bpf_object__load_xattr(struct bpf_object_load_attr *attr)
 	err = err ? : bpf_object__relocate(obj, attr->target_btf_path);
 	err = err ? : bpf_object__load_progs(obj, attr->log_level);
 
+	if (obj->gen_loader && !err)
+		err = bpf_gen__finish(obj->gen_loader);
+
 	/* clean up module BTFs */
 	for (i = 0; i < obj->btf_module_cnt; i++) {
 		close(obj->btf_modules[i].fd);
@@ -8493,6 +8598,7 @@ void bpf_object__close(struct bpf_object *obj)
 	if (obj->clear_priv)
 		obj->clear_priv(obj, obj->priv);
 
+	bpf_gen__free(obj->gen_loader);
 	bpf_object__elf_finish(obj);
 	bpf_object__unload(obj);
 	btf__free(obj->btf);
@@ -8583,6 +8689,22 @@ void *bpf_object__priv(const struct bpf_object *obj)
 	return obj ? obj->priv : ERR_PTR(-EINVAL);
 }
 
+int bpf_object__gen_loader(struct bpf_object *obj, struct gen_loader_opts *opts)
+{
+	struct bpf_gen *gen;
+
+	if (!opts)
+		return -EFAULT;
+	if (!OPTS_VALID(opts, gen_loader_opts))
+		return -EINVAL;
+	gen = calloc(sizeof(*gen), 1);
+	if (!gen)
+		return -ENOMEM;
+	gen->opts = opts;
+	obj->gen_loader = gen;
+	return 0;
+}
+
 static struct bpf_program *
 __bpf_program__iter(const struct bpf_program *p, const struct bpf_object *obj,
 		    bool forward)
@@ -9219,6 +9341,28 @@ static int bpf_object__collect_st_ops_relos(struct bpf_object *obj,
 #define BTF_ITER_PREFIX "bpf_iter_"
 #define BTF_MAX_NAME_SIZE 128
 
+void btf_get_kernel_prefix_kind(enum bpf_attach_type attach_type,
+				const char **prefix, int *kind)
+{
+	switch (attach_type) {
+	case BPF_TRACE_RAW_TP:
+		*prefix = BTF_TRACE_PREFIX;
+		*kind = BTF_KIND_TYPEDEF;
+		break;
+	case BPF_LSM_MAC:
+		*prefix = BTF_LSM_PREFIX;
+		*kind = BTF_KIND_FUNC;
+		break;
+	case BPF_TRACE_ITER:
+		*prefix = BTF_ITER_PREFIX;
+		*kind = BTF_KIND_FUNC;
+		break;
+	default:
+		*prefix = "";
+		*kind = BTF_KIND_FUNC;
+	}
+}
+
 static int find_btf_by_prefix_kind(const struct btf *btf, const char *prefix,
 				   const char *name, __u32 kind)
 {
@@ -9239,21 +9383,11 @@ static int find_btf_by_prefix_kind(const struct btf *btf, const char *prefix,
 static inline int find_attach_btf_id(struct btf *btf, const char *name,
 				     enum bpf_attach_type attach_type)
 {
-	int err;
-
-	if (attach_type == BPF_TRACE_RAW_TP)
-		err = find_btf_by_prefix_kind(btf, BTF_TRACE_PREFIX, name,
-					      BTF_KIND_TYPEDEF);
-	else if (attach_type == BPF_LSM_MAC)
-		err = find_btf_by_prefix_kind(btf, BTF_LSM_PREFIX, name,
-					      BTF_KIND_FUNC);
-	else if (attach_type == BPF_TRACE_ITER)
-		err = find_btf_by_prefix_kind(btf, BTF_ITER_PREFIX, name,
-					      BTF_KIND_FUNC);
-	else
-		err = btf__find_by_name_kind(btf, name, BTF_KIND_FUNC);
+	const char *prefix;
+	int kind;
 
-	return err;
+	btf_get_kernel_prefix_kind(attach_type, &prefix, &kind);
+	return find_btf_by_prefix_kind(btf, prefix, name, kind);
 }
 
 int libbpf_find_vmlinux_btf_id(const char *name,
@@ -9352,7 +9486,7 @@ static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd,
 	__u32 attach_prog_fd = prog->attach_prog_fd;
 	const char *name = prog->sec_name, *attach_name;
 	const struct bpf_sec_def *sec = NULL;
-	int i, err;
+	int i, err = 0;
 
 	if (!name)
 		return -EINVAL;
@@ -9387,7 +9521,13 @@ static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd,
 	}
 
 	/* kernel/module BTF ID */
-	err = find_kernel_btf_id(prog->obj, attach_name, attach_type, btf_obj_fd, btf_type_id);
+	if (prog->obj->gen_loader) {
+		bpf_gen__record_attach_target(prog->obj->gen_loader, attach_name, attach_type);
+		*btf_obj_fd = 0;
+		*btf_type_id = 1;
+	} else {
+		err = find_kernel_btf_id(prog->obj, attach_name, attach_type, btf_obj_fd, btf_type_id);
+	}
 	if (err) {
 		pr_warn("failed to find kernel BTF type ID of '%s': %d\n", attach_name, err);
 		return err;
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index bec4e6a6e31d..fb291b4529e8 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -756,6 +756,18 @@ LIBBPF_API int bpf_object__attach_skeleton(struct bpf_object_skeleton *s);
 LIBBPF_API void bpf_object__detach_skeleton(struct bpf_object_skeleton *s);
 LIBBPF_API void bpf_object__destroy_skeleton(struct bpf_object_skeleton *s);
 
+struct gen_loader_opts {
+	size_t sz; /* size of this struct, for forward/backward compatiblity */
+	const char *data;
+	const char *insns;
+	__u32 data_sz;
+	__u32 insns_sz;
+};
+
+#define gen_loader_opts__last_field insns_sz
+LIBBPF_API int bpf_object__gen_loader(struct bpf_object *obj,
+				      struct gen_loader_opts *opts);
+
 enum libbpf_tristate {
 	TRI_NO = 0,
 	TRI_YES = 1,
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index b9b29baf1df8..889ee2f3611c 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -360,5 +360,6 @@ LIBBPF_0.4.0 {
 		bpf_linker__free;
 		bpf_linker__new;
 		bpf_map__inner_map;
+		bpf_object__gen_loader;
 		bpf_object__set_kversion;
 } LIBBPF_0.3.0;
diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
index 9114c7085f2a..fd5c57ac93f1 100644
--- a/tools/lib/bpf/libbpf_internal.h
+++ b/tools/lib/bpf/libbpf_internal.h
@@ -214,6 +214,8 @@ int bpf_object__section_size(const struct bpf_object *obj, const char *name,
 int bpf_object__variable_offset(const struct bpf_object *obj, const char *name,
 				__u32 *off);
 struct btf *btf_get_from_fd(int btf_fd, struct btf *base_btf);
+void btf_get_kernel_prefix_kind(enum bpf_attach_type attach_type,
+				const char **prefix, int *kind);
 
 struct btf_ext_info {
 	/*
diff --git a/tools/lib/bpf/skel_internal.h b/tools/lib/bpf/skel_internal.h
new file mode 100644
index 000000000000..1d05031f7856
--- /dev/null
+++ b/tools/lib/bpf/skel_internal.h
@@ -0,0 +1,105 @@
+/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
+/* Copyright (c) 2021 Facebook */
+#ifndef __SKEL_INTERNAL_H
+#define __SKEL_INTERNAL_H
+
+#include <unistd.h>
+#include <sys/syscall.h>
+#include <sys/mman.h>
+
+/* This file is a base header for auto-generated *.lskel.h files.
+ * Its contents will change and may become part of auto-generation in the future.
+ *
+ * The layout of bpf_[map|prog]_desc and bpf_loader_ctx is feature dependent
+ * and will change from one version of libbpf to another and features
+ * requested during loader program generation.
+ */
+struct bpf_map_desc {
+	__u32 map_fd;
+	__u32 max_entries;
+};
+struct bpf_prog_desc {
+	__u32 prog_fd;
+	__u32 attach_prog_fd;
+};
+
+struct bpf_loader_ctx {
+	size_t sz;
+	__u32 log_level;
+	__u32 log_size;
+	__u64 log_buf;
+};
+
+struct bpf_load_and_run_opts {
+	struct bpf_loader_ctx *ctx;
+	const void *data;
+	const void *insns;
+	__u32 data_sz;
+	__u32 insns_sz;
+	const char *errstr;
+};
+
+static inline int sys_bpf(enum bpf_cmd cmd, union bpf_attr *attr,
+			  unsigned int size)
+{
+	return syscall(__NR_bpf, cmd, attr, size);
+}
+
+static inline int bpf_load_and_run(struct bpf_load_and_run_opts *opts)
+{
+	int map_fd = -1, prog_fd = -1, key = 0, err;
+	union bpf_attr attr;
+
+	map_fd = bpf_create_map_name(BPF_MAP_TYPE_ARRAY, "__loader.map", 4,
+				     opts->data_sz, 1, 0);
+	if (map_fd < 0) {
+		opts->errstr = "failed to create loader map";
+		err = -errno;
+		goto out;
+	}
+
+	err = bpf_map_update_elem(map_fd, &key, opts->data, 0);
+	if (err < 0) {
+		opts->errstr = "failed to update loader map";
+		err = -errno;
+		goto out;
+	}
+
+	memset(&attr, 0, sizeof(attr));
+	attr.prog_type = BPF_PROG_TYPE_SYSCALL;
+	attr.insns = (long) opts->insns;
+	attr.insn_cnt = opts->insns_sz / sizeof(struct bpf_insn);
+	attr.license = (long) "Dual BSD/GPL";
+	memcpy(attr.prog_name, "__loader.prog", sizeof("__loader.prog"));
+	attr.fd_array = (long) &map_fd;
+	attr.log_level = opts->ctx->log_level;
+	attr.log_size = opts->ctx->log_size;
+	attr.log_buf = opts->ctx->log_buf;
+	prog_fd = sys_bpf(BPF_PROG_LOAD, &attr, sizeof(attr));
+	if (prog_fd < 0) {
+		opts->errstr = "failed to load loader prog";
+		err = -errno;
+		goto out;
+	}
+
+	memset(&attr, 0, sizeof(attr));
+	attr.test.prog_fd = prog_fd;
+	attr.test.ctx_in = (long) opts->ctx;
+	attr.test.ctx_size_in = opts->ctx->sz;
+	err = sys_bpf(BPF_PROG_TEST_RUN, &attr, sizeof(attr));
+	if (err < 0 || (int)attr.test.retval < 0) {
+		opts->errstr = "failed to execute loader prog";
+		if (err < 0)
+			err = -errno;
+		else
+			err = (int)attr.test.retval;
+		goto out;
+	}
+	err = 0;
+out:
+	close(map_fd);
+	close(prog_fd);
+	return err;
+}
+
+#endif
-- 
2.30.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v2 bpf-next 15/16] bpftool: Use syscall/loader program in "prog load" and "gen skeleton" command.
  2021-04-23  0:26 [PATCH v2 bpf-next 00/16] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
                   ` (13 preceding siblings ...)
  2021-04-23  0:26 ` [PATCH v2 bpf-next 14/16] libbpf: Generate loader program out of BPF ELF file Alexei Starovoitov
@ 2021-04-23  0:26 ` Alexei Starovoitov
  2021-04-26 22:35   ` Andrii Nakryiko
  2021-04-23  0:26 ` [PATCH v2 bpf-next 16/16] selftests/bpf: Convert few tests to light skeleton Alexei Starovoitov
  2021-04-23 21:36 ` [PATCH v2 bpf-next 00/16] bpf: syscall program, FD array, loader program, " Yonghong Song
  16 siblings, 1 reply; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-23  0:26 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Add -L flag to bpftool to use libbpf gen_trace facility and syscall/loader program
for skeleton generation and program loading.

"bpftool gen skeleton -L" command will generate a "light skeleton" or "loader skeleton"
that is similar to existing skeleton, but has one major difference:
$ bpftool gen skeleton lsm.o > lsm.skel.h
$ bpftool gen skeleton -L lsm.o > lsm.lskel.h
$ diff lsm.skel.h lsm.lskel.h
@@ -5,34 +4,34 @@
 #define __LSM_SKEL_H__

 #include <stdlib.h>
-#include <bpf/libbpf.h>
+#include <bpf/bpf.h>

The light skeleton does not use majority of libbpf infrastructure.
It doesn't need libelf. It doesn't parse .o file.
It only needs few sys_bpf wrappers. All of them are in bpf/bpf.h file.
In future libbpf/bpf.c can be inlined into bpf.h, so not even libbpf.a would be
needed to work with light skeleton.

"bpftool prog load -L file.o" command is introduced for debugging of syscall/loader
program generation. Just like the same command without -L it will try to load
the programs from file.o into the kernel. It won't even try to pin them.

"bpftool prog load -L -d file.o" command will provide additional debug messages
on how syscall/loader program was generated.
Also the execution of syscall/loader program will use bpf_trace_printk() for
each step of loading BTF, creating maps, and loading programs.
The user can do "cat /.../trace_pipe" for further debug.

An example of fexit_sleep.lskel.h generated from progs/fexit_sleep.c:
struct fexit_sleep {
	struct bpf_loader_ctx ctx;
	struct {
		struct bpf_map_desc bss;
	} maps;
	struct {
		struct bpf_prog_desc nanosleep_fentry;
		struct bpf_prog_desc nanosleep_fexit;
	} progs;
	struct {
		int nanosleep_fentry_fd;
		int nanosleep_fexit_fd;
	} links;
	struct fexit_sleep__bss {
		int pid;
		int fentry_cnt;
		int fexit_cnt;
	} *bss;
};

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/bpf/bpftool/Makefile        |   2 +-
 tools/bpf/bpftool/gen.c           | 313 +++++++++++++++++++++++++++---
 tools/bpf/bpftool/main.c          |   7 +-
 tools/bpf/bpftool/main.h          |   1 +
 tools/bpf/bpftool/prog.c          |  80 ++++++++
 tools/bpf/bpftool/xlated_dumper.c |   3 +
 6 files changed, 382 insertions(+), 24 deletions(-)

diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile
index b3073ae84018..d16d289ade7a 100644
--- a/tools/bpf/bpftool/Makefile
+++ b/tools/bpf/bpftool/Makefile
@@ -136,7 +136,7 @@ endif
 
 BPFTOOL_BOOTSTRAP := $(BOOTSTRAP_OUTPUT)bpftool
 
-BOOTSTRAP_OBJS = $(addprefix $(BOOTSTRAP_OUTPUT),main.o common.o json_writer.o gen.o btf.o)
+BOOTSTRAP_OBJS = $(addprefix $(BOOTSTRAP_OUTPUT),main.o common.o json_writer.o gen.o btf.o xlated_dumper.o btf_dumper.o) $(OUTPUT)disasm.o
 OBJS = $(patsubst %.c,$(OUTPUT)%.o,$(SRCS)) $(OUTPUT)disasm.o
 
 VMLINUX_BTF_PATHS ?= $(if $(O),$(O)/vmlinux)				\
diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c
index 31ade77f5ef8..0e56b8f3e337 100644
--- a/tools/bpf/bpftool/gen.c
+++ b/tools/bpf/bpftool/gen.c
@@ -18,6 +18,7 @@
 #include <sys/stat.h>
 #include <sys/mman.h>
 #include <bpf/btf.h>
+#include <bpf/bpf_gen_internal.h>
 
 #include "json_writer.h"
 #include "main.h"
@@ -268,6 +269,254 @@ static void codegen(const char *template, ...)
 	free(s);
 }
 
+static void print_hex(const char *obj_data, int file_sz)
+{
+	int i, len;
+
+	/* embed contents of BPF object file */
+	for (i = 0, len = 0; i < file_sz; i++) {
+		int w = obj_data[i] ? 4 : 2;
+
+		len += w;
+		if (len > 78) {
+			printf("\\\n");
+			len = w;
+		}
+		if (!obj_data[i])
+			printf("\\0");
+		else
+			printf("\\x%02x", (unsigned char)obj_data[i]);
+	}
+}
+
+static size_t bpf_map_mmap_sz(const struct bpf_map *map)
+{
+	long page_sz = sysconf(_SC_PAGE_SIZE);
+	size_t map_sz;
+
+	map_sz = (size_t)roundup(bpf_map__value_size(map), 8) * bpf_map__max_entries(map);
+	map_sz = roundup(map_sz, page_sz);
+	return map_sz;
+}
+
+static void codegen_attach_detach(struct bpf_object *obj, const char *obj_name)
+{
+	struct bpf_program *prog;
+
+	codegen("\
+		\n\
+									    \n\
+		static inline int					    \n\
+		%1$s__attach(struct %1$s *skel)				    \n\
+		{							    \n\
+		", obj_name);
+
+	bpf_object__for_each_program(prog, obj) {
+		printf("\tskel->links.%1$s_fd =\n"
+		       "\t\tbpf_raw_tracepoint_open(",
+		       bpf_program__name(prog));
+
+		switch (bpf_program__get_type(prog)) {
+		case BPF_PROG_TYPE_RAW_TRACEPOINT:
+			putchar('"');
+			fputs(strchr(bpf_program__section_name(prog), '/') + 1, stdout);
+			putchar('"');
+			break;
+		default:
+			fputs("NULL", stdout);
+			break;
+		}
+		printf(", skel->progs.%1$s.prog_fd);\n",
+		       bpf_program__name(prog));
+	}
+
+	codegen("\
+		\n\
+			return 0;					    \n\
+		}							    \n\
+									    \n\
+		static inline void					    \n\
+		%1$s__detach(struct %1$s *skel)				    \n\
+		{							    \n\
+		", obj_name);
+	bpf_object__for_each_program(prog, obj) {
+		printf("\tclose(skel->links.%1$s_fd);\n",
+		       bpf_program__name(prog));
+	}
+	codegen("\
+		\n\
+		}							    \n\
+		");
+}
+
+static void codegen_destroy(struct bpf_object *obj, const char *obj_name)
+{
+	struct bpf_program *prog;
+	struct bpf_map *map;
+
+	codegen("\
+		\n\
+		static void						    \n\
+		%1$s__destroy(struct %1$s *skel)			    \n\
+		{							    \n\
+			if (!skel)					    \n\
+				return;					    \n\
+			%1$s__detach(skel);				    \n\
+		",
+		obj_name);
+	bpf_object__for_each_program(prog, obj) {
+		printf("\tclose(skel->progs.%1$s.prog_fd);\n",
+		       bpf_program__name(prog));
+	}
+	bpf_object__for_each_map(map, obj) {
+		const char * ident;
+
+		ident = get_map_ident(map);
+		if (!ident)
+			continue;
+		if (bpf_map__is_internal(map) &&
+		    (bpf_map__def(map)->map_flags & BPF_F_MMAPABLE))
+			printf("\tmunmap(skel->%1$s, %2$zd);\n",
+			       ident, bpf_map_mmap_sz(map));
+		printf("\tclose(skel->maps.%1$s.map_fd);\n", ident);
+	}
+	codegen("\
+		\n\
+			free(skel);					    \n\
+		}							    \n\
+		",
+		obj_name);
+}
+
+static int gen_trace(struct bpf_object *obj, const char *obj_name, const char *header_guard)
+{
+	struct bpf_object_load_attr load_attr = {};
+	DECLARE_LIBBPF_OPTS(gen_loader_opts, opts);
+	struct bpf_map *map;
+	int err = 0;
+
+	err = bpf_object__gen_loader(obj, &opts);
+	if (err)
+		return err;
+
+	load_attr.obj = obj;
+	if (verifier_logs)
+		/* log_level1 + log_level2 + stats, but not stable UAPI */
+		load_attr.log_level = 1 + 2 + 4;
+
+	err = bpf_object__load_xattr(&load_attr);
+	if (err) {
+		p_err("failed to load object file");
+		goto out;
+	}
+	/* If there was no error during load then gen_loader_opts
+	 * are populated with the loader program.
+	 */
+
+	/* finish generating 'struct skel' */
+	codegen("\
+		\n\
+		};							    \n\
+		", obj_name);
+
+
+	codegen_attach_detach(obj, obj_name);
+
+	codegen_destroy(obj, obj_name);
+
+	codegen("\
+		\n\
+		static inline struct %1$s *				    \n\
+		%1$s__open(void)					    \n\
+		{							    \n\
+			struct %1$s *skel;				    \n\
+									    \n\
+			skel = calloc(sizeof(*skel), 1);		    \n\
+			if (!skel)					    \n\
+				return NULL;				    \n\
+			skel->ctx.sz = (void *)&skel->links - (void *)skel; \n\
+			return skel;					    \n\
+		}							    \n\
+									    \n\
+		static inline int					    \n\
+		%1$s__load(struct %1$s *skel)				    \n\
+		{							    \n\
+			struct bpf_load_and_run_opts opts = {};		    \n\
+			int err;					    \n\
+									    \n\
+			opts.ctx = (struct bpf_loader_ctx *)skel;	    \n\
+			opts.data_sz = %2$d;				    \n\
+			opts.data = (void *)\"\\			    \n\
+		",
+		obj_name, opts.data_sz);
+	print_hex(opts.data, opts.data_sz);
+	codegen("\
+		\n\
+		\";							    \n\
+		");
+
+	codegen("\
+		\n\
+			opts.insns_sz = %d;				    \n\
+			opts.insns = (void *)\"\\			    \n\
+		",
+		opts.insns_sz);
+	print_hex(opts.insns, opts.insns_sz);
+	codegen("\
+		\n\
+		\";							    \n\
+			err = bpf_load_and_run(&opts);			    \n\
+			if (err < 0)					    \n\
+				return err;				    \n\
+		", obj_name);
+	bpf_object__for_each_map(map, obj) {
+		const char * ident;
+
+		ident = get_map_ident(map);
+		if (!ident)
+			continue;
+
+		if (!bpf_map__is_internal(map) ||
+		    !(bpf_map__def(map)->map_flags & BPF_F_MMAPABLE))
+			continue;
+
+		printf("\tskel->%1$s =\n"
+		       "\t\tmmap(NULL, %2$zd, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_FIXED,\n"
+		       "\t\t\tskel->maps.%1$s.map_fd, 0);\n",
+		       ident, bpf_map_mmap_sz(map));
+	}
+	codegen("\
+		\n\
+			return 0;					    \n\
+		}							    \n\
+									    \n\
+		static inline struct %1$s *				    \n\
+		%1$s__open_and_load(void)				    \n\
+		{							    \n\
+			struct %1$s *skel;				    \n\
+									    \n\
+			skel = %1$s__open();				    \n\
+			if (!skel)					    \n\
+				return NULL;				    \n\
+			if (%1$s__load(skel)) {				    \n\
+				%1$s__destroy(skel);			    \n\
+				return NULL;				    \n\
+			}						    \n\
+			return skel;					    \n\
+		}							    \n\
+		", obj_name);
+
+	codegen("\
+		\n\
+									    \n\
+		#endif /* %s */						    \n\
+		",
+		header_guard);
+	err = 0;
+out:
+	return err;
+}
+
 static int do_skeleton(int argc, char **argv)
 {
 	char header_guard[MAX_OBJ_NAME_LEN + sizeof("__SKEL_H__")];
@@ -277,7 +526,7 @@ static int do_skeleton(int argc, char **argv)
 	struct bpf_object *obj = NULL;
 	const char *file, *ident;
 	struct bpf_program *prog;
-	int fd, len, err = -1;
+	int fd, err = -1;
 	struct bpf_map *map;
 	struct btf *btf;
 	struct stat st;
@@ -359,7 +608,25 @@ static int do_skeleton(int argc, char **argv)
 	}
 
 	get_header_guard(header_guard, obj_name);
-	codegen("\
+	if (use_loader)
+		codegen("\
+		\n\
+		/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */   \n\
+		/* THIS FILE IS AUTOGENERATED! */			    \n\
+		#ifndef %2$s						    \n\
+		#define %2$s						    \n\
+									    \n\
+		#include <stdlib.h>					    \n\
+		#include <bpf/bpf.h>					    \n\
+		#include <bpf/skel_internal.h>				    \n\
+									    \n\
+		struct %1$s {						    \n\
+			struct bpf_loader_ctx ctx;			    \n\
+		",
+		obj_name, header_guard
+		);
+	else
+		codegen("\
 		\n\
 		/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */   \n\
 									    \n\
@@ -375,7 +642,7 @@ static int do_skeleton(int argc, char **argv)
 			struct bpf_object *obj;				    \n\
 		",
 		obj_name, header_guard
-	);
+		);
 
 	if (map_cnt) {
 		printf("\tstruct {\n");
@@ -383,7 +650,10 @@ static int do_skeleton(int argc, char **argv)
 			ident = get_map_ident(map);
 			if (!ident)
 				continue;
-			printf("\t\tstruct bpf_map *%s;\n", ident);
+			if (use_loader)
+				printf("\t\tstruct bpf_map_desc %s;\n", ident);
+			else
+				printf("\t\tstruct bpf_map *%s;\n", ident);
 		}
 		printf("\t} maps;\n");
 	}
@@ -391,14 +661,22 @@ static int do_skeleton(int argc, char **argv)
 	if (prog_cnt) {
 		printf("\tstruct {\n");
 		bpf_object__for_each_program(prog, obj) {
-			printf("\t\tstruct bpf_program *%s;\n",
-			       bpf_program__name(prog));
+			if (use_loader)
+				printf("\t\tstruct bpf_prog_desc %s;\n",
+				       bpf_program__name(prog));
+			else
+				printf("\t\tstruct bpf_program *%s;\n",
+				       bpf_program__name(prog));
 		}
 		printf("\t} progs;\n");
 		printf("\tstruct {\n");
 		bpf_object__for_each_program(prog, obj) {
-			printf("\t\tstruct bpf_link *%s;\n",
-			       bpf_program__name(prog));
+			if (use_loader)
+				printf("\t\tint %s_fd;\n",
+				       bpf_program__name(prog));
+			else
+				printf("\t\tstruct bpf_link *%s;\n",
+				       bpf_program__name(prog));
 		}
 		printf("\t} links;\n");
 	}
@@ -409,6 +687,10 @@ static int do_skeleton(int argc, char **argv)
 		if (err)
 			goto out;
 	}
+	if (use_loader) {
+		err = gen_trace(obj, obj_name, header_guard);
+		goto out;
+	}
 
 	codegen("\
 		\n\
@@ -577,20 +859,7 @@ static int do_skeleton(int argc, char **argv)
 		",
 		file_sz);
 
-	/* embed contents of BPF object file */
-	for (i = 0, len = 0; i < file_sz; i++) {
-		int w = obj_data[i] ? 4 : 2;
-
-		len += w;
-		if (len > 78) {
-			printf("\\\n");
-			len = w;
-		}
-		if (!obj_data[i])
-			printf("\\0");
-		else
-			printf("\\x%02x", (unsigned char)obj_data[i]);
-	}
+	print_hex(obj_data, file_sz);
 
 	codegen("\
 		\n\
diff --git a/tools/bpf/bpftool/main.c b/tools/bpf/bpftool/main.c
index d9afb730136a..7f2817d97079 100644
--- a/tools/bpf/bpftool/main.c
+++ b/tools/bpf/bpftool/main.c
@@ -29,6 +29,7 @@ bool show_pinned;
 bool block_mount;
 bool verifier_logs;
 bool relaxed_maps;
+bool use_loader;
 struct btf *base_btf;
 struct pinned_obj_table prog_table;
 struct pinned_obj_table map_table;
@@ -392,6 +393,7 @@ int main(int argc, char **argv)
 		{ "mapcompat",	no_argument,	NULL,	'm' },
 		{ "nomount",	no_argument,	NULL,	'n' },
 		{ "debug",	no_argument,	NULL,	'd' },
+		{ "use-loader",	no_argument,	NULL,	'L' },
 		{ "base-btf",	required_argument, NULL, 'B' },
 		{ 0 }
 	};
@@ -409,7 +411,7 @@ int main(int argc, char **argv)
 	hash_init(link_table.table);
 
 	opterr = 0;
-	while ((opt = getopt_long(argc, argv, "VhpjfmndB:",
+	while ((opt = getopt_long(argc, argv, "VhpjfLmndB:",
 				  options, NULL)) >= 0) {
 		switch (opt) {
 		case 'V':
@@ -452,6 +454,9 @@ int main(int argc, char **argv)
 				return -1;
 			}
 			break;
+		case 'L':
+			use_loader = true;
+			break;
 		default:
 			p_err("unrecognized option '%s'", argv[optind - 1]);
 			if (json_output)
diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h
index 76e91641262b..c1cf29798b99 100644
--- a/tools/bpf/bpftool/main.h
+++ b/tools/bpf/bpftool/main.h
@@ -90,6 +90,7 @@ extern bool show_pids;
 extern bool block_mount;
 extern bool verifier_logs;
 extern bool relaxed_maps;
+extern bool use_loader;
 extern struct btf *base_btf;
 extern struct pinned_obj_table prog_table;
 extern struct pinned_obj_table map_table;
diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
index 3f067d2d7584..052b16101ab7 100644
--- a/tools/bpf/bpftool/prog.c
+++ b/tools/bpf/bpftool/prog.c
@@ -24,6 +24,8 @@
 #include <bpf/bpf.h>
 #include <bpf/btf.h>
 #include <bpf/libbpf.h>
+#include <bpf/bpf_gen_internal.h>
+#include <bpf/skel_internal.h>
 
 #include "cfg.h"
 #include "main.h"
@@ -1645,8 +1647,86 @@ static int load_with_options(int argc, char **argv, bool first_prog_only)
 	return -1;
 }
 
+static int try_loader(struct gen_loader_opts *gen)
+{
+	struct bpf_load_and_run_opts opts = {};
+	struct bpf_loader_ctx *ctx;
+	int ctx_sz = sizeof(*ctx) + 64 * max(sizeof(struct bpf_map_desc), sizeof(struct bpf_prog_desc));
+	int log_buf_sz = (1u << 24) - 1;
+	char *log_buf;
+	int err;
+
+	ctx = alloca(ctx_sz);
+	ctx->sz = ctx_sz;
+	ctx->log_level = 1;
+	ctx->log_size = log_buf_sz;
+	log_buf = malloc(log_buf_sz);
+	if (!log_buf)
+		return -ENOMEM;
+	ctx->log_buf = (long) log_buf;
+	opts.ctx = ctx;
+	opts.data = gen->data;
+	opts.data_sz = gen->data_sz;
+	opts.insns = gen->insns;
+	opts.insns_sz = gen->insns_sz;
+	err = bpf_load_and_run(&opts);
+	if (err < 0)
+		fprintf(stderr, "err %d\n%s\n%s", err, opts.errstr, log_buf);
+	free(log_buf);
+	return err;
+}
+
+static int do_loader(int argc, char **argv)
+{
+	DECLARE_LIBBPF_OPTS(bpf_object_open_opts, open_opts);
+	DECLARE_LIBBPF_OPTS(gen_loader_opts, gen);
+	struct bpf_object_load_attr load_attr = {};
+	struct bpf_object *obj;
+	const char *file;
+	int err = 0;
+
+	if (!REQ_ARGS(1))
+		return -1;
+	file = GET_ARG();
+
+	obj = bpf_object__open_file(file, &open_opts);
+	if (IS_ERR_OR_NULL(obj)) {
+		p_err("failed to open object file");
+		goto err_close_obj;
+	}
+
+	err = bpf_object__gen_loader(obj, &gen);
+	if (err)
+		goto err_close_obj;
+
+	load_attr.obj = obj;
+	if (verifier_logs)
+		/* log_level1 + log_level2 + stats, but not stable UAPI */
+		load_attr.log_level = 1 + 2 + 4;
+
+	err = bpf_object__load_xattr(&load_attr);
+	if (err) {
+		p_err("failed to load object file");
+		goto err_close_obj;
+	}
+
+	if (verifier_logs) {
+		struct dump_data dd = {};
+
+		kernel_syms_load(&dd);
+		dump_xlated_plain(&dd, (void *)gen.insns, gen.insns_sz, false, false);
+		kernel_syms_destroy(&dd);
+	}
+	err = try_loader(&gen);
+err_close_obj:
+	bpf_object__close(obj);
+	return err;
+}
+
 static int do_load(int argc, char **argv)
 {
+	if (use_loader)
+		return do_loader(argc, argv);
 	return load_with_options(argc, argv, true);
 }
 
diff --git a/tools/bpf/bpftool/xlated_dumper.c b/tools/bpf/bpftool/xlated_dumper.c
index 6fc3e6f7f40c..f1f32e21d5cd 100644
--- a/tools/bpf/bpftool/xlated_dumper.c
+++ b/tools/bpf/bpftool/xlated_dumper.c
@@ -196,6 +196,9 @@ static const char *print_imm(void *private_data,
 	else if (insn->src_reg == BPF_PSEUDO_MAP_VALUE)
 		snprintf(dd->scratch_buff, sizeof(dd->scratch_buff),
 			 "map[id:%u][0]+%u", insn->imm, (insn + 1)->imm);
+	else if (insn->src_reg == BPF_PSEUDO_MAP_IDX_VALUE)
+		snprintf(dd->scratch_buff, sizeof(dd->scratch_buff),
+			 "map[idx:%u]+%u", insn->imm, (insn + 1)->imm);
 	else if (insn->src_reg == BPF_PSEUDO_FUNC)
 		snprintf(dd->scratch_buff, sizeof(dd->scratch_buff),
 			 "subprog[%+d]", insn->imm);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* [PATCH v2 bpf-next 16/16] selftests/bpf: Convert few tests to light skeleton.
  2021-04-23  0:26 [PATCH v2 bpf-next 00/16] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
                   ` (14 preceding siblings ...)
  2021-04-23  0:26 ` [PATCH v2 bpf-next 15/16] bpftool: Use syscall/loader program in "prog load" and "gen skeleton" command Alexei Starovoitov
@ 2021-04-23  0:26 ` Alexei Starovoitov
  2021-04-23 21:36 ` [PATCH v2 bpf-next 00/16] bpf: syscall program, FD array, loader program, " Yonghong Song
  16 siblings, 0 replies; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-23  0:26 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Convert few tests that don't use CO-RE to light skeleton.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/testing/selftests/bpf/.gitignore            |  1 +
 tools/testing/selftests/bpf/Makefile              | 15 ++++++++++++++-
 .../selftests/bpf/prog_tests/fentry_fexit.c       |  6 +++---
 .../selftests/bpf/prog_tests/fentry_test.c        |  4 ++--
 .../selftests/bpf/prog_tests/fexit_sleep.c        |  6 +++---
 .../testing/selftests/bpf/prog_tests/fexit_test.c |  4 ++--
 .../testing/selftests/bpf/prog_tests/kfunc_call.c |  6 +++---
 .../selftests/bpf/prog_tests/ksyms_module.c       |  2 +-
 8 files changed, 29 insertions(+), 15 deletions(-)

diff --git a/tools/testing/selftests/bpf/.gitignore b/tools/testing/selftests/bpf/.gitignore
index 4866f6a21901..a030aa4a8a9e 100644
--- a/tools/testing/selftests/bpf/.gitignore
+++ b/tools/testing/selftests/bpf/.gitignore
@@ -30,6 +30,7 @@ test_sysctl
 xdping
 test_cpp
 *.skel.h
+*.lskel.h
 /no_alu32
 /bpf_gcc
 /tools
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 9fdfdbc61857..2db4262b2f40 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -312,6 +312,9 @@ SKEL_BLACKLIST := btf__% test_pinning_invalid.c test_sk_assign.c
 
 LINKED_SKELS := test_static_linked.skel.h
 
+LSKELS := kfunc_call_test.c fentry_test.c fexit_test.c fexit_sleep.c test_ksyms_module.c
+SKEL_BLACKLIST += $$(LSKELS)
+
 test_static_linked.skel.h-deps := test_static_linked1.o test_static_linked2.o
 
 # Set up extra TRUNNER_XXX "temporary" variables in the environment (relies on
@@ -334,6 +337,7 @@ TRUNNER_BPF_OBJS := $$(patsubst %.c,$$(TRUNNER_OUTPUT)/%.o, $$(TRUNNER_BPF_SRCS)
 TRUNNER_BPF_SKELS := $$(patsubst %.c,$$(TRUNNER_OUTPUT)/%.skel.h,	\
 				 $$(filter-out $(SKEL_BLACKLIST),	\
 					       $$(TRUNNER_BPF_SRCS)))
+TRUNNER_BPF_LSKELS := $$(patsubst %.c,$$(TRUNNER_OUTPUT)/%.lskel.h, $$(LSKELS))
 TRUNNER_BPF_SKELS_LINKED := $$(addprefix $$(TRUNNER_OUTPUT)/,$(LINKED_SKELS))
 TEST_GEN_FILES += $$(TRUNNER_BPF_OBJS)
 
@@ -375,6 +379,14 @@ $(TRUNNER_BPF_SKELS): %.skel.h: %.o $(BPFTOOL) | $(TRUNNER_OUTPUT)
 	$(Q)diff $$(<:.o=.linked2.o) $$(<:.o=.linked3.o)
 	$(Q)$$(BPFTOOL) gen skeleton $$(<:.o=.linked3.o) name $$(notdir $$(<:.o=)) > $$@
 
+$(TRUNNER_BPF_LSKELS): %.lskel.h: %.o $(BPFTOOL) | $(TRUNNER_OUTPUT)
+	$$(call msg,GEN-SKEL,$(TRUNNER_BINARY),$$@)
+	$(Q)$$(BPFTOOL) gen object $$(<:.o=.linked1.o) $$<
+	$(Q)$$(BPFTOOL) gen object $$(<:.o=.linked2.o) $$(<:.o=.linked1.o)
+	$(Q)$$(BPFTOOL) gen object $$(<:.o=.linked3.o) $$(<:.o=.linked2.o)
+	$(Q)diff $$(<:.o=.linked2.o) $$(<:.o=.linked3.o)
+	$(Q)$$(BPFTOOL) gen skeleton -L $$(<:.o=.linked3.o) name $$(notdir $$(<:.o=)) > $$@
+
 $(TRUNNER_BPF_SKELS_LINKED): $(TRUNNER_BPF_OBJS) $(BPFTOOL) | $(TRUNNER_OUTPUT)
 	$$(call msg,LINK-BPF,$(TRUNNER_BINARY),$$(@:.skel.h=.o))
 	$(Q)$$(BPFTOOL) gen object $$(@:.skel.h=.linked1.o) $$(addprefix $(TRUNNER_OUTPUT)/,$$($$(@F)-deps))
@@ -404,6 +416,7 @@ $(TRUNNER_TEST_OBJS): $(TRUNNER_OUTPUT)/%.test.o:			\
 		      $(TRUNNER_EXTRA_HDRS)				\
 		      $(TRUNNER_BPF_OBJS)				\
 		      $(TRUNNER_BPF_SKELS)				\
+		      $(TRUNNER_BPF_LSKELS)				\
 		      $(TRUNNER_BPF_SKELS_LINKED)			\
 		      $$(BPFOBJ) | $(TRUNNER_OUTPUT)
 	$$(call msg,TEST-OBJ,$(TRUNNER_BINARY),$$@)
@@ -511,6 +524,6 @@ $(OUTPUT)/bench: $(OUTPUT)/bench.o $(OUTPUT)/testing_helpers.o \
 EXTRA_CLEAN := $(TEST_CUSTOM_PROGS) $(SCRATCH_DIR) $(HOST_SCRATCH_DIR)	\
 	prog_tests/tests.h map_tests/tests.h verifier/tests.h		\
 	feature								\
-	$(addprefix $(OUTPUT)/,*.o *.skel.h no_alu32 bpf_gcc bpf_testmod.ko)
+	$(addprefix $(OUTPUT)/,*.o *.skel.h *.lskel.h no_alu32 bpf_gcc bpf_testmod.ko)
 
 .PHONY: docs docs-clean
diff --git a/tools/testing/selftests/bpf/prog_tests/fentry_fexit.c b/tools/testing/selftests/bpf/prog_tests/fentry_fexit.c
index 109d0345a2be..91154c2ba256 100644
--- a/tools/testing/selftests/bpf/prog_tests/fentry_fexit.c
+++ b/tools/testing/selftests/bpf/prog_tests/fentry_fexit.c
@@ -1,8 +1,8 @@
 // SPDX-License-Identifier: GPL-2.0
 /* Copyright (c) 2019 Facebook */
 #include <test_progs.h>
-#include "fentry_test.skel.h"
-#include "fexit_test.skel.h"
+#include "fentry_test.lskel.h"
+#include "fexit_test.lskel.h"
 
 void test_fentry_fexit(void)
 {
@@ -26,7 +26,7 @@ void test_fentry_fexit(void)
 	if (CHECK(err, "fexit_attach", "fexit attach failed: %d\n", err))
 		goto close_prog;
 
-	prog_fd = bpf_program__fd(fexit_skel->progs.test1);
+	prog_fd = fexit_skel->progs.test1.prog_fd;
 	err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
 				NULL, NULL, &retval, &duration);
 	CHECK(err || retval, "ipv6",
diff --git a/tools/testing/selftests/bpf/prog_tests/fentry_test.c b/tools/testing/selftests/bpf/prog_tests/fentry_test.c
index 04ebbf1cb390..78062855b142 100644
--- a/tools/testing/selftests/bpf/prog_tests/fentry_test.c
+++ b/tools/testing/selftests/bpf/prog_tests/fentry_test.c
@@ -1,7 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 /* Copyright (c) 2019 Facebook */
 #include <test_progs.h>
-#include "fentry_test.skel.h"
+#include "fentry_test.lskel.h"
 
 void test_fentry_test(void)
 {
@@ -18,7 +18,7 @@ void test_fentry_test(void)
 	if (CHECK(err, "fentry_attach", "fentry attach failed: %d\n", err))
 		goto cleanup;
 
-	prog_fd = bpf_program__fd(fentry_skel->progs.test1);
+	prog_fd = fentry_skel->progs.test1.prog_fd;
 	err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
 				NULL, NULL, &retval, &duration);
 	CHECK(err || retval, "test_run",
diff --git a/tools/testing/selftests/bpf/prog_tests/fexit_sleep.c b/tools/testing/selftests/bpf/prog_tests/fexit_sleep.c
index ccc7e8a34ab6..4e7f4b42ea29 100644
--- a/tools/testing/selftests/bpf/prog_tests/fexit_sleep.c
+++ b/tools/testing/selftests/bpf/prog_tests/fexit_sleep.c
@@ -6,7 +6,7 @@
 #include <time.h>
 #include <sys/mman.h>
 #include <sys/syscall.h>
-#include "fexit_sleep.skel.h"
+#include "fexit_sleep.lskel.h"
 
 static int do_sleep(void *skel)
 {
@@ -58,8 +58,8 @@ void test_fexit_sleep(void)
 	 * waiting for percpu_ref_kill to confirm). The other one
 	 * will be freed quickly.
 	 */
-	close(bpf_program__fd(fexit_skel->progs.nanosleep_fentry));
-	close(bpf_program__fd(fexit_skel->progs.nanosleep_fexit));
+	close(fexit_skel->progs.nanosleep_fentry.prog_fd);
+	close(fexit_skel->progs.nanosleep_fexit.prog_fd);
 	fexit_sleep__detach(fexit_skel);
 
 	/* kill the thread to unwind sys_nanosleep stack through the trampoline */
diff --git a/tools/testing/selftests/bpf/prog_tests/fexit_test.c b/tools/testing/selftests/bpf/prog_tests/fexit_test.c
index 78d7a2765c27..be75d0c1018a 100644
--- a/tools/testing/selftests/bpf/prog_tests/fexit_test.c
+++ b/tools/testing/selftests/bpf/prog_tests/fexit_test.c
@@ -1,7 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
 /* Copyright (c) 2019 Facebook */
 #include <test_progs.h>
-#include "fexit_test.skel.h"
+#include "fexit_test.lskel.h"
 
 void test_fexit_test(void)
 {
@@ -18,7 +18,7 @@ void test_fexit_test(void)
 	if (CHECK(err, "fexit_attach", "fexit attach failed: %d\n", err))
 		goto cleanup;
 
-	prog_fd = bpf_program__fd(fexit_skel->progs.test1);
+	prog_fd = fexit_skel->progs.test1.prog_fd;
 	err = bpf_prog_test_run(prog_fd, 1, NULL, 0,
 				NULL, NULL, &retval, &duration);
 	CHECK(err || retval, "test_run",
diff --git a/tools/testing/selftests/bpf/prog_tests/kfunc_call.c b/tools/testing/selftests/bpf/prog_tests/kfunc_call.c
index 7fc0951ee75f..30a7b9b837bf 100644
--- a/tools/testing/selftests/bpf/prog_tests/kfunc_call.c
+++ b/tools/testing/selftests/bpf/prog_tests/kfunc_call.c
@@ -2,7 +2,7 @@
 /* Copyright (c) 2021 Facebook */
 #include <test_progs.h>
 #include <network_helpers.h>
-#include "kfunc_call_test.skel.h"
+#include "kfunc_call_test.lskel.h"
 #include "kfunc_call_test_subprog.skel.h"
 
 static void test_main(void)
@@ -14,13 +14,13 @@ static void test_main(void)
 	if (!ASSERT_OK_PTR(skel, "skel"))
 		return;
 
-	prog_fd = bpf_program__fd(skel->progs.kfunc_call_test1);
+	prog_fd = skel->progs.kfunc_call_test1.prog_fd;
 	err = bpf_prog_test_run(prog_fd, 1, &pkt_v4, sizeof(pkt_v4),
 				NULL, NULL, (__u32 *)&retval, NULL);
 	ASSERT_OK(err, "bpf_prog_test_run(test1)");
 	ASSERT_EQ(retval, 12, "test1-retval");
 
-	prog_fd = bpf_program__fd(skel->progs.kfunc_call_test2);
+	prog_fd = skel->progs.kfunc_call_test2.prog_fd;
 	err = bpf_prog_test_run(prog_fd, 1, &pkt_v4, sizeof(pkt_v4),
 				NULL, NULL, (__u32 *)&retval, NULL);
 	ASSERT_OK(err, "bpf_prog_test_run(test2)");
diff --git a/tools/testing/selftests/bpf/prog_tests/ksyms_module.c b/tools/testing/selftests/bpf/prog_tests/ksyms_module.c
index 4c232b456479..2cd5cded543f 100644
--- a/tools/testing/selftests/bpf/prog_tests/ksyms_module.c
+++ b/tools/testing/selftests/bpf/prog_tests/ksyms_module.c
@@ -4,7 +4,7 @@
 #include <test_progs.h>
 #include <bpf/libbpf.h>
 #include <bpf/btf.h>
-#include "test_ksyms_module.skel.h"
+#include "test_ksyms_module.lskel.h"
 
 static int duration;
 
-- 
2.30.2


^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 01/16] bpf: Introduce bpf_sys_bpf() helper and program type.
  2021-04-23  0:26 ` [PATCH v2 bpf-next 01/16] bpf: Introduce bpf_sys_bpf() helper and program type Alexei Starovoitov
@ 2021-04-23 18:15   ` Yonghong Song
  2021-04-23 18:28     ` Alexei Starovoitov
  2021-04-26 16:51   ` Andrii Nakryiko
  2021-04-27 18:45   ` John Fastabend
  2 siblings, 1 reply; 52+ messages in thread
From: Yonghong Song @ 2021-04-23 18:15 UTC (permalink / raw)
  To: Alexei Starovoitov, davem; +Cc: daniel, andrii, netdev, bpf, kernel-team



On 4/22/21 5:26 PM, Alexei Starovoitov wrote:
> From: Alexei Starovoitov <ast@kernel.org>
> 
> Add placeholders for bpf_sys_bpf() helper and new program type.
> 
> v1->v2:
> - check that expected_attach_type is zero
> - allow more helper functions to be used in this program type, since they will
>    only execute from user context via bpf_prog_test_run.
> 
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---
>   include/linux/bpf.h            | 10 +++++++
>   include/linux/bpf_types.h      |  2 ++
>   include/uapi/linux/bpf.h       |  8 +++++
>   kernel/bpf/syscall.c           | 54 ++++++++++++++++++++++++++++++++++
>   net/bpf/test_run.c             | 43 +++++++++++++++++++++++++++
>   tools/include/uapi/linux/bpf.h |  8 +++++
>   6 files changed, 125 insertions(+)
> 
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index f8a45f109e96..aed30bbffb54 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -1824,6 +1824,9 @@ static inline bool bpf_map_is_dev_bound(struct bpf_map *map)
>   
>   struct bpf_map *bpf_map_offload_map_alloc(union bpf_attr *attr);
>   void bpf_map_offload_map_free(struct bpf_map *map);
> +int bpf_prog_test_run_syscall(struct bpf_prog *prog,
> +			      const union bpf_attr *kattr,
> +			      union bpf_attr __user *uattr);
>   #else
>   static inline int bpf_prog_offload_init(struct bpf_prog *prog,
>   					union bpf_attr *attr)
> @@ -1849,6 +1852,13 @@ static inline struct bpf_map *bpf_map_offload_map_alloc(union bpf_attr *attr)
>   static inline void bpf_map_offload_map_free(struct bpf_map *map)
>   {
>   }
> +
> +static inline int bpf_prog_test_run_syscall(struct bpf_prog *prog,
> +					    const union bpf_attr *kattr,
> +					    union bpf_attr __user *uattr)
> +{
> +	return -ENOTSUPP;
> +}
>   #endif /* CONFIG_NET && CONFIG_BPF_SYSCALL */
>   
>   #if defined(CONFIG_INET) && defined(CONFIG_BPF_SYSCALL)
> diff --git a/include/linux/bpf_types.h b/include/linux/bpf_types.h
> index f883f01a5061..a9db1eae6796 100644
> --- a/include/linux/bpf_types.h
> +++ b/include/linux/bpf_types.h
> @@ -77,6 +77,8 @@ BPF_PROG_TYPE(BPF_PROG_TYPE_LSM, lsm,
>   	       void *, void *)
>   #endif /* CONFIG_BPF_LSM */
>   #endif
> +BPF_PROG_TYPE(BPF_PROG_TYPE_SYSCALL, bpf_syscall,
> +	      void *, void *)
>   
>   BPF_MAP_TYPE(BPF_MAP_TYPE_ARRAY, array_map_ops)
>   BPF_MAP_TYPE(BPF_MAP_TYPE_PERCPU_ARRAY, percpu_array_map_ops)
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index ec6d85a81744..c92648f38144 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -937,6 +937,7 @@ enum bpf_prog_type {
>   	BPF_PROG_TYPE_EXT,
>   	BPF_PROG_TYPE_LSM,
>   	BPF_PROG_TYPE_SK_LOOKUP,
> +	BPF_PROG_TYPE_SYSCALL, /* a program that can execute syscalls */
>   };
>   
>   enum bpf_attach_type {
> @@ -4735,6 +4736,12 @@ union bpf_attr {
>    *		be zero-terminated except when **str_size** is 0.
>    *
>    *		Or **-EBUSY** if the per-CPU memory copy buffer is busy.
> + *
> + * long bpf_sys_bpf(u32 cmd, void *attr, u32 attr_size)
> + * 	Description
> + * 		Execute bpf syscall with given arguments.
> + * 	Return
> + * 		A syscall result.
>    */
>   #define __BPF_FUNC_MAPPER(FN)		\
>   	FN(unspec),			\
> @@ -4903,6 +4910,7 @@ union bpf_attr {
>   	FN(check_mtu),			\
>   	FN(for_each_map_elem),		\
>   	FN(snprintf),			\
> +	FN(sys_bpf),			\
>   	/* */
>   
>   /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
> index fd495190115e..8636876f3e6b 100644
> --- a/kernel/bpf/syscall.c
> +++ b/kernel/bpf/syscall.c
> @@ -2014,6 +2014,7 @@ bpf_prog_load_check_attach(enum bpf_prog_type prog_type,
>   		if (expected_attach_type == BPF_SK_LOOKUP)
>   			return 0;
>   		return -EINVAL;
> +	case BPF_PROG_TYPE_SYSCALL:
>   	case BPF_PROG_TYPE_EXT:
>   		if (expected_attach_type)
>   			return -EINVAL;
> @@ -4497,3 +4498,56 @@ SYSCALL_DEFINE3(bpf, int, cmd, union bpf_attr __user *, uattr, unsigned int, siz
>   
>   	return err;
>   }
> +
> +static bool syscall_prog_is_valid_access(int off, int size,
> +					 enum bpf_access_type type,
> +					 const struct bpf_prog *prog,
> +					 struct bpf_insn_access_aux *info)
> +{
> +	if (off < 0 || off >= U16_MAX)
> +		return false;

Is this enough? If I understand correctly, the new program type
allows any arbitrary context data from user as long as its size
meets the following constraints:
    if (ctx_size_in < prog->aux->max_ctx_offset ||
  	    ctx_size_in > U16_MAX)
		return -EINVAL;

So if user provides a ctx with size say 40 and inside the program looks
it is still able to read/write to say offset 400.
Should we be a little more restrictive on this?

> +	if (off % size != 0)
> +		return false;
> +	return true;
> +}
> +
> +BPF_CALL_3(bpf_sys_bpf, int, cmd, void *, attr, u32, attr_size)
> +{
> +	return -EINVAL;
> +}
> +
> +const struct bpf_func_proto bpf_sys_bpf_proto = {
> +	.func		= bpf_sys_bpf,
> +	.gpl_only	= false,
> +	.ret_type	= RET_INTEGER,
> +	.arg1_type	= ARG_ANYTHING,
> +	.arg2_type	= ARG_PTR_TO_MEM,
> +	.arg3_type	= ARG_CONST_SIZE,
> +};
> +
> +const struct bpf_func_proto * __weak
> +tracing_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> +{
> +
> +	return bpf_base_func_proto(func_id);
> +}
> +
> +static const struct bpf_func_proto *
> +syscall_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> +{
> +	switch (func_id) {
> +	case BPF_FUNC_sys_bpf:
> +		return &bpf_sys_bpf_proto;
> +	default:
> +		return tracing_prog_func_proto(func_id, prog);
> +	}
> +}
> +
> +const struct bpf_verifier_ops bpf_syscall_verifier_ops = {
> +	.get_func_proto  = syscall_prog_func_proto,
> +	.is_valid_access = syscall_prog_is_valid_access,
> +};
> +
> +const struct bpf_prog_ops bpf_syscall_prog_ops = {
> +	.test_run = bpf_prog_test_run_syscall,
> +};
> diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
> index a5d72c48fb66..1783ea77b95c 100644
> --- a/net/bpf/test_run.c
> +++ b/net/bpf/test_run.c
> @@ -918,3 +918,46 @@ int bpf_prog_test_run_sk_lookup(struct bpf_prog *prog, const union bpf_attr *kat
>   	kfree(user_ctx);
>   	return ret;
>   }
> +
> +int bpf_prog_test_run_syscall(struct bpf_prog *prog,
> +			      const union bpf_attr *kattr,
> +			      union bpf_attr __user *uattr)
> +{
> +	void __user *ctx_in = u64_to_user_ptr(kattr->test.ctx_in);
> +	__u32 ctx_size_in = kattr->test.ctx_size_in;
> +	void *ctx = NULL;
> +	u32 retval;
> +	int err = 0;
> +
> +	/* doesn't support data_in/out, ctx_out, duration, or repeat or flags */
> +	if (kattr->test.data_in || kattr->test.data_out ||
> +	    kattr->test.ctx_out || kattr->test.duration ||
> +	    kattr->test.repeat || kattr->test.flags)
> +		return -EINVAL;
> +
> +	if (ctx_size_in < prog->aux->max_ctx_offset ||
> +	    ctx_size_in > U16_MAX)
> +		return -EINVAL;
> +
> +	if (ctx_size_in) {
> +		ctx = kzalloc(ctx_size_in, GFP_USER);
> +		if (!ctx)
> +			return -ENOMEM;
> +		if (copy_from_user(ctx, ctx_in, ctx_size_in)) {
> +			err = -EFAULT;
> +			goto out;
> +		}
> +	}
> +	retval = bpf_prog_run_pin_on_cpu(prog, ctx);
> +
> +	if (copy_to_user(&uattr->test.retval, &retval, sizeof(u32)))
> +		err = -EFAULT;
> +	if (ctx_size_in)
> +		if (copy_to_user(ctx_in, ctx, ctx_size_in)) {
> +			err = -EFAULT;
> +			goto out;
> +		}
> +out:
> +	kfree(ctx);
> +	return err;
> +}
[...]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 01/16] bpf: Introduce bpf_sys_bpf() helper and program type.
  2021-04-23 18:15   ` Yonghong Song
@ 2021-04-23 18:28     ` Alexei Starovoitov
  2021-04-23 19:32       ` Yonghong Song
  0 siblings, 1 reply; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-23 18:28 UTC (permalink / raw)
  To: Yonghong Song
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko,
	Network Development, bpf, Kernel Team

On Fri, Apr 23, 2021 at 11:16 AM Yonghong Song <yhs@fb.com> wrote:
> > +
> > +static bool syscall_prog_is_valid_access(int off, int size,
> > +                                      enum bpf_access_type type,
> > +                                      const struct bpf_prog *prog,
> > +                                      struct bpf_insn_access_aux *info)
> > +{
> > +     if (off < 0 || off >= U16_MAX)
> > +             return false;
>
> Is this enough? If I understand correctly, the new program type
> allows any arbitrary context data from user as long as its size
> meets the following constraints:
>     if (ctx_size_in < prog->aux->max_ctx_offset ||
>             ctx_size_in > U16_MAX)
>                 return -EINVAL;
>
> So if user provides a ctx with size say 40 and inside the program looks
> it is still able to read/write to say offset 400.
> Should we be a little more restrictive on this?

At the load time the program can have a read/write at offset 400,
but it will be rejected at prog_test_run time.
That's similar to tp and raw_tp test_run-s and attach-es.
That's why test_run has that check you've quoted.
It's a two step verification.
The verifier rejects <0 || > u16_max right away and
keeps the track of max_ctx_offset.
Then at attach/test_run the final check is done with an actual ctx_size_in.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 01/16] bpf: Introduce bpf_sys_bpf() helper and program type.
  2021-04-23 18:28     ` Alexei Starovoitov
@ 2021-04-23 19:32       ` Yonghong Song
  0 siblings, 0 replies; 52+ messages in thread
From: Yonghong Song @ 2021-04-23 19:32 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko,
	Network Development, bpf, Kernel Team



On 4/23/21 11:28 AM, Alexei Starovoitov wrote:
> On Fri, Apr 23, 2021 at 11:16 AM Yonghong Song <yhs@fb.com> wrote:
>>> +
>>> +static bool syscall_prog_is_valid_access(int off, int size,
>>> +                                      enum bpf_access_type type,
>>> +                                      const struct bpf_prog *prog,
>>> +                                      struct bpf_insn_access_aux *info)
>>> +{
>>> +     if (off < 0 || off >= U16_MAX)
>>> +             return false;
>>
>> Is this enough? If I understand correctly, the new program type
>> allows any arbitrary context data from user as long as its size
>> meets the following constraints:
>>      if (ctx_size_in < prog->aux->max_ctx_offset ||
>>              ctx_size_in > U16_MAX)
>>                  return -EINVAL;
>>
>> So if user provides a ctx with size say 40 and inside the program looks
>> it is still able to read/write to say offset 400.
>> Should we be a little more restrictive on this?
> 
> At the load time the program can have a read/write at offset 400,
> but it will be rejected at prog_test_run time.
> That's similar to tp and raw_tp test_run-s and attach-es.
> That's why test_run has that check you've quoted.
> It's a two step verification.
> The verifier rejects <0 || > u16_max right away and
> keeps the track of max_ctx_offset.
> Then at attach/test_run the final check is done with an actual ctx_size_in.

Thanks! That is indeed the case. Somehow although I copy-pasted it,
I missed the code "ctx_size_in < prog->aux->max_ctx_offset"...

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 00/16] bpf: syscall program, FD array, loader program, light skeleton.
  2021-04-23  0:26 [PATCH v2 bpf-next 00/16] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
                   ` (15 preceding siblings ...)
  2021-04-23  0:26 ` [PATCH v2 bpf-next 16/16] selftests/bpf: Convert few tests to light skeleton Alexei Starovoitov
@ 2021-04-23 21:36 ` Yonghong Song
  2021-04-23 23:16   ` Alexei Starovoitov
  16 siblings, 1 reply; 52+ messages in thread
From: Yonghong Song @ 2021-04-23 21:36 UTC (permalink / raw)
  To: Alexei Starovoitov, davem; +Cc: daniel, andrii, netdev, bpf, kernel-team



On 4/22/21 5:26 PM, Alexei Starovoitov wrote:
> From: Alexei Starovoitov <ast@kernel.org>
> 
> v1->v2: Addressed comments from Al, Yonghong and Andrii.
> - documented sys_close fdget/fdput requirement and non-recursion check.
> - reduced internal api leaks between libbpf and bpftool.
>    Now bpf_object__gen_loader() is the only new libbf api with minimal fields.
> - fixed light skeleton __destroy() method to munmap and close maps and progs.
> - refactored bpf_btf_find_by_name_kind to return btf_id | (btf_obj_fd << 32).
> - refactored use of bpf_btf_find_by_name_kind from loader prog.
> - moved auto-gen like code into skel_internal.h that is used by *.lskel.h
>    It has minimal static inline bpf_load_and_run() method used by lskel.
> - added lksel.h example in patch 15.
> - replaced union bpf_map_prog_desc with struct bpf_map_desc and struct bpf_prog_desc.
> - removed mark_feat_supported and added a patch to pass 'obj' into kernel_supports.
> - added proper tracking of temporary FDs in loader prog and their cleanup via bpf_sys_close.
> - rename gen_trace.c into gen_loader.c to better align the naming throughout.
> - expanded number of available helpers in new prog type.
> - added support for raw_tp attaching in lskel.
>    lskel supports tracing and raw_tp progs now.
>    It correctly loads all networking prog types too, but __attach() method is tbd.
> - converted progs/test_ksyms_module.c to lskel.
> - minor feedback fixes all over.
> 
> One thing that was not addressed from feedback is the name of new program type.
> Currently it's still:
> BPF_PROG_TYPE_SYSCALL, /* a program that can execute syscalls */

Do you have plan for other non-bpf syscalls? Maybe use the name
BPF_PROG_TYPE_BPF_SYSCALL? It will be really clear this is
the program type you can execute bpf syscalls.

> 
> The concern raised was that it sounds like a program that should be attached
> to a syscall. Like BPF_PROG_TYPE_KPROBE is used to process kprobes.
> I've considered and rejected:
> BPF_PROG_TYPE_USER - too generic
> BPF_PROG_TYPE_USERCTX - ambiguous with uprobes

USERCTX probably not a good choice. People can write a program without
context and put the ctx into a map and use it.

> BPF_PROG_TYPE_LOADER - ok-ish, but imo TYPE_SYSCALL is cleaner.

User can write a program to do more than loading although I am not sure
how useful it is compared to implementation in user space.

> Other suggestions?
> 
> The description of V1 set is still valid:
> ----
[...]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 00/16] bpf: syscall program, FD array, loader program, light skeleton.
  2021-04-23 21:36 ` [PATCH v2 bpf-next 00/16] bpf: syscall program, FD array, loader program, " Yonghong Song
@ 2021-04-23 23:16   ` Alexei Starovoitov
  0 siblings, 0 replies; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-23 23:16 UTC (permalink / raw)
  To: Yonghong Song; +Cc: davem, daniel, andrii, netdev, bpf, kernel-team

On Fri, Apr 23, 2021 at 02:36:43PM -0700, Yonghong Song wrote:
> > 
> > One thing that was not addressed from feedback is the name of new program type.
> > Currently it's still:
> > BPF_PROG_TYPE_SYSCALL, /* a program that can execute syscalls */
> 
> Do you have plan for other non-bpf syscalls? Maybe use the name
> BPF_PROG_TYPE_BPF_SYSCALL? It will be really clear this is
> the program type you can execute bpf syscalls.

In this patch set it's already doing sys_bpf and sys_close syscalls :)

> > 
> > The concern raised was that it sounds like a program that should be attached
> > to a syscall. Like BPF_PROG_TYPE_KPROBE is used to process kprobes.
> > I've considered and rejected:
> > BPF_PROG_TYPE_USER - too generic
> > BPF_PROG_TYPE_USERCTX - ambiguous with uprobes
> 
> USERCTX probably not a good choice. People can write a program without
> context and put the ctx into a map and use it.
> 
> > BPF_PROG_TYPE_LOADER - ok-ish, but imo TYPE_SYSCALL is cleaner.
> 
> User can write a program to do more than loading although I am not sure
> how useful it is compared to implementation in user space.

Exactly.
Just BPF_PROG_TYPE_SYSCALL alone can be used as more generic equivalent
to sys_close_range syscalls.
If somebody needs to close a sparse set of FDs or get fd_to_be_closed
from a map they can craft a bpf prog that would do that.
Or if somebody wants to do a batched map processing...
instead of doing sys_bpf() with BPF_MAP_UPDATE_BATCH they can craft
a bpf prog.
Plenty of use cases beyond LOADER.

This patch set only allows BPF_PROG_TYPE_SYSCALL to be executed
via prog_test_run, but I think it's safe to execute it upon entry
to pretty much any syscall.
So _SYSCALL suffix fits as both "a program that can execute syscalls"
and as "a program that attaches to syscalls".
The later is not implemented yet, but would fit right in.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 01/16] bpf: Introduce bpf_sys_bpf() helper and program type.
  2021-04-23  0:26 ` [PATCH v2 bpf-next 01/16] bpf: Introduce bpf_sys_bpf() helper and program type Alexei Starovoitov
  2021-04-23 18:15   ` Yonghong Song
@ 2021-04-26 16:51   ` Andrii Nakryiko
  2021-04-27 18:45   ` John Fastabend
  2 siblings, 0 replies; 52+ messages in thread
From: Andrii Nakryiko @ 2021-04-26 16:51 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Kernel Team

On Thu, Apr 22, 2021 at 5:26 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> From: Alexei Starovoitov <ast@kernel.org>
>
> Add placeholders for bpf_sys_bpf() helper and new program type.
>
> v1->v2:
> - check that expected_attach_type is zero
> - allow more helper functions to be used in this program type, since they will
>   only execute from user context via bpf_prog_test_run.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---

LGTM, see minor comments below.

Acked-by: Andrii Nakryiko <andrii@kernel.org>

>  include/linux/bpf.h            | 10 +++++++
>  include/linux/bpf_types.h      |  2 ++
>  include/uapi/linux/bpf.h       |  8 +++++
>  kernel/bpf/syscall.c           | 54 ++++++++++++++++++++++++++++++++++
>  net/bpf/test_run.c             | 43 +++++++++++++++++++++++++++
>  tools/include/uapi/linux/bpf.h |  8 +++++
>  6 files changed, 125 insertions(+)
>

[...]

> +
> +const struct bpf_func_proto * __weak
> +tracing_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> +{
> +

extra empty line

> +       return bpf_base_func_proto(func_id);
> +}
> +
> +static const struct bpf_func_proto *
> +syscall_prog_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
> +{
> +       switch (func_id) {
> +       case BPF_FUNC_sys_bpf:
> +               return &bpf_sys_bpf_proto;
> +       default:
> +               return tracing_prog_func_proto(func_id, prog);
> +       }
> +}
> +

[...]

> +       if (ctx_size_in) {
> +               ctx = kzalloc(ctx_size_in, GFP_USER);
> +               if (!ctx)
> +                       return -ENOMEM;
> +               if (copy_from_user(ctx, ctx_in, ctx_size_in)) {
> +                       err = -EFAULT;
> +                       goto out;
> +               }
> +       }
> +       retval = bpf_prog_run_pin_on_cpu(prog, ctx);
> +
> +       if (copy_to_user(&uattr->test.retval, &retval, sizeof(u32)))
> +               err = -EFAULT;

is there a point in trying to do another copy_to_user if this fails?
I.e., why not goto out here?

> +       if (ctx_size_in)
> +               if (copy_to_user(ctx_in, ctx, ctx_size_in)) {
> +                       err = -EFAULT;
> +                       goto out;
> +               }
> +out:
> +       kfree(ctx);
> +       return err;
> +}

[...]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 05/16] selftests/bpf: Test for syscall program type
  2021-04-23  0:26 ` [PATCH v2 bpf-next 05/16] selftests/bpf: Test " Alexei Starovoitov
@ 2021-04-26 17:02   ` Andrii Nakryiko
  2021-04-27  2:43     ` Alexei Starovoitov
  0 siblings, 1 reply; 52+ messages in thread
From: Andrii Nakryiko @ 2021-04-26 17:02 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Kernel Team

On Thu, Apr 22, 2021 at 5:26 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> From: Alexei Starovoitov <ast@kernel.org>
>
> bpf_prog_type_syscall is a program that creates a bpf map,
> updates it, and loads another bpf program using bpf_sys_bpf() helper.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---
>  tools/testing/selftests/bpf/Makefile          |  1 +
>  .../selftests/bpf/prog_tests/syscall.c        | 53 ++++++++++++++
>  tools/testing/selftests/bpf/progs/syscall.c   | 73 +++++++++++++++++++
>  3 files changed, 127 insertions(+)
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/syscall.c
>  create mode 100644 tools/testing/selftests/bpf/progs/syscall.c
>
> diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
> index c5bcdb3d4b12..9fdfdbc61857 100644
> --- a/tools/testing/selftests/bpf/Makefile
> +++ b/tools/testing/selftests/bpf/Makefile
> @@ -278,6 +278,7 @@ MENDIAN=$(if $(IS_LITTLE_ENDIAN),-mlittle-endian,-mbig-endian)
>  CLANG_SYS_INCLUDES = $(call get_sys_includes,$(CLANG))
>  BPF_CFLAGS = -g -D__TARGET_ARCH_$(SRCARCH) $(MENDIAN)                  \
>              -I$(INCLUDE_DIR) -I$(CURDIR) -I$(APIDIR)                   \
> +            -I$(TOOLSINCDIR) \

is this for filter.h? also, please align \ with the previous line


>              -I$(abspath $(OUTPUT)/../usr/include)
>
>  CLANG_CFLAGS = $(CLANG_SYS_INCLUDES) \
> diff --git a/tools/testing/selftests/bpf/prog_tests/syscall.c b/tools/testing/selftests/bpf/prog_tests/syscall.c
> new file mode 100644
> index 000000000000..e550e36bb5da
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/syscall.c
> @@ -0,0 +1,53 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2021 Facebook */
> +#include <test_progs.h>
> +#include "syscall.skel.h"
> +
> +struct args {
> +       __u64 log_buf;
> +       __u32 log_size;
> +       int max_entries;
> +       int map_fd;
> +       int prog_fd;
> +};
> +
> +void test_syscall(void)
> +{
> +       static char verifier_log[8192];
> +       struct args ctx = {
> +               .max_entries = 1024,
> +               .log_buf = (uintptr_t) verifier_log,
> +               .log_size = sizeof(verifier_log),
> +       };
> +       struct bpf_prog_test_run_attr tattr = {
> +               .ctx_in = &ctx,
> +               .ctx_size_in = sizeof(ctx),
> +       };
> +       struct syscall *skel = NULL;
> +       __u64 key = 12, value = 0;
> +       __u32 duration = 0;
> +       int err;
> +
> +       skel = syscall__open_and_load();
> +       if (CHECK(!skel, "skel_load", "syscall skeleton failed\n"))
> +               goto cleanup;
> +
> +       tattr.prog_fd = bpf_program__fd(skel->progs.bpf_prog);
> +       err = bpf_prog_test_run_xattr(&tattr);
> +       if (CHECK(err || tattr.retval != 1, "test_run sys_bpf",
> +                 "err %d errno %d retval %d duration %d\n",
> +                 err, errno, tattr.retval, tattr.duration))
> +               goto cleanup;
> +
> +       CHECK(ctx.map_fd <= 0, "map_fd", "fd = %d\n", ctx.map_fd);
> +       CHECK(ctx.prog_fd <= 0, "prog_fd", "fd = %d\n", ctx.prog_fd);

please use ASSERT_xxx() macros everywhere. I've just added
ASSERT_GT(), so once that patch set lands you should have all the
variants you need.

> +       CHECK(memcmp(verifier_log, "processed", sizeof("processed") - 1) != 0,
> +             "verifier_log", "%s\n", verifier_log);
> +
> +       err = bpf_map_lookup_elem(ctx.map_fd, &key, &value);
> +       CHECK(err, "map_lookup", "map_lookup failed\n");
> +       CHECK(value != 34, "invalid_value",
> +             "got value %llu expected %u\n", value, 34);
> +cleanup:
> +       syscall__destroy(skel);
> +}
> diff --git a/tools/testing/selftests/bpf/progs/syscall.c b/tools/testing/selftests/bpf/progs/syscall.c
> new file mode 100644
> index 000000000000..01476f88e45f
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/syscall.c
> @@ -0,0 +1,73 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2021 Facebook */
> +#include <linux/stddef.h>
> +#include <linux/bpf.h>
> +#include <bpf/bpf_helpers.h>
> +#include <bpf/bpf_tracing.h>
> +#include <../../tools/include/linux/filter.h>

with TOOLSINCDIR shouldn't this be just <linux/fiter.h>?

> +
> +volatile const int workaround = 1;

not needed anymore?

> +
> +char _license[] SEC("license") = "GPL";
> +
> +struct args {
> +       __u64 log_buf;
> +       __u32 log_size;
> +       int max_entries;
> +       int map_fd;
> +       int prog_fd;
> +};
> +

[...]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 09/16] libbpf: Support for fd_idx
  2021-04-23  0:26 ` [PATCH v2 bpf-next 09/16] libbpf: Support for fd_idx Alexei Starovoitov
@ 2021-04-26 17:14   ` Andrii Nakryiko
  2021-04-27  2:53     ` Alexei Starovoitov
  0 siblings, 1 reply; 52+ messages in thread
From: Andrii Nakryiko @ 2021-04-26 17:14 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Kernel Team

On Thu, Apr 22, 2021 at 5:27 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> From: Alexei Starovoitov <ast@kernel.org>
>
> Add support for FD_IDX make libbpf prefer that approach to loading programs.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---
>  tools/lib/bpf/bpf.c             |  1 +
>  tools/lib/bpf/libbpf.c          | 70 +++++++++++++++++++++++++++++----
>  tools/lib/bpf/libbpf_internal.h |  1 +
>  3 files changed, 65 insertions(+), 7 deletions(-)
>

[...]

> +static int probe_kern_fd_idx(void)
> +{
> +       struct bpf_load_program_attr attr;
> +       struct bpf_insn insns[] = {
> +               BPF_LD_IMM64_RAW(BPF_REG_0, BPF_PSEUDO_MAP_IDX, 0),
> +               BPF_EXIT_INSN(),
> +       };
> +
> +       memset(&attr, 0, sizeof(attr));
> +       attr.prog_type = BPF_PROG_TYPE_SOCKET_FILTER;
> +       attr.insns = insns;
> +       attr.insns_cnt = ARRAY_SIZE(insns);
> +       attr.license = "GPL";
> +
> +       probe_fd(bpf_load_program_xattr(&attr, NULL, 0));

probe_fd() calls close(fd) internally, which technically can interfere
with errno, though close() shouldn't be called because syscall has to
fail on correct kernels... So this should work, but I feel like
open-coding that logic is better than ignoring probe_fd() result.

> +       return errno == EPROTO;
> +}
> +

[...]

> @@ -7239,6 +7279,8 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)
>         struct bpf_program *prog;
>         size_t i;
>         int err;
> +       struct bpf_map *map;
> +       int *fd_array = NULL;
>
>         for (i = 0; i < obj->nr_programs; i++) {
>                 prog = &obj->programs[i];
> @@ -7247,6 +7289,16 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)
>                         return err;
>         }
>
> +       if (kernel_supports(FEAT_FD_IDX) && obj->nr_maps) {
> +               fd_array = malloc(sizeof(int) * obj->nr_maps);
> +               if (!fd_array)
> +                       return -ENOMEM;
> +               for (i = 0; i < obj->nr_maps; i++) {
> +                       map = &obj->maps[i];
> +                       fd_array[i] = map->fd;

nit: obj->maps[i].fd will keep it a single line

> +               }
> +       }
> +
>         for (i = 0; i < obj->nr_programs; i++) {
>                 prog = &obj->programs[i];
>                 if (prog_is_subprog(obj, prog))
> @@ -7256,10 +7308,14 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)
>                         continue;
>                 }
>                 prog->log_level |= log_level;
> +               prog->fd_array = fd_array;

you are not freeing this memory on success, as far as I can see. And
given multiple programs are sharing fd_array, it's a bit problematic
for prog to have fd_array. This is per-object properly, so let's add
it at bpf_object level and clean it up on bpf_object__close()? And by
assigning to obj->fd_array at malloc() site, you won't need to do all
the error-handling free()s below.

>                 err = bpf_program__load(prog, obj->license, obj->kern_version);
> -               if (err)
> +               if (err) {
> +                       free(fd_array);
>                         return err;
> +               }
>         }
> +       free(fd_array);
>         return 0;
>  }
>
> diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
> index 6017902c687e..9114c7085f2a 100644
> --- a/tools/lib/bpf/libbpf_internal.h
> +++ b/tools/lib/bpf/libbpf_internal.h
> @@ -204,6 +204,7 @@ struct bpf_prog_load_params {
>         __u32 log_level;
>         char *log_buf;
>         size_t log_buf_sz;
> +       int *fd_array;
>  };
>
>  int libbpf__bpf_prog_load(const struct bpf_prog_load_params *load_attr);
> --
> 2.30.2
>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 12/16] libbpf: Change the order of data and text relocations.
  2021-04-23  0:26 ` [PATCH v2 bpf-next 12/16] libbpf: Change the order of data and text relocations Alexei Starovoitov
@ 2021-04-26 17:29   ` Andrii Nakryiko
  2021-04-27  3:00     ` Alexei Starovoitov
  0 siblings, 1 reply; 52+ messages in thread
From: Andrii Nakryiko @ 2021-04-26 17:29 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Kernel Team

On Thu, Apr 22, 2021 at 5:27 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> From: Alexei Starovoitov <ast@kernel.org>
>
> In order to be able to generate loader program in the later
> patches change the order of data and text relocations.
> Also improve the test to include data relos.
>
> If the kernel supports "FD array" the map_fd relocations can be processed
> before text relos since generated loader program won't need to manually
> patch ld_imm64 insns with map_fd.
> But ksym and kfunc relocations can only be processed after all calls
> are relocated, since loader program will consist of a sequence
> of calls to bpf_btf_find_by_name_kind() followed by patching of btf_id
> and btf_obj_fd into corresponding ld_imm64 insns. The locations of those
> ld_imm64 insns are specified in relocations.
> Hence process all data relocations (maps, ksym, kfunc) together after call relos.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---
>  tools/lib/bpf/libbpf.c                        | 86 +++++++++++++++----
>  .../selftests/bpf/progs/test_subprogs.c       | 13 +++
>  2 files changed, 80 insertions(+), 19 deletions(-)
>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 17cfc5b66111..c73a85b97ca5 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -6379,11 +6379,15 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog)
>                         insn[0].imm = ext->ksym.kernel_btf_id;
>                         break;
>                 case RELO_SUBPROG_ADDR:
> -                       insn[0].src_reg = BPF_PSEUDO_FUNC;
> -                       /* will be handled as a follow up pass */
> +                       if (insn[0].src_reg != BPF_PSEUDO_FUNC) {
> +                               pr_warn("prog '%s': relo #%d: bad insn\n",
> +                                       prog->name, i);
> +                               return -EINVAL;
> +                       }

given SUBPROG_ADDR is now handled similarly to RELO_CALL in a
different place, I'd probably drop this error check and just combine
RELO_SUBPROG_ADDR and RELO_CALL cases with just a /* handled already
*/ comment.

> +                       /* handled already */
>                         break;
>                 case RELO_CALL:
> -                       /* will be handled as a follow up pass */
> +                       /* handled already */
>                         break;
>                 default:
>                         pr_warn("prog '%s': relo #%d: bad relo type %d\n",
> @@ -6552,6 +6556,31 @@ static struct reloc_desc *find_prog_insn_relo(const struct bpf_program *prog, si
>                        sizeof(*prog->reloc_desc), cmp_relo_by_insn_idx);
>  }
>
> +static int append_subprog_relos(struct bpf_program *main_prog, struct bpf_program *subprog)
> +{
> +       int new_cnt = main_prog->nr_reloc + subprog->nr_reloc;
> +       struct reloc_desc *relos;
> +       size_t off = subprog->sub_insn_off;
> +       int i;
> +
> +       if (main_prog == subprog)
> +               return 0;
> +       relos = libbpf_reallocarray(main_prog->reloc_desc, new_cnt, sizeof(*relos));
> +       if (!relos)
> +               return -ENOMEM;
> +       memcpy(relos + main_prog->nr_reloc, subprog->reloc_desc,
> +              sizeof(*relos) * subprog->nr_reloc);
> +
> +       for (i = main_prog->nr_reloc; i < new_cnt; i++)
> +               relos[i].insn_idx += off;

nit: off is used only here, so there is little point in having it as a
separate var, inline?

> +       /* After insn_idx adjustment the 'relos' array is still sorted
> +        * by insn_idx and doesn't break bsearch.
> +        */
> +       main_prog->reloc_desc = relos;
> +       main_prog->nr_reloc = new_cnt;
> +       return 0;
> +}
> +
>  static int
>  bpf_object__reloc_code(struct bpf_object *obj, struct bpf_program *main_prog,
>                        struct bpf_program *prog)
> @@ -6560,18 +6589,32 @@ bpf_object__reloc_code(struct bpf_object *obj, struct bpf_program *main_prog,
>         struct bpf_program *subprog;
>         struct bpf_insn *insns, *insn;
>         struct reloc_desc *relo;
> -       int err;
> +       int err, i;
>
>         err = reloc_prog_func_and_line_info(obj, main_prog, prog);
>         if (err)
>                 return err;
>
> +       for (i = 0; i < prog->nr_reloc; i++) {
> +               relo = &prog->reloc_desc[i];
> +               insn = &main_prog->insns[prog->sub_insn_off + relo->insn_idx];
> +
> +               if (relo->type == RELO_SUBPROG_ADDR)
> +                       /* mark the insn, so it becomes insn_is_pseudo_func() */
> +                       insn[0].src_reg = BPF_PSEUDO_FUNC;
> +       }
> +

This will do the same work over and over each time we append a subprog
to main_prog. This should logically follow append_subprog_relos(), but
you wanted to do it for main_prog with the same code, right?

How about instead doing this before we start appending subprogs to
main_progs? I.e., do it explicitly in bpf_object__relocate() before
you start code relocation loop.

>         for (insn_idx = 0; insn_idx < prog->sec_insn_cnt; insn_idx++) {
>                 insn = &main_prog->insns[prog->sub_insn_off + insn_idx];
>                 if (!insn_is_subprog_call(insn) && !insn_is_pseudo_func(insn))
>                         continue;
>
>                 relo = find_prog_insn_relo(prog, insn_idx);
> +               if (relo && relo->type == RELO_EXTERN_FUNC)
> +                       /* kfunc relocations will be handled later
> +                        * in bpf_object__relocate_data()
> +                        */
> +                       continue;
>                 if (relo && relo->type != RELO_CALL && relo->type != RELO_SUBPROG_ADDR) {
>                         pr_warn("prog '%s': unexpected relo for insn #%zu, type %d\n",
>                                 prog->name, insn_idx, relo->type);

[...]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 13/16] libbpf: Add bpf_object pointer to kernel_supports().
  2021-04-23  0:26 ` [PATCH v2 bpf-next 13/16] libbpf: Add bpf_object pointer to kernel_supports() Alexei Starovoitov
@ 2021-04-26 17:30   ` Andrii Nakryiko
  0 siblings, 0 replies; 52+ messages in thread
From: Andrii Nakryiko @ 2021-04-26 17:30 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Kernel Team

On Thu, Apr 22, 2021 at 5:27 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> From: Alexei Starovoitov <ast@kernel.org>
>
> Add a pointer to 'struct bpf_object' to kernel_supports() helper.
> It will be used in the next patch.
> No functional changes.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---

LGTM.

Acked-by: Andrii Nakryiko <andrii@kernel.org>

>  tools/lib/bpf/libbpf.c | 52 +++++++++++++++++++++---------------------
>  1 file changed, 26 insertions(+), 26 deletions(-)
>

[...]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 14/16] libbpf: Generate loader program out of BPF ELF file.
  2021-04-23  0:26 ` [PATCH v2 bpf-next 14/16] libbpf: Generate loader program out of BPF ELF file Alexei Starovoitov
@ 2021-04-26 22:22   ` Andrii Nakryiko
  2021-04-27  3:25     ` Alexei Starovoitov
  0 siblings, 1 reply; 52+ messages in thread
From: Andrii Nakryiko @ 2021-04-26 22:22 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Kernel Team

On Thu, Apr 22, 2021 at 5:27 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> From: Alexei Starovoitov <ast@kernel.org>
>
> The BPF program loading process performed by libbpf is quite complex
> and consists of the following steps:
> "open" phase:
> - parse elf file and remember relocations, sections
> - collect externs and ksyms including their btf_ids in prog's BTF
> - patch BTF datasec (since llvm couldn't do it)
> - init maps (old style map_def, BTF based, global data map, kconfig map)
> - collect relocations against progs and maps
> "load" phase:
> - probe kernel features
> - load vmlinux BTF
> - resolve externs (kconfig and ksym)
> - load program BTF
> - init struct_ops
> - create maps
> - apply CO-RE relocations
> - patch ld_imm64 insns with src_reg=PSEUDO_MAP, PSEUDO_MAP_VALUE, PSEUDO_BTF_ID
> - reposition subprograms and adjust call insns
> - sanitize and load progs
>
> During this process libbpf does sys_bpf() calls to load BTF, create maps,
> populate maps and finally load programs.
> Instead of actually doing the syscalls generate a trace of what libbpf
> would have done and represent it as the "loader program".
> The "loader program" consists of single map with:
> - union bpf_attr(s)
> - BTF bytes
> - map value bytes
> - insns bytes
> and single bpf program that passes bpf_attr(s) and data into bpf_sys_bpf() helper.
> Executing such "loader program" via bpf_prog_test_run() command will
> replay the sequence of syscalls that libbpf would have done which will result
> the same maps created and programs loaded as specified in the elf file.
> The "loader program" removes libelf and majority of libbpf dependency from
> program loading process.
>
> kconfig, typeless ksym, struct_ops and CO-RE are not supported yet.
>
> The order of relocate_data and relocate_calls had to change, so that
> bpf_gen__prog_load() can see all relocations for a given program with
> correct insn_idx-es.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---
>  tools/lib/bpf/Build              |   2 +-
>  tools/lib/bpf/bpf_gen_internal.h |  40 ++
>  tools/lib/bpf/gen_loader.c       | 615 +++++++++++++++++++++++++++++++
>  tools/lib/bpf/libbpf.c           | 204 ++++++++--
>  tools/lib/bpf/libbpf.h           |  12 +
>  tools/lib/bpf/libbpf.map         |   1 +
>  tools/lib/bpf/libbpf_internal.h  |   2 +
>  tools/lib/bpf/skel_internal.h    | 105 ++++++
>  8 files changed, 948 insertions(+), 33 deletions(-)
>  create mode 100644 tools/lib/bpf/bpf_gen_internal.h
>  create mode 100644 tools/lib/bpf/gen_loader.c
>  create mode 100644 tools/lib/bpf/skel_internal.h
>
> diff --git a/tools/lib/bpf/Build b/tools/lib/bpf/Build
> index 9b057cc7650a..430f6874fa41 100644
> --- a/tools/lib/bpf/Build
> +++ b/tools/lib/bpf/Build
> @@ -1,3 +1,3 @@
>  libbpf-y := libbpf.o bpf.o nlattr.o btf.o libbpf_errno.o str_error.o \
>             netlink.o bpf_prog_linfo.o libbpf_probes.o xsk.o hashmap.o \
> -           btf_dump.o ringbuf.o strset.o linker.o
> +           btf_dump.o ringbuf.o strset.o linker.o gen_loader.o
> diff --git a/tools/lib/bpf/bpf_gen_internal.h b/tools/lib/bpf/bpf_gen_internal.h
> new file mode 100644
> index 000000000000..dc3e2cbf9ce3
> --- /dev/null
> +++ b/tools/lib/bpf/bpf_gen_internal.h
> @@ -0,0 +1,40 @@
> +/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
> +/* Copyright (c) 2021 Facebook */
> +#ifndef __BPF_GEN_INTERNAL_H
> +#define __BPF_GEN_INTERNAL_H
> +
> +struct relo_desc {

there is very similarly named reloc_desc struct in libbpf.c, can you
rename it to something like gen_btf_relo_desc?

> +       const char *name;
> +       int kind;
> +       int insn_idx;
> +};
> +

[...]

> +
> +static int bpf_gen__realloc_insn_buf(struct bpf_gen *gen, __u32 size)
> +{
> +       size_t off = gen->insn_cur - gen->insn_start;
> +
> +       if (gen->error)
> +               return gen->error;
> +       if (size > INT32_MAX || off + size > INT32_MAX) {
> +               gen->error = -ERANGE;
> +               return -ERANGE;
> +       }
> +       gen->insn_start = realloc(gen->insn_start, off + size);

leaking memory here: gen->insn_start will be NULL on failure

> +       if (!gen->insn_start) {
> +               gen->error = -ENOMEM;
> +               return -ENOMEM;
> +       }
> +       gen->insn_cur = gen->insn_start + off;
> +       return 0;
> +}
> +
> +static int bpf_gen__realloc_data_buf(struct bpf_gen *gen, __u32 size)
> +{
> +       size_t off = gen->data_cur - gen->data_start;
> +
> +       if (gen->error)
> +               return gen->error;
> +       if (size > INT32_MAX || off + size > INT32_MAX) {
> +               gen->error = -ERANGE;
> +               return -ERANGE;
> +       }
> +       gen->data_start = realloc(gen->data_start, off + size);

same as above

> +       if (!gen->data_start) {
> +               gen->error = -ENOMEM;
> +               return -ENOMEM;
> +       }
> +       gen->data_cur = gen->data_start + off;
> +       return 0;
> +}
> +

[...]

> +
> +static void bpf_gen__emit_sys_bpf(struct bpf_gen *gen, int cmd, int attr, int attr_size)
> +{
> +       bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_1, cmd));
> +       bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_2, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, attr));

is attr an offset into a blob? if yes, attr_off? or attr_base_off,
anything with _off

> +       bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_3, attr_size));
> +       bpf_gen__emit(gen, BPF_EMIT_CALL(BPF_FUNC_sys_bpf));
> +       /* remember the result in R7 */
> +       bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_7, BPF_REG_0));
> +}
> +
> +static void bpf_gen__emit_check_err(struct bpf_gen *gen)
> +{
> +       bpf_gen__emit(gen, BPF_JMP_IMM(BPF_JSGE, BPF_REG_7, 0, 2));
> +       bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_0, BPF_REG_7));
> +       bpf_gen__emit(gen, BPF_EXIT_INSN());
> +}
> +
> +static void __bpf_gen__debug(struct bpf_gen *gen, int reg1, int reg2, const char *fmt, va_list args)

Can you please leave a comment on what reg1 and reg2 is, it's not very
clear and the code clearly assumes that it can't be reg[1-4]. It's
probably those special R7 and R9 (or -1, of course), but having a
short comment makes sense to not jump around trying to figure out
possible inputs.

Oh, reading further, it can also be R0.

> +{
> +       char buf[1024];
> +       int addr, len, ret;
> +
> +       if (!gen->log_level)
> +               return;
> +       ret = vsnprintf(buf, sizeof(buf), fmt, args);
> +       if (ret < 1024 - 7 && reg1 >= 0 && reg2 < 0)
> +               /* The special case to accommodate common bpf_gen__debug_ret():
> +                * to avoid specifying BPF_REG_7 and adding " r=%%d" to prints explicitly.
> +                */
> +               strcat(buf, " r=%d");
> +       len = strlen(buf) + 1;
> +       addr = bpf_gen__add_data(gen, buf, len);

nit: offset, not address, right?

> +
> +       bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, addr));
> +       bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_2, len));
> +       if (reg1 >= 0)
> +               bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_3, reg1));
> +       if (reg2 >= 0)
> +               bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_4, reg2));
> +       bpf_gen__emit(gen, BPF_EMIT_CALL(BPF_FUNC_trace_printk));
> +}
> +

[...]

> +int bpf_gen__finish(struct bpf_gen *gen)
> +{
> +       int i;
> +
> +       bpf_gen__emit_sys_close_stack(gen, stack_off(btf_fd));
> +       for (i = 0; i < gen->nr_progs; i++)
> +               bpf_gen__move_stack2ctx(gen,
> +                                       sizeof(struct bpf_loader_ctx) +
> +                                       sizeof(struct bpf_map_desc) * gen->nr_maps +
> +                                       sizeof(struct bpf_prog_desc) * i +
> +                                       offsetof(struct bpf_prog_desc, prog_fd), 4,
> +                                       stack_off(prog_fd[i]));
> +       for (i = 0; i < gen->nr_maps; i++)
> +               bpf_gen__move_stack2ctx(gen,
> +                                       sizeof(struct bpf_loader_ctx) +
> +                                       sizeof(struct bpf_map_desc) * i +
> +                                       offsetof(struct bpf_map_desc, map_fd), 4,
> +                                       stack_off(map_fd[i]));
> +       bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_0, 0));
> +       bpf_gen__emit(gen, BPF_EXIT_INSN());
> +       pr_debug("bpf_gen__finish %d\n", gen->error);

maybe prefix all those pr_debug()s with "gen: " to distinguish them
from the rest of libbpf logging?

> +       if (!gen->error) {
> +               struct gen_loader_opts *opts = gen->opts;
> +
> +               opts->insns = gen->insn_start;
> +               opts->insns_sz = gen->insn_cur - gen->insn_start;
> +               opts->data = gen->data_start;
> +               opts->data_sz = gen->data_cur - gen->data_start;
> +       }
> +       return gen->error;
> +}
> +
> +void bpf_gen__free(struct bpf_gen *gen)
> +{
> +       if (!gen)
> +               return;
> +       free(gen->data_start);
> +       free(gen->insn_start);
> +       gen->data_start = NULL;
> +       gen->insn_start = NULL;

what's the point of NULL'ing them out if you don't clear gen->data_cur
and gen->insn_cur?

also should it free(gen) itself?

> +}
> +
> +void bpf_gen__load_btf(struct bpf_gen *gen, const void *btf_raw_data, __u32 btf_raw_size)
> +{
> +       union bpf_attr attr = {};

here and below: memset(0)?

> +       int attr_size = offsetofend(union bpf_attr, btf_log_level);
> +       int btf_data, btf_load_attr;
> +
> +       pr_debug("btf_load: size %d\n", btf_raw_size);
> +       btf_data = bpf_gen__add_data(gen, btf_raw_data, btf_raw_size);
> +

[...]

> +       map_create_attr = bpf_gen__add_data(gen, &attr, attr_size);
> +       if (attr.btf_value_type_id)
> +               /* populate union bpf_attr with btf_fd saved in the stack earlier */
> +               bpf_gen__move_stack2blob(gen, map_create_attr + offsetof(union bpf_attr, btf_fd), 4,
> +                                        stack_off(btf_fd));
> +       switch (attr.map_type) {
> +       case BPF_MAP_TYPE_ARRAY_OF_MAPS:
> +       case BPF_MAP_TYPE_HASH_OF_MAPS:
> +               bpf_gen__move_stack2blob(gen, map_create_attr + offsetof(union bpf_attr, inner_map_fd),
> +                                        4, stack_off(inner_map_fd));
> +               close_inner_map_fd = true;
> +               break;
> +       default:;

default:
    break;

> +       }
> +       /* emit MAP_CREATE command */
> +       bpf_gen__emit_sys_bpf(gen, BPF_MAP_CREATE, map_create_attr, attr_size);
> +       bpf_gen__debug_ret(gen, "map_create %s idx %d type %d value_size %d",
> +                          attr.map_name, map_idx, map_attr->map_type, attr.value_size);
> +       bpf_gen__emit_check_err(gen);

what will happen on error with inner_map_fd and all the other fds
created by now?

> +       /* remember map_fd in the stack, if successful */
> +       if (map_idx < 0) {
> +               /* This bpf_gen__map_create() function is called with map_idx >= 0 for all maps
> +                * that libbpf loading logic tracks.
> +                * It's called with -1 to create an inner map.
> +                */
> +               bpf_gen__emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7, stack_off(inner_map_fd)));
> +       } else {
> +               if (map_idx != gen->nr_maps) {

why would that happen? defensive programming? and even then `if () {}
else if () {} else {}` structure is more appropriate

> +                       gen->error = -EDOM; /* internal bug */
> +                       return;
> +               }
> +               bpf_gen__emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7, stack_off(map_fd[map_idx])));
> +               gen->nr_maps++;
> +       }
> +       if (close_inner_map_fd)
> +               bpf_gen__emit_sys_close_stack(gen, stack_off(inner_map_fd));
> +}
> +

[...]

> +static void bpf_gen__cleanup_relos(struct bpf_gen *gen, int insns)
> +{
> +       int i, insn;
> +
> +       for (i = 0; i < gen->relo_cnt; i++) {
> +               if (gen->relos[i].kind != BTF_KIND_VAR)
> +                       continue;
> +               /* close fd recorded in insn[insn_idx + 1].imm */
> +               insn = insns + sizeof(struct bpf_insn) * (gen->relos[i].insn_idx + 1)
> +                       + offsetof(struct bpf_insn, imm);
> +               bpf_gen__emit_sys_close_blob(gen, insn);

wouldn't this close the same FD used across multiple "relos" multiple times?

> +       }
> +       if (gen->relo_cnt) {
> +               free(gen->relos);
> +               gen->relo_cnt = 0;
> +               gen->relos = NULL;
> +       }
> +}
> +

[...]

> +       struct bpf_gen *gen_loader;
> +
>         /*
>          * Information when doing elf related work. Only valid if fd
>          * is valid.
> @@ -2651,7 +2654,15 @@ static int bpf_object__sanitize_and_load_btf(struct bpf_object *obj)
>                 bpf_object__sanitize_btf(obj, kern_btf);
>         }
>
> -       err = btf__load(kern_btf);
> +       if (obj->gen_loader) {
> +               __u32 raw_size = 0;
> +               const void *raw_data = btf__get_raw_data(kern_btf, &raw_size);

this can return NULL on ENOMEM

> +
> +               bpf_gen__load_btf(obj->gen_loader, raw_data, raw_size);
> +               btf__set_fd(kern_btf, 0);

why setting fd to 0 (stdin)? does gen depend on this somewhere? The
problem is that it will eventually be closed on btf__free(), which
will close stdin, causing a big surprise. What will happen if you
leave it at -1?


> +       } else {
> +               err = btf__load(kern_btf);
> +       }
>         if (sanitize) {
>                 if (!err) {
>                         /* move fd to libbpf's BTF */
> @@ -4262,6 +4273,12 @@ static bool kernel_supports(const struct bpf_object *obj, enum kern_feature_id f
>         struct kern_feature_desc *feat = &feature_probes[feat_id];
>         int ret;
>
> +       if (obj->gen_loader)
> +               /* To generate loader program assume the latest kernel
> +                * to avoid doing extra prog_load, map_create syscalls.
> +                */
> +               return true;
> +
>         if (READ_ONCE(feat->res) == FEAT_UNKNOWN) {
>                 ret = feat->probe();
>                 if (ret > 0) {
> @@ -4344,6 +4361,13 @@ bpf_object__populate_internal_map(struct bpf_object *obj, struct bpf_map *map)
>         char *cp, errmsg[STRERR_BUFSIZE];
>         int err, zero = 0;
>
> +       if (obj->gen_loader) {
> +               bpf_gen__map_update_elem(obj->gen_loader, map - obj->maps,

it would be great for bpf_gen__map_update_elem to reflect that it's
not a generic map_update_elem() call, rather special internal map
update (just use bpf_gen__populate_internal_map?) Whether to freeze or
not could be just a flag to the same call, they always go together.

> +                                        map->mmaped, map->def.value_size);
> +               if (map_type == LIBBPF_MAP_RODATA || map_type == LIBBPF_MAP_KCONFIG)
> +                       bpf_gen__map_freeze(obj->gen_loader, map - obj->maps);
> +               return 0;
> +       }
>         err = bpf_map_update_elem(map->fd, &zero, map->mmaped, 0);
>         if (err) {
>                 err = -errno;
> @@ -4369,7 +4393,7 @@ bpf_object__populate_internal_map(struct bpf_object *obj, struct bpf_map *map)
>
>  static void bpf_map__destroy(struct bpf_map *map);
>
> -static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map)
> +static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, bool is_inner)
>  {
>         struct bpf_create_map_attr create_attr;
>         struct bpf_map_def *def = &map->def;
> @@ -4415,9 +4439,9 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map)
>
>         if (bpf_map_type__is_map_in_map(def->type)) {
>                 if (map->inner_map) {
> -                       int err;
> +                       int err = 0;

no need to initialize to zero, you are assigning it right below

>
> -                       err = bpf_object__create_map(obj, map->inner_map);
> +                       err = bpf_object__create_map(obj, map->inner_map, true);
>                         if (err) {
>                                 pr_warn("map '%s': failed to create inner map: %d\n",
>                                         map->name, err);

[...]

> @@ -4469,7 +4498,12 @@ static int init_map_slots(struct bpf_map *map)
>
>                 targ_map = map->init_slots[i];
>                 fd = bpf_map__fd(targ_map);
> -               err = bpf_map_update_elem(map->fd, &i, &fd, 0);
> +               if (obj->gen_loader) {
> +                       printf("// TODO map_update_elem: idx %ld key %d value==map_idx %ld\n",
> +                              map - obj->maps, i, targ_map - obj->maps);

return error for now?

> +               } else {
> +                       err = bpf_map_update_elem(map->fd, &i, &fd, 0);
> +               }
>                 if (err) {
>                         err = -errno;
>                         pr_warn("map '%s': failed to initialize slot [%d] to map '%s' fd=%d: %d\n",

[...]

> @@ -6082,6 +6119,11 @@ static int bpf_core_apply_relo(struct bpf_program *prog,
>         if (str_is_empty(spec_str))
>                 return -EINVAL;
>
> +       if (prog->obj->gen_loader) {
> +               printf("// TODO core_relo: prog %ld insn[%d] %s %s kind %d\n",
> +                      prog - prog->obj->programs, relo->insn_off / 8,
> +                      local_name, spec_str, relo->kind);

same, return error? Drop printf, maybe leave pr_debug()?

> +       }
>         err = bpf_core_parse_spec(local_btf, local_id, spec_str, relo->kind, &local_spec);
>         if (err) {
>                 pr_warn("prog '%s': relo #%d: parsing [%d] %s %s + %s failed: %d\n",
> @@ -6821,6 +6863,19 @@ bpf_object__relocate_calls(struct bpf_object *obj, struct bpf_program *prog)
>
>         return 0;
>  }

empty line here

> +static void
> +bpf_object__free_relocs(struct bpf_object *obj)
> +{
> +       struct bpf_program *prog;
> +       int i;
> +
> +       /* free up relocation descriptors */
> +       for (i = 0; i < obj->nr_programs; i++) {
> +               prog = &obj->programs[i];
> +               zfree(&prog->reloc_desc);
> +               prog->nr_reloc = 0;
> +       }
> +}
>

[...]

> +static int bpf_program__record_externs(struct bpf_program *prog)
> +{
> +       struct bpf_object *obj = prog->obj;
> +       int i;
> +
> +       for (i = 0; i < prog->nr_reloc; i++) {
> +               struct reloc_desc *relo = &prog->reloc_desc[i];
> +               struct extern_desc *ext = &obj->externs[relo->sym_off];
> +
> +               switch (relo->type) {
> +               case RELO_EXTERN_VAR:
> +                       if (ext->type != EXT_KSYM)
> +                               continue;
> +                       if (!ext->ksym.type_id) /* typeless ksym */
> +                               continue;

this shouldn't be silently ignored, if it's not supported, it should
return error

> +                       bpf_gen__record_extern(obj->gen_loader, ext->name, BTF_KIND_VAR,
> +                                              relo->insn_idx);
> +                       break;
> +               case RELO_EXTERN_FUNC:
> +                       bpf_gen__record_extern(obj->gen_loader, ext->name, BTF_KIND_FUNC,
> +                                              relo->insn_idx);
> +                       break;
> +               default:
> +                       continue;
> +               }
> +       }
> +       return 0;
> +}
> +

[...]

> @@ -7868,6 +7970,9 @@ int bpf_object__load_xattr(struct bpf_object_load_attr *attr)
>         err = err ? : bpf_object__relocate(obj, attr->target_btf_path);
>         err = err ? : bpf_object__load_progs(obj, attr->log_level);
>
> +       if (obj->gen_loader && !err)
> +               err = bpf_gen__finish(obj->gen_loader);
> +
>         /* clean up module BTFs */
>         for (i = 0; i < obj->btf_module_cnt; i++) {
>                 close(obj->btf_modules[i].fd);
> @@ -8493,6 +8598,7 @@ void bpf_object__close(struct bpf_object *obj)
>         if (obj->clear_priv)
>                 obj->clear_priv(obj, obj->priv);

bpf_object__close() will close all those FD=0 in maps/progs, that's not good

>
> +       bpf_gen__free(obj->gen_loader);
>         bpf_object__elf_finish(obj);
>         bpf_object__unload(obj);
>         btf__free(obj->btf);

[...]

> @@ -9387,7 +9521,13 @@ static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd,
>         }
>
>         /* kernel/module BTF ID */
> -       err = find_kernel_btf_id(prog->obj, attach_name, attach_type, btf_obj_fd, btf_type_id);
> +       if (prog->obj->gen_loader) {
> +               bpf_gen__record_attach_target(prog->obj->gen_loader, attach_name, attach_type);
> +               *btf_obj_fd = 0;

this will leak kernel module BTF FDs

> +               *btf_type_id = 1;
> +       } else {
> +               err = find_kernel_btf_id(prog->obj, attach_name, attach_type, btf_obj_fd, btf_type_id);
> +       }
>         if (err) {
>                 pr_warn("failed to find kernel BTF type ID of '%s': %d\n", attach_name, err);
>                 return err;

[...]

> +out:
> +       close(map_fd);
> +       close(prog_fd);

this does close(-1), check >= 0


> +       return err;
> +}
> +
> +#endif
> --
> 2.30.2
>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 04/16] libbpf: Support for syscall program type
  2021-04-23  0:26 ` [PATCH v2 bpf-next 04/16] libbpf: Support for syscall program type Alexei Starovoitov
@ 2021-04-26 22:24   ` Andrii Nakryiko
  0 siblings, 0 replies; 52+ messages in thread
From: Andrii Nakryiko @ 2021-04-26 22:24 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Kernel Team

On Thu, Apr 22, 2021 at 5:26 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> From: Alexei Starovoitov <ast@kernel.org>
>
> Trivial support for syscall program type.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---

Acked-by: Andrii Nakryiko <andrii@kernel.org>

>  tools/lib/bpf/libbpf.c | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 9cc2d45b0080..254a0c9aa6cf 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -8899,6 +8899,7 @@ static const struct bpf_sec_def section_defs[] = {
>         BPF_PROG_SEC("struct_ops",              BPF_PROG_TYPE_STRUCT_OPS),
>         BPF_EAPROG_SEC("sk_lookup/",            BPF_PROG_TYPE_SK_LOOKUP,
>                                                 BPF_SK_LOOKUP),
> +       BPF_PROG_SEC("syscall",                 BPF_PROG_TYPE_SYSCALL),
>  };
>
>  #undef BPF_PROG_SEC_IMPL
> --
> 2.30.2
>

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 15/16] bpftool: Use syscall/loader program in "prog load" and "gen skeleton" command.
  2021-04-23  0:26 ` [PATCH v2 bpf-next 15/16] bpftool: Use syscall/loader program in "prog load" and "gen skeleton" command Alexei Starovoitov
@ 2021-04-26 22:35   ` Andrii Nakryiko
  2021-04-27  3:28     ` Alexei Starovoitov
  0 siblings, 1 reply; 52+ messages in thread
From: Andrii Nakryiko @ 2021-04-26 22:35 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Kernel Team

On Thu, Apr 22, 2021 at 5:27 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> From: Alexei Starovoitov <ast@kernel.org>
>
> Add -L flag to bpftool to use libbpf gen_trace facility and syscall/loader program
> for skeleton generation and program loading.
>
> "bpftool gen skeleton -L" command will generate a "light skeleton" or "loader skeleton"
> that is similar to existing skeleton, but has one major difference:
> $ bpftool gen skeleton lsm.o > lsm.skel.h
> $ bpftool gen skeleton -L lsm.o > lsm.lskel.h
> $ diff lsm.skel.h lsm.lskel.h
> @@ -5,34 +4,34 @@
>  #define __LSM_SKEL_H__
>
>  #include <stdlib.h>
> -#include <bpf/libbpf.h>
> +#include <bpf/bpf.h>
>
> The light skeleton does not use majority of libbpf infrastructure.
> It doesn't need libelf. It doesn't parse .o file.
> It only needs few sys_bpf wrappers. All of them are in bpf/bpf.h file.
> In future libbpf/bpf.c can be inlined into bpf.h, so not even libbpf.a would be
> needed to work with light skeleton.
>
> "bpftool prog load -L file.o" command is introduced for debugging of syscall/loader
> program generation. Just like the same command without -L it will try to load
> the programs from file.o into the kernel. It won't even try to pin them.
>
> "bpftool prog load -L -d file.o" command will provide additional debug messages
> on how syscall/loader program was generated.
> Also the execution of syscall/loader program will use bpf_trace_printk() for
> each step of loading BTF, creating maps, and loading programs.
> The user can do "cat /.../trace_pipe" for further debug.
>
> An example of fexit_sleep.lskel.h generated from progs/fexit_sleep.c:
> struct fexit_sleep {
>         struct bpf_loader_ctx ctx;
>         struct {
>                 struct bpf_map_desc bss;
>         } maps;
>         struct {
>                 struct bpf_prog_desc nanosleep_fentry;
>                 struct bpf_prog_desc nanosleep_fexit;
>         } progs;
>         struct {
>                 int nanosleep_fentry_fd;
>                 int nanosleep_fexit_fd;
>         } links;
>         struct fexit_sleep__bss {
>                 int pid;
>                 int fentry_cnt;
>                 int fexit_cnt;
>         } *bss;
> };
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---
>  tools/bpf/bpftool/Makefile        |   2 +-
>  tools/bpf/bpftool/gen.c           | 313 +++++++++++++++++++++++++++---
>  tools/bpf/bpftool/main.c          |   7 +-
>  tools/bpf/bpftool/main.h          |   1 +
>  tools/bpf/bpftool/prog.c          |  80 ++++++++
>  tools/bpf/bpftool/xlated_dumper.c |   3 +
>  6 files changed, 382 insertions(+), 24 deletions(-)
>

[...]

> @@ -268,6 +269,254 @@ static void codegen(const char *template, ...)
>         free(s);
>  }
>
> +static void print_hex(const char *obj_data, int file_sz)
> +{
> +       int i, len;
> +
> +       /* embed contents of BPF object file */

nit: this comment should have stayed at the original place

> +       for (i = 0, len = 0; i < file_sz; i++) {
> +               int w = obj_data[i] ? 4 : 2;
> +

[...]

> +       bpf_object__for_each_map(map, obj) {
> +               const char * ident;
> +
> +               ident = get_map_ident(map);
> +               if (!ident)
> +                       continue;
> +
> +               if (!bpf_map__is_internal(map) ||
> +                   !(bpf_map__def(map)->map_flags & BPF_F_MMAPABLE))
> +                       continue;
> +
> +               printf("\tskel->%1$s =\n"
> +                      "\t\tmmap(NULL, %2$zd, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_FIXED,\n"
> +                      "\t\t\tskel->maps.%1$s.map_fd, 0);\n",
> +                      ident, bpf_map_mmap_sz(map));

use codegen()?

> +       }
> +       codegen("\
> +               \n\
> +                       return 0;                                           \n\
> +               }                                                           \n\
> +                                                                           \n\
> +               static inline struct %1$s *                                 \n\

[...]

>  static int do_skeleton(int argc, char **argv)
>  {
>         char header_guard[MAX_OBJ_NAME_LEN + sizeof("__SKEL_H__")];
> @@ -277,7 +526,7 @@ static int do_skeleton(int argc, char **argv)
>         struct bpf_object *obj = NULL;
>         const char *file, *ident;
>         struct bpf_program *prog;
> -       int fd, len, err = -1;
> +       int fd, err = -1;
>         struct bpf_map *map;
>         struct btf *btf;
>         struct stat st;
> @@ -359,7 +608,25 @@ static int do_skeleton(int argc, char **argv)
>         }
>
>         get_header_guard(header_guard, obj_name);
> -       codegen("\
> +       if (use_loader)

please use {} for such a long if/else, even if it's, technically, a
single-statement if

> +               codegen("\
> +               \n\
> +               /* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */   \n\
> +               /* THIS FILE IS AUTOGENERATED! */                           \n\
> +               #ifndef %2$s                                                \n\
> +               #define %2$s                                                \n\
> +                                                                           \n\
> +               #include <stdlib.h>                                         \n\
> +               #include <bpf/bpf.h>                                        \n\
> +               #include <bpf/skel_internal.h>                              \n\
> +                                                                           \n\
> +               struct %1$s {                                               \n\
> +                       struct bpf_loader_ctx ctx;                          \n\
> +               ",
> +               obj_name, header_guard
> +               );
> +       else
> +               codegen("\
>                 \n\
>                 /* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */   \n\
>                                                                             \n\

[...]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 10/16] bpf: Add bpf_btf_find_by_name_kind() helper.
  2021-04-23  0:26 ` [PATCH v2 bpf-next 10/16] bpf: Add bpf_btf_find_by_name_kind() helper Alexei Starovoitov
@ 2021-04-26 22:46   ` Andrii Nakryiko
  2021-04-27  3:37     ` Alexei Starovoitov
  2021-04-27 21:00   ` John Fastabend
  1 sibling, 1 reply; 52+ messages in thread
From: Andrii Nakryiko @ 2021-04-26 22:46 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Kernel Team

On Thu, Apr 22, 2021 at 5:27 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> From: Alexei Starovoitov <ast@kernel.org>
>
> Add new helper:
>
> long bpf_btf_find_by_name_kind(u32 btf_fd, char *name, u32 kind, int flags)
>         Description
>                 Find given name with given type in BTF pointed to by btf_fd.
>                 If btf_fd is zero look for the name in vmlinux BTF and in module's BTFs.
>         Return
>                 Returns btf_id and btf_obj_fd in lower and upper 32 bits.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---
>  include/linux/bpf.h            |  1 +
>  include/uapi/linux/bpf.h       |  8 ++++
>  kernel/bpf/btf.c               | 68 ++++++++++++++++++++++++++++++++++
>  kernel/bpf/syscall.c           |  2 +
>  tools/include/uapi/linux/bpf.h |  8 ++++
>  5 files changed, 87 insertions(+)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 0f841bd0cb85..4cf361eb6a80 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -1972,6 +1972,7 @@ extern const struct bpf_func_proto bpf_get_socket_ptr_cookie_proto;
>  extern const struct bpf_func_proto bpf_task_storage_get_proto;
>  extern const struct bpf_func_proto bpf_task_storage_delete_proto;
>  extern const struct bpf_func_proto bpf_for_each_map_elem_proto;
> +extern const struct bpf_func_proto bpf_btf_find_by_name_kind_proto;
>
>  const struct bpf_func_proto *bpf_tracing_func_proto(
>         enum bpf_func_id func_id, const struct bpf_prog *prog);
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index de58a714ed36..253f5f031f08 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -4748,6 +4748,13 @@ union bpf_attr {
>   *             Execute bpf syscall with given arguments.
>   *     Return
>   *             A syscall result.
> + *
> + * long bpf_btf_find_by_name_kind(u32 btf_fd, char *name, u32 kind, int flags)
> + *     Description
> + *             Find given name with given type in BTF pointed to by btf_fd.

"Find BTF type with given name"? Should the limits on name length be
specified? KSYM_NAME_LEN is a pretty arbitrary restriction. Also,
would it still work fine if the caller provides a pointer to a much
shorter piece of memory?

Why not add name_sz right after name, as we do with a lot of other
arguments like this?

> + *             If btf_fd is zero look for the name in vmlinux BTF and in module's BTFs.
> + *     Return
> + *             Returns btf_id and btf_obj_fd in lower and upper 32 bits.

Mention that for vmlinux BTF btf_obj_fd will be zero? Also who "owns"
the FD? If the BPF program doesn't close it, when are they going to be
cleaned up?

>   */
>  #define __BPF_FUNC_MAPPER(FN)          \
>         FN(unspec),                     \
> @@ -4917,6 +4924,7 @@ union bpf_attr {
>         FN(for_each_map_elem),          \
>         FN(snprintf),                   \
>         FN(sys_bpf),                    \
> +       FN(btf_find_by_name_kind),      \
>         /* */
>

[...]

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 05/16] selftests/bpf: Test for syscall program type
  2021-04-26 17:02   ` Andrii Nakryiko
@ 2021-04-27  2:43     ` Alexei Starovoitov
  2021-04-27 16:28       ` Andrii Nakryiko
  0 siblings, 1 reply; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-27  2:43 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Kernel Team

On Mon, Apr 26, 2021 at 10:02:59AM -0700, Andrii Nakryiko wrote:
> > +/* Copyright (c) 2021 Facebook */
> > +#include <linux/stddef.h>
> > +#include <linux/bpf.h>
> > +#include <bpf/bpf_helpers.h>
> > +#include <bpf/bpf_tracing.h>
> > +#include <../../tools/include/linux/filter.h>
> 
> with TOOLSINCDIR shouldn't this be just <linux/fiter.h>?

sadly no. There is uapi/linux/filter.h that gets included first.
And changing the order of -Is brings the whole set of other issues.
I couldn't come up with anything better unfortunately.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 09/16] libbpf: Support for fd_idx
  2021-04-26 17:14   ` Andrii Nakryiko
@ 2021-04-27  2:53     ` Alexei Starovoitov
  2021-04-27 16:36       ` Andrii Nakryiko
  0 siblings, 1 reply; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-27  2:53 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Kernel Team

On Mon, Apr 26, 2021 at 10:14:45AM -0700, Andrii Nakryiko wrote:
> On Thu, Apr 22, 2021 at 5:27 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > From: Alexei Starovoitov <ast@kernel.org>
> >
> > Add support for FD_IDX make libbpf prefer that approach to loading programs.
> >
> > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > ---
> >  tools/lib/bpf/bpf.c             |  1 +
> >  tools/lib/bpf/libbpf.c          | 70 +++++++++++++++++++++++++++++----
> >  tools/lib/bpf/libbpf_internal.h |  1 +
> >  3 files changed, 65 insertions(+), 7 deletions(-)
> >
> 
> [...]
> 
> > +static int probe_kern_fd_idx(void)
> > +{
> > +       struct bpf_load_program_attr attr;
> > +       struct bpf_insn insns[] = {
> > +               BPF_LD_IMM64_RAW(BPF_REG_0, BPF_PSEUDO_MAP_IDX, 0),
> > +               BPF_EXIT_INSN(),
> > +       };
> > +
> > +       memset(&attr, 0, sizeof(attr));
> > +       attr.prog_type = BPF_PROG_TYPE_SOCKET_FILTER;
> > +       attr.insns = insns;
> > +       attr.insns_cnt = ARRAY_SIZE(insns);
> > +       attr.license = "GPL";
> > +
> > +       probe_fd(bpf_load_program_xattr(&attr, NULL, 0));
> 
> probe_fd() calls close(fd) internally, which technically can interfere
> with errno, though close() shouldn't be called because syscall has to
> fail on correct kernels... So this should work, but I feel like
> open-coding that logic is better than ignoring probe_fd() result.

It will fail on all kernels.
That probe_fd was a left over of earlier detection approach where it would
proceed to load all the way, but then I switched to:

> > +       return errno == EPROTO;

since such style of probing is much cheaper for the kernel and user space.
But point taken. Will open code it.

> > +}
> > +
> 
> [...]
> 
> > @@ -7239,6 +7279,8 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)
> >         struct bpf_program *prog;
> >         size_t i;
> >         int err;
> > +       struct bpf_map *map;
> > +       int *fd_array = NULL;
> >
> >         for (i = 0; i < obj->nr_programs; i++) {
> >                 prog = &obj->programs[i];
> > @@ -7247,6 +7289,16 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)
> >                         return err;
> >         }
> >
> > +       if (kernel_supports(FEAT_FD_IDX) && obj->nr_maps) {
> > +               fd_array = malloc(sizeof(int) * obj->nr_maps);
> > +               if (!fd_array)
> > +                       return -ENOMEM;
> > +               for (i = 0; i < obj->nr_maps; i++) {
> > +                       map = &obj->maps[i];
> > +                       fd_array[i] = map->fd;
> 
> nit: obj->maps[i].fd will keep it a single line
> 
> > +               }
> > +       }
> > +
> >         for (i = 0; i < obj->nr_programs; i++) {
> >                 prog = &obj->programs[i];
> >                 if (prog_is_subprog(obj, prog))
> > @@ -7256,10 +7308,14 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)
> >                         continue;
> >                 }
> >                 prog->log_level |= log_level;
> > +               prog->fd_array = fd_array;
> 
> you are not freeing this memory on success, as far as I can see. 

hmm. there is free on success below.

> And
> given multiple programs are sharing fd_array, it's a bit problematic
> for prog to have fd_array. This is per-object properly, so let's add
> it at bpf_object level and clean it up on bpf_object__close()? And by
> assigning to obj->fd_array at malloc() site, you won't need to do all
> the error-handling free()s below.

hmm. that sounds worse.
why add another 8 byte to bpf_object that won't be used
until this last step of bpf_object__load_progs.
And only for the duration of this loading.
It's cheaper to have this alloc here with two free()s below.

> 
> >                 err = bpf_program__load(prog, obj->license, obj->kern_version);
> > -               if (err)
> > +               if (err) {
> > +                       free(fd_array);
> >                         return err;
> > +               }
> >         }
> > +       free(fd_array);
> >         return 0;
> >  }
> >
> > diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
> > index 6017902c687e..9114c7085f2a 100644
> > --- a/tools/lib/bpf/libbpf_internal.h
> > +++ b/tools/lib/bpf/libbpf_internal.h
> > @@ -204,6 +204,7 @@ struct bpf_prog_load_params {
> >         __u32 log_level;
> >         char *log_buf;
> >         size_t log_buf_sz;
> > +       int *fd_array;
> >  };
> >
> >  int libbpf__bpf_prog_load(const struct bpf_prog_load_params *load_attr);
> > --
> > 2.30.2
> >

-- 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 12/16] libbpf: Change the order of data and text relocations.
  2021-04-26 17:29   ` Andrii Nakryiko
@ 2021-04-27  3:00     ` Alexei Starovoitov
  2021-04-27 16:47       ` Andrii Nakryiko
  0 siblings, 1 reply; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-27  3:00 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Kernel Team

On Mon, Apr 26, 2021 at 10:29:09AM -0700, Andrii Nakryiko wrote:
> On Thu, Apr 22, 2021 at 5:27 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > From: Alexei Starovoitov <ast@kernel.org>
> >
> > In order to be able to generate loader program in the later
> > patches change the order of data and text relocations.
> > Also improve the test to include data relos.
> >
> > If the kernel supports "FD array" the map_fd relocations can be processed
> > before text relos since generated loader program won't need to manually
> > patch ld_imm64 insns with map_fd.
> > But ksym and kfunc relocations can only be processed after all calls
> > are relocated, since loader program will consist of a sequence
> > of calls to bpf_btf_find_by_name_kind() followed by patching of btf_id
> > and btf_obj_fd into corresponding ld_imm64 insns. The locations of those
> > ld_imm64 insns are specified in relocations.
> > Hence process all data relocations (maps, ksym, kfunc) together after call relos.
> >
> > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > ---
> >  tools/lib/bpf/libbpf.c                        | 86 +++++++++++++++----
> >  .../selftests/bpf/progs/test_subprogs.c       | 13 +++
> >  2 files changed, 80 insertions(+), 19 deletions(-)
> >
> > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> > index 17cfc5b66111..c73a85b97ca5 100644
> > --- a/tools/lib/bpf/libbpf.c
> > +++ b/tools/lib/bpf/libbpf.c
> > @@ -6379,11 +6379,15 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog)
> >                         insn[0].imm = ext->ksym.kernel_btf_id;
> >                         break;
> >                 case RELO_SUBPROG_ADDR:
> > -                       insn[0].src_reg = BPF_PSEUDO_FUNC;
> > -                       /* will be handled as a follow up pass */
> > +                       if (insn[0].src_reg != BPF_PSEUDO_FUNC) {
> > +                               pr_warn("prog '%s': relo #%d: bad insn\n",
> > +                                       prog->name, i);
> > +                               return -EINVAL;
> > +                       }
> 
> given SUBPROG_ADDR is now handled similarly to RELO_CALL in a
> different place, I'd probably drop this error check and just combine
> RELO_SUBPROG_ADDR and RELO_CALL cases with just a /* handled already
> */ comment.

I prefer to keep them separate. I've hit this pr_warn couple times
while messing with relos and it saved my time.
I bet it will save time to the next developer too.

> > +                       /* handled already */
> >                         break;
> >                 case RELO_CALL:
> > -                       /* will be handled as a follow up pass */
> > +                       /* handled already */
> >                         break;
> >                 default:
> >                         pr_warn("prog '%s': relo #%d: bad relo type %d\n",
> > @@ -6552,6 +6556,31 @@ static struct reloc_desc *find_prog_insn_relo(const struct bpf_program *prog, si
> >                        sizeof(*prog->reloc_desc), cmp_relo_by_insn_idx);
> >  }
> >
> > +static int append_subprog_relos(struct bpf_program *main_prog, struct bpf_program *subprog)
> > +{
> > +       int new_cnt = main_prog->nr_reloc + subprog->nr_reloc;
> > +       struct reloc_desc *relos;
> > +       size_t off = subprog->sub_insn_off;
> > +       int i;
> > +
> > +       if (main_prog == subprog)
> > +               return 0;
> > +       relos = libbpf_reallocarray(main_prog->reloc_desc, new_cnt, sizeof(*relos));
> > +       if (!relos)
> > +               return -ENOMEM;
> > +       memcpy(relos + main_prog->nr_reloc, subprog->reloc_desc,
> > +              sizeof(*relos) * subprog->nr_reloc);
> > +
> > +       for (i = main_prog->nr_reloc; i < new_cnt; i++)
> > +               relos[i].insn_idx += off;
> 
> nit: off is used only here, so there is little point in having it as a
> separate var, inline?

sure.

> > +       /* After insn_idx adjustment the 'relos' array is still sorted
> > +        * by insn_idx and doesn't break bsearch.
> > +        */
> > +       main_prog->reloc_desc = relos;
> > +       main_prog->nr_reloc = new_cnt;
> > +       return 0;
> > +}
> > +
> >  static int
> >  bpf_object__reloc_code(struct bpf_object *obj, struct bpf_program *main_prog,
> >                        struct bpf_program *prog)
> > @@ -6560,18 +6589,32 @@ bpf_object__reloc_code(struct bpf_object *obj, struct bpf_program *main_prog,
> >         struct bpf_program *subprog;
> >         struct bpf_insn *insns, *insn;
> >         struct reloc_desc *relo;
> > -       int err;
> > +       int err, i;
> >
> >         err = reloc_prog_func_and_line_info(obj, main_prog, prog);
> >         if (err)
> >                 return err;
> >
> > +       for (i = 0; i < prog->nr_reloc; i++) {
> > +               relo = &prog->reloc_desc[i];
> > +               insn = &main_prog->insns[prog->sub_insn_off + relo->insn_idx];
> > +
> > +               if (relo->type == RELO_SUBPROG_ADDR)
> > +                       /* mark the insn, so it becomes insn_is_pseudo_func() */
> > +                       insn[0].src_reg = BPF_PSEUDO_FUNC;
> > +       }
> > +
> 
> This will do the same work over and over each time we append a subprog
> to main_prog. This should logically follow append_subprog_relos(), but
> you wanted to do it for main_prog with the same code, right?

It cannot follow append_subprog_relos.
It has to be done before the loop below.
Otherwise !insn_is_pseudo_func() won't catch it and all ld_imm64 insns
will be considered which will make the loop below more complex and slower.
The find_prog_insn_relo() will be called a lot more times.
!relo condition would be treated different ld_imm64 vs call insn, etc.

> How about instead doing this before we start appending subprogs to
> main_progs? I.e., do it explicitly in bpf_object__relocate() before
> you start code relocation loop.

Not sure I follow.
Do another loop:
 for (i = 0; i < obj->nr_programs; i++)
    for (i = 0; i < prog->nr_reloc; i++)
      if (relo->type == RELO_SUBPROG_ADDR)
      ?
That's an option too.
I can do that if you prefer.
It felt cleaner to do this mark here right before the loop below that needs it.

> >         for (insn_idx = 0; insn_idx < prog->sec_insn_cnt; insn_idx++) {
> >                 insn = &main_prog->insns[prog->sub_insn_off + insn_idx];
> >                 if (!insn_is_subprog_call(insn) && !insn_is_pseudo_func(insn))
> >                         continue;
> >
> >                 relo = find_prog_insn_relo(prog, insn_idx);
> > +               if (relo && relo->type == RELO_EXTERN_FUNC)
> > +                       /* kfunc relocations will be handled later
> > +                        * in bpf_object__relocate_data()
> > +                        */
> > +                       continue;
> >                 if (relo && relo->type != RELO_CALL && relo->type != RELO_SUBPROG_ADDR) {
> >                         pr_warn("prog '%s': unexpected relo for insn #%zu, type %d\n",
> >                                 prog->name, insn_idx, relo->type);
> 
> [...]

-- 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 14/16] libbpf: Generate loader program out of BPF ELF file.
  2021-04-26 22:22   ` Andrii Nakryiko
@ 2021-04-27  3:25     ` Alexei Starovoitov
  2021-04-27 17:34       ` Andrii Nakryiko
  0 siblings, 1 reply; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-27  3:25 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Kernel Team

On Mon, Apr 26, 2021 at 03:22:36PM -0700, Andrii Nakryiko wrote:
> On Thu, Apr 22, 2021 at 5:27 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > From: Alexei Starovoitov <ast@kernel.org>
> >
> > The BPF program loading process performed by libbpf is quite complex
> > and consists of the following steps:
> > "open" phase:
> > - parse elf file and remember relocations, sections
> > - collect externs and ksyms including their btf_ids in prog's BTF
> > - patch BTF datasec (since llvm couldn't do it)
> > - init maps (old style map_def, BTF based, global data map, kconfig map)
> > - collect relocations against progs and maps
> > "load" phase:
> > - probe kernel features
> > - load vmlinux BTF
> > - resolve externs (kconfig and ksym)
> > - load program BTF
> > - init struct_ops
> > - create maps
> > - apply CO-RE relocations
> > - patch ld_imm64 insns with src_reg=PSEUDO_MAP, PSEUDO_MAP_VALUE, PSEUDO_BTF_ID
> > - reposition subprograms and adjust call insns
> > - sanitize and load progs
> >
> > During this process libbpf does sys_bpf() calls to load BTF, create maps,
> > populate maps and finally load programs.
> > Instead of actually doing the syscalls generate a trace of what libbpf
> > would have done and represent it as the "loader program".
> > The "loader program" consists of single map with:
> > - union bpf_attr(s)
> > - BTF bytes
> > - map value bytes
> > - insns bytes
> > and single bpf program that passes bpf_attr(s) and data into bpf_sys_bpf() helper.
> > Executing such "loader program" via bpf_prog_test_run() command will
> > replay the sequence of syscalls that libbpf would have done which will result
> > the same maps created and programs loaded as specified in the elf file.
> > The "loader program" removes libelf and majority of libbpf dependency from
> > program loading process.
> >
> > kconfig, typeless ksym, struct_ops and CO-RE are not supported yet.
> >
> > The order of relocate_data and relocate_calls had to change, so that
> > bpf_gen__prog_load() can see all relocations for a given program with
> > correct insn_idx-es.
> >
> > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > ---
> >  tools/lib/bpf/Build              |   2 +-
> >  tools/lib/bpf/bpf_gen_internal.h |  40 ++
> >  tools/lib/bpf/gen_loader.c       | 615 +++++++++++++++++++++++++++++++
> >  tools/lib/bpf/libbpf.c           | 204 ++++++++--
> >  tools/lib/bpf/libbpf.h           |  12 +
> >  tools/lib/bpf/libbpf.map         |   1 +
> >  tools/lib/bpf/libbpf_internal.h  |   2 +
> >  tools/lib/bpf/skel_internal.h    | 105 ++++++
> >  8 files changed, 948 insertions(+), 33 deletions(-)
> >  create mode 100644 tools/lib/bpf/bpf_gen_internal.h
> >  create mode 100644 tools/lib/bpf/gen_loader.c
> >  create mode 100644 tools/lib/bpf/skel_internal.h
> >
> > diff --git a/tools/lib/bpf/Build b/tools/lib/bpf/Build
> > index 9b057cc7650a..430f6874fa41 100644
> > --- a/tools/lib/bpf/Build
> > +++ b/tools/lib/bpf/Build
> > @@ -1,3 +1,3 @@
> >  libbpf-y := libbpf.o bpf.o nlattr.o btf.o libbpf_errno.o str_error.o \
> >             netlink.o bpf_prog_linfo.o libbpf_probes.o xsk.o hashmap.o \
> > -           btf_dump.o ringbuf.o strset.o linker.o
> > +           btf_dump.o ringbuf.o strset.o linker.o gen_loader.o
> > diff --git a/tools/lib/bpf/bpf_gen_internal.h b/tools/lib/bpf/bpf_gen_internal.h
> > new file mode 100644
> > index 000000000000..dc3e2cbf9ce3
> > --- /dev/null
> > +++ b/tools/lib/bpf/bpf_gen_internal.h
> > @@ -0,0 +1,40 @@
> > +/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
> > +/* Copyright (c) 2021 Facebook */
> > +#ifndef __BPF_GEN_INTERNAL_H
> > +#define __BPF_GEN_INTERNAL_H
> > +
> > +struct relo_desc {
> 
> there is very similarly named reloc_desc struct in libbpf.c, can you
> rename it to something like gen_btf_relo_desc?

sure.

> > +       const char *name;
> > +       int kind;
> > +       int insn_idx;
> > +};
> > +
> 
> [...]
> 
> > +
> > +static int bpf_gen__realloc_insn_buf(struct bpf_gen *gen, __u32 size)
> > +{
> > +       size_t off = gen->insn_cur - gen->insn_start;
> > +
> > +       if (gen->error)
> > +               return gen->error;
> > +       if (size > INT32_MAX || off + size > INT32_MAX) {
> > +               gen->error = -ERANGE;
> > +               return -ERANGE;
> > +       }
> > +       gen->insn_start = realloc(gen->insn_start, off + size);
> 
> leaking memory here: gen->insn_start will be NULL on failure

ohh. good catch.

> > +       if (!gen->insn_start) {
> > +               gen->error = -ENOMEM;
> > +               return -ENOMEM;
> > +       }
> > +       gen->insn_cur = gen->insn_start + off;
> > +       return 0;
> > +}
> > +
> > +static int bpf_gen__realloc_data_buf(struct bpf_gen *gen, __u32 size)
> > +{
> > +       size_t off = gen->data_cur - gen->data_start;
> > +
> > +       if (gen->error)
> > +               return gen->error;
> > +       if (size > INT32_MAX || off + size > INT32_MAX) {
> > +               gen->error = -ERANGE;
> > +               return -ERANGE;
> > +       }
> > +       gen->data_start = realloc(gen->data_start, off + size);
> 
> same as above
> 
> > +       if (!gen->data_start) {
> > +               gen->error = -ENOMEM;
> > +               return -ENOMEM;
> > +       }
> > +       gen->data_cur = gen->data_start + off;
> > +       return 0;
> > +}
> > +
> 
> [...]
> 
> > +
> > +static void bpf_gen__emit_sys_bpf(struct bpf_gen *gen, int cmd, int attr, int attr_size)
> > +{
> > +       bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_1, cmd));
> > +       bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_2, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, attr));
> 
> is attr an offset into a blob? if yes, attr_off? or attr_base_off,
> anything with _off

yes. it's an offset into a blob, but I don't use _off anywhere
otherwise all variables through out would have to have _off which is too verbose.

> > +       bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_3, attr_size));
> > +       bpf_gen__emit(gen, BPF_EMIT_CALL(BPF_FUNC_sys_bpf));
> > +       /* remember the result in R7 */
> > +       bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_7, BPF_REG_0));
> > +}
> > +
> > +static void bpf_gen__emit_check_err(struct bpf_gen *gen)
> > +{
> > +       bpf_gen__emit(gen, BPF_JMP_IMM(BPF_JSGE, BPF_REG_7, 0, 2));
> > +       bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_0, BPF_REG_7));
> > +       bpf_gen__emit(gen, BPF_EXIT_INSN());
> > +}
> > +
> > +static void __bpf_gen__debug(struct bpf_gen *gen, int reg1, int reg2, const char *fmt, va_list args)
> 
> Can you please leave a comment on what reg1 and reg2 is, it's not very
> clear and the code clearly assumes that it can't be reg[1-4]. It's
> probably those special R7 and R9 (or -1, of course), but having a
> short comment makes sense to not jump around trying to figure out
> possible inputs.
> 
> Oh, reading further, it can also be R0.

good point. will add a comment.

> > +{
> > +       char buf[1024];
> > +       int addr, len, ret;
> > +
> > +       if (!gen->log_level)
> > +               return;
> > +       ret = vsnprintf(buf, sizeof(buf), fmt, args);
> > +       if (ret < 1024 - 7 && reg1 >= 0 && reg2 < 0)
> > +               /* The special case to accommodate common bpf_gen__debug_ret():
> > +                * to avoid specifying BPF_REG_7 and adding " r=%%d" to prints explicitly.
> > +                */
> > +               strcat(buf, " r=%d");
> > +       len = strlen(buf) + 1;
> > +       addr = bpf_gen__add_data(gen, buf, len);
> 
> nit: offset, not address, right?

it's actually an address.
From pov of the program and insn below:

> > +       bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, addr));

I guess it's kinda both. It's an offset within global data, but no one calls
int var;
&var -> taking an offset here? no. It's taking an address of the glob var.
Our implementation of global data is via single map value element.
So global vars have offsets too.
imo addr is more accurate here and through out this file.

> > +       bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_2, len));
> > +       if (reg1 >= 0)
> > +               bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_3, reg1));
> > +       if (reg2 >= 0)
> > +               bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_4, reg2));
> > +       bpf_gen__emit(gen, BPF_EMIT_CALL(BPF_FUNC_trace_printk));
> > +}
> > +
> 
> [...]
> 
> > +int bpf_gen__finish(struct bpf_gen *gen)
> > +{
> > +       int i;
> > +
> > +       bpf_gen__emit_sys_close_stack(gen, stack_off(btf_fd));
> > +       for (i = 0; i < gen->nr_progs; i++)
> > +               bpf_gen__move_stack2ctx(gen,
> > +                                       sizeof(struct bpf_loader_ctx) +
> > +                                       sizeof(struct bpf_map_desc) * gen->nr_maps +
> > +                                       sizeof(struct bpf_prog_desc) * i +
> > +                                       offsetof(struct bpf_prog_desc, prog_fd), 4,
> > +                                       stack_off(prog_fd[i]));
> > +       for (i = 0; i < gen->nr_maps; i++)
> > +               bpf_gen__move_stack2ctx(gen,
> > +                                       sizeof(struct bpf_loader_ctx) +
> > +                                       sizeof(struct bpf_map_desc) * i +
> > +                                       offsetof(struct bpf_map_desc, map_fd), 4,
> > +                                       stack_off(map_fd[i]));
> > +       bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_0, 0));
> > +       bpf_gen__emit(gen, BPF_EXIT_INSN());
> > +       pr_debug("bpf_gen__finish %d\n", gen->error);
> 
> maybe prefix all those pr_debug()s with "gen: " to distinguish them
> from the rest of libbpf logging?

sure.

> > +       if (!gen->error) {
> > +               struct gen_loader_opts *opts = gen->opts;
> > +
> > +               opts->insns = gen->insn_start;
> > +               opts->insns_sz = gen->insn_cur - gen->insn_start;
> > +               opts->data = gen->data_start;
> > +               opts->data_sz = gen->data_cur - gen->data_start;
> > +       }
> > +       return gen->error;
> > +}
> > +
> > +void bpf_gen__free(struct bpf_gen *gen)
> > +{
> > +       if (!gen)
> > +               return;
> > +       free(gen->data_start);
> > +       free(gen->insn_start);
> > +       gen->data_start = NULL;
> > +       gen->insn_start = NULL;
> 
> what's the point of NULL'ing them out if you don't clear gen->data_cur
> and gen->insn_cur?

To spot the bugs quicker if there are issues.

> also should it free(gen) itself?

ohh. bpf_object__close() should probably do it to stay symmetrical with calloc.

> > +}
> > +
> > +void bpf_gen__load_btf(struct bpf_gen *gen, const void *btf_raw_data, __u32 btf_raw_size)
> > +{
> > +       union bpf_attr attr = {};
> 
> here and below: memset(0)?

that's unnecessary. there is no backward/forward compat issue here.
and bpf_attr doesn't have gaps inside.

> > +       int attr_size = offsetofend(union bpf_attr, btf_log_level);
> > +       int btf_data, btf_load_attr;
> > +
> > +       pr_debug("btf_load: size %d\n", btf_raw_size);
> > +       btf_data = bpf_gen__add_data(gen, btf_raw_data, btf_raw_size);
> > +
> 
> [...]
> 
> > +       map_create_attr = bpf_gen__add_data(gen, &attr, attr_size);
> > +       if (attr.btf_value_type_id)
> > +               /* populate union bpf_attr with btf_fd saved in the stack earlier */
> > +               bpf_gen__move_stack2blob(gen, map_create_attr + offsetof(union bpf_attr, btf_fd), 4,
> > +                                        stack_off(btf_fd));
> > +       switch (attr.map_type) {
> > +       case BPF_MAP_TYPE_ARRAY_OF_MAPS:
> > +       case BPF_MAP_TYPE_HASH_OF_MAPS:
> > +               bpf_gen__move_stack2blob(gen, map_create_attr + offsetof(union bpf_attr, inner_map_fd),
> > +                                        4, stack_off(inner_map_fd));
> > +               close_inner_map_fd = true;
> > +               break;
> > +       default:;
> 
> default:
>     break;

why?

> > +       }
> > +       /* emit MAP_CREATE command */
> > +       bpf_gen__emit_sys_bpf(gen, BPF_MAP_CREATE, map_create_attr, attr_size);
> > +       bpf_gen__debug_ret(gen, "map_create %s idx %d type %d value_size %d",
> > +                          attr.map_name, map_idx, map_attr->map_type, attr.value_size);
> > +       bpf_gen__emit_check_err(gen);
> 
> what will happen on error with inner_map_fd and all the other fds
> created by now?

that's a todo item. I'll add a comment that error path is far from perfect.

> > +       /* remember map_fd in the stack, if successful */
> > +       if (map_idx < 0) {
> > +               /* This bpf_gen__map_create() function is called with map_idx >= 0 for all maps
> > +                * that libbpf loading logic tracks.
> > +                * It's called with -1 to create an inner map.
> > +                */
> > +               bpf_gen__emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7, stack_off(inner_map_fd)));
> > +       } else {
> > +               if (map_idx != gen->nr_maps) {
> 
> why would that happen? defensive programming? and even then `if () {}
> else if () {} else {}` structure is more appropriate

sure. will use that style.

> > +                       gen->error = -EDOM; /* internal bug */
> > +                       return;
> > +               }
> > +               bpf_gen__emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7, stack_off(map_fd[map_idx])));
> > +               gen->nr_maps++;
> > +       }
> > +       if (close_inner_map_fd)
> > +               bpf_gen__emit_sys_close_stack(gen, stack_off(inner_map_fd));
> > +}
> > +
> 
> [...]
> 
> > +static void bpf_gen__cleanup_relos(struct bpf_gen *gen, int insns)
> > +{
> > +       int i, insn;
> > +
> > +       for (i = 0; i < gen->relo_cnt; i++) {
> > +               if (gen->relos[i].kind != BTF_KIND_VAR)
> > +                       continue;
> > +               /* close fd recorded in insn[insn_idx + 1].imm */
> > +               insn = insns + sizeof(struct bpf_insn) * (gen->relos[i].insn_idx + 1)
> > +                       + offsetof(struct bpf_insn, imm);
> > +               bpf_gen__emit_sys_close_blob(gen, insn);
> 
> wouldn't this close the same FD used across multiple "relos" multiple times?

no. since every relo has its own fd.
Right now the loader gen is simple and doesn't do preprocessing of
all relos for all progs to avoid complicating the code.
It's good enough for now.

> > +       }
> > +       if (gen->relo_cnt) {
> > +               free(gen->relos);
> > +               gen->relo_cnt = 0;
> > +               gen->relos = NULL;
> > +       }
> > +}
> > +
> 
> [...]
> 
> > +       struct bpf_gen *gen_loader;
> > +
> >         /*
> >          * Information when doing elf related work. Only valid if fd
> >          * is valid.
> > @@ -2651,7 +2654,15 @@ static int bpf_object__sanitize_and_load_btf(struct bpf_object *obj)
> >                 bpf_object__sanitize_btf(obj, kern_btf);
> >         }
> >
> > -       err = btf__load(kern_btf);
> > +       if (obj->gen_loader) {
> > +               __u32 raw_size = 0;
> > +               const void *raw_data = btf__get_raw_data(kern_btf, &raw_size);
> 
> this can return NULL on ENOMEM

good point.

> > +
> > +               bpf_gen__load_btf(obj->gen_loader, raw_data, raw_size);
> > +               btf__set_fd(kern_btf, 0);
> 
> why setting fd to 0 (stdin)? does gen depend on this somewhere? The
> problem is that it will eventually be closed on btf__free(), which
> will close stdin, causing a big surprise. What will happen if you
> leave it at -1?

unfortunately there are various piece of the code that check <0
means not init.
I can try to fix them all, but it felt unncessary complex vs screwing up stdin.
It's bpftool's stdin, but you're right that it's not pretty.
I can probably use special very large FD number and check for it.
Still cleaner than fixing all checks.

> 
> > +       } else {
> > +               err = btf__load(kern_btf);
> > +       }
> >         if (sanitize) {
> >                 if (!err) {
> >                         /* move fd to libbpf's BTF */
> > @@ -4262,6 +4273,12 @@ static bool kernel_supports(const struct bpf_object *obj, enum kern_feature_id f
> >         struct kern_feature_desc *feat = &feature_probes[feat_id];
> >         int ret;
> >
> > +       if (obj->gen_loader)
> > +               /* To generate loader program assume the latest kernel
> > +                * to avoid doing extra prog_load, map_create syscalls.
> > +                */
> > +               return true;
> > +
> >         if (READ_ONCE(feat->res) == FEAT_UNKNOWN) {
> >                 ret = feat->probe();
> >                 if (ret > 0) {
> > @@ -4344,6 +4361,13 @@ bpf_object__populate_internal_map(struct bpf_object *obj, struct bpf_map *map)
> >         char *cp, errmsg[STRERR_BUFSIZE];
> >         int err, zero = 0;
> >
> > +       if (obj->gen_loader) {
> > +               bpf_gen__map_update_elem(obj->gen_loader, map - obj->maps,
> 
> it would be great for bpf_gen__map_update_elem to reflect that it's
> not a generic map_update_elem() call, rather special internal map
> update (just use bpf_gen__populate_internal_map?) Whether to freeze or
> not could be just a flag to the same call, they always go together.

It's actually generic map_update_elem. I haven't used it in inner map init yet.
Still on my todo list.

> > +                                        map->mmaped, map->def.value_size);
> > +               if (map_type == LIBBPF_MAP_RODATA || map_type == LIBBPF_MAP_KCONFIG)
> > +                       bpf_gen__map_freeze(obj->gen_loader, map - obj->maps);
> > +               return 0;
> > +       }
> >         err = bpf_map_update_elem(map->fd, &zero, map->mmaped, 0);
> >         if (err) {
> >                 err = -errno;
> > @@ -4369,7 +4393,7 @@ bpf_object__populate_internal_map(struct bpf_object *obj, struct bpf_map *map)
> >
> >  static void bpf_map__destroy(struct bpf_map *map);
> >
> > -static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map)
> > +static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map, bool is_inner)
> >  {
> >         struct bpf_create_map_attr create_attr;
> >         struct bpf_map_def *def = &map->def;
> > @@ -4415,9 +4439,9 @@ static int bpf_object__create_map(struct bpf_object *obj, struct bpf_map *map)
> >
> >         if (bpf_map_type__is_map_in_map(def->type)) {
> >                 if (map->inner_map) {
> > -                       int err;
> > +                       int err = 0;
> 
> no need to initialize to zero, you are assigning it right below
> 
> >
> > -                       err = bpf_object__create_map(obj, map->inner_map);
> > +                       err = bpf_object__create_map(obj, map->inner_map, true);
> >                         if (err) {
> >                                 pr_warn("map '%s': failed to create inner map: %d\n",
> >                                         map->name, err);
> 
> [...]
> 
> > @@ -4469,7 +4498,12 @@ static int init_map_slots(struct bpf_map *map)
> >
> >                 targ_map = map->init_slots[i];
> >                 fd = bpf_map__fd(targ_map);
> > -               err = bpf_map_update_elem(map->fd, &i, &fd, 0);
> > +               if (obj->gen_loader) {
> > +                       printf("// TODO map_update_elem: idx %ld key %d value==map_idx %ld\n",
> > +                              map - obj->maps, i, targ_map - obj->maps);
> 
> return error for now?
> 
> > +               } else {
> > +                       err = bpf_map_update_elem(map->fd, &i, &fd, 0);
> > +               }
> >                 if (err) {
> >                         err = -errno;
> >                         pr_warn("map '%s': failed to initialize slot [%d] to map '%s' fd=%d: %d\n",
> 
> [...]
> 
> > @@ -6082,6 +6119,11 @@ static int bpf_core_apply_relo(struct bpf_program *prog,
> >         if (str_is_empty(spec_str))
> >                 return -EINVAL;
> >
> > +       if (prog->obj->gen_loader) {
> > +               printf("// TODO core_relo: prog %ld insn[%d] %s %s kind %d\n",
> > +                      prog - prog->obj->programs, relo->insn_off / 8,
> > +                      local_name, spec_str, relo->kind);
> 
> same, return error? Drop printf, maybe leave pr_debug()?

sure. pr_debug with error sounds fine.

> > +       }
> >         err = bpf_core_parse_spec(local_btf, local_id, spec_str, relo->kind, &local_spec);
> >         if (err) {
> >                 pr_warn("prog '%s': relo #%d: parsing [%d] %s %s + %s failed: %d\n",
> > @@ -6821,6 +6863,19 @@ bpf_object__relocate_calls(struct bpf_object *obj, struct bpf_program *prog)
> >
> >         return 0;
> >  }
> 
> empty line here
> 
> > +static void
> > +bpf_object__free_relocs(struct bpf_object *obj)
> > +{
> > +       struct bpf_program *prog;
> > +       int i;
> > +
> > +       /* free up relocation descriptors */
> > +       for (i = 0; i < obj->nr_programs; i++) {
> > +               prog = &obj->programs[i];
> > +               zfree(&prog->reloc_desc);
> > +               prog->nr_reloc = 0;
> > +       }
> > +}
> >
> 
> [...]
> 
> > +static int bpf_program__record_externs(struct bpf_program *prog)
> > +{
> > +       struct bpf_object *obj = prog->obj;
> > +       int i;
> > +
> > +       for (i = 0; i < prog->nr_reloc; i++) {
> > +               struct reloc_desc *relo = &prog->reloc_desc[i];
> > +               struct extern_desc *ext = &obj->externs[relo->sym_off];
> > +
> > +               switch (relo->type) {
> > +               case RELO_EXTERN_VAR:
> > +                       if (ext->type != EXT_KSYM)
> > +                               continue;
> > +                       if (!ext->ksym.type_id) /* typeless ksym */
> > +                               continue;
> 
> this shouldn't be silently ignored, if it's not supported, it should
> return error

agree

> > +                       bpf_gen__record_extern(obj->gen_loader, ext->name, BTF_KIND_VAR,
> > +                                              relo->insn_idx);
> > +                       break;
> > +               case RELO_EXTERN_FUNC:
> > +                       bpf_gen__record_extern(obj->gen_loader, ext->name, BTF_KIND_FUNC,
> > +                                              relo->insn_idx);
> > +                       break;
> > +               default:
> > +                       continue;
> > +               }
> > +       }
> > +       return 0;
> > +}
> > +
> 
> [...]
> 
> > @@ -7868,6 +7970,9 @@ int bpf_object__load_xattr(struct bpf_object_load_attr *attr)
> >         err = err ? : bpf_object__relocate(obj, attr->target_btf_path);
> >         err = err ? : bpf_object__load_progs(obj, attr->log_level);
> >
> > +       if (obj->gen_loader && !err)
> > +               err = bpf_gen__finish(obj->gen_loader);
> > +
> >         /* clean up module BTFs */
> >         for (i = 0; i < obj->btf_module_cnt; i++) {
> >                 close(obj->btf_modules[i].fd);
> > @@ -8493,6 +8598,7 @@ void bpf_object__close(struct bpf_object *obj)
> >         if (obj->clear_priv)
> >                 obj->clear_priv(obj, obj->priv);
> 
> bpf_object__close() will close all those FD=0 in maps/progs, that's not good
> 
> >
> > +       bpf_gen__free(obj->gen_loader);
> >         bpf_object__elf_finish(obj);
> >         bpf_object__unload(obj);
> >         btf__free(obj->btf);
> 
> [...]
> 
> > @@ -9387,7 +9521,13 @@ static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd,
> >         }
> >
> >         /* kernel/module BTF ID */
> > -       err = find_kernel_btf_id(prog->obj, attach_name, attach_type, btf_obj_fd, btf_type_id);
> > +       if (prog->obj->gen_loader) {
> > +               bpf_gen__record_attach_target(prog->obj->gen_loader, attach_name, attach_type);
> > +               *btf_obj_fd = 0;
> 
> this will leak kernel module BTF FDs

I don't follow.
When gen_loader is happening the find_kernel_btf_id() is not called and modules BTFs are not loaded.

> > +               *btf_type_id = 1;
> > +       } else {
> > +               err = find_kernel_btf_id(prog->obj, attach_name, attach_type, btf_obj_fd, btf_type_id);
> > +       }
> >         if (err) {
> >                 pr_warn("failed to find kernel BTF type ID of '%s': %d\n", attach_name, err);
> >                 return err;
> 
> [...]
> 
> > +out:
> > +       close(map_fd);
> > +       close(prog_fd);
> 
> this does close(-1), check >= 0

Right. I felt there is no risk in doing that. I guess extra check is fine too.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 15/16] bpftool: Use syscall/loader program in "prog load" and "gen skeleton" command.
  2021-04-26 22:35   ` Andrii Nakryiko
@ 2021-04-27  3:28     ` Alexei Starovoitov
  2021-04-27 17:38       ` Andrii Nakryiko
  0 siblings, 1 reply; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-27  3:28 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Kernel Team

On Mon, Apr 26, 2021 at 03:35:16PM -0700, Andrii Nakryiko wrote:
> On Thu, Apr 22, 2021 at 5:27 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > From: Alexei Starovoitov <ast@kernel.org>
> >
> > Add -L flag to bpftool to use libbpf gen_trace facility and syscall/loader program
> > for skeleton generation and program loading.
> >
> > "bpftool gen skeleton -L" command will generate a "light skeleton" or "loader skeleton"
> > that is similar to existing skeleton, but has one major difference:
> > $ bpftool gen skeleton lsm.o > lsm.skel.h
> > $ bpftool gen skeleton -L lsm.o > lsm.lskel.h
> > $ diff lsm.skel.h lsm.lskel.h
> > @@ -5,34 +4,34 @@
> >  #define __LSM_SKEL_H__
> >
> >  #include <stdlib.h>
> > -#include <bpf/libbpf.h>
> > +#include <bpf/bpf.h>
> >
> > The light skeleton does not use majority of libbpf infrastructure.
> > It doesn't need libelf. It doesn't parse .o file.
> > It only needs few sys_bpf wrappers. All of them are in bpf/bpf.h file.
> > In future libbpf/bpf.c can be inlined into bpf.h, so not even libbpf.a would be
> > needed to work with light skeleton.
> >
> > "bpftool prog load -L file.o" command is introduced for debugging of syscall/loader
> > program generation. Just like the same command without -L it will try to load
> > the programs from file.o into the kernel. It won't even try to pin them.
> >
> > "bpftool prog load -L -d file.o" command will provide additional debug messages
> > on how syscall/loader program was generated.
> > Also the execution of syscall/loader program will use bpf_trace_printk() for
> > each step of loading BTF, creating maps, and loading programs.
> > The user can do "cat /.../trace_pipe" for further debug.
> >
> > An example of fexit_sleep.lskel.h generated from progs/fexit_sleep.c:
> > struct fexit_sleep {
> >         struct bpf_loader_ctx ctx;
> >         struct {
> >                 struct bpf_map_desc bss;
> >         } maps;
> >         struct {
> >                 struct bpf_prog_desc nanosleep_fentry;
> >                 struct bpf_prog_desc nanosleep_fexit;
> >         } progs;
> >         struct {
> >                 int nanosleep_fentry_fd;
> >                 int nanosleep_fexit_fd;
> >         } links;
> >         struct fexit_sleep__bss {
> >                 int pid;
> >                 int fentry_cnt;
> >                 int fexit_cnt;
> >         } *bss;
> > };
> >
> > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > ---
> >  tools/bpf/bpftool/Makefile        |   2 +-
> >  tools/bpf/bpftool/gen.c           | 313 +++++++++++++++++++++++++++---
> >  tools/bpf/bpftool/main.c          |   7 +-
> >  tools/bpf/bpftool/main.h          |   1 +
> >  tools/bpf/bpftool/prog.c          |  80 ++++++++
> >  tools/bpf/bpftool/xlated_dumper.c |   3 +
> >  6 files changed, 382 insertions(+), 24 deletions(-)
> >
> 
> [...]
> 
> > @@ -268,6 +269,254 @@ static void codegen(const char *template, ...)
> >         free(s);
> >  }
> >
> > +static void print_hex(const char *obj_data, int file_sz)
> > +{
> > +       int i, len;
> > +
> > +       /* embed contents of BPF object file */
> 
> nit: this comment should have stayed at the original place
> 
> > +       for (i = 0, len = 0; i < file_sz; i++) {
> > +               int w = obj_data[i] ? 4 : 2;
> > +
> 
> [...]
> 
> > +       bpf_object__for_each_map(map, obj) {
> > +               const char * ident;
> > +
> > +               ident = get_map_ident(map);
> > +               if (!ident)
> > +                       continue;
> > +
> > +               if (!bpf_map__is_internal(map) ||
> > +                   !(bpf_map__def(map)->map_flags & BPF_F_MMAPABLE))
> > +                       continue;
> > +
> > +               printf("\tskel->%1$s =\n"
> > +                      "\t\tmmap(NULL, %2$zd, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_FIXED,\n"
> > +                      "\t\t\tskel->maps.%1$s.map_fd, 0);\n",
> > +                      ident, bpf_map_mmap_sz(map));
> 
> use codegen()?

why?
codegen() would add extra early \n for no good reason.

> > +       }
> > +       codegen("\
> > +               \n\
> > +                       return 0;                                           \n\
> > +               }                                                           \n\
> > +                                                                           \n\
> > +               static inline struct %1$s *                                 \n\
> 
> [...]
> 
> >  static int do_skeleton(int argc, char **argv)
> >  {
> >         char header_guard[MAX_OBJ_NAME_LEN + sizeof("__SKEL_H__")];
> > @@ -277,7 +526,7 @@ static int do_skeleton(int argc, char **argv)
> >         struct bpf_object *obj = NULL;
> >         const char *file, *ident;
> >         struct bpf_program *prog;
> > -       int fd, len, err = -1;
> > +       int fd, err = -1;
> >         struct bpf_map *map;
> >         struct btf *btf;
> >         struct stat st;
> > @@ -359,7 +608,25 @@ static int do_skeleton(int argc, char **argv)
> >         }
> >
> >         get_header_guard(header_guard, obj_name);
> > -       codegen("\
> > +       if (use_loader)
> 
> please use {} for such a long if/else, even if it's, technically, a
> single-statement if

I think it reads fine as-is, but, sure, I can add {}

> > +               codegen("\
> > +               \n\
> > +               /* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */   \n\
> > +               /* THIS FILE IS AUTOGENERATED! */                           \n\
> > +               #ifndef %2$s                                                \n\
> > +               #define %2$s                                                \n\
> > +                                                                           \n\
> > +               #include <stdlib.h>                                         \n\
> > +               #include <bpf/bpf.h>                                        \n\
> > +               #include <bpf/skel_internal.h>                              \n\
> > +                                                                           \n\
> > +               struct %1$s {                                               \n\
> > +                       struct bpf_loader_ctx ctx;                          \n\
> > +               ",
> > +               obj_name, header_guard
> > +               );
> > +       else
> > +               codegen("\
> >                 \n\
> >                 /* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */   \n\
> >                                                                             \n\
> 
> [...]

-- 

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 10/16] bpf: Add bpf_btf_find_by_name_kind() helper.
  2021-04-26 22:46   ` Andrii Nakryiko
@ 2021-04-27  3:37     ` Alexei Starovoitov
  2021-04-27 17:45       ` Andrii Nakryiko
  0 siblings, 1 reply; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-27  3:37 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Kernel Team

On Mon, Apr 26, 2021 at 03:46:29PM -0700, Andrii Nakryiko wrote:
> On Thu, Apr 22, 2021 at 5:27 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > From: Alexei Starovoitov <ast@kernel.org>
> >
> > Add new helper:
> >
> > long bpf_btf_find_by_name_kind(u32 btf_fd, char *name, u32 kind, int flags)
> >         Description
> >                 Find given name with given type in BTF pointed to by btf_fd.
> >                 If btf_fd is zero look for the name in vmlinux BTF and in module's BTFs.
> >         Return
> >                 Returns btf_id and btf_obj_fd in lower and upper 32 bits.
> >
> > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > ---
> >  include/linux/bpf.h            |  1 +
> >  include/uapi/linux/bpf.h       |  8 ++++
> >  kernel/bpf/btf.c               | 68 ++++++++++++++++++++++++++++++++++
> >  kernel/bpf/syscall.c           |  2 +
> >  tools/include/uapi/linux/bpf.h |  8 ++++
> >  5 files changed, 87 insertions(+)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index 0f841bd0cb85..4cf361eb6a80 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -1972,6 +1972,7 @@ extern const struct bpf_func_proto bpf_get_socket_ptr_cookie_proto;
> >  extern const struct bpf_func_proto bpf_task_storage_get_proto;
> >  extern const struct bpf_func_proto bpf_task_storage_delete_proto;
> >  extern const struct bpf_func_proto bpf_for_each_map_elem_proto;
> > +extern const struct bpf_func_proto bpf_btf_find_by_name_kind_proto;
> >
> >  const struct bpf_func_proto *bpf_tracing_func_proto(
> >         enum bpf_func_id func_id, const struct bpf_prog *prog);
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index de58a714ed36..253f5f031f08 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -4748,6 +4748,13 @@ union bpf_attr {
> >   *             Execute bpf syscall with given arguments.
> >   *     Return
> >   *             A syscall result.
> > + *
> > + * long bpf_btf_find_by_name_kind(u32 btf_fd, char *name, u32 kind, int flags)
> > + *     Description
> > + *             Find given name with given type in BTF pointed to by btf_fd.
> 
> "Find BTF type with given name"? Should the limits on name length be

+1

> specified? KSYM_NAME_LEN is a pretty arbitrary restriction.

that's implementation detail that shouldn't leak into uapi.

> Also,
> would it still work fine if the caller provides a pointer to a much
> shorter piece of memory?
> 
> Why not add name_sz right after name, as we do with a lot of other
> arguments like this?

That's an option too, but then the helper will have 5 args and 'flags'
would be likely useless. I mean unlikely it will help extending it.
I was thinking about ARG_PTR_TO_CONST_STR, but it doesn't work,
since blob is writeable by the prog. It's read only from user space.
I'm fine with name, name_sz though.

> 
> > + *             If btf_fd is zero look for the name in vmlinux BTF and in module's BTFs.
> > + *     Return
> > + *             Returns btf_id and btf_obj_fd in lower and upper 32 bits.
> 
> Mention that for vmlinux BTF btf_obj_fd will be zero? Also who "owns"
> the FD? If the BPF program doesn't close it, when are they going to be
> cleaned up?

just like bpf_sys_bpf. Who owns returned FD? The program that called
the helper, of course.
In the current shape of loader prog these btf fds are cleaned up correctly
in success and in error case.
Not all FDs though. map fds will stay around if bpf_sys_bpf(prog_load) fails to load.
Tweaking loader prog to close all FDs in error case is on todo list.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 05/16] selftests/bpf: Test for syscall program type
  2021-04-27  2:43     ` Alexei Starovoitov
@ 2021-04-27 16:28       ` Andrii Nakryiko
  0 siblings, 0 replies; 52+ messages in thread
From: Andrii Nakryiko @ 2021-04-27 16:28 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Kernel Team

On Mon, Apr 26, 2021 at 7:43 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Apr 26, 2021 at 10:02:59AM -0700, Andrii Nakryiko wrote:
> > > +/* Copyright (c) 2021 Facebook */
> > > +#include <linux/stddef.h>
> > > +#include <linux/bpf.h>
> > > +#include <bpf/bpf_helpers.h>
> > > +#include <bpf/bpf_tracing.h>
> > > +#include <../../tools/include/linux/filter.h>
> >
> > with TOOLSINCDIR shouldn't this be just <linux/fiter.h>?
>
> sadly no. There is uapi/linux/filter.h that gets included first.
> And changing the order of -Is brings the whole set of other issues.
> I couldn't come up with anything better unfortunately.

Then let's at least drop TOOLSINCDIR for now, if it's not really used?

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 09/16] libbpf: Support for fd_idx
  2021-04-27  2:53     ` Alexei Starovoitov
@ 2021-04-27 16:36       ` Andrii Nakryiko
  2021-04-28  1:32         ` Alexei Starovoitov
  0 siblings, 1 reply; 52+ messages in thread
From: Andrii Nakryiko @ 2021-04-27 16:36 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Kernel Team

On Mon, Apr 26, 2021 at 7:53 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Apr 26, 2021 at 10:14:45AM -0700, Andrii Nakryiko wrote:
> > On Thu, Apr 22, 2021 at 5:27 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > From: Alexei Starovoitov <ast@kernel.org>
> > >
> > > Add support for FD_IDX make libbpf prefer that approach to loading programs.
> > >
> > > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > > ---
> > >  tools/lib/bpf/bpf.c             |  1 +
> > >  tools/lib/bpf/libbpf.c          | 70 +++++++++++++++++++++++++++++----
> > >  tools/lib/bpf/libbpf_internal.h |  1 +
> > >  3 files changed, 65 insertions(+), 7 deletions(-)
> > >
> >

[...]

> > >         for (i = 0; i < obj->nr_programs; i++) {
> > >                 prog = &obj->programs[i];
> > >                 if (prog_is_subprog(obj, prog))
> > > @@ -7256,10 +7308,14 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)
> > >                         continue;
> > >                 }
> > >                 prog->log_level |= log_level;
> > > +               prog->fd_array = fd_array;
> >
> > you are not freeing this memory on success, as far as I can see.
>
> hmm. there is free on success below.

right, my bad, I somehow understood as if it was only for error case

>
> > And
> > given multiple programs are sharing fd_array, it's a bit problematic
> > for prog to have fd_array. This is per-object properly, so let's add
> > it at bpf_object level and clean it up on bpf_object__close()? And by
> > assigning to obj->fd_array at malloc() site, you won't need to do all
> > the error-handling free()s below.
>
> hmm. that sounds worse.
> why add another 8 byte to bpf_object that won't be used
> until this last step of bpf_object__load_progs.
> And only for the duration of this loading.
> It's cheaper to have this alloc here with two free()s below.

So if you care about extra 8 bytes, then it's even more efficient to
have just one obj->fd_array rather than N prog->fd_array, no? And it's
also not very clean that prog->fd_array will have a dangling pointer
to deallocated memory after bpf_object__load_progs().

But that brings the entire question of why use fd_array at all here?
Commit description doesn't explain why libbpf has to use fd_array and
why it should be preferred. What are the advantages justifying added
complexity and extra memory allocation/clean up? It also reduces test
coverage of the "old ways" that offer the same capabilities. I think
this should be part of the commit description, if we agree that
fd_array has to be used outside of the auto-generated loader program.


>
> >
> > >                 err = bpf_program__load(prog, obj->license, obj->kern_version);
> > > -               if (err)
> > > +               if (err) {
> > > +                       free(fd_array);
> > >                         return err;
> > > +               }
> > >         }
> > > +       free(fd_array);
> > >         return 0;
> > >  }
> > >
> > > diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
> > > index 6017902c687e..9114c7085f2a 100644
> > > --- a/tools/lib/bpf/libbpf_internal.h
> > > +++ b/tools/lib/bpf/libbpf_internal.h
> > > @@ -204,6 +204,7 @@ struct bpf_prog_load_params {
> > >         __u32 log_level;
> > >         char *log_buf;
> > >         size_t log_buf_sz;
> > > +       int *fd_array;
> > >  };
> > >
> > >  int libbpf__bpf_prog_load(const struct bpf_prog_load_params *load_attr);
> > > --
> > > 2.30.2
> > >
>
> --

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 12/16] libbpf: Change the order of data and text relocations.
  2021-04-27  3:00     ` Alexei Starovoitov
@ 2021-04-27 16:47       ` Andrii Nakryiko
  0 siblings, 0 replies; 52+ messages in thread
From: Andrii Nakryiko @ 2021-04-27 16:47 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Kernel Team

On Mon, Apr 26, 2021 at 8:00 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Apr 26, 2021 at 10:29:09AM -0700, Andrii Nakryiko wrote:
> > On Thu, Apr 22, 2021 at 5:27 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > From: Alexei Starovoitov <ast@kernel.org>
> > >
> > > In order to be able to generate loader program in the later
> > > patches change the order of data and text relocations.
> > > Also improve the test to include data relos.
> > >
> > > If the kernel supports "FD array" the map_fd relocations can be processed
> > > before text relos since generated loader program won't need to manually
> > > patch ld_imm64 insns with map_fd.
> > > But ksym and kfunc relocations can only be processed after all calls
> > > are relocated, since loader program will consist of a sequence
> > > of calls to bpf_btf_find_by_name_kind() followed by patching of btf_id
> > > and btf_obj_fd into corresponding ld_imm64 insns. The locations of those
> > > ld_imm64 insns are specified in relocations.
> > > Hence process all data relocations (maps, ksym, kfunc) together after call relos.
> > >
> > > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > > ---
> > >  tools/lib/bpf/libbpf.c                        | 86 +++++++++++++++----
> > >  .../selftests/bpf/progs/test_subprogs.c       | 13 +++
> > >  2 files changed, 80 insertions(+), 19 deletions(-)
> > >
> > > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> > > index 17cfc5b66111..c73a85b97ca5 100644
> > > --- a/tools/lib/bpf/libbpf.c
> > > +++ b/tools/lib/bpf/libbpf.c
> > > @@ -6379,11 +6379,15 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog)
> > >                         insn[0].imm = ext->ksym.kernel_btf_id;
> > >                         break;
> > >                 case RELO_SUBPROG_ADDR:
> > > -                       insn[0].src_reg = BPF_PSEUDO_FUNC;
> > > -                       /* will be handled as a follow up pass */
> > > +                       if (insn[0].src_reg != BPF_PSEUDO_FUNC) {
> > > +                               pr_warn("prog '%s': relo #%d: bad insn\n",
> > > +                                       prog->name, i);
> > > +                               return -EINVAL;
> > > +                       }
> >
> > given SUBPROG_ADDR is now handled similarly to RELO_CALL in a
> > different place, I'd probably drop this error check and just combine
> > RELO_SUBPROG_ADDR and RELO_CALL cases with just a /* handled already
> > */ comment.
>
> I prefer to keep them separate. I've hit this pr_warn couple times
> while messing with relos and it saved my time.
> I bet it will save time to the next developer too.

hmm.. ok, not critical to me

>
> > > +                       /* handled already */
> > >                         break;
> > >                 case RELO_CALL:
> > > -                       /* will be handled as a follow up pass */
> > > +                       /* handled already */
> > >                         break;
> > >                 default:
> > >                         pr_warn("prog '%s': relo #%d: bad relo type %d\n",
> > > @@ -6552,6 +6556,31 @@ static struct reloc_desc *find_prog_insn_relo(const struct bpf_program *prog, si
> > >                        sizeof(*prog->reloc_desc), cmp_relo_by_insn_idx);
> > >  }
> > >
> > > +static int append_subprog_relos(struct bpf_program *main_prog, struct bpf_program *subprog)
> > > +{
> > > +       int new_cnt = main_prog->nr_reloc + subprog->nr_reloc;
> > > +       struct reloc_desc *relos;
> > > +       size_t off = subprog->sub_insn_off;
> > > +       int i;
> > > +
> > > +       if (main_prog == subprog)
> > > +               return 0;
> > > +       relos = libbpf_reallocarray(main_prog->reloc_desc, new_cnt, sizeof(*relos));
> > > +       if (!relos)
> > > +               return -ENOMEM;
> > > +       memcpy(relos + main_prog->nr_reloc, subprog->reloc_desc,
> > > +              sizeof(*relos) * subprog->nr_reloc);
> > > +
> > > +       for (i = main_prog->nr_reloc; i < new_cnt; i++)
> > > +               relos[i].insn_idx += off;
> >
> > nit: off is used only here, so there is little point in having it as a
> > separate var, inline?
>
> sure.
>
> > > +       /* After insn_idx adjustment the 'relos' array is still sorted
> > > +        * by insn_idx and doesn't break bsearch.
> > > +        */
> > > +       main_prog->reloc_desc = relos;
> > > +       main_prog->nr_reloc = new_cnt;
> > > +       return 0;
> > > +}
> > > +
> > >  static int
> > >  bpf_object__reloc_code(struct bpf_object *obj, struct bpf_program *main_prog,
> > >                        struct bpf_program *prog)
> > > @@ -6560,18 +6589,32 @@ bpf_object__reloc_code(struct bpf_object *obj, struct bpf_program *main_prog,
> > >         struct bpf_program *subprog;
> > >         struct bpf_insn *insns, *insn;
> > >         struct reloc_desc *relo;
> > > -       int err;
> > > +       int err, i;
> > >
> > >         err = reloc_prog_func_and_line_info(obj, main_prog, prog);
> > >         if (err)
> > >                 return err;
> > >
> > > +       for (i = 0; i < prog->nr_reloc; i++) {
> > > +               relo = &prog->reloc_desc[i];
> > > +               insn = &main_prog->insns[prog->sub_insn_off + relo->insn_idx];
> > > +
> > > +               if (relo->type == RELO_SUBPROG_ADDR)
> > > +                       /* mark the insn, so it becomes insn_is_pseudo_func() */
> > > +                       insn[0].src_reg = BPF_PSEUDO_FUNC;
> > > +       }
> > > +
> >
> > This will do the same work over and over each time we append a subprog
> > to main_prog. This should logically follow append_subprog_relos(), but
> > you wanted to do it for main_prog with the same code, right?
>
> It cannot follow append_subprog_relos.
> It has to be done before the loop below.
> Otherwise !insn_is_pseudo_func() won't catch it and all ld_imm64 insns
> will be considered which will make the loop below more complex and slower.
> The find_prog_insn_relo() will be called a lot more times.
> !relo condition would be treated different ld_imm64 vs call insn, etc.

if you process main_prog->insns first all the calls to subprogs would
be updated, then it recursively would do this right before a new
subprog is added (initial call is bpf_object__reloc_code(obj, prog,
prog) where prog is entry-point program). But either way I'm not
suggesting doing this and splitting this logic into two places.

>
> > How about instead doing this before we start appending subprogs to
> > main_progs? I.e., do it explicitly in bpf_object__relocate() before
> > you start code relocation loop.
>
> Not sure I follow.
> Do another loop:
>  for (i = 0; i < obj->nr_programs; i++)
>     for (i = 0; i < prog->nr_reloc; i++)
>       if (relo->type == RELO_SUBPROG_ADDR)
>       ?
> That's an option too.
> I can do that if you prefer.
> It felt cleaner to do this mark here right before the loop below that needs it.

Yes, I'm proposing to do another loop in bpf_object__relocate() before
we start adding subprogs to main_progs. The reason is that
bpf_object__reloc_code() is called recursively many times for the same
main_prog, so doing that here is O(N^2) in the number of total
instructions in main_prog. It processes the same (already processed)
instructions many times unnecessarily. It's wasteful and unclean.

>
> > >         for (insn_idx = 0; insn_idx < prog->sec_insn_cnt; insn_idx++) {
> > >                 insn = &main_prog->insns[prog->sub_insn_off + insn_idx];
> > >                 if (!insn_is_subprog_call(insn) && !insn_is_pseudo_func(insn))
> > >                         continue;
> > >
> > >                 relo = find_prog_insn_relo(prog, insn_idx);
> > > +               if (relo && relo->type == RELO_EXTERN_FUNC)
> > > +                       /* kfunc relocations will be handled later
> > > +                        * in bpf_object__relocate_data()
> > > +                        */
> > > +                       continue;
> > >                 if (relo && relo->type != RELO_CALL && relo->type != RELO_SUBPROG_ADDR) {
> > >                         pr_warn("prog '%s': unexpected relo for insn #%zu, type %d\n",
> > >                                 prog->name, insn_idx, relo->type);
> >
> > [...]
>
> --

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 14/16] libbpf: Generate loader program out of BPF ELF file.
  2021-04-27  3:25     ` Alexei Starovoitov
@ 2021-04-27 17:34       ` Andrii Nakryiko
  2021-04-28  1:42         ` Alexei Starovoitov
  0 siblings, 1 reply; 52+ messages in thread
From: Andrii Nakryiko @ 2021-04-27 17:34 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Kernel Team

On Mon, Apr 26, 2021 at 8:25 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Apr 26, 2021 at 03:22:36PM -0700, Andrii Nakryiko wrote:
> > On Thu, Apr 22, 2021 at 5:27 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > From: Alexei Starovoitov <ast@kernel.org>
> > >
> > > The BPF program loading process performed by libbpf is quite complex
> > > and consists of the following steps:
> > > "open" phase:
> > > - parse elf file and remember relocations, sections
> > > - collect externs and ksyms including their btf_ids in prog's BTF
> > > - patch BTF datasec (since llvm couldn't do it)
> > > - init maps (old style map_def, BTF based, global data map, kconfig map)
> > > - collect relocations against progs and maps
> > > "load" phase:
> > > - probe kernel features
> > > - load vmlinux BTF
> > > - resolve externs (kconfig and ksym)
> > > - load program BTF
> > > - init struct_ops
> > > - create maps
> > > - apply CO-RE relocations
> > > - patch ld_imm64 insns with src_reg=PSEUDO_MAP, PSEUDO_MAP_VALUE, PSEUDO_BTF_ID
> > > - reposition subprograms and adjust call insns
> > > - sanitize and load progs
> > >
> > > During this process libbpf does sys_bpf() calls to load BTF, create maps,
> > > populate maps and finally load programs.
> > > Instead of actually doing the syscalls generate a trace of what libbpf
> > > would have done and represent it as the "loader program".
> > > The "loader program" consists of single map with:
> > > - union bpf_attr(s)
> > > - BTF bytes
> > > - map value bytes
> > > - insns bytes
> > > and single bpf program that passes bpf_attr(s) and data into bpf_sys_bpf() helper.
> > > Executing such "loader program" via bpf_prog_test_run() command will
> > > replay the sequence of syscalls that libbpf would have done which will result
> > > the same maps created and programs loaded as specified in the elf file.
> > > The "loader program" removes libelf and majority of libbpf dependency from
> > > program loading process.
> > >
> > > kconfig, typeless ksym, struct_ops and CO-RE are not supported yet.
> > >
> > > The order of relocate_data and relocate_calls had to change, so that
> > > bpf_gen__prog_load() can see all relocations for a given program with
> > > correct insn_idx-es.
> > >
> > > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > > ---
> > >  tools/lib/bpf/Build              |   2 +-
> > >  tools/lib/bpf/bpf_gen_internal.h |  40 ++
> > >  tools/lib/bpf/gen_loader.c       | 615 +++++++++++++++++++++++++++++++
> > >  tools/lib/bpf/libbpf.c           | 204 ++++++++--
> > >  tools/lib/bpf/libbpf.h           |  12 +
> > >  tools/lib/bpf/libbpf.map         |   1 +
> > >  tools/lib/bpf/libbpf_internal.h  |   2 +
> > >  tools/lib/bpf/skel_internal.h    | 105 ++++++
> > >  8 files changed, 948 insertions(+), 33 deletions(-)
> > >  create mode 100644 tools/lib/bpf/bpf_gen_internal.h
> > >  create mode 100644 tools/lib/bpf/gen_loader.c
> > >  create mode 100644 tools/lib/bpf/skel_internal.h
> > >
> > > diff --git a/tools/lib/bpf/Build b/tools/lib/bpf/Build
> > > index 9b057cc7650a..430f6874fa41 100644
> > > --- a/tools/lib/bpf/Build
> > > +++ b/tools/lib/bpf/Build
> > > @@ -1,3 +1,3 @@
> > >  libbpf-y := libbpf.o bpf.o nlattr.o btf.o libbpf_errno.o str_error.o \
> > >             netlink.o bpf_prog_linfo.o libbpf_probes.o xsk.o hashmap.o \
> > > -           btf_dump.o ringbuf.o strset.o linker.o
> > > +           btf_dump.o ringbuf.o strset.o linker.o gen_loader.o
> > > diff --git a/tools/lib/bpf/bpf_gen_internal.h b/tools/lib/bpf/bpf_gen_internal.h
> > > new file mode 100644
> > > index 000000000000..dc3e2cbf9ce3
> > > --- /dev/null
> > > +++ b/tools/lib/bpf/bpf_gen_internal.h
> > > @@ -0,0 +1,40 @@
> > > +/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
> > > +/* Copyright (c) 2021 Facebook */
> > > +#ifndef __BPF_GEN_INTERNAL_H
> > > +#define __BPF_GEN_INTERNAL_H
> > > +
> > > +struct relo_desc {
> >
> > there is very similarly named reloc_desc struct in libbpf.c, can you
> > rename it to something like gen_btf_relo_desc?
>
> sure.
>
> > > +       const char *name;
> > > +       int kind;
> > > +       int insn_idx;
> > > +};
> > > +
> >
> > [...]
> >
> > > +
> > > +static int bpf_gen__realloc_insn_buf(struct bpf_gen *gen, __u32 size)
> > > +{
> > > +       size_t off = gen->insn_cur - gen->insn_start;
> > > +
> > > +       if (gen->error)
> > > +               return gen->error;
> > > +       if (size > INT32_MAX || off + size > INT32_MAX) {
> > > +               gen->error = -ERANGE;
> > > +               return -ERANGE;
> > > +       }
> > > +       gen->insn_start = realloc(gen->insn_start, off + size);
> >
> > leaking memory here: gen->insn_start will be NULL on failure
>
> ohh. good catch.
>
> > > +       if (!gen->insn_start) {
> > > +               gen->error = -ENOMEM;
> > > +               return -ENOMEM;
> > > +       }
> > > +       gen->insn_cur = gen->insn_start + off;
> > > +       return 0;
> > > +}
> > > +
> > > +static int bpf_gen__realloc_data_buf(struct bpf_gen *gen, __u32 size)
> > > +{
> > > +       size_t off = gen->data_cur - gen->data_start;
> > > +
> > > +       if (gen->error)
> > > +               return gen->error;
> > > +       if (size > INT32_MAX || off + size > INT32_MAX) {
> > > +               gen->error = -ERANGE;
> > > +               return -ERANGE;
> > > +       }
> > > +       gen->data_start = realloc(gen->data_start, off + size);
> >
> > same as above
> >
> > > +       if (!gen->data_start) {
> > > +               gen->error = -ENOMEM;
> > > +               return -ENOMEM;
> > > +       }
> > > +       gen->data_cur = gen->data_start + off;
> > > +       return 0;
> > > +}
> > > +
> >
> > [...]
> >
> > > +
> > > +static void bpf_gen__emit_sys_bpf(struct bpf_gen *gen, int cmd, int attr, int attr_size)
> > > +{
> > > +       bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_1, cmd));
> > > +       bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_2, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, attr));
> >
> > is attr an offset into a blob? if yes, attr_off? or attr_base_off,
> > anything with _off
>
> yes. it's an offset into a blob, but I don't use _off anywhere
> otherwise all variables through out would have to have _off which is too verbose.

After reading bpf_gen__emit_rel_store() which uses "off" I assumed
offset terminology will be used everywhere, but I got used to that by
the end I finished reading, so I don't care anymore.

>
> > > +       bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_3, attr_size));
> > > +       bpf_gen__emit(gen, BPF_EMIT_CALL(BPF_FUNC_sys_bpf));
> > > +       /* remember the result in R7 */
> > > +       bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_7, BPF_REG_0));
> > > +}
> > > +
> > > +static void bpf_gen__emit_check_err(struct bpf_gen *gen)
> > > +{
> > > +       bpf_gen__emit(gen, BPF_JMP_IMM(BPF_JSGE, BPF_REG_7, 0, 2));
> > > +       bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_0, BPF_REG_7));
> > > +       bpf_gen__emit(gen, BPF_EXIT_INSN());
> > > +}
> > > +
> > > +static void __bpf_gen__debug(struct bpf_gen *gen, int reg1, int reg2, const char *fmt, va_list args)
> >
> > Can you please leave a comment on what reg1 and reg2 is, it's not very
> > clear and the code clearly assumes that it can't be reg[1-4]. It's
> > probably those special R7 and R9 (or -1, of course), but having a
> > short comment makes sense to not jump around trying to figure out
> > possible inputs.
> >
> > Oh, reading further, it can also be R0.
>
> good point. will add a comment.
>
> > > +{
> > > +       char buf[1024];
> > > +       int addr, len, ret;
> > > +
> > > +       if (!gen->log_level)
> > > +               return;
> > > +       ret = vsnprintf(buf, sizeof(buf), fmt, args);
> > > +       if (ret < 1024 - 7 && reg1 >= 0 && reg2 < 0)
> > > +               /* The special case to accommodate common bpf_gen__debug_ret():
> > > +                * to avoid specifying BPF_REG_7 and adding " r=%%d" to prints explicitly.
> > > +                */
> > > +               strcat(buf, " r=%d");
> > > +       len = strlen(buf) + 1;
> > > +       addr = bpf_gen__add_data(gen, buf, len);
> >
> > nit: offset, not address, right?
>
> it's actually an address.
> From pov of the program and insn below:
>
> > > +       bpf_gen__emit2(gen, BPF_LD_IMM64_RAW_FULL(BPF_REG_1, BPF_PSEUDO_MAP_IDX_VALUE, 0, 0, 0, addr));
>
> I guess it's kinda both. It's an offset within global data, but no one calls
> int var;
> &var -> taking an offset here? no. It's taking an address of the glob var.
> Our implementation of global data is via single map value element.
> So global vars have offsets too.
> imo addr is more accurate here and through out this file.

alright

>
> > > +       bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_2, len));
> > > +       if (reg1 >= 0)
> > > +               bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_3, reg1));
> > > +       if (reg2 >= 0)
> > > +               bpf_gen__emit(gen, BPF_MOV64_REG(BPF_REG_4, reg2));
> > > +       bpf_gen__emit(gen, BPF_EMIT_CALL(BPF_FUNC_trace_printk));
> > > +}
> > > +
> >
> > [...]
> >
> > > +int bpf_gen__finish(struct bpf_gen *gen)
> > > +{
> > > +       int i;
> > > +
> > > +       bpf_gen__emit_sys_close_stack(gen, stack_off(btf_fd));
> > > +       for (i = 0; i < gen->nr_progs; i++)
> > > +               bpf_gen__move_stack2ctx(gen,
> > > +                                       sizeof(struct bpf_loader_ctx) +
> > > +                                       sizeof(struct bpf_map_desc) * gen->nr_maps +
> > > +                                       sizeof(struct bpf_prog_desc) * i +
> > > +                                       offsetof(struct bpf_prog_desc, prog_fd), 4,
> > > +                                       stack_off(prog_fd[i]));
> > > +       for (i = 0; i < gen->nr_maps; i++)
> > > +               bpf_gen__move_stack2ctx(gen,
> > > +                                       sizeof(struct bpf_loader_ctx) +
> > > +                                       sizeof(struct bpf_map_desc) * i +
> > > +                                       offsetof(struct bpf_map_desc, map_fd), 4,
> > > +                                       stack_off(map_fd[i]));
> > > +       bpf_gen__emit(gen, BPF_MOV64_IMM(BPF_REG_0, 0));
> > > +       bpf_gen__emit(gen, BPF_EXIT_INSN());
> > > +       pr_debug("bpf_gen__finish %d\n", gen->error);
> >
> > maybe prefix all those pr_debug()s with "gen: " to distinguish them
> > from the rest of libbpf logging?
>
> sure.
>
> > > +       if (!gen->error) {
> > > +               struct gen_loader_opts *opts = gen->opts;
> > > +
> > > +               opts->insns = gen->insn_start;
> > > +               opts->insns_sz = gen->insn_cur - gen->insn_start;
> > > +               opts->data = gen->data_start;
> > > +               opts->data_sz = gen->data_cur - gen->data_start;
> > > +       }
> > > +       return gen->error;
> > > +}
> > > +
> > > +void bpf_gen__free(struct bpf_gen *gen)
> > > +{
> > > +       if (!gen)
> > > +               return;
> > > +       free(gen->data_start);
> > > +       free(gen->insn_start);
> > > +       gen->data_start = NULL;
> > > +       gen->insn_start = NULL;
> >
> > what's the point of NULL'ing them out if you don't clear gen->data_cur
> > and gen->insn_cur?
>
> To spot the bugs quicker if there are issues.
>
> > also should it free(gen) itself?
>
> ohh. bpf_object__close() should probably do it to stay symmetrical with calloc.

in libbpf usually xxx__free() frees the object itself, so doing
free(gen) here would be consistent with that, IMO

>
> > > +}
> > > +
> > > +void bpf_gen__load_btf(struct bpf_gen *gen, const void *btf_raw_data, __u32 btf_raw_size)
> > > +{
> > > +       union bpf_attr attr = {};
> >
> > here and below: memset(0)?
>
> that's unnecessary. there is no backward/forward compat issue here.
> and bpf_attr doesn't have gaps inside.

systemd definitely had a problem with non-zero padding with such usage
of bpf_attr recently, but I don't remember which command specifically.
Is there any downside to making sure that this will keep working for
later bpf_attr changes regardless of whether there are gaps or not?

>
> > > +       int attr_size = offsetofend(union bpf_attr, btf_log_level);
> > > +       int btf_data, btf_load_attr;
> > > +
> > > +       pr_debug("btf_load: size %d\n", btf_raw_size);
> > > +       btf_data = bpf_gen__add_data(gen, btf_raw_data, btf_raw_size);
> > > +
> >
> > [...]
> >
> > > +       map_create_attr = bpf_gen__add_data(gen, &attr, attr_size);
> > > +       if (attr.btf_value_type_id)
> > > +               /* populate union bpf_attr with btf_fd saved in the stack earlier */
> > > +               bpf_gen__move_stack2blob(gen, map_create_attr + offsetof(union bpf_attr, btf_fd), 4,
> > > +                                        stack_off(btf_fd));
> > > +       switch (attr.map_type) {
> > > +       case BPF_MAP_TYPE_ARRAY_OF_MAPS:
> > > +       case BPF_MAP_TYPE_HASH_OF_MAPS:
> > > +               bpf_gen__move_stack2blob(gen, map_create_attr + offsetof(union bpf_attr, inner_map_fd),
> > > +                                        4, stack_off(inner_map_fd));
> > > +               close_inner_map_fd = true;
> > > +               break;
> > > +       default:;
> >
> > default:
> >     break;
>
> why?

because it's consistent with all the code in libbpf:

$ rg --multiline 'default:\s*\n\s*break;' tools/lib/bpf

returns quite a few cases, while

$ rg --multiline 'default:\s*;' tools/lib/bpf

returns none

>
> > > +       }
> > > +       /* emit MAP_CREATE command */
> > > +       bpf_gen__emit_sys_bpf(gen, BPF_MAP_CREATE, map_create_attr, attr_size);
> > > +       bpf_gen__debug_ret(gen, "map_create %s idx %d type %d value_size %d",
> > > +                          attr.map_name, map_idx, map_attr->map_type, attr.value_size);
> > > +       bpf_gen__emit_check_err(gen);
> >
> > what will happen on error with inner_map_fd and all the other fds
> > created by now?
>
> that's a todo item. I'll add a comment that error path is far from perfect.

ok

>
> > > +       /* remember map_fd in the stack, if successful */
> > > +       if (map_idx < 0) {
> > > +               /* This bpf_gen__map_create() function is called with map_idx >= 0 for all maps
> > > +                * that libbpf loading logic tracks.
> > > +                * It's called with -1 to create an inner map.
> > > +                */
> > > +               bpf_gen__emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7, stack_off(inner_map_fd)));
> > > +       } else {
> > > +               if (map_idx != gen->nr_maps) {
> >
> > why would that happen? defensive programming? and even then `if () {}
> > else if () {} else {}` structure is more appropriate
>
> sure. will use that style.
>
> > > +                       gen->error = -EDOM; /* internal bug */
> > > +                       return;
> > > +               }
> > > +               bpf_gen__emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7, stack_off(map_fd[map_idx])));
> > > +               gen->nr_maps++;
> > > +       }
> > > +       if (close_inner_map_fd)
> > > +               bpf_gen__emit_sys_close_stack(gen, stack_off(inner_map_fd));
> > > +}
> > > +
> >
> > [...]
> >
> > > +static void bpf_gen__cleanup_relos(struct bpf_gen *gen, int insns)
> > > +{
> > > +       int i, insn;
> > > +
> > > +       for (i = 0; i < gen->relo_cnt; i++) {
> > > +               if (gen->relos[i].kind != BTF_KIND_VAR)
> > > +                       continue;
> > > +               /* close fd recorded in insn[insn_idx + 1].imm */
> > > +               insn = insns + sizeof(struct bpf_insn) * (gen->relos[i].insn_idx + 1)
> > > +                       + offsetof(struct bpf_insn, imm);
> > > +               bpf_gen__emit_sys_close_blob(gen, insn);
> >
> > wouldn't this close the same FD used across multiple "relos" multiple times?
>
> no. since every relo has its own fd.
> Right now the loader gen is simple and doesn't do preprocessing of
> all relos for all progs to avoid complicating the code.
> It's good enough for now.

I keep forgetting about this split of "this will happen much later in
kernel" vs "this is what libbpf is normally doing". This is going to
be fun to support.

>
> > > +       }
> > > +       if (gen->relo_cnt) {
> > > +               free(gen->relos);
> > > +               gen->relo_cnt = 0;
> > > +               gen->relos = NULL;
> > > +       }
> > > +}
> > > +
> >
> > [...]
> >
> > > +       struct bpf_gen *gen_loader;
> > > +
> > >         /*
> > >          * Information when doing elf related work. Only valid if fd
> > >          * is valid.
> > > @@ -2651,7 +2654,15 @@ static int bpf_object__sanitize_and_load_btf(struct bpf_object *obj)
> > >                 bpf_object__sanitize_btf(obj, kern_btf);
> > >         }
> > >
> > > -       err = btf__load(kern_btf);
> > > +       if (obj->gen_loader) {
> > > +               __u32 raw_size = 0;
> > > +               const void *raw_data = btf__get_raw_data(kern_btf, &raw_size);
> >
> > this can return NULL on ENOMEM
>
> good point.
>
> > > +
> > > +               bpf_gen__load_btf(obj->gen_loader, raw_data, raw_size);
> > > +               btf__set_fd(kern_btf, 0);
> >
> > why setting fd to 0 (stdin)? does gen depend on this somewhere? The
> > problem is that it will eventually be closed on btf__free(), which
> > will close stdin, causing a big surprise. What will happen if you
> > leave it at -1?
>
> unfortunately there are various piece of the code that check <0
> means not init.
> I can try to fix them all, but it felt unncessary complex vs screwing up stdin.
> It's bpftool's stdin, but you're right that it's not pretty.
> I can probably use special very large FD number and check for it.
> Still cleaner than fixing all checks.

Maybe after generation go over each prog/map/BTF and reset their FDs
to -1 if gen_loader is used? Or I guess we can just sprinkle

if (!obj->gen_loader)
    close(fd);

in the right places. Not great from code readability, but at least
won't have spurious close()s.

>
> >
> > > +       } else {
> > > +               err = btf__load(kern_btf);
> > > +       }
> > >         if (sanitize) {
> > >                 if (!err) {
> > >                         /* move fd to libbpf's BTF */
> > > @@ -4262,6 +4273,12 @@ static bool kernel_supports(const struct bpf_object *obj, enum kern_feature_id f
> > >         struct kern_feature_desc *feat = &feature_probes[feat_id];
> > >         int ret;
> > >
> > > +       if (obj->gen_loader)
> > > +               /* To generate loader program assume the latest kernel
> > > +                * to avoid doing extra prog_load, map_create syscalls.
> > > +                */
> > > +               return true;
> > > +
> > >         if (READ_ONCE(feat->res) == FEAT_UNKNOWN) {
> > >                 ret = feat->probe();
> > >                 if (ret > 0) {
> > > @@ -4344,6 +4361,13 @@ bpf_object__populate_internal_map(struct bpf_object *obj, struct bpf_map *map)
> > >         char *cp, errmsg[STRERR_BUFSIZE];
> > >         int err, zero = 0;
> > >
> > > +       if (obj->gen_loader) {
> > > +               bpf_gen__map_update_elem(obj->gen_loader, map - obj->maps,
> >
> > it would be great for bpf_gen__map_update_elem to reflect that it's
> > not a generic map_update_elem() call, rather special internal map
> > update (just use bpf_gen__populate_internal_map?) Whether to freeze or
> > not could be just a flag to the same call, they always go together.
>
> It's actually generic map_update_elem. I haven't used it in inner map init yet.
> Still on my todo list.

Right now it assumes the key is zero and sizeof(int), I was wondering
if you are going to capture generic key in the future. Ok.

>
> > > +                                        map->mmaped, map->def.value_size);
> > > +               if (map_type == LIBBPF_MAP_RODATA || map_type == LIBBPF_MAP_KCONFIG)
> > > +                       bpf_gen__map_freeze(obj->gen_loader, map - obj->maps);
> > > +               return 0;
> > > +       }
> > >         err = bpf_map_update_elem(map->fd, &zero, map->mmaped, 0);
> > >         if (err) {
> > >                 err = -errno;

[...]

> > > @@ -9387,7 +9521,13 @@ static int libbpf_find_attach_btf_id(struct bpf_program *prog, int *btf_obj_fd,
> > >         }
> > >
> > >         /* kernel/module BTF ID */
> > > -       err = find_kernel_btf_id(prog->obj, attach_name, attach_type, btf_obj_fd, btf_type_id);
> > > +       if (prog->obj->gen_loader) {
> > > +               bpf_gen__record_attach_target(prog->obj->gen_loader, attach_name, attach_type);
> > > +               *btf_obj_fd = 0;
> >
> > this will leak kernel module BTF FDs
>
> I don't follow.
> When gen_loader is happening the find_kernel_btf_id() is not called and modules BTFs are not loaded.

Yeah, my bad again, I misread diff as if find_kernel_btf_id() was
still called before. Keep forgetting about this "in the kernel, in the
future" semantics. Never mind.

>
> > > +               *btf_type_id = 1;
> > > +       } else {
> > > +               err = find_kernel_btf_id(prog->obj, attach_name, attach_type, btf_obj_fd, btf_type_id);
> > > +       }
> > >         if (err) {
> > >                 pr_warn("failed to find kernel BTF type ID of '%s': %d\n", attach_name, err);
> > >                 return err;
> >
> > [...]
> >
> > > +out:
> > > +       close(map_fd);
> > > +       close(prog_fd);
> >
> > this does close(-1), check >= 0
>
> Right. I felt there is no risk in doing that. I guess extra check is fine too.

We try to not do this even in selftests, so good to not have spurious close()s.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 15/16] bpftool: Use syscall/loader program in "prog load" and "gen skeleton" command.
  2021-04-27  3:28     ` Alexei Starovoitov
@ 2021-04-27 17:38       ` Andrii Nakryiko
  0 siblings, 0 replies; 52+ messages in thread
From: Andrii Nakryiko @ 2021-04-27 17:38 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Kernel Team

On Mon, Apr 26, 2021 at 8:28 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Apr 26, 2021 at 03:35:16PM -0700, Andrii Nakryiko wrote:
> > On Thu, Apr 22, 2021 at 5:27 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > From: Alexei Starovoitov <ast@kernel.org>
> > >
> > > Add -L flag to bpftool to use libbpf gen_trace facility and syscall/loader program
> > > for skeleton generation and program loading.
> > >
> > > "bpftool gen skeleton -L" command will generate a "light skeleton" or "loader skeleton"
> > > that is similar to existing skeleton, but has one major difference:
> > > $ bpftool gen skeleton lsm.o > lsm.skel.h
> > > $ bpftool gen skeleton -L lsm.o > lsm.lskel.h
> > > $ diff lsm.skel.h lsm.lskel.h
> > > @@ -5,34 +4,34 @@
> > >  #define __LSM_SKEL_H__
> > >
> > >  #include <stdlib.h>
> > > -#include <bpf/libbpf.h>
> > > +#include <bpf/bpf.h>
> > >
> > > The light skeleton does not use majority of libbpf infrastructure.
> > > It doesn't need libelf. It doesn't parse .o file.
> > > It only needs few sys_bpf wrappers. All of them are in bpf/bpf.h file.
> > > In future libbpf/bpf.c can be inlined into bpf.h, so not even libbpf.a would be
> > > needed to work with light skeleton.
> > >
> > > "bpftool prog load -L file.o" command is introduced for debugging of syscall/loader
> > > program generation. Just like the same command without -L it will try to load
> > > the programs from file.o into the kernel. It won't even try to pin them.
> > >
> > > "bpftool prog load -L -d file.o" command will provide additional debug messages
> > > on how syscall/loader program was generated.
> > > Also the execution of syscall/loader program will use bpf_trace_printk() for
> > > each step of loading BTF, creating maps, and loading programs.
> > > The user can do "cat /.../trace_pipe" for further debug.
> > >
> > > An example of fexit_sleep.lskel.h generated from progs/fexit_sleep.c:
> > > struct fexit_sleep {
> > >         struct bpf_loader_ctx ctx;
> > >         struct {
> > >                 struct bpf_map_desc bss;
> > >         } maps;
> > >         struct {
> > >                 struct bpf_prog_desc nanosleep_fentry;
> > >                 struct bpf_prog_desc nanosleep_fexit;
> > >         } progs;
> > >         struct {
> > >                 int nanosleep_fentry_fd;
> > >                 int nanosleep_fexit_fd;
> > >         } links;
> > >         struct fexit_sleep__bss {
> > >                 int pid;
> > >                 int fentry_cnt;
> > >                 int fexit_cnt;
> > >         } *bss;
> > > };
> > >
> > > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > > ---
> > >  tools/bpf/bpftool/Makefile        |   2 +-
> > >  tools/bpf/bpftool/gen.c           | 313 +++++++++++++++++++++++++++---
> > >  tools/bpf/bpftool/main.c          |   7 +-
> > >  tools/bpf/bpftool/main.h          |   1 +
> > >  tools/bpf/bpftool/prog.c          |  80 ++++++++
> > >  tools/bpf/bpftool/xlated_dumper.c |   3 +
> > >  6 files changed, 382 insertions(+), 24 deletions(-)
> > >
> >
> > [...]
> >
> > > @@ -268,6 +269,254 @@ static void codegen(const char *template, ...)
> > >         free(s);
> > >  }
> > >
> > > +static void print_hex(const char *obj_data, int file_sz)
> > > +{
> > > +       int i, len;
> > > +
> > > +       /* embed contents of BPF object file */
> >
> > nit: this comment should have stayed at the original place
> >
> > > +       for (i = 0, len = 0; i < file_sz; i++) {
> > > +               int w = obj_data[i] ? 4 : 2;
> > > +
> >
> > [...]
> >
> > > +       bpf_object__for_each_map(map, obj) {
> > > +               const char * ident;
> > > +
> > > +               ident = get_map_ident(map);
> > > +               if (!ident)
> > > +                       continue;
> > > +
> > > +               if (!bpf_map__is_internal(map) ||
> > > +                   !(bpf_map__def(map)->map_flags & BPF_F_MMAPABLE))
> > > +                       continue;
> > > +
> > > +               printf("\tskel->%1$s =\n"
> > > +                      "\t\tmmap(NULL, %2$zd, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_FIXED,\n"
> > > +                      "\t\t\tskel->maps.%1$s.map_fd, 0);\n",
> > > +                      ident, bpf_map_mmap_sz(map));
> >
> > use codegen()?
>
> why?
> codegen() would add extra early \n for no good reason.

for consistency, seems like the rest of the code in that function uses
codegen(). But not critical.

>
> > > +       }
> > > +       codegen("\
> > > +               \n\
> > > +                       return 0;                                           \n\
> > > +               }                                                           \n\
> > > +                                                                           \n\
> > > +               static inline struct %1$s *                                 \n\
> >
> > [...]
> >
> > >  static int do_skeleton(int argc, char **argv)
> > >  {
> > >         char header_guard[MAX_OBJ_NAME_LEN + sizeof("__SKEL_H__")];
> > > @@ -277,7 +526,7 @@ static int do_skeleton(int argc, char **argv)
> > >         struct bpf_object *obj = NULL;
> > >         const char *file, *ident;
> > >         struct bpf_program *prog;
> > > -       int fd, len, err = -1;
> > > +       int fd, err = -1;
> > >         struct bpf_map *map;
> > >         struct btf *btf;
> > >         struct stat st;
> > > @@ -359,7 +608,25 @@ static int do_skeleton(int argc, char **argv)
> > >         }
> > >
> > >         get_header_guard(header_guard, obj_name);
> > > -       codegen("\
> > > +       if (use_loader)
> >
> > please use {} for such a long if/else, even if it's, technically, a
> > single-statement if
>
> I think it reads fine as-is, but, sure, I can add {}
>
> > > +               codegen("\
> > > +               \n\
> > > +               /* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */   \n\
> > > +               /* THIS FILE IS AUTOGENERATED! */                           \n\
> > > +               #ifndef %2$s                                                \n\
> > > +               #define %2$s                                                \n\
> > > +                                                                           \n\
> > > +               #include <stdlib.h>                                         \n\
> > > +               #include <bpf/bpf.h>                                        \n\
> > > +               #include <bpf/skel_internal.h>                              \n\
> > > +                                                                           \n\
> > > +               struct %1$s {                                               \n\
> > > +                       struct bpf_loader_ctx ctx;                          \n\
> > > +               ",
> > > +               obj_name, header_guard
> > > +               );
> > > +       else
> > > +               codegen("\
> > >                 \n\
> > >                 /* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */   \n\
> > >                                                                             \n\
> >
> > [...]
>
> --

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 10/16] bpf: Add bpf_btf_find_by_name_kind() helper.
  2021-04-27  3:37     ` Alexei Starovoitov
@ 2021-04-27 17:45       ` Andrii Nakryiko
  2021-04-28  1:55         ` Alexei Starovoitov
  0 siblings, 1 reply; 52+ messages in thread
From: Andrii Nakryiko @ 2021-04-27 17:45 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Kernel Team

On Mon, Apr 26, 2021 at 8:37 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Apr 26, 2021 at 03:46:29PM -0700, Andrii Nakryiko wrote:
> > On Thu, Apr 22, 2021 at 5:27 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > From: Alexei Starovoitov <ast@kernel.org>
> > >
> > > Add new helper:
> > >
> > > long bpf_btf_find_by_name_kind(u32 btf_fd, char *name, u32 kind, int flags)
> > >         Description
> > >                 Find given name with given type in BTF pointed to by btf_fd.
> > >                 If btf_fd is zero look for the name in vmlinux BTF and in module's BTFs.
> > >         Return
> > >                 Returns btf_id and btf_obj_fd in lower and upper 32 bits.
> > >
> > > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > > ---
> > >  include/linux/bpf.h            |  1 +
> > >  include/uapi/linux/bpf.h       |  8 ++++
> > >  kernel/bpf/btf.c               | 68 ++++++++++++++++++++++++++++++++++
> > >  kernel/bpf/syscall.c           |  2 +
> > >  tools/include/uapi/linux/bpf.h |  8 ++++
> > >  5 files changed, 87 insertions(+)
> > >
> > > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > > index 0f841bd0cb85..4cf361eb6a80 100644
> > > --- a/include/linux/bpf.h
> > > +++ b/include/linux/bpf.h
> > > @@ -1972,6 +1972,7 @@ extern const struct bpf_func_proto bpf_get_socket_ptr_cookie_proto;
> > >  extern const struct bpf_func_proto bpf_task_storage_get_proto;
> > >  extern const struct bpf_func_proto bpf_task_storage_delete_proto;
> > >  extern const struct bpf_func_proto bpf_for_each_map_elem_proto;
> > > +extern const struct bpf_func_proto bpf_btf_find_by_name_kind_proto;
> > >
> > >  const struct bpf_func_proto *bpf_tracing_func_proto(
> > >         enum bpf_func_id func_id, const struct bpf_prog *prog);
> > > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > > index de58a714ed36..253f5f031f08 100644
> > > --- a/include/uapi/linux/bpf.h
> > > +++ b/include/uapi/linux/bpf.h
> > > @@ -4748,6 +4748,13 @@ union bpf_attr {
> > >   *             Execute bpf syscall with given arguments.
> > >   *     Return
> > >   *             A syscall result.
> > > + *
> > > + * long bpf_btf_find_by_name_kind(u32 btf_fd, char *name, u32 kind, int flags)
> > > + *     Description
> > > + *             Find given name with given type in BTF pointed to by btf_fd.
> >
> > "Find BTF type with given name"? Should the limits on name length be
>
> +1
>
> > specified? KSYM_NAME_LEN is a pretty arbitrary restriction.
>
> that's implementation detail that shouldn't leak into uapi.
>
> > Also,
> > would it still work fine if the caller provides a pointer to a much
> > shorter piece of memory?
> >
> > Why not add name_sz right after name, as we do with a lot of other
> > arguments like this?
>
> That's an option too, but then the helper will have 5 args and 'flags'
> would be likely useless. I mean unlikely it will help extending it.
> I was thinking about ARG_PTR_TO_CONST_STR, but it doesn't work,
> since blob is writeable by the prog. It's read only from user space.
> I'm fine with name, name_sz though.

Yeah, I figured ARG_PTR_TO_CONST_STR isn't an option here. By "flags
would be useless" you mean that you'd use another parameter if some
flags were set? Did we ever do that to existing helpers? We can always
add a new helper, if necessary. name + name_sz seems less error-prone,
tbh.

>
> >
> > > + *             If btf_fd is zero look for the name in vmlinux BTF and in module's BTFs.
> > > + *     Return
> > > + *             Returns btf_id and btf_obj_fd in lower and upper 32 bits.
> >
> > Mention that for vmlinux BTF btf_obj_fd will be zero? Also who "owns"
> > the FD? If the BPF program doesn't close it, when are they going to be
> > cleaned up?
>
> just like bpf_sys_bpf. Who owns returned FD? The program that called
> the helper, of course.

"program" as in the user-space process that did bpf_prog_test_run(),
right? In the cover letter you mentioned that BPF_PROG_TYPE_SYSCALL
might be called on syscall entry in the future, for that case there is
no clear "owning" process, so would be curious to see how that problem
gets solved.

> In the current shape of loader prog these btf fds are cleaned up correctly
> in success and in error case.
> Not all FDs though. map fds will stay around if bpf_sys_bpf(prog_load) fails to load.
> Tweaking loader prog to close all FDs in error case is on todo list.

Ok, good, that seems important.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH v2 bpf-next 01/16] bpf: Introduce bpf_sys_bpf() helper and program type.
  2021-04-23  0:26 ` [PATCH v2 bpf-next 01/16] bpf: Introduce bpf_sys_bpf() helper and program type Alexei Starovoitov
  2021-04-23 18:15   ` Yonghong Song
  2021-04-26 16:51   ` Andrii Nakryiko
@ 2021-04-27 18:45   ` John Fastabend
  2 siblings, 0 replies; 52+ messages in thread
From: John Fastabend @ 2021-04-27 18:45 UTC (permalink / raw)
  To: Alexei Starovoitov, davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

Alexei Starovoitov wrote:
> From: Alexei Starovoitov <ast@kernel.org>
> 
> Add placeholders for bpf_sys_bpf() helper and new program type.
> 
> v1->v2:
> - check that expected_attach_type is zero
> - allow more helper functions to be used in this program type, since they will
>   only execute from user context via bpf_prog_test_run.
> 
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---

Acked-by: John Fastabend <john.fastabend@gmail.com>

> +int bpf_prog_test_run_syscall(struct bpf_prog *prog,
> +			      const union bpf_attr *kattr,
> +			      union bpf_attr __user *uattr)
> +{
> +	void __user *ctx_in = u64_to_user_ptr(kattr->test.ctx_in);
> +	__u32 ctx_size_in = kattr->test.ctx_size_in;
> +	void *ctx = NULL;
> +	u32 retval;
> +	int err = 0;
> +
> +	/* doesn't support data_in/out, ctx_out, duration, or repeat or flags */
> +	if (kattr->test.data_in || kattr->test.data_out ||
> +	    kattr->test.ctx_out || kattr->test.duration ||
> +	    kattr->test.repeat || kattr->test.flags)
> +		return -EINVAL;
> +
> +	if (ctx_size_in < prog->aux->max_ctx_offset ||
> +	    ctx_size_in > U16_MAX)
> +		return -EINVAL;
> +
> +	if (ctx_size_in) {
> +		ctx = kzalloc(ctx_size_in, GFP_USER);
> +		if (!ctx)
> +			return -ENOMEM;
> +		if (copy_from_user(ctx, ctx_in, ctx_size_in)) {
> +			err = -EFAULT;
> +			goto out;
> +		}
> +	}
> +	retval = bpf_prog_run_pin_on_cpu(prog, ctx);
> +
> +	if (copy_to_user(&uattr->test.retval, &retval, sizeof(u32)))
> +		err = -EFAULT;
> +	if (ctx_size_in)
> +		if (copy_to_user(ctx_in, ctx, ctx_size_in)) {
> +			err = -EFAULT;
> +			goto out;
> +		}

stupid nit, the last goto there is not needed.

> +out:
> +	kfree(ctx);
> +	return err;
> +}

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH v2 bpf-next 10/16] bpf: Add bpf_btf_find_by_name_kind() helper.
  2021-04-23  0:26 ` [PATCH v2 bpf-next 10/16] bpf: Add bpf_btf_find_by_name_kind() helper Alexei Starovoitov
  2021-04-26 22:46   ` Andrii Nakryiko
@ 2021-04-27 21:00   ` John Fastabend
  2021-04-27 21:05     ` John Fastabend
  2021-04-28  2:10     ` Alexei Starovoitov
  1 sibling, 2 replies; 52+ messages in thread
From: John Fastabend @ 2021-04-27 21:00 UTC (permalink / raw)
  To: Alexei Starovoitov, davem; +Cc: daniel, andrii, netdev, bpf, kernel-team

Alexei Starovoitov wrote:
> From: Alexei Starovoitov <ast@kernel.org>
> 
> Add new helper:
> 
> long bpf_btf_find_by_name_kind(u32 btf_fd, char *name, u32 kind, int flags)
> 	Description
> 		Find given name with given type in BTF pointed to by btf_fd.
> 		If btf_fd is zero look for the name in vmlinux BTF and in module's BTFs.
> 	Return
> 		Returns btf_id and btf_obj_fd in lower and upper 32 bits.
> 
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---

I'm missing some high-level concept on how this would be used? Where does btf_fd come
from and how is it used so that it doesn't break sig-check?

A use-case I'm trying to fit into this series is how to pass down a BTF fd/object
with the program. I know its not doing CO-RE yet but we would want it to use the
BTF object being passed down for CO-RE eventually. Will there be someway to do
that here? That looks like the btf_fd here.

Thanks,
John

^ permalink raw reply	[flat|nested] 52+ messages in thread

* RE: [PATCH v2 bpf-next 10/16] bpf: Add bpf_btf_find_by_name_kind() helper.
  2021-04-27 21:00   ` John Fastabend
@ 2021-04-27 21:05     ` John Fastabend
  2021-04-28  2:10     ` Alexei Starovoitov
  1 sibling, 0 replies; 52+ messages in thread
From: John Fastabend @ 2021-04-27 21:05 UTC (permalink / raw)
  To: John Fastabend, Alexei Starovoitov, davem
  Cc: daniel, andrii, netdev, bpf, kernel-team

John Fastabend wrote:
> Alexei Starovoitov wrote:
> > From: Alexei Starovoitov <ast@kernel.org>
> > 
> > Add new helper:
> > 
> > long bpf_btf_find_by_name_kind(u32 btf_fd, char *name, u32 kind, int flags)
> > 	Description
> > 		Find given name with given type in BTF pointed to by btf_fd.
> > 		If btf_fd is zero look for the name in vmlinux BTF and in module's BTFs.
> > 	Return
> > 		Returns btf_id and btf_obj_fd in lower and upper 32 bits.
> > 
> > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > ---
> 
> I'm missing some high-level concept on how this would be used? Where does btf_fd come
> from and how is it used so that it doesn't break sig-check?

aha as I look through this again it seems btf_fd can be from fd_array[] and
sig-check will pan out as well as BTF can come from any valid file.

Correct? If so lgtm at high-level.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 09/16] libbpf: Support for fd_idx
  2021-04-27 16:36       ` Andrii Nakryiko
@ 2021-04-28  1:32         ` Alexei Starovoitov
  2021-04-28 18:40           ` Andrii Nakryiko
  0 siblings, 1 reply; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-28  1:32 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Kernel Team

On Tue, Apr 27, 2021 at 09:36:54AM -0700, Andrii Nakryiko wrote:
> On Mon, Apr 26, 2021 at 7:53 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Mon, Apr 26, 2021 at 10:14:45AM -0700, Andrii Nakryiko wrote:
> > > On Thu, Apr 22, 2021 at 5:27 PM Alexei Starovoitov
> > > <alexei.starovoitov@gmail.com> wrote:
> > > >
> > > > From: Alexei Starovoitov <ast@kernel.org>
> > > >
> > > > Add support for FD_IDX make libbpf prefer that approach to loading programs.
> > > >
> > > > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > > > ---
> > > >  tools/lib/bpf/bpf.c             |  1 +
> > > >  tools/lib/bpf/libbpf.c          | 70 +++++++++++++++++++++++++++++----
> > > >  tools/lib/bpf/libbpf_internal.h |  1 +
> > > >  3 files changed, 65 insertions(+), 7 deletions(-)
> > > >
> > >
> 
> [...]
> 
> > > >         for (i = 0; i < obj->nr_programs; i++) {
> > > >                 prog = &obj->programs[i];
> > > >                 if (prog_is_subprog(obj, prog))
> > > > @@ -7256,10 +7308,14 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)
> > > >                         continue;
> > > >                 }
> > > >                 prog->log_level |= log_level;
> > > > +               prog->fd_array = fd_array;
> > >
> > > you are not freeing this memory on success, as far as I can see.
> >
> > hmm. there is free on success below.
> 
> right, my bad, I somehow understood as if it was only for error case
> 
> >
> > > And
> > > given multiple programs are sharing fd_array, it's a bit problematic
> > > for prog to have fd_array. This is per-object properly, so let's add
> > > it at bpf_object level and clean it up on bpf_object__close()? And by
> > > assigning to obj->fd_array at malloc() site, you won't need to do all
> > > the error-handling free()s below.
> >
> > hmm. that sounds worse.
> > why add another 8 byte to bpf_object that won't be used
> > until this last step of bpf_object__load_progs.
> > And only for the duration of this loading.
> > It's cheaper to have this alloc here with two free()s below.
> 
> So if you care about extra 8 bytes, then it's even more efficient to
> have just one obj->fd_array rather than N prog->fd_array, no?

I think it's layer breaking when bpf_program__load()->load_program()
has to reach out to prog->obj to do its work.
The layers are already a mess due to:
&prog->obj->maps[prog->obj->rodata_map_idx]
I wanted to avoid making it uglier.

> And it's
> also not very clean that prog->fd_array will have a dangling pointer
> to deallocated memory after bpf_object__load_progs().

prog->reloc_desc is free and zeroed after __relocate.
prog->insns are freed and _not_ zereod after __load_progs.
so prog->fd_array won't be the first such pointer.
I can add zeroing, of course.

> 
> But that brings the entire question of why use fd_array at all here?
> Commit description doesn't explain why libbpf has to use fd_array and
> why it should be preferred. What are the advantages justifying added
> complexity and extra memory allocation/clean up? It also reduces test
> coverage of the "old ways" that offer the same capabilities. I think
> this should be part of the commit description, if we agree that
> fd_array has to be used outside of the auto-generated loader program.

I can add a knob to it to use it during loader gen for the loader gen
and for the runner of the loader prog.
I think it will add more complexity.
The bpf CI runs on older kernels, so the test coverage of "old ways"
is not reduced regardless.
From the kernel pov BPF_PSEUDO_MAP_FD vs BPF_PSEUDO_MAP_IDX there is
no advantage.
From the libbpf side patch 9 looked trivial enough _not_ do it conditionally,
but whatever. I don't mind more 'if'-s.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 14/16] libbpf: Generate loader program out of BPF ELF file.
  2021-04-27 17:34       ` Andrii Nakryiko
@ 2021-04-28  1:42         ` Alexei Starovoitov
  0 siblings, 0 replies; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-28  1:42 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Kernel Team

On Tue, Apr 27, 2021 at 10:34:50AM -0700, Andrii Nakryiko wrote:
> > > > +void bpf_gen__load_btf(struct bpf_gen *gen, const void *btf_raw_data, __u32 btf_raw_size)
> > > > +{
> > > > +       union bpf_attr attr = {};
> > >
> > > here and below: memset(0)?
> >
> > that's unnecessary. there is no backward/forward compat issue here.
> > and bpf_attr doesn't have gaps inside.
> 
> systemd definitely had a problem with non-zero padding with such usage
> of bpf_attr recently, but I don't remember which command specifically.
> Is there any downside to making sure that this will keep working for
> later bpf_attr changes regardless of whether there are gaps or not?

I'd like to avoid cargo culting memset where it's not needed,
but thinking more about it...
I'll switch to memset(, cmd_specific_attr_size) instead.
I wanted to do this optimization forever in the rest of libbpf.
That would be a starting place.
Eventually when bpf.c will migrate into bpf.h I will convert it to
memset(, attr_size) too.
The bpf_attr is too big to do full zeroing.
Loader gen already taking advantage of that with partial bpf_attr for everything.

> > unfortunately there are various piece of the code that check <0
> > means not init.
> > I can try to fix them all, but it felt unncessary complex vs screwing up stdin.
> > It's bpftool's stdin, but you're right that it's not pretty.
> > I can probably use special very large FD number and check for it.
> > Still cleaner than fixing all checks.
> 
> Maybe after generation go over each prog/map/BTF and reset their FDs
> to -1 if gen_loader is used? Or I guess we can just sprinkle
> 
> if (!obj->gen_loader)
>     close(fd);
> 
> in the right places. Not great from code readability, but at least
> won't have spurious close()s.

Interesting ideas. I haven't thought about reseting back to -1.
Will do and address the rest of comments too.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 10/16] bpf: Add bpf_btf_find_by_name_kind() helper.
  2021-04-27 17:45       ` Andrii Nakryiko
@ 2021-04-28  1:55         ` Alexei Starovoitov
  2021-04-28 18:44           ` Andrii Nakryiko
  0 siblings, 1 reply; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-28  1:55 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Kernel Team

On Tue, Apr 27, 2021 at 10:45:38AM -0700, Andrii Nakryiko wrote:
> > >
> > > > + *             If btf_fd is zero look for the name in vmlinux BTF and in module's BTFs.
> > > > + *     Return
> > > > + *             Returns btf_id and btf_obj_fd in lower and upper 32 bits.
> > >
> > > Mention that for vmlinux BTF btf_obj_fd will be zero? Also who "owns"
> > > the FD? If the BPF program doesn't close it, when are they going to be
> > > cleaned up?
> >
> > just like bpf_sys_bpf. Who owns returned FD? The program that called
> > the helper, of course.
> 
> "program" as in the user-space process that did bpf_prog_test_run(),
> right? In the cover letter you mentioned that BPF_PROG_TYPE_SYSCALL
> might be called on syscall entry in the future, for that case there is
> no clear "owning" process, so would be curious to see how that problem
> gets solved.

well, there is always an owner process. When syscall progs is attached
to syscall such FDs will be in the process that doing syscall.
It's kinda 'random', but that's the job of the prog to make 'non random'.
If it's doing syscalls that will install FDs it should have a reason
to do so. Likely there will be limitations on what bpf helpers such syscall
prog can do if it's attached to this or that syscall.
Currently it's test_run only.

I'm not sure whether you're hinting that it all should be FD-less or I'm
putting a question in your mouth, but I've considered doing that and
figured that it's an overkill. It's possible to convert .*bpf.* do deal
with FDs and with some other temporary handle. Instead of map_fd the
loader prog would create a map and get a handle back that it will use
later in prog_load, etc.
But amount of refactoring looks excessive.
The generated loader prog should be correct by construction and
clean up after itself instead of burdening the kernel cleaning
those extra handles.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 10/16] bpf: Add bpf_btf_find_by_name_kind() helper.
  2021-04-27 21:00   ` John Fastabend
  2021-04-27 21:05     ` John Fastabend
@ 2021-04-28  2:10     ` Alexei Starovoitov
  1 sibling, 0 replies; 52+ messages in thread
From: Alexei Starovoitov @ 2021-04-28  2:10 UTC (permalink / raw)
  To: John Fastabend; +Cc: davem, daniel, andrii, netdev, bpf, kernel-team

On Tue, Apr 27, 2021 at 02:00:43PM -0700, John Fastabend wrote:
> Alexei Starovoitov wrote:
> > From: Alexei Starovoitov <ast@kernel.org>
> > 
> > Add new helper:
> > 
> > long bpf_btf_find_by_name_kind(u32 btf_fd, char *name, u32 kind, int flags)
> > 	Description
> > 		Find given name with given type in BTF pointed to by btf_fd.
> > 		If btf_fd is zero look for the name in vmlinux BTF and in module's BTFs.
> > 	Return
> > 		Returns btf_id and btf_obj_fd in lower and upper 32 bits.
> > 
> > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > ---
> 
> I'm missing some high-level concept on how this would be used? Where does btf_fd come
> from and how is it used so that it doesn't break sig-check?

you mean that one that is being returned or the one passed in?
The one that is passed in I only tested locally. No patches use that. Sorry.
It's to support PROG_EXT. That btf_fd points to BTF of the prog that is being extended.
The signed extension prog will have the name of subprog covered by the signature,
but target btf_fd of the prog won't be known at signing time.
It will be supplied via struct bpf_prog_desc. That's what attach_prog_fd is there for.
I can remove all that stuff for now.
The name of target prog doesn't have to be part of the signature.
All of these details are to be discussed.
We can make signature as tight as we want or more flexible.

> A use-case I'm trying to fit into this series is how to pass down a BTF fd/object
> with the program. 

I'm not sure I follow.
struct bpf_prog_desc will have more fields that can be populated to tweak
particular prog before running the loader.

> I know its not doing CO-RE yet but we would want it to use the
> BTF object being passed down for CO-RE eventually. Will there be someway to do
> that here? That looks like the btf_fd here.

I've started hacking on CO-RE. So far I'm thinking to pass spec string to
the kernel in another section of btf.ext. Similar to line_info and func_info.
As an orthogonal discussion I think CO-RE should be able to relocate
against already loaded bpf progs too (and not only kernel and modules).

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 09/16] libbpf: Support for fd_idx
  2021-04-28  1:32         ` Alexei Starovoitov
@ 2021-04-28 18:40           ` Andrii Nakryiko
  0 siblings, 0 replies; 52+ messages in thread
From: Andrii Nakryiko @ 2021-04-28 18:40 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Kernel Team

On Tue, Apr 27, 2021 at 6:32 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Tue, Apr 27, 2021 at 09:36:54AM -0700, Andrii Nakryiko wrote:
> > On Mon, Apr 26, 2021 at 7:53 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Mon, Apr 26, 2021 at 10:14:45AM -0700, Andrii Nakryiko wrote:
> > > > On Thu, Apr 22, 2021 at 5:27 PM Alexei Starovoitov
> > > > <alexei.starovoitov@gmail.com> wrote:
> > > > >
> > > > > From: Alexei Starovoitov <ast@kernel.org>
> > > > >
> > > > > Add support for FD_IDX make libbpf prefer that approach to loading programs.
> > > > >
> > > > > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > > > > ---
> > > > >  tools/lib/bpf/bpf.c             |  1 +
> > > > >  tools/lib/bpf/libbpf.c          | 70 +++++++++++++++++++++++++++++----
> > > > >  tools/lib/bpf/libbpf_internal.h |  1 +
> > > > >  3 files changed, 65 insertions(+), 7 deletions(-)
> > > > >
> > > >
> >
> > [...]
> >
> > > > >         for (i = 0; i < obj->nr_programs; i++) {
> > > > >                 prog = &obj->programs[i];
> > > > >                 if (prog_is_subprog(obj, prog))
> > > > > @@ -7256,10 +7308,14 @@ bpf_object__load_progs(struct bpf_object *obj, int log_level)
> > > > >                         continue;
> > > > >                 }
> > > > >                 prog->log_level |= log_level;
> > > > > +               prog->fd_array = fd_array;
> > > >
> > > > you are not freeing this memory on success, as far as I can see.
> > >
> > > hmm. there is free on success below.
> >
> > right, my bad, I somehow understood as if it was only for error case
> >
> > >
> > > > And
> > > > given multiple programs are sharing fd_array, it's a bit problematic
> > > > for prog to have fd_array. This is per-object properly, so let's add
> > > > it at bpf_object level and clean it up on bpf_object__close()? And by
> > > > assigning to obj->fd_array at malloc() site, you won't need to do all
> > > > the error-handling free()s below.
> > >
> > > hmm. that sounds worse.
> > > why add another 8 byte to bpf_object that won't be used
> > > until this last step of bpf_object__load_progs.
> > > And only for the duration of this loading.
> > > It's cheaper to have this alloc here with two free()s below.
> >
> > So if you care about extra 8 bytes, then it's even more efficient to
> > have just one obj->fd_array rather than N prog->fd_array, no?
>
> I think it's layer breaking when bpf_program__load()->load_program()
> has to reach out to prog->obj to do its work.
> The layers are already a mess due to:
> &prog->obj->maps[prog->obj->rodata_map_idx]
> I wanted to avoid making it uglier.

I don't think it's breaking any layer. bpf_program is not an
independent entity from libbpf's point of view, it always belongs to
bpf_object. And there are bpf_object-scoped properties, shared across
all progs, like BTF, global variables, maps, license, etc.

It's another thing that bpf_program__load() just shouldn't be a public
API, and we are going to address that in libbpf 1.0.

>
> > And it's
> > also not very clean that prog->fd_array will have a dangling pointer
> > to deallocated memory after bpf_object__load_progs().
>
> prog->reloc_desc is free and zeroed after __relocate.
> prog->insns are freed and _not_ zereod after __load_progs.
> so prog->fd_array won't be the first such pointer.
> I can add zeroing, of course.

cool, it would be great to fix prog->insns to be zeroed out as well

>
> >
> > But that brings the entire question of why use fd_array at all here?
> > Commit description doesn't explain why libbpf has to use fd_array and
> > why it should be preferred. What are the advantages justifying added
> > complexity and extra memory allocation/clean up? It also reduces test
> > coverage of the "old ways" that offer the same capabilities. I think
> > this should be part of the commit description, if we agree that
> > fd_array has to be used outside of the auto-generated loader program.
>
> I can add a knob to it to use it during loader gen for the loader gen
> and for the runner of the loader prog.

So that's why I'm saying a better commit description is necessary. I
lost track, again, that those patched instructions with embedded
map_idx are assumed by prog loader program and then only fd_array is
modified in runtime by BPF loader program. Please, don't skim on
commit description, there are many moving pieces that are obvious only
in hindsight.

Getting back to code, given it's necessary for gen_loader only, I'd
switch out all those `kernel_supports(FEAT_FD_IDX)` checks with
`obj->gen_loader` and leave the current behavior as is. And we also
won't need to do FEAT_FD_IDX feature probing and extra memory
allocation at all. And bpf_load_and_run() uses fd_array
unconditionally without feature probing anyways.

> I think it will add more complexity.
> The bpf CI runs on older kernels, so the test coverage of "old ways"
> is not reduced regardless.

I'm the only one who checks that, and we keep shrinking the set of
tests that run on older kernels because we update existing ones with
dependencies on newer kernel features. So coverage is shrinking, but
basic stuff is still tested, of course.

> From the kernel pov BPF_PSEUDO_MAP_FD vs BPF_PSEUDO_MAP_IDX there is
> no advantage.
> From the libbpf side patch 9 looked trivial enough _not_ do it conditionally,
> but whatever. I don't mind more 'if'-s.

I do mind unnecessary ifs, that's not what I was proposing.

^ permalink raw reply	[flat|nested] 52+ messages in thread

* Re: [PATCH v2 bpf-next 10/16] bpf: Add bpf_btf_find_by_name_kind() helper.
  2021-04-28  1:55         ` Alexei Starovoitov
@ 2021-04-28 18:44           ` Andrii Nakryiko
  0 siblings, 0 replies; 52+ messages in thread
From: Andrii Nakryiko @ 2021-04-28 18:44 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, Networking,
	bpf, Kernel Team

On Tue, Apr 27, 2021 at 6:55 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Tue, Apr 27, 2021 at 10:45:38AM -0700, Andrii Nakryiko wrote:
> > > >
> > > > > + *             If btf_fd is zero look for the name in vmlinux BTF and in module's BTFs.
> > > > > + *     Return
> > > > > + *             Returns btf_id and btf_obj_fd in lower and upper 32 bits.
> > > >
> > > > Mention that for vmlinux BTF btf_obj_fd will be zero? Also who "owns"
> > > > the FD? If the BPF program doesn't close it, when are they going to be
> > > > cleaned up?
> > >
> > > just like bpf_sys_bpf. Who owns returned FD? The program that called
> > > the helper, of course.
> >
> > "program" as in the user-space process that did bpf_prog_test_run(),
> > right? In the cover letter you mentioned that BPF_PROG_TYPE_SYSCALL
> > might be called on syscall entry in the future, for that case there is
> > no clear "owning" process, so would be curious to see how that problem
> > gets solved.
>
> well, there is always an owner process. When syscall progs is attached
> to syscall such FDs will be in the process that doing syscall.
> It's kinda 'random', but that's the job of the prog to make 'non random'.
> If it's doing syscalls that will install FDs it should have a reason
> to do so. Likely there will be limitations on what bpf helpers such syscall
> prog can do if it's attached to this or that syscall.
> Currently it's test_run only.
>
> I'm not sure whether you're hinting that it all should be FD-less or I'm
> putting a question in your mouth, but I've considered doing that and
> figured that it's an overkill. It's possible to convert .*bpf.* do deal
> with FDs and with some other temporary handle. Instead of map_fd the
> loader prog would create a map and get a handle back that it will use
> later in prog_load, etc.
> But amount of refactoring looks excessive.
> The generated loader prog should be correct by construction and
> clean up after itself instead of burdening the kernel cleaning
> those extra handles.

That's not really the suggestion or question I had in mind. I was
contemplating how the FD handling will happen if such BPF program is
running from some other process's context and it seemed (and still
seems) very surprising if new FD will just be added to a "random"
process. Ignoring all the technical difficulties, I'd say ideally
those FDs should be owned by BPF program itself, and when it gets
unloaded, just like at the process exit, all still open FDs should be
closed. How technically feasible that is is entirely different
question.

But basically, I wanted to confirm I understand where those new FDs
are attached to.

^ permalink raw reply	[flat|nested] 52+ messages in thread

end of thread, back to index

Thread overview: 52+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-23  0:26 [PATCH v2 bpf-next 00/16] bpf: syscall program, FD array, loader program, light skeleton Alexei Starovoitov
2021-04-23  0:26 ` [PATCH v2 bpf-next 01/16] bpf: Introduce bpf_sys_bpf() helper and program type Alexei Starovoitov
2021-04-23 18:15   ` Yonghong Song
2021-04-23 18:28     ` Alexei Starovoitov
2021-04-23 19:32       ` Yonghong Song
2021-04-26 16:51   ` Andrii Nakryiko
2021-04-27 18:45   ` John Fastabend
2021-04-23  0:26 ` [PATCH v2 bpf-next 02/16] bpf: Introduce bpfptr_t user/kernel pointer Alexei Starovoitov
2021-04-23  0:26 ` [PATCH v2 bpf-next 03/16] bpf: Prepare bpf syscall to be used from kernel and user space Alexei Starovoitov
2021-04-23  0:26 ` [PATCH v2 bpf-next 04/16] libbpf: Support for syscall program type Alexei Starovoitov
2021-04-26 22:24   ` Andrii Nakryiko
2021-04-23  0:26 ` [PATCH v2 bpf-next 05/16] selftests/bpf: Test " Alexei Starovoitov
2021-04-26 17:02   ` Andrii Nakryiko
2021-04-27  2:43     ` Alexei Starovoitov
2021-04-27 16:28       ` Andrii Nakryiko
2021-04-23  0:26 ` [PATCH v2 bpf-next 06/16] bpf: Make btf_load command to be bpfptr_t compatible Alexei Starovoitov
2021-04-23  0:26 ` [PATCH v2 bpf-next 07/16] selftests/bpf: Test for btf_load command Alexei Starovoitov
2021-04-23  0:26 ` [PATCH v2 bpf-next 08/16] bpf: Introduce fd_idx Alexei Starovoitov
2021-04-23  0:26 ` [PATCH v2 bpf-next 09/16] libbpf: Support for fd_idx Alexei Starovoitov
2021-04-26 17:14   ` Andrii Nakryiko
2021-04-27  2:53     ` Alexei Starovoitov
2021-04-27 16:36       ` Andrii Nakryiko
2021-04-28  1:32         ` Alexei Starovoitov
2021-04-28 18:40           ` Andrii Nakryiko
2021-04-23  0:26 ` [PATCH v2 bpf-next 10/16] bpf: Add bpf_btf_find_by_name_kind() helper Alexei Starovoitov
2021-04-26 22:46   ` Andrii Nakryiko
2021-04-27  3:37     ` Alexei Starovoitov
2021-04-27 17:45       ` Andrii Nakryiko
2021-04-28  1:55         ` Alexei Starovoitov
2021-04-28 18:44           ` Andrii Nakryiko
2021-04-27 21:00   ` John Fastabend
2021-04-27 21:05     ` John Fastabend
2021-04-28  2:10     ` Alexei Starovoitov
2021-04-23  0:26 ` [PATCH v2 bpf-next 11/16] bpf: Add bpf_sys_close() helper Alexei Starovoitov
2021-04-23  0:26 ` [PATCH v2 bpf-next 12/16] libbpf: Change the order of data and text relocations Alexei Starovoitov
2021-04-26 17:29   ` Andrii Nakryiko
2021-04-27  3:00     ` Alexei Starovoitov
2021-04-27 16:47       ` Andrii Nakryiko
2021-04-23  0:26 ` [PATCH v2 bpf-next 13/16] libbpf: Add bpf_object pointer to kernel_supports() Alexei Starovoitov
2021-04-26 17:30   ` Andrii Nakryiko
2021-04-23  0:26 ` [PATCH v2 bpf-next 14/16] libbpf: Generate loader program out of BPF ELF file Alexei Starovoitov
2021-04-26 22:22   ` Andrii Nakryiko
2021-04-27  3:25     ` Alexei Starovoitov
2021-04-27 17:34       ` Andrii Nakryiko
2021-04-28  1:42         ` Alexei Starovoitov
2021-04-23  0:26 ` [PATCH v2 bpf-next 15/16] bpftool: Use syscall/loader program in "prog load" and "gen skeleton" command Alexei Starovoitov
2021-04-26 22:35   ` Andrii Nakryiko
2021-04-27  3:28     ` Alexei Starovoitov
2021-04-27 17:38       ` Andrii Nakryiko
2021-04-23  0:26 ` [PATCH v2 bpf-next 16/16] selftests/bpf: Convert few tests to light skeleton Alexei Starovoitov
2021-04-23 21:36 ` [PATCH v2 bpf-next 00/16] bpf: syscall program, FD array, loader program, " Yonghong Song
2021-04-23 23:16   ` Alexei Starovoitov

BPF Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/bpf/0 bpf/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 bpf bpf/ https://lore.kernel.org/bpf \
		bpf@vger.kernel.org
	public-inbox-index bpf

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.bpf


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git