All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 bpf-next 00/16] bpf: CO-RE support in the kernel
@ 2021-11-24  6:01 Alexei Starovoitov
  2021-11-24  6:01 ` [PATCH v4 bpf-next 01/16] libbpf: Replace btf__type_by_id() with btf_type_by_id() Alexei Starovoitov
                   ` (15 more replies)
  0 siblings, 16 replies; 28+ messages in thread
From: Alexei Starovoitov @ 2021-11-24  6:01 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

v3->v4:
. complete refactor of find candidates logic.
  Now it has small permanent cache.
  I haven't completed testing of all corner cases,
  but the patches are clean enough for review.
. Fix a bug in gen_loader related to attach_kind.
. Fix BTF log size limit.
. More tests.

v2->v3:
. addressed Andrii's feedback in every patch.
  New field in union bpf_attr changed from "core_relo" to "core_relos".
. added one more test and checkpatch.pl-ed the set.

v1->v2:
. Refactor uapi to pass 'struct bpf_core_relo' from LLVM into libbpf and further
into the kernel instead of bpf_core_apply_relo() bpf helper. Because of this
change the CO-RE algorithm has an ability to log error and debug events through
the standard bpf verifer log mechanism which was not possible with helper
approach.
. #define RELO_CORE macro was removed and replaced with btf_member_bit_offset() patch.

This set introduces CO-RE support in the kernel.
There are several reasons to add such support:
1. It's a step toward signed BPF programs.
2. It allows golang like languages that struggle to adopt libbpf
   to take advantage of CO-RE powers.
3. Currently the field accessed by 'ldx [R1 + 10]' insn is recognized
   by the verifier purely based on +10 offset. If R1 points to a union
   the verifier picks one of the fields at this offset.
   With CO-RE the kernel can disambiguate the field access.

Alexei Starovoitov (16):
  libbpf: Replace btf__type_by_id() with btf_type_by_id().
  bpf: Rename btf_member accessors.
  bpf: Prepare relo_core.c for kernel duty.
  bpf: Define enum bpf_core_relo_kind as uapi.
  bpf: Pass a set of bpf_core_relo-s to prog_load command.
  bpf: Adjust BTF log size limit.
  bpf: Add bpf_core_add_cands() and wire it into
    bpf_core_apply_relo_insn().
  libbpf: Use CO-RE in the kernel in light skeleton.
  libbpf: Support init of inner maps in light skeleton.
  libbpf: Clean gen_loader's attach kind.
  selftests/bpf: Add lskel version of kfunc test.
  selftests/bpf: Improve inner_map test coverage.
  selftests/bpf: Convert map_ptr_kern test to use light skeleton.
  selftests/bpf: Additional test for CO-RE in the kernel.
  selftests/bpf: Revert CO-RE removal in test_ksyms_weak.
  selftests/bpf: Add CO-RE relocations to verifier scale test.

 include/linux/bpf.h                           |   8 +
 include/linux/btf.h                           |  89 +++++-
 include/uapi/linux/bpf.h                      |  78 ++++-
 kernel/bpf/Makefile                           |   4 +
 kernel/bpf/bpf_struct_ops.c                   |   6 +-
 kernel/bpf/btf.c                              | 300 +++++++++++++++++-
 kernel/bpf/syscall.c                          |   2 +-
 kernel/bpf/verifier.c                         |  76 +++++
 net/ipv4/bpf_tcp_ca.c                         |   6 +-
 tools/include/uapi/linux/bpf.h                |  78 ++++-
 tools/lib/bpf/bpf_gen_internal.h              |   4 +
 tools/lib/bpf/btf.c                           |   2 +-
 tools/lib/bpf/gen_loader.c                    |  72 ++++-
 tools/lib/bpf/libbpf.c                        | 116 +++++--
 tools/lib/bpf/libbpf_internal.h               |   2 +-
 tools/lib/bpf/relo_core.c                     | 179 +++++++----
 tools/lib/bpf/relo_core.h                     |  73 +----
 tools/testing/selftests/bpf/Makefile          |   5 +-
 .../selftests/bpf/prog_tests/core_kern.c      |  14 +
 .../selftests/bpf/prog_tests/kfunc_call.c     |  24 ++
 .../selftests/bpf/prog_tests/map_ptr.c        |  16 +-
 tools/testing/selftests/bpf/progs/core_kern.c |  87 +++++
 .../selftests/bpf/progs/map_ptr_kern.c        |  16 +-
 .../selftests/bpf/progs/test_ksyms_weak.c     |   2 +-
 .../selftests/bpf/progs/test_verif_scale2.c   |   4 +-
 25 files changed, 1050 insertions(+), 213 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/core_kern.c
 create mode 100644 tools/testing/selftests/bpf/progs/core_kern.c

-- 
2.30.2


^ permalink raw reply	[flat|nested] 28+ messages in thread

* [PATCH v4 bpf-next 01/16] libbpf: Replace btf__type_by_id() with btf_type_by_id().
  2021-11-24  6:01 [PATCH v4 bpf-next 00/16] bpf: CO-RE support in the kernel Alexei Starovoitov
@ 2021-11-24  6:01 ` Alexei Starovoitov
  2021-11-24  6:01 ` [PATCH v4 bpf-next 02/16] bpf: Rename btf_member accessors Alexei Starovoitov
                   ` (14 subsequent siblings)
  15 siblings, 0 replies; 28+ messages in thread
From: Alexei Starovoitov @ 2021-11-24  6:01 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

To prepare relo_core.c to be compiled in the kernel and the user space
replace btf__type_by_id with btf_type_by_id.

In libbpf btf__type_by_id and btf_type_by_id have different behavior.

bpf_core_apply_relo_insn() needs behavior of uapi btf__type_by_id
vs internal btf_type_by_id, but type_id range check is already done
in bpf_core_apply_relo(), so it's safe to replace it everywhere.
The kernel btf_type_by_id() does the check anyway. It doesn't hurt.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/lib/bpf/btf.c             |  2 +-
 tools/lib/bpf/libbpf_internal.h |  2 +-
 tools/lib/bpf/relo_core.c       | 19 ++++++++-----------
 3 files changed, 10 insertions(+), 13 deletions(-)

diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index e97217a77196..4a1115eb39b4 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -454,7 +454,7 @@ const struct btf *btf__base_btf(const struct btf *btf)
 }
 
 /* internal helper returning non-const pointer to a type */
-struct btf_type *btf_type_by_id(struct btf *btf, __u32 type_id)
+struct btf_type *btf_type_by_id(const struct btf *btf, __u32 type_id)
 {
 	if (type_id == 0)
 		return &btf_void;
diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
index f7ac349650a1..1e1250e1dfa3 100644
--- a/tools/lib/bpf/libbpf_internal.h
+++ b/tools/lib/bpf/libbpf_internal.h
@@ -172,7 +172,7 @@ static inline void *libbpf_reallocarray(void *ptr, size_t nmemb, size_t size)
 struct btf;
 struct btf_type;
 
-struct btf_type *btf_type_by_id(struct btf *btf, __u32 type_id);
+struct btf_type *btf_type_by_id(const struct btf *btf, __u32 type_id);
 const char *btf_kind_str(const struct btf_type *t);
 const struct btf_type *skip_mods_and_typedefs(const struct btf *btf, __u32 id, __u32 *res_id);
 
diff --git a/tools/lib/bpf/relo_core.c b/tools/lib/bpf/relo_core.c
index b5b8956a1be8..c0904f4cb514 100644
--- a/tools/lib/bpf/relo_core.c
+++ b/tools/lib/bpf/relo_core.c
@@ -51,7 +51,7 @@ static bool is_flex_arr(const struct btf *btf,
 		return false;
 
 	/* has to be the last member of enclosing struct */
-	t = btf__type_by_id(btf, acc->type_id);
+	t = btf_type_by_id(btf, acc->type_id);
 	return acc->idx == btf_vlen(t) - 1;
 }
 
@@ -388,7 +388,7 @@ static int bpf_core_match_member(const struct btf *local_btf,
 		return 0;
 
 	local_id = local_acc->type_id;
-	local_type = btf__type_by_id(local_btf, local_id);
+	local_type = btf_type_by_id(local_btf, local_id);
 	local_member = btf_members(local_type) + local_acc->idx;
 	local_name = btf__name_by_offset(local_btf, local_member->name_off);
 
@@ -580,7 +580,7 @@ static int bpf_core_calc_field_relo(const char *prog_name,
 		return -EUCLEAN; /* request instruction poisoning */
 
 	acc = &spec->spec[spec->len - 1];
-	t = btf__type_by_id(spec->btf, acc->type_id);
+	t = btf_type_by_id(spec->btf, acc->type_id);
 
 	/* a[n] accessor needs special handling */
 	if (!acc->name) {
@@ -729,7 +729,7 @@ static int bpf_core_calc_enumval_relo(const struct bpf_core_relo *relo,
 	case BPF_ENUMVAL_VALUE:
 		if (!spec)
 			return -EUCLEAN; /* request instruction poisoning */
-		t = btf__type_by_id(spec->btf, spec->spec[0].type_id);
+		t = btf_type_by_id(spec->btf, spec->spec[0].type_id);
 		e = btf_enum(t) + spec->spec[0].idx;
 		*val = e->val;
 		break;
@@ -805,8 +805,8 @@ static int bpf_core_calc_relo(const char *prog_name,
 		if (res->orig_sz != res->new_sz) {
 			const struct btf_type *orig_t, *new_t;
 
-			orig_t = btf__type_by_id(local_spec->btf, res->orig_type_id);
-			new_t = btf__type_by_id(targ_spec->btf, res->new_type_id);
+			orig_t = btf_type_by_id(local_spec->btf, res->orig_type_id);
+			new_t = btf_type_by_id(targ_spec->btf, res->new_type_id);
 
 			/* There are two use cases in which it's safe to
 			 * adjust load/store's mem size:
@@ -1054,7 +1054,7 @@ static void bpf_core_dump_spec(int level, const struct bpf_core_spec *spec)
 	int i;
 
 	type_id = spec->root_type_id;
-	t = btf__type_by_id(spec->btf, type_id);
+	t = btf_type_by_id(spec->btf, type_id);
 	s = btf__name_by_offset(spec->btf, t->name_off);
 
 	libbpf_print(level, "[%u] %s %s", type_id, btf_kind_str(t), str_is_empty(s) ? "<anon>" : s);
@@ -1158,10 +1158,7 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn,
 	int i, j, err;
 
 	local_id = relo->type_id;
-	local_type = btf__type_by_id(local_btf, local_id);
-	if (!local_type)
-		return -EINVAL;
-
+	local_type = btf_type_by_id(local_btf, local_id);
 	local_name = btf__name_by_offset(local_btf, local_type->name_off);
 	if (!local_name)
 		return -EINVAL;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 bpf-next 02/16] bpf: Rename btf_member accessors.
  2021-11-24  6:01 [PATCH v4 bpf-next 00/16] bpf: CO-RE support in the kernel Alexei Starovoitov
  2021-11-24  6:01 ` [PATCH v4 bpf-next 01/16] libbpf: Replace btf__type_by_id() with btf_type_by_id() Alexei Starovoitov
@ 2021-11-24  6:01 ` Alexei Starovoitov
  2021-11-24  6:01 ` [PATCH v4 bpf-next 03/16] bpf: Prepare relo_core.c for kernel duty Alexei Starovoitov
                   ` (13 subsequent siblings)
  15 siblings, 0 replies; 28+ messages in thread
From: Alexei Starovoitov @ 2021-11-24  6:01 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Rename btf_member_bit_offset() and btf_member_bitfield_size() to
avoid conflicts with similarly named helpers in libbpf's btf.h.
Rename the kernel helpers, since libbpf helpers are part of uapi.

Suggested-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
---
 include/linux/btf.h         |  8 ++++----
 kernel/bpf/bpf_struct_ops.c |  6 +++---
 kernel/bpf/btf.c            | 18 +++++++++---------
 net/ipv4/bpf_tcp_ca.c       |  6 +++---
 4 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/include/linux/btf.h b/include/linux/btf.h
index 203eef993d76..956f70388f69 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -194,15 +194,15 @@ static inline bool btf_type_kflag(const struct btf_type *t)
 	return BTF_INFO_KFLAG(t->info);
 }
 
-static inline u32 btf_member_bit_offset(const struct btf_type *struct_type,
-					const struct btf_member *member)
+static inline u32 __btf_member_bit_offset(const struct btf_type *struct_type,
+					  const struct btf_member *member)
 {
 	return btf_type_kflag(struct_type) ? BTF_MEMBER_BIT_OFFSET(member->offset)
 					   : member->offset;
 }
 
-static inline u32 btf_member_bitfield_size(const struct btf_type *struct_type,
-					   const struct btf_member *member)
+static inline u32 __btf_member_bitfield_size(const struct btf_type *struct_type,
+					     const struct btf_member *member)
 {
 	return btf_type_kflag(struct_type) ? BTF_MEMBER_BITFIELD_SIZE(member->offset)
 					   : 0;
diff --git a/kernel/bpf/bpf_struct_ops.c b/kernel/bpf/bpf_struct_ops.c
index 8ecfe4752769..21069dbe9138 100644
--- a/kernel/bpf/bpf_struct_ops.c
+++ b/kernel/bpf/bpf_struct_ops.c
@@ -165,7 +165,7 @@ void bpf_struct_ops_init(struct btf *btf, struct bpf_verifier_log *log)
 				break;
 			}
 
-			if (btf_member_bitfield_size(t, member)) {
+			if (__btf_member_bitfield_size(t, member)) {
 				pr_warn("bit field member %s in struct %s is not supported\n",
 					mname, st_ops->name);
 				break;
@@ -296,7 +296,7 @@ static int check_zero_holes(const struct btf_type *t, void *data)
 	const struct btf_type *mtype;
 
 	for_each_member(i, t, member) {
-		moff = btf_member_bit_offset(t, member) / 8;
+		moff = __btf_member_bit_offset(t, member) / 8;
 		if (moff > prev_mend &&
 		    memchr_inv(data + prev_mend, 0, moff - prev_mend))
 			return -EINVAL;
@@ -387,7 +387,7 @@ static int bpf_struct_ops_map_update_elem(struct bpf_map *map, void *key,
 		struct bpf_prog *prog;
 		u32 moff;
 
-		moff = btf_member_bit_offset(t, member) / 8;
+		moff = __btf_member_bit_offset(t, member) / 8;
 		ptype = btf_type_resolve_ptr(btf_vmlinux, member->type, NULL);
 		if (ptype == module_type) {
 			if (*(void **)(udata + moff))
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 6b9d23be1e99..f4119a99da7b 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -2969,7 +2969,7 @@ static s32 btf_struct_check_meta(struct btf_verifier_env *env,
 			return -EINVAL;
 		}
 
-		offset = btf_member_bit_offset(t, member);
+		offset = __btf_member_bit_offset(t, member);
 		if (is_union && offset) {
 			btf_verifier_log_member(env, t, member,
 						"Invalid member bits_offset");
@@ -3094,7 +3094,7 @@ static int btf_find_struct_field(const struct btf *btf, const struct btf_type *t
 		if (off != -ENOENT)
 			/* only one such field is allowed */
 			return -E2BIG;
-		off = btf_member_bit_offset(t, member);
+		off = __btf_member_bit_offset(t, member);
 		if (off % 8)
 			/* valid C code cannot generate such BTF */
 			return -EINVAL;
@@ -3184,8 +3184,8 @@ static void __btf_struct_show(const struct btf *btf, const struct btf_type *t,
 
 		btf_show_start_member(show, member);
 
-		member_offset = btf_member_bit_offset(t, member);
-		bitfield_size = btf_member_bitfield_size(t, member);
+		member_offset = __btf_member_bit_offset(t, member);
+		bitfield_size = __btf_member_bitfield_size(t, member);
 		bytes_offset = BITS_ROUNDDOWN_BYTES(member_offset);
 		bits8_offset = BITS_PER_BYTE_MASKED(member_offset);
 		if (bitfield_size) {
@@ -5060,7 +5060,7 @@ static int btf_struct_walk(struct bpf_verifier_log *log, const struct btf *btf,
 		if (array_elem->nelems != 0)
 			goto error;
 
-		moff = btf_member_bit_offset(t, member) / 8;
+		moff = __btf_member_bit_offset(t, member) / 8;
 		if (off < moff)
 			goto error;
 
@@ -5083,14 +5083,14 @@ static int btf_struct_walk(struct bpf_verifier_log *log, const struct btf *btf,
 
 	for_each_member(i, t, member) {
 		/* offset of the field in bytes */
-		moff = btf_member_bit_offset(t, member) / 8;
+		moff = __btf_member_bit_offset(t, member) / 8;
 		if (off + size <= moff)
 			/* won't find anything, field is already too far */
 			break;
 
-		if (btf_member_bitfield_size(t, member)) {
-			u32 end_bit = btf_member_bit_offset(t, member) +
-				btf_member_bitfield_size(t, member);
+		if (__btf_member_bitfield_size(t, member)) {
+			u32 end_bit = __btf_member_bit_offset(t, member) +
+				__btf_member_bitfield_size(t, member);
 
 			/* off <= moff instead of off == moff because clang
 			 * does not generate a BTF member for anonymous
diff --git a/net/ipv4/bpf_tcp_ca.c b/net/ipv4/bpf_tcp_ca.c
index 2cf02b4d77fb..67466dbff152 100644
--- a/net/ipv4/bpf_tcp_ca.c
+++ b/net/ipv4/bpf_tcp_ca.c
@@ -169,7 +169,7 @@ static u32 prog_ops_moff(const struct bpf_prog *prog)
 	t = bpf_tcp_congestion_ops.type;
 	m = &btf_type_member(t)[midx];
 
-	return btf_member_bit_offset(t, m) / 8;
+	return __btf_member_bit_offset(t, m) / 8;
 }
 
 static const struct bpf_func_proto *
@@ -244,7 +244,7 @@ static int bpf_tcp_ca_init_member(const struct btf_type *t,
 	utcp_ca = (const struct tcp_congestion_ops *)udata;
 	tcp_ca = (struct tcp_congestion_ops *)kdata;
 
-	moff = btf_member_bit_offset(t, member) / 8;
+	moff = __btf_member_bit_offset(t, member) / 8;
 	switch (moff) {
 	case offsetof(struct tcp_congestion_ops, flags):
 		if (utcp_ca->flags & ~TCP_CONG_MASK)
@@ -274,7 +274,7 @@ static int bpf_tcp_ca_init_member(const struct btf_type *t,
 static int bpf_tcp_ca_check_member(const struct btf_type *t,
 				   const struct btf_member *member)
 {
-	if (is_unsupported(btf_member_bit_offset(t, member) / 8))
+	if (is_unsupported(__btf_member_bit_offset(t, member) / 8))
 		return -ENOTSUPP;
 	return 0;
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 bpf-next 03/16] bpf: Prepare relo_core.c for kernel duty.
  2021-11-24  6:01 [PATCH v4 bpf-next 00/16] bpf: CO-RE support in the kernel Alexei Starovoitov
  2021-11-24  6:01 ` [PATCH v4 bpf-next 01/16] libbpf: Replace btf__type_by_id() with btf_type_by_id() Alexei Starovoitov
  2021-11-24  6:01 ` [PATCH v4 bpf-next 02/16] bpf: Rename btf_member accessors Alexei Starovoitov
@ 2021-11-24  6:01 ` Alexei Starovoitov
  2021-11-24  6:01 ` [PATCH v4 bpf-next 04/16] bpf: Define enum bpf_core_relo_kind as uapi Alexei Starovoitov
                   ` (12 subsequent siblings)
  15 siblings, 0 replies; 28+ messages in thread
From: Alexei Starovoitov @ 2021-11-24  6:01 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Make relo_core.c to be compiled for the kernel and for user space libbpf.

Note the patch is reducing BPF_CORE_SPEC_MAX_LEN from 64 to 32.
This is the maximum number of nested structs and arrays.
For example:
 struct sample {
     int a;
     struct {
         int b[10];
     };
 };

 struct sample *s = ...;
 int *y = &s->b[5];
This field access is encoded as "0:1:0:5" and spec len is 4.

The follow up patch might bump it back to 64.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 include/linux/btf.h       | 81 +++++++++++++++++++++++++++++++++++++++
 kernel/bpf/Makefile       |  4 ++
 kernel/bpf/btf.c          | 26 +++++++++++++
 tools/lib/bpf/relo_core.c | 76 ++++++++++++++++++++++++++++++------
 4 files changed, 176 insertions(+), 11 deletions(-)

diff --git a/include/linux/btf.h b/include/linux/btf.h
index 956f70388f69..acef6ef28768 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -144,6 +144,53 @@ static inline bool btf_type_is_enum(const struct btf_type *t)
 	return BTF_INFO_KIND(t->info) == BTF_KIND_ENUM;
 }
 
+static inline bool str_is_empty(const char *s)
+{
+	return !s || !s[0];
+}
+
+static inline u16 btf_kind(const struct btf_type *t)
+{
+	return BTF_INFO_KIND(t->info);
+}
+
+static inline bool btf_is_enum(const struct btf_type *t)
+{
+	return btf_kind(t) == BTF_KIND_ENUM;
+}
+
+static inline bool btf_is_composite(const struct btf_type *t)
+{
+	u16 kind = btf_kind(t);
+
+	return kind == BTF_KIND_STRUCT || kind == BTF_KIND_UNION;
+}
+
+static inline bool btf_is_array(const struct btf_type *t)
+{
+	return btf_kind(t) == BTF_KIND_ARRAY;
+}
+
+static inline bool btf_is_int(const struct btf_type *t)
+{
+	return btf_kind(t) == BTF_KIND_INT;
+}
+
+static inline bool btf_is_ptr(const struct btf_type *t)
+{
+	return btf_kind(t) == BTF_KIND_PTR;
+}
+
+static inline u8 btf_int_offset(const struct btf_type *t)
+{
+	return BTF_INT_OFFSET(*(u32 *)(t + 1));
+}
+
+static inline u8 btf_int_encoding(const struct btf_type *t)
+{
+	return BTF_INT_ENCODING(*(u32 *)(t + 1));
+}
+
 static inline bool btf_type_is_scalar(const struct btf_type *t)
 {
 	return btf_type_is_int(t) || btf_type_is_enum(t);
@@ -184,6 +231,11 @@ static inline u16 btf_type_vlen(const struct btf_type *t)
 	return BTF_INFO_VLEN(t->info);
 }
 
+static inline u16 btf_vlen(const struct btf_type *t)
+{
+	return btf_type_vlen(t);
+}
+
 static inline u16 btf_func_linkage(const struct btf_type *t)
 {
 	return BTF_INFO_VLEN(t->info);
@@ -208,11 +260,40 @@ static inline u32 __btf_member_bitfield_size(const struct btf_type *struct_type,
 					   : 0;
 }
 
+static inline struct btf_member *btf_members(const struct btf_type *t)
+{
+	return (struct btf_member *)(t + 1);
+}
+
+static inline u32 btf_member_bit_offset(const struct btf_type *t, u32 member_idx)
+{
+	const struct btf_member *m = btf_members(t) + member_idx;
+
+	return __btf_member_bit_offset(t, m);
+}
+
+static inline u32 btf_member_bitfield_size(const struct btf_type *t, u32 member_idx)
+{
+	const struct btf_member *m = btf_members(t) + member_idx;
+
+	return __btf_member_bitfield_size(t, m);
+}
+
 static inline const struct btf_member *btf_type_member(const struct btf_type *t)
 {
 	return (const struct btf_member *)(t + 1);
 }
 
+static inline struct btf_array *btf_array(const struct btf_type *t)
+{
+	return (struct btf_array *)(t + 1);
+}
+
+static inline struct btf_enum *btf_enum(const struct btf_type *t)
+{
+	return (struct btf_enum *)(t + 1);
+}
+
 static inline const struct btf_var_secinfo *btf_type_var_secinfo(
 		const struct btf_type *t)
 {
diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
index cf6ca339f3cd..c1a9be6a4b9f 100644
--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -36,3 +36,7 @@ obj-$(CONFIG_BPF_SYSCALL) += bpf_struct_ops.o
 obj-${CONFIG_BPF_LSM} += bpf_lsm.o
 endif
 obj-$(CONFIG_BPF_PRELOAD) += preload/
+
+obj-$(CONFIG_BPF_SYSCALL) += relo_core.o
+$(obj)/relo_core.o: $(srctree)/tools/lib/bpf/relo_core.c FORCE
+	$(call if_changed_rule,cc_o_c)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index f4119a99da7b..c79595aad55b 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -6413,3 +6413,29 @@ bool bpf_check_mod_kfunc_call(struct kfunc_btf_id_list *klist, u32 kfunc_id,
 
 DEFINE_KFUNC_BTF_ID_LIST(bpf_tcp_ca_kfunc_list);
 DEFINE_KFUNC_BTF_ID_LIST(prog_test_kfunc_list);
+
+int bpf_core_types_are_compat(const struct btf *local_btf, __u32 local_id,
+			      const struct btf *targ_btf, __u32 targ_id)
+{
+	return -EOPNOTSUPP;
+}
+
+static bool bpf_core_is_flavor_sep(const char *s)
+{
+	/* check X___Y name pattern, where X and Y are not underscores */
+	return s[0] != '_' &&				      /* X */
+	       s[1] == '_' && s[2] == '_' && s[3] == '_' &&   /* ___ */
+	       s[4] != '_';				      /* Y */
+}
+
+size_t bpf_core_essential_name_len(const char *name)
+{
+	size_t n = strlen(name);
+	int i;
+
+	for (i = n - 5; i >= 0; i--) {
+		if (bpf_core_is_flavor_sep(name + i))
+			return i + 1;
+	}
+	return n;
+}
diff --git a/tools/lib/bpf/relo_core.c b/tools/lib/bpf/relo_core.c
index c0904f4cb514..56dbe6d16664 100644
--- a/tools/lib/bpf/relo_core.c
+++ b/tools/lib/bpf/relo_core.c
@@ -1,6 +1,60 @@
 // SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
 /* Copyright (c) 2019 Facebook */
 
+#ifdef __KERNEL__
+#include <linux/bpf.h>
+#include <linux/btf.h>
+#include <linux/string.h>
+#include <linux/bpf_verifier.h>
+#include "relo_core.h"
+
+static const char *btf_kind_str(const struct btf_type *t)
+{
+	return btf_type_str(t);
+}
+
+static bool is_ldimm64_insn(struct bpf_insn *insn)
+{
+	return insn->code == (BPF_LD | BPF_IMM | BPF_DW);
+}
+
+static const struct btf_type *
+skip_mods_and_typedefs(const struct btf *btf, u32 id, u32 *res_id)
+{
+	return btf_type_skip_modifiers(btf, id, res_id);
+}
+
+static const char *btf__name_by_offset(const struct btf *btf, u32 offset)
+{
+	return btf_name_by_offset(btf, offset);
+}
+
+static s64 btf__resolve_size(const struct btf *btf, u32 type_id)
+{
+	const struct btf_type *t;
+	int size;
+
+	t = btf_type_by_id(btf, type_id);
+	t = btf_resolve_size(btf, t, &size);
+	if (IS_ERR(t))
+		return PTR_ERR(t);
+	return size;
+}
+
+enum libbpf_print_level {
+	LIBBPF_WARN,
+	LIBBPF_INFO,
+	LIBBPF_DEBUG,
+};
+
+#undef pr_warn
+#undef pr_info
+#undef pr_debug
+#define pr_warn(fmt, log, ...)	bpf_log((void *)log, fmt, "", ##__VA_ARGS__)
+#define pr_info(fmt, log, ...)	bpf_log((void *)log, fmt, "", ##__VA_ARGS__)
+#define pr_debug(fmt, log, ...)	bpf_log((void *)log, fmt, "", ##__VA_ARGS__)
+#define libbpf_print(level, fmt, ...)	bpf_log((void *)prog_name, fmt, ##__VA_ARGS__)
+#else
 #include <stdio.h>
 #include <string.h>
 #include <errno.h>
@@ -12,8 +66,9 @@
 #include "btf.h"
 #include "str_error.h"
 #include "libbpf_internal.h"
+#endif
 
-#define BPF_CORE_SPEC_MAX_LEN 64
+#define BPF_CORE_SPEC_MAX_LEN 32
 
 /* represents BPF CO-RE field or array element accessor */
 struct bpf_core_accessor {
@@ -150,7 +205,7 @@ static bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind)
  * Enum value-based relocations (ENUMVAL_EXISTS/ENUMVAL_VALUE) use access
  * string to specify enumerator's value index that need to be relocated.
  */
-static int bpf_core_parse_spec(const struct btf *btf,
+static int bpf_core_parse_spec(const char *prog_name, const struct btf *btf,
 			       __u32 type_id,
 			       const char *spec_str,
 			       enum bpf_core_relo_kind relo_kind,
@@ -272,8 +327,8 @@ static int bpf_core_parse_spec(const struct btf *btf,
 				return sz;
 			spec->bit_offset += access_idx * sz * 8;
 		} else {
-			pr_warn("relo for [%u] %s (at idx %d) captures type [%d] of unexpected kind %s\n",
-				type_id, spec_str, i, id, btf_kind_str(t));
+			pr_warn("prog '%s': relo for [%u] %s (at idx %d) captures type [%d] of unexpected kind %s\n",
+				prog_name, type_id, spec_str, i, id, btf_kind_str(t));
 			return -EINVAL;
 		}
 	}
@@ -346,8 +401,6 @@ static int bpf_core_fields_are_compat(const struct btf *local_btf,
 		targ_id = btf_array(targ_type)->type;
 		goto recur;
 	default:
-		pr_warn("unexpected kind %d relocated, local [%d], target [%d]\n",
-			btf_kind(local_type), local_id, targ_id);
 		return 0;
 	}
 }
@@ -1045,7 +1098,7 @@ static int bpf_core_patch_insn(const char *prog_name, struct bpf_insn *insn,
  * [<type-id>] (<type-name>) + <raw-spec> => <offset>@<spec>,
  * where <spec> is a C-syntax view of recorded field access, e.g.: x.a[3].b
  */
-static void bpf_core_dump_spec(int level, const struct bpf_core_spec *spec)
+static void bpf_core_dump_spec(const char *prog_name, int level, const struct bpf_core_spec *spec)
 {
 	const struct btf_type *t;
 	const struct btf_enum *e;
@@ -1167,7 +1220,8 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn,
 	if (str_is_empty(spec_str))
 		return -EINVAL;
 
-	err = bpf_core_parse_spec(local_btf, local_id, spec_str, relo->kind, &local_spec);
+	err = bpf_core_parse_spec(prog_name, local_btf, local_id, spec_str,
+				  relo->kind, &local_spec);
 	if (err) {
 		pr_warn("prog '%s': relo #%d: parsing [%d] %s %s + %s failed: %d\n",
 			prog_name, relo_idx, local_id, btf_kind_str(local_type),
@@ -1178,7 +1232,7 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn,
 
 	pr_debug("prog '%s': relo #%d: kind <%s> (%d), spec is ", prog_name,
 		 relo_idx, core_relo_kind_str(relo->kind), relo->kind);
-	bpf_core_dump_spec(LIBBPF_DEBUG, &local_spec);
+	bpf_core_dump_spec(prog_name, LIBBPF_DEBUG, &local_spec);
 	libbpf_print(LIBBPF_DEBUG, "\n");
 
 	/* TYPE_ID_LOCAL relo is special and doesn't need candidate search */
@@ -1204,14 +1258,14 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn,
 		if (err < 0) {
 			pr_warn("prog '%s': relo #%d: error matching candidate #%d ",
 				prog_name, relo_idx, i);
-			bpf_core_dump_spec(LIBBPF_WARN, &cand_spec);
+			bpf_core_dump_spec(prog_name, LIBBPF_WARN, &cand_spec);
 			libbpf_print(LIBBPF_WARN, ": %d\n", err);
 			return err;
 		}
 
 		pr_debug("prog '%s': relo #%d: %s candidate #%d ", prog_name,
 			 relo_idx, err == 0 ? "non-matching" : "matching", i);
-		bpf_core_dump_spec(LIBBPF_DEBUG, &cand_spec);
+		bpf_core_dump_spec(prog_name, LIBBPF_DEBUG, &cand_spec);
 		libbpf_print(LIBBPF_DEBUG, "\n");
 
 		if (err == 0)
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 bpf-next 04/16] bpf: Define enum bpf_core_relo_kind as uapi.
  2021-11-24  6:01 [PATCH v4 bpf-next 00/16] bpf: CO-RE support in the kernel Alexei Starovoitov
                   ` (2 preceding siblings ...)
  2021-11-24  6:01 ` [PATCH v4 bpf-next 03/16] bpf: Prepare relo_core.c for kernel duty Alexei Starovoitov
@ 2021-11-24  6:01 ` Alexei Starovoitov
  2021-11-24  6:01 ` [PATCH v4 bpf-next 05/16] bpf: Pass a set of bpf_core_relo-s to prog_load command Alexei Starovoitov
                   ` (11 subsequent siblings)
  15 siblings, 0 replies; 28+ messages in thread
From: Alexei Starovoitov @ 2021-11-24  6:01 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

enum bpf_core_relo_kind is generated by llvm and processed by libbpf.
It's a de-facto uapi.
With CO-RE in the kernel the bpf_core_relo_kind values become uapi de-jure.
Also rename them with BPF_CORE_ prefix to distinguish from conflicting names in
bpf_core_read.h. The enums bpf_field_info_kind, bpf_type_id_kind,
bpf_type_info_kind, bpf_enum_value_kind are passing different values from bpf
program into llvm.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Andrii Nakryiko <andrii@kernel.org>
---
 include/uapi/linux/bpf.h       | 19 ++++++++
 tools/include/uapi/linux/bpf.h | 19 ++++++++
 tools/lib/bpf/libbpf.c         |  2 +-
 tools/lib/bpf/relo_core.c      | 84 +++++++++++++++++-----------------
 tools/lib/bpf/relo_core.h      | 18 +-------
 5 files changed, 82 insertions(+), 60 deletions(-)

diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index a69e4b04ffeb..190e9e3c0693 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -6349,4 +6349,23 @@ enum {
 	BTF_F_ZERO	=	(1ULL << 3),
 };
 
+/* bpf_core_relo_kind encodes which aspect of captured field/type/enum value
+ * has to be adjusted by relocations. It is emitted by llvm and passed to
+ * libbpf and later to the kernel.
+ */
+enum bpf_core_relo_kind {
+	BPF_CORE_FIELD_BYTE_OFFSET = 0,      /* field byte offset */
+	BPF_CORE_FIELD_BYTE_SIZE = 1,        /* field size in bytes */
+	BPF_CORE_FIELD_EXISTS = 2,           /* field existence in target kernel */
+	BPF_CORE_FIELD_SIGNED = 3,           /* field signedness (0 - unsigned, 1 - signed) */
+	BPF_CORE_FIELD_LSHIFT_U64 = 4,       /* bitfield-specific left bitshift */
+	BPF_CORE_FIELD_RSHIFT_U64 = 5,       /* bitfield-specific right bitshift */
+	BPF_CORE_TYPE_ID_LOCAL = 6,          /* type ID in local BPF object */
+	BPF_CORE_TYPE_ID_TARGET = 7,         /* type ID in target kernel */
+	BPF_CORE_TYPE_EXISTS = 8,            /* type existence in target kernel */
+	BPF_CORE_TYPE_SIZE = 9,              /* type size in bytes */
+	BPF_CORE_ENUMVAL_EXISTS = 10,        /* enum value existence in target kernel */
+	BPF_CORE_ENUMVAL_VALUE = 11,         /* enum value integer value */
+};
+
 #endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index a69e4b04ffeb..190e9e3c0693 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -6349,4 +6349,23 @@ enum {
 	BTF_F_ZERO	=	(1ULL << 3),
 };
 
+/* bpf_core_relo_kind encodes which aspect of captured field/type/enum value
+ * has to be adjusted by relocations. It is emitted by llvm and passed to
+ * libbpf and later to the kernel.
+ */
+enum bpf_core_relo_kind {
+	BPF_CORE_FIELD_BYTE_OFFSET = 0,      /* field byte offset */
+	BPF_CORE_FIELD_BYTE_SIZE = 1,        /* field size in bytes */
+	BPF_CORE_FIELD_EXISTS = 2,           /* field existence in target kernel */
+	BPF_CORE_FIELD_SIGNED = 3,           /* field signedness (0 - unsigned, 1 - signed) */
+	BPF_CORE_FIELD_LSHIFT_U64 = 4,       /* bitfield-specific left bitshift */
+	BPF_CORE_FIELD_RSHIFT_U64 = 5,       /* bitfield-specific right bitshift */
+	BPF_CORE_TYPE_ID_LOCAL = 6,          /* type ID in local BPF object */
+	BPF_CORE_TYPE_ID_TARGET = 7,         /* type ID in target kernel */
+	BPF_CORE_TYPE_EXISTS = 8,            /* type existence in target kernel */
+	BPF_CORE_TYPE_SIZE = 9,              /* type size in bytes */
+	BPF_CORE_ENUMVAL_EXISTS = 10,        /* enum value existence in target kernel */
+	BPF_CORE_ENUMVAL_VALUE = 11,         /* enum value integer value */
+};
+
 #endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index af405c38aadc..fc9a179ea412 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -5459,7 +5459,7 @@ static int bpf_core_apply_relo(struct bpf_program *prog,
 		return -ENOTSUP;
 	}
 
-	if (relo->kind != BPF_TYPE_ID_LOCAL &&
+	if (relo->kind != BPF_CORE_TYPE_ID_LOCAL &&
 	    !hashmap__find(cand_cache, type_key, (void **)&cands)) {
 		cands = bpf_core_find_cands(prog->obj, local_btf, local_id);
 		if (IS_ERR(cands)) {
diff --git a/tools/lib/bpf/relo_core.c b/tools/lib/bpf/relo_core.c
index 56dbe6d16664..d194fb9306ed 100644
--- a/tools/lib/bpf/relo_core.c
+++ b/tools/lib/bpf/relo_core.c
@@ -113,18 +113,18 @@ static bool is_flex_arr(const struct btf *btf,
 static const char *core_relo_kind_str(enum bpf_core_relo_kind kind)
 {
 	switch (kind) {
-	case BPF_FIELD_BYTE_OFFSET: return "byte_off";
-	case BPF_FIELD_BYTE_SIZE: return "byte_sz";
-	case BPF_FIELD_EXISTS: return "field_exists";
-	case BPF_FIELD_SIGNED: return "signed";
-	case BPF_FIELD_LSHIFT_U64: return "lshift_u64";
-	case BPF_FIELD_RSHIFT_U64: return "rshift_u64";
-	case BPF_TYPE_ID_LOCAL: return "local_type_id";
-	case BPF_TYPE_ID_TARGET: return "target_type_id";
-	case BPF_TYPE_EXISTS: return "type_exists";
-	case BPF_TYPE_SIZE: return "type_size";
-	case BPF_ENUMVAL_EXISTS: return "enumval_exists";
-	case BPF_ENUMVAL_VALUE: return "enumval_value";
+	case BPF_CORE_FIELD_BYTE_OFFSET: return "byte_off";
+	case BPF_CORE_FIELD_BYTE_SIZE: return "byte_sz";
+	case BPF_CORE_FIELD_EXISTS: return "field_exists";
+	case BPF_CORE_FIELD_SIGNED: return "signed";
+	case BPF_CORE_FIELD_LSHIFT_U64: return "lshift_u64";
+	case BPF_CORE_FIELD_RSHIFT_U64: return "rshift_u64";
+	case BPF_CORE_TYPE_ID_LOCAL: return "local_type_id";
+	case BPF_CORE_TYPE_ID_TARGET: return "target_type_id";
+	case BPF_CORE_TYPE_EXISTS: return "type_exists";
+	case BPF_CORE_TYPE_SIZE: return "type_size";
+	case BPF_CORE_ENUMVAL_EXISTS: return "enumval_exists";
+	case BPF_CORE_ENUMVAL_VALUE: return "enumval_value";
 	default: return "unknown";
 	}
 }
@@ -132,12 +132,12 @@ static const char *core_relo_kind_str(enum bpf_core_relo_kind kind)
 static bool core_relo_is_field_based(enum bpf_core_relo_kind kind)
 {
 	switch (kind) {
-	case BPF_FIELD_BYTE_OFFSET:
-	case BPF_FIELD_BYTE_SIZE:
-	case BPF_FIELD_EXISTS:
-	case BPF_FIELD_SIGNED:
-	case BPF_FIELD_LSHIFT_U64:
-	case BPF_FIELD_RSHIFT_U64:
+	case BPF_CORE_FIELD_BYTE_OFFSET:
+	case BPF_CORE_FIELD_BYTE_SIZE:
+	case BPF_CORE_FIELD_EXISTS:
+	case BPF_CORE_FIELD_SIGNED:
+	case BPF_CORE_FIELD_LSHIFT_U64:
+	case BPF_CORE_FIELD_RSHIFT_U64:
 		return true;
 	default:
 		return false;
@@ -147,10 +147,10 @@ static bool core_relo_is_field_based(enum bpf_core_relo_kind kind)
 static bool core_relo_is_type_based(enum bpf_core_relo_kind kind)
 {
 	switch (kind) {
-	case BPF_TYPE_ID_LOCAL:
-	case BPF_TYPE_ID_TARGET:
-	case BPF_TYPE_EXISTS:
-	case BPF_TYPE_SIZE:
+	case BPF_CORE_TYPE_ID_LOCAL:
+	case BPF_CORE_TYPE_ID_TARGET:
+	case BPF_CORE_TYPE_EXISTS:
+	case BPF_CORE_TYPE_SIZE:
 		return true;
 	default:
 		return false;
@@ -160,8 +160,8 @@ static bool core_relo_is_type_based(enum bpf_core_relo_kind kind)
 static bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind)
 {
 	switch (kind) {
-	case BPF_ENUMVAL_EXISTS:
-	case BPF_ENUMVAL_VALUE:
+	case BPF_CORE_ENUMVAL_EXISTS:
+	case BPF_CORE_ENUMVAL_VALUE:
 		return true;
 	default:
 		return false;
@@ -624,7 +624,7 @@ static int bpf_core_calc_field_relo(const char *prog_name,
 
 	*field_sz = 0;
 
-	if (relo->kind == BPF_FIELD_EXISTS) {
+	if (relo->kind == BPF_CORE_FIELD_EXISTS) {
 		*val = spec ? 1 : 0;
 		return 0;
 	}
@@ -637,7 +637,7 @@ static int bpf_core_calc_field_relo(const char *prog_name,
 
 	/* a[n] accessor needs special handling */
 	if (!acc->name) {
-		if (relo->kind == BPF_FIELD_BYTE_OFFSET) {
+		if (relo->kind == BPF_CORE_FIELD_BYTE_OFFSET) {
 			*val = spec->bit_offset / 8;
 			/* remember field size for load/store mem size */
 			sz = btf__resolve_size(spec->btf, acc->type_id);
@@ -645,7 +645,7 @@ static int bpf_core_calc_field_relo(const char *prog_name,
 				return -EINVAL;
 			*field_sz = sz;
 			*type_id = acc->type_id;
-		} else if (relo->kind == BPF_FIELD_BYTE_SIZE) {
+		} else if (relo->kind == BPF_CORE_FIELD_BYTE_SIZE) {
 			sz = btf__resolve_size(spec->btf, acc->type_id);
 			if (sz < 0)
 				return -EINVAL;
@@ -697,36 +697,36 @@ static int bpf_core_calc_field_relo(const char *prog_name,
 		*validate = !bitfield;
 
 	switch (relo->kind) {
-	case BPF_FIELD_BYTE_OFFSET:
+	case BPF_CORE_FIELD_BYTE_OFFSET:
 		*val = byte_off;
 		if (!bitfield) {
 			*field_sz = byte_sz;
 			*type_id = field_type_id;
 		}
 		break;
-	case BPF_FIELD_BYTE_SIZE:
+	case BPF_CORE_FIELD_BYTE_SIZE:
 		*val = byte_sz;
 		break;
-	case BPF_FIELD_SIGNED:
+	case BPF_CORE_FIELD_SIGNED:
 		/* enums will be assumed unsigned */
 		*val = btf_is_enum(mt) ||
 		       (btf_int_encoding(mt) & BTF_INT_SIGNED);
 		if (validate)
 			*validate = true; /* signedness is never ambiguous */
 		break;
-	case BPF_FIELD_LSHIFT_U64:
+	case BPF_CORE_FIELD_LSHIFT_U64:
 #if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
 		*val = 64 - (bit_off + bit_sz - byte_off  * 8);
 #else
 		*val = (8 - byte_sz) * 8 + (bit_off - byte_off * 8);
 #endif
 		break;
-	case BPF_FIELD_RSHIFT_U64:
+	case BPF_CORE_FIELD_RSHIFT_U64:
 		*val = 64 - bit_sz;
 		if (validate)
 			*validate = true; /* right shift is never ambiguous */
 		break;
-	case BPF_FIELD_EXISTS:
+	case BPF_CORE_FIELD_EXISTS:
 	default:
 		return -EOPNOTSUPP;
 	}
@@ -747,20 +747,20 @@ static int bpf_core_calc_type_relo(const struct bpf_core_relo *relo,
 	}
 
 	switch (relo->kind) {
-	case BPF_TYPE_ID_TARGET:
+	case BPF_CORE_TYPE_ID_TARGET:
 		*val = spec->root_type_id;
 		break;
-	case BPF_TYPE_EXISTS:
+	case BPF_CORE_TYPE_EXISTS:
 		*val = 1;
 		break;
-	case BPF_TYPE_SIZE:
+	case BPF_CORE_TYPE_SIZE:
 		sz = btf__resolve_size(spec->btf, spec->root_type_id);
 		if (sz < 0)
 			return -EINVAL;
 		*val = sz;
 		break;
-	case BPF_TYPE_ID_LOCAL:
-	/* BPF_TYPE_ID_LOCAL is handled specially and shouldn't get here */
+	case BPF_CORE_TYPE_ID_LOCAL:
+	/* BPF_CORE_TYPE_ID_LOCAL is handled specially and shouldn't get here */
 	default:
 		return -EOPNOTSUPP;
 	}
@@ -776,10 +776,10 @@ static int bpf_core_calc_enumval_relo(const struct bpf_core_relo *relo,
 	const struct btf_enum *e;
 
 	switch (relo->kind) {
-	case BPF_ENUMVAL_EXISTS:
+	case BPF_CORE_ENUMVAL_EXISTS:
 		*val = spec ? 1 : 0;
 		break;
-	case BPF_ENUMVAL_VALUE:
+	case BPF_CORE_ENUMVAL_VALUE:
 		if (!spec)
 			return -EUCLEAN; /* request instruction poisoning */
 		t = btf_type_by_id(spec->btf, spec->spec[0].type_id);
@@ -1236,7 +1236,7 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn,
 	libbpf_print(LIBBPF_DEBUG, "\n");
 
 	/* TYPE_ID_LOCAL relo is special and doesn't need candidate search */
-	if (relo->kind == BPF_TYPE_ID_LOCAL) {
+	if (relo->kind == BPF_CORE_TYPE_ID_LOCAL) {
 		targ_res.validate = true;
 		targ_res.poison = false;
 		targ_res.orig_val = local_spec.root_type_id;
@@ -1302,7 +1302,7 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn,
 	}
 
 	/*
-	 * For BPF_FIELD_EXISTS relo or when used BPF program has field
+	 * For BPF_CORE_FIELD_EXISTS relo or when used BPF program has field
 	 * existence checks or kernel version/config checks, it's expected
 	 * that we might not find any candidates. In this case, if field
 	 * wasn't found in any candidate, the list of candidates shouldn't
diff --git a/tools/lib/bpf/relo_core.h b/tools/lib/bpf/relo_core.h
index 3b9f8f18346c..3d0b86e7f439 100644
--- a/tools/lib/bpf/relo_core.h
+++ b/tools/lib/bpf/relo_core.h
@@ -4,23 +4,7 @@
 #ifndef __RELO_CORE_H
 #define __RELO_CORE_H
 
-/* bpf_core_relo_kind encodes which aspect of captured field/type/enum value
- * has to be adjusted by relocations.
- */
-enum bpf_core_relo_kind {
-	BPF_FIELD_BYTE_OFFSET = 0,	/* field byte offset */
-	BPF_FIELD_BYTE_SIZE = 1,	/* field size in bytes */
-	BPF_FIELD_EXISTS = 2,		/* field existence in target kernel */
-	BPF_FIELD_SIGNED = 3,		/* field signedness (0 - unsigned, 1 - signed) */
-	BPF_FIELD_LSHIFT_U64 = 4,	/* bitfield-specific left bitshift */
-	BPF_FIELD_RSHIFT_U64 = 5,	/* bitfield-specific right bitshift */
-	BPF_TYPE_ID_LOCAL = 6,		/* type ID in local BPF object */
-	BPF_TYPE_ID_TARGET = 7,		/* type ID in target kernel */
-	BPF_TYPE_EXISTS = 8,		/* type existence in target kernel */
-	BPF_TYPE_SIZE = 9,		/* type size in bytes */
-	BPF_ENUMVAL_EXISTS = 10,	/* enum value existence in target kernel */
-	BPF_ENUMVAL_VALUE = 11,		/* enum value integer value */
-};
+#include <linux/bpf.h>
 
 /* The minimum bpf_core_relo checked by the loader
  *
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 bpf-next 05/16] bpf: Pass a set of bpf_core_relo-s to prog_load command.
  2021-11-24  6:01 [PATCH v4 bpf-next 00/16] bpf: CO-RE support in the kernel Alexei Starovoitov
                   ` (3 preceding siblings ...)
  2021-11-24  6:01 ` [PATCH v4 bpf-next 04/16] bpf: Define enum bpf_core_relo_kind as uapi Alexei Starovoitov
@ 2021-11-24  6:01 ` Alexei Starovoitov
  2021-11-24  6:01 ` [PATCH v4 bpf-next 06/16] bpf: Adjust BTF log size limit Alexei Starovoitov
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 28+ messages in thread
From: Alexei Starovoitov @ 2021-11-24  6:01 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

struct bpf_core_relo is generated by llvm and processed by libbpf.
It's a de-facto uapi.
With CO-RE in the kernel the struct bpf_core_relo becomes uapi de-jure.
Add an ability to pass a set of 'struct bpf_core_relo' to prog_load command
and let the kernel perform CO-RE relocations.

Note the struct bpf_line_info and struct bpf_func_info have the same
layout when passed from LLVM to libbpf and from libbpf to the kernel
except "insn_off" fields means "byte offset" when LLVM generates it.
Then libbpf converts it to "insn index" to pass to the kernel.
The struct bpf_core_relo's "insn_off" field is always "byte offset".

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 include/linux/bpf.h            |  8 ++++
 include/uapi/linux/bpf.h       | 59 +++++++++++++++++++++++++-
 kernel/bpf/btf.c               |  6 +++
 kernel/bpf/syscall.c           |  2 +-
 kernel/bpf/verifier.c          | 76 ++++++++++++++++++++++++++++++++++
 tools/include/uapi/linux/bpf.h | 59 +++++++++++++++++++++++++-
 tools/lib/bpf/relo_core.h      | 53 ------------------------
 7 files changed, 207 insertions(+), 56 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index cc7a0c36e7df..a49e1a5573b0 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1732,6 +1732,14 @@ bool bpf_prog_has_kfunc_call(const struct bpf_prog *prog);
 const struct btf_func_model *
 bpf_jit_find_kfunc_model(const struct bpf_prog *prog,
 			 const struct bpf_insn *insn);
+struct bpf_core_ctx {
+	struct bpf_verifier_log *log;
+	const struct btf *btf;
+};
+
+int bpf_core_apply(struct bpf_core_ctx *ctx, const struct bpf_core_relo *relo,
+		   int relo_idx, void *insn);
+
 #else /* !CONFIG_BPF_SYSCALL */
 static inline struct bpf_prog *bpf_prog_get(u32 ufd)
 {
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 190e9e3c0693..d66e5b0ea7ba 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -1342,8 +1342,10 @@ union bpf_attr {
 			/* or valid module BTF object fd or 0 to attach to vmlinux */
 			__u32		attach_btf_obj_fd;
 		};
-		__u32		:32;		/* pad */
+		__u32		core_relo_cnt;	/* number of bpf_core_relo */
 		__aligned_u64	fd_array;	/* array of FDs */
+		__aligned_u64	core_relos;
+		__u32		core_relo_rec_size; /* sizeof(struct bpf_core_relo) */
 	};
 
 	struct { /* anonymous struct used by BPF_OBJ_* commands */
@@ -6368,4 +6370,59 @@ enum bpf_core_relo_kind {
 	BPF_CORE_ENUMVAL_VALUE = 11,         /* enum value integer value */
 };
 
+/*
+ * "struct bpf_core_relo" is used to pass relocation data form LLVM to libbpf
+ * and from libbpf to the kernel.
+ *
+ * CO-RE relocation captures the following data:
+ * - insn_off - instruction offset (in bytes) within a BPF program that needs
+ *   its insn->imm field to be relocated with actual field info;
+ * - type_id - BTF type ID of the "root" (containing) entity of a relocatable
+ *   type or field;
+ * - access_str_off - offset into corresponding .BTF string section. String
+ *   interpretation depends on specific relocation kind:
+ *     - for field-based relocations, string encodes an accessed field using
+ *       a sequence of field and array indices, separated by colon (:). It's
+ *       conceptually very close to LLVM's getelementptr ([0]) instruction's
+ *       arguments for identifying offset to a field.
+ *     - for type-based relocations, strings is expected to be just "0";
+ *     - for enum value-based relocations, string contains an index of enum
+ *       value within its enum type;
+ * - kind - one of enum bpf_core_relo_kind;
+ *
+ * Example:
+ *   struct sample {
+ *       int a;
+ *       struct {
+ *           int b[10];
+ *       };
+ *   };
+ *
+ *   struct sample *s = ...;
+ *   int *x = &s->a;     // encoded as "0:0" (a is field #0)
+ *   int *y = &s->b[5];  // encoded as "0:1:0:5" (anon struct is field #1,
+ *                       // b is field #0 inside anon struct, accessing elem #5)
+ *   int *z = &s[10]->b; // encoded as "10:1" (ptr is used as an array)
+ *
+ * type_id for all relocs in this example will capture BTF type id of
+ * `struct sample`.
+ *
+ * Such relocation is emitted when using __builtin_preserve_access_index()
+ * Clang built-in, passing expression that captures field address, e.g.:
+ *
+ * bpf_probe_read(&dst, sizeof(dst),
+ *		  __builtin_preserve_access_index(&src->a.b.c));
+ *
+ * In this case Clang will emit field relocation recording necessary data to
+ * be able to find offset of embedded `a.b.c` field within `src` struct.
+ *
+ * [0] https://llvm.org/docs/LangRef.html#getelementptr-instruction
+ */
+struct bpf_core_relo {
+	__u32 insn_off;
+	__u32 type_id;
+	__u32 access_str_off;
+	enum bpf_core_relo_kind kind;
+};
+
 #endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index c79595aad55b..0d070461e2b8 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -6439,3 +6439,9 @@ size_t bpf_core_essential_name_len(const char *name)
 	}
 	return n;
 }
+
+int bpf_core_apply(struct bpf_core_ctx *ctx, const struct bpf_core_relo *relo,
+		   int relo_idx, void *insn)
+{
+	return -EOPNOTSUPP;
+}
diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
index 50f96ea4452a..45c9bb932132 100644
--- a/kernel/bpf/syscall.c
+++ b/kernel/bpf/syscall.c
@@ -2184,7 +2184,7 @@ static bool is_perfmon_prog_type(enum bpf_prog_type prog_type)
 }
 
 /* last field in 'union bpf_attr' used by this command */
-#define	BPF_PROG_LOAD_LAST_FIELD fd_array
+#define	BPF_PROG_LOAD_LAST_FIELD core_relo_rec_size
 
 static int bpf_prog_load(union bpf_attr *attr, bpfptr_t uattr)
 {
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 0763cca139a7..02ea3dd9d495 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -10251,6 +10251,78 @@ static int check_btf_line(struct bpf_verifier_env *env,
 	return err;
 }
 
+#define MIN_CORE_RELO_SIZE	sizeof(struct bpf_core_relo)
+#define MAX_CORE_RELO_SIZE	MAX_FUNCINFO_REC_SIZE
+
+static int check_core_relo(struct bpf_verifier_env *env,
+			   const union bpf_attr *attr,
+			   bpfptr_t uattr)
+{
+	u32 i, nr_core_relo, ncopy, expected_size, rec_size;
+	struct bpf_core_relo core_relo = {};
+	struct bpf_prog *prog = env->prog;
+	const struct btf *btf = prog->aux->btf;
+	struct bpf_core_ctx ctx = {
+		.log = &env->log,
+		.btf = btf,
+	};
+	bpfptr_t u_core_relo;
+	int err;
+
+	nr_core_relo = attr->core_relo_cnt;
+	if (!nr_core_relo)
+		return 0;
+	if (nr_core_relo > INT_MAX / sizeof(struct bpf_core_relo))
+		return -EINVAL;
+
+	rec_size = attr->core_relo_rec_size;
+	if (rec_size < MIN_CORE_RELO_SIZE ||
+	    rec_size > MAX_CORE_RELO_SIZE ||
+	    rec_size % sizeof(u32))
+		return -EINVAL;
+
+	u_core_relo = make_bpfptr(attr->core_relos, uattr.is_kernel);
+	expected_size = sizeof(struct bpf_core_relo);
+	ncopy = min_t(u32, expected_size, rec_size);
+
+	/* Unlike func_info and line_info, copy and apply each CO-RE
+	 * relocation record one at a time.
+	 */
+	for (i = 0; i < nr_core_relo; i++) {
+		/* future proofing when sizeof(bpf_core_relo) changes */
+		err = bpf_check_uarg_tail_zero(u_core_relo, expected_size, rec_size);
+		if (err) {
+			if (err == -E2BIG) {
+				verbose(env, "nonzero tailing record in core_relo");
+				if (copy_to_bpfptr_offset(uattr,
+							  offsetof(union bpf_attr, core_relo_rec_size),
+							  &expected_size, sizeof(expected_size)))
+					err = -EFAULT;
+			}
+			break;
+		}
+
+		if (copy_from_bpfptr(&core_relo, u_core_relo, ncopy)) {
+			err = -EFAULT;
+			break;
+		}
+
+		if (core_relo.insn_off % 8 || core_relo.insn_off / 8 >= prog->len) {
+			verbose(env, "Invalid core_relo[%u].insn_off:%u prog->len:%u\n",
+				i, core_relo.insn_off, prog->len);
+			err = -EINVAL;
+			break;
+		}
+
+		err = bpf_core_apply(&ctx, &core_relo, i,
+				     &prog->insnsi[core_relo.insn_off / 8]);
+		if (err)
+			break;
+		bpfptr_add(&u_core_relo, rec_size);
+	}
+	return err;
+}
+
 static int check_btf_info(struct bpf_verifier_env *env,
 			  const union bpf_attr *attr,
 			  bpfptr_t uattr)
@@ -10281,6 +10353,10 @@ static int check_btf_info(struct bpf_verifier_env *env,
 	if (err)
 		return err;
 
+	err = check_core_relo(env, attr, uattr);
+	if (err)
+		return err;
+
 	return 0;
 }
 
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 190e9e3c0693..d66e5b0ea7ba 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -1342,8 +1342,10 @@ union bpf_attr {
 			/* or valid module BTF object fd or 0 to attach to vmlinux */
 			__u32		attach_btf_obj_fd;
 		};
-		__u32		:32;		/* pad */
+		__u32		core_relo_cnt;	/* number of bpf_core_relo */
 		__aligned_u64	fd_array;	/* array of FDs */
+		__aligned_u64	core_relos;
+		__u32		core_relo_rec_size; /* sizeof(struct bpf_core_relo) */
 	};
 
 	struct { /* anonymous struct used by BPF_OBJ_* commands */
@@ -6368,4 +6370,59 @@ enum bpf_core_relo_kind {
 	BPF_CORE_ENUMVAL_VALUE = 11,         /* enum value integer value */
 };
 
+/*
+ * "struct bpf_core_relo" is used to pass relocation data form LLVM to libbpf
+ * and from libbpf to the kernel.
+ *
+ * CO-RE relocation captures the following data:
+ * - insn_off - instruction offset (in bytes) within a BPF program that needs
+ *   its insn->imm field to be relocated with actual field info;
+ * - type_id - BTF type ID of the "root" (containing) entity of a relocatable
+ *   type or field;
+ * - access_str_off - offset into corresponding .BTF string section. String
+ *   interpretation depends on specific relocation kind:
+ *     - for field-based relocations, string encodes an accessed field using
+ *       a sequence of field and array indices, separated by colon (:). It's
+ *       conceptually very close to LLVM's getelementptr ([0]) instruction's
+ *       arguments for identifying offset to a field.
+ *     - for type-based relocations, strings is expected to be just "0";
+ *     - for enum value-based relocations, string contains an index of enum
+ *       value within its enum type;
+ * - kind - one of enum bpf_core_relo_kind;
+ *
+ * Example:
+ *   struct sample {
+ *       int a;
+ *       struct {
+ *           int b[10];
+ *       };
+ *   };
+ *
+ *   struct sample *s = ...;
+ *   int *x = &s->a;     // encoded as "0:0" (a is field #0)
+ *   int *y = &s->b[5];  // encoded as "0:1:0:5" (anon struct is field #1,
+ *                       // b is field #0 inside anon struct, accessing elem #5)
+ *   int *z = &s[10]->b; // encoded as "10:1" (ptr is used as an array)
+ *
+ * type_id for all relocs in this example will capture BTF type id of
+ * `struct sample`.
+ *
+ * Such relocation is emitted when using __builtin_preserve_access_index()
+ * Clang built-in, passing expression that captures field address, e.g.:
+ *
+ * bpf_probe_read(&dst, sizeof(dst),
+ *		  __builtin_preserve_access_index(&src->a.b.c));
+ *
+ * In this case Clang will emit field relocation recording necessary data to
+ * be able to find offset of embedded `a.b.c` field within `src` struct.
+ *
+ * [0] https://llvm.org/docs/LangRef.html#getelementptr-instruction
+ */
+struct bpf_core_relo {
+	__u32 insn_off;
+	__u32 type_id;
+	__u32 access_str_off;
+	enum bpf_core_relo_kind kind;
+};
+
 #endif /* _UAPI__LINUX_BPF_H__ */
diff --git a/tools/lib/bpf/relo_core.h b/tools/lib/bpf/relo_core.h
index 3d0b86e7f439..f410691cc4e5 100644
--- a/tools/lib/bpf/relo_core.h
+++ b/tools/lib/bpf/relo_core.h
@@ -6,59 +6,6 @@
 
 #include <linux/bpf.h>
 
-/* The minimum bpf_core_relo checked by the loader
- *
- * CO-RE relocation captures the following data:
- * - insn_off - instruction offset (in bytes) within a BPF program that needs
- *   its insn->imm field to be relocated with actual field info;
- * - type_id - BTF type ID of the "root" (containing) entity of a relocatable
- *   type or field;
- * - access_str_off - offset into corresponding .BTF string section. String
- *   interpretation depends on specific relocation kind:
- *     - for field-based relocations, string encodes an accessed field using
- *     a sequence of field and array indices, separated by colon (:). It's
- *     conceptually very close to LLVM's getelementptr ([0]) instruction's
- *     arguments for identifying offset to a field.
- *     - for type-based relocations, strings is expected to be just "0";
- *     - for enum value-based relocations, string contains an index of enum
- *     value within its enum type;
- *
- * Example to provide a better feel.
- *
- *   struct sample {
- *       int a;
- *       struct {
- *           int b[10];
- *       };
- *   };
- *
- *   struct sample *s = ...;
- *   int x = &s->a;     // encoded as "0:0" (a is field #0)
- *   int y = &s->b[5];  // encoded as "0:1:0:5" (anon struct is field #1,
- *                      // b is field #0 inside anon struct, accessing elem #5)
- *   int z = &s[10]->b; // encoded as "10:1" (ptr is used as an array)
- *
- * type_id for all relocs in this example  will capture BTF type id of
- * `struct sample`.
- *
- * Such relocation is emitted when using __builtin_preserve_access_index()
- * Clang built-in, passing expression that captures field address, e.g.:
- *
- * bpf_probe_read(&dst, sizeof(dst),
- *		  __builtin_preserve_access_index(&src->a.b.c));
- *
- * In this case Clang will emit field relocation recording necessary data to
- * be able to find offset of embedded `a.b.c` field within `src` struct.
- *
- *   [0] https://llvm.org/docs/LangRef.html#getelementptr-instruction
- */
-struct bpf_core_relo {
-	__u32   insn_off;
-	__u32   type_id;
-	__u32   access_str_off;
-	enum bpf_core_relo_kind kind;
-};
-
 struct bpf_core_cand {
 	const struct btf *btf;
 	const struct btf_type *t;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 bpf-next 06/16] bpf: Adjust BTF log size limit.
  2021-11-24  6:01 [PATCH v4 bpf-next 00/16] bpf: CO-RE support in the kernel Alexei Starovoitov
                   ` (4 preceding siblings ...)
  2021-11-24  6:01 ` [PATCH v4 bpf-next 05/16] bpf: Pass a set of bpf_core_relo-s to prog_load command Alexei Starovoitov
@ 2021-11-24  6:01 ` Alexei Starovoitov
  2021-11-24  6:02 ` [PATCH v4 bpf-next 07/16] bpf: Add bpf_core_add_cands() and wire it into bpf_core_apply_relo_insn() Alexei Starovoitov
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 28+ messages in thread
From: Alexei Starovoitov @ 2021-11-24  6:01 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Make BTF log size limit to be the same as the verifier log size limit.
Otherwise tools that progressively increase log size and use the same log
for BTF loading and program loading will be hitting hard to debug EINVAL.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 kernel/bpf/btf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 0d070461e2b8..dbf1f389b1d3 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -4472,7 +4472,7 @@ static struct btf *btf_parse(bpfptr_t btf_data, u32 btf_data_size,
 		log->len_total = log_size;
 
 		/* log attributes have to be sane */
-		if (log->len_total < 128 || log->len_total > UINT_MAX >> 8 ||
+		if (log->len_total < 128 || log->len_total > UINT_MAX >> 2 ||
 		    !log->level || !log->ubuf) {
 			err = -EINVAL;
 			goto errout;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 bpf-next 07/16] bpf: Add bpf_core_add_cands() and wire it into bpf_core_apply_relo_insn().
  2021-11-24  6:01 [PATCH v4 bpf-next 00/16] bpf: CO-RE support in the kernel Alexei Starovoitov
                   ` (5 preceding siblings ...)
  2021-11-24  6:01 ` [PATCH v4 bpf-next 06/16] bpf: Adjust BTF log size limit Alexei Starovoitov
@ 2021-11-24  6:02 ` Alexei Starovoitov
  2021-11-30  1:03   ` Andrii Nakryiko
  2021-11-30  5:50   ` Alexei Starovoitov
  2021-11-24  6:02 ` [PATCH v4 bpf-next 08/16] libbpf: Use CO-RE in the kernel in light skeleton Alexei Starovoitov
                   ` (8 subsequent siblings)
  15 siblings, 2 replies; 28+ messages in thread
From: Alexei Starovoitov @ 2021-11-24  6:02 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Given BPF program's BTF root type name perform the following steps:
. search in vmlinux candidate cache.
. if (present in cache and candidate list >= 1) return candidate list.
. do a linear search through kernel BTFs for possible candidates.
. regardless of number of candidates found populate vmlinux cache.
. if (candidate list >= 1) return candidate list.
. search in module candidate cache.
. if (present in cache) return candidate list (even if list is empty).
. do a linear search through BTFs of all kernel modules
  collecting candidates from all of them.
. regardless of number of candidates found populate module cache.
. return candidate list.
Then wire the result into bpf_core_apply_relo_insn().

When BPF program is trying to CO-RE relocate a type
that doesn't exist in either vmlinux BTF or in modules BTFs
these steps will perform 2 cache lookups when cache is hit.

Note the cache doesn't prevent the abuse by the program that might
have lots of relocations that cannot be resolved. Hence cond_resched().

CO-RE in the kernel requires CAP_BPF, since BTF loading requires it.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 kernel/bpf/btf.c          | 250 +++++++++++++++++++++++++++++++++++++-
 tools/lib/bpf/relo_core.h |   2 +
 2 files changed, 251 insertions(+), 1 deletion(-)

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index dbf1f389b1d3..cf971b8a0769 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -25,6 +25,7 @@
 #include <linux/kobject.h>
 #include <linux/sysfs.h>
 #include <net/sock.h>
+#include "../tools/lib/bpf/relo_core.h"
 
 /* BTF (BPF Type Format) is the meta data format which describes
  * the data types of BPF program/map.  Hence, it basically focus
@@ -6169,6 +6170,8 @@ btf_module_read(struct file *file, struct kobject *kobj,
 	return len;
 }
 
+static void purge_cand_cache(struct btf *btf);
+
 static int btf_module_notify(struct notifier_block *nb, unsigned long op,
 			     void *module)
 {
@@ -6245,6 +6248,7 @@ static int btf_module_notify(struct notifier_block *nb, unsigned long op,
 			list_del(&btf_mod->list);
 			if (btf_mod->sysfs_attr)
 				sysfs_remove_bin_file(btf_kobj, btf_mod->sysfs_attr);
+			purge_cand_cache(btf_mod->btf);
 			btf_put(btf_mod->btf);
 			kfree(btf_mod->sysfs_attr);
 			kfree(btf_mod);
@@ -6440,8 +6444,252 @@ size_t bpf_core_essential_name_len(const char *name)
 	return n;
 }
 
+struct bpf_cand_cache {
+	const char *name;
+	u32 name_len;
+	u16 kind;
+	u16 cnt;
+	struct {
+		const struct btf *btf;
+		u32 id;
+	} cands[];
+};
+
+static void bpf_free_cands(struct bpf_cand_cache *cands)
+{
+	kfree(cands->name);
+	kfree(cands);
+}
+
+#define VMLINUX_CAND_CACHE_SIZE 31
+static struct bpf_cand_cache *vmlinux_cand_cache[VMLINUX_CAND_CACHE_SIZE];
+
+#define MODULE_CAND_CACHE_SIZE 31
+static struct bpf_cand_cache *module_cand_cache[MODULE_CAND_CACHE_SIZE];
+
+static DEFINE_MUTEX(cand_cache_mutex);
+
+static struct bpf_cand_cache *check_cand_cache(struct bpf_cand_cache *cands,
+					       struct bpf_cand_cache **cache,
+					       int cache_size)
+{
+	u32 hash = jhash_2words(cands->name_len,
+				(((u32) cands->name[0]) << 8) | cands->name[1], 0);
+	struct bpf_cand_cache *cc = cache[hash % cache_size];
+
+	if (cc && cc->name_len == cands->name_len &&
+	    !strncmp(cc->name, cands->name, cands->name_len))
+		return cc;
+	return NULL;
+}
+
+static void populate_cand_cache(struct bpf_cand_cache *cands,
+				struct bpf_cand_cache **cache,
+				int cache_size)
+{
+	u32 hash = jhash_2words(cands->name_len,
+				(((u32) cands->name[0]) << 8) | cands->name[1], 0);
+	struct bpf_cand_cache *cc = cache[hash % cache_size];
+
+	if (cc)
+		bpf_free_cands(cc);
+	cache[hash % cache_size] = cands;
+}
+
+static void __purge_cand_cache(struct btf *btf, struct bpf_cand_cache **cache,
+			       int cache_size)
+{
+	struct bpf_cand_cache *cc;
+	int i, j;
+
+	for (i = 0; i < cache_size; i++) {
+		cc = cache[i];
+		if (!cc)
+			continue;
+		for (j = 0; j < cc->cnt; j++)
+			if (cc->cands[j].btf == btf) {
+				bpf_free_cands(cc);
+				cache[i] = NULL;
+				break;
+			}
+	}
+
+}
+
+static void purge_cand_cache(struct btf *btf)
+{
+	mutex_lock(&cand_cache_mutex);
+	__purge_cand_cache(btf, vmlinux_cand_cache, VMLINUX_CAND_CACHE_SIZE);
+	__purge_cand_cache(btf, module_cand_cache, MODULE_CAND_CACHE_SIZE);
+	mutex_unlock(&cand_cache_mutex);
+}
+
+static struct bpf_cand_cache *
+bpf_core_add_cands(struct bpf_cand_cache *cands, const struct btf *targ_btf,
+		   int targ_start_id)
+{
+	struct bpf_cand_cache *new_cands;
+	const struct btf_type *t;
+	const char *targ_name;
+	size_t targ_essent_len;
+	int n, i;
+
+	n = btf_nr_types(targ_btf);
+	for (i = targ_start_id; i < n; i++) {
+		t = btf_type_by_id(targ_btf, i);
+		if (btf_kind(t) != cands->kind)
+			continue;
+
+		targ_name = btf_name_by_offset(targ_btf, t->name_off);
+		if (!targ_name)
+			continue;
+
+		cond_resched();
+
+		if (strncmp(cands->name, targ_name, cands->name_len) != 0)
+			continue;
+
+		targ_essent_len = bpf_core_essential_name_len(targ_name);
+		if (targ_essent_len != cands->name_len)
+			continue;
+
+		new_cands = krealloc(cands,
+				     offsetof(struct bpf_cand_cache, cands[cands->cnt + 1]),
+				     GFP_KERNEL);
+		if (!new_cands) {
+			bpf_free_cands(cands);
+			return ERR_PTR(-ENOMEM);
+		}
+
+		cands = new_cands;
+		cands->cands[cands->cnt].btf = targ_btf;
+		cands->cands[cands->cnt].id = i;
+		cands->cnt++;
+	}
+	return cands;
+}
+
+static struct bpf_cand_cache *
+bpf_core_find_cands(struct bpf_core_ctx *ctx, u32 local_type_id)
+{
+	const struct btf *local_btf = ctx->btf;
+	const struct btf_type *local_type;
+	const struct btf *main_btf;
+	size_t local_essent_len;
+	struct bpf_cand_cache *cands, *cc;
+	struct btf *mod_btf;
+	const char *name;
+	int id;
+
+	local_type = btf_type_by_id(local_btf, local_type_id);
+	if (!local_type)
+		return ERR_PTR(-EINVAL);
+
+	name = btf_name_by_offset(local_btf, local_type->name_off);
+	if (str_is_empty(name))
+		return ERR_PTR(-EINVAL);
+	local_essent_len = bpf_core_essential_name_len(name);
+
+	cands = kcalloc(1, sizeof(*cands), GFP_KERNEL);
+	if (!cands)
+		return ERR_PTR(-ENOMEM);
+	cands->name = kmemdup_nul(name, local_essent_len, GFP_KERNEL);
+	if (!cands->name) {
+		kfree(cands);
+		return ERR_PTR(-ENOMEM);
+	}
+	cands->kind = btf_kind(local_type);
+	cands->name_len = local_essent_len;
+
+	cc = check_cand_cache(cands, vmlinux_cand_cache, VMLINUX_CAND_CACHE_SIZE);
+	if (cc) {
+		if (cc->cnt) {
+			bpf_free_cands(cands);
+			return cc;
+		}
+		goto check_modules;
+	}
+
+	/* Attempt to find target candidates in vmlinux BTF first */
+	main_btf = bpf_get_btf_vmlinux();
+	cands = bpf_core_add_cands(cands, main_btf, 1);
+	if (IS_ERR(cands))
+		return cands;
+
+	/* populate cache even when cands->cnt == 0 */
+	populate_cand_cache(cands, vmlinux_cand_cache, VMLINUX_CAND_CACHE_SIZE);
+
+	/* if vmlinux BTF has any candidate, don't go for module BTFs */
+	if (cands->cnt)
+		return cands;
+
+check_modules:
+	cc = check_cand_cache(cands, module_cand_cache, MODULE_CAND_CACHE_SIZE);
+	if (cc) {
+		bpf_free_cands(cands);
+		/* if cache has it return it even if cc->cnt == 0 */
+		return cc;
+	}
+
+	/* If candidate is not found in vmlinux's BTF then search in module's BTFs */
+	spin_lock_bh(&btf_idr_lock);
+	idr_for_each_entry(&btf_idr, mod_btf, id) {
+		if (!btf_is_module(mod_btf))
+			continue;
+		/* linear search could be slow hence unlock/lock
+		 * the IDR to avoiding holding it for too long
+		 */
+		btf_get(mod_btf);
+		spin_unlock_bh(&btf_idr_lock);
+		cands = bpf_core_add_cands(cands, mod_btf, btf_nr_types(main_btf));
+		if (IS_ERR(cands)) {
+			btf_put(mod_btf);
+			return cands;
+		}
+		spin_lock_bh(&btf_idr_lock);
+		btf_put(mod_btf);
+	}
+	spin_unlock_bh(&btf_idr_lock);
+	/* populate cache even when cands->cnt == 0 */
+	populate_cand_cache(cands, module_cand_cache, MODULE_CAND_CACHE_SIZE);
+	return cands;
+}
+
 int bpf_core_apply(struct bpf_core_ctx *ctx, const struct bpf_core_relo *relo,
 		   int relo_idx, void *insn)
 {
-	return -EOPNOTSUPP;
+	struct bpf_core_cand_list cands = {};
+	int err;
+
+	if (relo->kind != BPF_CORE_TYPE_ID_LOCAL) {
+		struct bpf_cand_cache *cc;
+		int i;
+
+		mutex_lock(&cand_cache_mutex);
+		cc = bpf_core_find_cands(ctx, relo->type_id);
+		if (IS_ERR(cc)) {
+			bpf_log(ctx->log, "target candidate search failed for %d\n",
+				relo->type_id);
+			return PTR_ERR(cc);
+		}
+		if (cc->cnt) {
+			cands.cands = kcalloc(cc->cnt, sizeof(*cands.cands), GFP_KERNEL);
+			if (!cands.cands)
+				return -ENOMEM;
+		}
+		for (i = 0; i < cc->cnt; i++) {
+			bpf_log(ctx->log,
+				"CO-RE relocating %s %s: found target candidate [%d]\n",
+				btf_kind_str[cc->kind], cc->name, cc->cands[i].id);
+			cands.cands[i].btf = cc->cands[i].btf;
+			cands.cands[i].id = cc->cands[i].id;
+		}
+		cands.len = cc->cnt;
+		mutex_unlock(&cand_cache_mutex);
+	}
+
+	err = bpf_core_apply_relo_insn((void *)ctx->log, insn, relo->insn_off / 8,
+				       relo, relo_idx, ctx->btf, &cands);
+	kfree(cands.cands);
+	return err;
 }
diff --git a/tools/lib/bpf/relo_core.h b/tools/lib/bpf/relo_core.h
index f410691cc4e5..f7b0d698978c 100644
--- a/tools/lib/bpf/relo_core.h
+++ b/tools/lib/bpf/relo_core.h
@@ -8,8 +8,10 @@
 
 struct bpf_core_cand {
 	const struct btf *btf;
+#ifndef __KERNEL__
 	const struct btf_type *t;
 	const char *name;
+#endif
 	__u32 id;
 };
 
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 bpf-next 08/16] libbpf: Use CO-RE in the kernel in light skeleton.
  2021-11-24  6:01 [PATCH v4 bpf-next 00/16] bpf: CO-RE support in the kernel Alexei Starovoitov
                   ` (6 preceding siblings ...)
  2021-11-24  6:02 ` [PATCH v4 bpf-next 07/16] bpf: Add bpf_core_add_cands() and wire it into bpf_core_apply_relo_insn() Alexei Starovoitov
@ 2021-11-24  6:02 ` Alexei Starovoitov
  2021-11-30  1:11   ` Andrii Nakryiko
  2021-11-24  6:02 ` [PATCH v4 bpf-next 09/16] libbpf: Support init of inner maps " Alexei Starovoitov
                   ` (7 subsequent siblings)
  15 siblings, 1 reply; 28+ messages in thread
From: Alexei Starovoitov @ 2021-11-24  6:02 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Without lskel the CO-RE relocations are processed by libbpf before any other
work is done. Instead, when lskel is needed, remember relocation as RELO_CORE
kind. Then when loader prog is generated for a given bpf program pass CO-RE
relos of that program to gen loader via bpf_gen__record_relo_core(). The gen
loader will remember them as-is and pass it later as-is into the kernel.

The normal libbpf flow is to process CO-RE early before call relos happen. In
case of gen_loader the core relos have to be added to other relos to be copied
together when bpf static function is appended in different places to other main
bpf progs. During the copy the append_subprog_relos() will adjust insn_idx for
normal relos and for RELO_CORE kind too. When that is done each struct
reloc_desc has good relos for specific main prog.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/lib/bpf/bpf_gen_internal.h |   3 +
 tools/lib/bpf/gen_loader.c       |  41 +++++++++++-
 tools/lib/bpf/libbpf.c           | 108 ++++++++++++++++++++++---------
 3 files changed, 119 insertions(+), 33 deletions(-)

diff --git a/tools/lib/bpf/bpf_gen_internal.h b/tools/lib/bpf/bpf_gen_internal.h
index 75ca9fb857b2..5cbc3e5d3e69 100644
--- a/tools/lib/bpf/bpf_gen_internal.h
+++ b/tools/lib/bpf/bpf_gen_internal.h
@@ -39,6 +39,8 @@ struct bpf_gen {
 	int error;
 	struct ksym_relo_desc *relos;
 	int relo_cnt;
+	struct bpf_core_relo *core_relos;
+	int core_relo_cnt;
 	char attach_target[128];
 	int attach_kind;
 	struct ksym_desc *ksyms;
@@ -61,5 +63,6 @@ void bpf_gen__map_freeze(struct bpf_gen *gen, int map_idx);
 void bpf_gen__record_attach_target(struct bpf_gen *gen, const char *name, enum bpf_attach_type type);
 void bpf_gen__record_extern(struct bpf_gen *gen, const char *name, bool is_weak,
 			    bool is_typeless, int kind, int insn_idx);
+void bpf_gen__record_relo_core(struct bpf_gen *gen, const struct bpf_core_relo *core_relo);
 
 #endif
diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c
index 7b73f97b1fa1..4e32fe1f68db 100644
--- a/tools/lib/bpf/gen_loader.c
+++ b/tools/lib/bpf/gen_loader.c
@@ -839,6 +839,22 @@ static void emit_relo_ksym_btf(struct bpf_gen *gen, struct ksym_relo_desc *relo,
 	emit_ksym_relo_log(gen, relo, kdesc->ref);
 }
 
+void bpf_gen__record_relo_core(struct bpf_gen *gen,
+			       const struct bpf_core_relo *core_relo)
+{
+	struct bpf_core_relo *relos;
+
+	relos = libbpf_reallocarray(gen->core_relos, gen->core_relo_cnt + 1, sizeof(*relos));
+	if (!relos) {
+		gen->error = -ENOMEM;
+		return;
+	}
+	gen->core_relos = relos;
+	relos += gen->core_relo_cnt;
+	memcpy(relos, core_relo, sizeof(*relos));
+	gen->core_relo_cnt++;
+}
+
 static void emit_relo(struct bpf_gen *gen, struct ksym_relo_desc *relo, int insns)
 {
 	int insn;
@@ -871,6 +887,15 @@ static void emit_relos(struct bpf_gen *gen, int insns)
 		emit_relo(gen, gen->relos + i, insns);
 }
 
+static void cleanup_core_relo(struct bpf_gen *gen)
+{
+	if (!gen->core_relo_cnt)
+		return;
+	free(gen->core_relos);
+	gen->core_relo_cnt = 0;
+	gen->core_relos = NULL;
+}
+
 static void cleanup_relos(struct bpf_gen *gen, int insns)
 {
 	int i, insn;
@@ -898,6 +923,7 @@ static void cleanup_relos(struct bpf_gen *gen, int insns)
 		gen->relo_cnt = 0;
 		gen->relos = NULL;
 	}
+	cleanup_core_relo(gen);
 }
 
 void bpf_gen__prog_load(struct bpf_gen *gen,
@@ -905,12 +931,13 @@ void bpf_gen__prog_load(struct bpf_gen *gen,
 			const char *license, struct bpf_insn *insns, size_t insn_cnt,
 			struct bpf_prog_load_opts *load_attr, int prog_idx)
 {
-	int attr_size = offsetofend(union bpf_attr, fd_array);
-	int prog_load_attr, license_off, insns_off, func_info, line_info;
+	int prog_load_attr, license_off, insns_off, func_info, line_info, core_relos;
+	int attr_size = offsetofend(union bpf_attr, core_relo_rec_size);
 	union bpf_attr attr;
 
 	memset(&attr, 0, attr_size);
-	pr_debug("gen: prog_load: type %d insns_cnt %zd\n", prog_type, insn_cnt);
+	pr_debug("gen: prog_load: type %d insns_cnt %zd progi_idx %d\n",
+		 prog_type, insn_cnt, prog_idx);
 	/* add license string to blob of bytes */
 	license_off = add_data(gen, license, strlen(license) + 1);
 	/* add insns to blob of bytes */
@@ -934,6 +961,11 @@ void bpf_gen__prog_load(struct bpf_gen *gen,
 	line_info = add_data(gen, load_attr->line_info,
 			     attr.line_info_cnt * attr.line_info_rec_size);
 
+	attr.core_relo_rec_size = sizeof(struct bpf_core_relo);
+	attr.core_relo_cnt = gen->core_relo_cnt;
+	core_relos = add_data(gen, gen->core_relos,
+			     attr.core_relo_cnt * attr.core_relo_rec_size);
+
 	memcpy(attr.prog_name, prog_name,
 	       min((unsigned)strlen(prog_name), BPF_OBJ_NAME_LEN - 1));
 	prog_load_attr = add_data(gen, &attr, attr_size);
@@ -950,6 +982,9 @@ void bpf_gen__prog_load(struct bpf_gen *gen,
 	/* populate union bpf_attr with a pointer to line_info */
 	emit_rel_store(gen, attr_field(prog_load_attr, line_info), line_info);
 
+	/* populate union bpf_attr with a pointer to core_relos */
+	emit_rel_store(gen, attr_field(prog_load_attr, core_relos), core_relos);
+
 	/* populate union bpf_attr fd_array with a pointer to data where map_fds are saved */
 	emit_rel_store(gen, attr_field(prog_load_attr, fd_array), gen->fd_array);
 
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index fc9a179ea412..dbb393768339 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -230,13 +230,19 @@ enum reloc_type {
 	RELO_EXTERN_VAR,
 	RELO_EXTERN_FUNC,
 	RELO_SUBPROG_ADDR,
+	RELO_CORE,
 };
 
 struct reloc_desc {
 	enum reloc_type type;
 	int insn_idx;
-	int map_idx;
-	int sym_off;
+	union {
+		const struct bpf_core_relo *core_relo; /* used when type == RELO_CORE */
+		struct {
+			int map_idx;
+			int sym_off;
+		};
+	};
 };
 
 struct bpf_sec_def;
@@ -5417,6 +5423,24 @@ static void *u32_as_hash_key(__u32 x)
 	return (void *)(uintptr_t)x;
 }
 
+static int record_relo_core(struct bpf_program *prog,
+			    const struct bpf_core_relo *core_relo, int insn_idx)
+{
+	struct reloc_desc *relos, *relo;
+
+	relos = libbpf_reallocarray(prog->reloc_desc,
+				    prog->nr_reloc + 1, sizeof(*relos));
+	if (!relos)
+		return -ENOMEM;
+	relo = &relos[prog->nr_reloc];
+	relo->type = RELO_CORE;
+	relo->insn_idx = insn_idx;
+	relo->core_relo = core_relo;
+	prog->reloc_desc = relos;
+	prog->nr_reloc++;
+	return 0;
+}
+
 static int bpf_core_apply_relo(struct bpf_program *prog,
 			       const struct bpf_core_relo *relo,
 			       int relo_idx,
@@ -5453,10 +5477,12 @@ static int bpf_core_apply_relo(struct bpf_program *prog,
 		return -EINVAL;
 
 	if (prog->obj->gen_loader) {
-		pr_warn("// TODO core_relo: prog %td insn[%d] %s kind %d\n",
+		const char *spec_str = btf__name_by_offset(local_btf, relo->access_str_off);
+
+		pr_debug("record_relo_core: prog %td insn[%d] %s %s %s final insn_idx %d\n",
 			prog - prog->obj->programs, relo->insn_off / 8,
-			local_name, relo->kind);
-		return -ENOTSUP;
+			btf_kind_str(local_type), local_name, spec_str, insn_idx);
+		return record_relo_core(prog, relo, insn_idx);
 	}
 
 	if (relo->kind != BPF_CORE_TYPE_ID_LOCAL &&
@@ -5653,6 +5679,9 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog)
 		case RELO_CALL:
 			/* handled already */
 			break;
+		case RELO_CORE:
+			/* will be handled by bpf_program_record_relos() */
+			break;
 		default:
 			pr_warn("prog '%s': relo #%d: bad relo type %d\n",
 				prog->name, i, relo->type);
@@ -6090,6 +6119,35 @@ bpf_object__free_relocs(struct bpf_object *obj)
 	}
 }
 
+static int cmp_relocs(const void *_a, const void *_b)
+{
+	const struct reloc_desc *a = _a;
+	const struct reloc_desc *b = _b;
+
+	if (a->insn_idx != b->insn_idx)
+		return a->insn_idx < b->insn_idx ? -1 : 1;
+
+	/* no two relocations should have the same insn_idx, but ... */
+	if (a->type != b->type)
+		return a->type < b->type ? -1 : 1;
+
+	return 0;
+}
+
+static void bpf_object__sort_relos(struct bpf_object *obj)
+{
+	int i;
+
+	for (i = 0; i < obj->nr_programs; i++) {
+		struct bpf_program *p = &obj->programs[i];
+
+		if (!p->nr_reloc)
+			continue;
+
+		qsort(p->reloc_desc, p->nr_reloc, sizeof(*p->reloc_desc), cmp_relocs);
+	}
+}
+
 static int
 bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path)
 {
@@ -6104,6 +6162,8 @@ bpf_object__relocate(struct bpf_object *obj, const char *targ_btf_path)
 				err);
 			return err;
 		}
+		if (obj->gen_loader)
+			bpf_object__sort_relos(obj);
 	}
 
 	/* Before relocating calls pre-process relocations and mark
@@ -6281,21 +6341,6 @@ static int bpf_object__collect_map_relos(struct bpf_object *obj,
 	return 0;
 }
 
-static int cmp_relocs(const void *_a, const void *_b)
-{
-	const struct reloc_desc *a = _a;
-	const struct reloc_desc *b = _b;
-
-	if (a->insn_idx != b->insn_idx)
-		return a->insn_idx < b->insn_idx ? -1 : 1;
-
-	/* no two relocations should have the same insn_idx, but ... */
-	if (a->type != b->type)
-		return a->type < b->type ? -1 : 1;
-
-	return 0;
-}
-
 static int bpf_object__collect_relos(struct bpf_object *obj)
 {
 	int i, err;
@@ -6328,14 +6373,7 @@ static int bpf_object__collect_relos(struct bpf_object *obj)
 			return err;
 	}
 
-	for (i = 0; i < obj->nr_programs; i++) {
-		struct bpf_program *p = &obj->programs[i];
-
-		if (!p->nr_reloc)
-			continue;
-
-		qsort(p->reloc_desc, p->nr_reloc, sizeof(*p->reloc_desc), cmp_relocs);
-	}
+	bpf_object__sort_relos(obj);
 	return 0;
 }
 
@@ -6578,7 +6616,7 @@ static int bpf_object_load_prog_instance(struct bpf_object *obj, struct bpf_prog
 	return ret;
 }
 
-static int bpf_program__record_externs(struct bpf_program *prog)
+static int bpf_program_record_relos(struct bpf_program *prog)
 {
 	struct bpf_object *obj = prog->obj;
 	int i;
@@ -6600,6 +6638,16 @@ static int bpf_program__record_externs(struct bpf_program *prog)
 					       ext->is_weak, false, BTF_KIND_FUNC,
 					       relo->insn_idx);
 			break;
+		case RELO_CORE: {
+			struct bpf_core_relo cr = {
+				.insn_off = relo->insn_idx * 8,
+				.type_id = relo->core_relo->type_id,
+				.access_str_off = relo->core_relo->access_str_off,
+				.kind = relo->core_relo->kind,
+			};
+			bpf_gen__record_relo_core(obj->gen_loader, &cr);
+			break;
+		}
 		default:
 			continue;
 		}
@@ -6639,7 +6687,7 @@ static int bpf_object_load_prog(struct bpf_object *obj, struct bpf_program *prog
 				prog->name, prog->instances.nr);
 		}
 		if (obj->gen_loader)
-			bpf_program__record_externs(prog);
+			bpf_program_record_relos(prog);
 		err = bpf_object_load_prog_instance(obj, prog,
 						    prog->insns, prog->insns_cnt,
 						    license, kern_ver, &fd);
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 bpf-next 09/16] libbpf: Support init of inner maps in light skeleton.
  2021-11-24  6:01 [PATCH v4 bpf-next 00/16] bpf: CO-RE support in the kernel Alexei Starovoitov
                   ` (7 preceding siblings ...)
  2021-11-24  6:02 ` [PATCH v4 bpf-next 08/16] libbpf: Use CO-RE in the kernel in light skeleton Alexei Starovoitov
@ 2021-11-24  6:02 ` Alexei Starovoitov
  2021-11-24  6:02 ` [PATCH v4 bpf-next 10/16] libbpf: Clean gen_loader's attach kind Alexei Starovoitov
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 28+ messages in thread
From: Alexei Starovoitov @ 2021-11-24  6:02 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Add ability to initialize inner maps in light skeleton.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/lib/bpf/bpf_gen_internal.h |  1 +
 tools/lib/bpf/gen_loader.c       | 27 +++++++++++++++++++++++++++
 tools/lib/bpf/libbpf.c           |  6 +++---
 3 files changed, 31 insertions(+), 3 deletions(-)

diff --git a/tools/lib/bpf/bpf_gen_internal.h b/tools/lib/bpf/bpf_gen_internal.h
index 5cbc3e5d3e69..6d8447756509 100644
--- a/tools/lib/bpf/bpf_gen_internal.h
+++ b/tools/lib/bpf/bpf_gen_internal.h
@@ -64,5 +64,6 @@ void bpf_gen__record_attach_target(struct bpf_gen *gen, const char *name, enum b
 void bpf_gen__record_extern(struct bpf_gen *gen, const char *name, bool is_weak,
 			    bool is_typeless, int kind, int insn_idx);
 void bpf_gen__record_relo_core(struct bpf_gen *gen, const struct bpf_core_relo *core_relo);
+void bpf_gen__populate_outer_map(struct bpf_gen *gen, int outer_map_idx, int key, int inner_map_idx);
 
 #endif
diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c
index 4e32fe1f68db..9066d1ae3461 100644
--- a/tools/lib/bpf/gen_loader.c
+++ b/tools/lib/bpf/gen_loader.c
@@ -1063,6 +1063,33 @@ void bpf_gen__map_update_elem(struct bpf_gen *gen, int map_idx, void *pvalue,
 	emit_check_err(gen);
 }
 
+void bpf_gen__populate_outer_map(struct bpf_gen *gen, int outer_map_idx, int slot,
+				 int inner_map_idx)
+{
+	int attr_size = offsetofend(union bpf_attr, flags);
+	int map_update_attr, key;
+	union bpf_attr attr;
+
+	memset(&attr, 0, attr_size);
+	pr_debug("gen: populate_outer_map: outer %d key %d inner %d\n",
+		 outer_map_idx, slot, inner_map_idx);
+
+	key = add_data(gen, &slot, sizeof(slot));
+
+	map_update_attr = add_data(gen, &attr, attr_size);
+	move_blob2blob(gen, attr_field(map_update_attr, map_fd), 4,
+		       blob_fd_array_off(gen, outer_map_idx));
+	emit_rel_store(gen, attr_field(map_update_attr, key), key);
+	emit_rel_store(gen, attr_field(map_update_attr, value),
+		       blob_fd_array_off(gen, inner_map_idx));
+
+	/* emit MAP_UPDATE_ELEM command */
+	emit_sys_bpf(gen, BPF_MAP_UPDATE_ELEM, map_update_attr, attr_size);
+	debug_ret(gen, "populate_outer_map outer %d key %d inner %d",
+		  outer_map_idx, slot, inner_map_idx);
+	emit_check_err(gen);
+}
+
 void bpf_gen__map_freeze(struct bpf_gen *gen, int map_idx)
 {
 	int attr_size = offsetofend(union bpf_attr, map_fd);
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index dbb393768339..af413273c0fb 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -4977,9 +4977,9 @@ static int init_map_slots(struct bpf_object *obj, struct bpf_map *map)
 		targ_map = map->init_slots[i];
 		fd = bpf_map__fd(targ_map);
 		if (obj->gen_loader) {
-			pr_warn("// TODO map_update_elem: idx %td key %d value==map_idx %td\n",
-				map - obj->maps, i, targ_map - obj->maps);
-			return -ENOTSUP;
+			bpf_gen__populate_outer_map(obj->gen_loader,
+						    map - obj->maps, i,
+						    targ_map - obj->maps);
 		} else {
 			err = bpf_map_update_elem(map->fd, &i, &fd, 0);
 		}
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 bpf-next 10/16] libbpf: Clean gen_loader's attach kind.
  2021-11-24  6:01 [PATCH v4 bpf-next 00/16] bpf: CO-RE support in the kernel Alexei Starovoitov
                   ` (8 preceding siblings ...)
  2021-11-24  6:02 ` [PATCH v4 bpf-next 09/16] libbpf: Support init of inner maps " Alexei Starovoitov
@ 2021-11-24  6:02 ` Alexei Starovoitov
  2021-11-30  1:11   ` Andrii Nakryiko
  2021-11-24  6:02 ` [PATCH v4 bpf-next 11/16] selftests/bpf: Add lskel version of kfunc test Alexei Starovoitov
                   ` (5 subsequent siblings)
  15 siblings, 1 reply; 28+ messages in thread
From: Alexei Starovoitov @ 2021-11-24  6:02 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

The gen_loader has to clear attach_kind otherwise the programs
without attach_btf_id will fail load if they follow programs
with attach_btf_id.

Fixes: 67234743736a ("libbpf: Generate loader program out of BPF ELF file.")
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/lib/bpf/gen_loader.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c
index 9066d1ae3461..3e9cc2312f0a 100644
--- a/tools/lib/bpf/gen_loader.c
+++ b/tools/lib/bpf/gen_loader.c
@@ -1015,9 +1015,11 @@ void bpf_gen__prog_load(struct bpf_gen *gen,
 	debug_ret(gen, "prog_load %s insn_cnt %d", attr.prog_name, attr.insn_cnt);
 	/* successful or not, close btf module FDs used in extern ksyms and attach_btf_obj_fd */
 	cleanup_relos(gen, insns_off);
-	if (gen->attach_kind)
+	if (gen->attach_kind) {
 		emit_sys_close_blob(gen,
 				    attr_field(prog_load_attr, attach_btf_obj_fd));
+		gen->attach_kind = 0;
+	}
 	emit_check_err(gen);
 	/* remember prog_fd in the stack, if successful */
 	emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7,
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 bpf-next 11/16] selftests/bpf: Add lskel version of kfunc test.
  2021-11-24  6:01 [PATCH v4 bpf-next 00/16] bpf: CO-RE support in the kernel Alexei Starovoitov
                   ` (9 preceding siblings ...)
  2021-11-24  6:02 ` [PATCH v4 bpf-next 10/16] libbpf: Clean gen_loader's attach kind Alexei Starovoitov
@ 2021-11-24  6:02 ` Alexei Starovoitov
  2021-11-24  6:02 ` [PATCH v4 bpf-next 12/16] selftests/bpf: Improve inner_map test coverage Alexei Starovoitov
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 28+ messages in thread
From: Alexei Starovoitov @ 2021-11-24  6:02 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Add light skeleton version of kfunc_call_test_subprog test.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/testing/selftests/bpf/Makefile          |  2 +-
 .../selftests/bpf/prog_tests/kfunc_call.c     | 24 +++++++++++++++++++
 2 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 35684d61aaeb..0c9f1837ece2 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -327,7 +327,7 @@ LINKED_SKELS := test_static_linked.skel.h linked_funcs.skel.h		\
 LSKELS := kfunc_call_test.c fentry_test.c fexit_test.c fexit_sleep.c \
 	test_ringbuf.c atomics.c trace_printk.c trace_vprintk.c
 # Generate both light skeleton and libbpf skeleton for these
-LSKELS_EXTRA := test_ksyms_module.c test_ksyms_weak.c
+LSKELS_EXTRA := test_ksyms_module.c test_ksyms_weak.c kfunc_call_test_subprog.c
 SKEL_BLACKLIST += $$(LSKELS)
 
 test_static_linked.skel.h-deps := test_static_linked1.o test_static_linked2.o
diff --git a/tools/testing/selftests/bpf/prog_tests/kfunc_call.c b/tools/testing/selftests/bpf/prog_tests/kfunc_call.c
index 5c9c0176991b..7d7445ccc141 100644
--- a/tools/testing/selftests/bpf/prog_tests/kfunc_call.c
+++ b/tools/testing/selftests/bpf/prog_tests/kfunc_call.c
@@ -4,6 +4,7 @@
 #include <network_helpers.h>
 #include "kfunc_call_test.lskel.h"
 #include "kfunc_call_test_subprog.skel.h"
+#include "kfunc_call_test_subprog.lskel.h"
 
 static void test_main(void)
 {
@@ -49,6 +50,26 @@ static void test_subprog(void)
 	kfunc_call_test_subprog__destroy(skel);
 }
 
+static void test_subprog_lskel(void)
+{
+	struct kfunc_call_test_subprog_lskel *skel;
+	int prog_fd, retval, err;
+
+	skel = kfunc_call_test_subprog_lskel__open_and_load();
+	if (!ASSERT_OK_PTR(skel, "skel"))
+		return;
+
+	prog_fd = skel->progs.kfunc_call_test1.prog_fd;
+	err = bpf_prog_test_run(prog_fd, 1, &pkt_v4, sizeof(pkt_v4),
+				NULL, NULL, (__u32 *)&retval, NULL);
+	ASSERT_OK(err, "bpf_prog_test_run(test1)");
+	ASSERT_EQ(retval, 10, "test1-retval");
+	ASSERT_NEQ(skel->data->active_res, -1, "active_res");
+	ASSERT_EQ(skel->data->sk_state_res, BPF_TCP_CLOSE, "sk_state_res");
+
+	kfunc_call_test_subprog_lskel__destroy(skel);
+}
+
 void test_kfunc_call(void)
 {
 	if (test__start_subtest("main"))
@@ -56,4 +77,7 @@ void test_kfunc_call(void)
 
 	if (test__start_subtest("subprog"))
 		test_subprog();
+
+	if (test__start_subtest("subprog_lskel"))
+		test_subprog_lskel();
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 bpf-next 12/16] selftests/bpf: Improve inner_map test coverage.
  2021-11-24  6:01 [PATCH v4 bpf-next 00/16] bpf: CO-RE support in the kernel Alexei Starovoitov
                   ` (10 preceding siblings ...)
  2021-11-24  6:02 ` [PATCH v4 bpf-next 11/16] selftests/bpf: Add lskel version of kfunc test Alexei Starovoitov
@ 2021-11-24  6:02 ` Alexei Starovoitov
  2021-11-24  6:02 ` [PATCH v4 bpf-next 13/16] selftests/bpf: Convert map_ptr_kern test to use light skeleton Alexei Starovoitov
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 28+ messages in thread
From: Alexei Starovoitov @ 2021-11-24  6:02 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Check that hash and array inner maps are properly initialized.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/testing/selftests/bpf/progs/map_ptr_kern.c | 16 ++++++++++++++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/map_ptr_kern.c b/tools/testing/selftests/bpf/progs/map_ptr_kern.c
index b1b711d9b214..b64df94ec476 100644
--- a/tools/testing/selftests/bpf/progs/map_ptr_kern.c
+++ b/tools/testing/selftests/bpf/progs/map_ptr_kern.c
@@ -334,9 +334,11 @@ static inline int check_lpm_trie(void)
 	return 1;
 }
 
+#define INNER_MAX_ENTRIES 1234
+
 struct inner_map {
 	__uint(type, BPF_MAP_TYPE_ARRAY);
-	__uint(max_entries, 1);
+	__uint(max_entries, INNER_MAX_ENTRIES);
 	__type(key, __u32);
 	__type(value, __u32);
 } inner_map SEC(".maps");
@@ -348,7 +350,7 @@ struct {
 	__type(value, __u32);
 	__array(values, struct {
 		__uint(type, BPF_MAP_TYPE_ARRAY);
-		__uint(max_entries, 1);
+		__uint(max_entries, INNER_MAX_ENTRIES);
 		__type(key, __u32);
 		__type(value, __u32);
 	});
@@ -360,8 +362,13 @@ static inline int check_array_of_maps(void)
 {
 	struct bpf_array *array_of_maps = (struct bpf_array *)&m_array_of_maps;
 	struct bpf_map *map = (struct bpf_map *)&m_array_of_maps;
+	struct bpf_array *inner_map;
+	int key = 0;
 
 	VERIFY(check_default(&array_of_maps->map, map));
+	inner_map = bpf_map_lookup_elem(array_of_maps, &key);
+	VERIFY(inner_map != 0);
+	VERIFY(inner_map->map.max_entries == INNER_MAX_ENTRIES);
 
 	return 1;
 }
@@ -382,8 +389,13 @@ static inline int check_hash_of_maps(void)
 {
 	struct bpf_htab *hash_of_maps = (struct bpf_htab *)&m_hash_of_maps;
 	struct bpf_map *map = (struct bpf_map *)&m_hash_of_maps;
+	struct bpf_htab *inner_map;
+	int key = 2;
 
 	VERIFY(check_default(&hash_of_maps->map, map));
+	inner_map = bpf_map_lookup_elem(hash_of_maps, &key);
+	VERIFY(inner_map != 0);
+	VERIFY(inner_map->map.max_entries == INNER_MAX_ENTRIES);
 
 	return 1;
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 bpf-next 13/16] selftests/bpf: Convert map_ptr_kern test to use light skeleton.
  2021-11-24  6:01 [PATCH v4 bpf-next 00/16] bpf: CO-RE support in the kernel Alexei Starovoitov
                   ` (11 preceding siblings ...)
  2021-11-24  6:02 ` [PATCH v4 bpf-next 12/16] selftests/bpf: Improve inner_map test coverage Alexei Starovoitov
@ 2021-11-24  6:02 ` Alexei Starovoitov
  2021-11-24  6:02 ` [PATCH v4 bpf-next 14/16] selftests/bpf: Additional test for CO-RE in the kernel Alexei Starovoitov
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 28+ messages in thread
From: Alexei Starovoitov @ 2021-11-24  6:02 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

To exercise CO-RE in the kernel further convert map_ptr_kern
test to light skeleton.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/testing/selftests/bpf/Makefile             |  3 ++-
 tools/testing/selftests/bpf/prog_tests/map_ptr.c | 16 +++++++---------
 2 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 0c9f1837ece2..4fd040f5944b 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -325,7 +325,8 @@ LINKED_SKELS := test_static_linked.skel.h linked_funcs.skel.h		\
 		linked_vars.skel.h linked_maps.skel.h
 
 LSKELS := kfunc_call_test.c fentry_test.c fexit_test.c fexit_sleep.c \
-	test_ringbuf.c atomics.c trace_printk.c trace_vprintk.c
+	test_ringbuf.c atomics.c trace_printk.c trace_vprintk.c \
+	map_ptr_kern.c
 # Generate both light skeleton and libbpf skeleton for these
 LSKELS_EXTRA := test_ksyms_module.c test_ksyms_weak.c kfunc_call_test_subprog.c
 SKEL_BLACKLIST += $$(LSKELS)
diff --git a/tools/testing/selftests/bpf/prog_tests/map_ptr.c b/tools/testing/selftests/bpf/prog_tests/map_ptr.c
index 4972f92205c7..273725504f11 100644
--- a/tools/testing/selftests/bpf/prog_tests/map_ptr.c
+++ b/tools/testing/selftests/bpf/prog_tests/map_ptr.c
@@ -4,31 +4,29 @@
 #include <test_progs.h>
 #include <network_helpers.h>
 
-#include "map_ptr_kern.skel.h"
+#include "map_ptr_kern.lskel.h"
 
 void test_map_ptr(void)
 {
-	struct map_ptr_kern *skel;
+	struct map_ptr_kern_lskel *skel;
 	__u32 duration = 0, retval;
 	char buf[128];
 	int err;
 	int page_size = getpagesize();
 
-	skel = map_ptr_kern__open();
+	skel = map_ptr_kern_lskel__open();
 	if (!ASSERT_OK_PTR(skel, "skel_open"))
 		return;
 
-	err = bpf_map__set_max_entries(skel->maps.m_ringbuf, page_size);
-	if (!ASSERT_OK(err, "bpf_map__set_max_entries"))
-		goto cleanup;
+	skel->maps.m_ringbuf.max_entries = page_size;
 
-	err = map_ptr_kern__load(skel);
+	err = map_ptr_kern_lskel__load(skel);
 	if (!ASSERT_OK(err, "skel_load"))
 		goto cleanup;
 
 	skel->bss->page_size = page_size;
 
-	err = bpf_prog_test_run(bpf_program__fd(skel->progs.cg_skb), 1, &pkt_v4,
+	err = bpf_prog_test_run(skel->progs.cg_skb.prog_fd, 1, &pkt_v4,
 				sizeof(pkt_v4), buf, NULL, &retval, NULL);
 
 	if (CHECK(err, "test_run", "err=%d errno=%d\n", err, errno))
@@ -39,5 +37,5 @@ void test_map_ptr(void)
 		goto cleanup;
 
 cleanup:
-	map_ptr_kern__destroy(skel);
+	map_ptr_kern_lskel__destroy(skel);
 }
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 bpf-next 14/16] selftests/bpf: Additional test for CO-RE in the kernel.
  2021-11-24  6:01 [PATCH v4 bpf-next 00/16] bpf: CO-RE support in the kernel Alexei Starovoitov
                   ` (12 preceding siblings ...)
  2021-11-24  6:02 ` [PATCH v4 bpf-next 13/16] selftests/bpf: Convert map_ptr_kern test to use light skeleton Alexei Starovoitov
@ 2021-11-24  6:02 ` Alexei Starovoitov
  2021-11-30  1:13   ` Andrii Nakryiko
  2021-11-24  6:02 ` [PATCH v4 bpf-next 15/16] selftests/bpf: Revert CO-RE removal in test_ksyms_weak Alexei Starovoitov
  2021-11-24  6:02 ` [PATCH v4 bpf-next 16/16] selftests/bpf: Add CO-RE relocations to verifier scale test Alexei Starovoitov
  15 siblings, 1 reply; 28+ messages in thread
From: Alexei Starovoitov @ 2021-11-24  6:02 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Additional test where randmap() function is appended to three different bpf
programs. That action checks struct bpf_core_relo replication logic and offset
adjustment.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/testing/selftests/bpf/Makefile          |  2 +-
 .../selftests/bpf/prog_tests/core_kern.c      | 14 +++
 tools/testing/selftests/bpf/progs/core_kern.c | 87 +++++++++++++++++++
 3 files changed, 102 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/core_kern.c
 create mode 100644 tools/testing/selftests/bpf/progs/core_kern.c

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 4fd040f5944b..139d7e5e0a5f 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -326,7 +326,7 @@ LINKED_SKELS := test_static_linked.skel.h linked_funcs.skel.h		\
 
 LSKELS := kfunc_call_test.c fentry_test.c fexit_test.c fexit_sleep.c \
 	test_ringbuf.c atomics.c trace_printk.c trace_vprintk.c \
-	map_ptr_kern.c
+	map_ptr_kern.c core_kern.c
 # Generate both light skeleton and libbpf skeleton for these
 LSKELS_EXTRA := test_ksyms_module.c test_ksyms_weak.c kfunc_call_test_subprog.c
 SKEL_BLACKLIST += $$(LSKELS)
diff --git a/tools/testing/selftests/bpf/prog_tests/core_kern.c b/tools/testing/selftests/bpf/prog_tests/core_kern.c
new file mode 100644
index 000000000000..561c5185d886
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/core_kern.c
@@ -0,0 +1,14 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2021 Facebook */
+
+#include "test_progs.h"
+#include "core_kern.lskel.h"
+
+void test_core_kern_lskel(void)
+{
+	struct core_kern_lskel *skel;
+
+	skel = core_kern_lskel__open_and_load();
+	ASSERT_OK_PTR(skel, "open_and_load");
+	core_kern_lskel__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/core_kern.c b/tools/testing/selftests/bpf/progs/core_kern.c
new file mode 100644
index 000000000000..2a40eec581b1
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/core_kern.c
@@ -0,0 +1,87 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2021 Facebook */
+#include "vmlinux.h"
+
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_core_read.h>
+
+#define ATTR __always_inline
+#include "test_jhash.h"
+
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+	__type(key, u32);
+	__type(value, u32);
+	__uint(max_entries, 256);
+} array1 SEC(".maps");
+
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+	__type(key, u32);
+	__type(value, u32);
+	__uint(max_entries, 256);
+} array2 SEC(".maps");
+
+static __noinline int randmap(int v, const struct net_device *dev)
+{
+	struct bpf_map *map = (struct bpf_map *)&array1;
+	int key = bpf_get_prandom_u32() & 0xff;
+	int *val;
+
+	if (bpf_get_prandom_u32() & 1)
+		map = (struct bpf_map *)&array2;
+
+	val = bpf_map_lookup_elem(map, &key);
+	if (val)
+		*val = bpf_get_prandom_u32() + v + dev->mtu;
+
+	return 0;
+}
+
+SEC("tp_btf/xdp_devmap_xmit")
+int BPF_PROG(tp_xdp_devmap_xmit_multi, const struct net_device
+	     *from_dev, const struct net_device *to_dev, int sent, int drops,
+	     int err)
+{
+	return randmap(from_dev->ifindex, from_dev);
+}
+
+SEC("fentry/eth_type_trans")
+int BPF_PROG(fentry_eth_type_trans, struct sk_buff *skb,
+	     struct net_device *dev, unsigned short protocol)
+{
+	return randmap(dev->ifindex + skb->len, dev);
+}
+
+SEC("fexit/eth_type_trans")
+int BPF_PROG(fexit_eth_type_trans, struct sk_buff *skb,
+	     struct net_device *dev, unsigned short protocol)
+{
+	return randmap(dev->ifindex + skb->len, dev);
+}
+
+SEC("tc")
+int balancer_ingress(struct __sk_buff *ctx)
+{
+	void *data_end = (void *)(long)ctx->data_end;
+	void *data = (void *)(long)ctx->data;
+	void *ptr;
+	int ret = 0, nh_off, i = 0;
+
+	nh_off = 14;
+
+	/* pragma unroll doesn't work on large loops */
+
+#define C do { \
+	ptr = data + i; \
+	if (ptr + nh_off > data_end) \
+		break; \
+	ctx->tc_index = jhash(ptr, nh_off, ctx->cb[0] + i++); \
+	} while (0);
+#define C30 C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;C;
+	C30;C30;C30; /* 90 calls */
+	return 0;
+}
+
+char LICENSE[] SEC("license") = "GPL";
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 bpf-next 15/16] selftests/bpf: Revert CO-RE removal in test_ksyms_weak.
  2021-11-24  6:01 [PATCH v4 bpf-next 00/16] bpf: CO-RE support in the kernel Alexei Starovoitov
                   ` (13 preceding siblings ...)
  2021-11-24  6:02 ` [PATCH v4 bpf-next 14/16] selftests/bpf: Additional test for CO-RE in the kernel Alexei Starovoitov
@ 2021-11-24  6:02 ` Alexei Starovoitov
  2021-11-24  6:02 ` [PATCH v4 bpf-next 16/16] selftests/bpf: Add CO-RE relocations to verifier scale test Alexei Starovoitov
  15 siblings, 0 replies; 28+ messages in thread
From: Alexei Starovoitov @ 2021-11-24  6:02 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

The commit 087cba799ced ("selftests/bpf: Add weak/typeless ksym test for light skeleton")
added test_ksyms_weak to light skeleton testing, but remove CO-RE access.
Revert that part of commit, since light skeleton can use CO-RE in the kernel.

Acked-by: Andrii Nakryiko <andrii@kernel.org>
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/testing/selftests/bpf/progs/test_ksyms_weak.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/progs/test_ksyms_weak.c b/tools/testing/selftests/bpf/progs/test_ksyms_weak.c
index 8eadbd4caf7a..5f8379aadb29 100644
--- a/tools/testing/selftests/bpf/progs/test_ksyms_weak.c
+++ b/tools/testing/selftests/bpf/progs/test_ksyms_weak.c
@@ -38,7 +38,7 @@ int pass_handler(const void *ctx)
 	/* tests existing symbols. */
 	rq = (struct rq *)bpf_per_cpu_ptr(&runqueues, 0);
 	if (rq)
-		out__existing_typed = 0;
+		out__existing_typed = rq->cpu;
 	out__existing_typeless = (__u64)&bpf_prog_active;
 
 	/* tests non-existent symbols. */
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* [PATCH v4 bpf-next 16/16] selftests/bpf: Add CO-RE relocations to verifier scale test.
  2021-11-24  6:01 [PATCH v4 bpf-next 00/16] bpf: CO-RE support in the kernel Alexei Starovoitov
                   ` (14 preceding siblings ...)
  2021-11-24  6:02 ` [PATCH v4 bpf-next 15/16] selftests/bpf: Revert CO-RE removal in test_ksyms_weak Alexei Starovoitov
@ 2021-11-24  6:02 ` Alexei Starovoitov
  15 siblings, 0 replies; 28+ messages in thread
From: Alexei Starovoitov @ 2021-11-24  6:02 UTC (permalink / raw)
  To: davem; +Cc: daniel, andrii, bpf, kernel-team

From: Alexei Starovoitov <ast@kernel.org>

Add 182 CO-RE relocations to verifier scale test.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/testing/selftests/bpf/progs/test_verif_scale2.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/bpf/progs/test_verif_scale2.c b/tools/testing/selftests/bpf/progs/test_verif_scale2.c
index f024154c7be7..f90ffcafd1e8 100644
--- a/tools/testing/selftests/bpf/progs/test_verif_scale2.c
+++ b/tools/testing/selftests/bpf/progs/test_verif_scale2.c
@@ -1,11 +1,11 @@
 // SPDX-License-Identifier: GPL-2.0
 // Copyright (c) 2019 Facebook
-#include <linux/bpf.h>
+#include "vmlinux.h"
 #include <bpf/bpf_helpers.h>
 #define ATTR __always_inline
 #include "test_jhash.h"
 
-SEC("scale90_inline")
+SEC("tc")
 int balancer_ingress(struct __sk_buff *ctx)
 {
 	void *data_end = (void *)(long)ctx->data_end;
-- 
2.30.2


^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 bpf-next 07/16] bpf: Add bpf_core_add_cands() and wire it into bpf_core_apply_relo_insn().
  2021-11-24  6:02 ` [PATCH v4 bpf-next 07/16] bpf: Add bpf_core_add_cands() and wire it into bpf_core_apply_relo_insn() Alexei Starovoitov
@ 2021-11-30  1:03   ` Andrii Nakryiko
  2021-11-30  3:18     ` Alexei Starovoitov
  2021-11-30  5:50   ` Alexei Starovoitov
  1 sibling, 1 reply; 28+ messages in thread
From: Andrii Nakryiko @ 2021-11-30  1:03 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, bpf, Kernel Team

On Tue, Nov 23, 2021 at 10:02 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> From: Alexei Starovoitov <ast@kernel.org>
>
> Given BPF program's BTF root type name perform the following steps:
> . search in vmlinux candidate cache.
> . if (present in cache and candidate list >= 1) return candidate list.
> . do a linear search through kernel BTFs for possible candidates.
> . regardless of number of candidates found populate vmlinux cache.
> . if (candidate list >= 1) return candidate list.
> . search in module candidate cache.
> . if (present in cache) return candidate list (even if list is empty).
> . do a linear search through BTFs of all kernel modules
>   collecting candidates from all of them.
> . regardless of number of candidates found populate module cache.
> . return candidate list.
> Then wire the result into bpf_core_apply_relo_insn().
>
> When BPF program is trying to CO-RE relocate a type
> that doesn't exist in either vmlinux BTF or in modules BTFs
> these steps will perform 2 cache lookups when cache is hit.
>
> Note the cache doesn't prevent the abuse by the program that might
> have lots of relocations that cannot be resolved. Hence cond_resched().
>
> CO-RE in the kernel requires CAP_BPF, since BTF loading requires it.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---
>  kernel/bpf/btf.c          | 250 +++++++++++++++++++++++++++++++++++++-
>  tools/lib/bpf/relo_core.h |   2 +
>  2 files changed, 251 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index dbf1f389b1d3..cf971b8a0769 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -25,6 +25,7 @@
>  #include <linux/kobject.h>
>  #include <linux/sysfs.h>
>  #include <net/sock.h>
> +#include "../tools/lib/bpf/relo_core.h"
>

[...]

> +static void populate_cand_cache(struct bpf_cand_cache *cands,
> +                               struct bpf_cand_cache **cache,
> +                               int cache_size)
> +{
> +       u32 hash = jhash_2words(cands->name_len,
> +                               (((u32) cands->name[0]) << 8) | cands->name[1], 0);

maybe add a helper func to calculate the hash given struct
bpf_cand_cache to keep the logic always in sync?

> +       struct bpf_cand_cache *cc = cache[hash % cache_size];
> +
> +       if (cc)
> +               bpf_free_cands(cc);
> +       cache[hash % cache_size] = cands;
> +}
> +

[...]

> +static struct bpf_cand_cache *
> +bpf_core_find_cands(struct bpf_core_ctx *ctx, u32 local_type_id)
> +{
> +       const struct btf *local_btf = ctx->btf;
> +       const struct btf_type *local_type;
> +       const struct btf *main_btf;
> +       size_t local_essent_len;
> +       struct bpf_cand_cache *cands, *cc;
> +       struct btf *mod_btf;
> +       const char *name;
> +       int id;
> +
> +       local_type = btf_type_by_id(local_btf, local_type_id);
> +       if (!local_type)
> +               return ERR_PTR(-EINVAL);
> +
> +       name = btf_name_by_offset(local_btf, local_type->name_off);
> +       if (str_is_empty(name))
> +               return ERR_PTR(-EINVAL);
> +       local_essent_len = bpf_core_essential_name_len(name);
> +
> +       cands = kcalloc(1, sizeof(*cands), GFP_KERNEL);
> +       if (!cands)
> +               return ERR_PTR(-ENOMEM);
> +       cands->name = kmemdup_nul(name, local_essent_len, GFP_KERNEL);

it's pretty minor, but you don't really need kmemdup_nul() until
populate_cand_cache(), you can use name as is until you really need to
cache cands

> +       if (!cands->name) {
> +               kfree(cands);
> +               return ERR_PTR(-ENOMEM);
> +       }
> +       cands->kind = btf_kind(local_type);
> +       cands->name_len = local_essent_len;
> +
> +       cc = check_cand_cache(cands, vmlinux_cand_cache, VMLINUX_CAND_CACHE_SIZE);
> +       if (cc) {
> +               if (cc->cnt) {
> +                       bpf_free_cands(cands);
> +                       return cc;
> +               }
> +               goto check_modules;
> +       }
> +
> +       /* Attempt to find target candidates in vmlinux BTF first */
> +       main_btf = bpf_get_btf_vmlinux();
> +       cands = bpf_core_add_cands(cands, main_btf, 1);
> +       if (IS_ERR(cands))
> +               return cands;
> +
> +       /* populate cache even when cands->cnt == 0 */
> +       populate_cand_cache(cands, vmlinux_cand_cache, VMLINUX_CAND_CACHE_SIZE);
> +
> +       /* if vmlinux BTF has any candidate, don't go for module BTFs */
> +       if (cands->cnt)
> +               return cands;
> +
> +check_modules:
> +       cc = check_cand_cache(cands, module_cand_cache, MODULE_CAND_CACHE_SIZE);
> +       if (cc) {
> +               bpf_free_cands(cands);
> +               /* if cache has it return it even if cc->cnt == 0 */
> +               return cc;
> +       }
> +
> +       /* If candidate is not found in vmlinux's BTF then search in module's BTFs */
> +       spin_lock_bh(&btf_idr_lock);
> +       idr_for_each_entry(&btf_idr, mod_btf, id) {
> +               if (!btf_is_module(mod_btf))
> +                       continue;
> +               /* linear search could be slow hence unlock/lock
> +                * the IDR to avoiding holding it for too long
> +                */
> +               btf_get(mod_btf);
> +               spin_unlock_bh(&btf_idr_lock);
> +               cands = bpf_core_add_cands(cands, mod_btf, btf_nr_types(main_btf));
> +               if (IS_ERR(cands)) {
> +                       btf_put(mod_btf);
> +                       return cands;
> +               }
> +               spin_lock_bh(&btf_idr_lock);
> +               btf_put(mod_btf);

either need to additionally btf_get(mod_btf) inside
bpf_core_add_cands() not btf_put() it here if you added at least one
candidate, as you are storing targ_btf inside bpf_core_add_cands() and
dropping refcount might leave dangling pointer


> +       }
> +       spin_unlock_bh(&btf_idr_lock);
> +       /* populate cache even when cands->cnt == 0 */
> +       populate_cand_cache(cands, module_cand_cache, MODULE_CAND_CACHE_SIZE);
> +       return cands;
> +}
> +
>  int bpf_core_apply(struct bpf_core_ctx *ctx, const struct bpf_core_relo *relo,
>                    int relo_idx, void *insn)
>  {
> -       return -EOPNOTSUPP;
> +       struct bpf_core_cand_list cands = {};
> +       int err;
> +
> +       if (relo->kind != BPF_CORE_TYPE_ID_LOCAL) {
> +               struct bpf_cand_cache *cc;
> +               int i;
> +
> +               mutex_lock(&cand_cache_mutex);
> +               cc = bpf_core_find_cands(ctx, relo->type_id);
> +               if (IS_ERR(cc)) {
> +                       bpf_log(ctx->log, "target candidate search failed for %d\n",
> +                               relo->type_id);
> +                       return PTR_ERR(cc);
> +               }
> +               if (cc->cnt) {
> +                       cands.cands = kcalloc(cc->cnt, sizeof(*cands.cands), GFP_KERNEL);
> +                       if (!cands.cands)
> +                               return -ENOMEM;
> +               }
> +               for (i = 0; i < cc->cnt; i++) {
> +                       bpf_log(ctx->log,
> +                               "CO-RE relocating %s %s: found target candidate [%d]\n",
> +                               btf_kind_str[cc->kind], cc->name, cc->cands[i].id);
> +                       cands.cands[i].btf = cc->cands[i].btf;
> +                       cands.cands[i].id = cc->cands[i].id;
> +               }
> +               cands.len = cc->cnt;
> +               mutex_unlock(&cand_cache_mutex);
> +       }
> +

cache is not locked at this point, so those cands.cands[i].btf objects
might be freed now (if module got unloaded meanwhile), right?

This global sharing of that small cache seems to cause unnecessary
headaches, tbh. It adds global mutex (which might also block for
kcalloc). If you used that cache locally for processing single
bpf_prog, you wouldn't need the locking. It can probably also simplify
the refcounting, especially if you just btf_get(targ_btf) for each
candidate and then btf_put() it after all relos are processed. You are
also half-step away from removing the size restriction (just chain
bpf_cand_caches together) and having a fixed bucket-size hash with
non-fixed chain length (which probably would be totally fine for all
practical purposes).


> +       err = bpf_core_apply_relo_insn((void *)ctx->log, insn, relo->insn_off / 8,
> +                                      relo, relo_idx, ctx->btf, &cands);
> +       kfree(cands.cands);
> +       return err;
>  }
> diff --git a/tools/lib/bpf/relo_core.h b/tools/lib/bpf/relo_core.h
> index f410691cc4e5..f7b0d698978c 100644
> --- a/tools/lib/bpf/relo_core.h
> +++ b/tools/lib/bpf/relo_core.h
> @@ -8,8 +8,10 @@
>
>  struct bpf_core_cand {
>         const struct btf *btf;
> +#ifndef __KERNEL__
>         const struct btf_type *t;
>         const char *name;
> +#endif

why? doesn't seem to be used and both t and name could be derived from
btf and id

>         __u32 id;
>  };
>
> --
> 2.30.2
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 bpf-next 08/16] libbpf: Use CO-RE in the kernel in light skeleton.
  2021-11-24  6:02 ` [PATCH v4 bpf-next 08/16] libbpf: Use CO-RE in the kernel in light skeleton Alexei Starovoitov
@ 2021-11-30  1:11   ` Andrii Nakryiko
  0 siblings, 0 replies; 28+ messages in thread
From: Andrii Nakryiko @ 2021-11-30  1:11 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, bpf, Kernel Team

On Tue, Nov 23, 2021 at 10:02 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> From: Alexei Starovoitov <ast@kernel.org>
>
> Without lskel the CO-RE relocations are processed by libbpf before any other
> work is done. Instead, when lskel is needed, remember relocation as RELO_CORE
> kind. Then when loader prog is generated for a given bpf program pass CO-RE
> relos of that program to gen loader via bpf_gen__record_relo_core(). The gen
> loader will remember them as-is and pass it later as-is into the kernel.
>
> The normal libbpf flow is to process CO-RE early before call relos happen. In
> case of gen_loader the core relos have to be added to other relos to be copied
> together when bpf static function is appended in different places to other main
> bpf progs. During the copy the append_subprog_relos() will adjust insn_idx for
> normal relos and for RELO_CORE kind too. When that is done each struct
> reloc_desc has good relos for specific main prog.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---
>  tools/lib/bpf/bpf_gen_internal.h |   3 +
>  tools/lib/bpf/gen_loader.c       |  41 +++++++++++-
>  tools/lib/bpf/libbpf.c           | 108 ++++++++++++++++++++++---------
>  3 files changed, 119 insertions(+), 33 deletions(-)
>

LGTM, minor styling nit, please address if/when resubmitting.

Acked-by: Andrii Nakryiko <andrii@kernel.org>

[...]

> @@ -6600,6 +6638,16 @@ static int bpf_program__record_externs(struct bpf_program *prog)
>                                                ext->is_weak, false, BTF_KIND_FUNC,
>                                                relo->insn_idx);
>                         break;
> +               case RELO_CORE: {
> +                       struct bpf_core_relo cr = {
> +                               .insn_off = relo->insn_idx * 8,
> +                               .type_id = relo->core_relo->type_id,
> +                               .access_str_off = relo->core_relo->access_str_off,
> +                               .kind = relo->core_relo->kind,
> +                       };

nit: empty line between variable and statements

> +                       bpf_gen__record_relo_core(obj->gen_loader, &cr);
> +                       break;
> +               }
>                 default:
>                         continue;
>                 }
> @@ -6639,7 +6687,7 @@ static int bpf_object_load_prog(struct bpf_object *obj, struct bpf_program *prog
>                                 prog->name, prog->instances.nr);
>                 }
>                 if (obj->gen_loader)
> -                       bpf_program__record_externs(prog);
> +                       bpf_program_record_relos(prog);
>                 err = bpf_object_load_prog_instance(obj, prog,
>                                                     prog->insns, prog->insns_cnt,
>                                                     license, kern_ver, &fd);
> --
> 2.30.2
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 bpf-next 10/16] libbpf: Clean gen_loader's attach kind.
  2021-11-24  6:02 ` [PATCH v4 bpf-next 10/16] libbpf: Clean gen_loader's attach kind Alexei Starovoitov
@ 2021-11-30  1:11   ` Andrii Nakryiko
  0 siblings, 0 replies; 28+ messages in thread
From: Andrii Nakryiko @ 2021-11-30  1:11 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, bpf, Kernel Team

On Tue, Nov 23, 2021 at 10:02 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> From: Alexei Starovoitov <ast@kernel.org>
>
> The gen_loader has to clear attach_kind otherwise the programs
> without attach_btf_id will fail load if they follow programs
> with attach_btf_id.
>
> Fixes: 67234743736a ("libbpf: Generate loader program out of BPF ELF file.")
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---

Acked-by: Andrii Nakryiko <andrii@kernel.org>

>  tools/lib/bpf/gen_loader.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/tools/lib/bpf/gen_loader.c b/tools/lib/bpf/gen_loader.c
> index 9066d1ae3461..3e9cc2312f0a 100644
> --- a/tools/lib/bpf/gen_loader.c
> +++ b/tools/lib/bpf/gen_loader.c
> @@ -1015,9 +1015,11 @@ void bpf_gen__prog_load(struct bpf_gen *gen,
>         debug_ret(gen, "prog_load %s insn_cnt %d", attr.prog_name, attr.insn_cnt);
>         /* successful or not, close btf module FDs used in extern ksyms and attach_btf_obj_fd */
>         cleanup_relos(gen, insns_off);
> -       if (gen->attach_kind)
> +       if (gen->attach_kind) {
>                 emit_sys_close_blob(gen,
>                                     attr_field(prog_load_attr, attach_btf_obj_fd));
> +               gen->attach_kind = 0;
> +       }
>         emit_check_err(gen);
>         /* remember prog_fd in the stack, if successful */
>         emit(gen, BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_7,
> --
> 2.30.2
>

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 bpf-next 14/16] selftests/bpf: Additional test for CO-RE in the kernel.
  2021-11-24  6:02 ` [PATCH v4 bpf-next 14/16] selftests/bpf: Additional test for CO-RE in the kernel Alexei Starovoitov
@ 2021-11-30  1:13   ` Andrii Nakryiko
  0 siblings, 0 replies; 28+ messages in thread
From: Andrii Nakryiko @ 2021-11-30  1:13 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, bpf, Kernel Team

On Tue, Nov 23, 2021 at 10:02 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> From: Alexei Starovoitov <ast@kernel.org>
>
> Additional test where randmap() function is appended to three different bpf
> programs. That action checks struct bpf_core_relo replication logic and offset
> adjustment.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---

Acked-by: Andrii Nakryiko <andrii@kernel.org>

>  tools/testing/selftests/bpf/Makefile          |  2 +-
>  .../selftests/bpf/prog_tests/core_kern.c      | 14 +++
>  tools/testing/selftests/bpf/progs/core_kern.c | 87 +++++++++++++++++++
>  3 files changed, 102 insertions(+), 1 deletion(-)
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/core_kern.c
>  create mode 100644 tools/testing/selftests/bpf/progs/core_kern.c
>

[...]

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 bpf-next 07/16] bpf: Add bpf_core_add_cands() and wire it into bpf_core_apply_relo_insn().
  2021-11-30  1:03   ` Andrii Nakryiko
@ 2021-11-30  3:18     ` Alexei Starovoitov
  2021-11-30  4:09       ` Andrii Nakryiko
  0 siblings, 1 reply; 28+ messages in thread
From: Alexei Starovoitov @ 2021-11-30  3:18 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, bpf, Kernel Team

On Mon, Nov 29, 2021 at 05:03:37PM -0800, Andrii Nakryiko wrote:
> On Tue, Nov 23, 2021 at 10:02 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > From: Alexei Starovoitov <ast@kernel.org>
> >
> > Given BPF program's BTF root type name perform the following steps:
> > . search in vmlinux candidate cache.
> > . if (present in cache and candidate list >= 1) return candidate list.
> > . do a linear search through kernel BTFs for possible candidates.
> > . regardless of number of candidates found populate vmlinux cache.
> > . if (candidate list >= 1) return candidate list.
> > . search in module candidate cache.
> > . if (present in cache) return candidate list (even if list is empty).
> > . do a linear search through BTFs of all kernel modules
> >   collecting candidates from all of them.
> > . regardless of number of candidates found populate module cache.
> > . return candidate list.
> > Then wire the result into bpf_core_apply_relo_insn().
> >
> > When BPF program is trying to CO-RE relocate a type
> > that doesn't exist in either vmlinux BTF or in modules BTFs
> > these steps will perform 2 cache lookups when cache is hit.
> >
> > Note the cache doesn't prevent the abuse by the program that might
> > have lots of relocations that cannot be resolved. Hence cond_resched().
> >
> > CO-RE in the kernel requires CAP_BPF, since BTF loading requires it.
> >
> > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > ---
> >  kernel/bpf/btf.c          | 250 +++++++++++++++++++++++++++++++++++++-
> >  tools/lib/bpf/relo_core.h |   2 +
> >  2 files changed, 251 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > index dbf1f389b1d3..cf971b8a0769 100644
> > --- a/kernel/bpf/btf.c
> > +++ b/kernel/bpf/btf.c
> > @@ -25,6 +25,7 @@
> >  #include <linux/kobject.h>
> >  #include <linux/sysfs.h>
> >  #include <net/sock.h>
> > +#include "../tools/lib/bpf/relo_core.h"
> >
> 
> [...]
> 
> > +static void populate_cand_cache(struct bpf_cand_cache *cands,
> > +                               struct bpf_cand_cache **cache,
> > +                               int cache_size)
> > +{
> > +       u32 hash = jhash_2words(cands->name_len,
> > +                               (((u32) cands->name[0]) << 8) | cands->name[1], 0);
> 
> maybe add a helper func to calculate the hash given struct
> bpf_cand_cache to keep the logic always in sync?

I felt it's trivial enough, but sure I can do that.

> > +       struct bpf_cand_cache *cc = cache[hash % cache_size];
> > +
> > +       if (cc)
> > +               bpf_free_cands(cc);
> > +       cache[hash % cache_size] = cands;
> > +}
> > +
> 
> [...]
> 
> > +static struct bpf_cand_cache *
> > +bpf_core_find_cands(struct bpf_core_ctx *ctx, u32 local_type_id)
> > +{
> > +       const struct btf *local_btf = ctx->btf;
> > +       const struct btf_type *local_type;
> > +       const struct btf *main_btf;
> > +       size_t local_essent_len;
> > +       struct bpf_cand_cache *cands, *cc;
> > +       struct btf *mod_btf;
> > +       const char *name;
> > +       int id;
> > +
> > +       local_type = btf_type_by_id(local_btf, local_type_id);
> > +       if (!local_type)
> > +               return ERR_PTR(-EINVAL);
> > +
> > +       name = btf_name_by_offset(local_btf, local_type->name_off);
> > +       if (str_is_empty(name))
> > +               return ERR_PTR(-EINVAL);
> > +       local_essent_len = bpf_core_essential_name_len(name);
> > +
> > +       cands = kcalloc(1, sizeof(*cands), GFP_KERNEL);
> > +       if (!cands)
> > +               return ERR_PTR(-ENOMEM);
> > +       cands->name = kmemdup_nul(name, local_essent_len, GFP_KERNEL);
> 
> it's pretty minor, but you don't really need kmemdup_nul() until
> populate_cand_cache(), you can use name as is until you really need to
> cache cands

I thought about it, but didn't do it, because it complicates the code:
bpf_core_add_cands() would somehow need to differentiate
whether cands->name came as a 1st call into bpf_core_add_cands()
or it's a subsequent call.
It could be a flag in struct bpf_cand_cache that will tell
whether bpf_free_cands() needs to be freed or not, but feels
unnecessary complex.

> 
> > +       if (!cands->name) {
> > +               kfree(cands);
> > +               return ERR_PTR(-ENOMEM);
> > +       }
> > +       cands->kind = btf_kind(local_type);
> > +       cands->name_len = local_essent_len;
> > +
> > +       cc = check_cand_cache(cands, vmlinux_cand_cache, VMLINUX_CAND_CACHE_SIZE);
> > +       if (cc) {
> > +               if (cc->cnt) {
> > +                       bpf_free_cands(cands);
> > +                       return cc;
> > +               }
> > +               goto check_modules;
> > +       }
> > +
> > +       /* Attempt to find target candidates in vmlinux BTF first */
> > +       main_btf = bpf_get_btf_vmlinux();
> > +       cands = bpf_core_add_cands(cands, main_btf, 1);
> > +       if (IS_ERR(cands))
> > +               return cands;
> > +
> > +       /* populate cache even when cands->cnt == 0 */
> > +       populate_cand_cache(cands, vmlinux_cand_cache, VMLINUX_CAND_CACHE_SIZE);
> > +
> > +       /* if vmlinux BTF has any candidate, don't go for module BTFs */
> > +       if (cands->cnt)
> > +               return cands;
> > +
> > +check_modules:
> > +       cc = check_cand_cache(cands, module_cand_cache, MODULE_CAND_CACHE_SIZE);
> > +       if (cc) {
> > +               bpf_free_cands(cands);
> > +               /* if cache has it return it even if cc->cnt == 0 */
> > +               return cc;
> > +       }
> > +
> > +       /* If candidate is not found in vmlinux's BTF then search in module's BTFs */
> > +       spin_lock_bh(&btf_idr_lock);
> > +       idr_for_each_entry(&btf_idr, mod_btf, id) {
> > +               if (!btf_is_module(mod_btf))
> > +                       continue;
> > +               /* linear search could be slow hence unlock/lock
> > +                * the IDR to avoiding holding it for too long
> > +                */
> > +               btf_get(mod_btf);
> > +               spin_unlock_bh(&btf_idr_lock);
> > +               cands = bpf_core_add_cands(cands, mod_btf, btf_nr_types(main_btf));
> > +               if (IS_ERR(cands)) {
> > +                       btf_put(mod_btf);
> > +                       return cands;
> > +               }
> > +               spin_lock_bh(&btf_idr_lock);
> > +               btf_put(mod_btf);
> 
> either need to additionally btf_get(mod_btf) inside
> bpf_core_add_cands() not btf_put() it here if you added at least one
> candidate, as you are storing targ_btf inside bpf_core_add_cands() and
> dropping refcount might leave dangling pointer

Module will not go away while cands are being searched and cache ops are done.
purge_cand_cache() is called from MODULE_STATE_GOING.
I've considered doing the purge from btf_put(),
but we don't guarantee the context in there, so mutex_lock
would complicate btf_put too much.
It's simpler to do purge from MODULE_STATE_GOING.
But more below...

> > +               for (i = 0; i < cc->cnt; i++) {
> > +                       bpf_log(ctx->log,
> > +                               "CO-RE relocating %s %s: found target candidate [%d]\n",
> > +                               btf_kind_str[cc->kind], cc->name, cc->cands[i].id);
> > +                       cands.cands[i].btf = cc->cands[i].btf;
> > +                       cands.cands[i].id = cc->cands[i].id;
> > +               }
> > +               cands.len = cc->cnt;
> > +               mutex_unlock(&cand_cache_mutex);
> > +       }
> > +
> 
> cache is not locked at this point, so those cands.cands[i].btf objects
> might be freed now (if module got unloaded meanwhile), right?

right, looks easier to do btf_get here while copying the pointer.
and do a loop of btf_put after bpf_core_apply_relo_insn.
Doing btf_get inside bpf_core_add_cands adds complexity to cache replacement
and purging.
Right now __purge_cand_cache is just kfree of the whole slot
with multiple btf pointers in there potentially from different modules.
With btf_get in add_cands it would need to be a loop and
bpf_free_cands would need to have a loop.
btw the earlier versions of this patch set had the same issue,
so thanks for the good catch!

> This global sharing of that small cache seems to cause unnecessary
> headaches, tbh. It adds global mutex (which might also block for
> kcalloc). If you used that cache locally for processing single
> bpf_prog, you wouldn't need the locking. It can probably also simplify
> the refcounting, especially if you just btf_get(targ_btf) for each
> candidate and then btf_put() it after all relos are processed. You are
> also half-step away from removing the size restriction (just chain
> bpf_cand_caches together) and having a fixed bucket-size hash with
> non-fixed chain length (which probably would be totally fine for all
> practical purposes).

and that would be a almost done hashtable? why add that complexity?
The size restriction is necessary anyway for a global cache.
Even for per-program hashtable some size restriction might be
necessary. CAP_BPF is a target for "researchers" to do
weird things with the kernel.

> 
> 
> > +       err = bpf_core_apply_relo_insn((void *)ctx->log, insn, relo->insn_off / 8,
> > +                                      relo, relo_idx, ctx->btf, &cands);
> > +       kfree(cands.cands);
> > +       return err;
> >  }
> > diff --git a/tools/lib/bpf/relo_core.h b/tools/lib/bpf/relo_core.h
> > index f410691cc4e5..f7b0d698978c 100644
> > --- a/tools/lib/bpf/relo_core.h
> > +++ b/tools/lib/bpf/relo_core.h
> > @@ -8,8 +8,10 @@
> >
> >  struct bpf_core_cand {
> >         const struct btf *btf;
> > +#ifndef __KERNEL__
> >         const struct btf_type *t;
> >         const char *name;
> > +#endif
> 
> why? doesn't seem to be used and both t and name could be derived from
> btf and id

exactly, that's why CO-RE in the kernel doesn't use them,
but libbpf uses both for debugging and for passing information
back and forth between layers.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 bpf-next 07/16] bpf: Add bpf_core_add_cands() and wire it into bpf_core_apply_relo_insn().
  2021-11-30  3:18     ` Alexei Starovoitov
@ 2021-11-30  4:09       ` Andrii Nakryiko
  2021-11-30  5:04         ` Alexei Starovoitov
  2021-11-30 23:06         ` Alexei Starovoitov
  0 siblings, 2 replies; 28+ messages in thread
From: Andrii Nakryiko @ 2021-11-30  4:09 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, bpf, Kernel Team

On Mon, Nov 29, 2021 at 7:18 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Nov 29, 2021 at 05:03:37PM -0800, Andrii Nakryiko wrote:
> > On Tue, Nov 23, 2021 at 10:02 PM Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > From: Alexei Starovoitov <ast@kernel.org>
> > >
> > > Given BPF program's BTF root type name perform the following steps:
> > > . search in vmlinux candidate cache.
> > > . if (present in cache and candidate list >= 1) return candidate list.
> > > . do a linear search through kernel BTFs for possible candidates.
> > > . regardless of number of candidates found populate vmlinux cache.
> > > . if (candidate list >= 1) return candidate list.
> > > . search in module candidate cache.
> > > . if (present in cache) return candidate list (even if list is empty).
> > > . do a linear search through BTFs of all kernel modules
> > >   collecting candidates from all of them.
> > > . regardless of number of candidates found populate module cache.
> > > . return candidate list.
> > > Then wire the result into bpf_core_apply_relo_insn().
> > >
> > > When BPF program is trying to CO-RE relocate a type
> > > that doesn't exist in either vmlinux BTF or in modules BTFs
> > > these steps will perform 2 cache lookups when cache is hit.
> > >
> > > Note the cache doesn't prevent the abuse by the program that might
> > > have lots of relocations that cannot be resolved. Hence cond_resched().
> > >
> > > CO-RE in the kernel requires CAP_BPF, since BTF loading requires it.
> > >
> > > Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> > > ---
> > >  kernel/bpf/btf.c          | 250 +++++++++++++++++++++++++++++++++++++-
> > >  tools/lib/bpf/relo_core.h |   2 +
> > >  2 files changed, 251 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> > > index dbf1f389b1d3..cf971b8a0769 100644
> > > --- a/kernel/bpf/btf.c
> > > +++ b/kernel/bpf/btf.c
> > > @@ -25,6 +25,7 @@
> > >  #include <linux/kobject.h>
> > >  #include <linux/sysfs.h>
> > >  #include <net/sock.h>
> > > +#include "../tools/lib/bpf/relo_core.h"
> > >
> >
> > [...]
> >
> > > +static void populate_cand_cache(struct bpf_cand_cache *cands,
> > > +                               struct bpf_cand_cache **cache,
> > > +                               int cache_size)
> > > +{
> > > +       u32 hash = jhash_2words(cands->name_len,
> > > +                               (((u32) cands->name[0]) << 8) | cands->name[1], 0);
> >
> > maybe add a helper func to calculate the hash given struct
> > bpf_cand_cache to keep the logic always in sync?
>
> I felt it's trivial enough, but sure I can do that.
>
> > > +       struct bpf_cand_cache *cc = cache[hash % cache_size];
> > > +
> > > +       if (cc)
> > > +               bpf_free_cands(cc);
> > > +       cache[hash % cache_size] = cands;
> > > +}
> > > +
> >
> > [...]
> >
> > > +static struct bpf_cand_cache *
> > > +bpf_core_find_cands(struct bpf_core_ctx *ctx, u32 local_type_id)
> > > +{
> > > +       const struct btf *local_btf = ctx->btf;
> > > +       const struct btf_type *local_type;
> > > +       const struct btf *main_btf;
> > > +       size_t local_essent_len;
> > > +       struct bpf_cand_cache *cands, *cc;
> > > +       struct btf *mod_btf;
> > > +       const char *name;
> > > +       int id;
> > > +
> > > +       local_type = btf_type_by_id(local_btf, local_type_id);
> > > +       if (!local_type)
> > > +               return ERR_PTR(-EINVAL);
> > > +
> > > +       name = btf_name_by_offset(local_btf, local_type->name_off);
> > > +       if (str_is_empty(name))
> > > +               return ERR_PTR(-EINVAL);
> > > +       local_essent_len = bpf_core_essential_name_len(name);
> > > +
> > > +       cands = kcalloc(1, sizeof(*cands), GFP_KERNEL);
> > > +       if (!cands)
> > > +               return ERR_PTR(-ENOMEM);
> > > +       cands->name = kmemdup_nul(name, local_essent_len, GFP_KERNEL);
> >
> > it's pretty minor, but you don't really need kmemdup_nul() until
> > populate_cand_cache(), you can use name as is until you really need to
> > cache cands
>
> I thought about it, but didn't do it, because it complicates the code:
> bpf_core_add_cands() would somehow need to differentiate
> whether cands->name came as a 1st call into bpf_core_add_cands()
> or it's a subsequent call.

ah, I initially missed that bpf_core_add_cans() can also free
candidate list; yeah, it's fine, as I said minor

> It could be a flag in struct bpf_cand_cache that will tell
> whether bpf_free_cands() needs to be freed or not, but feels
> unnecessary complex.

right, flags suck

>
> >
> > > +       if (!cands->name) {
> > > +               kfree(cands);
> > > +               return ERR_PTR(-ENOMEM);
> > > +       }
> > > +       cands->kind = btf_kind(local_type);
> > > +       cands->name_len = local_essent_len;
> > > +
> > > +       cc = check_cand_cache(cands, vmlinux_cand_cache, VMLINUX_CAND_CACHE_SIZE);
> > > +       if (cc) {
> > > +               if (cc->cnt) {
> > > +                       bpf_free_cands(cands);
> > > +                       return cc;
> > > +               }
> > > +               goto check_modules;
> > > +       }
> > > +
> > > +       /* Attempt to find target candidates in vmlinux BTF first */
> > > +       main_btf = bpf_get_btf_vmlinux();
> > > +       cands = bpf_core_add_cands(cands, main_btf, 1);
> > > +       if (IS_ERR(cands))
> > > +               return cands;
> > > +
> > > +       /* populate cache even when cands->cnt == 0 */
> > > +       populate_cand_cache(cands, vmlinux_cand_cache, VMLINUX_CAND_CACHE_SIZE);
> > > +
> > > +       /* if vmlinux BTF has any candidate, don't go for module BTFs */
> > > +       if (cands->cnt)
> > > +               return cands;
> > > +
> > > +check_modules:
> > > +       cc = check_cand_cache(cands, module_cand_cache, MODULE_CAND_CACHE_SIZE);
> > > +       if (cc) {
> > > +               bpf_free_cands(cands);
> > > +               /* if cache has it return it even if cc->cnt == 0 */
> > > +               return cc;
> > > +       }
> > > +
> > > +       /* If candidate is not found in vmlinux's BTF then search in module's BTFs */
> > > +       spin_lock_bh(&btf_idr_lock);
> > > +       idr_for_each_entry(&btf_idr, mod_btf, id) {
> > > +               if (!btf_is_module(mod_btf))
> > > +                       continue;
> > > +               /* linear search could be slow hence unlock/lock
> > > +                * the IDR to avoiding holding it for too long
> > > +                */
> > > +               btf_get(mod_btf);
> > > +               spin_unlock_bh(&btf_idr_lock);
> > > +               cands = bpf_core_add_cands(cands, mod_btf, btf_nr_types(main_btf));
> > > +               if (IS_ERR(cands)) {
> > > +                       btf_put(mod_btf);
> > > +                       return cands;
> > > +               }
> > > +               spin_lock_bh(&btf_idr_lock);
> > > +               btf_put(mod_btf);
> >
> > either need to additionally btf_get(mod_btf) inside
> > bpf_core_add_cands() not btf_put() it here if you added at least one
> > candidate, as you are storing targ_btf inside bpf_core_add_cands() and
> > dropping refcount might leave dangling pointer
>
> Module will not go away while cands are being searched and cache ops are done.
> purge_cand_cache() is called from MODULE_STATE_GOING.
> I've considered doing the purge from btf_put(),
> but we don't guarantee the context in there, so mutex_lock
> would complicate btf_put too much.
> It's simpler to do purge from MODULE_STATE_GOING.
> But more below...
>
> > > +               for (i = 0; i < cc->cnt; i++) {
> > > +                       bpf_log(ctx->log,
> > > +                               "CO-RE relocating %s %s: found target candidate [%d]\n",
> > > +                               btf_kind_str[cc->kind], cc->name, cc->cands[i].id);
> > > +                       cands.cands[i].btf = cc->cands[i].btf;
> > > +                       cands.cands[i].id = cc->cands[i].id;
> > > +               }
> > > +               cands.len = cc->cnt;
> > > +               mutex_unlock(&cand_cache_mutex);
> > > +       }
> > > +
> >
> > cache is not locked at this point, so those cands.cands[i].btf objects
> > might be freed now (if module got unloaded meanwhile), right?
>
> right, looks easier to do btf_get here while copying the pointer.
> and do a loop of btf_put after bpf_core_apply_relo_insn.
> Doing btf_get inside bpf_core_add_cands adds complexity to cache replacement
> and purging.
> Right now __purge_cand_cache is just kfree of the whole slot
> with multiple btf pointers in there potentially from different modules.
> With btf_get in add_cands it would need to be a loop and
> bpf_free_cands would need to have a loop.
> btw the earlier versions of this patch set had the same issue,
> so thanks for the good catch!

no prolem. Yeah, btf_get() here should work as well, I think.
>
> > This global sharing of that small cache seems to cause unnecessary
> > headaches, tbh. It adds global mutex (which might also block for
> > kcalloc). If you used that cache locally for processing single
> > bpf_prog, you wouldn't need the locking. It can probably also simplify
> > the refcounting, especially if you just btf_get(targ_btf) for each
> > candidate and then btf_put() it after all relos are processed. You are
> > also half-step away from removing the size restriction (just chain
> > bpf_cand_caches together) and having a fixed bucket-size hash with
> > non-fixed chain length (which probably would be totally fine for all
> > practical purposes).
>
> and that would be a almost done hashtable? why add that complexity?

well, my point was that you already have all this complexity :) I see
avoiding global mutex as less complexity, hashmap parts are just
annoying and mundate code, but not really a complexity.

But if you insist on a shared global mini-cache, that's fine.

> The size restriction is necessary anyway for a global cache.
> Even for per-program hashtable some size restriction might be
> necessary. CAP_BPF is a target for "researchers" to do
> weird things with the kernel.
>
> >
> >
> > > +       err = bpf_core_apply_relo_insn((void *)ctx->log, insn, relo->insn_off / 8,
> > > +                                      relo, relo_idx, ctx->btf, &cands);
> > > +       kfree(cands.cands);
> > > +       return err;
> > >  }
> > > diff --git a/tools/lib/bpf/relo_core.h b/tools/lib/bpf/relo_core.h
> > > index f410691cc4e5..f7b0d698978c 100644
> > > --- a/tools/lib/bpf/relo_core.h
> > > +++ b/tools/lib/bpf/relo_core.h
> > > @@ -8,8 +8,10 @@
> > >
> > >  struct bpf_core_cand {
> > >         const struct btf *btf;
> > > +#ifndef __KERNEL__
> > >         const struct btf_type *t;
> > >         const char *name;
> > > +#endif
> >
> > why? doesn't seem to be used and both t and name could be derived from
> > btf and id
>
> exactly, that's why CO-RE in the kernel doesn't use them,
> but libbpf uses both for debugging and for passing information
> back and forth between layers.

oh, I thought you added those fields initially and forgot to delete or
something, didn't notice that you are just "opting them out" for
__KERNEL__. I think libbpf code doesn't strictly need this, here's the
diff that completely removes their use, it's pretty straightforward
and minimal, so maybe instead of #ifdef'ing let's just do that?

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index b59fede08ba7..95fa57eea289 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -5179,15 +5179,18 @@ static int bpf_core_add_cands(struct
bpf_core_cand *local_cand,
                   struct bpf_core_cand_list *cands)
 {
     struct bpf_core_cand *new_cands, *cand;
-    const struct btf_type *t;
-    const char *targ_name;
+    const struct btf_type *t, *local_t;
+    const char *targ_name, *local_name;
     size_t targ_essent_len;
     int n, i;

+    local_t = btf__type_by_id(local_cand->btf, local_cand->id);
+    local_name = btf__str_by_offset(local_cand->btf, local_t->name_off);
+
     n = btf__type_cnt(targ_btf);
     for (i = targ_start_id; i < n; i++) {
         t = btf__type_by_id(targ_btf, i);
-        if (btf_kind(t) != btf_kind(local_cand->t))
+        if (btf_kind(t) != btf_kind(local_t))
             continue;

         targ_name = btf__name_by_offset(targ_btf, t->name_off);
@@ -5198,12 +5201,12 @@ static int bpf_core_add_cands(struct
bpf_core_cand *local_cand,
         if (targ_essent_len != local_essent_len)
             continue;

-        if (strncmp(local_cand->name, targ_name, local_essent_len) != 0)
+        if (strncmp(local_name, targ_name, local_essent_len) != 0)
             continue;

         pr_debug("CO-RE relocating [%d] %s %s: found target candidate
[%d] %s %s in [%s]\n",
-             local_cand->id, btf_kind_str(local_cand->t),
-             local_cand->name, i, btf_kind_str(t), targ_name,
+             local_cand->id, btf_kind_str(local_t),
+             local_name, i, btf_kind_str(t), targ_name,
              targ_btf_name);
         new_cands = libbpf_reallocarray(cands->cands, cands->len + 1,
                           sizeof(*cands->cands));
@@ -5212,8 +5215,6 @@ static int bpf_core_add_cands(struct
bpf_core_cand *local_cand,

         cand = &new_cands[cands->len];
         cand->btf = targ_btf;
-        cand->t = t;
-        cand->name = targ_name;
         cand->id = i;

         cands->cands = new_cands;
@@ -5320,18 +5321,20 @@ bpf_core_find_cands(struct bpf_object *obj,
const struct btf *local_btf, __u32 l
     struct bpf_core_cand local_cand = {};
     struct bpf_core_cand_list *cands;
     const struct btf *main_btf;
+    const struct btf_type *local_t;
+    const char *local_name;
     size_t local_essent_len;
     int err, i;

     local_cand.btf = local_btf;
-    local_cand.t = btf__type_by_id(local_btf, local_type_id);
-    if (!local_cand.t)
+    local_t = btf__type_by_id(local_btf, local_type_id);
+    if (!local_t)
         return ERR_PTR(-EINVAL);

-    local_cand.name = btf__name_by_offset(local_btf, local_cand.t->name_off);
-    if (str_is_empty(local_cand.name))
+    local_name = btf__name_by_offset(local_btf, local_t->name_off);
+    if (str_is_empty(local_name))
         return ERR_PTR(-EINVAL);
-    local_essent_len = bpf_core_essential_name_len(local_cand.name);
+    local_essent_len = bpf_core_essential_name_len(local_name);

     cands = calloc(1, sizeof(*cands));
     if (!cands)
diff --git a/tools/lib/bpf/relo_core.h b/tools/lib/bpf/relo_core.h
index 3b9f8f18346c..0dc0b9256bea 100644
--- a/tools/lib/bpf/relo_core.h
+++ b/tools/lib/bpf/relo_core.h
@@ -77,8 +77,6 @@ struct bpf_core_relo {

 struct bpf_core_cand {
     const struct btf *btf;
-    const struct btf_type *t;
-    const char *name;
     __u32 id;
 };

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 bpf-next 07/16] bpf: Add bpf_core_add_cands() and wire it into bpf_core_apply_relo_insn().
  2021-11-30  4:09       ` Andrii Nakryiko
@ 2021-11-30  5:04         ` Alexei Starovoitov
  2021-11-30  5:14           ` Andrii Nakryiko
  2021-11-30 23:06         ` Alexei Starovoitov
  1 sibling, 1 reply; 28+ messages in thread
From: Alexei Starovoitov @ 2021-11-30  5:04 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, bpf, Kernel Team

On Mon, Nov 29, 2021 at 08:09:55PM -0800, Andrii Nakryiko wrote:
> 
> oh, I thought you added those fields initially and forgot to delete or
> something, didn't notice that you are just "opting them out" for
> __KERNEL__. I think libbpf code doesn't strictly need this, here's the
> diff that completely removes their use, it's pretty straightforward
> and minimal, so maybe instead of #ifdef'ing let's just do that?

Cool.
Folded it into my series as another patch with your SOB.

> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index b59fede08ba7..95fa57eea289 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -5179,15 +5179,18 @@ static int bpf_core_add_cands(struct
> bpf_core_cand *local_cand,
>                    struct bpf_core_cand_list *cands)
>  {
>      struct bpf_core_cand *new_cands, *cand;
> -    const struct btf_type *t;
> -    const char *targ_name;
> +    const struct btf_type *t, *local_t;
> +    const char *targ_name, *local_name;

I wish you've inserted the patch without mangling.
Thankfully it was short enough to repeat manually.
No big deal.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 bpf-next 07/16] bpf: Add bpf_core_add_cands() and wire it into bpf_core_apply_relo_insn().
  2021-11-30  5:04         ` Alexei Starovoitov
@ 2021-11-30  5:14           ` Andrii Nakryiko
  0 siblings, 0 replies; 28+ messages in thread
From: Andrii Nakryiko @ 2021-11-30  5:14 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, bpf, Kernel Team

On Mon, Nov 29, 2021 at 9:05 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Nov 29, 2021 at 08:09:55PM -0800, Andrii Nakryiko wrote:
> >
> > oh, I thought you added those fields initially and forgot to delete or
> > something, didn't notice that you are just "opting them out" for
> > __KERNEL__. I think libbpf code doesn't strictly need this, here's the
> > diff that completely removes their use, it's pretty straightforward
> > and minimal, so maybe instead of #ifdef'ing let's just do that?
>
> Cool.
> Folded it into my series as another patch with your SOB.
>

Sounds good, thanks.

> > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> > index b59fede08ba7..95fa57eea289 100644
> > --- a/tools/lib/bpf/libbpf.c
> > +++ b/tools/lib/bpf/libbpf.c
> > @@ -5179,15 +5179,18 @@ static int bpf_core_add_cands(struct
> > bpf_core_cand *local_cand,
> >                    struct bpf_core_cand_list *cands)
> >  {
> >      struct bpf_core_cand *new_cands, *cand;
> > -    const struct btf_type *t;
> > -    const char *targ_name;
> > +    const struct btf_type *t, *local_t;
> > +    const char *targ_name, *local_name;
>
> I wish you've inserted the patch without mangling.
> Thankfully it was short enough to repeat manually.
> No big deal.

Sorry, I actually don't know how to do it with Gmail :( It always
replaces tabs with spaces and I haven't found a way around that.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 bpf-next 07/16] bpf: Add bpf_core_add_cands() and wire it into bpf_core_apply_relo_insn().
  2021-11-24  6:02 ` [PATCH v4 bpf-next 07/16] bpf: Add bpf_core_add_cands() and wire it into bpf_core_apply_relo_insn() Alexei Starovoitov
  2021-11-30  1:03   ` Andrii Nakryiko
@ 2021-11-30  5:50   ` Alexei Starovoitov
  1 sibling, 0 replies; 28+ messages in thread
From: Alexei Starovoitov @ 2021-11-30  5:50 UTC (permalink / raw)
  To: David S. Miller; +Cc: Daniel Borkmann, Andrii Nakryiko, bpf, Kernel Team

On Tue, Nov 23, 2021 at 10:02 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
> +
> +       /* Attempt to find target candidates in vmlinux BTF first */
> +       main_btf = bpf_get_btf_vmlinux();
> +       cands = bpf_core_add_cands(cands, main_btf, 1);
> +       if (IS_ERR(cands))
> +               return cands;
> +
> +       /* populate cache even when cands->cnt == 0 */
> +       populate_cand_cache(cands, vmlinux_cand_cache, VMLINUX_CAND_CACHE_SIZE);
> +
> +       /* if vmlinux BTF has any candidate, don't go for module BTFs */
> +       if (cands->cnt)
> +               return cands;
> +
> +check_modules:
> +       cc = check_cand_cache(cands, module_cand_cache, MODULE_CAND_CACHE_SIZE);
> +       if (cc) {
> +               bpf_free_cands(cands);
> +               /* if cache has it return it even if cc->cnt == 0 */
> +               return cc;
> +       }

Found another issue in the above: When cache is populated
with empty cands the above free_cands() will make it uaf.
Fixing in respin.

^ permalink raw reply	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 bpf-next 07/16] bpf: Add bpf_core_add_cands() and wire it into bpf_core_apply_relo_insn().
  2021-11-30  4:09       ` Andrii Nakryiko
  2021-11-30  5:04         ` Alexei Starovoitov
@ 2021-11-30 23:06         ` Alexei Starovoitov
  2021-11-30 23:12           ` Andrii Nakryiko
  1 sibling, 1 reply; 28+ messages in thread
From: Alexei Starovoitov @ 2021-11-30 23:06 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, bpf, Kernel Team

On Mon, Nov 29, 2021 at 8:10 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> oh, I thought you added those fields initially and forgot to delete or
> something, didn't notice that you are just "opting them out" for
> __KERNEL__. I think libbpf code doesn't strictly need this, here's the
> diff that completely removes their use, it's pretty straightforward
> and minimal, so maybe instead of #ifdef'ing let's just do that?
>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index b59fede08ba7..95fa57eea289 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -5179,15 +5179,18 @@ static int bpf_core_add_cands(struct
> bpf_core_cand *local_cand,
>                    struct bpf_core_cand_list *cands)
>  {
>      struct bpf_core_cand *new_cands, *cand;
> -    const struct btf_type *t;
> -    const char *targ_name;
> +    const struct btf_type *t, *local_t;
> +    const char *targ_name, *local_name;
>      size_t targ_essent_len;
>      int n, i;
>
> +    local_t = btf__type_by_id(local_cand->btf, local_cand->id);
> +    local_name = btf__str_by_offset(local_cand->btf, local_t->name_off);
> +
>      n = btf__type_cnt(targ_btf);
>      for (i = targ_start_id; i < n; i++) {
>          t = btf__type_by_id(targ_btf, i);
> -        if (btf_kind(t) != btf_kind(local_cand->t))
> +        if (btf_kind(t) != btf_kind(local_t))
>              continue;
>
>          targ_name = btf__name_by_offset(targ_btf, t->name_off);
> @@ -5198,12 +5201,12 @@ static int bpf_core_add_cands(struct
> bpf_core_cand *local_cand,
>          if (targ_essent_len != local_essent_len)
>              continue;
>
> -        if (strncmp(local_cand->name, targ_name, local_essent_len) != 0)
> +        if (strncmp(local_name, targ_name, local_essent_len) != 0)
>              continue;
>
>          pr_debug("CO-RE relocating [%d] %s %s: found target candidate
> [%d] %s %s in [%s]\n",
> -             local_cand->id, btf_kind_str(local_cand->t),
> -             local_cand->name, i, btf_kind_str(t), targ_name,
> +             local_cand->id, btf_kind_str(local_t),
> +             local_name, i, btf_kind_str(t), targ_name,
>               targ_btf_name);
>          new_cands = libbpf_reallocarray(cands->cands, cands->len + 1,
>                            sizeof(*cands->cands));
> @@ -5212,8 +5215,6 @@ static int bpf_core_add_cands(struct
> bpf_core_cand *local_cand,
>
>          cand = &new_cands[cands->len];
>          cand->btf = targ_btf;
> -        cand->t = t;
> -        cand->name = targ_name;
>          cand->id = i;
>
>          cands->cands = new_cands;
> @@ -5320,18 +5321,20 @@ bpf_core_find_cands(struct bpf_object *obj,
> const struct btf *local_btf, __u32 l
>      struct bpf_core_cand local_cand = {};
>      struct bpf_core_cand_list *cands;
>      const struct btf *main_btf;
> +    const struct btf_type *local_t;
> +    const char *local_name;
>      size_t local_essent_len;
>      int err, i;
>
>      local_cand.btf = local_btf;
> -    local_cand.t = btf__type_by_id(local_btf, local_type_id);
> -    if (!local_cand.t)
> +    local_t = btf__type_by_id(local_btf, local_type_id);
> +    if (!local_t)
>          return ERR_PTR(-EINVAL);

Heh. Looks like you only compile-tested it :)
I was surprised that CO-RE in the kernel was working,
but libbpf CO-RE didn't :)
Thankfully the fix was simple:

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 24d1cbc30084..1341ce539662 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -5333,6 +5333,7 @@ bpf_core_find_cands(struct bpf_object *obj,
const struct btf *local_btf, __u32 l
        int err, i;

        local_cand.btf = local_btf;
+       local_cand.id = local_type_id;
        local_t = btf__type_by_id(local_btf, local_type_id);

Just fyi.

^ permalink raw reply related	[flat|nested] 28+ messages in thread

* Re: [PATCH v4 bpf-next 07/16] bpf: Add bpf_core_add_cands() and wire it into bpf_core_apply_relo_insn().
  2021-11-30 23:06         ` Alexei Starovoitov
@ 2021-11-30 23:12           ` Andrii Nakryiko
  0 siblings, 0 replies; 28+ messages in thread
From: Andrii Nakryiko @ 2021-11-30 23:12 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Andrii Nakryiko, bpf, Kernel Team

On Tue, Nov 30, 2021 at 3:06 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Mon, Nov 29, 2021 at 8:10 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > oh, I thought you added those fields initially and forgot to delete or
> > something, didn't notice that you are just "opting them out" for
> > __KERNEL__. I think libbpf code doesn't strictly need this, here's the
> > diff that completely removes their use, it's pretty straightforward
> > and minimal, so maybe instead of #ifdef'ing let's just do that?
> >
> > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> > index b59fede08ba7..95fa57eea289 100644
> > --- a/tools/lib/bpf/libbpf.c
> > +++ b/tools/lib/bpf/libbpf.c
> > @@ -5179,15 +5179,18 @@ static int bpf_core_add_cands(struct
> > bpf_core_cand *local_cand,
> >                    struct bpf_core_cand_list *cands)
> >  {
> >      struct bpf_core_cand *new_cands, *cand;
> > -    const struct btf_type *t;
> > -    const char *targ_name;
> > +    const struct btf_type *t, *local_t;
> > +    const char *targ_name, *local_name;
> >      size_t targ_essent_len;
> >      int n, i;
> >
> > +    local_t = btf__type_by_id(local_cand->btf, local_cand->id);
> > +    local_name = btf__str_by_offset(local_cand->btf, local_t->name_off);
> > +
> >      n = btf__type_cnt(targ_btf);
> >      for (i = targ_start_id; i < n; i++) {
> >          t = btf__type_by_id(targ_btf, i);
> > -        if (btf_kind(t) != btf_kind(local_cand->t))
> > +        if (btf_kind(t) != btf_kind(local_t))
> >              continue;
> >
> >          targ_name = btf__name_by_offset(targ_btf, t->name_off);
> > @@ -5198,12 +5201,12 @@ static int bpf_core_add_cands(struct
> > bpf_core_cand *local_cand,
> >          if (targ_essent_len != local_essent_len)
> >              continue;
> >
> > -        if (strncmp(local_cand->name, targ_name, local_essent_len) != 0)
> > +        if (strncmp(local_name, targ_name, local_essent_len) != 0)
> >              continue;
> >
> >          pr_debug("CO-RE relocating [%d] %s %s: found target candidate
> > [%d] %s %s in [%s]\n",
> > -             local_cand->id, btf_kind_str(local_cand->t),
> > -             local_cand->name, i, btf_kind_str(t), targ_name,
> > +             local_cand->id, btf_kind_str(local_t),
> > +             local_name, i, btf_kind_str(t), targ_name,
> >               targ_btf_name);
> >          new_cands = libbpf_reallocarray(cands->cands, cands->len + 1,
> >                            sizeof(*cands->cands));
> > @@ -5212,8 +5215,6 @@ static int bpf_core_add_cands(struct
> > bpf_core_cand *local_cand,
> >
> >          cand = &new_cands[cands->len];
> >          cand->btf = targ_btf;
> > -        cand->t = t;
> > -        cand->name = targ_name;
> >          cand->id = i;
> >
> >          cands->cands = new_cands;
> > @@ -5320,18 +5321,20 @@ bpf_core_find_cands(struct bpf_object *obj,
> > const struct btf *local_btf, __u32 l
> >      struct bpf_core_cand local_cand = {};
> >      struct bpf_core_cand_list *cands;
> >      const struct btf *main_btf;
> > +    const struct btf_type *local_t;
> > +    const char *local_name;
> >      size_t local_essent_len;
> >      int err, i;
> >
> >      local_cand.btf = local_btf;
> > -    local_cand.t = btf__type_by_id(local_btf, local_type_id);
> > -    if (!local_cand.t)
> > +    local_t = btf__type_by_id(local_btf, local_type_id);
> > +    if (!local_t)
> >          return ERR_PTR(-EINVAL);
>
> Heh. Looks like you only compile-tested it :)

of course :)

> I was surprised that CO-RE in the kernel was working,
> but libbpf CO-RE didn't :)
> Thankfully the fix was simple:
>
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 24d1cbc30084..1341ce539662 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -5333,6 +5333,7 @@ bpf_core_find_cands(struct bpf_object *obj,
> const struct btf *local_btf, __u32 l
>         int err, i;
>
>         local_cand.btf = local_btf;
> +       local_cand.id = local_type_id;

fascinating, we didn't really use id because we just had type
directly, nice find, sorry about that

>         local_t = btf__type_by_id(local_btf, local_type_id);
>
> Just fyi.

^ permalink raw reply	[flat|nested] 28+ messages in thread

end of thread, other threads:[~2021-11-30 23:12 UTC | newest]

Thread overview: 28+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-11-24  6:01 [PATCH v4 bpf-next 00/16] bpf: CO-RE support in the kernel Alexei Starovoitov
2021-11-24  6:01 ` [PATCH v4 bpf-next 01/16] libbpf: Replace btf__type_by_id() with btf_type_by_id() Alexei Starovoitov
2021-11-24  6:01 ` [PATCH v4 bpf-next 02/16] bpf: Rename btf_member accessors Alexei Starovoitov
2021-11-24  6:01 ` [PATCH v4 bpf-next 03/16] bpf: Prepare relo_core.c for kernel duty Alexei Starovoitov
2021-11-24  6:01 ` [PATCH v4 bpf-next 04/16] bpf: Define enum bpf_core_relo_kind as uapi Alexei Starovoitov
2021-11-24  6:01 ` [PATCH v4 bpf-next 05/16] bpf: Pass a set of bpf_core_relo-s to prog_load command Alexei Starovoitov
2021-11-24  6:01 ` [PATCH v4 bpf-next 06/16] bpf: Adjust BTF log size limit Alexei Starovoitov
2021-11-24  6:02 ` [PATCH v4 bpf-next 07/16] bpf: Add bpf_core_add_cands() and wire it into bpf_core_apply_relo_insn() Alexei Starovoitov
2021-11-30  1:03   ` Andrii Nakryiko
2021-11-30  3:18     ` Alexei Starovoitov
2021-11-30  4:09       ` Andrii Nakryiko
2021-11-30  5:04         ` Alexei Starovoitov
2021-11-30  5:14           ` Andrii Nakryiko
2021-11-30 23:06         ` Alexei Starovoitov
2021-11-30 23:12           ` Andrii Nakryiko
2021-11-30  5:50   ` Alexei Starovoitov
2021-11-24  6:02 ` [PATCH v4 bpf-next 08/16] libbpf: Use CO-RE in the kernel in light skeleton Alexei Starovoitov
2021-11-30  1:11   ` Andrii Nakryiko
2021-11-24  6:02 ` [PATCH v4 bpf-next 09/16] libbpf: Support init of inner maps " Alexei Starovoitov
2021-11-24  6:02 ` [PATCH v4 bpf-next 10/16] libbpf: Clean gen_loader's attach kind Alexei Starovoitov
2021-11-30  1:11   ` Andrii Nakryiko
2021-11-24  6:02 ` [PATCH v4 bpf-next 11/16] selftests/bpf: Add lskel version of kfunc test Alexei Starovoitov
2021-11-24  6:02 ` [PATCH v4 bpf-next 12/16] selftests/bpf: Improve inner_map test coverage Alexei Starovoitov
2021-11-24  6:02 ` [PATCH v4 bpf-next 13/16] selftests/bpf: Convert map_ptr_kern test to use light skeleton Alexei Starovoitov
2021-11-24  6:02 ` [PATCH v4 bpf-next 14/16] selftests/bpf: Additional test for CO-RE in the kernel Alexei Starovoitov
2021-11-30  1:13   ` Andrii Nakryiko
2021-11-24  6:02 ` [PATCH v4 bpf-next 15/16] selftests/bpf: Revert CO-RE removal in test_ksyms_weak Alexei Starovoitov
2021-11-24  6:02 ` [PATCH v4 bpf-next 16/16] selftests/bpf: Add CO-RE relocations to verifier scale test Alexei Starovoitov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.