bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next v3 0/3] libbpf: Implement BTFGen
@ 2021-12-17 18:56 Mauricio Vásquez
  2021-12-17 18:56 ` [PATCH bpf-next v3 1/3] libbpf: split bpf_core_apply_relo() Mauricio Vásquez
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Mauricio Vásquez @ 2021-12-17 18:56 UTC (permalink / raw)
  To: netdev, bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Quentin Monnet, Rafael David Tinoco, Lorenzo Fontana,
	Leonardo Di Donato

CO-RE requires to have BTF information describing the types of the
kernel in order to perform the relocations. This is usually provided by
the kernel itself when it's configured with CONFIG_DEBUG_INFO_BTF.
However, this configuration is not enabled in all the distributions and
it's not available on kernels before 5.12.

It's possible to use CO-RE in kernels without CONFIG_DEBUG_INFO_BTF
support by providing the BTF information from an external source.
BTFHub[0] contains BTF files to each released kernel not supporting BTF,
for the most popular distributions.

Providing this BTF file for a given kernel has some challenges: 1. Each
BTF file is a few MBs big, then it's not possible to ship the eBPF
program with all the BTF files needed to run in different kernels. (The
BTF files will be in the order of GBs if you want to support a high
number of kernels) 2. Downloading the BTF file for the current kernel at
runtime delays the start of the program and it's not always possible to
reach an external host to download such a file.

Providing the BTF file with the information about all the data types of
the kernel for running an eBPF program is an overkill in many of the
cases. Usually the eBPF programs access only some kernel fields.

This series implements BTFGen support in bpftool by exposing some of the
internal libbpf's APIs to it.

This idea was discussed during the "Towards truly portable eBPF"[1]
presentation at Linux Plumbers 2021.

There is a good example[2] on how to use BTFGen and BTFHub together
to generate multiple BTF files, to each existing/supported kernel,
tailored to one application. For example: a complex bpf object might
support nearly 400 kernels by having BTF files summing only 1.5 MB.

[0]: https://github.com/aquasecurity/btfhub/
[1]: https://www.youtube.com/watch?v=igJLKyP1lFk&t=2418s
[2]: https://github.com/aquasecurity/btfhub/tree/main/tools

Changelog:

v2 > v3:
- expose internal libbpf APIs to bpftool instead
- implement btfgen in bpftool
- drop btf__raw_data() from libbpf

v1 > v2:
- introduce bpf_object__prepare() and ‘record_core_relos’ to expose
  CO-RE relocations instead of bpf_object__reloc_info_gen()
- rename btf__save_to_file() to btf__raw_data()

v1: https://lore.kernel.org/bpf/20211027203727.208847-1-mauricio@kinvolk.io/
v2: https://lore.kernel.org/bpf/20211116164208.164245-1-mauricio@kinvolk.io/

Mauricio Vásquez (3):
  libbpf: split bpf_core_apply_relo()
  libbpf: Implement changes needed for BTFGen in bpftool
  bpftool: Implement btfgen

 kernel/bpf/btf.c                |  11 +-
 tools/bpf/bpftool/gen.c         | 892 ++++++++++++++++++++++++++++++++
 tools/lib/bpf/Makefile          |   2 +-
 tools/lib/bpf/libbpf.c          | 124 +++--
 tools/lib/bpf/libbpf.h          |   3 +
 tools/lib/bpf/libbpf.map        |   2 +
 tools/lib/bpf/libbpf_internal.h |  22 +
 tools/lib/bpf/relo_core.c       |  83 +--
 tools/lib/bpf/relo_core.h       |  46 +-
 9 files changed, 1086 insertions(+), 99 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH bpf-next v3 1/3] libbpf: split bpf_core_apply_relo()
  2021-12-17 18:56 [PATCH bpf-next v3 0/3] libbpf: Implement BTFGen Mauricio Vásquez
@ 2021-12-17 18:56 ` Mauricio Vásquez
  2021-12-23  0:33   ` Andrii Nakryiko
  2021-12-17 18:56 ` [PATCH bpf-next v3 2/3] libbpf: Implement changes needed for BTFGen in bpftool Mauricio Vásquez
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 12+ messages in thread
From: Mauricio Vásquez @ 2021-12-17 18:56 UTC (permalink / raw)
  To: netdev, bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Quentin Monnet, Rafael David Tinoco, Lorenzo Fontana,
	Leonardo Di Donato

BTFGen needs to run the core relocation logic in order to understand
what are the types in the target BTF that involved in a given
relocation.

Currently bpf_core_apply_relo() calculates and **applies** a relocation
to an instruction. Having both operations in the same function makes it
difficult to only calculate the relocation without patching the
instruction. This commit splits that logic in two different phases: (1)
calculate the relocation and (2) patch the instruction.

For the first phase bpf_core_apply_relo() is renamed to
bpf_core_calc_relo_res() who is now only on charge of calculating the
relocation, the second phase uses the already existing
bpf_core_patch_insn(). bpf_object__relocate_core() uses both of them and
the BTFGen will use only bpf_core_calc_relo_res().

Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io>
Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com>
Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co>
Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co>
---
 kernel/bpf/btf.c          | 11 +++++-
 tools/lib/bpf/libbpf.c    | 54 ++++++++++++++++-----------
 tools/lib/bpf/relo_core.c | 77 +++++++++++----------------------------
 tools/lib/bpf/relo_core.h | 42 ++++++++++++++++++---
 4 files changed, 99 insertions(+), 85 deletions(-)

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index a17de71abd2e..5a8f6ef6a341 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -6734,6 +6734,7 @@ int bpf_core_apply(struct bpf_core_ctx *ctx, const struct bpf_core_relo *relo,
 {
 	bool need_cands = relo->kind != BPF_CORE_TYPE_ID_LOCAL;
 	struct bpf_core_cand_list cands = {};
+	struct bpf_core_relo_res targ_res;
 	struct bpf_core_spec *specs;
 	int err;
 
@@ -6778,8 +6779,14 @@ int bpf_core_apply(struct bpf_core_ctx *ctx, const struct bpf_core_relo *relo,
 		 */
 	}
 
-	err = bpf_core_apply_relo_insn((void *)ctx->log, insn, relo->insn_off / 8,
-				       relo, relo_idx, ctx->btf, &cands, specs);
+	err = bpf_core_calc_relo_insn((void *)ctx->log, relo, relo_idx, ctx->btf, &cands, specs,
+				      &targ_res);
+	if (err)
+		goto out;
+
+	err = bpf_core_patch_insn((void *)ctx->log, insn, relo->insn_off / 8, relo, relo_idx,
+				  &targ_res);
+
 out:
 	kfree(specs);
 	if (need_cands) {
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index cf862a19222b..77e2df13715a 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -5498,11 +5498,12 @@ static int record_relo_core(struct bpf_program *prog,
 	return 0;
 }
 
-static int bpf_core_apply_relo(struct bpf_program *prog,
-			       const struct bpf_core_relo *relo,
-			       int relo_idx,
-			       const struct btf *local_btf,
-			       struct hashmap *cand_cache)
+static int bpf_core_calc_relo_res(struct bpf_program *prog,
+				  const struct bpf_core_relo *relo,
+				  int relo_idx,
+				  const struct btf *local_btf,
+				  struct hashmap *cand_cache,
+				  struct bpf_core_relo_res *targ_res)
 {
 	struct bpf_core_spec specs_scratch[3] = {};
 	const void *type_key = u32_as_hash_key(relo->type_id);
@@ -5511,20 +5512,7 @@ static int bpf_core_apply_relo(struct bpf_program *prog,
 	const struct btf_type *local_type;
 	const char *local_name;
 	__u32 local_id = relo->type_id;
-	struct bpf_insn *insn;
-	int insn_idx, err;
-
-	if (relo->insn_off % BPF_INSN_SZ)
-		return -EINVAL;
-	insn_idx = relo->insn_off / BPF_INSN_SZ;
-	/* adjust insn_idx from section frame of reference to the local
-	 * program's frame of reference; (sub-)program code is not yet
-	 * relocated, so it's enough to just subtract in-section offset
-	 */
-	insn_idx = insn_idx - prog->sec_insn_off;
-	if (insn_idx >= prog->insns_cnt)
-		return -EINVAL;
-	insn = &prog->insns[insn_idx];
+	int err;
 
 	local_type = btf__type_by_id(local_btf, local_id);
 	if (!local_type)
@@ -5536,6 +5524,7 @@ static int bpf_core_apply_relo(struct bpf_program *prog,
 
 	if (prog->obj->gen_loader) {
 		const char *spec_str = btf__name_by_offset(local_btf, relo->access_str_off);
+		int insn_idx = relo->insn_off / BPF_INSN_SZ;
 
 		pr_debug("record_relo_core: prog %td insn[%d] %s %s %s final insn_idx %d\n",
 			prog - prog->obj->programs, relo->insn_off / 8,
@@ -5559,19 +5548,21 @@ static int bpf_core_apply_relo(struct bpf_program *prog,
 		}
 	}
 
-	return bpf_core_apply_relo_insn(prog_name, insn, insn_idx, relo,
-					relo_idx, local_btf, cands, specs_scratch);
+	return bpf_core_calc_relo_insn(prog_name, relo, relo_idx, local_btf, cands,
+				       specs_scratch, targ_res);
 }
 
 static int
 bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path)
 {
 	const struct btf_ext_info_sec *sec;
+	struct bpf_core_relo_res targ_res;
 	const struct bpf_core_relo *rec;
 	const struct btf_ext_info *seg;
 	struct hashmap_entry *entry;
 	struct hashmap *cand_cache = NULL;
 	struct bpf_program *prog;
+	struct bpf_insn *insn;
 	const char *sec_name;
 	int i, err = 0, insn_idx, sec_idx;
 
@@ -5636,12 +5627,31 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path)
 			if (!prog->load)
 				continue;
 
-			err = bpf_core_apply_relo(prog, rec, i, obj->btf, cand_cache);
+			err = bpf_core_calc_relo_res(prog, rec, i, obj->btf, cand_cache, &targ_res);
 			if (err) {
 				pr_warn("prog '%s': relo #%d: failed to relocate: %d\n",
 					prog->name, i, err);
 				goto out;
 			}
+
+			if (rec->insn_off % BPF_INSN_SZ)
+				return -EINVAL;
+			insn_idx = rec->insn_off / BPF_INSN_SZ;
+			/* adjust insn_idx from section frame of reference to the local
+			 * program's frame of reference; (sub-)program code is not yet
+			 * relocated, so it's enough to just subtract in-section offset
+			 */
+			insn_idx = insn_idx - prog->sec_insn_off;
+			if (insn_idx >= prog->insns_cnt)
+				return -EINVAL;
+			insn = &prog->insns[insn_idx];
+
+			err = bpf_core_patch_insn(prog->name, insn, insn_idx, rec, i, &targ_res);
+			if (err) {
+				pr_warn("prog '%s': relo #%d: failed to patch insn #%u: %d\n",
+					prog->name, i, insn_idx, err);
+				goto out;
+			}
 		}
 	}
 
diff --git a/tools/lib/bpf/relo_core.c b/tools/lib/bpf/relo_core.c
index 910865e29edc..4f3552468624 100644
--- a/tools/lib/bpf/relo_core.c
+++ b/tools/lib/bpf/relo_core.c
@@ -775,31 +775,6 @@ static int bpf_core_calc_enumval_relo(const struct bpf_core_relo *relo,
 	return 0;
 }
 
-struct bpf_core_relo_res
-{
-	/* expected value in the instruction, unless validate == false */
-	__u32 orig_val;
-	/* new value that needs to be patched up to */
-	__u32 new_val;
-	/* relocation unsuccessful, poison instruction, but don't fail load */
-	bool poison;
-	/* some relocations can't be validated against orig_val */
-	bool validate;
-	/* for field byte offset relocations or the forms:
-	 *     *(T *)(rX + <off>) = rY
-	 *     rX = *(T *)(rY + <off>),
-	 * we remember original and resolved field size to adjust direct
-	 * memory loads of pointers and integers; this is necessary for 32-bit
-	 * host kernel architectures, but also allows to automatically
-	 * relocate fields that were resized from, e.g., u32 to u64, etc.
-	 */
-	bool fail_memsz_adjust;
-	__u32 orig_sz;
-	__u32 orig_type_id;
-	__u32 new_sz;
-	__u32 new_type_id;
-};
-
 /* Calculate original and target relocation values, given local and target
  * specs and relocation kind. These values are calculated for each candidate.
  * If there are multiple candidates, resulting values should all be consistent
@@ -951,9 +926,9 @@ static int insn_bytes_to_bpf_size(__u32 sz)
  * 5. *(T *)(rX + <off>) = rY, where T is one of {u8, u16, u32, u64};
  * 6. *(T *)(rX + <off>) = <imm>, where T is one of {u8, u16, u32, u64}.
  */
-static int bpf_core_patch_insn(const char *prog_name, struct bpf_insn *insn,
-			       int insn_idx, const struct bpf_core_relo *relo,
-			       int relo_idx, const struct bpf_core_relo_res *res)
+int bpf_core_patch_insn(const char *prog_name, struct bpf_insn *insn,
+			int insn_idx, const struct bpf_core_relo *relo,
+			int relo_idx, const struct bpf_core_relo_res *res)
 {
 	__u32 orig_val, new_val;
 	__u8 class;
@@ -1177,18 +1152,18 @@ static void bpf_core_dump_spec(const char *prog_name, int level, const struct bp
  *    between multiple relocations for the same type ID and is updated as some
  *    of the candidates are pruned due to structural incompatibility.
  */
-int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn,
-			     int insn_idx,
-			     const struct bpf_core_relo *relo,
-			     int relo_idx,
-			     const struct btf *local_btf,
-			     struct bpf_core_cand_list *cands,
-			     struct bpf_core_spec *specs_scratch)
+int bpf_core_calc_relo_insn(const char *prog_name,
+			    const struct bpf_core_relo *relo,
+			    int relo_idx,
+			    const struct btf *local_btf,
+			    struct bpf_core_cand_list *cands,
+			    struct bpf_core_spec *specs_scratch,
+			    struct bpf_core_relo_res *targ_res)
 {
 	struct bpf_core_spec *local_spec = &specs_scratch[0];
 	struct bpf_core_spec *cand_spec = &specs_scratch[1];
 	struct bpf_core_spec *targ_spec = &specs_scratch[2];
-	struct bpf_core_relo_res cand_res, targ_res;
+	struct bpf_core_relo_res cand_res;
 	const struct btf_type *local_type;
 	const char *local_name;
 	__u32 local_id;
@@ -1223,12 +1198,12 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn,
 	/* TYPE_ID_LOCAL relo is special and doesn't need candidate search */
 	if (relo->kind == BPF_CORE_TYPE_ID_LOCAL) {
 		/* bpf_insn's imm value could get out of sync during linking */
-		memset(&targ_res, 0, sizeof(targ_res));
-		targ_res.validate = false;
-		targ_res.poison = false;
-		targ_res.orig_val = local_spec->root_type_id;
-		targ_res.new_val = local_spec->root_type_id;
-		goto patch_insn;
+		memset(targ_res, 0, sizeof(*targ_res));
+		targ_res->validate = true;
+		targ_res->poison = false;
+		targ_res->orig_val = local_spec->root_type_id;
+		targ_res->new_val = local_spec->root_type_id;
+		return 0;
 	}
 
 	/* libbpf doesn't support candidate search for anonymous types */
@@ -1262,7 +1237,7 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn,
 			return err;
 
 		if (j == 0) {
-			targ_res = cand_res;
+			*targ_res = cand_res;
 			*targ_spec = *cand_spec;
 		} else if (cand_spec->bit_offset != targ_spec->bit_offset) {
 			/* if there are many field relo candidates, they
@@ -1272,7 +1247,8 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn,
 				prog_name, relo_idx, cand_spec->bit_offset,
 				targ_spec->bit_offset);
 			return -EINVAL;
-		} else if (cand_res.poison != targ_res.poison || cand_res.new_val != targ_res.new_val) {
+		} else if (cand_res.poison != targ_res->poison ||
+			   cand_res.new_val != targ_res->new_val) {
 			/* all candidates should result in the same relocation
 			 * decision and value, otherwise it's dangerous to
 			 * proceed due to ambiguity
@@ -1280,7 +1256,7 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn,
 			pr_warn("prog '%s': relo #%d: relocation decision ambiguity: %s %u != %s %u\n",
 				prog_name, relo_idx,
 				cand_res.poison ? "failure" : "success", cand_res.new_val,
-				targ_res.poison ? "failure" : "success", targ_res.new_val);
+				targ_res->poison ? "failure" : "success", targ_res->new_val);
 			return -EINVAL;
 		}
 
@@ -1314,19 +1290,10 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn,
 			 prog_name, relo_idx);
 
 		/* calculate single target relo result explicitly */
-		err = bpf_core_calc_relo(prog_name, relo, relo_idx, local_spec, NULL, &targ_res);
+		err = bpf_core_calc_relo(prog_name, relo, relo_idx, local_spec, NULL, targ_res);
 		if (err)
 			return err;
 	}
 
-patch_insn:
-	/* bpf_core_patch_insn() should know how to handle missing targ_spec */
-	err = bpf_core_patch_insn(prog_name, insn, insn_idx, relo, relo_idx, &targ_res);
-	if (err) {
-		pr_warn("prog '%s': relo #%d: failed to patch insn #%u: %d\n",
-			prog_name, relo_idx, relo->insn_off / 8, err);
-		return -EINVAL;
-	}
-
 	return 0;
 }
diff --git a/tools/lib/bpf/relo_core.h b/tools/lib/bpf/relo_core.h
index 17799819ad7c..a28bf3711ce2 100644
--- a/tools/lib/bpf/relo_core.h
+++ b/tools/lib/bpf/relo_core.h
@@ -44,14 +44,44 @@ struct bpf_core_spec {
 	__u32 bit_offset;
 };
 
-int bpf_core_apply_relo_insn(const char *prog_name,
-			     struct bpf_insn *insn, int insn_idx,
-			     const struct bpf_core_relo *relo, int relo_idx,
-			     const struct btf *local_btf,
-			     struct bpf_core_cand_list *cands,
-			     struct bpf_core_spec *specs_scratch);
+struct bpf_core_relo_res {
+	/* expected value in the instruction, unless validate == false */
+	__u32 orig_val;
+	/* new value that needs to be patched up to */
+	__u32 new_val;
+	/* relocation unsuccessful, poison instruction, but don't fail load */
+	bool poison;
+	/* some relocations can't be validated against orig_val */
+	bool validate;
+	/* for field byte offset relocations or the forms:
+	 *     *(T *)(rX + <off>) = rY
+	 *     rX = *(T *)(rY + <off>),
+	 * we remember original and resolved field size to adjust direct
+	 * memory loads of pointers and integers; this is necessary for 32-bit
+	 * host kernel architectures, but also allows to automatically
+	 * relocate fields that were resized from, e.g., u32 to u64, etc.
+	 */
+	bool fail_memsz_adjust;
+	__u32 orig_sz;
+	__u32 orig_type_id;
+	__u32 new_sz;
+	__u32 new_type_id;
+};
+
 int bpf_core_types_are_compat(const struct btf *local_btf, __u32 local_id,
 			      const struct btf *targ_btf, __u32 targ_id);
 
 size_t bpf_core_essential_name_len(const char *name);
+
+int bpf_core_calc_relo_insn(const char *prog_name,
+			    const struct bpf_core_relo *relo, int relo_idx,
+			    const struct btf *local_btf,
+			    struct bpf_core_cand_list *cands,
+			    struct bpf_core_spec *specs_scratch,
+			    struct bpf_core_relo_res *targ_res);
+
+int bpf_core_patch_insn(const char *prog_name, struct bpf_insn *insn,
+			int insn_idx, const struct bpf_core_relo *relo,
+			int relo_idx, const struct bpf_core_relo_res *res);
+
 #endif
-- 
2.25.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH bpf-next v3 2/3] libbpf: Implement changes needed for BTFGen in bpftool
  2021-12-17 18:56 [PATCH bpf-next v3 0/3] libbpf: Implement BTFGen Mauricio Vásquez
  2021-12-17 18:56 ` [PATCH bpf-next v3 1/3] libbpf: split bpf_core_apply_relo() Mauricio Vásquez
@ 2021-12-17 18:56 ` Mauricio Vásquez
  2021-12-23  0:33   ` Andrii Nakryiko
  2021-12-17 18:56 ` [PATCH bpf-next v3 3/3] bpftool: Implement btfgen Mauricio Vásquez
  2021-12-17 23:11 ` [PATCH bpf-next v3 0/3] libbpf: Implement BTFGen Daniel Borkmann
  3 siblings, 1 reply; 12+ messages in thread
From: Mauricio Vásquez @ 2021-12-17 18:56 UTC (permalink / raw)
  To: netdev, bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Quentin Monnet, Rafael David Tinoco, Lorenzo Fontana,
	Leonardo Di Donato

Given that BTFGen is a so unique use case, let's expose some of the
libbpf internal APIs to bpftool. This commit extends libbpf with the
features that are needed to implement it.

Specifically, this commit introduces the following functions:
- bpf_object__btf_ext(): Public method to access the BTF.ext section of
an bpf object.
- bpf_program__sec_idx(): Public method to get the sec index of a
program within the elf file.
- Implement bpf_core_create_cand_cache() and bpf_core_free_cand_cache()
to handle candidates cache.
- Expose bpf_core_calc_relo_res() in libbpfinternal.h and add "struct
bpf_core_spec *targ_spec" parameter.
- bpf_object_set_vmlinux_override(): Internal function to set
obj->btf_vmlinux_override.
- bpf_object__get_nr_programs() and bpf_object__get_program(): To give
access to the program inside an object. bpf_object__for_each_program
is not good enough because BTFGen needs to access subprograms too.

Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io>
Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com>
Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co>
Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co>
---
 tools/lib/bpf/Makefile          |  2 +-
 tools/lib/bpf/libbpf.c          | 88 ++++++++++++++++++++++++++-------
 tools/lib/bpf/libbpf.h          |  3 ++
 tools/lib/bpf/libbpf.map        |  2 +
 tools/lib/bpf/libbpf_internal.h | 22 +++++++++
 tools/lib/bpf/relo_core.c       |  6 +--
 tools/lib/bpf/relo_core.h       |  4 ++
 7 files changed, 104 insertions(+), 23 deletions(-)

diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile
index f947b61b2107..dba019ee2832 100644
--- a/tools/lib/bpf/Makefile
+++ b/tools/lib/bpf/Makefile
@@ -239,7 +239,7 @@ install_lib: all_cmd
 
 SRC_HDRS := bpf.h libbpf.h btf.h libbpf_common.h libbpf_legacy.h xsk.h	     \
 	    bpf_helpers.h bpf_tracing.h bpf_endian.h bpf_core_read.h	     \
-	    skel_internal.h libbpf_version.h
+	    skel_internal.h libbpf_version.h relo_core.h libbpf_internal.h
 GEN_HDRS := $(BPF_GENERATED)
 
 INSTALL_PFX := $(DESTDIR)$(prefix)/include/bpf
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 77e2df13715a..7c8f8797475f 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -4027,8 +4027,8 @@ static bool prog_contains_insn(const struct bpf_program *prog, size_t insn_idx)
 	       insn_idx < prog->sec_insn_off + prog->sec_insn_cnt;
 }
 
-static struct bpf_program *find_prog_by_sec_insn(const struct bpf_object *obj,
-						 size_t sec_idx, size_t insn_idx)
+struct bpf_program *find_prog_by_sec_insn(const struct bpf_object *obj,
+					  size_t sec_idx, size_t insn_idx)
 {
 	int l = 0, r = obj->nr_programs - 1, m;
 	struct bpf_program *prog;
@@ -5498,12 +5498,13 @@ static int record_relo_core(struct bpf_program *prog,
 	return 0;
 }
 
-static int bpf_core_calc_relo_res(struct bpf_program *prog,
-				  const struct bpf_core_relo *relo,
-				  int relo_idx,
-				  const struct btf *local_btf,
-				  struct hashmap *cand_cache,
-				  struct bpf_core_relo_res *targ_res)
+int bpf_core_calc_relo_res(struct bpf_program *prog,
+			   const struct bpf_core_relo *relo,
+			   int relo_idx,
+			   const struct btf *local_btf,
+			   struct hashmap *cand_cache,
+			   struct bpf_core_relo_res *targ_res,
+			   struct bpf_core_spec *targ_spec)
 {
 	struct bpf_core_spec specs_scratch[3] = {};
 	const void *type_key = u32_as_hash_key(relo->type_id);
@@ -5548,8 +5549,32 @@ static int bpf_core_calc_relo_res(struct bpf_program *prog,
 		}
 	}
 
-	return bpf_core_calc_relo_insn(prog_name, relo, relo_idx, local_btf, cands,
-				       specs_scratch, targ_res);
+	err = bpf_core_calc_relo_insn(prog_name, relo, relo_idx, local_btf, cands,
+				      specs_scratch, targ_res);
+	if (err)
+		return err;
+
+	if (targ_spec)
+		*targ_spec = specs_scratch[2];
+	return 0;
+}
+
+struct hashmap *bpf_core_create_cand_cache(void)
+{
+	return hashmap__new(bpf_core_hash_fn, bpf_core_equal_fn, NULL);
+}
+
+void bpf_core_free_cand_cache(struct hashmap *cand_cache)
+{
+	struct hashmap_entry *entry;
+	int i;
+
+	if (!IS_ERR_OR_NULL(cand_cache)) {
+		hashmap__for_each_entry(cand_cache, entry, i) {
+			bpf_core_free_cands(entry->value);
+		}
+		hashmap__free(cand_cache);
+	}
 }
 
 static int
@@ -5559,7 +5584,6 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path)
 	struct bpf_core_relo_res targ_res;
 	const struct bpf_core_relo *rec;
 	const struct btf_ext_info *seg;
-	struct hashmap_entry *entry;
 	struct hashmap *cand_cache = NULL;
 	struct bpf_program *prog;
 	struct bpf_insn *insn;
@@ -5578,7 +5602,7 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path)
 		}
 	}
 
-	cand_cache = hashmap__new(bpf_core_hash_fn, bpf_core_equal_fn, NULL);
+	cand_cache = bpf_core_create_cand_cache();
 	if (IS_ERR(cand_cache)) {
 		err = PTR_ERR(cand_cache);
 		goto out;
@@ -5627,7 +5651,8 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path)
 			if (!prog->load)
 				continue;
 
-			err = bpf_core_calc_relo_res(prog, rec, i, obj->btf, cand_cache, &targ_res);
+			err = bpf_core_calc_relo_res(prog, rec, i, obj->btf, cand_cache, &targ_res,
+						     NULL);
 			if (err) {
 				pr_warn("prog '%s': relo #%d: failed to relocate: %d\n",
 					prog->name, i, err);
@@ -5660,12 +5685,8 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path)
 	btf__free(obj->btf_vmlinux_override);
 	obj->btf_vmlinux_override = NULL;
 
-	if (!IS_ERR_OR_NULL(cand_cache)) {
-		hashmap__for_each_entry(cand_cache, entry, i) {
-			bpf_core_free_cands(entry->value);
-		}
-		hashmap__free(cand_cache);
-	}
+	bpf_core_free_cand_cache(cand_cache);
+
 	return err;
 }
 
@@ -8190,6 +8211,11 @@ struct btf *bpf_object__btf(const struct bpf_object *obj)
 	return obj ? obj->btf : NULL;
 }
 
+struct btf_ext *bpf_object__btf_ext(const struct bpf_object *obj)
+{
+	return obj ? obj->btf_ext : NULL;
+}
+
 int bpf_object__btf_fd(const struct bpf_object *obj)
 {
 	return obj->btf ? btf__fd(obj->btf) : -1;
@@ -8281,6 +8307,20 @@ bpf_object__next_program(const struct bpf_object *obj, struct bpf_program *prev)
 	return prog;
 }
 
+size_t bpf_object__get_nr_programs(const struct bpf_object *obj)
+{
+	return obj->nr_programs;
+}
+
+struct bpf_program *
+bpf_object__get_program(const struct bpf_object *obj, unsigned int i)
+{
+	if (i >= obj->nr_programs)
+		return NULL;
+
+	return &obj->programs[i];
+}
+
 struct bpf_program *
 bpf_program__prev(struct bpf_program *next, const struct bpf_object *obj)
 {
@@ -8360,6 +8400,11 @@ int bpf_program__set_autoload(struct bpf_program *prog, bool autoload)
 	return 0;
 }
 
+int bpf_program__sec_idx(const struct bpf_program *prog)
+{
+	return prog->sec_idx;
+}
+
 static int bpf_program_nth_fd(const struct bpf_program *prog, int n);
 
 int bpf_program__fd(const struct bpf_program *prog)
@@ -11779,3 +11824,8 @@ void bpf_object__destroy_skeleton(struct bpf_object_skeleton *s)
 	free(s->progs);
 	free(s);
 }
+
+void bpf_object_set_vmlinux_override(struct bpf_object *obj, struct btf *btf)
+{
+	obj->btf_vmlinux_override = btf;
+}
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index 42b2f36fd9f0..2b048ee5a9b2 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -225,6 +225,8 @@ LIBBPF_API int bpf_object__set_kversion(struct bpf_object *obj, __u32 kern_versi
 
 struct btf;
 LIBBPF_API struct btf *bpf_object__btf(const struct bpf_object *obj);
+struct btf_ext;
+LIBBPF_API struct btf_ext *bpf_object__btf_ext(const struct bpf_object *obj);
 LIBBPF_API int bpf_object__btf_fd(const struct bpf_object *obj);
 
 LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_object__find_program_by_name() instead")
@@ -290,6 +292,7 @@ LIBBPF_API LIBBPF_DEPRECATED("BPF program title is confusing term; please use bp
 const char *bpf_program__title(const struct bpf_program *prog, bool needs_copy);
 LIBBPF_API bool bpf_program__autoload(const struct bpf_program *prog);
 LIBBPF_API int bpf_program__set_autoload(struct bpf_program *prog, bool autoload);
+LIBBPF_API int bpf_program__sec_idx(const struct bpf_program *prog);
 
 /* returns program size in bytes */
 LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_program__insn_cnt() instead")
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index b3938b3f8fc9..15da4075e0b5 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -392,6 +392,7 @@ LIBBPF_0.6.0 {
 		bpf_map__map_extra;
 		bpf_map__set_map_extra;
 		bpf_map_create;
+		bpf_object__btf_ext;
 		bpf_object__next_map;
 		bpf_object__next_program;
 		bpf_object__prev_map;
@@ -401,6 +402,7 @@ LIBBPF_0.6.0 {
 		bpf_program__flags;
 		bpf_program__insn_cnt;
 		bpf_program__insns;
+		bpf_program__sec_idx;
 		bpf_program__set_flags;
 		btf__add_btf;
 		btf__add_decl_tag;
diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
index 5dbe4f463880..b1962adb110c 100644
--- a/tools/lib/bpf/libbpf_internal.h
+++ b/tools/lib/bpf/libbpf_internal.h
@@ -524,4 +524,26 @@ static inline int ensure_good_fd(int fd)
 	return fd;
 }
 
+struct hashmap;
+
+int bpf_core_calc_relo_res(struct bpf_program *prog,
+			   const struct bpf_core_relo *relo,
+			   int relo_idx,
+			   const struct btf *local_btf,
+			   struct hashmap *cand_cache,
+			   struct bpf_core_relo_res *targ_res,
+			   struct bpf_core_spec *targ_spec);
+void bpf_object_set_vmlinux_override(struct bpf_object *obj, struct btf *btf);
+struct hashmap *bpf_core_create_cand_cache(void);
+void bpf_core_free_cand_cache(struct hashmap *cand_cache);
+
+struct bpf_program *find_prog_by_sec_insn(const struct bpf_object *obj,
+					  size_t sec_idx, size_t insn_idx);
+
+size_t bpf_object__get_nr_programs(const struct bpf_object *obj);
+
+struct bpf_program *
+bpf_object__get_program(const struct bpf_object *obj, unsigned int n);
+
+
 #endif /* __LIBBPF_LIBBPF_INTERNAL_H */
diff --git a/tools/lib/bpf/relo_core.c b/tools/lib/bpf/relo_core.c
index 4f3552468624..66dfb7fa89a2 100644
--- a/tools/lib/bpf/relo_core.c
+++ b/tools/lib/bpf/relo_core.c
@@ -102,7 +102,7 @@ static const char *core_relo_kind_str(enum bpf_core_relo_kind kind)
 	}
 }
 
-static bool core_relo_is_field_based(enum bpf_core_relo_kind kind)
+bool core_relo_is_field_based(enum bpf_core_relo_kind kind)
 {
 	switch (kind) {
 	case BPF_CORE_FIELD_BYTE_OFFSET:
@@ -117,7 +117,7 @@ static bool core_relo_is_field_based(enum bpf_core_relo_kind kind)
 	}
 }
 
-static bool core_relo_is_type_based(enum bpf_core_relo_kind kind)
+bool core_relo_is_type_based(enum bpf_core_relo_kind kind)
 {
 	switch (kind) {
 	case BPF_CORE_TYPE_ID_LOCAL:
@@ -130,7 +130,7 @@ static bool core_relo_is_type_based(enum bpf_core_relo_kind kind)
 	}
 }
 
-static bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind)
+bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind)
 {
 	switch (kind) {
 	case BPF_CORE_ENUMVAL_EXISTS:
diff --git a/tools/lib/bpf/relo_core.h b/tools/lib/bpf/relo_core.h
index a28bf3711ce2..e969dfb032f4 100644
--- a/tools/lib/bpf/relo_core.h
+++ b/tools/lib/bpf/relo_core.h
@@ -84,4 +84,8 @@ int bpf_core_patch_insn(const char *prog_name, struct bpf_insn *insn,
 			int insn_idx, const struct bpf_core_relo *relo,
 			int relo_idx, const struct bpf_core_relo_res *res);
 
+bool core_relo_is_field_based(enum bpf_core_relo_kind kind);
+bool core_relo_is_type_based(enum bpf_core_relo_kind kind);
+bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind);
+
 #endif
-- 
2.25.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH bpf-next v3 3/3] bpftool: Implement btfgen
  2021-12-17 18:56 [PATCH bpf-next v3 0/3] libbpf: Implement BTFGen Mauricio Vásquez
  2021-12-17 18:56 ` [PATCH bpf-next v3 1/3] libbpf: split bpf_core_apply_relo() Mauricio Vásquez
  2021-12-17 18:56 ` [PATCH bpf-next v3 2/3] libbpf: Implement changes needed for BTFGen in bpftool Mauricio Vásquez
@ 2021-12-17 18:56 ` Mauricio Vásquez
  2021-12-23  0:33   ` Andrii Nakryiko
  2021-12-17 23:11 ` [PATCH bpf-next v3 0/3] libbpf: Implement BTFGen Daniel Borkmann
  3 siblings, 1 reply; 12+ messages in thread
From: Mauricio Vásquez @ 2021-12-17 18:56 UTC (permalink / raw)
  To: netdev, bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Quentin Monnet, Rafael David Tinoco, Lorenzo Fontana,
	Leonardo Di Donato

The BTFGen's goal is to produce a BTF file that contains **only** the
information that is needed by an eBPF program. This algorithm does a
first step collecting the types involved for each relocation present on
the object and "marking" them as needed. Types are collected in
different ways according to the type relocation, for field based
relocations only the union and structures members involved are
considered, for type based relocations the whole types are added, enum
field relocations are not supported in this iteration yet.

A second step generates a BTF file from the "marked" types. This step
accesses the original BTF file extracting the types and their members
that were "marked" as needed in the first step.

This command is implemented under the "gen" command in bpftool and the
syntax is the following:

$ bpftool gen btf INPUT OUTPUT OBJECT(S)

INPUT can be either a single BTF file or a folder containing BTF files,
when it's a folder, a BTF file is generated for each BTF file contained
in this folder. OUTPUT is the file (or folder) where generated files are
stored and OBJECT(S) is the list of bpf objects we want to generate the
BTF file(s) for (each generated BTF file contains all the types needed
by all the objects).

Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io>
Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com>
Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co>
Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co>
---
 tools/bpf/bpftool/gen.c | 892 ++++++++++++++++++++++++++++++++++++++++
 1 file changed, 892 insertions(+)

diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c
index b4695df2ea3d..2794fe88f609 100644
--- a/tools/bpf/bpftool/gen.c
+++ b/tools/bpf/bpftool/gen.c
@@ -5,6 +5,7 @@
 #define _GNU_SOURCE
 #endif
 #include <ctype.h>
+#include <dirent.h>
 #include <errno.h>
 #include <fcntl.h>
 #include <linux/err.h>
@@ -14,6 +15,7 @@
 #include <unistd.h>
 #include <bpf/bpf.h>
 #include <bpf/libbpf.h>
+#include <bpf/libbpf_internal.h>
 #include <sys/types.h>
 #include <sys/stat.h>
 #include <sys/mman.h>
@@ -1084,6 +1086,7 @@ static int do_help(int argc, char **argv)
 	fprintf(stderr,
 		"Usage: %1$s %2$s object OUTPUT_FILE INPUT_FILE [INPUT_FILE...]\n"
 		"       %1$s %2$s skeleton FILE [name OBJECT_NAME]\n"
+		"       %1$s %2$s btf INPUT OUTPUT OBJECT(S)\n"
 		"       %1$s %2$s help\n"
 		"\n"
 		"       " HELP_SPEC_OPTIONS " |\n"
@@ -1094,9 +1097,898 @@ static int do_help(int argc, char **argv)
 	return 0;
 }
 
+static int btf_save_raw(const struct btf *btf, const char *path)
+{
+	const void *data;
+	FILE *f = NULL;
+	__u32 data_sz;
+	int err = 0;
+
+	data = btf__raw_data(btf, &data_sz);
+	if (!data) {
+		err = -ENOMEM;
+		goto out;
+	}
+
+	f = fopen(path, "wb");
+	if (!f) {
+		err = -errno;
+		goto out;
+	}
+
+	if (fwrite(data, 1, data_sz, f) != data_sz) {
+		err = -errno;
+		goto out;
+	}
+
+out:
+	if (f)
+		fclose(f);
+	return libbpf_err(err);
+}
+
+struct btf_reloc_member {
+	struct btf_member *member;
+	int idx;
+};
+
+struct btf_reloc_type {
+	struct btf_type *type;
+	unsigned int id;
+	bool added_by_all;
+
+	struct hashmap *members;
+};
+
+struct btf_reloc_info {
+	struct hashmap *types;
+	struct hashmap *ids_map;
+
+	struct btf *src_btf;
+};
+
+static size_t bpf_reloc_info_hash_fn(const void *key, void *ctx)
+{
+	return (size_t)key;
+}
+
+static bool bpf_reloc_info_equal_fn(const void *k1, const void *k2, void *ctx)
+{
+	return k1 == k2;
+}
+
+static void *uint_as_hash_key(int x)
+{
+	return (void *)(uintptr_t)x;
+}
+
+static void bpf_reloc_type_free(struct btf_reloc_type *type)
+{
+	struct hashmap_entry *entry;
+	size_t bkt;
+
+	if (IS_ERR_OR_NULL(type))
+		return;
+
+	if (!IS_ERR_OR_NULL(type->members)) {
+		hashmap__for_each_entry(type->members, entry, bkt) {
+			free(entry->value);
+		}
+		hashmap__free(type->members);
+	}
+
+	free(type);
+}
+
+static void btfgen_reloc_info_free(struct btf_reloc_info *info)
+{
+	struct hashmap_entry *entry;
+	size_t bkt;
+
+	if (!info)
+		return;
+
+	hashmap__free(info->ids_map);
+
+	if (!IS_ERR_OR_NULL(info->types)) {
+		hashmap__for_each_entry(info->types, entry, bkt) {
+			bpf_reloc_type_free(entry->value);
+		}
+		hashmap__free(info->types);
+	}
+
+	btf__free(info->src_btf);
+
+	free(info);
+}
+
+static struct btf_reloc_info *
+btfgen_reloc_info_new(const char *targ_btf_path)
+{
+	struct btf_reloc_info *info;
+	struct btf *src_btf;
+	struct hashmap *ids_map;
+	struct hashmap *types;
+
+	info = calloc(1, sizeof(*info));
+	if (!info)
+		return ERR_PTR(-ENOMEM);
+
+	src_btf = btf__parse(targ_btf_path, NULL);
+	if (libbpf_get_error(src_btf)) {
+		btfgen_reloc_info_free(info);
+		return (void *) src_btf;
+	}
+
+	info->src_btf = src_btf;
+
+	ids_map = hashmap__new(bpf_reloc_info_hash_fn, bpf_reloc_info_equal_fn, NULL);
+	if (IS_ERR(ids_map)) {
+		btfgen_reloc_info_free(info);
+		return (void *) ids_map;
+	}
+
+	info->ids_map = ids_map;
+
+	types = hashmap__new(bpf_reloc_info_hash_fn, bpf_reloc_info_equal_fn, NULL);
+	if (IS_ERR(types)) {
+		btfgen_reloc_info_free(info);
+		return (void *) types;
+	}
+
+	info->types = types;
+
+	return info;
+}
+
+/* Return id for type in new btf instance */
+static unsigned int btf_reloc_id_get(struct btf_reloc_info *info, unsigned int old)
+{
+	uintptr_t new = 0;
+
+	/* deal with BTF_KIND_VOID */
+	if (old == 0)
+		return 0;
+
+	if (!hashmap__find(info->ids_map, uint_as_hash_key(old), (void **)&new)) {
+		/* return id for void as it's possible that the ID we're looking for is
+		 * the type of a pointer that we're not adding.
+		 */
+		return 0;
+	}
+
+	return (unsigned int)(uintptr_t)new;
+}
+
+/* Add new id map to the list of mappings */
+static int btf_reloc_id_add(struct btf_reloc_info *info, unsigned int old, unsigned int new)
+{
+	return hashmap__add(info->ids_map, uint_as_hash_key(old), uint_as_hash_key(new));
+}
+
+/*
+ * Put type in the list. If the type already exists it's returned, otherwise a
+ * new one is created and added to the list. This is called recursively adding
+ * all the types that are needed for the current one.
+ */
+static struct btf_reloc_type *btf_reloc_put_type(struct btf *btf,
+						 struct btf_reloc_info *info,
+						 struct btf_type *btf_type,
+						 unsigned int id)
+{
+	struct btf_reloc_type *reloc_type, *tmp;
+	struct btf_array *array;
+	unsigned int child_id;
+	int err;
+
+	/* check if we already have this type */
+	if (hashmap__find(info->types, uint_as_hash_key(id), (void **) &reloc_type))
+		return reloc_type;
+
+
+	/* do nothing. void is implicit in BTF */
+	if (id == 0)
+		return NULL;
+
+	reloc_type = calloc(1, sizeof(*reloc_type));
+	if (!reloc_type)
+		return ERR_PTR(-ENOMEM);
+
+	reloc_type->type = btf_type;
+	reloc_type->id = id;
+
+	/* append this type to the relocation type's list before anything else */
+	err = hashmap__add(info->types, uint_as_hash_key(reloc_type->id), reloc_type);
+	if (err)
+		return ERR_PTR(err);
+
+	/* complex types might need further processing */
+	switch (btf_kind(reloc_type->type)) {
+	/* already processed */
+	case BTF_KIND_UNKN:
+	case BTF_KIND_INT:
+	case BTF_KIND_FLOAT:
+	/* processed by callee */
+	case BTF_KIND_STRUCT:
+	case BTF_KIND_UNION:
+	/* doesn't need resolution. If the data of the pointer is used
+	 * then it'll added by the caller in another relocation.
+	 */
+	case BTF_KIND_PTR:
+		break;
+	/* needs resolution */
+	case BTF_KIND_CONST:
+	case BTF_KIND_VOLATILE:
+	case BTF_KIND_TYPEDEF:
+		child_id = btf_type->type;
+		btf_type = (struct btf_type *) btf__type_by_id(btf, child_id);
+		if (!btf_type)
+			return ERR_PTR(-EINVAL);
+
+		tmp = btf_reloc_put_type(btf, info, btf_type, child_id);
+		if (IS_ERR(tmp))
+			return tmp;
+		break;
+	/* needs resolution */
+	case BTF_KIND_ARRAY:
+		array = btf_array(reloc_type->type);
+
+		/* add type for array type */
+		btf_type = (struct btf_type *) btf__type_by_id(btf, array->type);
+		tmp = btf_reloc_put_type(btf, info, btf_type, array->type);
+		if (IS_ERR(tmp))
+			return tmp;
+
+		/* add type for array's index type */
+		btf_type = (struct btf_type *) btf__type_by_id(btf, array->index_type);
+		tmp = btf_reloc_put_type(btf, info, btf_type, array->index_type);
+		if (IS_ERR(tmp))
+			return tmp;
+
+		break;
+	/* tells if some other type needs to be handled */
+	default:
+		p_err("unsupported relocation: %d", reloc_type->id);
+		return ERR_PTR(-EINVAL);
+	}
+
+	return reloc_type;
+}
+
+/* Return pointer to btf_reloc_type by id */
+static struct btf_reloc_type *btf_reloc_get_type(struct btf_reloc_info *info, int id)
+{
+	struct btf_reloc_type *type = NULL;
+
+	if (!hashmap__find(info->types, uint_as_hash_key(id), (void **)&type))
+		return ERR_PTR(-ENOENT);
+
+	return type;
+}
+
+static int bpf_reloc_type_add_member(struct btf_reloc_info *info,
+				     struct btf_reloc_type *reloc_type,
+				     struct btf_member *btf_member, int idx)
+{
+	struct btf_reloc_member *reloc_member;
+	int err;
+
+	/* create new members hashmap for this relocation type if needed */
+	if (reloc_type->members == NULL) {
+		struct hashmap *tmp = hashmap__new(bpf_reloc_info_hash_fn,
+						   bpf_reloc_info_equal_fn,
+						   NULL);
+		if (IS_ERR(tmp))
+			return PTR_ERR(tmp);
+
+		reloc_type->members = tmp;
+	}
+	/* add given btf_member as a member of the parent relocation_type's type */
+	reloc_member = calloc(1, sizeof(*reloc_member));
+	if (!reloc_member)
+		return -ENOMEM;
+	reloc_member->member = btf_member;
+	reloc_member->idx = idx;
+	/* add given btf_member as member to given relocation type */
+	err = hashmap__add(reloc_type->members, uint_as_hash_key(reloc_member->idx), reloc_member);
+	if (err) {
+		free(reloc_member);
+		if (err != -EEXIST)
+			return err;
+	}
+
+	return 0;
+}
+
+/*
+ * Same as btf_reloc_put_type, but adding all fields, from given complex type, recursively
+ */
+static int btf_reloc_put_type_all(struct btf *btf,
+				  struct btf_reloc_info *info,
+				  struct btf_type *btf_type,
+				  unsigned int id)
+{
+	struct btf_reloc_type *reloc_type;
+	struct btf_array *array;
+	unsigned int child_id;
+	struct btf_member *m;
+	int err, i, n;
+
+	if (id == 0)
+		return 0;
+
+	if (!hashmap__find(info->types, uint_as_hash_key(id), (void **) &reloc_type)) {
+		reloc_type = calloc(1, sizeof(*reloc_type));
+		if (!reloc_type)
+			return -ENOMEM;
+
+		reloc_type->type = btf_type;
+		reloc_type->id = id;
+		/* avoid infinite recursion and yet be able to add all
+		 * fields/members for types also managed by this function twin
+		 * brother btf_reloc_put_type()
+		 */
+		reloc_type->added_by_all = true;
+
+		err = hashmap__add(info->types, uint_as_hash_key(reloc_type->id), reloc_type);
+		if (err)
+			return err;
+	} else {
+		if (reloc_type->added_by_all)
+			return 0;
+
+		reloc_type->added_by_all = true;
+	}
+
+	switch (btf_kind(reloc_type->type)) {
+	case BTF_KIND_UNKN:
+	case BTF_KIND_INT:
+	case BTF_KIND_FLOAT:
+	case BTF_KIND_ENUM:
+		/* not a complex type, already solved */
+		break;
+	case BTF_KIND_STRUCT:
+	case BTF_KIND_UNION:
+		n = btf_vlen(btf_type);
+		m = btf_members(btf_type);
+		for (i = 0; i < n; i++, m++) {
+			btf_type = (struct btf_type *) btf__type_by_id(btf, m->type);
+			if (!btf_type)
+				return -EINVAL;
+
+			/* add all member types */
+			err = btf_reloc_put_type_all(btf, info, btf_type, m->type);
+			if (err)
+				return err;
+
+			/* add all members */
+			err = bpf_reloc_type_add_member(info, reloc_type, m, i);
+			if (err)
+				return err;
+		}
+		break;
+	case BTF_KIND_PTR:
+	case BTF_KIND_CONST:
+	case BTF_KIND_VOLATILE:
+	case BTF_KIND_TYPEDEF:
+		/* modifier types */
+		child_id = btf_type->type;
+		btf_type = (struct btf_type *) btf__type_by_id(btf, child_id);
+		if (!btf_type)
+			return -EINVAL;
+
+		err = btf_reloc_put_type_all(btf, info, btf_type, child_id);
+		if (err)
+			return err;
+		break;
+	case BTF_KIND_ARRAY:
+		array = btf_array(btf_type);
+
+		/* add array member type */
+		btf_type = (struct btf_type *) btf__type_by_id(btf, array->type);
+		if (!btf_type)
+			return -EINVAL;
+		err = btf_reloc_put_type_all(btf, info, btf_type, array->type);
+		if (err)
+			return err;
+
+		/* add array index type */
+		btf_type = (struct btf_type *) btf__type_by_id(btf, array->index_type);
+		if (!btf_type)
+			return -EINVAL;
+		err = btf_reloc_put_type_all(btf, info, btf_type, array->type);
+		if (err)
+			return err;
+		break;
+	default:
+		p_err("unsupported kind (all): %s (%d)",
+		      btf_kind_str(reloc_type->type), reloc_type->id);
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+static int btf_reloc_info_gen_field(struct btf_reloc_info *info, struct bpf_core_spec *targ_spec)
+{
+	struct btf *btf = (struct btf *) info->src_btf;
+	struct btf_reloc_type *reloc_type;
+	struct btf_member *btf_member;
+	struct btf_type *btf_type;
+	struct btf_array *array;
+	unsigned int id;
+	int idx, err;
+
+	btf_type = (struct btf_type *) btf__type_by_id(btf, targ_spec->root_type_id);
+
+	/* create reloc type for root type */
+	reloc_type = btf_reloc_put_type(btf, info, btf_type, targ_spec->root_type_id);
+	if (IS_ERR(reloc_type))
+		return PTR_ERR(reloc_type);
+
+	/* add types for complex types (arrays, unions, structures) */
+	for (int i = 1; i < targ_spec->raw_len; i++) {
+		/* skip typedefs and mods */
+		while (btf_is_mod(btf_type) || btf_is_typedef(btf_type)) {
+			id = btf_type->type;
+			reloc_type = btf_reloc_get_type(info, id);
+			if (IS_ERR(reloc_type))
+				return PTR_ERR(reloc_type);
+			btf_type = (struct btf_type *) btf__type_by_id(btf, id);
+		}
+
+		switch (btf_kind(btf_type)) {
+		case BTF_KIND_STRUCT:
+		case BTF_KIND_UNION:
+			idx = targ_spec->raw_spec[i];
+			btf_member = btf_members(btf_type) + idx;
+			btf_type =  (struct btf_type *) btf__type_by_id(btf, btf_member->type);
+
+			/* add member to relocation type */
+			err = bpf_reloc_type_add_member(info, reloc_type, btf_member, idx);
+			if (err)
+				return err;
+			/* add relocation type */
+			reloc_type = btf_reloc_put_type(btf, info, btf_type, btf_member->type);
+			if (IS_ERR(reloc_type))
+				return PTR_ERR(reloc_type);
+			break;
+		case BTF_KIND_ARRAY:
+			array = btf_array(btf_type);
+			reloc_type = btf_reloc_get_type(info, array->type);
+			if (IS_ERR(reloc_type))
+				return PTR_ERR(reloc_type);
+			btf_type = (struct btf_type *) btf__type_by_id(btf, array->type);
+			break;
+		default:
+			p_err("spec type wasn't handled");
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
+static int btf_reloc_info_gen_type(struct btf_reloc_info *info, struct bpf_core_spec *targ_spec)
+{
+	struct btf *btf = (struct btf *) info->src_btf;
+	struct btf_type *btf_type;
+	int err = 0;
+
+	btf_type = (struct btf_type *) btf__type_by_id(btf, targ_spec->root_type_id);
+
+	return btf_reloc_put_type_all(btf, info, btf_type, targ_spec->root_type_id);
+}
+
+static int btf_reloc_info_gen_enumval(struct btf_reloc_info *info, struct bpf_core_spec *targ_spec)
+{
+	p_err("untreated enumval based relocation");
+	return -EOPNOTSUPP;
+}
+
+static int btf_reloc_info_gen(struct btf_reloc_info *info, struct bpf_core_spec *res)
+{
+	if (core_relo_is_type_based(res->relo_kind))
+		return btf_reloc_info_gen_type(info, res);
+
+	if (core_relo_is_enumval_based(res->relo_kind))
+		return btf_reloc_info_gen_enumval(info, res);
+
+	if (core_relo_is_field_based(res->relo_kind))
+		return btf_reloc_info_gen_field(info, res);
+
+	return -EINVAL;
+}
+
+#define BPF_INSN_SZ (sizeof(struct bpf_insn))
+
+static int btfgen_obj_reloc_info_gen(struct btf_reloc_info *reloc_info, struct bpf_object *obj)
+{
+	const struct btf_ext_info_sec *sec;
+	const struct bpf_core_relo *rec;
+	const struct btf_ext_info *seg;
+	struct hashmap *cand_cache;
+	int err, insn_idx, sec_idx;
+	struct bpf_program *prog;
+	struct btf_ext *btf_ext;
+	const char *sec_name;
+	size_t nr_programs;
+	struct btf *btf;
+	unsigned int i;
+
+	btf = bpf_object__btf(obj);
+	btf_ext = bpf_object__btf_ext(obj);
+
+	if (btf_ext->core_relo_info.len == 0)
+		return 0;
+
+	cand_cache = bpf_core_create_cand_cache();
+	if (IS_ERR(cand_cache))
+		return PTR_ERR(cand_cache);
+
+	bpf_object_set_vmlinux_override(obj, reloc_info->src_btf);
+
+	seg = &btf_ext->core_relo_info;
+	for_each_btf_ext_sec(seg, sec) {
+		bool prog_found;
+
+		sec_name = btf__name_by_offset(btf, sec->sec_name_off);
+		if (str_is_empty(sec_name)) {
+			err = -EINVAL;
+			goto out;
+		}
+
+		prog_found = false;
+		nr_programs = bpf_object__get_nr_programs(obj);
+		for (i = 0; i < nr_programs; i++)	{
+			prog = bpf_object__get_program(obj, i);
+			if (strcmp(bpf_program__section_name(prog), sec_name) == 0) {
+				prog_found = true;
+				break;
+			}
+		}
+
+		if (!prog_found) {
+			pr_warn("sec '%s': failed to find a BPF program\n", sec_name);
+			err = -EINVAL;
+			goto out;
+		}
+
+		sec_idx = bpf_program__sec_idx(prog);
+
+		for_each_btf_ext_rec(seg, sec, i, rec) {
+			struct bpf_core_relo_res targ_res;
+			struct bpf_core_spec targ_spec;
+
+			insn_idx = rec->insn_off / BPF_INSN_SZ;
+
+			prog = find_prog_by_sec_insn(obj, sec_idx, insn_idx);
+			if (!prog) {
+				pr_warn("sec '%s': failed to find program at insn #%d for CO-RE offset relocation #%d\n",
+					sec_name, insn_idx, i);
+				err = -EINVAL;
+				goto out;
+			}
+
+			err = bpf_core_calc_relo_res(prog, rec, i, btf, cand_cache, &targ_res,
+						     &targ_spec);
+			if (err)
+				goto out;
+
+			err = btf_reloc_info_gen(reloc_info, &targ_spec);
+			if (err)
+				goto out;
+		}
+	}
+
+out:
+	bpf_core_free_cand_cache(cand_cache);
+
+	return err;
+}
+
+static struct btf *btfgen_reloc_info_get_btf(struct btf_reloc_info *info)
+{
+	struct hashmap_entry *entry;
+	struct btf *btf_new;
+	size_t bkt;
+	int err;
+
+	btf_new = btf__new_empty();
+	if (IS_ERR(btf_new)) {
+		p_err("failed to allocate btf structure");
+		return btf_new;
+	}
+
+	/* first pass: add all types and add their new ids to the ids map */
+	hashmap__for_each_entry(info->types, entry, bkt) {
+		struct btf_reloc_type *reloc_type = entry->value;
+		struct btf_type *btf_type = reloc_type->type;
+		int new_id;
+
+		/* add members for struct and union */
+		if (btf_is_struct(btf_type) || btf_is_union(btf_type)) {
+			struct hashmap_entry *member_entry;
+			struct btf_type *btf_type_cpy;
+			int nmembers, index;
+			size_t new_size;
+
+			nmembers = reloc_type->members ? hashmap__size(reloc_type->members) : 0;
+			new_size = sizeof(struct btf_type) + nmembers * sizeof(struct btf_member);
+
+			btf_type_cpy = malloc(new_size);
+			if (!btf_type_cpy) {
+				err = -ENOMEM;
+				goto out;
+			}
+
+			/* copy header */
+			memcpy(btf_type_cpy, btf_type, sizeof(*btf_type_cpy));
+
+			/* copy only members that are needed */
+			index = 0;
+			if (nmembers > 0) {
+				size_t bkt2;
+
+				hashmap__for_each_entry(reloc_type->members, member_entry, bkt2) {
+					struct btf_reloc_member *reloc_member;
+					struct btf_member *btf_member;
+
+					reloc_member = member_entry->value;
+					btf_member = btf_members(btf_type) + reloc_member->idx;
+
+					memcpy(btf_members(btf_type_cpy) + index, btf_member,
+					       sizeof(struct btf_member));
+
+					index++;
+				}
+			}
+
+			/* set new vlen */
+			btf_type_cpy->info = btf_type_info(btf_kind(btf_type_cpy), nmembers,
+							   btf_kflag(btf_type_cpy));
+
+			err = btf__add_type(btf_new, info->src_btf, btf_type_cpy);
+			free(btf_type_cpy);
+		} else {
+			err = btf__add_type(btf_new, info->src_btf, btf_type);
+		}
+
+		if (err < 0)
+			goto out;
+
+		new_id = err;
+
+		/* add ID mapping */
+		err = btf_reloc_id_add(info, reloc_type->id, new_id);
+		if (err)
+			goto out;
+	}
+
+	/* second pass: fix up type ids */
+	for (unsigned int i = 0; i < btf__type_cnt(btf_new); i++) {
+		struct btf_member *btf_member;
+		struct btf_type *btf_type;
+		struct btf_param *params;
+		struct btf_array *array;
+
+		btf_type = (struct btf_type *) btf__type_by_id(btf_new, i);
+
+		switch (btf_kind(btf_type)) {
+		case BTF_KIND_STRUCT:
+		case BTF_KIND_UNION:
+			for (unsigned short j = 0; j < btf_vlen(btf_type); j++) {
+				btf_member = btf_members(btf_type) + j;
+				btf_member->type = btf_reloc_id_get(info, btf_member->type);
+			}
+			break;
+		case BTF_KIND_PTR:
+		case BTF_KIND_TYPEDEF:
+		case BTF_KIND_VOLATILE:
+		case BTF_KIND_CONST:
+		case BTF_KIND_RESTRICT:
+		case BTF_KIND_FUNC:
+		case BTF_KIND_VAR:
+			btf_type->type = btf_reloc_id_get(info, btf_type->type);
+			break;
+		case BTF_KIND_ARRAY:
+			array = btf_array(btf_type);
+			array->index_type = btf_reloc_id_get(info, array->index_type);
+			array->type = btf_reloc_id_get(info, array->type);
+			break;
+		case BTF_KIND_FUNC_PROTO:
+			btf_type->type = btf_reloc_id_get(info, btf_type->type);
+			params = btf_params(btf_type);
+			for (unsigned short j = 0; j < btf_vlen(btf_type); j++)
+				params[j].type = btf_reloc_id_get(info, params[j].type);
+			break;
+		default:
+			break;
+		}
+	}
+
+	return btf_new;
+
+out:
+	btf__free(btf_new);
+	return ERR_PTR(err);
+}
+
+static int is_file(const char *path)
+{
+	struct stat st;
+
+	if (stat(path, &st) < 0)
+		return -1;
+
+	switch (st.st_mode & S_IFMT) {
+	case S_IFDIR:
+		return 0;
+	case S_IFREG:
+		return 1;
+	default:
+		return -1;
+	}
+
+	return -1;
+}
+
+static int generate_btf(const char *src_btf, const char *dst_btf, const char *objspaths[])
+{
+	struct btf_reloc_info *reloc_info;
+	struct btf *btf_new = NULL;
+	struct bpf_object *obj;
+	int err;
+
+	struct bpf_object_open_opts ops = {
+		.sz = sizeof(ops),
+		.btf_custom_path = src_btf,
+	};
+
+	reloc_info = btfgen_reloc_info_new(src_btf);
+	err = libbpf_get_error(reloc_info);
+	if (err) {
+		p_err("failed to allocate info structure");
+		goto out;
+	}
+
+	for (int i = 0; objspaths[i] != NULL; i++) {
+		printf("OBJ : %s\n", objspaths[i]);
+		obj = bpf_object__open_file(objspaths[i], &ops);
+		err = libbpf_get_error(obj);
+		if (err) {
+			p_err("error opening object: %s", strerror(errno));
+			goto out;
+		}
+
+		err = btfgen_obj_reloc_info_gen(reloc_info, obj);
+		if (err)
+			goto out;
+
+		bpf_object__close(obj);
+	}
+
+	btf_new = btfgen_reloc_info_get_btf(reloc_info);
+	err = libbpf_get_error(btf_new);
+	if (err) {
+		p_err("error generating btf: %s", strerror(errno));
+		goto out;
+	}
+
+	printf("DBTF: %s\n", dst_btf);
+	err = btf_save_raw(btf_new, dst_btf);
+	if (err) {
+		p_err("error saving btf file: %s", strerror(errno));
+		goto out;
+	}
+
+out:
+	if (!libbpf_get_error(btf_new))
+		btf__free(btf_new);
+	btfgen_reloc_info_free(reloc_info);
+
+	return err;
+}
+
+static int do_gen_btf(int argc, char **argv)
+{
+	char src_btf_path[PATH_MAX], dst_btf_path[PATH_MAX];
+	bool input_is_file, output_is_file = false;
+	const char *input, *output;
+	const char **objs = NULL;
+	struct dirent *dir;
+	DIR *d = NULL;
+	int i, err;
+
+	if (!REQ_ARGS(3)) {
+		usage();
+		return -1;
+	}
+
+	input = GET_ARG();
+	err = is_file(input);
+	if (err < 0) {
+		p_err("failed to stat %s: %s", input, strerror(errno));
+		return err;
+	}
+	input_is_file = err;
+
+	output = GET_ARG();
+	err = is_file(output);
+	if (err != 0)
+		output_is_file = true;
+
+	objs = (const char **) malloc((argc + 1) * sizeof(*objs));
+	if (!objs)
+		return -ENOMEM;
+
+	i = 0;
+	while (argc > 0)
+		objs[i++] = GET_ARG();
+
+	objs[i] = NULL;
+
+	// single BTF file
+	if (input_is_file) {
+		char *d_input;
+
+		printf("SBTF: %s\n", input);
+
+		if (output_is_file) {
+			err = generate_btf(input, output, objs);
+			goto out;
+		}
+		d_input = strdup(input);
+		snprintf(dst_btf_path, sizeof(dst_btf_path), "%s/%s", output,
+			 basename(d_input));
+		free(d_input);
+		err = generate_btf(input, dst_btf_path, objs);
+		goto out;
+	}
+
+	if (output_is_file) {
+		p_err("can't have just one file as output");
+		err = -EINVAL;
+		goto out;
+	}
+
+	// directory with BTF files
+	d = opendir(input);
+	if (!d) {
+		p_err("error opening input dir: %s", strerror(errno));
+		err = -errno;
+		goto out;
+	}
+
+	while ((dir = readdir(d)) != NULL) {
+		if (dir->d_type != DT_REG)
+			continue;
+
+		if (strncmp(dir->d_name + strlen(dir->d_name) - 4, ".btf", 4))
+			continue;
+
+		snprintf(src_btf_path, sizeof(src_btf_path), "%s/%s", input, dir->d_name);
+		snprintf(dst_btf_path, sizeof(dst_btf_path), "%s/%s", output, dir->d_name);
+
+		printf("SBTF: %s\n", src_btf_path);
+
+		err = generate_btf(src_btf_path, dst_btf_path, objs);
+		if (err)
+			goto out;
+	}
+
+out:
+	if (!err)
+		printf("STAT: done!\n");
+	free(objs);
+	closedir(d);
+	return err;
+}
+
 static const struct cmd cmds[] = {
 	{ "object",	do_object },
 	{ "skeleton",	do_skeleton },
+	{ "btf",	do_gen_btf},
 	{ "help",	do_help },
 	{ 0 }
 };
-- 
2.25.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next v3 0/3] libbpf: Implement BTFGen
  2021-12-17 18:56 [PATCH bpf-next v3 0/3] libbpf: Implement BTFGen Mauricio Vásquez
                   ` (2 preceding siblings ...)
  2021-12-17 18:56 ` [PATCH bpf-next v3 3/3] bpftool: Implement btfgen Mauricio Vásquez
@ 2021-12-17 23:11 ` Daniel Borkmann
  2021-12-20 22:43   ` Mauricio Vásquez Bernal
  3 siblings, 1 reply; 12+ messages in thread
From: Daniel Borkmann @ 2021-12-17 23:11 UTC (permalink / raw)
  To: Mauricio Vásquez, netdev, bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Quentin Monnet,
	Rafael David Tinoco, Lorenzo Fontana, Leonardo Di Donato

On 12/17/21 7:56 PM, Mauricio Vásquez wrote:
> CO-RE requires to have BTF information describing the types of the
> kernel in order to perform the relocations. This is usually provided by
> the kernel itself when it's configured with CONFIG_DEBUG_INFO_BTF.
> However, this configuration is not enabled in all the distributions and
> it's not available on kernels before 5.12.
> 
[...]
> Changelog:
> 
> v2 > v3:
> - expose internal libbpf APIs to bpftool instead
> - implement btfgen in bpftool
> - drop btf__raw_data() from libbpf

Looks like this breaks bpf-next CI, please take a look:

https://github.com/kernel-patches/bpf/runs/4565944884?check_suite_focus=true

   [...]

   All error logs:

   #36 core_kern_lskel:FAIL
   test_core_kern_lskel:FAIL:open_and_load unexpected error: -22

   #69 kfunc_call:FAIL
   test_main:PASS:skel 0 nsec
   test_main:PASS:bpf_prog_test_run(test1) 0 nsec
   test_main:PASS:test1-retval 0 nsec
   test_main:PASS:bpf_prog_test_run(test2) 0 nsec
   test_main:PASS:test2-retval 0 nsec
   #69/1 kfunc_call/main:OK
   test_subprog:PASS:skel 0 nsec
   test_subprog:PASS:bpf_prog_test_run(test1) 0 nsec
   test_subprog:PASS:test1-retval 0 nsec
   test_subprog:PASS:active_res 0 nsec
   test_subprog:PASS:sk_state_res 0 nsec
   #69/2 kfunc_call/subprog:OK
   test_subprog_lskel:FAIL:skel unexpected error: -22
   #69/3 kfunc_call/subprog_lskel:FAIL

   #71 ksyms_btf:FAIL
   test_ksyms_btf:PASS:btf_exists 0 nsec
   test_basic:PASS:kallsyms_fopen 0 nsec
   test_basic:PASS:ksym_find 0 nsec
   test_basic:PASS:kallsyms_fopen 0 nsec
   test_basic:PASS:ksym_find 0 nsec
   test_basic:PASS:skel_open 0 nsec
   test_basic:PASS:skel_attach 0 nsec
   test_basic:PASS:runqueues_addr 0 nsec
   test_basic:PASS:bpf_prog_active_addr 0 nsec
   test_basic:PASS:rq_cpu 0 nsec
   test_basic:PASS:bpf_prog_active 0 nsec
   test_basic:PASS:cpu_rq(0)->cpu 0 nsec
   test_basic:PASS:this_rq_cpu 0 nsec
   test_basic:PASS:this_bpf_prog_active 0 nsec
   #71/1 ksyms_btf/basic:OK
   libbpf: prog 'handler': BPF program load failed: Permission denied
   libbpf: prog 'handler': -- BEGIN PROG LOAD LOG --
   arg#0 reference type('UNKNOWN ') size cannot be determined: -22
   0: R1=ctx(id=0,off=0,imm=0) R10=fp0
   ; cpu = bpf_get_smp_processor_id();
   0: (85) call bpf_get_smp_processor_id#8       ; R0_w=inv(id=0)
   1: (bc) w7 = w0                       ; R0_w=inv(id=0) R7_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff))
   ; rq = (struct rq *)bpf_per_cpu_ptr(&runqueues, cpu);
   2: (18) r1 = 0x2bd80                  ; R1_w=percpu_ptr_rq(id=0,off=0,imm=0)
   4: (bc) w2 = w7                       ; R2_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R7_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff))
   5: (85) call bpf_per_cpu_ptr#153      ; R0_w=ptr_or_null_rq(id=1,off=0,imm=0)
   6: (bf) r6 = r0                       ; R0_w=ptr_or_null_rq(id=1,off=0,imm=0) R6_w=ptr_or_null_rq(id=1,off=0,imm=0)
   ; active = (int *)bpf_per_cpu_ptr(&bpf_prog_active, cpu);
   7: (18) r1 = 0x27de0                  ; R1_w=percpu_ptr_int(id=0,off=0,imm=0)
   9: (bc) w2 = w7                       ; R2_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R7_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff))
   10: (85) call bpf_per_cpu_ptr#153     ; R0=mem_or_null(id=2,ref_obj_id=0,off=0,imm=0)
   ; if (active) {
   11: (15) if r0 == 0x0 goto pc+2       ; R0=mem(id=0,ref_obj_id=0,off=0,imm=0)
   ; *(volatile int *)active;
   12: (61) r1 = *(u32 *)(r0 +0)         ; R0=mem(id=0,ref_obj_id=0,off=0,imm=0) R1_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff))
   ; *(volatile int *)(&rq->cpu);
   13: (61) r1 = *(u32 *)(r6 +2848)
   R6 invalid mem access 'ptr_or_null_'
   processed 12 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 1
   -- END PROG LOAD LOG --
   libbpf: failed to load program 'handler'
   libbpf: failed to load object 'test_ksyms_btf_null_check'
   libbpf: failed to load BPF skeleton 'test_ksyms_btf_null_check': -13
   test_null_check:PASS:skel_open 0 nsec
   #71/2 ksyms_btf/null_check:OK
   test_weak_syms:PASS:test_ksyms_weak__open_and_load 0 nsec
   test_weak_syms:PASS:test_ksyms_weak__attach 0 nsec
   test_weak_syms:PASS:existing typed ksym 0 nsec
   test_weak_syms:PASS:existing typeless ksym 0 nsec
   test_weak_syms:PASS:nonexistent typeless ksym 0 nsec
   test_weak_syms:PASS:nonexistent typed ksym 0 nsec
   #71/3 ksyms_btf/weak_ksyms:OK
   test_weak_syms_lskel:FAIL:test_ksyms_weak_lskel__open_and_load unexpected error: -22
   #71/4 ksyms_btf/weak_ksyms_lskel:FAIL

   #89 map_ptr:FAIL
   test_map_ptr:PASS:skel_open 0 nsec
   test_map_ptr:FAIL:skel_load unexpected error: -22 (errno 22)
   Summary: 221/1052 PASSED, 8 SKIPPED, 4 FAILED
test_progs-no_alu32 - Testing test_progs-no_alu32
test_maps - Testing test_maps
test_verifier - Testing test_verifier
collect_status - Collect status
shutdown - Shutdown
Test Results:
              bpftool: PASS
           test_progs: FAIL (returned 1)
  test_progs-no_alu32: FAIL (returned 1)
            test_maps: PASS
        test_verifier: PASS
             shutdown: CLEAN
Error: Process completed with exit code 1.

> v1 > v2:
> - introduce bpf_object__prepare() and ‘record_core_relos’ to expose
>    CO-RE relocations instead of bpf_object__reloc_info_gen()
> - rename btf__save_to_file() to btf__raw_data()
> 
> v1: https://lore.kernel.org/bpf/20211027203727.208847-1-mauricio@kinvolk.io/
> v2: https://lore.kernel.org/bpf/20211116164208.164245-1-mauricio@kinvolk.io/
> 
> Mauricio Vásquez (3):
>    libbpf: split bpf_core_apply_relo()
>    libbpf: Implement changes needed for BTFGen in bpftool
>    bpftool: Implement btfgen
> 
>   kernel/bpf/btf.c                |  11 +-
>   tools/bpf/bpftool/gen.c         | 892 ++++++++++++++++++++++++++++++++
>   tools/lib/bpf/Makefile          |   2 +-
>   tools/lib/bpf/libbpf.c          | 124 +++--
>   tools/lib/bpf/libbpf.h          |   3 +
>   tools/lib/bpf/libbpf.map        |   2 +
>   tools/lib/bpf/libbpf_internal.h |  22 +
>   tools/lib/bpf/relo_core.c       |  83 +--
>   tools/lib/bpf/relo_core.h       |  46 +-
>   9 files changed, 1086 insertions(+), 99 deletions(-)
> 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next v3 0/3] libbpf: Implement BTFGen
  2021-12-17 23:11 ` [PATCH bpf-next v3 0/3] libbpf: Implement BTFGen Daniel Borkmann
@ 2021-12-20 22:43   ` Mauricio Vásquez Bernal
  0 siblings, 0 replies; 12+ messages in thread
From: Mauricio Vásquez Bernal @ 2021-12-20 22:43 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Networking, bpf, Alexei Starovoitov, Andrii Nakryiko,
	Quentin Monnet, Rafael David Tinoco, Lorenzo Fontana,
	Leonardo Di Donato

> [...]
> > Changelog:
> >
> > v2 > v3:
> > - expose internal libbpf APIs to bpftool instead
> > - implement btfgen in bpftool
> > - drop btf__raw_data() from libbpf
>
> Looks like this breaks bpf-next CI, please take a look:
>
> https://github.com/kernel-patches/bpf/runs/4565944884?check_suite_focus=true
>

Thanks Daniel for checking this up. I have spotted a potential issue:
the instruction is also patched when prog->obj->gen_loader is set.
I'll fix it in the next iteration but I'm not sure it's causing those
test failures. I tried to reproduce them in my fork but they pass
fine: https://github.com/mauriciovasquezbernal/linux/runs/4586966124?check_suite_focus=true.
Could it be a false positive?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next v3 3/3] bpftool: Implement btfgen
  2021-12-17 18:56 ` [PATCH bpf-next v3 3/3] bpftool: Implement btfgen Mauricio Vásquez
@ 2021-12-23  0:33   ` Andrii Nakryiko
  2022-01-12 14:26     ` Mauricio Vásquez Bernal
  0 siblings, 1 reply; 12+ messages in thread
From: Andrii Nakryiko @ 2021-12-23  0:33 UTC (permalink / raw)
  To: Mauricio Vásquez
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Quentin Monnet, Rafael David Tinoco,
	Lorenzo Fontana, Leonardo Di Donato

On Fri, Dec 17, 2021 at 10:57 AM Mauricio Vásquez <mauricio@kinvolk.io> wrote:
>
> The BTFGen's goal is to produce a BTF file that contains **only** the
> information that is needed by an eBPF program. This algorithm does a
> first step collecting the types involved for each relocation present on
> the object and "marking" them as needed. Types are collected in
> different ways according to the type relocation, for field based
> relocations only the union and structures members involved are
> considered, for type based relocations the whole types are added, enum
> field relocations are not supported in this iteration yet.
>
> A second step generates a BTF file from the "marked" types. This step
> accesses the original BTF file extracting the types and their members
> that were "marked" as needed in the first step.
>
> This command is implemented under the "gen" command in bpftool and the
> syntax is the following:
>
> $ bpftool gen btf INPUT OUTPUT OBJECT(S)
>
> INPUT can be either a single BTF file or a folder containing BTF files,
> when it's a folder, a BTF file is generated for each BTF file contained
> in this folder. OUTPUT is the file (or folder) where generated files are
> stored and OBJECT(S) is the list of bpf objects we want to generate the
> BTF file(s) for (each generated BTF file contains all the types needed
> by all the objects).
>
> Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io>
> Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com>
> Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co>
> Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co>
> ---
>  tools/bpf/bpftool/gen.c | 892 ++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 892 insertions(+)
>

I haven't looked through details of stripping BTF itself, let's
finalize CO-RE relocation parts first. Maybe for the next revision you
could split bpftool changes in two in some reasonable way so that each
patch concentrates on different steps of the process a bit more? E.g.,
first patch might set up new command and BTF stripping parts but leave
the CO-RE relocation logic unimplemented, and the second path fills in
that part. Should make it easier to review this big patch.

Please also cc Quentin Monnet to review bpftool parts as well.


[...]

> +static int btf_reloc_info_gen_type(struct btf_reloc_info *info, struct bpf_core_spec *targ_spec)
> +{
> +       struct btf *btf = (struct btf *) info->src_btf;
> +       struct btf_type *btf_type;
> +       int err = 0;
> +
> +       btf_type = (struct btf_type *) btf__type_by_id(btf, targ_spec->root_type_id);
> +
> +       return btf_reloc_put_type_all(btf, info, btf_type, targ_spec->root_type_id);
> +}
> +
> +static int btf_reloc_info_gen_enumval(struct btf_reloc_info *info, struct bpf_core_spec *targ_spec)
> +{
> +       p_err("untreated enumval based relocation");

why untreated? what's the problem supporting it?

> +       return -EOPNOTSUPP;
> +}
> +
> +static int btf_reloc_info_gen(struct btf_reloc_info *info, struct bpf_core_spec *res)
> +{
> +       if (core_relo_is_type_based(res->relo_kind))
> +               return btf_reloc_info_gen_type(info, res);
> +
> +       if (core_relo_is_enumval_based(res->relo_kind))
> +               return btf_reloc_info_gen_enumval(info, res);
> +
> +       if (core_relo_is_field_based(res->relo_kind))
> +               return btf_reloc_info_gen_field(info, res);

you can have a simple switch here instead of exposing libbpf internal helpers

> +
> +       return -EINVAL;
> +}
> +
> +#define BPF_INSN_SZ (sizeof(struct bpf_insn))
> +
> +static int btfgen_obj_reloc_info_gen(struct btf_reloc_info *reloc_info, struct bpf_object *obj)
> +{
> +       const struct btf_ext_info_sec *sec;
> +       const struct bpf_core_relo *rec;
> +       const struct btf_ext_info *seg;
> +       struct hashmap *cand_cache;
> +       int err, insn_idx, sec_idx;
> +       struct bpf_program *prog;
> +       struct btf_ext *btf_ext;
> +       const char *sec_name;
> +       size_t nr_programs;
> +       struct btf *btf;
> +       unsigned int i;
> +
> +       btf = bpf_object__btf(obj);
> +       btf_ext = bpf_object__btf_ext(obj);
> +
> +       if (btf_ext->core_relo_info.len == 0)
> +               return 0;
> +
> +       cand_cache = bpf_core_create_cand_cache();
> +       if (IS_ERR(cand_cache))
> +               return PTR_ERR(cand_cache);
> +
> +       bpf_object_set_vmlinux_override(obj, reloc_info->src_btf);
> +
> +       seg = &btf_ext->core_relo_info;
> +       for_each_btf_ext_sec(seg, sec) {
> +               bool prog_found;
> +
> +               sec_name = btf__name_by_offset(btf, sec->sec_name_off);
> +               if (str_is_empty(sec_name)) {
> +                       err = -EINVAL;
> +                       goto out;
> +               }
> +
> +               prog_found = false;
> +               nr_programs = bpf_object__get_nr_programs(obj);
> +               for (i = 0; i < nr_programs; i++)       {
> +                       prog = bpf_object__get_program(obj, i);
> +                       if (strcmp(bpf_program__section_name(prog), sec_name) == 0) {
> +                               prog_found = true;
> +                               break;
> +                       }
> +               }
> +
> +               if (!prog_found) {
> +                       pr_warn("sec '%s': failed to find a BPF program\n", sec_name);
> +                       err = -EINVAL;
> +                       goto out;
> +               }
> +
> +               sec_idx = bpf_program__sec_idx(prog);
> +
> +               for_each_btf_ext_rec(seg, sec, i, rec) {
> +                       struct bpf_core_relo_res targ_res;
> +                       struct bpf_core_spec targ_spec;
> +
> +                       insn_idx = rec->insn_off / BPF_INSN_SZ;
> +
> +                       prog = find_prog_by_sec_insn(obj, sec_idx, insn_idx);
> +                       if (!prog) {
> +                               pr_warn("sec '%s': failed to find program at insn #%d for CO-RE offset relocation #%d\n",
> +                                       sec_name, insn_idx, i);
> +                               err = -EINVAL;
> +                               goto out;
> +                       }
> +
> +                       err = bpf_core_calc_relo_res(prog, rec, i, btf, cand_cache, &targ_res,
> +                                                    &targ_spec);


I don't think you need to do *exactly* what libbpf is doing.
bpf_core_calc_relo_res() doesn't add much on top of
bpf_core_calc_relo_insn(), if you use bpf_core_calc_relo_insn()
directly and expose bpf_core_add_cands/bpf_core_free_cands and use
them directly as well, bypassing bpf_object completely, you won't need
btf_vmlinux override and make everything less coupled, I think. In the
future, if you'd like to support BTF module BTFs, you'll have all the
necessary flexibility to do that, while if you try to reuse every
single line of bpf_object's code we'll be adding more hacks like your
bpf_object_set_vmlinux_override().

Fundamentally, you don't care about bpf_object and bpf_programs. All
you need to do is parse .BTF.ext (you can do that just btf__parse, no
need to even construct bpf_object!). Construct candidate cache, then
iterate each CO-RE record, add find/add candidates, calculate
relocation result, use it for your algorithm.

Yes, there will be a bit of duplication (candidate search), but that's
better than trying to turn bpf_object inside out with all the custom
getters/setters that you are exposing (even if it's libbpf internal
only, still makes it really hard to reasons what's going on and what
are consequences of manual control over a lot of bpf_object internal
implementation details).


> +                       if (err)
> +                               goto out;
> +
> +                       err = btf_reloc_info_gen(reloc_info, &targ_spec);
> +                       if (err)
> +                               goto out;
> +               }
> +       }
> +
> +out:
> +       bpf_core_free_cand_cache(cand_cache);
> +
> +       return err;
> +}
> +

[...]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next v3 2/3] libbpf: Implement changes needed for BTFGen in bpftool
  2021-12-17 18:56 ` [PATCH bpf-next v3 2/3] libbpf: Implement changes needed for BTFGen in bpftool Mauricio Vásquez
@ 2021-12-23  0:33   ` Andrii Nakryiko
  2022-01-12 14:26     ` Mauricio Vásquez Bernal
  0 siblings, 1 reply; 12+ messages in thread
From: Andrii Nakryiko @ 2021-12-23  0:33 UTC (permalink / raw)
  To: Mauricio Vásquez
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Quentin Monnet, Rafael David Tinoco,
	Lorenzo Fontana, Leonardo Di Donato

On Fri, Dec 17, 2021 at 10:57 AM Mauricio Vásquez <mauricio@kinvolk.io> wrote:
>
> Given that BTFGen is a so unique use case, let's expose some of the
> libbpf internal APIs to bpftool. This commit extends libbpf with the
> features that are needed to implement it.
>
> Specifically, this commit introduces the following functions:
> - bpf_object__btf_ext(): Public method to access the BTF.ext section of
> an bpf object.
> - bpf_program__sec_idx(): Public method to get the sec index of a
> program within the elf file.
> - Implement bpf_core_create_cand_cache() and bpf_core_free_cand_cache()
> to handle candidates cache.
> - Expose bpf_core_calc_relo_res() in libbpfinternal.h and add "struct
> bpf_core_spec *targ_spec" parameter.
> - bpf_object_set_vmlinux_override(): Internal function to set
> obj->btf_vmlinux_override.
> - bpf_object__get_nr_programs() and bpf_object__get_program(): To give
> access to the program inside an object. bpf_object__for_each_program
> is not good enough because BTFGen needs to access subprograms too.
>
> Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io>
> Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com>
> Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co>
> Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co>
> ---
>  tools/lib/bpf/Makefile          |  2 +-
>  tools/lib/bpf/libbpf.c          | 88 ++++++++++++++++++++++++++-------
>  tools/lib/bpf/libbpf.h          |  3 ++
>  tools/lib/bpf/libbpf.map        |  2 +
>  tools/lib/bpf/libbpf_internal.h | 22 +++++++++
>  tools/lib/bpf/relo_core.c       |  6 +--
>  tools/lib/bpf/relo_core.h       |  4 ++
>  7 files changed, 104 insertions(+), 23 deletions(-)
>
> diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile
> index f947b61b2107..dba019ee2832 100644
> --- a/tools/lib/bpf/Makefile
> +++ b/tools/lib/bpf/Makefile
> @@ -239,7 +239,7 @@ install_lib: all_cmd
>
>  SRC_HDRS := bpf.h libbpf.h btf.h libbpf_common.h libbpf_legacy.h xsk.h      \
>             bpf_helpers.h bpf_tracing.h bpf_endian.h bpf_core_read.h         \
> -           skel_internal.h libbpf_version.h
> +           skel_internal.h libbpf_version.h relo_core.h libbpf_internal.h
>  GEN_HDRS := $(BPF_GENERATED)
>
>  INSTALL_PFX := $(DESTDIR)$(prefix)/include/bpf
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 77e2df13715a..7c8f8797475f 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -4027,8 +4027,8 @@ static bool prog_contains_insn(const struct bpf_program *prog, size_t insn_idx)
>                insn_idx < prog->sec_insn_off + prog->sec_insn_cnt;
>  }
>
> -static struct bpf_program *find_prog_by_sec_insn(const struct bpf_object *obj,
> -                                                size_t sec_idx, size_t insn_idx)
> +struct bpf_program *find_prog_by_sec_insn(const struct bpf_object *obj,
> +                                         size_t sec_idx, size_t insn_idx)
>  {
>         int l = 0, r = obj->nr_programs - 1, m;
>         struct bpf_program *prog;
> @@ -5498,12 +5498,13 @@ static int record_relo_core(struct bpf_program *prog,
>         return 0;
>  }
>
> -static int bpf_core_calc_relo_res(struct bpf_program *prog,
> -                                 const struct bpf_core_relo *relo,
> -                                 int relo_idx,
> -                                 const struct btf *local_btf,
> -                                 struct hashmap *cand_cache,
> -                                 struct bpf_core_relo_res *targ_res)
> +int bpf_core_calc_relo_res(struct bpf_program *prog,
> +                          const struct bpf_core_relo *relo,
> +                          int relo_idx,
> +                          const struct btf *local_btf,
> +                          struct hashmap *cand_cache,
> +                          struct bpf_core_relo_res *targ_res,
> +                          struct bpf_core_spec *targ_spec)

maybe let's add targ_spec and local_spec into bpf_core_relo_res? that
way bpf_core_relo_res contains all the relevant information around
CO-RE relo resolution?

>  {
>         struct bpf_core_spec specs_scratch[3] = {};
>         const void *type_key = u32_as_hash_key(relo->type_id);
> @@ -5548,8 +5549,32 @@ static int bpf_core_calc_relo_res(struct bpf_program *prog,
>                 }
>         }
>
> -       return bpf_core_calc_relo_insn(prog_name, relo, relo_idx, local_btf, cands,
> -                                      specs_scratch, targ_res);
> +       err = bpf_core_calc_relo_insn(prog_name, relo, relo_idx, local_btf, cands,
> +                                     specs_scratch, targ_res);
> +       if (err)
> +               return err;
> +
> +       if (targ_spec)
> +               *targ_spec = specs_scratch[2];

and then we'll avoid this ugliness

> +       return 0;
> +}
> +
> +struct hashmap *bpf_core_create_cand_cache(void)
> +{
> +       return hashmap__new(bpf_core_hash_fn, bpf_core_equal_fn, NULL);
> +}
> +
> +void bpf_core_free_cand_cache(struct hashmap *cand_cache)
> +{
> +       struct hashmap_entry *entry;
> +       int i;
> +
> +       if (!IS_ERR_OR_NULL(cand_cache)) {
> +               hashmap__for_each_entry(cand_cache, entry, i) {
> +                       bpf_core_free_cands(entry->value);
> +               }
> +               hashmap__free(cand_cache);
> +       }
>  }
>
>  static int
> @@ -5559,7 +5584,6 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path)
>         struct bpf_core_relo_res targ_res;
>         const struct bpf_core_relo *rec;
>         const struct btf_ext_info *seg;
> -       struct hashmap_entry *entry;
>         struct hashmap *cand_cache = NULL;
>         struct bpf_program *prog;
>         struct bpf_insn *insn;
> @@ -5578,7 +5602,7 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path)
>                 }
>         }
>
> -       cand_cache = hashmap__new(bpf_core_hash_fn, bpf_core_equal_fn, NULL);
> +       cand_cache = bpf_core_create_cand_cache();
>         if (IS_ERR(cand_cache)) {
>                 err = PTR_ERR(cand_cache);
>                 goto out;
> @@ -5627,7 +5651,8 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path)
>                         if (!prog->load)
>                                 continue;
>
> -                       err = bpf_core_calc_relo_res(prog, rec, i, obj->btf, cand_cache, &targ_res);
> +                       err = bpf_core_calc_relo_res(prog, rec, i, obj->btf, cand_cache, &targ_res,
> +                                                    NULL);
>                         if (err) {
>                                 pr_warn("prog '%s': relo #%d: failed to relocate: %d\n",
>                                         prog->name, i, err);
> @@ -5660,12 +5685,8 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path)
>         btf__free(obj->btf_vmlinux_override);
>         obj->btf_vmlinux_override = NULL;
>
> -       if (!IS_ERR_OR_NULL(cand_cache)) {
> -               hashmap__for_each_entry(cand_cache, entry, i) {
> -                       bpf_core_free_cands(entry->value);
> -               }
> -               hashmap__free(cand_cache);
> -       }
> +       bpf_core_free_cand_cache(cand_cache);
> +
>         return err;
>  }
>
> @@ -8190,6 +8211,11 @@ struct btf *bpf_object__btf(const struct bpf_object *obj)
>         return obj ? obj->btf : NULL;
>  }
>
> +struct btf_ext *bpf_object__btf_ext(const struct bpf_object *obj)
> +{
> +       return obj ? obj->btf_ext : NULL;

just return obj->btf_ext, no one should be passing NULL for those getters

> +}
> +
>  int bpf_object__btf_fd(const struct bpf_object *obj)
>  {
>         return obj->btf ? btf__fd(obj->btf) : -1;
> @@ -8281,6 +8307,20 @@ bpf_object__next_program(const struct bpf_object *obj, struct bpf_program *prev)
>         return prog;
>  }
>
> +size_t bpf_object__get_nr_programs(const struct bpf_object *obj)
> +{
> +       return obj->nr_programs;
> +}
> +
> +struct bpf_program *
> +bpf_object__get_program(const struct bpf_object *obj, unsigned int i)
> +{
> +       if (i >= obj->nr_programs)
> +               return NULL;
> +
> +       return &obj->programs[i];
> +}
> +
>  struct bpf_program *
>  bpf_program__prev(struct bpf_program *next, const struct bpf_object *obj)
>  {
> @@ -8360,6 +8400,11 @@ int bpf_program__set_autoload(struct bpf_program *prog, bool autoload)
>         return 0;
>  }
>
> +int bpf_program__sec_idx(const struct bpf_program *prog)
> +{
> +       return prog->sec_idx;
> +}
> +
>  static int bpf_program_nth_fd(const struct bpf_program *prog, int n);
>
>  int bpf_program__fd(const struct bpf_program *prog)
> @@ -11779,3 +11824,8 @@ void bpf_object__destroy_skeleton(struct bpf_object_skeleton *s)
>         free(s->progs);
>         free(s);
>  }
> +
> +void bpf_object_set_vmlinux_override(struct bpf_object *obj, struct btf *btf)
> +{
> +       obj->btf_vmlinux_override = btf;
> +}

I don't think we need this, see comments in next patch


> diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
> index 42b2f36fd9f0..2b048ee5a9b2 100644
> --- a/tools/lib/bpf/libbpf.h
> +++ b/tools/lib/bpf/libbpf.h
> @@ -225,6 +225,8 @@ LIBBPF_API int bpf_object__set_kversion(struct bpf_object *obj, __u32 kern_versi
>
>  struct btf;
>  LIBBPF_API struct btf *bpf_object__btf(const struct bpf_object *obj);
> +struct btf_ext;
> +LIBBPF_API struct btf_ext *bpf_object__btf_ext(const struct bpf_object *obj);
>  LIBBPF_API int bpf_object__btf_fd(const struct bpf_object *obj);
>
>  LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_object__find_program_by_name() instead")
> @@ -290,6 +292,7 @@ LIBBPF_API LIBBPF_DEPRECATED("BPF program title is confusing term; please use bp
>  const char *bpf_program__title(const struct bpf_program *prog, bool needs_copy);
>  LIBBPF_API bool bpf_program__autoload(const struct bpf_program *prog);
>  LIBBPF_API int bpf_program__set_autoload(struct bpf_program *prog, bool autoload);
> +LIBBPF_API int bpf_program__sec_idx(const struct bpf_program *prog);
>
>  /* returns program size in bytes */
>  LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_program__insn_cnt() instead")
> diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
> index b3938b3f8fc9..15da4075e0b5 100644
> --- a/tools/lib/bpf/libbpf.map
> +++ b/tools/lib/bpf/libbpf.map
> @@ -392,6 +392,7 @@ LIBBPF_0.6.0 {
>                 bpf_map__map_extra;
>                 bpf_map__set_map_extra;
>                 bpf_map_create;
> +               bpf_object__btf_ext;
>                 bpf_object__next_map;
>                 bpf_object__next_program;
>                 bpf_object__prev_map;
> @@ -401,6 +402,7 @@ LIBBPF_0.6.0 {
>                 bpf_program__flags;
>                 bpf_program__insn_cnt;
>                 bpf_program__insns;
> +               bpf_program__sec_idx;
>                 bpf_program__set_flags;
>                 btf__add_btf;
>                 btf__add_decl_tag;
> diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
> index 5dbe4f463880..b1962adb110c 100644
> --- a/tools/lib/bpf/libbpf_internal.h
> +++ b/tools/lib/bpf/libbpf_internal.h
> @@ -524,4 +524,26 @@ static inline int ensure_good_fd(int fd)
>         return fd;
>  }
>
> +struct hashmap;
> +
> +int bpf_core_calc_relo_res(struct bpf_program *prog,
> +                          const struct bpf_core_relo *relo,
> +                          int relo_idx,
> +                          const struct btf *local_btf,
> +                          struct hashmap *cand_cache,
> +                          struct bpf_core_relo_res *targ_res,
> +                          struct bpf_core_spec *targ_spec);
> +void bpf_object_set_vmlinux_override(struct bpf_object *obj, struct btf *btf);
> +struct hashmap *bpf_core_create_cand_cache(void);
> +void bpf_core_free_cand_cache(struct hashmap *cand_cache);
> +
> +struct bpf_program *find_prog_by_sec_insn(const struct bpf_object *obj,
> +                                         size_t sec_idx, size_t insn_idx);
> +
> +size_t bpf_object__get_nr_programs(const struct bpf_object *obj);
> +
> +struct bpf_program *
> +bpf_object__get_program(const struct bpf_object *obj, unsigned int n);
> +

that's too much, I don't think you need bpf_program and all the things
around it (sec_idx, look up, etc). As for core_relo_is_field_based and
co, bpftool can do those simple checks on their own, no need to make
all the "internal API", don't overdo it with "let's expose internals
of libbpf to bpftool".


> +
>  #endif /* __LIBBPF_LIBBPF_INTERNAL_H */
> diff --git a/tools/lib/bpf/relo_core.c b/tools/lib/bpf/relo_core.c
> index 4f3552468624..66dfb7fa89a2 100644
> --- a/tools/lib/bpf/relo_core.c
> +++ b/tools/lib/bpf/relo_core.c
> @@ -102,7 +102,7 @@ static const char *core_relo_kind_str(enum bpf_core_relo_kind kind)
>         }
>  }
>
> -static bool core_relo_is_field_based(enum bpf_core_relo_kind kind)
> +bool core_relo_is_field_based(enum bpf_core_relo_kind kind)
>  {
>         switch (kind) {
>         case BPF_CORE_FIELD_BYTE_OFFSET:
> @@ -117,7 +117,7 @@ static bool core_relo_is_field_based(enum bpf_core_relo_kind kind)
>         }
>  }
>
> -static bool core_relo_is_type_based(enum bpf_core_relo_kind kind)
> +bool core_relo_is_type_based(enum bpf_core_relo_kind kind)
>  {
>         switch (kind) {
>         case BPF_CORE_TYPE_ID_LOCAL:
> @@ -130,7 +130,7 @@ static bool core_relo_is_type_based(enum bpf_core_relo_kind kind)
>         }
>  }
>
> -static bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind)
> +bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind)
>  {
>         switch (kind) {
>         case BPF_CORE_ENUMVAL_EXISTS:
> diff --git a/tools/lib/bpf/relo_core.h b/tools/lib/bpf/relo_core.h
> index a28bf3711ce2..e969dfb032f4 100644
> --- a/tools/lib/bpf/relo_core.h
> +++ b/tools/lib/bpf/relo_core.h
> @@ -84,4 +84,8 @@ int bpf_core_patch_insn(const char *prog_name, struct bpf_insn *insn,
>                         int insn_idx, const struct bpf_core_relo *relo,
>                         int relo_idx, const struct bpf_core_relo_res *res);
>
> +bool core_relo_is_field_based(enum bpf_core_relo_kind kind);
> +bool core_relo_is_type_based(enum bpf_core_relo_kind kind);
> +bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind);
> +
>  #endif
> --
> 2.25.1
>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next v3 1/3] libbpf: split bpf_core_apply_relo()
  2021-12-17 18:56 ` [PATCH bpf-next v3 1/3] libbpf: split bpf_core_apply_relo() Mauricio Vásquez
@ 2021-12-23  0:33   ` Andrii Nakryiko
  2022-01-12 14:26     ` Mauricio Vásquez Bernal
  0 siblings, 1 reply; 12+ messages in thread
From: Andrii Nakryiko @ 2021-12-23  0:33 UTC (permalink / raw)
  To: Mauricio Vásquez
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Quentin Monnet, Rafael David Tinoco,
	Lorenzo Fontana, Leonardo Di Donato

On Fri, Dec 17, 2021 at 10:57 AM Mauricio Vásquez <mauricio@kinvolk.io> wrote:
>
> BTFGen needs to run the core relocation logic in order to understand
> what are the types in the target BTF that involved in a given
> relocation.
>
> Currently bpf_core_apply_relo() calculates and **applies** a relocation
> to an instruction. Having both operations in the same function makes it
> difficult to only calculate the relocation without patching the
> instruction. This commit splits that logic in two different phases: (1)
> calculate the relocation and (2) patch the instruction.
>
> For the first phase bpf_core_apply_relo() is renamed to
> bpf_core_calc_relo_res() who is now only on charge of calculating the
> relocation, the second phase uses the already existing
> bpf_core_patch_insn(). bpf_object__relocate_core() uses both of them and
> the BTFGen will use only bpf_core_calc_relo_res().
>
> Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io>
> Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com>
> Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co>
> Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co>
> ---
>  kernel/bpf/btf.c          | 11 +++++-
>  tools/lib/bpf/libbpf.c    | 54 ++++++++++++++++-----------
>  tools/lib/bpf/relo_core.c | 77 +++++++++++----------------------------
>  tools/lib/bpf/relo_core.h | 42 ++++++++++++++++++---
>  4 files changed, 99 insertions(+), 85 deletions(-)
>
> diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
> index a17de71abd2e..5a8f6ef6a341 100644
> --- a/kernel/bpf/btf.c
> +++ b/kernel/bpf/btf.c
> @@ -6734,6 +6734,7 @@ int bpf_core_apply(struct bpf_core_ctx *ctx, const struct bpf_core_relo *relo,
>  {
>         bool need_cands = relo->kind != BPF_CORE_TYPE_ID_LOCAL;
>         struct bpf_core_cand_list cands = {};
> +       struct bpf_core_relo_res targ_res;
>         struct bpf_core_spec *specs;
>         int err;
>
> @@ -6778,8 +6779,14 @@ int bpf_core_apply(struct bpf_core_ctx *ctx, const struct bpf_core_relo *relo,
>                  */
>         }
>
> -       err = bpf_core_apply_relo_insn((void *)ctx->log, insn, relo->insn_off / 8,
> -                                      relo, relo_idx, ctx->btf, &cands, specs);
> +       err = bpf_core_calc_relo_insn((void *)ctx->log, relo, relo_idx, ctx->btf, &cands, specs,
> +                                     &targ_res);
> +       if (err)
> +               goto out;
> +
> +       err = bpf_core_patch_insn((void *)ctx->log, insn, relo->insn_off / 8, relo, relo_idx,
> +                                 &targ_res);
> +
>  out:
>         kfree(specs);
>         if (need_cands) {
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index cf862a19222b..77e2df13715a 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -5498,11 +5498,12 @@ static int record_relo_core(struct bpf_program *prog,
>         return 0;
>  }
>
> -static int bpf_core_apply_relo(struct bpf_program *prog,
> -                              const struct bpf_core_relo *relo,
> -                              int relo_idx,
> -                              const struct btf *local_btf,
> -                              struct hashmap *cand_cache)
> +static int bpf_core_calc_relo_res(struct bpf_program *prog,

bpf_core_calc_relo_res is almost indistinguishable from
bpf_core_calc_relo... Let's call this one bpf_core_resolve_relo()?

> +                                 const struct bpf_core_relo *relo,
> +                                 int relo_idx,
> +                                 const struct btf *local_btf,
> +                                 struct hashmap *cand_cache,
> +                                 struct bpf_core_relo_res *targ_res)

[...]

>  static int
>  bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path)
>  {
>         const struct btf_ext_info_sec *sec;
> +       struct bpf_core_relo_res targ_res;
>         const struct bpf_core_relo *rec;
>         const struct btf_ext_info *seg;
>         struct hashmap_entry *entry;
>         struct hashmap *cand_cache = NULL;
>         struct bpf_program *prog;
> +       struct bpf_insn *insn;
>         const char *sec_name;
>         int i, err = 0, insn_idx, sec_idx;
>
> @@ -5636,12 +5627,31 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path)
>                         if (!prog->load)
>                                 continue;
>
> -                       err = bpf_core_apply_relo(prog, rec, i, obj->btf, cand_cache);
> +                       err = bpf_core_calc_relo_res(prog, rec, i, obj->btf, cand_cache, &targ_res);
>                         if (err) {
>                                 pr_warn("prog '%s': relo #%d: failed to relocate: %d\n",
>                                         prog->name, i, err);
>                                 goto out;
>                         }
> +
> +                       if (rec->insn_off % BPF_INSN_SZ)
> +                               return -EINVAL;
> +                       insn_idx = rec->insn_off / BPF_INSN_SZ;
> +                       /* adjust insn_idx from section frame of reference to the local
> +                        * program's frame of reference; (sub-)program code is not yet
> +                        * relocated, so it's enough to just subtract in-section offset
> +                        */
> +                       insn_idx = insn_idx - prog->sec_insn_off;
> +                       if (insn_idx >= prog->insns_cnt)
> +                               return -EINVAL;
> +                       insn = &prog->insns[insn_idx];

this is sort of like sanity checks, let's do them before the core_calc
step, so after that it's a clean sequence of calc_relo + pathc_insn?

> +
> +                       err = bpf_core_patch_insn(prog->name, insn, insn_idx, rec, i, &targ_res);
> +                       if (err) {
> +                               pr_warn("prog '%s': relo #%d: failed to patch insn #%u: %d\n",
> +                                       prog->name, i, insn_idx, err);
> +                               goto out;
> +                       }
>                 }
>         }
>

[...]

>  {
>         __u32 orig_val, new_val;
>         __u8 class;
> @@ -1177,18 +1152,18 @@ static void bpf_core_dump_spec(const char *prog_name, int level, const struct bp
>   *    between multiple relocations for the same type ID and is updated as some
>   *    of the candidates are pruned due to structural incompatibility.
>   */
> -int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn,
> -                            int insn_idx,
> -                            const struct bpf_core_relo *relo,
> -                            int relo_idx,
> -                            const struct btf *local_btf,
> -                            struct bpf_core_cand_list *cands,
> -                            struct bpf_core_spec *specs_scratch)
> +int bpf_core_calc_relo_insn(const char *prog_name,

please update the comment for this function, it's not "CO-RE relocate
single instruction" anymore, it's more like "Calculate CO-RE
relocation target result" or something along those lines.

> +                           const struct bpf_core_relo *relo,
> +                           int relo_idx,
> +                           const struct btf *local_btf,
> +                           struct bpf_core_cand_list *cands,
> +                           struct bpf_core_spec *specs_scratch,
> +                           struct bpf_core_relo_res *targ_res)
>  {
>         struct bpf_core_spec *local_spec = &specs_scratch[0];
>         struct bpf_core_spec *cand_spec = &specs_scratch[1];
>         struct bpf_core_spec *targ_spec = &specs_scratch[2];
> -       struct bpf_core_relo_res cand_res, targ_res;
> +       struct bpf_core_relo_res cand_res;
>         const struct btf_type *local_type;
>         const char *local_name;
>         __u32 local_id;
> @@ -1223,12 +1198,12 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn,
>         /* TYPE_ID_LOCAL relo is special and doesn't need candidate search */
>         if (relo->kind == BPF_CORE_TYPE_ID_LOCAL) {
>                 /* bpf_insn's imm value could get out of sync during linking */
> -               memset(&targ_res, 0, sizeof(targ_res));
> -               targ_res.validate = false;
> -               targ_res.poison = false;
> -               targ_res.orig_val = local_spec->root_type_id;
> -               targ_res.new_val = local_spec->root_type_id;
> -               goto patch_insn;
> +               memset(targ_res, 0, sizeof(*targ_res));
> +               targ_res->validate = true;

hm.. original code sets it to false here, please don't regress the logic

> +               targ_res->poison = false;
> +               targ_res->orig_val = local_spec->root_type_id;
> +               targ_res->new_val = local_spec->root_type_id;
> +               return 0;
>         }
>
>         /* libbpf doesn't support candidate search for anonymous types */

[...]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next v3 1/3] libbpf: split bpf_core_apply_relo()
  2021-12-23  0:33   ` Andrii Nakryiko
@ 2022-01-12 14:26     ` Mauricio Vásquez Bernal
  0 siblings, 0 replies; 12+ messages in thread
From: Mauricio Vásquez Bernal @ 2022-01-12 14:26 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Quentin Monnet, Rafael David Tinoco,
	Lorenzo Fontana, Leonardo Di Donato

> > -static int bpf_core_apply_relo(struct bpf_program *prog,
> > -                              const struct bpf_core_relo *relo,
> > -                              int relo_idx,
> > -                              const struct btf *local_btf,
> > -                              struct hashmap *cand_cache)
> > +static int bpf_core_calc_relo_res(struct bpf_program *prog,
>
> bpf_core_calc_relo_res is almost indistinguishable from
> bpf_core_calc_relo... Let's call this one bpf_core_resolve_relo()?
>

That's a much better name! Deciding the name of that function was
probably the most complicated part of this patch.

> > @@ -5636,12 +5627,31 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path)
> >                         if (!prog->load)
> >                                 continue;
> >
> > -                       err = bpf_core_apply_relo(prog, rec, i, obj->btf, cand_cache);
> > +                       err = bpf_core_calc_relo_res(prog, rec, i, obj->btf, cand_cache, &targ_res);
> >                         if (err) {
> >                                 pr_warn("prog '%s': relo #%d: failed to relocate: %d\n",
> >                                         prog->name, i, err);
> >                                 goto out;
> >                         }
> > +
> > +                       if (rec->insn_off % BPF_INSN_SZ)
> > +                               return -EINVAL;
> > +                       insn_idx = rec->insn_off / BPF_INSN_SZ;
> > +                       /* adjust insn_idx from section frame of reference to the local
> > +                        * program's frame of reference; (sub-)program code is not yet
> > +                        * relocated, so it's enough to just subtract in-section offset
> > +                        */
> > +                       insn_idx = insn_idx - prog->sec_insn_off;
> > +                       if (insn_idx >= prog->insns_cnt)
> > +                               return -EINVAL;
> > +                       insn = &prog->insns[insn_idx];
>
> this is sort of like sanity checks, let's do them before the core_calc
> step, so after that it's a clean sequence of calc_relo + pathc_insn?
>

Makes sense.

> > @@ -1177,18 +1152,18 @@ static void bpf_core_dump_spec(const char *prog_name, int level, const struct bp
> >   *    between multiple relocations for the same type ID and is updated as some
> >   *    of the candidates are pruned due to structural incompatibility.
> >   */
> > -int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn,
> > -                            int insn_idx,
> > -                            const struct bpf_core_relo *relo,
> > -                            int relo_idx,
> > -                            const struct btf *local_btf,
> > -                            struct bpf_core_cand_list *cands,
> > -                            struct bpf_core_spec *specs_scratch)
> > +int bpf_core_calc_relo_insn(const char *prog_name,
>
> please update the comment for this function, it's not "CO-RE relocate
> single instruction" anymore, it's more like "Calculate CO-RE
> relocation target result" or something along those lines.
>

Updated with your suggestion.

> > @@ -1223,12 +1198,12 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn,
> >         /* TYPE_ID_LOCAL relo is special and doesn't need candidate search */
> >         if (relo->kind == BPF_CORE_TYPE_ID_LOCAL) {
> >                 /* bpf_insn's imm value could get out of sync during linking */
> > -               memset(&targ_res, 0, sizeof(targ_res));
> > -               targ_res.validate = false;
> > -               targ_res.poison = false;
> > -               targ_res.orig_val = local_spec->root_type_id;
> > -               targ_res.new_val = local_spec->root_type_id;
> > -               goto patch_insn;
> > +               memset(targ_res, 0, sizeof(*targ_res));
> > +               targ_res->validate = true;
>
> hm.. original code sets it to false here, please don't regress the logic
>

ops, I introduced this by mistake while rebasing.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next v3 2/3] libbpf: Implement changes needed for BTFGen in bpftool
  2021-12-23  0:33   ` Andrii Nakryiko
@ 2022-01-12 14:26     ` Mauricio Vásquez Bernal
  0 siblings, 0 replies; 12+ messages in thread
From: Mauricio Vásquez Bernal @ 2022-01-12 14:26 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Quentin Monnet, Rafael David Tinoco,
	Lorenzo Fontana, Leonardo Di Donato

> > @@ -5498,12 +5498,13 @@ static int record_relo_core(struct bpf_program *prog,
> >         return 0;
> >  }
> >
> > -static int bpf_core_calc_relo_res(struct bpf_program *prog,
> > -                                 const struct bpf_core_relo *relo,
> > -                                 int relo_idx,
> > -                                 const struct btf *local_btf,
> > -                                 struct hashmap *cand_cache,
> > -                                 struct bpf_core_relo_res *targ_res)
> > +int bpf_core_calc_relo_res(struct bpf_program *prog,
> > +                          const struct bpf_core_relo *relo,
> > +                          int relo_idx,
> > +                          const struct btf *local_btf,
> > +                          struct hashmap *cand_cache,
> > +                          struct bpf_core_relo_res *targ_res,
> > +                          struct bpf_core_spec *targ_spec)
>
> maybe let's add targ_spec and local_spec into bpf_core_relo_res? that
> way bpf_core_relo_res contains all the relevant information around
> CO-RE relo resolution?
>

It's not needed anymore now that we're using bpf_core_calc_relo_insn()
directly in bpftool.

> > @@ -8190,6 +8211,11 @@ struct btf *bpf_object__btf(const struct bpf_object *obj)
> >         return obj ? obj->btf : NULL;
> >  }
> >
> > +struct btf_ext *bpf_object__btf_ext(const struct bpf_object *obj)
> > +{
> > +       return obj ? obj->btf_ext : NULL;
>
> just return obj->btf_ext, no one should be passing NULL for those getters

I dropped this function as we don't need it now.

>
> > +}
> > +
> >  int bpf_object__btf_fd(const struct bpf_object *obj)
> >  {
> >         return obj->btf ? btf__fd(obj->btf) : -1;
> > @@ -8281,6 +8307,20 @@ bpf_object__next_program(const struct bpf_object *obj, struct bpf_program *prev)
> >         return prog;
> >  }
> >
> > +size_t bpf_object__get_nr_programs(const struct bpf_object *obj)
> > +{
> > +       return obj->nr_programs;
> > +}
> > +
> > +struct bpf_program *
> > +bpf_object__get_program(const struct bpf_object *obj, unsigned int i)
> > +{
> > +       if (i >= obj->nr_programs)
> > +               return NULL;
> > +
> > +       return &obj->programs[i];
> > +}
> > +
> >  struct bpf_program *
> >  bpf_program__prev(struct bpf_program *next, const struct bpf_object *obj)
> >  {
> > @@ -8360,6 +8400,11 @@ int bpf_program__set_autoload(struct bpf_program *prog, bool autoload)
> >         return 0;
> >  }
> >
> > +int bpf_program__sec_idx(const struct bpf_program *prog)
> > +{
> > +       return prog->sec_idx;
> > +}
> > +
> >  static int bpf_program_nth_fd(const struct bpf_program *prog, int n);
> >
> >  int bpf_program__fd(const struct bpf_program *prog)
> > @@ -11779,3 +11824,8 @@ void bpf_object__destroy_skeleton(struct bpf_object_skeleton *s)
> >         free(s->progs);
> >         free(s);
> >  }
> > +
> > +void bpf_object_set_vmlinux_override(struct bpf_object *obj, struct btf *btf)
> > +{
> > +       obj->btf_vmlinux_override = btf;
> > +}
>
> I don't think we need this, see comments in next patch
>
>
> > diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
> > index 42b2f36fd9f0..2b048ee5a9b2 100644
> > --- a/tools/lib/bpf/libbpf.h
> > +++ b/tools/lib/bpf/libbpf.h
> > @@ -225,6 +225,8 @@ LIBBPF_API int bpf_object__set_kversion(struct bpf_object *obj, __u32 kern_versi
> >
> >  struct btf;
> >  LIBBPF_API struct btf *bpf_object__btf(const struct bpf_object *obj);
> > +struct btf_ext;
> > +LIBBPF_API struct btf_ext *bpf_object__btf_ext(const struct bpf_object *obj);
> >  LIBBPF_API int bpf_object__btf_fd(const struct bpf_object *obj);
> >
> >  LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_object__find_program_by_name() instead")
> > @@ -290,6 +292,7 @@ LIBBPF_API LIBBPF_DEPRECATED("BPF program title is confusing term; please use bp
> >  const char *bpf_program__title(const struct bpf_program *prog, bool needs_copy);
> >  LIBBPF_API bool bpf_program__autoload(const struct bpf_program *prog);
> >  LIBBPF_API int bpf_program__set_autoload(struct bpf_program *prog, bool autoload);
> > +LIBBPF_API int bpf_program__sec_idx(const struct bpf_program *prog);
> >
> >  /* returns program size in bytes */
> >  LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_program__insn_cnt() instead")
> > diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
> > index b3938b3f8fc9..15da4075e0b5 100644
> > --- a/tools/lib/bpf/libbpf.map
> > +++ b/tools/lib/bpf/libbpf.map
> > @@ -392,6 +392,7 @@ LIBBPF_0.6.0 {
> >                 bpf_map__map_extra;
> >                 bpf_map__set_map_extra;
> >                 bpf_map_create;
> > +               bpf_object__btf_ext;
> >                 bpf_object__next_map;
> >                 bpf_object__next_program;
> >                 bpf_object__prev_map;
> > @@ -401,6 +402,7 @@ LIBBPF_0.6.0 {
> >                 bpf_program__flags;
> >                 bpf_program__insn_cnt;
> >                 bpf_program__insns;
> > +               bpf_program__sec_idx;
> >                 bpf_program__set_flags;
> >                 btf__add_btf;
> >                 btf__add_decl_tag;
> > diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
> > index 5dbe4f463880..b1962adb110c 100644
> > --- a/tools/lib/bpf/libbpf_internal.h
> > +++ b/tools/lib/bpf/libbpf_internal.h
> > @@ -524,4 +524,26 @@ static inline int ensure_good_fd(int fd)
> >         return fd;
> >  }
> >
> > +struct hashmap;
> > +
> > +int bpf_core_calc_relo_res(struct bpf_program *prog,
> > +                          const struct bpf_core_relo *relo,
> > +                          int relo_idx,
> > +                          const struct btf *local_btf,
> > +                          struct hashmap *cand_cache,
> > +                          struct bpf_core_relo_res *targ_res,
> > +                          struct bpf_core_spec *targ_spec);
> > +void bpf_object_set_vmlinux_override(struct bpf_object *obj, struct btf *btf);
> > +struct hashmap *bpf_core_create_cand_cache(void);
> > +void bpf_core_free_cand_cache(struct hashmap *cand_cache);
> > +
> > +struct bpf_program *find_prog_by_sec_insn(const struct bpf_object *obj,
> > +                                         size_t sec_idx, size_t insn_idx);
> > +
> > +size_t bpf_object__get_nr_programs(const struct bpf_object *obj);
> > +
> > +struct bpf_program *
> > +bpf_object__get_program(const struct bpf_object *obj, unsigned int n);
> > +
>
> that's too much, I don't think you need bpf_program and all the things
> around it (sec_idx, look up, etc). As for core_relo_is_field_based and
> co, bpftool can do those simple checks on their own, no need to make
> all the "internal API", don't overdo it with "let's expose internals
> of libbpf to bpftool".
>
>
> > +
> >  #endif /* __LIBBPF_LIBBPF_INTERNAL_H */
> > diff --git a/tools/lib/bpf/relo_core.c b/tools/lib/bpf/relo_core.c
> > index 4f3552468624..66dfb7fa89a2 100644
> > --- a/tools/lib/bpf/relo_core.c
> > +++ b/tools/lib/bpf/relo_core.c
> > @@ -102,7 +102,7 @@ static const char *core_relo_kind_str(enum bpf_core_relo_kind kind)
> >         }
> >  }
> >
> > -static bool core_relo_is_field_based(enum bpf_core_relo_kind kind)
> > +bool core_relo_is_field_based(enum bpf_core_relo_kind kind)
> >  {
> >         switch (kind) {
> >         case BPF_CORE_FIELD_BYTE_OFFSET:
> > @@ -117,7 +117,7 @@ static bool core_relo_is_field_based(enum bpf_core_relo_kind kind)
> >         }
> >  }
> >
> > -static bool core_relo_is_type_based(enum bpf_core_relo_kind kind)
> > +bool core_relo_is_type_based(enum bpf_core_relo_kind kind)
> >  {
> >         switch (kind) {
> >         case BPF_CORE_TYPE_ID_LOCAL:
> > @@ -130,7 +130,7 @@ static bool core_relo_is_type_based(enum bpf_core_relo_kind kind)
> >         }
> >  }
> >
> > -static bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind)
> > +bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind)
> >  {
> >         switch (kind) {
> >         case BPF_CORE_ENUMVAL_EXISTS:
> > diff --git a/tools/lib/bpf/relo_core.h b/tools/lib/bpf/relo_core.h
> > index a28bf3711ce2..e969dfb032f4 100644
> > --- a/tools/lib/bpf/relo_core.h
> > +++ b/tools/lib/bpf/relo_core.h
> > @@ -84,4 +84,8 @@ int bpf_core_patch_insn(const char *prog_name, struct bpf_insn *insn,
> >                         int insn_idx, const struct bpf_core_relo *relo,
> >                         int relo_idx, const struct bpf_core_relo_res *res);
> >
> > +bool core_relo_is_field_based(enum bpf_core_relo_kind kind);
> > +bool core_relo_is_type_based(enum bpf_core_relo_kind kind);
> > +bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind);
> > +
> >  #endif
> > --
> > 2.25.1
> >

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH bpf-next v3 3/3] bpftool: Implement btfgen
  2021-12-23  0:33   ` Andrii Nakryiko
@ 2022-01-12 14:26     ` Mauricio Vásquez Bernal
  0 siblings, 0 replies; 12+ messages in thread
From: Mauricio Vásquez Bernal @ 2022-01-12 14:26 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Networking, bpf, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Quentin Monnet, Rafael David Tinoco,
	Lorenzo Fontana, Leonardo Di Donato

On Wed, Dec 22, 2021 at 7:33 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Fri, Dec 17, 2021 at 10:57 AM Mauricio Vásquez <mauricio@kinvolk.io> wrote:
> >
> > The BTFGen's goal is to produce a BTF file that contains **only** the
> > information that is needed by an eBPF program. This algorithm does a
> > first step collecting the types involved for each relocation present on
> > the object and "marking" them as needed. Types are collected in
> > different ways according to the type relocation, for field based
> > relocations only the union and structures members involved are
> > considered, for type based relocations the whole types are added, enum
> > field relocations are not supported in this iteration yet.
> >
> > A second step generates a BTF file from the "marked" types. This step
> > accesses the original BTF file extracting the types and their members
> > that were "marked" as needed in the first step.
> >
> > This command is implemented under the "gen" command in bpftool and the
> > syntax is the following:
> >
> > $ bpftool gen btf INPUT OUTPUT OBJECT(S)
> >
> > INPUT can be either a single BTF file or a folder containing BTF files,
> > when it's a folder, a BTF file is generated for each BTF file contained
> > in this folder. OUTPUT is the file (or folder) where generated files are
> > stored and OBJECT(S) is the list of bpf objects we want to generate the
> > BTF file(s) for (each generated BTF file contains all the types needed
> > by all the objects).
> >
> > Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io>
> > Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com>
> > Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co>
> > Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co>
> > ---
> >  tools/bpf/bpftool/gen.c | 892 ++++++++++++++++++++++++++++++++++++++++
> >  1 file changed, 892 insertions(+)
> >
>
> I haven't looked through details of stripping BTF itself, let's
> finalize CO-RE relocation parts first. Maybe for the next revision you
> could split bpftool changes in two in some reasonable way so that each
> patch concentrates on different steps of the process a bit more? E.g.,
> first patch might set up new command and BTF stripping parts but leave
> the CO-RE relocation logic unimplemented, and the second path fills in
> that part. Should make it easier to review this big patch.

I totally agree. Will send v4 with more granular commits.

> Please also cc Quentin Monnet to review bpftool parts as well.

He's already there.

> > +static int btf_reloc_info_gen_type(struct btf_reloc_info *info, struct bpf_core_spec *targ_spec)
> > +{
> > +       struct btf *btf = (struct btf *) info->src_btf;
> > +       struct btf_type *btf_type;
> > +       int err = 0;
> > +
> > +       btf_type = (struct btf_type *) btf__type_by_id(btf, targ_spec->root_type_id);
> > +
> > +       return btf_reloc_put_type_all(btf, info, btf_type, targ_spec->root_type_id);
> > +}
> > +
> > +static int btf_reloc_info_gen_enumval(struct btf_reloc_info *info, struct bpf_core_spec *targ_spec)
> > +{
> > +       p_err("untreated enumval based relocation");
>
> why untreated? what's the problem supporting it?
>

Nothing, we haven't given it any priority. It'll be part of the next iteration.

> > +static int btf_reloc_info_gen(struct btf_reloc_info *info, struct bpf_core_spec *res)
> > +{
> > +       if (core_relo_is_type_based(res->relo_kind))
> > +               return btf_reloc_info_gen_type(info, res);
> > +
> > +       if (core_relo_is_enumval_based(res->relo_kind))
> > +               return btf_reloc_info_gen_enumval(info, res);
> > +
> > +       if (core_relo_is_field_based(res->relo_kind))
> > +               return btf_reloc_info_gen_field(info, res);
>
> you can have a simple switch here instead of exposing libbpf internal helpers
>

Will do.

> > +static int btfgen_obj_reloc_info_gen(struct btf_reloc_info *reloc_info, struct bpf_object *obj)
> > +{
> > +       const struct btf_ext_info_sec *sec;
> > +       const struct bpf_core_relo *rec;
> > +       const struct btf_ext_info *seg;
> > +       struct hashmap *cand_cache;
> > +       int err, insn_idx, sec_idx;
> > +       struct bpf_program *prog;
> > +       struct btf_ext *btf_ext;
> > +       const char *sec_name;
> > +       size_t nr_programs;
> > +       struct btf *btf;
> > +       unsigned int i;
> > +
> > +       btf = bpf_object__btf(obj);
> > +       btf_ext = bpf_object__btf_ext(obj);
> > +
> > +       if (btf_ext->core_relo_info.len == 0)
> > +               return 0;
> > +
> > +       cand_cache = bpf_core_create_cand_cache();
> > +       if (IS_ERR(cand_cache))
> > +               return PTR_ERR(cand_cache);
> > +
> > +       bpf_object_set_vmlinux_override(obj, reloc_info->src_btf);
> > +
> > +       seg = &btf_ext->core_relo_info;
> > +       for_each_btf_ext_sec(seg, sec) {
> > +               bool prog_found;
> > +
> > +               sec_name = btf__name_by_offset(btf, sec->sec_name_off);
> > +               if (str_is_empty(sec_name)) {
> > +                       err = -EINVAL;
> > +                       goto out;
> > +               }
> > +
> > +               prog_found = false;
> > +               nr_programs = bpf_object__get_nr_programs(obj);
> > +               for (i = 0; i < nr_programs; i++)       {
> > +                       prog = bpf_object__get_program(obj, i);
> > +                       if (strcmp(bpf_program__section_name(prog), sec_name) == 0) {
> > +                               prog_found = true;
> > +                               break;
> > +                       }
> > +               }
> > +
> > +               if (!prog_found) {
> > +                       pr_warn("sec '%s': failed to find a BPF program\n", sec_name);
> > +                       err = -EINVAL;
> > +                       goto out;
> > +               }
> > +
> > +               sec_idx = bpf_program__sec_idx(prog);
> > +
> > +               for_each_btf_ext_rec(seg, sec, i, rec) {
> > +                       struct bpf_core_relo_res targ_res;
> > +                       struct bpf_core_spec targ_spec;
> > +
> > +                       insn_idx = rec->insn_off / BPF_INSN_SZ;
> > +
> > +                       prog = find_prog_by_sec_insn(obj, sec_idx, insn_idx);
> > +                       if (!prog) {
> > +                               pr_warn("sec '%s': failed to find program at insn #%d for CO-RE offset relocation #%d\n",
> > +                                       sec_name, insn_idx, i);
> > +                               err = -EINVAL;
> > +                               goto out;
> > +                       }
> > +
> > +                       err = bpf_core_calc_relo_res(prog, rec, i, btf, cand_cache, &targ_res,
> > +                                                    &targ_spec);
>
>
> I don't think you need to do *exactly* what libbpf is doing.
> bpf_core_calc_relo_res() doesn't add much on top of
> bpf_core_calc_relo_insn(), if you use bpf_core_calc_relo_insn()
> directly and expose bpf_core_add_cands/bpf_core_free_cands and use
> them directly as well, bypassing bpf_object completely, you won't need
> btf_vmlinux override and make everything less coupled, I think. In the
> future, if you'd like to support BTF module BTFs, you'll have all the
> necessary flexibility to do that, while if you try to reuse every
> single line of bpf_object's code we'll be adding more hacks like your
> bpf_object_set_vmlinux_override().
>
> Fundamentally, you don't care about bpf_object and bpf_programs. All
> you need to do is parse .BTF.ext (you can do that just btf__parse, no
> need to even construct bpf_object!). Construct candidate cache, then
> iterate each CO-RE record, add find/add candidates, calculate
> relocation result, use it for your algorithm.
>
> Yes, there will be a bit of duplication (candidate search), but that's
> better than trying to turn bpf_object inside out with all the custom
> getters/setters that you are exposing (even if it's libbpf internal
> only, still makes it really hard to reasons what's going on and what
> are consequences of manual control over a lot of bpf_object internal
> implementation details).
>

Thanks a lot for this suggestion. We implemented it this way and the
result is much better.

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-01-12 14:27 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-17 18:56 [PATCH bpf-next v3 0/3] libbpf: Implement BTFGen Mauricio Vásquez
2021-12-17 18:56 ` [PATCH bpf-next v3 1/3] libbpf: split bpf_core_apply_relo() Mauricio Vásquez
2021-12-23  0:33   ` Andrii Nakryiko
2022-01-12 14:26     ` Mauricio Vásquez Bernal
2021-12-17 18:56 ` [PATCH bpf-next v3 2/3] libbpf: Implement changes needed for BTFGen in bpftool Mauricio Vásquez
2021-12-23  0:33   ` Andrii Nakryiko
2022-01-12 14:26     ` Mauricio Vásquez Bernal
2021-12-17 18:56 ` [PATCH bpf-next v3 3/3] bpftool: Implement btfgen Mauricio Vásquez
2021-12-23  0:33   ` Andrii Nakryiko
2022-01-12 14:26     ` Mauricio Vásquez Bernal
2021-12-17 23:11 ` [PATCH bpf-next v3 0/3] libbpf: Implement BTFGen Daniel Borkmann
2021-12-20 22:43   ` Mauricio Vásquez Bernal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).