CO-RE requires to have BTF information describing the types of the kernel in order to perform the relocations. This is usually provided by the kernel itself when it's configured with CONFIG_DEBUG_INFO_BTF. However, this configuration is not enabled in all the distributions and it's not available on kernels before 5.12. It's possible to use CO-RE in kernels without CONFIG_DEBUG_INFO_BTF support by providing the BTF information from an external source. BTFHub[0] contains BTF files to each released kernel not supporting BTF, for the most popular distributions. Providing this BTF file for a given kernel has some challenges: 1. Each BTF file is a few MBs big, then it's not possible to ship the eBPF program with all the BTF files needed to run in different kernels. (The BTF files will be in the order of GBs if you want to support a high number of kernels) 2. Downloading the BTF file for the current kernel at runtime delays the start of the program and it's not always possible to reach an external host to download such a file. Providing the BTF file with the information about all the data types of the kernel for running an eBPF program is an overkill in many of the cases. Usually the eBPF programs access only some kernel fields. This series implements BTFGen support in bpftool by exposing some of the internal libbpf's APIs to it. This idea was discussed during the "Towards truly portable eBPF"[1] presentation at Linux Plumbers 2021. There is a good example[2] on how to use BTFGen and BTFHub together to generate multiple BTF files, to each existing/supported kernel, tailored to one application. For example: a complex bpf object might support nearly 400 kernels by having BTF files summing only 1.5 MB. [0]: https://github.com/aquasecurity/btfhub/ [1]: https://www.youtube.com/watch?v=igJLKyP1lFk&t=2418s [2]: https://github.com/aquasecurity/btfhub/tree/main/tools Changelog: v2 > v3: - expose internal libbpf APIs to bpftool instead - implement btfgen in bpftool - drop btf__raw_data() from libbpf v1 > v2: - introduce bpf_object__prepare() and ‘record_core_relos’ to expose CO-RE relocations instead of bpf_object__reloc_info_gen() - rename btf__save_to_file() to btf__raw_data() v1: https://lore.kernel.org/bpf/20211027203727.208847-1-mauricio@kinvolk.io/ v2: https://lore.kernel.org/bpf/20211116164208.164245-1-mauricio@kinvolk.io/ Mauricio Vásquez (3): libbpf: split bpf_core_apply_relo() libbpf: Implement changes needed for BTFGen in bpftool bpftool: Implement btfgen kernel/bpf/btf.c | 11 +- tools/bpf/bpftool/gen.c | 892 ++++++++++++++++++++++++++++++++ tools/lib/bpf/Makefile | 2 +- tools/lib/bpf/libbpf.c | 124 +++-- tools/lib/bpf/libbpf.h | 3 + tools/lib/bpf/libbpf.map | 2 + tools/lib/bpf/libbpf_internal.h | 22 + tools/lib/bpf/relo_core.c | 83 +-- tools/lib/bpf/relo_core.h | 46 +- 9 files changed, 1086 insertions(+), 99 deletions(-) -- 2.25.1
BTFGen needs to run the core relocation logic in order to understand what are the types in the target BTF that involved in a given relocation. Currently bpf_core_apply_relo() calculates and **applies** a relocation to an instruction. Having both operations in the same function makes it difficult to only calculate the relocation without patching the instruction. This commit splits that logic in two different phases: (1) calculate the relocation and (2) patch the instruction. For the first phase bpf_core_apply_relo() is renamed to bpf_core_calc_relo_res() who is now only on charge of calculating the relocation, the second phase uses the already existing bpf_core_patch_insn(). bpf_object__relocate_core() uses both of them and the BTFGen will use only bpf_core_calc_relo_res(). Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io> Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com> Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co> Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co> --- kernel/bpf/btf.c | 11 +++++- tools/lib/bpf/libbpf.c | 54 ++++++++++++++++----------- tools/lib/bpf/relo_core.c | 77 +++++++++++---------------------------- tools/lib/bpf/relo_core.h | 42 ++++++++++++++++++--- 4 files changed, 99 insertions(+), 85 deletions(-) diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c index a17de71abd2e..5a8f6ef6a341 100644 --- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -6734,6 +6734,7 @@ int bpf_core_apply(struct bpf_core_ctx *ctx, const struct bpf_core_relo *relo, { bool need_cands = relo->kind != BPF_CORE_TYPE_ID_LOCAL; struct bpf_core_cand_list cands = {}; + struct bpf_core_relo_res targ_res; struct bpf_core_spec *specs; int err; @@ -6778,8 +6779,14 @@ int bpf_core_apply(struct bpf_core_ctx *ctx, const struct bpf_core_relo *relo, */ } - err = bpf_core_apply_relo_insn((void *)ctx->log, insn, relo->insn_off / 8, - relo, relo_idx, ctx->btf, &cands, specs); + err = bpf_core_calc_relo_insn((void *)ctx->log, relo, relo_idx, ctx->btf, &cands, specs, + &targ_res); + if (err) + goto out; + + err = bpf_core_patch_insn((void *)ctx->log, insn, relo->insn_off / 8, relo, relo_idx, + &targ_res); + out: kfree(specs); if (need_cands) { diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index cf862a19222b..77e2df13715a 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -5498,11 +5498,12 @@ static int record_relo_core(struct bpf_program *prog, return 0; } -static int bpf_core_apply_relo(struct bpf_program *prog, - const struct bpf_core_relo *relo, - int relo_idx, - const struct btf *local_btf, - struct hashmap *cand_cache) +static int bpf_core_calc_relo_res(struct bpf_program *prog, + const struct bpf_core_relo *relo, + int relo_idx, + const struct btf *local_btf, + struct hashmap *cand_cache, + struct bpf_core_relo_res *targ_res) { struct bpf_core_spec specs_scratch[3] = {}; const void *type_key = u32_as_hash_key(relo->type_id); @@ -5511,20 +5512,7 @@ static int bpf_core_apply_relo(struct bpf_program *prog, const struct btf_type *local_type; const char *local_name; __u32 local_id = relo->type_id; - struct bpf_insn *insn; - int insn_idx, err; - - if (relo->insn_off % BPF_INSN_SZ) - return -EINVAL; - insn_idx = relo->insn_off / BPF_INSN_SZ; - /* adjust insn_idx from section frame of reference to the local - * program's frame of reference; (sub-)program code is not yet - * relocated, so it's enough to just subtract in-section offset - */ - insn_idx = insn_idx - prog->sec_insn_off; - if (insn_idx >= prog->insns_cnt) - return -EINVAL; - insn = &prog->insns[insn_idx]; + int err; local_type = btf__type_by_id(local_btf, local_id); if (!local_type) @@ -5536,6 +5524,7 @@ static int bpf_core_apply_relo(struct bpf_program *prog, if (prog->obj->gen_loader) { const char *spec_str = btf__name_by_offset(local_btf, relo->access_str_off); + int insn_idx = relo->insn_off / BPF_INSN_SZ; pr_debug("record_relo_core: prog %td insn[%d] %s %s %s final insn_idx %d\n", prog - prog->obj->programs, relo->insn_off / 8, @@ -5559,19 +5548,21 @@ static int bpf_core_apply_relo(struct bpf_program *prog, } } - return bpf_core_apply_relo_insn(prog_name, insn, insn_idx, relo, - relo_idx, local_btf, cands, specs_scratch); + return bpf_core_calc_relo_insn(prog_name, relo, relo_idx, local_btf, cands, + specs_scratch, targ_res); } static int bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path) { const struct btf_ext_info_sec *sec; + struct bpf_core_relo_res targ_res; const struct bpf_core_relo *rec; const struct btf_ext_info *seg; struct hashmap_entry *entry; struct hashmap *cand_cache = NULL; struct bpf_program *prog; + struct bpf_insn *insn; const char *sec_name; int i, err = 0, insn_idx, sec_idx; @@ -5636,12 +5627,31 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path) if (!prog->load) continue; - err = bpf_core_apply_relo(prog, rec, i, obj->btf, cand_cache); + err = bpf_core_calc_relo_res(prog, rec, i, obj->btf, cand_cache, &targ_res); if (err) { pr_warn("prog '%s': relo #%d: failed to relocate: %d\n", prog->name, i, err); goto out; } + + if (rec->insn_off % BPF_INSN_SZ) + return -EINVAL; + insn_idx = rec->insn_off / BPF_INSN_SZ; + /* adjust insn_idx from section frame of reference to the local + * program's frame of reference; (sub-)program code is not yet + * relocated, so it's enough to just subtract in-section offset + */ + insn_idx = insn_idx - prog->sec_insn_off; + if (insn_idx >= prog->insns_cnt) + return -EINVAL; + insn = &prog->insns[insn_idx]; + + err = bpf_core_patch_insn(prog->name, insn, insn_idx, rec, i, &targ_res); + if (err) { + pr_warn("prog '%s': relo #%d: failed to patch insn #%u: %d\n", + prog->name, i, insn_idx, err); + goto out; + } } } diff --git a/tools/lib/bpf/relo_core.c b/tools/lib/bpf/relo_core.c index 910865e29edc..4f3552468624 100644 --- a/tools/lib/bpf/relo_core.c +++ b/tools/lib/bpf/relo_core.c @@ -775,31 +775,6 @@ static int bpf_core_calc_enumval_relo(const struct bpf_core_relo *relo, return 0; } -struct bpf_core_relo_res -{ - /* expected value in the instruction, unless validate == false */ - __u32 orig_val; - /* new value that needs to be patched up to */ - __u32 new_val; - /* relocation unsuccessful, poison instruction, but don't fail load */ - bool poison; - /* some relocations can't be validated against orig_val */ - bool validate; - /* for field byte offset relocations or the forms: - * *(T *)(rX + <off>) = rY - * rX = *(T *)(rY + <off>), - * we remember original and resolved field size to adjust direct - * memory loads of pointers and integers; this is necessary for 32-bit - * host kernel architectures, but also allows to automatically - * relocate fields that were resized from, e.g., u32 to u64, etc. - */ - bool fail_memsz_adjust; - __u32 orig_sz; - __u32 orig_type_id; - __u32 new_sz; - __u32 new_type_id; -}; - /* Calculate original and target relocation values, given local and target * specs and relocation kind. These values are calculated for each candidate. * If there are multiple candidates, resulting values should all be consistent @@ -951,9 +926,9 @@ static int insn_bytes_to_bpf_size(__u32 sz) * 5. *(T *)(rX + <off>) = rY, where T is one of {u8, u16, u32, u64}; * 6. *(T *)(rX + <off>) = <imm>, where T is one of {u8, u16, u32, u64}. */ -static int bpf_core_patch_insn(const char *prog_name, struct bpf_insn *insn, - int insn_idx, const struct bpf_core_relo *relo, - int relo_idx, const struct bpf_core_relo_res *res) +int bpf_core_patch_insn(const char *prog_name, struct bpf_insn *insn, + int insn_idx, const struct bpf_core_relo *relo, + int relo_idx, const struct bpf_core_relo_res *res) { __u32 orig_val, new_val; __u8 class; @@ -1177,18 +1152,18 @@ static void bpf_core_dump_spec(const char *prog_name, int level, const struct bp * between multiple relocations for the same type ID and is updated as some * of the candidates are pruned due to structural incompatibility. */ -int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn, - int insn_idx, - const struct bpf_core_relo *relo, - int relo_idx, - const struct btf *local_btf, - struct bpf_core_cand_list *cands, - struct bpf_core_spec *specs_scratch) +int bpf_core_calc_relo_insn(const char *prog_name, + const struct bpf_core_relo *relo, + int relo_idx, + const struct btf *local_btf, + struct bpf_core_cand_list *cands, + struct bpf_core_spec *specs_scratch, + struct bpf_core_relo_res *targ_res) { struct bpf_core_spec *local_spec = &specs_scratch[0]; struct bpf_core_spec *cand_spec = &specs_scratch[1]; struct bpf_core_spec *targ_spec = &specs_scratch[2]; - struct bpf_core_relo_res cand_res, targ_res; + struct bpf_core_relo_res cand_res; const struct btf_type *local_type; const char *local_name; __u32 local_id; @@ -1223,12 +1198,12 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn, /* TYPE_ID_LOCAL relo is special and doesn't need candidate search */ if (relo->kind == BPF_CORE_TYPE_ID_LOCAL) { /* bpf_insn's imm value could get out of sync during linking */ - memset(&targ_res, 0, sizeof(targ_res)); - targ_res.validate = false; - targ_res.poison = false; - targ_res.orig_val = local_spec->root_type_id; - targ_res.new_val = local_spec->root_type_id; - goto patch_insn; + memset(targ_res, 0, sizeof(*targ_res)); + targ_res->validate = true; + targ_res->poison = false; + targ_res->orig_val = local_spec->root_type_id; + targ_res->new_val = local_spec->root_type_id; + return 0; } /* libbpf doesn't support candidate search for anonymous types */ @@ -1262,7 +1237,7 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn, return err; if (j == 0) { - targ_res = cand_res; + *targ_res = cand_res; *targ_spec = *cand_spec; } else if (cand_spec->bit_offset != targ_spec->bit_offset) { /* if there are many field relo candidates, they @@ -1272,7 +1247,8 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn, prog_name, relo_idx, cand_spec->bit_offset, targ_spec->bit_offset); return -EINVAL; - } else if (cand_res.poison != targ_res.poison || cand_res.new_val != targ_res.new_val) { + } else if (cand_res.poison != targ_res->poison || + cand_res.new_val != targ_res->new_val) { /* all candidates should result in the same relocation * decision and value, otherwise it's dangerous to * proceed due to ambiguity @@ -1280,7 +1256,7 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn, pr_warn("prog '%s': relo #%d: relocation decision ambiguity: %s %u != %s %u\n", prog_name, relo_idx, cand_res.poison ? "failure" : "success", cand_res.new_val, - targ_res.poison ? "failure" : "success", targ_res.new_val); + targ_res->poison ? "failure" : "success", targ_res->new_val); return -EINVAL; } @@ -1314,19 +1290,10 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn, prog_name, relo_idx); /* calculate single target relo result explicitly */ - err = bpf_core_calc_relo(prog_name, relo, relo_idx, local_spec, NULL, &targ_res); + err = bpf_core_calc_relo(prog_name, relo, relo_idx, local_spec, NULL, targ_res); if (err) return err; } -patch_insn: - /* bpf_core_patch_insn() should know how to handle missing targ_spec */ - err = bpf_core_patch_insn(prog_name, insn, insn_idx, relo, relo_idx, &targ_res); - if (err) { - pr_warn("prog '%s': relo #%d: failed to patch insn #%u: %d\n", - prog_name, relo_idx, relo->insn_off / 8, err); - return -EINVAL; - } - return 0; } diff --git a/tools/lib/bpf/relo_core.h b/tools/lib/bpf/relo_core.h index 17799819ad7c..a28bf3711ce2 100644 --- a/tools/lib/bpf/relo_core.h +++ b/tools/lib/bpf/relo_core.h @@ -44,14 +44,44 @@ struct bpf_core_spec { __u32 bit_offset; }; -int bpf_core_apply_relo_insn(const char *prog_name, - struct bpf_insn *insn, int insn_idx, - const struct bpf_core_relo *relo, int relo_idx, - const struct btf *local_btf, - struct bpf_core_cand_list *cands, - struct bpf_core_spec *specs_scratch); +struct bpf_core_relo_res { + /* expected value in the instruction, unless validate == false */ + __u32 orig_val; + /* new value that needs to be patched up to */ + __u32 new_val; + /* relocation unsuccessful, poison instruction, but don't fail load */ + bool poison; + /* some relocations can't be validated against orig_val */ + bool validate; + /* for field byte offset relocations or the forms: + * *(T *)(rX + <off>) = rY + * rX = *(T *)(rY + <off>), + * we remember original and resolved field size to adjust direct + * memory loads of pointers and integers; this is necessary for 32-bit + * host kernel architectures, but also allows to automatically + * relocate fields that were resized from, e.g., u32 to u64, etc. + */ + bool fail_memsz_adjust; + __u32 orig_sz; + __u32 orig_type_id; + __u32 new_sz; + __u32 new_type_id; +}; + int bpf_core_types_are_compat(const struct btf *local_btf, __u32 local_id, const struct btf *targ_btf, __u32 targ_id); size_t bpf_core_essential_name_len(const char *name); + +int bpf_core_calc_relo_insn(const char *prog_name, + const struct bpf_core_relo *relo, int relo_idx, + const struct btf *local_btf, + struct bpf_core_cand_list *cands, + struct bpf_core_spec *specs_scratch, + struct bpf_core_relo_res *targ_res); + +int bpf_core_patch_insn(const char *prog_name, struct bpf_insn *insn, + int insn_idx, const struct bpf_core_relo *relo, + int relo_idx, const struct bpf_core_relo_res *res); + #endif -- 2.25.1
Given that BTFGen is a so unique use case, let's expose some of the libbpf internal APIs to bpftool. This commit extends libbpf with the features that are needed to implement it. Specifically, this commit introduces the following functions: - bpf_object__btf_ext(): Public method to access the BTF.ext section of an bpf object. - bpf_program__sec_idx(): Public method to get the sec index of a program within the elf file. - Implement bpf_core_create_cand_cache() and bpf_core_free_cand_cache() to handle candidates cache. - Expose bpf_core_calc_relo_res() in libbpfinternal.h and add "struct bpf_core_spec *targ_spec" parameter. - bpf_object_set_vmlinux_override(): Internal function to set obj->btf_vmlinux_override. - bpf_object__get_nr_programs() and bpf_object__get_program(): To give access to the program inside an object. bpf_object__for_each_program is not good enough because BTFGen needs to access subprograms too. Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io> Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com> Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co> Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co> --- tools/lib/bpf/Makefile | 2 +- tools/lib/bpf/libbpf.c | 88 ++++++++++++++++++++++++++------- tools/lib/bpf/libbpf.h | 3 ++ tools/lib/bpf/libbpf.map | 2 + tools/lib/bpf/libbpf_internal.h | 22 +++++++++ tools/lib/bpf/relo_core.c | 6 +-- tools/lib/bpf/relo_core.h | 4 ++ 7 files changed, 104 insertions(+), 23 deletions(-) diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile index f947b61b2107..dba019ee2832 100644 --- a/tools/lib/bpf/Makefile +++ b/tools/lib/bpf/Makefile @@ -239,7 +239,7 @@ install_lib: all_cmd SRC_HDRS := bpf.h libbpf.h btf.h libbpf_common.h libbpf_legacy.h xsk.h \ bpf_helpers.h bpf_tracing.h bpf_endian.h bpf_core_read.h \ - skel_internal.h libbpf_version.h + skel_internal.h libbpf_version.h relo_core.h libbpf_internal.h GEN_HDRS := $(BPF_GENERATED) INSTALL_PFX := $(DESTDIR)$(prefix)/include/bpf diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index 77e2df13715a..7c8f8797475f 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -4027,8 +4027,8 @@ static bool prog_contains_insn(const struct bpf_program *prog, size_t insn_idx) insn_idx < prog->sec_insn_off + prog->sec_insn_cnt; } -static struct bpf_program *find_prog_by_sec_insn(const struct bpf_object *obj, - size_t sec_idx, size_t insn_idx) +struct bpf_program *find_prog_by_sec_insn(const struct bpf_object *obj, + size_t sec_idx, size_t insn_idx) { int l = 0, r = obj->nr_programs - 1, m; struct bpf_program *prog; @@ -5498,12 +5498,13 @@ static int record_relo_core(struct bpf_program *prog, return 0; } -static int bpf_core_calc_relo_res(struct bpf_program *prog, - const struct bpf_core_relo *relo, - int relo_idx, - const struct btf *local_btf, - struct hashmap *cand_cache, - struct bpf_core_relo_res *targ_res) +int bpf_core_calc_relo_res(struct bpf_program *prog, + const struct bpf_core_relo *relo, + int relo_idx, + const struct btf *local_btf, + struct hashmap *cand_cache, + struct bpf_core_relo_res *targ_res, + struct bpf_core_spec *targ_spec) { struct bpf_core_spec specs_scratch[3] = {}; const void *type_key = u32_as_hash_key(relo->type_id); @@ -5548,8 +5549,32 @@ static int bpf_core_calc_relo_res(struct bpf_program *prog, } } - return bpf_core_calc_relo_insn(prog_name, relo, relo_idx, local_btf, cands, - specs_scratch, targ_res); + err = bpf_core_calc_relo_insn(prog_name, relo, relo_idx, local_btf, cands, + specs_scratch, targ_res); + if (err) + return err; + + if (targ_spec) + *targ_spec = specs_scratch[2]; + return 0; +} + +struct hashmap *bpf_core_create_cand_cache(void) +{ + return hashmap__new(bpf_core_hash_fn, bpf_core_equal_fn, NULL); +} + +void bpf_core_free_cand_cache(struct hashmap *cand_cache) +{ + struct hashmap_entry *entry; + int i; + + if (!IS_ERR_OR_NULL(cand_cache)) { + hashmap__for_each_entry(cand_cache, entry, i) { + bpf_core_free_cands(entry->value); + } + hashmap__free(cand_cache); + } } static int @@ -5559,7 +5584,6 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path) struct bpf_core_relo_res targ_res; const struct bpf_core_relo *rec; const struct btf_ext_info *seg; - struct hashmap_entry *entry; struct hashmap *cand_cache = NULL; struct bpf_program *prog; struct bpf_insn *insn; @@ -5578,7 +5602,7 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path) } } - cand_cache = hashmap__new(bpf_core_hash_fn, bpf_core_equal_fn, NULL); + cand_cache = bpf_core_create_cand_cache(); if (IS_ERR(cand_cache)) { err = PTR_ERR(cand_cache); goto out; @@ -5627,7 +5651,8 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path) if (!prog->load) continue; - err = bpf_core_calc_relo_res(prog, rec, i, obj->btf, cand_cache, &targ_res); + err = bpf_core_calc_relo_res(prog, rec, i, obj->btf, cand_cache, &targ_res, + NULL); if (err) { pr_warn("prog '%s': relo #%d: failed to relocate: %d\n", prog->name, i, err); @@ -5660,12 +5685,8 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path) btf__free(obj->btf_vmlinux_override); obj->btf_vmlinux_override = NULL; - if (!IS_ERR_OR_NULL(cand_cache)) { - hashmap__for_each_entry(cand_cache, entry, i) { - bpf_core_free_cands(entry->value); - } - hashmap__free(cand_cache); - } + bpf_core_free_cand_cache(cand_cache); + return err; } @@ -8190,6 +8211,11 @@ struct btf *bpf_object__btf(const struct bpf_object *obj) return obj ? obj->btf : NULL; } +struct btf_ext *bpf_object__btf_ext(const struct bpf_object *obj) +{ + return obj ? obj->btf_ext : NULL; +} + int bpf_object__btf_fd(const struct bpf_object *obj) { return obj->btf ? btf__fd(obj->btf) : -1; @@ -8281,6 +8307,20 @@ bpf_object__next_program(const struct bpf_object *obj, struct bpf_program *prev) return prog; } +size_t bpf_object__get_nr_programs(const struct bpf_object *obj) +{ + return obj->nr_programs; +} + +struct bpf_program * +bpf_object__get_program(const struct bpf_object *obj, unsigned int i) +{ + if (i >= obj->nr_programs) + return NULL; + + return &obj->programs[i]; +} + struct bpf_program * bpf_program__prev(struct bpf_program *next, const struct bpf_object *obj) { @@ -8360,6 +8400,11 @@ int bpf_program__set_autoload(struct bpf_program *prog, bool autoload) return 0; } +int bpf_program__sec_idx(const struct bpf_program *prog) +{ + return prog->sec_idx; +} + static int bpf_program_nth_fd(const struct bpf_program *prog, int n); int bpf_program__fd(const struct bpf_program *prog) @@ -11779,3 +11824,8 @@ void bpf_object__destroy_skeleton(struct bpf_object_skeleton *s) free(s->progs); free(s); } + +void bpf_object_set_vmlinux_override(struct bpf_object *obj, struct btf *btf) +{ + obj->btf_vmlinux_override = btf; +} diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h index 42b2f36fd9f0..2b048ee5a9b2 100644 --- a/tools/lib/bpf/libbpf.h +++ b/tools/lib/bpf/libbpf.h @@ -225,6 +225,8 @@ LIBBPF_API int bpf_object__set_kversion(struct bpf_object *obj, __u32 kern_versi struct btf; LIBBPF_API struct btf *bpf_object__btf(const struct bpf_object *obj); +struct btf_ext; +LIBBPF_API struct btf_ext *bpf_object__btf_ext(const struct bpf_object *obj); LIBBPF_API int bpf_object__btf_fd(const struct bpf_object *obj); LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_object__find_program_by_name() instead") @@ -290,6 +292,7 @@ LIBBPF_API LIBBPF_DEPRECATED("BPF program title is confusing term; please use bp const char *bpf_program__title(const struct bpf_program *prog, bool needs_copy); LIBBPF_API bool bpf_program__autoload(const struct bpf_program *prog); LIBBPF_API int bpf_program__set_autoload(struct bpf_program *prog, bool autoload); +LIBBPF_API int bpf_program__sec_idx(const struct bpf_program *prog); /* returns program size in bytes */ LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_program__insn_cnt() instead") diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map index b3938b3f8fc9..15da4075e0b5 100644 --- a/tools/lib/bpf/libbpf.map +++ b/tools/lib/bpf/libbpf.map @@ -392,6 +392,7 @@ LIBBPF_0.6.0 { bpf_map__map_extra; bpf_map__set_map_extra; bpf_map_create; + bpf_object__btf_ext; bpf_object__next_map; bpf_object__next_program; bpf_object__prev_map; @@ -401,6 +402,7 @@ LIBBPF_0.6.0 { bpf_program__flags; bpf_program__insn_cnt; bpf_program__insns; + bpf_program__sec_idx; bpf_program__set_flags; btf__add_btf; btf__add_decl_tag; diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h index 5dbe4f463880..b1962adb110c 100644 --- a/tools/lib/bpf/libbpf_internal.h +++ b/tools/lib/bpf/libbpf_internal.h @@ -524,4 +524,26 @@ static inline int ensure_good_fd(int fd) return fd; } +struct hashmap; + +int bpf_core_calc_relo_res(struct bpf_program *prog, + const struct bpf_core_relo *relo, + int relo_idx, + const struct btf *local_btf, + struct hashmap *cand_cache, + struct bpf_core_relo_res *targ_res, + struct bpf_core_spec *targ_spec); +void bpf_object_set_vmlinux_override(struct bpf_object *obj, struct btf *btf); +struct hashmap *bpf_core_create_cand_cache(void); +void bpf_core_free_cand_cache(struct hashmap *cand_cache); + +struct bpf_program *find_prog_by_sec_insn(const struct bpf_object *obj, + size_t sec_idx, size_t insn_idx); + +size_t bpf_object__get_nr_programs(const struct bpf_object *obj); + +struct bpf_program * +bpf_object__get_program(const struct bpf_object *obj, unsigned int n); + + #endif /* __LIBBPF_LIBBPF_INTERNAL_H */ diff --git a/tools/lib/bpf/relo_core.c b/tools/lib/bpf/relo_core.c index 4f3552468624..66dfb7fa89a2 100644 --- a/tools/lib/bpf/relo_core.c +++ b/tools/lib/bpf/relo_core.c @@ -102,7 +102,7 @@ static const char *core_relo_kind_str(enum bpf_core_relo_kind kind) } } -static bool core_relo_is_field_based(enum bpf_core_relo_kind kind) +bool core_relo_is_field_based(enum bpf_core_relo_kind kind) { switch (kind) { case BPF_CORE_FIELD_BYTE_OFFSET: @@ -117,7 +117,7 @@ static bool core_relo_is_field_based(enum bpf_core_relo_kind kind) } } -static bool core_relo_is_type_based(enum bpf_core_relo_kind kind) +bool core_relo_is_type_based(enum bpf_core_relo_kind kind) { switch (kind) { case BPF_CORE_TYPE_ID_LOCAL: @@ -130,7 +130,7 @@ static bool core_relo_is_type_based(enum bpf_core_relo_kind kind) } } -static bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind) +bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind) { switch (kind) { case BPF_CORE_ENUMVAL_EXISTS: diff --git a/tools/lib/bpf/relo_core.h b/tools/lib/bpf/relo_core.h index a28bf3711ce2..e969dfb032f4 100644 --- a/tools/lib/bpf/relo_core.h +++ b/tools/lib/bpf/relo_core.h @@ -84,4 +84,8 @@ int bpf_core_patch_insn(const char *prog_name, struct bpf_insn *insn, int insn_idx, const struct bpf_core_relo *relo, int relo_idx, const struct bpf_core_relo_res *res); +bool core_relo_is_field_based(enum bpf_core_relo_kind kind); +bool core_relo_is_type_based(enum bpf_core_relo_kind kind); +bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind); + #endif -- 2.25.1
The BTFGen's goal is to produce a BTF file that contains **only** the information that is needed by an eBPF program. This algorithm does a first step collecting the types involved for each relocation present on the object and "marking" them as needed. Types are collected in different ways according to the type relocation, for field based relocations only the union and structures members involved are considered, for type based relocations the whole types are added, enum field relocations are not supported in this iteration yet. A second step generates a BTF file from the "marked" types. This step accesses the original BTF file extracting the types and their members that were "marked" as needed in the first step. This command is implemented under the "gen" command in bpftool and the syntax is the following: $ bpftool gen btf INPUT OUTPUT OBJECT(S) INPUT can be either a single BTF file or a folder containing BTF files, when it's a folder, a BTF file is generated for each BTF file contained in this folder. OUTPUT is the file (or folder) where generated files are stored and OBJECT(S) is the list of bpf objects we want to generate the BTF file(s) for (each generated BTF file contains all the types needed by all the objects). Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io> Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com> Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co> Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co> --- tools/bpf/bpftool/gen.c | 892 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 892 insertions(+) diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c index b4695df2ea3d..2794fe88f609 100644 --- a/tools/bpf/bpftool/gen.c +++ b/tools/bpf/bpftool/gen.c @@ -5,6 +5,7 @@ #define _GNU_SOURCE #endif #include <ctype.h> +#include <dirent.h> #include <errno.h> #include <fcntl.h> #include <linux/err.h> @@ -14,6 +15,7 @@ #include <unistd.h> #include <bpf/bpf.h> #include <bpf/libbpf.h> +#include <bpf/libbpf_internal.h> #include <sys/types.h> #include <sys/stat.h> #include <sys/mman.h> @@ -1084,6 +1086,7 @@ static int do_help(int argc, char **argv) fprintf(stderr, "Usage: %1$s %2$s object OUTPUT_FILE INPUT_FILE [INPUT_FILE...]\n" " %1$s %2$s skeleton FILE [name OBJECT_NAME]\n" + " %1$s %2$s btf INPUT OUTPUT OBJECT(S)\n" " %1$s %2$s help\n" "\n" " " HELP_SPEC_OPTIONS " |\n" @@ -1094,9 +1097,898 @@ static int do_help(int argc, char **argv) return 0; } +static int btf_save_raw(const struct btf *btf, const char *path) +{ + const void *data; + FILE *f = NULL; + __u32 data_sz; + int err = 0; + + data = btf__raw_data(btf, &data_sz); + if (!data) { + err = -ENOMEM; + goto out; + } + + f = fopen(path, "wb"); + if (!f) { + err = -errno; + goto out; + } + + if (fwrite(data, 1, data_sz, f) != data_sz) { + err = -errno; + goto out; + } + +out: + if (f) + fclose(f); + return libbpf_err(err); +} + +struct btf_reloc_member { + struct btf_member *member; + int idx; +}; + +struct btf_reloc_type { + struct btf_type *type; + unsigned int id; + bool added_by_all; + + struct hashmap *members; +}; + +struct btf_reloc_info { + struct hashmap *types; + struct hashmap *ids_map; + + struct btf *src_btf; +}; + +static size_t bpf_reloc_info_hash_fn(const void *key, void *ctx) +{ + return (size_t)key; +} + +static bool bpf_reloc_info_equal_fn(const void *k1, const void *k2, void *ctx) +{ + return k1 == k2; +} + +static void *uint_as_hash_key(int x) +{ + return (void *)(uintptr_t)x; +} + +static void bpf_reloc_type_free(struct btf_reloc_type *type) +{ + struct hashmap_entry *entry; + size_t bkt; + + if (IS_ERR_OR_NULL(type)) + return; + + if (!IS_ERR_OR_NULL(type->members)) { + hashmap__for_each_entry(type->members, entry, bkt) { + free(entry->value); + } + hashmap__free(type->members); + } + + free(type); +} + +static void btfgen_reloc_info_free(struct btf_reloc_info *info) +{ + struct hashmap_entry *entry; + size_t bkt; + + if (!info) + return; + + hashmap__free(info->ids_map); + + if (!IS_ERR_OR_NULL(info->types)) { + hashmap__for_each_entry(info->types, entry, bkt) { + bpf_reloc_type_free(entry->value); + } + hashmap__free(info->types); + } + + btf__free(info->src_btf); + + free(info); +} + +static struct btf_reloc_info * +btfgen_reloc_info_new(const char *targ_btf_path) +{ + struct btf_reloc_info *info; + struct btf *src_btf; + struct hashmap *ids_map; + struct hashmap *types; + + info = calloc(1, sizeof(*info)); + if (!info) + return ERR_PTR(-ENOMEM); + + src_btf = btf__parse(targ_btf_path, NULL); + if (libbpf_get_error(src_btf)) { + btfgen_reloc_info_free(info); + return (void *) src_btf; + } + + info->src_btf = src_btf; + + ids_map = hashmap__new(bpf_reloc_info_hash_fn, bpf_reloc_info_equal_fn, NULL); + if (IS_ERR(ids_map)) { + btfgen_reloc_info_free(info); + return (void *) ids_map; + } + + info->ids_map = ids_map; + + types = hashmap__new(bpf_reloc_info_hash_fn, bpf_reloc_info_equal_fn, NULL); + if (IS_ERR(types)) { + btfgen_reloc_info_free(info); + return (void *) types; + } + + info->types = types; + + return info; +} + +/* Return id for type in new btf instance */ +static unsigned int btf_reloc_id_get(struct btf_reloc_info *info, unsigned int old) +{ + uintptr_t new = 0; + + /* deal with BTF_KIND_VOID */ + if (old == 0) + return 0; + + if (!hashmap__find(info->ids_map, uint_as_hash_key(old), (void **)&new)) { + /* return id for void as it's possible that the ID we're looking for is + * the type of a pointer that we're not adding. + */ + return 0; + } + + return (unsigned int)(uintptr_t)new; +} + +/* Add new id map to the list of mappings */ +static int btf_reloc_id_add(struct btf_reloc_info *info, unsigned int old, unsigned int new) +{ + return hashmap__add(info->ids_map, uint_as_hash_key(old), uint_as_hash_key(new)); +} + +/* + * Put type in the list. If the type already exists it's returned, otherwise a + * new one is created and added to the list. This is called recursively adding + * all the types that are needed for the current one. + */ +static struct btf_reloc_type *btf_reloc_put_type(struct btf *btf, + struct btf_reloc_info *info, + struct btf_type *btf_type, + unsigned int id) +{ + struct btf_reloc_type *reloc_type, *tmp; + struct btf_array *array; + unsigned int child_id; + int err; + + /* check if we already have this type */ + if (hashmap__find(info->types, uint_as_hash_key(id), (void **) &reloc_type)) + return reloc_type; + + + /* do nothing. void is implicit in BTF */ + if (id == 0) + return NULL; + + reloc_type = calloc(1, sizeof(*reloc_type)); + if (!reloc_type) + return ERR_PTR(-ENOMEM); + + reloc_type->type = btf_type; + reloc_type->id = id; + + /* append this type to the relocation type's list before anything else */ + err = hashmap__add(info->types, uint_as_hash_key(reloc_type->id), reloc_type); + if (err) + return ERR_PTR(err); + + /* complex types might need further processing */ + switch (btf_kind(reloc_type->type)) { + /* already processed */ + case BTF_KIND_UNKN: + case BTF_KIND_INT: + case BTF_KIND_FLOAT: + /* processed by callee */ + case BTF_KIND_STRUCT: + case BTF_KIND_UNION: + /* doesn't need resolution. If the data of the pointer is used + * then it'll added by the caller in another relocation. + */ + case BTF_KIND_PTR: + break; + /* needs resolution */ + case BTF_KIND_CONST: + case BTF_KIND_VOLATILE: + case BTF_KIND_TYPEDEF: + child_id = btf_type->type; + btf_type = (struct btf_type *) btf__type_by_id(btf, child_id); + if (!btf_type) + return ERR_PTR(-EINVAL); + + tmp = btf_reloc_put_type(btf, info, btf_type, child_id); + if (IS_ERR(tmp)) + return tmp; + break; + /* needs resolution */ + case BTF_KIND_ARRAY: + array = btf_array(reloc_type->type); + + /* add type for array type */ + btf_type = (struct btf_type *) btf__type_by_id(btf, array->type); + tmp = btf_reloc_put_type(btf, info, btf_type, array->type); + if (IS_ERR(tmp)) + return tmp; + + /* add type for array's index type */ + btf_type = (struct btf_type *) btf__type_by_id(btf, array->index_type); + tmp = btf_reloc_put_type(btf, info, btf_type, array->index_type); + if (IS_ERR(tmp)) + return tmp; + + break; + /* tells if some other type needs to be handled */ + default: + p_err("unsupported relocation: %d", reloc_type->id); + return ERR_PTR(-EINVAL); + } + + return reloc_type; +} + +/* Return pointer to btf_reloc_type by id */ +static struct btf_reloc_type *btf_reloc_get_type(struct btf_reloc_info *info, int id) +{ + struct btf_reloc_type *type = NULL; + + if (!hashmap__find(info->types, uint_as_hash_key(id), (void **)&type)) + return ERR_PTR(-ENOENT); + + return type; +} + +static int bpf_reloc_type_add_member(struct btf_reloc_info *info, + struct btf_reloc_type *reloc_type, + struct btf_member *btf_member, int idx) +{ + struct btf_reloc_member *reloc_member; + int err; + + /* create new members hashmap for this relocation type if needed */ + if (reloc_type->members == NULL) { + struct hashmap *tmp = hashmap__new(bpf_reloc_info_hash_fn, + bpf_reloc_info_equal_fn, + NULL); + if (IS_ERR(tmp)) + return PTR_ERR(tmp); + + reloc_type->members = tmp; + } + /* add given btf_member as a member of the parent relocation_type's type */ + reloc_member = calloc(1, sizeof(*reloc_member)); + if (!reloc_member) + return -ENOMEM; + reloc_member->member = btf_member; + reloc_member->idx = idx; + /* add given btf_member as member to given relocation type */ + err = hashmap__add(reloc_type->members, uint_as_hash_key(reloc_member->idx), reloc_member); + if (err) { + free(reloc_member); + if (err != -EEXIST) + return err; + } + + return 0; +} + +/* + * Same as btf_reloc_put_type, but adding all fields, from given complex type, recursively + */ +static int btf_reloc_put_type_all(struct btf *btf, + struct btf_reloc_info *info, + struct btf_type *btf_type, + unsigned int id) +{ + struct btf_reloc_type *reloc_type; + struct btf_array *array; + unsigned int child_id; + struct btf_member *m; + int err, i, n; + + if (id == 0) + return 0; + + if (!hashmap__find(info->types, uint_as_hash_key(id), (void **) &reloc_type)) { + reloc_type = calloc(1, sizeof(*reloc_type)); + if (!reloc_type) + return -ENOMEM; + + reloc_type->type = btf_type; + reloc_type->id = id; + /* avoid infinite recursion and yet be able to add all + * fields/members for types also managed by this function twin + * brother btf_reloc_put_type() + */ + reloc_type->added_by_all = true; + + err = hashmap__add(info->types, uint_as_hash_key(reloc_type->id), reloc_type); + if (err) + return err; + } else { + if (reloc_type->added_by_all) + return 0; + + reloc_type->added_by_all = true; + } + + switch (btf_kind(reloc_type->type)) { + case BTF_KIND_UNKN: + case BTF_KIND_INT: + case BTF_KIND_FLOAT: + case BTF_KIND_ENUM: + /* not a complex type, already solved */ + break; + case BTF_KIND_STRUCT: + case BTF_KIND_UNION: + n = btf_vlen(btf_type); + m = btf_members(btf_type); + for (i = 0; i < n; i++, m++) { + btf_type = (struct btf_type *) btf__type_by_id(btf, m->type); + if (!btf_type) + return -EINVAL; + + /* add all member types */ + err = btf_reloc_put_type_all(btf, info, btf_type, m->type); + if (err) + return err; + + /* add all members */ + err = bpf_reloc_type_add_member(info, reloc_type, m, i); + if (err) + return err; + } + break; + case BTF_KIND_PTR: + case BTF_KIND_CONST: + case BTF_KIND_VOLATILE: + case BTF_KIND_TYPEDEF: + /* modifier types */ + child_id = btf_type->type; + btf_type = (struct btf_type *) btf__type_by_id(btf, child_id); + if (!btf_type) + return -EINVAL; + + err = btf_reloc_put_type_all(btf, info, btf_type, child_id); + if (err) + return err; + break; + case BTF_KIND_ARRAY: + array = btf_array(btf_type); + + /* add array member type */ + btf_type = (struct btf_type *) btf__type_by_id(btf, array->type); + if (!btf_type) + return -EINVAL; + err = btf_reloc_put_type_all(btf, info, btf_type, array->type); + if (err) + return err; + + /* add array index type */ + btf_type = (struct btf_type *) btf__type_by_id(btf, array->index_type); + if (!btf_type) + return -EINVAL; + err = btf_reloc_put_type_all(btf, info, btf_type, array->type); + if (err) + return err; + break; + default: + p_err("unsupported kind (all): %s (%d)", + btf_kind_str(reloc_type->type), reloc_type->id); + return -EINVAL; + } + + return 0; +} + +static int btf_reloc_info_gen_field(struct btf_reloc_info *info, struct bpf_core_spec *targ_spec) +{ + struct btf *btf = (struct btf *) info->src_btf; + struct btf_reloc_type *reloc_type; + struct btf_member *btf_member; + struct btf_type *btf_type; + struct btf_array *array; + unsigned int id; + int idx, err; + + btf_type = (struct btf_type *) btf__type_by_id(btf, targ_spec->root_type_id); + + /* create reloc type for root type */ + reloc_type = btf_reloc_put_type(btf, info, btf_type, targ_spec->root_type_id); + if (IS_ERR(reloc_type)) + return PTR_ERR(reloc_type); + + /* add types for complex types (arrays, unions, structures) */ + for (int i = 1; i < targ_spec->raw_len; i++) { + /* skip typedefs and mods */ + while (btf_is_mod(btf_type) || btf_is_typedef(btf_type)) { + id = btf_type->type; + reloc_type = btf_reloc_get_type(info, id); + if (IS_ERR(reloc_type)) + return PTR_ERR(reloc_type); + btf_type = (struct btf_type *) btf__type_by_id(btf, id); + } + + switch (btf_kind(btf_type)) { + case BTF_KIND_STRUCT: + case BTF_KIND_UNION: + idx = targ_spec->raw_spec[i]; + btf_member = btf_members(btf_type) + idx; + btf_type = (struct btf_type *) btf__type_by_id(btf, btf_member->type); + + /* add member to relocation type */ + err = bpf_reloc_type_add_member(info, reloc_type, btf_member, idx); + if (err) + return err; + /* add relocation type */ + reloc_type = btf_reloc_put_type(btf, info, btf_type, btf_member->type); + if (IS_ERR(reloc_type)) + return PTR_ERR(reloc_type); + break; + case BTF_KIND_ARRAY: + array = btf_array(btf_type); + reloc_type = btf_reloc_get_type(info, array->type); + if (IS_ERR(reloc_type)) + return PTR_ERR(reloc_type); + btf_type = (struct btf_type *) btf__type_by_id(btf, array->type); + break; + default: + p_err("spec type wasn't handled"); + return -1; + } + } + + return 0; +} + +static int btf_reloc_info_gen_type(struct btf_reloc_info *info, struct bpf_core_spec *targ_spec) +{ + struct btf *btf = (struct btf *) info->src_btf; + struct btf_type *btf_type; + int err = 0; + + btf_type = (struct btf_type *) btf__type_by_id(btf, targ_spec->root_type_id); + + return btf_reloc_put_type_all(btf, info, btf_type, targ_spec->root_type_id); +} + +static int btf_reloc_info_gen_enumval(struct btf_reloc_info *info, struct bpf_core_spec *targ_spec) +{ + p_err("untreated enumval based relocation"); + return -EOPNOTSUPP; +} + +static int btf_reloc_info_gen(struct btf_reloc_info *info, struct bpf_core_spec *res) +{ + if (core_relo_is_type_based(res->relo_kind)) + return btf_reloc_info_gen_type(info, res); + + if (core_relo_is_enumval_based(res->relo_kind)) + return btf_reloc_info_gen_enumval(info, res); + + if (core_relo_is_field_based(res->relo_kind)) + return btf_reloc_info_gen_field(info, res); + + return -EINVAL; +} + +#define BPF_INSN_SZ (sizeof(struct bpf_insn)) + +static int btfgen_obj_reloc_info_gen(struct btf_reloc_info *reloc_info, struct bpf_object *obj) +{ + const struct btf_ext_info_sec *sec; + const struct bpf_core_relo *rec; + const struct btf_ext_info *seg; + struct hashmap *cand_cache; + int err, insn_idx, sec_idx; + struct bpf_program *prog; + struct btf_ext *btf_ext; + const char *sec_name; + size_t nr_programs; + struct btf *btf; + unsigned int i; + + btf = bpf_object__btf(obj); + btf_ext = bpf_object__btf_ext(obj); + + if (btf_ext->core_relo_info.len == 0) + return 0; + + cand_cache = bpf_core_create_cand_cache(); + if (IS_ERR(cand_cache)) + return PTR_ERR(cand_cache); + + bpf_object_set_vmlinux_override(obj, reloc_info->src_btf); + + seg = &btf_ext->core_relo_info; + for_each_btf_ext_sec(seg, sec) { + bool prog_found; + + sec_name = btf__name_by_offset(btf, sec->sec_name_off); + if (str_is_empty(sec_name)) { + err = -EINVAL; + goto out; + } + + prog_found = false; + nr_programs = bpf_object__get_nr_programs(obj); + for (i = 0; i < nr_programs; i++) { + prog = bpf_object__get_program(obj, i); + if (strcmp(bpf_program__section_name(prog), sec_name) == 0) { + prog_found = true; + break; + } + } + + if (!prog_found) { + pr_warn("sec '%s': failed to find a BPF program\n", sec_name); + err = -EINVAL; + goto out; + } + + sec_idx = bpf_program__sec_idx(prog); + + for_each_btf_ext_rec(seg, sec, i, rec) { + struct bpf_core_relo_res targ_res; + struct bpf_core_spec targ_spec; + + insn_idx = rec->insn_off / BPF_INSN_SZ; + + prog = find_prog_by_sec_insn(obj, sec_idx, insn_idx); + if (!prog) { + pr_warn("sec '%s': failed to find program at insn #%d for CO-RE offset relocation #%d\n", + sec_name, insn_idx, i); + err = -EINVAL; + goto out; + } + + err = bpf_core_calc_relo_res(prog, rec, i, btf, cand_cache, &targ_res, + &targ_spec); + if (err) + goto out; + + err = btf_reloc_info_gen(reloc_info, &targ_spec); + if (err) + goto out; + } + } + +out: + bpf_core_free_cand_cache(cand_cache); + + return err; +} + +static struct btf *btfgen_reloc_info_get_btf(struct btf_reloc_info *info) +{ + struct hashmap_entry *entry; + struct btf *btf_new; + size_t bkt; + int err; + + btf_new = btf__new_empty(); + if (IS_ERR(btf_new)) { + p_err("failed to allocate btf structure"); + return btf_new; + } + + /* first pass: add all types and add their new ids to the ids map */ + hashmap__for_each_entry(info->types, entry, bkt) { + struct btf_reloc_type *reloc_type = entry->value; + struct btf_type *btf_type = reloc_type->type; + int new_id; + + /* add members for struct and union */ + if (btf_is_struct(btf_type) || btf_is_union(btf_type)) { + struct hashmap_entry *member_entry; + struct btf_type *btf_type_cpy; + int nmembers, index; + size_t new_size; + + nmembers = reloc_type->members ? hashmap__size(reloc_type->members) : 0; + new_size = sizeof(struct btf_type) + nmembers * sizeof(struct btf_member); + + btf_type_cpy = malloc(new_size); + if (!btf_type_cpy) { + err = -ENOMEM; + goto out; + } + + /* copy header */ + memcpy(btf_type_cpy, btf_type, sizeof(*btf_type_cpy)); + + /* copy only members that are needed */ + index = 0; + if (nmembers > 0) { + size_t bkt2; + + hashmap__for_each_entry(reloc_type->members, member_entry, bkt2) { + struct btf_reloc_member *reloc_member; + struct btf_member *btf_member; + + reloc_member = member_entry->value; + btf_member = btf_members(btf_type) + reloc_member->idx; + + memcpy(btf_members(btf_type_cpy) + index, btf_member, + sizeof(struct btf_member)); + + index++; + } + } + + /* set new vlen */ + btf_type_cpy->info = btf_type_info(btf_kind(btf_type_cpy), nmembers, + btf_kflag(btf_type_cpy)); + + err = btf__add_type(btf_new, info->src_btf, btf_type_cpy); + free(btf_type_cpy); + } else { + err = btf__add_type(btf_new, info->src_btf, btf_type); + } + + if (err < 0) + goto out; + + new_id = err; + + /* add ID mapping */ + err = btf_reloc_id_add(info, reloc_type->id, new_id); + if (err) + goto out; + } + + /* second pass: fix up type ids */ + for (unsigned int i = 0; i < btf__type_cnt(btf_new); i++) { + struct btf_member *btf_member; + struct btf_type *btf_type; + struct btf_param *params; + struct btf_array *array; + + btf_type = (struct btf_type *) btf__type_by_id(btf_new, i); + + switch (btf_kind(btf_type)) { + case BTF_KIND_STRUCT: + case BTF_KIND_UNION: + for (unsigned short j = 0; j < btf_vlen(btf_type); j++) { + btf_member = btf_members(btf_type) + j; + btf_member->type = btf_reloc_id_get(info, btf_member->type); + } + break; + case BTF_KIND_PTR: + case BTF_KIND_TYPEDEF: + case BTF_KIND_VOLATILE: + case BTF_KIND_CONST: + case BTF_KIND_RESTRICT: + case BTF_KIND_FUNC: + case BTF_KIND_VAR: + btf_type->type = btf_reloc_id_get(info, btf_type->type); + break; + case BTF_KIND_ARRAY: + array = btf_array(btf_type); + array->index_type = btf_reloc_id_get(info, array->index_type); + array->type = btf_reloc_id_get(info, array->type); + break; + case BTF_KIND_FUNC_PROTO: + btf_type->type = btf_reloc_id_get(info, btf_type->type); + params = btf_params(btf_type); + for (unsigned short j = 0; j < btf_vlen(btf_type); j++) + params[j].type = btf_reloc_id_get(info, params[j].type); + break; + default: + break; + } + } + + return btf_new; + +out: + btf__free(btf_new); + return ERR_PTR(err); +} + +static int is_file(const char *path) +{ + struct stat st; + + if (stat(path, &st) < 0) + return -1; + + switch (st.st_mode & S_IFMT) { + case S_IFDIR: + return 0; + case S_IFREG: + return 1; + default: + return -1; + } + + return -1; +} + +static int generate_btf(const char *src_btf, const char *dst_btf, const char *objspaths[]) +{ + struct btf_reloc_info *reloc_info; + struct btf *btf_new = NULL; + struct bpf_object *obj; + int err; + + struct bpf_object_open_opts ops = { + .sz = sizeof(ops), + .btf_custom_path = src_btf, + }; + + reloc_info = btfgen_reloc_info_new(src_btf); + err = libbpf_get_error(reloc_info); + if (err) { + p_err("failed to allocate info structure"); + goto out; + } + + for (int i = 0; objspaths[i] != NULL; i++) { + printf("OBJ : %s\n", objspaths[i]); + obj = bpf_object__open_file(objspaths[i], &ops); + err = libbpf_get_error(obj); + if (err) { + p_err("error opening object: %s", strerror(errno)); + goto out; + } + + err = btfgen_obj_reloc_info_gen(reloc_info, obj); + if (err) + goto out; + + bpf_object__close(obj); + } + + btf_new = btfgen_reloc_info_get_btf(reloc_info); + err = libbpf_get_error(btf_new); + if (err) { + p_err("error generating btf: %s", strerror(errno)); + goto out; + } + + printf("DBTF: %s\n", dst_btf); + err = btf_save_raw(btf_new, dst_btf); + if (err) { + p_err("error saving btf file: %s", strerror(errno)); + goto out; + } + +out: + if (!libbpf_get_error(btf_new)) + btf__free(btf_new); + btfgen_reloc_info_free(reloc_info); + + return err; +} + +static int do_gen_btf(int argc, char **argv) +{ + char src_btf_path[PATH_MAX], dst_btf_path[PATH_MAX]; + bool input_is_file, output_is_file = false; + const char *input, *output; + const char **objs = NULL; + struct dirent *dir; + DIR *d = NULL; + int i, err; + + if (!REQ_ARGS(3)) { + usage(); + return -1; + } + + input = GET_ARG(); + err = is_file(input); + if (err < 0) { + p_err("failed to stat %s: %s", input, strerror(errno)); + return err; + } + input_is_file = err; + + output = GET_ARG(); + err = is_file(output); + if (err != 0) + output_is_file = true; + + objs = (const char **) malloc((argc + 1) * sizeof(*objs)); + if (!objs) + return -ENOMEM; + + i = 0; + while (argc > 0) + objs[i++] = GET_ARG(); + + objs[i] = NULL; + + // single BTF file + if (input_is_file) { + char *d_input; + + printf("SBTF: %s\n", input); + + if (output_is_file) { + err = generate_btf(input, output, objs); + goto out; + } + d_input = strdup(input); + snprintf(dst_btf_path, sizeof(dst_btf_path), "%s/%s", output, + basename(d_input)); + free(d_input); + err = generate_btf(input, dst_btf_path, objs); + goto out; + } + + if (output_is_file) { + p_err("can't have just one file as output"); + err = -EINVAL; + goto out; + } + + // directory with BTF files + d = opendir(input); + if (!d) { + p_err("error opening input dir: %s", strerror(errno)); + err = -errno; + goto out; + } + + while ((dir = readdir(d)) != NULL) { + if (dir->d_type != DT_REG) + continue; + + if (strncmp(dir->d_name + strlen(dir->d_name) - 4, ".btf", 4)) + continue; + + snprintf(src_btf_path, sizeof(src_btf_path), "%s/%s", input, dir->d_name); + snprintf(dst_btf_path, sizeof(dst_btf_path), "%s/%s", output, dir->d_name); + + printf("SBTF: %s\n", src_btf_path); + + err = generate_btf(src_btf_path, dst_btf_path, objs); + if (err) + goto out; + } + +out: + if (!err) + printf("STAT: done!\n"); + free(objs); + closedir(d); + return err; +} + static const struct cmd cmds[] = { { "object", do_object }, { "skeleton", do_skeleton }, + { "btf", do_gen_btf}, { "help", do_help }, { 0 } }; -- 2.25.1
On 12/17/21 7:56 PM, Mauricio Vásquez wrote: > CO-RE requires to have BTF information describing the types of the > kernel in order to perform the relocations. This is usually provided by > the kernel itself when it's configured with CONFIG_DEBUG_INFO_BTF. > However, this configuration is not enabled in all the distributions and > it's not available on kernels before 5.12. > [...] > Changelog: > > v2 > v3: > - expose internal libbpf APIs to bpftool instead > - implement btfgen in bpftool > - drop btf__raw_data() from libbpf Looks like this breaks bpf-next CI, please take a look: https://github.com/kernel-patches/bpf/runs/4565944884?check_suite_focus=true [...] All error logs: #36 core_kern_lskel:FAIL test_core_kern_lskel:FAIL:open_and_load unexpected error: -22 #69 kfunc_call:FAIL test_main:PASS:skel 0 nsec test_main:PASS:bpf_prog_test_run(test1) 0 nsec test_main:PASS:test1-retval 0 nsec test_main:PASS:bpf_prog_test_run(test2) 0 nsec test_main:PASS:test2-retval 0 nsec #69/1 kfunc_call/main:OK test_subprog:PASS:skel 0 nsec test_subprog:PASS:bpf_prog_test_run(test1) 0 nsec test_subprog:PASS:test1-retval 0 nsec test_subprog:PASS:active_res 0 nsec test_subprog:PASS:sk_state_res 0 nsec #69/2 kfunc_call/subprog:OK test_subprog_lskel:FAIL:skel unexpected error: -22 #69/3 kfunc_call/subprog_lskel:FAIL #71 ksyms_btf:FAIL test_ksyms_btf:PASS:btf_exists 0 nsec test_basic:PASS:kallsyms_fopen 0 nsec test_basic:PASS:ksym_find 0 nsec test_basic:PASS:kallsyms_fopen 0 nsec test_basic:PASS:ksym_find 0 nsec test_basic:PASS:skel_open 0 nsec test_basic:PASS:skel_attach 0 nsec test_basic:PASS:runqueues_addr 0 nsec test_basic:PASS:bpf_prog_active_addr 0 nsec test_basic:PASS:rq_cpu 0 nsec test_basic:PASS:bpf_prog_active 0 nsec test_basic:PASS:cpu_rq(0)->cpu 0 nsec test_basic:PASS:this_rq_cpu 0 nsec test_basic:PASS:this_bpf_prog_active 0 nsec #71/1 ksyms_btf/basic:OK libbpf: prog 'handler': BPF program load failed: Permission denied libbpf: prog 'handler': -- BEGIN PROG LOAD LOG -- arg#0 reference type('UNKNOWN ') size cannot be determined: -22 0: R1=ctx(id=0,off=0,imm=0) R10=fp0 ; cpu = bpf_get_smp_processor_id(); 0: (85) call bpf_get_smp_processor_id#8 ; R0_w=inv(id=0) 1: (bc) w7 = w0 ; R0_w=inv(id=0) R7_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) ; rq = (struct rq *)bpf_per_cpu_ptr(&runqueues, cpu); 2: (18) r1 = 0x2bd80 ; R1_w=percpu_ptr_rq(id=0,off=0,imm=0) 4: (bc) w2 = w7 ; R2_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R7_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) 5: (85) call bpf_per_cpu_ptr#153 ; R0_w=ptr_or_null_rq(id=1,off=0,imm=0) 6: (bf) r6 = r0 ; R0_w=ptr_or_null_rq(id=1,off=0,imm=0) R6_w=ptr_or_null_rq(id=1,off=0,imm=0) ; active = (int *)bpf_per_cpu_ptr(&bpf_prog_active, cpu); 7: (18) r1 = 0x27de0 ; R1_w=percpu_ptr_int(id=0,off=0,imm=0) 9: (bc) w2 = w7 ; R2_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) R7_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) 10: (85) call bpf_per_cpu_ptr#153 ; R0=mem_or_null(id=2,ref_obj_id=0,off=0,imm=0) ; if (active) { 11: (15) if r0 == 0x0 goto pc+2 ; R0=mem(id=0,ref_obj_id=0,off=0,imm=0) ; *(volatile int *)active; 12: (61) r1 = *(u32 *)(r0 +0) ; R0=mem(id=0,ref_obj_id=0,off=0,imm=0) R1_w=inv(id=0,umax_value=4294967295,var_off=(0x0; 0xffffffff)) ; *(volatile int *)(&rq->cpu); 13: (61) r1 = *(u32 *)(r6 +2848) R6 invalid mem access 'ptr_or_null_' processed 12 insns (limit 1000000) max_states_per_insn 0 total_states 1 peak_states 1 mark_read 1 -- END PROG LOAD LOG -- libbpf: failed to load program 'handler' libbpf: failed to load object 'test_ksyms_btf_null_check' libbpf: failed to load BPF skeleton 'test_ksyms_btf_null_check': -13 test_null_check:PASS:skel_open 0 nsec #71/2 ksyms_btf/null_check:OK test_weak_syms:PASS:test_ksyms_weak__open_and_load 0 nsec test_weak_syms:PASS:test_ksyms_weak__attach 0 nsec test_weak_syms:PASS:existing typed ksym 0 nsec test_weak_syms:PASS:existing typeless ksym 0 nsec test_weak_syms:PASS:nonexistent typeless ksym 0 nsec test_weak_syms:PASS:nonexistent typed ksym 0 nsec #71/3 ksyms_btf/weak_ksyms:OK test_weak_syms_lskel:FAIL:test_ksyms_weak_lskel__open_and_load unexpected error: -22 #71/4 ksyms_btf/weak_ksyms_lskel:FAIL #89 map_ptr:FAIL test_map_ptr:PASS:skel_open 0 nsec test_map_ptr:FAIL:skel_load unexpected error: -22 (errno 22) Summary: 221/1052 PASSED, 8 SKIPPED, 4 FAILED test_progs-no_alu32 - Testing test_progs-no_alu32 test_maps - Testing test_maps test_verifier - Testing test_verifier collect_status - Collect status shutdown - Shutdown Test Results: bpftool: PASS test_progs: FAIL (returned 1) test_progs-no_alu32: FAIL (returned 1) test_maps: PASS test_verifier: PASS shutdown: CLEAN Error: Process completed with exit code 1. > v1 > v2: > - introduce bpf_object__prepare() and ‘record_core_relos’ to expose > CO-RE relocations instead of bpf_object__reloc_info_gen() > - rename btf__save_to_file() to btf__raw_data() > > v1: https://lore.kernel.org/bpf/20211027203727.208847-1-mauricio@kinvolk.io/ > v2: https://lore.kernel.org/bpf/20211116164208.164245-1-mauricio@kinvolk.io/ > > Mauricio Vásquez (3): > libbpf: split bpf_core_apply_relo() > libbpf: Implement changes needed for BTFGen in bpftool > bpftool: Implement btfgen > > kernel/bpf/btf.c | 11 +- > tools/bpf/bpftool/gen.c | 892 ++++++++++++++++++++++++++++++++ > tools/lib/bpf/Makefile | 2 +- > tools/lib/bpf/libbpf.c | 124 +++-- > tools/lib/bpf/libbpf.h | 3 + > tools/lib/bpf/libbpf.map | 2 + > tools/lib/bpf/libbpf_internal.h | 22 + > tools/lib/bpf/relo_core.c | 83 +-- > tools/lib/bpf/relo_core.h | 46 +- > 9 files changed, 1086 insertions(+), 99 deletions(-) >
> [...] > > Changelog: > > > > v2 > v3: > > - expose internal libbpf APIs to bpftool instead > > - implement btfgen in bpftool > > - drop btf__raw_data() from libbpf > > Looks like this breaks bpf-next CI, please take a look: > > https://github.com/kernel-patches/bpf/runs/4565944884?check_suite_focus=true > Thanks Daniel for checking this up. I have spotted a potential issue: the instruction is also patched when prog->obj->gen_loader is set. I'll fix it in the next iteration but I'm not sure it's causing those test failures. I tried to reproduce them in my fork but they pass fine: https://github.com/mauriciovasquezbernal/linux/runs/4586966124?check_suite_focus=true. Could it be a false positive?
On Fri, Dec 17, 2021 at 10:57 AM Mauricio Vásquez <mauricio@kinvolk.io> wrote: > > The BTFGen's goal is to produce a BTF file that contains **only** the > information that is needed by an eBPF program. This algorithm does a > first step collecting the types involved for each relocation present on > the object and "marking" them as needed. Types are collected in > different ways according to the type relocation, for field based > relocations only the union and structures members involved are > considered, for type based relocations the whole types are added, enum > field relocations are not supported in this iteration yet. > > A second step generates a BTF file from the "marked" types. This step > accesses the original BTF file extracting the types and their members > that were "marked" as needed in the first step. > > This command is implemented under the "gen" command in bpftool and the > syntax is the following: > > $ bpftool gen btf INPUT OUTPUT OBJECT(S) > > INPUT can be either a single BTF file or a folder containing BTF files, > when it's a folder, a BTF file is generated for each BTF file contained > in this folder. OUTPUT is the file (or folder) where generated files are > stored and OBJECT(S) is the list of bpf objects we want to generate the > BTF file(s) for (each generated BTF file contains all the types needed > by all the objects). > > Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io> > Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com> > Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co> > Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co> > --- > tools/bpf/bpftool/gen.c | 892 ++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 892 insertions(+) > I haven't looked through details of stripping BTF itself, let's finalize CO-RE relocation parts first. Maybe for the next revision you could split bpftool changes in two in some reasonable way so that each patch concentrates on different steps of the process a bit more? E.g., first patch might set up new command and BTF stripping parts but leave the CO-RE relocation logic unimplemented, and the second path fills in that part. Should make it easier to review this big patch. Please also cc Quentin Monnet to review bpftool parts as well. [...] > +static int btf_reloc_info_gen_type(struct btf_reloc_info *info, struct bpf_core_spec *targ_spec) > +{ > + struct btf *btf = (struct btf *) info->src_btf; > + struct btf_type *btf_type; > + int err = 0; > + > + btf_type = (struct btf_type *) btf__type_by_id(btf, targ_spec->root_type_id); > + > + return btf_reloc_put_type_all(btf, info, btf_type, targ_spec->root_type_id); > +} > + > +static int btf_reloc_info_gen_enumval(struct btf_reloc_info *info, struct bpf_core_spec *targ_spec) > +{ > + p_err("untreated enumval based relocation"); why untreated? what's the problem supporting it? > + return -EOPNOTSUPP; > +} > + > +static int btf_reloc_info_gen(struct btf_reloc_info *info, struct bpf_core_spec *res) > +{ > + if (core_relo_is_type_based(res->relo_kind)) > + return btf_reloc_info_gen_type(info, res); > + > + if (core_relo_is_enumval_based(res->relo_kind)) > + return btf_reloc_info_gen_enumval(info, res); > + > + if (core_relo_is_field_based(res->relo_kind)) > + return btf_reloc_info_gen_field(info, res); you can have a simple switch here instead of exposing libbpf internal helpers > + > + return -EINVAL; > +} > + > +#define BPF_INSN_SZ (sizeof(struct bpf_insn)) > + > +static int btfgen_obj_reloc_info_gen(struct btf_reloc_info *reloc_info, struct bpf_object *obj) > +{ > + const struct btf_ext_info_sec *sec; > + const struct bpf_core_relo *rec; > + const struct btf_ext_info *seg; > + struct hashmap *cand_cache; > + int err, insn_idx, sec_idx; > + struct bpf_program *prog; > + struct btf_ext *btf_ext; > + const char *sec_name; > + size_t nr_programs; > + struct btf *btf; > + unsigned int i; > + > + btf = bpf_object__btf(obj); > + btf_ext = bpf_object__btf_ext(obj); > + > + if (btf_ext->core_relo_info.len == 0) > + return 0; > + > + cand_cache = bpf_core_create_cand_cache(); > + if (IS_ERR(cand_cache)) > + return PTR_ERR(cand_cache); > + > + bpf_object_set_vmlinux_override(obj, reloc_info->src_btf); > + > + seg = &btf_ext->core_relo_info; > + for_each_btf_ext_sec(seg, sec) { > + bool prog_found; > + > + sec_name = btf__name_by_offset(btf, sec->sec_name_off); > + if (str_is_empty(sec_name)) { > + err = -EINVAL; > + goto out; > + } > + > + prog_found = false; > + nr_programs = bpf_object__get_nr_programs(obj); > + for (i = 0; i < nr_programs; i++) { > + prog = bpf_object__get_program(obj, i); > + if (strcmp(bpf_program__section_name(prog), sec_name) == 0) { > + prog_found = true; > + break; > + } > + } > + > + if (!prog_found) { > + pr_warn("sec '%s': failed to find a BPF program\n", sec_name); > + err = -EINVAL; > + goto out; > + } > + > + sec_idx = bpf_program__sec_idx(prog); > + > + for_each_btf_ext_rec(seg, sec, i, rec) { > + struct bpf_core_relo_res targ_res; > + struct bpf_core_spec targ_spec; > + > + insn_idx = rec->insn_off / BPF_INSN_SZ; > + > + prog = find_prog_by_sec_insn(obj, sec_idx, insn_idx); > + if (!prog) { > + pr_warn("sec '%s': failed to find program at insn #%d for CO-RE offset relocation #%d\n", > + sec_name, insn_idx, i); > + err = -EINVAL; > + goto out; > + } > + > + err = bpf_core_calc_relo_res(prog, rec, i, btf, cand_cache, &targ_res, > + &targ_spec); I don't think you need to do *exactly* what libbpf is doing. bpf_core_calc_relo_res() doesn't add much on top of bpf_core_calc_relo_insn(), if you use bpf_core_calc_relo_insn() directly and expose bpf_core_add_cands/bpf_core_free_cands and use them directly as well, bypassing bpf_object completely, you won't need btf_vmlinux override and make everything less coupled, I think. In the future, if you'd like to support BTF module BTFs, you'll have all the necessary flexibility to do that, while if you try to reuse every single line of bpf_object's code we'll be adding more hacks like your bpf_object_set_vmlinux_override(). Fundamentally, you don't care about bpf_object and bpf_programs. All you need to do is parse .BTF.ext (you can do that just btf__parse, no need to even construct bpf_object!). Construct candidate cache, then iterate each CO-RE record, add find/add candidates, calculate relocation result, use it for your algorithm. Yes, there will be a bit of duplication (candidate search), but that's better than trying to turn bpf_object inside out with all the custom getters/setters that you are exposing (even if it's libbpf internal only, still makes it really hard to reasons what's going on and what are consequences of manual control over a lot of bpf_object internal implementation details). > + if (err) > + goto out; > + > + err = btf_reloc_info_gen(reloc_info, &targ_spec); > + if (err) > + goto out; > + } > + } > + > +out: > + bpf_core_free_cand_cache(cand_cache); > + > + return err; > +} > + [...]
On Fri, Dec 17, 2021 at 10:57 AM Mauricio Vásquez <mauricio@kinvolk.io> wrote: > > Given that BTFGen is a so unique use case, let's expose some of the > libbpf internal APIs to bpftool. This commit extends libbpf with the > features that are needed to implement it. > > Specifically, this commit introduces the following functions: > - bpf_object__btf_ext(): Public method to access the BTF.ext section of > an bpf object. > - bpf_program__sec_idx(): Public method to get the sec index of a > program within the elf file. > - Implement bpf_core_create_cand_cache() and bpf_core_free_cand_cache() > to handle candidates cache. > - Expose bpf_core_calc_relo_res() in libbpfinternal.h and add "struct > bpf_core_spec *targ_spec" parameter. > - bpf_object_set_vmlinux_override(): Internal function to set > obj->btf_vmlinux_override. > - bpf_object__get_nr_programs() and bpf_object__get_program(): To give > access to the program inside an object. bpf_object__for_each_program > is not good enough because BTFGen needs to access subprograms too. > > Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io> > Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com> > Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co> > Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co> > --- > tools/lib/bpf/Makefile | 2 +- > tools/lib/bpf/libbpf.c | 88 ++++++++++++++++++++++++++------- > tools/lib/bpf/libbpf.h | 3 ++ > tools/lib/bpf/libbpf.map | 2 + > tools/lib/bpf/libbpf_internal.h | 22 +++++++++ > tools/lib/bpf/relo_core.c | 6 +-- > tools/lib/bpf/relo_core.h | 4 ++ > 7 files changed, 104 insertions(+), 23 deletions(-) > > diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile > index f947b61b2107..dba019ee2832 100644 > --- a/tools/lib/bpf/Makefile > +++ b/tools/lib/bpf/Makefile > @@ -239,7 +239,7 @@ install_lib: all_cmd > > SRC_HDRS := bpf.h libbpf.h btf.h libbpf_common.h libbpf_legacy.h xsk.h \ > bpf_helpers.h bpf_tracing.h bpf_endian.h bpf_core_read.h \ > - skel_internal.h libbpf_version.h > + skel_internal.h libbpf_version.h relo_core.h libbpf_internal.h > GEN_HDRS := $(BPF_GENERATED) > > INSTALL_PFX := $(DESTDIR)$(prefix)/include/bpf > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c > index 77e2df13715a..7c8f8797475f 100644 > --- a/tools/lib/bpf/libbpf.c > +++ b/tools/lib/bpf/libbpf.c > @@ -4027,8 +4027,8 @@ static bool prog_contains_insn(const struct bpf_program *prog, size_t insn_idx) > insn_idx < prog->sec_insn_off + prog->sec_insn_cnt; > } > > -static struct bpf_program *find_prog_by_sec_insn(const struct bpf_object *obj, > - size_t sec_idx, size_t insn_idx) > +struct bpf_program *find_prog_by_sec_insn(const struct bpf_object *obj, > + size_t sec_idx, size_t insn_idx) > { > int l = 0, r = obj->nr_programs - 1, m; > struct bpf_program *prog; > @@ -5498,12 +5498,13 @@ static int record_relo_core(struct bpf_program *prog, > return 0; > } > > -static int bpf_core_calc_relo_res(struct bpf_program *prog, > - const struct bpf_core_relo *relo, > - int relo_idx, > - const struct btf *local_btf, > - struct hashmap *cand_cache, > - struct bpf_core_relo_res *targ_res) > +int bpf_core_calc_relo_res(struct bpf_program *prog, > + const struct bpf_core_relo *relo, > + int relo_idx, > + const struct btf *local_btf, > + struct hashmap *cand_cache, > + struct bpf_core_relo_res *targ_res, > + struct bpf_core_spec *targ_spec) maybe let's add targ_spec and local_spec into bpf_core_relo_res? that way bpf_core_relo_res contains all the relevant information around CO-RE relo resolution? > { > struct bpf_core_spec specs_scratch[3] = {}; > const void *type_key = u32_as_hash_key(relo->type_id); > @@ -5548,8 +5549,32 @@ static int bpf_core_calc_relo_res(struct bpf_program *prog, > } > } > > - return bpf_core_calc_relo_insn(prog_name, relo, relo_idx, local_btf, cands, > - specs_scratch, targ_res); > + err = bpf_core_calc_relo_insn(prog_name, relo, relo_idx, local_btf, cands, > + specs_scratch, targ_res); > + if (err) > + return err; > + > + if (targ_spec) > + *targ_spec = specs_scratch[2]; and then we'll avoid this ugliness > + return 0; > +} > + > +struct hashmap *bpf_core_create_cand_cache(void) > +{ > + return hashmap__new(bpf_core_hash_fn, bpf_core_equal_fn, NULL); > +} > + > +void bpf_core_free_cand_cache(struct hashmap *cand_cache) > +{ > + struct hashmap_entry *entry; > + int i; > + > + if (!IS_ERR_OR_NULL(cand_cache)) { > + hashmap__for_each_entry(cand_cache, entry, i) { > + bpf_core_free_cands(entry->value); > + } > + hashmap__free(cand_cache); > + } > } > > static int > @@ -5559,7 +5584,6 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path) > struct bpf_core_relo_res targ_res; > const struct bpf_core_relo *rec; > const struct btf_ext_info *seg; > - struct hashmap_entry *entry; > struct hashmap *cand_cache = NULL; > struct bpf_program *prog; > struct bpf_insn *insn; > @@ -5578,7 +5602,7 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path) > } > } > > - cand_cache = hashmap__new(bpf_core_hash_fn, bpf_core_equal_fn, NULL); > + cand_cache = bpf_core_create_cand_cache(); > if (IS_ERR(cand_cache)) { > err = PTR_ERR(cand_cache); > goto out; > @@ -5627,7 +5651,8 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path) > if (!prog->load) > continue; > > - err = bpf_core_calc_relo_res(prog, rec, i, obj->btf, cand_cache, &targ_res); > + err = bpf_core_calc_relo_res(prog, rec, i, obj->btf, cand_cache, &targ_res, > + NULL); > if (err) { > pr_warn("prog '%s': relo #%d: failed to relocate: %d\n", > prog->name, i, err); > @@ -5660,12 +5685,8 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path) > btf__free(obj->btf_vmlinux_override); > obj->btf_vmlinux_override = NULL; > > - if (!IS_ERR_OR_NULL(cand_cache)) { > - hashmap__for_each_entry(cand_cache, entry, i) { > - bpf_core_free_cands(entry->value); > - } > - hashmap__free(cand_cache); > - } > + bpf_core_free_cand_cache(cand_cache); > + > return err; > } > > @@ -8190,6 +8211,11 @@ struct btf *bpf_object__btf(const struct bpf_object *obj) > return obj ? obj->btf : NULL; > } > > +struct btf_ext *bpf_object__btf_ext(const struct bpf_object *obj) > +{ > + return obj ? obj->btf_ext : NULL; just return obj->btf_ext, no one should be passing NULL for those getters > +} > + > int bpf_object__btf_fd(const struct bpf_object *obj) > { > return obj->btf ? btf__fd(obj->btf) : -1; > @@ -8281,6 +8307,20 @@ bpf_object__next_program(const struct bpf_object *obj, struct bpf_program *prev) > return prog; > } > > +size_t bpf_object__get_nr_programs(const struct bpf_object *obj) > +{ > + return obj->nr_programs; > +} > + > +struct bpf_program * > +bpf_object__get_program(const struct bpf_object *obj, unsigned int i) > +{ > + if (i >= obj->nr_programs) > + return NULL; > + > + return &obj->programs[i]; > +} > + > struct bpf_program * > bpf_program__prev(struct bpf_program *next, const struct bpf_object *obj) > { > @@ -8360,6 +8400,11 @@ int bpf_program__set_autoload(struct bpf_program *prog, bool autoload) > return 0; > } > > +int bpf_program__sec_idx(const struct bpf_program *prog) > +{ > + return prog->sec_idx; > +} > + > static int bpf_program_nth_fd(const struct bpf_program *prog, int n); > > int bpf_program__fd(const struct bpf_program *prog) > @@ -11779,3 +11824,8 @@ void bpf_object__destroy_skeleton(struct bpf_object_skeleton *s) > free(s->progs); > free(s); > } > + > +void bpf_object_set_vmlinux_override(struct bpf_object *obj, struct btf *btf) > +{ > + obj->btf_vmlinux_override = btf; > +} I don't think we need this, see comments in next patch > diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h > index 42b2f36fd9f0..2b048ee5a9b2 100644 > --- a/tools/lib/bpf/libbpf.h > +++ b/tools/lib/bpf/libbpf.h > @@ -225,6 +225,8 @@ LIBBPF_API int bpf_object__set_kversion(struct bpf_object *obj, __u32 kern_versi > > struct btf; > LIBBPF_API struct btf *bpf_object__btf(const struct bpf_object *obj); > +struct btf_ext; > +LIBBPF_API struct btf_ext *bpf_object__btf_ext(const struct bpf_object *obj); > LIBBPF_API int bpf_object__btf_fd(const struct bpf_object *obj); > > LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_object__find_program_by_name() instead") > @@ -290,6 +292,7 @@ LIBBPF_API LIBBPF_DEPRECATED("BPF program title is confusing term; please use bp > const char *bpf_program__title(const struct bpf_program *prog, bool needs_copy); > LIBBPF_API bool bpf_program__autoload(const struct bpf_program *prog); > LIBBPF_API int bpf_program__set_autoload(struct bpf_program *prog, bool autoload); > +LIBBPF_API int bpf_program__sec_idx(const struct bpf_program *prog); > > /* returns program size in bytes */ > LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_program__insn_cnt() instead") > diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map > index b3938b3f8fc9..15da4075e0b5 100644 > --- a/tools/lib/bpf/libbpf.map > +++ b/tools/lib/bpf/libbpf.map > @@ -392,6 +392,7 @@ LIBBPF_0.6.0 { > bpf_map__map_extra; > bpf_map__set_map_extra; > bpf_map_create; > + bpf_object__btf_ext; > bpf_object__next_map; > bpf_object__next_program; > bpf_object__prev_map; > @@ -401,6 +402,7 @@ LIBBPF_0.6.0 { > bpf_program__flags; > bpf_program__insn_cnt; > bpf_program__insns; > + bpf_program__sec_idx; > bpf_program__set_flags; > btf__add_btf; > btf__add_decl_tag; > diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h > index 5dbe4f463880..b1962adb110c 100644 > --- a/tools/lib/bpf/libbpf_internal.h > +++ b/tools/lib/bpf/libbpf_internal.h > @@ -524,4 +524,26 @@ static inline int ensure_good_fd(int fd) > return fd; > } > > +struct hashmap; > + > +int bpf_core_calc_relo_res(struct bpf_program *prog, > + const struct bpf_core_relo *relo, > + int relo_idx, > + const struct btf *local_btf, > + struct hashmap *cand_cache, > + struct bpf_core_relo_res *targ_res, > + struct bpf_core_spec *targ_spec); > +void bpf_object_set_vmlinux_override(struct bpf_object *obj, struct btf *btf); > +struct hashmap *bpf_core_create_cand_cache(void); > +void bpf_core_free_cand_cache(struct hashmap *cand_cache); > + > +struct bpf_program *find_prog_by_sec_insn(const struct bpf_object *obj, > + size_t sec_idx, size_t insn_idx); > + > +size_t bpf_object__get_nr_programs(const struct bpf_object *obj); > + > +struct bpf_program * > +bpf_object__get_program(const struct bpf_object *obj, unsigned int n); > + that's too much, I don't think you need bpf_program and all the things around it (sec_idx, look up, etc). As for core_relo_is_field_based and co, bpftool can do those simple checks on their own, no need to make all the "internal API", don't overdo it with "let's expose internals of libbpf to bpftool". > + > #endif /* __LIBBPF_LIBBPF_INTERNAL_H */ > diff --git a/tools/lib/bpf/relo_core.c b/tools/lib/bpf/relo_core.c > index 4f3552468624..66dfb7fa89a2 100644 > --- a/tools/lib/bpf/relo_core.c > +++ b/tools/lib/bpf/relo_core.c > @@ -102,7 +102,7 @@ static const char *core_relo_kind_str(enum bpf_core_relo_kind kind) > } > } > > -static bool core_relo_is_field_based(enum bpf_core_relo_kind kind) > +bool core_relo_is_field_based(enum bpf_core_relo_kind kind) > { > switch (kind) { > case BPF_CORE_FIELD_BYTE_OFFSET: > @@ -117,7 +117,7 @@ static bool core_relo_is_field_based(enum bpf_core_relo_kind kind) > } > } > > -static bool core_relo_is_type_based(enum bpf_core_relo_kind kind) > +bool core_relo_is_type_based(enum bpf_core_relo_kind kind) > { > switch (kind) { > case BPF_CORE_TYPE_ID_LOCAL: > @@ -130,7 +130,7 @@ static bool core_relo_is_type_based(enum bpf_core_relo_kind kind) > } > } > > -static bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind) > +bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind) > { > switch (kind) { > case BPF_CORE_ENUMVAL_EXISTS: > diff --git a/tools/lib/bpf/relo_core.h b/tools/lib/bpf/relo_core.h > index a28bf3711ce2..e969dfb032f4 100644 > --- a/tools/lib/bpf/relo_core.h > +++ b/tools/lib/bpf/relo_core.h > @@ -84,4 +84,8 @@ int bpf_core_patch_insn(const char *prog_name, struct bpf_insn *insn, > int insn_idx, const struct bpf_core_relo *relo, > int relo_idx, const struct bpf_core_relo_res *res); > > +bool core_relo_is_field_based(enum bpf_core_relo_kind kind); > +bool core_relo_is_type_based(enum bpf_core_relo_kind kind); > +bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind); > + > #endif > -- > 2.25.1 >
On Fri, Dec 17, 2021 at 10:57 AM Mauricio Vásquez <mauricio@kinvolk.io> wrote: > > BTFGen needs to run the core relocation logic in order to understand > what are the types in the target BTF that involved in a given > relocation. > > Currently bpf_core_apply_relo() calculates and **applies** a relocation > to an instruction. Having both operations in the same function makes it > difficult to only calculate the relocation without patching the > instruction. This commit splits that logic in two different phases: (1) > calculate the relocation and (2) patch the instruction. > > For the first phase bpf_core_apply_relo() is renamed to > bpf_core_calc_relo_res() who is now only on charge of calculating the > relocation, the second phase uses the already existing > bpf_core_patch_insn(). bpf_object__relocate_core() uses both of them and > the BTFGen will use only bpf_core_calc_relo_res(). > > Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io> > Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com> > Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co> > Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co> > --- > kernel/bpf/btf.c | 11 +++++- > tools/lib/bpf/libbpf.c | 54 ++++++++++++++++----------- > tools/lib/bpf/relo_core.c | 77 +++++++++++---------------------------- > tools/lib/bpf/relo_core.h | 42 ++++++++++++++++++--- > 4 files changed, 99 insertions(+), 85 deletions(-) > > diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c > index a17de71abd2e..5a8f6ef6a341 100644 > --- a/kernel/bpf/btf.c > +++ b/kernel/bpf/btf.c > @@ -6734,6 +6734,7 @@ int bpf_core_apply(struct bpf_core_ctx *ctx, const struct bpf_core_relo *relo, > { > bool need_cands = relo->kind != BPF_CORE_TYPE_ID_LOCAL; > struct bpf_core_cand_list cands = {}; > + struct bpf_core_relo_res targ_res; > struct bpf_core_spec *specs; > int err; > > @@ -6778,8 +6779,14 @@ int bpf_core_apply(struct bpf_core_ctx *ctx, const struct bpf_core_relo *relo, > */ > } > > - err = bpf_core_apply_relo_insn((void *)ctx->log, insn, relo->insn_off / 8, > - relo, relo_idx, ctx->btf, &cands, specs); > + err = bpf_core_calc_relo_insn((void *)ctx->log, relo, relo_idx, ctx->btf, &cands, specs, > + &targ_res); > + if (err) > + goto out; > + > + err = bpf_core_patch_insn((void *)ctx->log, insn, relo->insn_off / 8, relo, relo_idx, > + &targ_res); > + > out: > kfree(specs); > if (need_cands) { > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c > index cf862a19222b..77e2df13715a 100644 > --- a/tools/lib/bpf/libbpf.c > +++ b/tools/lib/bpf/libbpf.c > @@ -5498,11 +5498,12 @@ static int record_relo_core(struct bpf_program *prog, > return 0; > } > > -static int bpf_core_apply_relo(struct bpf_program *prog, > - const struct bpf_core_relo *relo, > - int relo_idx, > - const struct btf *local_btf, > - struct hashmap *cand_cache) > +static int bpf_core_calc_relo_res(struct bpf_program *prog, bpf_core_calc_relo_res is almost indistinguishable from bpf_core_calc_relo... Let's call this one bpf_core_resolve_relo()? > + const struct bpf_core_relo *relo, > + int relo_idx, > + const struct btf *local_btf, > + struct hashmap *cand_cache, > + struct bpf_core_relo_res *targ_res) [...] > static int > bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path) > { > const struct btf_ext_info_sec *sec; > + struct bpf_core_relo_res targ_res; > const struct bpf_core_relo *rec; > const struct btf_ext_info *seg; > struct hashmap_entry *entry; > struct hashmap *cand_cache = NULL; > struct bpf_program *prog; > + struct bpf_insn *insn; > const char *sec_name; > int i, err = 0, insn_idx, sec_idx; > > @@ -5636,12 +5627,31 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path) > if (!prog->load) > continue; > > - err = bpf_core_apply_relo(prog, rec, i, obj->btf, cand_cache); > + err = bpf_core_calc_relo_res(prog, rec, i, obj->btf, cand_cache, &targ_res); > if (err) { > pr_warn("prog '%s': relo #%d: failed to relocate: %d\n", > prog->name, i, err); > goto out; > } > + > + if (rec->insn_off % BPF_INSN_SZ) > + return -EINVAL; > + insn_idx = rec->insn_off / BPF_INSN_SZ; > + /* adjust insn_idx from section frame of reference to the local > + * program's frame of reference; (sub-)program code is not yet > + * relocated, so it's enough to just subtract in-section offset > + */ > + insn_idx = insn_idx - prog->sec_insn_off; > + if (insn_idx >= prog->insns_cnt) > + return -EINVAL; > + insn = &prog->insns[insn_idx]; this is sort of like sanity checks, let's do them before the core_calc step, so after that it's a clean sequence of calc_relo + pathc_insn? > + > + err = bpf_core_patch_insn(prog->name, insn, insn_idx, rec, i, &targ_res); > + if (err) { > + pr_warn("prog '%s': relo #%d: failed to patch insn #%u: %d\n", > + prog->name, i, insn_idx, err); > + goto out; > + } > } > } > [...] > { > __u32 orig_val, new_val; > __u8 class; > @@ -1177,18 +1152,18 @@ static void bpf_core_dump_spec(const char *prog_name, int level, const struct bp > * between multiple relocations for the same type ID and is updated as some > * of the candidates are pruned due to structural incompatibility. > */ > -int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn, > - int insn_idx, > - const struct bpf_core_relo *relo, > - int relo_idx, > - const struct btf *local_btf, > - struct bpf_core_cand_list *cands, > - struct bpf_core_spec *specs_scratch) > +int bpf_core_calc_relo_insn(const char *prog_name, please update the comment for this function, it's not "CO-RE relocate single instruction" anymore, it's more like "Calculate CO-RE relocation target result" or something along those lines. > + const struct bpf_core_relo *relo, > + int relo_idx, > + const struct btf *local_btf, > + struct bpf_core_cand_list *cands, > + struct bpf_core_spec *specs_scratch, > + struct bpf_core_relo_res *targ_res) > { > struct bpf_core_spec *local_spec = &specs_scratch[0]; > struct bpf_core_spec *cand_spec = &specs_scratch[1]; > struct bpf_core_spec *targ_spec = &specs_scratch[2]; > - struct bpf_core_relo_res cand_res, targ_res; > + struct bpf_core_relo_res cand_res; > const struct btf_type *local_type; > const char *local_name; > __u32 local_id; > @@ -1223,12 +1198,12 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn, > /* TYPE_ID_LOCAL relo is special and doesn't need candidate search */ > if (relo->kind == BPF_CORE_TYPE_ID_LOCAL) { > /* bpf_insn's imm value could get out of sync during linking */ > - memset(&targ_res, 0, sizeof(targ_res)); > - targ_res.validate = false; > - targ_res.poison = false; > - targ_res.orig_val = local_spec->root_type_id; > - targ_res.new_val = local_spec->root_type_id; > - goto patch_insn; > + memset(targ_res, 0, sizeof(*targ_res)); > + targ_res->validate = true; hm.. original code sets it to false here, please don't regress the logic > + targ_res->poison = false; > + targ_res->orig_val = local_spec->root_type_id; > + targ_res->new_val = local_spec->root_type_id; > + return 0; > } > > /* libbpf doesn't support candidate search for anonymous types */ [...]
> > -static int bpf_core_apply_relo(struct bpf_program *prog, > > - const struct bpf_core_relo *relo, > > - int relo_idx, > > - const struct btf *local_btf, > > - struct hashmap *cand_cache) > > +static int bpf_core_calc_relo_res(struct bpf_program *prog, > > bpf_core_calc_relo_res is almost indistinguishable from > bpf_core_calc_relo... Let's call this one bpf_core_resolve_relo()? > That's a much better name! Deciding the name of that function was probably the most complicated part of this patch. > > @@ -5636,12 +5627,31 @@ bpf_object__relocate_core(struct bpf_object *obj, const char *targ_btf_path) > > if (!prog->load) > > continue; > > > > - err = bpf_core_apply_relo(prog, rec, i, obj->btf, cand_cache); > > + err = bpf_core_calc_relo_res(prog, rec, i, obj->btf, cand_cache, &targ_res); > > if (err) { > > pr_warn("prog '%s': relo #%d: failed to relocate: %d\n", > > prog->name, i, err); > > goto out; > > } > > + > > + if (rec->insn_off % BPF_INSN_SZ) > > + return -EINVAL; > > + insn_idx = rec->insn_off / BPF_INSN_SZ; > > + /* adjust insn_idx from section frame of reference to the local > > + * program's frame of reference; (sub-)program code is not yet > > + * relocated, so it's enough to just subtract in-section offset > > + */ > > + insn_idx = insn_idx - prog->sec_insn_off; > > + if (insn_idx >= prog->insns_cnt) > > + return -EINVAL; > > + insn = &prog->insns[insn_idx]; > > this is sort of like sanity checks, let's do them before the core_calc > step, so after that it's a clean sequence of calc_relo + pathc_insn? > Makes sense. > > @@ -1177,18 +1152,18 @@ static void bpf_core_dump_spec(const char *prog_name, int level, const struct bp > > * between multiple relocations for the same type ID and is updated as some > > * of the candidates are pruned due to structural incompatibility. > > */ > > -int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn, > > - int insn_idx, > > - const struct bpf_core_relo *relo, > > - int relo_idx, > > - const struct btf *local_btf, > > - struct bpf_core_cand_list *cands, > > - struct bpf_core_spec *specs_scratch) > > +int bpf_core_calc_relo_insn(const char *prog_name, > > please update the comment for this function, it's not "CO-RE relocate > single instruction" anymore, it's more like "Calculate CO-RE > relocation target result" or something along those lines. > Updated with your suggestion. > > @@ -1223,12 +1198,12 @@ int bpf_core_apply_relo_insn(const char *prog_name, struct bpf_insn *insn, > > /* TYPE_ID_LOCAL relo is special and doesn't need candidate search */ > > if (relo->kind == BPF_CORE_TYPE_ID_LOCAL) { > > /* bpf_insn's imm value could get out of sync during linking */ > > - memset(&targ_res, 0, sizeof(targ_res)); > > - targ_res.validate = false; > > - targ_res.poison = false; > > - targ_res.orig_val = local_spec->root_type_id; > > - targ_res.new_val = local_spec->root_type_id; > > - goto patch_insn; > > + memset(targ_res, 0, sizeof(*targ_res)); > > + targ_res->validate = true; > > hm.. original code sets it to false here, please don't regress the logic > ops, I introduced this by mistake while rebasing.
> > @@ -5498,12 +5498,13 @@ static int record_relo_core(struct bpf_program *prog, > > return 0; > > } > > > > -static int bpf_core_calc_relo_res(struct bpf_program *prog, > > - const struct bpf_core_relo *relo, > > - int relo_idx, > > - const struct btf *local_btf, > > - struct hashmap *cand_cache, > > - struct bpf_core_relo_res *targ_res) > > +int bpf_core_calc_relo_res(struct bpf_program *prog, > > + const struct bpf_core_relo *relo, > > + int relo_idx, > > + const struct btf *local_btf, > > + struct hashmap *cand_cache, > > + struct bpf_core_relo_res *targ_res, > > + struct bpf_core_spec *targ_spec) > > maybe let's add targ_spec and local_spec into bpf_core_relo_res? that > way bpf_core_relo_res contains all the relevant information around > CO-RE relo resolution? > It's not needed anymore now that we're using bpf_core_calc_relo_insn() directly in bpftool. > > @@ -8190,6 +8211,11 @@ struct btf *bpf_object__btf(const struct bpf_object *obj) > > return obj ? obj->btf : NULL; > > } > > > > +struct btf_ext *bpf_object__btf_ext(const struct bpf_object *obj) > > +{ > > + return obj ? obj->btf_ext : NULL; > > just return obj->btf_ext, no one should be passing NULL for those getters I dropped this function as we don't need it now. > > > +} > > + > > int bpf_object__btf_fd(const struct bpf_object *obj) > > { > > return obj->btf ? btf__fd(obj->btf) : -1; > > @@ -8281,6 +8307,20 @@ bpf_object__next_program(const struct bpf_object *obj, struct bpf_program *prev) > > return prog; > > } > > > > +size_t bpf_object__get_nr_programs(const struct bpf_object *obj) > > +{ > > + return obj->nr_programs; > > +} > > + > > +struct bpf_program * > > +bpf_object__get_program(const struct bpf_object *obj, unsigned int i) > > +{ > > + if (i >= obj->nr_programs) > > + return NULL; > > + > > + return &obj->programs[i]; > > +} > > + > > struct bpf_program * > > bpf_program__prev(struct bpf_program *next, const struct bpf_object *obj) > > { > > @@ -8360,6 +8400,11 @@ int bpf_program__set_autoload(struct bpf_program *prog, bool autoload) > > return 0; > > } > > > > +int bpf_program__sec_idx(const struct bpf_program *prog) > > +{ > > + return prog->sec_idx; > > +} > > + > > static int bpf_program_nth_fd(const struct bpf_program *prog, int n); > > > > int bpf_program__fd(const struct bpf_program *prog) > > @@ -11779,3 +11824,8 @@ void bpf_object__destroy_skeleton(struct bpf_object_skeleton *s) > > free(s->progs); > > free(s); > > } > > + > > +void bpf_object_set_vmlinux_override(struct bpf_object *obj, struct btf *btf) > > +{ > > + obj->btf_vmlinux_override = btf; > > +} > > I don't think we need this, see comments in next patch > > > > diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h > > index 42b2f36fd9f0..2b048ee5a9b2 100644 > > --- a/tools/lib/bpf/libbpf.h > > +++ b/tools/lib/bpf/libbpf.h > > @@ -225,6 +225,8 @@ LIBBPF_API int bpf_object__set_kversion(struct bpf_object *obj, __u32 kern_versi > > > > struct btf; > > LIBBPF_API struct btf *bpf_object__btf(const struct bpf_object *obj); > > +struct btf_ext; > > +LIBBPF_API struct btf_ext *bpf_object__btf_ext(const struct bpf_object *obj); > > LIBBPF_API int bpf_object__btf_fd(const struct bpf_object *obj); > > > > LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_object__find_program_by_name() instead") > > @@ -290,6 +292,7 @@ LIBBPF_API LIBBPF_DEPRECATED("BPF program title is confusing term; please use bp > > const char *bpf_program__title(const struct bpf_program *prog, bool needs_copy); > > LIBBPF_API bool bpf_program__autoload(const struct bpf_program *prog); > > LIBBPF_API int bpf_program__set_autoload(struct bpf_program *prog, bool autoload); > > +LIBBPF_API int bpf_program__sec_idx(const struct bpf_program *prog); > > > > /* returns program size in bytes */ > > LIBBPF_DEPRECATED_SINCE(0, 7, "use bpf_program__insn_cnt() instead") > > diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map > > index b3938b3f8fc9..15da4075e0b5 100644 > > --- a/tools/lib/bpf/libbpf.map > > +++ b/tools/lib/bpf/libbpf.map > > @@ -392,6 +392,7 @@ LIBBPF_0.6.0 { > > bpf_map__map_extra; > > bpf_map__set_map_extra; > > bpf_map_create; > > + bpf_object__btf_ext; > > bpf_object__next_map; > > bpf_object__next_program; > > bpf_object__prev_map; > > @@ -401,6 +402,7 @@ LIBBPF_0.6.0 { > > bpf_program__flags; > > bpf_program__insn_cnt; > > bpf_program__insns; > > + bpf_program__sec_idx; > > bpf_program__set_flags; > > btf__add_btf; > > btf__add_decl_tag; > > diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h > > index 5dbe4f463880..b1962adb110c 100644 > > --- a/tools/lib/bpf/libbpf_internal.h > > +++ b/tools/lib/bpf/libbpf_internal.h > > @@ -524,4 +524,26 @@ static inline int ensure_good_fd(int fd) > > return fd; > > } > > > > +struct hashmap; > > + > > +int bpf_core_calc_relo_res(struct bpf_program *prog, > > + const struct bpf_core_relo *relo, > > + int relo_idx, > > + const struct btf *local_btf, > > + struct hashmap *cand_cache, > > + struct bpf_core_relo_res *targ_res, > > + struct bpf_core_spec *targ_spec); > > +void bpf_object_set_vmlinux_override(struct bpf_object *obj, struct btf *btf); > > +struct hashmap *bpf_core_create_cand_cache(void); > > +void bpf_core_free_cand_cache(struct hashmap *cand_cache); > > + > > +struct bpf_program *find_prog_by_sec_insn(const struct bpf_object *obj, > > + size_t sec_idx, size_t insn_idx); > > + > > +size_t bpf_object__get_nr_programs(const struct bpf_object *obj); > > + > > +struct bpf_program * > > +bpf_object__get_program(const struct bpf_object *obj, unsigned int n); > > + > > that's too much, I don't think you need bpf_program and all the things > around it (sec_idx, look up, etc). As for core_relo_is_field_based and > co, bpftool can do those simple checks on their own, no need to make > all the "internal API", don't overdo it with "let's expose internals > of libbpf to bpftool". > > > > + > > #endif /* __LIBBPF_LIBBPF_INTERNAL_H */ > > diff --git a/tools/lib/bpf/relo_core.c b/tools/lib/bpf/relo_core.c > > index 4f3552468624..66dfb7fa89a2 100644 > > --- a/tools/lib/bpf/relo_core.c > > +++ b/tools/lib/bpf/relo_core.c > > @@ -102,7 +102,7 @@ static const char *core_relo_kind_str(enum bpf_core_relo_kind kind) > > } > > } > > > > -static bool core_relo_is_field_based(enum bpf_core_relo_kind kind) > > +bool core_relo_is_field_based(enum bpf_core_relo_kind kind) > > { > > switch (kind) { > > case BPF_CORE_FIELD_BYTE_OFFSET: > > @@ -117,7 +117,7 @@ static bool core_relo_is_field_based(enum bpf_core_relo_kind kind) > > } > > } > > > > -static bool core_relo_is_type_based(enum bpf_core_relo_kind kind) > > +bool core_relo_is_type_based(enum bpf_core_relo_kind kind) > > { > > switch (kind) { > > case BPF_CORE_TYPE_ID_LOCAL: > > @@ -130,7 +130,7 @@ static bool core_relo_is_type_based(enum bpf_core_relo_kind kind) > > } > > } > > > > -static bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind) > > +bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind) > > { > > switch (kind) { > > case BPF_CORE_ENUMVAL_EXISTS: > > diff --git a/tools/lib/bpf/relo_core.h b/tools/lib/bpf/relo_core.h > > index a28bf3711ce2..e969dfb032f4 100644 > > --- a/tools/lib/bpf/relo_core.h > > +++ b/tools/lib/bpf/relo_core.h > > @@ -84,4 +84,8 @@ int bpf_core_patch_insn(const char *prog_name, struct bpf_insn *insn, > > int insn_idx, const struct bpf_core_relo *relo, > > int relo_idx, const struct bpf_core_relo_res *res); > > > > +bool core_relo_is_field_based(enum bpf_core_relo_kind kind); > > +bool core_relo_is_type_based(enum bpf_core_relo_kind kind); > > +bool core_relo_is_enumval_based(enum bpf_core_relo_kind kind); > > + > > #endif > > -- > > 2.25.1 > >
On Wed, Dec 22, 2021 at 7:33 PM Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote: > > On Fri, Dec 17, 2021 at 10:57 AM Mauricio Vásquez <mauricio@kinvolk.io> wrote: > > > > The BTFGen's goal is to produce a BTF file that contains **only** the > > information that is needed by an eBPF program. This algorithm does a > > first step collecting the types involved for each relocation present on > > the object and "marking" them as needed. Types are collected in > > different ways according to the type relocation, for field based > > relocations only the union and structures members involved are > > considered, for type based relocations the whole types are added, enum > > field relocations are not supported in this iteration yet. > > > > A second step generates a BTF file from the "marked" types. This step > > accesses the original BTF file extracting the types and their members > > that were "marked" as needed in the first step. > > > > This command is implemented under the "gen" command in bpftool and the > > syntax is the following: > > > > $ bpftool gen btf INPUT OUTPUT OBJECT(S) > > > > INPUT can be either a single BTF file or a folder containing BTF files, > > when it's a folder, a BTF file is generated for each BTF file contained > > in this folder. OUTPUT is the file (or folder) where generated files are > > stored and OBJECT(S) is the list of bpf objects we want to generate the > > BTF file(s) for (each generated BTF file contains all the types needed > > by all the objects). > > > > Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io> > > Signed-off-by: Rafael David Tinoco <rafael.tinoco@aquasec.com> > > Signed-off-by: Lorenzo Fontana <lorenzo.fontana@elastic.co> > > Signed-off-by: Leonardo Di Donato <leonardo.didonato@elastic.co> > > --- > > tools/bpf/bpftool/gen.c | 892 ++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 892 insertions(+) > > > > I haven't looked through details of stripping BTF itself, let's > finalize CO-RE relocation parts first. Maybe for the next revision you > could split bpftool changes in two in some reasonable way so that each > patch concentrates on different steps of the process a bit more? E.g., > first patch might set up new command and BTF stripping parts but leave > the CO-RE relocation logic unimplemented, and the second path fills in > that part. Should make it easier to review this big patch. I totally agree. Will send v4 with more granular commits. > Please also cc Quentin Monnet to review bpftool parts as well. He's already there. > > +static int btf_reloc_info_gen_type(struct btf_reloc_info *info, struct bpf_core_spec *targ_spec) > > +{ > > + struct btf *btf = (struct btf *) info->src_btf; > > + struct btf_type *btf_type; > > + int err = 0; > > + > > + btf_type = (struct btf_type *) btf__type_by_id(btf, targ_spec->root_type_id); > > + > > + return btf_reloc_put_type_all(btf, info, btf_type, targ_spec->root_type_id); > > +} > > + > > +static int btf_reloc_info_gen_enumval(struct btf_reloc_info *info, struct bpf_core_spec *targ_spec) > > +{ > > + p_err("untreated enumval based relocation"); > > why untreated? what's the problem supporting it? > Nothing, we haven't given it any priority. It'll be part of the next iteration. > > +static int btf_reloc_info_gen(struct btf_reloc_info *info, struct bpf_core_spec *res) > > +{ > > + if (core_relo_is_type_based(res->relo_kind)) > > + return btf_reloc_info_gen_type(info, res); > > + > > + if (core_relo_is_enumval_based(res->relo_kind)) > > + return btf_reloc_info_gen_enumval(info, res); > > + > > + if (core_relo_is_field_based(res->relo_kind)) > > + return btf_reloc_info_gen_field(info, res); > > you can have a simple switch here instead of exposing libbpf internal helpers > Will do. > > +static int btfgen_obj_reloc_info_gen(struct btf_reloc_info *reloc_info, struct bpf_object *obj) > > +{ > > + const struct btf_ext_info_sec *sec; > > + const struct bpf_core_relo *rec; > > + const struct btf_ext_info *seg; > > + struct hashmap *cand_cache; > > + int err, insn_idx, sec_idx; > > + struct bpf_program *prog; > > + struct btf_ext *btf_ext; > > + const char *sec_name; > > + size_t nr_programs; > > + struct btf *btf; > > + unsigned int i; > > + > > + btf = bpf_object__btf(obj); > > + btf_ext = bpf_object__btf_ext(obj); > > + > > + if (btf_ext->core_relo_info.len == 0) > > + return 0; > > + > > + cand_cache = bpf_core_create_cand_cache(); > > + if (IS_ERR(cand_cache)) > > + return PTR_ERR(cand_cache); > > + > > + bpf_object_set_vmlinux_override(obj, reloc_info->src_btf); > > + > > + seg = &btf_ext->core_relo_info; > > + for_each_btf_ext_sec(seg, sec) { > > + bool prog_found; > > + > > + sec_name = btf__name_by_offset(btf, sec->sec_name_off); > > + if (str_is_empty(sec_name)) { > > + err = -EINVAL; > > + goto out; > > + } > > + > > + prog_found = false; > > + nr_programs = bpf_object__get_nr_programs(obj); > > + for (i = 0; i < nr_programs; i++) { > > + prog = bpf_object__get_program(obj, i); > > + if (strcmp(bpf_program__section_name(prog), sec_name) == 0) { > > + prog_found = true; > > + break; > > + } > > + } > > + > > + if (!prog_found) { > > + pr_warn("sec '%s': failed to find a BPF program\n", sec_name); > > + err = -EINVAL; > > + goto out; > > + } > > + > > + sec_idx = bpf_program__sec_idx(prog); > > + > > + for_each_btf_ext_rec(seg, sec, i, rec) { > > + struct bpf_core_relo_res targ_res; > > + struct bpf_core_spec targ_spec; > > + > > + insn_idx = rec->insn_off / BPF_INSN_SZ; > > + > > + prog = find_prog_by_sec_insn(obj, sec_idx, insn_idx); > > + if (!prog) { > > + pr_warn("sec '%s': failed to find program at insn #%d for CO-RE offset relocation #%d\n", > > + sec_name, insn_idx, i); > > + err = -EINVAL; > > + goto out; > > + } > > + > > + err = bpf_core_calc_relo_res(prog, rec, i, btf, cand_cache, &targ_res, > > + &targ_spec); > > > I don't think you need to do *exactly* what libbpf is doing. > bpf_core_calc_relo_res() doesn't add much on top of > bpf_core_calc_relo_insn(), if you use bpf_core_calc_relo_insn() > directly and expose bpf_core_add_cands/bpf_core_free_cands and use > them directly as well, bypassing bpf_object completely, you won't need > btf_vmlinux override and make everything less coupled, I think. In the > future, if you'd like to support BTF module BTFs, you'll have all the > necessary flexibility to do that, while if you try to reuse every > single line of bpf_object's code we'll be adding more hacks like your > bpf_object_set_vmlinux_override(). > > Fundamentally, you don't care about bpf_object and bpf_programs. All > you need to do is parse .BTF.ext (you can do that just btf__parse, no > need to even construct bpf_object!). Construct candidate cache, then > iterate each CO-RE record, add find/add candidates, calculate > relocation result, use it for your algorithm. > > Yes, there will be a bit of duplication (candidate search), but that's > better than trying to turn bpf_object inside out with all the custom > getters/setters that you are exposing (even if it's libbpf internal > only, still makes it really hard to reasons what's going on and what > are consequences of manual control over a lot of bpf_object internal > implementation details). > Thanks a lot for this suggestion. We implemented it this way and the result is much better.