bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 bpf-next 00/13] bpf: support resilient split BTF
@ 2024-04-24 15:47 Alan Maguire
  2024-04-24 15:47 ` [PATCH v2 bpf-next 01/13] libbpf: add support to btf__add_fwd() for ENUM64 Alan Maguire
                   ` (13 more replies)
  0 siblings, 14 replies; 43+ messages in thread
From: Alan Maguire @ 2024-04-24 15:47 UTC (permalink / raw)
  To: andrii, ast
  Cc: jolsa, acme, quentin, eddyz87, mykolal, daniel, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, houtao1,
	bpf, masahiroy, mcgrof, nathan, Alan Maguire

Split BPF Type Format (BTF) provides huge advantages in that kernel
modules only have to provide type information for types that they do not
share with the core kernel; for core kernel types, split BTF refers to
core kernel BTF type ids.  So for a STRUCT sk_buff, a module that
uses that structure (or a pointer to it) simply needs to refer to the
core kernel type id, saving the need to define the structure and its many
dependents.  This cuts down on duplication and makes BTF as compact
as possible.

However, there is a downside.  This scheme requires the references from
split BTF to base BTF to be valid not just at encoding time, but at use
time (when the module is loaded).  Even a small change in kernel types
can perturb the type ids in core kernel BTF, and due to pahole's
parallel processing of compilation units, even an unchanged kernel can
have different type ids if BTF is re-generated.  So we have a robustness
problem for split BTF for cases where a module is not always compiled at
the same time as the kernel.  This problem is particularly acute for
distros which generally want module builders to be able to compile a
module for the lifetime of a Linux stable-based release, and have it
continue to be valid over the lifetime of that release, even as changes
in data structures (and hence BTF types) accrue.  Today it's not
possible to generate BTF for modules that works beyond the initial
kernel it is compiled against - kernel bugfixes etc invalidate the split
BTF references to vmlinux BTF, and BTF is no longer usable for the
module.

The goal of this series is to provide options to provide additional
context for cases like this.  That context comes in the form of
distilled base BTF; it stands in for the base BTF, and contains
information about the types referenced from split BTF, but not their
full descriptions.  The modified split BTF will refer to type ids in
this .BTF.base section, and when the kernel loads such modules it
will use that base BTF to map references from split BTF to the
current vmlinux BTF - a process of relocating split BTF with the
currently-running kernel's vmlinux base BTF.

A module builder - using this series along with the pahole changes -
can then build a module with distilled base BTF via an out-of-tree
module build, i.e.

make -C . M=path/2/module

The module will have a .BTF section (the split BTF) and a
.BTF.base section.  The latter is small in size - distilled base
BTF does not need full struct/union/enum information for named
types for example.  For 2667 modules built with distilled base BTF,
the average size observed was 1556 bytes (stddev 1563).

Note that for the in-tree modules, this approach is not needed as
split and base BTF in the case of in-tree modules are always built
and re-built together.

The series first focuses on generating split BTF with distilled base
BTF, and provides btf__parse_opts() which allows specification
of the section name from which to read BTF data, since we now have
both .BTF and .BTF.base sections that can contain such data.

Then we add support to resolve_btfids for generating the .BTF.ids
section with reference to the .BTF.base section - this ensures the
.BTF.ids match those used in the split/base BTF.

Finally the series provides the mechanism for relocating split BTF with
a new base; the distilled base BTF is used to map the references to base
BTF in the split BTF to the new base.  For the kernel, this relocation
process happens at module load time, and we relocate split BTF
references to point at types in the current vmlinux BTF.  As part of
this, .BTF.ids references need to be mapped also.

So concretely, what happens is

- we generate split BTF in the .BTF section of a module that refers to
  types in the .BTF.base section as base types; these are not full
  type descriptions but provide information about the base type.  So
  a STRUCT sk_buff would be represented as a FWD struct sk_buff in
  distilled base BTF for example.
- when the module is loaded, the split BTF is relocated with vmlinux
  BTF; in the case of the FWD struct sk_buff, we find the STRUCT sk_buff
  in vmlinux BTF and map all split BTF references to the distilled base
  FWD sk_buff, replacing them with references to the vmlinux BTF
  STRUCT sk_buff.

Support is also added to bpftool to be able to display split BTF
relative to its .BTF.base section, and also to display the relocated
form via the "-R path_to_base_btf".

A previous approach to this problem [1] utilized standalone BTF for such
cases - where the BTF is not defined relative to base BTF so there is no
relocation required.  The problem with that approach is that from
the verifier perspective, some types are special, and having a custom
representation of a core kernel type that did not necessarily match the
current representation is not tenable.  So the approach taken here was
to preserve the split BTF model while minimizing the representation of
the context needed to relocate split and current vmlinux BTF.

To generate distilled .BTF.base sections the associated dwarves
patch (to be applied on the "next" branch there) is needed.
Without it, things will still work but bpf_testmod will not be built
with a .BTF.base section.

Changes since RFC [2]:

- updated terminology; we replace clunky "base reference" BTF with
  distilling base BTF into a .BTF.base section. Similarly BTF
  reconcilation becomes BTF relocation (Andrii, most patches)
- add distilled base BTF by default for out-of-tree modules
  (Alexei, patch 8)
- distill algorithm updated to record size of embedded struct/union
  by recording it as a 0-vlen STRUCT/UNION with size preserved
  (Andrii, patch 2)
- verify size match on relocation for such STRUCT/UNIONs (Andrii,
  patch 9)
- with embedded STRUCT/UNION recording size, we can have bpftool
  dump a header representation using .BTF.base + .BTF sections
  rather than special-casing and refusing to use "format c" for
  that case (patch 5)
- match enum with enum64 and vice versa (Andrii, patch 9)
- ensure that resolve_btfids works with BTF without .BTF.base
  section (patch 7)
- update tests to cover embedded types, arrays and function
  prototypes (patches 3, 12)

One change not made yet is adding anonymous struct/unions that the split
BTF references in base BTF to the module instead of adding them to the
.BTF.base section.  That would involve having to maintain two pipes for
writing BTF, one for the .BTF.base and one for the split BTF.  It would
be possible, but there are I think some edge cases that might make it
tricky.  For example consider a split BTF reference to a base BTF
ARRAY which in turn referenced an anonymous STRUCT as type.  In such a
case, it wouldn't make sense to have the array in the .BTF.base section
while having the STRUCT in the module.  The general concern is that once
we move a type to the module we would need to also ensure any base types
that refer to it move there too.  For now it is I think simpler to
retain the existing split/base type classifications.

[1] https://lore.kernel.org/bpf/20231112124834.388735-14-alan.maguire@oracle.com/
[2] https://lore.kernel.org/bpf/20240322102455.98558-1-alan.maguire@oracle.com/



Alan Maguire (13):
  libbpf: add support to btf__add_fwd() for ENUM64
  libbpf: add btf__distill_base() creating split BTF with distilled base
    BTF
  selftests/bpf: test distilled base, split BTF generation
  libbpf: add btf__parse_opts() API for flexible BTF parsing
  bpftool: support displaying raw split BTF using base BTF section as
    base
  kbuild,bpf: switch to using --btf_features for pahole v1.26 and later
  resolve_btfids: use .BTF.base ELF section as base BTF if -B option is
    used
  kbuild, bpf: add module-specific pahole/resolve_btfids flags for
    distilled base BTF
  libbpf: split BTF relocation
  module, bpf: store BTF base pointer in struct module
  libbpf,bpf: share BTF relocate-related code with kernel
  selftests/bpf: extend distilled BTF tests to cover BTF relocation
  bpftool: support displaying relocated-with-base split BTF

 include/linux/btf.h                           |  32 +
 include/linux/module.h                        |   2 +
 kernel/bpf/Makefile                           |   8 +
 kernel/bpf/btf.c                              | 227 +++++--
 kernel/module/main.c                          |   5 +-
 scripts/Makefile.btf                          |  12 +-
 scripts/Makefile.modfinal                     |   4 +-
 .../bpf/bpftool/Documentation/bpftool-btf.rst |  15 +-
 tools/bpf/bpftool/bash-completion/bpftool     |   7 +-
 tools/bpf/bpftool/btf.c                       |  20 +-
 tools/bpf/bpftool/main.c                      |  14 +-
 tools/bpf/bpftool/main.h                      |   2 +
 tools/bpf/resolve_btfids/main.c               |  22 +-
 tools/lib/bpf/Build                           |   2 +-
 tools/lib/bpf/btf.c                           | 561 +++++++++++-----
 tools/lib/bpf/btf.h                           |  61 ++
 tools/lib/bpf/btf_common.c                    | 146 ++++
 tools/lib/bpf/btf_relocate.c                  | 630 ++++++++++++++++++
 tools/lib/bpf/libbpf.map                      |   3 +
 tools/lib/bpf/libbpf_internal.h               |   2 +
 .../selftests/bpf/prog_tests/btf_distill.c    | 298 +++++++++
 21 files changed, 1864 insertions(+), 209 deletions(-)
 create mode 100644 tools/lib/bpf/btf_common.c
 create mode 100644 tools/lib/bpf/btf_relocate.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_distill.c

-- 
2.31.1


^ permalink raw reply	[flat|nested] 43+ messages in thread

* [PATCH v2 bpf-next 01/13] libbpf: add support to btf__add_fwd() for ENUM64
  2024-04-24 15:47 [PATCH v2 bpf-next 00/13] bpf: support resilient split BTF Alan Maguire
@ 2024-04-24 15:47 ` Alan Maguire
  2024-04-26 22:56   ` Andrii Nakryiko
  2024-04-24 15:47 ` [PATCH v2 bpf-next 02/13] libbpf: add btf__distill_base() creating split BTF with distilled base BTF Alan Maguire
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 43+ messages in thread
From: Alan Maguire @ 2024-04-24 15:47 UTC (permalink / raw)
  To: andrii, ast
  Cc: jolsa, acme, quentin, eddyz87, mykolal, daniel, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, houtao1,
	bpf, masahiroy, mcgrof, nathan, Alan Maguire

Forward declaration of BTF_KIND_ENUM64 is added by supporting BTF_FWD_ENUM64
as an enumerated value for btf_fwd_kind; an ENUM64 forward is an 8-byte
signed enum64 with no values.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 tools/lib/bpf/btf.c | 7 ++++++-
 tools/lib/bpf/btf.h | 1 +
 2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index 2d0840ef599a..44afae098369 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -2418,7 +2418,7 @@ int btf__add_enum64_value(struct btf *btf, const char *name, __u64 value)
  * Append new BTF_KIND_FWD type with:
  *   - *name*, non-empty/non-NULL name;
  *   - *fwd_kind*, kind of forward declaration, one of BTF_FWD_STRUCT,
- *     BTF_FWD_UNION, or BTF_FWD_ENUM;
+ *     BTF_FWD_UNION, BTF_FWD_ENUM or BTF_FWD_ENUM64;
  * Returns:
  *   - >0, type ID of newly added BTF type;
  *   - <0, on error.
@@ -2446,6 +2446,11 @@ int btf__add_fwd(struct btf *btf, const char *name, enum btf_fwd_kind fwd_kind)
 		 * values; we also assume a standard 4-byte size for it
 		 */
 		return btf__add_enum(btf, name, sizeof(int));
+	case BTF_FWD_ENUM64:
+		/* enum64 forward is similarly just an enum64 with no enum
+		 * values; assume 8 byte size, signed.
+		 */
+		return btf__add_enum64(btf, name, sizeof(__u64), true);
 	default:
 		return libbpf_err(-EINVAL);
 	}
diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
index 8e6880d91c84..47d3e00b25c7 100644
--- a/tools/lib/bpf/btf.h
+++ b/tools/lib/bpf/btf.h
@@ -194,6 +194,7 @@ enum btf_fwd_kind {
 	BTF_FWD_STRUCT = 0,
 	BTF_FWD_UNION = 1,
 	BTF_FWD_ENUM = 2,
+	BTF_FWD_ENUM64 = 3,
 };
 
 LIBBPF_API int btf__add_fwd(struct btf *btf, const char *name, enum btf_fwd_kind fwd_kind);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v2 bpf-next 02/13] libbpf: add btf__distill_base() creating split BTF with distilled base BTF
  2024-04-24 15:47 [PATCH v2 bpf-next 00/13] bpf: support resilient split BTF Alan Maguire
  2024-04-24 15:47 ` [PATCH v2 bpf-next 01/13] libbpf: add support to btf__add_fwd() for ENUM64 Alan Maguire
@ 2024-04-24 15:47 ` Alan Maguire
  2024-04-26 22:57   ` Andrii Nakryiko
  2024-04-30 23:06   ` Eduard Zingerman
  2024-04-24 15:47 ` [PATCH v2 bpf-next 03/13] selftests/bpf: test distilled base, split BTF generation Alan Maguire
                   ` (11 subsequent siblings)
  13 siblings, 2 replies; 43+ messages in thread
From: Alan Maguire @ 2024-04-24 15:47 UTC (permalink / raw)
  To: andrii, ast
  Cc: jolsa, acme, quentin, eddyz87, mykolal, daniel, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, houtao1,
	bpf, masahiroy, mcgrof, nathan, Alan Maguire

To support more robust split BTF, adding supplemental context for the
base BTF type ids that split BTF refers to is required.  Without such
references, a simple shuffling of base BTF type ids (without any other
significant change) invalidates the split BTF.  Here the attempt is made
to store additional context to make split BTF more robust.

This context comes in the form of distilled base BTF - this base BTF
constitutes the minimal BTF representation needed to disambiguate split BTF
references to base BTF.  The rules are as follows:

- INT, FLOAT are recorded in full.
- if a named base BTF STRUCT or UNION is referred to from split BTF, it
  will be encoded either as a zero-member sized STRUCT/UNION (preserving
  size for later relocation checks) or as a named FWD.  Only base BTF
  STRUCT/UNIONs that are embedded in split BTF STRUCT/UNIONs need to
  preserve size information, so a FWD representation will be used in
  most cases.
- if an ENUM[64] is named, a ENUM[64] forward representation (an ENUM[64]
  with no values) is used.
- if a STRUCT, UNION, ENUM or ENUM64 is not named, it is recorded in full.
- base BTF reference types like CONST, RESTRICT, TYPEDEF, PTR are recorded
  as-is.

Avoiding struct/union/enum/enum64 expansion is important to keep the
distilled base BTF representation to a minimum size; however anonymous
struct, union and enum[64] types are represented in full since type details
are needed to disambiguate the reference - the name is not enough in those
cases since there is no name.  In practice these are rare; in sample
cases where reference base BTF was generated for in-tree kernel modules,
only a few were needed in distilled base BTF.  These represent the
anonymous struct/unions that are used by the module but were de-duplicated
to use base vmlinux BTF ids instead.

When successful, new representations of the distilled base BTF and new
split BTF that refers to it are returned.  Both need to be freed by the
caller.

So to take a simple example, with split BTF with a type referring
to "struct sk_buff", we will generate base reference BTF with a
FWD struct sk_buff, and the split BTF will refer to it instead.

Tools like pahole can utilize such split BTF to popuate the .BTF section
(split BTF) and an additional .BTF.base section.
Then when the split BTF is loaded, the distilled base BTF can be used
to relocate split BTF to reference the current - and possibly changed -
base BTF.

So for example if "struct sk_buff" was id 502 when the split BTF was
originally generated,  we can use the distilled base BTF to see that
id 502 refers to a "struct sk_buff" and replace instances of id 502
with the current (relocated) base BTF sk_buff type id.

Distilled base BTF is small; when building a kernel with all modules
using distilled base BTF as a test, the average size for module
distilled base BTF is 1555 bytes (standard deviation 1563).  The
maximum distilled base BTF size across ~2700 modules was 37895 bytes.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 tools/lib/bpf/btf.c      | 316 ++++++++++++++++++++++++++++++++++++++-
 tools/lib/bpf/btf.h      |  20 +++
 tools/lib/bpf/libbpf.map |   1 +
 3 files changed, 331 insertions(+), 6 deletions(-)

diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index 44afae098369..419cc4fa2e86 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -1771,9 +1771,8 @@ static int btf_rewrite_str(__u32 *str_off, void *ctx)
 	return 0;
 }
 
-int btf__add_type(struct btf *btf, const struct btf *src_btf, const struct btf_type *src_type)
+static int btf_add_type(struct btf_pipe *p, const struct btf_type *src_type)
 {
-	struct btf_pipe p = { .src = src_btf, .dst = btf };
 	struct btf_type *t;
 	int sz, err;
 
@@ -1782,20 +1781,27 @@ int btf__add_type(struct btf *btf, const struct btf *src_btf, const struct btf_t
 		return libbpf_err(sz);
 
 	/* deconstruct BTF, if necessary, and invalidate raw_data */
-	if (btf_ensure_modifiable(btf))
+	if (btf_ensure_modifiable(p->dst))
 		return libbpf_err(-ENOMEM);
 
-	t = btf_add_type_mem(btf, sz);
+	t = btf_add_type_mem(p->dst, sz);
 	if (!t)
 		return libbpf_err(-ENOMEM);
 
 	memcpy(t, src_type, sz);
 
-	err = btf_type_visit_str_offs(t, btf_rewrite_str, &p);
+	err = btf_type_visit_str_offs(t, btf_rewrite_str, p);
 	if (err)
 		return libbpf_err(err);
 
-	return btf_commit_type(btf, sz);
+	return btf_commit_type(p->dst, sz);
+}
+
+int btf__add_type(struct btf *btf, const struct btf *src_btf, const struct btf_type *src_type)
+{
+	struct btf_pipe p = { .src = src_btf, .dst = btf };
+
+	return btf_add_type(&p, src_type);
 }
 
 static int btf_rewrite_type_ids(__u32 *type_id, void *ctx)
@@ -5217,3 +5223,301 @@ int btf_ext_visit_str_offs(struct btf_ext *btf_ext, str_off_visit_fn visit, void
 
 	return 0;
 }
+
+struct btf_distill_id {
+	int id;
+	bool embedded;		/* true if id refers to a struct/union in base BTF
+				 * that is embedded in a split BTF struct/union.
+				 */
+};
+
+struct btf_distill {
+	struct btf_pipe pipe;
+	struct btf_distill_id *ids;
+	__u32 query_id;
+	unsigned int nr_base_types;
+	unsigned int diff_id;
+};
+
+/* Check if a member of a split BTF struct/union refers to a base BTF
+ * struct/union.  Members can be const/restrict/volatile/typedef
+ * reference types, but if a pointer is encountered, type is no longer
+ * considered embedded.
+ */
+static int btf_find_embedded_composite_type_ids(__u32 *id, void *ctx)
+{
+	struct btf_distill *dist = ctx;
+	const struct btf_type *t;
+	__u32 next_id = *id;
+
+	do {
+		if (next_id == 0)
+			return 0;
+		t = btf_type_by_id(dist->pipe.src, next_id);
+		switch (btf_kind(t)) {
+		case BTF_KIND_CONST:
+		case BTF_KIND_RESTRICT:
+		case BTF_KIND_VOLATILE:
+		case BTF_KIND_TYPEDEF:
+			next_id = t->type;
+			break;
+		case BTF_KIND_ARRAY: {
+			struct btf_array *a = btf_array(t);
+
+			next_id = a->type;
+			break;
+		}
+		case BTF_KIND_STRUCT:
+		case BTF_KIND_UNION:
+			dist->ids[next_id].embedded = next_id > 0 &&
+						      next_id <= dist->nr_base_types;
+			return 0;
+		default:
+			return 0;
+		}
+
+	} while (next_id != 0);
+
+	return 0;
+}
+
+static bool btf_is_eligible_named_fwd(const struct btf_type *t)
+{
+	return (btf_is_composite(t) || btf_is_any_enum(t)) && t->name_off != 0;
+}
+
+static int btf_add_distilled_type_ids(__u32 *id, void *ctx)
+{
+	struct btf_distill *dist = ctx;
+	struct btf_type *t = btf_type_by_id(dist->pipe.src, *id);
+	int ret;
+
+	/* split BTF id, not needed */
+	if (*id > dist->nr_base_types)
+		return 0;
+	/* already added ? */
+	if (dist->ids[*id].id >= 0)
+		return 0;
+	dist->ids[*id].id = *id;
+
+	/* only a subset of base BTF types should be referenced from split
+	 * BTF; ensure nothing unexpected is referenced.
+	 */
+	switch (btf_kind(t)) {
+	case BTF_KIND_INT:
+	case BTF_KIND_FLOAT:
+	case BTF_KIND_ARRAY:
+	case BTF_KIND_STRUCT:
+	case BTF_KIND_UNION:
+	case BTF_KIND_TYPEDEF:
+	case BTF_KIND_ENUM:
+	case BTF_KIND_ENUM64:
+	case BTF_KIND_PTR:
+	case BTF_KIND_CONST:
+	case BTF_KIND_RESTRICT:
+	case BTF_KIND_VOLATILE:
+	case BTF_KIND_FUNC_PROTO:
+		break;
+	default:
+		pr_warn("unexpected reference to base type[%u] of kind [%u] when creating distilled base BTF.\n",
+			*id, btf_kind(t));
+		return -EINVAL;
+	}
+
+	/* struct/union members not needed, except for anonymous structs
+	 * and unions, which we need since name won't help us determine
+	 * matches; so if a named struct/union, no need to recurse
+	 * into members.
+	 */
+	if (btf_is_eligible_named_fwd(t))
+		return 0;
+
+	/* ensure references in type are added also. */
+	ret = btf_type_visit_type_ids(t, btf_add_distilled_type_ids, ctx);
+	if (ret < 0)
+		return ret;
+	return 0;
+}
+
+/* All split BTF ids will be shifted downwards since there are less base BTF
+ * in distilled base BTF, and for those that refer to base BTF, we use the
+ * reference map to map from original base BTF to distilled base BTF id.
+ */
+static int btf_update_distilled_type_ids(__u32 *id, void *ctx)
+{
+	struct btf_distill *dist = ctx;
+
+	if (*id >= dist->nr_base_types)
+		*id -= dist->diff_id;
+	else
+		*id = dist->ids[*id].id;
+	return 0;
+}
+
+/* Create updated /split BTF with distilled base BTF; distilled base BTF
+ * consists of BTF information required to clarify the types that split
+ * BTF refers to, omitting unneeded details.  Specifically it will contain
+ * base types and forward declarations of structs, unions and enumerated
+ * types, along with associated reference types like pointers, arrays etc.
+ *
+ * The only case where structs, unions or enumerated types are fully represented
+ * is when they are anonymous; in such cases, info about type content is needed
+ * to clarify type references.
+ *
+ * We return newly-created split BTF where the split BTf refers to a newly-created
+ * distilled base BTF. Both must be freed separately by the caller.
+ *
+ * When creating the BTF representation for a module and provided with the
+ * distilled_base option, pahole will create split BTF using this API, and store
+ * the distilled base BTF in the .BTF.base.distilled section.
+ */
+int btf__distill_base(const struct btf *src_btf, struct btf **new_base_btf,
+		      struct btf **new_split_btf)
+{
+	struct btf *new_base = NULL, *new_split = NULL;
+	unsigned int n = btf__type_cnt(src_btf);
+	struct btf_distill dist = {};
+	struct btf_type *t;
+	__u32 i, id = 0;
+	int ret = 0;
+
+	/* src BTF must be split BTF. */
+	if (!new_base_btf || !new_split_btf || !btf__base_btf(src_btf)) {
+		errno = EINVAL;
+		return -EINVAL;
+	}
+	new_base = btf__new_empty();
+	if (!new_base)
+		return -ENOMEM;
+	dist.ids = calloc(n, sizeof(*dist.ids));
+	if (!dist.ids) {
+		ret = -ENOMEM;
+		goto err_out;
+	}
+	for (i = 1; i < n; i++)
+		dist.ids[i].id = -1;
+	dist.pipe.src = src_btf;
+	dist.pipe.dst = new_base;
+	dist.pipe.str_off_map = hashmap__new(btf_dedup_identity_hash_fn, btf_dedup_equal_fn, NULL);
+	if (IS_ERR(dist.pipe.str_off_map)) {
+		ret = -ENOMEM;
+		goto err_out;
+	}
+	dist.nr_base_types = btf__type_cnt(btf__base_btf(src_btf));
+
+	/* Pass over src split BTF; generate the list of base BTF
+	 * type ids it references; these will constitute our distilled
+	 * base BTF set.
+	 */
+	for (i = src_btf->start_id; i < n; i++) {
+		t = (struct btf_type *)btf__type_by_id(src_btf, i);
+
+		/* check if members of struct/union in split BTF refer to base BTF
+		 * struct/union; if so, we will use an empty sized struct to represent
+		 * it rather than a FWD because its size must match on later BTF
+		 * relocation.
+		 */
+		if (btf_is_composite(t)) {
+			ret = btf_type_visit_type_ids(t, btf_find_embedded_composite_type_ids,
+						      &dist);
+			if (ret < 0)
+				goto err_out;
+		}
+		ret = btf_type_visit_type_ids(t,  btf_add_distilled_type_ids, &dist);
+		if (ret < 0)
+			goto err_out;
+	}
+	/* Next add types for each of the required references. */
+	for (i = 1; i < src_btf->start_id; i++) {
+		if (dist.ids[i].id < 0)
+			continue;
+		t = btf_type_by_id(src_btf, i);
+
+		if (dist.ids[i].embedded) {
+			/* If a named struct/union in base BTF is referenced as a type
+			 * in split BTF without use of a pointer - i.e. as an embedded
+			 * struct/union - add an empty struct/union preserving size
+			 * since size must be consistent when relocating split and
+			 * possibly changed base BTF.
+			 */
+			ret = btf_add_composite(new_base, btf_kind(t),
+						btf__name_by_offset(src_btf, t->name_off),
+						t->size);
+		} else if (btf_is_eligible_named_fwd(t)) {
+			enum btf_fwd_kind fwd_kind;
+
+			/* If not embedded, use a fwd for named struct/unions since we
+			 * can match via name without any other details.
+			 */
+			switch (btf_kind(t)) {
+			case BTF_KIND_STRUCT:
+				fwd_kind = BTF_FWD_STRUCT;
+				break;
+			case BTF_KIND_UNION:
+				fwd_kind = BTF_FWD_UNION;
+				break;
+			case BTF_KIND_ENUM:
+				fwd_kind = BTF_FWD_ENUM;
+				break;
+			case BTF_KIND_ENUM64:
+				fwd_kind = BTF_FWD_ENUM64;
+				break;
+			default:
+				pr_warn("unexpected kind [%u] when creating distilled base BTF.\n",
+					btf_kind(t));
+				goto err_out;
+			}
+			ret = btf__add_fwd(new_base, btf__name_by_offset(src_btf, t->name_off),
+					   fwd_kind);
+		} else {
+			ret = btf_add_type(&dist.pipe, t);
+		}
+		if (ret < 0)
+			goto err_out;
+		dist.ids[i].id = ++id;
+	}
+	/* now create new split BTF with distilled base BTF as its base; we end up with
+	 * split BTF that has base BTF that represents enough about its base references
+	 * to allow it to be relocated with the base BTF available.
+	 */
+	new_split = btf__new_empty_split(new_base);
+	if (!new_split_btf) {
+		ret = libbpf_get_error(new_split);
+		goto err_out;
+	}
+
+	dist.pipe.dst = new_split;
+	/* all split BTF ids will be shifted downwards since there are less base BTF ids
+	 * in distilled base BTF.
+	 */
+	dist.diff_id = dist.nr_base_types - btf__type_cnt(new_base);
+
+	/* First add all split types */
+	for (i = src_btf->start_id; i < n; i++) {
+		t = btf_type_by_id(src_btf, i);
+		ret = btf_add_type(&dist.pipe, t);
+		if (ret < 0)
+			goto err_out;
+	}
+	n = btf__type_cnt(new_split);
+	/* Now update base/split BTF ids. */
+	for (i = 1; i < n; i++) {
+		t = btf_type_by_id(new_split, i);
+
+		ret = btf_type_visit_type_ids(t,  btf_update_distilled_type_ids, &dist);
+		if (ret < 0)
+			goto err_out;
+	}
+	free(dist.ids);
+	hashmap__free(dist.pipe.str_off_map);
+	*new_base_btf = new_base;
+	*new_split_btf = new_split;
+	return 0;
+err_out:
+	free(dist.ids);
+	hashmap__free(dist.pipe.str_off_map);
+	btf__free(new_split);
+	btf__free(new_base);
+	errno = -ret;
+	return ret;
+}
diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
index 47d3e00b25c7..025ed28b7fe8 100644
--- a/tools/lib/bpf/btf.h
+++ b/tools/lib/bpf/btf.h
@@ -107,6 +107,26 @@ LIBBPF_API struct btf *btf__new_empty(void);
  */
 LIBBPF_API struct btf *btf__new_empty_split(struct btf *base_btf);
 
+/**
+ * @brief **btf__distill_base()** creates new versions of the split BTF
+ * *src_btf* and its base BTF.  The new base BTF will only contain the types
+ * needed to improve robustness of the split BTF to small changes in base BTF.
+ * When that split BTF is loaded against a (possibly changed) base, this
+ * distilled base BTF will help update references to that (possibly changed)
+ * base BTF.
+ *
+ * Both the new split and its associated new base BTF must be freed by
+ * the caller.
+ *
+ * If successful, 0 is returned and **new_base_btf** and **new_split_btf**
+ * will point at new base/split BTF.  Both the new split and its associated
+ * new base BTF must be freed by the caller.
+ *
+ * A negative value is returned on error.
+ */
+LIBBPF_API int btf__distill_base(const struct btf *src_btf, struct btf **new_base_btf,
+				 struct btf **new_split_btf);
+
 LIBBPF_API struct btf *btf__parse(const char *path, struct btf_ext **btf_ext);
 LIBBPF_API struct btf *btf__parse_split(const char *path, struct btf *base_btf);
 LIBBPF_API struct btf *btf__parse_elf(const char *path, struct btf_ext **btf_ext);
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index c1ce8aa3520b..c4d9bd7d3220 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -420,6 +420,7 @@ LIBBPF_1.4.0 {
 LIBBPF_1.5.0 {
 	global:
 		bpf_program__attach_sockmap;
+		btf__distill_base;
 		ring__consume_n;
 		ring_buffer__consume_n;
 } LIBBPF_1.4.0;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v2 bpf-next 03/13] selftests/bpf: test distilled base, split BTF generation
  2024-04-24 15:47 [PATCH v2 bpf-next 00/13] bpf: support resilient split BTF Alan Maguire
  2024-04-24 15:47 ` [PATCH v2 bpf-next 01/13] libbpf: add support to btf__add_fwd() for ENUM64 Alan Maguire
  2024-04-24 15:47 ` [PATCH v2 bpf-next 02/13] libbpf: add btf__distill_base() creating split BTF with distilled base BTF Alan Maguire
@ 2024-04-24 15:47 ` Alan Maguire
  2024-04-30 23:50   ` Eduard Zingerman
  2024-04-30 23:55   ` Eduard Zingerman
  2024-04-24 15:47 ` [PATCH v2 bpf-next 04/13] libbpf: add btf__parse_opts() API for flexible BTF parsing Alan Maguire
                   ` (10 subsequent siblings)
  13 siblings, 2 replies; 43+ messages in thread
From: Alan Maguire @ 2024-04-24 15:47 UTC (permalink / raw)
  To: andrii, ast
  Cc: jolsa, acme, quentin, eddyz87, mykolal, daniel, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, houtao1,
	bpf, masahiroy, mcgrof, nathan, Alan Maguire

Test generation of split+distilled base BTF, ensuring that

- base BTF STRUCTs which are embedded in split BTF structs are
  represented as 0-member sized structs, allowing size checking
- FWDs are used in place of full named struct/union declarations
- FWDs are used in place of full named enum declarations
- anonymous struct/unions are represented in full
- anonymous enums are represented in full
- types unreferenced from split BTF are not present in distilled
  base BTF

Also test that with vmlinux BTF and split BTF based upon it,
we only represent needed base types referenced from split BTF
in distilled base.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 .../selftests/bpf/prog_tests/btf_distill.c    | 253 ++++++++++++++++++
 1 file changed, 253 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_distill.c

diff --git a/tools/testing/selftests/bpf/prog_tests/btf_distill.c b/tools/testing/selftests/bpf/prog_tests/btf_distill.c
new file mode 100644
index 000000000000..aae9aef68bd6
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/btf_distill.c
@@ -0,0 +1,253 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024, Oracle and/or its affiliates. */
+
+#include <test_progs.h>
+#include <bpf/btf.h>
+#include "btf_helpers.h"
+
+/* Fabricate base, split BTF with references to base types needed; then create
+ * split BTF with distilled base BTF and ensure expectations are met:
+ *  - only referenced base types from split BTF are present
+ *  - struct/union/enum are represented as FWDs unless anonymous, when they
+ *    are represented in full, or if embedded in a split BTF struct, in which
+ *    case they are represented by a STRUCT with specified size and vlen=0.
+ */
+static void test_distilled_base(void)
+{
+	struct btf *btf1 = NULL, *btf2 = NULL, *btf3 = NULL, *btf4 = NULL;
+
+	btf1 = btf__new_empty();
+	if (!ASSERT_OK_PTR(btf1, "empty_main_btf"))
+		return;
+
+	btf__add_int(btf1, "int", 4, BTF_INT_SIGNED);	/* [1] int */
+	btf__add_ptr(btf1, 1);				/* [2] ptr to int */
+	btf__add_struct(btf1, "s1", 8);			/* [3] struct s1 { */
+	btf__add_field(btf1, "f1", 2, 0, 0);		/*      int *f1; */
+							/* } */
+	btf__add_struct(btf1, "", 12);			/* [4] struct { */
+	btf__add_field(btf1, "f1", 1, 0, 0);		/*	int f1; */
+	btf__add_field(btf1, "f2", 3, 32, 0);		/*	struct s1 f2; */
+							/* } */
+	btf__add_int(btf1, "unsigned int", 4, 0);	/* [5] unsigned int */
+	btf__add_union(btf1, "u1", 12);			/* [6] union u1 { */
+	btf__add_field(btf1, "f1", 1, 0, 0);		/*	int f1; */
+	btf__add_field(btf1, "f2", 2, 0, 0);		/*	int *f2; */
+							/* } */
+	btf__add_union(btf1, "", 4);			/* [7] union { */
+	btf__add_field(btf1, "f1", 1, 0, 0);		/*	int f1; */
+							/* } */
+	btf__add_enum(btf1, "e1", 4);			/* [8] enum e1 { */
+	btf__add_enum_value(btf1, "v1", 1);		/*	v1 = 1; */
+							/* } */
+	btf__add_enum(btf1, "", 4);			/* [9] enum { */
+	btf__add_enum_value(btf1, "av1", 2);		/*	av1 = 2; */
+							/* } */
+	btf__add_enum64(btf1, "e641", 8, true);		/* [10] enum64 { */
+	btf__add_enum64_value(btf1, "v1", 1024);	/*	v1 = 1024; */
+							/* } */
+	btf__add_enum64(btf1, "", 8, true);		/* [11] enum64 { */
+	btf__add_enum64_value(btf1, "v1", 1025);	/*	v1 = 1025; */
+							/* } */
+	btf__add_struct(btf1, "unneeded", 4);		/* [12] struct unneeded { */
+	btf__add_field(btf1, "f1", 1, 0, 0);		/*	int f1; */
+							/* } */
+	btf__add_struct(btf1, "embedded", 4);		/* [13] struct embedded { */
+	btf__add_field(btf1, "f1", 1, 0, 0);		/*	int f1; */
+							/* } */
+	btf__add_func_proto(btf1, 1);			/* [14] int (*)(int *p1); */
+	btf__add_func_param(btf1, "p1", 1);
+
+	btf__add_array(btf1, 1, 1, 3);			/* [15] int [3]; */
+
+	VALIDATE_RAW_BTF(
+		btf1,
+		"[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+		"[2] PTR '(anon)' type_id=1",
+		"[3] STRUCT 's1' size=8 vlen=1\n"
+		"\t'f1' type_id=2 bits_offset=0",
+		"[4] STRUCT '(anon)' size=12 vlen=2\n"
+		"\t'f1' type_id=1 bits_offset=0\n"
+		"\t'f2' type_id=3 bits_offset=32",
+		"[5] INT 'unsigned int' size=4 bits_offset=0 nr_bits=32 encoding=(none)",
+		"[6] UNION 'u1' size=12 vlen=2\n"
+		"\t'f1' type_id=1 bits_offset=0\n"
+		"\t'f2' type_id=2 bits_offset=0",
+		"[7] UNION '(anon)' size=4 vlen=1\n"
+		"\t'f1' type_id=1 bits_offset=0",
+		"[8] ENUM 'e1' encoding=UNSIGNED size=4 vlen=1\n"
+		"\t'v1' val=1",
+		"[9] ENUM '(anon)' encoding=UNSIGNED size=4 vlen=1\n"
+		"\t'av1' val=2",
+		"[10] ENUM64 'e641' encoding=SIGNED size=8 vlen=1\n"
+		"\t'v1' val=1024",
+		"[11] ENUM64 '(anon)' encoding=SIGNED size=8 vlen=1\n"
+		"\t'v1' val=1025",
+		"[12] STRUCT 'unneeded' size=4 vlen=1\n"
+		"\t'f1' type_id=1 bits_offset=0",
+		"[13] STRUCT 'embedded' size=4 vlen=1\n"
+		"\t'f1' type_id=1 bits_offset=0",
+		"[14] FUNC_PROTO '(anon)' ret_type_id=1 vlen=1\n"
+		"\t'p1' type_id=1",
+		"[15] ARRAY '(anon)' type_id=1 index_type_id=1 nr_elems=3");
+
+	btf2 = btf__new_empty_split(btf1);
+	if (!ASSERT_OK_PTR(btf2, "empty_split_btf"))
+		goto cleanup;
+
+	btf__add_ptr(btf2, 3);				/* [16] ptr to struct s1 */
+	/* add ptr to struct anon */
+	btf__add_ptr(btf2, 4);				/* [17] ptr to struct (anon) */
+	btf__add_const(btf2, 6);			/* [18] const union u1 */
+	btf__add_restrict(btf2, 7);			/* [19] restrict union (anon) */
+	btf__add_volatile(btf2, 8);			/* [20] volatile enum e1 */
+	btf__add_typedef(btf2, "et", 9);		/* [21] typedef enum (anon) */
+	btf__add_const(btf2, 10);			/* [22] const enum64 e641 */
+	btf__add_ptr(btf2, 11);				/* [23] restrict enum64 (anon) */
+	btf__add_struct(btf2, "with_embedded", 4);	/* [24] struct with_embedded { */
+	btf__add_field(btf2, "f1", 13, 0, 0);		/*	struct embedded f1; */
+							/* } */
+	btf__add_func(btf2, "fn", BTF_FUNC_STATIC, 14);	/* [25] int fn(int p1); */
+	btf__add_typedef(btf2, "arraytype", 15);	/* [26] typedef int[3] foo; */
+
+	VALIDATE_RAW_BTF(
+		btf2,
+		"[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+		"[2] PTR '(anon)' type_id=1",
+		"[3] STRUCT 's1' size=8 vlen=1\n"
+		"\t'f1' type_id=2 bits_offset=0",
+		"[4] STRUCT '(anon)' size=12 vlen=2\n"
+		"\t'f1' type_id=1 bits_offset=0\n"
+		"\t'f2' type_id=3 bits_offset=32",
+		"[5] INT 'unsigned int' size=4 bits_offset=0 nr_bits=32 encoding=(none)",
+		"[6] UNION 'u1' size=12 vlen=2\n"
+		"\t'f1' type_id=1 bits_offset=0\n"
+		"\t'f2' type_id=2 bits_offset=0",
+		"[7] UNION '(anon)' size=4 vlen=1\n"
+		"\t'f1' type_id=1 bits_offset=0",
+		"[8] ENUM 'e1' encoding=UNSIGNED size=4 vlen=1\n"
+		"\t'v1' val=1",
+		"[9] ENUM '(anon)' encoding=UNSIGNED size=4 vlen=1\n"
+		"\t'av1' val=2",
+		"[10] ENUM64 'e641' encoding=SIGNED size=8 vlen=1\n"
+		"\t'v1' val=1024",
+		"[11] ENUM64 '(anon)' encoding=SIGNED size=8 vlen=1\n"
+		"\t'v1' val=1025",
+		"[12] STRUCT 'unneeded' size=4 vlen=1\n"
+		"\t'f1' type_id=1 bits_offset=0",
+		"[13] STRUCT 'embedded' size=4 vlen=1\n"
+		"\t'f1' type_id=1 bits_offset=0",
+		"[14] FUNC_PROTO '(anon)' ret_type_id=1 vlen=1\n"
+		"\t'p1' type_id=1",
+		"[15] ARRAY '(anon)' type_id=1 index_type_id=1 nr_elems=3",
+		"[16] PTR '(anon)' type_id=3",
+		"[17] PTR '(anon)' type_id=4",
+		"[18] CONST '(anon)' type_id=6",
+		"[19] RESTRICT '(anon)' type_id=7",
+		"[20] VOLATILE '(anon)' type_id=8",
+		"[21] TYPEDEF 'et' type_id=9",
+		"[22] CONST '(anon)' type_id=10",
+		"[23] PTR '(anon)' type_id=11",
+		"[24] STRUCT 'with_embedded' size=4 vlen=1\n"
+		"\t'f1' type_id=13 bits_offset=0",
+		"[25] FUNC 'fn' type_id=14 linkage=static",
+		"[26] TYPEDEF 'arraytype' type_id=15");
+
+	if (!ASSERT_EQ(0, btf__distill_base(btf2, &btf3, &btf4),
+		       "distilled_base") ||
+	    !ASSERT_OK_PTR(btf3, "distilled_base") ||
+	    !ASSERT_OK_PTR(btf4, "distilled_split"))
+		goto cleanup;
+
+	VALIDATE_RAW_BTF(
+		btf4,
+		"[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+		"[2] FWD 's1' fwd_kind=struct",
+		"[3] STRUCT '(anon)' size=12 vlen=2\n"
+		"\t'f1' type_id=1 bits_offset=0\n"
+		"\t'f2' type_id=2 bits_offset=32",
+		"[4] FWD 'u1' fwd_kind=union",
+		"[5] UNION '(anon)' size=4 vlen=1\n"
+		"\t'f1' type_id=1 bits_offset=0",
+		"[6] ENUM 'e1' encoding=UNSIGNED size=4 vlen=0",
+		"[7] ENUM '(anon)' encoding=UNSIGNED size=4 vlen=1\n"
+		"\t'av1' val=2",
+		"[8] ENUM64 'e641' encoding=SIGNED size=8 vlen=0",
+		"[9] ENUM64 '(anon)' encoding=SIGNED size=8 vlen=1\n"
+		"\t'v1' val=1025",
+		"[10] STRUCT 'embedded' size=4 vlen=0",
+		"[11] FUNC_PROTO '(anon)' ret_type_id=1 vlen=1\n"
+		"\t'p1' type_id=1",
+		"[12] ARRAY '(anon)' type_id=1 index_type_id=1 nr_elems=3",
+		"[13] PTR '(anon)' type_id=2",
+		"[14] PTR '(anon)' type_id=3",
+		"[15] CONST '(anon)' type_id=4",
+		"[16] RESTRICT '(anon)' type_id=5",
+		"[17] VOLATILE '(anon)' type_id=6",
+		"[18] TYPEDEF 'et' type_id=7",
+		"[19] CONST '(anon)' type_id=8",
+		"[20] PTR '(anon)' type_id=9",
+		"[21] STRUCT 'with_embedded' size=4 vlen=1\n"
+		"\t'f1' type_id=10 bits_offset=0",
+		"[22] FUNC 'fn' type_id=11 linkage=static",
+		"[23] TYPEDEF 'arraytype' type_id=12");
+
+cleanup:
+	btf__free(btf4);
+	btf__free(btf3);
+	btf__free(btf2);
+	btf__free(btf1);
+}
+
+/* create split reference BTF from vmlinux + split BTF with a few type references;
+ * ensure the resultant split reference BTF is as expected, containing only types
+ * needed to disambiguate references from split BTF.
+ */
+static void test_distilled_base_vmlinux(void)
+{
+	struct btf *split_btf = NULL, *vmlinux_btf = btf__load_vmlinux_btf();
+	struct btf *split_dist = NULL, *base_dist = NULL;
+	__s32 int_id, sk_buff_id;
+
+	if (!ASSERT_OK_PTR(vmlinux_btf, "load_vmlinux"))
+		return;
+	int_id = btf__find_by_name_kind(vmlinux_btf, "int", BTF_KIND_INT);
+	if (!ASSERT_GT(int_id, 0, "find_int"))
+		goto cleanup;
+	sk_buff_id = btf__find_by_name_kind(vmlinux_btf, "sk_buff", BTF_KIND_STRUCT);
+	if (!ASSERT_GT(sk_buff_id, 0, "find_sk_buff_id"))
+		goto cleanup;
+	split_btf = btf__new_empty_split(vmlinux_btf);
+	if (!ASSERT_OK_PTR(split_btf, "new_split"))
+		goto cleanup;
+	btf__add_typedef(split_btf, "myint", int_id);
+	btf__add_ptr(split_btf, sk_buff_id);
+
+	if (!ASSERT_EQ(btf__distill_base(split_btf, &base_dist, &split_dist), 0,
+		       "distill_vmlinux_base"))
+		goto cleanup;
+
+	if (!ASSERT_OK_PTR(split_dist, "split_distilled") ||
+	    !ASSERT_OK_PTR(base_dist, "base_dist"))
+		goto cleanup;
+	VALIDATE_RAW_BTF(
+		split_dist,
+		"[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+		"[2] FWD 'sk_buff' fwd_kind=struct",
+		"[3] TYPEDEF 'myint' type_id=1",
+		"[4] PTR '(anon)' type_id=2");
+
+cleanup:
+	btf__free(split_dist);
+	btf__free(base_dist);
+	btf__free(split_btf);
+	btf__free(vmlinux_btf);
+}
+
+void test_btf_distill(void)
+{
+	if (test__start_subtest("distilled_base"))
+		test_distilled_base();
+	if (test__start_subtest("distilled_base_vmlinux"))
+		test_distilled_base_vmlinux();
+}
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v2 bpf-next 04/13] libbpf: add btf__parse_opts() API for flexible BTF parsing
  2024-04-24 15:47 [PATCH v2 bpf-next 00/13] bpf: support resilient split BTF Alan Maguire
                   ` (2 preceding siblings ...)
  2024-04-24 15:47 ` [PATCH v2 bpf-next 03/13] selftests/bpf: test distilled base, split BTF generation Alan Maguire
@ 2024-04-24 15:47 ` Alan Maguire
  2024-04-29 23:40   ` Andrii Nakryiko
  2024-05-01  0:07   ` Eduard Zingerman
  2024-04-24 15:47 ` [PATCH v2 bpf-next 05/13] bpftool: support displaying raw split BTF using base BTF section as base Alan Maguire
                   ` (9 subsequent siblings)
  13 siblings, 2 replies; 43+ messages in thread
From: Alan Maguire @ 2024-04-24 15:47 UTC (permalink / raw)
  To: andrii, ast
  Cc: jolsa, acme, quentin, eddyz87, mykolal, daniel, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, houtao1,
	bpf, masahiroy, mcgrof, nathan, Alan Maguire

Options cover existing parsing scenarios (ELF, raw, retrieving
.BTF.ext) and also allow specification of the ELF section name
containing BTF.  This will allow consumers to retrieve BTF from
.BTF.base sections (BTF_BASE_ELF_SEC) also.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 tools/lib/bpf/btf.c      | 50 ++++++++++++++++++++++++++++------------
 tools/lib/bpf/btf.h      | 32 +++++++++++++++++++++++++
 tools/lib/bpf/libbpf.map |  1 +
 3 files changed, 68 insertions(+), 15 deletions(-)

diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index 419cc4fa2e86..9036c1dc45d0 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -1084,7 +1084,7 @@ struct btf *btf__new_split(const void *data, __u32 size, struct btf *base_btf)
 	return libbpf_ptr(btf_new(data, size, base_btf));
 }
 
-static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
+static struct btf *btf_parse_elf(const char *path, const char *btf_sec, struct btf *base_btf,
 				 struct btf_ext **btf_ext)
 {
 	Elf_Data *btf_data = NULL, *btf_ext_data = NULL;
@@ -1146,7 +1146,7 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
 				idx, path);
 			goto done;
 		}
-		if (strcmp(name, BTF_ELF_SEC) == 0) {
+		if (strcmp(name, btf_sec) == 0) {
 			btf_data = elf_getdata(scn, 0);
 			if (!btf_data) {
 				pr_warn("failed to get section(%d, %s) data from %s\n",
@@ -1166,7 +1166,7 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
 	}
 
 	if (!btf_data) {
-		pr_warn("failed to find '%s' ELF section in %s\n", BTF_ELF_SEC, path);
+		pr_warn("failed to find '%s' ELF section in %s\n", btf_sec, path);
 		err = -ENODATA;
 		goto done;
 	}
@@ -1212,12 +1212,12 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
 
 struct btf *btf__parse_elf(const char *path, struct btf_ext **btf_ext)
 {
-	return libbpf_ptr(btf_parse_elf(path, NULL, btf_ext));
+	return libbpf_ptr(btf_parse_elf(path, BTF_ELF_SEC, NULL, btf_ext));
 }
 
 struct btf *btf__parse_elf_split(const char *path, struct btf *base_btf)
 {
-	return libbpf_ptr(btf_parse_elf(path, base_btf, NULL));
+	return libbpf_ptr(btf_parse_elf(path, BTF_ELF_SEC, base_btf, NULL));
 }
 
 static struct btf *btf_parse_raw(const char *path, struct btf *base_btf)
@@ -1293,7 +1293,8 @@ struct btf *btf__parse_raw_split(const char *path, struct btf *base_btf)
 	return libbpf_ptr(btf_parse_raw(path, base_btf));
 }
 
-static struct btf *btf_parse(const char *path, struct btf *base_btf, struct btf_ext **btf_ext)
+static struct btf *btf_parse(const char *path, const char *btf_elf_sec, struct btf *base_btf,
+			     struct btf_ext **btf_ext)
 {
 	struct btf *btf;
 	int err;
@@ -1301,23 +1302,42 @@ static struct btf *btf_parse(const char *path, struct btf *base_btf, struct btf_
 	if (btf_ext)
 		*btf_ext = NULL;
 
-	btf = btf_parse_raw(path, base_btf);
-	err = libbpf_get_error(btf);
-	if (!err)
-		return btf;
-	if (err != -EPROTO)
-		return ERR_PTR(err);
-	return btf_parse_elf(path, base_btf, btf_ext);
+	if (!btf_elf_sec) {
+		btf = btf_parse_raw(path, base_btf);
+		err = libbpf_get_error(btf);
+		if (!err)
+			return btf;
+		if (err != -EPROTO)
+			return ERR_PTR(err);
+	}
+	if (!btf_elf_sec)
+		btf_elf_sec = BTF_ELF_SEC;
+
+	return btf_parse_elf(path, btf_elf_sec, base_btf, btf_ext);
+}
+
+struct btf *btf__parse_opts(const char *path, struct btf_parse_opts *opts)
+{
+	struct btf *base_btf;
+	const char *btf_sec;
+	struct btf_ext **btf_ext;
+
+	if (!OPTS_VALID(opts, btf_parse_opts))
+		return libbpf_err_ptr(-EINVAL);
+	base_btf = OPTS_GET(opts, base_btf, NULL);
+	btf_sec = OPTS_GET(opts, btf_sec, NULL);
+	btf_ext = OPTS_GET(opts, btf_ext, NULL);
+	return libbpf_ptr(btf_parse(path, btf_sec, base_btf, btf_ext));
 }
 
 struct btf *btf__parse(const char *path, struct btf_ext **btf_ext)
 {
-	return libbpf_ptr(btf_parse(path, NULL, btf_ext));
+	return libbpf_ptr(btf_parse(path, NULL, NULL, btf_ext));
 }
 
 struct btf *btf__parse_split(const char *path, struct btf *base_btf)
 {
-	return libbpf_ptr(btf_parse(path, base_btf, NULL));
+	return libbpf_ptr(btf_parse(path, NULL, base_btf, NULL));
 }
 
 static void *btf_get_raw_data(const struct btf *btf, __u32 *size, bool swap_endian);
diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
index 025ed28b7fe8..94dfdfdef617 100644
--- a/tools/lib/bpf/btf.h
+++ b/tools/lib/bpf/btf.h
@@ -18,6 +18,7 @@ extern "C" {
 
 #define BTF_ELF_SEC ".BTF"
 #define BTF_EXT_ELF_SEC ".BTF.ext"
+#define BTF_BASE_ELF_SEC ".BTF.base"
 #define MAPS_ELF_SEC ".maps"
 
 struct btf;
@@ -134,6 +135,37 @@ LIBBPF_API struct btf *btf__parse_elf_split(const char *path, struct btf *base_b
 LIBBPF_API struct btf *btf__parse_raw(const char *path);
 LIBBPF_API struct btf *btf__parse_raw_split(const char *path, struct btf *base_btf);
 
+struct btf_parse_opts {
+	size_t sz;
+	/* use base BTF to parse split BTF */
+	struct btf *base_btf;
+	/* retrieve optional .BTF.ext info */
+	struct btf_ext **btf_ext;
+	/* BTF section name */
+	const char *btf_sec;
+	size_t:0;
+};
+
+#define btf_parse_opts__last_field btf_sec
+
+/* @brief **btf__parse_opts()** parses BTF information from either a
+ * raw BTF file (*btf_sec* is NULL) or from the specified BTF section,
+ * also retrieving  .BTF.ext info if *btf_ext* is non-NULL.  If
+ * *base_btf* is specified, use it to parse split BTF from the
+ * specified location.
+ *
+ * @return new BTF object instance which has to be eventually freed with
+ * **btf__free()**
+ *
+ * On error, error-code-encoded-as-pointer is returned, not a NULL. To extract
+ * error code from such a pointer `libbpf_get_error()` should be used. If
+ * `libbpf_set_strict_mode(LIBBPF_STRICT_CLEAN_PTRS)` is enabled, NULL is
+ * returned on error instead. In both cases thread-local `errno` variable is
+ * always set to error code as well.
+ */
+
+LIBBPF_API struct btf *btf__parse_opts(const char *path, struct btf_parse_opts *opts);
+
 LIBBPF_API struct btf *btf__load_vmlinux_btf(void);
 LIBBPF_API struct btf *btf__load_module_btf(const char *module_name, struct btf *vmlinux_btf);
 
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index c4d9bd7d3220..a9151e31dfa9 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -421,6 +421,7 @@ LIBBPF_1.5.0 {
 	global:
 		bpf_program__attach_sockmap;
 		btf__distill_base;
+		btf__parse_opts;
 		ring__consume_n;
 		ring_buffer__consume_n;
 } LIBBPF_1.4.0;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v2 bpf-next 05/13] bpftool: support displaying raw split BTF using base BTF section as base
  2024-04-24 15:47 [PATCH v2 bpf-next 00/13] bpf: support resilient split BTF Alan Maguire
                   ` (3 preceding siblings ...)
  2024-04-24 15:47 ` [PATCH v2 bpf-next 04/13] libbpf: add btf__parse_opts() API for flexible BTF parsing Alan Maguire
@ 2024-04-24 15:47 ` Alan Maguire
  2024-04-29 23:42   ` Andrii Nakryiko
  2024-04-24 15:47 ` [PATCH v2 bpf-next 06/13] kbuild,bpf: switch to using --btf_features for pahole v1.26 and later Alan Maguire
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 43+ messages in thread
From: Alan Maguire @ 2024-04-24 15:47 UTC (permalink / raw)
  To: andrii, ast
  Cc: jolsa, acme, quentin, eddyz87, mykolal, daniel, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, houtao1,
	bpf, masahiroy, mcgrof, nathan, Alan Maguire

If no base BTF can be found, fall back to checking for the .BTF.base
section and use it to display split BTF.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 tools/bpf/bpftool/btf.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/tools/bpf/bpftool/btf.c b/tools/bpf/bpftool/btf.c
index 91fcb75babe3..2e8bd2c9f0a3 100644
--- a/tools/bpf/bpftool/btf.c
+++ b/tools/bpf/bpftool/btf.c
@@ -631,6 +631,15 @@ static int do_dump(int argc, char **argv)
 			base = get_vmlinux_btf_from_sysfs();
 
 		btf = btf__parse_split(*argv, base ?: base_btf);
+		/* Finally check for presence of base BTF section */
+		if (!btf && !base && !base_btf) {
+			LIBBPF_OPTS(btf_parse_opts, optp);
+
+			optp.btf_sec = BTF_BASE_ELF_SEC;
+			base_btf = btf__parse_opts(*argv, &optp);
+			if (base_btf)
+				btf = btf__parse_split(*argv, base_btf);
+		}
 		if (!btf) {
 			err = -errno;
 			p_err("failed to load BTF from %s: %s",
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v2 bpf-next 06/13] kbuild,bpf: switch to using --btf_features for pahole v1.26 and later
  2024-04-24 15:47 [PATCH v2 bpf-next 00/13] bpf: support resilient split BTF Alan Maguire
                   ` (4 preceding siblings ...)
  2024-04-24 15:47 ` [PATCH v2 bpf-next 05/13] bpftool: support displaying raw split BTF using base BTF section as base Alan Maguire
@ 2024-04-24 15:47 ` Alan Maguire
  2024-04-29 23:43   ` Andrii Nakryiko
  2024-04-24 15:48 ` [PATCH v2 bpf-next 07/13] resolve_btfids: use .BTF.base ELF section as base BTF if -B option is used Alan Maguire
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 43+ messages in thread
From: Alan Maguire @ 2024-04-24 15:47 UTC (permalink / raw)
  To: andrii, ast
  Cc: jolsa, acme, quentin, eddyz87, mykolal, daniel, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, houtao1,
	bpf, masahiroy, mcgrof, nathan, Alan Maguire

The btf_features list can be used for pahole v1.26 and later -
it is useful because if a feature is not yet implemented it will
not exit with a failure message.  This will allow us to add feature
requests to the pahole options without having to check pahole versions
in future; if the version of pahole supports the feature it will be
added.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 scripts/Makefile.btf | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/scripts/Makefile.btf b/scripts/Makefile.btf
index 82377e470aed..8e6a9d4b492e 100644
--- a/scripts/Makefile.btf
+++ b/scripts/Makefile.btf
@@ -12,8 +12,11 @@ pahole-flags-$(call test-ge, $(pahole-ver), 121)	+= --btf_gen_floats
 
 pahole-flags-$(call test-ge, $(pahole-ver), 122)	+= -j
 
-pahole-flags-$(CONFIG_PAHOLE_HAS_LANG_EXCLUDE)		+= --lang_exclude=rust
-
 pahole-flags-$(call test-ge, $(pahole-ver), 125)	+= --skip_encoding_btf_inconsistent_proto --btf_gen_optimized
 
+# Switch to using --btf_features for v1.26 and later.
+pahole-flags-$(call test-ge, $(pahole-ver), 126)	= -j --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func
+
+pahole-flags-$(CONFIG_PAHOLE_HAS_LANG_EXCLUDE)		+= --lang_exclude=rust
+
 export PAHOLE_FLAGS := $(pahole-flags-y)
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v2 bpf-next 07/13] resolve_btfids: use .BTF.base ELF section as base BTF if -B option is used
  2024-04-24 15:47 [PATCH v2 bpf-next 00/13] bpf: support resilient split BTF Alan Maguire
                   ` (5 preceding siblings ...)
  2024-04-24 15:47 ` [PATCH v2 bpf-next 06/13] kbuild,bpf: switch to using --btf_features for pahole v1.26 and later Alan Maguire
@ 2024-04-24 15:48 ` Alan Maguire
  2024-04-29 23:45   ` Andrii Nakryiko
  2024-05-01 20:39   ` Eduard Zingerman
  2024-04-24 15:48 ` [PATCH v2 bpf-next 08/13] kbuild, bpf: add module-specific pahole/resolve_btfids flags for distilled base BTF Alan Maguire
                   ` (6 subsequent siblings)
  13 siblings, 2 replies; 43+ messages in thread
From: Alan Maguire @ 2024-04-24 15:48 UTC (permalink / raw)
  To: andrii, ast
  Cc: jolsa, acme, quentin, eddyz87, mykolal, daniel, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, houtao1,
	bpf, masahiroy, mcgrof, nathan, Alan Maguire

When resolving BTF ids, use the BTF in the module .BTF.base section
when passed the -B option.  Both references to base BTF from split
BTF and BTF ids will be relocated for base vmlinux on module load.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 tools/bpf/resolve_btfids/main.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/tools/bpf/resolve_btfids/main.c b/tools/bpf/resolve_btfids/main.c
index d9520cb826b3..c5b622a31f18 100644
--- a/tools/bpf/resolve_btfids/main.c
+++ b/tools/bpf/resolve_btfids/main.c
@@ -115,6 +115,7 @@ struct object {
 	const char *path;
 	const char *btf;
 	const char *base_btf_path;
+	int base;
 
 	struct {
 		int		 fd;
@@ -532,11 +533,26 @@ static int symbols_resolve(struct object *obj)
 	__u32 nr_types;
 
 	if (obj->base_btf_path) {
-		base_btf = btf__parse(obj->base_btf_path, NULL);
+		LIBBPF_OPTS(btf_parse_opts, optp);
+		const char *path;
+
+		if (obj->base) {
+			optp.btf_sec = BTF_BASE_ELF_SEC;
+			path = obj->path;
+			base_btf = btf__parse_opts(path, &optp);
+			/* fall back to normal base parsing if no BTF_BASE_ELF_SEC */
+			if (libbpf_get_error(base_btf))
+				base_btf = NULL;
+		}
+		if (!base_btf) {
+			optp.btf_sec = BTF_ELF_SEC;
+			path = obj->base_btf_path;
+			base_btf = btf__parse_opts(path, &optp);
+		}
 		err = libbpf_get_error(base_btf);
 		if (err) {
 			pr_err("FAILED: load base BTF from %s: %s\n",
-			       obj->base_btf_path, strerror(-err));
+			       path, strerror(-err));
 			return -1;
 		}
 	}
@@ -781,6 +797,8 @@ int main(int argc, const char **argv)
 			   "BTF data"),
 		OPT_STRING('b', "btf_base", &obj.base_btf_path, "file",
 			   "path of file providing base BTF"),
+		OPT_INCR('B', "base", &obj.base,
+			 "use " BTF_BASE_ELF_SEC " ELF section BTF as base"),
 		OPT_END()
 	};
 	int err = -1;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v2 bpf-next 08/13] kbuild, bpf: add module-specific pahole/resolve_btfids flags for distilled base BTF
  2024-04-24 15:47 [PATCH v2 bpf-next 00/13] bpf: support resilient split BTF Alan Maguire
                   ` (6 preceding siblings ...)
  2024-04-24 15:48 ` [PATCH v2 bpf-next 07/13] resolve_btfids: use .BTF.base ELF section as base BTF if -B option is used Alan Maguire
@ 2024-04-24 15:48 ` Alan Maguire
  2024-04-24 15:48 ` [PATCH v2 bpf-next 09/13] libbpf: split BTF relocation Alan Maguire
                   ` (5 subsequent siblings)
  13 siblings, 0 replies; 43+ messages in thread
From: Alan Maguire @ 2024-04-24 15:48 UTC (permalink / raw)
  To: andrii, ast
  Cc: jolsa, acme, quentin, eddyz87, mykolal, daniel, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, houtao1,
	bpf, masahiroy, mcgrof, nathan, Alan Maguire

Support creation of module BTF along with distilled base BTF;
the latter is stored in a .BTF.base ELF section and supplements
split BTF references to base BTF with information about base types,
allowing for later relocation of split BTF with a (possibly
changed) base.  resolve_btfids uses the "-B" option to specify
that the BTF.ids section should be populated with split BTF
relative to the added .BTF.base section rather than relative
to the vmlinux base.

Modules will be built with a distilled .BTF.base section for external
module build, i.e.

make -C. -M=path2/module

...while in-tree module build as part of a normal kernel build will
not generate distilled base BTF; this is because in-tree modules
change with the kernel and do not require BTF relocation for the
running vmlinux.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 scripts/Makefile.btf      | 7 +++++++
 scripts/Makefile.modfinal | 4 ++--
 2 files changed, 9 insertions(+), 2 deletions(-)

diff --git a/scripts/Makefile.btf b/scripts/Makefile.btf
index 8e6a9d4b492e..8a3f45813c1e 100644
--- a/scripts/Makefile.btf
+++ b/scripts/Makefile.btf
@@ -19,4 +19,11 @@ pahole-flags-$(call test-ge, $(pahole-ver), 126)	= -j --btf_features=encode_forc
 
 pahole-flags-$(CONFIG_PAHOLE_HAS_LANG_EXCLUDE)		+= --lang_exclude=rust
 
+ifneq ($(KBUILD_EXTMOD),)
+module-pahole-flags-$(call test-ge, $(pahole-ver), 126)	+= --btf_features=distilled_base
+module-resolve-btfids-flags-$(call test-ge, $(pahole-ver), 126) = -B
+endif
+
 export PAHOLE_FLAGS := $(pahole-flags-y)
+export MODULE_PAHOLE_FLAGS := $(module-pahole-flags-y)
+export MODULE_RESOLVE_BTFIDS_FLAGS := $(module-resolve-btfids-flags-y)
diff --git a/scripts/Makefile.modfinal b/scripts/Makefile.modfinal
index 8568d256d6fb..22f5bb0a60a6 100644
--- a/scripts/Makefile.modfinal
+++ b/scripts/Makefile.modfinal
@@ -39,8 +39,8 @@ quiet_cmd_btf_ko = BTF [M] $@
 	if [ ! -f vmlinux ]; then					\
 		printf "Skipping BTF generation for %s due to unavailability of vmlinux\n" $@ 1>&2; \
 	else								\
-		LLVM_OBJCOPY="$(OBJCOPY)" $(PAHOLE) -J $(PAHOLE_FLAGS) --btf_base vmlinux $@; \
-		$(RESOLVE_BTFIDS) -b vmlinux $@; 			\
+		LLVM_OBJCOPY="$(OBJCOPY)" $(PAHOLE) -J $(PAHOLE_FLAGS) $(MODULE_PAHOLE_FLAGS) --btf_base vmlinux $@; \
+		$(RESOLVE_BTFIDS) $(MODULE_RESOLVE_BTFIDS_FLAGS) -b vmlinux $@; 			\
 	fi;
 
 # Same as newer-prereqs, but allows to exclude specified extra dependencies
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v2 bpf-next 09/13] libbpf: split BTF relocation
  2024-04-24 15:47 [PATCH v2 bpf-next 00/13] bpf: support resilient split BTF Alan Maguire
                   ` (7 preceding siblings ...)
  2024-04-24 15:48 ` [PATCH v2 bpf-next 08/13] kbuild, bpf: add module-specific pahole/resolve_btfids flags for distilled base BTF Alan Maguire
@ 2024-04-24 15:48 ` Alan Maguire
  2024-04-30  0:14   ` Andrii Nakryiko
  2024-04-24 15:48 ` [PATCH v2 bpf-next 10/13] module, bpf: store BTF base pointer in struct module Alan Maguire
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 43+ messages in thread
From: Alan Maguire @ 2024-04-24 15:48 UTC (permalink / raw)
  To: andrii, ast
  Cc: jolsa, acme, quentin, eddyz87, mykolal, daniel, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, houtao1,
	bpf, masahiroy, mcgrof, nathan, Alan Maguire

Map distilled base BTF type ids referenced in split BTF and their
references to the base BTF passed in, and if the mapping succeeds,
reparent the split BTF to the base BTF.

Relocation rules are

- base types must match exactly
- enum[64] types should match all value name/value pairs, but the
  to-be-relocated enum[64] can also define additional name/value pairs
- an enum64 can match an enum and vice versa provided the values match
  as described above
- named fwds match to the correspondingly-named struct/union/enum/enum64
- structs with no members match to the correspondingly-named struct/union
  provided their sizes match
- anon struct/unions must have field names/offsets specified in base
  reference BTF matched by those in base BTF we are matching with

Relocation can not recurse, since it will be used in-kernel also and
we do not want to blow up the kernel stack when carrying out type
compatibility checks.  Hence we use a stack for reference type
relocation rather then recursive function calls.  The approach however
is the same; we use a depth-first search to match the referents
associated with reference types, and work back from there to match
the reference type itself.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 tools/lib/bpf/Build             |   2 +-
 tools/lib/bpf/btf.c             |  58 +++
 tools/lib/bpf/btf.h             |   8 +
 tools/lib/bpf/btf_relocate.c    | 601 ++++++++++++++++++++++++++++++++
 tools/lib/bpf/libbpf.map        |   1 +
 tools/lib/bpf/libbpf_internal.h |   2 +
 6 files changed, 671 insertions(+), 1 deletion(-)
 create mode 100644 tools/lib/bpf/btf_relocate.c

diff --git a/tools/lib/bpf/Build b/tools/lib/bpf/Build
index b6619199a706..336da6844d42 100644
--- a/tools/lib/bpf/Build
+++ b/tools/lib/bpf/Build
@@ -1,4 +1,4 @@
 libbpf-y := libbpf.o bpf.o nlattr.o btf.o libbpf_errno.o str_error.o \
 	    netlink.o bpf_prog_linfo.o libbpf_probes.o hashmap.o \
 	    btf_dump.o ringbuf.o strset.o linker.o gen_loader.o relo_core.o \
-	    usdt.o zip.o elf.o features.o
+	    usdt.o zip.o elf.o features.o btf_relocate.o
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index 9036c1dc45d0..f00a84fea9b5 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -5541,3 +5541,61 @@ int btf__distill_base(const struct btf *src_btf, struct btf **new_base_btf,
 	errno = -ret;
 	return ret;
 }
+
+struct btf_rewrite_strs {
+	struct btf *btf;
+	const struct btf *old_base_btf;
+	int str_start;
+	int str_diff;
+};
+
+static int btf_rewrite_strs(__u32 *str_off, void *ctx)
+{
+	struct btf_rewrite_strs *r = ctx;
+	const char *s;
+	int off;
+
+	if (!*str_off)
+		return 0;
+	if (*str_off >= r->str_start) {
+		*str_off += r->str_diff;
+	} else {
+		s = btf__str_by_offset(r->old_base_btf, *str_off);
+		if (!s)
+			return -ENOENT;
+		off = btf__add_str(r->btf, s);
+		if (off < 0)
+			return off;
+		*str_off = off;
+	}
+	return 0;
+}
+
+int btf_set_base_btf(struct btf *btf, struct btf *base_btf)
+{
+	struct btf_rewrite_strs r = {};
+	struct btf_type *t;
+	int i, err;
+
+	r.old_base_btf = btf__base_btf(btf);
+	if (!r.old_base_btf)
+		return -EINVAL;
+	r.btf = btf;
+	r.str_start = r.old_base_btf->hdr->str_len;
+	r.str_diff = base_btf->hdr->str_len - r.old_base_btf->hdr->str_len;
+	btf->base_btf = base_btf;
+	btf->start_id = btf__type_cnt(base_btf);
+	btf->start_str_off = base_btf->hdr->str_len;
+	for (i = 0; i < btf->nr_types; i++) {
+		t = (struct btf_type *)btf__type_by_id(btf, i + btf->start_id);
+		err = btf_type_visit_str_offs(t, btf_rewrite_strs, &r);
+		if (err)
+			break;
+	}
+	return err;
+}
+
+int btf__relocate(struct btf *btf, const struct btf *base_btf)
+{
+	return btf_relocate(btf, base_btf, NULL);
+}
diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
index 94dfdfdef617..00e885998ba1 100644
--- a/tools/lib/bpf/btf.h
+++ b/tools/lib/bpf/btf.h
@@ -284,6 +284,14 @@ struct btf_dedup_opts {
 
 LIBBPF_API int btf__dedup(struct btf *btf, const struct btf_dedup_opts *opts);
 
+/**
+ * @brief **btf__relocate()** will check the split BTF *btf* for references
+ * to base BTF kinds, and verify those references are compatible with
+ * *base_btf*; if they are, *btf* is adjusted such that is re-parented to
+ * *base_btf* and type ids and strings are adjusted to accommodate this.
+ */
+LIBBPF_API int btf__relocate(struct btf *btf, const struct btf *base_btf);
+
 struct btf_dump;
 
 struct btf_dump_opts {
diff --git a/tools/lib/bpf/btf_relocate.c b/tools/lib/bpf/btf_relocate.c
new file mode 100644
index 000000000000..d9340375f4a3
--- /dev/null
+++ b/tools/lib/bpf/btf_relocate.c
@@ -0,0 +1,601 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2024, Oracle and/or its affiliates. */
+
+#include "btf.h"
+#include "bpf.h"
+#include "libbpf.h"
+#include "libbpf_internal.h"
+
+struct btf;
+
+#define BTF_MAX_NR_TYPES 0x7fffffffU
+#define BTF_UNPROCESSED_ID ((__u32)-1)
+
+struct btf_relocate {
+	struct btf *btf;
+	const struct btf *base_btf;
+	const struct btf *dist_base_btf;
+	unsigned int nr_base_types;
+	__u32 *map;
+	__u32 *stack;
+	unsigned int stack_size;
+	unsigned int stack_limit;
+};
+
+/* Find next type after *id in base BTF that matches kind of type t passed in
+ * and name (if it is specified).  Match fwd kinds to appropriate kind also.
+ */
+static int btf_relocate_find_next(struct btf_relocate *r, const struct btf_type *t,
+				   __u32 *id, const struct btf_type **tp)
+{
+	const struct btf_type *nt;
+	int kind, tkind = btf_kind(t);
+	int tkflag = btf_kflag(t);
+	__u32 i;
+
+	for (i = *id + 1; i < r->nr_base_types; i++) {
+		nt = btf__type_by_id(r->base_btf, i);
+		kind = btf_kind(nt);
+		/* enum[64] can match either enum or enum64;
+		 * a fwd can match a struct/union of the appropriate
+		 * type; otherwise kinds must match.
+		 */
+		switch (tkind) {
+		case BTF_KIND_ENUM:
+		case BTF_KIND_ENUM64:
+			switch (kind) {
+			case BTF_KIND_ENUM64:
+			case BTF_KIND_ENUM:
+				break;
+			default:
+				continue;
+			}
+			break;
+		case BTF_KIND_FWD:
+			switch (kind) {
+			case BTF_KIND_FWD:
+				continue;
+			case BTF_KIND_STRUCT:
+				if (tkflag)
+					continue;
+				break;
+			case BTF_KIND_UNION:
+				if (!tkflag)
+					continue;
+				break;
+			default:
+				break;
+			}
+			break;
+		default:
+			if (kind != tkind)
+				continue;
+			break;
+		}
+		/* either names must match or both be anon. */
+		if (t->name_off && nt->name_off) {
+			if (strcmp(btf__name_by_offset(r->btf, t->name_off),
+				   btf__name_by_offset(r->base_btf, nt->name_off)))
+				continue;
+		} else if (t->name_off != nt->name_off) {
+			continue;
+		}
+		*tp = nt;
+		*id = i;
+		return 0;
+	}
+	return -ENOENT;
+}
+
+static int btf_relocate_int(struct btf_relocate *r, const char *name,
+			     const struct btf_type *t, const struct btf_type *bt)
+{
+	__u8 encoding, bencoding, bits, bbits;
+
+	if (t->size != bt->size) {
+		pr_warn("INT types '%s' disagree on size; distilled base BTF says %d; base BTF says %d\n",
+			name, t->size, bt->size);
+		return -EINVAL;
+	}
+	encoding = btf_int_encoding(t);
+	bencoding = btf_int_encoding(bt);
+	if (encoding != bencoding) {
+		pr_warn("INT types '%s' disagree on encoding; distilled base BTF says '(%s/%s/%s); base BTF says '(%s/%s/%s)'\n",
+			name,
+			encoding & BTF_INT_SIGNED ? "signed" : "unsigned",
+			encoding & BTF_INT_CHAR ? "char" : "nonchar",
+			encoding & BTF_INT_BOOL ? "bool" : "nonbool",
+			bencoding & BTF_INT_SIGNED ? "signed" : "unsigned",
+			bencoding & BTF_INT_CHAR ? "char" : "nonchar",
+			bencoding & BTF_INT_BOOL ? "bool" : "nonbool");
+		return -EINVAL;
+	}
+	bits = btf_int_bits(t);
+	bbits = btf_int_bits(bt);
+	if (bits != bbits) {
+		pr_warn("INT types '%s' disagree on bit size; distilled base BTF says %d; base BTF says %d\n",
+			name, bits, bbits);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+static int btf_relocate_float(struct btf_relocate *r, const char *name,
+			       const struct btf_type *t, const struct btf_type *bt)
+{
+
+	if (t->size != bt->size) {
+		pr_warn("float types '%s' disagree on size; distilled base BTF says %d; base BTF says %d\n",
+			name, t->size, bt->size);
+		return -EINVAL;
+	}
+	return 0;
+}
+
+/* ensure each enum[64] value in type t has equivalent in base BTF and that
+ * values match; we must support matching enum64 to enum and vice versa
+ * as well as enum to enum and enum64 to enum64.
+ */
+static int btf_relocate_enum(struct btf_relocate *r, const char *name,
+			      const struct btf_type *t, const struct btf_type *bt)
+{
+	struct btf_enum *v = btf_enum(t);
+	struct btf_enum *bv = btf_enum(bt);
+	struct btf_enum64 *v64 = btf_enum64(t);
+	struct btf_enum64 *bv64 = btf_enum64(bt);
+	bool found, match, bisenum, isenum;
+	const char *vname, *bvname;
+	__u32 name_off, bname_off;
+	__u64 val = 0, bval = 0;
+	int i, j;
+
+	isenum = btf_kind(t) == BTF_KIND_ENUM;
+	for (i = 0; i < btf_vlen(t); i++, v++, v64++) {
+		found = match = false;
+
+		if (isenum) {
+			name_off = v->name_off;
+			val = v->val;
+		} else {
+			name_off = v64->name_off;
+			val = btf_enum64_value(v64);
+		}
+		if (!name_off)
+			continue;
+		vname = btf__name_by_offset(r->dist_base_btf, name_off);
+
+		bisenum = btf_kind(bt) == BTF_KIND_ENUM;
+		for (j = 0; j < btf_vlen(bt); j++, bv++, bv64++) {
+			if (bisenum) {
+				bname_off = bv->name_off;
+				bval = bv->val;
+			} else {
+				bname_off = bv64->name_off;
+				bval = btf_enum64_value(bv64);
+			}
+			if (!bname_off)
+				continue;
+			bvname = btf__name_by_offset(r->base_btf, bname_off);
+			if (strcmp(vname, bvname) != 0)
+				continue;
+			found = true;
+			match = val == bval;
+			break;
+		}
+		if (!found) {
+			if (t->name_off)
+				pr_warn("ENUM[64] types '%s' disagree; distilled base BTF has enum[64] value '%s' (%lld), base BTF does not have that value.\n",
+					name, vname, val);
+			return -EINVAL;
+		}
+		if (!match) {
+			if (t->name_off)
+				pr_warn("ENUM[64] types '%s' disagree on enum value '%s'; distilled base BTF specifies value %lld; base BTF specifies value %lld\n",
+					name, vname, val, bval);
+			return -EINVAL;
+		}
+	}
+	return 0;
+}
+
+/* relocate base types (int, float, enum, enum64 and fwd) */
+static int btf_relocate_base_type(struct btf_relocate *r, __u32 id)
+{
+	const struct btf_type *t = btf_type_by_id(r->dist_base_btf, id);
+	const char *name = btf__name_by_offset(r->dist_base_btf, t->name_off);
+	const struct btf_type *bt = NULL;
+	__u32 base_id = 0;
+	int err = 0;
+
+	switch (btf_kind(t)) {
+	case BTF_KIND_INT:
+	case BTF_KIND_ENUM:
+	case BTF_KIND_FLOAT:
+	case BTF_KIND_ENUM64:
+	case BTF_KIND_FWD:
+		break;
+	default:
+		return 0;
+	}
+
+	if (r->map[id] <= BTF_MAX_NR_TYPES)
+		return 0;
+
+	while ((err = btf_relocate_find_next(r, t, &base_id, &bt)) != -ENOENT) {
+		bt = btf_type_by_id(r->base_btf, base_id);
+		switch (btf_kind(t)) {
+		case BTF_KIND_INT:
+			err = btf_relocate_int(r, name, t, bt);
+			break;
+		case BTF_KIND_ENUM:
+		case BTF_KIND_ENUM64:
+			err = btf_relocate_enum(r, name, t, bt);
+			break;
+		case BTF_KIND_FLOAT:
+			err = btf_relocate_float(r, name, t, bt);
+			break;
+		case BTF_KIND_FWD:
+			err = 0;
+			break;
+		default:
+			return 0;
+		}
+		if (!err) {
+			r->map[id] = base_id;
+			return 0;
+		}
+	}
+	return err;
+}
+
+/* all distilled base BTF members must be in base BTF equivalent. */
+static int btf_relocate_check_member(struct btf_relocate *r, const char *name,
+				      struct btf_member *m, const struct btf_type *bt,
+				      bool verbose)
+{
+	struct btf_member *bm = (struct btf_member *)(bt + 1);
+	const char *kindstr = btf_kind(bt) == BTF_KIND_STRUCT ? "STRUCT" : "UNION";
+	const char *mname, *bmname;
+	int i, bvlen = btf_vlen(bt);
+
+	mname = btf__name_by_offset(r->dist_base_btf, m->name_off);
+	for (i = 0; i < bvlen; i++, bm++) {
+		bmname = btf__name_by_offset(r->base_btf, bm->name_off);
+
+		if (!m->name_off || !bm->name_off) {
+			if (m->name_off != bm->name_off)
+				continue;
+			if (bm->offset != m->offset)
+				continue;
+		} else {
+			if (strcmp(mname, bmname) != 0)
+				continue;
+			if (bm->offset != m->offset) {
+				if (verbose) {
+					pr_warn("%s '%s' member '%s' disagrees about offset; %d in distilled base BTF versus %d in base BTF\n",
+						kindstr, name, mname, bm->offset, m->offset);
+					return -EINVAL;
+				}
+			}
+		}
+		return 0;
+	}
+	if (verbose)
+		pr_warn("%s '%s' missing member '%s' found in distilled base BTF\n",
+			kindstr, name, mname);
+	return -EINVAL;
+}
+
+static int btf_relocate_struct_type(struct btf_relocate *r, __u32 id)
+{
+	const struct btf_type *t = btf_type_by_id(r->dist_base_btf, id);
+	const char *name = btf__name_by_offset(r->dist_base_btf, t->name_off);
+	const struct btf_type *bt = NULL;
+	struct btf_member *m;
+	const char *kindstr;
+	int i, vlen, err = 0;
+	__u32 base_id = 0;
+
+	switch (btf_kind(t)) {
+	case BTF_KIND_STRUCT:
+		kindstr = "STRUCT";
+		break;
+	case BTF_KIND_UNION:
+		kindstr = "UNION";
+		break;
+	default:
+		return 0;
+	}
+
+	if (r->map[id] <= BTF_MAX_NR_TYPES)
+		return 0;
+
+	vlen = btf_vlen(t);
+
+	while ((err = btf_relocate_find_next(r, t, &base_id, &bt)) != -ENOENT) {
+		/* vlen 0 named types (signalling type is embedded in
+		 * a split BTF struct/union) must match size exactly
+		 */
+		if (t->name_off && vlen == 0) {
+			if (bt->size != t->size) {
+				pr_warn("%s '%s' disagrees about size; is size (%d) in distilled base BTF; in base BTF it is size (%d)\n",
+					kindstr, name, t->size, bt->size);
+				return -EINVAL;
+			}
+		}
+		/* otherwise must be at least as big */
+		if (bt->size < t->size) {
+			if (t->name_off) {
+				pr_warn("%s '%s' disagrees about size with distilled base BTF (%d); base BTF is smaller (%d)\n",
+					kindstr, name, t->size, bt->size);
+				return -EINVAL;
+			}
+			continue;
+		}
+		/* must have at least as many elements */
+		if (btf_vlen(bt) < vlen) {
+			if (t->name_off) {
+				pr_warn("%s '%s' disagrees about number of members with distilled base BTF (%d); base BTF has less (%d)\n",
+					kindstr, name, vlen, btf_vlen(bt));
+				return -EINVAL;
+			}
+			continue;
+		}
+		m = (struct btf_member *)(t + 1);
+		for (i = 0; i < vlen; i++, m++) {
+			if (btf_relocate_check_member(r, name, m, bt, t->name_off != 0)) {
+				if (t->name_off)
+					return -EINVAL;
+				err = -EINVAL;
+				break;
+			}
+		}
+		if (!err) {
+			r->map[id] = base_id;
+			return 0;
+		}
+	}
+	return err;
+}
+
+/* Use a stack rather than recursion to manage dependent reference types.
+ * When a reference type with dependents is encountered, the approach we
+ * take depends on whether the dependents have been resolved to base
+ * BTF references via the map[].  If they all have, we can simply search
+ * for the base BTF type that has those references.  If the references
+ * are not resolved, we need to push the type and its dependents onto
+ * the stack for later resolution.  We first pop the dependents, and
+ * once these have been resolved we pop the reference type with dependents
+ * now resolved.
+ */
+static int btf_relocate_push(struct btf_relocate *r, __u32 id)
+{
+	if (r->stack_size >= r->stack_limit)
+		return -ENOSPC;
+	r->stack[r->stack_size++] = id;
+	return 0;
+}
+
+static __u32 btf_relocate_pop(struct btf_relocate *r)
+{
+	if (r->stack_size > 0)
+		return r->stack[--r->stack_size];
+	return BTF_UNPROCESSED_ID;
+}
+
+static int btf_relocate_ref_type(struct btf_relocate *r, __u32 id)
+{
+	const struct btf_type *t;
+	const struct btf_type *bt;
+	__u32 base_id;
+	int err = 0;
+
+	do {
+		if (r->map[id] <= BTF_MAX_NR_TYPES)
+			continue;
+		t = btf_type_by_id(r->dist_base_btf, id);
+		switch (btf_kind(t)) {
+		case BTF_KIND_CONST:
+		case BTF_KIND_VOLATILE:
+		case BTF_KIND_RESTRICT:
+		case BTF_KIND_PTR:
+		case BTF_KIND_TYPEDEF:
+		case BTF_KIND_FUNC:
+		case BTF_KIND_TYPE_TAG:
+		case BTF_KIND_DECL_TAG:
+			if (r->map[t->type] <= BTF_MAX_NR_TYPES) {
+				bt = NULL;
+				base_id = 0;
+				while ((err = btf_relocate_find_next(r, t, &base_id, &bt))
+				       != -ENOENT) {
+					if (btf_kind(t) == BTF_KIND_DECL_TAG) {
+						if (btf_decl_tag(t) != btf_decl_tag(bt))
+							continue;
+					}
+					if (bt->type != r->map[t->type])
+						continue;
+					r->map[id] = base_id;
+					break;
+				}
+				if (err) {
+					pr_warn("could not find base BTF type for distilled base BTF type[%u]\n",
+						id);
+					return err;
+				}
+			} else {
+				if (btf_relocate_push(r, id) < 0 ||
+				    btf_relocate_push(r, t->type) < 0)
+					return -ENOSPC;
+			}
+			break;
+		case BTF_KIND_ARRAY: {
+			struct btf_array *ba, *a = btf_array(t);
+
+			if (r->map[a->type] <= BTF_MAX_NR_TYPES &&
+			    r->map[a->index_type] <= BTF_MAX_NR_TYPES) {
+				bt = NULL;
+				base_id = 0;
+				while ((err = btf_relocate_find_next(r, t, &base_id, &bt))
+				       != -ENOENT) {
+					ba = btf_array(bt);
+					if (a->nelems != ba->nelems ||
+					    r->map[a->type] != ba->type ||
+					    r->map[a->index_type] != ba->index_type)
+						continue;
+					r->map[id] = base_id;
+					break;
+				}
+				if (err) {
+					pr_warn("could not matching find base BTF ARRAY for distilled base BTF ARRAY[%u]\n",
+						id);
+					return err;
+				}
+			} else {
+				if (btf_relocate_push(r, id) < 0 ||
+				    btf_relocate_push(r, a->type) < 0 ||
+				    btf_relocate_push(r, a->index_type) < 0)
+					return -ENOSPC;
+			}
+			break;
+		}
+		case BTF_KIND_FUNC_PROTO: {
+			struct btf_param *p = btf_params(t);
+			int i, vlen = btf_vlen(t);
+
+			for (i = 0; i < vlen; i++, p++) {
+				if (r->map[p->type] > BTF_MAX_NR_TYPES)
+					break;
+			}
+			if (i == vlen && r->map[t->type] <= BTF_MAX_NR_TYPES) {
+				bt = NULL;
+				base_id = 0;
+				while ((err = btf_relocate_find_next(r, t, &base_id, &bt))
+				       != -ENOENT) {
+					struct btf_param *bp = btf_params(bt);
+					int bvlen = btf_vlen(bt);
+					int j;
+
+					if (bvlen != vlen)
+						continue;
+					if (r->map[t->type] != bt->type)
+						continue;
+					for (j = 0, p = btf_params(t); j < bvlen; j++, bp++, p++) {
+						if (r->map[p->type] != bp->type)
+							break;
+					}
+					if (j < bvlen)
+						continue;
+					r->map[id] = base_id;
+					break;
+				}
+				if (err) {
+					pr_warn("could not find matching base BTF FUNC_PROTO for distilled base BTF FUNC_PROTO[%u]\n",
+						id);
+					return err;
+				}
+			} else {
+				if (btf_relocate_push(r, id) < 0 ||
+				    btf_relocate_push(r, t->type) < 0)
+					return -ENOSPC;
+				for (i = 0, p = btf_params(t); i < btf_vlen(t); i++, p++) {
+					if (btf_relocate_push(r, p->type) < 0)
+						return -ENOSPC;
+				}
+			}
+			break;
+		}
+		default:
+			return -EINVAL;
+		}
+	} while ((id = btf_relocate_pop(r)) <= BTF_MAX_NR_TYPES);
+
+	return 0;
+}
+
+static int btf_relocate_rewrite_type_id(__u32 *id, void *ctx)
+{
+	struct btf_relocate *r = ctx;
+
+	*id = r->map[*id];
+	return 0;
+}
+
+/* If successful, output of relocation is updated BTF with base BTF pointing
+ * at base_btf, and type ids, strings adjusted accordingly
+ */
+int btf_relocate(struct btf *btf, const struct btf *base_btf, __u32 **map_ids)
+{
+	const struct btf *dist_base_btf = btf__base_btf(btf);
+	unsigned int nr_split_types, nr_dist_base_types;
+	unsigned int nr_types = btf__type_cnt(btf);
+	struct btf_relocate r = {};
+	const struct btf_type *t;
+	int diff_id, err = 0;
+	__u32 id, i;
+
+	if (!base_btf || dist_base_btf == base_btf)
+		return 0;
+
+	nr_dist_base_types = btf__type_cnt(dist_base_btf);
+	r.nr_base_types = btf__type_cnt(base_btf);
+	nr_split_types = nr_types - nr_dist_base_types;
+	r.map = calloc(nr_types, sizeof(*r.map));
+	r.stack_limit = nr_dist_base_types;
+	r.stack = calloc(r.stack_limit, sizeof(*r.stack));
+	if (!r.map || !r.stack) {
+		err = -ENOMEM;
+		goto err_out;
+	}
+	diff_id = r.nr_base_types - nr_dist_base_types;
+	for (id = 1; id < nr_dist_base_types; id++)
+		r.map[id] = BTF_UNPROCESSED_ID;
+	for (id = nr_dist_base_types; id < nr_types; id++)
+		r.map[id] = id + diff_id;
+
+	r.btf = btf;
+	r.dist_base_btf = dist_base_btf;
+	r.base_btf = base_btf;
+
+	/* Build a map from base references to actual base BTF ids; it is used
+	 * to track the state of comparisons.  First map base types and fwds,
+	 * next structs/unions, and finally reference types (const, restrict,
+	 * ptr, array, func, func_proto etc).
+	 */
+	for (id = 1; id < nr_dist_base_types; id++) {
+		err = btf_relocate_base_type(&r, id);
+		if (err)
+			goto err_out;
+	}
+	for (id = 1; id < nr_dist_base_types; id++) {
+		err = btf_relocate_struct_type(&r, id);
+		if (err)
+			goto err_out;
+	}
+	for (id = 1; id < nr_dist_base_types; id++) {
+		err = btf_relocate_ref_type(&r, id);
+		if (err)
+			goto err_out;
+	}
+	/* Next, rewrite type ids in split BTF, replacing split ids with updated
+	 * ids based on number of types in base BTF, and base ids with
+	 * relocated ids from base_btf.
+	 */
+	for (i = 0, id = nr_dist_base_types; i < nr_split_types; i++, id++) {
+		t = btf__type_by_id(btf, id);
+		err = btf_type_visit_type_ids((struct btf_type *)t,
+					      btf_relocate_rewrite_type_id, &r);
+		if (err)
+			goto err_out;
+	}
+	/* Finally reset base BTF to base_btf; as part of this operation, string
+	 * offsets are also updated, and we are done.
+	 */
+	err = btf_set_base_btf(r.btf, (struct btf *)r.base_btf);
+err_out:
+	if (!err && map_ids)
+		*map_ids = r.map;
+	else
+		free(r.map);
+	free(r.stack);
+	return err;
+}
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index a9151e31dfa9..b245350f456c 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -422,6 +422,7 @@ LIBBPF_1.5.0 {
 		bpf_program__attach_sockmap;
 		btf__distill_base;
 		btf__parse_opts;
+		btf__relocate;
 		ring__consume_n;
 		ring_buffer__consume_n;
 } LIBBPF_1.4.0;
diff --git a/tools/lib/bpf/libbpf_internal.h b/tools/lib/bpf/libbpf_internal.h
index a0dcfb82e455..e38e1b01e86e 100644
--- a/tools/lib/bpf/libbpf_internal.h
+++ b/tools/lib/bpf/libbpf_internal.h
@@ -234,6 +234,8 @@ struct btf_type;
 struct btf_type *btf_type_by_id(const struct btf *btf, __u32 type_id);
 const char *btf_kind_str(const struct btf_type *t);
 const struct btf_type *skip_mods_and_typedefs(const struct btf *btf, __u32 id, __u32 *res_id);
+int btf_set_base_btf(struct btf *btf, struct btf *base_btf);
+int btf_relocate(struct btf *btf, const struct btf *base_btf, __u32 **map_ids);
 
 static inline enum btf_func_linkage btf_func_linkage(const struct btf_type *t)
 {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v2 bpf-next 10/13] module, bpf: store BTF base pointer in struct module
  2024-04-24 15:47 [PATCH v2 bpf-next 00/13] bpf: support resilient split BTF Alan Maguire
                   ` (8 preceding siblings ...)
  2024-04-24 15:48 ` [PATCH v2 bpf-next 09/13] libbpf: split BTF relocation Alan Maguire
@ 2024-04-24 15:48 ` Alan Maguire
  2024-04-24 15:48 ` [PATCH v2 bpf-next 11/13] libbpf,bpf: share BTF relocate-related code with kernel Alan Maguire
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 43+ messages in thread
From: Alan Maguire @ 2024-04-24 15:48 UTC (permalink / raw)
  To: andrii, ast
  Cc: jolsa, acme, quentin, eddyz87, mykolal, daniel, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, houtao1,
	bpf, masahiroy, mcgrof, nathan, Alan Maguire

...as this will allow split BTF modules with a base BTF
representation (rather than the full vmlinux BTF at time of
BTF encoding) to resolve their references to kernel types in a
way that is more resilient to small changes in kernel types.

This will allow modules that are not built every time the kernel
is to provide more resilient BTF, rather than have it invalidated
every time BTF ids for core kernel types change.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 include/linux/module.h | 2 ++
 kernel/module/main.c   | 5 ++++-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/include/linux/module.h b/include/linux/module.h
index 1153b0d99a80..f127a79a95d9 100644
--- a/include/linux/module.h
+++ b/include/linux/module.h
@@ -510,6 +510,8 @@ struct module {
 #ifdef CONFIG_DEBUG_INFO_BTF_MODULES
 	unsigned int btf_data_size;
 	void *btf_data;
+	unsigned int btf_base_data_size;
+	void *btf_base_data;
 #endif
 #ifdef CONFIG_JUMP_LABEL
 	struct jump_entry *jump_entries;
diff --git a/kernel/module/main.c b/kernel/module/main.c
index e1e8a7a9d6c1..e18683abec07 100644
--- a/kernel/module/main.c
+++ b/kernel/module/main.c
@@ -2148,6 +2148,8 @@ static int find_module_sections(struct module *mod, struct load_info *info)
 #endif
 #ifdef CONFIG_DEBUG_INFO_BTF_MODULES
 	mod->btf_data = any_section_objs(info, ".BTF", 1, &mod->btf_data_size);
+	mod->btf_base_data = any_section_objs(info, ".BTF.base", 1,
+					      &mod->btf_base_data_size);
 #endif
 #ifdef CONFIG_JUMP_LABEL
 	mod->jump_entries = section_objs(info, "__jump_table",
@@ -2587,8 +2589,9 @@ static noinline int do_init_module(struct module *mod)
 	}
 
 #ifdef CONFIG_DEBUG_INFO_BTF_MODULES
-	/* .BTF is not SHF_ALLOC and will get removed, so sanitize pointer */
+	/* .BTF is not SHF_ALLOC and will get removed, so sanitize pointers */
 	mod->btf_data = NULL;
+	mod->btf_base_data = NULL;
 #endif
 	/*
 	 * We want to free module_init, but be aware that kallsyms may be
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v2 bpf-next 11/13] libbpf,bpf: share BTF relocate-related code with kernel
  2024-04-24 15:47 [PATCH v2 bpf-next 00/13] bpf: support resilient split BTF Alan Maguire
                   ` (9 preceding siblings ...)
  2024-04-24 15:48 ` [PATCH v2 bpf-next 10/13] module, bpf: store BTF base pointer in struct module Alan Maguire
@ 2024-04-24 15:48 ` Alan Maguire
  2024-04-24 15:48 ` [PATCH v2 bpf-next 12/13] selftests/bpf: extend distilled BTF tests to cover BTF relocation Alan Maguire
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 43+ messages in thread
From: Alan Maguire @ 2024-04-24 15:48 UTC (permalink / raw)
  To: andrii, ast
  Cc: jolsa, acme, quentin, eddyz87, mykolal, daniel, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, houtao1,
	bpf, masahiroy, mcgrof, nathan, Alan Maguire

Share relocation implementation with the kernel.  As part of this,
we also need the type/string visitation functions so add them to a
btf_common.c file that also gets shared with the kernel. Relocation
code in kernel and userspace is identical save for the impementation
of the reparenting of split BTF to the relocated base BTF; this
depends on struct btf internals so is different in kernel and
userspace.

One other wrinkle on the kernel side is we have to map .BTF.ids in
modules as they were generated with the type ids used at BTF encoding
time. btf_relocate() optionally returns an array mapping from old BTF
ids to relocated ids, so we use that to fix up these references where
needed for kfuncs.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 include/linux/btf.h          |  32 +++++
 kernel/bpf/Makefile          |   8 ++
 kernel/bpf/btf.c             | 227 ++++++++++++++++++++++++++++-------
 tools/lib/bpf/Build          |   2 +-
 tools/lib/bpf/btf.c          | 130 --------------------
 tools/lib/bpf/btf_common.c   | 146 ++++++++++++++++++++++
 tools/lib/bpf/btf_relocate.c |  29 +++++
 7 files changed, 399 insertions(+), 175 deletions(-)
 create mode 100644 tools/lib/bpf/btf_common.c

diff --git a/include/linux/btf.h b/include/linux/btf.h
index f9e56fd12a9f..1cc20844f163 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -214,6 +214,7 @@ bool btf_is_kernel(const struct btf *btf);
 bool btf_is_module(const struct btf *btf);
 struct module *btf_try_get_module(const struct btf *btf);
 u32 btf_nr_types(const struct btf *btf);
+struct btf *btf_base_btf(const struct btf *btf);
 bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s,
 			   const struct btf_member *m,
 			   u32 expected_offset, u32 expected_size);
@@ -515,8 +516,15 @@ static inline const struct bpf_struct_ops_desc *bpf_struct_ops_find(struct btf *
 }
 #endif
 
+typedef int (*type_id_visit_fn)(__u32 *type_id, void *ctx);
+typedef int (*str_off_visit_fn)(__u32 *str_off, void *ctx);
+
 #ifdef CONFIG_BPF_SYSCALL
 const struct btf_type *btf_type_by_id(const struct btf *btf, u32 type_id);
+int btf_set_base_btf(struct btf *btf, struct btf *base_btf);
+int btf_relocate(struct btf *btf, const struct btf *base_btf, __u32 **map_ids);
+int btf_type_visit_type_ids(struct btf_type *t, type_id_visit_fn visit, void *ctx);
+int btf_type_visit_str_offs(struct btf_type *t, str_off_visit_fn visit, void *ctx);
 const char *btf_name_by_offset(const struct btf *btf, u32 offset);
 struct btf *btf_parse_vmlinux(void);
 struct btf *bpf_prog_get_target_btf(const struct bpf_prog *prog);
@@ -543,6 +551,30 @@ static inline const struct btf_type *btf_type_by_id(const struct btf *btf,
 {
 	return NULL;
 }
+
+static inline int btf_set_base_btf(struct btf *btf, struct btf *base_btf)
+{
+	return 0;
+}
+
+static inline int btf_relocate(void *log, struct btf *btf, const struct btf *base_btf,
+			       __u32 **map_ids)
+{
+	return 0;
+}
+
+static inline int btf_type_visit_type_ids(struct btf_type *t, type_id_visit_fn visit,
+					  void *ctx)
+{
+	return 0;
+}
+
+static inline int btf_type_visit_str_offs(struct btf_type *t, str_off_visit_fn visit,
+					  void *ctx)
+{
+	return 0;
+}
+
 static inline const char *btf_name_by_offset(const struct btf *btf,
 					     u32 offset)
 {
diff --git a/kernel/bpf/Makefile b/kernel/bpf/Makefile
index 368c5d86b5b7..d705dbe2b226 100644
--- a/kernel/bpf/Makefile
+++ b/kernel/bpf/Makefile
@@ -49,3 +49,11 @@ obj-$(CONFIG_BPF_PRELOAD) += preload/
 obj-$(CONFIG_BPF_SYSCALL) += relo_core.o
 $(obj)/relo_core.o: $(srctree)/tools/lib/bpf/relo_core.c FORCE
 	$(call if_changed_rule,cc_o_c)
+
+obj-$(CONFIG_BPF_SYSCALL) += btf_common.o
+$(obj)/btf_common.o: $(srctree)/tools/lib/bpf/btf_common.c FORCE
+	$(call if_changed_rule,cc_o_c)
+
+obj-$(CONFIG_BPF_SYSCALL) += btf_relocate.o
+$(obj)/btf_relocate.o: $(srctree)/tools/lib/bpf/btf_relocate.c FORCE
+	$(call if_changed_rule,cc_o_c)
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 8291fbfd27b1..2f304b77bab4 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -273,6 +273,7 @@ struct btf {
 	u32 start_str_off; /* first string offset (0 for base BTF) */
 	char name[MODULE_NAME_LEN];
 	bool kernel_btf;
+	__u32 *base_map; /* map from distilled base BTF -> vmlinux BTF ids */
 };
 
 enum verifier_phase {
@@ -1734,7 +1735,13 @@ static void btf_free(struct btf *btf)
 	kvfree(btf->types);
 	kvfree(btf->resolved_sizes);
 	kvfree(btf->resolved_ids);
-	kvfree(btf->data);
+	/* only split BTF allocates data, but btf->data is non-NULL for
+	 * vmlinux BTF too.
+	 */
+	if (btf->base_btf)
+		kvfree(btf->data);
+	if (btf->kernel_btf)
+		kvfree(btf->base_map);
 	kfree(btf);
 }
 
@@ -1763,6 +1770,90 @@ void btf_put(struct btf *btf)
 	}
 }
 
+struct btf *btf_base_btf(const struct btf *btf)
+{
+	return btf->base_btf;
+}
+
+struct btf_rewrite_strs {
+	struct btf *btf;
+	const struct btf *old_base_btf;
+	int str_start;
+	int str_diff;
+	__u32 *str_map;
+};
+
+static __u32 btf_find_str(struct btf *btf, const char *s)
+{
+	__u32 offset = 0;
+
+	while (offset < btf->hdr.str_len) {
+		while (!btf->strings[offset])
+			offset++;
+		if (strcmp(s, &btf->strings[offset]) == 0)
+			return offset;
+		while (btf->strings[offset])
+			offset++;
+	}
+	return -ENOENT;
+}
+
+static int btf_rewrite_strs(__u32 *str_off, void *ctx)
+{
+	struct btf_rewrite_strs *r = ctx;
+	const char *s;
+	int off;
+
+	if (!*str_off)
+		return 0;
+	if (*str_off >= r->str_start) {
+		*str_off += r->str_diff;
+	} else {
+		s = btf_str_by_offset(r->old_base_btf, *str_off);
+		if (!s)
+			return -ENOENT;
+		if (r->str_map[*str_off]) {
+			off = r->str_map[*str_off];
+		} else {
+			off = btf_find_str(r->btf->base_btf, s);
+			if (off < 0)
+				return off;
+			r->str_map[*str_off] = off;
+		}
+		*str_off = off;
+	}
+	return 0;
+}
+
+int btf_set_base_btf(struct btf *btf, struct btf *base_btf)
+{
+	struct btf_rewrite_strs r = {};
+	struct btf_type *t;
+	int i, err;
+
+	r.old_base_btf = btf_base_btf(btf);
+	if (!r.old_base_btf)
+		return -EINVAL;
+	r.btf = btf;
+	r.str_start = r.old_base_btf->hdr.str_len;
+	r.str_diff = base_btf->hdr.str_len - r.old_base_btf->hdr.str_len;
+	r.str_map = kvcalloc(r.old_base_btf->hdr.str_len, sizeof(*r.str_map),
+			     GFP_KERNEL | __GFP_NOWARN);
+	if (!r.str_map)
+		return -ENOMEM;
+	btf->base_btf = base_btf;
+	btf->start_id = btf_nr_types(base_btf);
+	btf->start_str_off = base_btf->hdr.str_len;
+	for (i = 0; i < btf->nr_types; i++) {
+		t = (struct btf_type *)btf_type_by_id(btf, i + btf->start_id);
+		err = btf_type_visit_str_offs((struct btf_type *)t, btf_rewrite_strs, &r);
+		if (err)
+			break;
+	}
+	kvfree(r.str_map);
+	return err;
+}
+
 static int env_resolve_init(struct btf_verifier_env *env)
 {
 	struct btf *btf = env->btf;
@@ -5981,23 +6072,15 @@ int get_kern_ctx_btf_id(struct bpf_verifier_log *log, enum bpf_prog_type prog_ty
 BTF_ID_LIST(bpf_ctx_convert_btf_id)
 BTF_ID(struct, bpf_ctx_convert)
 
-struct btf *btf_parse_vmlinux(void)
+static struct btf *btf_parse_base(struct btf_verifier_env *env, const char *name,
+				  void *data, unsigned int data_size)
 {
-	struct btf_verifier_env *env = NULL;
-	struct bpf_verifier_log *log;
 	struct btf *btf = NULL;
 	int err;
 
 	if (!IS_ENABLED(CONFIG_DEBUG_INFO_BTF))
 		return ERR_PTR(-ENOENT);
 
-	env = kzalloc(sizeof(*env), GFP_KERNEL | __GFP_NOWARN);
-	if (!env)
-		return ERR_PTR(-ENOMEM);
-
-	log = &env->log;
-	log->level = BPF_LOG_KERNEL;
-
 	btf = kzalloc(sizeof(*btf), GFP_KERNEL | __GFP_NOWARN);
 	if (!btf) {
 		err = -ENOMEM;
@@ -6005,10 +6088,10 @@ struct btf *btf_parse_vmlinux(void)
 	}
 	env->btf = btf;
 
-	btf->data = __start_BTF;
-	btf->data_size = __stop_BTF - __start_BTF;
+	btf->data = data;
+	btf->data_size = data_size;
 	btf->kernel_btf = true;
-	snprintf(btf->name, sizeof(btf->name), "vmlinux");
+	snprintf(btf->name, sizeof(btf->name), "%s", name);
 
 	err = btf_parse_hdr(env);
 	if (err)
@@ -6028,20 +6111,11 @@ struct btf *btf_parse_vmlinux(void)
 	if (err)
 		goto errout;
 
-	/* btf_parse_vmlinux() runs under bpf_verifier_lock */
-	bpf_ctx_convert.t = btf_type_by_id(btf, bpf_ctx_convert_btf_id[0]);
-
 	refcount_set(&btf->refcnt, 1);
 
-	err = btf_alloc_id(btf);
-	if (err)
-		goto errout;
-
-	btf_verifier_env_free(env);
 	return btf;
 
 errout:
-	btf_verifier_env_free(env);
 	if (btf) {
 		kvfree(btf->types);
 		kfree(btf);
@@ -6049,19 +6123,59 @@ struct btf *btf_parse_vmlinux(void)
 	return ERR_PTR(err);
 }
 
+struct btf *btf_parse_vmlinux(void)
+{
+	struct btf_verifier_env *env = NULL;
+	struct bpf_verifier_log *log;
+	struct btf *btf;
+	int err;
+
+	env = kzalloc(sizeof(*env), GFP_KERNEL | __GFP_NOWARN);
+	if (!env)
+		return ERR_PTR(-ENOMEM);
+
+	log = &env->log;
+	log->level = BPF_LOG_KERNEL;
+	btf = btf_parse_base(env, "vmlinux", __start_BTF, __stop_BTF - __start_BTF);
+	if (!IS_ERR(btf)) {
+		/* btf_parse_vmlinux() runs under bpf_verifier_lock */
+		bpf_ctx_convert.t = btf_type_by_id(btf, bpf_ctx_convert_btf_id[0]);
+		err = btf_alloc_id(btf);
+		if (err) {
+			btf_free(btf);
+			btf = ERR_PTR(err);
+		}
+	}
+	btf_verifier_env_free(env);
+	return btf;
+}
+
 #ifdef CONFIG_DEBUG_INFO_BTF_MODULES
 
-static struct btf *btf_parse_module(const char *module_name, const void *data, unsigned int data_size)
+/* If .BTF_ids section was created with distilled base BTF, both base and
+ * split BTF ids will need to be mapped to actual base/split ids for
+ * BTF now that it has been relocated.
+ */
+static __u32 btf_id_map(const struct btf *btf, __u32 id)
+{
+	if (!btf->base_btf || !btf->base_map)
+		return id;
+	return btf->base_map[id];
+}
+
+static struct btf *btf_parse_module(const char *module_name, const void *data,
+				    unsigned int data_size, void *base_data,
+				    unsigned int base_data_size)
 {
+	struct btf *btf = NULL, *vmlinux_btf, *base_btf = NULL;
 	struct btf_verifier_env *env = NULL;
 	struct bpf_verifier_log *log;
-	struct btf *btf = NULL, *base_btf;
-	int err;
+	int err = 0;
 
-	base_btf = bpf_get_btf_vmlinux();
-	if (IS_ERR(base_btf))
-		return base_btf;
-	if (!base_btf)
+	vmlinux_btf = bpf_get_btf_vmlinux();
+	if (IS_ERR(vmlinux_btf))
+		return vmlinux_btf;
+	if (!vmlinux_btf)
 		return ERR_PTR(-EINVAL);
 
 	env = kzalloc(sizeof(*env), GFP_KERNEL | __GFP_NOWARN);
@@ -6071,6 +6185,16 @@ static struct btf *btf_parse_module(const char *module_name, const void *data, u
 	log = &env->log;
 	log->level = BPF_LOG_KERNEL;
 
+	if (base_data) {
+		base_btf = btf_parse_base(env, ".BTF.base", base_data, base_data_size);
+		if (IS_ERR(base_btf)) {
+			err = PTR_ERR(base_btf);
+			goto errout;
+		}
+	} else {
+		base_btf = vmlinux_btf;
+	}
+
 	btf = kzalloc(sizeof(*btf), GFP_KERNEL | __GFP_NOWARN);
 	if (!btf) {
 		err = -ENOMEM;
@@ -6110,12 +6234,22 @@ static struct btf *btf_parse_module(const char *module_name, const void *data, u
 	if (err)
 		goto errout;
 
+	if (base_btf != vmlinux_btf) {
+		err = btf_relocate(btf, vmlinux_btf, &btf->base_map);
+		if (err)
+			goto errout;
+		btf_free(base_btf);
+		base_btf = vmlinux_btf;
+	}
+
 	btf_verifier_env_free(env);
 	refcount_set(&btf->refcnt, 1);
 	return btf;
 
 errout:
 	btf_verifier_env_free(env);
+	if (base_btf != vmlinux_btf)
+		btf_free(base_btf);
 	if (btf) {
 		kvfree(btf->data);
 		kvfree(btf->types);
@@ -7668,7 +7802,8 @@ static int btf_module_notify(struct notifier_block *nb, unsigned long op,
 			err = -ENOMEM;
 			goto out;
 		}
-		btf = btf_parse_module(mod->name, mod->btf_data, mod->btf_data_size);
+		btf = btf_parse_module(mod->name, mod->btf_data, mod->btf_data_size,
+				       mod->btf_base_data, mod->btf_base_data_size);
 		if (IS_ERR(btf)) {
 			kfree(btf_mod);
 			if (!IS_ENABLED(CONFIG_MODULE_ALLOW_BTF_MISMATCH)) {
@@ -7992,7 +8127,7 @@ static int btf_populate_kfunc_set(struct btf *btf, enum btf_kfunc_hook hook,
 	bool add_filter = !!kset->filter;
 	struct btf_kfunc_set_tab *tab;
 	struct btf_id_set8 *set;
-	u32 set_cnt;
+	u32 set_cnt, i;
 	int ret;
 
 	if (hook >= BTF_KFUNC_HOOK_MAX) {
@@ -8038,21 +8173,15 @@ static int btf_populate_kfunc_set(struct btf *btf, enum btf_kfunc_hook hook,
 		goto end;
 	}
 
-	/* We don't need to allocate, concatenate, and sort module sets, because
-	 * only one is allowed per hook. Hence, we can directly assign the
-	 * pointer and return.
-	 */
-	if (!vmlinux_set) {
-		tab->sets[hook] = add_set;
-		goto do_add_filter;
-	}
-
 	/* In case of vmlinux sets, there may be more than one set being
 	 * registered per hook. To create a unified set, we allocate a new set
 	 * and concatenate all individual sets being registered. While each set
 	 * is individually sorted, they may become unsorted when concatenated,
 	 * hence re-sorting the final set again is required to make binary
 	 * searching the set using btf_id_set8_contains function work.
+	 *
+	 * For module sets, we need to allocate as we may need to relocate
+	 * BTF ids.
 	 */
 	set_cnt = set ? set->cnt : 0;
 
@@ -8082,11 +8211,14 @@ static int btf_populate_kfunc_set(struct btf *btf, enum btf_kfunc_hook hook,
 
 	/* Concatenate the two sets */
 	memcpy(set->pairs + set->cnt, add_set->pairs, add_set->cnt * sizeof(set->pairs[0]));
+	/* Now that the set is copied, update with relocated BTF ids */
+	for (i = set->cnt; i < set->cnt + add_set->cnt; i++)
+		set->pairs[i].id = btf_id_map(btf, set->pairs[i].id);
+
 	set->cnt += add_set->cnt;
 
 	sort(set->pairs, set->cnt, sizeof(set->pairs[0]), btf_id_cmp_func, NULL);
 
-do_add_filter:
 	if (add_filter) {
 		hook_filter = &tab->hook_filters[hook];
 		hook_filter->filters[hook_filter->nr_filters++] = kset->filter;
@@ -8204,7 +8336,7 @@ static int __register_btf_kfunc_id_set(enum btf_kfunc_hook hook,
 		return PTR_ERR(btf);
 
 	for (i = 0; i < kset->set->cnt; i++) {
-		ret = btf_check_kfunc_protos(btf, kset->set->pairs[i].id,
+		ret = btf_check_kfunc_protos(btf, btf_id_map(btf, kset->set->pairs[i].id),
 					     kset->set->pairs[i].flags);
 		if (ret)
 			goto err_out;
@@ -8303,7 +8435,7 @@ int register_btf_id_dtor_kfuncs(const struct btf_id_dtor_kfunc *dtors, u32 add_c
 {
 	struct btf_id_dtor_kfunc_tab *tab;
 	struct btf *btf;
-	u32 tab_cnt;
+	u32 tab_cnt, i;
 	int ret;
 
 	btf = btf_get_module_btf(owner);
@@ -8354,6 +8486,13 @@ int register_btf_id_dtor_kfuncs(const struct btf_id_dtor_kfunc *dtors, u32 add_c
 	btf->dtor_kfunc_tab = tab;
 
 	memcpy(tab->dtors + tab->cnt, dtors, add_cnt * sizeof(tab->dtors[0]));
+
+	/* remap BTF ids based on BTF relocation (if any) */
+	for (i = tab_cnt; i < tab_cnt + add_cnt; i++) {
+		tab->dtors[i].btf_id = btf_id_map(btf, tab->dtors[i].btf_id);
+		tab->dtors[i].kfunc_btf_id = btf_id_map(btf, tab->dtors[i].kfunc_btf_id);
+	}
+
 	tab->cnt += add_cnt;
 
 	sort(tab->dtors, tab->cnt, sizeof(tab->dtors[0]), btf_id_cmp_func, NULL);
diff --git a/tools/lib/bpf/Build b/tools/lib/bpf/Build
index 336da6844d42..567abaa52131 100644
--- a/tools/lib/bpf/Build
+++ b/tools/lib/bpf/Build
@@ -1,4 +1,4 @@
 libbpf-y := libbpf.o bpf.o nlattr.o btf.o libbpf_errno.o str_error.o \
 	    netlink.o bpf_prog_linfo.o libbpf_probes.o hashmap.o \
 	    btf_dump.o ringbuf.o strset.o linker.o gen_loader.o relo_core.o \
-	    usdt.o zip.o elf.o features.o btf_relocate.o
+	    usdt.o zip.o elf.o features.o btf_common.o btf_relocate.o
diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index f00a84fea9b5..7be33560bd94 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -5034,136 +5034,6 @@ struct btf *btf__load_module_btf(const char *module_name, struct btf *vmlinux_bt
 	return btf__parse_split(path, vmlinux_btf);
 }
 
-int btf_type_visit_type_ids(struct btf_type *t, type_id_visit_fn visit, void *ctx)
-{
-	int i, n, err;
-
-	switch (btf_kind(t)) {
-	case BTF_KIND_INT:
-	case BTF_KIND_FLOAT:
-	case BTF_KIND_ENUM:
-	case BTF_KIND_ENUM64:
-		return 0;
-
-	case BTF_KIND_FWD:
-	case BTF_KIND_CONST:
-	case BTF_KIND_VOLATILE:
-	case BTF_KIND_RESTRICT:
-	case BTF_KIND_PTR:
-	case BTF_KIND_TYPEDEF:
-	case BTF_KIND_FUNC:
-	case BTF_KIND_VAR:
-	case BTF_KIND_DECL_TAG:
-	case BTF_KIND_TYPE_TAG:
-		return visit(&t->type, ctx);
-
-	case BTF_KIND_ARRAY: {
-		struct btf_array *a = btf_array(t);
-
-		err = visit(&a->type, ctx);
-		err = err ?: visit(&a->index_type, ctx);
-		return err;
-	}
-
-	case BTF_KIND_STRUCT:
-	case BTF_KIND_UNION: {
-		struct btf_member *m = btf_members(t);
-
-		for (i = 0, n = btf_vlen(t); i < n; i++, m++) {
-			err = visit(&m->type, ctx);
-			if (err)
-				return err;
-		}
-		return 0;
-	}
-
-	case BTF_KIND_FUNC_PROTO: {
-		struct btf_param *m = btf_params(t);
-
-		err = visit(&t->type, ctx);
-		if (err)
-			return err;
-		for (i = 0, n = btf_vlen(t); i < n; i++, m++) {
-			err = visit(&m->type, ctx);
-			if (err)
-				return err;
-		}
-		return 0;
-	}
-
-	case BTF_KIND_DATASEC: {
-		struct btf_var_secinfo *m = btf_var_secinfos(t);
-
-		for (i = 0, n = btf_vlen(t); i < n; i++, m++) {
-			err = visit(&m->type, ctx);
-			if (err)
-				return err;
-		}
-		return 0;
-	}
-
-	default:
-		return -EINVAL;
-	}
-}
-
-int btf_type_visit_str_offs(struct btf_type *t, str_off_visit_fn visit, void *ctx)
-{
-	int i, n, err;
-
-	err = visit(&t->name_off, ctx);
-	if (err)
-		return err;
-
-	switch (btf_kind(t)) {
-	case BTF_KIND_STRUCT:
-	case BTF_KIND_UNION: {
-		struct btf_member *m = btf_members(t);
-
-		for (i = 0, n = btf_vlen(t); i < n; i++, m++) {
-			err = visit(&m->name_off, ctx);
-			if (err)
-				return err;
-		}
-		break;
-	}
-	case BTF_KIND_ENUM: {
-		struct btf_enum *m = btf_enum(t);
-
-		for (i = 0, n = btf_vlen(t); i < n; i++, m++) {
-			err = visit(&m->name_off, ctx);
-			if (err)
-				return err;
-		}
-		break;
-	}
-	case BTF_KIND_ENUM64: {
-		struct btf_enum64 *m = btf_enum64(t);
-
-		for (i = 0, n = btf_vlen(t); i < n; i++, m++) {
-			err = visit(&m->name_off, ctx);
-			if (err)
-				return err;
-		}
-		break;
-	}
-	case BTF_KIND_FUNC_PROTO: {
-		struct btf_param *m = btf_params(t);
-
-		for (i = 0, n = btf_vlen(t); i < n; i++, m++) {
-			err = visit(&m->name_off, ctx);
-			if (err)
-				return err;
-		}
-		break;
-	}
-	default:
-		break;
-	}
-
-	return 0;
-}
-
 int btf_ext_visit_type_ids(struct btf_ext *btf_ext, type_id_visit_fn visit, void *ctx)
 {
 	const struct btf_ext_info *seg;
diff --git a/tools/lib/bpf/btf_common.c b/tools/lib/bpf/btf_common.c
new file mode 100644
index 000000000000..ddec3f3ac423
--- /dev/null
+++ b/tools/lib/bpf/btf_common.c
@@ -0,0 +1,146 @@
+// SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause)
+/* Copyright (c) 2021 Facebook */
+/* Copyright (c) 2024, Oracle and/or its affiliates. */
+
+#ifdef __KERNEL__
+#include <linux/bpf.h>
+#include <linux/btf.h>
+
+static inline struct btf_var_secinfo *btf_var_secinfos(const struct btf_type *t)
+{
+	return (struct btf_var_secinfo *)(t + 1);
+}
+
+#else
+#include "btf.h"
+#include "libbpf_internal.h"
+#endif
+
+int btf_type_visit_type_ids(struct btf_type *t, type_id_visit_fn visit, void *ctx)
+{
+	int i, n, err;
+
+	switch (btf_kind(t)) {
+	case BTF_KIND_INT:
+	case BTF_KIND_FLOAT:
+	case BTF_KIND_ENUM:
+	case BTF_KIND_ENUM64:
+		return 0;
+
+	case BTF_KIND_FWD:
+	case BTF_KIND_CONST:
+	case BTF_KIND_VOLATILE:
+	case BTF_KIND_RESTRICT:
+	case BTF_KIND_PTR:
+	case BTF_KIND_TYPEDEF:
+	case BTF_KIND_FUNC:
+	case BTF_KIND_VAR:
+	case BTF_KIND_DECL_TAG:
+	case BTF_KIND_TYPE_TAG:
+		return visit(&t->type, ctx);
+
+	case BTF_KIND_ARRAY: {
+		struct btf_array *a = btf_array(t);
+
+		err = visit(&a->type, ctx);
+		err = err ?: visit(&a->index_type, ctx);
+		return err;
+	}
+
+	case BTF_KIND_STRUCT:
+	case BTF_KIND_UNION: {
+		struct btf_member *m = btf_members(t);
+
+		for (i = 0, n = btf_vlen(t); i < n; i++, m++) {
+			err = visit(&m->type, ctx);
+			if (err)
+				return err;
+		}
+		return 0;
+	}
+	case BTF_KIND_FUNC_PROTO: {
+		struct btf_param *m = btf_params(t);
+
+		err = visit(&t->type, ctx);
+		if (err)
+			return err;
+		for (i = 0, n = btf_vlen(t); i < n; i++, m++) {
+			err = visit(&m->type, ctx);
+			if (err)
+				return err;
+		}
+		return 0;
+	}
+
+	case BTF_KIND_DATASEC: {
+		struct btf_var_secinfo *m = btf_var_secinfos(t);
+
+		for (i = 0, n = btf_vlen(t); i < n; i++, m++) {
+			err = visit(&m->type, ctx);
+			if (err)
+				return err;
+		}
+		return 0;
+	}
+
+	default:
+		return -EINVAL;
+	}
+}
+
+int btf_type_visit_str_offs(struct btf_type *t, str_off_visit_fn visit, void *ctx)
+{
+	int i, n, err;
+
+	err = visit(&t->name_off, ctx);
+	if (err)
+		return err;
+
+	switch (btf_kind(t)) {
+	case BTF_KIND_STRUCT:
+	case BTF_KIND_UNION: {
+		struct btf_member *m = btf_members(t);
+
+		for (i = 0, n = btf_vlen(t); i < n; i++, m++) {
+			err = visit(&m->name_off, ctx);
+			if (err)
+				return err;
+		}
+		break;
+	}
+	case BTF_KIND_ENUM: {
+		struct btf_enum *m = btf_enum(t);
+
+		for (i = 0, n = btf_vlen(t); i < n; i++, m++) {
+			err = visit(&m->name_off, ctx);
+			if (err)
+				return err;
+		}
+		break;
+	}
+	case BTF_KIND_ENUM64: {
+		struct btf_enum64 *m = btf_enum64(t);
+
+		for (i = 0, n = btf_vlen(t); i < n; i++, m++) {
+			err = visit(&m->name_off, ctx);
+			if (err)
+				return err;
+		}
+		break;
+	}
+	case BTF_KIND_FUNC_PROTO: {
+		struct btf_param *m = btf_params(t);
+
+		for (i = 0, n = btf_vlen(t); i < n; i++, m++) {
+			err = visit(&m->name_off, ctx);
+			if (err)
+				return err;
+		}
+		break;
+	}
+	default:
+		break;
+	}
+
+	return 0;
+}
diff --git a/tools/lib/bpf/btf_relocate.c b/tools/lib/bpf/btf_relocate.c
index d9340375f4a3..8d3865cf193b 100644
--- a/tools/lib/bpf/btf_relocate.c
+++ b/tools/lib/bpf/btf_relocate.c
@@ -1,11 +1,40 @@
 // SPDX-License-Identifier: GPL-2.0
 /* Copyright (c) 2024, Oracle and/or its affiliates. */
 
+#ifdef __KERNEL__
+#include <linux/bpf.h>
+#include <linux/btf.h>
+#include <linux/string.h>
+#include <linux/bpf_verifier.h>
+
+#define btf__type_by_id		btf_type_by_id
+#define btf__type_cnt		btf_nr_types
+#define btf__base_btf		btf_base_btf
+#define btf__name_by_offset	btf_name_by_offset
+#define btf_kflag		btf_type_kflag
+
+#define calloc(nmemb, size)	kvcalloc(nmemb, size, GFP_KERNEL | __GFP_NOWARN)
+#define free(ptr)		kvfree(ptr)
+
+static inline __u8 btf_int_bits(const struct btf_type *t)
+{
+	return BTF_INT_BITS(*(__u32 *)(t + 1));
+}
+
+static inline struct btf_decl_tag *btf_decl_tag(const struct btf_type *t)
+{
+	return (struct btf_decl_tag *)(t + 1);
+}
+
+#else
+
 #include "btf.h"
 #include "bpf.h"
 #include "libbpf.h"
 #include "libbpf_internal.h"
 
+#endif /* __KERNEL__ */
+
 struct btf;
 
 #define BTF_MAX_NR_TYPES 0x7fffffffU
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v2 bpf-next 12/13] selftests/bpf: extend distilled BTF tests to cover BTF relocation
  2024-04-24 15:47 [PATCH v2 bpf-next 00/13] bpf: support resilient split BTF Alan Maguire
                   ` (10 preceding siblings ...)
  2024-04-24 15:48 ` [PATCH v2 bpf-next 11/13] libbpf,bpf: share BTF relocate-related code with kernel Alan Maguire
@ 2024-04-24 15:48 ` Alan Maguire
  2024-04-24 15:48 ` [PATCH v2 bpf-next 13/13] bpftool: support displaying relocated-with-base split BTF Alan Maguire
  2024-04-26 22:56 ` [PATCH v2 bpf-next 00/13] bpf: support resilient " Andrii Nakryiko
  13 siblings, 0 replies; 43+ messages in thread
From: Alan Maguire @ 2024-04-24 15:48 UTC (permalink / raw)
  To: andrii, ast
  Cc: jolsa, acme, quentin, eddyz87, mykolal, daniel, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, houtao1,
	bpf, masahiroy, mcgrof, nathan, Alan Maguire

Ensure relocated BTF looks as expected; in this case identical to
original split BTF.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 .../selftests/bpf/prog_tests/btf_distill.c    | 45 +++++++++++++++++++
 1 file changed, 45 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/btf_distill.c b/tools/testing/selftests/bpf/prog_tests/btf_distill.c
index aae9aef68bd6..67cc98227c12 100644
--- a/tools/testing/selftests/bpf/prog_tests/btf_distill.c
+++ b/tools/testing/selftests/bpf/prog_tests/btf_distill.c
@@ -192,6 +192,51 @@ static void test_distilled_base(void)
 		"[22] FUNC 'fn' type_id=11 linkage=static",
 		"[23] TYPEDEF 'arraytype' type_id=12");
 
+	if (!ASSERT_EQ(btf__relocate(btf4, btf1), 0, "relocate_split"))
+		goto cleanup;
+	VALIDATE_RAW_BTF(
+		btf4,
+		"[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+		"[2] PTR '(anon)' type_id=1",
+		"[3] STRUCT 's1' size=8 vlen=1\n"
+		"\t'f1' type_id=2 bits_offset=0",
+		"[4] STRUCT '(anon)' size=12 vlen=2\n"
+		"\t'f1' type_id=1 bits_offset=0\n"
+		"\t'f2' type_id=3 bits_offset=32",
+		"[5] INT 'unsigned int' size=4 bits_offset=0 nr_bits=32 encoding=(none)",
+		"[6] UNION 'u1' size=12 vlen=2\n"
+		"\t'f1' type_id=1 bits_offset=0\n"
+		"\t'f2' type_id=2 bits_offset=0",
+		"[7] UNION '(anon)' size=4 vlen=1\n"
+		"\t'f1' type_id=1 bits_offset=0",
+		"[8] ENUM 'e1' encoding=UNSIGNED size=4 vlen=1\n"
+		"\t'v1' val=1",
+		"[9] ENUM '(anon)' encoding=UNSIGNED size=4 vlen=1\n"
+		"\t'av1' val=2",
+		"[10] ENUM64 'e641' encoding=SIGNED size=8 vlen=1\n"
+		"\t'v1' val=1024",
+		"[11] ENUM64 '(anon)' encoding=SIGNED size=8 vlen=1\n"
+		"\t'v1' val=1025",
+		"[12] STRUCT 'unneeded' size=4 vlen=1\n"
+		"\t'f1' type_id=1 bits_offset=0",
+		"[13] STRUCT 'embedded' size=4 vlen=1\n"
+		"\t'f1' type_id=1 bits_offset=0",
+		"[14] FUNC_PROTO '(anon)' ret_type_id=1 vlen=1\n"
+		"\t'p1' type_id=1",
+		"[15] ARRAY '(anon)' type_id=1 index_type_id=1 nr_elems=3",
+		"[16] PTR '(anon)' type_id=3",
+		"[17] PTR '(anon)' type_id=4",
+		"[18] CONST '(anon)' type_id=6",
+		"[19] RESTRICT '(anon)' type_id=7",
+		"[20] VOLATILE '(anon)' type_id=8",
+		"[21] TYPEDEF 'et' type_id=9",
+		"[22] CONST '(anon)' type_id=10",
+		"[23] PTR '(anon)' type_id=11",
+		"[24] STRUCT 'with_embedded' size=4 vlen=1\n"
+		"\t'f1' type_id=13 bits_offset=0",
+		"[25] FUNC 'fn' type_id=14 linkage=static",
+		"[26] TYPEDEF 'arraytype' type_id=15");
+
 cleanup:
 	btf__free(btf4);
 	btf__free(btf3);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* [PATCH v2 bpf-next 13/13] bpftool: support displaying relocated-with-base split BTF
  2024-04-24 15:47 [PATCH v2 bpf-next 00/13] bpf: support resilient split BTF Alan Maguire
                   ` (11 preceding siblings ...)
  2024-04-24 15:48 ` [PATCH v2 bpf-next 12/13] selftests/bpf: extend distilled BTF tests to cover BTF relocation Alan Maguire
@ 2024-04-24 15:48 ` Alan Maguire
  2024-04-26 22:56 ` [PATCH v2 bpf-next 00/13] bpf: support resilient " Andrii Nakryiko
  13 siblings, 0 replies; 43+ messages in thread
From: Alan Maguire @ 2024-04-24 15:48 UTC (permalink / raw)
  To: andrii, ast
  Cc: jolsa, acme, quentin, eddyz87, mykolal, daniel, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, houtao1,
	bpf, masahiroy, mcgrof, nathan, Alan Maguire

If the -R <base_btf> option is used, we can display BTF that has been
generated with distilled base BTF in its relocated form.  For example
for bpf_testmod.ko (which is built as an out-of-tree module, so has
a distilled .BTF.base section:

bpftool btf dump file bpf_testmod.ko

Alternatively, we can display content relocated with
(a possibly changed) base BTF via

bpftool btf dump -R /sys/kernel/btf/vmlinux bpf_testmod.ko

The latter mirrors how the kernel will handle such split
BTF; it relocates its representation with the running
kernel, and if successful, renumbers BTF ids to reference
the current vmlinux BTF.

Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
---
 tools/bpf/bpftool/Documentation/bpftool-btf.rst | 15 ++++++++++++++-
 tools/bpf/bpftool/bash-completion/bpftool       |  7 ++++---
 tools/bpf/bpftool/btf.c                         | 11 ++++++++++-
 tools/bpf/bpftool/main.c                        | 14 +++++++++++++-
 tools/bpf/bpftool/main.h                        |  2 ++
 5 files changed, 43 insertions(+), 6 deletions(-)

diff --git a/tools/bpf/bpftool/Documentation/bpftool-btf.rst b/tools/bpf/bpftool/Documentation/bpftool-btf.rst
index eaba24320fb2..fd6bb1280e7b 100644
--- a/tools/bpf/bpftool/Documentation/bpftool-btf.rst
+++ b/tools/bpf/bpftool/Documentation/bpftool-btf.rst
@@ -16,7 +16,7 @@ SYNOPSIS
 
 **bpftool** [*OPTIONS*] **btf** *COMMAND*
 
-*OPTIONS* := { |COMMON_OPTIONS| | { **-B** | **--base-btf** } }
+*OPTIONS* := { |COMMON_OPTIONS| | { **-B** | **--base-btf** } { **-R** | **relocate-base-btf** } }
 
 *COMMANDS* := { **dump** | **help** }
 
@@ -85,6 +85,19 @@ OPTIONS
     BTF object is passed through other handles, this option becomes
     necessary.
 
+-R, --relocate-base-btf *FILE*
+    When split BTF is generated with distilled base BTF for relocation,
+    the latter is stored in a .BTF.base section and allows us to later
+    relocate split BTF and a potentially-changed base BTF by using
+    information in the .BTF.base section about the base types referenced
+    from split BTF.  Relocation is carried out against the split BTF
+    supplied via this parameter and the split BTF will then refer to
+    the base types supplied in *FILE*.
+
+    If this option is not used, split BTF is shown relative to the
+    .BTF.base, which contains just enough information to support later
+    relocation.
+
 EXAMPLES
 ========
 **# bpftool btf dump id 1226**
diff --git a/tools/bpf/bpftool/bash-completion/bpftool b/tools/bpf/bpftool/bash-completion/bpftool
index 04afe2ac2228..878cf3d49a76 100644
--- a/tools/bpf/bpftool/bash-completion/bpftool
+++ b/tools/bpf/bpftool/bash-completion/bpftool
@@ -262,7 +262,7 @@ _bpftool()
     # Deal with options
     if [[ ${words[cword]} == -* ]]; then
         local c='--version --json --pretty --bpffs --mapcompat --debug \
-            --use-loader --base-btf'
+            --use-loader --base-btf --relocate-base-btf'
         COMPREPLY=( $( compgen -W "$c" -- "$cur" ) )
         return 0
     fi
@@ -283,7 +283,7 @@ _bpftool()
             _sysfs_get_netdevs
             return 0
             ;;
-        file|pinned|-B|--base-btf)
+        file|pinned|-B|-R|--base-btf|--relocate-base-btf)
             _filedir
             return 0
             ;;
@@ -297,7 +297,8 @@ _bpftool()
     local i pprev
     for (( i=1; i < ${#words[@]}; )); do
         if [[ ${words[i]::1} == - ]] &&
-            [[ ${words[i]} != "-B" ]] && [[ ${words[i]} != "--base-btf" ]]; then
+            [[ ${words[i]} != "-B" ]] && [[ ${words[i]} != "--base-btf" ]] &&
+            [[ ${words[i]} != "-R" ]] && [[ ${words[i]} != "--relocate-base-btf" ]]; then
             words=( "${words[@]:0:i}" "${words[@]:i+1}" )
             [[ $i -le $cword ]] && cword=$(( cword - 1 ))
         else
diff --git a/tools/bpf/bpftool/btf.c b/tools/bpf/bpftool/btf.c
index 2e8bd2c9f0a3..7df8a686fef7 100644
--- a/tools/bpf/bpftool/btf.c
+++ b/tools/bpf/bpftool/btf.c
@@ -639,6 +639,14 @@ static int do_dump(int argc, char **argv)
 			base_btf = btf__parse_opts(*argv, &optp);
 			if (base_btf)
 				btf = btf__parse_split(*argv, base_btf);
+			if (btf && relocate_base_btf) {
+				err = btf__relocate(btf, relocate_base_btf);
+				if (err) {
+					p_err("could not relocate BTF from '%s' with base BTF '%s': %s\n",
+					      *argv, relocate_base_btf_path, strerror(-err));
+					goto done;
+				}
+			}
 		}
 		if (!btf) {
 			err = -errno;
@@ -1076,7 +1084,8 @@ static int do_help(int argc, char **argv)
 		"       " HELP_SPEC_MAP "\n"
 		"       " HELP_SPEC_PROGRAM "\n"
 		"       " HELP_SPEC_OPTIONS " |\n"
-		"                    {-B|--base-btf} }\n"
+		"                    {-B|--base-btf} |\n"
+		"                    {-R|--relocate-base-btf} }\n"
 		"",
 		bin_name, "btf");
 
diff --git a/tools/bpf/bpftool/main.c b/tools/bpf/bpftool/main.c
index 08d0ac543c67..69d4906bec5c 100644
--- a/tools/bpf/bpftool/main.c
+++ b/tools/bpf/bpftool/main.c
@@ -32,6 +32,8 @@ bool verifier_logs;
 bool relaxed_maps;
 bool use_loader;
 struct btf *base_btf;
+struct btf *relocate_base_btf;
+const char *relocate_base_btf_path;
 struct hashmap *refs_table;
 
 static void __noreturn clean_and_exit(int i)
@@ -448,6 +450,7 @@ int main(int argc, char **argv)
 		{ "debug",	no_argument,	NULL,	'd' },
 		{ "use-loader",	no_argument,	NULL,	'L' },
 		{ "base-btf",	required_argument, NULL, 'B' },
+		{ "relocate-base-btf", required_argument, NULL, 'R' },
 		{ 0 }
 	};
 	bool version_requested = false;
@@ -473,7 +476,7 @@ int main(int argc, char **argv)
 	bin_name = "bpftool";
 
 	opterr = 0;
-	while ((opt = getopt_long(argc, argv, "VhpjfLmndB:l",
+	while ((opt = getopt_long(argc, argv, "VhpjfLmndB:lR:",
 				  options, NULL)) >= 0) {
 		switch (opt) {
 		case 'V':
@@ -519,6 +522,15 @@ int main(int argc, char **argv)
 		case 'L':
 			use_loader = true;
 			break;
+		case 'R':
+			relocate_base_btf_path = optarg;
+			relocate_base_btf = btf__parse(optarg, NULL);
+			if (!relocate_base_btf) {
+				p_err("failed to parse base BTF for relocation at '%s': %d\n",
+				      optarg, -errno);
+				return -1;
+			}
+			break;
 		default:
 			p_err("unrecognized option '%s'", argv[optind - 1]);
 			if (json_output)
diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h
index 9eb764fe4cc8..bbf8194a2d76 100644
--- a/tools/bpf/bpftool/main.h
+++ b/tools/bpf/bpftool/main.h
@@ -83,6 +83,8 @@ extern bool verifier_logs;
 extern bool relaxed_maps;
 extern bool use_loader;
 extern struct btf *base_btf;
+extern struct btf *relocate_base_btf;
+extern const char *relocate_base_btf_path;
 extern struct hashmap *refs_table;
 
 void __printf(1, 2) p_err(const char *fmt, ...);
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 00/13] bpf: support resilient split BTF
  2024-04-24 15:47 [PATCH v2 bpf-next 00/13] bpf: support resilient split BTF Alan Maguire
                   ` (12 preceding siblings ...)
  2024-04-24 15:48 ` [PATCH v2 bpf-next 13/13] bpftool: support displaying relocated-with-base split BTF Alan Maguire
@ 2024-04-26 22:56 ` Andrii Nakryiko
  2024-04-27  0:24   ` Andrii Nakryiko
  13 siblings, 1 reply; 43+ messages in thread
From: Andrii Nakryiko @ 2024-04-26 22:56 UTC (permalink / raw)
  To: Alan Maguire
  Cc: andrii, ast, jolsa, acme, quentin, eddyz87, mykolal, daniel,
	martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf,
	haoluo, houtao1, bpf, masahiroy, mcgrof, nathan

On Wed, Apr 24, 2024 at 8:48 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> Split BPF Type Format (BTF) provides huge advantages in that kernel
> modules only have to provide type information for types that they do not
> share with the core kernel; for core kernel types, split BTF refers to
> core kernel BTF type ids.  So for a STRUCT sk_buff, a module that
> uses that structure (or a pointer to it) simply needs to refer to the
> core kernel type id, saving the need to define the structure and its many
> dependents.  This cuts down on duplication and makes BTF as compact
> as possible.
>
> However, there is a downside.  This scheme requires the references from
> split BTF to base BTF to be valid not just at encoding time, but at use
> time (when the module is loaded).  Even a small change in kernel types
> can perturb the type ids in core kernel BTF, and due to pahole's
> parallel processing of compilation units, even an unchanged kernel can
> have different type ids if BTF is re-generated.  So we have a robustness
> problem for split BTF for cases where a module is not always compiled at
> the same time as the kernel.  This problem is particularly acute for
> distros which generally want module builders to be able to compile a
> module for the lifetime of a Linux stable-based release, and have it
> continue to be valid over the lifetime of that release, even as changes
> in data structures (and hence BTF types) accrue.  Today it's not
> possible to generate BTF for modules that works beyond the initial
> kernel it is compiled against - kernel bugfixes etc invalidate the split
> BTF references to vmlinux BTF, and BTF is no longer usable for the
> module.
>
> The goal of this series is to provide options to provide additional
> context for cases like this.  That context comes in the form of
> distilled base BTF; it stands in for the base BTF, and contains
> information about the types referenced from split BTF, but not their
> full descriptions.  The modified split BTF will refer to type ids in
> this .BTF.base section, and when the kernel loads such modules it
> will use that base BTF to map references from split BTF to the
> current vmlinux BTF - a process of relocating split BTF with the
> currently-running kernel's vmlinux base BTF.
>
> A module builder - using this series along with the pahole changes -
> can then build a module with distilled base BTF via an out-of-tree
> module build, i.e.
>
> make -C . M=path/2/module
>
> The module will have a .BTF section (the split BTF) and a
> .BTF.base section.  The latter is small in size - distilled base
> BTF does not need full struct/union/enum information for named
> types for example.  For 2667 modules built with distilled base BTF,
> the average size observed was 1556 bytes (stddev 1563).
>
> Note that for the in-tree modules, this approach is not needed as
> split and base BTF in the case of in-tree modules are always built
> and re-built together.
>
> The series first focuses on generating split BTF with distilled base
> BTF, and provides btf__parse_opts() which allows specification
> of the section name from which to read BTF data, since we now have
> both .BTF and .BTF.base sections that can contain such data.
>
> Then we add support to resolve_btfids for generating the .BTF.ids
> section with reference to the .BTF.base section - this ensures the
> .BTF.ids match those used in the split/base BTF.
>
> Finally the series provides the mechanism for relocating split BTF with
> a new base; the distilled base BTF is used to map the references to base
> BTF in the split BTF to the new base.  For the kernel, this relocation
> process happens at module load time, and we relocate split BTF
> references to point at types in the current vmlinux BTF.  As part of
> this, .BTF.ids references need to be mapped also.
>
> So concretely, what happens is
>
> - we generate split BTF in the .BTF section of a module that refers to
>   types in the .BTF.base section as base types; these are not full
>   type descriptions but provide information about the base type.  So
>   a STRUCT sk_buff would be represented as a FWD struct sk_buff in
>   distilled base BTF for example.
> - when the module is loaded, the split BTF is relocated with vmlinux
>   BTF; in the case of the FWD struct sk_buff, we find the STRUCT sk_buff
>   in vmlinux BTF and map all split BTF references to the distilled base
>   FWD sk_buff, replacing them with references to the vmlinux BTF
>   STRUCT sk_buff.
>
> Support is also added to bpftool to be able to display split BTF
> relative to its .BTF.base section, and also to display the relocated
> form via the "-R path_to_base_btf".
>
> A previous approach to this problem [1] utilized standalone BTF for such
> cases - where the BTF is not defined relative to base BTF so there is no
> relocation required.  The problem with that approach is that from
> the verifier perspective, some types are special, and having a custom
> representation of a core kernel type that did not necessarily match the
> current representation is not tenable.  So the approach taken here was
> to preserve the split BTF model while minimizing the representation of
> the context needed to relocate split and current vmlinux BTF.
>
> To generate distilled .BTF.base sections the associated dwarves
> patch (to be applied on the "next" branch there) is needed.
> Without it, things will still work but bpf_testmod will not be built
> with a .BTF.base section.
>
> Changes since RFC [2]:
>
> - updated terminology; we replace clunky "base reference" BTF with
>   distilling base BTF into a .BTF.base section. Similarly BTF
>   reconcilation becomes BTF relocation (Andrii, most patches)
> - add distilled base BTF by default for out-of-tree modules
>   (Alexei, patch 8)
> - distill algorithm updated to record size of embedded struct/union
>   by recording it as a 0-vlen STRUCT/UNION with size preserved
>   (Andrii, patch 2)
> - verify size match on relocation for such STRUCT/UNIONs (Andrii,
>   patch 9)
> - with embedded STRUCT/UNION recording size, we can have bpftool
>   dump a header representation using .BTF.base + .BTF sections
>   rather than special-casing and refusing to use "format c" for
>   that case (patch 5)
> - match enum with enum64 and vice versa (Andrii, patch 9)
> - ensure that resolve_btfids works with BTF without .BTF.base
>   section (patch 7)
> - update tests to cover embedded types, arrays and function
>   prototypes (patches 3, 12)
>
> One change not made yet is adding anonymous struct/unions that the split
> BTF references in base BTF to the module instead of adding them to the
> .BTF.base section.  That would involve having to maintain two pipes for
> writing BTF, one for the .BTF.base and one for the split BTF.  It would
> be possible, but there are I think some edge cases that might make it
> tricky.  For example consider a split BTF reference to a base BTF
> ARRAY which in turn referenced an anonymous STRUCT as type.  In such a
> case, it wouldn't make sense to have the array in the .BTF.base section
> while having the STRUCT in the module.  The general concern is that once

Hm.. not really? ARRAY is a reference type (and anonymous at that), so
it would have to stay in module's BTF, no? I'll go read the patch
series again, but let me know if I'm missing something.

> we move a type to the module we would need to also ensure any base types
> that refer to it move there too.  For now it is I think simpler to
> retain the existing split/base type classifications.

We would have to finalize this part before landing, as it has big
implications on the relocation process.


>
> [1] https://lore.kernel.org/bpf/20231112124834.388735-14-alan.maguire@oracle.com/
> [2] https://lore.kernel.org/bpf/20240322102455.98558-1-alan.maguire@oracle.com/
>
>
>
> Alan Maguire (13):
>   libbpf: add support to btf__add_fwd() for ENUM64
>   libbpf: add btf__distill_base() creating split BTF with distilled base
>     BTF
>   selftests/bpf: test distilled base, split BTF generation
>   libbpf: add btf__parse_opts() API for flexible BTF parsing
>   bpftool: support displaying raw split BTF using base BTF section as
>     base
>   kbuild,bpf: switch to using --btf_features for pahole v1.26 and later
>   resolve_btfids: use .BTF.base ELF section as base BTF if -B option is
>     used
>   kbuild, bpf: add module-specific pahole/resolve_btfids flags for
>     distilled base BTF
>   libbpf: split BTF relocation
>   module, bpf: store BTF base pointer in struct module
>   libbpf,bpf: share BTF relocate-related code with kernel
>   selftests/bpf: extend distilled BTF tests to cover BTF relocation
>   bpftool: support displaying relocated-with-base split BTF
>
>  include/linux/btf.h                           |  32 +
>  include/linux/module.h                        |   2 +
>  kernel/bpf/Makefile                           |   8 +
>  kernel/bpf/btf.c                              | 227 +++++--
>  kernel/module/main.c                          |   5 +-
>  scripts/Makefile.btf                          |  12 +-
>  scripts/Makefile.modfinal                     |   4 +-
>  .../bpf/bpftool/Documentation/bpftool-btf.rst |  15 +-
>  tools/bpf/bpftool/bash-completion/bpftool     |   7 +-
>  tools/bpf/bpftool/btf.c                       |  20 +-
>  tools/bpf/bpftool/main.c                      |  14 +-
>  tools/bpf/bpftool/main.h                      |   2 +
>  tools/bpf/resolve_btfids/main.c               |  22 +-
>  tools/lib/bpf/Build                           |   2 +-
>  tools/lib/bpf/btf.c                           | 561 +++++++++++-----
>  tools/lib/bpf/btf.h                           |  61 ++
>  tools/lib/bpf/btf_common.c                    | 146 ++++
>  tools/lib/bpf/btf_relocate.c                  | 630 ++++++++++++++++++
>  tools/lib/bpf/libbpf.map                      |   3 +
>  tools/lib/bpf/libbpf_internal.h               |   2 +
>  .../selftests/bpf/prog_tests/btf_distill.c    | 298 +++++++++
>  21 files changed, 1864 insertions(+), 209 deletions(-)
>  create mode 100644 tools/lib/bpf/btf_common.c
>  create mode 100644 tools/lib/bpf/btf_relocate.c
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_distill.c
>
> --
> 2.31.1
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 01/13] libbpf: add support to btf__add_fwd() for ENUM64
  2024-04-24 15:47 ` [PATCH v2 bpf-next 01/13] libbpf: add support to btf__add_fwd() for ENUM64 Alan Maguire
@ 2024-04-26 22:56   ` Andrii Nakryiko
  0 siblings, 0 replies; 43+ messages in thread
From: Andrii Nakryiko @ 2024-04-26 22:56 UTC (permalink / raw)
  To: Alan Maguire
  Cc: andrii, ast, jolsa, acme, quentin, eddyz87, mykolal, daniel,
	martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf,
	haoluo, houtao1, bpf, masahiroy, mcgrof, nathan

On Wed, Apr 24, 2024 at 8:48 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> Forward declaration of BTF_KIND_ENUM64 is added by supporting BTF_FWD_ENUM64
> as an enumerated value for btf_fwd_kind; an ENUM64 forward is an 8-byte
> signed enum64 with no values.
>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
>  tools/lib/bpf/btf.c | 7 ++++++-
>  tools/lib/bpf/btf.h | 1 +
>  2 files changed, 7 insertions(+), 1 deletion(-)
>
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 2d0840ef599a..44afae098369 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -2418,7 +2418,7 @@ int btf__add_enum64_value(struct btf *btf, const char *name, __u64 value)
>   * Append new BTF_KIND_FWD type with:
>   *   - *name*, non-empty/non-NULL name;
>   *   - *fwd_kind*, kind of forward declaration, one of BTF_FWD_STRUCT,
> - *     BTF_FWD_UNION, or BTF_FWD_ENUM;
> + *     BTF_FWD_UNION, BTF_FWD_ENUM or BTF_FWD_ENUM64;
>   * Returns:
>   *   - >0, type ID of newly added BTF type;
>   *   - <0, on error.
> @@ -2446,6 +2446,11 @@ int btf__add_fwd(struct btf *btf, const char *name, enum btf_fwd_kind fwd_kind)
>                  * values; we also assume a standard 4-byte size for it
>                  */
>                 return btf__add_enum(btf, name, sizeof(int));
> +       case BTF_FWD_ENUM64:
> +               /* enum64 forward is similarly just an enum64 with no enum
> +                * values; assume 8 byte size, signed.
> +                */
> +               return btf__add_enum64(btf, name, sizeof(__u64), true);
>         default:
>                 return libbpf_err(-EINVAL);
>         }
> diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
> index 8e6880d91c84..47d3e00b25c7 100644
> --- a/tools/lib/bpf/btf.h
> +++ b/tools/lib/bpf/btf.h
> @@ -194,6 +194,7 @@ enum btf_fwd_kind {
>         BTF_FWD_STRUCT = 0,
>         BTF_FWD_UNION = 1,
>         BTF_FWD_ENUM = 2,
> +       BTF_FWD_ENUM64 = 3,

one can argue that having BTF_FWD_ENUM64 isn't necessary if we allow
to use BTF_KIND_ENUM and BTF_KIND_ENUM64 interchangeably (when
resolving such fwd declarations). "Original" enum can record size 8,
so you can have enum64 forwarding with just BTF_KIND_ENUM vlen=0
size=8? Would that be sufficient?


>  };
>
>  LIBBPF_API int btf__add_fwd(struct btf *btf, const char *name, enum btf_fwd_kind fwd_kind);
> --
> 2.31.1
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 02/13] libbpf: add btf__distill_base() creating split BTF with distilled base BTF
  2024-04-24 15:47 ` [PATCH v2 bpf-next 02/13] libbpf: add btf__distill_base() creating split BTF with distilled base BTF Alan Maguire
@ 2024-04-26 22:57   ` Andrii Nakryiko
  2024-04-30 23:06   ` Eduard Zingerman
  1 sibling, 0 replies; 43+ messages in thread
From: Andrii Nakryiko @ 2024-04-26 22:57 UTC (permalink / raw)
  To: Alan Maguire
  Cc: andrii, ast, jolsa, acme, quentin, eddyz87, mykolal, daniel,
	martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf,
	haoluo, houtao1, bpf, masahiroy, mcgrof, nathan

On Wed, Apr 24, 2024 at 8:48 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> To support more robust split BTF, adding supplemental context for the
> base BTF type ids that split BTF refers to is required.  Without such
> references, a simple shuffling of base BTF type ids (without any other
> significant change) invalidates the split BTF.  Here the attempt is made
> to store additional context to make split BTF more robust.
>
> This context comes in the form of distilled base BTF - this base BTF
> constitutes the minimal BTF representation needed to disambiguate split BTF
> references to base BTF.  The rules are as follows:
>
> - INT, FLOAT are recorded in full.
> - if a named base BTF STRUCT or UNION is referred to from split BTF, it
>   will be encoded either as a zero-member sized STRUCT/UNION (preserving
>   size for later relocation checks) or as a named FWD.  Only base BTF
>   STRUCT/UNIONs that are embedded in split BTF STRUCT/UNIONs need to
>   preserve size information, so a FWD representation will be used in
>   most cases.
> - if an ENUM[64] is named, a ENUM[64] forward representation (an ENUM[64]
>   with no values) is used.
> - if a STRUCT, UNION, ENUM or ENUM64 is not named, it is recorded in full.
> - base BTF reference types like CONST, RESTRICT, TYPEDEF, PTR are recorded
>   as-is.
>
> Avoiding struct/union/enum/enum64 expansion is important to keep the
> distilled base BTF representation to a minimum size; however anonymous
> struct, union and enum[64] types are represented in full since type details
> are needed to disambiguate the reference - the name is not enough in those
> cases since there is no name.  In practice these are rare; in sample
> cases where reference base BTF was generated for in-tree kernel modules,
> only a few were needed in distilled base BTF.  These represent the
> anonymous struct/unions that are used by the module but were de-duplicated
> to use base vmlinux BTF ids instead.
>
> When successful, new representations of the distilled base BTF and new
> split BTF that refers to it are returned.  Both need to be freed by the
> caller.
>
> So to take a simple example, with split BTF with a type referring
> to "struct sk_buff", we will generate base reference BTF with a
> FWD struct sk_buff, and the split BTF will refer to it instead.
>
> Tools like pahole can utilize such split BTF to popuate the .BTF section

typo: populate

> (split BTF) and an additional .BTF.base section.
> Then when the split BTF is loaded, the distilled base BTF can be used
> to relocate split BTF to reference the current - and possibly changed -
> base BTF.
>
> So for example if "struct sk_buff" was id 502 when the split BTF was
> originally generated,  we can use the distilled base BTF to see that
> id 502 refers to a "struct sk_buff" and replace instances of id 502
> with the current (relocated) base BTF sk_buff type id.
>
> Distilled base BTF is small; when building a kernel with all modules
> using distilled base BTF as a test, the average size for module
> distilled base BTF is 1555 bytes (standard deviation 1563).  The
> maximum distilled base BTF size across ~2700 modules was 37895 bytes.
>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
>  tools/lib/bpf/btf.c      | 316 ++++++++++++++++++++++++++++++++++++++-
>  tools/lib/bpf/btf.h      |  20 +++
>  tools/lib/bpf/libbpf.map |   1 +
>  3 files changed, 331 insertions(+), 6 deletions(-)
>

So, a few high-level notes.

1. I still think we should not add *anything* besides named
structs/unions/enums into distilled base BTF. Unless proven otherwise,
I don't see why we'd need them and complicate kernel-side. It's also
not a big complication for libbpf and your code below is like 95%
there anyways. See below about id map

2. I don't think we need to init id map to -1. 0 is always an
"invalid" ID in the sense that no valid type has such ID. It's
reserved for VOID and in this context could mean "not yet mapped"
right after calloc().

3. Please double-check the handling of all possible kinds (TYPE_TAG
and DECL_TAG are notoriously missing, if I'm not missing anything
myself)

4. we can use the same id map to remap those anonymous/copied types
from original base BTF into new split BTF. We just map them to higher
IDs (and append them to split BTF at the end). So we'll have a few
interesting cases (for id map):

  a) id == 0, not yet mapped/visited/irrelevant
  b) id < btf__type_cnt(base_btf) -- remapped base BTF type in distilled BTF
  c) id >= btf__type_cnt(base_btf) -- remapped base BTF type appended
to new split BTF (because anonymous or can't existing in distilled
base BTF)

remapping is trivial in this case.

5. it's minor, but it feel wasteful to waste 4 bytes per each type
just to record "embedded" flag, we can just set highest bit to 1 for
such IDs and account for that in the logic I described above and
remapping overall. Again, it's minor, but feels wrong to allocate half
a megabyte (my kernel has 130K types) just for those few bits.

So, I think you are really close, let's try to iterate on this (both
discussion and implementation) quickly and get it over the finish
line.

> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 44afae098369..419cc4fa2e86 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -1771,9 +1771,8 @@ static int btf_rewrite_str(__u32 *str_off, void *ctx)
>         return 0;
>  }
>

[...]

>  static int btf_rewrite_type_ids(__u32 *type_id, void *ctx)
> @@ -5217,3 +5223,301 @@ int btf_ext_visit_str_offs(struct btf_ext *btf_ext, str_off_visit_fn visit, void
>
>         return 0;
>  }
> +
> +struct btf_distill_id {
> +       int id;
> +       bool embedded;          /* true if id refers to a struct/union in base BTF
> +                                * that is embedded in a split BTF struct/union.
> +                                */

nit: add this multi-line comment before `bool embedded;` line

> +};
> +

[...]

> +               case BTF_KIND_STRUCT:
> +               case BTF_KIND_UNION:
> +                       dist->ids[next_id].embedded = next_id > 0 &&
> +                                                     next_id <= dist->nr_base_types;

hm... if next_id >= dist->nr_base_types, you are still overwriting
some memory in dist->ids[next_id], no? And again, you are doing wrong
< vs <= comparisons in nr_base_types (I think, please prove me wrong).

> +                       return 0;
> +               default:
> +                       return 0;
> +               }
> +
> +       } while (next_id != 0);
> +
> +       return 0;
> +}
> +
> +static bool btf_is_eligible_named_fwd(const struct btf_type *t)
> +{
> +       return (btf_is_composite(t) || btf_is_any_enum(t)) && t->name_off != 0;
> +}
> +
> +static int btf_add_distilled_type_ids(__u32 *id, void *ctx)
> +{
> +       struct btf_distill *dist = ctx;
> +       struct btf_type *t = btf_type_by_id(dist->pipe.src, *id);
> +       int ret;
> +
> +       /* split BTF id, not needed */
> +       if (*id > dist->nr_base_types)

>=, no? otherwise we have access out of bounds of dist->ids array, I think

> +               return 0;
> +       /* already added ? */
> +       if (dist->ids[*id].id >= 0)

let's use > 0 to make very clear that zero is never a valid (mapped) ID

> +               return 0;
> +       dist->ids[*id].id = *id;
> +

[...]

> +/* All split BTF ids will be shifted downwards since there are less base BTF
> + * in distilled base BTF, and for those that refer to base BTF, we use the
> + * reference map to map from original base BTF to distilled base BTF id.
> + */
> +static int btf_update_distilled_type_ids(__u32 *id, void *ctx)
> +{
> +       struct btf_distill *dist = ctx;
> +
> +       if (*id >= dist->nr_base_types)
> +               *id -= dist->diff_id;
> +       else
> +               *id = dist->ids[*id].id;
> +       return 0;
> +}
> +
> +/* Create updated /split BTF with distilled base BTF; distilled base BTF

/split -- was it supposed to be an emphasis, like "/split/" ?

> + * consists of BTF information required to clarify the types that split
> + * BTF refers to, omitting unneeded details.  Specifically it will contain
> + * base types and forward declarations of structs, unions and enumerated
> + * types, along with associated reference types like pointers, arrays etc.
> + *
> + * The only case where structs, unions or enumerated types are fully represented
> + * is when they are anonymous; in such cases, info about type content is needed
> + * to clarify type references.
> + *
> + * We return newly-created split BTF where the split BTf refers to a newly-created

BTf -> BTF

> + * distilled base BTF. Both must be freed separately by the caller.
> + *
> + * When creating the BTF representation for a module and provided with the
> + * distilled_base option, pahole will create split BTF using this API, and store
> + * the distilled base BTF in the .BTF.base.distilled section.

.BTF.base.distilled is outdated, update?

It's also kind of unusual to explain specific .BTF.base and pahole
convention. I guess it's fine to refer to pahole and .BTF.base, but
more like an example (this is minor)?

> + */
> +int btf__distill_base(const struct btf *src_btf, struct btf **new_base_btf,
> +                     struct btf **new_split_btf)
> +{
> +       struct btf *new_base = NULL, *new_split = NULL;
> +       unsigned int n = btf__type_cnt(src_btf);
> +       struct btf_distill dist = {};
> +       struct btf_type *t;
> +       __u32 i, id = 0;
> +       int ret = 0;
> +
> +       /* src BTF must be split BTF. */
> +       if (!new_base_btf || !new_split_btf || !btf__base_btf(src_btf)) {
> +               errno = EINVAL;
> +               return -EINVAL;

use `return libbpf_err(-EINVAL);` here?

> +       }
> +       new_base = btf__new_empty();
> +       if (!new_base)
> +               return -ENOMEM;

libbpf_err()

> +       dist.ids = calloc(n, sizeof(*dist.ids));
> +       if (!dist.ids) {
> +               ret = -ENOMEM;
> +               goto err_out;
> +       }
> +       for (i = 1; i < n; i++)
> +               dist.ids[i].id = -1;
> +       dist.pipe.src = src_btf;
> +       dist.pipe.dst = new_base;
> +       dist.pipe.str_off_map = hashmap__new(btf_dedup_identity_hash_fn, btf_dedup_equal_fn, NULL);
> +       if (IS_ERR(dist.pipe.str_off_map)) {
> +               ret = -ENOMEM;
> +               goto err_out;
> +       }
> +       dist.nr_base_types = btf__type_cnt(btf__base_btf(src_btf));
> +
> +       /* Pass over src split BTF; generate the list of base BTF
> +        * type ids it references; these will constitute our distilled
> +        * base BTF set.
> +        */
> +       for (i = src_btf->start_id; i < n; i++) {
> +               t = (struct btf_type *)btf__type_by_id(src_btf, i);

btf_type_by_id() exists (as internal helper) exactly to not do these casts

> +
> +               /* check if members of struct/union in split BTF refer to base BTF
> +                * struct/union; if so, we will use an empty sized struct to represent
> +                * it rather than a FWD because its size must match on later BTF
> +                * relocation.
> +                */
> +               if (btf_is_composite(t)) {
> +                       ret = btf_type_visit_type_ids(t, btf_find_embedded_composite_type_ids,
> +                                                     &dist);
> +                       if (ret < 0)
> +                               goto err_out;
> +               }
> +               ret = btf_type_visit_type_ids(t,  btf_add_distilled_type_ids, &dist);
> +               if (ret < 0)
> +                       goto err_out;
> +       }
> +       /* Next add types for each of the required references. */
> +       for (i = 1; i < src_btf->start_id; i++) {

I think you have dist.nr_base_types, let's use that as it's more explicit?

> +               if (dist.ids[i].id < 0)
> +                       continue;
> +               t = btf_type_by_id(src_btf, i);
> +
> +               if (dist.ids[i].embedded) {
> +                       /* If a named struct/union in base BTF is referenced as a type
> +                        * in split BTF without use of a pointer - i.e. as an embedded
> +                        * struct/union - add an empty struct/union preserving size
> +                        * since size must be consistent when relocating split and
> +                        * possibly changed base BTF.
> +                        */
> +                       ret = btf_add_composite(new_base, btf_kind(t),
> +                                               btf__name_by_offset(src_btf, t->name_off),

nit: look up name ahead of time (it's fine to pass zero to
btf__name_by_offset()), and use it below for btf__add_fwd() as well

> +                                               t->size);
> +               } else if (btf_is_eligible_named_fwd(t)) {
> +                       enum btf_fwd_kind fwd_kind;
> +
> +                       /* If not embedded, use a fwd for named struct/unions since we
> +                        * can match via name without any other details.
> +                        */
> +                       switch (btf_kind(t)) {
> +                       case BTF_KIND_STRUCT:
> +                               fwd_kind = BTF_FWD_STRUCT;
> +                               break;
> +                       case BTF_KIND_UNION:
> +                               fwd_kind = BTF_FWD_UNION;
> +                               break;
> +                       case BTF_KIND_ENUM:
> +                               fwd_kind = BTF_FWD_ENUM;
> +                               break;
> +                       case BTF_KIND_ENUM64:
> +                               fwd_kind = BTF_FWD_ENUM64;
> +                               break;

it feels like if you just have

case BTF_KIND_ENUM:
case BTF_KIND_ENUM64:
    fwd_kind = BTF_FWD_ENUM;
    break;

we wouldn't lose anything and wouldn't need patch #1

> +                       default:
> +                               pr_warn("unexpected kind [%u] when creating distilled base BTF.\n",
> +                                       btf_kind(t));
> +                               goto err_out;
> +                       }
> +                       ret = btf__add_fwd(new_base, btf__name_by_offset(src_btf, t->name_off),
> +                                          fwd_kind);
> +               } else {
> +                       ret = btf_add_type(&dist.pipe, t);
> +               }
> +               if (ret < 0)
> +                       goto err_out;
> +               dist.ids[i].id = ++id;
> +       }
> +       /* now create new split BTF with distilled base BTF as its base; we end up with
> +        * split BTF that has base BTF that represents enough about its base references
> +        * to allow it to be relocated with the base BTF available.
> +        */
> +       new_split = btf__new_empty_split(new_base);
> +       if (!new_split_btf) {
> +               ret = libbpf_get_error(new_split);

please don't add new uses of libbpf_get_error(), `ret = -errno`

> +               goto err_out;
> +       }
> +
> +       dist.pipe.dst = new_split;
> +       /* all split BTF ids will be shifted downwards since there are less base BTF ids
> +        * in distilled base BTF.
> +        */
> +       dist.diff_id = dist.nr_base_types - btf__type_cnt(new_base);
> +
> +       /* First add all split types */
> +       for (i = src_btf->start_id; i < n; i++) {
> +               t = btf_type_by_id(src_btf, i);
> +               ret = btf_add_type(&dist.pipe, t);
> +               if (ret < 0)
> +                       goto err_out;
> +       }
> +       n = btf__type_cnt(new_split);
> +       /* Now update base/split BTF ids. */
> +       for (i = 1; i < n; i++) {
> +               t = btf_type_by_id(new_split, i);
> +
> +               ret = btf_type_visit_type_ids(t,  btf_update_distilled_type_ids, &dist);
> +               if (ret < 0)
> +                       goto err_out;
> +       }
> +       free(dist.ids);
> +       hashmap__free(dist.pipe.str_off_map);
> +       *new_base_btf = new_base;
> +       *new_split_btf = new_split;
> +       return 0;
> +err_out:
> +       free(dist.ids);
> +       hashmap__free(dist.pipe.str_off_map);
> +       btf__free(new_split);
> +       btf__free(new_base);
> +       errno = -ret;
> +       return ret;

libbpf_err(ret), but also s/ret/err/, it is literally error value or
zero (for success)

> +}
> diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
> index 47d3e00b25c7..025ed28b7fe8 100644
> --- a/tools/lib/bpf/btf.h
> +++ b/tools/lib/bpf/btf.h
> @@ -107,6 +107,26 @@ LIBBPF_API struct btf *btf__new_empty(void);
>   */
>  LIBBPF_API struct btf *btf__new_empty_split(struct btf *base_btf);
>
> +/**
> + * @brief **btf__distill_base()** creates new versions of the split BTF
> + * *src_btf* and its base BTF.  The new base BTF will only contain the types

nit: extra spaces after '.'

> + * needed to improve robustness of the split BTF to small changes in base BTF.
> + * When that split BTF is loaded against a (possibly changed) base, this
> + * distilled base BTF will help update references to that (possibly changed)
> + * base BTF.
> + *
> + * Both the new split and its associated new base BTF must be freed by
> + * the caller.
> + *
> + * If successful, 0 is returned and **new_base_btf** and **new_split_btf**
> + * will point at new base/split BTF.  Both the new split and its associated

nit: extra spaces after '.'

> + * new base BTF must be freed by the caller.
> + *
> + * A negative value is returned on error.
> + */
> +LIBBPF_API int btf__distill_base(const struct btf *src_btf, struct btf **new_base_btf,
> +                                struct btf **new_split_btf);
> +
>  LIBBPF_API struct btf *btf__parse(const char *path, struct btf_ext **btf_ext);
>  LIBBPF_API struct btf *btf__parse_split(const char *path, struct btf *base_btf);
>  LIBBPF_API struct btf *btf__parse_elf(const char *path, struct btf_ext **btf_ext);
> diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
> index c1ce8aa3520b..c4d9bd7d3220 100644
> --- a/tools/lib/bpf/libbpf.map
> +++ b/tools/lib/bpf/libbpf.map
> @@ -420,6 +420,7 @@ LIBBPF_1.4.0 {
>  LIBBPF_1.5.0 {
>         global:
>                 bpf_program__attach_sockmap;
> +               btf__distill_base;

nit: '_' orders before 'p'


>                 ring__consume_n;
>                 ring_buffer__consume_n;
>  } LIBBPF_1.4.0;
> --
> 2.31.1
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 00/13] bpf: support resilient split BTF
  2024-04-26 22:56 ` [PATCH v2 bpf-next 00/13] bpf: support resilient " Andrii Nakryiko
@ 2024-04-27  0:24   ` Andrii Nakryiko
  2024-04-29 15:25     ` Alan Maguire
  0 siblings, 1 reply; 43+ messages in thread
From: Andrii Nakryiko @ 2024-04-27  0:24 UTC (permalink / raw)
  To: Alan Maguire
  Cc: andrii, ast, jolsa, acme, quentin, eddyz87, mykolal, daniel,
	martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf,
	haoluo, houtao1, bpf, masahiroy, mcgrof, nathan

On Fri, Apr 26, 2024 at 3:56 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Wed, Apr 24, 2024 at 8:48 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> >
> > Split BPF Type Format (BTF) provides huge advantages in that kernel
> > modules only have to provide type information for types that they do not
> > share with the core kernel; for core kernel types, split BTF refers to
> > core kernel BTF type ids.  So for a STRUCT sk_buff, a module that
> > uses that structure (or a pointer to it) simply needs to refer to the
> > core kernel type id, saving the need to define the structure and its many
> > dependents.  This cuts down on duplication and makes BTF as compact
> > as possible.
> >
> > However, there is a downside.  This scheme requires the references from
> > split BTF to base BTF to be valid not just at encoding time, but at use
> > time (when the module is loaded).  Even a small change in kernel types
> > can perturb the type ids in core kernel BTF, and due to pahole's
> > parallel processing of compilation units, even an unchanged kernel can
> > have different type ids if BTF is re-generated.  So we have a robustness
> > problem for split BTF for cases where a module is not always compiled at
> > the same time as the kernel.  This problem is particularly acute for
> > distros which generally want module builders to be able to compile a
> > module for the lifetime of a Linux stable-based release, and have it
> > continue to be valid over the lifetime of that release, even as changes
> > in data structures (and hence BTF types) accrue.  Today it's not
> > possible to generate BTF for modules that works beyond the initial
> > kernel it is compiled against - kernel bugfixes etc invalidate the split
> > BTF references to vmlinux BTF, and BTF is no longer usable for the
> > module.
> >
> > The goal of this series is to provide options to provide additional
> > context for cases like this.  That context comes in the form of
> > distilled base BTF; it stands in for the base BTF, and contains
> > information about the types referenced from split BTF, but not their
> > full descriptions.  The modified split BTF will refer to type ids in
> > this .BTF.base section, and when the kernel loads such modules it
> > will use that base BTF to map references from split BTF to the
> > current vmlinux BTF - a process of relocating split BTF with the
> > currently-running kernel's vmlinux base BTF.
> >
> > A module builder - using this series along with the pahole changes -
> > can then build a module with distilled base BTF via an out-of-tree
> > module build, i.e.
> >
> > make -C . M=path/2/module
> >
> > The module will have a .BTF section (the split BTF) and a
> > .BTF.base section.  The latter is small in size - distilled base
> > BTF does not need full struct/union/enum information for named
> > types for example.  For 2667 modules built with distilled base BTF,
> > the average size observed was 1556 bytes (stddev 1563).
> >
> > Note that for the in-tree modules, this approach is not needed as
> > split and base BTF in the case of in-tree modules are always built
> > and re-built together.
> >
> > The series first focuses on generating split BTF with distilled base
> > BTF, and provides btf__parse_opts() which allows specification
> > of the section name from which to read BTF data, since we now have
> > both .BTF and .BTF.base sections that can contain such data.
> >
> > Then we add support to resolve_btfids for generating the .BTF.ids
> > section with reference to the .BTF.base section - this ensures the
> > .BTF.ids match those used in the split/base BTF.
> >
> > Finally the series provides the mechanism for relocating split BTF with
> > a new base; the distilled base BTF is used to map the references to base
> > BTF in the split BTF to the new base.  For the kernel, this relocation
> > process happens at module load time, and we relocate split BTF
> > references to point at types in the current vmlinux BTF.  As part of
> > this, .BTF.ids references need to be mapped also.
> >
> > So concretely, what happens is
> >
> > - we generate split BTF in the .BTF section of a module that refers to
> >   types in the .BTF.base section as base types; these are not full
> >   type descriptions but provide information about the base type.  So
> >   a STRUCT sk_buff would be represented as a FWD struct sk_buff in
> >   distilled base BTF for example.
> > - when the module is loaded, the split BTF is relocated with vmlinux
> >   BTF; in the case of the FWD struct sk_buff, we find the STRUCT sk_buff
> >   in vmlinux BTF and map all split BTF references to the distilled base
> >   FWD sk_buff, replacing them with references to the vmlinux BTF
> >   STRUCT sk_buff.
> >
> > Support is also added to bpftool to be able to display split BTF
> > relative to its .BTF.base section, and also to display the relocated
> > form via the "-R path_to_base_btf".
> >
> > A previous approach to this problem [1] utilized standalone BTF for such
> > cases - where the BTF is not defined relative to base BTF so there is no
> > relocation required.  The problem with that approach is that from
> > the verifier perspective, some types are special, and having a custom
> > representation of a core kernel type that did not necessarily match the
> > current representation is not tenable.  So the approach taken here was
> > to preserve the split BTF model while minimizing the representation of
> > the context needed to relocate split and current vmlinux BTF.
> >
> > To generate distilled .BTF.base sections the associated dwarves
> > patch (to be applied on the "next" branch there) is needed.
> > Without it, things will still work but bpf_testmod will not be built
> > with a .BTF.base section.
> >
> > Changes since RFC [2]:
> >
> > - updated terminology; we replace clunky "base reference" BTF with
> >   distilling base BTF into a .BTF.base section. Similarly BTF
> >   reconcilation becomes BTF relocation (Andrii, most patches)
> > - add distilled base BTF by default for out-of-tree modules
> >   (Alexei, patch 8)
> > - distill algorithm updated to record size of embedded struct/union
> >   by recording it as a 0-vlen STRUCT/UNION with size preserved
> >   (Andrii, patch 2)
> > - verify size match on relocation for such STRUCT/UNIONs (Andrii,
> >   patch 9)
> > - with embedded STRUCT/UNION recording size, we can have bpftool
> >   dump a header representation using .BTF.base + .BTF sections
> >   rather than special-casing and refusing to use "format c" for
> >   that case (patch 5)
> > - match enum with enum64 and vice versa (Andrii, patch 9)
> > - ensure that resolve_btfids works with BTF without .BTF.base
> >   section (patch 7)
> > - update tests to cover embedded types, arrays and function
> >   prototypes (patches 3, 12)
> >
> > One change not made yet is adding anonymous struct/unions that the split
> > BTF references in base BTF to the module instead of adding them to the
> > .BTF.base section.  That would involve having to maintain two pipes for
> > writing BTF, one for the .BTF.base and one for the split BTF.  It would
> > be possible, but there are I think some edge cases that might make it
> > tricky.  For example consider a split BTF reference to a base BTF
> > ARRAY which in turn referenced an anonymous STRUCT as type.  In such a
> > case, it wouldn't make sense to have the array in the .BTF.base section
> > while having the STRUCT in the module.  The general concern is that once
>
> Hm.. not really? ARRAY is a reference type (and anonymous at that), so
> it would have to stay in module's BTF, no? I'll go read the patch
> series again, but let me know if I'm missing something.
>
> > we move a type to the module we would need to also ensure any base types
> > that refer to it move there too.  For now it is I think simpler to
> > retain the existing split/base type classifications.
>
> We would have to finalize this part before landing, as it has big
> implications on the relocation process.

Ran out of time, sorry, will continue on Monday. But please consider,
meanwhile, what I mentioned about only having named
structs/unions/enums in distilled base BTF.

>
>
> >
> > [1] https://lore.kernel.org/bpf/20231112124834.388735-14-alan.maguire@oracle.com/
> > [2] https://lore.kernel.org/bpf/20240322102455.98558-1-alan.maguire@oracle.com/
> >
> >
> >
> > Alan Maguire (13):
> >   libbpf: add support to btf__add_fwd() for ENUM64
> >   libbpf: add btf__distill_base() creating split BTF with distilled base
> >     BTF
> >   selftests/bpf: test distilled base, split BTF generation
> >   libbpf: add btf__parse_opts() API for flexible BTF parsing
> >   bpftool: support displaying raw split BTF using base BTF section as
> >     base
> >   kbuild,bpf: switch to using --btf_features for pahole v1.26 and later
> >   resolve_btfids: use .BTF.base ELF section as base BTF if -B option is
> >     used
> >   kbuild, bpf: add module-specific pahole/resolve_btfids flags for
> >     distilled base BTF
> >   libbpf: split BTF relocation
> >   module, bpf: store BTF base pointer in struct module
> >   libbpf,bpf: share BTF relocate-related code with kernel
> >   selftests/bpf: extend distilled BTF tests to cover BTF relocation
> >   bpftool: support displaying relocated-with-base split BTF
> >
> >  include/linux/btf.h                           |  32 +
> >  include/linux/module.h                        |   2 +
> >  kernel/bpf/Makefile                           |   8 +
> >  kernel/bpf/btf.c                              | 227 +++++--
> >  kernel/module/main.c                          |   5 +-
> >  scripts/Makefile.btf                          |  12 +-
> >  scripts/Makefile.modfinal                     |   4 +-
> >  .../bpf/bpftool/Documentation/bpftool-btf.rst |  15 +-
> >  tools/bpf/bpftool/bash-completion/bpftool     |   7 +-
> >  tools/bpf/bpftool/btf.c                       |  20 +-
> >  tools/bpf/bpftool/main.c                      |  14 +-
> >  tools/bpf/bpftool/main.h                      |   2 +
> >  tools/bpf/resolve_btfids/main.c               |  22 +-
> >  tools/lib/bpf/Build                           |   2 +-
> >  tools/lib/bpf/btf.c                           | 561 +++++++++++-----
> >  tools/lib/bpf/btf.h                           |  61 ++
> >  tools/lib/bpf/btf_common.c                    | 146 ++++
> >  tools/lib/bpf/btf_relocate.c                  | 630 ++++++++++++++++++
> >  tools/lib/bpf/libbpf.map                      |   3 +
> >  tools/lib/bpf/libbpf_internal.h               |   2 +
> >  .../selftests/bpf/prog_tests/btf_distill.c    | 298 +++++++++
> >  21 files changed, 1864 insertions(+), 209 deletions(-)
> >  create mode 100644 tools/lib/bpf/btf_common.c
> >  create mode 100644 tools/lib/bpf/btf_relocate.c
> >  create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_distill.c
> >
> > --
> > 2.31.1
> >

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 00/13] bpf: support resilient split BTF
  2024-04-27  0:24   ` Andrii Nakryiko
@ 2024-04-29 15:25     ` Alan Maguire
  2024-04-29 17:05       ` Andrii Nakryiko
  0 siblings, 1 reply; 43+ messages in thread
From: Alan Maguire @ 2024-04-29 15:25 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: andrii, ast, jolsa, acme, quentin, eddyz87, mykolal, daniel,
	martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf,
	haoluo, houtao1, bpf, masahiroy, mcgrof, nathan

On 27/04/2024 01:24, Andrii Nakryiko wrote:
> On Fri, Apr 26, 2024 at 3:56 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
>>
>> On Wed, Apr 24, 2024 at 8:48 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>>>
>>> Split BPF Type Format (BTF) provides huge advantages in that kernel
>>> modules only have to provide type information for types that they do not
>>> share with the core kernel; for core kernel types, split BTF refers to
>>> core kernel BTF type ids.  So for a STRUCT sk_buff, a module that
>>> uses that structure (or a pointer to it) simply needs to refer to the
>>> core kernel type id, saving the need to define the structure and its many
>>> dependents.  This cuts down on duplication and makes BTF as compact
>>> as possible.
>>>
>>> However, there is a downside.  This scheme requires the references from
>>> split BTF to base BTF to be valid not just at encoding time, but at use
>>> time (when the module is loaded).  Even a small change in kernel types
>>> can perturb the type ids in core kernel BTF, and due to pahole's
>>> parallel processing of compilation units, even an unchanged kernel can
>>> have different type ids if BTF is re-generated.  So we have a robustness
>>> problem for split BTF for cases where a module is not always compiled at
>>> the same time as the kernel.  This problem is particularly acute for
>>> distros which generally want module builders to be able to compile a
>>> module for the lifetime of a Linux stable-based release, and have it
>>> continue to be valid over the lifetime of that release, even as changes
>>> in data structures (and hence BTF types) accrue.  Today it's not
>>> possible to generate BTF for modules that works beyond the initial
>>> kernel it is compiled against - kernel bugfixes etc invalidate the split
>>> BTF references to vmlinux BTF, and BTF is no longer usable for the
>>> module.
>>>
>>> The goal of this series is to provide options to provide additional
>>> context for cases like this.  That context comes in the form of
>>> distilled base BTF; it stands in for the base BTF, and contains
>>> information about the types referenced from split BTF, but not their
>>> full descriptions.  The modified split BTF will refer to type ids in
>>> this .BTF.base section, and when the kernel loads such modules it
>>> will use that base BTF to map references from split BTF to the
>>> current vmlinux BTF - a process of relocating split BTF with the
>>> currently-running kernel's vmlinux base BTF.
>>>
>>> A module builder - using this series along with the pahole changes -
>>> can then build a module with distilled base BTF via an out-of-tree
>>> module build, i.e.
>>>
>>> make -C . M=path/2/module
>>>
>>> The module will have a .BTF section (the split BTF) and a
>>> .BTF.base section.  The latter is small in size - distilled base
>>> BTF does not need full struct/union/enum information for named
>>> types for example.  For 2667 modules built with distilled base BTF,
>>> the average size observed was 1556 bytes (stddev 1563).
>>>
>>> Note that for the in-tree modules, this approach is not needed as
>>> split and base BTF in the case of in-tree modules are always built
>>> and re-built together.
>>>
>>> The series first focuses on generating split BTF with distilled base
>>> BTF, and provides btf__parse_opts() which allows specification
>>> of the section name from which to read BTF data, since we now have
>>> both .BTF and .BTF.base sections that can contain such data.
>>>
>>> Then we add support to resolve_btfids for generating the .BTF.ids
>>> section with reference to the .BTF.base section - this ensures the
>>> .BTF.ids match those used in the split/base BTF.
>>>
>>> Finally the series provides the mechanism for relocating split BTF with
>>> a new base; the distilled base BTF is used to map the references to base
>>> BTF in the split BTF to the new base.  For the kernel, this relocation
>>> process happens at module load time, and we relocate split BTF
>>> references to point at types in the current vmlinux BTF.  As part of
>>> this, .BTF.ids references need to be mapped also.
>>>
>>> So concretely, what happens is
>>>
>>> - we generate split BTF in the .BTF section of a module that refers to
>>>   types in the .BTF.base section as base types; these are not full
>>>   type descriptions but provide information about the base type.  So
>>>   a STRUCT sk_buff would be represented as a FWD struct sk_buff in
>>>   distilled base BTF for example.
>>> - when the module is loaded, the split BTF is relocated with vmlinux
>>>   BTF; in the case of the FWD struct sk_buff, we find the STRUCT sk_buff
>>>   in vmlinux BTF and map all split BTF references to the distilled base
>>>   FWD sk_buff, replacing them with references to the vmlinux BTF
>>>   STRUCT sk_buff.
>>>
>>> Support is also added to bpftool to be able to display split BTF
>>> relative to its .BTF.base section, and also to display the relocated
>>> form via the "-R path_to_base_btf".
>>>
>>> A previous approach to this problem [1] utilized standalone BTF for such
>>> cases - where the BTF is not defined relative to base BTF so there is no
>>> relocation required.  The problem with that approach is that from
>>> the verifier perspective, some types are special, and having a custom
>>> representation of a core kernel type that did not necessarily match the
>>> current representation is not tenable.  So the approach taken here was
>>> to preserve the split BTF model while minimizing the representation of
>>> the context needed to relocate split and current vmlinux BTF.
>>>
>>> To generate distilled .BTF.base sections the associated dwarves
>>> patch (to be applied on the "next" branch there) is needed.
>>> Without it, things will still work but bpf_testmod will not be built
>>> with a .BTF.base section.
>>>
>>> Changes since RFC [2]:
>>>
>>> - updated terminology; we replace clunky "base reference" BTF with
>>>   distilling base BTF into a .BTF.base section. Similarly BTF
>>>   reconcilation becomes BTF relocation (Andrii, most patches)
>>> - add distilled base BTF by default for out-of-tree modules
>>>   (Alexei, patch 8)
>>> - distill algorithm updated to record size of embedded struct/union
>>>   by recording it as a 0-vlen STRUCT/UNION with size preserved
>>>   (Andrii, patch 2)
>>> - verify size match on relocation for such STRUCT/UNIONs (Andrii,
>>>   patch 9)
>>> - with embedded STRUCT/UNION recording size, we can have bpftool
>>>   dump a header representation using .BTF.base + .BTF sections
>>>   rather than special-casing and refusing to use "format c" for
>>>   that case (patch 5)
>>> - match enum with enum64 and vice versa (Andrii, patch 9)
>>> - ensure that resolve_btfids works with BTF without .BTF.base
>>>   section (patch 7)
>>> - update tests to cover embedded types, arrays and function
>>>   prototypes (patches 3, 12)
>>>
>>> One change not made yet is adding anonymous struct/unions that the split
>>> BTF references in base BTF to the module instead of adding them to the
>>> .BTF.base section.  That would involve having to maintain two pipes for
>>> writing BTF, one for the .BTF.base and one for the split BTF.  It would
>>> be possible, but there are I think some edge cases that might make it
>>> tricky.  For example consider a split BTF reference to a base BTF
>>> ARRAY which in turn referenced an anonymous STRUCT as type.  In such a
>>> case, it wouldn't make sense to have the array in the .BTF.base section
>>> while having the STRUCT in the module.  The general concern is that once
>>
>> Hm.. not really? ARRAY is a reference type (and anonymous at that), so
>> it would have to stay in module's BTF, no? I'll go read the patch
>> series again, but let me know if I'm missing something.
>>

The way things currently work, we preserve all relationships prior to
distilling base BTF. That is, if a type was in split BTF prior to
calling btf__distill_base(), it will stay in split BTF afterwards. Ditto
for base types. This is true for reference types as well as named types.
So in the case of the above array for example, prior to distilling types
it is in base BTF. If it in turn then referred to a base anonymous
struct, both would be in the base and thus the distilled base BTF. In
the above case, I was suggesting the array itself was referred to from
split BTF, but not in split BTF, sorry if that wasn't clearer.

So the problem comes if we moved the anon struct to the module; then we
also need to move types that depend on it there. This means we'd need to
make the move recursive. That seems doable; the only question is around
the logistics and the effects of doing so. At one extreme we might end
up with something that resembles standalone BTF (many/most types in the
split BTF). That seems unlikely in most cases. I examined one module's
BTF base for example, and the only anon structs arose from typedef
references possible_net_t, sockptr_t, rwlock_t and atomic_t. These in
turn were only referenced once elsewhere in distilled base BTF; a
sockptr was in a FUNC_PROTO, but aside from that the typedefs were not
otherwise referenced in distilled base BTF, they were referenced in
split BTF as embeeded struct field types.

So moving all of this to the split BTF seems possible; what I think we
probably need to think on a bit is how to handle relocation.  Is there a
need to relocate these module types too, or can we live with having
duplicate atomic_t/sockptr_t typedefs in the module? Currently
relocation is simplified by the fact that we only need to relocate the
types prior to the module's start id. All we need to do is rewrite type
references in split BTF to base ids. If we were relocating split types
too we'd need to remove them from split BTF.

>>> we move a type to the module we would need to also ensure any base types
>>> that refer to it move there too.  For now it is I think simpler to
>>> retain the existing split/base type classifications.
>>
>> We would have to finalize this part before landing, as it has big
>> implications on the relocation process.
> 
> Ran out of time, sorry, will continue on Monday. But please consider,
> meanwhile, what I mentioned about only having named
> structs/unions/enums in distilled base BTF.
>

Sure, I'll dig into it further. FWIW I agree with the goal of moving
anonymous structs/unions if it's doable. I can't see any blocking issues
thus far.

>>
>>
>>>
>>> [1] https://lore.kernel.org/bpf/20231112124834.388735-14-alan.maguire@oracle.com/
>>> [2] https://lore.kernel.org/bpf/20240322102455.98558-1-alan.maguire@oracle.com/
>>>
>>>
>>>
>>> Alan Maguire (13):
>>>   libbpf: add support to btf__add_fwd() for ENUM64
>>>   libbpf: add btf__distill_base() creating split BTF with distilled base
>>>     BTF
>>>   selftests/bpf: test distilled base, split BTF generation
>>>   libbpf: add btf__parse_opts() API for flexible BTF parsing
>>>   bpftool: support displaying raw split BTF using base BTF section as
>>>     base
>>>   kbuild,bpf: switch to using --btf_features for pahole v1.26 and later
>>>   resolve_btfids: use .BTF.base ELF section as base BTF if -B option is
>>>     used
>>>   kbuild, bpf: add module-specific pahole/resolve_btfids flags for
>>>     distilled base BTF
>>>   libbpf: split BTF relocation
>>>   module, bpf: store BTF base pointer in struct module
>>>   libbpf,bpf: share BTF relocate-related code with kernel
>>>   selftests/bpf: extend distilled BTF tests to cover BTF relocation
>>>   bpftool: support displaying relocated-with-base split BTF
>>>
>>>  include/linux/btf.h                           |  32 +
>>>  include/linux/module.h                        |   2 +
>>>  kernel/bpf/Makefile                           |   8 +
>>>  kernel/bpf/btf.c                              | 227 +++++--
>>>  kernel/module/main.c                          |   5 +-
>>>  scripts/Makefile.btf                          |  12 +-
>>>  scripts/Makefile.modfinal                     |   4 +-
>>>  .../bpf/bpftool/Documentation/bpftool-btf.rst |  15 +-
>>>  tools/bpf/bpftool/bash-completion/bpftool     |   7 +-
>>>  tools/bpf/bpftool/btf.c                       |  20 +-
>>>  tools/bpf/bpftool/main.c                      |  14 +-
>>>  tools/bpf/bpftool/main.h                      |   2 +
>>>  tools/bpf/resolve_btfids/main.c               |  22 +-
>>>  tools/lib/bpf/Build                           |   2 +-
>>>  tools/lib/bpf/btf.c                           | 561 +++++++++++-----
>>>  tools/lib/bpf/btf.h                           |  61 ++
>>>  tools/lib/bpf/btf_common.c                    | 146 ++++
>>>  tools/lib/bpf/btf_relocate.c                  | 630 ++++++++++++++++++
>>>  tools/lib/bpf/libbpf.map                      |   3 +
>>>  tools/lib/bpf/libbpf_internal.h               |   2 +
>>>  .../selftests/bpf/prog_tests/btf_distill.c    | 298 +++++++++
>>>  21 files changed, 1864 insertions(+), 209 deletions(-)
>>>  create mode 100644 tools/lib/bpf/btf_common.c
>>>  create mode 100644 tools/lib/bpf/btf_relocate.c
>>>  create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_distill.c
>>>
>>> --
>>> 2.31.1
>>>
> 

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 00/13] bpf: support resilient split BTF
  2024-04-29 15:25     ` Alan Maguire
@ 2024-04-29 17:05       ` Andrii Nakryiko
  2024-04-29 17:31         ` Alan Maguire
  0 siblings, 1 reply; 43+ messages in thread
From: Andrii Nakryiko @ 2024-04-29 17:05 UTC (permalink / raw)
  To: Alan Maguire
  Cc: andrii, ast, jolsa, acme, quentin, eddyz87, mykolal, daniel,
	martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf,
	haoluo, houtao1, bpf, masahiroy, mcgrof, nathan

On Mon, Apr 29, 2024 at 8:25 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> On 27/04/2024 01:24, Andrii Nakryiko wrote:
> > On Fri, Apr 26, 2024 at 3:56 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> >>
> >> On Wed, Apr 24, 2024 at 8:48 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> >>>
> >>> Split BPF Type Format (BTF) provides huge advantages in that kernel
> >>> modules only have to provide type information for types that they do not
> >>> share with the core kernel; for core kernel types, split BTF refers to
> >>> core kernel BTF type ids.  So for a STRUCT sk_buff, a module that
> >>> uses that structure (or a pointer to it) simply needs to refer to the
> >>> core kernel type id, saving the need to define the structure and its many
> >>> dependents.  This cuts down on duplication and makes BTF as compact
> >>> as possible.
> >>>
> >>> However, there is a downside.  This scheme requires the references from
> >>> split BTF to base BTF to be valid not just at encoding time, but at use
> >>> time (when the module is loaded).  Even a small change in kernel types
> >>> can perturb the type ids in core kernel BTF, and due to pahole's
> >>> parallel processing of compilation units, even an unchanged kernel can
> >>> have different type ids if BTF is re-generated.  So we have a robustness
> >>> problem for split BTF for cases where a module is not always compiled at
> >>> the same time as the kernel.  This problem is particularly acute for
> >>> distros which generally want module builders to be able to compile a
> >>> module for the lifetime of a Linux stable-based release, and have it
> >>> continue to be valid over the lifetime of that release, even as changes
> >>> in data structures (and hence BTF types) accrue.  Today it's not
> >>> possible to generate BTF for modules that works beyond the initial
> >>> kernel it is compiled against - kernel bugfixes etc invalidate the split
> >>> BTF references to vmlinux BTF, and BTF is no longer usable for the
> >>> module.
> >>>
> >>> The goal of this series is to provide options to provide additional
> >>> context for cases like this.  That context comes in the form of
> >>> distilled base BTF; it stands in for the base BTF, and contains
> >>> information about the types referenced from split BTF, but not their
> >>> full descriptions.  The modified split BTF will refer to type ids in
> >>> this .BTF.base section, and when the kernel loads such modules it
> >>> will use that base BTF to map references from split BTF to the
> >>> current vmlinux BTF - a process of relocating split BTF with the
> >>> currently-running kernel's vmlinux base BTF.
> >>>
> >>> A module builder - using this series along with the pahole changes -
> >>> can then build a module with distilled base BTF via an out-of-tree
> >>> module build, i.e.
> >>>
> >>> make -C . M=path/2/module
> >>>
> >>> The module will have a .BTF section (the split BTF) and a
> >>> .BTF.base section.  The latter is small in size - distilled base
> >>> BTF does not need full struct/union/enum information for named
> >>> types for example.  For 2667 modules built with distilled base BTF,
> >>> the average size observed was 1556 bytes (stddev 1563).
> >>>
> >>> Note that for the in-tree modules, this approach is not needed as
> >>> split and base BTF in the case of in-tree modules are always built
> >>> and re-built together.
> >>>
> >>> The series first focuses on generating split BTF with distilled base
> >>> BTF, and provides btf__parse_opts() which allows specification
> >>> of the section name from which to read BTF data, since we now have
> >>> both .BTF and .BTF.base sections that can contain such data.
> >>>
> >>> Then we add support to resolve_btfids for generating the .BTF.ids
> >>> section with reference to the .BTF.base section - this ensures the
> >>> .BTF.ids match those used in the split/base BTF.
> >>>
> >>> Finally the series provides the mechanism for relocating split BTF with
> >>> a new base; the distilled base BTF is used to map the references to base
> >>> BTF in the split BTF to the new base.  For the kernel, this relocation
> >>> process happens at module load time, and we relocate split BTF
> >>> references to point at types in the current vmlinux BTF.  As part of
> >>> this, .BTF.ids references need to be mapped also.
> >>>
> >>> So concretely, what happens is
> >>>
> >>> - we generate split BTF in the .BTF section of a module that refers to
> >>>   types in the .BTF.base section as base types; these are not full
> >>>   type descriptions but provide information about the base type.  So
> >>>   a STRUCT sk_buff would be represented as a FWD struct sk_buff in
> >>>   distilled base BTF for example.
> >>> - when the module is loaded, the split BTF is relocated with vmlinux
> >>>   BTF; in the case of the FWD struct sk_buff, we find the STRUCT sk_buff
> >>>   in vmlinux BTF and map all split BTF references to the distilled base
> >>>   FWD sk_buff, replacing them with references to the vmlinux BTF
> >>>   STRUCT sk_buff.
> >>>
> >>> Support is also added to bpftool to be able to display split BTF
> >>> relative to its .BTF.base section, and also to display the relocated
> >>> form via the "-R path_to_base_btf".
> >>>
> >>> A previous approach to this problem [1] utilized standalone BTF for such
> >>> cases - where the BTF is not defined relative to base BTF so there is no
> >>> relocation required.  The problem with that approach is that from
> >>> the verifier perspective, some types are special, and having a custom
> >>> representation of a core kernel type that did not necessarily match the
> >>> current representation is not tenable.  So the approach taken here was
> >>> to preserve the split BTF model while minimizing the representation of
> >>> the context needed to relocate split and current vmlinux BTF.
> >>>
> >>> To generate distilled .BTF.base sections the associated dwarves
> >>> patch (to be applied on the "next" branch there) is needed.
> >>> Without it, things will still work but bpf_testmod will not be built
> >>> with a .BTF.base section.
> >>>
> >>> Changes since RFC [2]:
> >>>
> >>> - updated terminology; we replace clunky "base reference" BTF with
> >>>   distilling base BTF into a .BTF.base section. Similarly BTF
> >>>   reconcilation becomes BTF relocation (Andrii, most patches)
> >>> - add distilled base BTF by default for out-of-tree modules
> >>>   (Alexei, patch 8)
> >>> - distill algorithm updated to record size of embedded struct/union
> >>>   by recording it as a 0-vlen STRUCT/UNION with size preserved
> >>>   (Andrii, patch 2)
> >>> - verify size match on relocation for such STRUCT/UNIONs (Andrii,
> >>>   patch 9)
> >>> - with embedded STRUCT/UNION recording size, we can have bpftool
> >>>   dump a header representation using .BTF.base + .BTF sections
> >>>   rather than special-casing and refusing to use "format c" for
> >>>   that case (patch 5)
> >>> - match enum with enum64 and vice versa (Andrii, patch 9)
> >>> - ensure that resolve_btfids works with BTF without .BTF.base
> >>>   section (patch 7)
> >>> - update tests to cover embedded types, arrays and function
> >>>   prototypes (patches 3, 12)
> >>>
> >>> One change not made yet is adding anonymous struct/unions that the split
> >>> BTF references in base BTF to the module instead of adding them to the
> >>> .BTF.base section.  That would involve having to maintain two pipes for
> >>> writing BTF, one for the .BTF.base and one for the split BTF.  It would
> >>> be possible, but there are I think some edge cases that might make it
> >>> tricky.  For example consider a split BTF reference to a base BTF
> >>> ARRAY which in turn referenced an anonymous STRUCT as type.  In such a
> >>> case, it wouldn't make sense to have the array in the .BTF.base section
> >>> while having the STRUCT in the module.  The general concern is that once
> >>
> >> Hm.. not really? ARRAY is a reference type (and anonymous at that), so
> >> it would have to stay in module's BTF, no? I'll go read the patch
> >> series again, but let me know if I'm missing something.
> >>
>
> The way things currently work, we preserve all relationships prior to
> distilling base BTF. That is, if a type was in split BTF prior to
> calling btf__distill_base(), it will stay in split BTF afterwards. Ditto
> for base types. This is true for reference types as well as named types.
> So in the case of the above array for example, prior to distilling types
> it is in base BTF. If it in turn then referred to a base anonymous
> struct, both would be in the base and thus the distilled base BTF. In
> the above case, I was suggesting the array itself was referred to from
> split BTF, but not in split BTF, sorry if that wasn't clearer.
>
> So the problem comes if we moved the anon struct to the module; then we
> also need to move types that depend on it there. This means we'd need to
> make the move recursive. That seems doable; the only question is around

Yep, it should be very doable. We just mark everything used from
"to-be-moved-to-new-split-BTF" types recursively, unless it's
"qualified named type", where we stop. You have a pass to mark
embedded types, here it might be another pass to mark
"used-by-split-BTF-types-but-not-distillable" types.

> the logistics and the effects of doing so. At one extreme we might end
> up with something that resembles standalone BTF (many/most types in the

My hypothesis is that it is very unlikely that there will be a lot of
types that have to be copied into split BTF.

> split BTF). That seems unlikely in most cases. I examined one module's
> BTF base for example, and the only anon structs arose from typedef
> references possible_net_t, sockptr_t, rwlock_t and atomic_t. These in
> turn were only referenced once elsewhere in distilled base BTF; a
> sockptr was in a FUNC_PROTO, but aside from that the typedefs were not
> otherwise referenced in distilled base BTF, they were referenced in
> split BTF as embeeded struct field types.
>
> So moving all of this to the split BTF seems possible; what I think we
> probably need to think on a bit is how to handle relocation.  Is there a
> need to relocate these module types too, or can we live with having
> duplicate atomic_t/sockptr_t typedefs in the module? Currently
> relocation is simplified by the fact that we only need to relocate the
> types prior to the module's start id. All we need to do is rewrite type
> references in split BTF to base ids. If we were relocating split types
> too we'd need to remove them from split BTF.

I think anything that is not in distilled base should not be
relocated, so current simplicity is remapping distilled BTF IDs will
remain. It's ok to have clones/copies of some simple typedefs,
probably.

We have a few somewhat competing goals here and we need to make a
tradeoff between them:

  a) minimizing split BTF size (or rather not making it too large)
  b) making sure PTR_TO_BTF_ID types work (so module kfuncs can accept
task_struct and others)
  c) keeping relocation simple, fast, and reliable/unambiguous

By copying anonymous types we potentially hurt a) (but presumably not
a lot to worry about), and we significantly improve c) by making
relocation simple/fast/reliably (to the extent possible with "by name"
lookups). And we (presumably) don't change b), it still works for all
existing and future cases.

If we ever need to pass anonymous typedef'ed types to kfunc, we'll
need to think how to represent them in distilled base BTF. But it most
probably won't be TYPEDEF -> STRUCT chain, but rather empty STRUCT
with the name of original TYPEDEF + some bit to specify that we are
looking for a TYPEDEF in real base BTF; I think we have a pass forward
here, and that's the main thing, but I don't think it's a problem
worth solving now (or ever).

WDYT?

>
> >>> we move a type to the module we would need to also ensure any base types
> >>> that refer to it move there too.  For now it is I think simpler to
> >>> retain the existing split/base type classifications.
> >>
> >> We would have to finalize this part before landing, as it has big
> >> implications on the relocation process.
> >
> > Ran out of time, sorry, will continue on Monday. But please consider,
> > meanwhile, what I mentioned about only having named
> > structs/unions/enums in distilled base BTF.
> >
>
> Sure, I'll dig into it further. FWIW I agree with the goal of moving
> anonymous structs/unions if it's doable. I can't see any blocking issues
> thus far.

Yep, please give it a go, and I'll try to finish the review today, thanks.

>
> >>
> >>
> >>>
> >>> [1] https://lore.kernel.org/bpf/20231112124834.388735-14-alan.maguire@oracle.com/
> >>> [2] https://lore.kernel.org/bpf/20240322102455.98558-1-alan.maguire@oracle.com/
> >>>
> >>>
> >>>
> >>> Alan Maguire (13):
> >>>   libbpf: add support to btf__add_fwd() for ENUM64
> >>>   libbpf: add btf__distill_base() creating split BTF with distilled base
> >>>     BTF
> >>>   selftests/bpf: test distilled base, split BTF generation
> >>>   libbpf: add btf__parse_opts() API for flexible BTF parsing
> >>>   bpftool: support displaying raw split BTF using base BTF section as
> >>>     base
> >>>   kbuild,bpf: switch to using --btf_features for pahole v1.26 and later
> >>>   resolve_btfids: use .BTF.base ELF section as base BTF if -B option is
> >>>     used
> >>>   kbuild, bpf: add module-specific pahole/resolve_btfids flags for
> >>>     distilled base BTF
> >>>   libbpf: split BTF relocation
> >>>   module, bpf: store BTF base pointer in struct module
> >>>   libbpf,bpf: share BTF relocate-related code with kernel
> >>>   selftests/bpf: extend distilled BTF tests to cover BTF relocation
> >>>   bpftool: support displaying relocated-with-base split BTF
> >>>
> >>>  include/linux/btf.h                           |  32 +
> >>>  include/linux/module.h                        |   2 +
> >>>  kernel/bpf/Makefile                           |   8 +
> >>>  kernel/bpf/btf.c                              | 227 +++++--
> >>>  kernel/module/main.c                          |   5 +-
> >>>  scripts/Makefile.btf                          |  12 +-
> >>>  scripts/Makefile.modfinal                     |   4 +-
> >>>  .../bpf/bpftool/Documentation/bpftool-btf.rst |  15 +-
> >>>  tools/bpf/bpftool/bash-completion/bpftool     |   7 +-
> >>>  tools/bpf/bpftool/btf.c                       |  20 +-
> >>>  tools/bpf/bpftool/main.c                      |  14 +-
> >>>  tools/bpf/bpftool/main.h                      |   2 +
> >>>  tools/bpf/resolve_btfids/main.c               |  22 +-
> >>>  tools/lib/bpf/Build                           |   2 +-
> >>>  tools/lib/bpf/btf.c                           | 561 +++++++++++-----
> >>>  tools/lib/bpf/btf.h                           |  61 ++
> >>>  tools/lib/bpf/btf_common.c                    | 146 ++++
> >>>  tools/lib/bpf/btf_relocate.c                  | 630 ++++++++++++++++++
> >>>  tools/lib/bpf/libbpf.map                      |   3 +
> >>>  tools/lib/bpf/libbpf_internal.h               |   2 +
> >>>  .../selftests/bpf/prog_tests/btf_distill.c    | 298 +++++++++
> >>>  21 files changed, 1864 insertions(+), 209 deletions(-)
> >>>  create mode 100644 tools/lib/bpf/btf_common.c
> >>>  create mode 100644 tools/lib/bpf/btf_relocate.c
> >>>  create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_distill.c
> >>>
> >>> --
> >>> 2.31.1
> >>>
> >

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 00/13] bpf: support resilient split BTF
  2024-04-29 17:05       ` Andrii Nakryiko
@ 2024-04-29 17:31         ` Alan Maguire
  2024-04-29 18:02           ` Andrii Nakryiko
  0 siblings, 1 reply; 43+ messages in thread
From: Alan Maguire @ 2024-04-29 17:31 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: andrii, ast, jolsa, acme, quentin, eddyz87, mykolal, daniel,
	martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf,
	haoluo, houtao1, bpf, masahiroy, mcgrof, nathan

On 29/04/2024 18:05, Andrii Nakryiko wrote:
> On Mon, Apr 29, 2024 at 8:25 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>>
>> On 27/04/2024 01:24, Andrii Nakryiko wrote:
>>> On Fri, Apr 26, 2024 at 3:56 PM Andrii Nakryiko
>>> <andrii.nakryiko@gmail.com> wrote:
>>>>
>>>> On Wed, Apr 24, 2024 at 8:48 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>>>>>
>>>>> Split BPF Type Format (BTF) provides huge advantages in that kernel
>>>>> modules only have to provide type information for types that they do not
>>>>> share with the core kernel; for core kernel types, split BTF refers to
>>>>> core kernel BTF type ids.  So for a STRUCT sk_buff, a module that
>>>>> uses that structure (or a pointer to it) simply needs to refer to the
>>>>> core kernel type id, saving the need to define the structure and its many
>>>>> dependents.  This cuts down on duplication and makes BTF as compact
>>>>> as possible.
>>>>>
>>>>> However, there is a downside.  This scheme requires the references from
>>>>> split BTF to base BTF to be valid not just at encoding time, but at use
>>>>> time (when the module is loaded).  Even a small change in kernel types
>>>>> can perturb the type ids in core kernel BTF, and due to pahole's
>>>>> parallel processing of compilation units, even an unchanged kernel can
>>>>> have different type ids if BTF is re-generated.  So we have a robustness
>>>>> problem for split BTF for cases where a module is not always compiled at
>>>>> the same time as the kernel.  This problem is particularly acute for
>>>>> distros which generally want module builders to be able to compile a
>>>>> module for the lifetime of a Linux stable-based release, and have it
>>>>> continue to be valid over the lifetime of that release, even as changes
>>>>> in data structures (and hence BTF types) accrue.  Today it's not
>>>>> possible to generate BTF for modules that works beyond the initial
>>>>> kernel it is compiled against - kernel bugfixes etc invalidate the split
>>>>> BTF references to vmlinux BTF, and BTF is no longer usable for the
>>>>> module.
>>>>>
>>>>> The goal of this series is to provide options to provide additional
>>>>> context for cases like this.  That context comes in the form of
>>>>> distilled base BTF; it stands in for the base BTF, and contains
>>>>> information about the types referenced from split BTF, but not their
>>>>> full descriptions.  The modified split BTF will refer to type ids in
>>>>> this .BTF.base section, and when the kernel loads such modules it
>>>>> will use that base BTF to map references from split BTF to the
>>>>> current vmlinux BTF - a process of relocating split BTF with the
>>>>> currently-running kernel's vmlinux base BTF.
>>>>>
>>>>> A module builder - using this series along with the pahole changes -
>>>>> can then build a module with distilled base BTF via an out-of-tree
>>>>> module build, i.e.
>>>>>
>>>>> make -C . M=path/2/module
>>>>>
>>>>> The module will have a .BTF section (the split BTF) and a
>>>>> .BTF.base section.  The latter is small in size - distilled base
>>>>> BTF does not need full struct/union/enum information for named
>>>>> types for example.  For 2667 modules built with distilled base BTF,
>>>>> the average size observed was 1556 bytes (stddev 1563).
>>>>>
>>>>> Note that for the in-tree modules, this approach is not needed as
>>>>> split and base BTF in the case of in-tree modules are always built
>>>>> and re-built together.
>>>>>
>>>>> The series first focuses on generating split BTF with distilled base
>>>>> BTF, and provides btf__parse_opts() which allows specification
>>>>> of the section name from which to read BTF data, since we now have
>>>>> both .BTF and .BTF.base sections that can contain such data.
>>>>>
>>>>> Then we add support to resolve_btfids for generating the .BTF.ids
>>>>> section with reference to the .BTF.base section - this ensures the
>>>>> .BTF.ids match those used in the split/base BTF.
>>>>>
>>>>> Finally the series provides the mechanism for relocating split BTF with
>>>>> a new base; the distilled base BTF is used to map the references to base
>>>>> BTF in the split BTF to the new base.  For the kernel, this relocation
>>>>> process happens at module load time, and we relocate split BTF
>>>>> references to point at types in the current vmlinux BTF.  As part of
>>>>> this, .BTF.ids references need to be mapped also.
>>>>>
>>>>> So concretely, what happens is
>>>>>
>>>>> - we generate split BTF in the .BTF section of a module that refers to
>>>>>   types in the .BTF.base section as base types; these are not full
>>>>>   type descriptions but provide information about the base type.  So
>>>>>   a STRUCT sk_buff would be represented as a FWD struct sk_buff in
>>>>>   distilled base BTF for example.
>>>>> - when the module is loaded, the split BTF is relocated with vmlinux
>>>>>   BTF; in the case of the FWD struct sk_buff, we find the STRUCT sk_buff
>>>>>   in vmlinux BTF and map all split BTF references to the distilled base
>>>>>   FWD sk_buff, replacing them with references to the vmlinux BTF
>>>>>   STRUCT sk_buff.
>>>>>
>>>>> Support is also added to bpftool to be able to display split BTF
>>>>> relative to its .BTF.base section, and also to display the relocated
>>>>> form via the "-R path_to_base_btf".
>>>>>
>>>>> A previous approach to this problem [1] utilized standalone BTF for such
>>>>> cases - where the BTF is not defined relative to base BTF so there is no
>>>>> relocation required.  The problem with that approach is that from
>>>>> the verifier perspective, some types are special, and having a custom
>>>>> representation of a core kernel type that did not necessarily match the
>>>>> current representation is not tenable.  So the approach taken here was
>>>>> to preserve the split BTF model while minimizing the representation of
>>>>> the context needed to relocate split and current vmlinux BTF.
>>>>>
>>>>> To generate distilled .BTF.base sections the associated dwarves
>>>>> patch (to be applied on the "next" branch there) is needed.
>>>>> Without it, things will still work but bpf_testmod will not be built
>>>>> with a .BTF.base section.
>>>>>
>>>>> Changes since RFC [2]:
>>>>>
>>>>> - updated terminology; we replace clunky "base reference" BTF with
>>>>>   distilling base BTF into a .BTF.base section. Similarly BTF
>>>>>   reconcilation becomes BTF relocation (Andrii, most patches)
>>>>> - add distilled base BTF by default for out-of-tree modules
>>>>>   (Alexei, patch 8)
>>>>> - distill algorithm updated to record size of embedded struct/union
>>>>>   by recording it as a 0-vlen STRUCT/UNION with size preserved
>>>>>   (Andrii, patch 2)
>>>>> - verify size match on relocation for such STRUCT/UNIONs (Andrii,
>>>>>   patch 9)
>>>>> - with embedded STRUCT/UNION recording size, we can have bpftool
>>>>>   dump a header representation using .BTF.base + .BTF sections
>>>>>   rather than special-casing and refusing to use "format c" for
>>>>>   that case (patch 5)
>>>>> - match enum with enum64 and vice versa (Andrii, patch 9)
>>>>> - ensure that resolve_btfids works with BTF without .BTF.base
>>>>>   section (patch 7)
>>>>> - update tests to cover embedded types, arrays and function
>>>>>   prototypes (patches 3, 12)
>>>>>
>>>>> One change not made yet is adding anonymous struct/unions that the split
>>>>> BTF references in base BTF to the module instead of adding them to the
>>>>> .BTF.base section.  That would involve having to maintain two pipes for
>>>>> writing BTF, one for the .BTF.base and one for the split BTF.  It would
>>>>> be possible, but there are I think some edge cases that might make it
>>>>> tricky.  For example consider a split BTF reference to a base BTF
>>>>> ARRAY which in turn referenced an anonymous STRUCT as type.  In such a
>>>>> case, it wouldn't make sense to have the array in the .BTF.base section
>>>>> while having the STRUCT in the module.  The general concern is that once
>>>>
>>>> Hm.. not really? ARRAY is a reference type (and anonymous at that), so
>>>> it would have to stay in module's BTF, no? I'll go read the patch
>>>> series again, but let me know if I'm missing something.
>>>>
>>
>> The way things currently work, we preserve all relationships prior to
>> distilling base BTF. That is, if a type was in split BTF prior to
>> calling btf__distill_base(), it will stay in split BTF afterwards. Ditto
>> for base types. This is true for reference types as well as named types.
>> So in the case of the above array for example, prior to distilling types
>> it is in base BTF. If it in turn then referred to a base anonymous
>> struct, both would be in the base and thus the distilled base BTF. In
>> the above case, I was suggesting the array itself was referred to from
>> split BTF, but not in split BTF, sorry if that wasn't clearer.
>>
>> So the problem comes if we moved the anon struct to the module; then we
>> also need to move types that depend on it there. This means we'd need to
>> make the move recursive. That seems doable; the only question is around
> 
> Yep, it should be very doable. We just mark everything used from
> "to-be-moved-to-new-split-BTF" types recursively, unless it's
> "qualified named type", where we stop. You have a pass to mark
> embedded types, here it might be another pass to mark
> "used-by-split-BTF-types-but-not-distillable" types.
> 
>> the logistics and the effects of doing so. At one extreme we might end
>> up with something that resembles standalone BTF (many/most types in the
> 
> My hypothesis is that it is very unlikely that there will be a lot of
> types that have to be copied into split BTF.
> 
>> split BTF). That seems unlikely in most cases. I examined one module's
>> BTF base for example, and the only anon structs arose from typedef
>> references possible_net_t, sockptr_t, rwlock_t and atomic_t. These in
>> turn were only referenced once elsewhere in distilled base BTF; a
>> sockptr was in a FUNC_PROTO, but aside from that the typedefs were not
>> otherwise referenced in distilled base BTF, they were referenced in
>> split BTF as embeeded struct field types.
>>
>> So moving all of this to the split BTF seems possible; what I think we
>> probably need to think on a bit is how to handle relocation.  Is there a
>> need to relocate these module types too, or can we live with having
>> duplicate atomic_t/sockptr_t typedefs in the module? Currently
>> relocation is simplified by the fact that we only need to relocate the
>> types prior to the module's start id. All we need to do is rewrite type
>> references in split BTF to base ids. If we were relocating split types
>> too we'd need to remove them from split BTF.
> 
> I think anything that is not in distilled base should not be
> relocated, so current simplicity is remapping distilled BTF IDs will
> remain. It's ok to have clones/copies of some simple typedefs,
> probably.
> 
> We have a few somewhat competing goals here and we need to make a
> tradeoff between them:
> 
>   a) minimizing split BTF size (or rather not making it too large)
>   b) making sure PTR_TO_BTF_ID types work (so module kfuncs can accept
> task_struct and others)
>   c) keeping relocation simple, fast, and reliable/unambiguous
> 
> By copying anonymous types we potentially hurt a) (but presumably not
> a lot to worry about), and we significantly improve c) by making
> relocation simple/fast/reliably (to the extent possible with "by name"
> lookups). And we (presumably) don't change b), it still works for all
> existing and future cases.
>

Yeah, case b) is the only lingering concern I have, but in practice it
seems unlikely to arise. One point of clarification - we've discussed so
far mostly anonymous STRUCTs and UNIONs; do you think there are other
anonymous types we should consider, ARRAYs for example?
> If we ever need to pass anonymous typedef'ed types to kfunc, we'll
> need to think how to represent them in distilled base BTF. But it most
> probably won't be TYPEDEF -> STRUCT chain, but rather empty STRUCT
> with the name of original TYPEDEF + some bit to specify that we are
> looking for a TYPEDEF in real base BTF; I think we have a pass forward
> here, and that's the main thing, but I don't think it's a problem
> worth solving now (or ever).
> 
> WDYT?

Agreed. I think (hope) it's unlikely to arise.

> 
>>
>>>>> we move a type to the module we would need to also ensure any base types
>>>>> that refer to it move there too.  For now it is I think simpler to
>>>>> retain the existing split/base type classifications.
>>>>
>>>> We would have to finalize this part before landing, as it has big
>>>> implications on the relocation process.
>>>
>>> Ran out of time, sorry, will continue on Monday. But please consider,
>>> meanwhile, what I mentioned about only having named
>>> structs/unions/enums in distilled base BTF.
>>>
>>
>> Sure, I'll dig into it further. FWIW I agree with the goal of moving
>> anonymous structs/unions if it's doable. I can't see any blocking issues
>> thus far.
> 
> Yep, please give it a go, and I'll try to finish the review today, thanks.
> 
>>
>>>>
>>>>
>>>>>
>>>>> [1] https://lore.kernel.org/bpf/20231112124834.388735-14-alan.maguire@oracle.com/
>>>>> [2] https://lore.kernel.org/bpf/20240322102455.98558-1-alan.maguire@oracle.com/
>>>>>
>>>>>
>>>>>
>>>>> Alan Maguire (13):
>>>>>   libbpf: add support to btf__add_fwd() for ENUM64
>>>>>   libbpf: add btf__distill_base() creating split BTF with distilled base
>>>>>     BTF
>>>>>   selftests/bpf: test distilled base, split BTF generation
>>>>>   libbpf: add btf__parse_opts() API for flexible BTF parsing
>>>>>   bpftool: support displaying raw split BTF using base BTF section as
>>>>>     base
>>>>>   kbuild,bpf: switch to using --btf_features for pahole v1.26 and later
>>>>>   resolve_btfids: use .BTF.base ELF section as base BTF if -B option is
>>>>>     used
>>>>>   kbuild, bpf: add module-specific pahole/resolve_btfids flags for
>>>>>     distilled base BTF
>>>>>   libbpf: split BTF relocation
>>>>>   module, bpf: store BTF base pointer in struct module
>>>>>   libbpf,bpf: share BTF relocate-related code with kernel
>>>>>   selftests/bpf: extend distilled BTF tests to cover BTF relocation
>>>>>   bpftool: support displaying relocated-with-base split BTF
>>>>>
>>>>>  include/linux/btf.h                           |  32 +
>>>>>  include/linux/module.h                        |   2 +
>>>>>  kernel/bpf/Makefile                           |   8 +
>>>>>  kernel/bpf/btf.c                              | 227 +++++--
>>>>>  kernel/module/main.c                          |   5 +-
>>>>>  scripts/Makefile.btf                          |  12 +-
>>>>>  scripts/Makefile.modfinal                     |   4 +-
>>>>>  .../bpf/bpftool/Documentation/bpftool-btf.rst |  15 +-
>>>>>  tools/bpf/bpftool/bash-completion/bpftool     |   7 +-
>>>>>  tools/bpf/bpftool/btf.c                       |  20 +-
>>>>>  tools/bpf/bpftool/main.c                      |  14 +-
>>>>>  tools/bpf/bpftool/main.h                      |   2 +
>>>>>  tools/bpf/resolve_btfids/main.c               |  22 +-
>>>>>  tools/lib/bpf/Build                           |   2 +-
>>>>>  tools/lib/bpf/btf.c                           | 561 +++++++++++-----
>>>>>  tools/lib/bpf/btf.h                           |  61 ++
>>>>>  tools/lib/bpf/btf_common.c                    | 146 ++++
>>>>>  tools/lib/bpf/btf_relocate.c                  | 630 ++++++++++++++++++
>>>>>  tools/lib/bpf/libbpf.map                      |   3 +
>>>>>  tools/lib/bpf/libbpf_internal.h               |   2 +
>>>>>  .../selftests/bpf/prog_tests/btf_distill.c    | 298 +++++++++
>>>>>  21 files changed, 1864 insertions(+), 209 deletions(-)
>>>>>  create mode 100644 tools/lib/bpf/btf_common.c
>>>>>  create mode 100644 tools/lib/bpf/btf_relocate.c
>>>>>  create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_distill.c
>>>>>
>>>>> --
>>>>> 2.31.1
>>>>>
>>>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 00/13] bpf: support resilient split BTF
  2024-04-29 17:31         ` Alan Maguire
@ 2024-04-29 18:02           ` Andrii Nakryiko
  0 siblings, 0 replies; 43+ messages in thread
From: Andrii Nakryiko @ 2024-04-29 18:02 UTC (permalink / raw)
  To: Alan Maguire
  Cc: andrii, ast, jolsa, acme, quentin, eddyz87, mykolal, daniel,
	martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf,
	haoluo, houtao1, bpf, masahiroy, mcgrof, nathan

On Mon, Apr 29, 2024 at 10:31 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> On 29/04/2024 18:05, Andrii Nakryiko wrote:
> > On Mon, Apr 29, 2024 at 8:25 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> >>
> >> On 27/04/2024 01:24, Andrii Nakryiko wrote:
> >>> On Fri, Apr 26, 2024 at 3:56 PM Andrii Nakryiko
> >>> <andrii.nakryiko@gmail.com> wrote:
> >>>>
> >>>> On Wed, Apr 24, 2024 at 8:48 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> >>>>>
> >>>>> Split BPF Type Format (BTF) provides huge advantages in that kernel
> >>>>> modules only have to provide type information for types that they do not
> >>>>> share with the core kernel; for core kernel types, split BTF refers to
> >>>>> core kernel BTF type ids.  So for a STRUCT sk_buff, a module that
> >>>>> uses that structure (or a pointer to it) simply needs to refer to the
> >>>>> core kernel type id, saving the need to define the structure and its many
> >>>>> dependents.  This cuts down on duplication and makes BTF as compact
> >>>>> as possible.
> >>>>>
> >>>>> However, there is a downside.  This scheme requires the references from
> >>>>> split BTF to base BTF to be valid not just at encoding time, but at use
> >>>>> time (when the module is loaded).  Even a small change in kernel types
> >>>>> can perturb the type ids in core kernel BTF, and due to pahole's
> >>>>> parallel processing of compilation units, even an unchanged kernel can
> >>>>> have different type ids if BTF is re-generated.  So we have a robustness
> >>>>> problem for split BTF for cases where a module is not always compiled at
> >>>>> the same time as the kernel.  This problem is particularly acute for
> >>>>> distros which generally want module builders to be able to compile a
> >>>>> module for the lifetime of a Linux stable-based release, and have it
> >>>>> continue to be valid over the lifetime of that release, even as changes
> >>>>> in data structures (and hence BTF types) accrue.  Today it's not
> >>>>> possible to generate BTF for modules that works beyond the initial
> >>>>> kernel it is compiled against - kernel bugfixes etc invalidate the split
> >>>>> BTF references to vmlinux BTF, and BTF is no longer usable for the
> >>>>> module.
> >>>>>
> >>>>> The goal of this series is to provide options to provide additional
> >>>>> context for cases like this.  That context comes in the form of
> >>>>> distilled base BTF; it stands in for the base BTF, and contains
> >>>>> information about the types referenced from split BTF, but not their
> >>>>> full descriptions.  The modified split BTF will refer to type ids in
> >>>>> this .BTF.base section, and when the kernel loads such modules it
> >>>>> will use that base BTF to map references from split BTF to the
> >>>>> current vmlinux BTF - a process of relocating split BTF with the
> >>>>> currently-running kernel's vmlinux base BTF.
> >>>>>
> >>>>> A module builder - using this series along with the pahole changes -
> >>>>> can then build a module with distilled base BTF via an out-of-tree
> >>>>> module build, i.e.
> >>>>>
> >>>>> make -C . M=path/2/module
> >>>>>
> >>>>> The module will have a .BTF section (the split BTF) and a
> >>>>> .BTF.base section.  The latter is small in size - distilled base
> >>>>> BTF does not need full struct/union/enum information for named
> >>>>> types for example.  For 2667 modules built with distilled base BTF,
> >>>>> the average size observed was 1556 bytes (stddev 1563).
> >>>>>
> >>>>> Note that for the in-tree modules, this approach is not needed as
> >>>>> split and base BTF in the case of in-tree modules are always built
> >>>>> and re-built together.
> >>>>>
> >>>>> The series first focuses on generating split BTF with distilled base
> >>>>> BTF, and provides btf__parse_opts() which allows specification
> >>>>> of the section name from which to read BTF data, since we now have
> >>>>> both .BTF and .BTF.base sections that can contain such data.
> >>>>>
> >>>>> Then we add support to resolve_btfids for generating the .BTF.ids
> >>>>> section with reference to the .BTF.base section - this ensures the
> >>>>> .BTF.ids match those used in the split/base BTF.
> >>>>>
> >>>>> Finally the series provides the mechanism for relocating split BTF with
> >>>>> a new base; the distilled base BTF is used to map the references to base
> >>>>> BTF in the split BTF to the new base.  For the kernel, this relocation
> >>>>> process happens at module load time, and we relocate split BTF
> >>>>> references to point at types in the current vmlinux BTF.  As part of
> >>>>> this, .BTF.ids references need to be mapped also.
> >>>>>
> >>>>> So concretely, what happens is
> >>>>>
> >>>>> - we generate split BTF in the .BTF section of a module that refers to
> >>>>>   types in the .BTF.base section as base types; these are not full
> >>>>>   type descriptions but provide information about the base type.  So
> >>>>>   a STRUCT sk_buff would be represented as a FWD struct sk_buff in
> >>>>>   distilled base BTF for example.
> >>>>> - when the module is loaded, the split BTF is relocated with vmlinux
> >>>>>   BTF; in the case of the FWD struct sk_buff, we find the STRUCT sk_buff
> >>>>>   in vmlinux BTF and map all split BTF references to the distilled base
> >>>>>   FWD sk_buff, replacing them with references to the vmlinux BTF
> >>>>>   STRUCT sk_buff.
> >>>>>
> >>>>> Support is also added to bpftool to be able to display split BTF
> >>>>> relative to its .BTF.base section, and also to display the relocated
> >>>>> form via the "-R path_to_base_btf".
> >>>>>
> >>>>> A previous approach to this problem [1] utilized standalone BTF for such
> >>>>> cases - where the BTF is not defined relative to base BTF so there is no
> >>>>> relocation required.  The problem with that approach is that from
> >>>>> the verifier perspective, some types are special, and having a custom
> >>>>> representation of a core kernel type that did not necessarily match the
> >>>>> current representation is not tenable.  So the approach taken here was
> >>>>> to preserve the split BTF model while minimizing the representation of
> >>>>> the context needed to relocate split and current vmlinux BTF.
> >>>>>
> >>>>> To generate distilled .BTF.base sections the associated dwarves
> >>>>> patch (to be applied on the "next" branch there) is needed.
> >>>>> Without it, things will still work but bpf_testmod will not be built
> >>>>> with a .BTF.base section.
> >>>>>
> >>>>> Changes since RFC [2]:
> >>>>>
> >>>>> - updated terminology; we replace clunky "base reference" BTF with
> >>>>>   distilling base BTF into a .BTF.base section. Similarly BTF
> >>>>>   reconcilation becomes BTF relocation (Andrii, most patches)
> >>>>> - add distilled base BTF by default for out-of-tree modules
> >>>>>   (Alexei, patch 8)
> >>>>> - distill algorithm updated to record size of embedded struct/union
> >>>>>   by recording it as a 0-vlen STRUCT/UNION with size preserved
> >>>>>   (Andrii, patch 2)
> >>>>> - verify size match on relocation for such STRUCT/UNIONs (Andrii,
> >>>>>   patch 9)
> >>>>> - with embedded STRUCT/UNION recording size, we can have bpftool
> >>>>>   dump a header representation using .BTF.base + .BTF sections
> >>>>>   rather than special-casing and refusing to use "format c" for
> >>>>>   that case (patch 5)
> >>>>> - match enum with enum64 and vice versa (Andrii, patch 9)
> >>>>> - ensure that resolve_btfids works with BTF without .BTF.base
> >>>>>   section (patch 7)
> >>>>> - update tests to cover embedded types, arrays and function
> >>>>>   prototypes (patches 3, 12)
> >>>>>
> >>>>> One change not made yet is adding anonymous struct/unions that the split
> >>>>> BTF references in base BTF to the module instead of adding them to the
> >>>>> .BTF.base section.  That would involve having to maintain two pipes for
> >>>>> writing BTF, one for the .BTF.base and one for the split BTF.  It would
> >>>>> be possible, but there are I think some edge cases that might make it
> >>>>> tricky.  For example consider a split BTF reference to a base BTF
> >>>>> ARRAY which in turn referenced an anonymous STRUCT as type.  In such a
> >>>>> case, it wouldn't make sense to have the array in the .BTF.base section
> >>>>> while having the STRUCT in the module.  The general concern is that once
> >>>>
> >>>> Hm.. not really? ARRAY is a reference type (and anonymous at that), so
> >>>> it would have to stay in module's BTF, no? I'll go read the patch
> >>>> series again, but let me know if I'm missing something.
> >>>>
> >>
> >> The way things currently work, we preserve all relationships prior to
> >> distilling base BTF. That is, if a type was in split BTF prior to
> >> calling btf__distill_base(), it will stay in split BTF afterwards. Ditto
> >> for base types. This is true for reference types as well as named types.
> >> So in the case of the above array for example, prior to distilling types
> >> it is in base BTF. If it in turn then referred to a base anonymous
> >> struct, both would be in the base and thus the distilled base BTF. In
> >> the above case, I was suggesting the array itself was referred to from
> >> split BTF, but not in split BTF, sorry if that wasn't clearer.
> >>
> >> So the problem comes if we moved the anon struct to the module; then we
> >> also need to move types that depend on it there. This means we'd need to
> >> make the move recursive. That seems doable; the only question is around
> >
> > Yep, it should be very doable. We just mark everything used from
> > "to-be-moved-to-new-split-BTF" types recursively, unless it's
> > "qualified named type", where we stop. You have a pass to mark
> > embedded types, here it might be another pass to mark
> > "used-by-split-BTF-types-but-not-distillable" types.
> >
> >> the logistics and the effects of doing so. At one extreme we might end
> >> up with something that resembles standalone BTF (many/most types in the
> >
> > My hypothesis is that it is very unlikely that there will be a lot of
> > types that have to be copied into split BTF.
> >
> >> split BTF). That seems unlikely in most cases. I examined one module's
> >> BTF base for example, and the only anon structs arose from typedef
> >> references possible_net_t, sockptr_t, rwlock_t and atomic_t. These in
> >> turn were only referenced once elsewhere in distilled base BTF; a
> >> sockptr was in a FUNC_PROTO, but aside from that the typedefs were not
> >> otherwise referenced in distilled base BTF, they were referenced in
> >> split BTF as embeeded struct field types.
> >>
> >> So moving all of this to the split BTF seems possible; what I think we
> >> probably need to think on a bit is how to handle relocation.  Is there a
> >> need to relocate these module types too, or can we live with having
> >> duplicate atomic_t/sockptr_t typedefs in the module? Currently
> >> relocation is simplified by the fact that we only need to relocate the
> >> types prior to the module's start id. All we need to do is rewrite type
> >> references in split BTF to base ids. If we were relocating split types
> >> too we'd need to remove them from split BTF.
> >
> > I think anything that is not in distilled base should not be
> > relocated, so current simplicity is remapping distilled BTF IDs will
> > remain. It's ok to have clones/copies of some simple typedefs,
> > probably.
> >
> > We have a few somewhat competing goals here and we need to make a
> > tradeoff between them:
> >
> >   a) minimizing split BTF size (or rather not making it too large)
> >   b) making sure PTR_TO_BTF_ID types work (so module kfuncs can accept
> > task_struct and others)
> >   c) keeping relocation simple, fast, and reliable/unambiguous
> >
> > By copying anonymous types we potentially hurt a) (but presumably not
> > a lot to worry about), and we significantly improve c) by making
> > relocation simple/fast/reliably (to the extent possible with "by name"
> > lookups). And we (presumably) don't change b), it still works for all
> > existing and future cases.
> >
>
> Yeah, case b) is the only lingering concern I have, but in practice it
> seems unlikely to arise. One point of clarification - we've discussed so
> far mostly anonymous STRUCTs and UNIONs; do you think there are other
> anonymous types we should consider, ARRAYs for example?

Everything is technically possible, but I'd be surprised if anything
but STRUCT/UNION is referred to by PTR_TO_BTF_ID for kfunc. But let's
get there first.

> > If we ever need to pass anonymous typedef'ed types to kfunc, we'll
> > need to think how to represent them in distilled base BTF. But it most
> > probably won't be TYPEDEF -> STRUCT chain, but rather empty STRUCT
> > with the name of original TYPEDEF + some bit to specify that we are
> > looking for a TYPEDEF in real base BTF; I think we have a pass forward
> > here, and that's the main thing, but I don't think it's a problem
> > worth solving now (or ever).
> >
> > WDYT?
>
> Agreed. I think (hope) it's unlikely to arise.
>
> >
> >>
> >>>>> we move a type to the module we would need to also ensure any base types
> >>>>> that refer to it move there too.  For now it is I think simpler to
> >>>>> retain the existing split/base type classifications.
> >>>>
> >>>> We would have to finalize this part before landing, as it has big
> >>>> implications on the relocation process.
> >>>
> >>> Ran out of time, sorry, will continue on Monday. But please consider,
> >>> meanwhile, what I mentioned about only having named
> >>> structs/unions/enums in distilled base BTF.
> >>>
> >>
> >> Sure, I'll dig into it further. FWIW I agree with the goal of moving
> >> anonymous structs/unions if it's doable. I can't see any blocking issues
> >> thus far.
> >
> > Yep, please give it a go, and I'll try to finish the review today, thanks.
> >
> >>
> >>>>
> >>>>
> >>>>>
> >>>>> [1] https://lore.kernel.org/bpf/20231112124834.388735-14-alan.maguire@oracle.com/
> >>>>> [2] https://lore.kernel.org/bpf/20240322102455.98558-1-alan.maguire@oracle.com/
> >>>>>
> >>>>>
> >>>>>
> >>>>> Alan Maguire (13):
> >>>>>   libbpf: add support to btf__add_fwd() for ENUM64
> >>>>>   libbpf: add btf__distill_base() creating split BTF with distilled base
> >>>>>     BTF
> >>>>>   selftests/bpf: test distilled base, split BTF generation
> >>>>>   libbpf: add btf__parse_opts() API for flexible BTF parsing
> >>>>>   bpftool: support displaying raw split BTF using base BTF section as
> >>>>>     base
> >>>>>   kbuild,bpf: switch to using --btf_features for pahole v1.26 and later
> >>>>>   resolve_btfids: use .BTF.base ELF section as base BTF if -B option is
> >>>>>     used
> >>>>>   kbuild, bpf: add module-specific pahole/resolve_btfids flags for
> >>>>>     distilled base BTF
> >>>>>   libbpf: split BTF relocation
> >>>>>   module, bpf: store BTF base pointer in struct module
> >>>>>   libbpf,bpf: share BTF relocate-related code with kernel
> >>>>>   selftests/bpf: extend distilled BTF tests to cover BTF relocation
> >>>>>   bpftool: support displaying relocated-with-base split BTF
> >>>>>
> >>>>>  include/linux/btf.h                           |  32 +
> >>>>>  include/linux/module.h                        |   2 +
> >>>>>  kernel/bpf/Makefile                           |   8 +
> >>>>>  kernel/bpf/btf.c                              | 227 +++++--
> >>>>>  kernel/module/main.c                          |   5 +-
> >>>>>  scripts/Makefile.btf                          |  12 +-
> >>>>>  scripts/Makefile.modfinal                     |   4 +-
> >>>>>  .../bpf/bpftool/Documentation/bpftool-btf.rst |  15 +-
> >>>>>  tools/bpf/bpftool/bash-completion/bpftool     |   7 +-
> >>>>>  tools/bpf/bpftool/btf.c                       |  20 +-
> >>>>>  tools/bpf/bpftool/main.c                      |  14 +-
> >>>>>  tools/bpf/bpftool/main.h                      |   2 +
> >>>>>  tools/bpf/resolve_btfids/main.c               |  22 +-
> >>>>>  tools/lib/bpf/Build                           |   2 +-
> >>>>>  tools/lib/bpf/btf.c                           | 561 +++++++++++-----
> >>>>>  tools/lib/bpf/btf.h                           |  61 ++
> >>>>>  tools/lib/bpf/btf_common.c                    | 146 ++++
> >>>>>  tools/lib/bpf/btf_relocate.c                  | 630 ++++++++++++++++++
> >>>>>  tools/lib/bpf/libbpf.map                      |   3 +
> >>>>>  tools/lib/bpf/libbpf_internal.h               |   2 +
> >>>>>  .../selftests/bpf/prog_tests/btf_distill.c    | 298 +++++++++
> >>>>>  21 files changed, 1864 insertions(+), 209 deletions(-)
> >>>>>  create mode 100644 tools/lib/bpf/btf_common.c
> >>>>>  create mode 100644 tools/lib/bpf/btf_relocate.c
> >>>>>  create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_distill.c
> >>>>>
> >>>>> --
> >>>>> 2.31.1
> >>>>>
> >>>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 04/13] libbpf: add btf__parse_opts() API for flexible BTF parsing
  2024-04-24 15:47 ` [PATCH v2 bpf-next 04/13] libbpf: add btf__parse_opts() API for flexible BTF parsing Alan Maguire
@ 2024-04-29 23:40   ` Andrii Nakryiko
  2024-05-01 17:42     ` Alan Maguire
  2024-05-01  0:07   ` Eduard Zingerman
  1 sibling, 1 reply; 43+ messages in thread
From: Andrii Nakryiko @ 2024-04-29 23:40 UTC (permalink / raw)
  To: Alan Maguire
  Cc: andrii, ast, jolsa, acme, quentin, eddyz87, mykolal, daniel,
	martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf,
	haoluo, houtao1, bpf, masahiroy, mcgrof, nathan

On Wed, Apr 24, 2024 at 8:48 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> Options cover existing parsing scenarios (ELF, raw, retrieving
> .BTF.ext) and also allow specification of the ELF section name
> containing BTF.  This will allow consumers to retrieve BTF from
> .BTF.base sections (BTF_BASE_ELF_SEC) also.
>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
>  tools/lib/bpf/btf.c      | 50 ++++++++++++++++++++++++++++------------
>  tools/lib/bpf/btf.h      | 32 +++++++++++++++++++++++++
>  tools/lib/bpf/libbpf.map |  1 +
>  3 files changed, 68 insertions(+), 15 deletions(-)
>
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 419cc4fa2e86..9036c1dc45d0 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -1084,7 +1084,7 @@ struct btf *btf__new_split(const void *data, __u32 size, struct btf *base_btf)
>         return libbpf_ptr(btf_new(data, size, base_btf));
>  }
>
> -static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
> +static struct btf *btf_parse_elf(const char *path, const char *btf_sec, struct btf *base_btf,
>                                  struct btf_ext **btf_ext)
>  {
>         Elf_Data *btf_data = NULL, *btf_ext_data = NULL;
> @@ -1146,7 +1146,7 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
>                                 idx, path);
>                         goto done;
>                 }
> -               if (strcmp(name, BTF_ELF_SEC) == 0) {
> +               if (strcmp(name, btf_sec) == 0) {
>                         btf_data = elf_getdata(scn, 0);
>                         if (!btf_data) {
>                                 pr_warn("failed to get section(%d, %s) data from %s\n",
> @@ -1166,7 +1166,7 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
>         }
>
>         if (!btf_data) {
> -               pr_warn("failed to find '%s' ELF section in %s\n", BTF_ELF_SEC, path);
> +               pr_warn("failed to find '%s' ELF section in %s\n", btf_sec, path);
>                 err = -ENODATA;
>                 goto done;
>         }
> @@ -1212,12 +1212,12 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
>
>  struct btf *btf__parse_elf(const char *path, struct btf_ext **btf_ext)
>  {
> -       return libbpf_ptr(btf_parse_elf(path, NULL, btf_ext));
> +       return libbpf_ptr(btf_parse_elf(path, BTF_ELF_SEC, NULL, btf_ext));
>  }
>
>  struct btf *btf__parse_elf_split(const char *path, struct btf *base_btf)
>  {
> -       return libbpf_ptr(btf_parse_elf(path, base_btf, NULL));
> +       return libbpf_ptr(btf_parse_elf(path, BTF_ELF_SEC, base_btf, NULL));
>  }
>
>  static struct btf *btf_parse_raw(const char *path, struct btf *base_btf)
> @@ -1293,7 +1293,8 @@ struct btf *btf__parse_raw_split(const char *path, struct btf *base_btf)
>         return libbpf_ptr(btf_parse_raw(path, base_btf));
>  }
>
> -static struct btf *btf_parse(const char *path, struct btf *base_btf, struct btf_ext **btf_ext)
> +static struct btf *btf_parse(const char *path, const char *btf_elf_sec, struct btf *base_btf,
> +                            struct btf_ext **btf_ext)
>  {
>         struct btf *btf;
>         int err;
> @@ -1301,23 +1302,42 @@ static struct btf *btf_parse(const char *path, struct btf *base_btf, struct btf_
>         if (btf_ext)
>                 *btf_ext = NULL;
>
> -       btf = btf_parse_raw(path, base_btf);
> -       err = libbpf_get_error(btf);
> -       if (!err)
> -               return btf;
> -       if (err != -EPROTO)
> -               return ERR_PTR(err);
> -       return btf_parse_elf(path, base_btf, btf_ext);
> +       if (!btf_elf_sec) {
> +               btf = btf_parse_raw(path, base_btf);
> +               err = libbpf_get_error(btf);
> +               if (!err)
> +                       return btf;
> +               if (err != -EPROTO)
> +                       return ERR_PTR(err);
> +       }
> +       if (!btf_elf_sec)
> +               btf_elf_sec = BTF_ELF_SEC;
> +
> +       return btf_parse_elf(path, btf_elf_sec, base_btf, btf_ext);

nit: btf_elf_sec ?: BTF_ELF_SEC


> +}
> +
> +struct btf *btf__parse_opts(const char *path, struct btf_parse_opts *opts)
> +{
> +       struct btf *base_btf;
> +       const char *btf_sec;
> +       struct btf_ext **btf_ext;
> +
> +       if (!OPTS_VALID(opts, btf_parse_opts))
> +               return libbpf_err_ptr(-EINVAL);
> +       base_btf = OPTS_GET(opts, base_btf, NULL);
> +       btf_sec = OPTS_GET(opts, btf_sec, NULL);
> +       btf_ext = OPTS_GET(opts, btf_ext, NULL);
> +       return libbpf_ptr(btf_parse(path, btf_sec, base_btf, btf_ext));
>  }
>
>  struct btf *btf__parse(const char *path, struct btf_ext **btf_ext)
>  {
> -       return libbpf_ptr(btf_parse(path, NULL, btf_ext));
> +       return libbpf_ptr(btf_parse(path, NULL, NULL, btf_ext));
>  }
>
>  struct btf *btf__parse_split(const char *path, struct btf *base_btf)
>  {
> -       return libbpf_ptr(btf_parse(path, base_btf, NULL));
> +       return libbpf_ptr(btf_parse(path, NULL, base_btf, NULL));
>  }
>
>  static void *btf_get_raw_data(const struct btf *btf, __u32 *size, bool swap_endian);
> diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
> index 025ed28b7fe8..94dfdfdef617 100644
> --- a/tools/lib/bpf/btf.h
> +++ b/tools/lib/bpf/btf.h
> @@ -18,6 +18,7 @@ extern "C" {
>
>  #define BTF_ELF_SEC ".BTF"
>  #define BTF_EXT_ELF_SEC ".BTF.ext"
> +#define BTF_BASE_ELF_SEC ".BTF.base"

Does libbpf code itself use this? If not, let's get rid of it.

>  #define MAPS_ELF_SEC ".maps"
>
>  struct btf;
> @@ -134,6 +135,37 @@ LIBBPF_API struct btf *btf__parse_elf_split(const char *path, struct btf *base_b
>  LIBBPF_API struct btf *btf__parse_raw(const char *path);
>  LIBBPF_API struct btf *btf__parse_raw_split(const char *path, struct btf *base_btf);
>
> +struct btf_parse_opts {
> +       size_t sz;
> +       /* use base BTF to parse split BTF */
> +       struct btf *base_btf;
> +       /* retrieve optional .BTF.ext info */
> +       struct btf_ext **btf_ext;
> +       /* BTF section name */

let's mention that if not set, libbpf will default to trying to parse
data as raw BTF, and then will fallback to .BTF in ELF. If it is set
to non-NULL, we'll assume ELF and use that section to fetch BTF data.

> +       const char *btf_sec;
> +       size_t:0;

nit: size_t :0; (consistency)

> +};
> +
> +#define btf_parse_opts__last_field btf_sec
> +
> +/* @brief **btf__parse_opts()** parses BTF information from either a
> + * raw BTF file (*btf_sec* is NULL) or from the specified BTF section,
> + * also retrieving  .BTF.ext info if *btf_ext* is non-NULL.  If
> + * *base_btf* is specified, use it to parse split BTF from the
> + * specified location.
> + *
> + * @return new BTF object instance which has to be eventually freed with
> + * **btf__free()**
> + *
> + * On error, error-code-encoded-as-pointer is returned, not a NULL. To extract

this is false, we don't encode error as pointer anymore. starting from
v1.0 it's always NULL + errno.

> + * error code from such a pointer `libbpf_get_error()` should be used. If
> + * `libbpf_set_strict_mode(LIBBPF_STRICT_CLEAN_PTRS)` is enabled, NULL is
> + * returned on error instead. In both cases thread-local `errno` variable is
> + * always set to error code as well.
> + */
> +
> +LIBBPF_API struct btf *btf__parse_opts(const char *path, struct btf_parse_opts *opts);
> +
>  LIBBPF_API struct btf *btf__load_vmlinux_btf(void);
>  LIBBPF_API struct btf *btf__load_module_btf(const char *module_name, struct btf *vmlinux_btf);
>
> diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
> index c4d9bd7d3220..a9151e31dfa9 100644
> --- a/tools/lib/bpf/libbpf.map
> +++ b/tools/lib/bpf/libbpf.map
> @@ -421,6 +421,7 @@ LIBBPF_1.5.0 {
>         global:
>                 bpf_program__attach_sockmap;
>                 btf__distill_base;
> +               btf__parse_opts;
>                 ring__consume_n;
>                 ring_buffer__consume_n;
>  } LIBBPF_1.4.0;
> --
> 2.31.1
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 05/13] bpftool: support displaying raw split BTF using base BTF section as base
  2024-04-24 15:47 ` [PATCH v2 bpf-next 05/13] bpftool: support displaying raw split BTF using base BTF section as base Alan Maguire
@ 2024-04-29 23:42   ` Andrii Nakryiko
  0 siblings, 0 replies; 43+ messages in thread
From: Andrii Nakryiko @ 2024-04-29 23:42 UTC (permalink / raw)
  To: Alan Maguire
  Cc: andrii, ast, jolsa, acme, quentin, eddyz87, mykolal, daniel,
	martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf,
	haoluo, houtao1, bpf, masahiroy, mcgrof, nathan

On Wed, Apr 24, 2024 at 8:49 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> If no base BTF can be found, fall back to checking for the .BTF.base
> section and use it to display split BTF.
>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
>  tools/bpf/bpftool/btf.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
>
> diff --git a/tools/bpf/bpftool/btf.c b/tools/bpf/bpftool/btf.c
> index 91fcb75babe3..2e8bd2c9f0a3 100644
> --- a/tools/bpf/bpftool/btf.c
> +++ b/tools/bpf/bpftool/btf.c
> @@ -631,6 +631,15 @@ static int do_dump(int argc, char **argv)
>                         base = get_vmlinux_btf_from_sysfs();
>
>                 btf = btf__parse_split(*argv, base ?: base_btf);
> +               /* Finally check for presence of base BTF section */
> +               if (!btf && !base && !base_btf) {
> +                       LIBBPF_OPTS(btf_parse_opts, optp);
> +
> +                       optp.btf_sec = BTF_BASE_ELF_SEC;

you can do this declaratively:

LIBBPF_OPTS(btf_parse_opts, optp, .btf_sec = BTF_BASE_ELF_SEC);


> +                       base_btf = btf__parse_opts(*argv, &optp);
> +                       if (base_btf)
> +                               btf = btf__parse_split(*argv, base_btf);
> +               }
>                 if (!btf) {
>                         err = -errno;
>                         p_err("failed to load BTF from %s: %s",
> --
> 2.31.1
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 06/13] kbuild,bpf: switch to using --btf_features for pahole v1.26 and later
  2024-04-24 15:47 ` [PATCH v2 bpf-next 06/13] kbuild,bpf: switch to using --btf_features for pahole v1.26 and later Alan Maguire
@ 2024-04-29 23:43   ` Andrii Nakryiko
  2024-05-01 17:22     ` Alan Maguire
  0 siblings, 1 reply; 43+ messages in thread
From: Andrii Nakryiko @ 2024-04-29 23:43 UTC (permalink / raw)
  To: Alan Maguire
  Cc: andrii, ast, jolsa, acme, quentin, eddyz87, mykolal, daniel,
	martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf,
	haoluo, houtao1, bpf, masahiroy, mcgrof, nathan

On Wed, Apr 24, 2024 at 8:49 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> The btf_features list can be used for pahole v1.26 and later -
> it is useful because if a feature is not yet implemented it will
> not exit with a failure message.  This will allow us to add feature
> requests to the pahole options without having to check pahole versions
> in future; if the version of pahole supports the feature it will be
> added.
>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
>  scripts/Makefile.btf | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
>

post this patch separately? we can land it sooner, right?


> diff --git a/scripts/Makefile.btf b/scripts/Makefile.btf
> index 82377e470aed..8e6a9d4b492e 100644
> --- a/scripts/Makefile.btf
> +++ b/scripts/Makefile.btf
> @@ -12,8 +12,11 @@ pahole-flags-$(call test-ge, $(pahole-ver), 121)     += --btf_gen_floats
>
>  pahole-flags-$(call test-ge, $(pahole-ver), 122)       += -j
>
> -pahole-flags-$(CONFIG_PAHOLE_HAS_LANG_EXCLUDE)         += --lang_exclude=rust
> -
>  pahole-flags-$(call test-ge, $(pahole-ver), 125)       += --skip_encoding_btf_inconsistent_proto --btf_gen_optimized
>
> +# Switch to using --btf_features for v1.26 and later.
> +pahole-flags-$(call test-ge, $(pahole-ver), 126)       = -j --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func
> +
> +pahole-flags-$(CONFIG_PAHOLE_HAS_LANG_EXCLUDE)         += --lang_exclude=rust
> +
>  export PAHOLE_FLAGS := $(pahole-flags-y)
> --
> 2.31.1
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 07/13] resolve_btfids: use .BTF.base ELF section as base BTF if -B option is used
  2024-04-24 15:48 ` [PATCH v2 bpf-next 07/13] resolve_btfids: use .BTF.base ELF section as base BTF if -B option is used Alan Maguire
@ 2024-04-29 23:45   ` Andrii Nakryiko
  2024-05-01 20:39   ` Eduard Zingerman
  1 sibling, 0 replies; 43+ messages in thread
From: Andrii Nakryiko @ 2024-04-29 23:45 UTC (permalink / raw)
  To: Alan Maguire
  Cc: andrii, ast, jolsa, acme, quentin, eddyz87, mykolal, daniel,
	martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf,
	haoluo, houtao1, bpf, masahiroy, mcgrof, nathan

On Wed, Apr 24, 2024 at 8:49 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> When resolving BTF ids, use the BTF in the module .BTF.base section
> when passed the -B option.  Both references to base BTF from split
> BTF and BTF ids will be relocated for base vmlinux on module load.
>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
>  tools/bpf/resolve_btfids/main.c | 22 ++++++++++++++++++++--
>  1 file changed, 20 insertions(+), 2 deletions(-)
>
> diff --git a/tools/bpf/resolve_btfids/main.c b/tools/bpf/resolve_btfids/main.c
> index d9520cb826b3..c5b622a31f18 100644
> --- a/tools/bpf/resolve_btfids/main.c
> +++ b/tools/bpf/resolve_btfids/main.c
> @@ -115,6 +115,7 @@ struct object {
>         const char *path;
>         const char *btf;
>         const char *base_btf_path;
> +       int base;
>
>         struct {
>                 int              fd;
> @@ -532,11 +533,26 @@ static int symbols_resolve(struct object *obj)
>         __u32 nr_types;
>
>         if (obj->base_btf_path) {
> -               base_btf = btf__parse(obj->base_btf_path, NULL);
> +               LIBBPF_OPTS(btf_parse_opts, optp);
> +               const char *path;
> +
> +               if (obj->base) {
> +                       optp.btf_sec = BTF_BASE_ELF_SEC;
> +                       path = obj->path;
> +                       base_btf = btf__parse_opts(path, &optp);
> +                       /* fall back to normal base parsing if no BTF_BASE_ELF_SEC */
> +                       if (libbpf_get_error(base_btf))

don't add new uses of libbpf_get_error(), it will be eventually
removed, as it's now quire error prone. Just check pointer and then
access errno, if necessary

> +                               base_btf = NULL;
> +               }
> +               if (!base_btf) {
> +                       optp.btf_sec = BTF_ELF_SEC;
> +                       path = obj->base_btf_path;
> +                       base_btf = btf__parse_opts(path, &optp);
> +               }
>                 err = libbpf_get_error(base_btf);
>                 if (err) {
>                         pr_err("FAILED: load base BTF from %s: %s\n",
> -                              obj->base_btf_path, strerror(-err));
> +                              path, strerror(-err));
>                         return -1;
>                 }
>         }
> @@ -781,6 +797,8 @@ int main(int argc, const char **argv)
>                            "BTF data"),
>                 OPT_STRING('b', "btf_base", &obj.base_btf_path, "file",
>                            "path of file providing base BTF"),
> +               OPT_INCR('B', "base", &obj.base,
> +                        "use " BTF_BASE_ELF_SEC " ELF section BTF as base"),
>                 OPT_END()
>         };
>         int err = -1;
> --
> 2.31.1
>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 09/13] libbpf: split BTF relocation
  2024-04-24 15:48 ` [PATCH v2 bpf-next 09/13] libbpf: split BTF relocation Alan Maguire
@ 2024-04-30  0:14   ` Andrii Nakryiko
  2024-04-30 16:56     ` Alan Maguire
  0 siblings, 1 reply; 43+ messages in thread
From: Andrii Nakryiko @ 2024-04-30  0:14 UTC (permalink / raw)
  To: Alan Maguire
  Cc: andrii, ast, jolsa, acme, quentin, eddyz87, mykolal, daniel,
	martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf,
	haoluo, houtao1, bpf, masahiroy, mcgrof, nathan

On Wed, Apr 24, 2024 at 8:49 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> Map distilled base BTF type ids referenced in split BTF and their
> references to the base BTF passed in, and if the mapping succeeds,
> reparent the split BTF to the base BTF.
>
> Relocation rules are
>
> - base types must match exactly
> - enum[64] types should match all value name/value pairs, but the
>   to-be-relocated enum[64] can also define additional name/value pairs
> - an enum64 can match an enum and vice versa provided the values match
>   as described above
> - named fwds match to the correspondingly-named struct/union/enum/enum64
> - structs with no members match to the correspondingly-named struct/union
>   provided their sizes match
> - anon struct/unions must have field names/offsets specified in base
>   reference BTF matched by those in base BTF we are matching with
>
> Relocation can not recurse, since it will be used in-kernel also and
> we do not want to blow up the kernel stack when carrying out type
> compatibility checks.  Hence we use a stack for reference type
> relocation rather then recursive function calls.  The approach however
> is the same; we use a depth-first search to match the referents
> associated with reference types, and work back from there to match
> the reference type itself.
>
> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> ---
>  tools/lib/bpf/Build             |   2 +-
>  tools/lib/bpf/btf.c             |  58 +++
>  tools/lib/bpf/btf.h             |   8 +
>  tools/lib/bpf/btf_relocate.c    | 601 ++++++++++++++++++++++++++++++++
>  tools/lib/bpf/libbpf.map        |   1 +
>  tools/lib/bpf/libbpf_internal.h |   2 +
>  6 files changed, 671 insertions(+), 1 deletion(-)
>  create mode 100644 tools/lib/bpf/btf_relocate.c
>
> diff --git a/tools/lib/bpf/Build b/tools/lib/bpf/Build
> index b6619199a706..336da6844d42 100644
> --- a/tools/lib/bpf/Build
> +++ b/tools/lib/bpf/Build
> @@ -1,4 +1,4 @@
>  libbpf-y := libbpf.o bpf.o nlattr.o btf.o libbpf_errno.o str_error.o \
>             netlink.o bpf_prog_linfo.o libbpf_probes.o hashmap.o \
>             btf_dump.o ringbuf.o strset.o linker.o gen_loader.o relo_core.o \
> -           usdt.o zip.o elf.o features.o
> +           usdt.o zip.o elf.o features.o btf_relocate.o
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 9036c1dc45d0..f00a84fea9b5 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -5541,3 +5541,61 @@ int btf__distill_base(const struct btf *src_btf, struct btf **new_base_btf,
>         errno = -ret;
>         return ret;
>  }
> +
> +struct btf_rewrite_strs {
> +       struct btf *btf;
> +       const struct btf *old_base_btf;
> +       int str_start;
> +       int str_diff;
> +};
> +
> +static int btf_rewrite_strs(__u32 *str_off, void *ctx)
> +{
> +       struct btf_rewrite_strs *r = ctx;
> +       const char *s;
> +       int off;
> +
> +       if (!*str_off)
> +               return 0;
> +       if (*str_off >= r->str_start) {
> +               *str_off += r->str_diff;
> +       } else {
> +               s = btf__str_by_offset(r->old_base_btf, *str_off);
> +               if (!s)
> +                       return -ENOENT;
> +               off = btf__add_str(r->btf, s);
> +               if (off < 0)
> +                       return off;
> +               *str_off = off;
> +       }
> +       return 0;
> +}
> +
> +int btf_set_base_btf(struct btf *btf, struct btf *base_btf)
> +{
> +       struct btf_rewrite_strs r = {};
> +       struct btf_type *t;
> +       int i, err;
> +
> +       r.old_base_btf = btf__base_btf(btf);
> +       if (!r.old_base_btf)
> +               return -EINVAL;
> +       r.btf = btf;
> +       r.str_start = r.old_base_btf->hdr->str_len;
> +       r.str_diff = base_btf->hdr->str_len - r.old_base_btf->hdr->str_len;
> +       btf->base_btf = base_btf;
> +       btf->start_id = btf__type_cnt(base_btf);
> +       btf->start_str_off = base_btf->hdr->str_len;
> +       for (i = 0; i < btf->nr_types; i++) {
> +               t = (struct btf_type *)btf__type_by_id(btf, i + btf->start_id);

btf_type_by_id()

> +               err = btf_type_visit_str_offs(t, btf_rewrite_strs, &r);
> +               if (err)
> +                       break;
> +       }
> +       return err;
> +}
> +
> +int btf__relocate(struct btf *btf, const struct btf *base_btf)
> +{
> +       return btf_relocate(btf, base_btf, NULL);
> +}

[...]

> +               /* either names must match or both be anon. */
> +               if (t->name_off && nt->name_off) {
> +                       if (strcmp(btf__name_by_offset(r->btf, t->name_off),
> +                                  btf__name_by_offset(r->base_btf, nt->name_off)))
> +                               continue;
> +               } else if (t->name_off != nt->name_off) {
> +                       continue;
> +               }

btf__name_by_offset(0) return "", so you don't need this if/else
guard, just do strcmp()

> +               *tp = nt;
> +               *id = i;
> +               return 0;
> +       }
> +       return -ENOENT;
> +}
> +
> +static int btf_relocate_int(struct btf_relocate *r, const char *name,
> +                            const struct btf_type *t, const struct btf_type *bt)
> +{
> +       __u8 encoding, bencoding, bits, bbits;
> +
> +       if (t->size != bt->size) {
> +               pr_warn("INT types '%s' disagree on size; distilled base BTF says %d; base BTF says %d\n",
> +                       name, t->size, bt->size);
> +               return -EINVAL;
> +       }
> +       encoding = btf_int_encoding(t);
> +       bencoding = btf_int_encoding(bt);
> +       if (encoding != bencoding) {
> +               pr_warn("INT types '%s' disagree on encoding; distilled base BTF says '(%s/%s/%s); base BTF says '(%s/%s/%s)'\n",
> +                       name,
> +                       encoding & BTF_INT_SIGNED ? "signed" : "unsigned",
> +                       encoding & BTF_INT_CHAR ? "char" : "nonchar",
> +                       encoding & BTF_INT_BOOL ? "bool" : "nonbool",
> +                       bencoding & BTF_INT_SIGNED ? "signed" : "unsigned",
> +                       bencoding & BTF_INT_CHAR ? "char" : "nonchar",
> +                       bencoding & BTF_INT_BOOL ? "bool" : "nonbool");
> +               return -EINVAL;
> +       }
> +       bits = btf_int_bits(t);
> +       bbits = btf_int_bits(bt);
> +       if (bits != bbits) {

nit: this b* prefix is a bit ugly, maybe use enc/base_enc and bits/base_bits?

> +               pr_warn("INT types '%s' disagree on bit size; distilled base BTF says %d; base BTF says %d\n",
> +                       name, bits, bbits);
> +               return -EINVAL;
> +       }
> +       return 0;
> +}
> +
> +static int btf_relocate_float(struct btf_relocate *r, const char *name,
> +                              const struct btf_type *t, const struct btf_type *bt)
> +{
> +
> +       if (t->size != bt->size) {
> +               pr_warn("float types '%s' disagree on size; distilled base BTF says %d; base BTF says %d\n",
> +                       name, t->size, bt->size);
> +               return -EINVAL;
> +       }
> +       return 0;
> +}
> +
> +/* ensure each enum[64] value in type t has equivalent in base BTF and that
> + * values match; we must support matching enum64 to enum and vice versa
> + * as well as enum to enum and enum64 to enum64.
> + */
> +static int btf_relocate_enum(struct btf_relocate *r, const char *name,
> +                             const struct btf_type *t, const struct btf_type *bt)
> +{
> +       struct btf_enum *v = btf_enum(t);
> +       struct btf_enum *bv = btf_enum(bt);
> +       struct btf_enum64 *v64 = btf_enum64(t);
> +       struct btf_enum64 *bv64 = btf_enum64(bt);
> +       bool found, match, bisenum, isenum;

is_enum? bisenum is a bit too much to read without underscores (and
I'd still use base_ prefix)

> +       const char *vname, *bvname;
> +       __u32 name_off, bname_off;
> +       __u64 val = 0, bval = 0;
> +       int i, j;
> +

[...]

> +               if (!match) {
> +                       if (t->name_off)
> +                               pr_warn("ENUM[64] types '%s' disagree on enum value '%s'; distilled base BTF specifies value %lld; base BTF specifies value %lld\n",
> +                                       name, vname, val, bval);
> +                       return -EINVAL;
> +               }

What's the motivation to check enum values if we don't really do any
check like this for struct/union? It feels like just checking that
enum names and byte sizes match would be adequate, no?

I have similar feelings about INT checks, we assume the kernel module
was built against valid base BTF in the first place, so as long as
general memory layout matches, it should be OK to relocate. So I'd
stick to NAME + size checks.

If the kernel module was built with an enum definition that's not
compatible with the base kernel, it's a bigger problem than BTF. Just
like what we discussed with STRUCT/UNION.

> +       }
> +       return 0;
> +}
> +
> +/* relocate base types (int, float, enum, enum64 and fwd) */
> +static int btf_relocate_base_type(struct btf_relocate *r, __u32 id)
> +{
> +       const struct btf_type *t = btf_type_by_id(r->dist_base_btf, id);
> +       const char *name = btf__name_by_offset(r->dist_base_btf, t->name_off);
> +       const struct btf_type *bt = NULL;
> +       __u32 base_id = 0;
> +       int err = 0;
> +
> +       switch (btf_kind(t)) {
> +       case BTF_KIND_INT:
> +       case BTF_KIND_ENUM:
> +       case BTF_KIND_FLOAT:
> +       case BTF_KIND_ENUM64:
> +       case BTF_KIND_FWD:
> +               break;
> +       default:
> +               return 0;

why this is not an error?

> +       }
> +
> +       if (r->map[id] <= BTF_MAX_NR_TYPES)
> +               return 0;
> +
> +       while ((err = btf_relocate_find_next(r, t, &base_id, &bt)) != -ENOENT) {
> +               bt = btf_type_by_id(r->base_btf, base_id);
> +               switch (btf_kind(t)) {
> +               case BTF_KIND_INT:
> +                       err = btf_relocate_int(r, name, t, bt);
> +                       break;
> +               case BTF_KIND_ENUM:
> +               case BTF_KIND_ENUM64:
> +                       err = btf_relocate_enum(r, name, t, bt);
> +                       break;
> +               case BTF_KIND_FLOAT:
> +                       err = btf_relocate_float(r, name, t, bt);
> +                       break;
> +               case BTF_KIND_FWD:
> +                       err = 0;
> +                       break;
> +               default:
> +                       return 0;
> +               }
> +               if (!err) {
> +                       r->map[id] = base_id;
> +                       return 0;
> +               }
> +       }

I'm apprehensive of this linear search (many times) over vmlinux BTF,
it feels slow and sloppy, tbh

What if we mandate that distilled base BTF should be sorted by (kind,
name) by pahole/libbpf (which is simple enough to do), and then we can
do a single linear pass over vmlinux BTF + quick binary search over
distilled base BTF, marking (on the side) which base distilled BTF
type was processed. Then keep a pointer of processed distilled base
BTF types, and if at the end it doesn't match base distilled BTF
number of types, we couldn't relocate some of base types.

Simple and fast, WDYT? Or if we don't want to make pahole/libbpf sort,
we can build *distilled base* index cheaply in memory, and do
effectively the same (that's perhaps a bit more robust, but I think we
can just say that distilled base has to be sorted).

For STRUCT/UNION we'd need to do two searches, once for FWD+name and
if not found (embedded struct/union case) STRUCT/UNION+name, but
that's still fast with two binary searches.

A lot of the code below would go away (once we keep only named types
in distilled base), so I didn't spend much time reviewing it, sorry.

> +       return err;
> +}
> +

[...]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 09/13] libbpf: split BTF relocation
  2024-04-30  0:14   ` Andrii Nakryiko
@ 2024-04-30 16:56     ` Alan Maguire
  2024-04-30 17:41       ` Andrii Nakryiko
  0 siblings, 1 reply; 43+ messages in thread
From: Alan Maguire @ 2024-04-30 16:56 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: andrii, ast, jolsa, acme, quentin, eddyz87, mykolal, daniel,
	martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf,
	haoluo, houtao1, bpf, masahiroy, mcgrof, nathan

On 30/04/2024 01:14, Andrii Nakryiko wrote:
> On Wed, Apr 24, 2024 at 8:49 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>>
>> Map distilled base BTF type ids referenced in split BTF and their
>> references to the base BTF passed in, and if the mapping succeeds,
>> reparent the split BTF to the base BTF.
>>
>> Relocation rules are
>>
>> - base types must match exactly
>> - enum[64] types should match all value name/value pairs, but the
>>   to-be-relocated enum[64] can also define additional name/value pairs
>> - an enum64 can match an enum and vice versa provided the values match
>>   as described above
>> - named fwds match to the correspondingly-named struct/union/enum/enum64
>> - structs with no members match to the correspondingly-named struct/union
>>   provided their sizes match
>> - anon struct/unions must have field names/offsets specified in base
>>   reference BTF matched by those in base BTF we are matching with
>>
>> Relocation can not recurse, since it will be used in-kernel also and
>> we do not want to blow up the kernel stack when carrying out type
>> compatibility checks.  Hence we use a stack for reference type
>> relocation rather then recursive function calls.  The approach however
>> is the same; we use a depth-first search to match the referents
>> associated with reference types, and work back from there to match
>> the reference type itself.
>>
>> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
>> ---
>>  tools/lib/bpf/Build             |   2 +-
>>  tools/lib/bpf/btf.c             |  58 +++
>>  tools/lib/bpf/btf.h             |   8 +
>>  tools/lib/bpf/btf_relocate.c    | 601 ++++++++++++++++++++++++++++++++
>>  tools/lib/bpf/libbpf.map        |   1 +
>>  tools/lib/bpf/libbpf_internal.h |   2 +
>>  6 files changed, 671 insertions(+), 1 deletion(-)
>>  create mode 100644 tools/lib/bpf/btf_relocate.c
>>
>> diff --git a/tools/lib/bpf/Build b/tools/lib/bpf/Build
>> index b6619199a706..336da6844d42 100644
>> --- a/tools/lib/bpf/Build
>> +++ b/tools/lib/bpf/Build
>> @@ -1,4 +1,4 @@
>>  libbpf-y := libbpf.o bpf.o nlattr.o btf.o libbpf_errno.o str_error.o \
>>             netlink.o bpf_prog_linfo.o libbpf_probes.o hashmap.o \
>>             btf_dump.o ringbuf.o strset.o linker.o gen_loader.o relo_core.o \
>> -           usdt.o zip.o elf.o features.o
>> +           usdt.o zip.o elf.o features.o btf_relocate.o
>> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
>> index 9036c1dc45d0..f00a84fea9b5 100644
>> --- a/tools/lib/bpf/btf.c
>> +++ b/tools/lib/bpf/btf.c
>> @@ -5541,3 +5541,61 @@ int btf__distill_base(const struct btf *src_btf, struct btf **new_base_btf,
>>         errno = -ret;
>>         return ret;
>>  }
>> +
>> +struct btf_rewrite_strs {
>> +       struct btf *btf;
>> +       const struct btf *old_base_btf;
>> +       int str_start;
>> +       int str_diff;
>> +};
>> +
>> +static int btf_rewrite_strs(__u32 *str_off, void *ctx)
>> +{
>> +       struct btf_rewrite_strs *r = ctx;
>> +       const char *s;
>> +       int off;
>> +
>> +       if (!*str_off)
>> +               return 0;
>> +       if (*str_off >= r->str_start) {
>> +               *str_off += r->str_diff;
>> +       } else {
>> +               s = btf__str_by_offset(r->old_base_btf, *str_off);
>> +               if (!s)
>> +                       return -ENOENT;
>> +               off = btf__add_str(r->btf, s);
>> +               if (off < 0)
>> +                       return off;
>> +               *str_off = off;
>> +       }
>> +       return 0;
>> +}
>> +
>> +int btf_set_base_btf(struct btf *btf, struct btf *base_btf)
>> +{
>> +       struct btf_rewrite_strs r = {};
>> +       struct btf_type *t;
>> +       int i, err;
>> +
>> +       r.old_base_btf = btf__base_btf(btf);
>> +       if (!r.old_base_btf)
>> +               return -EINVAL;
>> +       r.btf = btf;
>> +       r.str_start = r.old_base_btf->hdr->str_len;
>> +       r.str_diff = base_btf->hdr->str_len - r.old_base_btf->hdr->str_len;
>> +       btf->base_btf = base_btf;
>> +       btf->start_id = btf__type_cnt(base_btf);
>> +       btf->start_str_off = base_btf->hdr->str_len;
>> +       for (i = 0; i < btf->nr_types; i++) {
>> +               t = (struct btf_type *)btf__type_by_id(btf, i + btf->start_id);
> 
> btf_type_by_id()
> 
>> +               err = btf_type_visit_str_offs(t, btf_rewrite_strs, &r);
>> +               if (err)
>> +                       break;
>> +       }
>> +       return err;
>> +}
>> +
>> +int btf__relocate(struct btf *btf, const struct btf *base_btf)
>> +{
>> +       return btf_relocate(btf, base_btf, NULL);
>> +}
> 
> [...]
> 
>> +               /* either names must match or both be anon. */
>> +               if (t->name_off && nt->name_off) {
>> +                       if (strcmp(btf__name_by_offset(r->btf, t->name_off),
>> +                                  btf__name_by_offset(r->base_btf, nt->name_off)))
>> +                               continue;
>> +               } else if (t->name_off != nt->name_off) {
>> +                       continue;
>> +               }
> 
> btf__name_by_offset(0) return "", so you don't need this if/else
> guard, just do strcmp()
> 
>> +               *tp = nt;
>> +               *id = i;
>> +               return 0;
>> +       }
>> +       return -ENOENT;
>> +}
>> +
>> +static int btf_relocate_int(struct btf_relocate *r, const char *name,
>> +                            const struct btf_type *t, const struct btf_type *bt)
>> +{
>> +       __u8 encoding, bencoding, bits, bbits;
>> +
>> +       if (t->size != bt->size) {
>> +               pr_warn("INT types '%s' disagree on size; distilled base BTF says %d; base BTF says %d\n",
>> +                       name, t->size, bt->size);
>> +               return -EINVAL;
>> +       }
>> +       encoding = btf_int_encoding(t);
>> +       bencoding = btf_int_encoding(bt);
>> +       if (encoding != bencoding) {
>> +               pr_warn("INT types '%s' disagree on encoding; distilled base BTF says '(%s/%s/%s); base BTF says '(%s/%s/%s)'\n",
>> +                       name,
>> +                       encoding & BTF_INT_SIGNED ? "signed" : "unsigned",
>> +                       encoding & BTF_INT_CHAR ? "char" : "nonchar",
>> +                       encoding & BTF_INT_BOOL ? "bool" : "nonbool",
>> +                       bencoding & BTF_INT_SIGNED ? "signed" : "unsigned",
>> +                       bencoding & BTF_INT_CHAR ? "char" : "nonchar",
>> +                       bencoding & BTF_INT_BOOL ? "bool" : "nonbool");
>> +               return -EINVAL;
>> +       }
>> +       bits = btf_int_bits(t);
>> +       bbits = btf_int_bits(bt);
>> +       if (bits != bbits) {
> 
> nit: this b* prefix is a bit ugly, maybe use enc/base_enc and bits/base_bits?
> 
>> +               pr_warn("INT types '%s' disagree on bit size; distilled base BTF says %d; base BTF says %d\n",
>> +                       name, bits, bbits);
>> +               return -EINVAL;
>> +       }
>> +       return 0;
>> +}
>> +
>> +static int btf_relocate_float(struct btf_relocate *r, const char *name,
>> +                              const struct btf_type *t, const struct btf_type *bt)
>> +{
>> +
>> +       if (t->size != bt->size) {
>> +               pr_warn("float types '%s' disagree on size; distilled base BTF says %d; base BTF says %d\n",
>> +                       name, t->size, bt->size);
>> +               return -EINVAL;
>> +       }
>> +       return 0;
>> +}
>> +
>> +/* ensure each enum[64] value in type t has equivalent in base BTF and that
>> + * values match; we must support matching enum64 to enum and vice versa
>> + * as well as enum to enum and enum64 to enum64.
>> + */
>> +static int btf_relocate_enum(struct btf_relocate *r, const char *name,
>> +                             const struct btf_type *t, const struct btf_type *bt)
>> +{
>> +       struct btf_enum *v = btf_enum(t);
>> +       struct btf_enum *bv = btf_enum(bt);
>> +       struct btf_enum64 *v64 = btf_enum64(t);
>> +       struct btf_enum64 *bv64 = btf_enum64(bt);
>> +       bool found, match, bisenum, isenum;
> 
> is_enum? bisenum is a bit too much to read without underscores (and
> I'd still use base_ prefix)
> 
>> +       const char *vname, *bvname;
>> +       __u32 name_off, bname_off;
>> +       __u64 val = 0, bval = 0;
>> +       int i, j;
>> +
> 
> [...]
> 
>> +               if (!match) {
>> +                       if (t->name_off)
>> +                               pr_warn("ENUM[64] types '%s' disagree on enum value '%s'; distilled base BTF specifies value %lld; base BTF specifies value %lld\n",
>> +                                       name, vname, val, bval);
>> +                       return -EINVAL;
>> +               }
> 
> What's the motivation to check enum values if we don't really do any
> check like this for struct/union? It feels like just checking that
> enum names and byte sizes match would be adequate, no?
> 
> I have similar feelings about INT checks, we assume the kernel module
> was built against valid base BTF in the first place, so as long as
> general memory layout matches, it should be OK to relocate. So I'd
> stick to NAME + size checks.
> 
> If the kernel module was built with an enum definition that's not
> compatible with the base kernel, it's a bigger problem than BTF. Just
> like what we discussed with STRUCT/UNION.
> 
>> +       }
>> +       return 0;
>> +}
>> +
>> +/* relocate base types (int, float, enum, enum64 and fwd) */
>> +static int btf_relocate_base_type(struct btf_relocate *r, __u32 id)
>> +{
>> +       const struct btf_type *t = btf_type_by_id(r->dist_base_btf, id);
>> +       const char *name = btf__name_by_offset(r->dist_base_btf, t->name_off);
>> +       const struct btf_type *bt = NULL;
>> +       __u32 base_id = 0;
>> +       int err = 0;
>> +
>> +       switch (btf_kind(t)) {
>> +       case BTF_KIND_INT:
>> +       case BTF_KIND_ENUM:
>> +       case BTF_KIND_FLOAT:
>> +       case BTF_KIND_ENUM64:
>> +       case BTF_KIND_FWD:
>> +               break;
>> +       default:
>> +               return 0;
> 
> why this is not an error?
> 
>> +       }
>> +
>> +       if (r->map[id] <= BTF_MAX_NR_TYPES)
>> +               return 0;
>> +
>> +       while ((err = btf_relocate_find_next(r, t, &base_id, &bt)) != -ENOENT) {
>> +               bt = btf_type_by_id(r->base_btf, base_id);
>> +               switch (btf_kind(t)) {
>> +               case BTF_KIND_INT:
>> +                       err = btf_relocate_int(r, name, t, bt);
>> +                       break;
>> +               case BTF_KIND_ENUM:
>> +               case BTF_KIND_ENUM64:
>> +                       err = btf_relocate_enum(r, name, t, bt);
>> +                       break;
>> +               case BTF_KIND_FLOAT:
>> +                       err = btf_relocate_float(r, name, t, bt);
>> +                       break;
>> +               case BTF_KIND_FWD:
>> +                       err = 0;
>> +                       break;
>> +               default:
>> +                       return 0;
>> +               }
>> +               if (!err) {
>> +                       r->map[id] = base_id;
>> +                       return 0;
>> +               }
>> +       }
> 
> I'm apprehensive of this linear search (many times) over vmlinux BTF,
> it feels slow and sloppy, tbh
> 
> What if we mandate that distilled base BTF should be sorted by (kind,
> name) by pahole/libbpf (which is simple enough to do), and then we can
> do a single linear pass over vmlinux BTF + quick binary search over
> distilled base BTF, marking (on the side) which base distilled BTF
> type was processed. Then keep a pointer of processed distilled base
> BTF types, and if at the end it doesn't match base distilled BTF
> number of types, we couldn't relocate some of base types.
> 

Hmm, so are you saying something like

	foreach vmlinux type
		binary search for an equivalent distilled base type, and record the
mapping

? Would be great to just have to iterate once alright.

> Simple and fast, WDYT? Or if we don't want to make pahole/libbpf sort,
> we can build *distilled base* index cheaply in memory, and do
> effectively the same (that's perhaps a bit more robust, but I think we
> can just say that distilled base has to be sorted).
>

Sorting BTF is something that's come up a lot. We should probably do it;
more below..

> For STRUCT/UNION we'd need to do two searches, once for FWD+name and
> if not found (embedded struct/union case) STRUCT/UNION+name, but
> that's still fast with two binary searches.
> 
> A lot of the code below would go away (once we keep only named types
> in distilled base), so I didn't spend much time reviewing it, sorry.
>

The only concern I'd have is that the kernel would I suppose need to be
skeptical of getting sorted data (in distilled base or elsewhere), so
we'd probably need to validate sort order I guess. We could share some
of the mechanics of sorting in btf_common.c to do that specifically for
.BTF.base, but thinking about it, as part of general BTF validation we
could mark BTF as sorted or not. What would be nice about this is that
once we knew BTF was sorted,  we could speed up btf_find_by_name_kind()
by using binary search on the sorted BTF.


>> +       return err;
>> +}
>> +
> 
> [...]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 09/13] libbpf: split BTF relocation
  2024-04-30 16:56     ` Alan Maguire
@ 2024-04-30 17:41       ` Andrii Nakryiko
  2024-05-02 16:39         ` Eduard Zingerman
  0 siblings, 1 reply; 43+ messages in thread
From: Andrii Nakryiko @ 2024-04-30 17:41 UTC (permalink / raw)
  To: Alan Maguire, Mykyta Yatsenko
  Cc: andrii, ast, jolsa, acme, quentin, eddyz87, mykolal, daniel,
	martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf,
	haoluo, houtao1, bpf, masahiroy, mcgrof, nathan

On Tue, Apr 30, 2024 at 9:57 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> On 30/04/2024 01:14, Andrii Nakryiko wrote:
> > On Wed, Apr 24, 2024 at 8:49 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> >>
> >> Map distilled base BTF type ids referenced in split BTF and their
> >> references to the base BTF passed in, and if the mapping succeeds,
> >> reparent the split BTF to the base BTF.
> >>
> >> Relocation rules are
> >>
> >> - base types must match exactly
> >> - enum[64] types should match all value name/value pairs, but the
> >>   to-be-relocated enum[64] can also define additional name/value pairs
> >> - an enum64 can match an enum and vice versa provided the values match
> >>   as described above
> >> - named fwds match to the correspondingly-named struct/union/enum/enum64
> >> - structs with no members match to the correspondingly-named struct/union
> >>   provided their sizes match
> >> - anon struct/unions must have field names/offsets specified in base
> >>   reference BTF matched by those in base BTF we are matching with
> >>
> >> Relocation can not recurse, since it will be used in-kernel also and
> >> we do not want to blow up the kernel stack when carrying out type
> >> compatibility checks.  Hence we use a stack for reference type
> >> relocation rather then recursive function calls.  The approach however
> >> is the same; we use a depth-first search to match the referents
> >> associated with reference types, and work back from there to match
> >> the reference type itself.
> >>
> >> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> >> ---
> >>  tools/lib/bpf/Build             |   2 +-
> >>  tools/lib/bpf/btf.c             |  58 +++
> >>  tools/lib/bpf/btf.h             |   8 +
> >>  tools/lib/bpf/btf_relocate.c    | 601 ++++++++++++++++++++++++++++++++
> >>  tools/lib/bpf/libbpf.map        |   1 +
> >>  tools/lib/bpf/libbpf_internal.h |   2 +
> >>  6 files changed, 671 insertions(+), 1 deletion(-)
> >>  create mode 100644 tools/lib/bpf/btf_relocate.c
> >>
> >> diff --git a/tools/lib/bpf/Build b/tools/lib/bpf/Build
> >> index b6619199a706..336da6844d42 100644
> >> --- a/tools/lib/bpf/Build
> >> +++ b/tools/lib/bpf/Build
> >> @@ -1,4 +1,4 @@
> >>  libbpf-y := libbpf.o bpf.o nlattr.o btf.o libbpf_errno.o str_error.o \
> >>             netlink.o bpf_prog_linfo.o libbpf_probes.o hashmap.o \
> >>             btf_dump.o ringbuf.o strset.o linker.o gen_loader.o relo_core.o \
> >> -           usdt.o zip.o elf.o features.o
> >> +           usdt.o zip.o elf.o features.o btf_relocate.o
> >> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> >> index 9036c1dc45d0..f00a84fea9b5 100644
> >> --- a/tools/lib/bpf/btf.c
> >> +++ b/tools/lib/bpf/btf.c
> >> @@ -5541,3 +5541,61 @@ int btf__distill_base(const struct btf *src_btf, struct btf **new_base_btf,
> >>         errno = -ret;
> >>         return ret;
> >>  }
> >> +
> >> +struct btf_rewrite_strs {
> >> +       struct btf *btf;
> >> +       const struct btf *old_base_btf;
> >> +       int str_start;
> >> +       int str_diff;
> >> +};
> >> +
> >> +static int btf_rewrite_strs(__u32 *str_off, void *ctx)
> >> +{
> >> +       struct btf_rewrite_strs *r = ctx;
> >> +       const char *s;
> >> +       int off;
> >> +
> >> +       if (!*str_off)
> >> +               return 0;
> >> +       if (*str_off >= r->str_start) {
> >> +               *str_off += r->str_diff;
> >> +       } else {
> >> +               s = btf__str_by_offset(r->old_base_btf, *str_off);
> >> +               if (!s)
> >> +                       return -ENOENT;
> >> +               off = btf__add_str(r->btf, s);
> >> +               if (off < 0)
> >> +                       return off;
> >> +               *str_off = off;
> >> +       }
> >> +       return 0;
> >> +}
> >> +
> >> +int btf_set_base_btf(struct btf *btf, struct btf *base_btf)
> >> +{
> >> +       struct btf_rewrite_strs r = {};
> >> +       struct btf_type *t;
> >> +       int i, err;
> >> +
> >> +       r.old_base_btf = btf__base_btf(btf);
> >> +       if (!r.old_base_btf)
> >> +               return -EINVAL;
> >> +       r.btf = btf;
> >> +       r.str_start = r.old_base_btf->hdr->str_len;
> >> +       r.str_diff = base_btf->hdr->str_len - r.old_base_btf->hdr->str_len;
> >> +       btf->base_btf = base_btf;
> >> +       btf->start_id = btf__type_cnt(base_btf);
> >> +       btf->start_str_off = base_btf->hdr->str_len;
> >> +       for (i = 0; i < btf->nr_types; i++) {
> >> +               t = (struct btf_type *)btf__type_by_id(btf, i + btf->start_id);
> >
> > btf_type_by_id()
> >
> >> +               err = btf_type_visit_str_offs(t, btf_rewrite_strs, &r);
> >> +               if (err)
> >> +                       break;
> >> +       }
> >> +       return err;
> >> +}
> >> +
> >> +int btf__relocate(struct btf *btf, const struct btf *base_btf)
> >> +{
> >> +       return btf_relocate(btf, base_btf, NULL);
> >> +}
> >
> > [...]
> >
> >> +               /* either names must match or both be anon. */
> >> +               if (t->name_off && nt->name_off) {
> >> +                       if (strcmp(btf__name_by_offset(r->btf, t->name_off),
> >> +                                  btf__name_by_offset(r->base_btf, nt->name_off)))
> >> +                               continue;
> >> +               } else if (t->name_off != nt->name_off) {
> >> +                       continue;
> >> +               }
> >
> > btf__name_by_offset(0) return "", so you don't need this if/else
> > guard, just do strcmp()
> >
> >> +               *tp = nt;
> >> +               *id = i;
> >> +               return 0;
> >> +       }
> >> +       return -ENOENT;
> >> +}
> >> +
> >> +static int btf_relocate_int(struct btf_relocate *r, const char *name,
> >> +                            const struct btf_type *t, const struct btf_type *bt)
> >> +{
> >> +       __u8 encoding, bencoding, bits, bbits;
> >> +
> >> +       if (t->size != bt->size) {
> >> +               pr_warn("INT types '%s' disagree on size; distilled base BTF says %d; base BTF says %d\n",
> >> +                       name, t->size, bt->size);
> >> +               return -EINVAL;
> >> +       }
> >> +       encoding = btf_int_encoding(t);
> >> +       bencoding = btf_int_encoding(bt);
> >> +       if (encoding != bencoding) {
> >> +               pr_warn("INT types '%s' disagree on encoding; distilled base BTF says '(%s/%s/%s); base BTF says '(%s/%s/%s)'\n",
> >> +                       name,
> >> +                       encoding & BTF_INT_SIGNED ? "signed" : "unsigned",
> >> +                       encoding & BTF_INT_CHAR ? "char" : "nonchar",
> >> +                       encoding & BTF_INT_BOOL ? "bool" : "nonbool",
> >> +                       bencoding & BTF_INT_SIGNED ? "signed" : "unsigned",
> >> +                       bencoding & BTF_INT_CHAR ? "char" : "nonchar",
> >> +                       bencoding & BTF_INT_BOOL ? "bool" : "nonbool");
> >> +               return -EINVAL;
> >> +       }
> >> +       bits = btf_int_bits(t);
> >> +       bbits = btf_int_bits(bt);
> >> +       if (bits != bbits) {
> >
> > nit: this b* prefix is a bit ugly, maybe use enc/base_enc and bits/base_bits?
> >
> >> +               pr_warn("INT types '%s' disagree on bit size; distilled base BTF says %d; base BTF says %d\n",
> >> +                       name, bits, bbits);
> >> +               return -EINVAL;
> >> +       }
> >> +       return 0;
> >> +}
> >> +
> >> +static int btf_relocate_float(struct btf_relocate *r, const char *name,
> >> +                              const struct btf_type *t, const struct btf_type *bt)
> >> +{
> >> +
> >> +       if (t->size != bt->size) {
> >> +               pr_warn("float types '%s' disagree on size; distilled base BTF says %d; base BTF says %d\n",
> >> +                       name, t->size, bt->size);
> >> +               return -EINVAL;
> >> +       }
> >> +       return 0;
> >> +}
> >> +
> >> +/* ensure each enum[64] value in type t has equivalent in base BTF and that
> >> + * values match; we must support matching enum64 to enum and vice versa
> >> + * as well as enum to enum and enum64 to enum64.
> >> + */
> >> +static int btf_relocate_enum(struct btf_relocate *r, const char *name,
> >> +                             const struct btf_type *t, const struct btf_type *bt)
> >> +{
> >> +       struct btf_enum *v = btf_enum(t);
> >> +       struct btf_enum *bv = btf_enum(bt);
> >> +       struct btf_enum64 *v64 = btf_enum64(t);
> >> +       struct btf_enum64 *bv64 = btf_enum64(bt);
> >> +       bool found, match, bisenum, isenum;
> >
> > is_enum? bisenum is a bit too much to read without underscores (and
> > I'd still use base_ prefix)
> >
> >> +       const char *vname, *bvname;
> >> +       __u32 name_off, bname_off;
> >> +       __u64 val = 0, bval = 0;
> >> +       int i, j;
> >> +
> >
> > [...]
> >
> >> +               if (!match) {
> >> +                       if (t->name_off)
> >> +                               pr_warn("ENUM[64] types '%s' disagree on enum value '%s'; distilled base BTF specifies value %lld; base BTF specifies value %lld\n",
> >> +                                       name, vname, val, bval);
> >> +                       return -EINVAL;
> >> +               }
> >
> > What's the motivation to check enum values if we don't really do any
> > check like this for struct/union? It feels like just checking that
> > enum names and byte sizes match would be adequate, no?
> >
> > I have similar feelings about INT checks, we assume the kernel module
> > was built against valid base BTF in the first place, so as long as
> > general memory layout matches, it should be OK to relocate. So I'd
> > stick to NAME + size checks.
> >
> > If the kernel module was built with an enum definition that's not
> > compatible with the base kernel, it's a bigger problem than BTF. Just
> > like what we discussed with STRUCT/UNION.
> >
> >> +       }
> >> +       return 0;
> >> +}
> >> +
> >> +/* relocate base types (int, float, enum, enum64 and fwd) */
> >> +static int btf_relocate_base_type(struct btf_relocate *r, __u32 id)
> >> +{
> >> +       const struct btf_type *t = btf_type_by_id(r->dist_base_btf, id);
> >> +       const char *name = btf__name_by_offset(r->dist_base_btf, t->name_off);
> >> +       const struct btf_type *bt = NULL;
> >> +       __u32 base_id = 0;
> >> +       int err = 0;
> >> +
> >> +       switch (btf_kind(t)) {
> >> +       case BTF_KIND_INT:
> >> +       case BTF_KIND_ENUM:
> >> +       case BTF_KIND_FLOAT:
> >> +       case BTF_KIND_ENUM64:
> >> +       case BTF_KIND_FWD:
> >> +               break;
> >> +       default:
> >> +               return 0;
> >
> > why this is not an error?
> >
> >> +       }
> >> +
> >> +       if (r->map[id] <= BTF_MAX_NR_TYPES)
> >> +               return 0;
> >> +
> >> +       while ((err = btf_relocate_find_next(r, t, &base_id, &bt)) != -ENOENT) {
> >> +               bt = btf_type_by_id(r->base_btf, base_id);
> >> +               switch (btf_kind(t)) {
> >> +               case BTF_KIND_INT:
> >> +                       err = btf_relocate_int(r, name, t, bt);
> >> +                       break;
> >> +               case BTF_KIND_ENUM:
> >> +               case BTF_KIND_ENUM64:
> >> +                       err = btf_relocate_enum(r, name, t, bt);
> >> +                       break;
> >> +               case BTF_KIND_FLOAT:
> >> +                       err = btf_relocate_float(r, name, t, bt);
> >> +                       break;
> >> +               case BTF_KIND_FWD:
> >> +                       err = 0;
> >> +                       break;
> >> +               default:
> >> +                       return 0;
> >> +               }
> >> +               if (!err) {
> >> +                       r->map[id] = base_id;
> >> +                       return 0;
> >> +               }
> >> +       }
> >
> > I'm apprehensive of this linear search (many times) over vmlinux BTF,
> > it feels slow and sloppy, tbh
> >
> > What if we mandate that distilled base BTF should be sorted by (kind,
> > name) by pahole/libbpf (which is simple enough to do), and then we can
> > do a single linear pass over vmlinux BTF + quick binary search over
> > distilled base BTF, marking (on the side) which base distilled BTF
> > type was processed. Then keep a pointer of processed distilled base
> > BTF types, and if at the end it doesn't match base distilled BTF
> > number of types, we couldn't relocate some of base types.
> >
>
> Hmm, so are you saying something like
>
>         foreach vmlinux type
>                 binary search for an equivalent distilled base type, and record the
> mapping
>
> ? Would be great to just have to iterate once alright.

Yes. You'd just need an extra quick pass to check with distilled base
types weren't marked, which would be an error condition.

>
> > Simple and fast, WDYT? Or if we don't want to make pahole/libbpf sort,
> > we can build *distilled base* index cheaply in memory, and do
> > effectively the same (that's perhaps a bit more robust, but I think we
> > can just say that distilled base has to be sorted).
> >
>
> Sorting BTF is something that's come up a lot. We should probably do it;
> more below..
>
> > For STRUCT/UNION we'd need to do two searches, once for FWD+name and
> > if not found (embedded struct/union case) STRUCT/UNION+name, but
> > that's still fast with two binary searches.
> >
> > A lot of the code below would go away (once we keep only named types
> > in distilled base), so I didn't spend much time reviewing it, sorry.
> >
>
> The only concern I'd have is that the kernel would I suppose need to be
> skeptical of getting sorted data (in distilled base or elsewhere), so
> we'd probably need to validate sort order I guess. We could share some
> of the mechanics of sorting in btf_common.c to do that specifically for
> .BTF.base, but thinking about it, as part of general BTF validation we
> could mark BTF as sorted or not. What would be nice about this is that
> once we knew BTF was sorted,  we could speed up btf_find_by_name_kind()
> by using binary search on the sorted BTF.

Given distilled base BTF is small, I was thinking the kernel can just
do its own sorting when the module is loaded, it's a few KB of
integers at most, so isn't a problem.

As for generally sorting vmlinux BTF... I think it gets tricky,
because, generally speaking, just KIND+NAME is not enough to define
unique sorting (what do we do with anon types? how do we deal with
reference types that only have some arbitrary BTF ID (which will get
remapped after sorting, mind you)? It gets hairy. BTF is a graph, we
are talking about defining some unique order on a graph, it's not a
straightforward problem.

So I'd focus on getting this distilled base thing working fast and
reliably, before trying to improve BTF sorting in general.

Speaking of sorting, Mykyta (cc'ed) is working on teaching *bpftool*
to do a sane ordering of types so that vmlinux.h output is a)
meaningfully (from human POV) sorted and b) vmlinux.h overall is more
"stable" between slight changes to vmlinux BTF itself, so that they
can be more meaningfully diffed. This is in no way related to sorting
vmlinux BTF data itself (sorting is done on the fly before generating
vmlinux.h), but I thought I'd mention that as you are probably
interested in this as well.


>
>
> >> +       return err;
> >> +}
> >> +
> >
> > [...]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 02/13] libbpf: add btf__distill_base() creating split BTF with distilled base BTF
  2024-04-24 15:47 ` [PATCH v2 bpf-next 02/13] libbpf: add btf__distill_base() creating split BTF with distilled base BTF Alan Maguire
  2024-04-26 22:57   ` Andrii Nakryiko
@ 2024-04-30 23:06   ` Eduard Zingerman
  2024-05-01 17:29     ` Alan Maguire
  1 sibling, 1 reply; 43+ messages in thread
From: Eduard Zingerman @ 2024-04-30 23:06 UTC (permalink / raw)
  To: Alan Maguire, andrii, ast
  Cc: jolsa, acme, quentin, mykolal, daniel, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, houtao1,
	bpf, masahiroy, mcgrof, nathan

On Wed, 2024-04-24 at 16:47 +0100, Alan Maguire wrote:

Hi Alan,

Looked through the patch, noted a few minor logical inconsistencies.
Agree with Andrii's comments about memory size allocated for dist.ids.
Otherwise this patch makes sense to me.

[...]

> @@ -5217,3 +5223,301 @@ int btf_ext_visit_str_offs(struct btf_ext *btf_ext, str_off_visit_fn visit, void
>  
>  	return 0;
>  }
> +
> +struct btf_distill_id {
> +	int id;
> +	bool embedded;		/* true if id refers to a struct/union in base BTF
> +				 * that is embedded in a split BTF struct/union.
> +				 */
> +};
> +
> +struct btf_distill {
> +	struct btf_pipe pipe;
> +	struct btf_distill_id *ids;
> +	__u32 query_id;
> +	unsigned int nr_base_types;
> +	unsigned int diff_id;
> +};
> +
> +/* Check if a member of a split BTF struct/union refers to a base BTF
> + * struct/union.  Members can be const/restrict/volatile/typedef
> + * reference types, but if a pointer is encountered, type is no longer
> + * considered embedded.
> + */
> +static int btf_find_embedded_composite_type_ids(__u32 *id, void *ctx)
> +{
> +	struct btf_distill *dist = ctx;
> +	const struct btf_type *t;
> +	__u32 next_id = *id;
> +
> +	do {
> +		if (next_id == 0)
> +			return 0;
> +		t = btf_type_by_id(dist->pipe.src, next_id);
> +		switch (btf_kind(t)) {
> +		case BTF_KIND_CONST:
> +		case BTF_KIND_RESTRICT:
> +		case BTF_KIND_VOLATILE:
> +		case BTF_KIND_TYPEDEF:

I think BTF_KIND_TYPE_TAG is missing.

> +			next_id = t->type;
> +			break;
> +		case BTF_KIND_ARRAY: {
> +			struct btf_array *a = btf_array(t);
> +
> +			next_id = a->type;
> +			break;
> +		}
> +		case BTF_KIND_STRUCT:
> +		case BTF_KIND_UNION:
> +			dist->ids[next_id].embedded = next_id > 0 &&
> +						      next_id <= dist->nr_base_types;

I think next_id can't be zero, otherwise it's kind would be UNKN.
Also, should this be 'next_id < dist->nr_base_types'?

__u32 btf__type_cnt(const struct btf *btf)
{
	return btf->start_id + btf->nr_types;
}

static struct btf *btf_new(const void *data, __u32 size, struct btf *base_btf)
{
	...
	btf->nr_types = 0;
	btf->start_id = 1;
	...
	if (base_btf) {
		...
		btf->start_id = btf__type_cnt(base_btf);
		...
	}
	...
}

int btf__distill_base(const struct btf *src_btf, struct btf **new_base_btf,
		      struct btf **new_split_btf)
{
	...
	dist.nr_base_types = btf__type_cnt(btf__base_btf(src_btf));
	...
}

So, suppose there is only one base type:
- it's ID would be 1;
- nr_types would be 1;
- nr_base_types would be 2;
- meaning that split BTF ids would start from 2.

Maybe use .split_start_id instead of .nr_base_types to avoid confusion?

> +			return 0;
> +		default:
> +			return 0;
> +		}
> +
> +	} while (next_id != 0);
> +
> +	return 0;
> +}

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 03/13] selftests/bpf: test distilled base, split BTF generation
  2024-04-24 15:47 ` [PATCH v2 bpf-next 03/13] selftests/bpf: test distilled base, split BTF generation Alan Maguire
@ 2024-04-30 23:50   ` Eduard Zingerman
  2024-04-30 23:55   ` Eduard Zingerman
  1 sibling, 0 replies; 43+ messages in thread
From: Eduard Zingerman @ 2024-04-30 23:50 UTC (permalink / raw)
  To: Alan Maguire, andrii, ast
  Cc: jolsa, acme, quentin, mykolal, daniel, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, houtao1,
	bpf, masahiroy, mcgrof, nathan

On Wed, 2024-04-24 at 16:47 +0100, Alan Maguire wrote:

[...]

> +static void test_distilled_base(void)
> +{

[...]

> +	if (!ASSERT_EQ(0, btf__distill_base(btf2, &btf3, &btf4),
> +		       "distilled_base") ||
> +	    !ASSERT_OK_PTR(btf3, "distilled_base") ||
> +	    !ASSERT_OK_PTR(btf4, "distilled_split"))
> +		goto cleanup;

Maybe also assert the value of btf4->start_id?
Otherwise look good.

> +
> +	VALIDATE_RAW_BTF(
> +		btf4,
> +		"[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
> +		"[2] FWD 's1' fwd_kind=struct",
> +		"[3] STRUCT '(anon)' size=12 vlen=2\n"
> +		"\t'f1' type_id=1 bits_offset=0\n"
> +		"\t'f2' type_id=2 bits_offset=32",
> +		"[4] FWD 'u1' fwd_kind=union",
> +		"[5] UNION '(anon)' size=4 vlen=1\n"
> +		"\t'f1' type_id=1 bits_offset=0",
> +		"[6] ENUM 'e1' encoding=UNSIGNED size=4 vlen=0",
> +		"[7] ENUM '(anon)' encoding=UNSIGNED size=4 vlen=1\n"
> +		"\t'av1' val=2",
> +		"[8] ENUM64 'e641' encoding=SIGNED size=8 vlen=0",
> +		"[9] ENUM64 '(anon)' encoding=SIGNED size=8 vlen=1\n"
> +		"\t'v1' val=1025",
> +		"[10] STRUCT 'embedded' size=4 vlen=0",
> +		"[11] FUNC_PROTO '(anon)' ret_type_id=1 vlen=1\n"
> +		"\t'p1' type_id=1",
> +		"[12] ARRAY '(anon)' type_id=1 index_type_id=1 nr_elems=3",
> +		"[13] PTR '(anon)' type_id=2",
> +		"[14] PTR '(anon)' type_id=3",
> +		"[15] CONST '(anon)' type_id=4",
> +		"[16] RESTRICT '(anon)' type_id=5",
> +		"[17] VOLATILE '(anon)' type_id=6",
> +		"[18] TYPEDEF 'et' type_id=7",
> +		"[19] CONST '(anon)' type_id=8",
> +		"[20] PTR '(anon)' type_id=9",
> +		"[21] STRUCT 'with_embedded' size=4 vlen=1\n"
> +		"\t'f1' type_id=10 bits_offset=0",
> +		"[22] FUNC 'fn' type_id=11 linkage=static",
> +		"[23] TYPEDEF 'arraytype' type_id=12");
> +
> +cleanup:
> +	btf__free(btf4);
> +	btf__free(btf3);
> +	btf__free(btf2);
> +	btf__free(btf1);
> +}

[...]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 03/13] selftests/bpf: test distilled base, split BTF generation
  2024-04-24 15:47 ` [PATCH v2 bpf-next 03/13] selftests/bpf: test distilled base, split BTF generation Alan Maguire
  2024-04-30 23:50   ` Eduard Zingerman
@ 2024-04-30 23:55   ` Eduard Zingerman
  2024-05-01 17:31     ` Alan Maguire
  1 sibling, 1 reply; 43+ messages in thread
From: Eduard Zingerman @ 2024-04-30 23:55 UTC (permalink / raw)
  To: Alan Maguire, andrii, ast
  Cc: jolsa, acme, quentin, mykolal, daniel, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, houtao1,
	bpf, masahiroy, mcgrof, nathan

On Wed, 2024-04-24 at 16:47 +0100, Alan Maguire wrote:


[...]

> +static void test_distilled_base(void)
> +{
> 

[...]

> +
> +	VALIDATE_RAW_BTF(
> +		btf1,
> +		"[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
> +		"[2] PTR '(anon)' type_id=1",
> +		"[3] STRUCT 's1' size=8 vlen=1\n"
> +		"\t'f1' type_id=2 bits_offset=0",
> +		"[4] STRUCT '(anon)' size=12 vlen=2\n"
> +		"\t'f1' type_id=1 bits_offset=0\n"
> +		"\t'f2' type_id=3 bits_offset=32",
> +		"[5] INT 'unsigned int' size=4 bits_offset=0 nr_bits=32 encoding=(none)",
> +		"[6] UNION 'u1' size=12 vlen=2\n"
> +		"\t'f1' type_id=1 bits_offset=0\n"
> +		"\t'f2' type_id=2 bits_offset=0",
> +		"[7] UNION '(anon)' size=4 vlen=1\n"
> +		"\t'f1' type_id=1 bits_offset=0",
> +		"[8] ENUM 'e1' encoding=UNSIGNED size=4 vlen=1\n"
> +		"\t'v1' val=1",
> +		"[9] ENUM '(anon)' encoding=UNSIGNED size=4 vlen=1\n"
> +		"\t'av1' val=2",
> +		"[10] ENUM64 'e641' encoding=SIGNED size=8 vlen=1\n"
> +		"\t'v1' val=1024",
> +		"[11] ENUM64 '(anon)' encoding=SIGNED size=8 vlen=1\n"
> +		"\t'v1' val=1025",
> +		"[12] STRUCT 'unneeded' size=4 vlen=1\n"
> +		"\t'f1' type_id=1 bits_offset=0",
> +		"[13] STRUCT 'embedded' size=4 vlen=1\n"
> +		"\t'f1' type_id=1 bits_offset=0",
> +		"[14] FUNC_PROTO '(anon)' ret_type_id=1 vlen=1\n"
> +		"\t'p1' type_id=1",
> +		"[15] ARRAY '(anon)' type_id=1 index_type_id=1 nr_elems=3");

Sorry, one more thing,
maybe add a a FUNC_PROTO referencing a struct and refer to this proto from btf2?
To check that FUNC_PROTOs are visited as appropriate.

> +
> +	btf2 = btf__new_empty_split(btf1);
> +	if (!ASSERT_OK_PTR(btf2, "empty_split_btf"))
> +		goto cleanup;
> +

[...]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 04/13] libbpf: add btf__parse_opts() API for flexible BTF parsing
  2024-04-24 15:47 ` [PATCH v2 bpf-next 04/13] libbpf: add btf__parse_opts() API for flexible BTF parsing Alan Maguire
  2024-04-29 23:40   ` Andrii Nakryiko
@ 2024-05-01  0:07   ` Eduard Zingerman
  1 sibling, 0 replies; 43+ messages in thread
From: Eduard Zingerman @ 2024-05-01  0:07 UTC (permalink / raw)
  To: Alan Maguire, andrii, ast
  Cc: jolsa, acme, quentin, mykolal, daniel, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, houtao1,
	bpf, masahiroy, mcgrof, nathan

On Wed, 2024-04-24 at 16:47 +0100, Alan Maguire wrote:

[...]

> @@ -1301,23 +1302,42 @@ static struct btf *btf_parse(const char *path, struct btf *base_btf, struct btf_
>  	if (btf_ext)
>  		*btf_ext = NULL;
>  
> -	btf = btf_parse_raw(path, base_btf);
> -	err = libbpf_get_error(btf);
> -	if (!err)
> -		return btf;
> -	if (err != -EPROTO)
> -		return ERR_PTR(err);
> -	return btf_parse_elf(path, base_btf, btf_ext);
> +	if (!btf_elf_sec) {
> +		btf = btf_parse_raw(path, base_btf);
> +		err = libbpf_get_error(btf);
> +		if (!err)
> +			return btf;
> +		if (err != -EPROTO)
> +			return ERR_PTR(err);
> +	}
> +	if (!btf_elf_sec)
> +		btf_elf_sec = BTF_ELF_SEC;
> +
> +	return btf_parse_elf(path, btf_elf_sec, base_btf, btf_ext);
> +}
> +
> +struct btf *btf__parse_opts(const char *path, struct btf_parse_opts *opts)
> +{
> +	struct btf *base_btf;
> +	const char *btf_sec;
> +	struct btf_ext **btf_ext;
> +
> +	if (!OPTS_VALID(opts, btf_parse_opts))
> +		return libbpf_err_ptr(-EINVAL);
> +	base_btf = OPTS_GET(opts, base_btf, NULL);
> +	btf_sec = OPTS_GET(opts, btf_sec, NULL);
> +	btf_ext = OPTS_GET(opts, btf_ext, NULL);
> +	return libbpf_ptr(btf_parse(path, btf_sec, base_btf, btf_ext));
>  }
>  

I think that btf_parse() should be inlined into btf__parse_opts() and removed.
As a proxy for passing btf_parse_opts fields as parameters it does not
make much sense.

[...]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 06/13] kbuild,bpf: switch to using --btf_features for pahole v1.26 and later
  2024-04-29 23:43   ` Andrii Nakryiko
@ 2024-05-01 17:22     ` Alan Maguire
  0 siblings, 0 replies; 43+ messages in thread
From: Alan Maguire @ 2024-05-01 17:22 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: andrii, ast, jolsa, acme, quentin, eddyz87, mykolal, daniel,
	martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf,
	haoluo, houtao1, bpf, masahiroy, mcgrof, nathan

On 30/04/2024 00:43, Andrii Nakryiko wrote:
> On Wed, Apr 24, 2024 at 8:49 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>>
>> The btf_features list can be used for pahole v1.26 and later -
>> it is useful because if a feature is not yet implemented it will
>> not exit with a failure message.  This will allow us to add feature
>> requests to the pahole options without having to check pahole versions
>> in future; if the version of pahole supports the feature it will be
>> added.
>>
>> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
>> ---
>>  scripts/Makefile.btf | 7 +++++--
>>  1 file changed, 5 insertions(+), 2 deletions(-)
>>
> 
> post this patch separately? we can land it sooner, right?
> 
>

sure, will do!

>> diff --git a/scripts/Makefile.btf b/scripts/Makefile.btf
>> index 82377e470aed..8e6a9d4b492e 100644
>> --- a/scripts/Makefile.btf
>> +++ b/scripts/Makefile.btf
>> @@ -12,8 +12,11 @@ pahole-flags-$(call test-ge, $(pahole-ver), 121)     += --btf_gen_floats
>>
>>  pahole-flags-$(call test-ge, $(pahole-ver), 122)       += -j
>>
>> -pahole-flags-$(CONFIG_PAHOLE_HAS_LANG_EXCLUDE)         += --lang_exclude=rust
>> -
>>  pahole-flags-$(call test-ge, $(pahole-ver), 125)       += --skip_encoding_btf_inconsistent_proto --btf_gen_optimized
>>
>> +# Switch to using --btf_features for v1.26 and later.
>> +pahole-flags-$(call test-ge, $(pahole-ver), 126)       = -j --btf_features=encode_force,var,float,enum64,decl_tag,type_tag,optimized_func,consistent_func
>> +
>> +pahole-flags-$(CONFIG_PAHOLE_HAS_LANG_EXCLUDE)         += --lang_exclude=rust
>> +
>>  export PAHOLE_FLAGS := $(pahole-flags-y)
>> --
>> 2.31.1
>>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 02/13] libbpf: add btf__distill_base() creating split BTF with distilled base BTF
  2024-04-30 23:06   ` Eduard Zingerman
@ 2024-05-01 17:29     ` Alan Maguire
  2024-05-01 17:43       ` Eduard Zingerman
  0 siblings, 1 reply; 43+ messages in thread
From: Alan Maguire @ 2024-05-01 17:29 UTC (permalink / raw)
  To: Eduard Zingerman, andrii, ast
  Cc: jolsa, acme, quentin, mykolal, daniel, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, houtao1,
	bpf, masahiroy, mcgrof, nathan

On 01/05/2024 00:06, Eduard Zingerman wrote:
> On Wed, 2024-04-24 at 16:47 +0100, Alan Maguire wrote:
> 
> Hi Alan,
> 
> Looked through the patch, noted a few minor logical inconsistencies.
> Agree with Andrii's comments about memory size allocated for dist.ids.
> Otherwise this patch makes sense to me.
>
thanks for taking a look! I'm working on an updated series incorporating
the approach of limiting distilled base to named struct/union/enum
types, hope to have that ready by the end of the week. It will also OR
in flags to mark types as embedded as per Andrii and your suggestion.
A bit more below..
> [...]
> 
>> @@ -5217,3 +5223,301 @@ int btf_ext_visit_str_offs(struct btf_ext *btf_ext, str_off_visit_fn visit, void
>>  
>>  	return 0;
>>  }
>> +
>> +struct btf_distill_id {
>> +	int id;
>> +	bool embedded;		/* true if id refers to a struct/union in base BTF
>> +				 * that is embedded in a split BTF struct/union.
>> +				 */
>> +};
>> +
>> +struct btf_distill {
>> +	struct btf_pipe pipe;
>> +	struct btf_distill_id *ids;
>> +	__u32 query_id;
>> +	unsigned int nr_base_types;
>> +	unsigned int diff_id;
>> +};
>> +
>> +/* Check if a member of a split BTF struct/union refers to a base BTF
>> + * struct/union.  Members can be const/restrict/volatile/typedef
>> + * reference types, but if a pointer is encountered, type is no longer
>> + * considered embedded.
>> + */
>> +static int btf_find_embedded_composite_type_ids(__u32 *id, void *ctx)
>> +{
>> +	struct btf_distill *dist = ctx;
>> +	const struct btf_type *t;
>> +	__u32 next_id = *id;
>> +
>> +	do {
>> +		if (next_id == 0)
>> +			return 0;
>> +		t = btf_type_by_id(dist->pipe.src, next_id);
>> +		switch (btf_kind(t)) {
>> +		case BTF_KIND_CONST:
>> +		case BTF_KIND_RESTRICT:
>> +		case BTF_KIND_VOLATILE:
>> +		case BTF_KIND_TYPEDEF:
> 
> I think BTF_KIND_TYPE_TAG is missing.
>

It's implicit in the default clause; I can't see a case for having a
split BTF type tag base BTF types, but I might be missing something
there. I can make all the unexpected types explicit if that would be
clearer?


>> +			next_id = t->type;
>> +			break;
>> +		case BTF_KIND_ARRAY: {
>> +			struct btf_array *a = btf_array(t);
>> +
>> +			next_id = a->type;
>> +			break;
>> +		}
>> +		case BTF_KIND_STRUCT:
>> +		case BTF_KIND_UNION:
>> +			dist->ids[next_id].embedded = next_id > 0 &&
>> +						      next_id <= dist->nr_base_types;
> 
> I think next_id can't be zero, otherwise it's kind would be UNKN.
> Also, should this be 'next_id < dist->nr_base_types'?
> 
yeah this needs to be fixed; also isn't worth range-checking this as
it's got to be a base type AFAICT.

> __u32 btf__type_cnt(const struct btf *btf)
> {
> 	return btf->start_id + btf->nr_types;
> }
> 
> static struct btf *btf_new(const void *data, __u32 size, struct btf *base_btf)
> {
> 	...
> 	btf->nr_types = 0;
> 	btf->start_id = 1;
> 	...
> 	if (base_btf) {
> 		...
> 		btf->start_id = btf__type_cnt(base_btf);
> 		...
> 	}
> 	...
> }
> 
> int btf__distill_base(const struct btf *src_btf, struct btf **new_base_btf,
> 		      struct btf **new_split_btf)
> {
> 	...
> 	dist.nr_base_types = btf__type_cnt(btf__base_btf(src_btf));
> 	...
> }
> 
> So, suppose there is only one base type:
> - it's ID would be 1;
> - nr_types would be 1;
> - nr_base_types would be 2;
> - meaning that split BTF ids would start from 2.
> 
> Maybe use .split_start_id instead of .nr_base_types to avoid confusion?
> 

good idea, will fix. thanks!

>> +			return 0;
>> +		default:
>> +			return 0;
>> +		}
>> +
>> +	} while (next_id != 0);
>> +
>> +	return 0;
>> +}

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 03/13] selftests/bpf: test distilled base, split BTF generation
  2024-04-30 23:55   ` Eduard Zingerman
@ 2024-05-01 17:31     ` Alan Maguire
  0 siblings, 0 replies; 43+ messages in thread
From: Alan Maguire @ 2024-05-01 17:31 UTC (permalink / raw)
  To: Eduard Zingerman, andrii, ast
  Cc: jolsa, acme, quentin, mykolal, daniel, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, houtao1,
	bpf, masahiroy, mcgrof, nathan

On 01/05/2024 00:55, Eduard Zingerman wrote:
> On Wed, 2024-04-24 at 16:47 +0100, Alan Maguire wrote:
> 
> 
> [...]
> 
>> +static void test_distilled_base(void)
>> +{
>>
> 
> [...]
> 
>> +
>> +	VALIDATE_RAW_BTF(
>> +		btf1,
>> +		"[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
>> +		"[2] PTR '(anon)' type_id=1",
>> +		"[3] STRUCT 's1' size=8 vlen=1\n"
>> +		"\t'f1' type_id=2 bits_offset=0",
>> +		"[4] STRUCT '(anon)' size=12 vlen=2\n"
>> +		"\t'f1' type_id=1 bits_offset=0\n"
>> +		"\t'f2' type_id=3 bits_offset=32",
>> +		"[5] INT 'unsigned int' size=4 bits_offset=0 nr_bits=32 encoding=(none)",
>> +		"[6] UNION 'u1' size=12 vlen=2\n"
>> +		"\t'f1' type_id=1 bits_offset=0\n"
>> +		"\t'f2' type_id=2 bits_offset=0",
>> +		"[7] UNION '(anon)' size=4 vlen=1\n"
>> +		"\t'f1' type_id=1 bits_offset=0",
>> +		"[8] ENUM 'e1' encoding=UNSIGNED size=4 vlen=1\n"
>> +		"\t'v1' val=1",
>> +		"[9] ENUM '(anon)' encoding=UNSIGNED size=4 vlen=1\n"
>> +		"\t'av1' val=2",
>> +		"[10] ENUM64 'e641' encoding=SIGNED size=8 vlen=1\n"
>> +		"\t'v1' val=1024",
>> +		"[11] ENUM64 '(anon)' encoding=SIGNED size=8 vlen=1\n"
>> +		"\t'v1' val=1025",
>> +		"[12] STRUCT 'unneeded' size=4 vlen=1\n"
>> +		"\t'f1' type_id=1 bits_offset=0",
>> +		"[13] STRUCT 'embedded' size=4 vlen=1\n"
>> +		"\t'f1' type_id=1 bits_offset=0",
>> +		"[14] FUNC_PROTO '(anon)' ret_type_id=1 vlen=1\n"
>> +		"\t'p1' type_id=1",
>> +		"[15] ARRAY '(anon)' type_id=1 index_type_id=1 nr_elems=3");
> 
> Sorry, one more thing,
> maybe add a a FUNC_PROTO referencing a struct and refer to this proto from btf2?
> To check that FUNC_PROTOs are visited as appropriate.
>

good idea, I'll add this. the test will need to be reworked anyway since
ref types etc will move to split BTF.

>> +
>> +	btf2 = btf__new_empty_split(btf1);
>> +	if (!ASSERT_OK_PTR(btf2, "empty_split_btf"))
>> +		goto cleanup;
>> +
> 
> [...]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 04/13] libbpf: add btf__parse_opts() API for flexible BTF parsing
  2024-04-29 23:40   ` Andrii Nakryiko
@ 2024-05-01 17:42     ` Alan Maguire
  2024-05-01 17:47       ` Andrii Nakryiko
  0 siblings, 1 reply; 43+ messages in thread
From: Alan Maguire @ 2024-05-01 17:42 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: andrii, ast, jolsa, acme, quentin, eddyz87, mykolal, daniel,
	martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf,
	haoluo, houtao1, bpf, masahiroy, mcgrof, nathan

On 30/04/2024 00:40, Andrii Nakryiko wrote:
> On Wed, Apr 24, 2024 at 8:48 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>>
>> Options cover existing parsing scenarios (ELF, raw, retrieving
>> .BTF.ext) and also allow specification of the ELF section name
>> containing BTF.  This will allow consumers to retrieve BTF from
>> .BTF.base sections (BTF_BASE_ELF_SEC) also.
>>
>> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
>> ---
>>  tools/lib/bpf/btf.c      | 50 ++++++++++++++++++++++++++++------------
>>  tools/lib/bpf/btf.h      | 32 +++++++++++++++++++++++++
>>  tools/lib/bpf/libbpf.map |  1 +
>>  3 files changed, 68 insertions(+), 15 deletions(-)
>>
>> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
>> index 419cc4fa2e86..9036c1dc45d0 100644
>> --- a/tools/lib/bpf/btf.c
>> +++ b/tools/lib/bpf/btf.c
>> @@ -1084,7 +1084,7 @@ struct btf *btf__new_split(const void *data, __u32 size, struct btf *base_btf)
>>         return libbpf_ptr(btf_new(data, size, base_btf));
>>  }
>>
>> -static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
>> +static struct btf *btf_parse_elf(const char *path, const char *btf_sec, struct btf *base_btf,
>>                                  struct btf_ext **btf_ext)
>>  {
>>         Elf_Data *btf_data = NULL, *btf_ext_data = NULL;
>> @@ -1146,7 +1146,7 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
>>                                 idx, path);
>>                         goto done;
>>                 }
>> -               if (strcmp(name, BTF_ELF_SEC) == 0) {
>> +               if (strcmp(name, btf_sec) == 0) {
>>                         btf_data = elf_getdata(scn, 0);
>>                         if (!btf_data) {
>>                                 pr_warn("failed to get section(%d, %s) data from %s\n",
>> @@ -1166,7 +1166,7 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
>>         }
>>
>>         if (!btf_data) {
>> -               pr_warn("failed to find '%s' ELF section in %s\n", BTF_ELF_SEC, path);
>> +               pr_warn("failed to find '%s' ELF section in %s\n", btf_sec, path);
>>                 err = -ENODATA;
>>                 goto done;
>>         }
>> @@ -1212,12 +1212,12 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
>>
>>  struct btf *btf__parse_elf(const char *path, struct btf_ext **btf_ext)
>>  {
>> -       return libbpf_ptr(btf_parse_elf(path, NULL, btf_ext));
>> +       return libbpf_ptr(btf_parse_elf(path, BTF_ELF_SEC, NULL, btf_ext));
>>  }
>>
>>  struct btf *btf__parse_elf_split(const char *path, struct btf *base_btf)
>>  {
>> -       return libbpf_ptr(btf_parse_elf(path, base_btf, NULL));
>> +       return libbpf_ptr(btf_parse_elf(path, BTF_ELF_SEC, base_btf, NULL));
>>  }
>>
>>  static struct btf *btf_parse_raw(const char *path, struct btf *base_btf)
>> @@ -1293,7 +1293,8 @@ struct btf *btf__parse_raw_split(const char *path, struct btf *base_btf)
>>         return libbpf_ptr(btf_parse_raw(path, base_btf));
>>  }
>>
>> -static struct btf *btf_parse(const char *path, struct btf *base_btf, struct btf_ext **btf_ext)
>> +static struct btf *btf_parse(const char *path, const char *btf_elf_sec, struct btf *base_btf,
>> +                            struct btf_ext **btf_ext)
>>  {
>>         struct btf *btf;
>>         int err;
>> @@ -1301,23 +1302,42 @@ static struct btf *btf_parse(const char *path, struct btf *base_btf, struct btf_
>>         if (btf_ext)
>>                 *btf_ext = NULL;
>>
>> -       btf = btf_parse_raw(path, base_btf);
>> -       err = libbpf_get_error(btf);
>> -       if (!err)
>> -               return btf;
>> -       if (err != -EPROTO)
>> -               return ERR_PTR(err);
>> -       return btf_parse_elf(path, base_btf, btf_ext);
>> +       if (!btf_elf_sec) {
>> +               btf = btf_parse_raw(path, base_btf);
>> +               err = libbpf_get_error(btf);
>> +               if (!err)
>> +                       return btf;
>> +               if (err != -EPROTO)
>> +                       return ERR_PTR(err);
>> +       }
>> +       if (!btf_elf_sec)
>> +               btf_elf_sec = BTF_ELF_SEC;
>> +
>> +       return btf_parse_elf(path, btf_elf_sec, base_btf, btf_ext);
> 
> nit: btf_elf_sec ?: BTF_ELF_SEC
>

sure, will fix.

> 
>> +}
>> +
>> +struct btf *btf__parse_opts(const char *path, struct btf_parse_opts *opts)
>> +{
>> +       struct btf *base_btf;
>> +       const char *btf_sec;
>> +       struct btf_ext **btf_ext;
>> +
>> +       if (!OPTS_VALID(opts, btf_parse_opts))
>> +               return libbpf_err_ptr(-EINVAL);
>> +       base_btf = OPTS_GET(opts, base_btf, NULL);
>> +       btf_sec = OPTS_GET(opts, btf_sec, NULL);
>> +       btf_ext = OPTS_GET(opts, btf_ext, NULL);
>> +       return libbpf_ptr(btf_parse(path, btf_sec, base_btf, btf_ext));
>>  }
>>
>>  struct btf *btf__parse(const char *path, struct btf_ext **btf_ext)
>>  {
>> -       return libbpf_ptr(btf_parse(path, NULL, btf_ext));
>> +       return libbpf_ptr(btf_parse(path, NULL, NULL, btf_ext));
>>  }
>>
>>  struct btf *btf__parse_split(const char *path, struct btf *base_btf)
>>  {
>> -       return libbpf_ptr(btf_parse(path, base_btf, NULL));
>> +       return libbpf_ptr(btf_parse(path, NULL, base_btf, NULL));
>>  }
>>
>>  static void *btf_get_raw_data(const struct btf *btf, __u32 *size, bool swap_endian);
>> diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
>> index 025ed28b7fe8..94dfdfdef617 100644
>> --- a/tools/lib/bpf/btf.h
>> +++ b/tools/lib/bpf/btf.h
>> @@ -18,6 +18,7 @@ extern "C" {
>>
>>  #define BTF_ELF_SEC ".BTF"
>>  #define BTF_EXT_ELF_SEC ".BTF.ext"
>> +#define BTF_BASE_ELF_SEC ".BTF.base"
> 
> Does libbpf code itself use this? If not, let's get rid of it.
> 

We could, but I wonder would there be value to keeping it around as
multiple consumers need to agree on this name (pahole, resolve_btfids,
bpftool)?

>>  #define MAPS_ELF_SEC ".maps"
>>
>>  struct btf;
>> @@ -134,6 +135,37 @@ LIBBPF_API struct btf *btf__parse_elf_split(const char *path, struct btf *base_b
>>  LIBBPF_API struct btf *btf__parse_raw(const char *path);
>>  LIBBPF_API struct btf *btf__parse_raw_split(const char *path, struct btf *base_btf);
>>
>> +struct btf_parse_opts {
>> +       size_t sz;
>> +       /* use base BTF to parse split BTF */
>> +       struct btf *base_btf;
>> +       /* retrieve optional .BTF.ext info */
>> +       struct btf_ext **btf_ext;
>> +       /* BTF section name */
> 
> let's mention that if not set, libbpf will default to trying to parse
> data as raw BTF, and then will fallback to .BTF in ELF. If it is set
> to non-NULL, we'll assume ELF and use that section to fetch BTF data.
>

sure, will do.

>> +       const char *btf_sec;
>> +       size_t:0;
> 
> nit: size_t :0; (consistency)
> 
>> +};
>> +
>> +#define btf_parse_opts__last_field btf_sec
>> +
>> +/* @brief **btf__parse_opts()** parses BTF information from either a
>> + * raw BTF file (*btf_sec* is NULL) or from the specified BTF section,
>> + * also retrieving  .BTF.ext info if *btf_ext* is non-NULL.  If
>> + * *base_btf* is specified, use it to parse split BTF from the
>> + * specified location.
>> + *
>> + * @return new BTF object instance which has to be eventually freed with
>> + * **btf__free()**
>> + *
>> + * On error, error-code-encoded-as-pointer is returned, not a NULL. To extract
> 
> this is false, we don't encode error as pointer anymore. starting from
> v1.0 it's always NULL + errno.
> 

ah good catch, I must have cut-and-pasted this..

Thanks again for all the review help!

Alan

>> + * error code from such a pointer `libbpf_get_error()` should be used. If
>> + * `libbpf_set_strict_mode(LIBBPF_STRICT_CLEAN_PTRS)` is enabled, NULL is
>> + * returned on error instead. In both cases thread-local `errno` variable is
>> + * always set to error code as well.
>> + */
>> +
>> +LIBBPF_API struct btf *btf__parse_opts(const char *path, struct btf_parse_opts *opts);
>> +
>>  LIBBPF_API struct btf *btf__load_vmlinux_btf(void);
>>  LIBBPF_API struct btf *btf__load_module_btf(const char *module_name, struct btf *vmlinux_btf);
>>
>> diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
>> index c4d9bd7d3220..a9151e31dfa9 100644
>> --- a/tools/lib/bpf/libbpf.map
>> +++ b/tools/lib/bpf/libbpf.map
>> @@ -421,6 +421,7 @@ LIBBPF_1.5.0 {
>>         global:
>>                 bpf_program__attach_sockmap;
>>                 btf__distill_base;
>> +               btf__parse_opts;
>>                 ring__consume_n;
>>                 ring_buffer__consume_n;
>>  } LIBBPF_1.4.0;
>> --
>> 2.31.1
>>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 02/13] libbpf: add btf__distill_base() creating split BTF with distilled base BTF
  2024-05-01 17:29     ` Alan Maguire
@ 2024-05-01 17:43       ` Eduard Zingerman
  2024-05-02 11:51         ` Alan Maguire
  0 siblings, 1 reply; 43+ messages in thread
From: Eduard Zingerman @ 2024-05-01 17:43 UTC (permalink / raw)
  To: Alan Maguire, andrii, ast
  Cc: jolsa, acme, quentin, mykolal, daniel, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, houtao1,
	bpf, masahiroy, mcgrof, nathan

On Wed, 2024-05-01 at 18:29 +0100, Alan Maguire wrote:

[...]

> > > +/* Check if a member of a split BTF struct/union refers to a base BTF
> > > + * struct/union.  Members can be const/restrict/volatile/typedef
> > > + * reference types, but if a pointer is encountered, type is no longer
> > > + * considered embedded.
> > > + */
> > > +static int btf_find_embedded_composite_type_ids(__u32 *id, void *ctx)
> > > +{
> > > +	struct btf_distill *dist = ctx;
> > > +	const struct btf_type *t;
> > > +	__u32 next_id = *id;
> > > +
> > > +	do {
> > > +		if (next_id == 0)
> > > +			return 0;
> > > +		t = btf_type_by_id(dist->pipe.src, next_id);
> > > +		switch (btf_kind(t)) {
> > > +		case BTF_KIND_CONST:
> > > +		case BTF_KIND_RESTRICT:
> > > +		case BTF_KIND_VOLATILE:
> > > +		case BTF_KIND_TYPEDEF:
> > 
> > I think BTF_KIND_TYPE_TAG is missing.
> > 
> 
> It's implicit in the default clause; I can't see a case for having a
> split BTF type tag base BTF types, but I might be missing something
> there. I can make all the unexpected types explicit if that would be
> clearer?

I mean, this skips a series of modifiers, e.g.:

struct buz {
  // next_id will get to 'struct bar' eventually
  const volatile struct bar foo;
}

Now, it is legal to have this chain like below:

struct buz {
  const volatile __type_tag("quux") struct bar foo;
}

In which case the traversal does not have to stop.
Am I confused?

(Note: at the moment type tags are only applied to pointers but that
 would change in the future, I have a stalled LLVM change for this).

[...]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 04/13] libbpf: add btf__parse_opts() API for flexible BTF parsing
  2024-05-01 17:42     ` Alan Maguire
@ 2024-05-01 17:47       ` Andrii Nakryiko
  0 siblings, 0 replies; 43+ messages in thread
From: Andrii Nakryiko @ 2024-05-01 17:47 UTC (permalink / raw)
  To: Alan Maguire
  Cc: andrii, ast, jolsa, acme, quentin, eddyz87, mykolal, daniel,
	martin.lau, song, yonghong.song, john.fastabend, kpsingh, sdf,
	haoluo, houtao1, bpf, masahiroy, mcgrof, nathan

On Wed, May 1, 2024 at 10:43 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> On 30/04/2024 00:40, Andrii Nakryiko wrote:
> > On Wed, Apr 24, 2024 at 8:48 AM Alan Maguire <alan.maguire@oracle.com> wrote:
> >>
> >> Options cover existing parsing scenarios (ELF, raw, retrieving
> >> .BTF.ext) and also allow specification of the ELF section name
> >> containing BTF.  This will allow consumers to retrieve BTF from
> >> .BTF.base sections (BTF_BASE_ELF_SEC) also.
> >>
> >> Signed-off-by: Alan Maguire <alan.maguire@oracle.com>
> >> ---
> >>  tools/lib/bpf/btf.c      | 50 ++++++++++++++++++++++++++++------------
> >>  tools/lib/bpf/btf.h      | 32 +++++++++++++++++++++++++
> >>  tools/lib/bpf/libbpf.map |  1 +
> >>  3 files changed, 68 insertions(+), 15 deletions(-)
> >>
> >> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> >> index 419cc4fa2e86..9036c1dc45d0 100644
> >> --- a/tools/lib/bpf/btf.c
> >> +++ b/tools/lib/bpf/btf.c
> >> @@ -1084,7 +1084,7 @@ struct btf *btf__new_split(const void *data, __u32 size, struct btf *base_btf)
> >>         return libbpf_ptr(btf_new(data, size, base_btf));
> >>  }
> >>
> >> -static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
> >> +static struct btf *btf_parse_elf(const char *path, const char *btf_sec, struct btf *base_btf,
> >>                                  struct btf_ext **btf_ext)
> >>  {
> >>         Elf_Data *btf_data = NULL, *btf_ext_data = NULL;
> >> @@ -1146,7 +1146,7 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
> >>                                 idx, path);
> >>                         goto done;
> >>                 }
> >> -               if (strcmp(name, BTF_ELF_SEC) == 0) {
> >> +               if (strcmp(name, btf_sec) == 0) {
> >>                         btf_data = elf_getdata(scn, 0);
> >>                         if (!btf_data) {
> >>                                 pr_warn("failed to get section(%d, %s) data from %s\n",
> >> @@ -1166,7 +1166,7 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
> >>         }
> >>
> >>         if (!btf_data) {
> >> -               pr_warn("failed to find '%s' ELF section in %s\n", BTF_ELF_SEC, path);
> >> +               pr_warn("failed to find '%s' ELF section in %s\n", btf_sec, path);
> >>                 err = -ENODATA;
> >>                 goto done;
> >>         }
> >> @@ -1212,12 +1212,12 @@ static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
> >>
> >>  struct btf *btf__parse_elf(const char *path, struct btf_ext **btf_ext)
> >>  {
> >> -       return libbpf_ptr(btf_parse_elf(path, NULL, btf_ext));
> >> +       return libbpf_ptr(btf_parse_elf(path, BTF_ELF_SEC, NULL, btf_ext));
> >>  }
> >>
> >>  struct btf *btf__parse_elf_split(const char *path, struct btf *base_btf)
> >>  {
> >> -       return libbpf_ptr(btf_parse_elf(path, base_btf, NULL));
> >> +       return libbpf_ptr(btf_parse_elf(path, BTF_ELF_SEC, base_btf, NULL));
> >>  }
> >>
> >>  static struct btf *btf_parse_raw(const char *path, struct btf *base_btf)
> >> @@ -1293,7 +1293,8 @@ struct btf *btf__parse_raw_split(const char *path, struct btf *base_btf)
> >>         return libbpf_ptr(btf_parse_raw(path, base_btf));
> >>  }
> >>
> >> -static struct btf *btf_parse(const char *path, struct btf *base_btf, struct btf_ext **btf_ext)
> >> +static struct btf *btf_parse(const char *path, const char *btf_elf_sec, struct btf *base_btf,
> >> +                            struct btf_ext **btf_ext)
> >>  {
> >>         struct btf *btf;
> >>         int err;
> >> @@ -1301,23 +1302,42 @@ static struct btf *btf_parse(const char *path, struct btf *base_btf, struct btf_
> >>         if (btf_ext)
> >>                 *btf_ext = NULL;
> >>
> >> -       btf = btf_parse_raw(path, base_btf);
> >> -       err = libbpf_get_error(btf);
> >> -       if (!err)
> >> -               return btf;
> >> -       if (err != -EPROTO)
> >> -               return ERR_PTR(err);
> >> -       return btf_parse_elf(path, base_btf, btf_ext);
> >> +       if (!btf_elf_sec) {
> >> +               btf = btf_parse_raw(path, base_btf);
> >> +               err = libbpf_get_error(btf);
> >> +               if (!err)
> >> +                       return btf;
> >> +               if (err != -EPROTO)
> >> +                       return ERR_PTR(err);
> >> +       }
> >> +       if (!btf_elf_sec)
> >> +               btf_elf_sec = BTF_ELF_SEC;
> >> +
> >> +       return btf_parse_elf(path, btf_elf_sec, base_btf, btf_ext);
> >
> > nit: btf_elf_sec ?: BTF_ELF_SEC
> >
>
> sure, will fix.
>
> >
> >> +}
> >> +
> >> +struct btf *btf__parse_opts(const char *path, struct btf_parse_opts *opts)
> >> +{
> >> +       struct btf *base_btf;
> >> +       const char *btf_sec;
> >> +       struct btf_ext **btf_ext;
> >> +
> >> +       if (!OPTS_VALID(opts, btf_parse_opts))
> >> +               return libbpf_err_ptr(-EINVAL);
> >> +       base_btf = OPTS_GET(opts, base_btf, NULL);
> >> +       btf_sec = OPTS_GET(opts, btf_sec, NULL);
> >> +       btf_ext = OPTS_GET(opts, btf_ext, NULL);
> >> +       return libbpf_ptr(btf_parse(path, btf_sec, base_btf, btf_ext));
> >>  }
> >>
> >>  struct btf *btf__parse(const char *path, struct btf_ext **btf_ext)
> >>  {
> >> -       return libbpf_ptr(btf_parse(path, NULL, btf_ext));
> >> +       return libbpf_ptr(btf_parse(path, NULL, NULL, btf_ext));
> >>  }
> >>
> >>  struct btf *btf__parse_split(const char *path, struct btf *base_btf)
> >>  {
> >> -       return libbpf_ptr(btf_parse(path, base_btf, NULL));
> >> +       return libbpf_ptr(btf_parse(path, NULL, base_btf, NULL));
> >>  }
> >>
> >>  static void *btf_get_raw_data(const struct btf *btf, __u32 *size, bool swap_endian);
> >> diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
> >> index 025ed28b7fe8..94dfdfdef617 100644
> >> --- a/tools/lib/bpf/btf.h
> >> +++ b/tools/lib/bpf/btf.h
> >> @@ -18,6 +18,7 @@ extern "C" {
> >>
> >>  #define BTF_ELF_SEC ".BTF"
> >>  #define BTF_EXT_ELF_SEC ".BTF.ext"
> >> +#define BTF_BASE_ELF_SEC ".BTF.base"
> >
> > Does libbpf code itself use this? If not, let's get rid of it.
> >
>
> We could, but I wonder would there be value to keeping it around as
> multiple consumers need to agree on this name (pahole, resolve_btfids,
> bpftool)?

Ok, I can see how it might be a bit more generic thing beyond just
kernel use, let's keep it then.

>
> >>  #define MAPS_ELF_SEC ".maps"
> >>
> >>  struct btf;
> >> @@ -134,6 +135,37 @@ LIBBPF_API struct btf *btf__parse_elf_split(const char *path, struct btf *base_b
> >>  LIBBPF_API struct btf *btf__parse_raw(const char *path);
> >>  LIBBPF_API struct btf *btf__parse_raw_split(const char *path, struct btf *base_btf);
> >>
> >> +struct btf_parse_opts {
> >> +       size_t sz;
> >> +       /* use base BTF to parse split BTF */
> >> +       struct btf *base_btf;
> >> +       /* retrieve optional .BTF.ext info */
> >> +       struct btf_ext **btf_ext;
> >> +       /* BTF section name */
> >
> > let's mention that if not set, libbpf will default to trying to parse
> > data as raw BTF, and then will fallback to .BTF in ELF. If it is set
> > to non-NULL, we'll assume ELF and use that section to fetch BTF data.
> >
>
> sure, will do.
>
> >> +       const char *btf_sec;
> >> +       size_t:0;
> >
> > nit: size_t :0; (consistency)
> >
> >> +};
> >> +
> >> +#define btf_parse_opts__last_field btf_sec
> >> +
> >> +/* @brief **btf__parse_opts()** parses BTF information from either a
> >> + * raw BTF file (*btf_sec* is NULL) or from the specified BTF section,
> >> + * also retrieving  .BTF.ext info if *btf_ext* is non-NULL.  If
> >> + * *base_btf* is specified, use it to parse split BTF from the
> >> + * specified location.
> >> + *
> >> + * @return new BTF object instance which has to be eventually freed with
> >> + * **btf__free()**
> >> + *
> >> + * On error, error-code-encoded-as-pointer is returned, not a NULL. To extract
> >
> > this is false, we don't encode error as pointer anymore. starting from
> > v1.0 it's always NULL + errno.
> >
>
> ah good catch, I must have cut-and-pasted this..
>
> Thanks again for all the review help!
>
> Alan
>
> >> + * error code from such a pointer `libbpf_get_error()` should be used. If
> >> + * `libbpf_set_strict_mode(LIBBPF_STRICT_CLEAN_PTRS)` is enabled, NULL is
> >> + * returned on error instead. In both cases thread-local `errno` variable is
> >> + * always set to error code as well.
> >> + */
> >> +
> >> +LIBBPF_API struct btf *btf__parse_opts(const char *path, struct btf_parse_opts *opts);
> >> +
> >>  LIBBPF_API struct btf *btf__load_vmlinux_btf(void);
> >>  LIBBPF_API struct btf *btf__load_module_btf(const char *module_name, struct btf *vmlinux_btf);
> >>
> >> diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
> >> index c4d9bd7d3220..a9151e31dfa9 100644
> >> --- a/tools/lib/bpf/libbpf.map
> >> +++ b/tools/lib/bpf/libbpf.map
> >> @@ -421,6 +421,7 @@ LIBBPF_1.5.0 {
> >>         global:
> >>                 bpf_program__attach_sockmap;
> >>                 btf__distill_base;
> >> +               btf__parse_opts;
> >>                 ring__consume_n;
> >>                 ring_buffer__consume_n;
> >>  } LIBBPF_1.4.0;
> >> --
> >> 2.31.1
> >>

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 07/13] resolve_btfids: use .BTF.base ELF section as base BTF if -B option is used
  2024-04-24 15:48 ` [PATCH v2 bpf-next 07/13] resolve_btfids: use .BTF.base ELF section as base BTF if -B option is used Alan Maguire
  2024-04-29 23:45   ` Andrii Nakryiko
@ 2024-05-01 20:39   ` Eduard Zingerman
  2024-05-02 14:53     ` Alan Maguire
  1 sibling, 1 reply; 43+ messages in thread
From: Eduard Zingerman @ 2024-05-01 20:39 UTC (permalink / raw)
  To: Alan Maguire, andrii, ast
  Cc: jolsa, acme, quentin, mykolal, daniel, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, houtao1,
	bpf, masahiroy, mcgrof, nathan

On Wed, 2024-04-24 at 16:48 +0100, Alan Maguire wrote:

[...]

> @@ -532,11 +533,26 @@ static int symbols_resolve(struct object *obj)
>  	__u32 nr_types;
>  
>  	if (obj->base_btf_path) {
> -		base_btf = btf__parse(obj->base_btf_path, NULL);
> +		LIBBPF_OPTS(btf_parse_opts, optp);
> +		const char *path;
> +
> +		if (obj->base) {
> +			optp.btf_sec = BTF_BASE_ELF_SEC;
> +			path = obj->path;
> +			base_btf = btf__parse_opts(path, &optp);
> +			/* fall back to normal base parsing if no BTF_BASE_ELF_SEC */
> +			if (libbpf_get_error(base_btf))
> +				base_btf = NULL;

Should this be a fatal error?
Since user requested '-B' explicitly?

> +		}
> +		if (!base_btf) {
> +			optp.btf_sec = BTF_ELF_SEC;
> +			path = obj->base_btf_path;
> +			base_btf = btf__parse_opts(path, &optp);
> +		}
>  		err = libbpf_get_error(base_btf);
>  		if (err) {
>  			pr_err("FAILED: load base BTF from %s: %s\n",
> -			       obj->base_btf_path, strerror(-err));
> +			       path, strerror(-err));
>  			return -1;
>  		}
>  	}

[...]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 02/13] libbpf: add btf__distill_base() creating split BTF with distilled base BTF
  2024-05-01 17:43       ` Eduard Zingerman
@ 2024-05-02 11:51         ` Alan Maguire
  0 siblings, 0 replies; 43+ messages in thread
From: Alan Maguire @ 2024-05-02 11:51 UTC (permalink / raw)
  To: Eduard Zingerman, andrii, ast
  Cc: jolsa, acme, quentin, mykolal, daniel, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, houtao1,
	bpf, masahiroy, mcgrof, nathan

On 01/05/2024 18:43, Eduard Zingerman wrote:
> On Wed, 2024-05-01 at 18:29 +0100, Alan Maguire wrote:
> 
> [...]
> 
>>>> +/* Check if a member of a split BTF struct/union refers to a base BTF
>>>> + * struct/union.  Members can be const/restrict/volatile/typedef
>>>> + * reference types, but if a pointer is encountered, type is no longer
>>>> + * considered embedded.
>>>> + */
>>>> +static int btf_find_embedded_composite_type_ids(__u32 *id, void *ctx)
>>>> +{
>>>> +	struct btf_distill *dist = ctx;
>>>> +	const struct btf_type *t;
>>>> +	__u32 next_id = *id;
>>>> +
>>>> +	do {
>>>> +		if (next_id == 0)
>>>> +			return 0;
>>>> +		t = btf_type_by_id(dist->pipe.src, next_id);
>>>> +		switch (btf_kind(t)) {
>>>> +		case BTF_KIND_CONST:
>>>> +		case BTF_KIND_RESTRICT:
>>>> +		case BTF_KIND_VOLATILE:
>>>> +		case BTF_KIND_TYPEDEF:
>>>
>>> I think BTF_KIND_TYPE_TAG is missing.
>>>
>>
>> It's implicit in the default clause; I can't see a case for having a
>> split BTF type tag base BTF types, but I might be missing something
>> there. I can make all the unexpected types explicit if that would be
>> clearer?
> 
> I mean, this skips a series of modifiers, e.g.:
> 
> struct buz {
>   // next_id will get to 'struct bar' eventually
>   const volatile struct bar foo;
> }
> 
> Now, it is legal to have this chain like below:
> 
> struct buz {
>   const volatile __type_tag("quux") struct bar foo;
> }
> 
> In which case the traversal does not have to stop.
> Am I confused?
>

no, sorry, I was! You're absolutely right, BTF_KIND_TYPE_TAG needs to be
accounted for here. Thanks for catching this!

> (Note: at the moment type tags are only applied to pointers but that
>  would change in the future, I have a stalled LLVM change for this).
> 
> [...]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 07/13] resolve_btfids: use .BTF.base ELF section as base BTF if -B option is used
  2024-05-01 20:39   ` Eduard Zingerman
@ 2024-05-02 14:53     ` Alan Maguire
  0 siblings, 0 replies; 43+ messages in thread
From: Alan Maguire @ 2024-05-02 14:53 UTC (permalink / raw)
  To: Eduard Zingerman, andrii, ast
  Cc: jolsa, acme, quentin, mykolal, daniel, martin.lau, song,
	yonghong.song, john.fastabend, kpsingh, sdf, haoluo, houtao1,
	bpf, masahiroy, mcgrof, nathan

On 01/05/2024 21:39, Eduard Zingerman wrote:
> On Wed, 2024-04-24 at 16:48 +0100, Alan Maguire wrote:
> 
> [...]
> 
>> @@ -532,11 +533,26 @@ static int symbols_resolve(struct object *obj)
>>  	__u32 nr_types;
>>  
>>  	if (obj->base_btf_path) {
>> -		base_btf = btf__parse(obj->base_btf_path, NULL);
>> +		LIBBPF_OPTS(btf_parse_opts, optp);
>> +		const char *path;
>> +
>> +		if (obj->base) {
>> +			optp.btf_sec = BTF_BASE_ELF_SEC;
>> +			path = obj->path;
>> +			base_btf = btf__parse_opts(path, &optp);
>> +			/* fall back to normal base parsing if no BTF_BASE_ELF_SEC */
>> +			if (libbpf_get_error(base_btf))
>> +				base_btf = NULL;
> 
> Should this be a fatal error?
> Since user requested '-B' explicitly?
>

No, the fallback behaviour is intended. The reason is this; if the user
is using an older pahole that does not support the generation of
distilled base BTF, there will be no .BTF.base section in modules. We
will however have specified the -B option, so we want to fall back to
normal resolve_btfids behaviour for modules. This avoids the need to
check if the BTF feature really works; if it doesn't we drive on with
default resolve_btfids behaviour for modules. Thanks!

Alan

>> +		}
>> +		if (!base_btf) {
>> +			optp.btf_sec = BTF_ELF_SEC;
>> +			path = obj->base_btf_path;
>> +			base_btf = btf__parse_opts(path, &optp);
>> +		}
>>  		err = libbpf_get_error(base_btf);
>>  		if (err) {
>>  			pr_err("FAILED: load base BTF from %s: %s\n",
>> -			       obj->base_btf_path, strerror(-err));
>> +			       path, strerror(-err));
>>  			return -1;
>>  		}
>>  	}
> 
> [...]

^ permalink raw reply	[flat|nested] 43+ messages in thread

* Re: [PATCH v2 bpf-next 09/13] libbpf: split BTF relocation
  2024-04-30 17:41       ` Andrii Nakryiko
@ 2024-05-02 16:39         ` Eduard Zingerman
  0 siblings, 0 replies; 43+ messages in thread
From: Eduard Zingerman @ 2024-05-02 16:39 UTC (permalink / raw)
  To: Andrii Nakryiko, Alan Maguire, Mykyta Yatsenko
  Cc: andrii, ast, jolsa, acme, quentin, mykolal, daniel, martin.lau,
	song, yonghong.song, john.fastabend, kpsingh, sdf, haoluo,
	houtao1, bpf, masahiroy, mcgrof, nathan

On Tue, 2024-04-30 at 10:41 -0700, Andrii Nakryiko wrote:

[...]


> Speaking of sorting, Mykyta (cc'ed) is working on teaching *bpftool*
> to do a sane ordering of types so that vmlinux.h output is a)
> meaningfully (from human POV) sorted and b) vmlinux.h overall is more
> "stable" between slight changes to vmlinux BTF itself, so that they
> can be more meaningfully diffed. This is in no way related to sorting
> vmlinux BTF data itself (sorting is done on the fly before generating
> vmlinux.h), but I thought I'd mention that as you are probably
> interested in this as well.

Oh, well...
I have a sorting pass already, it is here:
https://github.com/eddyz87/dwarves/tree/sort-btf
Would be a bit simpler, if moved inside libbpf and uses internal APIs.


^ permalink raw reply	[flat|nested] 43+ messages in thread

end of thread, other threads:[~2024-05-02 16:39 UTC | newest]

Thread overview: 43+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-04-24 15:47 [PATCH v2 bpf-next 00/13] bpf: support resilient split BTF Alan Maguire
2024-04-24 15:47 ` [PATCH v2 bpf-next 01/13] libbpf: add support to btf__add_fwd() for ENUM64 Alan Maguire
2024-04-26 22:56   ` Andrii Nakryiko
2024-04-24 15:47 ` [PATCH v2 bpf-next 02/13] libbpf: add btf__distill_base() creating split BTF with distilled base BTF Alan Maguire
2024-04-26 22:57   ` Andrii Nakryiko
2024-04-30 23:06   ` Eduard Zingerman
2024-05-01 17:29     ` Alan Maguire
2024-05-01 17:43       ` Eduard Zingerman
2024-05-02 11:51         ` Alan Maguire
2024-04-24 15:47 ` [PATCH v2 bpf-next 03/13] selftests/bpf: test distilled base, split BTF generation Alan Maguire
2024-04-30 23:50   ` Eduard Zingerman
2024-04-30 23:55   ` Eduard Zingerman
2024-05-01 17:31     ` Alan Maguire
2024-04-24 15:47 ` [PATCH v2 bpf-next 04/13] libbpf: add btf__parse_opts() API for flexible BTF parsing Alan Maguire
2024-04-29 23:40   ` Andrii Nakryiko
2024-05-01 17:42     ` Alan Maguire
2024-05-01 17:47       ` Andrii Nakryiko
2024-05-01  0:07   ` Eduard Zingerman
2024-04-24 15:47 ` [PATCH v2 bpf-next 05/13] bpftool: support displaying raw split BTF using base BTF section as base Alan Maguire
2024-04-29 23:42   ` Andrii Nakryiko
2024-04-24 15:47 ` [PATCH v2 bpf-next 06/13] kbuild,bpf: switch to using --btf_features for pahole v1.26 and later Alan Maguire
2024-04-29 23:43   ` Andrii Nakryiko
2024-05-01 17:22     ` Alan Maguire
2024-04-24 15:48 ` [PATCH v2 bpf-next 07/13] resolve_btfids: use .BTF.base ELF section as base BTF if -B option is used Alan Maguire
2024-04-29 23:45   ` Andrii Nakryiko
2024-05-01 20:39   ` Eduard Zingerman
2024-05-02 14:53     ` Alan Maguire
2024-04-24 15:48 ` [PATCH v2 bpf-next 08/13] kbuild, bpf: add module-specific pahole/resolve_btfids flags for distilled base BTF Alan Maguire
2024-04-24 15:48 ` [PATCH v2 bpf-next 09/13] libbpf: split BTF relocation Alan Maguire
2024-04-30  0:14   ` Andrii Nakryiko
2024-04-30 16:56     ` Alan Maguire
2024-04-30 17:41       ` Andrii Nakryiko
2024-05-02 16:39         ` Eduard Zingerman
2024-04-24 15:48 ` [PATCH v2 bpf-next 10/13] module, bpf: store BTF base pointer in struct module Alan Maguire
2024-04-24 15:48 ` [PATCH v2 bpf-next 11/13] libbpf,bpf: share BTF relocate-related code with kernel Alan Maguire
2024-04-24 15:48 ` [PATCH v2 bpf-next 12/13] selftests/bpf: extend distilled BTF tests to cover BTF relocation Alan Maguire
2024-04-24 15:48 ` [PATCH v2 bpf-next 13/13] bpftool: support displaying relocated-with-base split BTF Alan Maguire
2024-04-26 22:56 ` [PATCH v2 bpf-next 00/13] bpf: support resilient " Andrii Nakryiko
2024-04-27  0:24   ` Andrii Nakryiko
2024-04-29 15:25     ` Alan Maguire
2024-04-29 17:05       ` Andrii Nakryiko
2024-04-29 17:31         ` Alan Maguire
2024-04-29 18:02           ` Andrii Nakryiko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).