bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next 00/11] libbpf: split BTF support
@ 2020-10-29  0:58 Andrii Nakryiko
  2020-10-29  0:58 ` [PATCH bpf-next 01/11] libbpf: factor out common operations in BTF writing APIs Andrii Nakryiko
                   ` (11 more replies)
  0 siblings, 12 replies; 49+ messages in thread
From: Andrii Nakryiko @ 2020-10-29  0:58 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii, kernel-team

This patch set adds support for generating and deduplicating split BTF. This
is an enhancement to the BTF, which allows to designate one BTF as the "base
BTF" (e.g., vmlinux BTF), and one or more other BTFs as "split BTF" (e.g.,
kernel module BTF), which are building upon and extending base BTF with extra
types and strings.

Once loaded, split BTF appears as a single unified BTF superset of base BTF,
with continuous and transparent numbering scheme. This allows all the existing
users of BTF to work correctly and stay agnostic to the base/split BTFs
composition.  The only difference is in how to instantiate split BTF: it
requires base BTF to be alread instantiated and passed to btf__new_xxx_split()
or btf__parse_xxx_split() "constructors" explicitly.

This split approach is necessary if we are to have a reasonably-sized kernel
module BTFs. By deduping each kernel module's BTF individually, resulting
module BTFs contain copies of a lot of kernel types that are already present
in vmlinux BTF. Even those single copies result in a big BTF size bloat. On my
kernel configuration with 700 modules built, non-split BTF approach results in
115MBs of BTFs across all modules. With split BTF deduplication approach,
total size is down to 5.2MBs total, which is on part with vmlinux BTF (at
around 4MBs). This seems reasonable and practical. As to why we'd need kernel
module BTFs, that should be pretty obvious to anyone using BPF at this point,
as it allows all the BTF-powered features to be used with kernel modules:
tp_btf, fentry/fexit/fmod_ret, lsm, bpf_iter, etc.

This patch set is a pre-requisite to adding split BTF support to pahole, which
is a prerequisite to integrating split BTF into the Linux kernel build setup
to generate BTF for kernel modules. The latter will come as a follow-up patch
series once this series makes it to the libbpf and pahole makes use of it.

Patch #4 introduces necessary basic support for split BTF into libbpf APIs.
Patch #8 implements minimal changes to BTF dedup algorithm to allow
deduplicating split BTFs. Patch #11 adds extra -B flag to bpftool to allow to
specify the path to base BTF for cases when one wants to dump or inspect split
BTF. All the rest are refactorings, clean ups, bug fixes and selftests.

Andrii Nakryiko (11):
  libbpf: factor out common operations in BTF writing APIs
  selftest/bpf: relax btf_dedup test checks
  libbpf: unify and speed up BTF string deduplication
  libbpf: implement basic split BTF support
  selftests/bpf: add split BTF basic test
  selftests/bpf: add checking of raw type dump in BTF writer APIs
    selftests
  libbpf: fix BTF data layout checks and allow empty BTF
  libbpf: support BTF dedup of split BTFs
  libbpf: accomodate DWARF/compiler bug with duplicated identical arrays
  selftests/bpf: add split BTF dedup selftests
  tools/bpftool: add bpftool support for split BTF

 tools/bpf/bpftool/btf.c                       |   9 +-
 tools/bpf/bpftool/main.c                      |  15 +-
 tools/bpf/bpftool/main.h                      |   1 +
 tools/lib/bpf/btf.c                           | 814 ++++++++++--------
 tools/lib/bpf/btf.h                           |   8 +
 tools/lib/bpf/libbpf.map                      |   9 +
 tools/testing/selftests/bpf/Makefile          |   2 +-
 tools/testing/selftests/bpf/btf_helpers.c     | 259 ++++++
 tools/testing/selftests/bpf/btf_helpers.h     |  19 +
 tools/testing/selftests/bpf/prog_tests/btf.c  |  34 +-
 .../bpf/prog_tests/btf_dedup_split.c          | 326 +++++++
 .../selftests/bpf/prog_tests/btf_split.c      |  99 +++
 .../selftests/bpf/prog_tests/btf_write.c      |  48 +-
 tools/testing/selftests/bpf/test_progs.h      |  11 +
 14 files changed, 1294 insertions(+), 360 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/btf_helpers.c
 create mode 100644 tools/testing/selftests/bpf/btf_helpers.h
 create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_split.c

-- 
2.24.1


^ permalink raw reply	[flat|nested] 49+ messages in thread

* [PATCH bpf-next 01/11] libbpf: factor out common operations in BTF writing APIs
  2020-10-29  0:58 [PATCH bpf-next 00/11] libbpf: split BTF support Andrii Nakryiko
@ 2020-10-29  0:58 ` Andrii Nakryiko
  2020-10-30  0:36   ` Song Liu
  2020-10-29  0:58 ` [PATCH bpf-next 02/11] selftest/bpf: relax btf_dedup test checks Andrii Nakryiko
                   ` (10 subsequent siblings)
  11 siblings, 1 reply; 49+ messages in thread
From: Andrii Nakryiko @ 2020-10-29  0:58 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii, kernel-team

Factor out commiting of appended type data. Also extract fetching the very
last type in the BTF (to append members to). These two operations are common
across many APIs and will be easier to refactor with split BTF, if they are
extracted into a single place.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/lib/bpf/btf.c | 123 ++++++++++++++++----------------------------
 1 file changed, 43 insertions(+), 80 deletions(-)

diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index 231b07203e3d..89fecfe5cb2b 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -1560,6 +1560,20 @@ static void btf_type_inc_vlen(struct btf_type *t)
 	t->info = btf_type_info(btf_kind(t), btf_vlen(t) + 1, btf_kflag(t));
 }
 
+static int btf_commit_type(struct btf *btf, int data_sz)
+{
+	int err;
+
+	err = btf_add_type_idx_entry(btf, btf->hdr->type_len);
+	if (err)
+		return err;
+
+	btf->hdr->type_len += data_sz;
+	btf->hdr->str_off += data_sz;
+	btf->nr_types++;
+	return btf->nr_types;
+}
+
 /*
  * Append new BTF_KIND_INT type with:
  *   - *name* - non-empty, non-NULL type name;
@@ -1572,7 +1586,7 @@ static void btf_type_inc_vlen(struct btf_type *t)
 int btf__add_int(struct btf *btf, const char *name, size_t byte_sz, int encoding)
 {
 	struct btf_type *t;
-	int sz, err, name_off;
+	int sz, name_off;
 
 	/* non-empty name */
 	if (!name || !name[0])
@@ -1606,14 +1620,7 @@ int btf__add_int(struct btf *btf, const char *name, size_t byte_sz, int encoding
 	/* set INT info, we don't allow setting legacy bit offset/size */
 	*(__u32 *)(t + 1) = (encoding << 24) | (byte_sz * 8);
 
-	err = btf_add_type_idx_entry(btf, btf->hdr->type_len);
-	if (err)
-		return err;
-
-	btf->hdr->type_len += sz;
-	btf->hdr->str_off += sz;
-	btf->nr_types++;
-	return btf->nr_types;
+	return btf_commit_type(btf, sz);
 }
 
 /* it's completely legal to append BTF types with type IDs pointing forward to
@@ -1631,7 +1638,7 @@ static int validate_type_id(int id)
 static int btf_add_ref_kind(struct btf *btf, int kind, const char *name, int ref_type_id)
 {
 	struct btf_type *t;
-	int sz, name_off = 0, err;
+	int sz, name_off = 0;
 
 	if (validate_type_id(ref_type_id))
 		return -EINVAL;
@@ -1654,14 +1661,7 @@ static int btf_add_ref_kind(struct btf *btf, int kind, const char *name, int ref
 	t->info = btf_type_info(kind, 0, 0);
 	t->type = ref_type_id;
 
-	err = btf_add_type_idx_entry(btf, btf->hdr->type_len);
-	if (err)
-		return err;
-
-	btf->hdr->type_len += sz;
-	btf->hdr->str_off += sz;
-	btf->nr_types++;
-	return btf->nr_types;
+	return btf_commit_type(btf, sz);
 }
 
 /*
@@ -1689,7 +1689,7 @@ int btf__add_array(struct btf *btf, int index_type_id, int elem_type_id, __u32 n
 {
 	struct btf_type *t;
 	struct btf_array *a;
-	int sz, err;
+	int sz;
 
 	if (validate_type_id(index_type_id) || validate_type_id(elem_type_id))
 		return -EINVAL;
@@ -1711,21 +1711,14 @@ int btf__add_array(struct btf *btf, int index_type_id, int elem_type_id, __u32 n
 	a->index_type = index_type_id;
 	a->nelems = nr_elems;
 
-	err = btf_add_type_idx_entry(btf, btf->hdr->type_len);
-	if (err)
-		return err;
-
-	btf->hdr->type_len += sz;
-	btf->hdr->str_off += sz;
-	btf->nr_types++;
-	return btf->nr_types;
+	return btf_commit_type(btf, sz);
 }
 
 /* generic STRUCT/UNION append function */
 static int btf_add_composite(struct btf *btf, int kind, const char *name, __u32 bytes_sz)
 {
 	struct btf_type *t;
-	int sz, err, name_off = 0;
+	int sz, name_off = 0;
 
 	if (btf_ensure_modifiable(btf))
 		return -ENOMEM;
@@ -1748,14 +1741,7 @@ static int btf_add_composite(struct btf *btf, int kind, const char *name, __u32
 	t->info = btf_type_info(kind, 0, 0);
 	t->size = bytes_sz;
 
-	err = btf_add_type_idx_entry(btf, btf->hdr->type_len);
-	if (err)
-		return err;
-
-	btf->hdr->type_len += sz;
-	btf->hdr->str_off += sz;
-	btf->nr_types++;
-	return btf->nr_types;
+	return btf_commit_type(btf, sz);
 }
 
 /*
@@ -1793,6 +1779,11 @@ int btf__add_union(struct btf *btf, const char *name, __u32 byte_sz)
 	return btf_add_composite(btf, BTF_KIND_UNION, name, byte_sz);
 }
 
+static struct btf_type *btf_last_type(struct btf *btf)
+{
+	return btf_type_by_id(btf, btf__get_nr_types(btf));
+}
+
 /*
  * Append new field for the current STRUCT/UNION type with:
  *   - *name* - name of the field, can be NULL or empty for anonymous field;
@@ -1814,7 +1805,7 @@ int btf__add_field(struct btf *btf, const char *name, int type_id,
 	/* last type should be union/struct */
 	if (btf->nr_types == 0)
 		return -EINVAL;
-	t = btf_type_by_id(btf, btf->nr_types);
+	t = btf_last_type(btf);
 	if (!btf_is_composite(t))
 		return -EINVAL;
 
@@ -1849,7 +1840,7 @@ int btf__add_field(struct btf *btf, const char *name, int type_id,
 	m->offset = bit_offset | (bit_size << 24);
 
 	/* btf_add_type_mem can invalidate t pointer */
-	t = btf_type_by_id(btf, btf->nr_types);
+	t = btf_last_type(btf);
 	/* update parent type's vlen and kflag */
 	t->info = btf_type_info(btf_kind(t), btf_vlen(t) + 1, is_bitfield || btf_kflag(t));
 
@@ -1874,7 +1865,7 @@ int btf__add_field(struct btf *btf, const char *name, int type_id,
 int btf__add_enum(struct btf *btf, const char *name, __u32 byte_sz)
 {
 	struct btf_type *t;
-	int sz, err, name_off = 0;
+	int sz, name_off = 0;
 
 	/* byte_sz must be power of 2 */
 	if (!byte_sz || (byte_sz & (byte_sz - 1)) || byte_sz > 8)
@@ -1899,14 +1890,7 @@ int btf__add_enum(struct btf *btf, const char *name, __u32 byte_sz)
 	t->info = btf_type_info(BTF_KIND_ENUM, 0, 0);
 	t->size = byte_sz;
 
-	err = btf_add_type_idx_entry(btf, btf->hdr->type_len);
-	if (err)
-		return err;
-
-	btf->hdr->type_len += sz;
-	btf->hdr->str_off += sz;
-	btf->nr_types++;
-	return btf->nr_types;
+	return btf_commit_type(btf, sz);
 }
 
 /*
@@ -1926,7 +1910,7 @@ int btf__add_enum_value(struct btf *btf, const char *name, __s64 value)
 	/* last type should be BTF_KIND_ENUM */
 	if (btf->nr_types == 0)
 		return -EINVAL;
-	t = btf_type_by_id(btf, btf->nr_types);
+	t = btf_last_type(btf);
 	if (!btf_is_enum(t))
 		return -EINVAL;
 
@@ -1953,7 +1937,7 @@ int btf__add_enum_value(struct btf *btf, const char *name, __s64 value)
 	v->val = value;
 
 	/* update parent type's vlen */
-	t = btf_type_by_id(btf, btf->nr_types);
+	t = btf_last_type(btf);
 	btf_type_inc_vlen(t);
 
 	btf->hdr->type_len += sz;
@@ -2093,7 +2077,7 @@ int btf__add_func(struct btf *btf, const char *name,
 int btf__add_func_proto(struct btf *btf, int ret_type_id)
 {
 	struct btf_type *t;
-	int sz, err;
+	int sz;
 
 	if (validate_type_id(ret_type_id))
 		return -EINVAL;
@@ -2113,14 +2097,7 @@ int btf__add_func_proto(struct btf *btf, int ret_type_id)
 	t->info = btf_type_info(BTF_KIND_FUNC_PROTO, 0, 0);
 	t->type = ret_type_id;
 
-	err = btf_add_type_idx_entry(btf, btf->hdr->type_len);
-	if (err)
-		return err;
-
-	btf->hdr->type_len += sz;
-	btf->hdr->str_off += sz;
-	btf->nr_types++;
-	return btf->nr_types;
+	return btf_commit_type(btf, sz);
 }
 
 /*
@@ -2143,7 +2120,7 @@ int btf__add_func_param(struct btf *btf, const char *name, int type_id)
 	/* last type should be BTF_KIND_FUNC_PROTO */
 	if (btf->nr_types == 0)
 		return -EINVAL;
-	t = btf_type_by_id(btf, btf->nr_types);
+	t = btf_last_type(btf);
 	if (!btf_is_func_proto(t))
 		return -EINVAL;
 
@@ -2166,7 +2143,7 @@ int btf__add_func_param(struct btf *btf, const char *name, int type_id)
 	p->type = type_id;
 
 	/* update parent type's vlen */
-	t = btf_type_by_id(btf, btf->nr_types);
+	t = btf_last_type(btf);
 	btf_type_inc_vlen(t);
 
 	btf->hdr->type_len += sz;
@@ -2188,7 +2165,7 @@ int btf__add_var(struct btf *btf, const char *name, int linkage, int type_id)
 {
 	struct btf_type *t;
 	struct btf_var *v;
-	int sz, err, name_off;
+	int sz, name_off;
 
 	/* non-empty name */
 	if (!name || !name[0])
@@ -2219,14 +2196,7 @@ int btf__add_var(struct btf *btf, const char *name, int linkage, int type_id)
 	v = btf_var(t);
 	v->linkage = linkage;
 
-	err = btf_add_type_idx_entry(btf, btf->hdr->type_len);
-	if (err)
-		return err;
-
-	btf->hdr->type_len += sz;
-	btf->hdr->str_off += sz;
-	btf->nr_types++;
-	return btf->nr_types;
+	return btf_commit_type(btf, sz);
 }
 
 /*
@@ -2244,7 +2214,7 @@ int btf__add_var(struct btf *btf, const char *name, int linkage, int type_id)
 int btf__add_datasec(struct btf *btf, const char *name, __u32 byte_sz)
 {
 	struct btf_type *t;
-	int sz, err, name_off;
+	int sz, name_off;
 
 	/* non-empty name */
 	if (!name || !name[0])
@@ -2267,14 +2237,7 @@ int btf__add_datasec(struct btf *btf, const char *name, __u32 byte_sz)
 	t->info = btf_type_info(BTF_KIND_DATASEC, 0, 0);
 	t->size = byte_sz;
 
-	err = btf_add_type_idx_entry(btf, btf->hdr->type_len);
-	if (err)
-		return err;
-
-	btf->hdr->type_len += sz;
-	btf->hdr->str_off += sz;
-	btf->nr_types++;
-	return btf->nr_types;
+	return btf_commit_type(btf, sz);
 }
 
 /*
@@ -2296,7 +2259,7 @@ int btf__add_datasec_var_info(struct btf *btf, int var_type_id, __u32 offset, __
 	/* last type should be BTF_KIND_DATASEC */
 	if (btf->nr_types == 0)
 		return -EINVAL;
-	t = btf_type_by_id(btf, btf->nr_types);
+	t = btf_last_type(btf);
 	if (!btf_is_datasec(t))
 		return -EINVAL;
 
@@ -2317,7 +2280,7 @@ int btf__add_datasec_var_info(struct btf *btf, int var_type_id, __u32 offset, __
 	v->size = byte_sz;
 
 	/* update parent type's vlen */
-	t = btf_type_by_id(btf, btf->nr_types);
+	t = btf_last_type(btf);
 	btf_type_inc_vlen(t);
 
 	btf->hdr->type_len += sz;
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH bpf-next 02/11] selftest/bpf: relax btf_dedup test checks
  2020-10-29  0:58 [PATCH bpf-next 00/11] libbpf: split BTF support Andrii Nakryiko
  2020-10-29  0:58 ` [PATCH bpf-next 01/11] libbpf: factor out common operations in BTF writing APIs Andrii Nakryiko
@ 2020-10-29  0:58 ` Andrii Nakryiko
  2020-10-30 16:43   ` Song Liu
  2020-10-29  0:58 ` [PATCH bpf-next 03/11] libbpf: unify and speed up BTF string deduplication Andrii Nakryiko
                   ` (9 subsequent siblings)
  11 siblings, 1 reply; 49+ messages in thread
From: Andrii Nakryiko @ 2020-10-29  0:58 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii, kernel-team

Remove the requirement of a strictly exact string section contents. This used
to be true when string deduplication was done through sorting, but with string
dedup done through hash table, it's no longer true. So relax test harness to
relax strings checks and, consequently, type checks, which now don't have to
have exactly the same string offsets.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/testing/selftests/bpf/prog_tests/btf.c | 34 +++++++++++---------
 1 file changed, 19 insertions(+), 15 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/btf.c b/tools/testing/selftests/bpf/prog_tests/btf.c
index 93162484c2ca..2ccc23b2a36f 100644
--- a/tools/testing/selftests/bpf/prog_tests/btf.c
+++ b/tools/testing/selftests/bpf/prog_tests/btf.c
@@ -6652,7 +6652,7 @@ static void do_test_dedup(unsigned int test_num)
 	const void *test_btf_data, *expect_btf_data;
 	const char *ret_test_next_str, *ret_expect_next_str;
 	const char *test_strs, *expect_strs;
-	const char *test_str_cur, *test_str_end;
+	const char *test_str_cur;
 	const char *expect_str_cur, *expect_str_end;
 	unsigned int raw_btf_size;
 	void *raw_btf;
@@ -6719,12 +6719,18 @@ static void do_test_dedup(unsigned int test_num)
 		goto done;
 	}
 
-	test_str_cur = test_strs;
-	test_str_end = test_strs + test_hdr->str_len;
 	expect_str_cur = expect_strs;
 	expect_str_end = expect_strs + expect_hdr->str_len;
-	while (test_str_cur < test_str_end && expect_str_cur < expect_str_end) {
+	while (expect_str_cur < expect_str_end) {
 		size_t test_len, expect_len;
+		int off;
+
+		off = btf__find_str(test_btf, expect_str_cur);
+		if (CHECK(off < 0, "exp str '%s' not found: %d\n", expect_str_cur, off)) {
+			err = -1;
+			goto done;
+		}
+		test_str_cur = btf__str_by_offset(test_btf, off);
 
 		test_len = strlen(test_str_cur);
 		expect_len = strlen(expect_str_cur);
@@ -6741,15 +6747,8 @@ static void do_test_dedup(unsigned int test_num)
 			err = -1;
 			goto done;
 		}
-		test_str_cur += test_len + 1;
 		expect_str_cur += expect_len + 1;
 	}
-	if (CHECK(test_str_cur != test_str_end,
-		  "test_str_cur:%p != test_str_end:%p",
-		  test_str_cur, test_str_end)) {
-		err = -1;
-		goto done;
-	}
 
 	test_nr_types = btf__get_nr_types(test_btf);
 	expect_nr_types = btf__get_nr_types(expect_btf);
@@ -6775,10 +6774,15 @@ static void do_test_dedup(unsigned int test_num)
 			err = -1;
 			goto done;
 		}
-		if (CHECK(memcmp((void *)test_type,
-				 (void *)expect_type,
-				 test_size),
-			  "type #%d: contents differ", i)) {
+		if (CHECK(btf_kind(test_type) != btf_kind(expect_type),
+			  "type %d kind: exp %d != got %u\n",
+			  i, btf_kind(expect_type), btf_kind(test_type))) {
+			err = -1;
+			goto done;
+		}
+		if (CHECK(test_type->info != expect_type->info,
+			  "type %d info: exp %d != got %u\n",
+			  i, expect_type->info, test_type->info)) {
 			err = -1;
 			goto done;
 		}
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH bpf-next 03/11] libbpf: unify and speed up BTF string deduplication
  2020-10-29  0:58 [PATCH bpf-next 00/11] libbpf: split BTF support Andrii Nakryiko
  2020-10-29  0:58 ` [PATCH bpf-next 01/11] libbpf: factor out common operations in BTF writing APIs Andrii Nakryiko
  2020-10-29  0:58 ` [PATCH bpf-next 02/11] selftest/bpf: relax btf_dedup test checks Andrii Nakryiko
@ 2020-10-29  0:58 ` Andrii Nakryiko
  2020-10-30 23:32   ` Song Liu
  2020-11-03  4:59   ` Alexei Starovoitov
  2020-10-29  0:58 ` [PATCH bpf-next 04/11] libbpf: implement basic split BTF support Andrii Nakryiko
                   ` (8 subsequent siblings)
  11 siblings, 2 replies; 49+ messages in thread
From: Andrii Nakryiko @ 2020-10-29  0:58 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii, kernel-team, Andrii Nakryiko

From: Andrii Nakryiko <andriin@fb.com>

Revamp BTF dedup's string deduplication to match the approach of writable BTF
string management. This allows to transfer deduplicated strings index back to
BTF object after deduplication without expensive extra memory copying and hash
map re-construction. It also simplifies the code and speeds it up, because
hashmap-based string deduplication is faster than sort + unique approach.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/lib/bpf/btf.c | 265 +++++++++++++++++---------------------------
 1 file changed, 99 insertions(+), 166 deletions(-)

diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index 89fecfe5cb2b..db9331fea672 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -90,6 +90,14 @@ struct btf {
 	struct hashmap *strs_hash;
 	/* whether strings are already deduplicated */
 	bool strs_deduped;
+	/* extra indirection layer to make strings hashmap work with stable
+	 * string offsets and ability to transparently choose between
+	 * btf->strs_data or btf_dedup->strs_data as a source of strings.
+	 * This is used for BTF strings dedup to transfer deduplicated strings
+	 * data back to struct btf without re-building strings index.
+	 */
+	void **strs_data_ptr;
+
 	/* BTF object FD, if loaded into kernel */
 	int fd;
 
@@ -1363,17 +1371,19 @@ int btf__get_map_kv_tids(const struct btf *btf, const char *map_name,
 
 static size_t strs_hash_fn(const void *key, void *ctx)
 {
-	struct btf *btf = ctx;
-	const char *str = btf->strs_data + (long)key;
+	const char ***strs_data_ptr = ctx;
+	const char *strs = **strs_data_ptr;
+	const char *str = strs + (long)key;
 
 	return str_hash(str);
 }
 
 static bool strs_hash_equal_fn(const void *key1, const void *key2, void *ctx)
 {
-	struct btf *btf = ctx;
-	const char *str1 = btf->strs_data + (long)key1;
-	const char *str2 = btf->strs_data + (long)key2;
+	const char ***strs_data_ptr = ctx;
+	const char *strs = **strs_data_ptr;
+	const char *str1 = strs + (long)key1;
+	const char *str2 = strs + (long)key2;
 
 	return strcmp(str1, str2) == 0;
 }
@@ -1418,8 +1428,11 @@ static int btf_ensure_modifiable(struct btf *btf)
 	memcpy(types, btf->types_data, btf->hdr->type_len);
 	memcpy(strs, btf->strs_data, btf->hdr->str_len);
 
+	/* make hashmap below use btf->strs_data as a source of strings */
+	btf->strs_data_ptr = &btf->strs_data;
+
 	/* build lookup index for all strings */
-	hash = hashmap__new(strs_hash_fn, strs_hash_equal_fn, btf);
+	hash = hashmap__new(strs_hash_fn, strs_hash_equal_fn, &btf->strs_data_ptr);
 	if (IS_ERR(hash)) {
 		err = PTR_ERR(hash);
 		hash = NULL;
@@ -2824,19 +2837,11 @@ struct btf_dedup {
 	size_t hypot_cap;
 	/* Various option modifying behavior of algorithm */
 	struct btf_dedup_opts opts;
-};
-
-struct btf_str_ptr {
-	const char *str;
-	__u32 new_off;
-	bool used;
-};
-
-struct btf_str_ptrs {
-	struct btf_str_ptr *ptrs;
-	const char *data;
-	__u32 cnt;
-	__u32 cap;
+	/* temporary strings deduplication state */
+	void *strs_data;
+	size_t strs_cap;
+	size_t strs_len;
+	struct hashmap* strs_hash;
 };
 
 static long hash_combine(long h, long value)
@@ -3063,64 +3068,41 @@ static int btf_for_each_str_off(struct btf_dedup *d, str_off_fn_t fn, void *ctx)
 	return 0;
 }
 
-static int str_sort_by_content(const void *a1, const void *a2)
-{
-	const struct btf_str_ptr *p1 = a1;
-	const struct btf_str_ptr *p2 = a2;
-
-	return strcmp(p1->str, p2->str);
-}
-
-static int str_sort_by_offset(const void *a1, const void *a2)
-{
-	const struct btf_str_ptr *p1 = a1;
-	const struct btf_str_ptr *p2 = a2;
-
-	if (p1->str != p2->str)
-		return p1->str < p2->str ? -1 : 1;
-	return 0;
-}
-
-static int btf_dedup_str_ptr_cmp(const void *str_ptr, const void *pelem)
-{
-	const struct btf_str_ptr *p = pelem;
-
-	if (str_ptr != p->str)
-		return (const char *)str_ptr < p->str ? -1 : 1;
-	return 0;
-}
-
-static int btf_str_mark_as_used(__u32 *str_off_ptr, void *ctx)
+static int strs_dedup_remap_str_off(__u32 *str_off_ptr, void *ctx)
 {
-	struct btf_str_ptrs *strs;
-	struct btf_str_ptr *s;
+	struct btf_dedup *d = ctx;
+	long old_off, new_off, len;
+	const char *s;
+	void *p;
+	int err;
 
 	if (*str_off_ptr == 0)
 		return 0;
 
-	strs = ctx;
-	s = bsearch(strs->data + *str_off_ptr, strs->ptrs, strs->cnt,
-		    sizeof(struct btf_str_ptr), btf_dedup_str_ptr_cmp);
-	if (!s)
-		return -EINVAL;
-	s->used = true;
-	return 0;
-}
+	s = btf__str_by_offset(d->btf, *str_off_ptr);
+	len = strlen(s) + 1;
 
-static int btf_str_remap_offset(__u32 *str_off_ptr, void *ctx)
-{
-	struct btf_str_ptrs *strs;
-	struct btf_str_ptr *s;
+	new_off = d->strs_len;
+	p = btf_add_mem(&d->strs_data, &d->strs_cap, 1, new_off, BTF_MAX_STR_OFFSET, len);
+	if (!p)
+		return -ENOMEM;
 
-	if (*str_off_ptr == 0)
-		return 0;
+	memcpy(p, s, len);
 
-	strs = ctx;
-	s = bsearch(strs->data + *str_off_ptr, strs->ptrs, strs->cnt,
-		    sizeof(struct btf_str_ptr), btf_dedup_str_ptr_cmp);
-	if (!s)
-		return -EINVAL;
-	*str_off_ptr = s->new_off;
+	/* Now attempt to add the string, but only if the string with the same
+	 * contents doesn't exist already (HASHMAP_ADD strategy). If such
+	 * string exists, we'll get its offset in old_off (that's old_key).
+	 */
+	err = hashmap__insert(d->strs_hash, (void *)new_off, (void *)new_off,
+			      HASHMAP_ADD, (const void **)&old_off, NULL);
+	if (err == -EEXIST) {
+		*str_off_ptr = old_off;
+	} else if (err) {
+		return err;
+	} else {
+		*str_off_ptr = new_off;
+		d->strs_len += len;
+	}
 	return 0;
 }
 
@@ -3137,118 +3119,69 @@ static int btf_str_remap_offset(__u32 *str_off_ptr, void *ctx)
  */
 static int btf_dedup_strings(struct btf_dedup *d)
 {
-	char *start = d->btf->strs_data;
-	char *end = start + d->btf->hdr->str_len;
-	char *p = start, *tmp_strs = NULL;
-	struct btf_str_ptrs strs = {
-		.cnt = 0,
-		.cap = 0,
-		.ptrs = NULL,
-		.data = start,
-	};
-	int i, j, err = 0, grp_idx;
-	bool grp_used;
+	char *s;
+	int err;
 
 	if (d->btf->strs_deduped)
 		return 0;
 
-	/* build index of all strings */
-	while (p < end) {
-		if (strs.cnt + 1 > strs.cap) {
-			struct btf_str_ptr *new_ptrs;
-
-			strs.cap += max(strs.cnt / 2, 16U);
-			new_ptrs = libbpf_reallocarray(strs.ptrs, strs.cap, sizeof(strs.ptrs[0]));
-			if (!new_ptrs) {
-				err = -ENOMEM;
-				goto done;
-			}
-			strs.ptrs = new_ptrs;
-		}
-
-		strs.ptrs[strs.cnt].str = p;
-		strs.ptrs[strs.cnt].used = false;
-
-		p += strlen(p) + 1;
-		strs.cnt++;
-	}
-
-	/* temporary storage for deduplicated strings */
-	tmp_strs = malloc(d->btf->hdr->str_len);
-	if (!tmp_strs) {
-		err = -ENOMEM;
-		goto done;
-	}
-
-	/* mark all used strings */
-	strs.ptrs[0].used = true;
-	err = btf_for_each_str_off(d, btf_str_mark_as_used, &strs);
-	if (err)
-		goto done;
-
-	/* sort strings by context, so that we can identify duplicates */
-	qsort(strs.ptrs, strs.cnt, sizeof(strs.ptrs[0]), str_sort_by_content);
+	s = btf_add_mem(&d->strs_data, &d->strs_cap, 1, d->strs_len, BTF_MAX_STR_OFFSET, 1);
+	if (!s)
+		return -ENOMEM;
+	/* initial empty string */
+	s[0] = 0;
+	d->strs_len = 1;
 
-	/*
-	 * iterate groups of equal strings and if any instance in a group was
-	 * referenced, emit single instance and remember new offset
+	/* temporarily switch to use btf_dedup's strs_data for strings for hash
+	 * functions; later we'll just transfer hashmap to struct btf as is,
+	 * along the strs_data
 	 */
-	p = tmp_strs;
-	grp_idx = 0;
-	grp_used = strs.ptrs[0].used;
-	/* iterate past end to avoid code duplication after loop */
-	for (i = 1; i <= strs.cnt; i++) {
-		/*
-		 * when i == strs.cnt, we want to skip string comparison and go
-		 * straight to handling last group of strings (otherwise we'd
-		 * need to handle last group after the loop w/ duplicated code)
-		 */
-		if (i < strs.cnt &&
-		    !strcmp(strs.ptrs[i].str, strs.ptrs[grp_idx].str)) {
-			grp_used = grp_used || strs.ptrs[i].used;
-			continue;
-		}
+	d->btf->strs_data_ptr = &d->strs_data;
 
-		/*
-		 * this check would have been required after the loop to handle
-		 * last group of strings, but due to <= condition in a loop
-		 * we avoid that duplication
-		 */
-		if (grp_used) {
-			int new_off = p - tmp_strs;
-			__u32 len = strlen(strs.ptrs[grp_idx].str);
-
-			memmove(p, strs.ptrs[grp_idx].str, len + 1);
-			for (j = grp_idx; j < i; j++)
-				strs.ptrs[j].new_off = new_off;
-			p += len + 1;
-		}
-
-		if (i < strs.cnt) {
-			grp_idx = i;
-			grp_used = strs.ptrs[i].used;
-		}
+	d->strs_hash = hashmap__new(strs_hash_fn, strs_hash_equal_fn, &d->btf->strs_data_ptr);
+	if (IS_ERR(d->strs_hash)) {
+		err = PTR_ERR(d->strs_hash);
+		d->strs_hash = NULL;
+		goto err_out;
 	}
 
-	/* replace original strings with deduped ones */
-	d->btf->hdr->str_len = p - tmp_strs;
-	memmove(start, tmp_strs, d->btf->hdr->str_len);
-	end = start + d->btf->hdr->str_len;
-
-	/* restore original order for further binary search lookups */
-	qsort(strs.ptrs, strs.cnt, sizeof(strs.ptrs[0]), str_sort_by_offset);
+	/* insert empty string; we won't be looking it up during strings
+	 * dedup, but it's good to have it for generic BTF string lookups
+	 */
+	err = hashmap__insert(d->strs_hash, (void *)0, (void *)0,
+			      HASHMAP_ADD, NULL, NULL);
+	if (err)
+		goto err_out;
 
 	/* remap string offsets */
-	err = btf_for_each_str_off(d, btf_str_remap_offset, &strs);
+	err = btf_for_each_str_off(d, strs_dedup_remap_str_off, d);
 	if (err)
-		goto done;
+		goto err_out;
 
-	d->btf->hdr->str_len = end - start;
+	/* replace BTF string data and hash with deduped ones */
+	free(d->btf->strs_data);
+	hashmap__free(d->btf->strs_hash);
+	d->btf->strs_data = d->strs_data;
+	d->btf->strs_data_cap = d->strs_cap;
+	d->btf->hdr->str_len = d->strs_len;
+	d->btf->strs_hash = d->strs_hash;
+	/* now point strs_data_ptr back to btf->strs_data */
+	d->btf->strs_data_ptr = &d->btf->strs_data;
+
+	d->strs_data = d->strs_hash = NULL;
+	d->strs_len = d->strs_cap = 0;
 	d->btf->strs_deduped = true;
+	return 0;
+
+err_out:
+	free(d->strs_data);
+	hashmap__free(d->strs_hash);
+	d->strs_data = d->strs_hash = NULL;
+	d->strs_len = d->strs_cap = 0;
+
+	/* restore strings pointer for existing d->btf->strs_hash back */
+	d->btf->strs_data_ptr = &d->strs_data;
 
-done:
-	free(tmp_strs);
-	free(strs.ptrs);
 	return err;
 }
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH bpf-next 04/11] libbpf: implement basic split BTF support
  2020-10-29  0:58 [PATCH bpf-next 00/11] libbpf: split BTF support Andrii Nakryiko
                   ` (2 preceding siblings ...)
  2020-10-29  0:58 ` [PATCH bpf-next 03/11] libbpf: unify and speed up BTF string deduplication Andrii Nakryiko
@ 2020-10-29  0:58 ` Andrii Nakryiko
  2020-11-02 23:23   ` Song Liu
  2020-10-29  0:58 ` [PATCH bpf-next 05/11] selftests/bpf: add split BTF basic test Andrii Nakryiko
                   ` (7 subsequent siblings)
  11 siblings, 1 reply; 49+ messages in thread
From: Andrii Nakryiko @ 2020-10-29  0:58 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii, kernel-team

Support split BTF operation, in which one BTF (base BTF) provides basic set of
types and strings, while another one (split BTF) builds on top of base's types
and strings and adds its own new types and strings. From API standpoint, the
fact that the split BTF is built on top of the base BTF is transparent.

Type numeration is transparent. If the base BTF had last type ID #N, then all
types in the split BTF start at type ID N+1. Any type in split BTF can
reference base BTF types, but not vice versa. Programmatically construction of
a split BTF on top of a base BTF is supported: one can create an empty split
BTF with btf__new_empty_split() and pass base BTF as an input, or pass raw
binary data to btf__new_split(), or use btf__parse_xxx_split() variants to get
initial set of split types/strings from the ELF file with .BTF section.

String offsets are similarly transparent and are a logical continuation of
base BTF's strings. When building BTF programmatically and adding a new string
(explicitly with btf__add_str() or implicitly through appending new
types/members), string-to-be-added would first be looked up from the base
BTF's string section and re-used if it's there. If not, it will be looked up
and/or added to the split BTF string section. Similarly to type IDs, types in
split BTF can refer to strings from base BTF absolutely transparently (but not
vice versa, of course, because base BTF doesn't "know" about existence of
split BTF).

Internal type index is slightly adjusted to be zero-indexed, ignoring a fake
[0] VOID type. This allows to handle split/base BTF type lookups transparently
by using btf->start_id type ID offset, which is always 1 for base/non-split
BTF and equals btf__get_nr_types(base_btf) + 1 for the split BTF.

BTF deduplication is not yet supported for split BTF and support for it will
be added in separate patch.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/lib/bpf/btf.c      | 205 ++++++++++++++++++++++++++++++---------
 tools/lib/bpf/btf.h      |   8 ++
 tools/lib/bpf/libbpf.map |   9 ++
 3 files changed, 175 insertions(+), 47 deletions(-)

diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index db9331fea672..20c64a8441a8 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -78,10 +78,32 @@ struct btf {
 	void *types_data;
 	size_t types_data_cap; /* used size stored in hdr->type_len */
 
-	/* type ID to `struct btf_type *` lookup index */
+	/* type ID to `struct btf_type *` lookup index
+	 * type_offs[0] corresponds to the first non-VOID type:
+	 *   - for base BTF it's type [1];
+	 *   - for split BTF it's the first non-base BTF type.
+	 */
 	__u32 *type_offs;
 	size_t type_offs_cap;
+	/* number of types in this BTF instance:
+	 *   - doesn't include special [0] void type;
+	 *   - for split BTF counts number of types added on top of base BTF.
+	 */
 	__u32 nr_types;
+	/* if not NULL, points to the base BTF on top of which the current
+	 * split BTF is based
+	 */
+	struct btf *base_btf;
+	/* BTF type ID of the first type in this BTF instance:
+	 *   - for base BTF it's equal to 1;
+	 *   - for split BTF it's equal to biggest type ID of base BTF plus 1.
+	 */
+	int start_id;
+	/* logical string offset of this BTF instance:
+	 *   - for base BTF it's equal to 0;
+	 *   - for split BTF it's equal to total size of base BTF's string section size.
+	 */
+	int start_str_off;
 
 	void *strs_data;
 	size_t strs_data_cap; /* used size stored in hdr->str_len */
@@ -176,7 +198,7 @@ static int btf_add_type_idx_entry(struct btf *btf, __u32 type_off)
 	__u32 *p;
 
 	p = btf_add_mem((void **)&btf->type_offs, &btf->type_offs_cap, sizeof(__u32),
-			btf->nr_types + 1, BTF_MAX_NR_TYPES, 1);
+			btf->nr_types, BTF_MAX_NR_TYPES, 1);
 	if (!p)
 		return -ENOMEM;
 
@@ -252,12 +274,20 @@ static int btf_parse_str_sec(struct btf *btf)
 	const char *start = btf->strs_data;
 	const char *end = start + btf->hdr->str_len;
 
-	if (!hdr->str_len || hdr->str_len - 1 > BTF_MAX_STR_OFFSET ||
-	    start[0] || end[-1]) {
-		pr_debug("Invalid BTF string section\n");
-		return -EINVAL;
+	if (btf->base_btf) {
+		if (hdr->str_len == 0)
+			return 0;
+		if (hdr->str_len - 1 > BTF_MAX_STR_OFFSET || end[-1]) {
+			pr_debug("Invalid BTF string section\n");
+			return -EINVAL;
+		}
+	} else {
+		if (!hdr->str_len || hdr->str_len - 1 > BTF_MAX_STR_OFFSET ||
+		    start[0] || end[-1]) {
+			pr_debug("Invalid BTF string section\n");
+			return -EINVAL;
+		}
 	}
-
 	return 0;
 }
 
@@ -372,19 +402,9 @@ static int btf_parse_type_sec(struct btf *btf)
 	struct btf_header *hdr = btf->hdr;
 	void *next_type = btf->types_data;
 	void *end_type = next_type + hdr->type_len;
-	int err, i = 0, type_size;
-
-	/* VOID (type_id == 0) is specially handled by btf__get_type_by_id(),
-	 * so ensure we can never properly use its offset from index by
-	 * setting it to a large value
-	 */
-	err = btf_add_type_idx_entry(btf, UINT_MAX);
-	if (err)
-		return err;
+	int err, type_size;
 
 	while (next_type + sizeof(struct btf_type) <= end_type) {
-		i++;
-
 		if (btf->swapped_endian)
 			btf_bswap_type_base(next_type);
 
@@ -392,7 +412,7 @@ static int btf_parse_type_sec(struct btf *btf)
 		if (type_size < 0)
 			return type_size;
 		if (next_type + type_size > end_type) {
-			pr_warn("BTF type [%d] is malformed\n", i);
+			pr_warn("BTF type [%d] is malformed\n", btf->start_id + btf->nr_types);
 			return -EINVAL;
 		}
 
@@ -417,7 +437,7 @@ static int btf_parse_type_sec(struct btf *btf)
 
 __u32 btf__get_nr_types(const struct btf *btf)
 {
-	return btf->nr_types;
+	return btf->start_id + btf->nr_types - 1;
 }
 
 /* internal helper returning non-const pointer to a type */
@@ -425,13 +445,14 @@ static struct btf_type *btf_type_by_id(struct btf *btf, __u32 type_id)
 {
 	if (type_id == 0)
 		return &btf_void;
-
-	return btf->types_data + btf->type_offs[type_id];
+	if (type_id < btf->start_id)
+		return btf_type_by_id(btf->base_btf, type_id);
+	return btf->types_data + btf->type_offs[type_id - btf->start_id];
 }
 
 const struct btf_type *btf__type_by_id(const struct btf *btf, __u32 type_id)
 {
-	if (type_id > btf->nr_types)
+	if (type_id >= btf->start_id + btf->nr_types)
 		return NULL;
 	return btf_type_by_id((struct btf *)btf, type_id);
 }
@@ -440,9 +461,13 @@ static int determine_ptr_size(const struct btf *btf)
 {
 	const struct btf_type *t;
 	const char *name;
-	int i;
+	int i, n;
 
-	for (i = 1; i <= btf->nr_types; i++) {
+	if (btf->base_btf && btf->base_btf->ptr_sz > 0)
+		return btf->base_btf->ptr_sz;
+
+	n = btf__get_nr_types(btf);
+	for (i = 1; i <= n; i++) {
 		t = btf__type_by_id(btf, i);
 		if (!btf_is_int(t))
 			continue;
@@ -725,7 +750,7 @@ void btf__free(struct btf *btf)
 	free(btf);
 }
 
-struct btf *btf__new_empty(void)
+static struct btf *btf_new_empty(struct btf *base_btf)
 {
 	struct btf *btf;
 
@@ -733,12 +758,21 @@ struct btf *btf__new_empty(void)
 	if (!btf)
 		return ERR_PTR(-ENOMEM);
 
+	btf->nr_types = 0;
+	btf->start_id = 1;
+	btf->start_str_off = 0;
 	btf->fd = -1;
 	btf->ptr_sz = sizeof(void *);
 	btf->swapped_endian = false;
 
+	if (base_btf) {
+		btf->base_btf = base_btf;
+		btf->start_id = btf__get_nr_types(base_btf) + 1;
+		btf->start_str_off = base_btf->hdr->str_len;
+	}
+
 	/* +1 for empty string at offset 0 */
-	btf->raw_size = sizeof(struct btf_header) + 1;
+	btf->raw_size = sizeof(struct btf_header) + (base_btf ? 0 : 1);
 	btf->raw_data = calloc(1, btf->raw_size);
 	if (!btf->raw_data) {
 		free(btf);
@@ -752,12 +786,22 @@ struct btf *btf__new_empty(void)
 
 	btf->types_data = btf->raw_data + btf->hdr->hdr_len;
 	btf->strs_data = btf->raw_data + btf->hdr->hdr_len;
-	btf->hdr->str_len = 1; /* empty string at offset 0 */
+	btf->hdr->str_len = base_btf ? 0 : 1; /* empty string at offset 0 */
 
 	return btf;
 }
 
-struct btf *btf__new(const void *data, __u32 size)
+struct btf *btf__new_empty(void)
+{
+	return btf_new_empty(NULL);
+}
+
+struct btf *btf__new_empty_split(struct btf *base_btf)
+{
+	return btf_new_empty(base_btf);
+}
+
+static struct btf *btf_new(const void *data, __u32 size, struct btf *base_btf)
 {
 	struct btf *btf;
 	int err;
@@ -766,6 +810,16 @@ struct btf *btf__new(const void *data, __u32 size)
 	if (!btf)
 		return ERR_PTR(-ENOMEM);
 
+	btf->nr_types = 0;
+	btf->start_id = 1;
+	btf->start_str_off = 0;
+
+	if (base_btf) {
+		btf->base_btf = base_btf;
+		btf->start_id = btf__get_nr_types(base_btf) + 1;
+		btf->start_str_off = base_btf->hdr->str_len;
+	}
+
 	btf->raw_data = malloc(size);
 	if (!btf->raw_data) {
 		err = -ENOMEM;
@@ -798,7 +852,13 @@ struct btf *btf__new(const void *data, __u32 size)
 	return btf;
 }
 
-struct btf *btf__parse_elf(const char *path, struct btf_ext **btf_ext)
+struct btf *btf__new(const void *data, __u32 size)
+{
+	return btf_new(data, size, NULL);
+}
+
+static struct btf *btf_parse_elf(const char *path, struct btf *base_btf,
+				 struct btf_ext **btf_ext)
 {
 	Elf_Data *btf_data = NULL, *btf_ext_data = NULL;
 	int err = 0, fd = -1, idx = 0;
@@ -876,7 +936,7 @@ struct btf *btf__parse_elf(const char *path, struct btf_ext **btf_ext)
 		err = -ENOENT;
 		goto done;
 	}
-	btf = btf__new(btf_data->d_buf, btf_data->d_size);
+	btf = btf_new(btf_data->d_buf, btf_data->d_size, base_btf);
 	if (IS_ERR(btf))
 		goto done;
 
@@ -921,7 +981,17 @@ struct btf *btf__parse_elf(const char *path, struct btf_ext **btf_ext)
 	return btf;
 }
 
-struct btf *btf__parse_raw(const char *path)
+struct btf *btf__parse_elf(const char *path, struct btf_ext **btf_ext)
+{
+	return btf_parse_elf(path, NULL, btf_ext);
+}
+
+struct btf *btf__parse_elf_split(const char *path, struct btf *base_btf)
+{
+	return btf_parse_elf(path, base_btf, NULL);
+}
+
+static struct btf *btf_parse_raw(const char *path, struct btf *base_btf)
 {
 	struct btf *btf = NULL;
 	void *data = NULL;
@@ -975,7 +1045,7 @@ struct btf *btf__parse_raw(const char *path)
 	}
 
 	/* finally parse BTF data */
-	btf = btf__new(data, sz);
+	btf = btf_new(data, sz, base_btf);
 
 err_out:
 	free(data);
@@ -984,18 +1054,38 @@ struct btf *btf__parse_raw(const char *path)
 	return err ? ERR_PTR(err) : btf;
 }
 
-struct btf *btf__parse(const char *path, struct btf_ext **btf_ext)
+struct btf *btf__parse_raw(const char *path)
+{
+	return btf_parse_raw(path, NULL);
+}
+
+struct btf *btf__parse_raw_split(const char *path, struct btf *base_btf)
+{
+	return btf_parse_raw(path, base_btf);
+}
+
+static struct btf *btf_parse(const char *path, struct btf *base_btf, struct btf_ext **btf_ext)
 {
 	struct btf *btf;
 
 	if (btf_ext)
 		*btf_ext = NULL;
 
-	btf = btf__parse_raw(path);
+	btf = btf_parse_raw(path, base_btf);
 	if (!IS_ERR(btf) || PTR_ERR(btf) != -EPROTO)
 		return btf;
 
-	return btf__parse_elf(path, btf_ext);
+	return btf_parse_elf(path, base_btf, btf_ext);
+}
+
+struct btf *btf__parse(const char *path, struct btf_ext **btf_ext)
+{
+	return btf_parse(path, NULL, btf_ext);
+}
+
+struct btf *btf__parse_split(const char *path, struct btf *base_btf)
+{
+	return btf_parse(path, base_btf, NULL);
 }
 
 static int compare_vsi_off(const void *_a, const void *_b)
@@ -1179,8 +1269,8 @@ static void *btf_get_raw_data(const struct btf *btf, __u32 *size, bool swap_endi
 
 	memcpy(p, btf->types_data, hdr->type_len);
 	if (swap_endian) {
-		for (i = 1; i <= btf->nr_types; i++) {
-			t = p  + btf->type_offs[i];
+		for (i = 0; i < btf->nr_types; i++) {
+			t = p + btf->type_offs[i];
 			/* btf_bswap_type_rest() relies on native t->info, so
 			 * we swap base type info after we swapped all the
 			 * additional information
@@ -1223,8 +1313,10 @@ const void *btf__get_raw_data(const struct btf *btf_ro, __u32 *size)
 
 const char *btf__str_by_offset(const struct btf *btf, __u32 offset)
 {
-	if (offset < btf->hdr->str_len)
-		return btf->strs_data + offset;
+	if (offset < btf->start_str_off)
+		return btf__str_by_offset(btf->base_btf, offset);
+	else if (offset - btf->start_str_off < btf->hdr->str_len)
+		return btf->strs_data + (offset - btf->start_str_off);
 	else
 		return NULL;
 }
@@ -1461,7 +1553,10 @@ static int btf_ensure_modifiable(struct btf *btf)
 	/* if BTF was created from scratch, all strings are guaranteed to be
 	 * unique and deduplicated
 	 */
-	btf->strs_deduped = btf->hdr->str_len <= 1;
+	if (btf->hdr->str_len == 0)
+		btf->strs_deduped = true;
+	if (!btf->base_btf && btf->hdr->str_len == 1)
+		btf->strs_deduped = true;
 
 	/* invalidate raw_data representation */
 	btf_invalidate_raw_data(btf);
@@ -1493,6 +1588,14 @@ int btf__find_str(struct btf *btf, const char *s)
 	long old_off, new_off, len;
 	void *p;
 
+	if (btf->base_btf) {
+		int ret;
+
+		ret = btf__find_str(btf->base_btf, s);
+		if (ret != -ENOENT)
+			return ret;
+	}
+
 	/* BTF needs to be in a modifiable state to build string lookup index */
 	if (btf_ensure_modifiable(btf))
 		return -ENOMEM;
@@ -1507,7 +1610,7 @@ int btf__find_str(struct btf *btf, const char *s)
 	memcpy(p, s, len);
 
 	if (hashmap__find(btf->strs_hash, (void *)new_off, (void **)&old_off))
-		return old_off;
+		return btf->start_str_off + old_off;
 
 	return -ENOENT;
 }
@@ -1523,6 +1626,14 @@ int btf__add_str(struct btf *btf, const char *s)
 	void *p;
 	int err;
 
+	if (btf->base_btf) {
+		int ret;
+
+		ret = btf__find_str(btf->base_btf, s);
+		if (ret != -ENOENT)
+			return ret;
+	}
+
 	if (btf_ensure_modifiable(btf))
 		return -ENOMEM;
 
@@ -1549,12 +1660,12 @@ int btf__add_str(struct btf *btf, const char *s)
 	err = hashmap__insert(btf->strs_hash, (void *)new_off, (void *)new_off,
 			      HASHMAP_ADD, (const void **)&old_off, NULL);
 	if (err == -EEXIST)
-		return old_off; /* duplicated string, return existing offset */
+		return btf->start_str_off + old_off; /* duplicated string, return existing offset */
 	if (err)
 		return err;
 
 	btf->hdr->str_len += len; /* new unique string, adjust data length */
-	return new_off;
+	return btf->start_str_off + new_off;
 }
 
 static void *btf_add_type_mem(struct btf *btf, size_t add_sz)
@@ -1584,7 +1695,7 @@ static int btf_commit_type(struct btf *btf, int data_sz)
 	btf->hdr->type_len += data_sz;
 	btf->hdr->str_off += data_sz;
 	btf->nr_types++;
-	return btf->nr_types;
+	return btf->start_id + btf->nr_types - 1;
 }
 
 /*
@@ -4167,14 +4278,14 @@ static int btf_dedup_compact_types(struct btf_dedup *d)
 
 		memmove(p, btf__type_by_id(d->btf, i), len);
 		d->hypot_map[i] = next_type_id;
-		d->btf->type_offs[next_type_id] = p - d->btf->types_data;
+		d->btf->type_offs[next_type_id - 1] = p - d->btf->types_data;
 		p += len;
 		next_type_id++;
 	}
 
 	/* shrink struct btf's internal types index and update btf_header */
 	d->btf->nr_types = next_type_id - 1;
-	d->btf->type_offs_cap = d->btf->nr_types + 1;
+	d->btf->type_offs_cap = d->btf->nr_types;
 	d->btf->hdr->type_len = p - d->btf->types_data;
 	new_offs = libbpf_reallocarray(d->btf->type_offs, d->btf->type_offs_cap,
 				       sizeof(*new_offs));
diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
index 57247240a20a..1093f6fe6800 100644
--- a/tools/lib/bpf/btf.h
+++ b/tools/lib/bpf/btf.h
@@ -31,11 +31,19 @@ enum btf_endianness {
 };
 
 LIBBPF_API void btf__free(struct btf *btf);
+
 LIBBPF_API struct btf *btf__new(const void *data, __u32 size);
+LIBBPF_API struct btf *btf__new_split(const void *data, __u32 size, struct btf *base_btf);
 LIBBPF_API struct btf *btf__new_empty(void);
+LIBBPF_API struct btf *btf__new_empty_split(struct btf *base_btf);
+
 LIBBPF_API struct btf *btf__parse(const char *path, struct btf_ext **btf_ext);
+LIBBPF_API struct btf *btf__parse_split(const char *path, struct btf *base_btf);
 LIBBPF_API struct btf *btf__parse_elf(const char *path, struct btf_ext **btf_ext);
+LIBBPF_API struct btf *btf__parse_elf_split(const char *path, struct btf *base_btf);
 LIBBPF_API struct btf *btf__parse_raw(const char *path);
+LIBBPF_API struct btf *btf__parse_raw_split(const char *path, struct btf *base_btf);
+
 LIBBPF_API int btf__finalize_data(struct bpf_object *obj, struct btf *btf);
 LIBBPF_API int btf__load(struct btf *btf);
 LIBBPF_API __s32 btf__find_by_name(const struct btf *btf,
diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
index 4ebfadf45b47..29ff4807b909 100644
--- a/tools/lib/bpf/libbpf.map
+++ b/tools/lib/bpf/libbpf.map
@@ -337,3 +337,12 @@ LIBBPF_0.2.0 {
 		perf_buffer__consume_buffer;
 		xsk_socket__create_shared;
 } LIBBPF_0.1.0;
+
+LIBBPF_0.3.0 {
+	global:
+		btf__parse_elf_split;
+		btf__parse_raw_split;
+		btf__parse_split;
+		btf__new_empty_split;
+		btf__new_split;
+} LIBBPF_0.2.0;
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH bpf-next 05/11] selftests/bpf: add split BTF basic test
  2020-10-29  0:58 [PATCH bpf-next 00/11] libbpf: split BTF support Andrii Nakryiko
                   ` (3 preceding siblings ...)
  2020-10-29  0:58 ` [PATCH bpf-next 04/11] libbpf: implement basic split BTF support Andrii Nakryiko
@ 2020-10-29  0:58 ` Andrii Nakryiko
  2020-11-02 23:36   ` Song Liu
  2020-10-29  0:58 ` [PATCH bpf-next 06/11] selftests/bpf: add checking of raw type dump in BTF writer APIs selftests Andrii Nakryiko
                   ` (6 subsequent siblings)
  11 siblings, 1 reply; 49+ messages in thread
From: Andrii Nakryiko @ 2020-10-29  0:58 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii, kernel-team

Add selftest validating ability to programmatically generate and then dump
split BTF.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 .../selftests/bpf/prog_tests/btf_split.c      | 99 +++++++++++++++++++
 tools/testing/selftests/bpf/test_progs.h      | 11 +++
 2 files changed, 110 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_split.c

diff --git a/tools/testing/selftests/bpf/prog_tests/btf_split.c b/tools/testing/selftests/bpf/prog_tests/btf_split.c
new file mode 100644
index 000000000000..b032577cded5
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/btf_split.c
@@ -0,0 +1,99 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2020 Facebook */
+#include <test_progs.h>
+#include <bpf/btf.h>
+
+static char *dump_buf;
+static size_t dump_buf_sz;
+static FILE *dump_buf_file;
+
+static void btf_dump_printf(void *ctx, const char *fmt, va_list args)
+{
+	vfprintf(ctx, fmt, args);
+}
+
+void test_btf_split() {
+	struct btf_dump_opts opts;
+	struct btf_dump *d = NULL;
+	const struct btf_type *t;
+	struct btf *btf1, *btf2 = NULL;
+	int str_off, i, err;
+
+	btf1 = btf__new_empty();
+	if (!ASSERT_OK_PTR(btf1, "empty_main_btf"))
+		return;
+
+	btf__set_pointer_size(btf1, 8); /* enforce 64-bit arch */
+
+	btf__add_int(btf1, "int", 4, BTF_INT_SIGNED);	/* [1] int */
+	btf__add_ptr(btf1, 1);				/* [2] ptr to int */
+
+	btf__add_struct(btf1, "s1", 4);			/* [3] struct s1 { */
+	btf__add_field(btf1, "f1", 1, 0, 0);		/*      int f1; */
+							/* } */
+
+	btf2 = btf__new_empty_split(btf1);
+	if (!ASSERT_OK_PTR(btf2, "empty_split_btf"))
+		goto cleanup;
+
+	/* pointer size should be "inherited" from main BTF */
+	ASSERT_EQ(btf__pointer_size(btf2), 8, "inherit_ptr_sz");
+
+	str_off = btf__find_str(btf2, "int");
+	ASSERT_NEQ(str_off, -ENOENT, "str_int_missing");
+
+	t = btf__type_by_id(btf2, 1);
+	if (!ASSERT_OK_PTR(t, "int_type"))
+		goto cleanup;
+	ASSERT_EQ(btf_is_int(t), true, "int_kind");
+	ASSERT_STREQ(btf__str_by_offset(btf2, t->name_off), "int", "int_name");
+
+	btf__add_struct(btf2, "s2", 16);		/* [4] struct s2 {	*/
+	btf__add_field(btf2, "f1", 3, 0, 0);		/*      struct s1 f1;	*/
+	btf__add_field(btf2, "f2", 1, 32, 0);		/*      int f2;		*/
+	btf__add_field(btf2, "f3", 2, 64, 0);		/*      int *f3;	*/
+							/* } */
+
+	t = btf__type_by_id(btf1, 4);
+	ASSERT_NULL(t, "split_type_in_main");
+
+	t = btf__type_by_id(btf2, 4);
+	if (!ASSERT_OK_PTR(t, "split_struct_type"))
+		goto cleanup;
+	ASSERT_EQ(btf_is_struct(t), true, "split_struct_kind");
+	ASSERT_EQ(btf_vlen(t), 3, "split_struct_vlen");
+	ASSERT_STREQ(btf__str_by_offset(btf2, t->name_off), "s2", "split_struct_name");
+
+	/* BTF-to-C dump of split BTF */
+	dump_buf_file = open_memstream(&dump_buf, &dump_buf_sz);
+	if (!ASSERT_OK_PTR(dump_buf_file, "dump_memstream"))
+		return;
+	opts.ctx = dump_buf_file;
+	d = btf_dump__new(btf2, NULL, &opts, btf_dump_printf);
+	if (!ASSERT_OK_PTR(d, "btf_dump__new"))
+		goto cleanup;
+	for (i = 1; i <= btf__get_nr_types(btf2); i++) {
+		err = btf_dump__dump_type(d, i);
+		ASSERT_OK(err, "dump_type_ok");
+	}
+	fflush(dump_buf_file);
+	dump_buf[dump_buf_sz] = 0; /* some libc implementations don't do this */
+	ASSERT_STREQ(dump_buf,
+"struct s1 {\n"
+"	int f1;\n"
+"};\n"
+"\n"
+"struct s2 {\n"
+"	struct s1 f1;\n"
+"	int f2;\n"
+"	int *f3;\n"
+"};\n\n", "c_dump");
+
+cleanup:
+	if (dump_buf_file)
+		fclose(dump_buf_file);
+	free(dump_buf);
+	btf_dump__free(d);
+	btf__free(btf1);
+	btf__free(btf2);
+}
diff --git a/tools/testing/selftests/bpf/test_progs.h b/tools/testing/selftests/bpf/test_progs.h
index 238f5f61189e..d6b14853f3bc 100644
--- a/tools/testing/selftests/bpf/test_progs.h
+++ b/tools/testing/selftests/bpf/test_progs.h
@@ -141,6 +141,17 @@ extern int test__join_cgroup(const char *path);
 	___ok;								\
 })
 
+#define ASSERT_NEQ(actual, expected, name) ({				\
+	static int duration = 0;					\
+	typeof(actual) ___act = (actual);				\
+	typeof(expected) ___exp = (expected);				\
+	bool ___ok = ___act != ___exp;					\
+	CHECK(!___ok, (name),						\
+	      "unexpected %s: actual %lld == expected %lld\n",		\
+	      (name), (long long)(___act), (long long)(___exp));	\
+	___ok;								\
+})
+
 #define ASSERT_STREQ(actual, expected, name) ({				\
 	static int duration = 0;					\
 	const char *___act = actual;					\
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH bpf-next 06/11] selftests/bpf: add checking of raw type dump in BTF writer APIs selftests
  2020-10-29  0:58 [PATCH bpf-next 00/11] libbpf: split BTF support Andrii Nakryiko
                   ` (4 preceding siblings ...)
  2020-10-29  0:58 ` [PATCH bpf-next 05/11] selftests/bpf: add split BTF basic test Andrii Nakryiko
@ 2020-10-29  0:58 ` Andrii Nakryiko
  2020-11-03  0:08   ` Song Liu
  2020-10-29  0:58 ` [PATCH bpf-next 07/11] libbpf: fix BTF data layout checks and allow empty BTF Andrii Nakryiko
                   ` (5 subsequent siblings)
  11 siblings, 1 reply; 49+ messages in thread
From: Andrii Nakryiko @ 2020-10-29  0:58 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii, kernel-team, Andrii Nakryiko

From: Andrii Nakryiko <andriin@fb.com>

Add re-usable btf_helpers.{c,h} to provide BTF-related testing routines. Start
with adding a raw BTF dumping helpers.

Raw BTF dump is the most succinct and at the same time a very human-friendly
way to validate exact contents of BTF types. Cross-validate raw BTF dump and
writable BTF in a single selftest. Raw type dump checks also serve as a good
self-documentation.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/testing/selftests/bpf/Makefile          |   2 +-
 tools/testing/selftests/bpf/btf_helpers.c     | 200 ++++++++++++++++++
 tools/testing/selftests/bpf/btf_helpers.h     |  12 ++
 .../selftests/bpf/prog_tests/btf_write.c      |  48 ++++-
 4 files changed, 259 insertions(+), 3 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/btf_helpers.c
 create mode 100644 tools/testing/selftests/bpf/btf_helpers.h

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 542768f5195b..c1772ec0ff5e 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -387,7 +387,7 @@ TRUNNER_TESTS_DIR := prog_tests
 TRUNNER_BPF_PROGS_DIR := progs
 TRUNNER_EXTRA_SOURCES := test_progs.c cgroup_helpers.c trace_helpers.c	\
 			 network_helpers.c testing_helpers.c		\
-			 flow_dissector_load.h
+			 btf_helpers.c	flow_dissector_load.h
 TRUNNER_EXTRA_FILES := $(OUTPUT)/urandom_read				\
 		       $(wildcard progs/btf_dump_test_case_*.c)
 TRUNNER_BPF_BUILD_RULE := CLANG_BPF_BUILD_RULE
diff --git a/tools/testing/selftests/bpf/btf_helpers.c b/tools/testing/selftests/bpf/btf_helpers.c
new file mode 100644
index 000000000000..abc3f6c04cfc
--- /dev/null
+++ b/tools/testing/selftests/bpf/btf_helpers.c
@@ -0,0 +1,200 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2020 Facebook */
+#include <stdio.h>
+#include <errno.h>
+#include <bpf/btf.h>
+
+static const char * const btf_kind_str_mapping[] = {
+	[BTF_KIND_UNKN]		= "UNKNOWN",
+	[BTF_KIND_INT]		= "INT",
+	[BTF_KIND_PTR]		= "PTR",
+	[BTF_KIND_ARRAY]	= "ARRAY",
+	[BTF_KIND_STRUCT]	= "STRUCT",
+	[BTF_KIND_UNION]	= "UNION",
+	[BTF_KIND_ENUM]		= "ENUM",
+	[BTF_KIND_FWD]		= "FWD",
+	[BTF_KIND_TYPEDEF]	= "TYPEDEF",
+	[BTF_KIND_VOLATILE]	= "VOLATILE",
+	[BTF_KIND_CONST]	= "CONST",
+	[BTF_KIND_RESTRICT]	= "RESTRICT",
+	[BTF_KIND_FUNC]		= "FUNC",
+	[BTF_KIND_FUNC_PROTO]	= "FUNC_PROTO",
+	[BTF_KIND_VAR]		= "VAR",
+	[BTF_KIND_DATASEC]	= "DATASEC",
+};
+
+static const char *btf_kind_str(__u16 kind)
+{
+	if (kind > BTF_KIND_DATASEC)
+		return "UNKNOWN";
+	return btf_kind_str_mapping[kind];
+}
+
+static const char *btf_int_enc_str(__u8 encoding)
+{
+	switch (encoding) {
+	case 0:
+		return "(none)";
+	case BTF_INT_SIGNED:
+		return "SIGNED";
+	case BTF_INT_CHAR:
+		return "CHAR";
+	case BTF_INT_BOOL:
+		return "BOOL";
+	default:
+		return "UNKN";
+	}
+}
+
+static const char *btf_var_linkage_str(__u32 linkage)
+{
+	switch (linkage) {
+	case BTF_VAR_STATIC:
+		return "static";
+	case BTF_VAR_GLOBAL_ALLOCATED:
+		return "global-alloc";
+	default:
+		return "(unknown)";
+	}
+}
+
+static const char *btf_func_linkage_str(const struct btf_type *t)
+{
+	switch (btf_vlen(t)) {
+	case BTF_FUNC_STATIC:
+		return "static";
+	case BTF_FUNC_GLOBAL:
+		return "global";
+	case BTF_FUNC_EXTERN:
+		return "extern";
+	default:
+		return "(unknown)";
+	}
+}
+
+static const char *btf_str(const struct btf *btf, __u32 off)
+{
+	if (!off)
+		return "(anon)";
+	return btf__str_by_offset(btf, off) ?: "(invalid)";
+}
+
+int fprintf_btf_type_raw(FILE *out, const struct btf *btf, __u32 id)
+{
+	const struct btf_type *t;
+	int kind, i;
+	__u32 vlen;
+
+	t = btf__type_by_id(btf, id);
+	if (!t)
+		return -EINVAL;
+
+	vlen = btf_vlen(t);
+	kind = btf_kind(t);
+
+	fprintf(out, "[%u] %s '%s'", id, btf_kind_str(kind), btf_str(btf, t->name_off));
+
+	switch (kind) {
+	case BTF_KIND_INT:
+		fprintf(out, " size=%u bits_offset=%u nr_bits=%u encoding=%s",
+			t->size, btf_int_offset(t), btf_int_bits(t),
+			btf_int_enc_str(btf_int_encoding(t)));
+		break;
+	case BTF_KIND_PTR:
+	case BTF_KIND_CONST:
+	case BTF_KIND_VOLATILE:
+	case BTF_KIND_RESTRICT:
+	case BTF_KIND_TYPEDEF:
+		fprintf(out, " type_id=%u", t->type);
+		break;
+	case BTF_KIND_ARRAY: {
+		const struct btf_array *arr = btf_array(t);
+
+		fprintf(out, " type_id=%u index_type_id=%u nr_elems=%u",
+			arr->type, arr->index_type, arr->nelems);
+		break;
+	}
+	case BTF_KIND_STRUCT:
+	case BTF_KIND_UNION: {
+		const struct btf_member *m = btf_members(t);
+
+		fprintf(out, " size=%u vlen=%u", t->size, vlen);
+		for (i = 0; i < vlen; i++, m++) {
+			__u32 bit_off, bit_sz;
+
+			bit_off = btf_member_bit_offset(t, i);
+			bit_sz = btf_member_bitfield_size(t, i);
+			fprintf(out, "\n\t'%s' type_id=%u bits_offset=%u",
+				btf_str(btf, m->name_off), m->type, bit_off);
+			if (bit_sz)
+				fprintf(out, " bitfield_size=%u", bit_sz);
+		}
+		break;
+	}
+	case BTF_KIND_ENUM: {
+		const struct btf_enum *v = btf_enum(t);
+
+		fprintf(out, " size=%u vlen=%u", t->size, vlen);
+		for (i = 0; i < vlen; i++, v++) {
+			fprintf(out, "\n\t'%s' val=%u",
+				btf_str(btf, v->name_off), v->val);
+		}
+		break;
+	}
+	case BTF_KIND_FWD:
+		fprintf(out, " fwd_kind=%s", btf_kflag(t) ? "union" : "struct");
+		break;
+	case BTF_KIND_FUNC:
+		fprintf(out, " type_id=%u linkage=%s", t->type, btf_func_linkage_str(t));
+		break;
+	case BTF_KIND_FUNC_PROTO: {
+		const struct btf_param *p = btf_params(t);
+
+		fprintf(out, " ret_type_id=%u vlen=%u", t->type, vlen);
+		for (i = 0; i < vlen; i++, p++) {
+			fprintf(out, "\n\t'%s' type_id=%u",
+				btf_str(btf, p->name_off), p->type);
+		}
+		break;
+	}
+	case BTF_KIND_VAR:
+		fprintf(out, " type_id=%u, linkage=%s",
+			t->type, btf_var_linkage_str(btf_var(t)->linkage));
+		break;
+	case BTF_KIND_DATASEC: {
+		const struct btf_var_secinfo *v = btf_var_secinfos(t);
+
+		fprintf(out, " size=%u vlen=%u", t->size, vlen);
+		for (i = 0; i < vlen; i++, v++) {
+			fprintf(out, "\n\ttype_id=%u offset=%u size=%u",
+				v->type, v->offset, v->size);
+		}
+		break;
+	}
+	default:
+		break;
+	}
+
+	return 0;
+}
+
+/* Print raw BTF type dump into a local buffer and return string pointer back.
+ * Buffer *will* be overwritten by subsequent btf_type_raw_dump() calls
+ */
+const char *btf_type_raw_dump(const struct btf *btf, int type_id)
+{
+	static char buf[16 * 1024];
+	FILE *buf_file;
+
+	buf_file = fmemopen(buf, sizeof(buf) - 1, "w");
+	if (!buf_file) {
+		fprintf(stderr, "Failed to open memstream: %d\n", errno);
+		return NULL;
+	}
+
+	fprintf_btf_type_raw(buf_file, btf, type_id);
+	fflush(buf_file);
+	fclose(buf_file);
+
+	return buf;
+}
diff --git a/tools/testing/selftests/bpf/btf_helpers.h b/tools/testing/selftests/bpf/btf_helpers.h
new file mode 100644
index 000000000000..2c9ce1b61dc9
--- /dev/null
+++ b/tools/testing/selftests/bpf/btf_helpers.h
@@ -0,0 +1,12 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (c) 2020 Facebook */
+#ifndef __BTF_HELPERS_H
+#define __BTF_HELPERS_H
+
+#include <stdio.h>
+#include <bpf/btf.h>
+
+int fprintf_btf_type_raw(FILE *out, const struct btf *btf, __u32 id);
+const char *btf_type_raw_dump(const struct btf *btf, int type_id);
+
+#endif
diff --git a/tools/testing/selftests/bpf/prog_tests/btf_write.c b/tools/testing/selftests/bpf/prog_tests/btf_write.c
index 314e1e7c36df..bc1412de1b3d 100644
--- a/tools/testing/selftests/bpf/prog_tests/btf_write.c
+++ b/tools/testing/selftests/bpf/prog_tests/btf_write.c
@@ -2,6 +2,7 @@
 /* Copyright (c) 2020 Facebook */
 #include <test_progs.h>
 #include <bpf/btf.h>
+#include "btf_helpers.h"
 
 static int duration = 0;
 
@@ -11,12 +12,12 @@ void test_btf_write() {
 	const struct btf_member *m;
 	const struct btf_enum *v;
 	const struct btf_param *p;
-	struct btf *btf;
+	struct btf *btf = NULL;
 	int id, err, str_off;
 
 	btf = btf__new_empty();
 	if (CHECK(IS_ERR(btf), "new_empty", "failed: %ld\n", PTR_ERR(btf)))
-		return;
+		goto err_out;
 
 	str_off = btf__find_str(btf, "int");
 	ASSERT_EQ(str_off, -ENOENT, "int_str_missing_off");
@@ -39,6 +40,8 @@ void test_btf_write() {
 	ASSERT_EQ(t->size, 4, "int_sz");
 	ASSERT_EQ(btf_int_encoding(t), BTF_INT_SIGNED, "int_enc");
 	ASSERT_EQ(btf_int_bits(t), 32, "int_bits");
+	ASSERT_STREQ(btf_type_raw_dump(btf, 1),
+		     "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED", "raw_dump");
 
 	/* invalid int size */
 	id = btf__add_int(btf, "bad sz int", 7, 0);
@@ -59,24 +62,32 @@ void test_btf_write() {
 	t = btf__type_by_id(btf, 2);
 	ASSERT_EQ(btf_kind(t), BTF_KIND_PTR, "ptr_kind");
 	ASSERT_EQ(t->type, 1, "ptr_type");
+	ASSERT_STREQ(btf_type_raw_dump(btf, 2),
+		     "[2] PTR '(anon)' type_id=1", "raw_dump");
 
 	id = btf__add_const(btf, 5); /* points forward to restrict */
 	ASSERT_EQ(id, 3, "const_id");
 	t = btf__type_by_id(btf, 3);
 	ASSERT_EQ(btf_kind(t), BTF_KIND_CONST, "const_kind");
 	ASSERT_EQ(t->type, 5, "const_type");
+	ASSERT_STREQ(btf_type_raw_dump(btf, 3),
+		     "[3] CONST '(anon)' type_id=5", "raw_dump");
 
 	id = btf__add_volatile(btf, 3);
 	ASSERT_EQ(id, 4, "volatile_id");
 	t = btf__type_by_id(btf, 4);
 	ASSERT_EQ(btf_kind(t), BTF_KIND_VOLATILE, "volatile_kind");
 	ASSERT_EQ(t->type, 3, "volatile_type");
+	ASSERT_STREQ(btf_type_raw_dump(btf, 4),
+		     "[4] VOLATILE '(anon)' type_id=3", "raw_dump");
 
 	id = btf__add_restrict(btf, 4);
 	ASSERT_EQ(id, 5, "restrict_id");
 	t = btf__type_by_id(btf, 5);
 	ASSERT_EQ(btf_kind(t), BTF_KIND_RESTRICT, "restrict_kind");
 	ASSERT_EQ(t->type, 4, "restrict_type");
+	ASSERT_STREQ(btf_type_raw_dump(btf, 5),
+		     "[5] RESTRICT '(anon)' type_id=4", "raw_dump");
 
 	/* ARRAY */
 	id = btf__add_array(btf, 1, 2, 10); /* int *[10] */
@@ -86,6 +97,8 @@ void test_btf_write() {
 	ASSERT_EQ(btf_array(t)->index_type, 1, "array_index_type");
 	ASSERT_EQ(btf_array(t)->type, 2, "array_elem_type");
 	ASSERT_EQ(btf_array(t)->nelems, 10, "array_nelems");
+	ASSERT_STREQ(btf_type_raw_dump(btf, 6),
+		     "[6] ARRAY '(anon)' type_id=2 index_type_id=1 nr_elems=10", "raw_dump");
 
 	/* STRUCT */
 	err = btf__add_field(btf, "field", 1, 0, 0);
@@ -113,6 +126,10 @@ void test_btf_write() {
 	ASSERT_EQ(m->type, 1, "f2_type");
 	ASSERT_EQ(btf_member_bit_offset(t, 1), 32, "f2_bit_off");
 	ASSERT_EQ(btf_member_bitfield_size(t, 1), 16, "f2_bit_sz");
+	ASSERT_STREQ(btf_type_raw_dump(btf, 7),
+		     "[7] STRUCT 's1' size=8 vlen=2\n"
+		     "\t'f1' type_id=1 bits_offset=0\n"
+		     "\t'f2' type_id=1 bits_offset=32 bitfield_size=16", "raw_dump");
 
 	/* UNION */
 	id = btf__add_union(btf, "u1", 8);
@@ -136,6 +153,9 @@ void test_btf_write() {
 	ASSERT_EQ(m->type, 1, "f1_type");
 	ASSERT_EQ(btf_member_bit_offset(t, 0), 0, "f1_bit_off");
 	ASSERT_EQ(btf_member_bitfield_size(t, 0), 16, "f1_bit_sz");
+	ASSERT_STREQ(btf_type_raw_dump(btf, 8),
+		     "[8] UNION 'u1' size=8 vlen=1\n"
+		     "\t'f1' type_id=1 bits_offset=0 bitfield_size=16", "raw_dump");
 
 	/* ENUM */
 	id = btf__add_enum(btf, "e1", 4);
@@ -156,6 +176,10 @@ void test_btf_write() {
 	v = btf_enum(t) + 1;
 	ASSERT_STREQ(btf__str_by_offset(btf, v->name_off), "v2", "v2_name");
 	ASSERT_EQ(v->val, 2, "v2_val");
+	ASSERT_STREQ(btf_type_raw_dump(btf, 9),
+		     "[9] ENUM 'e1' size=4 vlen=2\n"
+		     "\t'v1' val=1\n"
+		     "\t'v2' val=2", "raw_dump");
 
 	/* FWDs */
 	id = btf__add_fwd(btf, "struct_fwd", BTF_FWD_STRUCT);
@@ -164,6 +188,8 @@ void test_btf_write() {
 	ASSERT_STREQ(btf__str_by_offset(btf, t->name_off), "struct_fwd", "fwd_name");
 	ASSERT_EQ(btf_kind(t), BTF_KIND_FWD, "fwd_kind");
 	ASSERT_EQ(btf_kflag(t), 0, "fwd_kflag");
+	ASSERT_STREQ(btf_type_raw_dump(btf, 10),
+		     "[10] FWD 'struct_fwd' fwd_kind=struct", "raw_dump");
 
 	id = btf__add_fwd(btf, "union_fwd", BTF_FWD_UNION);
 	ASSERT_EQ(id, 11, "union_fwd_id");
@@ -171,6 +197,8 @@ void test_btf_write() {
 	ASSERT_STREQ(btf__str_by_offset(btf, t->name_off), "union_fwd", "fwd_name");
 	ASSERT_EQ(btf_kind(t), BTF_KIND_FWD, "fwd_kind");
 	ASSERT_EQ(btf_kflag(t), 1, "fwd_kflag");
+	ASSERT_STREQ(btf_type_raw_dump(btf, 11),
+		     "[11] FWD 'union_fwd' fwd_kind=union", "raw_dump");
 
 	id = btf__add_fwd(btf, "enum_fwd", BTF_FWD_ENUM);
 	ASSERT_EQ(id, 12, "enum_fwd_id");
@@ -179,6 +207,8 @@ void test_btf_write() {
 	ASSERT_EQ(btf_kind(t), BTF_KIND_ENUM, "enum_fwd_kind");
 	ASSERT_EQ(btf_vlen(t), 0, "enum_fwd_kind");
 	ASSERT_EQ(t->size, 4, "enum_fwd_sz");
+	ASSERT_STREQ(btf_type_raw_dump(btf, 12),
+		     "[12] ENUM 'enum_fwd' size=4 vlen=0", "raw_dump");
 
 	/* TYPEDEF */
 	id = btf__add_typedef(btf, "typedef1", 1);
@@ -187,6 +217,8 @@ void test_btf_write() {
 	ASSERT_STREQ(btf__str_by_offset(btf, t->name_off), "typedef1", "typedef_name");
 	ASSERT_EQ(btf_kind(t), BTF_KIND_TYPEDEF, "typedef_kind");
 	ASSERT_EQ(t->type, 1, "typedef_type");
+	ASSERT_STREQ(btf_type_raw_dump(btf, 13),
+		     "[13] TYPEDEF 'typedef1' type_id=1", "raw_dump");
 
 	/* FUNC & FUNC_PROTO */
 	id = btf__add_func(btf, "func1", BTF_FUNC_GLOBAL, 15);
@@ -196,6 +228,8 @@ void test_btf_write() {
 	ASSERT_EQ(t->type, 15, "func_type");
 	ASSERT_EQ(btf_kind(t), BTF_KIND_FUNC, "func_kind");
 	ASSERT_EQ(btf_vlen(t), BTF_FUNC_GLOBAL, "func_vlen");
+	ASSERT_STREQ(btf_type_raw_dump(btf, 14),
+		     "[14] FUNC 'func1' type_id=15 linkage=global", "raw_dump");
 
 	id = btf__add_func_proto(btf, 1);
 	ASSERT_EQ(id, 15, "func_proto_id");
@@ -214,6 +248,10 @@ void test_btf_write() {
 	p = btf_params(t) + 1;
 	ASSERT_STREQ(btf__str_by_offset(btf, p->name_off), "p2", "p2_name");
 	ASSERT_EQ(p->type, 2, "p2_type");
+	ASSERT_STREQ(btf_type_raw_dump(btf, 15),
+		     "[15] FUNC_PROTO '(anon)' ret_type_id=1 vlen=2\n"
+		     "\t'p1' type_id=1\n"
+		     "\t'p2' type_id=2", "raw_dump");
 
 	/* VAR */
 	id = btf__add_var(btf, "var1", BTF_VAR_GLOBAL_ALLOCATED, 1);
@@ -223,6 +261,8 @@ void test_btf_write() {
 	ASSERT_EQ(btf_kind(t), BTF_KIND_VAR, "var_kind");
 	ASSERT_EQ(t->type, 1, "var_type");
 	ASSERT_EQ(btf_var(t)->linkage, BTF_VAR_GLOBAL_ALLOCATED, "var_type");
+	ASSERT_STREQ(btf_type_raw_dump(btf, 16),
+		     "[16] VAR 'var1' type_id=1, linkage=global-alloc", "raw_dump");
 
 	/* DATASECT */
 	id = btf__add_datasec(btf, "datasec1", 12);
@@ -239,6 +279,10 @@ void test_btf_write() {
 	ASSERT_EQ(vi->type, 1, "v1_type");
 	ASSERT_EQ(vi->offset, 4, "v1_off");
 	ASSERT_EQ(vi->size, 8, "v1_sz");
+	ASSERT_STREQ(btf_type_raw_dump(btf, 17),
+		     "[17] DATASEC 'datasec1' size=12 vlen=1\n"
+		     "\ttype_id=1 offset=4 size=8", "raw_dump");
 
+err_out:
 	btf__free(btf);
 }
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH bpf-next 07/11] libbpf: fix BTF data layout checks and allow empty BTF
  2020-10-29  0:58 [PATCH bpf-next 00/11] libbpf: split BTF support Andrii Nakryiko
                   ` (5 preceding siblings ...)
  2020-10-29  0:58 ` [PATCH bpf-next 06/11] selftests/bpf: add checking of raw type dump in BTF writer APIs selftests Andrii Nakryiko
@ 2020-10-29  0:58 ` Andrii Nakryiko
  2020-11-03  0:51   ` Song Liu
  2020-10-29  0:58 ` [PATCH bpf-next 08/11] libbpf: support BTF dedup of split BTFs Andrii Nakryiko
                   ` (4 subsequent siblings)
  11 siblings, 1 reply; 49+ messages in thread
From: Andrii Nakryiko @ 2020-10-29  0:58 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii, kernel-team

Make data section layout checks stricter, disallowing overlap of types and
strings data.

Additionally, allow BTFs with no type data. There is nothing inherently wrong
with having BTF with no types (put potentially with some strings). This could
be a situation with kernel module BTFs, if module doesn't introduce any new
type information.

Also fix invalid offset alignment check for btf->hdr->type_off.

Fixes: 8a138aed4a80 ("bpf: btf: Add BTF support to libbpf")
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/lib/bpf/btf.c | 16 ++++++----------
 1 file changed, 6 insertions(+), 10 deletions(-)

diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index 20c64a8441a8..9b0ef71a03d0 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -245,22 +245,18 @@ static int btf_parse_hdr(struct btf *btf)
 		return -EINVAL;
 	}
 
-	if (meta_left < hdr->type_off) {
-		pr_debug("Invalid BTF type section offset:%u\n", hdr->type_off);
+	if (meta_left < hdr->str_off + hdr->str_len) {
+		pr_debug("Invalid BTF total size:%u\n", btf->raw_size);
 		return -EINVAL;
 	}
 
-	if (meta_left < hdr->str_off) {
-		pr_debug("Invalid BTF string section offset:%u\n", hdr->str_off);
+	if (hdr->type_off + hdr->type_len > hdr->str_off) {
+		pr_debug("Invalid BTF data sections layout: type data at %u + %u, strings data at %u + %u\n",
+			 hdr->type_off, hdr->type_len, hdr->str_off, hdr->str_len);
 		return -EINVAL;
 	}
 
-	if (hdr->type_off >= hdr->str_off) {
-		pr_debug("BTF type section offset >= string section offset. No type?\n");
-		return -EINVAL;
-	}
-
-	if (hdr->type_off & 0x02) {
+	if (hdr->type_off % 4) {
 		pr_debug("BTF type section is not aligned to 4 bytes\n");
 		return -EINVAL;
 	}
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH bpf-next 08/11] libbpf: support BTF dedup of split BTFs
  2020-10-29  0:58 [PATCH bpf-next 00/11] libbpf: split BTF support Andrii Nakryiko
                   ` (6 preceding siblings ...)
  2020-10-29  0:58 ` [PATCH bpf-next 07/11] libbpf: fix BTF data layout checks and allow empty BTF Andrii Nakryiko
@ 2020-10-29  0:58 ` Andrii Nakryiko
  2020-11-03  2:49   ` Song Liu
  2020-11-03  5:10   ` Alexei Starovoitov
  2020-10-29  0:59 ` [PATCH bpf-next 09/11] libbpf: accomodate DWARF/compiler bug with duplicated identical arrays Andrii Nakryiko
                   ` (3 subsequent siblings)
  11 siblings, 2 replies; 49+ messages in thread
From: Andrii Nakryiko @ 2020-10-29  0:58 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii, kernel-team

Add support for deduplication split BTFs. When deduplicating split BTF, base
BTF is considered to be immutable and can't be modified or adjusted. 99% of
BTF deduplication logic is left intact (module some type numbering adjustments).
There are only two differences.

First, each type in base BTF gets hashed (expect VAR and DATASEC, of course,
those are always considered to be self-canonical instances) and added into
a table of canonical table candidates. Hashing is a shallow, fast operation,
so mostly eliminates the overhead of having entire base BTF to be a part of
BTF dedup.

Second difference is very critical and subtle. While deduplicating split BTF
types, it is possible to discover that one of immutable base BTF BTF_KIND_FWD
types can and should be resolved to a full STRUCT/UNION type from the split
BTF part.  This is, obviously, can't happen because we can't modify the base
BTF types anymore. So because of that, any type in split BTF that directly or
indirectly references that newly-to-be-resolved FWD type can't be considered
to be equivalent to the corresponding canonical types in base BTF, because
that would result in a loss of type resolution information. So in such case,
split BTF types will be deduplicated separately and will cause some
duplication of type information, which is unavoidable.

With those two changes, the rest of the algorithm manages to deduplicate split
BTF correctly, pointing all the duplicates to their canonical counter-parts in
base BTF, but also is deduplicating whatever unique types are present in split
BTF on their own.

Also, theoretically, split BTF after deduplication could end up with either
empty type section or empty string section. This is handled by libbpf
correctly in one of previous patches in the series.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/lib/bpf/btf.c | 218 +++++++++++++++++++++++++++++++++-----------
 1 file changed, 165 insertions(+), 53 deletions(-)

diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index 9b0ef71a03d0..c760a5809d4d 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -2722,6 +2722,7 @@ struct btf_dedup;
 static struct btf_dedup *btf_dedup_new(struct btf *btf, struct btf_ext *btf_ext,
 				       const struct btf_dedup_opts *opts);
 static void btf_dedup_free(struct btf_dedup *d);
+static int btf_dedup_prep(struct btf_dedup *d);
 static int btf_dedup_strings(struct btf_dedup *d);
 static int btf_dedup_prim_types(struct btf_dedup *d);
 static int btf_dedup_struct_types(struct btf_dedup *d);
@@ -2880,6 +2881,11 @@ int btf__dedup(struct btf *btf, struct btf_ext *btf_ext,
 	if (btf_ensure_modifiable(btf))
 		return -ENOMEM;
 
+	err = btf_dedup_prep(d);
+	if (err) {
+		pr_debug("btf_dedup_prep failed:%d\n", err);
+		goto done;
+	}
 	err = btf_dedup_strings(d);
 	if (err < 0) {
 		pr_debug("btf_dedup_strings failed:%d\n", err);
@@ -2942,6 +2948,13 @@ struct btf_dedup {
 	__u32 *hypot_list;
 	size_t hypot_cnt;
 	size_t hypot_cap;
+	/* Whether hypothethical mapping, if successful, would need to adjust
+	 * already canonicalized types (due to a new forward declaration to
+	 * concrete type resolution). In such case, during split BTF dedup
+	 * candidate type would still be considered as different, because base
+	 * BTF is considered to be immutable.
+	 */
+	bool hypot_adjust_canon;
 	/* Various option modifying behavior of algorithm */
 	struct btf_dedup_opts opts;
 	/* temporary strings deduplication state */
@@ -2989,6 +3002,7 @@ static void btf_dedup_clear_hypot_map(struct btf_dedup *d)
 	for (i = 0; i < d->hypot_cnt; i++)
 		d->hypot_map[d->hypot_list[i]] = BTF_UNPROCESSED_ID;
 	d->hypot_cnt = 0;
+	d->hypot_adjust_canon = false;
 }
 
 static void btf_dedup_free(struct btf_dedup *d)
@@ -3028,7 +3042,7 @@ static struct btf_dedup *btf_dedup_new(struct btf *btf, struct btf_ext *btf_ext,
 {
 	struct btf_dedup *d = calloc(1, sizeof(struct btf_dedup));
 	hashmap_hash_fn hash_fn = btf_dedup_identity_hash_fn;
-	int i, err = 0;
+	int i, err = 0, type_cnt;
 
 	if (!d)
 		return ERR_PTR(-ENOMEM);
@@ -3048,14 +3062,15 @@ static struct btf_dedup *btf_dedup_new(struct btf *btf, struct btf_ext *btf_ext,
 		goto done;
 	}
 
-	d->map = malloc(sizeof(__u32) * (1 + btf->nr_types));
+	type_cnt = btf__get_nr_types(btf) + 1;
+	d->map = malloc(sizeof(__u32) * type_cnt);
 	if (!d->map) {
 		err = -ENOMEM;
 		goto done;
 	}
 	/* special BTF "void" type is made canonical immediately */
 	d->map[0] = 0;
-	for (i = 1; i <= btf->nr_types; i++) {
+	for (i = 1; i < type_cnt; i++) {
 		struct btf_type *t = btf_type_by_id(d->btf, i);
 
 		/* VAR and DATASEC are never deduped and are self-canonical */
@@ -3065,12 +3080,12 @@ static struct btf_dedup *btf_dedup_new(struct btf *btf, struct btf_ext *btf_ext,
 			d->map[i] = BTF_UNPROCESSED_ID;
 	}
 
-	d->hypot_map = malloc(sizeof(__u32) * (1 + btf->nr_types));
+	d->hypot_map = malloc(sizeof(__u32) * type_cnt);
 	if (!d->hypot_map) {
 		err = -ENOMEM;
 		goto done;
 	}
-	for (i = 0; i <= btf->nr_types; i++)
+	for (i = 0; i < type_cnt; i++)
 		d->hypot_map[i] = BTF_UNPROCESSED_ID;
 
 done:
@@ -3094,8 +3109,8 @@ static int btf_for_each_str_off(struct btf_dedup *d, str_off_fn_t fn, void *ctx)
 	int i, j, r, rec_size;
 	struct btf_type *t;
 
-	for (i = 1; i <= d->btf->nr_types; i++) {
-		t = btf_type_by_id(d->btf, i);
+	for (i = 0; i < d->btf->nr_types; i++) {
+		t = btf_type_by_id(d->btf, d->btf->start_id + i);
 		r = fn(&t->name_off, ctx);
 		if (r)
 			return r;
@@ -3178,15 +3193,27 @@ static int btf_for_each_str_off(struct btf_dedup *d, str_off_fn_t fn, void *ctx)
 static int strs_dedup_remap_str_off(__u32 *str_off_ptr, void *ctx)
 {
 	struct btf_dedup *d = ctx;
+	__u32 str_off = *str_off_ptr;
 	long old_off, new_off, len;
 	const char *s;
 	void *p;
 	int err;
 
-	if (*str_off_ptr == 0)
+	/* don't touch empty string or string in main BTF */
+	if (str_off == 0 || str_off < d->btf->start_str_off)
 		return 0;
 
-	s = btf__str_by_offset(d->btf, *str_off_ptr);
+	s = btf__str_by_offset(d->btf, str_off);
+	if (d->btf->base_btf) {
+		err = btf__find_str(d->btf->base_btf, s);
+		if (err >= 0) {
+			*str_off_ptr = err;
+			return 0;
+		}
+		if (err != -ENOENT)
+			return err;
+	}
+
 	len = strlen(s) + 1;
 
 	new_off = d->strs_len;
@@ -3203,11 +3230,11 @@ static int strs_dedup_remap_str_off(__u32 *str_off_ptr, void *ctx)
 	err = hashmap__insert(d->strs_hash, (void *)new_off, (void *)new_off,
 			      HASHMAP_ADD, (const void **)&old_off, NULL);
 	if (err == -EEXIST) {
-		*str_off_ptr = old_off;
+		*str_off_ptr = d->btf->start_str_off + old_off;
 	} else if (err) {
 		return err;
 	} else {
-		*str_off_ptr = new_off;
+		*str_off_ptr = d->btf->start_str_off + new_off;
 		d->strs_len += len;
 	}
 	return 0;
@@ -3232,13 +3259,6 @@ static int btf_dedup_strings(struct btf_dedup *d)
 	if (d->btf->strs_deduped)
 		return 0;
 
-	s = btf_add_mem(&d->strs_data, &d->strs_cap, 1, d->strs_len, BTF_MAX_STR_OFFSET, 1);
-	if (!s)
-		return -ENOMEM;
-	/* initial empty string */
-	s[0] = 0;
-	d->strs_len = 1;
-
 	/* temporarily switch to use btf_dedup's strs_data for strings for hash
 	 * functions; later we'll just transfer hashmap to struct btf as is,
 	 * along the strs_data
@@ -3252,13 +3272,22 @@ static int btf_dedup_strings(struct btf_dedup *d)
 		goto err_out;
 	}
 
-	/* insert empty string; we won't be looking it up during strings
-	 * dedup, but it's good to have it for generic BTF string lookups
-	 */
-	err = hashmap__insert(d->strs_hash, (void *)0, (void *)0,
-			      HASHMAP_ADD, NULL, NULL);
-	if (err)
-		goto err_out;
+	if (!d->btf->base_btf) {
+		s = btf_add_mem(&d->strs_data, &d->strs_cap, 1, d->strs_len, BTF_MAX_STR_OFFSET, 1);
+		if (!s)
+			return -ENOMEM;
+		/* initial empty string */
+		s[0] = 0;
+		d->strs_len = 1;
+
+		/* insert empty string; we won't be looking it up during strings
+		 * dedup, but it's good to have it for generic BTF string lookups
+		 */
+		err = hashmap__insert(d->strs_hash, (void *)0, (void *)0,
+				      HASHMAP_ADD, NULL, NULL);
+		if (err)
+			goto err_out;
+	}
 
 	/* remap string offsets */
 	err = btf_for_each_str_off(d, strs_dedup_remap_str_off, d);
@@ -3553,6 +3582,63 @@ static bool btf_compat_fnproto(struct btf_type *t1, struct btf_type *t2)
 	return true;
 }
 
+static int btf_dedup_prep(struct btf_dedup *d)
+{
+	struct btf_type *t;
+	int type_id;
+	long h;
+
+	if (!d->btf->base_btf)
+		return 0;
+
+	for (type_id = 1; type_id < d->btf->start_id; type_id++)
+	{
+		t = btf_type_by_id(d->btf, type_id);
+
+		/* all base BTF types are self-canonical by definition */
+		d->map[type_id] = type_id;
+
+		switch (btf_kind(t)) {
+		case BTF_KIND_VAR:
+		case BTF_KIND_DATASEC:
+			/* VAR and DATASEC are never hash/deduplicated */
+			continue;
+		case BTF_KIND_CONST:
+		case BTF_KIND_VOLATILE:
+		case BTF_KIND_RESTRICT:
+		case BTF_KIND_PTR:
+		case BTF_KIND_FWD:
+		case BTF_KIND_TYPEDEF:
+		case BTF_KIND_FUNC:
+			h = btf_hash_common(t);
+			break;
+		case BTF_KIND_INT:
+			h = btf_hash_int(t);
+			break;
+		case BTF_KIND_ENUM:
+			h = btf_hash_enum(t);
+			break;
+		case BTF_KIND_STRUCT:
+		case BTF_KIND_UNION:
+			h = btf_hash_struct(t);
+			break;
+		case BTF_KIND_ARRAY:
+			h = btf_hash_array(t);
+			break;
+		case BTF_KIND_FUNC_PROTO:
+			h = btf_hash_fnproto(t);
+			break;
+		default:
+			pr_debug("unknown kind %d for type [%d]\n", btf_kind(t), type_id);
+			return -EINVAL;
+		}
+		if (btf_dedup_table_add(d, h, type_id))
+			return -ENOMEM;
+	}
+
+	return 0;
+}
+
 /*
  * Deduplicate primitive types, that can't reference other types, by calculating
  * their type signature hash and comparing them with any possible canonical
@@ -3646,8 +3732,8 @@ static int btf_dedup_prim_types(struct btf_dedup *d)
 {
 	int i, err;
 
-	for (i = 1; i <= d->btf->nr_types; i++) {
-		err = btf_dedup_prim_type(d, i);
+	for (i = 0; i < d->btf->nr_types; i++) {
+		err = btf_dedup_prim_type(d, d->btf->start_id + i);
 		if (err)
 			return err;
 	}
@@ -3837,6 +3923,9 @@ static int btf_dedup_is_equiv(struct btf_dedup *d, __u32 cand_id,
 		} else {
 			real_kind = cand_kind;
 			fwd_kind = btf_fwd_kind(canon_type);
+			/* we'd need to resolve base FWD to STRUCT/UNION */
+			if (fwd_kind == real_kind && canon_id < d->btf->start_id)
+				d->hypot_adjust_canon = true;
 		}
 		return fwd_kind == real_kind;
 	}
@@ -3874,8 +3963,7 @@ static int btf_dedup_is_equiv(struct btf_dedup *d, __u32 cand_id,
 			return 0;
 		cand_arr = btf_array(cand_type);
 		canon_arr = btf_array(canon_type);
-		eq = btf_dedup_is_equiv(d,
-			cand_arr->index_type, canon_arr->index_type);
+		eq = btf_dedup_is_equiv(d, cand_arr->index_type, canon_arr->index_type);
 		if (eq <= 0)
 			return eq;
 		return btf_dedup_is_equiv(d, cand_arr->type, canon_arr->type);
@@ -3958,16 +4046,16 @@ static int btf_dedup_is_equiv(struct btf_dedup *d, __u32 cand_id,
  */
 static void btf_dedup_merge_hypot_map(struct btf_dedup *d)
 {
-	__u32 cand_type_id, targ_type_id;
+	__u32 canon_type_id, targ_type_id;
 	__u16 t_kind, c_kind;
 	__u32 t_id, c_id;
 	int i;
 
 	for (i = 0; i < d->hypot_cnt; i++) {
-		cand_type_id = d->hypot_list[i];
-		targ_type_id = d->hypot_map[cand_type_id];
+		canon_type_id = d->hypot_list[i];
+		targ_type_id = d->hypot_map[canon_type_id];
 		t_id = resolve_type_id(d, targ_type_id);
-		c_id = resolve_type_id(d, cand_type_id);
+		c_id = resolve_type_id(d, canon_type_id);
 		t_kind = btf_kind(btf__type_by_id(d->btf, t_id));
 		c_kind = btf_kind(btf__type_by_id(d->btf, c_id));
 		/*
@@ -3982,9 +4070,26 @@ static void btf_dedup_merge_hypot_map(struct btf_dedup *d)
 		 * stability is not a requirement for STRUCT/UNION equivalence
 		 * checks, though.
 		 */
+
+		/* if it's the split BTF case, we still need to point base FWD
+		 * to STRUCT/UNION in a split BTF, because FWDs from split BTF
+		 * will be resolved against base FWD. If we don't point base
+		 * canonical FWD to the resolved STRUCT/UNION, then all the
+		 * FWDs in split BTF won't be correctly resolved to a proper
+		 * STRUCT/UNION.
+		 */
 		if (t_kind != BTF_KIND_FWD && c_kind == BTF_KIND_FWD)
 			d->map[c_id] = t_id;
-		else if (t_kind == BTF_KIND_FWD && c_kind != BTF_KIND_FWD)
+
+		/* if graph equivalence determined that we'd need to adjust
+		 * base canonical types, then we need to only point base FWDs
+		 * to STRUCTs/UNIONs and do no more modifications. For all
+		 * other purposes the type graphs were not equivalent.
+		 */
+		if (d->hypot_adjust_canon)
+			continue;
+		
+		if (t_kind == BTF_KIND_FWD && c_kind != BTF_KIND_FWD)
 			d->map[t_id] = c_id;
 
 		if ((t_kind == BTF_KIND_STRUCT || t_kind == BTF_KIND_UNION) &&
@@ -4068,8 +4173,10 @@ static int btf_dedup_struct_type(struct btf_dedup *d, __u32 type_id)
 			return eq;
 		if (!eq)
 			continue;
-		new_id = cand_id;
 		btf_dedup_merge_hypot_map(d);
+		if (d->hypot_adjust_canon) /* not really equivalent */
+			continue;
+		new_id = cand_id;
 		break;
 	}
 
@@ -4084,8 +4191,8 @@ static int btf_dedup_struct_types(struct btf_dedup *d)
 {
 	int i, err;
 
-	for (i = 1; i <= d->btf->nr_types; i++) {
-		err = btf_dedup_struct_type(d, i);
+	for (i = 0; i < d->btf->nr_types; i++) {
+		err = btf_dedup_struct_type(d, d->btf->start_id + i);
 		if (err)
 			return err;
 	}
@@ -4228,8 +4335,8 @@ static int btf_dedup_ref_types(struct btf_dedup *d)
 {
 	int i, err;
 
-	for (i = 1; i <= d->btf->nr_types; i++) {
-		err = btf_dedup_ref_type(d, i);
+	for (i = 0; i < d->btf->nr_types; i++) {
+		err = btf_dedup_ref_type(d, d->btf->start_id + i);
 		if (err < 0)
 			return err;
 	}
@@ -4253,39 +4360,44 @@ static int btf_dedup_ref_types(struct btf_dedup *d)
 static int btf_dedup_compact_types(struct btf_dedup *d)
 {
 	__u32 *new_offs;
-	__u32 next_type_id = 1;
+	__u32 next_type_id = d->btf->start_id;
+	const struct btf_type *t;
 	void *p;
-	int i, len;
+	int i, id, len;
 
 	/* we are going to reuse hypot_map to store compaction remapping */
 	d->hypot_map[0] = 0;
-	for (i = 1; i <= d->btf->nr_types; i++)
-		d->hypot_map[i] = BTF_UNPROCESSED_ID;
+	/* base BTF types are not renumbered */
+	for (id = 1; id < d->btf->start_id; id++)
+		d->hypot_map[id] = id;
+	for (i = 0, id = d->btf->start_id; i < d->btf->nr_types; i++, id++)
+		d->hypot_map[id] = BTF_UNPROCESSED_ID;
 
 	p = d->btf->types_data;
 
-	for (i = 1; i <= d->btf->nr_types; i++) {
-		if (d->map[i] != i)
+	for (i = 0, id = d->btf->start_id; i < d->btf->nr_types; i++, id++) {
+		if (d->map[id] != id)
 			continue;
 
-		len = btf_type_size(btf__type_by_id(d->btf, i));
+		t = btf__type_by_id(d->btf, id);
+		len = btf_type_size(t);
 		if (len < 0)
 			return len;
 
-		memmove(p, btf__type_by_id(d->btf, i), len);
-		d->hypot_map[i] = next_type_id;
-		d->btf->type_offs[next_type_id - 1] = p - d->btf->types_data;
+		memmove(p, t, len);
+		d->hypot_map[id] = next_type_id;
+		d->btf->type_offs[next_type_id - d->btf->start_id] = p - d->btf->types_data;
 		p += len;
 		next_type_id++;
 	}
 
 	/* shrink struct btf's internal types index and update btf_header */
-	d->btf->nr_types = next_type_id - 1;
+	d->btf->nr_types = next_type_id - d->btf->start_id;
 	d->btf->type_offs_cap = d->btf->nr_types;
 	d->btf->hdr->type_len = p - d->btf->types_data;
 	new_offs = libbpf_reallocarray(d->btf->type_offs, d->btf->type_offs_cap,
 				       sizeof(*new_offs));
-	if (!new_offs)
+	if (d->btf->type_offs_cap && !new_offs)
 		return -ENOMEM;
 	d->btf->type_offs = new_offs;
 	d->btf->hdr->str_off = d->btf->hdr->type_len;
@@ -4417,8 +4529,8 @@ static int btf_dedup_remap_types(struct btf_dedup *d)
 {
 	int i, r;
 
-	for (i = 1; i <= d->btf->nr_types; i++) {
-		r = btf_dedup_remap_type(d, i);
+	for (i = 0; i < d->btf->nr_types; i++) {
+		r = btf_dedup_remap_type(d, d->btf->start_id + i);
 		if (r < 0)
 			return r;
 	}
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH bpf-next 09/11] libbpf: accomodate DWARF/compiler bug with duplicated identical arrays
  2020-10-29  0:58 [PATCH bpf-next 00/11] libbpf: split BTF support Andrii Nakryiko
                   ` (7 preceding siblings ...)
  2020-10-29  0:58 ` [PATCH bpf-next 08/11] libbpf: support BTF dedup of split BTFs Andrii Nakryiko
@ 2020-10-29  0:59 ` Andrii Nakryiko
  2020-11-03  2:52   ` Song Liu
  2020-10-29  0:59 ` [PATCH bpf-next 10/11] selftests/bpf: add split BTF dedup selftests Andrii Nakryiko
                   ` (2 subsequent siblings)
  11 siblings, 1 reply; 49+ messages in thread
From: Andrii Nakryiko @ 2020-10-29  0:59 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii, kernel-team

In some cases compiler seems to generate distinct DWARF types for identical
arrays within the same CU. That seems like a bug, but it's already out there
and breaks type graph equivalence checks, so accommodate it anyway by checking
for identical arrays, regardless of their type ID.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/lib/bpf/btf.c | 27 +++++++++++++++++++++++++--
 1 file changed, 25 insertions(+), 2 deletions(-)

diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
index c760a5809d4d..4643b0482686 100644
--- a/tools/lib/bpf/btf.c
+++ b/tools/lib/bpf/btf.c
@@ -3786,6 +3786,19 @@ static inline __u16 btf_fwd_kind(struct btf_type *t)
 	return btf_kflag(t) ? BTF_KIND_UNION : BTF_KIND_STRUCT;
 }
 
+/* Check if given two types are identical ARRAY definitions */
+static int btf_dedup_identical_arrays(struct btf_dedup *d, __u32 id1, __u32 id2)
+{
+	struct btf_type *t1, *t2;
+
+	t1 = btf_type_by_id(d->btf, id1);
+	t2 = btf_type_by_id(d->btf, id2);
+	if (!btf_is_array(t1) || !btf_is_array(t2))
+		return 0;
+
+	return btf_equal_array(t1, t2);
+}
+
 /*
  * Check equivalence of BTF type graph formed by candidate struct/union (we'll
  * call it "candidate graph" in this description for brevity) to a type graph
@@ -3896,8 +3909,18 @@ static int btf_dedup_is_equiv(struct btf_dedup *d, __u32 cand_id,
 	canon_id = resolve_fwd_id(d, canon_id);
 
 	hypot_type_id = d->hypot_map[canon_id];
-	if (hypot_type_id <= BTF_MAX_NR_TYPES)
-		return hypot_type_id == cand_id;
+	if (hypot_type_id <= BTF_MAX_NR_TYPES) {
+		/* In some cases compiler will generate different DWARF types
+		 * for *identical* array type definitions and use them for
+		 * different fields within the *same* struct. This breaks type
+		 * equivalence check, which makes an assumption that candidate
+		 * types sub-graph has a consistent and deduped-by-compiler
+		 * types within a single CU. So work around that by explicitly
+		 * allowing identical array types here.
+		 */
+		return hypot_type_id == cand_id ||
+		       btf_dedup_identical_arrays(d, hypot_type_id, cand_id);
+	}
 
 	if (btf_dedup_hypot_map_add(d, canon_id, cand_id))
 		return -ENOMEM;
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH bpf-next 10/11] selftests/bpf: add split BTF dedup selftests
  2020-10-29  0:58 [PATCH bpf-next 00/11] libbpf: split BTF support Andrii Nakryiko
                   ` (8 preceding siblings ...)
  2020-10-29  0:59 ` [PATCH bpf-next 09/11] libbpf: accomodate DWARF/compiler bug with duplicated identical arrays Andrii Nakryiko
@ 2020-10-29  0:59 ` Andrii Nakryiko
  2020-11-03  5:35   ` Song Liu
  2020-10-29  0:59 ` [PATCH bpf-next 11/11] tools/bpftool: add bpftool support for split BTF Andrii Nakryiko
  2020-10-30  0:33 ` [PATCH bpf-next 00/11] libbpf: split BTF support Song Liu
  11 siblings, 1 reply; 49+ messages in thread
From: Andrii Nakryiko @ 2020-10-29  0:59 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii, kernel-team

Add selftests validating BTF deduplication for split BTF case. Add a helper
macro that allows to validate entire BTF with raw BTF dump, not just
type-by-type. This saves tons of code and complexity.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/testing/selftests/bpf/btf_helpers.c     |  59 ++++
 tools/testing/selftests/bpf/btf_helpers.h     |   7 +
 .../bpf/prog_tests/btf_dedup_split.c          | 326 ++++++++++++++++++
 3 files changed, 392 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c

diff --git a/tools/testing/selftests/bpf/btf_helpers.c b/tools/testing/selftests/bpf/btf_helpers.c
index abc3f6c04cfc..48f90490f922 100644
--- a/tools/testing/selftests/bpf/btf_helpers.c
+++ b/tools/testing/selftests/bpf/btf_helpers.c
@@ -3,6 +3,8 @@
 #include <stdio.h>
 #include <errno.h>
 #include <bpf/btf.h>
+#include <bpf/libbpf.h>
+#include "test_progs.h"
 
 static const char * const btf_kind_str_mapping[] = {
 	[BTF_KIND_UNKN]		= "UNKNOWN",
@@ -198,3 +200,60 @@ const char *btf_type_raw_dump(const struct btf *btf, int type_id)
 
 	return buf;
 }
+
+int btf_validate_raw(struct btf *btf, int nr_types, const char *exp_types[])
+{
+	int i;
+	bool ok = true;
+
+	ASSERT_EQ(btf__get_nr_types(btf), nr_types, "btf_nr_types");
+
+	for (i = 1; i <= nr_types; i++) {
+		if (!ASSERT_STREQ(btf_type_raw_dump(btf, i), exp_types[i - 1], "raw_dump"))
+			ok = false;
+	}
+
+	return ok;
+}
+
+static void btf_dump_printf(void *ctx, const char *fmt, va_list args)
+{
+	vfprintf(ctx, fmt, args);
+}
+
+/* Print BTF-to-C dump into a local buffer and return string pointer back.
+ * Buffer *will* be overwritten by subsequent btf_type_raw_dump() calls
+ */
+const char *btf_type_c_dump(const struct btf *btf)
+{
+	static char buf[16 * 1024];
+	FILE *buf_file;
+	struct btf_dump *d = NULL;
+	struct btf_dump_opts opts = {};
+	int err, i;
+
+	buf_file = fmemopen(buf, sizeof(buf) - 1, "w");
+	if (!buf_file) {
+		fprintf(stderr, "Failed to open memstream: %d\n", errno);
+		return NULL;
+	}
+
+	opts.ctx = buf_file;
+	d = btf_dump__new(btf, NULL, &opts, btf_dump_printf);
+	if (libbpf_get_error(d)) {
+		fprintf(stderr, "Failed to create btf_dump instance: %ld\n", libbpf_get_error(d));
+		return NULL;
+	}
+
+	for (i = 1; i <= btf__get_nr_types(btf); i++) {
+		err = btf_dump__dump_type(d, i);
+		if (err) {
+			fprintf(stderr, "Failed to dump type [%d]: %d\n", i, err);
+			return NULL;
+		}
+	}
+
+	fflush(buf_file);
+	fclose(buf_file);
+	return buf;
+}
diff --git a/tools/testing/selftests/bpf/btf_helpers.h b/tools/testing/selftests/bpf/btf_helpers.h
index 2c9ce1b61dc9..295c0137d9bd 100644
--- a/tools/testing/selftests/bpf/btf_helpers.h
+++ b/tools/testing/selftests/bpf/btf_helpers.h
@@ -8,5 +8,12 @@
 
 int fprintf_btf_type_raw(FILE *out, const struct btf *btf, __u32 id);
 const char *btf_type_raw_dump(const struct btf *btf, int type_id);
+int btf_validate_raw(struct btf *btf, int nr_types, const char *exp_types[]);
 
+#define VALIDATE_RAW_BTF(btf, raw_types...)				\
+	btf_validate_raw(btf,						\
+			 sizeof((const char *[]){raw_types})/sizeof(void *),\
+			 (const char *[]){raw_types})
+
+const char *btf_type_c_dump(const struct btf *btf);
 #endif
diff --git a/tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c b/tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c
new file mode 100644
index 000000000000..097370a41b60
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c
@@ -0,0 +1,326 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2020 Facebook */
+#include <test_progs.h>
+#include <bpf/btf.h>
+#include "btf_helpers.h"
+
+
+static void test_split_simple() {
+	const struct btf_type *t;
+	struct btf *btf1, *btf2 = NULL;
+	int str_off, err;
+
+	btf1 = btf__new_empty();
+	if (!ASSERT_OK_PTR(btf1, "empty_main_btf"))
+		return;
+
+	btf__set_pointer_size(btf1, 8); /* enforce 64-bit arch */
+
+	btf__add_int(btf1, "int", 4, BTF_INT_SIGNED);	/* [1] int */
+	btf__add_ptr(btf1, 1);				/* [2] ptr to int */
+	btf__add_struct(btf1, "s1", 4);			/* [3] struct s1 { */
+	btf__add_field(btf1, "f1", 1, 0, 0);		/*      int f1; */
+							/* } */
+
+	VALIDATE_RAW_BTF(
+		btf1,
+		"[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+		"[2] PTR '(anon)' type_id=1",
+		"[3] STRUCT 's1' size=4 vlen=1\n"
+		"\t'f1' type_id=1 bits_offset=0");
+
+	ASSERT_STREQ(btf_type_c_dump(btf1), "\
+struct s1 {\n\
+	int f1;\n\
+};\n\n", "c_dump");
+
+	btf2 = btf__new_empty_split(btf1);
+	if (!ASSERT_OK_PTR(btf2, "empty_split_btf"))
+		goto cleanup;
+
+	/* pointer size should be "inherited" from main BTF */
+	ASSERT_EQ(btf__pointer_size(btf2), 8, "inherit_ptr_sz");
+
+	str_off = btf__find_str(btf2, "int");
+	ASSERT_NEQ(str_off, -ENOENT, "str_int_missing");
+
+	t = btf__type_by_id(btf2, 1);
+	if (!ASSERT_OK_PTR(t, "int_type"))
+		goto cleanup;
+	ASSERT_EQ(btf_is_int(t), true, "int_kind");
+	ASSERT_STREQ(btf__str_by_offset(btf2, t->name_off), "int", "int_name");
+
+	btf__add_struct(btf2, "s2", 16);		/* [4] struct s2 {	*/
+	btf__add_field(btf2, "f1", 6, 0, 0);		/*      struct s1 f1;	*/
+	btf__add_field(btf2, "f2", 5, 32, 0);		/*      int f2;		*/
+	btf__add_field(btf2, "f3", 2, 64, 0);		/*      int *f3;	*/
+							/* } */
+
+	/* duplicated int */
+	btf__add_int(btf2, "int", 4, BTF_INT_SIGNED);	/* [5] int */
+
+	/* duplicated struct s1 */
+	btf__add_struct(btf2, "s1", 4);			/* [6] struct s1 { */
+	btf__add_field(btf2, "f1", 5, 0, 0);		/*      int f1; */
+							/* } */
+
+	VALIDATE_RAW_BTF(
+		btf2,
+		"[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+		"[2] PTR '(anon)' type_id=1",
+		"[3] STRUCT 's1' size=4 vlen=1\n"
+		"\t'f1' type_id=1 bits_offset=0",
+		"[4] STRUCT 's2' size=16 vlen=3\n"
+		"\t'f1' type_id=6 bits_offset=0\n"
+		"\t'f2' type_id=5 bits_offset=32\n"
+		"\t'f3' type_id=2 bits_offset=64",
+		"[5] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+		"[6] STRUCT 's1' size=4 vlen=1\n"
+		"\t'f1' type_id=5 bits_offset=0");
+
+	ASSERT_STREQ(btf_type_c_dump(btf2), "\
+struct s1 {\n\
+	int f1;\n\
+};\n\
+\n\
+struct s1___2 {\n\
+	int f1;\n\
+};\n\
+\n\
+struct s2 {\n\
+	struct s1___2 f1;\n\
+	int f2;\n\
+	int *f3;\n\
+};\n\n", "c_dump");
+
+	err = btf__dedup(btf2, NULL, NULL);
+	if (!ASSERT_OK(err, "btf_dedup"))
+		goto cleanup;
+
+	VALIDATE_RAW_BTF(
+		btf2,
+		"[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+		"[2] PTR '(anon)' type_id=1",
+		"[3] STRUCT 's1' size=4 vlen=1\n"
+		"\t'f1' type_id=1 bits_offset=0",
+		"[4] STRUCT 's2' size=16 vlen=3\n"
+		"\t'f1' type_id=3 bits_offset=0\n"
+		"\t'f2' type_id=1 bits_offset=32\n"
+		"\t'f3' type_id=2 bits_offset=64");
+
+	ASSERT_STREQ(btf_type_c_dump(btf2), "\
+struct s1 {\n\
+	int f1;\n\
+};\n\
+\n\
+struct s2 {\n\
+	struct s1 f1;\n\
+	int f2;\n\
+	int *f3;\n\
+};\n\n", "c_dump");
+
+cleanup:
+	btf__free(btf2);
+	btf__free(btf1);
+}
+
+static void test_split_fwd_resolve() {
+	struct btf *btf1, *btf2 = NULL;
+	int err;
+
+	btf1 = btf__new_empty();
+	if (!ASSERT_OK_PTR(btf1, "empty_main_btf"))
+		return;
+
+	btf__set_pointer_size(btf1, 8); /* enforce 64-bit arch */
+
+	btf__add_int(btf1, "int", 4, BTF_INT_SIGNED);	/* [1] int */
+	btf__add_ptr(btf1, 4);				/* [2] ptr to struct s1 */
+	btf__add_ptr(btf1, 5);				/* [3] ptr to struct s2 */
+	btf__add_struct(btf1, "s1", 16);		/* [4] struct s1 { */
+	btf__add_field(btf1, "f1", 2, 0, 0);		/*      struct s1 *f1; */
+	btf__add_field(btf1, "f2", 3, 64, 0);		/*      struct s2 *f2; */
+							/* } */
+	btf__add_struct(btf1, "s2", 4);			/* [5] struct s2 { */
+	btf__add_field(btf1, "f1", 1, 0, 0);		/*      int f1; */
+							/* } */
+
+	VALIDATE_RAW_BTF(
+		btf1,
+		"[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+		"[2] PTR '(anon)' type_id=4",
+		"[3] PTR '(anon)' type_id=5",
+		"[4] STRUCT 's1' size=16 vlen=2\n"
+		"\t'f1' type_id=2 bits_offset=0\n"
+		"\t'f2' type_id=3 bits_offset=64",
+		"[5] STRUCT 's2' size=4 vlen=1\n"
+		"\t'f1' type_id=1 bits_offset=0");
+
+	btf2 = btf__new_empty_split(btf1);
+	if (!ASSERT_OK_PTR(btf2, "empty_split_btf"))
+		goto cleanup;
+
+	btf__add_int(btf2, "int", 4, BTF_INT_SIGNED);	/* [6] int */
+	btf__add_ptr(btf2, 10);				/* [7] ptr to struct s1 */
+	btf__add_fwd(btf2, "s2", BTF_FWD_STRUCT);	/* [8] fwd for struct s2 */
+	btf__add_ptr(btf2, 8);				/* [9] ptr to fwd struct s2 */
+	btf__add_struct(btf2, "s1", 16);		/* [10] struct s1 { */
+	btf__add_field(btf2, "f1", 7, 0, 0);		/*      struct s1 *f1; */
+	btf__add_field(btf2, "f2", 9, 64, 0);		/*      struct s2 *f2; */
+							/* } */
+
+	VALIDATE_RAW_BTF(
+		btf2,
+		"[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+		"[2] PTR '(anon)' type_id=4",
+		"[3] PTR '(anon)' type_id=5",
+		"[4] STRUCT 's1' size=16 vlen=2\n"
+		"\t'f1' type_id=2 bits_offset=0\n"
+		"\t'f2' type_id=3 bits_offset=64",
+		"[5] STRUCT 's2' size=4 vlen=1\n"
+		"\t'f1' type_id=1 bits_offset=0",
+		"[6] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+		"[7] PTR '(anon)' type_id=10",
+		"[8] FWD 's2' fwd_kind=struct",
+		"[9] PTR '(anon)' type_id=8",
+		"[10] STRUCT 's1' size=16 vlen=2\n"
+		"\t'f1' type_id=7 bits_offset=0\n"
+		"\t'f2' type_id=9 bits_offset=64");
+
+	err = btf__dedup(btf2, NULL, NULL);
+	if (!ASSERT_OK(err, "btf_dedup"))
+		goto cleanup;
+
+	VALIDATE_RAW_BTF(
+		btf2,
+		"[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+		"[2] PTR '(anon)' type_id=4",
+		"[3] PTR '(anon)' type_id=5",
+		"[4] STRUCT 's1' size=16 vlen=2\n"
+		"\t'f1' type_id=2 bits_offset=0\n"
+		"\t'f2' type_id=3 bits_offset=64",
+		"[5] STRUCT 's2' size=4 vlen=1\n"
+		"\t'f1' type_id=1 bits_offset=0");
+
+cleanup:
+	btf__free(btf2);
+	btf__free(btf1);
+}
+
+static void test_split_struct_duped() {
+	struct btf *btf1, *btf2 = NULL;
+	int err;
+
+	btf1 = btf__new_empty();
+	if (!ASSERT_OK_PTR(btf1, "empty_main_btf"))
+		return;
+
+	btf__set_pointer_size(btf1, 8); /* enforce 64-bit arch */
+
+	btf__add_int(btf1, "int", 4, BTF_INT_SIGNED);	/* [1] int */
+	btf__add_ptr(btf1, 5);				/* [2] ptr to struct s1 */
+	btf__add_fwd(btf1, "s2", BTF_FWD_STRUCT);	/* [3] fwd for struct s2 */
+	btf__add_ptr(btf1, 3);				/* [4] ptr to fwd struct s2 */
+	btf__add_struct(btf1, "s1", 16);		/* [5] struct s1 { */
+	btf__add_field(btf1, "f1", 2, 0, 0);		/*      struct s1 *f1; */
+	btf__add_field(btf1, "f2", 4, 64, 0);		/*      struct s2 *f2; */
+							/* } */
+
+	VALIDATE_RAW_BTF(
+		btf1,
+		"[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+		"[2] PTR '(anon)' type_id=5",
+		"[3] FWD 's2' fwd_kind=struct",
+		"[4] PTR '(anon)' type_id=3",
+		"[5] STRUCT 's1' size=16 vlen=2\n"
+		"\t'f1' type_id=2 bits_offset=0\n"
+		"\t'f2' type_id=4 bits_offset=64");
+
+	btf2 = btf__new_empty_split(btf1);
+	if (!ASSERT_OK_PTR(btf2, "empty_split_btf"))
+		goto cleanup;
+
+	btf__add_int(btf2, "int", 4, BTF_INT_SIGNED);	/* [6] int */
+	btf__add_ptr(btf2, 10);				/* [7] ptr to struct s1 */
+	btf__add_fwd(btf2, "s2", BTF_FWD_STRUCT);	/* [8] fwd for struct s2 */
+	btf__add_ptr(btf2, 11);				/* [9] ptr to struct s2 */
+	btf__add_struct(btf2, "s1", 16);		/* [10] struct s1 { */
+	btf__add_field(btf2, "f1", 7, 0, 0);		/*      struct s1 *f1; */
+	btf__add_field(btf2, "f2", 9, 64, 0);		/*      struct s2 *f2; */
+							/* } */
+	btf__add_struct(btf2, "s2", 40);		/* [11] struct s2 {	*/
+	btf__add_field(btf2, "f1", 7, 0, 0);		/*      struct s1 *f1;	*/
+	btf__add_field(btf2, "f2", 9, 64, 0);		/*      struct s2 *f2;	*/
+	btf__add_field(btf2, "f3", 6, 128, 0);		/*      int f3;		*/
+	btf__add_field(btf2, "f4", 10, 192, 0);		/*      struct s1 f4;	*/
+							/* } */
+	btf__add_ptr(btf2, 8);				/* [12] ptr to fwd struct s2 */
+	btf__add_struct(btf2, "s3", 8);			/* [13] struct s3 { */
+	btf__add_field(btf2, "f1", 12, 0, 0);		/*      struct s2 *f1; (fwd) */
+							/* } */
+
+	VALIDATE_RAW_BTF(
+		btf2,
+		"[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+		"[2] PTR '(anon)' type_id=5",
+		"[3] FWD 's2' fwd_kind=struct",
+		"[4] PTR '(anon)' type_id=3",
+		"[5] STRUCT 's1' size=16 vlen=2\n"
+		"\t'f1' type_id=2 bits_offset=0\n"
+		"\t'f2' type_id=4 bits_offset=64",
+		"[6] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+		"[7] PTR '(anon)' type_id=10",
+		"[8] FWD 's2' fwd_kind=struct",
+		"[9] PTR '(anon)' type_id=11",
+		"[10] STRUCT 's1' size=16 vlen=2\n"
+		"\t'f1' type_id=7 bits_offset=0\n"
+		"\t'f2' type_id=9 bits_offset=64",
+		"[11] STRUCT 's2' size=40 vlen=4\n"
+		"\t'f1' type_id=7 bits_offset=0\n"
+		"\t'f2' type_id=9 bits_offset=64\n"
+		"\t'f3' type_id=6 bits_offset=128\n"
+		"\t'f4' type_id=10 bits_offset=192",
+		"[12] PTR '(anon)' type_id=8",
+		"[13] STRUCT 's3' size=8 vlen=1\n"
+		"\t'f1' type_id=12 bits_offset=0");
+
+	err = btf__dedup(btf2, NULL, NULL);
+	if (!ASSERT_OK(err, "btf_dedup"))
+		goto cleanup;
+
+	VALIDATE_RAW_BTF(
+		btf2,
+		"[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
+		"[2] PTR '(anon)' type_id=5",
+		"[3] FWD 's2' fwd_kind=struct",
+		"[4] PTR '(anon)' type_id=3",
+		"[5] STRUCT 's1' size=16 vlen=2\n"
+		"\t'f1' type_id=2 bits_offset=0\n"
+		"\t'f2' type_id=4 bits_offset=64",
+		"[6] PTR '(anon)' type_id=8",
+		"[7] PTR '(anon)' type_id=9",
+		"[8] STRUCT 's1' size=16 vlen=2\n"
+		"\t'f1' type_id=6 bits_offset=0\n"
+		"\t'f2' type_id=7 bits_offset=64",
+		"[9] STRUCT 's2' size=40 vlen=4\n"
+		"\t'f1' type_id=6 bits_offset=0\n"
+		"\t'f2' type_id=7 bits_offset=64\n"
+		"\t'f3' type_id=1 bits_offset=128\n"
+		"\t'f4' type_id=8 bits_offset=192",
+		"[10] STRUCT 's3' size=8 vlen=1\n"
+		"\t'f1' type_id=7 bits_offset=0");
+
+cleanup:
+	btf__free(btf2);
+	btf__free(btf1);
+}
+
+void test_btf_dedup_split()
+{
+	if (test__start_subtest("split_simple"))
+		test_split_simple();
+	if (test__start_subtest("split_struct_duped"))
+		test_split_struct_duped();
+	if (test__start_subtest("split_fwd_resolve"))
+		test_split_fwd_resolve();
+}
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH bpf-next 11/11] tools/bpftool: add bpftool support for split BTF
  2020-10-29  0:58 [PATCH bpf-next 00/11] libbpf: split BTF support Andrii Nakryiko
                   ` (9 preceding siblings ...)
  2020-10-29  0:59 ` [PATCH bpf-next 10/11] selftests/bpf: add split BTF dedup selftests Andrii Nakryiko
@ 2020-10-29  0:59 ` Andrii Nakryiko
  2020-11-03  6:03   ` Song Liu
  2020-10-30  0:33 ` [PATCH bpf-next 00/11] libbpf: split BTF support Song Liu
  11 siblings, 1 reply; 49+ messages in thread
From: Andrii Nakryiko @ 2020-10-29  0:59 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii, kernel-team

Add ability to work with split BTF by providing extra -B flag, which allows to
specify the path to the base BTF file.

Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/bpf/bpftool/btf.c  |  9 ++++++---
 tools/bpf/bpftool/main.c | 15 ++++++++++++++-
 tools/bpf/bpftool/main.h |  1 +
 3 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/tools/bpf/bpftool/btf.c b/tools/bpf/bpftool/btf.c
index 8ab142ff5eac..c96b56e8e3a4 100644
--- a/tools/bpf/bpftool/btf.c
+++ b/tools/bpf/bpftool/btf.c
@@ -358,8 +358,12 @@ static int dump_btf_raw(const struct btf *btf,
 		}
 	} else {
 		int cnt = btf__get_nr_types(btf);
+		int start_id = 1;
 
-		for (i = 1; i <= cnt; i++) {
+		if (base_btf)
+			start_id = btf__get_nr_types(base_btf) + 1;
+
+		for (i = start_id; i <= cnt; i++) {
 			t = btf__type_by_id(btf, i);
 			dump_btf_type(btf, i, t);
 		}
@@ -438,7 +442,6 @@ static int do_dump(int argc, char **argv)
 		return -1;
 	}
 	src = GET_ARG();
-
 	if (is_prefix(src, "map")) {
 		struct bpf_map_info info = {};
 		__u32 len = sizeof(info);
@@ -499,7 +502,7 @@ static int do_dump(int argc, char **argv)
 		}
 		NEXT_ARG();
 	} else if (is_prefix(src, "file")) {
-		btf = btf__parse(*argv, NULL);
+		btf = btf__parse_split(*argv, base_btf);
 		if (IS_ERR(btf)) {
 			err = -PTR_ERR(btf);
 			btf = NULL;
diff --git a/tools/bpf/bpftool/main.c b/tools/bpf/bpftool/main.c
index 682daaa49e6a..b86f450e6fce 100644
--- a/tools/bpf/bpftool/main.c
+++ b/tools/bpf/bpftool/main.c
@@ -11,6 +11,7 @@
 
 #include <bpf/bpf.h>
 #include <bpf/libbpf.h>
+#include <bpf/btf.h>
 
 #include "main.h"
 
@@ -28,6 +29,7 @@ bool show_pinned;
 bool block_mount;
 bool verifier_logs;
 bool relaxed_maps;
+struct btf *base_btf;
 struct pinned_obj_table prog_table;
 struct pinned_obj_table map_table;
 struct pinned_obj_table link_table;
@@ -391,6 +393,7 @@ int main(int argc, char **argv)
 		{ "mapcompat",	no_argument,	NULL,	'm' },
 		{ "nomount",	no_argument,	NULL,	'n' },
 		{ "debug",	no_argument,	NULL,	'd' },
+		{ "base-btf",	required_argument, NULL, 'B' },
 		{ 0 }
 	};
 	int opt, ret;
@@ -407,7 +410,7 @@ int main(int argc, char **argv)
 	hash_init(link_table.table);
 
 	opterr = 0;
-	while ((opt = getopt_long(argc, argv, "Vhpjfmnd",
+	while ((opt = getopt_long(argc, argv, "VhpjfmndB:",
 				  options, NULL)) >= 0) {
 		switch (opt) {
 		case 'V':
@@ -441,6 +444,15 @@ int main(int argc, char **argv)
 			libbpf_set_print(print_all_levels);
 			verifier_logs = true;
 			break;
+		case 'B':
+			base_btf = btf__parse(optarg, NULL);
+			if (libbpf_get_error(base_btf)) {
+				p_err("failed to parse base BTF at '%s': %ld\n",
+				      optarg, libbpf_get_error(base_btf));
+				base_btf = NULL;
+				return -1;
+			}
+			break;
 		default:
 			p_err("unrecognized option '%s'", argv[optind - 1]);
 			if (json_output)
@@ -465,6 +477,7 @@ int main(int argc, char **argv)
 		delete_pinned_obj_table(&map_table);
 		delete_pinned_obj_table(&link_table);
 	}
+	btf__free(base_btf);
 
 	return ret;
 }
diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h
index c46e52137b87..76e91641262b 100644
--- a/tools/bpf/bpftool/main.h
+++ b/tools/bpf/bpftool/main.h
@@ -90,6 +90,7 @@ extern bool show_pids;
 extern bool block_mount;
 extern bool verifier_logs;
 extern bool relaxed_maps;
+extern struct btf *base_btf;
 extern struct pinned_obj_table prog_table;
 extern struct pinned_obj_table map_table;
 extern struct pinned_obj_table link_table;
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 00/11] libbpf: split BTF support
  2020-10-29  0:58 [PATCH bpf-next 00/11] libbpf: split BTF support Andrii Nakryiko
                   ` (10 preceding siblings ...)
  2020-10-29  0:59 ` [PATCH bpf-next 11/11] tools/bpftool: add bpftool support for split BTF Andrii Nakryiko
@ 2020-10-30  0:33 ` Song Liu
  2020-10-30  2:33   ` Andrii Nakryiko
  11 siblings, 1 reply; 49+ messages in thread
From: Song Liu @ 2020-10-30  0:33 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann, Kernel Team



> On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> 
> This patch set adds support for generating and deduplicating split BTF. This
> is an enhancement to the BTF, which allows to designate one BTF as the "base
> BTF" (e.g., vmlinux BTF), and one or more other BTFs as "split BTF" (e.g.,
> kernel module BTF), which are building upon and extending base BTF with extra
> types and strings.
> 
> Once loaded, split BTF appears as a single unified BTF superset of base BTF,
> with continuous and transparent numbering scheme. This allows all the existing
> users of BTF to work correctly and stay agnostic to the base/split BTFs
> composition.  The only difference is in how to instantiate split BTF: it
> requires base BTF to be alread instantiated and passed to btf__new_xxx_split()
> or btf__parse_xxx_split() "constructors" explicitly.
> 
> This split approach is necessary if we are to have a reasonably-sized kernel
> module BTFs. By deduping each kernel module's BTF individually, resulting
> module BTFs contain copies of a lot of kernel types that are already present
> in vmlinux BTF. Even those single copies result in a big BTF size bloat. On my
> kernel configuration with 700 modules built, non-split BTF approach results in
> 115MBs of BTFs across all modules. With split BTF deduplication approach,
> total size is down to 5.2MBs total, which is on part with vmlinux BTF (at
> around 4MBs). This seems reasonable and practical. As to why we'd need kernel
> module BTFs, that should be pretty obvious to anyone using BPF at this point,
> as it allows all the BTF-powered features to be used with kernel modules:
> tp_btf, fentry/fexit/fmod_ret, lsm, bpf_iter, etc.

Some high level questions. Do we plan to use split BTF for in-tree modules
(those built together with the kernel) or out-of-tree modules (those built 
separately)? If it is for in-tree modules, is it possible to build split BTF
into vmlinux BTF? 

Thanks,
Song

[...]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 01/11] libbpf: factor out common operations in BTF writing APIs
  2020-10-29  0:58 ` [PATCH bpf-next 01/11] libbpf: factor out common operations in BTF writing APIs Andrii Nakryiko
@ 2020-10-30  0:36   ` Song Liu
  0 siblings, 0 replies; 49+ messages in thread
From: Song Liu @ 2020-10-30  0:36 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, netdev, Alexei Starovoitov, daniel, Kernel Team



> On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> 
> Factor out commiting of appended type data. Also extract fetching the very
> last type in the BTF (to append members to). These two operations are common
> across many APIs and will be easier to refactor with split BTF, if they are
> extracted into a single place.
> 
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

Acked-by: Song Liu <songliubraving@fb.com>

[...]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 00/11] libbpf: split BTF support
  2020-10-30  0:33 ` [PATCH bpf-next 00/11] libbpf: split BTF support Song Liu
@ 2020-10-30  2:33   ` Andrii Nakryiko
  2020-10-30  6:45     ` Song Liu
  2020-10-30 12:04     ` Alan Maguire
  0 siblings, 2 replies; 49+ messages in thread
From: Andrii Nakryiko @ 2020-10-30  2:33 UTC (permalink / raw)
  To: Song Liu
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team

On Thu, Oct 29, 2020 at 5:33 PM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> > On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> >
> > This patch set adds support for generating and deduplicating split BTF. This
> > is an enhancement to the BTF, which allows to designate one BTF as the "base
> > BTF" (e.g., vmlinux BTF), and one or more other BTFs as "split BTF" (e.g.,
> > kernel module BTF), which are building upon and extending base BTF with extra
> > types and strings.
> >
> > Once loaded, split BTF appears as a single unified BTF superset of base BTF,
> > with continuous and transparent numbering scheme. This allows all the existing
> > users of BTF to work correctly and stay agnostic to the base/split BTFs
> > composition.  The only difference is in how to instantiate split BTF: it
> > requires base BTF to be alread instantiated and passed to btf__new_xxx_split()
> > or btf__parse_xxx_split() "constructors" explicitly.
> >
> > This split approach is necessary if we are to have a reasonably-sized kernel
> > module BTFs. By deduping each kernel module's BTF individually, resulting
> > module BTFs contain copies of a lot of kernel types that are already present
> > in vmlinux BTF. Even those single copies result in a big BTF size bloat. On my
> > kernel configuration with 700 modules built, non-split BTF approach results in
> > 115MBs of BTFs across all modules. With split BTF deduplication approach,
> > total size is down to 5.2MBs total, which is on part with vmlinux BTF (at
> > around 4MBs). This seems reasonable and practical. As to why we'd need kernel
> > module BTFs, that should be pretty obvious to anyone using BPF at this point,
> > as it allows all the BTF-powered features to be used with kernel modules:
> > tp_btf, fentry/fexit/fmod_ret, lsm, bpf_iter, etc.
>
> Some high level questions. Do we plan to use split BTF for in-tree modules
> (those built together with the kernel) or out-of-tree modules (those built
> separately)? If it is for in-tree modules, is it possible to build split BTF
> into vmlinux BTF?

It will be possible to use for both in-tree and out-of-tree. For
in-tree, this will be integrated into the kernel build process. For
out-of-tree, whoever builds their kernel module will need to invoke
pahole -J with an extra flag pointing to the right vmlinux image (I
haven't looked into the exact details of this integration, maybe there
are already scripts in Linux repo that out-of-tree modules have to
use, in such case we can add this integration there).

Merging all in-tree modules' BTFs into vmlinux's BTF defeats the
purpose of the split BTF and will just increase the size of vmlinux
BTF unnecessarily.

>
> Thanks,
> Song
>
> [...]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 00/11] libbpf: split BTF support
  2020-10-30  2:33   ` Andrii Nakryiko
@ 2020-10-30  6:45     ` Song Liu
  2020-10-30 12:04     ` Alan Maguire
  1 sibling, 0 replies; 49+ messages in thread
From: Song Liu @ 2020-10-30  6:45 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team



> On Oct 29, 2020, at 7:33 PM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> 
> On Thu, Oct 29, 2020 at 5:33 PM Song Liu <songliubraving@fb.com> wrote:
>> 
>> 
>> 
>>> On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
>>> 
>>> This patch set adds support for generating and deduplicating split BTF. This
>>> is an enhancement to the BTF, which allows to designate one BTF as the "base
>>> BTF" (e.g., vmlinux BTF), and one or more other BTFs as "split BTF" (e.g.,
>>> kernel module BTF), which are building upon and extending base BTF with extra
>>> types and strings.
>>> 
>>> Once loaded, split BTF appears as a single unified BTF superset of base BTF,
>>> with continuous and transparent numbering scheme. This allows all the existing
>>> users of BTF to work correctly and stay agnostic to the base/split BTFs
>>> composition.  The only difference is in how to instantiate split BTF: it
>>> requires base BTF to be alread instantiated and passed to btf__new_xxx_split()
>>> or btf__parse_xxx_split() "constructors" explicitly.
>>> 
>>> This split approach is necessary if we are to have a reasonably-sized kernel
>>> module BTFs. By deduping each kernel module's BTF individually, resulting
>>> module BTFs contain copies of a lot of kernel types that are already present
>>> in vmlinux BTF. Even those single copies result in a big BTF size bloat. On my
>>> kernel configuration with 700 modules built, non-split BTF approach results in
>>> 115MBs of BTFs across all modules. With split BTF deduplication approach,
>>> total size is down to 5.2MBs total, which is on part with vmlinux BTF (at
>>> around 4MBs). This seems reasonable and practical. As to why we'd need kernel
>>> module BTFs, that should be pretty obvious to anyone using BPF at this point,
>>> as it allows all the BTF-powered features to be used with kernel modules:
>>> tp_btf, fentry/fexit/fmod_ret, lsm, bpf_iter, etc.
>> 
>> Some high level questions. Do we plan to use split BTF for in-tree modules
>> (those built together with the kernel) or out-of-tree modules (those built
>> separately)? If it is for in-tree modules, is it possible to build split BTF
>> into vmlinux BTF?
> 
> It will be possible to use for both in-tree and out-of-tree. For
> in-tree, this will be integrated into the kernel build process. For
> out-of-tree, whoever builds their kernel module will need to invoke
> pahole -J with an extra flag pointing to the right vmlinux image (I
> haven't looked into the exact details of this integration, maybe there
> are already scripts in Linux repo that out-of-tree modules have to
> use, in such case we can add this integration there).

Thanks for the explanation. 

> 
> Merging all in-tree modules' BTFs into vmlinux's BTF defeats the
> purpose of the split BTF and will just increase the size of vmlinux
> BTF unnecessarily.

Is the purpose of split BTF to save memory used by module BTF? In the 
example above, I guess part of those 5.2MB will be loaded at run time, 
so the actual saving is less than 5.2MB. 5.2MB is really small for a 
decent system, e.g. ~0.03% of my laptop's main memory. 

Did I miss anything here? 

Song

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 00/11] libbpf: split BTF support
  2020-10-30  2:33   ` Andrii Nakryiko
  2020-10-30  6:45     ` Song Liu
@ 2020-10-30 12:04     ` Alan Maguire
  2020-10-30 18:30       ` Andrii Nakryiko
  1 sibling, 1 reply; 49+ messages in thread
From: Alan Maguire @ 2020-10-30 12:04 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Song Liu, Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team

On Thu, 29 Oct 2020, Andrii Nakryiko wrote:

> On Thu, Oct 29, 2020 at 5:33 PM Song Liu <songliubraving@fb.com> wrote:
> >
> >
> >
> > > On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> > >
> > > This patch set adds support for generating and deduplicating split BTF. This
> > > is an enhancement to the BTF, which allows to designate one BTF as the "base
> > > BTF" (e.g., vmlinux BTF), and one or more other BTFs as "split BTF" (e.g.,
> > > kernel module BTF), which are building upon and extending base BTF with extra
> > > types and strings.
> > >
> > > Once loaded, split BTF appears as a single unified BTF superset of base BTF,
> > > with continuous and transparent numbering scheme. This allows all the existing
> > > users of BTF to work correctly and stay agnostic to the base/split BTFs
> > > composition.  The only difference is in how to instantiate split BTF: it
> > > requires base BTF to be alread instantiated and passed to btf__new_xxx_split()
> > > or btf__parse_xxx_split() "constructors" explicitly.
> > >
> > > This split approach is necessary if we are to have a reasonably-sized kernel
> > > module BTFs. By deduping each kernel module's BTF individually, resulting
> > > module BTFs contain copies of a lot of kernel types that are already present
> > > in vmlinux BTF. Even those single copies result in a big BTF size bloat. On my
> > > kernel configuration with 700 modules built, non-split BTF approach results in
> > > 115MBs of BTFs across all modules. With split BTF deduplication approach,
> > > total size is down to 5.2MBs total, which is on part with vmlinux BTF (at
> > > around 4MBs). This seems reasonable and practical. As to why we'd need kernel
> > > module BTFs, that should be pretty obvious to anyone using BPF at this point,
> > > as it allows all the BTF-powered features to be used with kernel modules:
> > > tp_btf, fentry/fexit/fmod_ret, lsm, bpf_iter, etc.
> >
> > Some high level questions. Do we plan to use split BTF for in-tree modules
> > (those built together with the kernel) or out-of-tree modules (those built
> > separately)? If it is for in-tree modules, is it possible to build split BTF
> > into vmlinux BTF?
> 
> It will be possible to use for both in-tree and out-of-tree. For
> in-tree, this will be integrated into the kernel build process. For
> out-of-tree, whoever builds their kernel module will need to invoke
> pahole -J with an extra flag pointing to the right vmlinux image (I
> haven't looked into the exact details of this integration, maybe there
> are already scripts in Linux repo that out-of-tree modules have to
> use, in such case we can add this integration there).
> 
> Merging all in-tree modules' BTFs into vmlinux's BTF defeats the
> purpose of the split BTF and will just increase the size of vmlinux
> BTF unnecessarily.
>

Again more of a question about how module BTF will be exposed, but
I'm wondering if there will be a way for a consumer to ask for
type info across kernel and module BTF, i.e. something like
libbpf_find_kernel_btf_id() ? Similarly will __builtin_btf_type_id()
work across both vmlinux and modules? I'm thinking of the case where we 
potentially don't know which module a type is defined in.

I realize in some cases type names may refer to different types in 
different modules (not sure how frequent this is in practice?) but
I'm curious how the split model for modules will interact with existing 
APIs and helpers.

In some cases it's likely that modules may share types with
each other that they do not share with vmlinux; in such cases 
will those types get deduplicated also, or is deduplication just
between kernel/module, and not module/module? 

Sorry I know these questions aren't about this patchset in
particular, but I'm just trying to get a sense of the bigger
picture. Thanks!

Alan

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 02/11] selftest/bpf: relax btf_dedup test checks
  2020-10-29  0:58 ` [PATCH bpf-next 02/11] selftest/bpf: relax btf_dedup test checks Andrii Nakryiko
@ 2020-10-30 16:43   ` Song Liu
  2020-10-30 18:44     ` Andrii Nakryiko
  0 siblings, 1 reply; 49+ messages in thread
From: Song Liu @ 2020-10-30 16:43 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann, Kernel Team

On Thu, Oct 29, 2020 at 1:40 AM Andrii Nakryiko <andrii@kernel.org> wrote:
>
> Remove the requirement of a strictly exact string section contents. This used
> to be true when string deduplication was done through sorting, but with string
> dedup done through hash table, it's no longer true. So relax test harness to
> relax strings checks and, consequently, type checks, which now don't have to
> have exactly the same string offsets.
>
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> ---
>  tools/testing/selftests/bpf/prog_tests/btf.c | 34 +++++++++++---------
>  1 file changed, 19 insertions(+), 15 deletions(-)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/btf.c b/tools/testing/selftests/bpf/prog_tests/btf.c
> index 93162484c2ca..2ccc23b2a36f 100644
> --- a/tools/testing/selftests/bpf/prog_tests/btf.c
> +++ b/tools/testing/selftests/bpf/prog_tests/btf.c
> @@ -6652,7 +6652,7 @@ static void do_test_dedup(unsigned int test_num)
>         const void *test_btf_data, *expect_btf_data;
>         const char *ret_test_next_str, *ret_expect_next_str;
>         const char *test_strs, *expect_strs;
> -       const char *test_str_cur, *test_str_end;
> +       const char *test_str_cur;
>         const char *expect_str_cur, *expect_str_end;
>         unsigned int raw_btf_size;
>         void *raw_btf;
> @@ -6719,12 +6719,18 @@ static void do_test_dedup(unsigned int test_num)
>                 goto done;
>         }
>
> -       test_str_cur = test_strs;
> -       test_str_end = test_strs + test_hdr->str_len;
>         expect_str_cur = expect_strs;
>         expect_str_end = expect_strs + expect_hdr->str_len;
> -       while (test_str_cur < test_str_end && expect_str_cur < expect_str_end) {
> +       while (expect_str_cur < expect_str_end) {
>                 size_t test_len, expect_len;
> +               int off;
> +
> +               off = btf__find_str(test_btf, expect_str_cur);
> +               if (CHECK(off < 0, "exp str '%s' not found: %d\n", expect_str_cur, off)) {
> +                       err = -1;
> +                       goto done;
> +               }
> +               test_str_cur = btf__str_by_offset(test_btf, off);
>
>                 test_len = strlen(test_str_cur);
>                 expect_len = strlen(expect_str_cur);
> @@ -6741,15 +6747,8 @@ static void do_test_dedup(unsigned int test_num)
>                         err = -1;
>                         goto done;
>                 }
> -               test_str_cur += test_len + 1;
>                 expect_str_cur += expect_len + 1;
>         }
> -       if (CHECK(test_str_cur != test_str_end,
> -                 "test_str_cur:%p != test_str_end:%p",
> -                 test_str_cur, test_str_end)) {
> -               err = -1;
> -               goto done;
> -       }
>
>         test_nr_types = btf__get_nr_types(test_btf);
>         expect_nr_types = btf__get_nr_types(expect_btf);
> @@ -6775,10 +6774,15 @@ static void do_test_dedup(unsigned int test_num)
>                         err = -1;
>                         goto done;
>                 }
> -               if (CHECK(memcmp((void *)test_type,
> -                                (void *)expect_type,
> -                                test_size),
> -                         "type #%d: contents differ", i)) {

I guess test_size and expect_size are not needed anymore?

> +               if (CHECK(btf_kind(test_type) != btf_kind(expect_type),
> +                         "type %d kind: exp %d != got %u\n",
> +                         i, btf_kind(expect_type), btf_kind(test_type))) {
> +                       err = -1;
> +                       goto done;
> +               }
> +               if (CHECK(test_type->info != expect_type->info,
> +                         "type %d info: exp %d != got %u\n",
> +                         i, expect_type->info, test_type->info)) {

btf_kind() returns part of ->info, so we only need the second check, no?

IIUC, test_type and expect_type may have different name_off now. Shall
we check ->size matches?


>                         err = -1;
>                         goto done;
>                 }
> --
> 2.24.1
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 00/11] libbpf: split BTF support
  2020-10-30 12:04     ` Alan Maguire
@ 2020-10-30 18:30       ` Andrii Nakryiko
  0 siblings, 0 replies; 49+ messages in thread
From: Andrii Nakryiko @ 2020-10-30 18:30 UTC (permalink / raw)
  To: Alan Maguire
  Cc: Song Liu, Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team

On Fri, Oct 30, 2020 at 5:06 AM Alan Maguire <alan.maguire@oracle.com> wrote:
>
> On Thu, 29 Oct 2020, Andrii Nakryiko wrote:
>
> > On Thu, Oct 29, 2020 at 5:33 PM Song Liu <songliubraving@fb.com> wrote:
> > >
> > >
> > >
> > > > On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> > > >
> > > > This patch set adds support for generating and deduplicating split BTF. This
> > > > is an enhancement to the BTF, which allows to designate one BTF as the "base
> > > > BTF" (e.g., vmlinux BTF), and one or more other BTFs as "split BTF" (e.g.,
> > > > kernel module BTF), which are building upon and extending base BTF with extra
> > > > types and strings.
> > > >
> > > > Once loaded, split BTF appears as a single unified BTF superset of base BTF,
> > > > with continuous and transparent numbering scheme. This allows all the existing
> > > > users of BTF to work correctly and stay agnostic to the base/split BTFs
> > > > composition.  The only difference is in how to instantiate split BTF: it
> > > > requires base BTF to be alread instantiated and passed to btf__new_xxx_split()
> > > > or btf__parse_xxx_split() "constructors" explicitly.
> > > >
> > > > This split approach is necessary if we are to have a reasonably-sized kernel
> > > > module BTFs. By deduping each kernel module's BTF individually, resulting
> > > > module BTFs contain copies of a lot of kernel types that are already present
> > > > in vmlinux BTF. Even those single copies result in a big BTF size bloat. On my
> > > > kernel configuration with 700 modules built, non-split BTF approach results in
> > > > 115MBs of BTFs across all modules. With split BTF deduplication approach,
> > > > total size is down to 5.2MBs total, which is on part with vmlinux BTF (at
> > > > around 4MBs). This seems reasonable and practical. As to why we'd need kernel
> > > > module BTFs, that should be pretty obvious to anyone using BPF at this point,
> > > > as it allows all the BTF-powered features to be used with kernel modules:
> > > > tp_btf, fentry/fexit/fmod_ret, lsm, bpf_iter, etc.
> > >
> > > Some high level questions. Do we plan to use split BTF for in-tree modules
> > > (those built together with the kernel) or out-of-tree modules (those built
> > > separately)? If it is for in-tree modules, is it possible to build split BTF
> > > into vmlinux BTF?
> >
> > It will be possible to use for both in-tree and out-of-tree. For
> > in-tree, this will be integrated into the kernel build process. For
> > out-of-tree, whoever builds their kernel module will need to invoke
> > pahole -J with an extra flag pointing to the right vmlinux image (I
> > haven't looked into the exact details of this integration, maybe there
> > are already scripts in Linux repo that out-of-tree modules have to
> > use, in such case we can add this integration there).
> >
> > Merging all in-tree modules' BTFs into vmlinux's BTF defeats the
> > purpose of the split BTF and will just increase the size of vmlinux
> > BTF unnecessarily.
> >
>
> Again more of a question about how module BTF will be exposed, but
> I'm wondering if there will be a way for a consumer to ask for
> type info across kernel and module BTF, i.e. something like
> libbpf_find_kernel_btf_id() ?

I'm still playing with the options, but I think libbpf will do all the
search across vmlinux and modules. I'm considering allowing users to
specify module name as an optional hint. Just in case if there are
conflicting types/functions in two different modules with the same
name.

> Similarly will __builtin_btf_type_id()
> work across both vmlinux and modules? I'm thinking of the case where we
> potentially don't know which module a type is defined in.

I think we'll need another built-in/relocation to specify
module/vmlinux ID. Type ID itself is not unique enough to identify the
module.

Alternatively, we can extend its return type to u64 and have BTF
object ID in upper 4 bytes, and BTF type ID in lower 4 bytes. Need to
think about this and discuss it with Yonghong.

>
> I realize in some cases type names may refer to different types in
> different modules (not sure how frequent this is in practice?) but
> I'm curious how the split model for modules will interact with existing
> APIs and helpers.
>
> In some cases it's likely that modules may share types with
> each other that they do not share with vmlinux; in such cases
> will those types get deduplicated also, or is deduplication just
> between kernel/module, and not module/module?

Yes, they will be duplicated in two modules. It's a start schema,
where vmlinux BTF is the base for all kernel modules. It's technically
possible to have a longer chain of BTFs, but we'd need to deal with
dependencies between modules, making sure that dependent BTF is loaded
and available first, etc. That can be added later without breaking
anything, if there is a need.

>
> Sorry I know these questions aren't about this patchset in
> particular, but I'm just trying to get a sense of the bigger
> picture. Thanks!

These are fair questions, I just didn't want to go into too many
details in this particular patch set, because it's pretty agnostic to
all of those concerns. The next patch set will be dealing with all the
details of kernel/user space interface.

>
> Alan

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 02/11] selftest/bpf: relax btf_dedup test checks
  2020-10-30 16:43   ` Song Liu
@ 2020-10-30 18:44     ` Andrii Nakryiko
  2020-10-30 22:30       ` Song Liu
  0 siblings, 1 reply; 49+ messages in thread
From: Andrii Nakryiko @ 2020-10-30 18:44 UTC (permalink / raw)
  To: Song Liu
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team

On Fri, Oct 30, 2020 at 9:43 AM Song Liu <song@kernel.org> wrote:
>
> On Thu, Oct 29, 2020 at 1:40 AM Andrii Nakryiko <andrii@kernel.org> wrote:
> >
> > Remove the requirement of a strictly exact string section contents. This used
> > to be true when string deduplication was done through sorting, but with string
> > dedup done through hash table, it's no longer true. So relax test harness to
> > relax strings checks and, consequently, type checks, which now don't have to
> > have exactly the same string offsets.
> >
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > ---
> >  tools/testing/selftests/bpf/prog_tests/btf.c | 34 +++++++++++---------
> >  1 file changed, 19 insertions(+), 15 deletions(-)
> >
> > diff --git a/tools/testing/selftests/bpf/prog_tests/btf.c b/tools/testing/selftests/bpf/prog_tests/btf.c
> > index 93162484c2ca..2ccc23b2a36f 100644
> > --- a/tools/testing/selftests/bpf/prog_tests/btf.c
> > +++ b/tools/testing/selftests/bpf/prog_tests/btf.c
> > @@ -6652,7 +6652,7 @@ static void do_test_dedup(unsigned int test_num)
> >         const void *test_btf_data, *expect_btf_data;
> >         const char *ret_test_next_str, *ret_expect_next_str;
> >         const char *test_strs, *expect_strs;
> > -       const char *test_str_cur, *test_str_end;
> > +       const char *test_str_cur;
> >         const char *expect_str_cur, *expect_str_end;
> >         unsigned int raw_btf_size;
> >         void *raw_btf;
> > @@ -6719,12 +6719,18 @@ static void do_test_dedup(unsigned int test_num)
> >                 goto done;
> >         }
> >
> > -       test_str_cur = test_strs;
> > -       test_str_end = test_strs + test_hdr->str_len;
> >         expect_str_cur = expect_strs;
> >         expect_str_end = expect_strs + expect_hdr->str_len;
> > -       while (test_str_cur < test_str_end && expect_str_cur < expect_str_end) {
> > +       while (expect_str_cur < expect_str_end) {
> >                 size_t test_len, expect_len;
> > +               int off;
> > +
> > +               off = btf__find_str(test_btf, expect_str_cur);
> > +               if (CHECK(off < 0, "exp str '%s' not found: %d\n", expect_str_cur, off)) {
> > +                       err = -1;
> > +                       goto done;
> > +               }
> > +               test_str_cur = btf__str_by_offset(test_btf, off);
> >
> >                 test_len = strlen(test_str_cur);
> >                 expect_len = strlen(expect_str_cur);
> > @@ -6741,15 +6747,8 @@ static void do_test_dedup(unsigned int test_num)
> >                         err = -1;
> >                         goto done;
> >                 }
> > -               test_str_cur += test_len + 1;
> >                 expect_str_cur += expect_len + 1;
> >         }
> > -       if (CHECK(test_str_cur != test_str_end,
> > -                 "test_str_cur:%p != test_str_end:%p",
> > -                 test_str_cur, test_str_end)) {
> > -               err = -1;
> > -               goto done;
> > -       }
> >
> >         test_nr_types = btf__get_nr_types(test_btf);
> >         expect_nr_types = btf__get_nr_types(expect_btf);
> > @@ -6775,10 +6774,15 @@ static void do_test_dedup(unsigned int test_num)
> >                         err = -1;
> >                         goto done;
> >                 }
> > -               if (CHECK(memcmp((void *)test_type,
> > -                                (void *)expect_type,
> > -                                test_size),
> > -                         "type #%d: contents differ", i)) {
>
> I guess test_size and expect_size are not needed anymore?

hm.. they are used just one check above, still needed

>
> > +               if (CHECK(btf_kind(test_type) != btf_kind(expect_type),
> > +                         "type %d kind: exp %d != got %u\n",
> > +                         i, btf_kind(expect_type), btf_kind(test_type))) {
> > +                       err = -1;
> > +                       goto done;
> > +               }
> > +               if (CHECK(test_type->info != expect_type->info,
> > +                         "type %d info: exp %d != got %u\n",
> > +                         i, expect_type->info, test_type->info)) {
>
> btf_kind() returns part of ->info, so we only need the second check, no?

technically yes, but when kind mismatches, figuring that out from raw
info field is quite painful, so having a better, more targeted check
is still good.

>
> IIUC, test_type and expect_type may have different name_off now. Shall
> we check ->size matches?

yep, sure, I'll add

>
>
> >                         err = -1;
> >                         goto done;
> >                 }
> > --
> > 2.24.1
> >

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 02/11] selftest/bpf: relax btf_dedup test checks
  2020-10-30 18:44     ` Andrii Nakryiko
@ 2020-10-30 22:30       ` Song Liu
  0 siblings, 0 replies; 49+ messages in thread
From: Song Liu @ 2020-10-30 22:30 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team

On Fri, Oct 30, 2020 at 11:45 AM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
[...]
> > > @@ -6775,10 +6774,15 @@ static void do_test_dedup(unsigned int test_num)
> > >                         err = -1;
> > >                         goto done;
> > >                 }
> > > -               if (CHECK(memcmp((void *)test_type,
> > > -                                (void *)expect_type,
> > > -                                test_size),
> > > -                         "type #%d: contents differ", i)) {
> >
> > I guess test_size and expect_size are not needed anymore?
>
> hm.. they are used just one check above, still needed

Hmm... I don't know what happened to me back then.. Please ignore.

>
> >
> > > +               if (CHECK(btf_kind(test_type) != btf_kind(expect_type),
> > > +                         "type %d kind: exp %d != got %u\n",
> > > +                         i, btf_kind(expect_type), btf_kind(test_type))) {
> > > +                       err = -1;
> > > +                       goto done;
> > > +               }
> > > +               if (CHECK(test_type->info != expect_type->info,
> > > +                         "type %d info: exp %d != got %u\n",
> > > +                         i, expect_type->info, test_type->info)) {
> >
> > btf_kind() returns part of ->info, so we only need the second check, no?
>
> technically yes, but when kind mismatches, figuring that out from raw
> info field is quite painful, so having a better, more targeted check
> is still good.

Fair enough. We can have a more clear check.

Thanks,
Song

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 03/11] libbpf: unify and speed up BTF string deduplication
  2020-10-29  0:58 ` [PATCH bpf-next 03/11] libbpf: unify and speed up BTF string deduplication Andrii Nakryiko
@ 2020-10-30 23:32   ` Song Liu
  2020-11-03  4:51     ` Andrii Nakryiko
  2020-11-03  4:59   ` Alexei Starovoitov
  1 sibling, 1 reply; 49+ messages in thread
From: Song Liu @ 2020-10-30 23:32 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Networking, Alexei Starovoitov, daniel, Kernel Team,
	Andrii Nakryiko



> On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> 
> From: Andrii Nakryiko <andriin@fb.com>
> 
> Revamp BTF dedup's string deduplication to match the approach of writable BTF
> string management. This allows to transfer deduplicated strings index back to
> BTF object after deduplication without expensive extra memory copying and hash
> map re-construction. It also simplifies the code and speeds it up, because
> hashmap-based string deduplication is faster than sort + unique approach.
> 
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>

LGTM, with a couple nitpick below:

Acked-by: Song Liu <songliubraving@fb.com>

> ---
> tools/lib/bpf/btf.c | 265 +++++++++++++++++---------------------------
> 1 file changed, 99 insertions(+), 166 deletions(-)
> 
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 89fecfe5cb2b..db9331fea672 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -90,6 +90,14 @@ struct btf {
> 	struct hashmap *strs_hash;
> 	/* whether strings are already deduplicated */
> 	bool strs_deduped;
> +	/* extra indirection layer to make strings hashmap work with stable
> +	 * string offsets and ability to transparently choose between
> +	 * btf->strs_data or btf_dedup->strs_data as a source of strings.
> +	 * This is used for BTF strings dedup to transfer deduplicated strings
> +	 * data back to struct btf without re-building strings index.
> +	 */
> +	void **strs_data_ptr;
> +
> 	/* BTF object FD, if loaded into kernel */
> 	int fd;
> 
> @@ -1363,17 +1371,19 @@ int btf__get_map_kv_tids(const struct btf *btf, const char *map_name,
> 
> static size_t strs_hash_fn(const void *key, void *ctx)
> {
> -	struct btf *btf = ctx;
> -	const char *str = btf->strs_data + (long)key;
> +	const char ***strs_data_ptr = ctx;
> +	const char *strs = **strs_data_ptr;
> +	const char *str = strs + (long)key;

Can we keep using btf as the ctx here? "char ***" hurts my eyes...

[...]

> -	d->btf->hdr->str_len = end - start;
> +	/* replace BTF string data and hash with deduped ones */
> +	free(d->btf->strs_data);
> +	hashmap__free(d->btf->strs_hash);
> +	d->btf->strs_data = d->strs_data;
> +	d->btf->strs_data_cap = d->strs_cap;
> +	d->btf->hdr->str_len = d->strs_len;
> +	d->btf->strs_hash = d->strs_hash;
> +	/* now point strs_data_ptr back to btf->strs_data */
> +	d->btf->strs_data_ptr = &d->btf->strs_data;
> +
> +	d->strs_data = d->strs_hash = NULL;
> +	d->strs_len = d->strs_cap = 0;
> 	d->btf->strs_deduped = true;
> +	return 0;
> +
> +err_out:
> +	free(d->strs_data);
> +	hashmap__free(d->strs_hash);
> +	d->strs_data = d->strs_hash = NULL;
> +	d->strs_len = d->strs_cap = 0;
> +
> +	/* restore strings pointer for existing d->btf->strs_hash back */
> +	d->btf->strs_data_ptr = &d->strs_data;

We have quite some duplicated code in err_out vs. succeed_out cases. 
How about we add a helper function, like

void free_strs_data(struct btf_dedup *d)
{
	free(d->strs_data);
	hashmap__free(d->strs_hash);
	d->strs_data = d->strs_hash = NULL;
	d->strs_len = d->strs_cap = 0;	
}

?

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 04/11] libbpf: implement basic split BTF support
  2020-10-29  0:58 ` [PATCH bpf-next 04/11] libbpf: implement basic split BTF support Andrii Nakryiko
@ 2020-11-02 23:23   ` Song Liu
  2020-11-03  5:02     ` Andrii Nakryiko
  0 siblings, 1 reply; 49+ messages in thread
From: Song Liu @ 2020-11-02 23:23 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann, Kernel Team



> On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> 

[...]

> 
> BTF deduplication is not yet supported for split BTF and support for it will
> be added in separate patch.
> 
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

Acked-by: Song Liu <songliubraving@fb.com>

With a couple nits:

> ---
> tools/lib/bpf/btf.c      | 205 ++++++++++++++++++++++++++++++---------
> tools/lib/bpf/btf.h      |   8 ++
> tools/lib/bpf/libbpf.map |   9 ++
> 3 files changed, 175 insertions(+), 47 deletions(-)
> 
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index db9331fea672..20c64a8441a8 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -78,10 +78,32 @@ struct btf {
> 	void *types_data;
> 	size_t types_data_cap; /* used size stored in hdr->type_len */
> 
> -	/* type ID to `struct btf_type *` lookup index */
> +	/* type ID to `struct btf_type *` lookup index
> +	 * type_offs[0] corresponds to the first non-VOID type:
> +	 *   - for base BTF it's type [1];
> +	 *   - for split BTF it's the first non-base BTF type.
> +	 */
> 	__u32 *type_offs;
> 	size_t type_offs_cap;
> +	/* number of types in this BTF instance:
> +	 *   - doesn't include special [0] void type;
> +	 *   - for split BTF counts number of types added on top of base BTF.
> +	 */
> 	__u32 nr_types;

This is a little confusing. Maybe add a void type for every split BTF? 

> +	/* if not NULL, points to the base BTF on top of which the current
> +	 * split BTF is based
> +	 */

[...]

> 
> @@ -252,12 +274,20 @@ static int btf_parse_str_sec(struct btf *btf)
> 	const char *start = btf->strs_data;
> 	const char *end = start + btf->hdr->str_len;
> 
> -	if (!hdr->str_len || hdr->str_len - 1 > BTF_MAX_STR_OFFSET ||
> -	    start[0] || end[-1]) {
> -		pr_debug("Invalid BTF string section\n");
> -		return -EINVAL;
> +	if (btf->base_btf) {
> +		if (hdr->str_len == 0)
> +			return 0;
> +		if (hdr->str_len - 1 > BTF_MAX_STR_OFFSET || end[-1]) {
> +			pr_debug("Invalid BTF string section\n");
> +			return -EINVAL;
> +		}
> +	} else {
> +		if (!hdr->str_len || hdr->str_len - 1 > BTF_MAX_STR_OFFSET ||
> +		    start[0] || end[-1]) {
> +			pr_debug("Invalid BTF string section\n");
> +			return -EINVAL;
> +		}
> 	}
> -
> 	return 0;

I found this function a little difficult to follow. Maybe rearrange it as 

	/* too long, or not \0 terminated */
	if (hdr->str_len - 1 > BTF_MAX_STR_OFFSET || end[-1])
		goto err_out;

	/* for base btf, .... */
	if (!btf->base_btf && (!hdr->str_len || start[0]))
		goto err_out;

	return 0;
err_out:
	pr_debug("Invalid BTF string section\n");
	return -EINVAL;
}
> }
> 
> @@ -372,19 +402,9 @@ static int btf_parse_type_sec(struct btf *btf)
> 	struct btf_header *hdr = btf->hdr;
> 	void *next_type = btf->types_data;
> 	void *end_type = next_type + hdr->type_len;
> -	int err, i = 0, type_size;

[...]


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 05/11] selftests/bpf: add split BTF basic test
  2020-10-29  0:58 ` [PATCH bpf-next 05/11] selftests/bpf: add split BTF basic test Andrii Nakryiko
@ 2020-11-02 23:36   ` Song Liu
  2020-11-03  5:10     ` Andrii Nakryiko
  0 siblings, 1 reply; 49+ messages in thread
From: Song Liu @ 2020-11-02 23:36 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, netdev, Alexei Starovoitov, daniel, Kernel Team



> On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> 
> Add selftest validating ability to programmatically generate and then dump
> split BTF.
> 
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

Acked-by: Song Liu <songliubraving@fb.com>

With a nit:

[...]
> 
> +
> +static void btf_dump_printf(void *ctx, const char *fmt, va_list args)
> +{
> +	vfprintf(ctx, fmt, args);
> +}
> +
> +void test_btf_split() {
> +	struct btf_dump_opts opts;
> +	struct btf_dump *d = NULL;
> +	const struct btf_type *t;
> +	struct btf *btf1, *btf2 = NULL;

No need to initialize btf2 to NULL. 

> +	int str_off, i, err;
> +
> +	btf1 = btf__new_empty();
> +	if (!ASSERT_OK_PTR(btf1, "empty_main_btf"))
> +		return;
> +
> 

[...]


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 06/11] selftests/bpf: add checking of raw type dump in BTF writer APIs selftests
  2020-10-29  0:58 ` [PATCH bpf-next 06/11] selftests/bpf: add checking of raw type dump in BTF writer APIs selftests Andrii Nakryiko
@ 2020-11-03  0:08   ` Song Liu
  2020-11-03  5:14     ` Andrii Nakryiko
  0 siblings, 1 reply; 49+ messages in thread
From: Song Liu @ 2020-11-03  0:08 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, netdev, Alexei Starovoitov, daniel, Kernel Team, Andrii Nakryiko



> On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> 
> From: Andrii Nakryiko <andriin@fb.com>
> 
> Add re-usable btf_helpers.{c,h} to provide BTF-related testing routines. Start
> with adding a raw BTF dumping helpers.
> 
> Raw BTF dump is the most succinct and at the same time a very human-friendly
> way to validate exact contents of BTF types. Cross-validate raw BTF dump and
> writable BTF in a single selftest. Raw type dump checks also serve as a good
> self-documentation.
> 
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

Acked-by: Song Liu <songliubraving@fb.com>

with a couple nits:

[...]

> +
> +/* Print raw BTF type dump into a local buffer and return string pointer back.
> + * Buffer *will* be overwritten by subsequent btf_type_raw_dump() calls
> + */
> +const char *btf_type_raw_dump(const struct btf *btf, int type_id)
> +{
> +	static char buf[16 * 1024];
> +	FILE *buf_file;
> +
> +	buf_file = fmemopen(buf, sizeof(buf) - 1, "w");
> +	if (!buf_file) {
> +		fprintf(stderr, "Failed to open memstream: %d\n", errno);
> +		return NULL;
> +	}
> +
> +	fprintf_btf_type_raw(buf_file, btf, type_id);
> +	fflush(buf_file);
> +	fclose(buf_file);
> +
> +	return buf;
> +}
> diff --git a/tools/testing/selftests/bpf/btf_helpers.h b/tools/testing/selftests/bpf/btf_helpers.h
> new file mode 100644
> index 000000000000..2c9ce1b61dc9
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/btf_helpers.h
> @@ -0,0 +1,12 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/* Copyright (c) 2020 Facebook */
> +#ifndef __BTF_HELPERS_H
> +#define __BTF_HELPERS_H
> +
> +#include <stdio.h>
> +#include <bpf/btf.h>
> +
> +int fprintf_btf_type_raw(FILE *out, const struct btf *btf, __u32 id);
> +const char *btf_type_raw_dump(const struct btf *btf, int type_id);
> +
> +#endif
> diff --git a/tools/testing/selftests/bpf/prog_tests/btf_write.c b/tools/testing/selftests/bpf/prog_tests/btf_write.c
> index 314e1e7c36df..bc1412de1b3d 100644
> --- a/tools/testing/selftests/bpf/prog_tests/btf_write.c
> +++ b/tools/testing/selftests/bpf/prog_tests/btf_write.c
> @@ -2,6 +2,7 @@
> /* Copyright (c) 2020 Facebook */
> #include <test_progs.h>
> #include <bpf/btf.h>
> +#include "btf_helpers.h"
> 
> static int duration = 0;
> 
> @@ -11,12 +12,12 @@ void test_btf_write() {
> 	const struct btf_member *m;
> 	const struct btf_enum *v;
> 	const struct btf_param *p;
> -	struct btf *btf;
> +	struct btf *btf = NULL;

No need to initialize btf. 

> 	int id, err, str_off;
> 
> 	btf = btf__new_empty();
> 	if (CHECK(IS_ERR(btf), "new_empty", "failed: %ld\n", PTR_ERR(btf)))
> -		return;
> +		goto err_out;

err_out is not needed either. 

[...]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 07/11] libbpf: fix BTF data layout checks and allow empty BTF
  2020-10-29  0:58 ` [PATCH bpf-next 07/11] libbpf: fix BTF data layout checks and allow empty BTF Andrii Nakryiko
@ 2020-11-03  0:51   ` Song Liu
  2020-11-03  5:18     ` Andrii Nakryiko
  0 siblings, 1 reply; 49+ messages in thread
From: Song Liu @ 2020-11-03  0:51 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann, Kernel Team



> On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> 
> Make data section layout checks stricter, disallowing overlap of types and
> strings data.
> 
> Additionally, allow BTFs with no type data. There is nothing inherently wrong
> with having BTF with no types (put potentially with some strings). This could
> be a situation with kernel module BTFs, if module doesn't introduce any new
> type information.
> 
> Also fix invalid offset alignment check for btf->hdr->type_off.
> 
> Fixes: 8a138aed4a80 ("bpf: btf: Add BTF support to libbpf")
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> ---
> tools/lib/bpf/btf.c | 16 ++++++----------
> 1 file changed, 6 insertions(+), 10 deletions(-)
> 
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 20c64a8441a8..9b0ef71a03d0 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -245,22 +245,18 @@ static int btf_parse_hdr(struct btf *btf)
> 		return -EINVAL;
> 	}
> 
> -	if (meta_left < hdr->type_off) {
> -		pr_debug("Invalid BTF type section offset:%u\n", hdr->type_off);
> +	if (meta_left < hdr->str_off + hdr->str_len) {
> +		pr_debug("Invalid BTF total size:%u\n", btf->raw_size);
> 		return -EINVAL;
> 	}

Can we make this one as 
	if (meta_left != hdr->str_off + hdr->str_len) {

> 
> -	if (meta_left < hdr->str_off) {
> -		pr_debug("Invalid BTF string section offset:%u\n", hdr->str_off);
> +	if (hdr->type_off + hdr->type_len > hdr->str_off) {
> +		pr_debug("Invalid BTF data sections layout: type data at %u + %u, strings data at %u + %u\n",
> +			 hdr->type_off, hdr->type_len, hdr->str_off, hdr->str_len);
> 		return -EINVAL;
> 	}

And this one 
	if (hdr->type_off + hdr->type_len != hdr->str_off) {

?

[...]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 08/11] libbpf: support BTF dedup of split BTFs
  2020-10-29  0:58 ` [PATCH bpf-next 08/11] libbpf: support BTF dedup of split BTFs Andrii Nakryiko
@ 2020-11-03  2:49   ` Song Liu
  2020-11-03  5:25     ` Andrii Nakryiko
  2020-11-03  5:10   ` Alexei Starovoitov
  1 sibling, 1 reply; 49+ messages in thread
From: Song Liu @ 2020-11-03  2:49 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, netdev, Alexei Starovoitov, daniel, Kernel Team



> On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> 
> Add support for deduplication split BTFs. When deduplicating split BTF, base
> BTF is considered to be immutable and can't be modified or adjusted. 99% of
> BTF deduplication logic is left intact (module some type numbering adjustments).
> There are only two differences.
> 
> First, each type in base BTF gets hashed (expect VAR and DATASEC, of course,
> those are always considered to be self-canonical instances) and added into
> a table of canonical table candidates. Hashing is a shallow, fast operation,
> so mostly eliminates the overhead of having entire base BTF to be a part of
> BTF dedup.
> 
> Second difference is very critical and subtle. While deduplicating split BTF
> types, it is possible to discover that one of immutable base BTF BTF_KIND_FWD
> types can and should be resolved to a full STRUCT/UNION type from the split
> BTF part.  This is, obviously, can't happen because we can't modify the base
> BTF types anymore. So because of that, any type in split BTF that directly or
> indirectly references that newly-to-be-resolved FWD type can't be considered
> to be equivalent to the corresponding canonical types in base BTF, because
> that would result in a loss of type resolution information. So in such case,
> split BTF types will be deduplicated separately and will cause some
> duplication of type information, which is unavoidable.
> 
> With those two changes, the rest of the algorithm manages to deduplicate split
> BTF correctly, pointing all the duplicates to their canonical counter-parts in
> base BTF, but also is deduplicating whatever unique types are present in split
> BTF on their own.
> 
> Also, theoretically, split BTF after deduplication could end up with either
> empty type section or empty string section. This is handled by libbpf
> correctly in one of previous patches in the series.
> 
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

Acked-by: Song Liu <songliubraving@fb.com>

With some nits:

> ---

[...]

> 
> 	/* remap string offsets */
> 	err = btf_for_each_str_off(d, strs_dedup_remap_str_off, d);
> @@ -3553,6 +3582,63 @@ static bool btf_compat_fnproto(struct btf_type *t1, struct btf_type *t2)
> 	return true;
> }
> 

An overview comment about bpf_deup_prep() will be great. 

> +static int btf_dedup_prep(struct btf_dedup *d)
> +{
> +	struct btf_type *t;
> +	int type_id;
> +	long h;
> +
> +	if (!d->btf->base_btf)
> +		return 0;
> +
> +	for (type_id = 1; type_id < d->btf->start_id; type_id++)
> +	{

Move "{" to previous line? 

> +		t = btf_type_by_id(d->btf, type_id);
> +
> +		/* all base BTF types are self-canonical by definition */
> +		d->map[type_id] = type_id;
> +
> +		switch (btf_kind(t)) {
> +		case BTF_KIND_VAR:
> +		case BTF_KIND_DATASEC:
> +			/* VAR and DATASEC are never hash/deduplicated */
> +			continue;

[...]

> 	/* we are going to reuse hypot_map to store compaction remapping */
> 	d->hypot_map[0] = 0;
> -	for (i = 1; i <= d->btf->nr_types; i++)
> -		d->hypot_map[i] = BTF_UNPROCESSED_ID;
> +	/* base BTF types are not renumbered */
> +	for (id = 1; id < d->btf->start_id; id++)
> +		d->hypot_map[id] = id;
> +	for (i = 0, id = d->btf->start_id; i < d->btf->nr_types; i++, id++)
> +		d->hypot_map[id] = BTF_UNPROCESSED_ID;

We don't really need i in the loop, shall we just do 
	for (id = d->btf->start_id; id < d->btf->start_id + d->btf->nr_types; id++)
?

> 
> 	p = d->btf->types_data;
> 
> -	for (i = 1; i <= d->btf->nr_types; i++) {
> -		if (d->map[i] != i)
> +	for (i = 0, id = d->btf->start_id; i < d->btf->nr_types; i++, id++) {

ditto

> +		if (d->map[id] != id)
> 			continue;
> 
[...]


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 09/11] libbpf: accomodate DWARF/compiler bug with duplicated identical arrays
  2020-10-29  0:59 ` [PATCH bpf-next 09/11] libbpf: accomodate DWARF/compiler bug with duplicated identical arrays Andrii Nakryiko
@ 2020-11-03  2:52   ` Song Liu
  0 siblings, 0 replies; 49+ messages in thread
From: Song Liu @ 2020-11-03  2:52 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann, Kernel Team



> On Oct 28, 2020, at 5:59 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> 
> In some cases compiler seems to generate distinct DWARF types for identical
> arrays within the same CU. That seems like a bug, but it's already out there
> and breaks type graph equivalence checks, so accommodate it anyway by checking
> for identical arrays, regardless of their type ID.
> 
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

Acked-by: Song Liu <songliubraving@fb.com>

[...]


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 03/11] libbpf: unify and speed up BTF string deduplication
  2020-10-30 23:32   ` Song Liu
@ 2020-11-03  4:51     ` Andrii Nakryiko
  0 siblings, 0 replies; 49+ messages in thread
From: Andrii Nakryiko @ 2020-11-03  4:51 UTC (permalink / raw)
  To: Song Liu
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov, daniel,
	Kernel Team, Andrii Nakryiko

On Fri, Oct 30, 2020 at 4:33 PM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> > On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> >
> > From: Andrii Nakryiko <andriin@fb.com>
> >
> > Revamp BTF dedup's string deduplication to match the approach of writable BTF
> > string management. This allows to transfer deduplicated strings index back to
> > BTF object after deduplication without expensive extra memory copying and hash
> > map re-construction. It also simplifies the code and speeds it up, because
> > hashmap-based string deduplication is faster than sort + unique approach.
> >
> > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
>
> LGTM, with a couple nitpick below:
>
> Acked-by: Song Liu <songliubraving@fb.com>
>
> > ---
> > tools/lib/bpf/btf.c | 265 +++++++++++++++++---------------------------
> > 1 file changed, 99 insertions(+), 166 deletions(-)
> >
> > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > index 89fecfe5cb2b..db9331fea672 100644
> > --- a/tools/lib/bpf/btf.c
> > +++ b/tools/lib/bpf/btf.c
> > @@ -90,6 +90,14 @@ struct btf {
> >       struct hashmap *strs_hash;
> >       /* whether strings are already deduplicated */
> >       bool strs_deduped;
> > +     /* extra indirection layer to make strings hashmap work with stable
> > +      * string offsets and ability to transparently choose between
> > +      * btf->strs_data or btf_dedup->strs_data as a source of strings.
> > +      * This is used for BTF strings dedup to transfer deduplicated strings
> > +      * data back to struct btf without re-building strings index.
> > +      */
> > +     void **strs_data_ptr;
> > +
> >       /* BTF object FD, if loaded into kernel */
> >       int fd;
> >
> > @@ -1363,17 +1371,19 @@ int btf__get_map_kv_tids(const struct btf *btf, const char *map_name,
> >
> > static size_t strs_hash_fn(const void *key, void *ctx)
> > {
> > -     struct btf *btf = ctx;
> > -     const char *str = btf->strs_data + (long)key;
> > +     const char ***strs_data_ptr = ctx;
> > +     const char *strs = **strs_data_ptr;
> > +     const char *str = strs + (long)key;
>
> Can we keep using btf as the ctx here? "char ***" hurts my eyes...
>

yep, changed to struct btf *

> [...]
>
> > -     d->btf->hdr->str_len = end - start;
> > +     /* replace BTF string data and hash with deduped ones */
> > +     free(d->btf->strs_data);
> > +     hashmap__free(d->btf->strs_hash);
> > +     d->btf->strs_data = d->strs_data;
> > +     d->btf->strs_data_cap = d->strs_cap;
> > +     d->btf->hdr->str_len = d->strs_len;
> > +     d->btf->strs_hash = d->strs_hash;
> > +     /* now point strs_data_ptr back to btf->strs_data */
> > +     d->btf->strs_data_ptr = &d->btf->strs_data;
> > +
> > +     d->strs_data = d->strs_hash = NULL;
> > +     d->strs_len = d->strs_cap = 0;
> >       d->btf->strs_deduped = true;
> > +     return 0;
> > +
> > +err_out:
> > +     free(d->strs_data);
> > +     hashmap__free(d->strs_hash);
> > +     d->strs_data = d->strs_hash = NULL;
> > +     d->strs_len = d->strs_cap = 0;
> > +
> > +     /* restore strings pointer for existing d->btf->strs_hash back */
> > +     d->btf->strs_data_ptr = &d->strs_data;
>
> We have quite some duplicated code in err_out vs. succeed_out cases.
> How about we add a helper function, like

nope, that won't work, free(d->strs_data) vs free(d->btf->strs_data),
same for hashmap__free(), plus there are strict requirements about the
exact sequence of assignments in success case

>
> void free_strs_data(struct btf_dedup *d)
> {
>         free(d->strs_data);
>         hashmap__free(d->strs_hash);
>         d->strs_data = d->strs_hash = NULL;
>         d->strs_len = d->strs_cap = 0;
> }
>
> ?

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 03/11] libbpf: unify and speed up BTF string deduplication
  2020-10-29  0:58 ` [PATCH bpf-next 03/11] libbpf: unify and speed up BTF string deduplication Andrii Nakryiko
  2020-10-30 23:32   ` Song Liu
@ 2020-11-03  4:59   ` Alexei Starovoitov
  2020-11-03  6:01     ` Andrii Nakryiko
  1 sibling, 1 reply; 49+ messages in thread
From: Alexei Starovoitov @ 2020-11-03  4:59 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, netdev, ast, daniel, kernel-team, Andrii Nakryiko

On Wed, Oct 28, 2020 at 05:58:54PM -0700, Andrii Nakryiko wrote:
> From: Andrii Nakryiko <andriin@fb.com>
> 
> Revamp BTF dedup's string deduplication to match the approach of writable BTF
> string management. This allows to transfer deduplicated strings index back to
> BTF object after deduplication without expensive extra memory copying and hash
> map re-construction. It also simplifies the code and speeds it up, because
> hashmap-based string deduplication is faster than sort + unique approach.
> 
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> ---
>  tools/lib/bpf/btf.c | 265 +++++++++++++++++---------------------------
>  1 file changed, 99 insertions(+), 166 deletions(-)
> 
> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> index 89fecfe5cb2b..db9331fea672 100644
> --- a/tools/lib/bpf/btf.c
> +++ b/tools/lib/bpf/btf.c
> @@ -90,6 +90,14 @@ struct btf {
>  	struct hashmap *strs_hash;
>  	/* whether strings are already deduplicated */
>  	bool strs_deduped;
> +	/* extra indirection layer to make strings hashmap work with stable
> +	 * string offsets and ability to transparently choose between
> +	 * btf->strs_data or btf_dedup->strs_data as a source of strings.
> +	 * This is used for BTF strings dedup to transfer deduplicated strings
> +	 * data back to struct btf without re-building strings index.
> +	 */
> +	void **strs_data_ptr;

I thought one of the ideas of dedup algo was that strings were deduped first,
so there is no need to rebuild them.
Then split BTF cannot touch base BTF strings and they're immutable.
But the commit log is talking about transfer of strings and
hash map re-construction? Why split BTF would reconstruct anything?
It either finds a string in a base BTF or adds to its own strings section.
Is it all due to switch to hash? The speedup motivation is clear, but then
it sounds like that the speedup is causing all these issues.
The strings could have stayed as-is. Just a bit slower ?

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 04/11] libbpf: implement basic split BTF support
  2020-11-02 23:23   ` Song Liu
@ 2020-11-03  5:02     ` Andrii Nakryiko
  2020-11-03  5:41       ` Song Liu
  0 siblings, 1 reply; 49+ messages in thread
From: Andrii Nakryiko @ 2020-11-03  5:02 UTC (permalink / raw)
  To: Song Liu
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team

On Mon, Nov 2, 2020 at 3:24 PM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> > On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> >
>
> [...]
>
> >
> > BTF deduplication is not yet supported for split BTF and support for it will
> > be added in separate patch.
> >
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
>
> Acked-by: Song Liu <songliubraving@fb.com>
>
> With a couple nits:
>
> > ---
> > tools/lib/bpf/btf.c      | 205 ++++++++++++++++++++++++++++++---------
> > tools/lib/bpf/btf.h      |   8 ++
> > tools/lib/bpf/libbpf.map |   9 ++
> > 3 files changed, 175 insertions(+), 47 deletions(-)
> >
> > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > index db9331fea672..20c64a8441a8 100644
> > --- a/tools/lib/bpf/btf.c
> > +++ b/tools/lib/bpf/btf.c
> > @@ -78,10 +78,32 @@ struct btf {
> >       void *types_data;
> >       size_t types_data_cap; /* used size stored in hdr->type_len */
> >
> > -     /* type ID to `struct btf_type *` lookup index */
> > +     /* type ID to `struct btf_type *` lookup index
> > +      * type_offs[0] corresponds to the first non-VOID type:
> > +      *   - for base BTF it's type [1];
> > +      *   - for split BTF it's the first non-base BTF type.
> > +      */
> >       __u32 *type_offs;
> >       size_t type_offs_cap;
> > +     /* number of types in this BTF instance:
> > +      *   - doesn't include special [0] void type;
> > +      *   - for split BTF counts number of types added on top of base BTF.
> > +      */
> >       __u32 nr_types;
>
> This is a little confusing. Maybe add a void type for every split BTF?

Agree about being a bit confusing. But I don't want VOID in every BTF,
that seems sloppy (there's no continuity). I'm currently doing similar
changes on kernel side, and so far everything also works cleanly with
start_id == 0 && nr_types including VOID (for base BTF), and start_id
== base_btf->nr_type && nr_types has all the added types (for split
BTF). That seems a bit more straightforward, so I'll probably do that
here as well (unless I'm missing something, I'll double check).

>
> > +     /* if not NULL, points to the base BTF on top of which the current
> > +      * split BTF is based
> > +      */
>
> [...]
>
> >
> > @@ -252,12 +274,20 @@ static int btf_parse_str_sec(struct btf *btf)
> >       const char *start = btf->strs_data;
> >       const char *end = start + btf->hdr->str_len;
> >
> > -     if (!hdr->str_len || hdr->str_len - 1 > BTF_MAX_STR_OFFSET ||
> > -         start[0] || end[-1]) {
> > -             pr_debug("Invalid BTF string section\n");
> > -             return -EINVAL;
> > +     if (btf->base_btf) {
> > +             if (hdr->str_len == 0)
> > +                     return 0;
> > +             if (hdr->str_len - 1 > BTF_MAX_STR_OFFSET || end[-1]) {
> > +                     pr_debug("Invalid BTF string section\n");
> > +                     return -EINVAL;
> > +             }
> > +     } else {
> > +             if (!hdr->str_len || hdr->str_len - 1 > BTF_MAX_STR_OFFSET ||
> > +                 start[0] || end[-1]) {
> > +                     pr_debug("Invalid BTF string section\n");
> > +                     return -EINVAL;
> > +             }
> >       }
> > -
> >       return 0;
>
> I found this function a little difficult to follow. Maybe rearrange it as
>
>         /* too long, or not \0 terminated */
>         if (hdr->str_len - 1 > BTF_MAX_STR_OFFSET || end[-1])
>                 goto err_out;

this won't work, if str_len == 0. Both str_len - 1 will underflow, and
end[-1] will be reading garbage

How about this:

if (btf->base_btf && hdr->str_len == 0)
    return 0;

if (!hdr->str_len || hdr->str_len - 1 > BTF_MAX_STR_OFFSET || end[-1])
    return -EINVAL;

if (!btf->base_btf && start[0])
    return -EINVAL;

return 0;

This seems more straightforward, right?


>
>         /* for base btf, .... */
>         if (!btf->base_btf && (!hdr->str_len || start[0]))
>                 goto err_out;
>
>         return 0;
> err_out:
>         pr_debug("Invalid BTF string section\n");
>         return -EINVAL;
> }
> > }
> >
> > @@ -372,19 +402,9 @@ static int btf_parse_type_sec(struct btf *btf)
> >       struct btf_header *hdr = btf->hdr;
> >       void *next_type = btf->types_data;
> >       void *end_type = next_type + hdr->type_len;
> > -     int err, i = 0, type_size;
>
> [...]
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 08/11] libbpf: support BTF dedup of split BTFs
  2020-10-29  0:58 ` [PATCH bpf-next 08/11] libbpf: support BTF dedup of split BTFs Andrii Nakryiko
  2020-11-03  2:49   ` Song Liu
@ 2020-11-03  5:10   ` Alexei Starovoitov
  2020-11-03  6:27     ` Andrii Nakryiko
  1 sibling, 1 reply; 49+ messages in thread
From: Alexei Starovoitov @ 2020-11-03  5:10 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, netdev, ast, daniel, kernel-team

On Wed, Oct 28, 2020 at 05:58:59PM -0700, Andrii Nakryiko wrote:
> @@ -2942,6 +2948,13 @@ struct btf_dedup {
>  	__u32 *hypot_list;
>  	size_t hypot_cnt;
>  	size_t hypot_cap;
> +	/* Whether hypothethical mapping, if successful, would need to adjust
> +	 * already canonicalized types (due to a new forward declaration to
> +	 * concrete type resolution). In such case, during split BTF dedup
> +	 * candidate type would still be considered as different, because base
> +	 * BTF is considered to be immutable.
> +	 */
> +	bool hypot_adjust_canon;

why one flag per dedup session is enough?
Don't you have a case where some fwd are pointing to base btf and shouldn't
be adjusted while some are in split btf and should be?
It seems when this flag is set to true it will miss fwd in split btf?

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 05/11] selftests/bpf: add split BTF basic test
  2020-11-02 23:36   ` Song Liu
@ 2020-11-03  5:10     ` Andrii Nakryiko
  0 siblings, 0 replies; 49+ messages in thread
From: Andrii Nakryiko @ 2020-11-03  5:10 UTC (permalink / raw)
  To: Song Liu
  Cc: Andrii Nakryiko, bpf, netdev, Alexei Starovoitov, daniel, Kernel Team

On Mon, Nov 2, 2020 at 3:36 PM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> > On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> >
> > Add selftest validating ability to programmatically generate and then dump
> > split BTF.
> >
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
>
> Acked-by: Song Liu <songliubraving@fb.com>
>
> With a nit:
>
> [...]
> >
> > +
> > +static void btf_dump_printf(void *ctx, const char *fmt, va_list args)
> > +{
> > +     vfprintf(ctx, fmt, args);
> > +}
> > +
> > +void test_btf_split() {
> > +     struct btf_dump_opts opts;
> > +     struct btf_dump *d = NULL;
> > +     const struct btf_type *t;
> > +     struct btf *btf1, *btf2 = NULL;
>
> No need to initialize btf2 to NULL.

yep, must be a leftover from earlier version, I'll remove initialization.

>
> > +     int str_off, i, err;
> > +
> > +     btf1 = btf__new_empty();
> > +     if (!ASSERT_OK_PTR(btf1, "empty_main_btf"))
> > +             return;
> > +
> >
>
> [...]
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 06/11] selftests/bpf: add checking of raw type dump in BTF writer APIs selftests
  2020-11-03  0:08   ` Song Liu
@ 2020-11-03  5:14     ` Andrii Nakryiko
  0 siblings, 0 replies; 49+ messages in thread
From: Andrii Nakryiko @ 2020-11-03  5:14 UTC (permalink / raw)
  To: Song Liu
  Cc: Andrii Nakryiko, bpf, netdev, Alexei Starovoitov, daniel,
	Kernel Team, Andrii Nakryiko

On Mon, Nov 2, 2020 at 4:08 PM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> > On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> >
> > From: Andrii Nakryiko <andriin@fb.com>
> >
> > Add re-usable btf_helpers.{c,h} to provide BTF-related testing routines. Start
> > with adding a raw BTF dumping helpers.
> >
> > Raw BTF dump is the most succinct and at the same time a very human-friendly
> > way to validate exact contents of BTF types. Cross-validate raw BTF dump and
> > writable BTF in a single selftest. Raw type dump checks also serve as a good
> > self-documentation.
> >
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
>
> Acked-by: Song Liu <songliubraving@fb.com>
>
> with a couple nits:
>
> [...]
>
> > +
> > +/* Print raw BTF type dump into a local buffer and return string pointer back.
> > + * Buffer *will* be overwritten by subsequent btf_type_raw_dump() calls
> > + */
> > +const char *btf_type_raw_dump(const struct btf *btf, int type_id)
> > +{
> > +     static char buf[16 * 1024];
> > +     FILE *buf_file;
> > +
> > +     buf_file = fmemopen(buf, sizeof(buf) - 1, "w");
> > +     if (!buf_file) {
> > +             fprintf(stderr, "Failed to open memstream: %d\n", errno);
> > +             return NULL;
> > +     }
> > +
> > +     fprintf_btf_type_raw(buf_file, btf, type_id);
> > +     fflush(buf_file);
> > +     fclose(buf_file);
> > +
> > +     return buf;
> > +}
> > diff --git a/tools/testing/selftests/bpf/btf_helpers.h b/tools/testing/selftests/bpf/btf_helpers.h
> > new file mode 100644
> > index 000000000000..2c9ce1b61dc9
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/btf_helpers.h
> > @@ -0,0 +1,12 @@
> > +/* SPDX-License-Identifier: GPL-2.0 */
> > +/* Copyright (c) 2020 Facebook */
> > +#ifndef __BTF_HELPERS_H
> > +#define __BTF_HELPERS_H
> > +
> > +#include <stdio.h>
> > +#include <bpf/btf.h>
> > +
> > +int fprintf_btf_type_raw(FILE *out, const struct btf *btf, __u32 id);
> > +const char *btf_type_raw_dump(const struct btf *btf, int type_id);
> > +
> > +#endif
> > diff --git a/tools/testing/selftests/bpf/prog_tests/btf_write.c b/tools/testing/selftests/bpf/prog_tests/btf_write.c
> > index 314e1e7c36df..bc1412de1b3d 100644
> > --- a/tools/testing/selftests/bpf/prog_tests/btf_write.c
> > +++ b/tools/testing/selftests/bpf/prog_tests/btf_write.c
> > @@ -2,6 +2,7 @@
> > /* Copyright (c) 2020 Facebook */
> > #include <test_progs.h>
> > #include <bpf/btf.h>
> > +#include "btf_helpers.h"
> >
> > static int duration = 0;
> >
> > @@ -11,12 +12,12 @@ void test_btf_write() {
> >       const struct btf_member *m;
> >       const struct btf_enum *v;
> >       const struct btf_param *p;
> > -     struct btf *btf;
> > +     struct btf *btf = NULL;
>
> No need to initialize btf.
>
> >       int id, err, str_off;
> >
> >       btf = btf__new_empty();
> >       if (CHECK(IS_ERR(btf), "new_empty", "failed: %ld\n", PTR_ERR(btf)))
> > -             return;
> > +             goto err_out;
>
> err_out is not needed either.

eagle eye ;) fixed both

>
> [...]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 07/11] libbpf: fix BTF data layout checks and allow empty BTF
  2020-11-03  0:51   ` Song Liu
@ 2020-11-03  5:18     ` Andrii Nakryiko
  2020-11-03  5:44       ` Song Liu
  0 siblings, 1 reply; 49+ messages in thread
From: Andrii Nakryiko @ 2020-11-03  5:18 UTC (permalink / raw)
  To: Song Liu
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team

On Mon, Nov 2, 2020 at 4:51 PM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> > On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> >
> > Make data section layout checks stricter, disallowing overlap of types and
> > strings data.
> >
> > Additionally, allow BTFs with no type data. There is nothing inherently wrong
> > with having BTF with no types (put potentially with some strings). This could
> > be a situation with kernel module BTFs, if module doesn't introduce any new
> > type information.
> >
> > Also fix invalid offset alignment check for btf->hdr->type_off.
> >
> > Fixes: 8a138aed4a80 ("bpf: btf: Add BTF support to libbpf")
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> > ---
> > tools/lib/bpf/btf.c | 16 ++++++----------
> > 1 file changed, 6 insertions(+), 10 deletions(-)
> >
> > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > index 20c64a8441a8..9b0ef71a03d0 100644
> > --- a/tools/lib/bpf/btf.c
> > +++ b/tools/lib/bpf/btf.c
> > @@ -245,22 +245,18 @@ static int btf_parse_hdr(struct btf *btf)
> >               return -EINVAL;
> >       }
> >
> > -     if (meta_left < hdr->type_off) {
> > -             pr_debug("Invalid BTF type section offset:%u\n", hdr->type_off);
> > +     if (meta_left < hdr->str_off + hdr->str_len) {
> > +             pr_debug("Invalid BTF total size:%u\n", btf->raw_size);
> >               return -EINVAL;
> >       }
>
> Can we make this one as
>         if (meta_left != hdr->str_off + hdr->str_len) {

That would be not forward-compatible. I.e., old libbpf loading new BTF
with extra stuff after the string section. Kernel is necessarily more
strict, but I'd like to keep libbpf more permissive with this.

>
> >
> > -     if (meta_left < hdr->str_off) {
> > -             pr_debug("Invalid BTF string section offset:%u\n", hdr->str_off);
> > +     if (hdr->type_off + hdr->type_len > hdr->str_off) {
> > +             pr_debug("Invalid BTF data sections layout: type data at %u + %u, strings data at %u + %u\n",
> > +                      hdr->type_off, hdr->type_len, hdr->str_off, hdr->str_len);
> >               return -EINVAL;
> >       }
>
> And this one
>         if (hdr->type_off + hdr->type_len != hdr->str_off) {
>
> ?

Similarly, libbpf could be a bit more permissive here without
sacrificing correctness (at least for read-only BTF, when rewriting
BTF extra data will be discarded, of course).

>
> [...]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 08/11] libbpf: support BTF dedup of split BTFs
  2020-11-03  2:49   ` Song Liu
@ 2020-11-03  5:25     ` Andrii Nakryiko
  2020-11-03  5:59       ` Song Liu
  0 siblings, 1 reply; 49+ messages in thread
From: Andrii Nakryiko @ 2020-11-03  5:25 UTC (permalink / raw)
  To: Song Liu
  Cc: Andrii Nakryiko, bpf, netdev, Alexei Starovoitov, daniel, Kernel Team

On Mon, Nov 2, 2020 at 6:49 PM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> > On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> >
> > Add support for deduplication split BTFs. When deduplicating split BTF, base
> > BTF is considered to be immutable and can't be modified or adjusted. 99% of
> > BTF deduplication logic is left intact (module some type numbering adjustments).
> > There are only two differences.
> >
> > First, each type in base BTF gets hashed (expect VAR and DATASEC, of course,
> > those are always considered to be self-canonical instances) and added into
> > a table of canonical table candidates. Hashing is a shallow, fast operation,
> > so mostly eliminates the overhead of having entire base BTF to be a part of
> > BTF dedup.
> >
> > Second difference is very critical and subtle. While deduplicating split BTF
> > types, it is possible to discover that one of immutable base BTF BTF_KIND_FWD
> > types can and should be resolved to a full STRUCT/UNION type from the split
> > BTF part.  This is, obviously, can't happen because we can't modify the base
> > BTF types anymore. So because of that, any type in split BTF that directly or
> > indirectly references that newly-to-be-resolved FWD type can't be considered
> > to be equivalent to the corresponding canonical types in base BTF, because
> > that would result in a loss of type resolution information. So in such case,
> > split BTF types will be deduplicated separately and will cause some
> > duplication of type information, which is unavoidable.
> >
> > With those two changes, the rest of the algorithm manages to deduplicate split
> > BTF correctly, pointing all the duplicates to their canonical counter-parts in
> > base BTF, but also is deduplicating whatever unique types are present in split
> > BTF on their own.
> >
> > Also, theoretically, split BTF after deduplication could end up with either
> > empty type section or empty string section. This is handled by libbpf
> > correctly in one of previous patches in the series.
> >
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
>
> Acked-by: Song Liu <songliubraving@fb.com>
>
> With some nits:
>
> > ---
>
> [...]
>
> >
> >       /* remap string offsets */
> >       err = btf_for_each_str_off(d, strs_dedup_remap_str_off, d);
> > @@ -3553,6 +3582,63 @@ static bool btf_compat_fnproto(struct btf_type *t1, struct btf_type *t2)
> >       return true;
> > }
> >
>
> An overview comment about bpf_deup_prep() will be great.

ok

>
> > +static int btf_dedup_prep(struct btf_dedup *d)
> > +{
> > +     struct btf_type *t;
> > +     int type_id;
> > +     long h;
> > +
> > +     if (!d->btf->base_btf)
> > +             return 0;
> > +
> > +     for (type_id = 1; type_id < d->btf->start_id; type_id++)
> > +     {
>
> Move "{" to previous line?

yep, my bad

>
> > +             t = btf_type_by_id(d->btf, type_id);
> > +
> > +             /* all base BTF types are self-canonical by definition */
> > +             d->map[type_id] = type_id;
> > +
> > +             switch (btf_kind(t)) {
> > +             case BTF_KIND_VAR:
> > +             case BTF_KIND_DATASEC:
> > +                     /* VAR and DATASEC are never hash/deduplicated */
> > +                     continue;
>
> [...]
>
> >       /* we are going to reuse hypot_map to store compaction remapping */
> >       d->hypot_map[0] = 0;
> > -     for (i = 1; i <= d->btf->nr_types; i++)
> > -             d->hypot_map[i] = BTF_UNPROCESSED_ID;
> > +     /* base BTF types are not renumbered */
> > +     for (id = 1; id < d->btf->start_id; id++)
> > +             d->hypot_map[id] = id;
> > +     for (i = 0, id = d->btf->start_id; i < d->btf->nr_types; i++, id++)
> > +             d->hypot_map[id] = BTF_UNPROCESSED_ID;
>
> We don't really need i in the loop, shall we just do
>         for (id = d->btf->start_id; id < d->btf->start_id + d->btf->nr_types; id++)
> ?
>

I prefer the loop with i iterating over the count of types, it seems
more "obviously correct". For simple loop like this I could do

for (i = 0; i < d->btf->nr_types; i++)
    d->hypot_map[d->start_id + i] = ...;

But for the more complicated one below I found that maintaining id as
part of the for loop control block is a bit cleaner. So I just stuck
to the consistent pattern across all of them.

> >
> >       p = d->btf->types_data;
> >
> > -     for (i = 1; i <= d->btf->nr_types; i++) {
> > -             if (d->map[i] != i)
> > +     for (i = 0, id = d->btf->start_id; i < d->btf->nr_types; i++, id++) {
>
> ditto
>
> > +             if (d->map[id] != id)
> >                       continue;
> >
> [...]
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 10/11] selftests/bpf: add split BTF dedup selftests
  2020-10-29  0:59 ` [PATCH bpf-next 10/11] selftests/bpf: add split BTF dedup selftests Andrii Nakryiko
@ 2020-11-03  5:35   ` Song Liu
  2020-11-03  6:05     ` Andrii Nakryiko
  0 siblings, 1 reply; 49+ messages in thread
From: Song Liu @ 2020-11-03  5:35 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, netdev, Alexei Starovoitov, daniel, Kernel Team



> On Oct 28, 2020, at 5:59 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> 
> Add selftests validating BTF deduplication for split BTF case. Add a helper
> macro that allows to validate entire BTF with raw BTF dump, not just
> type-by-type. This saves tons of code and complexity.
> 
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

Acked-by: Song Liu <songliubraving@fb.com>

with a couple nits:

[...]

> 
> int fprintf_btf_type_raw(FILE *out, const struct btf *btf, __u32 id);
> const char *btf_type_raw_dump(const struct btf *btf, int type_id);
> +int btf_validate_raw(struct btf *btf, int nr_types, const char *exp_types[]);
> 
> +#define VALIDATE_RAW_BTF(btf, raw_types...)				\
> +	btf_validate_raw(btf,						\
> +			 sizeof((const char *[]){raw_types})/sizeof(void *),\
> +			 (const char *[]){raw_types})
> +
> +const char *btf_type_c_dump(const struct btf *btf);
> #endif
> diff --git a/tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c b/tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c
> new file mode 100644
> index 000000000000..097370a41b60
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c
> @@ -0,0 +1,326 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2020 Facebook */
> +#include <test_progs.h>
> +#include <bpf/btf.h>
> +#include "btf_helpers.h"
> +
> +
> +static void test_split_simple() {
> +	const struct btf_type *t;
> +	struct btf *btf1, *btf2 = NULL;
> +	int str_off, err;
> +
> +	btf1 = btf__new_empty();
> +	if (!ASSERT_OK_PTR(btf1, "empty_main_btf"))
> +		return;
> +
> +	btf__set_pointer_size(btf1, 8); /* enforce 64-bit arch */
> +
> +	btf__add_int(btf1, "int", 4, BTF_INT_SIGNED);	/* [1] int */
> +	btf__add_ptr(btf1, 1);				/* [2] ptr to int */
> +	btf__add_struct(btf1, "s1", 4);			/* [3] struct s1 { */
> +	btf__add_field(btf1, "f1", 1, 0, 0);		/*      int f1; */
> +							/* } */
> +

nit: two empty lines. 

> +	VALIDATE_RAW_BTF(
> +		btf1,
> +		"[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
> +		"[2] PTR '(anon)' type_id=1",
> +		"[3] STRUCT 's1' size=4 vlen=1\n"
> +		"\t'f1' type_id=1 bits_offset=0");
> +

[...]

> +
> +cleanup:
> +	btf__free(btf2);
> +	btf__free(btf1);
> +}
> +
> +static void test_split_struct_duped() {
> +	struct btf *btf1, *btf2 = NULL;

nit: No need to initialize btf2, for all 3 tests. 

> +	int err;
> +
[...]


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 04/11] libbpf: implement basic split BTF support
  2020-11-03  5:02     ` Andrii Nakryiko
@ 2020-11-03  5:41       ` Song Liu
  2020-11-04 23:51         ` Andrii Nakryiko
  0 siblings, 1 reply; 49+ messages in thread
From: Song Liu @ 2020-11-03  5:41 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team



> On Nov 2, 2020, at 9:02 PM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> 
> On Mon, Nov 2, 2020 at 3:24 PM Song Liu <songliubraving@fb.com> wrote:
>> 
>> 
>> 
>>> On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
>>> 
>> 
>> [...]
>> 
>>> 
>>> BTF deduplication is not yet supported for split BTF and support for it will
>>> be added in separate patch.
>>> 
>>> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
>> 
>> Acked-by: Song Liu <songliubraving@fb.com>
>> 
>> With a couple nits:
>> 
>>> ---
>>> tools/lib/bpf/btf.c      | 205 ++++++++++++++++++++++++++++++---------
>>> tools/lib/bpf/btf.h      |   8 ++
>>> tools/lib/bpf/libbpf.map |   9 ++
>>> 3 files changed, 175 insertions(+), 47 deletions(-)
>>> 
>>> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
>>> index db9331fea672..20c64a8441a8 100644
>>> --- a/tools/lib/bpf/btf.c
>>> +++ b/tools/lib/bpf/btf.c
>>> @@ -78,10 +78,32 @@ struct btf {
>>>      void *types_data;
>>>      size_t types_data_cap; /* used size stored in hdr->type_len */
>>> 
>>> -     /* type ID to `struct btf_type *` lookup index */
>>> +     /* type ID to `struct btf_type *` lookup index
>>> +      * type_offs[0] corresponds to the first non-VOID type:
>>> +      *   - for base BTF it's type [1];
>>> +      *   - for split BTF it's the first non-base BTF type.
>>> +      */
>>>      __u32 *type_offs;
>>>      size_t type_offs_cap;
>>> +     /* number of types in this BTF instance:
>>> +      *   - doesn't include special [0] void type;
>>> +      *   - for split BTF counts number of types added on top of base BTF.
>>> +      */
>>>      __u32 nr_types;
>> 
>> This is a little confusing. Maybe add a void type for every split BTF?
> 
> Agree about being a bit confusing. But I don't want VOID in every BTF,
> that seems sloppy (there's no continuity). I'm currently doing similar
> changes on kernel side, and so far everything also works cleanly with
> start_id == 0 && nr_types including VOID (for base BTF), and start_id
> == base_btf->nr_type && nr_types has all the added types (for split
> BTF). That seems a bit more straightforward, so I'll probably do that
> here as well (unless I'm missing something, I'll double check).

That sounds good. 

> 
>> 
>>> +     /* if not NULL, points to the base BTF on top of which the current
>>> +      * split BTF is based
>>> +      */
>> 
>> [...]
>> 
>>> 
>>> @@ -252,12 +274,20 @@ static int btf_parse_str_sec(struct btf *btf)
>>>      const char *start = btf->strs_data;
>>>      const char *end = start + btf->hdr->str_len;
>>> 
>>> -     if (!hdr->str_len || hdr->str_len - 1 > BTF_MAX_STR_OFFSET ||
>>> -         start[0] || end[-1]) {
>>> -             pr_debug("Invalid BTF string section\n");
>>> -             return -EINVAL;
>>> +     if (btf->base_btf) {
>>> +             if (hdr->str_len == 0)
>>> +                     return 0;
>>> +             if (hdr->str_len - 1 > BTF_MAX_STR_OFFSET || end[-1]) {
>>> +                     pr_debug("Invalid BTF string section\n");
>>> +                     return -EINVAL;
>>> +             }
>>> +     } else {
>>> +             if (!hdr->str_len || hdr->str_len - 1 > BTF_MAX_STR_OFFSET ||
>>> +                 start[0] || end[-1]) {
>>> +                     pr_debug("Invalid BTF string section\n");
>>> +                     return -EINVAL;
>>> +             }
>>>      }
>>> -
>>>      return 0;
>> 
>> I found this function a little difficult to follow. Maybe rearrange it as
>> 
>>        /* too long, or not \0 terminated */
>>        if (hdr->str_len - 1 > BTF_MAX_STR_OFFSET || end[-1])
>>                goto err_out;
> 
> this won't work, if str_len == 0. Both str_len - 1 will underflow, and
> end[-1] will be reading garbage
> 
> How about this:
> 
> if (btf->base_btf && hdr->str_len == 0)
>    return 0;
> 
> if (!hdr->str_len || hdr->str_len - 1 > BTF_MAX_STR_OFFSET || end[-1])
>    return -EINVAL;
> 
> if (!btf->base_btf && start[0])
>    return -EINVAL;
> 
> return 0;
> 
> This seems more straightforward, right?

Yeah, I like this version. BTW, short comment for each condition will be
helpful.



^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 07/11] libbpf: fix BTF data layout checks and allow empty BTF
  2020-11-03  5:18     ` Andrii Nakryiko
@ 2020-11-03  5:44       ` Song Liu
  0 siblings, 0 replies; 49+ messages in thread
From: Song Liu @ 2020-11-03  5:44 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team



> On Nov 2, 2020, at 9:18 PM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> 
> On Mon, Nov 2, 2020 at 4:51 PM Song Liu <songliubraving@fb.com> wrote:
>> 
>> 
>> 
>>> On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
>>> 
>>> Make data section layout checks stricter, disallowing overlap of types and
>>> strings data.
>>> 
>>> Additionally, allow BTFs with no type data. There is nothing inherently wrong
>>> with having BTF with no types (put potentially with some strings). This could
>>> be a situation with kernel module BTFs, if module doesn't introduce any new
>>> type information.
>>> 
>>> Also fix invalid offset alignment check for btf->hdr->type_off.
>>> 
>>> Fixes: 8a138aed4a80 ("bpf: btf: Add BTF support to libbpf")
>>> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
>>> ---
>>> tools/lib/bpf/btf.c | 16 ++++++----------
>>> 1 file changed, 6 insertions(+), 10 deletions(-)
>>> 
>>> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
>>> index 20c64a8441a8..9b0ef71a03d0 100644
>>> --- a/tools/lib/bpf/btf.c
>>> +++ b/tools/lib/bpf/btf.c
>>> @@ -245,22 +245,18 @@ static int btf_parse_hdr(struct btf *btf)
>>>              return -EINVAL;
>>>      }
>>> 
>>> -     if (meta_left < hdr->type_off) {
>>> -             pr_debug("Invalid BTF type section offset:%u\n", hdr->type_off);
>>> +     if (meta_left < hdr->str_off + hdr->str_len) {
>>> +             pr_debug("Invalid BTF total size:%u\n", btf->raw_size);
>>>              return -EINVAL;
>>>      }
>> 
>> Can we make this one as
>>        if (meta_left != hdr->str_off + hdr->str_len) {
> 
> That would be not forward-compatible. I.e., old libbpf loading new BTF
> with extra stuff after the string section. Kernel is necessarily more
> strict, but I'd like to keep libbpf more permissive with this.

Yeah, this makes sense. Let's keep both checks AS-IS. 

Thanks,
Song


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 08/11] libbpf: support BTF dedup of split BTFs
  2020-11-03  5:25     ` Andrii Nakryiko
@ 2020-11-03  5:59       ` Song Liu
  2020-11-03  6:31         ` Andrii Nakryiko
  0 siblings, 1 reply; 49+ messages in thread
From: Song Liu @ 2020-11-03  5:59 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, bpf, netdev, Alexei Starovoitov, daniel, Kernel Team



> On Nov 2, 2020, at 9:25 PM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> 
> On Mon, Nov 2, 2020 at 6:49 PM Song Liu <songliubraving@fb.com> wrote:
>> 
>> 
>> 
>>> On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
>>> 
>>> Add support for deduplication split BTFs. When deduplicating split BTF, base
>>> BTF is considered to be immutable and can't be modified or adjusted. 99% of
>>> BTF deduplication logic is left intact (module some type numbering adjustments).
>>> There are only two differences.
>>> 
>>> First, each type in base BTF gets hashed (expect VAR and DATASEC, of course,
>>> those are always considered to be self-canonical instances) and added into
>>> a table of canonical table candidates. Hashing is a shallow, fast operation,
>>> so mostly eliminates the overhead of having entire base BTF to be a part of
>>> BTF dedup.
>>> 
>>> Second difference is very critical and subtle. While deduplicating split BTF
>>> types, it is possible to discover that one of immutable base BTF BTF_KIND_FWD
>>> types can and should be resolved to a full STRUCT/UNION type from the split
>>> BTF part.  This is, obviously, can't happen because we can't modify the base
>>> BTF types anymore. So because of that, any type in split BTF that directly or
>>> indirectly references that newly-to-be-resolved FWD type can't be considered
>>> to be equivalent to the corresponding canonical types in base BTF, because
>>> that would result in a loss of type resolution information. So in such case,
>>> split BTF types will be deduplicated separately and will cause some
>>> duplication of type information, which is unavoidable.
>>> 
>>> With those two changes, the rest of the algorithm manages to deduplicate split
>>> BTF correctly, pointing all the duplicates to their canonical counter-parts in
>>> base BTF, but also is deduplicating whatever unique types are present in split
>>> BTF on their own.
>>> 
>>> Also, theoretically, split BTF after deduplication could end up with either
>>> empty type section or empty string section. This is handled by libbpf
>>> correctly in one of previous patches in the series.
>>> 
>>> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
>> 
>> Acked-by: Song Liu <songliubraving@fb.com>
>> 
>> With some nits:
>> 
>>> ---
>> 
>> [...]
>> 
>>> 
>>>      /* remap string offsets */
>>>      err = btf_for_each_str_off(d, strs_dedup_remap_str_off, d);
>>> @@ -3553,6 +3582,63 @@ static bool btf_compat_fnproto(struct btf_type *t1, struct btf_type *t2)
>>>      return true;
>>> }
>>> 
>> 
>> An overview comment about bpf_deup_prep() will be great.
> 
> ok
> 
>> 
>>> +static int btf_dedup_prep(struct btf_dedup *d)
>>> +{
>>> +     struct btf_type *t;
>>> +     int type_id;
>>> +     long h;
>>> +
>>> +     if (!d->btf->base_btf)
>>> +             return 0;
>>> +
>>> +     for (type_id = 1; type_id < d->btf->start_id; type_id++)
>>> +     {
>> 
>> Move "{" to previous line?
> 
> yep, my bad
> 
>> 
>>> +             t = btf_type_by_id(d->btf, type_id);
>>> +
>>> +             /* all base BTF types are self-canonical by definition */
>>> +             d->map[type_id] = type_id;
>>> +
>>> +             switch (btf_kind(t)) {
>>> +             case BTF_KIND_VAR:
>>> +             case BTF_KIND_DATASEC:
>>> +                     /* VAR and DATASEC are never hash/deduplicated */
>>> +                     continue;
>> 
>> [...]
>> 
>>>      /* we are going to reuse hypot_map to store compaction remapping */
>>>      d->hypot_map[0] = 0;
>>> -     for (i = 1; i <= d->btf->nr_types; i++)
>>> -             d->hypot_map[i] = BTF_UNPROCESSED_ID;
>>> +     /* base BTF types are not renumbered */
>>> +     for (id = 1; id < d->btf->start_id; id++)
>>> +             d->hypot_map[id] = id;
>>> +     for (i = 0, id = d->btf->start_id; i < d->btf->nr_types; i++, id++)
>>> +             d->hypot_map[id] = BTF_UNPROCESSED_ID;
>> 
>> We don't really need i in the loop, shall we just do
>>        for (id = d->btf->start_id; id < d->btf->start_id + d->btf->nr_types; id++)
>> ?
>> 
> 
> I prefer the loop with i iterating over the count of types, it seems
> more "obviously correct". For simple loop like this I could do
> 
> for (i = 0; i < d->btf->nr_types; i++)
>    d->hypot_map[d->start_id + i] = ...;
> 
> But for the more complicated one below I found that maintaining id as
> part of the for loop control block is a bit cleaner. So I just stuck
> to the consistent pattern across all of them.

How about 

	for (i = 0; i < d->btf->nr_types; i++) {
		id = d->start_id + i;
		...
?

I would expect for loop with two loop variable to do some tricks, like two 
termination conditions, or another conditional id++ somewhere in the loop. 

Thanks,
Song


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 03/11] libbpf: unify and speed up BTF string deduplication
  2020-11-03  4:59   ` Alexei Starovoitov
@ 2020-11-03  6:01     ` Andrii Nakryiko
  0 siblings, 0 replies; 49+ messages in thread
From: Andrii Nakryiko @ 2020-11-03  6:01 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team, Andrii Nakryiko

On Mon, Nov 2, 2020 at 8:59 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Oct 28, 2020 at 05:58:54PM -0700, Andrii Nakryiko wrote:
> > From: Andrii Nakryiko <andriin@fb.com>
> >
> > Revamp BTF dedup's string deduplication to match the approach of writable BTF
> > string management. This allows to transfer deduplicated strings index back to
> > BTF object after deduplication without expensive extra memory copying and hash
> > map re-construction. It also simplifies the code and speeds it up, because
> > hashmap-based string deduplication is faster than sort + unique approach.
> >
> > Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> > ---
> >  tools/lib/bpf/btf.c | 265 +++++++++++++++++---------------------------
> >  1 file changed, 99 insertions(+), 166 deletions(-)
> >
> > diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> > index 89fecfe5cb2b..db9331fea672 100644
> > --- a/tools/lib/bpf/btf.c
> > +++ b/tools/lib/bpf/btf.c
> > @@ -90,6 +90,14 @@ struct btf {
> >       struct hashmap *strs_hash;
> >       /* whether strings are already deduplicated */
> >       bool strs_deduped;
> > +     /* extra indirection layer to make strings hashmap work with stable
> > +      * string offsets and ability to transparently choose between
> > +      * btf->strs_data or btf_dedup->strs_data as a source of strings.
> > +      * This is used for BTF strings dedup to transfer deduplicated strings
> > +      * data back to struct btf without re-building strings index.
> > +      */
> > +     void **strs_data_ptr;
>
> I thought one of the ideas of dedup algo was that strings were deduped first,
> so there is no need to rebuild them.

Ugh.. many things to unpack here. Let's try.

Yes, the idea of dedup is to have only unique strings. But we always
were rebuilding strings during dedup, here we are just changing the
algorithm for string dedup from sort+uniq to hash table. We were
deduping strings unconditionally because we don't know how the BTF
strings section was created in the first place and if it's already
deduplicated or not. So we had to always do it before.

With BTF write APIs the situation became a bit more nuanced. If we
create BTF programmatically from scratch (btf_new_empty()), then
libbpf guarantees (by construction) that all added strings are
auto-deduped. In such a case btf->strs_deduped will be set to true and
during btf_dedup() we'll skip string deduplication. It's purely a
performance improvement and it benefits the main btf_dedup workflow in
pahole.

But if ready-built BTF was loaded from somewhere first and then
modified with BTF write APIs, then it's a bit different. For existing
strings, when we transition from read-only BTF to writable BTF, we
build string lookup hashmap, but we don't deduplicate and remap string
offsets. So if loaded BTF had string duplicates, it will continue
having string duplicates. The string lookup index will pick arbitrary
instance of duplicated string as a unique key, but strings data will
still have duplicates and there will be types that still reference
duplicated string. Until (and if) we do btf_dedup(). At that time
we'll create another unique hash table *and* will remap all string
offsets across all types.

I did it this way intentionally (not remapping strings when doing
read-only -> writable BTF transition) to not accidentally corrupt
.BTF.ext strings. If I were to do full string dedup for r/o ->
writable transition, I'd need to add APIs to "link" struct btf_ext to
struct btf, so that libbpf could remap .BTF.ext strings transparently.
But I didn't want to add those APIs (yet) and didn't want to deal with
mutable struct btf_ext (yet).

So, in short, for strings dedup fundamentally nothing changed at all.

> Then split BTF cannot touch base BTF strings and they're immutable.

This is exactly the case right now. Nothing in base BTF changes, ever.

> But the commit log is talking about transfer of strings and
> hash map re-construction? Why split BTF would reconstruct anything?

This transfer of strings is for split BTF's strings data only. In
general case, we have some unknown strings data in split BTF. When we
do dedup, we need to make sure that split BTF strings are deduplicated
(we don't touch base BTF strings at all). For that we need to
construct a new hashmap. Once we constructed it, we have new strings
data with deduplicated strings, so to avoid creating another big copy
for struct btf, we just "transfer" that data to struct btf from struct
btf_dedup. void **strs_data_ptr just allows reusing the same (already
constructed) hashmap, same underlying blog of deduplicated string
data, same hashing and equality functions.

> It either finds a string in a base BTF or adds to its own strings section.
> Is it all due to switch to hash? The speedup motivation is clear, but then
> it sounds like that the speedup is causing all these issues.
> The strings could have stayed as-is. Just a bit slower ?

Previously we were able to rewrite strings in-place and strings data
was never reallocated (because BTF was read-only always). So it was
all a bit simpler. By using double-indirection we don't have to build
a third hashmap once we are done with strings dedup, we just replace
struct btf's own string lookup hashmap and string data memory.
Alternative is another expensive memory allocation and potentially
pretty big hashmap copy.

Apart from double indirection, the algorithm is much simpler now. If I
were writing original BTF dedup in C++, I'd use a hashmap approach
back then. But we didn't have hashmap in libbpf yet, so sort + uniq
was chosen.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 11/11] tools/bpftool: add bpftool support for split BTF
  2020-10-29  0:59 ` [PATCH bpf-next 11/11] tools/bpftool: add bpftool support for split BTF Andrii Nakryiko
@ 2020-11-03  6:03   ` Song Liu
  0 siblings, 0 replies; 49+ messages in thread
From: Song Liu @ 2020-11-03  6:03 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, netdev, Alexei Starovoitov, daniel, Kernel Team



> On Oct 28, 2020, at 5:59 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> 
> Add ability to work with split BTF by providing extra -B flag, which allows to
> specify the path to the base BTF file.
> 
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
> tools/bpf/bpftool/btf.c  |  9 ++++++---
> tools/bpf/bpftool/main.c | 15 ++++++++++++++-
> tools/bpf/bpftool/main.h |  1 +
> 3 files changed, 21 insertions(+), 4 deletions(-)
> 
> diff --git a/tools/bpf/bpftool/btf.c b/tools/bpf/bpftool/btf.c
> index 8ab142ff5eac..c96b56e8e3a4 100644
> --- a/tools/bpf/bpftool/btf.c
> +++ b/tools/bpf/bpftool/btf.c
> @@ -358,8 +358,12 @@ static int dump_btf_raw(const struct btf *btf,
> 		}
> 	} else {
> 		int cnt = btf__get_nr_types(btf);
> +		int start_id = 1;
> 
> -		for (i = 1; i <= cnt; i++) {
> +		if (base_btf)
> +			start_id = btf__get_nr_types(base_btf) + 1;
> +
> +		for (i = start_id; i <= cnt; i++) {
> 			t = btf__type_by_id(btf, i);
> 			dump_btf_type(btf, i, t);
> 		}
> @@ -438,7 +442,6 @@ static int do_dump(int argc, char **argv)
> 		return -1;
> 	}
> 	src = GET_ARG();
> -
> 	if (is_prefix(src, "map")) {
> 		struct bpf_map_info info = {};
> 		__u32 len = sizeof(info);
> @@ -499,7 +502,7 @@ static int do_dump(int argc, char **argv)
> 		}
> 		NEXT_ARG();
> 	} else if (is_prefix(src, "file")) {
> -		btf = btf__parse(*argv, NULL);
> +		btf = btf__parse_split(*argv, base_btf);
> 		if (IS_ERR(btf)) {
> 			err = -PTR_ERR(btf);
> 			btf = NULL;
> diff --git a/tools/bpf/bpftool/main.c b/tools/bpf/bpftool/main.c
> index 682daaa49e6a..b86f450e6fce 100644
> --- a/tools/bpf/bpftool/main.c
> +++ b/tools/bpf/bpftool/main.c
> @@ -11,6 +11,7 @@
> 
> #include <bpf/bpf.h>
> #include <bpf/libbpf.h>
> +#include <bpf/btf.h>
> 
> #include "main.h"
> 
> @@ -28,6 +29,7 @@ bool show_pinned;
> bool block_mount;
> bool verifier_logs;
> bool relaxed_maps;
> +struct btf *base_btf;
> struct pinned_obj_table prog_table;
> struct pinned_obj_table map_table;
> struct pinned_obj_table link_table;
> @@ -391,6 +393,7 @@ int main(int argc, char **argv)
> 		{ "mapcompat",	no_argument,	NULL,	'm' },
> 		{ "nomount",	no_argument,	NULL,	'n' },
> 		{ "debug",	no_argument,	NULL,	'd' },
> +		{ "base-btf",	required_argument, NULL, 'B' },
> 		{ 0 }
> 	};
> 	int opt, ret;
> @@ -407,7 +410,7 @@ int main(int argc, char **argv)
> 	hash_init(link_table.table);
> 
> 	opterr = 0;
> -	while ((opt = getopt_long(argc, argv, "Vhpjfmnd",
> +	while ((opt = getopt_long(argc, argv, "VhpjfmndB:",
> 				  options, NULL)) >= 0) {
> 		switch (opt) {
> 		case 'V':
> @@ -441,6 +444,15 @@ int main(int argc, char **argv)
> 			libbpf_set_print(print_all_levels);
> 			verifier_logs = true;
> 			break;
> +		case 'B':
> +			base_btf = btf__parse(optarg, NULL);
> +			if (libbpf_get_error(base_btf)) {
> +				p_err("failed to parse base BTF at '%s': %ld\n",
> +				      optarg, libbpf_get_error(base_btf));
> +				base_btf = NULL;
> +				return -1;
> +			}
> +			break;
> 		default:
> 			p_err("unrecognized option '%s'", argv[optind - 1]);
> 			if (json_output)
> @@ -465,6 +477,7 @@ int main(int argc, char **argv)
> 		delete_pinned_obj_table(&map_table);
> 		delete_pinned_obj_table(&link_table);
> 	}
> +	btf__free(base_btf);
> 
> 	return ret;
> }
> diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h
> index c46e52137b87..76e91641262b 100644
> --- a/tools/bpf/bpftool/main.h
> +++ b/tools/bpf/bpftool/main.h
> @@ -90,6 +90,7 @@ extern bool show_pids;
> extern bool block_mount;
> extern bool verifier_logs;
> extern bool relaxed_maps;
> +extern struct btf *base_btf;
> extern struct pinned_obj_table prog_table;
> extern struct pinned_obj_table map_table;
> extern struct pinned_obj_table link_table;
> -- 
> 2.24.1
> 


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 10/11] selftests/bpf: add split BTF dedup selftests
  2020-11-03  5:35   ` Song Liu
@ 2020-11-03  6:05     ` Andrii Nakryiko
  2020-11-03  6:30       ` Song Liu
  0 siblings, 1 reply; 49+ messages in thread
From: Andrii Nakryiko @ 2020-11-03  6:05 UTC (permalink / raw)
  To: Song Liu
  Cc: Andrii Nakryiko, bpf, netdev, Alexei Starovoitov, daniel, Kernel Team

On Mon, Nov 2, 2020 at 9:35 PM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> > On Oct 28, 2020, at 5:59 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> >
> > Add selftests validating BTF deduplication for split BTF case. Add a helper
> > macro that allows to validate entire BTF with raw BTF dump, not just
> > type-by-type. This saves tons of code and complexity.
> >
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
>
> Acked-by: Song Liu <songliubraving@fb.com>
>
> with a couple nits:
>
> [...]
>
> >
> > int fprintf_btf_type_raw(FILE *out, const struct btf *btf, __u32 id);
> > const char *btf_type_raw_dump(const struct btf *btf, int type_id);
> > +int btf_validate_raw(struct btf *btf, int nr_types, const char *exp_types[]);
> >
> > +#define VALIDATE_RAW_BTF(btf, raw_types...)                          \
> > +     btf_validate_raw(btf,                                           \
> > +                      sizeof((const char *[]){raw_types})/sizeof(void *),\
> > +                      (const char *[]){raw_types})
> > +
> > +const char *btf_type_c_dump(const struct btf *btf);
> > #endif
> > diff --git a/tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c b/tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c
> > new file mode 100644
> > index 000000000000..097370a41b60
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c
> > @@ -0,0 +1,326 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/* Copyright (c) 2020 Facebook */
> > +#include <test_progs.h>
> > +#include <bpf/btf.h>
> > +#include "btf_helpers.h"
> > +
> > +
> > +static void test_split_simple() {
> > +     const struct btf_type *t;
> > +     struct btf *btf1, *btf2 = NULL;
> > +     int str_off, err;
> > +
> > +     btf1 = btf__new_empty();
> > +     if (!ASSERT_OK_PTR(btf1, "empty_main_btf"))
> > +             return;
> > +
> > +     btf__set_pointer_size(btf1, 8); /* enforce 64-bit arch */
> > +
> > +     btf__add_int(btf1, "int", 4, BTF_INT_SIGNED);   /* [1] int */
> > +     btf__add_ptr(btf1, 1);                          /* [2] ptr to int */
> > +     btf__add_struct(btf1, "s1", 4);                 /* [3] struct s1 { */
> > +     btf__add_field(btf1, "f1", 1, 0, 0);            /*      int f1; */
> > +                                                     /* } */
> > +
>
> nit: two empty lines.

There is a comment on one of them, so I figured it's not an empty line?

>
> > +     VALIDATE_RAW_BTF(
> > +             btf1,
> > +             "[1] INT 'int' size=4 bits_offset=0 nr_bits=32 encoding=SIGNED",
> > +             "[2] PTR '(anon)' type_id=1",
> > +             "[3] STRUCT 's1' size=4 vlen=1\n"
> > +             "\t'f1' type_id=1 bits_offset=0");
> > +
>
> [...]
>
> > +
> > +cleanup:
> > +     btf__free(btf2);
> > +     btf__free(btf1);
> > +}
> > +
> > +static void test_split_struct_duped() {
> > +     struct btf *btf1, *btf2 = NULL;
>
> nit: No need to initialize btf2, for all 3 tests.

yep, fixed all three

>
> > +     int err;
> > +
> [...]
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 08/11] libbpf: support BTF dedup of split BTFs
  2020-11-03  5:10   ` Alexei Starovoitov
@ 2020-11-03  6:27     ` Andrii Nakryiko
  2020-11-03 17:55       ` Alexei Starovoitov
  0 siblings, 1 reply; 49+ messages in thread
From: Andrii Nakryiko @ 2020-11-03  6:27 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team

On Mon, Nov 2, 2020 at 9:10 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Oct 28, 2020 at 05:58:59PM -0700, Andrii Nakryiko wrote:
> > @@ -2942,6 +2948,13 @@ struct btf_dedup {
> >       __u32 *hypot_list;
> >       size_t hypot_cnt;
> >       size_t hypot_cap;
> > +     /* Whether hypothethical mapping, if successful, would need to adjust
> > +      * already canonicalized types (due to a new forward declaration to
> > +      * concrete type resolution). In such case, during split BTF dedup
> > +      * candidate type would still be considered as different, because base
> > +      * BTF is considered to be immutable.
> > +      */
> > +     bool hypot_adjust_canon;
>
> why one flag per dedup session is enough?

So the entire hypot_xxx state is reset before each struct/union type
graph equivalence check. Then for each struct/union type we might do
potentially many type graph equivalence checks against each of
potential canonical (already deduplicated) struct. Let's keep that in
mind for the answer below.

> Don't you have a case where some fwd are pointing to base btf and shouldn't
> be adjusted while some are in split btf and should be?
> It seems when this flag is set to true it will miss fwd in split btf?

So keeping the above note in mind, let's think about this case. You
are saying that some FWDs would have candidates in base BTF, right?
That means that the canonical type we are checking equivalence against
has to be in the base BTF. That also means that all the canonical type
graph types are in the base BTF, right? Because no base BTF type can
reference types from split BTF. This, subsequently, means that no FWDs
from split BTF graph could have canonical matching types in split BTF,
because we are comparing split types against only base BTF types.

With that, if hypot_adjust_canon is triggered, *entire graph*
shouldn't be matched. No single type in that (connected) graph should
be matched to base BTF. We essentially pretend that canonical type
doesn't even exist for us (modulo the subtle bit of still recording
base BTF's FWD mapping to a concrete type in split BTF for FWD-to-FWD
resolution at the very end, we can ignore that here, though, it's an
ephemeral bookkeeping discarded after dedup).

In your example you worry about resolving FWD in split BTF to concrete
type in split BTF. If that's possible (i.e., we have duplicates and
enough information to infer the FWD-to-STRUCT mapping), then we'll
have another canonical type to compare against, at which point we'll
establish FWD-to-STRUCT mapping, like usual, and hypot_adjust_canon
will stay false (because we'll be staying with split BTF types only).

But honestly, with graphs it can get so complicated that I wouldn't be
surprised if I'm still missing something. So far, manually checking
the resulting BTF showed that generated deduped BTF types look
correct. Few cases where module BTFs had duplicated types from vmlinux
I was able to easily find where exactly vmlinux had FWD while modules
had STRUCT/UNION.

But also, by being conservative with hypot_adjust_canon, the worst
case would be slight duplication of types, which is not the end of the
world. Everything will keep working, no data will be corrupted, libbpf
will still perform CO-RE relocation correctly (because memory layout
of duplicated structs will be consistent across all copies, just like
it was with task_struct until ring_buffers were renamed).

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 10/11] selftests/bpf: add split BTF dedup selftests
  2020-11-03  6:05     ` Andrii Nakryiko
@ 2020-11-03  6:30       ` Song Liu
  0 siblings, 0 replies; 49+ messages in thread
From: Song Liu @ 2020-11-03  6:30 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, bpf, netdev, Alexei Starovoitov, daniel, Kernel Team



> On Nov 2, 2020, at 10:05 PM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> 
> On Mon, Nov 2, 2020 at 9:35 PM Song Liu <songliubraving@fb.com> wrote:
>> 
>> 
>> 
>>> On Oct 28, 2020, at 5:59 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
>>> 
>>> Add selftests validating BTF deduplication for split BTF case. Add a helper
>>> macro that allows to validate entire BTF with raw BTF dump, not just
>>> type-by-type. This saves tons of code and complexity.
>>> 
>>> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
>> 
>> Acked-by: Song Liu <songliubraving@fb.com>
>> 
>> with a couple nits:
>> 
>> [...]
>> 
>>> 
>>> int fprintf_btf_type_raw(FILE *out, const struct btf *btf, __u32 id);
>>> const char *btf_type_raw_dump(const struct btf *btf, int type_id);
>>> +int btf_validate_raw(struct btf *btf, int nr_types, const char *exp_types[]);
>>> 
>>> +#define VALIDATE_RAW_BTF(btf, raw_types...)                          \
>>> +     btf_validate_raw(btf,                                           \
>>> +                      sizeof((const char *[]){raw_types})/sizeof(void *),\
>>> +                      (const char *[]){raw_types})
>>> +
>>> +const char *btf_type_c_dump(const struct btf *btf);
>>> #endif
>>> diff --git a/tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c b/tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c
>>> new file mode 100644
>>> index 000000000000..097370a41b60
>>> --- /dev/null
>>> +++ b/tools/testing/selftests/bpf/prog_tests/btf_dedup_split.c
>>> @@ -0,0 +1,326 @@
>>> +// SPDX-License-Identifier: GPL-2.0
>>> +/* Copyright (c) 2020 Facebook */
>>> +#include <test_progs.h>
>>> +#include <bpf/btf.h>
>>> +#include "btf_helpers.h"
>>> +
>>> +
>>> +static void test_split_simple() {
>>> +     const struct btf_type *t;
>>> +     struct btf *btf1, *btf2 = NULL;
>>> +     int str_off, err;
>>> +
>>> +     btf1 = btf__new_empty();
>>> +     if (!ASSERT_OK_PTR(btf1, "empty_main_btf"))
>>> +             return;
>>> +
>>> +     btf__set_pointer_size(btf1, 8); /* enforce 64-bit arch */
>>> +
>>> +     btf__add_int(btf1, "int", 4, BTF_INT_SIGNED);   /* [1] int */
>>> +     btf__add_ptr(btf1, 1);                          /* [2] ptr to int */
>>> +     btf__add_struct(btf1, "s1", 4);                 /* [3] struct s1 { */
>>> +     btf__add_field(btf1, "f1", 1, 0, 0);            /*      int f1; */
>>> +                                                     /* } */
>>> +
>> 
>> nit: two empty lines.
> 
> There is a comment on one of them, so I figured it's not an empty line?

Exactly! I missed that one. 

[...]

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 08/11] libbpf: support BTF dedup of split BTFs
  2020-11-03  5:59       ` Song Liu
@ 2020-11-03  6:31         ` Andrii Nakryiko
  2020-11-03 17:15           ` Song Liu
  0 siblings, 1 reply; 49+ messages in thread
From: Andrii Nakryiko @ 2020-11-03  6:31 UTC (permalink / raw)
  To: Song Liu
  Cc: Andrii Nakryiko, bpf, netdev, Alexei Starovoitov, daniel, Kernel Team

On Mon, Nov 2, 2020 at 9:59 PM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> > On Nov 2, 2020, at 9:25 PM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> >
> > On Mon, Nov 2, 2020 at 6:49 PM Song Liu <songliubraving@fb.com> wrote:
> >>
> >>
> >>
> >>> On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> >>>
> >>> Add support for deduplication split BTFs. When deduplicating split BTF, base
> >>> BTF is considered to be immutable and can't be modified or adjusted. 99% of
> >>> BTF deduplication logic is left intact (module some type numbering adjustments).
> >>> There are only two differences.
> >>>
> >>> First, each type in base BTF gets hashed (expect VAR and DATASEC, of course,
> >>> those are always considered to be self-canonical instances) and added into
> >>> a table of canonical table candidates. Hashing is a shallow, fast operation,
> >>> so mostly eliminates the overhead of having entire base BTF to be a part of
> >>> BTF dedup.
> >>>
> >>> Second difference is very critical and subtle. While deduplicating split BTF
> >>> types, it is possible to discover that one of immutable base BTF BTF_KIND_FWD
> >>> types can and should be resolved to a full STRUCT/UNION type from the split
> >>> BTF part.  This is, obviously, can't happen because we can't modify the base
> >>> BTF types anymore. So because of that, any type in split BTF that directly or
> >>> indirectly references that newly-to-be-resolved FWD type can't be considered
> >>> to be equivalent to the corresponding canonical types in base BTF, because
> >>> that would result in a loss of type resolution information. So in such case,
> >>> split BTF types will be deduplicated separately and will cause some
> >>> duplication of type information, which is unavoidable.
> >>>
> >>> With those two changes, the rest of the algorithm manages to deduplicate split
> >>> BTF correctly, pointing all the duplicates to their canonical counter-parts in
> >>> base BTF, but also is deduplicating whatever unique types are present in split
> >>> BTF on their own.
> >>>
> >>> Also, theoretically, split BTF after deduplication could end up with either
> >>> empty type section or empty string section. This is handled by libbpf
> >>> correctly in one of previous patches in the series.
> >>>
> >>> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> >>
> >> Acked-by: Song Liu <songliubraving@fb.com>
> >>
> >> With some nits:
> >>
> >>> ---
> >>
> >> [...]
> >>
> >>>
> >>>      /* remap string offsets */
> >>>      err = btf_for_each_str_off(d, strs_dedup_remap_str_off, d);
> >>> @@ -3553,6 +3582,63 @@ static bool btf_compat_fnproto(struct btf_type *t1, struct btf_type *t2)
> >>>      return true;
> >>> }
> >>>
> >>
> >> An overview comment about bpf_deup_prep() will be great.
> >
> > ok
> >
> >>
> >>> +static int btf_dedup_prep(struct btf_dedup *d)
> >>> +{
> >>> +     struct btf_type *t;
> >>> +     int type_id;
> >>> +     long h;
> >>> +
> >>> +     if (!d->btf->base_btf)
> >>> +             return 0;
> >>> +
> >>> +     for (type_id = 1; type_id < d->btf->start_id; type_id++)
> >>> +     {
> >>
> >> Move "{" to previous line?
> >
> > yep, my bad
> >
> >>
> >>> +             t = btf_type_by_id(d->btf, type_id);
> >>> +
> >>> +             /* all base BTF types are self-canonical by definition */
> >>> +             d->map[type_id] = type_id;
> >>> +
> >>> +             switch (btf_kind(t)) {
> >>> +             case BTF_KIND_VAR:
> >>> +             case BTF_KIND_DATASEC:
> >>> +                     /* VAR and DATASEC are never hash/deduplicated */
> >>> +                     continue;
> >>
> >> [...]
> >>
> >>>      /* we are going to reuse hypot_map to store compaction remapping */
> >>>      d->hypot_map[0] = 0;
> >>> -     for (i = 1; i <= d->btf->nr_types; i++)
> >>> -             d->hypot_map[i] = BTF_UNPROCESSED_ID;
> >>> +     /* base BTF types are not renumbered */
> >>> +     for (id = 1; id < d->btf->start_id; id++)
> >>> +             d->hypot_map[id] = id;
> >>> +     for (i = 0, id = d->btf->start_id; i < d->btf->nr_types; i++, id++)
> >>> +             d->hypot_map[id] = BTF_UNPROCESSED_ID;
> >>
> >> We don't really need i in the loop, shall we just do
> >>        for (id = d->btf->start_id; id < d->btf->start_id + d->btf->nr_types; id++)
> >> ?
> >>
> >
> > I prefer the loop with i iterating over the count of types, it seems
> > more "obviously correct". For simple loop like this I could do
> >
> > for (i = 0; i < d->btf->nr_types; i++)
> >    d->hypot_map[d->start_id + i] = ...;
> >
> > But for the more complicated one below I found that maintaining id as
> > part of the for loop control block is a bit cleaner. So I just stuck
> > to the consistent pattern across all of them.
>
> How about
>
>         for (i = 0; i < d->btf->nr_types; i++) {
>                 id = d->start_id + i;
>                 ...
> ?

this would be excessive for that single-line for loop. I'd really like
to keep it consistent and confined within the for () block.

>
> I would expect for loop with two loop variable to do some tricks, like two
> termination conditions, or another conditional id++ somewhere in the loop.

Libbpf already uses such two variable loops for things like iterating
over btf_type's members, enums, func args, etc. So it's not an
entirely alien construct. I really appreciate you trying to keep the
code as simple and clean as possible, but I think it's pretty
straightforward in this case and there's no need to simplify it
further.

>
> Thanks,
> Song
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 08/11] libbpf: support BTF dedup of split BTFs
  2020-11-03  6:31         ` Andrii Nakryiko
@ 2020-11-03 17:15           ` Song Liu
  0 siblings, 0 replies; 49+ messages in thread
From: Song Liu @ 2020-11-03 17:15 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, bpf, netdev, Alexei Starovoitov, daniel, Kernel Team



> On Nov 2, 2020, at 10:31 PM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> 
> On Mon, Nov 2, 2020 at 9:59 PM Song Liu <songliubraving@fb.com> wrote:
>> 
>> 
>> 
>>> On Nov 2, 2020, at 9:25 PM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
>>> 
>>> On Mon, Nov 2, 2020 at 6:49 PM Song Liu <songliubraving@fb.com> wrote:
>>>> 
>>>> 
>>>> 
>>>>> On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
>>>>> 
>>>>> Add support for deduplication split BTFs. When deduplicating split BTF, base
>>>>> BTF is considered to be immutable and can't be modified or adjusted. 99% of
>>>>> BTF deduplication logic is left intact (module some type numbering adjustments).
>>>>> There are only two differences.
>>>>> 
>>>>> First, each type in base BTF gets hashed (expect VAR and DATASEC, of course,
>>>>> those are always considered to be self-canonical instances) and added into
>>>>> a table of canonical table candidates. Hashing is a shallow, fast operation,
>>>>> so mostly eliminates the overhead of having entire base BTF to be a part of
>>>>> BTF dedup.
>>>>> 
>>>>> Second difference is very critical and subtle. While deduplicating split BTF
>>>>> types, it is possible to discover that one of immutable base BTF BTF_KIND_FWD
>>>>> types can and should be resolved to a full STRUCT/UNION type from the split
>>>>> BTF part.  This is, obviously, can't happen because we can't modify the base
>>>>> BTF types anymore. So because of that, any type in split BTF that directly or
>>>>> indirectly references that newly-to-be-resolved FWD type can't be considered
>>>>> to be equivalent to the corresponding canonical types in base BTF, because
>>>>> that would result in a loss of type resolution information. So in such case,
>>>>> split BTF types will be deduplicated separately and will cause some
>>>>> duplication of type information, which is unavoidable.
>>>>> 
>>>>> With those two changes, the rest of the algorithm manages to deduplicate split
>>>>> BTF correctly, pointing all the duplicates to their canonical counter-parts in
>>>>> base BTF, but also is deduplicating whatever unique types are present in split
>>>>> BTF on their own.
>>>>> 
>>>>> Also, theoretically, split BTF after deduplication could end up with either
>>>>> empty type section or empty string section. This is handled by libbpf
>>>>> correctly in one of previous patches in the series.
>>>>> 
>>>>> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
>>>> 
>>>> Acked-by: Song Liu <songliubraving@fb.com>
>>>> 
>>>> With some nits:
>>>> 
>>>>> ---
>>>> 
>>>> [...]
>>>> 
>>>>> 
>>>>>     /* remap string offsets */
>>>>>     err = btf_for_each_str_off(d, strs_dedup_remap_str_off, d);
>>>>> @@ -3553,6 +3582,63 @@ static bool btf_compat_fnproto(struct btf_type *t1, struct btf_type *t2)
>>>>>     return true;
>>>>> }
>>>>> 
>>>> 
>>>> An overview comment about bpf_deup_prep() will be great.
>>> 
>>> ok
>>> 
>>>> 
>>>>> +static int btf_dedup_prep(struct btf_dedup *d)
>>>>> +{
>>>>> +     struct btf_type *t;
>>>>> +     int type_id;
>>>>> +     long h;
>>>>> +
>>>>> +     if (!d->btf->base_btf)
>>>>> +             return 0;
>>>>> +
>>>>> +     for (type_id = 1; type_id < d->btf->start_id; type_id++)
>>>>> +     {
>>>> 
>>>> Move "{" to previous line?
>>> 
>>> yep, my bad
>>> 
>>>> 
>>>>> +             t = btf_type_by_id(d->btf, type_id);
>>>>> +
>>>>> +             /* all base BTF types are self-canonical by definition */
>>>>> +             d->map[type_id] = type_id;
>>>>> +
>>>>> +             switch (btf_kind(t)) {
>>>>> +             case BTF_KIND_VAR:
>>>>> +             case BTF_KIND_DATASEC:
>>>>> +                     /* VAR and DATASEC are never hash/deduplicated */
>>>>> +                     continue;
>>>> 
>>>> [...]
>>>> 
>>>>>     /* we are going to reuse hypot_map to store compaction remapping */
>>>>>     d->hypot_map[0] = 0;
>>>>> -     for (i = 1; i <= d->btf->nr_types; i++)
>>>>> -             d->hypot_map[i] = BTF_UNPROCESSED_ID;
>>>>> +     /* base BTF types are not renumbered */
>>>>> +     for (id = 1; id < d->btf->start_id; id++)
>>>>> +             d->hypot_map[id] = id;
>>>>> +     for (i = 0, id = d->btf->start_id; i < d->btf->nr_types; i++, id++)
>>>>> +             d->hypot_map[id] = BTF_UNPROCESSED_ID;
>>>> 
>>>> We don't really need i in the loop, shall we just do
>>>>       for (id = d->btf->start_id; id < d->btf->start_id + d->btf->nr_types; id++)
>>>> ?
>>>> 
>>> 
>>> I prefer the loop with i iterating over the count of types, it seems
>>> more "obviously correct". For simple loop like this I could do
>>> 
>>> for (i = 0; i < d->btf->nr_types; i++)
>>>   d->hypot_map[d->start_id + i] = ...;
>>> 
>>> But for the more complicated one below I found that maintaining id as
>>> part of the for loop control block is a bit cleaner. So I just stuck
>>> to the consistent pattern across all of them.
>> 
>> How about
>> 
>>        for (i = 0; i < d->btf->nr_types; i++) {
>>                id = d->start_id + i;
>>                ...
>> ?
> 
> this would be excessive for that single-line for loop. I'd really like
> to keep it consistent and confined within the for () block.
> 
>> 
>> I would expect for loop with two loop variable to do some tricks, like two
>> termination conditions, or another conditional id++ somewhere in the loop.
> 
> Libbpf already uses such two variable loops for things like iterating
> over btf_type's members, enums, func args, etc. So it's not an
> entirely alien construct. I really appreciate you trying to keep the
> code as simple and clean as possible, but I think it's pretty
> straightforward in this case and there's no need to simplify it
> further.

No problem. It was just a nitpick. The loop is totally fine as is. 

Thanks,
Song

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 08/11] libbpf: support BTF dedup of split BTFs
  2020-11-03  6:27     ` Andrii Nakryiko
@ 2020-11-03 17:55       ` Alexei Starovoitov
  0 siblings, 0 replies; 49+ messages in thread
From: Alexei Starovoitov @ 2020-11-03 17:55 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team

On Mon, Nov 02, 2020 at 10:27:20PM -0800, Andrii Nakryiko wrote:
> On Mon, Nov 2, 2020 at 9:10 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Wed, Oct 28, 2020 at 05:58:59PM -0700, Andrii Nakryiko wrote:
> > > @@ -2942,6 +2948,13 @@ struct btf_dedup {
> > >       __u32 *hypot_list;
> > >       size_t hypot_cnt;
> > >       size_t hypot_cap;
> > > +     /* Whether hypothethical mapping, if successful, would need to adjust
> > > +      * already canonicalized types (due to a new forward declaration to
> > > +      * concrete type resolution). In such case, during split BTF dedup
> > > +      * candidate type would still be considered as different, because base
> > > +      * BTF is considered to be immutable.
> > > +      */
> > > +     bool hypot_adjust_canon;
> >
> > why one flag per dedup session is enough?
> 
> So the entire hypot_xxx state is reset before each struct/union type
> graph equivalence check. Then for each struct/union type we might do
> potentially many type graph equivalence checks against each of
> potential canonical (already deduplicated) struct. Let's keep that in
> mind for the answer below.
> 
> > Don't you have a case where some fwd are pointing to base btf and shouldn't
> > be adjusted while some are in split btf and should be?
> > It seems when this flag is set to true it will miss fwd in split btf?
> 
> So keeping the above note in mind, let's think about this case. You
> are saying that some FWDs would have candidates in base BTF, right?
> That means that the canonical type we are checking equivalence against
> has to be in the base BTF. That also means that all the canonical type
> graph types are in the base BTF, right? Because no base BTF type can
> reference types from split BTF. This, subsequently, means that no FWDs
> from split BTF graph could have canonical matching types in split BTF,
> because we are comparing split types against only base BTF types.
> 
> With that, if hypot_adjust_canon is triggered, *entire graph*
> shouldn't be matched. No single type in that (connected) graph should
> be matched to base BTF. We essentially pretend that canonical type
> doesn't even exist for us (modulo the subtle bit of still recording
> base BTF's FWD mapping to a concrete type in split BTF for FWD-to-FWD
> resolution at the very end, we can ignore that here, though, it's an
> ephemeral bookkeeping discarded after dedup).
> 
> In your example you worry about resolving FWD in split BTF to concrete
> type in split BTF. If that's possible (i.e., we have duplicates and
> enough information to infer the FWD-to-STRUCT mapping), then we'll
> have another canonical type to compare against, at which point we'll
> establish FWD-to-STRUCT mapping, like usual, and hypot_adjust_canon
> will stay false (because we'll be staying with split BTF types only).
> 
> But honestly, with graphs it can get so complicated that I wouldn't be
> surprised if I'm still missing something. So far, manually checking
> the resulting BTF showed that generated deduped BTF types look
> correct. Few cases where module BTFs had duplicated types from vmlinux
> I was able to easily find where exactly vmlinux had FWD while modules
> had STRUCT/UNION.
> 
> But also, by being conservative with hypot_adjust_canon, the worst
> case would be slight duplication of types, which is not the end of the
> world. Everything will keep working, no data will be corrupted, libbpf
> will still perform CO-RE relocation correctly (because memory layout
> of duplicated structs will be consistent across all copies, just like
> it was with task_struct until ring_buffers were renamed).

Yes. That last part is comforting. The explanation also makes sense.
Not worried about it anymore.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH bpf-next 04/11] libbpf: implement basic split BTF support
  2020-11-03  5:41       ` Song Liu
@ 2020-11-04 23:51         ` Andrii Nakryiko
  0 siblings, 0 replies; 49+ messages in thread
From: Andrii Nakryiko @ 2020-11-04 23:51 UTC (permalink / raw)
  To: Song Liu
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team

On Mon, Nov 2, 2020 at 9:41 PM Song Liu <songliubraving@fb.com> wrote:
>
>
>
> > On Nov 2, 2020, at 9:02 PM, Andrii Nakryiko <andrii.nakryiko@gmail.com> wrote:
> >
> > On Mon, Nov 2, 2020 at 3:24 PM Song Liu <songliubraving@fb.com> wrote:
> >>
> >>
> >>
> >>> On Oct 28, 2020, at 5:58 PM, Andrii Nakryiko <andrii@kernel.org> wrote:
> >>>
> >>
> >> [...]
> >>
> >>>
> >>> BTF deduplication is not yet supported for split BTF and support for it will
> >>> be added in separate patch.
> >>>
> >>> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> >>
> >> Acked-by: Song Liu <songliubraving@fb.com>
> >>
> >> With a couple nits:
> >>
> >>> ---
> >>> tools/lib/bpf/btf.c      | 205 ++++++++++++++++++++++++++++++---------
> >>> tools/lib/bpf/btf.h      |   8 ++
> >>> tools/lib/bpf/libbpf.map |   9 ++
> >>> 3 files changed, 175 insertions(+), 47 deletions(-)
> >>>
> >>> diff --git a/tools/lib/bpf/btf.c b/tools/lib/bpf/btf.c
> >>> index db9331fea672..20c64a8441a8 100644
> >>> --- a/tools/lib/bpf/btf.c
> >>> +++ b/tools/lib/bpf/btf.c
> >>> @@ -78,10 +78,32 @@ struct btf {
> >>>      void *types_data;
> >>>      size_t types_data_cap; /* used size stored in hdr->type_len */
> >>>
> >>> -     /* type ID to `struct btf_type *` lookup index */
> >>> +     /* type ID to `struct btf_type *` lookup index
> >>> +      * type_offs[0] corresponds to the first non-VOID type:
> >>> +      *   - for base BTF it's type [1];
> >>> +      *   - for split BTF it's the first non-base BTF type.
> >>> +      */
> >>>      __u32 *type_offs;
> >>>      size_t type_offs_cap;
> >>> +     /* number of types in this BTF instance:
> >>> +      *   - doesn't include special [0] void type;
> >>> +      *   - for split BTF counts number of types added on top of base BTF.
> >>> +      */
> >>>      __u32 nr_types;
> >>
> >> This is a little confusing. Maybe add a void type for every split BTF?
> >
> > Agree about being a bit confusing. But I don't want VOID in every BTF,
> > that seems sloppy (there's no continuity). I'm currently doing similar
> > changes on kernel side, and so far everything also works cleanly with
> > start_id == 0 && nr_types including VOID (for base BTF), and start_id
> > == base_btf->nr_type && nr_types has all the added types (for split
> > BTF). That seems a bit more straightforward, so I'll probably do that
> > here as well (unless I'm missing something, I'll double check).
>
> That sounds good.

So I don't think I can do that in libbpf representation,
unfortunately. I did miss something, turns out. The difference is that
in kernel BTF is always immutable, so we can store stable pointers for
id -> btf_type lookups. For libbpf, BTF can be modified, so pointers
could be invalidated. So I instead store offsets relative to the
beginning of the type data array. With such representation having VOID
as element #0 is more tricky (I actually tried, but it's too
cumbersome). So this representation will have to be slightly different
between kernel and libbpf. But that's ok, because it's just an
internal implementation. API abstracts all of that.

>
> >
> >>
> >>> +     /* if not NULL, points to the base BTF on top of which the current
> >>> +      * split BTF is based
> >>> +      */
> >>
> >> [...]
> >>
> >>>
> >>> @@ -252,12 +274,20 @@ static int btf_parse_str_sec(struct btf *btf)
> >>>      const char *start = btf->strs_data;
> >>>      const char *end = start + btf->hdr->str_len;
> >>>
> >>> -     if (!hdr->str_len || hdr->str_len - 1 > BTF_MAX_STR_OFFSET ||
> >>> -         start[0] || end[-1]) {
> >>> -             pr_debug("Invalid BTF string section\n");
> >>> -             return -EINVAL;
> >>> +     if (btf->base_btf) {
> >>> +             if (hdr->str_len == 0)
> >>> +                     return 0;
> >>> +             if (hdr->str_len - 1 > BTF_MAX_STR_OFFSET || end[-1]) {
> >>> +                     pr_debug("Invalid BTF string section\n");
> >>> +                     return -EINVAL;
> >>> +             }
> >>> +     } else {
> >>> +             if (!hdr->str_len || hdr->str_len - 1 > BTF_MAX_STR_OFFSET ||
> >>> +                 start[0] || end[-1]) {
> >>> +                     pr_debug("Invalid BTF string section\n");
> >>> +                     return -EINVAL;
> >>> +             }
> >>>      }
> >>> -
> >>>      return 0;
> >>
> >> I found this function a little difficult to follow. Maybe rearrange it as
> >>
> >>        /* too long, or not \0 terminated */
> >>        if (hdr->str_len - 1 > BTF_MAX_STR_OFFSET || end[-1])
> >>                goto err_out;
> >
> > this won't work, if str_len == 0. Both str_len - 1 will underflow, and
> > end[-1] will be reading garbage
> >
> > How about this:
> >
> > if (btf->base_btf && hdr->str_len == 0)
> >    return 0;
> >
> > if (!hdr->str_len || hdr->str_len - 1 > BTF_MAX_STR_OFFSET || end[-1])
> >    return -EINVAL;
> >
> > if (!btf->base_btf && start[0])
> >    return -EINVAL;
> >
> > return 0;
> >
> > This seems more straightforward, right?
>
> Yeah, I like this version. BTW, short comment for each condition will be
> helpful.
>
>

^ permalink raw reply	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2020-11-04 23:53 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-29  0:58 [PATCH bpf-next 00/11] libbpf: split BTF support Andrii Nakryiko
2020-10-29  0:58 ` [PATCH bpf-next 01/11] libbpf: factor out common operations in BTF writing APIs Andrii Nakryiko
2020-10-30  0:36   ` Song Liu
2020-10-29  0:58 ` [PATCH bpf-next 02/11] selftest/bpf: relax btf_dedup test checks Andrii Nakryiko
2020-10-30 16:43   ` Song Liu
2020-10-30 18:44     ` Andrii Nakryiko
2020-10-30 22:30       ` Song Liu
2020-10-29  0:58 ` [PATCH bpf-next 03/11] libbpf: unify and speed up BTF string deduplication Andrii Nakryiko
2020-10-30 23:32   ` Song Liu
2020-11-03  4:51     ` Andrii Nakryiko
2020-11-03  4:59   ` Alexei Starovoitov
2020-11-03  6:01     ` Andrii Nakryiko
2020-10-29  0:58 ` [PATCH bpf-next 04/11] libbpf: implement basic split BTF support Andrii Nakryiko
2020-11-02 23:23   ` Song Liu
2020-11-03  5:02     ` Andrii Nakryiko
2020-11-03  5:41       ` Song Liu
2020-11-04 23:51         ` Andrii Nakryiko
2020-10-29  0:58 ` [PATCH bpf-next 05/11] selftests/bpf: add split BTF basic test Andrii Nakryiko
2020-11-02 23:36   ` Song Liu
2020-11-03  5:10     ` Andrii Nakryiko
2020-10-29  0:58 ` [PATCH bpf-next 06/11] selftests/bpf: add checking of raw type dump in BTF writer APIs selftests Andrii Nakryiko
2020-11-03  0:08   ` Song Liu
2020-11-03  5:14     ` Andrii Nakryiko
2020-10-29  0:58 ` [PATCH bpf-next 07/11] libbpf: fix BTF data layout checks and allow empty BTF Andrii Nakryiko
2020-11-03  0:51   ` Song Liu
2020-11-03  5:18     ` Andrii Nakryiko
2020-11-03  5:44       ` Song Liu
2020-10-29  0:58 ` [PATCH bpf-next 08/11] libbpf: support BTF dedup of split BTFs Andrii Nakryiko
2020-11-03  2:49   ` Song Liu
2020-11-03  5:25     ` Andrii Nakryiko
2020-11-03  5:59       ` Song Liu
2020-11-03  6:31         ` Andrii Nakryiko
2020-11-03 17:15           ` Song Liu
2020-11-03  5:10   ` Alexei Starovoitov
2020-11-03  6:27     ` Andrii Nakryiko
2020-11-03 17:55       ` Alexei Starovoitov
2020-10-29  0:59 ` [PATCH bpf-next 09/11] libbpf: accomodate DWARF/compiler bug with duplicated identical arrays Andrii Nakryiko
2020-11-03  2:52   ` Song Liu
2020-10-29  0:59 ` [PATCH bpf-next 10/11] selftests/bpf: add split BTF dedup selftests Andrii Nakryiko
2020-11-03  5:35   ` Song Liu
2020-11-03  6:05     ` Andrii Nakryiko
2020-11-03  6:30       ` Song Liu
2020-10-29  0:59 ` [PATCH bpf-next 11/11] tools/bpftool: add bpftool support for split BTF Andrii Nakryiko
2020-11-03  6:03   ` Song Liu
2020-10-30  0:33 ` [PATCH bpf-next 00/11] libbpf: split BTF support Song Liu
2020-10-30  2:33   ` Andrii Nakryiko
2020-10-30  6:45     ` Song Liu
2020-10-30 12:04     ` Alan Maguire
2020-10-30 18:30       ` Andrii Nakryiko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).