All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v1 3/3] selftests/bpf: add test case for userspace and bpf type size mismatch
  2022-02-10  0:36 [PATCH bpf-next v1 0/3] Avoid typedef size mismatches in skeletons Delyan Kratunov
@ 2022-02-10  0:36 ` Delyan Kratunov
  2022-02-10 22:44   ` Andrii Nakryiko
  2022-02-10  0:36 ` [PATCH bpf-next v1 2/3] bpftool: skeleton uses explicitly sized ints Delyan Kratunov
                   ` (3 subsequent siblings)
  4 siblings, 1 reply; 16+ messages in thread
From: Delyan Kratunov @ 2022-02-10  0:36 UTC (permalink / raw)
  To: daniel, ast, andrii, bpf

Multiple test cases already fail if you add a type whose size is
different between userspace and bpf. That said, let's also add an
explicit test that ensures mis-sized reads/writes do not actually
happen. This test case fails before this patch series and passes after:

test_skeleton:FAIL:writes and reads match size unexpected writes
and reads match size: actual 3735928559 != expected 8030895855
test_skeleton:FAIL:skeleton uses underlying type unexpected
skeleton uses underlying type: actual 8 != expected 4

Signed-off-by: Delyan Kratunov <delyank@fb.com>
---
 tools/testing/selftests/bpf/prog_tests/skeleton.c | 6 ++++++
 tools/testing/selftests/bpf/progs/test_skeleton.c | 8 ++++++++
 2 files changed, 14 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/skeleton.c b/tools/testing/selftests/bpf/prog_tests/skeleton.c
index 9894e1b39211..bc07da929566 100644
--- a/tools/testing/selftests/bpf/prog_tests/skeleton.c
+++ b/tools/testing/selftests/bpf/prog_tests/skeleton.c
@@ -97,6 +97,9 @@ void test_skeleton(void)

 	skel->data_read_mostly->read_mostly_var = 123;

+	/* validate apparent 64-bit value is actually 32-bit */
+	skel->data->intest64 = (typeof(skel->data->intest64)) 0xdeadbeefdeadbeefULL;
+
 	err = test_skeleton__attach(skel);
 	if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err))
 		goto cleanup;
@@ -126,6 +129,9 @@ void test_skeleton(void)
 	ASSERT_OK_PTR(elf_bytes, "elf_bytes");
 	ASSERT_GE(elf_bytes_sz, 0, "elf_bytes_sz");

+	ASSERT_EQ(skel->data->outtest64, skel->data->intest64, "writes and reads match size");
+	ASSERT_EQ(sizeof(skel->data->intest64), sizeof(u32), "skeleton uses underlying type");
+
 cleanup:
 	test_skeleton__destroy(skel);
 }
diff --git a/tools/testing/selftests/bpf/progs/test_skeleton.c b/tools/testing/selftests/bpf/progs/test_skeleton.c
index 1b1187d2967b..fd1f4910cf42 100644
--- a/tools/testing/selftests/bpf/progs/test_skeleton.c
+++ b/tools/testing/selftests/bpf/progs/test_skeleton.c
@@ -16,6 +16,13 @@ struct s {
 int in1 = -1;
 long long in2 = -1;

+/* declare the int64_t type to actually be 32-bit to ensure the skeleton
+ * uses actual sizes and doesn't just copy the type name
+ */
+typedef __s32 int64_t;
+int64_t intest64 = -1;
+int64_t outtest64 = -1;
+
 /* .bss section */
 char in3 = '\0';
 long long in4 __attribute__((aligned(64))) = 0;
@@ -62,6 +69,7 @@ int handler(const void *ctx)
 	out4 = in4;
 	out5 = in5;
 	out6 = in.in6;
+	outtest64 = intest64;

 	bpf_syscall = CONFIG_BPF_SYSCALL;
 	kern_ver = LINUX_KERNEL_VERSION;
--
2.34.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH bpf-next v1 2/3] bpftool: skeleton uses explicitly sized ints
  2022-02-10  0:36 [PATCH bpf-next v1 0/3] Avoid typedef size mismatches in skeletons Delyan Kratunov
  2022-02-10  0:36 ` [PATCH bpf-next v1 3/3] selftests/bpf: add test case for userspace and bpf type size mismatch Delyan Kratunov
@ 2022-02-10  0:36 ` Delyan Kratunov
  2022-02-10 22:42   ` Andrii Nakryiko
  2022-02-10  0:36 ` [PATCH bpf-next v1 1/3] libbpf: btf_dump can produce " Delyan Kratunov
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 16+ messages in thread
From: Delyan Kratunov @ 2022-02-10  0:36 UTC (permalink / raw)
  To: daniel, ast, andrii, bpf

As reported in [0] and [1], kernel and userspace can sometimes disagree
on the definition of a typedef (in particular, the size).
This leads to trouble when userspace maps the memory of a bpf program
and reads/writes to it assuming a different memory layout.

This commit now uses the libbpf sized ints logic when emitting the
skeleton. This resolves int types to int32_t-like equivalents and
ensures that typedefs are not just emitted verbatim.

The drive-by selftest changes fix format specifier issues
due to the definitions of [us]64 and (u)int64_t differing in how
many longs they use (long long int vs long int on x86_64).

  [0]: https://github.com/iovisor/bcc/pull/3777
  [1]: Closes: https://github.com/libbpf/libbpf/issues/433

Signed-off-by: Delyan Kratunov <delyank@fb.com>
---
 tools/bpf/bpftool/gen.c                          |  3 +++
 .../testing/selftests/bpf/prog_tests/skeleton.c  | 16 ++++++++--------
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c
index eacfc6a2060d..18c3f755ad88 100644
--- a/tools/bpf/bpftool/gen.c
+++ b/tools/bpf/bpftool/gen.c
@@ -146,6 +146,7 @@ static int codegen_datasec_def(struct bpf_object *obj,
 			.field_name = var_ident,
 			.indent_level = 2,
 			.strip_mods = strip_mods,
+			.sizedints = true,
 		);
 		int need_off = sec_var->offset, align_off, align;
 		__u32 var_type_id = var->type;
@@ -751,6 +752,7 @@ static int do_skeleton(int argc, char **argv)
 		#ifndef %2$s						    \n\
 		#define %2$s						    \n\
 									    \n\
+		#include <inttypes.h>					    \n\
 		#include <stdlib.h>					    \n\
 		#include <bpf/bpf.h>					    \n\
 		#include <bpf/skel_internal.h>				    \n\
@@ -770,6 +772,7 @@ static int do_skeleton(int argc, char **argv)
 		#define %2$s						    \n\
 									    \n\
 		#include <errno.h>					    \n\
+		#include <inttypes.h>					    \n\
 		#include <stdlib.h>					    \n\
 		#include <bpf/libbpf.h>					    \n\
 									    \n\
diff --git a/tools/testing/selftests/bpf/prog_tests/skeleton.c b/tools/testing/selftests/bpf/prog_tests/skeleton.c
index 180afd632f4c..9894e1b39211 100644
--- a/tools/testing/selftests/bpf/prog_tests/skeleton.c
+++ b/tools/testing/selftests/bpf/prog_tests/skeleton.c
@@ -43,13 +43,13 @@ void test_skeleton(void)
 	/* validate values are pre-initialized correctly */
 	CHECK(data->in1 != -1, "in1", "got %d != exp %d\n", data->in1, -1);
 	CHECK(data->out1 != -1, "out1", "got %d != exp %d\n", data->out1, -1);
-	CHECK(data->in2 != -1, "in2", "got %lld != exp %lld\n", data->in2, -1LL);
-	CHECK(data->out2 != -1, "out2", "got %lld != exp %lld\n", data->out2, -1LL);
+	CHECK(data->in2 != -1, "in2", "got %"PRId64" != exp %lld\n", data->in2, -1LL);
+	CHECK(data->out2 != -1, "out2", "got %"PRId64" != exp %lld\n", data->out2, -1LL);

 	CHECK(bss->in3 != 0, "in3", "got %d != exp %d\n", bss->in3, 0);
 	CHECK(bss->out3 != 0, "out3", "got %d != exp %d\n", bss->out3, 0);
-	CHECK(bss->in4 != 0, "in4", "got %lld != exp %lld\n", bss->in4, 0LL);
-	CHECK(bss->out4 != 0, "out4", "got %lld != exp %lld\n", bss->out4, 0LL);
+	CHECK(bss->in4 != 0, "in4", "got %"PRId64" != exp %lld\n", bss->in4, 0LL);
+	CHECK(bss->out4 != 0, "out4", "got %"PRId64" != exp %lld\n", bss->out4, 0LL);

 	CHECK(rodata->in.in6 != 0, "in6", "got %d != exp %d\n", rodata->in.in6, 0);
 	CHECK(bss->out6 != 0, "out6", "got %d != exp %d\n", bss->out6, 0);
@@ -77,9 +77,9 @@ void test_skeleton(void)

 	/* validate pre-setup values are still there */
 	CHECK(data->in1 != 10, "in1", "got %d != exp %d\n", data->in1, 10);
-	CHECK(data->in2 != 11, "in2", "got %lld != exp %lld\n", data->in2, 11LL);
+	CHECK(data->in2 != 11, "in2", "got %"PRId64" != exp %lld\n", data->in2, 11LL);
 	CHECK(bss->in3 != 12, "in3", "got %d != exp %d\n", bss->in3, 12);
-	CHECK(bss->in4 != 13, "in4", "got %lld != exp %lld\n", bss->in4, 13LL);
+	CHECK(bss->in4 != 13, "in4", "got %"PRId64" != exp %lld\n", bss->in4, 13LL);
 	CHECK(rodata->in.in6 != 14, "in6", "got %d != exp %d\n", rodata->in.in6, 14);

 	ASSERT_EQ(rodata_dyn->in_dynarr_sz, 4, "in_dynarr_sz");
@@ -105,9 +105,9 @@ void test_skeleton(void)
 	usleep(1);

 	CHECK(data->out1 != 1, "res1", "got %d != exp %d\n", data->out1, 1);
-	CHECK(data->out2 != 2, "res2", "got %lld != exp %d\n", data->out2, 2);
+	CHECK(data->out2 != 2, "res2", "got %"PRId64" != exp %d\n", data->out2, 2);
 	CHECK(bss->out3 != 3, "res3", "got %d != exp %d\n", (int)bss->out3, 3);
-	CHECK(bss->out4 != 4, "res4", "got %lld != exp %d\n", bss->out4, 4);
+	CHECK(bss->out4 != 4, "res4", "got %"PRId64" != exp %d\n", bss->out4, 4);
 	CHECK(bss->out5.a != 5, "res5", "got %d != exp %d\n", bss->out5.a, 5);
 	CHECK(bss->out5.b != 6, "res6", "got %lld != exp %d\n", bss->out5.b, 6);
 	CHECK(bss->out6 != 14, "res7", "got %d != exp %d\n", bss->out6, 14);
--
2.34.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH bpf-next v1 1/3] libbpf: btf_dump can produce explicitly sized ints
  2022-02-10  0:36 [PATCH bpf-next v1 0/3] Avoid typedef size mismatches in skeletons Delyan Kratunov
  2022-02-10  0:36 ` [PATCH bpf-next v1 3/3] selftests/bpf: add test case for userspace and bpf type size mismatch Delyan Kratunov
  2022-02-10  0:36 ` [PATCH bpf-next v1 2/3] bpftool: skeleton uses explicitly sized ints Delyan Kratunov
@ 2022-02-10  0:36 ` Delyan Kratunov
  2022-02-10 22:35   ` Andrii Nakryiko
  2022-02-10 22:18 ` [PATCH bpf-next v1 0/3] Avoid typedef size mismatches in skeletons Yonghong Song
  2022-02-10 22:51 ` Yonghong Song
  4 siblings, 1 reply; 16+ messages in thread
From: Delyan Kratunov @ 2022-02-10  0:36 UTC (permalink / raw)
  To: daniel, ast, andrii, bpf

When emitting type declations, btf_dump can now optionally rename
int types (including typedefs) to standard types with explicit sizes.
Types like pid_t get renamed but types like __u32, char, and _Bool
are left alone to preserve cast semantics in as many pre-existing
programs as possible.

This option is useful when generating data structures on a system where
types may differ due to arch differences or just userspace and bpf program
disagreeing on the definition of a typedef.

Signed-off-by: Delyan Kratunov <delyank@fb.com>
---
 tools/lib/bpf/btf.h      |  4 +-
 tools/lib/bpf/btf_dump.c | 80 +++++++++++++++++++++++++++++++++++++++-
 2 files changed, 82 insertions(+), 2 deletions(-)

diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
index 951ac7475794..dbd41bf93b13 100644
--- a/tools/lib/bpf/btf.h
+++ b/tools/lib/bpf/btf.h
@@ -347,9 +347,11 @@ struct btf_dump_emit_type_decl_opts {
 	int indent_level;
 	/* strip all the const/volatile/restrict mods */
 	bool strip_mods;
+	/* normalize int fields to (u)?int(16|32|64)_t types */
+	bool sizedints;
 	size_t :0;
 };
-#define btf_dump_emit_type_decl_opts__last_field strip_mods
+#define btf_dump_emit_type_decl_opts__last_field sizedints

 LIBBPF_API int
 btf_dump__emit_type_decl(struct btf_dump *d, __u32 id,
diff --git a/tools/lib/bpf/btf_dump.c b/tools/lib/bpf/btf_dump.c
index 07ebe70d3a30..56bafacf1cbd 100644
--- a/tools/lib/bpf/btf_dump.c
+++ b/tools/lib/bpf/btf_dump.c
@@ -81,6 +81,7 @@ struct btf_dump {
 	void *cb_ctx;
 	int ptr_sz;
 	bool strip_mods;
+	bool sizedints;
 	bool skip_anon_defs;
 	int last_id;

@@ -1130,7 +1131,9 @@ int btf_dump__emit_type_decl(struct btf_dump *d, __u32 id,
 	fname = OPTS_GET(opts, field_name, "");
 	lvl = OPTS_GET(opts, indent_level, 0);
 	d->strip_mods = OPTS_GET(opts, strip_mods, false);
+	d->sizedints = OPTS_GET(opts, sizedints, false);
 	btf_dump_emit_type_decl(d, id, fname, lvl);
+	d->sizedints = false;
 	d->strip_mods = false;
 	return 0;
 }
@@ -1263,6 +1266,34 @@ static void btf_dump_emit_name(const struct btf_dump *d,
 	btf_dump_printf(d, "%s%s", separate ? " " : "", name);
 }

+/* Encode custom heurstics to find char types since BTF_INT_CHAR is never set. */
+static bool btf_is_char(const struct btf_dump *d, const struct btf_type *t)
+{
+	return btf_is_int(t) &&
+	       t->size == 1 &&
+	       strcmp(btf_name_of(d, t->name_off), "char") == 0;
+}
+
+static bool btf_is_bool(const struct btf_type *t)
+{
+	return btf_is_int(t) && (btf_int_encoding(t) & BTF_INT_BOOL);
+}
+
+/* returns true if type is of the '__[su](8|16|32|64)' type */
+static bool btf_is_kernel_sizedint(const struct btf_dump *d, const struct btf_type *t)
+{
+	const char *name = btf_name_of(d, t->name_off);
+
+	return strcmp(name, "__s8") == 0 ||
+	       strcmp(name, "__u8") == 0 ||
+	       strcmp(name, "__s16") == 0 ||
+	       strcmp(name, "__u16") == 0 ||
+	       strcmp(name, "__s32") == 0 ||
+	       strcmp(name, "__u32") == 0 ||
+	       strcmp(name, "__s64") == 0 ||
+	       strcmp(name, "__u64") == 0;
+}
+
 static void btf_dump_emit_type_chain(struct btf_dump *d,
 				     struct id_stack *decls,
 				     const char *fname, int lvl)
@@ -1277,10 +1308,12 @@ static void btf_dump_emit_type_chain(struct btf_dump *d,
 	 * don't want to prepend space for that last pointer.
 	 */
 	bool last_was_ptr = true;
-	const struct btf_type *t;
+	const struct btf_type *t, *rest;
 	const char *name;
 	__u16 kind;
 	__u32 id;
+	__u8 intenc;
+	int restypeid;

 	while (decls->cnt) {
 		id = decls->ids[--decls->cnt];
@@ -1295,8 +1328,51 @@ static void btf_dump_emit_type_chain(struct btf_dump *d,
 		t = btf__type_by_id(d->btf, id);
 		kind = btf_kind(t);

+		/* If we're asked to produce stdint declarations, we need
+		 * to only do that in the following cases:
+		 *  - int types other than char and _Bool
+		 *  - typedefs to int types (including char and _Bool) except
+		 *    kernel types like __s16/__u32/etc.
+		 *
+		 * If a typedef resolves to char or _Bool, we do want to use
+		 * the resolved type instead of the stdint types (i.e. char
+		 * instead of int8_t) because the stdint types are explicitly
+		 * signed/unsigned, which affects pointer casts.
+		 *
+		 * If the typedef is of the __s32 variety, we leave it as-is
+		 * due to incompatibilities in e.g. s64 vs int64_t definitions
+		 * (one is `long long` on x86_64 and the other is not).
+		 *
+		 * Unfortunately, the BTF type info never includes BTF_INT_CHAR,
+		 * so we use a size comparison to avoid chars and
+		 * BTF_INT_BOOL to avoid bools.
+		 */
+		if (d->sizedints && kind == BTF_KIND_TYPEDEF &&
+		    !btf_is_kernel_sizedint(d, t)) {
+			restypeid = btf__resolve_type(d->btf, id);
+			if (restypeid >= 0) {
+				rest = btf__type_by_id(d->btf, restypeid);
+				if (rest && btf_is_int(rest)) {
+					t = rest;
+					kind = btf_kind(rest);
+				}
+			}
+		}
+
 		switch (kind) {
 		case BTF_KIND_INT:
+			btf_dump_emit_mods(d, decls);
+			if (d->sizedints && !btf_is_bool(t) && !btf_is_char(d, t)) {
+				intenc = btf_int_encoding(t);
+				btf_dump_printf(d,
+						intenc & BTF_INT_SIGNED ?
+						"int%d_t" : "uint%d_t",
+						t->size * 8);
+			} else {
+				name = btf_name_of(d, t->name_off);
+				btf_dump_printf(d, "%s", name);
+			}
+			break;
 		case BTF_KIND_FLOAT:
 			btf_dump_emit_mods(d, decls);
 			name = btf_name_of(d, t->name_off);
@@ -1469,7 +1545,9 @@ static void btf_dump_emit_type_cast(struct btf_dump *d, __u32 id,

 	d->skip_anon_defs = true;
 	d->strip_mods = true;
+	d->sizedints = true;
 	btf_dump_emit_type_decl(d, id, "", 0);
+	d->sizedints = false;
 	d->strip_mods = false;
 	d->skip_anon_defs = false;

--
2.34.1

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH bpf-next v1 0/3] Avoid typedef size mismatches in skeletons
@ 2022-02-10  0:36 Delyan Kratunov
  2022-02-10  0:36 ` [PATCH bpf-next v1 3/3] selftests/bpf: add test case for userspace and bpf type size mismatch Delyan Kratunov
                   ` (4 more replies)
  0 siblings, 5 replies; 16+ messages in thread
From: Delyan Kratunov @ 2022-02-10  0:36 UTC (permalink / raw)
  To: daniel, ast, andrii, bpf

As reported in [0], kernel and userspace can sometimes disagree
on the definition of a typedef (in particular, the size).
This leads to trouble when userspace maps the memory of a bpf program
and reads/writes to it assuming a different memory layout.

This series resolves most int-like typedefs and rewrites them as
standard int16_t-like types. In particular, we don't touch
__u32-like types, char, and _Bool, as changing them changes cast
semantics and would break too many pre-existing programs. For example,
int8_t* is not convertible to char* because int8_t is explicitly signed.

  [0]: https://github.com/iovisor/bcc/pull/3777

Delyan Kratunov (3):
  libbpf: btf_dump can produce explicitly sized ints
  bpftool: skeleton uses explicitly sized ints
  selftests/bpf: add test case for userspace and bpf type size mismatch

 tools/bpf/bpftool/gen.c                       |  3 +
 tools/lib/bpf/btf.h                           |  4 +-
 tools/lib/bpf/btf_dump.c                      | 80 ++++++++++++++++++-
 .../selftests/bpf/prog_tests/skeleton.c       | 22 +++--
 .../selftests/bpf/progs/test_skeleton.c       |  8 ++
 5 files changed, 107 insertions(+), 10 deletions(-)

--
2.34.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next v1 0/3] Avoid typedef size mismatches in skeletons
  2022-02-10  0:36 [PATCH bpf-next v1 0/3] Avoid typedef size mismatches in skeletons Delyan Kratunov
                   ` (2 preceding siblings ...)
  2022-02-10  0:36 ` [PATCH bpf-next v1 1/3] libbpf: btf_dump can produce " Delyan Kratunov
@ 2022-02-10 22:18 ` Yonghong Song
  2022-02-10 22:48   ` Andrii Nakryiko
  2022-02-10 22:51 ` Yonghong Song
  4 siblings, 1 reply; 16+ messages in thread
From: Yonghong Song @ 2022-02-10 22:18 UTC (permalink / raw)
  To: Delyan Kratunov, daniel, ast, andrii, bpf



On 2/9/22 4:36 PM, Delyan Kratunov wrote:
> As reported in [0], kernel and userspace can sometimes disagree
> on the definition of a typedef (in particular, the size).
> This leads to trouble when userspace maps the memory of a bpf program
> and reads/writes to it assuming a different memory layout.
> 
> This series resolves most int-like typedefs and rewrites them as
> standard int16_t-like types. In particular, we don't touch
> __u32-like types, char, and _Bool, as changing them changes cast
> semantics and would break too many pre-existing programs. For example,
> int8_t* is not convertible to char* because int8_t is explicitly signed.

Build with clang (adding LLVM=1 to build kernel and selftests),
several btf_dump subtests failed. Please take a look.

btf_dump_data:PASS:find type id 0 nsec
btf_dump_data:PASS:failed/unexpected type_sz 0 nsec
btf_dump_data:FAIL:ensure expected/actual match unexpected ensure 
expected/actual match: actual '(int32_t)1234' != expected '(int)1234'
btf_dump_data:PASS:find type id 0 nsec
btf_dump_data:PASS:failed/unexpected type_sz 0 nsec
btf_dump_data:PASS:ensure expected/actual match 0 nsec
btf_dump_data:PASS:find type id 0 nsec
btf_dump_data:PASS:failed/unexpected type_sz 0 nsec
btf_dump_data:FAIL:ensure expected/actual match unexpected ensure 
expected/actual match: actual '(int32_t)1234' != expected '(int)1234'
btf_dump_data:PASS:find type id 0 nsec
btf_dump_data:PASS:failed/unexpected type_sz 0 nsec
btf_dump_data:FAIL:ensure expected/actual match unexpected ensure 
expected/actual match: actual '(int32_t)0' != expected '(int)0'
...
btf_dump_data:FAIL:ensure expected/actual match unexpected ensure 
expected/actual match: actual '(uint128_t)0xffffffffffffffff' != 
expected '(unsigned __int128)0xffffffffffffffff'
btf_dump_data:PASS:find type id 0 nsec
btf_dump_data:PASS:failed/unexpected type_sz 0 nsec
btf_dump_data:FAIL:ensure expected/actual match unexpected ensure 
expected/actual match: actual 
'(uint128_t)0xfffffffffffffffffffffffffffffffe' != expected '(unsigned 
__int128)0xfffffffffffffffffffffffffffff'
test_btf_dump_int_data:FAIL:dump unsigned __int128 unexpected error: -14 
(errno 7)
#20/9 btf_dump/btf_dump: int_data:FAIL
...
btf_dump_data:FAIL:ensure expected/actual match unexpected ensure 
expected/actual match: actual '(uint64_t)1' != expected '(u64)1'
btf_dump_data:PASS:find type id 0 nsec
btf_dump_data:PASS:failed/unexpected type_sz 0 nsec
btf_dump_data:FAIL:ensure expected/actual match unexpected ensure 
expected/actual match: actual '(uint64_t)0' != expected '(u64)0'
btf_dump_data:PASS:find type id 0 nsec
...
btf_dump_data:FAIL:ensure expected/actual match unexpected ensure 
expected/actual match: actual '(atomic_t){
         .counter = (int32_t)0,
}' != expected '(atomic_t){
         .counter = (int)0,
}'
btf_dump_data:PASS:find type id 0 nsec
btf_dump_data:PASS:failed to return -E2BIG 0 nsec
btf_dump_data:PASS:ensure expected/actual match 0 nsec
#20/12 btf_dump/btf_dump: typedef_data:FAIL
...
test_btf_dump_struct_data:FAIL:file_operations unexpected 
file_operations: actual '(struct file_operations){
         .owner = (struct module *)0xffffffffffffffff,
         .llseek = (int64_t (*)(struct file *, int64_t, 
int32_t))0xfffffffffff' != expected '(struct file_operations){
         .owner = (struct module *)0xffffffffffffffff,
         .llseek = (loff_t (*)(struct file *, loff_t, 
int))0xffffffffffffffff,'
...
...

> 
>    [0]: https://github.com/iovisor/bcc/pull/3777
> 
> Delyan Kratunov (3):
>    libbpf: btf_dump can produce explicitly sized ints
>    bpftool: skeleton uses explicitly sized ints
>    selftests/bpf: add test case for userspace and bpf type size mismatch
> 
>   tools/bpf/bpftool/gen.c                       |  3 +
>   tools/lib/bpf/btf.h                           |  4 +-
>   tools/lib/bpf/btf_dump.c                      | 80 ++++++++++++++++++-
>   .../selftests/bpf/prog_tests/skeleton.c       | 22 +++--
>   .../selftests/bpf/progs/test_skeleton.c       |  8 ++
>   5 files changed, 107 insertions(+), 10 deletions(-)
> 
> --
> 2.34.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next v1 1/3] libbpf: btf_dump can produce explicitly sized ints
  2022-02-10  0:36 ` [PATCH bpf-next v1 1/3] libbpf: btf_dump can produce " Delyan Kratunov
@ 2022-02-10 22:35   ` Andrii Nakryiko
  2022-02-10 23:40     ` Delyan Kratunov
  0 siblings, 1 reply; 16+ messages in thread
From: Andrii Nakryiko @ 2022-02-10 22:35 UTC (permalink / raw)
  To: Delyan Kratunov; +Cc: daniel, ast, andrii, bpf

On Wed, Feb 9, 2022 at 4:37 PM Delyan Kratunov <delyank@fb.com> wrote:
>
> When emitting type declations, btf_dump can now optionally rename
> int types (including typedefs) to standard types with explicit sizes.
> Types like pid_t get renamed but types like __u32, char, and _Bool
> are left alone to preserve cast semantics in as many pre-existing
> programs as possible.
>
> This option is useful when generating data structures on a system where
> types may differ due to arch differences or just userspace and bpf program
> disagreeing on the definition of a typedef.
>
> Signed-off-by: Delyan Kratunov <delyank@fb.com>
> ---
>  tools/lib/bpf/btf.h      |  4 +-
>  tools/lib/bpf/btf_dump.c | 80 +++++++++++++++++++++++++++++++++++++++-
>  2 files changed, 82 insertions(+), 2 deletions(-)
>
> diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
> index 951ac7475794..dbd41bf93b13 100644
> --- a/tools/lib/bpf/btf.h
> +++ b/tools/lib/bpf/btf.h
> @@ -347,9 +347,11 @@ struct btf_dump_emit_type_decl_opts {
>         int indent_level;
>         /* strip all the const/volatile/restrict mods */
>         bool strip_mods;
> +       /* normalize int fields to (u)?int(16|32|64)_t types */
> +       bool sizedints;

let's stick to consistent snake_case naming convention

let's also call it what the comment calls it :) "normalize_ints" ?

>         size_t :0;
>  };
> -#define btf_dump_emit_type_decl_opts__last_field strip_mods
> +#define btf_dump_emit_type_decl_opts__last_field sizedints
>
>  LIBBPF_API int
>  btf_dump__emit_type_decl(struct btf_dump *d, __u32 id,
> diff --git a/tools/lib/bpf/btf_dump.c b/tools/lib/bpf/btf_dump.c
> index 07ebe70d3a30..56bafacf1cbd 100644
> --- a/tools/lib/bpf/btf_dump.c
> +++ b/tools/lib/bpf/btf_dump.c
> @@ -81,6 +81,7 @@ struct btf_dump {
>         void *cb_ctx;
>         int ptr_sz;
>         bool strip_mods;
> +       bool sizedints;
>         bool skip_anon_defs;
>         int last_id;
>
> @@ -1130,7 +1131,9 @@ int btf_dump__emit_type_decl(struct btf_dump *d, __u32 id,
>         fname = OPTS_GET(opts, field_name, "");
>         lvl = OPTS_GET(opts, indent_level, 0);
>         d->strip_mods = OPTS_GET(opts, strip_mods, false);
> +       d->sizedints = OPTS_GET(opts, sizedints, false);
>         btf_dump_emit_type_decl(d, id, fname, lvl);
> +       d->sizedints = false;
>         d->strip_mods = false;
>         return 0;
>  }
> @@ -1263,6 +1266,34 @@ static void btf_dump_emit_name(const struct btf_dump *d,
>         btf_dump_printf(d, "%s%s", separate ? " " : "", name);
>  }
>
> +/* Encode custom heurstics to find char types since BTF_INT_CHAR is never set. */

typo: heuristic

> +static bool btf_is_char(const struct btf_dump *d, const struct btf_type *t)
> +{
> +       return btf_is_int(t) &&
> +              t->size == 1 &&
> +              strcmp(btf_name_of(d, t->name_off), "char") == 0;
> +}
> +
> +static bool btf_is_bool(const struct btf_type *t)
> +{
> +       return btf_is_int(t) && (btf_int_encoding(t) & BTF_INT_BOOL);
> +}
> +
> +/* returns true if type is of the '__[su](8|16|32|64)' type */
> +static bool btf_is_kernel_sizedint(const struct btf_dump *d, const struct btf_type *t)
> +{
> +       const char *name = btf_name_of(d, t->name_off);
> +
> +       return strcmp(name, "__s8") == 0 ||
> +              strcmp(name, "__u8") == 0 ||
> +              strcmp(name, "__s16") == 0 ||
> +              strcmp(name, "__u16") == 0 ||
> +              strcmp(name, "__s32") == 0 ||
> +              strcmp(name, "__u32") == 0 ||
> +              strcmp(name, "__s64") == 0 ||
> +              strcmp(name, "__u64") == 0;
> +}

So I thought about this a bit, I see how there could be a difference
of %lld vs %ld and such, but I think normalize should mean normalize
all ints, without any exceptions for kernel aliases. Let's keep the
rule simple: everything but char and bool gets converted to its
corresponding standard integer alias.

Worst case few users might need to adjust their printf specifier after
seeing a compiler warning.

> +
>  static void btf_dump_emit_type_chain(struct btf_dump *d,
>                                      struct id_stack *decls,
>                                      const char *fname, int lvl)
> @@ -1277,10 +1308,12 @@ static void btf_dump_emit_type_chain(struct btf_dump *d,
>          * don't want to prepend space for that last pointer.
>          */
>         bool last_was_ptr = true;
> -       const struct btf_type *t;
> +       const struct btf_type *t, *rest;
>         const char *name;
>         __u16 kind;
>         __u32 id;
> +       __u8 intenc;
> +       int restypeid;
>

same about variable naming conventions

>         while (decls->cnt) {
>                 id = decls->ids[--decls->cnt];
> @@ -1295,8 +1328,51 @@ static void btf_dump_emit_type_chain(struct btf_dump *d,
>                 t = btf__type_by_id(d->btf, id);
>                 kind = btf_kind(t);

just move it after the if () to not re-assign kind again inside the if

>
> +               /* If we're asked to produce stdint declarations, we need
> +                * to only do that in the following cases:
> +                *  - int types other than char and _Bool

let's make it the only case

> +                *  - typedefs to int types (including char and _Bool) except
> +                *    kernel types like __s16/__u32/etc.
> +                *
> +                * If a typedef resolves to char or _Bool, we do want to use
> +                * the resolved type instead of the stdint types (i.e. char
> +                * instead of int8_t) because the stdint types are explicitly
> +                * signed/unsigned, which affects pointer casts.
> +                *
> +                * If the typedef is of the __s32 variety, we leave it as-is
> +                * due to incompatibilities in e.g. s64 vs int64_t definitions
> +                * (one is `long long` on x86_64 and the other is not).
> +                *
> +                * Unfortunately, the BTF type info never includes BTF_INT_CHAR,
> +                * so we use a size comparison to avoid chars and
> +                * BTF_INT_BOOL to avoid bools.
> +                */
> +               if (d->sizedints && kind == BTF_KIND_TYPEDEF &&
> +                   !btf_is_kernel_sizedint(d, t)) {
> +                       restypeid = btf__resolve_type(d->btf, id);

we use skip_mods_and_typedefs() everywhere for this, it returns
btf_type (and optionally type_id as well), much more convenient (and
it can't fail, so no need for extra error handling)

also please use btf_is_typedef(t)

> +                       if (restypeid >= 0) {
> +                               rest = btf__type_by_id(d->btf, restypeid);
> +                               if (rest && btf_is_int(rest)) {
> +                                       t = rest;
> +                                       kind = btf_kind(rest);
> +                               }
> +                       }
> +               }
> +
>                 switch (kind) {
>                 case BTF_KIND_INT:
> +                       btf_dump_emit_mods(d, decls);
> +                       if (d->sizedints && !btf_is_bool(t) && !btf_is_char(d, t)) {
> +                               intenc = btf_int_encoding(t);
> +                               btf_dump_printf(d,
> +                                               intenc & BTF_INT_SIGNED ?
> +                                               "int%d_t" : "uint%d_t",
> +                                               t->size * 8);
> +                       } else {
> +                               name = btf_name_of(d, t->name_off);
> +                               btf_dump_printf(d, "%s", name);
> +                       }
> +                       break;
>                 case BTF_KIND_FLOAT:
>                         btf_dump_emit_mods(d, decls);
>                         name = btf_name_of(d, t->name_off);
> @@ -1469,7 +1545,9 @@ static void btf_dump_emit_type_cast(struct btf_dump *d, __u32 id,
>
>         d->skip_anon_defs = true;
>         d->strip_mods = true;
> +       d->sizedints = true;

this is a different use case, let's not normalize anything unconditionally

>         btf_dump_emit_type_decl(d, id, "", 0);
> +       d->sizedints = false;
>         d->strip_mods = false;
>         d->skip_anon_defs = false;
>
> --
> 2.34.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next v1 2/3] bpftool: skeleton uses explicitly sized ints
  2022-02-10  0:36 ` [PATCH bpf-next v1 2/3] bpftool: skeleton uses explicitly sized ints Delyan Kratunov
@ 2022-02-10 22:42   ` Andrii Nakryiko
  2022-02-10 23:45     ` Delyan Kratunov
  0 siblings, 1 reply; 16+ messages in thread
From: Andrii Nakryiko @ 2022-02-10 22:42 UTC (permalink / raw)
  To: Delyan Kratunov; +Cc: daniel, ast, andrii, bpf

On Wed, Feb 9, 2022 at 4:37 PM Delyan Kratunov <delyank@fb.com> wrote:
>
> As reported in [0] and [1], kernel and userspace can sometimes disagree
> on the definition of a typedef (in particular, the size).
> This leads to trouble when userspace maps the memory of a bpf program
> and reads/writes to it assuming a different memory layout.
>
> This commit now uses the libbpf sized ints logic when emitting the
> skeleton. This resolves int types to int32_t-like equivalents and
> ensures that typedefs are not just emitted verbatim.
>
> The drive-by selftest changes fix format specifier issues
> due to the definitions of [us]64 and (u)int64_t differing in how
> many longs they use (long long int vs long int on x86_64).
>
>   [0]: https://github.com/iovisor/bcc/pull/3777
>   [1]: Closes: https://github.com/libbpf/libbpf/issues/433
>
> Signed-off-by: Delyan Kratunov <delyank@fb.com>
> ---
>  tools/bpf/bpftool/gen.c                          |  3 +++
>  .../testing/selftests/bpf/prog_tests/skeleton.c  | 16 ++++++++--------
>  2 files changed, 11 insertions(+), 8 deletions(-)
>
> diff --git a/tools/bpf/bpftool/gen.c b/tools/bpf/bpftool/gen.c
> index eacfc6a2060d..18c3f755ad88 100644
> --- a/tools/bpf/bpftool/gen.c
> +++ b/tools/bpf/bpftool/gen.c
> @@ -146,6 +146,7 @@ static int codegen_datasec_def(struct bpf_object *obj,
>                         .field_name = var_ident,
>                         .indent_level = 2,
>                         .strip_mods = strip_mods,
> +                       .sizedints = true,
>                 );
>                 int need_off = sec_var->offset, align_off, align;
>                 __u32 var_type_id = var->type;
> @@ -751,6 +752,7 @@ static int do_skeleton(int argc, char **argv)
>                 #ifndef %2$s                                                \n\
>                 #define %2$s                                                \n\
>                                                                             \n\
> +               #include <inttypes.h>                                       \n\

if Alexei's patch set will go in first (very likely), you'll need to
rebase and make sure that you don't include either inttypes.h or
stdint.h for kernel mode, as those headers don't exist there


>                 #include <stdlib.h>                                         \n\
>                 #include <bpf/bpf.h>                                        \n\
>                 #include <bpf/skel_internal.h>                              \n\
> @@ -770,6 +772,7 @@ static int do_skeleton(int argc, char **argv)
>                 #define %2$s                                                \n\
>                                                                             \n\
>                 #include <errno.h>                                          \n\
> +               #include <inttypes.h>                                       \n\

seems like inttypes.h just includes stdint.h, I'd just include stdint.h directly

>                 #include <stdlib.h>                                         \n\
>                 #include <bpf/libbpf.h>                                     \n\
>                                                                             \n\
> diff --git a/tools/testing/selftests/bpf/prog_tests/skeleton.c b/tools/testing/selftests/bpf/prog_tests/skeleton.c
> index 180afd632f4c..9894e1b39211 100644
> --- a/tools/testing/selftests/bpf/prog_tests/skeleton.c
> +++ b/tools/testing/selftests/bpf/prog_tests/skeleton.c
> @@ -43,13 +43,13 @@ void test_skeleton(void)
>         /* validate values are pre-initialized correctly */
>         CHECK(data->in1 != -1, "in1", "got %d != exp %d\n", data->in1, -1);
>         CHECK(data->out1 != -1, "out1", "got %d != exp %d\n", data->out1, -1);
> -       CHECK(data->in2 != -1, "in2", "got %lld != exp %lld\n", data->in2, -1LL);
> -       CHECK(data->out2 != -1, "out2", "got %lld != exp %lld\n", data->out2, -1LL);
> +       CHECK(data->in2 != -1, "in2", "got %"PRId64" != exp %lld\n", data->in2, -1LL);
> +       CHECK(data->out2 != -1, "out2", "got %"PRId64" != exp %lld\n", data->out2, -1LL);

we don't use PRIxxx ugliness anywhere in selftests or libbpf code
base, it would be easier to just convert this to ASSERT_EQ()

>
>         CHECK(bss->in3 != 0, "in3", "got %d != exp %d\n", bss->in3, 0);
>         CHECK(bss->out3 != 0, "out3", "got %d != exp %d\n", bss->out3, 0);
> -       CHECK(bss->in4 != 0, "in4", "got %lld != exp %lld\n", bss->in4, 0LL);
> -       CHECK(bss->out4 != 0, "out4", "got %lld != exp %lld\n", bss->out4, 0LL);
> +       CHECK(bss->in4 != 0, "in4", "got %"PRId64" != exp %lld\n", bss->in4, 0LL);
> +       CHECK(bss->out4 != 0, "out4", "got %"PRId64" != exp %lld\n", bss->out4, 0LL);

[...]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next v1 3/3] selftests/bpf: add test case for userspace and bpf type size mismatch
  2022-02-10  0:36 ` [PATCH bpf-next v1 3/3] selftests/bpf: add test case for userspace and bpf type size mismatch Delyan Kratunov
@ 2022-02-10 22:44   ` Andrii Nakryiko
  2022-02-10 23:48     ` Delyan Kratunov
  0 siblings, 1 reply; 16+ messages in thread
From: Andrii Nakryiko @ 2022-02-10 22:44 UTC (permalink / raw)
  To: Delyan Kratunov; +Cc: daniel, ast, andrii, bpf

On Wed, Feb 9, 2022 at 4:37 PM Delyan Kratunov <delyank@fb.com> wrote:
>
> Multiple test cases already fail if you add a type whose size is
> different between userspace and bpf. That said, let's also add an
> explicit test that ensures mis-sized reads/writes do not actually
> happen. This test case fails before this patch series and passes after:
>
> test_skeleton:FAIL:writes and reads match size unexpected writes
> and reads match size: actual 3735928559 != expected 8030895855
> test_skeleton:FAIL:skeleton uses underlying type unexpected
> skeleton uses underlying type: actual 8 != expected 4
>
> Signed-off-by: Delyan Kratunov <delyank@fb.com>
> ---
>  tools/testing/selftests/bpf/prog_tests/skeleton.c | 6 ++++++
>  tools/testing/selftests/bpf/progs/test_skeleton.c | 8 ++++++++
>  2 files changed, 14 insertions(+)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/skeleton.c b/tools/testing/selftests/bpf/prog_tests/skeleton.c
> index 9894e1b39211..bc07da929566 100644
> --- a/tools/testing/selftests/bpf/prog_tests/skeleton.c
> +++ b/tools/testing/selftests/bpf/prog_tests/skeleton.c
> @@ -97,6 +97,9 @@ void test_skeleton(void)
>
>         skel->data_read_mostly->read_mostly_var = 123;
>
> +       /* validate apparent 64-bit value is actually 32-bit */
> +       skel->data->intest64 = (typeof(skel->data->intest64)) 0xdeadbeefdeadbeefULL;
> +
>         err = test_skeleton__attach(skel);
>         if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err))
>                 goto cleanup;
> @@ -126,6 +129,9 @@ void test_skeleton(void)
>         ASSERT_OK_PTR(elf_bytes, "elf_bytes");
>         ASSERT_GE(elf_bytes_sz, 0, "elf_bytes_sz");
>
> +       ASSERT_EQ(skel->data->outtest64, skel->data->intest64, "writes and reads match size");
> +       ASSERT_EQ(sizeof(skel->data->intest64), sizeof(u32), "skeleton uses underlying type");
> +
>  cleanup:
>         test_skeleton__destroy(skel);
>  }
> diff --git a/tools/testing/selftests/bpf/progs/test_skeleton.c b/tools/testing/selftests/bpf/progs/test_skeleton.c
> index 1b1187d2967b..fd1f4910cf42 100644
> --- a/tools/testing/selftests/bpf/progs/test_skeleton.c
> +++ b/tools/testing/selftests/bpf/progs/test_skeleton.c
> @@ -16,6 +16,13 @@ struct s {
>  int in1 = -1;
>  long long in2 = -1;
>
> +/* declare the int64_t type to actually be 32-bit to ensure the skeleton
> + * uses actual sizes and doesn't just copy the type name
> + */
> +typedef __s32 int64_t;
> +int64_t intest64 = -1;
> +int64_t outtest64 = -1;

This will be so confusing... But when you drop __s32 special handling
you can just use __s32 directly, right?


> +
>  /* .bss section */
>  char in3 = '\0';
>  long long in4 __attribute__((aligned(64))) = 0;
> @@ -62,6 +69,7 @@ int handler(const void *ctx)
>         out4 = in4;
>         out5 = in5;
>         out6 = in.in6;
> +       outtest64 = intest64;
>
>         bpf_syscall = CONFIG_BPF_SYSCALL;
>         kern_ver = LINUX_KERNEL_VERSION;
> --
> 2.34.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next v1 0/3] Avoid typedef size mismatches in skeletons
  2022-02-10 22:18 ` [PATCH bpf-next v1 0/3] Avoid typedef size mismatches in skeletons Yonghong Song
@ 2022-02-10 22:48   ` Andrii Nakryiko
  0 siblings, 0 replies; 16+ messages in thread
From: Andrii Nakryiko @ 2022-02-10 22:48 UTC (permalink / raw)
  To: Yonghong Song; +Cc: Delyan Kratunov, daniel, ast, andrii, bpf

On Thu, Feb 10, 2022 at 2:18 PM Yonghong Song <yhs@fb.com> wrote:
>
>
>
> On 2/9/22 4:36 PM, Delyan Kratunov wrote:
> > As reported in [0], kernel and userspace can sometimes disagree
> > on the definition of a typedef (in particular, the size).
> > This leads to trouble when userspace maps the memory of a bpf program
> > and reads/writes to it assuming a different memory layout.
> >
> > This series resolves most int-like typedefs and rewrites them as
> > standard int16_t-like types. In particular, we don't touch
> > __u32-like types, char, and _Bool, as changing them changes cast
> > semantics and would break too many pre-existing programs. For example,
> > int8_t* is not convertible to char* because int8_t is explicitly signed.
>
> Build with clang (adding LLVM=1 to build kernel and selftests),
> several btf_dump subtests failed. Please take a look.
>
> btf_dump_data:PASS:find type id 0 nsec
> btf_dump_data:PASS:failed/unexpected type_sz 0 nsec
> btf_dump_data:FAIL:ensure expected/actual match unexpected ensure
> expected/actual match: actual '(int32_t)1234' != expected '(int)1234'
> btf_dump_data:PASS:find type id 0 nsec
> btf_dump_data:PASS:failed/unexpected type_sz 0 nsec
> btf_dump_data:PASS:ensure expected/actual match 0 nsec
> btf_dump_data:PASS:find type id 0 nsec
> btf_dump_data:PASS:failed/unexpected type_sz 0 nsec
> btf_dump_data:FAIL:ensure expected/actual match unexpected ensure
> expected/actual match: actual '(int32_t)1234' != expected '(int)1234'
> btf_dump_data:PASS:find type id 0 nsec
> btf_dump_data:PASS:failed/unexpected type_sz 0 nsec
> btf_dump_data:FAIL:ensure expected/actual match unexpected ensure
> expected/actual match: actual '(int32_t)0' != expected '(int)0'
> ...
> btf_dump_data:FAIL:ensure expected/actual match unexpected ensure
> expected/actual match: actual '(uint128_t)0xffffffffffffffff' !=
> expected '(unsigned __int128)0xffffffffffffffff'
> btf_dump_data:PASS:find type id 0 nsec
> btf_dump_data:PASS:failed/unexpected type_sz 0 nsec
> btf_dump_data:FAIL:ensure expected/actual match unexpected ensure
> expected/actual match: actual
> '(uint128_t)0xfffffffffffffffffffffffffffffffe' != expected '(unsigned
> __int128)0xfffffffffffffffffffffffffffff'
> test_btf_dump_int_data:FAIL:dump unsigned __int128 unexpected error: -14
> (errno 7)
> #20/9 btf_dump/btf_dump: int_data:FAIL
> ...
> btf_dump_data:FAIL:ensure expected/actual match unexpected ensure
> expected/actual match: actual '(uint64_t)1' != expected '(u64)1'
> btf_dump_data:PASS:find type id 0 nsec
> btf_dump_data:PASS:failed/unexpected type_sz 0 nsec
> btf_dump_data:FAIL:ensure expected/actual match unexpected ensure
> expected/actual match: actual '(uint64_t)0' != expected '(u64)0'
> btf_dump_data:PASS:find type id 0 nsec
> ...
> btf_dump_data:FAIL:ensure expected/actual match unexpected ensure
> expected/actual match: actual '(atomic_t){
>          .counter = (int32_t)0,
> }' != expected '(atomic_t){
>          .counter = (int)0,
> }'
> btf_dump_data:PASS:find type id 0 nsec
> btf_dump_data:PASS:failed to return -E2BIG 0 nsec
> btf_dump_data:PASS:ensure expected/actual match 0 nsec
> #20/12 btf_dump/btf_dump: typedef_data:FAIL
> ...
> test_btf_dump_struct_data:FAIL:file_operations unexpected
> file_operations: actual '(struct file_operations){
>          .owner = (struct module *)0xffffffffffffffff,
>          .llseek = (int64_t (*)(struct file *, int64_t,
> int32_t))0xfffffffffff' != expected '(struct file_operations){
>          .owner = (struct module *)0xffffffffffffffff,
>          .llseek = (loff_t (*)(struct file *, loff_t,
> int))0xffffffffffffffff,'
> ...
> ...
>

This is due to unconditional normalization in
btf_dump_emit_type_cast(), we shouldn't do it. So this will "autofix"
itself when we fix that issue.

> >
> >    [0]: https://github.com/iovisor/bcc/pull/3777
> >
> > Delyan Kratunov (3):
> >    libbpf: btf_dump can produce explicitly sized ints
> >    bpftool: skeleton uses explicitly sized ints
> >    selftests/bpf: add test case for userspace and bpf type size mismatch
> >
> >   tools/bpf/bpftool/gen.c                       |  3 +
> >   tools/lib/bpf/btf.h                           |  4 +-
> >   tools/lib/bpf/btf_dump.c                      | 80 ++++++++++++++++++-
> >   .../selftests/bpf/prog_tests/skeleton.c       | 22 +++--
> >   .../selftests/bpf/progs/test_skeleton.c       |  8 ++
> >   5 files changed, 107 insertions(+), 10 deletions(-)
> >
> > --
> > 2.34.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next v1 0/3] Avoid typedef size mismatches in skeletons
  2022-02-10  0:36 [PATCH bpf-next v1 0/3] Avoid typedef size mismatches in skeletons Delyan Kratunov
                   ` (3 preceding siblings ...)
  2022-02-10 22:18 ` [PATCH bpf-next v1 0/3] Avoid typedef size mismatches in skeletons Yonghong Song
@ 2022-02-10 22:51 ` Yonghong Song
  2022-02-10 23:14   ` Alexei Starovoitov
  4 siblings, 1 reply; 16+ messages in thread
From: Yonghong Song @ 2022-02-10 22:51 UTC (permalink / raw)
  To: Delyan Kratunov, daniel, ast, andrii, bpf



On 2/9/22 4:36 PM, Delyan Kratunov wrote:
> As reported in [0], kernel and userspace can sometimes disagree
> on the definition of a typedef (in particular, the size).
> This leads to trouble when userspace maps the memory of a bpf program
> and reads/writes to it assuming a different memory layout.

I am thinking whether we can do better since this only resolved
some basic types. But it is totally possible some types in vmlinux.h,
who are kernel internal types, happen in some uapi or user header
as well but with different sizes.

Currently, the exposed bpf program types (in skeleton) are all
from global variables. Since we intend to ensure their size
be equal, and bpf program itself provides the size of the type.

For example, in bpf program, we have following,
    TypeA    variable;

Since TypeA will appear in the skel.h file, user must define it
somehow before skel.h. Let us say TypeA size is 20 from bpf program
BTF type.

So we could insert a
   BUILD_BUG_ON(sizeof(TypeA) != 20)
in the skeleton file to ensure the size match and this applies
to all types.

In the skel.h file, we can have
#ifndef BUILD_BUG_ON
#define BUILD_BUG_ON ...
#endif
to have BUILD_BUG_ON to cause compilation error if the condition is true.

User can define BUILD_BUG_ON before skel.h if they want to
override.

This should apply to all types put in bss/data/rodata sections
by skeleton.

If this indeed happens as in [0], user can detect the problem
and they may look at vmlinux.h and use proper underlying types
to resolve the issue.

WDYT?

> 
> This series resolves most int-like typedefs and rewrites them as
> standard int16_t-like types. In particular, we don't touch
> __u32-like types, char, and _Bool, as changing them changes cast
> semantics and would break too many pre-existing programs. For example,
> int8_t* is not convertible to char* because int8_t is explicitly signed.
> 
>    [0]: https://github.com/iovisor/bcc/pull/3777
> 
> Delyan Kratunov (3):
>    libbpf: btf_dump can produce explicitly sized ints
>    bpftool: skeleton uses explicitly sized ints
>    selftests/bpf: add test case for userspace and bpf type size mismatch
> 
>   tools/bpf/bpftool/gen.c                       |  3 +
>   tools/lib/bpf/btf.h                           |  4 +-
>   tools/lib/bpf/btf_dump.c                      | 80 ++++++++++++++++++-
>   .../selftests/bpf/prog_tests/skeleton.c       | 22 +++--
>   .../selftests/bpf/progs/test_skeleton.c       |  8 ++
>   5 files changed, 107 insertions(+), 10 deletions(-)
> 
> --
> 2.34.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next v1 0/3] Avoid typedef size mismatches in skeletons
  2022-02-10 22:51 ` Yonghong Song
@ 2022-02-10 23:14   ` Alexei Starovoitov
  2022-02-10 23:59     ` Delyan Kratunov
  0 siblings, 1 reply; 16+ messages in thread
From: Alexei Starovoitov @ 2022-02-10 23:14 UTC (permalink / raw)
  To: Yonghong Song; +Cc: Delyan Kratunov, daniel, ast, andrii, bpf

On Thu, Feb 10, 2022 at 2:51 PM Yonghong Song <yhs@fb.com> wrote:
>
>
>
> On 2/9/22 4:36 PM, Delyan Kratunov wrote:
> > As reported in [0], kernel and userspace can sometimes disagree
> > on the definition of a typedef (in particular, the size).
> > This leads to trouble when userspace maps the memory of a bpf program
> > and reads/writes to it assuming a different memory layout.
>
> I am thinking whether we can do better since this only resolved
> some basic types. But it is totally possible some types in vmlinux.h,
> who are kernel internal types, happen in some uapi or user header
> as well but with different sizes.
>
> Currently, the exposed bpf program types (in skeleton) are all
> from global variables. Since we intend to ensure their size
> be equal, and bpf program itself provides the size of the type.
>
> For example, in bpf program, we have following,
>     TypeA    variable;
>
> Since TypeA will appear in the skel.h file, user must define it
> somehow before skel.h. Let us say TypeA size is 20 from bpf program
> BTF type.
>
> So we could insert a
>    BUILD_BUG_ON(sizeof(TypeA) != 20)
> in the skeleton file to ensure the size match and this applies
> to all types.
>
> In the skel.h file, we can have
> #ifndef BUILD_BUG_ON
> #define BUILD_BUG_ON ...
> #endif
> to have BUILD_BUG_ON to cause compilation error if the condition is true.
>
> User can define BUILD_BUG_ON before skel.h if they want to
> override.
>
> This should apply to all types put in bss/data/rodata sections
> by skeleton.
>
> If this indeed happens as in [0], user can detect the problem
> and they may look at vmlinux.h and use proper underlying types
> to resolve the issue.
>
> WDYT?

Going back to https://github.com/iovisor/bcc/pull/3777 issue..
The user space part that included skel.h might have used
dev_t in other places in the code.
When bpftool automagically replaces dev_t in skel.h with int32_t
to match the kernel it only moves the problem further into
the user space where dev_t would still mismatch.
So adding
_Static_assert(sizeof(type_in_skel) == const_size_from_kernel);
to skel.h would force users to pick types that are the same
both in bpf prog and in corresponding user space part.
I think that's a better approach.
Maybe we don't need to add static_assert for all types,
but that's a minor tweak.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next v1 1/3] libbpf: btf_dump can produce explicitly sized ints
  2022-02-10 22:35   ` Andrii Nakryiko
@ 2022-02-10 23:40     ` Delyan Kratunov
  0 siblings, 0 replies; 16+ messages in thread
From: Delyan Kratunov @ 2022-02-10 23:40 UTC (permalink / raw)
  To: andrii.nakryiko; +Cc: daniel, ast, andrii, bpf

On Thu, 2022-02-10 at 14:35 -0800, Andrii Nakryiko wrote:
> let's stick to consistent snake_case naming convention
> 
> let's also call it what the comment calls it :) "normalize_ints" ?
> 
Sure, naming is hard.

> So I thought about this a bit, I see how there could be a difference
> of %lld vs %ld and such, but I think normalize should mean normalize
> all ints, without any exceptions for kernel aliases. Let's keep the
> rule simple: everything but char and bool gets converted to its
> corresponding standard integer alias.
> 
> Worst case few users might need to adjust their printf specifier after
> seeing a compiler warning.

When I first tried to do this, there were a couple of ways in which it was
problematic:

1. format specifiers are all over selftests
2. passing uint64_t* to things expecting u64* in the userspace code

Ultimately, this felt like a good compromise between keeping existing code
working and also avoiding the major pitfalls of non-sized types.
Another question here is whether a minor release of libbpf should be allowed to
break the build of applications using -Werror.

We could potentially gate this behavior behind a runtime arg but that's not
exactly a "pit of success" type of design.

> this is a different use case, let's not normalize anything unconditionally

My bad, meant to remove this before submitting.



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next v1 2/3] bpftool: skeleton uses explicitly sized ints
  2022-02-10 22:42   ` Andrii Nakryiko
@ 2022-02-10 23:45     ` Delyan Kratunov
  0 siblings, 0 replies; 16+ messages in thread
From: Delyan Kratunov @ 2022-02-10 23:45 UTC (permalink / raw)
  To: andrii.nakryiko; +Cc: daniel, ast, andrii, bpf

On Thu, 2022-02-10 at 14:42 -0800, Andrii Nakryiko wrote:
> if Alexei's patch set will go in first (very likely), you'll need to
> rebase and make sure that you don't include either inttypes.h or
> stdint.h for kernel mode, as those headers don't exist there

Ack.

> seems like inttypes.h just includes stdint.h, I'd just include stdint.h directly

You need inttypes.h for the PRIxxx format specifiers and I figured users of
skels are not unlikely to want to print things. Happy to move it to just
stdint.h though.

> we don't use PRIxxx ugliness anywhere in selftests or libbpf code
> base, it would be easier to just convert this to ASSERT_EQ()
> 

Sure, can do.
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next v1 3/3] selftests/bpf: add test case for userspace and bpf type size mismatch
  2022-02-10 22:44   ` Andrii Nakryiko
@ 2022-02-10 23:48     ` Delyan Kratunov
  0 siblings, 0 replies; 16+ messages in thread
From: Delyan Kratunov @ 2022-02-10 23:48 UTC (permalink / raw)
  To: andrii.nakryiko; +Cc: daniel, ast, andrii, bpf

On Thu, 2022-02-10 at 14:44 -0800, Andrii Nakryiko wrote:
> > +/* declare the int64_t type to actually be 32-bit to ensure the skeleton
> > + * uses actual sizes and doesn't just copy the type name
> > + */
> > +typedef __s32 int64_t;
> > +int64_t intest64 = -1;
> > +int64_t outtest64 = -1;
> 
> This will be so confusing... But when you drop __s32 special handling
> you can just use __s32 directly, right?

Actually, no. We need the type to have the textual representation of a 64-bit
type (int64_t) but a different underlying size. The test is ensuring that the
skeleton is using the underlying type and not the apparent type. 

You need a layer of typedefs to introduce that confusion and intXX_t is ideal
because it's not normally transitively available in .bpf.c programs.

`typedef signed char int64_t;` would also work, if you prefer that.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next v1 0/3] Avoid typedef size mismatches in skeletons
  2022-02-10 23:14   ` Alexei Starovoitov
@ 2022-02-10 23:59     ` Delyan Kratunov
  2022-02-11  0:17       ` Alexei Starovoitov
  0 siblings, 1 reply; 16+ messages in thread
From: Delyan Kratunov @ 2022-02-10 23:59 UTC (permalink / raw)
  To: Yonghong Song, alexei.starovoitov; +Cc: daniel, ast, andrii, bpf

On Thu, 2022-02-10 at 15:14 -0800, Alexei Starovoitov wrote:
> So adding
> _Static_assert(sizeof(type_in_skel) == const_size_from_kernel);
> to skel.h would force users to pick types that are the same
> both in bpf prog and in corresponding user space part.

I'm not sure users have this much control over the definition of the types in
their program though. If a kernel and uapi typedef differ in size, this approach
would force the user to use kernel types for the entire program.

If, for example, pid_t is a different size in glibc and the kernel, it would
force you out of using any glibc functions taking pid_t (and potentially all of
glibc depending on how entangled the headers are).

By normalizing to stdint types, we're saying that the contract represented by
the skel does not operate with either uapi or kernel types and it's up to you to
ensure you use the right one (or mix and match, if you can). It feels
fundamentally more permissive to different types of situations than forcing the
skel user to only use kernel types or refactor their program to isolate the
kernel type leakage to only the compilation unit using the skel.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH bpf-next v1 0/3] Avoid typedef size mismatches in skeletons
  2022-02-10 23:59     ` Delyan Kratunov
@ 2022-02-11  0:17       ` Alexei Starovoitov
  0 siblings, 0 replies; 16+ messages in thread
From: Alexei Starovoitov @ 2022-02-11  0:17 UTC (permalink / raw)
  To: Delyan Kratunov; +Cc: Yonghong Song, daniel, ast, andrii, bpf

On Thu, Feb 10, 2022 at 3:59 PM Delyan Kratunov <delyank@fb.com> wrote:
>
> On Thu, 2022-02-10 at 15:14 -0800, Alexei Starovoitov wrote:
> > So adding
> > _Static_assert(sizeof(type_in_skel) == const_size_from_kernel);
> > to skel.h would force users to pick types that are the same
> > both in bpf prog and in corresponding user space part.
>
> I'm not sure users have this much control over the definition of the types in
> their program though. If a kernel and uapi typedef differ in size, this approach
> would force the user to use kernel types for the entire program.
>
> If, for example, pid_t is a different size in glibc and the kernel, it would
> force you out of using any glibc functions taking pid_t (and potentially all of
> glibc depending on how entangled the headers are).

pid_t is a great example, since its meaning is different
in kernel and in user space.
One is thread and another is process.

static_assert will catch such issues and will force
users to pick u64 and think through the type
conversions. In kernel from pid_t -> u64 in bpf prog,
and in user space pid_t -> u64 in skel.
That would be an explicit action by users that hopefully
make them think about the differences in size and semantics.

> By normalizing to stdint types, we're saying that the contract represented by
> the skel does not operate with either uapi or kernel types and it's up to you to
> ensure you use the right one (or mix and match, if you can). It feels
> fundamentally more permissive to different types of situations than forcing the
> skel user to only use kernel types or refactor their program to isolate the
> kernel type leakage to only the compilation unit using the skel.

static_assert will force users to use compatible types, not kernel types.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-02-11  0:17 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-10  0:36 [PATCH bpf-next v1 0/3] Avoid typedef size mismatches in skeletons Delyan Kratunov
2022-02-10  0:36 ` [PATCH bpf-next v1 3/3] selftests/bpf: add test case for userspace and bpf type size mismatch Delyan Kratunov
2022-02-10 22:44   ` Andrii Nakryiko
2022-02-10 23:48     ` Delyan Kratunov
2022-02-10  0:36 ` [PATCH bpf-next v1 2/3] bpftool: skeleton uses explicitly sized ints Delyan Kratunov
2022-02-10 22:42   ` Andrii Nakryiko
2022-02-10 23:45     ` Delyan Kratunov
2022-02-10  0:36 ` [PATCH bpf-next v1 1/3] libbpf: btf_dump can produce " Delyan Kratunov
2022-02-10 22:35   ` Andrii Nakryiko
2022-02-10 23:40     ` Delyan Kratunov
2022-02-10 22:18 ` [PATCH bpf-next v1 0/3] Avoid typedef size mismatches in skeletons Yonghong Song
2022-02-10 22:48   ` Andrii Nakryiko
2022-02-10 22:51 ` Yonghong Song
2022-02-10 23:14   ` Alexei Starovoitov
2022-02-10 23:59     ` Delyan Kratunov
2022-02-11  0:17       ` Alexei Starovoitov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.