bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH bpf-next 0/6] Add libbpf-provided extern variables support
@ 2019-11-17  7:08 Andrii Nakryiko
  2019-11-17  7:08 ` [PATCH bpf-next 1/6] selftests/bpf: ensure no DWARF relocations for BPF object files Andrii Nakryiko
                   ` (6 more replies)
  0 siblings, 7 replies; 20+ messages in thread
From: Andrii Nakryiko @ 2019-11-17  7:08 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko

It's often important for BPF program to know kernel version or some specific
config values (e.g., CONFIG_HZ to convert jiffies to seconds) and change or
adjust program logic based on their values. As of today, any such need has to
be resolved by recompiling BPF program for specific kernel and kernel
configuration. In practice this is usually achieved by using BCC and its
embedded LLVM/Clang. With such set up #ifdef CONFIG_XXX and similar
compile-time constructs allow to deal with kernel varieties.

With CO-RE (Compile Once – Run Everywhere) approach, this is not an option,
unfortunately. All such logic variations have to be done as a normal
C language constructs (i.e., if/else, variables, etc), not a preprocessor
directives. This patch series add support for such advanced scenarios through
C extern variables. These extern variables will be recognized by libbpf and
supplied through extra .extern internal map, similarly to global data. This
.extern map is read-only, which allows BPF verifier to track its content
precisely as constants. That gives an opportunity to have pre-compiled BPF
program, which can potentially use BPF functionality (e.g., BPF helpers) or
kernel features (types, fields, etc), that are available only on a subset of
targeted kernels, while effectively eleminating (through verifier's dead code
detection) such unsupported functionality for other kernels (typically, older
versions). Patch #5 contains all the details. Patch #5 explicitly tests
a scenario of using unsupported BPF helper, to validate the approach.

As part of this patch set, libbpf also allows usage of initialized global
(non-static) variables, which provides better Clang semantics, which is closer
and better aligned witht kernel vs userspace BPF map contents sharing.

Outline of the patch set:
- patches #1-#3 do some preliminary refactorings of libbpf relocation logic
  and some more clean ups;
- patch #4 allows non-static variables and converts few tests to use them;
- patch #5 adds support for externs to libbpf;
- patch #6 adds tests for externs.

Andrii Nakryiko (6):
  selftests/bpf: ensure no DWARF relocations for BPF object files
  libbpf: refactor relocation handling
  libbpf: fix various errors and warning reported by checkpatch.pl
  libbpf: support initialized global variables
  libbpf: support libbpf-provided extern variables
  selftests/bpf: add tests for libbpf-provided externs

 tools/lib/bpf/Makefile                        |  17 +-
 tools/lib/bpf/libbpf.c                        | 758 +++++++++++++-----
 tools/lib/bpf/libbpf.h                        |   8 +-
 tools/testing/selftests/bpf/Makefile          |   4 +-
 .../selftests/bpf/prog_tests/core_extern.c    | 186 +++++
 .../selftests/bpf/progs/test_core_extern.c    |  43 +
 .../bpf/progs/test_core_reloc_arrays.c        |   4 +-
 .../progs/test_core_reloc_bitfields_direct.c  |   4 +-
 .../progs/test_core_reloc_bitfields_probed.c  |   4 +-
 .../bpf/progs/test_core_reloc_existence.c     |   4 +-
 .../bpf/progs/test_core_reloc_flavors.c       |   4 +-
 .../bpf/progs/test_core_reloc_ints.c          |   4 +-
 .../bpf/progs/test_core_reloc_kernel.c        |   4 +-
 .../bpf/progs/test_core_reloc_misc.c          |   4 +-
 .../bpf/progs/test_core_reloc_mods.c          |   4 +-
 .../bpf/progs/test_core_reloc_nesting.c       |   4 +-
 .../bpf/progs/test_core_reloc_primitives.c    |   4 +-
 .../bpf/progs/test_core_reloc_ptr_as_arr.c    |   4 +-
 .../bpf/progs/test_core_reloc_size.c          |   4 +-
 19 files changed, 820 insertions(+), 248 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/core_extern.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_core_extern.c

-- 
2.17.1


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH bpf-next 1/6] selftests/bpf: ensure no DWARF relocations for BPF object files
  2019-11-17  7:08 [PATCH bpf-next 0/6] Add libbpf-provided extern variables support Andrii Nakryiko
@ 2019-11-17  7:08 ` Andrii Nakryiko
  2019-11-17  7:08 ` [PATCH bpf-next 2/6] libbpf: refactor relocation handling Andrii Nakryiko
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: Andrii Nakryiko @ 2019-11-17  7:08 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko

Add -mattr=dwarfris attribute to llc to avoid having relocations against DWARF
data. These relocations make it impossible to inspect DWARF contents: all
strings are invalid.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/testing/selftests/bpf/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index b03dc2298fea..5737888cd6a7 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -161,7 +161,7 @@ $(OUTPUT)/flow_dissector_load.o: flow_dissector_load.h
 define CLANG_BPF_BUILD_RULE
 	($(CLANG) $3 -O2 -target bpf -emit-llvm				\
 		-c $1 -o - || echo "BPF obj compilation failed") | 	\
-	$(LLC) -march=bpf -mcpu=probe $4 -filetype=obj -o $2
+	$(LLC) -mattr=dwarfris -march=bpf -mcpu=probe $4 -filetype=obj -o $2
 endef
 # Similar to CLANG_BPF_BUILD_RULE, but using native Clang and bpf LLC
 define CLANG_NATIVE_BPF_BUILD_RULE
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next 2/6] libbpf: refactor relocation handling
  2019-11-17  7:08 [PATCH bpf-next 0/6] Add libbpf-provided extern variables support Andrii Nakryiko
  2019-11-17  7:08 ` [PATCH bpf-next 1/6] selftests/bpf: ensure no DWARF relocations for BPF object files Andrii Nakryiko
@ 2019-11-17  7:08 ` Andrii Nakryiko
  2019-11-17  7:08 ` [PATCH bpf-next 3/6] libbpf: fix various errors and warning reported by checkpatch.pl Andrii Nakryiko
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: Andrii Nakryiko @ 2019-11-17  7:08 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko

Relocation handling code is convoluted and unnecessarily deeply nested. Split
out per-relocation logic into separate function. Also refactor the logic to be
more a sequence of per-relocation type checks and processing steps, making it
simpler to follow control flow. This makes it easier to further extends it to
new kinds of relocations (e.g., support for extern variables).

This patch also makes relocation's section verification more robust.
Previously relocations against not yet supported externs were silently ignored
because of obj->efile.text_shndx was zero, when all BPF programs had custom
section names and there was no .text section. Also, invalid LDIMM64 relocations
against non-map sections were passed through, if they were pointing to a .text
section (or 0, which is invalid section). All these bugs are fixed within this
refactoring and checks are made more appropriate for each type of relocation.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/lib/bpf/libbpf.c | 256 ++++++++++++++++++++++-------------------
 1 file changed, 140 insertions(+), 116 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 15e91a1d6c11..0fdca01f7e57 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -276,8 +276,8 @@ struct bpf_object {
 		struct {
 			GElf_Shdr shdr;
 			Elf_Data *data;
-		} *reloc;
-		int nr_reloc;
+		} *reloc_sects;
+		int nr_reloc_sects;
 		int maps_shndx;
 		int btf_maps_shndx;
 		int text_shndx;
@@ -575,8 +575,8 @@ static void bpf_object__elf_finish(struct bpf_object *obj)
 	obj->efile.rodata = NULL;
 	obj->efile.bss = NULL;
 
-	zfree(&obj->efile.reloc);
-	obj->efile.nr_reloc = 0;
+	zfree(&obj->efile.reloc_sects);
+	obj->efile.nr_reloc_sects = 0;
 	zclose(obj->efile.fd);
 	obj->efile.obj_buf = NULL;
 	obj->efile.obj_buf_sz = 0;
@@ -1693,8 +1693,8 @@ static int bpf_object__elf_collect(struct bpf_object *obj, bool relaxed_maps,
 				pr_debug("skip section(%d) %s\n", idx, name);
 			}
 		} else if (sh.sh_type == SHT_REL) {
-			int nr_reloc = obj->efile.nr_reloc;
-			void *reloc = obj->efile.reloc;
+			int nr_sects = obj->efile.nr_reloc_sects;
+			void *sects = obj->efile.reloc_sects;
 			int sec = sh.sh_info; /* points to other section */
 
 			/* Only do relo for section with exec instructions */
@@ -1704,18 +1704,18 @@ static int bpf_object__elf_collect(struct bpf_object *obj, bool relaxed_maps,
 				continue;
 			}
 
-			reloc = reallocarray(reloc, nr_reloc + 1,
-					     sizeof(*obj->efile.reloc));
-			if (!reloc) {
-				pr_warn("realloc failed\n");
+			sects = reallocarray(sects, nr_sects + 1,
+					     sizeof(*obj->efile.reloc_sects));
+			if (!sects) {
+				pr_warn("reloc_sects realloc failed\n");
 				return -ENOMEM;
 			}
 
-			obj->efile.reloc = reloc;
-			obj->efile.nr_reloc++;
+			obj->efile.reloc_sects = sects;
+			obj->efile.nr_reloc_sects++;
 
-			obj->efile.reloc[nr_reloc].shdr = sh;
-			obj->efile.reloc[nr_reloc].data = data;
+			obj->efile.reloc_sects[nr_sects].shdr = sh;
+			obj->efile.reloc_sects[nr_sects].data = data;
 		} else if (sh.sh_type == SHT_NOBITS && strcmp(name, ".bss") == 0) {
 			obj->efile.bss = data;
 			obj->efile.bss_shndx = idx;
@@ -1780,14 +1780,6 @@ static bool bpf_object__shndx_is_maps(const struct bpf_object *obj,
 	       shndx == obj->efile.btf_maps_shndx;
 }
 
-static bool bpf_object__relo_in_known_section(const struct bpf_object *obj,
-					      int shndx)
-{
-	return shndx == obj->efile.text_shndx ||
-	       bpf_object__shndx_is_maps(obj, shndx) ||
-	       bpf_object__shndx_is_data(obj, shndx);
-}
-
 static enum libbpf_map_type
 bpf_object__section_to_libbpf_map_type(const struct bpf_object *obj, int shndx)
 {
@@ -1801,14 +1793,120 @@ bpf_object__section_to_libbpf_map_type(const struct bpf_object *obj, int shndx)
 		return LIBBPF_MAP_UNSPEC;
 }
 
+static int bpf_program__record_reloc(struct bpf_program *prog,
+				     struct reloc_desc *reloc_desc,
+				     __u32 insn_idx, const char *name,
+				     const GElf_Sym *sym, const GElf_Rel *rel)
+{
+	struct bpf_insn *insn = &prog->insns[insn_idx];
+	size_t map_idx, nr_maps = prog->obj->nr_maps;
+	struct bpf_object *obj = prog->obj;
+	__u32 shdr_idx = sym->st_shndx;
+	enum libbpf_map_type type;
+	struct bpf_map *map;
+
+	/* sub-program call relocation */
+	if (insn->code == (BPF_JMP | BPF_CALL)) {
+		if (insn->src_reg != BPF_PSEUDO_CALL) {
+			pr_warn("incorrect bpf_call opcode\n");
+			return -LIBBPF_ERRNO__RELOC;
+		}
+		/* text_shndx can be 0, if no default "main" program exists */
+		if (!shdr_idx || shdr_idx != obj->efile.text_shndx) {
+			pr_warn("bad call relo against section %u\n", shdr_idx);
+			return -LIBBPF_ERRNO__RELOC;
+		}
+		reloc_desc->type = RELO_CALL;
+		reloc_desc->insn_idx = insn_idx;
+		reloc_desc->text_off = sym->st_value;
+		obj->has_pseudo_calls = true;
+		return 0;
+	}
+
+	if (insn->code != (BPF_LD | BPF_IMM | BPF_DW)) {
+		pr_warn("bpf: relocation: invalid relo for insns[%d].code 0x%x\n",
+			insn_idx, insn->code);
+		return -LIBBPF_ERRNO__RELOC;
+	}
+	if (!shdr_idx || shdr_idx >= SHN_LORESERVE) {
+		pr_warn("relocation: not yet supported relo for non-static global \'%s\' variable in special section (0x%x) found in insns[%d].code 0x%x\n",
+			name, shdr_idx, insn_idx, insn->code);
+		return -LIBBPF_ERRNO__RELOC;
+	}
+
+	type = bpf_object__section_to_libbpf_map_type(obj, shdr_idx);
+
+	/* generic map reference relocation */
+	if (type == LIBBPF_MAP_UNSPEC) {
+		if (!bpf_object__shndx_is_maps(obj, shdr_idx)) {
+			pr_warn("bad map relo against section %u\n",
+				shdr_idx);
+			return -LIBBPF_ERRNO__RELOC;
+		}
+		for (map_idx = 0; map_idx < nr_maps; map_idx++) {
+			map = &obj->maps[map_idx];
+			if (map->libbpf_type != type ||
+			    map->sec_idx != sym->st_shndx ||
+			    map->sec_offset != sym->st_value)
+				continue;
+			pr_debug("found map %zd (%s, sec %d, off %zu) for insn %u\n",
+				 map_idx, map->name, map->sec_idx,
+				 map->sec_offset, insn_idx);
+			break;
+		}
+		if (map_idx >= nr_maps) {
+			pr_warn("map relo failed to find map for sec %u, off %llu\n",
+				shdr_idx, (__u64)sym->st_value);
+			return -LIBBPF_ERRNO__RELOC;
+		}
+		reloc_desc->type = RELO_LD64;
+		reloc_desc->insn_idx = insn_idx;
+		reloc_desc->map_idx = map_idx;
+		return 0;
+	}
+
+	/* global data map relocation */
+	if (!bpf_object__shndx_is_data(obj, shdr_idx)) {
+		pr_warn("bad data relo against section %u\n", shdr_idx);
+		return -LIBBPF_ERRNO__RELOC;
+	}
+	if (GELF_ST_BIND(sym->st_info) == STB_GLOBAL) {
+		pr_warn("relocation: not yet supported relo for non-static global \'%s\' variable found in insns[%d].code 0x%x\n",
+			name, insn_idx, insn->code);
+		return -LIBBPF_ERRNO__RELOC;
+	}
+	if (!obj->caps.global_data) {
+		pr_warn("relocation: kernel does not support global \'%s\' variable access in insns[%d]\n",
+			name, insn_idx);
+		return -LIBBPF_ERRNO__RELOC;
+	}
+	for (map_idx = 0; map_idx < nr_maps; map_idx++) {
+		map = &obj->maps[map_idx];
+		if (map->libbpf_type != type)
+			continue;
+		pr_debug("found data map %zd (%s, sec %d, off %zu) for insn %u\n",
+			 map_idx, map->name, map->sec_idx, map->sec_offset,
+			 insn_idx);
+		break;
+	}
+	if (map_idx >= nr_maps) {
+		pr_warn("data relo failed to find map for sec %u\n",
+			shdr_idx);
+		return -LIBBPF_ERRNO__RELOC;
+	}
+
+	reloc_desc->type = RELO_DATA;
+	reloc_desc->insn_idx = insn_idx;
+	reloc_desc->map_idx = map_idx;
+	return 0;
+}
+
 static int
 bpf_program__collect_reloc(struct bpf_program *prog, GElf_Shdr *shdr,
 			   Elf_Data *data, struct bpf_object *obj)
 {
 	Elf_Data *symbols = obj->efile.symbols;
-	struct bpf_map *maps = obj->maps;
-	size_t nr_maps = obj->nr_maps;
-	int i, nrels;
+	int err, i, nrels;
 
 	pr_debug("collecting relocating info for: '%s'\n", prog->section_name);
 	nrels = shdr->sh_size / shdr->sh_entsize;
@@ -1821,12 +1919,8 @@ bpf_program__collect_reloc(struct bpf_program *prog, GElf_Shdr *shdr,
 	prog->nr_reloc = nrels;
 
 	for (i = 0; i < nrels; i++) {
-		struct bpf_insn *insns = prog->insns;
-		enum libbpf_map_type type;
-		unsigned int insn_idx;
-		unsigned int shdr_idx;
 		const char *name;
-		size_t map_idx;
+		__u32 insn_idx;
 		GElf_Sym sym;
 		GElf_Rel rel;
 
@@ -1834,97 +1928,27 @@ bpf_program__collect_reloc(struct bpf_program *prog, GElf_Shdr *shdr,
 			pr_warn("relocation: failed to get %d reloc\n", i);
 			return -LIBBPF_ERRNO__FORMAT;
 		}
-
 		if (!gelf_getsym(symbols, GELF_R_SYM(rel.r_info), &sym)) {
 			pr_warn("relocation: symbol %"PRIx64" not found\n",
 				GELF_R_SYM(rel.r_info));
 			return -LIBBPF_ERRNO__FORMAT;
 		}
+		if (rel.r_offset % sizeof(struct bpf_insn))
+			return -LIBBPF_ERRNO__FORMAT;
 
+		insn_idx = rel.r_offset / sizeof(struct bpf_insn);
 		name = elf_strptr(obj->efile.elf, obj->efile.strtabidx,
 				  sym.st_name) ? : "<?>";
-
-		pr_debug("relo for %lld value %lld name %d (\'%s\')\n",
-			 (long long) (rel.r_info >> 32),
-			 (long long) sym.st_value, sym.st_name, name);
-
-		shdr_idx = sym.st_shndx;
-		insn_idx = rel.r_offset / sizeof(struct bpf_insn);
-		pr_debug("relocation: insn_idx=%u, shdr_idx=%u\n",
-			 insn_idx, shdr_idx);
-
-		if (shdr_idx >= SHN_LORESERVE) {
-			pr_warn("relocation: not yet supported relo for non-static global \'%s\' variable in special section (0x%x) found in insns[%d].code 0x%x\n",
-				name, shdr_idx, insn_idx,
-				insns[insn_idx].code);
-			return -LIBBPF_ERRNO__RELOC;
-		}
-		if (!bpf_object__relo_in_known_section(obj, shdr_idx)) {
-			pr_warn("Program '%s' contains unrecognized relo data pointing to section %u\n",
-				prog->section_name, shdr_idx);
-			return -LIBBPF_ERRNO__RELOC;
-		}
-
-		if (insns[insn_idx].code == (BPF_JMP | BPF_CALL)) {
-			if (insns[insn_idx].src_reg != BPF_PSEUDO_CALL) {
-				pr_warn("incorrect bpf_call opcode\n");
-				return -LIBBPF_ERRNO__RELOC;
-			}
-			prog->reloc_desc[i].type = RELO_CALL;
-			prog->reloc_desc[i].insn_idx = insn_idx;
-			prog->reloc_desc[i].text_off = sym.st_value;
-			obj->has_pseudo_calls = true;
-			continue;
-		}
-
-		if (insns[insn_idx].code != (BPF_LD | BPF_IMM | BPF_DW)) {
-			pr_warn("bpf: relocation: invalid relo for insns[%d].code 0x%x\n",
-				insn_idx, insns[insn_idx].code);
-			return -LIBBPF_ERRNO__RELOC;
-		}
-
-		if (bpf_object__shndx_is_maps(obj, shdr_idx) ||
-		    bpf_object__shndx_is_data(obj, shdr_idx)) {
-			type = bpf_object__section_to_libbpf_map_type(obj, shdr_idx);
-			if (type != LIBBPF_MAP_UNSPEC) {
-				if (GELF_ST_BIND(sym.st_info) == STB_GLOBAL) {
-					pr_warn("bpf: relocation: not yet supported relo for non-static global \'%s\' variable found in insns[%d].code 0x%x\n",
-						name, insn_idx, insns[insn_idx].code);
-					return -LIBBPF_ERRNO__RELOC;
-				}
-				if (!obj->caps.global_data) {
-					pr_warn("bpf: relocation: kernel does not support global \'%s\' variable access in insns[%d]\n",
-						name, insn_idx);
-					return -LIBBPF_ERRNO__RELOC;
-				}
-			}
-
-			for (map_idx = 0; map_idx < nr_maps; map_idx++) {
-				if (maps[map_idx].libbpf_type != type)
-					continue;
-				if (type != LIBBPF_MAP_UNSPEC ||
-				    (maps[map_idx].sec_idx == sym.st_shndx &&
-				     maps[map_idx].sec_offset == sym.st_value)) {
-					pr_debug("relocation: found map %zd (%s, sec_idx %d, offset %zu) for insn %u\n",
-						 map_idx, maps[map_idx].name,
-						 maps[map_idx].sec_idx,
-						 maps[map_idx].sec_offset,
-						 insn_idx);
-					break;
-				}
-			}
-
-			if (map_idx >= nr_maps) {
-				pr_warn("bpf relocation: map_idx %d larger than %d\n",
-					(int)map_idx, (int)nr_maps - 1);
-				return -LIBBPF_ERRNO__RELOC;
-			}
-
-			prog->reloc_desc[i].type = type != LIBBPF_MAP_UNSPEC ?
-						   RELO_DATA : RELO_LD64;
-			prog->reloc_desc[i].insn_idx = insn_idx;
-			prog->reloc_desc[i].map_idx = map_idx;
-		}
+		pr_debug("relo for shdr %u, symb %llu, value %llu, type %d, bind %d, name %d (\'%s\'), insn %u\n",
+			 (__u32)sym.st_shndx, (__u64)GELF_R_SYM(rel.r_info),
+			 (__u64)sym.st_value, GELF_ST_TYPE(sym.st_info),
+			 GELF_ST_BIND(sym.st_info), sym.st_name, name,
+			 insn_idx);
+
+		err = bpf_program__record_reloc(prog, &prog->reloc_desc[i],
+						insn_idx, name, &sym, &rel);
+		if (err)
+			return err;
 	}
 	return 0;
 }
@@ -3667,9 +3691,9 @@ static int bpf_object__collect_reloc(struct bpf_object *obj)
 		return -LIBBPF_ERRNO__INTERNAL;
 	}
 
-	for (i = 0; i < obj->efile.nr_reloc; i++) {
-		GElf_Shdr *shdr = &obj->efile.reloc[i].shdr;
-		Elf_Data *data = obj->efile.reloc[i].data;
+	for (i = 0; i < obj->efile.nr_reloc_sects; i++) {
+		GElf_Shdr *shdr = &obj->efile.reloc_sects[i].shdr;
+		Elf_Data *data = obj->efile.reloc_sects[i].data;
 		int idx = shdr->sh_info;
 		struct bpf_program *prog;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next 3/6] libbpf: fix various errors and warning reported by checkpatch.pl
  2019-11-17  7:08 [PATCH bpf-next 0/6] Add libbpf-provided extern variables support Andrii Nakryiko
  2019-11-17  7:08 ` [PATCH bpf-next 1/6] selftests/bpf: ensure no DWARF relocations for BPF object files Andrii Nakryiko
  2019-11-17  7:08 ` [PATCH bpf-next 2/6] libbpf: refactor relocation handling Andrii Nakryiko
@ 2019-11-17  7:08 ` Andrii Nakryiko
  2019-11-17  7:08 ` [PATCH bpf-next 4/6] libbpf: support initialized global variables Andrii Nakryiko
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: Andrii Nakryiko @ 2019-11-17  7:08 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko

Fix a bunch of warnings and errors reported by checkpatch.pl, to make it
easier to spot new problems.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/lib/bpf/libbpf.c | 38 +++++++++++++++++++++-----------------
 1 file changed, 21 insertions(+), 17 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 0fdca01f7e57..f6232aeefb9e 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -105,7 +105,7 @@ void libbpf_print(enum libbpf_print_level level, const char *format, ...)
 	err = action;			\
 	if (err)			\
 		goto out;		\
-} while(0)
+} while (0)
 
 
 /* Copied from tools/perf/util/util.h */
@@ -965,8 +965,7 @@ static int bpf_object__init_user_maps(struct bpf_object *obj, bool strict)
 		 obj->path, nr_maps, data->d_size);
 
 	if (!data->d_size || nr_maps == 0 || (data->d_size % nr_maps) != 0) {
-		pr_warn("unable to determine map definition size "
-			"section %s, %d maps in %zd bytes\n",
+		pr_warn("unable to determine map definition size section %s, %d maps in %zd bytes\n",
 			obj->path, nr_maps, data->d_size);
 		return -EINVAL;
 	}
@@ -1030,12 +1029,11 @@ static int bpf_object__init_user_maps(struct bpf_object *obj, bool strict)
 			 * incompatible.
 			 */
 			char *b;
+
 			for (b = ((char *)def) + sizeof(struct bpf_map_def);
 			     b < ((char *)def) + map_def_sz; b++) {
 				if (*b != 0) {
-					pr_warn("maps section in %s: \"%s\" "
-						"has unrecognized, non-zero "
-						"options\n",
+					pr_warn("maps section in %s: \"%s\" has unrecognized, non-zero options\n",
 						obj->path, map_name);
 					if (strict)
 						return -EINVAL;
@@ -1073,7 +1071,8 @@ skip_mods_and_typedefs(const struct btf *btf, __u32 id, __u32 *res_id)
  */
 static bool get_map_field_int(const char *map_name, const struct btf *btf,
 			      const struct btf_type *def,
-			      const struct btf_member *m, __u32 *res) {
+			      const struct btf_member *m, __u32 *res)
+{
 	const struct btf_type *t = skip_mods_and_typedefs(btf, m->type, NULL);
 	const char *name = btf__name_by_offset(btf, m->name_off);
 	const struct btf_array *arr_info;
@@ -1387,7 +1386,8 @@ static int bpf_object__init_user_btf_maps(struct bpf_object *obj, bool strict,
 	for (i = 0; i < vlen; i++) {
 		err = bpf_object__init_user_btf_map(obj, sec, i,
 						    obj->efile.btf_maps_shndx,
-						    data, strict, pin_root_path);
+						    data, strict,
+						    pin_root_path);
 		if (err)
 			return err;
 	}
@@ -1673,12 +1673,14 @@ static int bpf_object__elf_collect(struct bpf_object *obj, bool relaxed_maps,
 				if (strcmp(name, ".text") == 0)
 					obj->efile.text_shndx = idx;
 				err = bpf_object__add_program(obj, data->d_buf,
-							      data->d_size, name, idx);
+							      data->d_size,
+							      name, idx);
 				if (err) {
 					char errmsg[STRERR_BUFSIZE];
-					char *cp = libbpf_strerror_r(-err, errmsg,
-								     sizeof(errmsg));
+					char *cp;
 
+					cp = libbpf_strerror_r(-err, errmsg,
+							       sizeof(errmsg));
 					pr_warn("failed to alloc program %s (%s): %s",
 						name, obj->path, cp);
 					return err;
@@ -1824,7 +1826,7 @@ static int bpf_program__record_reloc(struct bpf_program *prog,
 	}
 
 	if (insn->code != (BPF_LD | BPF_IMM | BPF_DW)) {
-		pr_warn("bpf: relocation: invalid relo for insns[%d].code 0x%x\n",
+		pr_warn("invalid relo for insns[%d].code 0x%x\n",
 			insn_idx, insn->code);
 		return -LIBBPF_ERRNO__RELOC;
 	}
@@ -2140,7 +2142,7 @@ bpf_object__probe_global_data(struct bpf_object *obj)
 
 static int bpf_object__probe_btf_func(struct bpf_object *obj)
 {
-	const char strs[] = "\0int\0x\0a";
+	static const char strs[] = "\0int\0x\0a";
 	/* void x(int a) {} */
 	__u32 types[] = {
 		/* int */
@@ -2166,7 +2168,7 @@ static int bpf_object__probe_btf_func(struct bpf_object *obj)
 
 static int bpf_object__probe_btf_datasec(struct bpf_object *obj)
 {
-	const char strs[] = "\0x\0.data";
+	static const char strs[] = "\0x\0.data";
 	/* static int a; */
 	__u32 types[] = {
 		/* int */
@@ -5107,7 +5109,7 @@ int libbpf_prog_type_by_name(const char *name, enum bpf_prog_type *prog_type,
 		*expected_attach_type = section_names[i].expected_attach_type;
 		return 0;
 	}
-	pr_warn("failed to guess program type based on ELF section name '%s'\n", name);
+	pr_warn("failed to guess program type from ELF section '%s'\n", name);
 	type_names = libbpf_get_type_names(false);
 	if (type_names != NULL) {
 		pr_info("supported section(type) names are:%s\n", type_names);
@@ -6333,7 +6335,8 @@ static struct bpf_prog_info_array_desc bpf_prog_info_array_desc[] = {
 
 };
 
-static __u32 bpf_prog_info_read_offset_u32(struct bpf_prog_info *info, int offset)
+static __u32 bpf_prog_info_read_offset_u32(struct bpf_prog_info *info,
+					   int offset)
 {
 	__u32 *array = (__u32 *)info;
 
@@ -6342,7 +6345,8 @@ static __u32 bpf_prog_info_read_offset_u32(struct bpf_prog_info *info, int offse
 	return -(int)offset;
 }
 
-static __u64 bpf_prog_info_read_offset_u64(struct bpf_prog_info *info, int offset)
+static __u64 bpf_prog_info_read_offset_u64(struct bpf_prog_info *info,
+					   int offset)
 {
 	__u64 *array = (__u64 *)info;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next 4/6] libbpf: support initialized global variables
  2019-11-17  7:08 [PATCH bpf-next 0/6] Add libbpf-provided extern variables support Andrii Nakryiko
                   ` (2 preceding siblings ...)
  2019-11-17  7:08 ` [PATCH bpf-next 3/6] libbpf: fix various errors and warning reported by checkpatch.pl Andrii Nakryiko
@ 2019-11-17  7:08 ` Andrii Nakryiko
  2019-11-17  7:08 ` [PATCH bpf-next 5/6] libbpf: support libbpf-provided extern variables Andrii Nakryiko
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 20+ messages in thread
From: Andrii Nakryiko @ 2019-11-17  7:08 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko

Initialized global variables are no different in ELF from static variables,
and don't require any extra support from libbpf. But they are matching
semantics of global data (backed by BPF maps) more closely, preventing
LLVM/Clang from aggressively inlining constant values and not requiring
volatile incantations to prevent those. This patch enables global variables.
It still disables uninitialized variables, which will be put into special COM
(common) ELF section, because BPF doesn't allow uninitialized data to be
accessed.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/lib/bpf/libbpf.c                                   | 9 ++-------
 .../testing/selftests/bpf/progs/test_core_reloc_arrays.c | 4 ++--
 .../bpf/progs/test_core_reloc_bitfields_direct.c         | 4 ++--
 .../bpf/progs/test_core_reloc_bitfields_probed.c         | 4 ++--
 .../selftests/bpf/progs/test_core_reloc_existence.c      | 4 ++--
 .../selftests/bpf/progs/test_core_reloc_flavors.c        | 4 ++--
 tools/testing/selftests/bpf/progs/test_core_reloc_ints.c | 4 ++--
 .../testing/selftests/bpf/progs/test_core_reloc_kernel.c | 4 ++--
 tools/testing/selftests/bpf/progs/test_core_reloc_misc.c | 4 ++--
 tools/testing/selftests/bpf/progs/test_core_reloc_mods.c | 4 ++--
 .../selftests/bpf/progs/test_core_reloc_nesting.c        | 4 ++--
 .../selftests/bpf/progs/test_core_reloc_primitives.c     | 4 ++--
 .../selftests/bpf/progs/test_core_reloc_ptr_as_arr.c     | 4 ++--
 tools/testing/selftests/bpf/progs/test_core_reloc_size.c | 4 ++--
 14 files changed, 28 insertions(+), 33 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index f6232aeefb9e..639c2e17df0b 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -1831,8 +1831,8 @@ static int bpf_program__record_reloc(struct bpf_program *prog,
 		return -LIBBPF_ERRNO__RELOC;
 	}
 	if (!shdr_idx || shdr_idx >= SHN_LORESERVE) {
-		pr_warn("relocation: not yet supported relo for non-static global \'%s\' variable in special section (0x%x) found in insns[%d].code 0x%x\n",
-			name, shdr_idx, insn_idx, insn->code);
+		pr_warn("invalid relo for \'%s\' in special section 0x%x; forgot to initialize global var?..\n",
+			name, shdr_idx);
 		return -LIBBPF_ERRNO__RELOC;
 	}
 
@@ -1872,11 +1872,6 @@ static int bpf_program__record_reloc(struct bpf_program *prog,
 		pr_warn("bad data relo against section %u\n", shdr_idx);
 		return -LIBBPF_ERRNO__RELOC;
 	}
-	if (GELF_ST_BIND(sym->st_info) == STB_GLOBAL) {
-		pr_warn("relocation: not yet supported relo for non-static global \'%s\' variable found in insns[%d].code 0x%x\n",
-			name, insn_idx, insn->code);
-		return -LIBBPF_ERRNO__RELOC;
-	}
 	if (!obj->caps.global_data) {
 		pr_warn("relocation: kernel does not support global \'%s\' variable access in insns[%d]\n",
 			name, insn_idx);
diff --git a/tools/testing/selftests/bpf/progs/test_core_reloc_arrays.c b/tools/testing/selftests/bpf/progs/test_core_reloc_arrays.c
index 96b1f5f3b07a..89951b684282 100644
--- a/tools/testing/selftests/bpf/progs/test_core_reloc_arrays.c
+++ b/tools/testing/selftests/bpf/progs/test_core_reloc_arrays.c
@@ -8,10 +8,10 @@
 
 char _license[] SEC("license") = "GPL";
 
-static volatile struct data {
+struct {
 	char in[256];
 	char out[256];
-} data;
+} data = {};
 
 struct core_reloc_arrays_output {
 	int a2;
diff --git a/tools/testing/selftests/bpf/progs/test_core_reloc_bitfields_direct.c b/tools/testing/selftests/bpf/progs/test_core_reloc_bitfields_direct.c
index 738b34b72655..edc0f7c9e56d 100644
--- a/tools/testing/selftests/bpf/progs/test_core_reloc_bitfields_direct.c
+++ b/tools/testing/selftests/bpf/progs/test_core_reloc_bitfields_direct.c
@@ -8,10 +8,10 @@
 
 char _license[] SEC("license") = "GPL";
 
-static volatile struct data {
+struct {
 	char in[256];
 	char out[256];
-} data;
+} data = {};
 
 struct core_reloc_bitfields {
 	/* unsigned bitfields */
diff --git a/tools/testing/selftests/bpf/progs/test_core_reloc_bitfields_probed.c b/tools/testing/selftests/bpf/progs/test_core_reloc_bitfields_probed.c
index e466e3ab7de4..6c20e433558b 100644
--- a/tools/testing/selftests/bpf/progs/test_core_reloc_bitfields_probed.c
+++ b/tools/testing/selftests/bpf/progs/test_core_reloc_bitfields_probed.c
@@ -8,10 +8,10 @@
 
 char _license[] SEC("license") = "GPL";
 
-static volatile struct data {
+struct {
 	char in[256];
 	char out[256];
-} data;
+} data = {};
 
 struct core_reloc_bitfields {
 	/* unsigned bitfields */
diff --git a/tools/testing/selftests/bpf/progs/test_core_reloc_existence.c b/tools/testing/selftests/bpf/progs/test_core_reloc_existence.c
index c3cac95a19f1..1b7f0ae49cfb 100644
--- a/tools/testing/selftests/bpf/progs/test_core_reloc_existence.c
+++ b/tools/testing/selftests/bpf/progs/test_core_reloc_existence.c
@@ -8,10 +8,10 @@
 
 char _license[] SEC("license") = "GPL";
 
-static volatile struct data {
+struct {
 	char in[256];
 	char out[256];
-} data;
+} data = {};
 
 struct core_reloc_existence_output {
 	int a_exists;
diff --git a/tools/testing/selftests/bpf/progs/test_core_reloc_flavors.c b/tools/testing/selftests/bpf/progs/test_core_reloc_flavors.c
index 71fd7cebc9d7..b5dbeef540fd 100644
--- a/tools/testing/selftests/bpf/progs/test_core_reloc_flavors.c
+++ b/tools/testing/selftests/bpf/progs/test_core_reloc_flavors.c
@@ -8,10 +8,10 @@
 
 char _license[] SEC("license") = "GPL";
 
-static volatile struct data {
+struct {
 	char in[256];
 	char out[256];
-} data;
+} data = {};
 
 struct core_reloc_flavors {
 	int a;
diff --git a/tools/testing/selftests/bpf/progs/test_core_reloc_ints.c b/tools/testing/selftests/bpf/progs/test_core_reloc_ints.c
index ad5c3f59c9c6..c78ab6d28a14 100644
--- a/tools/testing/selftests/bpf/progs/test_core_reloc_ints.c
+++ b/tools/testing/selftests/bpf/progs/test_core_reloc_ints.c
@@ -8,10 +8,10 @@
 
 char _license[] SEC("license") = "GPL";
 
-static volatile struct data {
+struct {
 	char in[256];
 	char out[256];
-} data;
+} data = {};
 
 struct core_reloc_ints {
 	uint8_t		u8_field;
diff --git a/tools/testing/selftests/bpf/progs/test_core_reloc_kernel.c b/tools/testing/selftests/bpf/progs/test_core_reloc_kernel.c
index a4b5e0562ed5..5d499ebdc4bd 100644
--- a/tools/testing/selftests/bpf/progs/test_core_reloc_kernel.c
+++ b/tools/testing/selftests/bpf/progs/test_core_reloc_kernel.c
@@ -8,10 +8,10 @@
 
 char _license[] SEC("license") = "GPL";
 
-static volatile struct data {
+struct {
 	char in[256];
 	char out[256];
-} data;
+} data = {};
 
 struct core_reloc_kernel_output {
 	int valid[10];
diff --git a/tools/testing/selftests/bpf/progs/test_core_reloc_misc.c b/tools/testing/selftests/bpf/progs/test_core_reloc_misc.c
index 1a36b0856653..292a5c4ee76a 100644
--- a/tools/testing/selftests/bpf/progs/test_core_reloc_misc.c
+++ b/tools/testing/selftests/bpf/progs/test_core_reloc_misc.c
@@ -8,10 +8,10 @@
 
 char _license[] SEC("license") = "GPL";
 
-static volatile struct data {
+struct {
 	char in[256];
 	char out[256];
-} data;
+} data = {};
 
 struct core_reloc_misc_output {
 	int a, b, c;
diff --git a/tools/testing/selftests/bpf/progs/test_core_reloc_mods.c b/tools/testing/selftests/bpf/progs/test_core_reloc_mods.c
index 3199fafede2c..0b28bfacc8fd 100644
--- a/tools/testing/selftests/bpf/progs/test_core_reloc_mods.c
+++ b/tools/testing/selftests/bpf/progs/test_core_reloc_mods.c
@@ -8,10 +8,10 @@
 
 char _license[] SEC("license") = "GPL";
 
-static volatile struct data {
+struct {
 	char in[256];
 	char out[256];
-} data;
+} data = {};
 
 struct core_reloc_mods_output {
 	int a, b, c, d, e, f, g, h;
diff --git a/tools/testing/selftests/bpf/progs/test_core_reloc_nesting.c b/tools/testing/selftests/bpf/progs/test_core_reloc_nesting.c
index 98238cb64fbd..39279bf0c9db 100644
--- a/tools/testing/selftests/bpf/progs/test_core_reloc_nesting.c
+++ b/tools/testing/selftests/bpf/progs/test_core_reloc_nesting.c
@@ -8,10 +8,10 @@
 
 char _license[] SEC("license") = "GPL";
 
-static volatile struct data {
+struct {
 	char in[256];
 	char out[256];
-} data;
+} data = {};
 
 struct core_reloc_nesting_substruct {
 	int a;
diff --git a/tools/testing/selftests/bpf/progs/test_core_reloc_primitives.c b/tools/testing/selftests/bpf/progs/test_core_reloc_primitives.c
index 4f3ecb9127bb..ea57973cdd19 100644
--- a/tools/testing/selftests/bpf/progs/test_core_reloc_primitives.c
+++ b/tools/testing/selftests/bpf/progs/test_core_reloc_primitives.c
@@ -8,10 +8,10 @@
 
 char _license[] SEC("license") = "GPL";
 
-static volatile struct data {
+struct {
 	char in[256];
 	char out[256];
-} data;
+} data = {};
 
 enum core_reloc_primitives_enum {
 	A = 0,
diff --git a/tools/testing/selftests/bpf/progs/test_core_reloc_ptr_as_arr.c b/tools/testing/selftests/bpf/progs/test_core_reloc_ptr_as_arr.c
index 27f602f00419..d1eb59d4ea64 100644
--- a/tools/testing/selftests/bpf/progs/test_core_reloc_ptr_as_arr.c
+++ b/tools/testing/selftests/bpf/progs/test_core_reloc_ptr_as_arr.c
@@ -8,10 +8,10 @@
 
 char _license[] SEC("license") = "GPL";
 
-static volatile struct data {
+struct {
 	char in[256];
 	char out[256];
-} data;
+} data = {};
 
 struct core_reloc_ptr_as_arr {
 	int a;
diff --git a/tools/testing/selftests/bpf/progs/test_core_reloc_size.c b/tools/testing/selftests/bpf/progs/test_core_reloc_size.c
index 9a92998d9107..9e091124d3bd 100644
--- a/tools/testing/selftests/bpf/progs/test_core_reloc_size.c
+++ b/tools/testing/selftests/bpf/progs/test_core_reloc_size.c
@@ -8,10 +8,10 @@
 
 char _license[] SEC("license") = "GPL";
 
-static volatile struct data {
+struct {
 	char in[256];
 	char out[256];
-} data;
+} data = {};
 
 struct core_reloc_size_output {
 	int int_sz;
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next 5/6] libbpf: support libbpf-provided extern variables
  2019-11-17  7:08 [PATCH bpf-next 0/6] Add libbpf-provided extern variables support Andrii Nakryiko
                   ` (3 preceding siblings ...)
  2019-11-17  7:08 ` [PATCH bpf-next 4/6] libbpf: support initialized global variables Andrii Nakryiko
@ 2019-11-17  7:08 ` Andrii Nakryiko
  2019-11-19  3:21   ` Alexei Starovoitov
  2019-11-17  7:08 ` [PATCH bpf-next 6/6] selftests/bpf: add tests for libbpf-provided externs Andrii Nakryiko
  2019-11-17  7:12 ` [PATCH bpf-next 0/6] Add libbpf-provided extern variables support Andrii Nakryiko
  6 siblings, 1 reply; 20+ messages in thread
From: Andrii Nakryiko @ 2019-11-17  7:08 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko

Add support for extern variables, provided to BPF program by libbpf. Currently
the following extern variables are supported:
  - LINUX_KERNEL_VERSION; version of a kernel in which BPF program is
    executing, follows KERNEL_VERSION() macro convention;
  - CONFIG_xxx values; a set of values of actual kernel config. Tristate,
    boolean, and integer values are supported. Strings are not supported at
    the moment.

All values are represented as 64-bit integers, with the follow value encoding:
  - for boolean values, y is 1, n or missing value is 0;
  - for tristate values, y is 1, m is 2, n or missing value is 0;
  - for integers, the values is 64-bit integer, sign-extended, if negative; if
    config value is missing, it's represented as 0, which makes explicit 0 and
    missing config value indistinguishable. If this will turn out to be
    a problem in practice, we'll need to deal with it somehow.

Generally speaking, libbpf is not aware of which CONFIG_XXX values is of which
expected type (bool, tristate, int), so it doesn't enforce any specific set of
values and just parses n/y/m as 0/1/2, respectively. CONFIG_XXX values not
found in config file are set to 0. If any of declared
externs are unrecognized by libbpf, error is returned on bpf_object__open().
In the future it will be possible to extend available set of supported externs
to include new libbpf-provided values, if necessary, or linking against
kernel- or other BPF object-provided global variables/functions.

Config file itself is searched in /boot/config-$(uname -r) location with
fallback to /proc/config.gz, unless config path is specified explicitly
through bpf_object_open_opts' kernel_config_path option. Both gzipped and
plain text formats are supported. Libbpf adds explicit dependency on zlib
because of this, but this shouldn't be a problem, given libelf already depends
on zlib.

All detected extern variables, are put into a separate .extern internal map.
It, similarly to .rodata map, is marked as read-only from BPF program side, as
well as is frozen on load. This allows BPF verifier to track extern values as
constants and performe enhanced branch prediction and dead code elimination.
This can be relied upon for doing kernel version/feature detection and using
potentially unsupported field relocations or BPF helpers in a CO-RE-based BPF
program, while still having a single version of BPF program running on old and
new kernels. Selftests are validating this explicitly for unexisting BPF
helper.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/lib/bpf/Makefile               |  17 +-
 tools/lib/bpf/libbpf.c               | 471 ++++++++++++++++++++++-----
 tools/lib/bpf/libbpf.h               |   8 +-
 tools/testing/selftests/bpf/Makefile |   2 +-
 4 files changed, 409 insertions(+), 89 deletions(-)

diff --git a/tools/lib/bpf/Makefile b/tools/lib/bpf/Makefile
index 99425d0be6ff..28020b23c4d2 100644
--- a/tools/lib/bpf/Makefile
+++ b/tools/lib/bpf/Makefile
@@ -56,8 +56,8 @@ ifndef VERBOSE
 endif
 
 FEATURE_USER = .libbpf
-FEATURE_TESTS = libelf libelf-mmap bpf reallocarray
-FEATURE_DISPLAY = libelf bpf
+FEATURE_TESTS = libelf libelf-mmap zlib bpf reallocarray
+FEATURE_DISPLAY = libelf zlib bpf
 
 INCLUDES = -I. -I$(srctree)/tools/include -I$(srctree)/tools/arch/$(ARCH)/include/uapi -I$(srctree)/tools/include/uapi
 FEATURE_CHECK_CFLAGS-bpf = $(INCLUDES)
@@ -159,7 +159,7 @@ all: fixdep
 
 all_cmd: $(CMD_TARGETS) check
 
-$(BPF_IN_SHARED): force elfdep bpfdep bpf_helper_defs.h
+$(BPF_IN_SHARED): force elfdep zdep bpfdep bpf_helper_defs.h
 	@(test -f ../../include/uapi/linux/bpf.h -a -f ../../../include/uapi/linux/bpf.h && ( \
 	(diff -B ../../include/uapi/linux/bpf.h ../../../include/uapi/linux/bpf.h >/dev/null) || \
 	echo "Warning: Kernel ABI header at 'tools/include/uapi/linux/bpf.h' differs from latest version at 'include/uapi/linux/bpf.h'" >&2 )) || true
@@ -177,7 +177,7 @@ $(BPF_IN_SHARED): force elfdep bpfdep bpf_helper_defs.h
 	echo "Warning: Kernel ABI header at 'tools/include/uapi/linux/if_xdp.h' differs from latest version at 'include/uapi/linux/if_xdp.h'" >&2 )) || true
 	$(Q)$(MAKE) $(build)=libbpf OUTPUT=$(SHARED_OBJDIR) CFLAGS="$(CFLAGS) $(SHLIB_FLAGS)"
 
-$(BPF_IN_STATIC): force elfdep bpfdep bpf_helper_defs.h
+$(BPF_IN_STATIC): force elfdep zdep bpfdep bpf_helper_defs.h
 	$(Q)$(MAKE) $(build)=libbpf OUTPUT=$(STATIC_OBJDIR)
 
 bpf_helper_defs.h: $(srctree)/include/uapi/linux/bpf.h
@@ -189,7 +189,7 @@ $(OUTPUT)libbpf.so: $(OUTPUT)libbpf.so.$(LIBBPF_VERSION)
 $(OUTPUT)libbpf.so.$(LIBBPF_VERSION): $(BPF_IN_SHARED)
 	$(QUIET_LINK)$(CC) $(LDFLAGS) \
 		--shared -Wl,-soname,libbpf.so.$(LIBBPF_MAJOR_VERSION) \
-		-Wl,--version-script=$(VERSION_SCRIPT) $^ -lelf -o $@
+		-Wl,--version-script=$(VERSION_SCRIPT) $^ -lelf -lz -o $@
 	@ln -sf $(@F) $(OUTPUT)libbpf.so
 	@ln -sf $(@F) $(OUTPUT)libbpf.so.$(LIBBPF_MAJOR_VERSION)
 
@@ -197,7 +197,7 @@ $(OUTPUT)libbpf.a: $(BPF_IN_STATIC)
 	$(QUIET_LINK)$(RM) $@; $(AR) rcs $@ $^
 
 $(OUTPUT)test_libbpf: test_libbpf.c $(OUTPUT)libbpf.a
-	$(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) $(INCLUDES) $^ -lelf -o $@
+	$(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) $(INCLUDES) $^ -lelf -lz -o $@
 
 $(OUTPUT)libbpf.pc:
 	$(QUIET_GEN)sed -e "s|@PREFIX@|$(prefix)|" \
@@ -280,12 +280,15 @@ clean:
 
 
 
-PHONY += force elfdep bpfdep cscope tags
+PHONY += force elfdep zdep bpfdep cscope tags
 force:
 
 elfdep:
 	@if [ "$(feature-libelf)" != "1" ]; then echo "No libelf found"; exit 1 ; fi
 
+zdep:
+	@if [ "$(feature-zlib)" != "1" ]; then echo "No zlib found"; exit 1 ; fi
+
 bpfdep:
 	@if [ "$(feature-bpf)" != "1" ]; then echo "BPF API too old"; exit 1 ; fi
 
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 639c2e17df0b..d9b70483a2c7 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -44,6 +44,7 @@
 #include <tools/libc_compat.h>
 #include <libelf.h>
 #include <gelf.h>
+#include <zlib.h>
 
 #include "libbpf.h"
 #include "bpf.h"
@@ -146,6 +147,23 @@ struct bpf_capabilities {
 	__u32 array_mmap:1;
 };
 
+enum reloc_type {
+	RELO_LD64,
+	RELO_CALL,
+	RELO_DATA,
+	RELO_EXTERN,
+};
+
+struct reloc_desc {
+	enum reloc_type type;
+	int insn_idx;
+	union {
+		int map_idx;
+		int text_off;
+		int data_off;
+	};
+};
+
 /*
  * bpf_prog should be a better name but it has been used in
  * linux/filter.h.
@@ -164,18 +182,7 @@ struct bpf_program {
 	size_t insns_cnt, main_prog_cnt;
 	enum bpf_prog_type type;
 
-	struct reloc_desc {
-		enum {
-			RELO_LD64,
-			RELO_CALL,
-			RELO_DATA,
-		} type;
-		int insn_idx;
-		union {
-			int map_idx;
-			int text_off;
-		};
-	} *reloc_desc;
+	struct reloc_desc *reloc_desc;
 	int nr_reloc;
 	int log_level;
 
@@ -209,12 +216,14 @@ enum libbpf_map_type {
 	LIBBPF_MAP_DATA,
 	LIBBPF_MAP_BSS,
 	LIBBPF_MAP_RODATA,
+	LIBBPF_MAP_EXTERN,
 };
 
 static const char * const libbpf_type_to_btf_name[] = {
 	[LIBBPF_MAP_DATA]	= ".data",
 	[LIBBPF_MAP_BSS]	= ".bss",
 	[LIBBPF_MAP_RODATA]	= ".rodata",
+	[LIBBPF_MAP_EXTERN]	= ".extern",
 };
 
 struct bpf_map {
@@ -238,6 +247,13 @@ struct bpf_map {
 struct bpf_secdata {
 	void *rodata;
 	void *data;
+	void *extern_data;
+};
+
+struct extern_desc {
+	const char *name;
+	__u32 data_off;
+	int sym_idx;
 };
 
 static LIST_HEAD(bpf_objects_list);
@@ -254,6 +270,10 @@ struct bpf_object {
 	size_t maps_cap;
 	struct bpf_secdata sections;
 
+	struct extern_desc *externs;
+	int nr_extern;
+	int extern_map_idx;
+
 	bool loaded;
 	bool has_pseudo_calls;
 	bool relaxed_core_relocs;
@@ -281,6 +301,7 @@ struct bpf_object {
 		int maps_shndx;
 		int btf_maps_shndx;
 		int text_shndx;
+		int symbols_shndx;
 		int data_shndx;
 		int rodata_shndx;
 		int bss_shndx;
@@ -839,7 +860,8 @@ static struct bpf_map *bpf_object__add_map(struct bpf_object *obj)
 
 static int
 bpf_object__init_internal_map(struct bpf_object *obj, enum libbpf_map_type type,
-			      int sec_idx, Elf_Data *data, void **data_buff)
+			      int sec_idx, void *data, size_t data_sz,
+			      void **data_copy)
 {
 	char map_name[BPF_OBJ_NAME_LEN];
 	struct bpf_map_def *def;
@@ -859,27 +881,30 @@ bpf_object__init_internal_map(struct bpf_object *obj, enum libbpf_map_type type,
 		pr_warn("failed to alloc map name\n");
 		return -ENOMEM;
 	}
+	pr_debug("map '%s' (%s): at sec_idx %d, offset %zu.\n", map_name,
+		 libbpf_type_to_btf_name[type], map->sec_idx, map->sec_offset);
 
 	def = &map->def;
 	def->type = BPF_MAP_TYPE_ARRAY;
 	def->key_size = sizeof(int);
-	def->value_size = data->d_size;
+	def->value_size = data_sz;
 	def->max_entries = 1;
-	def->map_flags = type == LIBBPF_MAP_RODATA ? BPF_F_RDONLY_PROG : 0;
+	def->map_flags = type == LIBBPF_MAP_RODATA || type == LIBBPF_MAP_EXTERN
+			 ? BPF_F_RDONLY_PROG : 0;
 	if (obj->caps.array_mmap)
 		def->map_flags |= BPF_F_MMAPABLE;
 
 	pr_debug("map '%s' (global data): at sec_idx %d, offset %zu, flags %x.\n",
 		 map_name, map->sec_idx, map->sec_offset, def->map_flags);
 
-	if (data_buff) {
-		*data_buff = malloc(data->d_size);
-		if (!*data_buff) {
+	if (data_copy) {
+		*data_copy = malloc(data_sz);
+		if (!*data_copy) {
 			zfree(&map->name);
 			pr_warn("failed to alloc map content buffer\n");
 			return -ENOMEM;
 		}
-		memcpy(*data_buff, data->d_buf, data->d_size);
+		memcpy(*data_copy, data, data_sz);
 	}
 
 	pr_debug("map %td is \"%s\"\n", map - obj->maps, map->name);
@@ -898,7 +923,8 @@ static int bpf_object__init_global_data_maps(struct bpf_object *obj)
 	if (obj->efile.data_shndx >= 0) {
 		err = bpf_object__init_internal_map(obj, LIBBPF_MAP_DATA,
 						    obj->efile.data_shndx,
-						    obj->efile.data,
+						    obj->efile.data->d_buf,
+						    obj->efile.data->d_size,
 						    &obj->sections.data);
 		if (err)
 			return err;
@@ -906,7 +932,8 @@ static int bpf_object__init_global_data_maps(struct bpf_object *obj)
 	if (obj->efile.rodata_shndx >= 0) {
 		err = bpf_object__init_internal_map(obj, LIBBPF_MAP_RODATA,
 						    obj->efile.rodata_shndx,
-						    obj->efile.rodata,
+						    obj->efile.rodata->d_buf,
+						    obj->efile.rodata->d_size,
 						    &obj->sections.rodata);
 		if (err)
 			return err;
@@ -914,13 +941,173 @@ static int bpf_object__init_global_data_maps(struct bpf_object *obj)
 	if (obj->efile.bss_shndx >= 0) {
 		err = bpf_object__init_internal_map(obj, LIBBPF_MAP_BSS,
 						    obj->efile.bss_shndx,
-						    obj->efile.bss, NULL);
+						    obj->efile.bss->d_buf,
+						    obj->efile.bss->d_size,
+						    NULL);
 		if (err)
 			return err;
 	}
 	return 0;
 }
 
+
+static int find_extern_by_name(const void *name, const void *_ext)
+{
+	const struct extern_desc *ext = _ext;
+
+	return strcmp(name, ext->name);
+}
+
+static int bpf_object__read_kernel_config(struct bpf_object *obj,
+					  const char *config_path)
+{
+	char buf[PATH_MAX], *sep, *value, *value_end;
+	struct extern_desc *ext;
+	int len, err = 0;
+	__u64 *ext_val;
+	gzFile file;
+
+	if (config_path) {
+		file = gzopen(config_path, "r");
+	} else {
+		struct utsname uts;
+
+		uname(&uts);
+		len = snprintf(buf, PATH_MAX, "/boot/config-%s", uts.release);
+		if (len < 0)
+			return -EINVAL;
+		else if (len >= PATH_MAX)
+			return -ENAMETOOLONG;
+		/* gzopen also accepts uncompressed files. */
+		file = gzopen(buf, "r");
+		if (!file)
+			file = gzopen("/proc/config.gz", "r");
+	}
+	if (!file) {
+		pr_warn("failed to read kernel config at '%s'\n", config_path);
+		return -ENOENT;
+	}
+
+	while (gzgets(file, buf, sizeof(buf))) {
+		if (strncmp(buf, "CONFIG_", 7))
+			continue;
+
+		sep = strchr(buf, '=');
+		if (!sep) {
+			err = -EINVAL;
+			pr_warn("failed to parse '%s': no separator\n", buf);
+			goto out;
+		}
+		/* Trim ending '\n' */
+		buf[strlen(buf) - 1] = '\0';
+		/* Split on '=' and ensure that a value is present. */
+		*sep = '\0';
+		if (!sep[1]) {
+			err = -EINVAL;
+			*sep = '=';
+			pr_warn("failed to parse '%s': no value\n", buf);
+			goto out;
+		}
+
+		ext = bsearch(buf, obj->externs, obj->nr_extern, sizeof(*ext),
+			      find_extern_by_name);
+		if (!ext)
+			continue;
+
+		ext_val = (__u64 *)(obj->sections.extern_data + ext->data_off);
+		value = sep + 1;
+
+		switch (*value) {
+		case 'n':
+			*ext_val = 0;
+			break;
+		case 'y':
+			*ext_val = 1;
+			break;
+		case 'm':
+			*ext_val = 2;
+			break;
+		case '"':
+			pr_warn("extern '%s': strings are not supported\n",
+				ext->name);
+			err = -EINVAL;
+			goto out;
+		default:
+			errno = 0;
+			*ext_val = strtoull(value, &value_end, 10);
+			if (errno) {
+				err = -errno;
+				pr_warn("extern '%s': failed to parse value: %d\n",
+					ext->name, err);
+				goto out;
+			}
+			if (*value_end && *value_end != '\n') {
+				err = -EINVAL;
+				pr_warn("extern '%s': failed to parse value\n",
+					ext->name);
+				goto out;
+			}
+			break;
+		}
+		pr_debug("extern '%s' = %llu\n", ext->name, *ext_val);
+	}
+
+out:
+	gzclose(file);
+	return err;
+}
+
+static int bpf_object__init_extern_map(struct bpf_object *obj,
+				       const char *config_path)
+{
+	bool need_config = false;
+	struct extern_desc *ext;
+	__u64 *ext_val;
+	int err, i;
+
+	if (obj->nr_extern == 0)
+		return 0;
+	if (!obj->caps.global_data)
+		return -ENOTSUP;
+
+	obj->sections.extern_data = calloc(obj->nr_extern, sizeof(__u64));
+	if (!obj->sections.extern_data)
+		return -ENOMEM;
+
+	for (i = 0; i < obj->nr_extern; i++) {
+		ext = &obj->externs[i];
+		ext_val = (__u64 *)(obj->sections.extern_data + ext->data_off);
+
+		if (strcmp(ext->name, "LINUX_KERNEL_VERSION") == 0) {
+			*ext_val = get_kernel_version();
+			if (*ext_val == 0) {
+				pr_warn("failed to get kernel version\n");
+				return -EINVAL;
+			}
+			pr_debug("extern '%s' = 0x%llx\n", ext->name, *ext_val);
+		} else if (strncmp(ext->name, "CONFIG_", 7) == 0) {
+			need_config = true;
+		} else {
+			pr_warn("unrecognized extern: '%s'\n", ext->name);
+			return -EINVAL;
+		}
+	}
+	if (need_config) {
+		err = bpf_object__read_kernel_config(obj, config_path);
+		if (err)
+			return -EINVAL;
+	}
+
+	err = bpf_object__init_internal_map(obj, LIBBPF_MAP_EXTERN,
+					    obj->efile.symbols_shndx,
+					    obj->sections.extern_data,
+					    obj->nr_extern * sizeof(__u64),
+					    NULL);
+	if (!err)
+		obj->extern_map_idx = obj->nr_maps - 1;
+	return err;
+}
+
 static int bpf_object__init_user_maps(struct bpf_object *obj, bool strict)
 {
 	Elf_Data *symbols = obj->efile.symbols;
@@ -1395,12 +1582,17 @@ static int bpf_object__init_user_btf_maps(struct bpf_object *obj, bool strict,
 	return 0;
 }
 
-static int bpf_object__init_maps(struct bpf_object *obj, bool relaxed_maps,
-				 const char *pin_root_path)
+static int bpf_object__init_maps(struct bpf_object *obj,
+				 const struct bpf_object_open_opts *opts)
 {
-	bool strict = !relaxed_maps;
+	const char *pin_root_path, *config_path;
+	bool strict;
 	int err;
 
+	strict = !OPTS_GET(opts, relaxed_maps, false);
+	pin_root_path = OPTS_GET(opts, pin_root_path, NULL);
+	config_path = OPTS_GET(opts, kernel_config_path, NULL);
+
 	err = bpf_object__init_user_maps(obj, strict);
 	if (err)
 		return err;
@@ -1413,6 +1605,10 @@ static int bpf_object__init_maps(struct bpf_object *obj, bool relaxed_maps,
 	if (err)
 		return err;
 
+	err = bpf_object__init_extern_map(obj, config_path);
+	if (err)
+		return err;
+
 	if (obj->nr_maps) {
 		qsort(obj->maps, obj->nr_maps, sizeof(obj->maps[0]),
 		      compare_bpf_map);
@@ -1594,8 +1790,7 @@ static int bpf_object__sanitize_and_load_btf(struct bpf_object *obj)
 	return 0;
 }
 
-static int bpf_object__elf_collect(struct bpf_object *obj, bool relaxed_maps,
-				   const char *pin_root_path)
+static int bpf_object__elf_collect(struct bpf_object *obj)
 {
 	Elf *elf = obj->efile.elf;
 	GElf_Ehdr *ep = &obj->efile.ehdr;
@@ -1667,6 +1862,7 @@ static int bpf_object__elf_collect(struct bpf_object *obj, bool relaxed_maps,
 				return -LIBBPF_ERRNO__FORMAT;
 			}
 			obj->efile.symbols = data;
+			obj->efile.symbols_shndx = idx;
 			obj->efile.strtabidx = sh.sh_link;
 		} else if (sh.sh_type == SHT_PROGBITS && data->d_size > 0) {
 			if (sh.sh_flags & SHF_EXECINSTR) {
@@ -1730,14 +1926,81 @@ static int bpf_object__elf_collect(struct bpf_object *obj, bool relaxed_maps,
 		pr_warn("Corrupted ELF file: index of strtab invalid\n");
 		return -LIBBPF_ERRNO__FORMAT;
 	}
-	err = bpf_object__init_btf(obj, btf_data, btf_ext_data);
-	if (!err)
-		err = bpf_object__init_maps(obj, relaxed_maps, pin_root_path);
-	if (!err)
-		err = bpf_object__sanitize_and_load_btf(obj);
-	if (!err)
-		err = bpf_object__init_prog_names(obj);
-	return err;
+	return bpf_object__init_btf(obj, btf_data, btf_ext_data);
+}
+
+static bool sym_is_extern(const GElf_Sym *sym)
+{
+	/* externs are symbols with type=NOTYPE, bind=GLOBAL, section=UND */
+	return sym->st_shndx == SHN_UNDEF &&
+	       GELF_ST_BIND(sym->st_info) == STB_GLOBAL &&
+	       GELF_ST_TYPE(sym->st_info) == STT_NOTYPE;
+}
+
+static int cmp_extern_by_name(const void *_a, const void *_b)
+{
+	const struct extern_desc *a = _a;
+	const struct extern_desc *b = _b;
+
+	return strcmp(a->name, b->name);
+}
+
+static int bpf_object__collect_externs(struct bpf_object *obj)
+{
+	struct extern_desc *ext;
+	Elf_Scn *scn;
+	GElf_Shdr sh;
+	int i, n;
+
+	if (!obj->efile.symbols)
+		return 0;
+
+	scn = elf_getscn(obj->efile.elf, obj->efile.symbols_shndx);
+	if (!scn)
+		return -LIBBPF_ERRNO__FORMAT;
+	if (gelf_getshdr(scn, &sh) != &sh)
+		return -LIBBPF_ERRNO__FORMAT;
+	n = sh.sh_size / sh.sh_entsize;
+
+	pr_debug("looking for externs among %d symbols...\n", n);
+	for (i = 0; i < n; i++) {
+		const char *sym_name;
+		GElf_Sym sym;
+
+		if (!gelf_getsym(obj->efile.symbols, i, &sym))
+			return -LIBBPF_ERRNO__FORMAT;
+		if (!sym_is_extern(&sym))
+			continue;
+		sym_name = elf_strptr(obj->efile.elf, obj->efile.strtabidx,
+				      sym.st_name);
+		if (!sym_name || !sym_name[0])
+			continue;
+
+		ext = obj->externs;
+		ext = reallocarray(ext, obj->nr_extern + 1, sizeof(*ext));
+		if (!ext)
+			return -ENOMEM;
+		obj->externs = ext;
+
+		ext = &ext[obj->nr_extern];
+		ext->name = strdup(sym_name);
+		ext->sym_idx = i;
+		if (!ext->name)
+			return -ENOMEM;
+		obj->nr_extern++;
+	}
+	pr_debug("collected %d externs total\n", obj->nr_extern);
+
+	/* sort externs by name and calculate their offsets within a map */
+	qsort(obj->externs, obj->nr_extern, sizeof(*ext), cmp_extern_by_name);
+	for (i = 0; i < obj->nr_extern; i++) {
+		ext = &obj->externs[i];
+		ext->data_off = i * sizeof(__u64);
+		pr_debug("extern #%d: symbol %d, off %u, name %s\n",
+			 i, ext->sym_idx, ext->data_off, ext->name);
+	}
+
+	return 0;
 }
 
 static struct bpf_program *
@@ -1791,6 +2054,8 @@ bpf_object__section_to_libbpf_map_type(const struct bpf_object *obj, int shndx)
 		return LIBBPF_MAP_BSS;
 	else if (shndx == obj->efile.rodata_shndx)
 		return LIBBPF_MAP_RODATA;
+	else if (shndx == obj->efile.symbols_shndx)
+		return LIBBPF_MAP_EXTERN;
 	else
 		return LIBBPF_MAP_UNSPEC;
 }
@@ -1830,6 +2095,30 @@ static int bpf_program__record_reloc(struct bpf_program *prog,
 			insn_idx, insn->code);
 		return -LIBBPF_ERRNO__RELOC;
 	}
+
+	if (sym_is_extern(sym)) {
+		int sym_idx = GELF_R_SYM(rel->r_info);
+		int i, n = obj->nr_extern;
+		struct extern_desc *ext;
+
+		for (i = 0; i < n; i++) {
+			ext = &obj->externs[i];
+			if (ext->sym_idx == sym_idx)
+				break;
+		}
+		if (i >= n) {
+			pr_warn("extern relo failed to find extern for sym %d\n",
+				sym_idx);
+			return -LIBBPF_ERRNO__RELOC;
+		}
+		pr_debug("found extern #%d '%s' (sym %d, off %u) for insn %u\n",
+			 i, ext->name, ext->sym_idx, ext->data_off, insn_idx);
+		reloc_desc->type = RELO_EXTERN;
+		reloc_desc->insn_idx = insn_idx;
+		reloc_desc->data_off = ext->data_off;
+		return 0;
+	}
+
 	if (!shdr_idx || shdr_idx >= SHN_LORESERVE) {
 		pr_warn("invalid relo for \'%s\' in special section 0x%x; forgot to initialize global var?..\n",
 			name, shdr_idx);
@@ -2296,23 +2585,41 @@ bpf_object__populate_internal_map(struct bpf_object *obj, struct bpf_map *map)
 	char *cp, errmsg[STRERR_BUFSIZE];
 	int err, zero = 0;
 	__u8 *data;
+	bool freeze;
 
-	/* Nothing to do here since kernel already zero-initializes .bss map. */
-	if (map->libbpf_type == LIBBPF_MAP_BSS)
+	switch (map->libbpf_type) {
+	case LIBBPF_MAP_BSS:
+		/* kernel already zero-initializes .bss map. */
 		return 0;
-
-	data = map->libbpf_type == LIBBPF_MAP_DATA ?
-	       obj->sections.data : obj->sections.rodata;
+	case LIBBPF_MAP_DATA:
+		data = obj->sections.data;
+		freeze = false;
+		break;
+	case LIBBPF_MAP_RODATA:
+		data = obj->sections.rodata;
+		freeze = true;
+		break;
+	case LIBBPF_MAP_EXTERN:
+		data = obj->sections.extern_data;
+		freeze = true;
+		break;
+	case LIBBPF_MAP_UNSPEC:
+	default:
+		return -EINVAL;
+	}
 
 	err = bpf_map_update_elem(map->fd, &zero, data, 0);
-	/* Freeze .rodata map as read-only from syscall side. */
-	if (!err && map->libbpf_type == LIBBPF_MAP_RODATA) {
+	if (err)
+		return -errno;
+
+	/* Freeze .rodata or .externs map as read-only from syscall side. */
+	if (freeze) {
 		err = bpf_map_freeze(map->fd);
 		if (err) {
+			err = -errno;
 			cp = libbpf_strerror_r(errno, errmsg, sizeof(errmsg));
 			pr_warn("Error freezing map(%s) as read-only: %s\n",
 				map->name, cp);
-			err = 0;
 		}
 	}
 	return err;
@@ -3554,9 +3861,6 @@ bpf_program__reloc_text(struct bpf_program *prog, struct bpf_object *obj,
 	size_t new_cnt;
 	int err;
 
-	if (relo->type != RELO_CALL)
-		return -LIBBPF_ERRNO__RELOC;
-
 	if (prog->idx == obj->efile.text_shndx) {
 		pr_warn("relo in .text insn %d into off %d\n",
 			relo->insn_idx, relo->text_off);
@@ -3617,33 +3921,38 @@ bpf_program__relocate(struct bpf_program *prog, struct bpf_object *obj)
 		return 0;
 
 	for (i = 0; i < prog->nr_reloc; i++) {
-		if (prog->reloc_desc[i].type == RELO_LD64 ||
-		    prog->reloc_desc[i].type == RELO_DATA) {
-			bool relo_data = prog->reloc_desc[i].type == RELO_DATA;
-			struct bpf_insn *insns = prog->insns;
-			int insn_idx, map_idx;
+		struct reloc_desc *relo = &prog->reloc_desc[i];
+		struct bpf_insn *insns = prog->insns;
+		int insn_idx = relo->insn_idx;
 
-			insn_idx = prog->reloc_desc[i].insn_idx;
-			map_idx = prog->reloc_desc[i].map_idx;
-
-			if (insn_idx + 1 >= (int)prog->insns_cnt) {
-				pr_warn("relocation out of range: '%s'\n",
-					prog->section_name);
-				return -LIBBPF_ERRNO__RELOC;
-			}
+		if (insn_idx + 1 >= (int)prog->insns_cnt) {
+			pr_warn("relo #%d: insn out of range %d\n", i, insn_idx);
+			return -LIBBPF_ERRNO__RELOC;
+		}
 
-			if (!relo_data) {
-				insns[insn_idx].src_reg = BPF_PSEUDO_MAP_FD;
-			} else {
-				insns[insn_idx].src_reg = BPF_PSEUDO_MAP_VALUE;
-				insns[insn_idx + 1].imm = insns[insn_idx].imm;
-			}
-			insns[insn_idx].imm = obj->maps[map_idx].fd;
-		} else if (prog->reloc_desc[i].type == RELO_CALL) {
-			err = bpf_program__reloc_text(prog, obj,
-						      &prog->reloc_desc[i]);
+		switch (relo->type) {
+		case RELO_LD64:
+			insns[insn_idx].src_reg = BPF_PSEUDO_MAP_FD;
+			insns[insn_idx].imm = obj->maps[relo->map_idx].fd;
+			break;
+		case RELO_DATA:
+			insns[insn_idx].src_reg = BPF_PSEUDO_MAP_VALUE;
+			insns[insn_idx + 1].imm = insns[insn_idx].imm;
+			insns[insn_idx].imm = obj->maps[relo->map_idx].fd;
+			break;
+		case RELO_EXTERN:
+			insns[insn_idx].src_reg = BPF_PSEUDO_MAP_VALUE;
+			insns[insn_idx].imm = obj->maps[obj->extern_map_idx].fd;
+			insns[insn_idx + 1].imm = relo->data_off;
+			break;
+		case RELO_CALL:
+			err = bpf_program__reloc_text(prog, obj, relo);
 			if (err)
 				return err;
+			break;
+		default:
+			pr_warn("relo #%d: bad relo type %d\n", i, relo->type);
+			return -EINVAL;
 		}
 	}
 
@@ -3917,12 +4226,10 @@ static struct bpf_object *
 __bpf_object__open(const char *path, const void *obj_buf, size_t obj_buf_sz,
 		   struct bpf_object_open_opts *opts)
 {
-	const char *pin_root_path;
 	struct bpf_program *prog;
 	struct bpf_object *obj;
 	const char *obj_name;
 	char tmp_name[64];
-	bool relaxed_maps;
 	__u32 attach_prog_fd;
 	int err;
 
@@ -3952,16 +4259,19 @@ __bpf_object__open(const char *path, const void *obj_buf, size_t obj_buf_sz,
 		return obj;
 
 	obj->relaxed_core_relocs = OPTS_GET(opts, relaxed_core_relocs, false);
-	relaxed_maps = OPTS_GET(opts, relaxed_maps, false);
-	pin_root_path = OPTS_GET(opts, pin_root_path, NULL);
 	attach_prog_fd = OPTS_GET(opts, attach_prog_fd, 0);
 
-	CHECK_ERR(bpf_object__elf_init(obj), err, out);
-	CHECK_ERR(bpf_object__check_endianness(obj), err, out);
-	CHECK_ERR(bpf_object__probe_caps(obj), err, out);
-	CHECK_ERR(bpf_object__elf_collect(obj, relaxed_maps, pin_root_path),
-		  err, out);
-	CHECK_ERR(bpf_object__collect_reloc(obj), err, out);
+	err = bpf_object__elf_init(obj);
+	err = err ? : bpf_object__check_endianness(obj);
+	err = err ? : bpf_object__probe_caps(obj);
+	err = err ? : bpf_object__elf_collect(obj);
+	err = err ? : bpf_object__collect_externs(obj);
+	err = err ? : bpf_object__init_maps(obj, opts);
+	err = err ? : bpf_object__sanitize_and_load_btf(obj);
+	err = err ? : bpf_object__init_prog_names(obj);
+	err = err ? : bpf_object__collect_reloc(obj);
+	if (err)
+		goto out;
 	bpf_object__elf_finish(obj);
 
 	bpf_object__for_each_program(prog, obj) {
@@ -4681,6 +4991,9 @@ void bpf_object__close(struct bpf_object *obj)
 
 	zfree(&obj->sections.rodata);
 	zfree(&obj->sections.data);
+	zfree(&obj->sections.extern_data);
+	zfree(&obj->externs);
+	obj->nr_extern = 0;
 	zfree(&obj->maps);
 	obj->nr_maps = 0;
 
diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
index 0dbf4bfba0c4..30b920879319 100644
--- a/tools/lib/bpf/libbpf.h
+++ b/tools/lib/bpf/libbpf.h
@@ -103,14 +103,18 @@ struct bpf_object_open_opts {
 	bool relaxed_maps;
 	/* process CO-RE relocations non-strictly, allowing them to fail */
 	bool relaxed_core_relocs;
+	__u32 attach_prog_fd;
 	/* maps that set the 'pinning' attribute in their definition will have
 	 * their pin_path attribute set to a file in this directory, and be
 	 * auto-pinned to that path on load; defaults to "/sys/fs/bpf".
 	 */
 	const char *pin_root_path;
-	__u32 attach_prog_fd;
+	/* kernel config file path override (for CONFIG_ externs); can point
+	 * to either uncompressed text file or .gz file
+	 */
+	const char *kernel_config_path;
 };
-#define bpf_object_open_opts__last_field attach_prog_fd
+#define bpf_object_open_opts__last_field kernel_config_path
 
 LIBBPF_API struct bpf_object *bpf_object__open(const char *path);
 LIBBPF_API struct bpf_object *
diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 5737888cd6a7..df3f42aa9a1a 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -22,7 +22,7 @@ CFLAGS += -g -Wall -O2 $(GENFLAGS) -I$(APIDIR) -I$(LIBDIR) -I$(BPFDIR)	\
 	  -I$(GENDIR) -I$(TOOLSDIR) -I$(CURDIR)				\
 	  -Dbpf_prog_load=bpf_prog_test_load				\
 	  -Dbpf_load_program=bpf_test_load_program
-LDLIBS += -lcap -lelf -lrt -lpthread
+LDLIBS += -lcap -lelf -lz -lrt -lpthread
 
 # Order correspond to 'make run_tests' order
 TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map test_progs \
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH bpf-next 6/6] selftests/bpf: add tests for libbpf-provided externs
  2019-11-17  7:08 [PATCH bpf-next 0/6] Add libbpf-provided extern variables support Andrii Nakryiko
                   ` (4 preceding siblings ...)
  2019-11-17  7:08 ` [PATCH bpf-next 5/6] libbpf: support libbpf-provided extern variables Andrii Nakryiko
@ 2019-11-17  7:08 ` Andrii Nakryiko
  2019-11-17  7:12 ` [PATCH bpf-next 0/6] Add libbpf-provided extern variables support Andrii Nakryiko
  6 siblings, 0 replies; 20+ messages in thread
From: Andrii Nakryiko @ 2019-11-17  7:08 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko

Add a set of tests validating libbpf-provided extern variables. One crucial
feature that's tested is dead code elimination together with using invalid BPF
helper. CONFIG_MISSING is not supposed to exist and should always be specified
by libbpf as zero, which allows BPF verifier to correctly do branch pruning
and not fail validation, when invalid BPF helper is called from dead if branch.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 .../selftests/bpf/prog_tests/core_extern.c    | 186 ++++++++++++++++++
 .../selftests/bpf/progs/test_core_extern.c    |  43 ++++
 2 files changed, 229 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/core_extern.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_core_extern.c

diff --git a/tools/testing/selftests/bpf/prog_tests/core_extern.c b/tools/testing/selftests/bpf/prog_tests/core_extern.c
new file mode 100644
index 000000000000..2a87a5f1de4d
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/core_extern.c
@@ -0,0 +1,186 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <test_progs.h>
+#include <sys/mman.h>
+#include <sys/utsname.h>
+#include <linux/version.h>
+
+static size_t roundup_page(size_t sz)
+{
+	long page_size = sysconf(_SC_PAGE_SIZE);
+	return (sz + page_size - 1) / page_size * page_size;
+}
+
+static uint32_t get_kernel_version(void)
+{
+	uint32_t major, minor, patch;
+	struct utsname info;
+
+	uname(&info);
+	if (sscanf(info.release, "%u.%u.%u", &major, &minor, &patch) != 3)
+		return 0;
+	return KERNEL_VERSION(major, minor, patch);
+}
+
+struct data {
+	uint64_t kern_ver;
+	uint64_t tristate_val;
+	uint64_t bool_val;
+	uint64_t int_val;
+	uint64_t missing_val;
+};
+
+static struct test_case {
+	const char *name;
+	const char *cfg;
+	const char *cfg_path;
+	bool fails;
+	struct data data;
+} test_cases[] = {
+	{ .name = "default search path", .cfg_path = NULL },
+	{ .name = "/proc/config.gz", .cfg_path = "/proc/config.gz" },
+	{ .name = "missing config", .fails = true,
+	  .cfg_path = "/proc/invalid-config.gz" },
+	{
+		.name = "custom values",
+		.cfg = "CONFIG_TRISTATE=m\n"
+		       "CONFIG_BOOL=y\n"
+		       "CONFIG_INT=123456\n",
+		.data = {
+			.tristate_val = 2,
+			.bool_val = 1,
+			.int_val = 123456,
+		},
+	},
+	{
+		/* there is no real typing, so any valid value is accepted */
+		.name = "mixed up types",
+		.cfg = "CONFIG_TRISTATE=123\n"
+		       "CONFIG_BOOL=m\n"
+		       "CONFIG_INT=y\n",
+		.data = {
+			.tristate_val = 123,
+			.bool_val = 2,
+			.int_val = 1,
+		},
+	},
+	{
+		/* somewhat weird behavior of strtoull */
+		.name = "negative int",
+		.cfg = "CONFIG_INT=-12\n",
+		.data = { .int_val = (uint64_t)-12 },
+	},
+	{ .name = "bad tristate", .fails = true, .cfg = "CONFIG_TRISTATE=M" },
+	{ .name = "bad bool", .fails = true, .cfg = "CONFIG_BOOL=X" },
+	{ .name = "int (not int)", .fails = true, .cfg = "CONFIG_INT=abc" },
+	{ .name = "int (string)", .fails = true, .cfg = "CONFIG_INT=\"abc\"" },
+	{ .name = "int (empty)", .fails = true, .cfg = "CONFIG_INT=" },
+	{ .name = "int (mixed up 1)", .fails = true, .cfg = "CONFIG_INT=123abc",
+	  .fails = true, },
+	{ .name = "int (mixed up 2)", .fails = true, .cfg = "CONFIG_INT=123abc\n",
+	  .fails = true, },
+	{ .name = "int (too big)", .fails = true,
+	  .cfg = "CONFIG_INT=123456789123456789123\n" },
+};
+
+void test_core_extern(void)
+{
+	const char *file = "test_core_extern.o";
+	const char *probe_name = "raw_tp/sys_enter";
+	const char *tp_name = "sys_enter";
+	const size_t bss_sz = roundup_page(sizeof(struct data));
+	const uint32_t kern_ver = get_kernel_version();
+	int err, duration = 0, i;
+	struct bpf_program *prog;
+	struct bpf_object *obj;
+	struct bpf_link *link = NULL;
+	struct bpf_map *bss_map;
+	void *bss_mmaped = NULL;
+	volatile struct data *data;
+
+	for (i = 0; i < ARRAY_SIZE(test_cases); i++) {
+		char tmp_cfg_path[] = "/tmp/test_core_extern_cfg.XXXXXX";
+		const struct test_case *t = &test_cases[i];
+		DECLARE_LIBBPF_OPTS(bpf_object_open_opts, opts,
+			.kernel_config_path = t->cfg_path,
+		);
+
+		if (!test__start_subtest(t->name))
+			continue;
+
+		if (t->cfg) {
+			size_t n = strlen(t->cfg) + 1;
+			int fd = mkstemp(tmp_cfg_path);
+
+			if (CHECK(fd < 0, "mkstemp", "errno: %d\n", errno))
+				continue;
+			printf("using '%s' as config file\n", tmp_cfg_path);
+			if (CHECK_FAIL(write(fd, t->cfg, n) != n))
+				continue;
+			close(fd);
+			opts.kernel_config_path = tmp_cfg_path;
+		}
+
+		obj = bpf_object__open_file("test_core_extern.o", &opts);
+		if (t->fails) {
+			CHECK(!IS_ERR(obj), "obj_open",
+			      "shouldn't succeed opening '%s'!\n", file);
+			goto cleanup;
+		} else {
+			if (CHECK(IS_ERR(obj), "obj_open",
+				  "failed to open '%s': %ld\n",
+				  file, PTR_ERR(obj)))
+				goto cleanup;
+		}
+		prog = bpf_object__find_program_by_title(obj, probe_name);
+		if (CHECK(!prog, "find_prog", "prog %s missing\n", probe_name))
+			goto cleanup;
+		err = bpf_object__load(obj);
+		if (CHECK(err, "obj_load", "failed to load prog '%s': %d\n",
+			  probe_name, err))
+			goto cleanup;
+		bss_map = bpf_object__find_map_by_name(obj, "test_cor.bss");
+		if (CHECK(!bss_map, "find_bss_map", ".bss map not found\n"))
+			goto cleanup;
+		bss_mmaped = mmap(NULL, bss_sz, PROT_READ | PROT_WRITE,
+				  MAP_SHARED, bpf_map__fd(bss_map), 0);
+		if (CHECK(bss_mmaped == MAP_FAILED, "bss_mmap",
+			  ".bss mmap failed: %d\n", errno)) {
+			bss_mmaped = NULL;
+			goto cleanup;
+		}
+		data = bss_mmaped;
+
+		link = bpf_program__attach_raw_tracepoint(prog, tp_name);
+		if (CHECK(IS_ERR(link), "attach_raw_tp", "err %ld\n", PTR_ERR(link)))
+			goto cleanup;
+
+		usleep(1);
+
+		CHECK(data->kern_ver != kern_ver, "kern_ver",
+		      "exp %x, got %lx\n", kern_ver, data->kern_ver);
+		CHECK(data->missing_val != 0xDEADC0DE, "missing_val",
+		      "exp %x, got %lx\n", 0xDEADC0DE, data->missing_val);
+		CHECK(data->bool_val != t->data.bool_val, "bool_val",
+		      "exp %lx, got %lx\n", t->data.bool_val, data->bool_val);
+		CHECK(data->tristate_val != t->data.tristate_val,
+		      "tristate_val", "exp %lx, got %lx\n",
+		      t->data.tristate_val, data->tristate_val);
+		CHECK(data->int_val != t->data.int_val, "int_val",
+		      "exp %lx, got %lx\n", t->data.int_val, data->int_val);
+
+cleanup:
+		if (t->cfg)
+			unlink(tmp_cfg_path);
+		if (bss_mmaped) {
+			CHECK_FAIL(munmap(bss_mmaped, bss_sz));
+			bss_mmaped = NULL;
+			data = NULL;
+		}
+		if (!IS_ERR_OR_NULL(link)) {
+			bpf_link__destroy(link);
+			link = NULL;
+		}
+		if (!IS_ERR_OR_NULL(obj))
+			bpf_object__close(obj);
+	}
+}
diff --git a/tools/testing/selftests/bpf/progs/test_core_extern.c b/tools/testing/selftests/bpf/progs/test_core_extern.c
new file mode 100644
index 000000000000..adbc74113265
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_core_extern.c
@@ -0,0 +1,43 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2017 Facebook
+
+#include <stdint.h>
+#include <linux/ptrace.h>
+#include <linux/bpf.h>
+#include "bpf_helpers.h"
+
+/* non-existing BPF helper, to test dead code elimination */
+static int (*bpf_missing_helper)(const void *arg1, int arg2) = (void *) 999;
+
+extern uint64_t LINUX_KERNEL_VERSION;
+extern uint64_t CONFIG_TRISTATE;
+extern uint64_t CONFIG_BOOL;
+extern uint64_t CONFIG_INT;
+extern uint64_t CONFIG_MISSING;
+
+volatile struct {
+	uint64_t kern_ver;
+	uint64_t tristate_val;
+	uint64_t bool_val;
+	uint64_t int_val;
+	uint64_t missing_val;
+} out = {};
+
+SEC("raw_tp/sys_enter")
+int handle_sys_enter(struct pt_regs *ctx)
+{
+	out.kern_ver = LINUX_KERNEL_VERSION;
+	out.tristate_val = CONFIG_TRISTATE;
+	out.bool_val = CONFIG_BOOL;
+	out.int_val = CONFIG_INT;
+
+	if (CONFIG_MISSING)
+		/* invalid, but dead code - never executed */
+		out.missing_val = bpf_missing_helper(ctx, 123);
+	else
+		out.missing_val = 0xDEADC0DE;
+
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 0/6] Add libbpf-provided extern variables support
  2019-11-17  7:08 [PATCH bpf-next 0/6] Add libbpf-provided extern variables support Andrii Nakryiko
                   ` (5 preceding siblings ...)
  2019-11-17  7:08 ` [PATCH bpf-next 6/6] selftests/bpf: add tests for libbpf-provided externs Andrii Nakryiko
@ 2019-11-17  7:12 ` Andrii Nakryiko
  6 siblings, 0 replies; 20+ messages in thread
From: Andrii Nakryiko @ 2019-11-17  7:12 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Networking, Alexei Starovoitov, Daniel Borkmann, Kernel Team

On Sat, Nov 16, 2019 at 11:08 PM Andrii Nakryiko <andriin@fb.com> wrote:
>
> It's often important for BPF program to know kernel version or some specific
> config values (e.g., CONFIG_HZ to convert jiffies to seconds) and change or
> adjust program logic based on their values. As of today, any such need has to
> be resolved by recompiling BPF program for specific kernel and kernel
> configuration. In practice this is usually achieved by using BCC and its
> embedded LLVM/Clang. With such set up #ifdef CONFIG_XXX and similar
> compile-time constructs allow to deal with kernel varieties.
>
> With CO-RE (Compile Once – Run Everywhere) approach, this is not an option,
> unfortunately. All such logic variations have to be done as a normal
> C language constructs (i.e., if/else, variables, etc), not a preprocessor
> directives. This patch series add support for such advanced scenarios through
> C extern variables. These extern variables will be recognized by libbpf and
> supplied through extra .extern internal map, similarly to global data. This
> .extern map is read-only, which allows BPF verifier to track its content
> precisely as constants. That gives an opportunity to have pre-compiled BPF
> program, which can potentially use BPF functionality (e.g., BPF helpers) or
> kernel features (types, fields, etc), that are available only on a subset of
> targeted kernels, while effectively eleminating (through verifier's dead code
> detection) such unsupported functionality for other kernels (typically, older
> versions). Patch #5 contains all the details. Patch #5 explicitly tests
> a scenario of using unsupported BPF helper, to validate the approach.
>
> As part of this patch set, libbpf also allows usage of initialized global
> (non-static) variables, which provides better Clang semantics, which is closer
> and better aligned witht kernel vs userspace BPF map contents sharing.
>
> Outline of the patch set:
> - patches #1-#3 do some preliminary refactorings of libbpf relocation logic
>   and some more clean ups;
> - patch #4 allows non-static variables and converts few tests to use them;
> - patch #5 adds support for externs to libbpf;
> - patch #6 adds tests for externs.

Forgot to mention, this patch set is based off of patch set adding
mmap()-support for BPF maps, because I use that functionality for
selftests. In the worst case scenario, I can switch tests to
old-fashioned bpf_map_update_elem()/bpf_map_lookup_elem() interface.

[...]

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 5/6] libbpf: support libbpf-provided extern variables
  2019-11-17  7:08 ` [PATCH bpf-next 5/6] libbpf: support libbpf-provided extern variables Andrii Nakryiko
@ 2019-11-19  3:21   ` Alexei Starovoitov
  2019-11-19  6:57     ` Andrii Nakryiko
  0 siblings, 1 reply; 20+ messages in thread
From: Alexei Starovoitov @ 2019-11-19  3:21 UTC (permalink / raw)
  To: Andrii Nakryiko; +Cc: bpf, netdev, ast, daniel, andrii.nakryiko, kernel-team

On Sat, Nov 16, 2019 at 11:08:06PM -0800, Andrii Nakryiko wrote:
> Add support for extern variables, provided to BPF program by libbpf. Currently
> the following extern variables are supported:
>   - LINUX_KERNEL_VERSION; version of a kernel in which BPF program is
>     executing, follows KERNEL_VERSION() macro convention;
>   - CONFIG_xxx values; a set of values of actual kernel config. Tristate,
>     boolean, and integer values are supported. Strings are not supported at
>     the moment.
> 
> All values are represented as 64-bit integers, with the follow value encoding:
>   - for boolean values, y is 1, n or missing value is 0;
>   - for tristate values, y is 1, m is 2, n or missing value is 0;
>   - for integers, the values is 64-bit integer, sign-extended, if negative; if
>     config value is missing, it's represented as 0, which makes explicit 0 and
>     missing config value indistinguishable. If this will turn out to be
>     a problem in practice, we'll need to deal with it somehow.

I read that statement as there is no extensibility for such api.

> Generally speaking, libbpf is not aware of which CONFIG_XXX values is of which
> expected type (bool, tristate, int), so it doesn't enforce any specific set of
> values and just parses n/y/m as 0/1/2, respectively. CONFIG_XXX values not
> found in config file are set to 0.

This is not pretty either.

> +
> +		switch (*value) {
> +		case 'n':
> +			*ext_val = 0;
> +			break;
> +		case 'y':
> +			*ext_val = 1;
> +			break;
> +		case 'm':
> +			*ext_val = 2;
> +			break;
> +		case '"':
> +			pr_warn("extern '%s': strings are not supported\n",
> +				ext->name);
> +			err = -EINVAL;
> +			goto out;
> +		default:
> +			errno = 0;
> +			*ext_val = strtoull(value, &value_end, 10);
> +			if (errno) {
> +				err = -errno;
> +				pr_warn("extern '%s': failed to parse value: %d\n",
> +					ext->name, err);
> +				goto out;
> +			}

BPF has bpf_strtol() helper. I think it would be cleaner to pass whatever
.config has as bytes to the program and let program parse n/y/m, strings and
integers.

LINUX_KERNEL_VERSION is a special case and can stay as u64.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 5/6] libbpf: support libbpf-provided extern variables
  2019-11-19  3:21   ` Alexei Starovoitov
@ 2019-11-19  6:57     ` Andrii Nakryiko
  2019-11-19 15:42       ` Andrii Nakryiko
  0 siblings, 1 reply; 20+ messages in thread
From: Andrii Nakryiko @ 2019-11-19  6:57 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team

On Mon, Nov 18, 2019 at 7:21 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Sat, Nov 16, 2019 at 11:08:06PM -0800, Andrii Nakryiko wrote:
> > Add support for extern variables, provided to BPF program by libbpf. Currently
> > the following extern variables are supported:
> >   - LINUX_KERNEL_VERSION; version of a kernel in which BPF program is
> >     executing, follows KERNEL_VERSION() macro convention;
> >   - CONFIG_xxx values; a set of values of actual kernel config. Tristate,
> >     boolean, and integer values are supported. Strings are not supported at
> >     the moment.
> >
> > All values are represented as 64-bit integers, with the follow value encoding:
> >   - for boolean values, y is 1, n or missing value is 0;
> >   - for tristate values, y is 1, m is 2, n or missing value is 0;
> >   - for integers, the values is 64-bit integer, sign-extended, if negative; if
> >     config value is missing, it's represented as 0, which makes explicit 0 and
> >     missing config value indistinguishable. If this will turn out to be
> >     a problem in practice, we'll need to deal with it somehow.
>
> I read that statement as there is no extensibility for such api.

What do you mean exactly?

Are you worried about 0 vs undefined case? I don't think it's going to
be a problem in practice. Looking at my .config, I see that integer
config values set to their default values are still explicitly
specified with those values. E.g.,

CONFIG_HZ_1000=y
CONFIG_HZ=1000

CONFIG_HZ default is 1000, if CONFIG_HZ_1000==y, but still I see it
set. So while I won't claim that it's the case for any possible
integer config, it seems to be pretty consistent in practice.

Also, I see a lot of values set to explicit 0, like:

CONFIG_BASE_SMALL=0

So it seems like integers are typically spelled out explicitly in real
configs and I think this 0 default is pretty sane.

Next, speaking about extensibility. Once we have BTF type info for
externs, our possibilities are much better. It will be possible to
support bool, int, in64 for the same bool value. Libbpf will be able
to validate the range and fail program load if declared extern type
doesn't match actual value type and value range. So I think
extensibility is there, but right now we are enforcing (logically)
everything to be uin64_t. Unfortunately, with the way externs are done
in ELF, I don't know neither type nor size, so can't be more strict
than that.

If we really need to know whether some config value is defined or not,
regardless of its value, we can have it by convention. E.g.,
CONFIG_DEFINED_XXX will be either 0 or 1, depending if corresponding
CONFIG_XXX is defined explicitly or not. But I don't want to add that
until we really have a use case where it matters.

>
> > Generally speaking, libbpf is not aware of which CONFIG_XXX values is of which
> > expected type (bool, tristate, int), so it doesn't enforce any specific set of
> > values and just parses n/y/m as 0/1/2, respectively. CONFIG_XXX values not
> > found in config file are set to 0.
>
> This is not pretty either.

What exactly: defaulting to zero or not knowing config value's type?
Given all the options, defaulting to zero seems like the best way to
go.

>
> > +
> > +             switch (*value) {
> > +             case 'n':
> > +                     *ext_val = 0;
> > +                     break;
> > +             case 'y':
> > +                     *ext_val = 1;
> > +                     break;
> > +             case 'm':
> > +                     *ext_val = 2;
> > +                     break;

reading some more code from scripts/kconfig/symbol.c, I'll need to
handle N/Y/M and 0x hexadecimals, will add in v2 after collecting some
more feedback on this version.

> > +             case '"':
> > +                     pr_warn("extern '%s': strings are not supported\n",
> > +                             ext->name);
> > +                     err = -EINVAL;
> > +                     goto out;
> > +             default:
> > +                     errno = 0;
> > +                     *ext_val = strtoull(value, &value_end, 10);
> > +                     if (errno) {
> > +                             err = -errno;
> > +                             pr_warn("extern '%s': failed to parse value: %d\n",
> > +                                     ext->name, err);
> > +                             goto out;
> > +                     }
>
> BPF has bpf_strtol() helper. I think it would be cleaner to pass whatever
> .config has as bytes to the program and let program parse n/y/m, strings and
> integers.

Config value is not changing. This is an incredible waste of CPU
resources to re-parse same value over and over again. And it's
incredibly much worse usability as well. Again, once we have BTF for
externs, we can just declare values as const char[] and then user will
be able to do its own parsing. Until then, I think pre-parsing values
into convenient u64 types are much better and handles all the typical
cases.

>
> LINUX_KERNEL_VERSION is a special case and can stay as u64.
>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 5/6] libbpf: support libbpf-provided extern variables
  2019-11-19  6:57     ` Andrii Nakryiko
@ 2019-11-19 15:42       ` Andrii Nakryiko
  2019-11-19 15:58         ` Alexei Starovoitov
  0 siblings, 1 reply; 20+ messages in thread
From: Andrii Nakryiko @ 2019-11-19 15:42 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team

On Mon, Nov 18, 2019 at 10:57 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Mon, Nov 18, 2019 at 7:21 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Sat, Nov 16, 2019 at 11:08:06PM -0800, Andrii Nakryiko wrote:
> > > Add support for extern variables, provided to BPF program by libbpf. Currently
> > > the following extern variables are supported:
> > >   - LINUX_KERNEL_VERSION; version of a kernel in which BPF program is
> > >     executing, follows KERNEL_VERSION() macro convention;
> > >   - CONFIG_xxx values; a set of values of actual kernel config. Tristate,
> > >     boolean, and integer values are supported. Strings are not supported at
> > >     the moment.
> > >
> > > All values are represented as 64-bit integers, with the follow value encoding:
> > >   - for boolean values, y is 1, n or missing value is 0;
> > >   - for tristate values, y is 1, m is 2, n or missing value is 0;
> > >   - for integers, the values is 64-bit integer, sign-extended, if negative; if
> > >     config value is missing, it's represented as 0, which makes explicit 0 and
> > >     missing config value indistinguishable. If this will turn out to be
> > >     a problem in practice, we'll need to deal with it somehow.
> >
> > I read that statement as there is no extensibility for such api.
>
> What do you mean exactly?
>
> Are you worried about 0 vs undefined case? I don't think it's going to
> be a problem in practice. Looking at my .config, I see that integer
> config values set to their default values are still explicitly
> specified with those values. E.g.,
>
> CONFIG_HZ_1000=y
> CONFIG_HZ=1000
>
> CONFIG_HZ default is 1000, if CONFIG_HZ_1000==y, but still I see it
> set. So while I won't claim that it's the case for any possible
> integer config, it seems to be pretty consistent in practice.
>
> Also, I see a lot of values set to explicit 0, like:
>
> CONFIG_BASE_SMALL=0
>
> So it seems like integers are typically spelled out explicitly in real
> configs and I think this 0 default is pretty sane.
>
> Next, speaking about extensibility. Once we have BTF type info for
> externs, our possibilities are much better. It will be possible to
> support bool, int, in64 for the same bool value. Libbpf will be able
> to validate the range and fail program load if declared extern type
> doesn't match actual value type and value range. So I think
> extensibility is there, but right now we are enforcing (logically)
> everything to be uin64_t. Unfortunately, with the way externs are done
> in ELF, I don't know neither type nor size, so can't be more strict
> than that.
>
> If we really need to know whether some config value is defined or not,
> regardless of its value, we can have it by convention. E.g.,
> CONFIG_DEFINED_XXX will be either 0 or 1, depending if corresponding
> CONFIG_XXX is defined explicitly or not. But I don't want to add that
> until we really have a use case where it matters.
>
> >
> > > Generally speaking, libbpf is not aware of which CONFIG_XXX values is of which
> > > expected type (bool, tristate, int), so it doesn't enforce any specific set of
> > > values and just parses n/y/m as 0/1/2, respectively. CONFIG_XXX values not
> > > found in config file are set to 0.
> >
> > This is not pretty either.
>
> What exactly: defaulting to zero or not knowing config value's type?
> Given all the options, defaulting to zero seems like the best way to
> go.
>
> >
> > > +
> > > +             switch (*value) {
> > > +             case 'n':
> > > +                     *ext_val = 0;
> > > +                     break;
> > > +             case 'y':
> > > +                     *ext_val = 1;
> > > +                     break;
> > > +             case 'm':
> > > +                     *ext_val = 2;
> > > +                     break;
>
> reading some more code from scripts/kconfig/symbol.c, I'll need to
> handle N/Y/M and 0x hexadecimals, will add in v2 after collecting some
> more feedback on this version.
>
> > > +             case '"':
> > > +                     pr_warn("extern '%s': strings are not supported\n",
> > > +                             ext->name);
> > > +                     err = -EINVAL;
> > > +                     goto out;
> > > +             default:
> > > +                     errno = 0;
> > > +                     *ext_val = strtoull(value, &value_end, 10);
> > > +                     if (errno) {
> > > +                             err = -errno;
> > > +                             pr_warn("extern '%s': failed to parse value: %d\n",
> > > +                                     ext->name, err);
> > > +                             goto out;
> > > +                     }
> >
> > BPF has bpf_strtol() helper. I think it would be cleaner to pass whatever
> > .config has as bytes to the program and let program parse n/y/m, strings and
> > integers.
>
> Config value is not changing. This is an incredible waste of CPU
> resources to re-parse same value over and over again. And it's
> incredibly much worse usability as well. Again, once we have BTF for
> externs, we can just declare values as const char[] and then user will
> be able to do its own parsing. Until then, I think pre-parsing values
> into convenient u64 types are much better and handles all the typical
> cases.


One more thing I didn't realize I didn't state explicitly, because
I've been thinking and talking about that for so long now, that it
kind of internalized completely.

These externs, including CONFIG_XXX ones, are meant to interoperate
nicely with field relocations within BPF CO-RE concept. They are,
among other things, are meant to disable parts of BPF program logic
through verifier's dead code elimination by doing something like:


if (CONFIG_SOME_FEATURES_ENABLED) {
    BPF_CORE_READ(t, some_extra_field);
    /* or */
    bpf_helper_that_only_present_when_feature_is_enabled();
} else {
    /* fallback logic */
}

With CONFIG_SOME_FEATURES_ENABLED not being a read-only integer
constant when BPF program is loaded, this is impossible. So it
absolutely must be some sort of easy to use integer constant. With BTF
for externs we can have more flexibility in the future, of course.

>
> >
> > LINUX_KERNEL_VERSION is a special case and can stay as u64.
> >

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 5/6] libbpf: support libbpf-provided extern variables
  2019-11-19 15:42       ` Andrii Nakryiko
@ 2019-11-19 15:58         ` Alexei Starovoitov
  2019-11-19 23:32           ` Daniel Borkmann
  2019-11-20  3:44           ` Andrii Nakryiko
  0 siblings, 2 replies; 20+ messages in thread
From: Alexei Starovoitov @ 2019-11-19 15:58 UTC (permalink / raw)
  To: Andrii Nakryiko, Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, Networking, Daniel Borkmann, Kernel Team

On 11/19/19 7:42 AM, Andrii Nakryiko wrote:
> On Mon, Nov 18, 2019 at 10:57 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
>>
>> On Mon, Nov 18, 2019 at 7:21 PM Alexei Starovoitov
>> <alexei.starovoitov@gmail.com> wrote:
>>>
>>> On Sat, Nov 16, 2019 at 11:08:06PM -0800, Andrii Nakryiko wrote:
>>>> Add support for extern variables, provided to BPF program by libbpf. Currently
>>>> the following extern variables are supported:
>>>>    - LINUX_KERNEL_VERSION; version of a kernel in which BPF program is
>>>>      executing, follows KERNEL_VERSION() macro convention;
>>>>    - CONFIG_xxx values; a set of values of actual kernel config. Tristate,
>>>>      boolean, and integer values are supported. Strings are not supported at
>>>>      the moment.
>>>>
>>>> All values are represented as 64-bit integers, with the follow value encoding:
>>>>    - for boolean values, y is 1, n or missing value is 0;
>>>>    - for tristate values, y is 1, m is 2, n or missing value is 0;
>>>>    - for integers, the values is 64-bit integer, sign-extended, if negative; if
>>>>      config value is missing, it's represented as 0, which makes explicit 0 and
>>>>      missing config value indistinguishable. If this will turn out to be
>>>>      a problem in practice, we'll need to deal with it somehow.
>>>
>>> I read that statement as there is no extensibility for such api.
>>
>> What do you mean exactly?
>>
>> Are you worried about 0 vs undefined case? I don't think it's going to
>> be a problem in practice. Looking at my .config, I see that integer
>> config values set to their default values are still explicitly
>> specified with those values. E.g.,
>>
>> CONFIG_HZ_1000=y
>> CONFIG_HZ=1000
>>
>> CONFIG_HZ default is 1000, if CONFIG_HZ_1000==y, but still I see it
>> set. So while I won't claim that it's the case for any possible
>> integer config, it seems to be pretty consistent in practice.
>>
>> Also, I see a lot of values set to explicit 0, like:
>>
>> CONFIG_BASE_SMALL=0
>>
>> So it seems like integers are typically spelled out explicitly in real
>> configs and I think this 0 default is pretty sane.
>>
>> Next, speaking about extensibility. Once we have BTF type info for
>> externs, our possibilities are much better. It will be possible to
>> support bool, int, in64 for the same bool value. Libbpf will be able
>> to validate the range and fail program load if declared extern type
>> doesn't match actual value type and value range. So I think
>> extensibility is there, but right now we are enforcing (logically)
>> everything to be uin64_t. Unfortunately, with the way externs are done
>> in ELF, I don't know neither type nor size, so can't be more strict
>> than that.
>>
>> If we really need to know whether some config value is defined or not,
>> regardless of its value, we can have it by convention. E.g.,
>> CONFIG_DEFINED_XXX will be either 0 or 1, depending if corresponding
>> CONFIG_XXX is defined explicitly or not. But I don't want to add that
>> until we really have a use case where it matters.
>>
>>>
>>>> Generally speaking, libbpf is not aware of which CONFIG_XXX values is of which
>>>> expected type (bool, tristate, int), so it doesn't enforce any specific set of
>>>> values and just parses n/y/m as 0/1/2, respectively. CONFIG_XXX values not
>>>> found in config file are set to 0.
>>>
>>> This is not pretty either.
>>
>> What exactly: defaulting to zero or not knowing config value's type?
>> Given all the options, defaulting to zero seems like the best way to
>> go.
>>
>>>
>>>> +
>>>> +             switch (*value) {
>>>> +             case 'n':
>>>> +                     *ext_val = 0;
>>>> +                     break;
>>>> +             case 'y':
>>>> +                     *ext_val = 1;
>>>> +                     break;
>>>> +             case 'm':
>>>> +                     *ext_val = 2;
>>>> +                     break;
>>
>> reading some more code from scripts/kconfig/symbol.c, I'll need to
>> handle N/Y/M and 0x hexadecimals, will add in v2 after collecting some
>> more feedback on this version.
>>
>>>> +             case '"':
>>>> +                     pr_warn("extern '%s': strings are not supported\n",
>>>> +                             ext->name);
>>>> +                     err = -EINVAL;
>>>> +                     goto out;
>>>> +             default:
>>>> +                     errno = 0;
>>>> +                     *ext_val = strtoull(value, &value_end, 10);
>>>> +                     if (errno) {
>>>> +                             err = -errno;
>>>> +                             pr_warn("extern '%s': failed to parse value: %d\n",
>>>> +                                     ext->name, err);
>>>> +                             goto out;
>>>> +                     }
>>>
>>> BPF has bpf_strtol() helper. I think it would be cleaner to pass whatever
>>> .config has as bytes to the program and let program parse n/y/m, strings and
>>> integers.
>>
>> Config value is not changing. This is an incredible waste of CPU
>> resources to re-parse same value over and over again. And it's
>> incredibly much worse usability as well. Again, once we have BTF for
>> externs, we can just declare values as const char[] and then user will
>> be able to do its own parsing. Until then, I think pre-parsing values
>> into convenient u64 types are much better and handles all the typical
>> cases.
> 
> 
> One more thing I didn't realize I didn't state explicitly, because
> I've been thinking and talking about that for so long now, that it
> kind of internalized completely.
> 
> These externs, including CONFIG_XXX ones, are meant to interoperate
> nicely with field relocations within BPF CO-RE concept. They are,
> among other things, are meant to disable parts of BPF program logic
> through verifier's dead code elimination by doing something like:
> 
> 
> if (CONFIG_SOME_FEATURES_ENABLED) {
>      BPF_CORE_READ(t, some_extra_field);
>      /* or */
>      bpf_helper_that_only_present_when_feature_is_enabled();
> } else {
>      /* fallback logic */
> }
> 
> With CONFIG_SOME_FEATURES_ENABLED not being a read-only integer
> constant when BPF program is loaded, this is impossible. So it
> absolutely must be some sort of easy to use integer constant. 

Hmm. what difference do you see between u64 and char[] ?
The const propagation logic in the verifier should work the same way.
If it doesn't it's a bug in the verifier and it's not ok to hack
extern api to workaround the bug.

What you're advocating with libbpf-side of conversion to integers
reminds me of our earlier attempts with cgroup_sysctl hooks where
we started with ints only to realize that in practice it's too
limited. Then bpf_strtol was introduced and api got much cleaner.
Same thing here. Converting char[] into ints or whatever else
is the job of the program. Not of libbpf. The verifier can be taught
to optimize bpf_strtol() into const when const char[] is passed in.

As far as is_enabled() check doing it as 0/1 the way you're proposing
has in-band signaling issues that you admitted in the commit log.
For is_enabled() may be new builtin() on llvm side would be better?
Something like __builtin_preserve_field_info(field, BPF_FIELD_EXISTS)
but can be used on _any_ extern function or variable.
Like __builtin_is_extern_resolved(extern_name);
Then on libbpf side CONFIG_* that are not in config.gz won't be seen
by the program (instead of seen as 0 in your proposal) and the code
will look like:
if (__builtin_is_extern_resolved(CONFIG_NETNS)) {
   ..do things;
} else {
}
The verifier dead code elimination will take care of branches.
The BPF program itself doesn't need to read the value of CONFIG_
it only needs to know whether it was defined.
Such builtin would match semantics better.
If CONFIG_ is tri-state doing
if (*(u8*)CONFIG_FOO == 'y' || *(u8*)CONFIG_FOO == 'm')
is cleaner than *(u64*)CONFIG_FOO == 1 || 2.
and constant propagation in the verifier should work the same way.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 5/6] libbpf: support libbpf-provided extern variables
  2019-11-19 15:58         ` Alexei Starovoitov
@ 2019-11-19 23:32           ` Daniel Borkmann
  2019-11-20  3:17             ` Andrii Nakryiko
  2019-11-20  3:44           ` Andrii Nakryiko
  1 sibling, 1 reply; 20+ messages in thread
From: Daniel Borkmann @ 2019-11-19 23:32 UTC (permalink / raw)
  To: Alexei Starovoitov, Andrii Nakryiko, Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, Networking, Kernel Team

On 11/19/19 4:58 PM, Alexei Starovoitov wrote:
> On 11/19/19 7:42 AM, Andrii Nakryiko wrote:
>> On Mon, Nov 18, 2019 at 10:57 PM Andrii Nakryiko
>> <andrii.nakryiko@gmail.com> wrote:
>>> On Mon, Nov 18, 2019 at 7:21 PM Alexei Starovoitov
>>> <alexei.starovoitov@gmail.com> wrote:
>>>> On Sat, Nov 16, 2019 at 11:08:06PM -0800, Andrii Nakryiko wrote:
>>>>> Add support for extern variables, provided to BPF program by libbpf. Currently
>>>>> the following extern variables are supported:
>>>>>     - LINUX_KERNEL_VERSION; version of a kernel in which BPF program is
>>>>>       executing, follows KERNEL_VERSION() macro convention;
>>>>>     - CONFIG_xxx values; a set of values of actual kernel config. Tristate,
>>>>>       boolean, and integer values are supported. Strings are not supported at
>>>>>       the moment.
>>>>>
>>>>> All values are represented as 64-bit integers, with the follow value encoding:
>>>>>     - for boolean values, y is 1, n or missing value is 0;
>>>>>     - for tristate values, y is 1, m is 2, n or missing value is 0;
>>>>>     - for integers, the values is 64-bit integer, sign-extended, if negative; if
>>>>>       config value is missing, it's represented as 0, which makes explicit 0 and
>>>>>       missing config value indistinguishable. If this will turn out to be
>>>>>       a problem in practice, we'll need to deal with it somehow.
>>>>
>>>> I read that statement as there is no extensibility for such api.
>>>
>>> What do you mean exactly?
>>>
>>> Are you worried about 0 vs undefined case? I don't think it's going to
>>> be a problem in practice. Looking at my .config, I see that integer
>>> config values set to their default values are still explicitly
>>> specified with those values. E.g.,
>>>
>>> CONFIG_HZ_1000=y
>>> CONFIG_HZ=1000
>>>
>>> CONFIG_HZ default is 1000, if CONFIG_HZ_1000==y, but still I see it
>>> set. So while I won't claim that it's the case for any possible
>>> integer config, it seems to be pretty consistent in practice.
>>>
>>> Also, I see a lot of values set to explicit 0, like:
>>>
>>> CONFIG_BASE_SMALL=0
>>>
>>> So it seems like integers are typically spelled out explicitly in real
>>> configs and I think this 0 default is pretty sane.
>>>
>>> Next, speaking about extensibility. Once we have BTF type info for
>>> externs, our possibilities are much better. It will be possible to
>>> support bool, int, in64 for the same bool value. Libbpf will be able
>>> to validate the range and fail program load if declared extern type
>>> doesn't match actual value type and value range. So I think
>>> extensibility is there, but right now we are enforcing (logically)
>>> everything to be uin64_t. Unfortunately, with the way externs are done
>>> in ELF, I don't know neither type nor size, so can't be more strict
>>> than that.
>>>
>>> If we really need to know whether some config value is defined or not,
>>> regardless of its value, we can have it by convention. E.g.,
>>> CONFIG_DEFINED_XXX will be either 0 or 1, depending if corresponding
>>> CONFIG_XXX is defined explicitly or not. But I don't want to add that
>>> until we really have a use case where it matters.
>>>
>>>>> Generally speaking, libbpf is not aware of which CONFIG_XXX values is of which
>>>>> expected type (bool, tristate, int), so it doesn't enforce any specific set of
>>>>> values and just parses n/y/m as 0/1/2, respectively. CONFIG_XXX values not
>>>>> found in config file are set to 0.
>>>>
>>>> This is not pretty either.
>>>
>>> What exactly: defaulting to zero or not knowing config value's type?
>>> Given all the options, defaulting to zero seems like the best way to
>>> go.
>>>
>>>>> +
>>>>> +             switch (*value) {
>>>>> +             case 'n':
>>>>> +                     *ext_val = 0;
>>>>> +                     break;
>>>>> +             case 'y':
>>>>> +                     *ext_val = 1;
>>>>> +                     break;
>>>>> +             case 'm':
>>>>> +                     *ext_val = 2;
>>>>> +                     break;
>>>
>>> reading some more code from scripts/kconfig/symbol.c, I'll need to
>>> handle N/Y/M and 0x hexadecimals, will add in v2 after collecting some
>>> more feedback on this version.
>>>
>>>>> +             case '"':
>>>>> +                     pr_warn("extern '%s': strings are not supported\n",
>>>>> +                             ext->name);
>>>>> +                     err = -EINVAL;
>>>>> +                     goto out;
>>>>> +             default:
>>>>> +                     errno = 0;
>>>>> +                     *ext_val = strtoull(value, &value_end, 10);
>>>>> +                     if (errno) {
>>>>> +                             err = -errno;
>>>>> +                             pr_warn("extern '%s': failed to parse value: %d\n",
>>>>> +                                     ext->name, err);
>>>>> +                             goto out;
>>>>> +                     }
>>>>
>>>> BPF has bpf_strtol() helper. I think it would be cleaner to pass whatever
>>>> .config has as bytes to the program and let program parse n/y/m, strings and
>>>> integers.
>>>
>>> Config value is not changing. This is an incredible waste of CPU
>>> resources to re-parse same value over and over again. And it's
>>> incredibly much worse usability as well. Again, once we have BTF for
>>> externs, we can just declare values as const char[] and then user will
>>> be able to do its own parsing. Until then, I think pre-parsing values
>>> into convenient u64 types are much better and handles all the typical
>>> cases.
>>
>> One more thing I didn't realize I didn't state explicitly, because
>> I've been thinking and talking about that for so long now, that it
>> kind of internalized completely.
>>
>> These externs, including CONFIG_XXX ones, are meant to interoperate
>> nicely with field relocations within BPF CO-RE concept. They are,
>> among other things, are meant to disable parts of BPF program logic
>> through verifier's dead code elimination by doing something like:
>>
>> if (CONFIG_SOME_FEATURES_ENABLED) {
>>       BPF_CORE_READ(t, some_extra_field);
b>>       /* or */
>>       bpf_helper_that_only_present_when_feature_is_enabled();
>> } else {
>>       /* fallback logic */
>> }
>>
>> With CONFIG_SOME_FEATURES_ENABLED not being a read-only integer
>> constant when BPF program is loaded, this is impossible. So it
>> absolutely must be some sort of easy to use integer constant.
> 
> Hmm. what difference do you see between u64 and char[] ?
> The const propagation logic in the verifier should work the same way.
> If it doesn't it's a bug in the verifier and it's not ok to hack
> extern api to workaround the bug.
> 
> What you're advocating with libbpf-side of conversion to integers
> reminds me of our earlier attempts with cgroup_sysctl hooks where
> we started with ints only to realize that in practice it's too
> limited. Then bpf_strtol was introduced and api got much cleaner.
> Same thing here. Converting char[] into ints or whatever else
> is the job of the program. Not of libbpf. The verifier can be taught
> to optimize bpf_strtol() into const when const char[] is passed in.
> 
> As far as is_enabled() check doing it as 0/1 the way you're proposing
> has in-band signaling issues that you admitted in the commit log.
> For is_enabled() may be new builtin() on llvm side would be better?
> Something like __builtin_preserve_field_info(field, BPF_FIELD_EXISTS)
> but can be used on _any_ extern function or variable.
> Like __builtin_is_extern_resolved(extern_name);
> Then on libbpf side CONFIG_* that are not in config.gz won't be seen
> by the program (instead of seen as 0 in your proposal) and the code
> will look like:
> if (__builtin_is_extern_resolved(CONFIG_NETNS)) {
>     ..do things;
> } else {
> }
> The verifier dead code elimination will take care of branches.

I sort of like the option of __builtin_is_extern_resolved() better than
plain 0 to provide an option for the developer to explicitly check for
that. But I'd like to take a step back in the discussion on the topic of
bpf_object__init_extern_map(). I'm wondering why it must be part of libbpf
at all to read the kernel config and resolve CONFIG_ / LINUX_KERNEL_VERSION
automatically. This feels a bit too much assumptions and automagic resolving.
Can't the application on top of libbpf pass in a callback where the extern
resolution would be outsourced into application rather than in libbpf?
Reason I'm asking is two-fold: i) this concept feels quite generic and my
take is that this could be applied to many other things as well beyond just
plain kernel config, ii) callback would also allow to experiment first what
would work best in practice wrt kernel config as one specific example, and
in a later step, libbpf could provide this as one built-in callback option
for the user to opt-into if its found to be generic/useful enough.

> The BPF program itself doesn't need to read the value of CONFIG_
> it only needs to know whether it was defined.
> Such builtin would match semantics better.
> If CONFIG_ is tri-state doing
> if (*(u8*)CONFIG_FOO == 'y' || *(u8*)CONFIG_FOO == 'm')

Passing in raw buffer feels more natural, agree, but then, again, if we had
a callback it would be up to the one who's implementing it.

> is cleaner than *(u64*)CONFIG_FOO == 1 || 2.
> and constant propagation in the verifier should work the same way.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 5/6] libbpf: support libbpf-provided extern variables
  2019-11-19 23:32           ` Daniel Borkmann
@ 2019-11-20  3:17             ` Andrii Nakryiko
  0 siblings, 0 replies; 20+ messages in thread
From: Andrii Nakryiko @ 2019-11-20  3:17 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Alexei Starovoitov, Alexei Starovoitov, Andrii Nakryiko, bpf,
	Networking, Kernel Team

On Tue, Nov 19, 2019 at 3:32 PM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 11/19/19 4:58 PM, Alexei Starovoitov wrote:
> > On 11/19/19 7:42 AM, Andrii Nakryiko wrote:
> >> On Mon, Nov 18, 2019 at 10:57 PM Andrii Nakryiko
> >> <andrii.nakryiko@gmail.com> wrote:
> >>> On Mon, Nov 18, 2019 at 7:21 PM Alexei Starovoitov
> >>> <alexei.starovoitov@gmail.com> wrote:
> >>>> On Sat, Nov 16, 2019 at 11:08:06PM -0800, Andrii Nakryiko wrote:
> >>>>> Add support for extern variables, provided to BPF program by libbpf. Currently
> >>>>> the following extern variables are supported:
> >>>>>     - LINUX_KERNEL_VERSION; version of a kernel in which BPF program is
> >>>>>       executing, follows KERNEL_VERSION() macro convention;
> >>>>>     - CONFIG_xxx values; a set of values of actual kernel config. Tristate,
> >>>>>       boolean, and integer values are supported. Strings are not supported at
> >>>>>       the moment.
> >>>>>
> >>>>> All values are represented as 64-bit integers, with the follow value encoding:
> >>>>>     - for boolean values, y is 1, n or missing value is 0;
> >>>>>     - for tristate values, y is 1, m is 2, n or missing value is 0;
> >>>>>     - for integers, the values is 64-bit integer, sign-extended, if negative; if
> >>>>>       config value is missing, it's represented as 0, which makes explicit 0 and
> >>>>>       missing config value indistinguishable. If this will turn out to be
> >>>>>       a problem in practice, we'll need to deal with it somehow.
> >>>>
> >>>> I read that statement as there is no extensibility for such api.
> >>>
> >>> What do you mean exactly?
> >>>
> >>> Are you worried about 0 vs undefined case? I don't think it's going to
> >>> be a problem in practice. Looking at my .config, I see that integer
> >>> config values set to their default values are still explicitly
> >>> specified with those values. E.g.,
> >>>
> >>> CONFIG_HZ_1000=y
> >>> CONFIG_HZ=1000
> >>>
> >>> CONFIG_HZ default is 1000, if CONFIG_HZ_1000==y, but still I see it
> >>> set. So while I won't claim that it's the case for any possible
> >>> integer config, it seems to be pretty consistent in practice.
> >>>
> >>> Also, I see a lot of values set to explicit 0, like:
> >>>
> >>> CONFIG_BASE_SMALL=0
> >>>
> >>> So it seems like integers are typically spelled out explicitly in real
> >>> configs and I think this 0 default is pretty sane.
> >>>
> >>> Next, speaking about extensibility. Once we have BTF type info for
> >>> externs, our possibilities are much better. It will be possible to
> >>> support bool, int, in64 for the same bool value. Libbpf will be able
> >>> to validate the range and fail program load if declared extern type
> >>> doesn't match actual value type and value range. So I think
> >>> extensibility is there, but right now we are enforcing (logically)
> >>> everything to be uin64_t. Unfortunately, with the way externs are done
> >>> in ELF, I don't know neither type nor size, so can't be more strict
> >>> than that.
> >>>
> >>> If we really need to know whether some config value is defined or not,
> >>> regardless of its value, we can have it by convention. E.g.,
> >>> CONFIG_DEFINED_XXX will be either 0 or 1, depending if corresponding
> >>> CONFIG_XXX is defined explicitly or not. But I don't want to add that
> >>> until we really have a use case where it matters.
> >>>
> >>>>> Generally speaking, libbpf is not aware of which CONFIG_XXX values is of which
> >>>>> expected type (bool, tristate, int), so it doesn't enforce any specific set of
> >>>>> values and just parses n/y/m as 0/1/2, respectively. CONFIG_XXX values not
> >>>>> found in config file are set to 0.
> >>>>
> >>>> This is not pretty either.
> >>>
> >>> What exactly: defaulting to zero or not knowing config value's type?
> >>> Given all the options, defaulting to zero seems like the best way to
> >>> go.
> >>>
> >>>>> +
> >>>>> +             switch (*value) {
> >>>>> +             case 'n':
> >>>>> +                     *ext_val = 0;
> >>>>> +                     break;
> >>>>> +             case 'y':
> >>>>> +                     *ext_val = 1;
> >>>>> +                     break;
> >>>>> +             case 'm':
> >>>>> +                     *ext_val = 2;
> >>>>> +                     break;
> >>>
> >>> reading some more code from scripts/kconfig/symbol.c, I'll need to
> >>> handle N/Y/M and 0x hexadecimals, will add in v2 after collecting some
> >>> more feedback on this version.
> >>>
> >>>>> +             case '"':
> >>>>> +                     pr_warn("extern '%s': strings are not supported\n",
> >>>>> +                             ext->name);
> >>>>> +                     err = -EINVAL;
> >>>>> +                     goto out;
> >>>>> +             default:
> >>>>> +                     errno = 0;
> >>>>> +                     *ext_val = strtoull(value, &value_end, 10);
> >>>>> +                     if (errno) {
> >>>>> +                             err = -errno;
> >>>>> +                             pr_warn("extern '%s': failed to parse value: %d\n",
> >>>>> +                                     ext->name, err);
> >>>>> +                             goto out;
> >>>>> +                     }
> >>>>
> >>>> BPF has bpf_strtol() helper. I think it would be cleaner to pass whatever
> >>>> .config has as bytes to the program and let program parse n/y/m, strings and
> >>>> integers.
> >>>
> >>> Config value is not changing. This is an incredible waste of CPU
> >>> resources to re-parse same value over and over again. And it's
> >>> incredibly much worse usability as well. Again, once we have BTF for
> >>> externs, we can just declare values as const char[] and then user will
> >>> be able to do its own parsing. Until then, I think pre-parsing values
> >>> into convenient u64 types are much better and handles all the typical
> >>> cases.
> >>
> >> One more thing I didn't realize I didn't state explicitly, because
> >> I've been thinking and talking about that for so long now, that it
> >> kind of internalized completely.
> >>
> >> These externs, including CONFIG_XXX ones, are meant to interoperate
> >> nicely with field relocations within BPF CO-RE concept. They are,
> >> among other things, are meant to disable parts of BPF program logic
> >> through verifier's dead code elimination by doing something like:
> >>
> >> if (CONFIG_SOME_FEATURES_ENABLED) {
> >>       BPF_CORE_READ(t, some_extra_field);
> b>>       /* or */
> >>       bpf_helper_that_only_present_when_feature_is_enabled();
> >> } else {
> >>       /* fallback logic */
> >> }
> >>
> >> With CONFIG_SOME_FEATURES_ENABLED not being a read-only integer
> >> constant when BPF program is loaded, this is impossible. So it
> >> absolutely must be some sort of easy to use integer constant.
> >
> > Hmm. what difference do you see between u64 and char[] ?
> > The const propagation logic in the verifier should work the same way.
> > If it doesn't it's a bug in the verifier and it's not ok to hack
> > extern api to workaround the bug.
> >
> > What you're advocating with libbpf-side of conversion to integers
> > reminds me of our earlier attempts with cgroup_sysctl hooks where
> > we started with ints only to realize that in practice it's too
> > limited. Then bpf_strtol was introduced and api got much cleaner.
> > Same thing here. Converting char[] into ints or whatever else
> > is the job of the program. Not of libbpf. The verifier can be taught
> > to optimize bpf_strtol() into const when const char[] is passed in.
> >
> > As far as is_enabled() check doing it as 0/1 the way you're proposing
> > has in-band signaling issues that you admitted in the commit log.
> > For is_enabled() may be new builtin() on llvm side would be better?
> > Something like __builtin_preserve_field_info(field, BPF_FIELD_EXISTS)
> > but can be used on _any_ extern function or variable.
> > Like __builtin_is_extern_resolved(extern_name);
> > Then on libbpf side CONFIG_* that are not in config.gz won't be seen
> > by the program (instead of seen as 0 in your proposal) and the code
> > will look like:
> > if (__builtin_is_extern_resolved(CONFIG_NETNS)) {
> >     ..do things;
> > } else {
> > }
> > The verifier dead code elimination will take care of branches.
>
> I sort of like the option of __builtin_is_extern_resolved() better than
> plain 0 to provide an option for the developer to explicitly check for

I think we can actually have both and let user decide which semantics
works better for seem.
If extern is defined as a weak symbol, we'll default to 0 the way that
I proposed initially. If extern is not weak, than it will have to be
guarded with __builtin_extern_resolved(), otherwise read will cause
verifier to reject the read. Does that work?

> that. But I'd like to take a step back in the discussion on the topic of
> bpf_object__init_extern_map(). I'm wondering why it must be part of libbpf
> at all to read the kernel config and resolve CONFIG_ / LINUX_KERNEL_VERSION
> automatically. This feels a bit too much assumptions and automagic resolving.
> Can't the application on top of libbpf pass in a callback where the extern
> resolution would be outsourced into application rather than in libbpf?

What you are describing is already possible today, it's called global
data. Specifically, .rodata would have exactly the same constant
tracking capabilities, yet be completely application-provided. We
definitely need to improve the API to initialize it, though, but it's
separate from this whole extern discussion. I'm going to work on that
soon as well.

The point of externs, though, is to denote something that's not
allocated and populated by BPF program or its userspace control app,
it's something that's provided by some outside "entity": libbpf itself
(e.g., for LINUX_KERNEL_VERSION and kernel config), kernel (for
whatever we are going to expose from kernel eventually), or another
BPF object (as part of static or dynamic linking). Right now kernel-
and other BPF object-provided use cases are not yet developed, but it
fits in this model (even though it will probably have a considerably
different internal implementation). You can think about Kconfig as
being provided by kernel itself, but of course implemented by libbpf.
Think about writing kernel code. You'd expect to have CONFIG_HZ
available to you without any parsing, such that you can just use it in
your expressions. Similarly, #ifdef CONFIG_TASK_IO_ACCOUNTING form is
a norm in kernel code. With these externs, BPF programs will feel even
more like they are extension of kernel, with all the similar
"facilities" at user's disposal.

I don't think it's a good idea to require all application that would
benefit from knowing few CONFIG_XXX values to re-implement their own
config parsing logic. We'll end up with lots of slightly incompatible
implementations. While not complicated, the logic still requires quite
a lot of coding, so we are going to save a bunch of effort for
prospective users of this feature.

On the other hand, kernel config is a pretty well-defined and common
problem, that we can actually solve it nicely and provide a great and
intuitive way to get and use it. It's also extensible, and once we
have BTF types for externs, libbpf will become even more clever,
allowing users to specify whether they want to get value as bool, or
some tristate enum, or integer, or maybe just a raw string
representation. Before we have that, I think defaulting to uniform
int64_t makes a lot of sense and doesn't preclude further improvements
and extensions.

> Reason I'm asking is two-fold: i) this concept feels quite generic and my
> take is that this could be applied to many other things as well beyond just
> plain kernel config, ii) callback would also allow to experiment first what
> would work best in practice wrt kernel config as one specific example, and
> in a later step, libbpf could provide this as one built-in callback option
> for the user to opt-into if its found to be generic/useful enough.
>
> > The BPF program itself doesn't need to read the value of CONFIG_
> > it only needs to know whether it was defined.
> > Such builtin would match semantics better.
> > If CONFIG_ is tri-state doing
> > if (*(u8*)CONFIG_FOO == 'y' || *(u8*)CONFIG_FOO == 'm')
>
> Passing in raw buffer feels more natural, agree, but then, again, if we had
> a callback it would be up to the one who's implementing it.

See above, I think weak vs strong extern would give us the best of
both worlds. Applications not caring to distinguished "undefined" vs
0, would just specify it as a weak symbol. Default strong extern,
though, will enforce that whoever tries to fetch value needs to ensure
that extern was actually found and resolved.

>
> > is cleaner than *(u64*)CONFIG_FOO == 1 || 2.
> > and constant propagation in the verifier should work the same way.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 5/6] libbpf: support libbpf-provided extern variables
  2019-11-19 15:58         ` Alexei Starovoitov
  2019-11-19 23:32           ` Daniel Borkmann
@ 2019-11-20  3:44           ` Andrii Nakryiko
  2019-11-20 21:39             ` Alexei Starovoitov
  1 sibling, 1 reply; 20+ messages in thread
From: Andrii Nakryiko @ 2019-11-20  3:44 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, Networking,
	Daniel Borkmann, Kernel Team

On Tue, Nov 19, 2019 at 7:58 AM Alexei Starovoitov <ast@fb.com> wrote:
>
> On 11/19/19 7:42 AM, Andrii Nakryiko wrote:
> > On Mon, Nov 18, 2019 at 10:57 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> >>
> >> On Mon, Nov 18, 2019 at 7:21 PM Alexei Starovoitov
> >> <alexei.starovoitov@gmail.com> wrote:
> >>>
> >>> On Sat, Nov 16, 2019 at 11:08:06PM -0800, Andrii Nakryiko wrote:
> >>>> Add support for extern variables, provided to BPF program by libbpf. Currently
> >>>> the following extern variables are supported:
> >>>>    - LINUX_KERNEL_VERSION; version of a kernel in which BPF program is
> >>>>      executing, follows KERNEL_VERSION() macro convention;
> >>>>    - CONFIG_xxx values; a set of values of actual kernel config. Tristate,
> >>>>      boolean, and integer values are supported. Strings are not supported at
> >>>>      the moment.
> >>>>
> >>>> All values are represented as 64-bit integers, with the follow value encoding:
> >>>>    - for boolean values, y is 1, n or missing value is 0;
> >>>>    - for tristate values, y is 1, m is 2, n or missing value is 0;
> >>>>    - for integers, the values is 64-bit integer, sign-extended, if negative; if
> >>>>      config value is missing, it's represented as 0, which makes explicit 0 and
> >>>>      missing config value indistinguishable. If this will turn out to be
> >>>>      a problem in practice, we'll need to deal with it somehow.
> >>>
> >>> I read that statement as there is no extensibility for such api.
> >>
> >> What do you mean exactly?
> >>
> >> Are you worried about 0 vs undefined case? I don't think it's going to
> >> be a problem in practice. Looking at my .config, I see that integer
> >> config values set to their default values are still explicitly
> >> specified with those values. E.g.,
> >>
> >> CONFIG_HZ_1000=y
> >> CONFIG_HZ=1000
> >>
> >> CONFIG_HZ default is 1000, if CONFIG_HZ_1000==y, but still I see it
> >> set. So while I won't claim that it's the case for any possible
> >> integer config, it seems to be pretty consistent in practice.
> >>
> >> Also, I see a lot of values set to explicit 0, like:
> >>
> >> CONFIG_BASE_SMALL=0
> >>
> >> So it seems like integers are typically spelled out explicitly in real
> >> configs and I think this 0 default is pretty sane.
> >>
> >> Next, speaking about extensibility. Once we have BTF type info for
> >> externs, our possibilities are much better. It will be possible to
> >> support bool, int, in64 for the same bool value. Libbpf will be able
> >> to validate the range and fail program load if declared extern type
> >> doesn't match actual value type and value range. So I think
> >> extensibility is there, but right now we are enforcing (logically)
> >> everything to be uin64_t. Unfortunately, with the way externs are done
> >> in ELF, I don't know neither type nor size, so can't be more strict
> >> than that.
> >>
> >> If we really need to know whether some config value is defined or not,
> >> regardless of its value, we can have it by convention. E.g.,
> >> CONFIG_DEFINED_XXX will be either 0 or 1, depending if corresponding
> >> CONFIG_XXX is defined explicitly or not. But I don't want to add that
> >> until we really have a use case where it matters.
> >>
> >>>
> >>>> Generally speaking, libbpf is not aware of which CONFIG_XXX values is of which
> >>>> expected type (bool, tristate, int), so it doesn't enforce any specific set of
> >>>> values and just parses n/y/m as 0/1/2, respectively. CONFIG_XXX values not
> >>>> found in config file are set to 0.
> >>>
> >>> This is not pretty either.
> >>
> >> What exactly: defaulting to zero or not knowing config value's type?
> >> Given all the options, defaulting to zero seems like the best way to
> >> go.
> >>
> >>>
> >>>> +
> >>>> +             switch (*value) {
> >>>> +             case 'n':
> >>>> +                     *ext_val = 0;
> >>>> +                     break;
> >>>> +             case 'y':
> >>>> +                     *ext_val = 1;
> >>>> +                     break;
> >>>> +             case 'm':
> >>>> +                     *ext_val = 2;
> >>>> +                     break;
> >>
> >> reading some more code from scripts/kconfig/symbol.c, I'll need to
> >> handle N/Y/M and 0x hexadecimals, will add in v2 after collecting some
> >> more feedback on this version.
> >>
> >>>> +             case '"':
> >>>> +                     pr_warn("extern '%s': strings are not supported\n",
> >>>> +                             ext->name);
> >>>> +                     err = -EINVAL;
> >>>> +                     goto out;
> >>>> +             default:
> >>>> +                     errno = 0;
> >>>> +                     *ext_val = strtoull(value, &value_end, 10);
> >>>> +                     if (errno) {
> >>>> +                             err = -errno;
> >>>> +                             pr_warn("extern '%s': failed to parse value: %d\n",
> >>>> +                                     ext->name, err);
> >>>> +                             goto out;
> >>>> +                     }
> >>>
> >>> BPF has bpf_strtol() helper. I think it would be cleaner to pass whatever
> >>> .config has as bytes to the program and let program parse n/y/m, strings and
> >>> integers.
> >>
> >> Config value is not changing. This is an incredible waste of CPU
> >> resources to re-parse same value over and over again. And it's
> >> incredibly much worse usability as well. Again, once we have BTF for
> >> externs, we can just declare values as const char[] and then user will
> >> be able to do its own parsing. Until then, I think pre-parsing values
> >> into convenient u64 types are much better and handles all the typical
> >> cases.
> >
> >
> > One more thing I didn't realize I didn't state explicitly, because
> > I've been thinking and talking about that for so long now, that it
> > kind of internalized completely.
> >
> > These externs, including CONFIG_XXX ones, are meant to interoperate
> > nicely with field relocations within BPF CO-RE concept. They are,
> > among other things, are meant to disable parts of BPF program logic
> > through verifier's dead code elimination by doing something like:
> >
> >
> > if (CONFIG_SOME_FEATURES_ENABLED) {
> >      BPF_CORE_READ(t, some_extra_field);
> >      /* or */
> >      bpf_helper_that_only_present_when_feature_is_enabled();
> > } else {
> >      /* fallback logic */
> > }
> >
> > With CONFIG_SOME_FEATURES_ENABLED not being a read-only integer
> > constant when BPF program is loaded, this is impossible. So it
> > absolutely must be some sort of easy to use integer constant.
>
> Hmm. what difference do you see between u64 and char[] ?
> The const propagation logic in the verifier should work the same way.
> If it doesn't it's a bug in the verifier and it's not ok to hack
> extern api to workaround the bug.

For some specific subset of cases (mostly bool and tristate), yes,
verifier should be able to track raw byte value, because there is no
transformation applied to those values. With integers it's certainly
not the case.

Imagine something like:

if (CONFIG_HZ > 1000) {
  ... do something ...
else if (CONFIG_HZ > 100) {
  ... something else ...
} else {
  ... yet another fallback ...
}

With CONFIG_HZ being integer, entire if/else if/else will be
eliminated by verifier. If you do bpf_strtoul(), it's not possible,
unless we do those more further optimizations (implementing const
versions of helpers). The latter would be a good addition, if possible
to implement generically, of course, but I'm not sure we need to block
on that.

>
> What you're advocating with libbpf-side of conversion to integers
> reminds me of our earlier attempts with cgroup_sysctl hooks where
> we started with ints only to realize that in practice it's too
> limited. Then bpf_strtol was introduced and api got much cleaner.
> Same thing here. Converting char[] into ints or whatever else
> is the job of the program. Not of libbpf. The verifier can be taught
> to optimize bpf_strtol() into const when const char[] is passed in.

Given config values are constant and won't change throughout lifetime
of kernel, it's much more practical and easier to parse any
complicated value in userspace and pass necessary well-structured and
easy to use (and thus - performant) data to BPF side through global
data. But if BPF developer knows that CONFIG_HZ has to be integer
(because it's defined in Kconfig as having a type int), then it's
going to be integer, there is no doubt about that.

So while I can imagine some extreme cases where we might need parsing
string, I think most such cases can be painlessly solved in userspace.

Having said that, I don't oppose having an option to expose strings
and allow to work with them, either from userspace or BPF side. It is
extension of what I implemented and can be easily added.

>
> As far as is_enabled() check doing it as 0/1 the way you're proposing
> has in-band signaling issues that you admitted in the commit log.
> For is_enabled() may be new builtin() on llvm side would be better?
> Something like __builtin_preserve_field_info(field, BPF_FIELD_EXISTS)
> but can be used on _any_ extern function or variable.
> Like __builtin_is_extern_resolved(extern_name);
> Then on libbpf side CONFIG_* that are not in config.gz won't be seen
> by the program (instead of seen as 0 in your proposal) and the code
> will look like:
> if (__builtin_is_extern_resolved(CONFIG_NETNS)) {
>    ..do things;
> } else {
> }
> The verifier dead code elimination will take care of branches.
> The BPF program itself doesn't need to read the value of CONFIG_
> it only needs to know whether it was defined.
> Such builtin would match semantics better.

I agree such __builtin is useful and we should add it. But also I
believe that a lot of common cases would be much simpler and nicer if
we have this not defined = 0 logic. But I think we can satisfy both
sides without sacrificing anything. If you define extern as weak:

extern __attribute__((weak)) uint64_t CONFIG_MISSING;

Libbpf will set it to zero, if it's not recognized/found. So check
like below will nicely work:

if (CONFIG_MISSING) {
  .. do something ...
}

If the extern is strong, then the above check will fail, and will have
to be written with using __builtin:

if (__builtin_extern_resolved(CONFIG_MISSING)) {
  .. do something ..
}

This puts control over semantics into users hands. WDYT?


> If CONFIG_ is tri-state doing
> if (*(u8*)CONFIG_FOO == 'y' || *(u8*)CONFIG_FOO == 'm')
> is cleaner than *(u64*)CONFIG_FOO == 1 || 2.
> and constant propagation in the verifier should work the same way.

Once we get BTF info for externs, you'll be able to do just that very cleanly:

extern bool CONFIG_FOO; - true, if =y, false if =n, false, if extern
is weak and undefined in config
extern char CONFIG_FOO; will get 'y'/'n'/'m' values
extern enum tristate CONFIG_FOO - YES/NO/MODULE (or whatever we define
in enum tristate)
extern char CONFIG_FOO[]; - will get raw zero-terminated "y\0", "n\0", or "m\0"

Until we have BTF, though, we can dictate that all those should be
defined uniformly as uin64_t and be handled as in my current patch.
Then, with introduction of BTF, libbpf will do necessary extra
transformations and enforcement of type (e.g., if defined as `extern
bool CONFIG_FOO`, but actual value is m (module) - that will be an
error and enforced by libbpf).

Only if currently users will still use something like bool or int,
instead of uint64_t, that might break later because with BTF for
externs libbpf will suddenly start enforcing more restrictions. But I
think it's just going to be a misuse of current API and shouldn't be
considered a breaking change.

So, to summarize, we proceed with uint64_t for everything, with added
bits of weak vs strong handling. Then in parallel we'll work on adding
BTF for externs and __builtin_extern_resolved (and corresponding new
kind of BTF relocation) and will keep making this whole API even
better, while already having something useful and extensible.

As for strings, I'd prefer to add them in a follow up patch, but if
you guys insist, I can add them anyways. One reservation about strings
I do still have is how better to represent strings:

extern const char *CONFIG_SOME_STRING;   /* pointer to a string
literal, stored in soem other place */
/* or */
extern const char CONFIG_SOME_STRING[];  /* inlined string contents */

The latter can be implemented and used (to some degree) today. But the
former would more seamlessly blend with exposing string-based
variables from kernel. Also,

extern __attribute__((weak)) const char *CONFIG_SOME_STRING;

would more naturally resolve to NULL, if not explicitly defined in
kernel config.

But I'm feeling less strongly about strings overall.

Thoughts?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 5/6] libbpf: support libbpf-provided extern variables
  2019-11-20  3:44           ` Andrii Nakryiko
@ 2019-11-20 21:39             ` Alexei Starovoitov
  2019-11-20 22:47               ` Andrii Nakryiko
  0 siblings, 1 reply; 20+ messages in thread
From: Alexei Starovoitov @ 2019-11-20 21:39 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, Networking,
	Daniel Borkmann, Kernel Team

On 11/19/19 7:44 PM, Andrii Nakryiko wrote:
> So, to summarize, we proceed with uint64_t for everything, with added
> bits of weak vs strong handling. Then in parallel we'll work on adding
> BTF for externs and __builtin_extern_resolved (and corresponding new
> kind of BTF relocation) and will keep making this whole API even
> better, while already having something useful and extensible.

I didn't know extern can be weak. That was a good find.
Indeed undefined config_* can be represented as uint64 zero value.
But I still have an issue with 'proceed uint64_t for everything',
since strings and tri-state don't fit into uint64.

How about strtol be applied by libbpf only for things that parse 
successfully (like decimal and hex) and everything else will be 
represented raw ?
Like CONFIG=y will be 121.
CONFIG=m will be 109
CONFIG="abc" will be 0x636261
In other words CONFIG_A is an address and
'extern weak uint64_t CONFIG_A' reads raw bytes from that location.
When that CONFIG_A is undefined in /boot/config.gz the u64 read
from that address will return zero.
u8 read from that address will return zero too.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 5/6] libbpf: support libbpf-provided extern variables
  2019-11-20 21:39             ` Alexei Starovoitov
@ 2019-11-20 22:47               ` Andrii Nakryiko
  2019-11-21  0:18                 ` Alexei Starovoitov
  0 siblings, 1 reply; 20+ messages in thread
From: Andrii Nakryiko @ 2019-11-20 22:47 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, Networking,
	Daniel Borkmann, Kernel Team

On Wed, Nov 20, 2019 at 1:39 PM Alexei Starovoitov <ast@fb.com> wrote:
>
> On 11/19/19 7:44 PM, Andrii Nakryiko wrote:
> > So, to summarize, we proceed with uint64_t for everything, with added
> > bits of weak vs strong handling. Then in parallel we'll work on adding
> > BTF for externs and __builtin_extern_resolved (and corresponding new
> > kind of BTF relocation) and will keep making this whole API even
> > better, while already having something useful and extensible.
>
> I didn't know extern can be weak. That was a good find.
> Indeed undefined config_* can be represented as uint64 zero value.
> But I still have an issue with 'proceed uint64_t for everything',
> since strings and tri-state don't fit into uint64.

well, great that we are at least aligned w.r.t. int/hex :)

not clear about bool, though. They are subset of tristate, so should
they be 'y'/'n' (which is certainly weird, isn't it?) raw characters
(with zero byte termination? Or just plain single char?) or C's 0 and
1?

>
> How about strtol be applied by libbpf only for things that parse
> successfully (like decimal and hex) and everything else will be
> represented raw ?
> Like CONFIG=y will be 121.
> CONFIG=m will be 109

Again, conceptually, bool is subset of tristate. So if we go with real
bools, then 0 and 1 should match to tristate's y/n as well.

For tristate we also can have 8-byte enum, if you think that's better
than short 0/1/2 (or whatever, -1, if you don't like 2):

enum kconfig_tristate {
        NO = 0,
        YES = 1,
        MODULE = 2,
        __KCONFIG_TRISTATE_INVALID = 0xFFFFFFFFFFFFFFFF
};

> CONFIG="abc" will be 0x636261
> In other words CONFIG_A is an address and
> 'extern weak uint64_t CONFIG_A' reads raw bytes from that location.
> When that CONFIG_A is undefined in /boot/config.gz the u64 read
> from that address will return zero.
> u8 read from that address will return zero too.

I'm puzzled on why strings should be represented as "extern int64_t"?
What about strings longer than 7 bytes (+1 zero terminator)? IMO,
strings should be strings, i.e., const char []. Or const char *, if we
had a support (see my previous email). Cramming arbitrary strings into
uint64_t seems weird and I fail to see the reason.


So just to summarize how I understand situation right now. Please let
me know if that's not true.

1. We proceed with extern uint64_t for what's detected as integer/hex
(at runtime).
2. Weak/strong externs determine how we handle unresolved config
values. Weak - default to 0, strong - ensure it can't be read. But
this might be complicated by our further choices below.

Things that I'd like to clarify. And I'll list options we've discussed.

1. For bools:
  a) parse them as 0/1, for now use them as uint64_t. Once BTF for
externs available, users will be able to pick bool instead.
  b) parse them as 0/1 (false, true), represent as single byte bool.
  c) leave them as raw single character: 'y' or 'n'
    c1) represent as single byte?
    c2) represent as 8 bytes to match uint64_t?
  d) leave them as a raw string (zero-terminated)
    d1) represent as fixed 8-byte string?..
    d2) represent as variable-sized raw string?

2. For tristate, pretty much the same set of questions, except there
is one more state 'm'. One extra option: represent as 8-byte enum,
maybe? We can do 4-byte enum as well, but that has all the downsides
of variable sized externs without yet having BTF type info.

3. For strings:
  a) don't add them yet, if string is detected in runtime, return
error (my original proposal);
  b) add them as up to 7 characters fixed strings you proposed? What
do we do with longer strings?
  c) add them as variable-sized zero-terminated strings (libbpf will
determine size at runtime). See below for consequences.

If we pick any combination where fields are not uniform (so, if we add
strings right now, or have bool as 1 byte, or whatever), we'll have to
specify what size value we fall back to if weakly-defined extern is
not resolved. Is it going to be zero uint64_t? This is wasteful for
what's expected to be a string (and dirty). On the other hand,
specifying it as empty single-byte string is not going to work for
CONFIG_XXX that is undefined integer. Generally speaking, having
variable sized representation is very confusing until we have expected
type information or at least expected byte size.

Given all this, I think realistically we can pick few combinations:

1. Only support int/hex as uint64_t. Anything y/n/m or a string will
fail in runtime.
Pros:
  - we get at least something to use in practice (LINUX_KERNEL_VERSION
and int CONFIG_XXXs).
Cons:
  - undefined weak extern will have to be assumed either uint64_t 0
(and succeed) or undefined string (and fail). No clearly best behavior
here.
  - no ability to do "feature detection" logic in BPF program.
2. Stick to uint64_t for bool/tristate/int/hex. Don't support strings
yet. Improve in backwards compatible way once we get BTF type info for
externs (adding strings at that time).
Pros:
  - well-defined and logical behavior
  - easily extensible once we get BTF for externs. No new flags, no
changes in behavior.
Cons:
  - no strings yet
  - Alexei doesn't like y/n/m as 0/1/2
3. Mix of the options for bool/tristate/string I layed out above.
Pros:
  - we can get reasonable strings (or up to 7-byte ones, but not sure why);
  - bool/tristate can be accessed, though, IMO, in an ugly
character-based fashion.
Cons:
  - some arbitrary choices need to be made, that couldn't be changed
even when we have BTF for externs.
  - needs further discussion and specification to narrow down all the downsides.

My preferences would be 2, if not, then 1.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 5/6] libbpf: support libbpf-provided extern variables
  2019-11-20 22:47               ` Andrii Nakryiko
@ 2019-11-21  0:18                 ` Alexei Starovoitov
  2019-11-21  2:13                   ` Andrii Nakryiko
  0 siblings, 1 reply; 20+ messages in thread
From: Alexei Starovoitov @ 2019-11-21  0:18 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, Networking,
	Daniel Borkmann, Kernel Team

On Wed, Nov 20, 2019 at 02:47:58PM -0800, Andrii Nakryiko wrote:
> 
> Given all this, I think realistically we can pick few combinations:
> 
> 1. Only support int/hex as uint64_t. Anything y/n/m or a string will
> fail in runtime.
> Pros:
>   - we get at least something to use in practice (LINUX_KERNEL_VERSION
> and int CONFIG_XXXs).
> Cons:
>   - undefined weak extern will have to be assumed either uint64_t 0
> (and succeed) or undefined string (and fail). No clearly best behavior
> here.

what that uint64 going to be when config_ is defined?
If your answer is 1 than it's not extensible.

>   - no ability to do "feature detection" logic in BPF program.
> 2. Stick to uint64_t for bool/tristate/int/hex. Don't support strings
> yet. Improve in backwards compatible way once we get BTF type info for
> externs (adding strings at that time).
> Pros:
>   - well-defined and logical behavior
>   - easily extensible once we get BTF for externs. No new flags, no
> changes in behavior.

extensible with new flag == not extensible.
The choices for bpf program that we pick for extern must keep working
when llvm starts generating BTF for externs.

> My preferences would be 2, if not, then 1.

I'm proposing something else.
I see libbpf as a tool to pass /boot/config.gz into bpf program. From program
pov CONFIG_FOO is a label. It's not a variable of type u8 or type u64.
Example:
CONFIG_A=100
CONFIG_B=y
CONFIG_C="abcd"
will be a map of one element with value:
char ar[]= { 0x64, 0, 0, 0, 'y', 'a', 'b', 'c', 'd', 0};

CONFIG_A = &ar[0];
CONFIG_B = &ar[4];
CONFIG_C = &ar[5];

libbpf parses config.gz and converts all int/hex into 4 byte or 8 byte
integers depending on number of text digits it sees in config.gz
with alignment. All other strings and characters are passed as-is.
If program says
extern u8 CONFIG_A, CONFIG_B, CONFIG_C;
It will read 1st byte from these three labels.
Later when llvm emits BTF those u8 swill stay as-is and will read the same
things. With BTF if program says 'extern _Bool CONFIG_B;' then it will be an
explicit direction for libbpf to convert that CONFIG_B value into _Bool at
program load time and represent it as sizeof(_Bool) in map element. If program
says 'extern uint32_t CONFIG_C;' the libbpf will keep first 4 bytes of that
string in there. If program says 'extern uint64_t CONFIG_C;' the libbpf will
keep first 4 character plus one byte for zero plus 3 bytes of padding in map
element.
'extern char CONFIG_C[5];' is also ok. Either with or without BTF.
In both cases it will be 5 bytes in map element.
Without BTF doing 'extern char *CONFIG_C;' will read garbage 8 bytes from that
label. With BTF 'extern char *CONFIG_C;' will be converted to pointer to inner
bytes of map element.
In other words BTF types will be directives of how libbpf should convert
strings in text file to be consumed by bpf program. Since we don't have types
now int/hex are the only ones that libbpf will convert unconditionally. The
logic of parsing config.gz is generic. It can parse any file with 'var=value'
lines this way. All names will be preserved.
In that sense LINUX_KERNEL_VERSION is a text file that has one line:
LINUX_KERNEL_VESRION=1234
If digits fit into u32 it should be u32.
This way users can feed any configuration.txt file into libbpf and into their
programs. Without BTF the map element is a collection of raw bytes and the
program needs to be smart about reading them with correct sizes. With BTF
libbpf will be converting .txt into requested types and failing to load
if libbpf cannot convert string from text into requested type.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 5/6] libbpf: support libbpf-provided extern variables
  2019-11-21  0:18                 ` Alexei Starovoitov
@ 2019-11-21  2:13                   ` Andrii Nakryiko
  2019-11-21  4:39                     ` Alexei Starovoitov
  0 siblings, 1 reply; 20+ messages in thread
From: Andrii Nakryiko @ 2019-11-21  2:13 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, Networking,
	Daniel Borkmann, Kernel Team

On Wed, Nov 20, 2019 at 4:18 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Nov 20, 2019 at 02:47:58PM -0800, Andrii Nakryiko wrote:
> >
> > Given all this, I think realistically we can pick few combinations:
> >
> > 1. Only support int/hex as uint64_t. Anything y/n/m or a string will
> > fail in runtime.
> > Pros:
> >   - we get at least something to use in practice (LINUX_KERNEL_VERSION
> > and int CONFIG_XXXs).
> > Cons:
> >   - undefined weak extern will have to be assumed either uint64_t 0
> > (and succeed) or undefined string (and fail). No clearly best behavior
> > here.
>
> what that uint64 going to be when config_ is defined?
> If your answer is 1 than it's not extensible.

Let's say we have

extern __attribute__((weak)) uint64_t CONFIG_VALUE;

Now let's see how different possible configs will behave:

.config                         uint64_t value
CONFIG_VALUE=123            =>  123
CONFIG_VALUE=y              =>  1
CONFIG_VALUE="abc"          =>  libbpf error (assuming no string support)
# CONFIG_VALUE not defined  =>  0

>
> >   - no ability to do "feature detection" logic in BPF program.
> > 2. Stick to uint64_t for bool/tristate/int/hex. Don't support strings
> > yet. Improve in backwards compatible way once we get BTF type info for
> > externs (adding strings at that time).
> > Pros:
> >   - well-defined and logical behavior
> >   - easily extensible once we get BTF for externs. No new flags, no
> > changes in behavior.
>
> extensible with new flag == not extensible.

please re-read two lines you just quoted: **"No new flags, no changes
in behavior"**.
So extensible by your definition.

> The choices for bpf program that we pick for extern must keep working
> when llvm starts generating BTF for externs.

Agree and that's what I'm claiming: they are going to work.

>
> > My preferences would be 2, if not, then 1.
>
> I'm proposing something else.
> I see libbpf as a tool to pass /boot/config.gz into bpf program. From program
> pov CONFIG_FOO is a label. It's not a variable of type u8 or type u64.

This is not true for usage of CONFIG_XXX values in kernel code. Why
would we have a completely different semantics for BPF programs? When
kernel code does:

int sec = jiffies * CONFIG_HZ;

CONFIG_HZ is not just a label pointing to raw string representation of
100 or 1000, it **IS** a number 1000. You cannot multiply strings in
C, yet you can multiply CONFIG_HZ.

> Example:
> CONFIG_A=100
> CONFIG_B=y
> CONFIG_C="abcd"
> will be a map of one element with value:
> char ar[]= { 0x64, 0, 0, 0, 'y', 'a', 'b', 'c', 'd', 0};
>
> CONFIG_A = &ar[0];
> CONFIG_B = &ar[4];
> CONFIG_C = &ar[5];
>
> libbpf parses config.gz and converts all int/hex into 4 byte or 8 byte
> integers depending on number of text digits it sees in config.gz
> with alignment. All other strings and characters are passed as-is.
> If program says
> extern u8 CONFIG_A, CONFIG_B, CONFIG_C;
> It will read 1st byte from these three labels.
> Later when llvm emits BTF those u8 swill stay as-is and will read the same
> things. With BTF if program says 'extern _Bool CONFIG_B;' then it will be an
> explicit direction for libbpf to convert that CONFIG_B value into _Bool at
> program load time and represent it as sizeof(_Bool) in map element. If program
> says 'extern uint32_t CONFIG_C;' the libbpf will keep first 4 bytes of that
> string in there. If program says 'extern uint64_t CONFIG_C;' the libbpf will
> keep first 4 character plus one byte for zero plus 3 bytes of padding in map
> element.

This set of rules is not consistent, if I interpret what you are
saying correctly.

.config            extern definition              w/o BTF             w/ BTF

CONFIG_VALUE=y     extern bool CONFIG_VALUE;      {'y'}
true             -- change in behavior
CONFIG_VALUE=100   extern uint32_t CONFIG_VALUE;  {0x64, 0, 0, 0}
{0x64, 0, 0, 0}  -- no change, because we parse ints
CONFIG_VALUE="abc" extern uint32_t CONFIG_VALUE;  {'a', 'b', 'c', 0}
??? still {'a', 'b', 'c', 0}? or error?

So for bools we'll start interpreting 'y' as 1 (true), but only when
we get BTF. Even though we can clearly parse y/n today as 1 (true) or
0 (false).

For integers, nothing changes and uint32_t means "I want parsed
integer and at most 4 lower bytes on little-endian or 4 upper bytes on
big-endian". Though we might want to reject it with BTF info, if it
doesn't fit into 32 bits (because now we can?).

For strings, uint32_t suddenly means "give me first 4 raw bytes". I'm
not even clear what we will do with BTF in that case.

This logic also breaks when user declares "extern uint64_t", but real
value fits in just 4 bytes. According to your rules, libbpf will
allocate just 4 bytes, and user will read extra 4 bytes of garbage. In
either low or high 4 bytes, depending on machine endianness.

This strikes me as unnecessarily convoluted behavior.

> 'extern char CONFIG_C[5];' is also ok. Either with or without BTF.
> In both cases it will be 5 bytes in map element.
> Without BTF doing 'extern char *CONFIG_C;' will read garbage 8 bytes from that
> label. With BTF 'extern char *CONFIG_C;' will be converted to pointer to inner
> bytes of map element.

Again, I don't even know how many bytes user expected, so when it's
defined - I don't know how many zeros I should substitute. Or if user
define uint64_T, but value fits just 4 bytes, how do I know user
didn't expect all 8. For strings, it's even more interesting. "a" vs
"abracadabra" will require different amount of storage, so if user
will try to read it as extern uint32_t, he will either read 'a' +
garbage, or 'abra'.

I really struggle to see how what I'm proposing is not more consistent
and logical behavior. I'm saying that before we have BTF, all uses
should be `extern uint64_t CONFIG_BLA`. There is nothing preventing
user to do `extern int CONFIG_BLA`, but we don't have abilities to
enforce that, so we are saying "don't do it because it's not
supported, it might or might not work, with random probability". Now,
assuming people follow documentation and examples, we know that it's
always 8 bytes that people are expecting. So all integers are parsed
and allocated as uint64_t, which makes it consistent for big- and
litte-endian machines. For bools, it's just natural to interpret them
as 0 and 1, I fail to see why this is controversial. y/n is just
Kconfig text representation **in text file** of true/false (bool
type). By extension, y/n/m can be mapped into 0/1/2. Sure 2 is sort of
arbitrary, but it follows y, it keeps bool a strict subset of
tristate. For strings I'd hold off until we have BTF, because then
declaring `extern const char CONFIG_BLA[]` is a clear indication you
expect string. While `extern uint32_t CONFIG_BLA;` is a clear
indication that you expect integer, so if the config value is not
parsable to integer -- that's an error. No need to arbitrarily map
first 4 character and pretend they are integer. If you want that
behavior, just declare `extern u8 CONFIG_BLA[]`.

Now, everything is currently defined as uint64_t. BTF info for extern
arrives. People go fancy and say: "screw it, I don't want to waste 8
bytes for bool". They go and change it to "extern bool CONFIG_BLA;".
Good, libbpf sees BTF, validates it's only y or n in config. If it's m
- fail, if it's 1, 0, 123, "a", "abc", "y" - fail. It's not a bool.
Just strictly better behavior.

Once we have BTF, bam, strings just works, because libbpf knows that
user expects 'const char[]'. All this type information is available to
tooling at that point as well, so BPF object auto-generated skeleton
can take all that into account, because it had guarantees that libbpf
will always allocate that amount of bytes for primitive types.

Before we have BTF type info, it's either 8 bytes for supported types,
or pain for users. I don't know why I'd pick the latter.

> In other words BTF types will be directives of how libbpf should convert
> strings in text file to be consumed by bpf program. Since we don't have types
> now int/hex are the only ones that libbpf will convert unconditionally. The
> logic of parsing config.gz is generic. It can parse any file with 'var=value'
> lines this way. All names will be preserved.
> In that sense LINUX_KERNEL_VERSION is a text file that has one line:
> LINUX_KERNEL_VESRION=1234
> If digits fit into u32 it should be u32.

See above. Libbpf doesn't know that user expects u32, so choosing 4 vs
8 bytes at runtime is, while super easy for libbpf, would be super
frustrating for users. We should just say no until we have types.

> This way users can feed any configuration.txt file into libbpf and into their
> programs. Without BTF the map element is a collection of raw bytes and the
> program needs to be smart about reading them with correct sizes. With BTF
> libbpf will be converting .txt into requested types and failing to load
> if libbpf cannot convert string from text into requested type.

This was never intended as a generic application-provided config
parser, I'm sorry if I made that impression. This is specifically
driven towards kernel config, taking into account kernel config
semantics with its explicitly supported types of values and how it's
exposed to kernel code. This is intended to provide an illusion to BPF
program of having as close environment, as we can get, to just
programming inside the kernel. Example 'var=value' will never happen
with Kconfig. It's invalid config and libbpf should reject that.

If application needs to pass custom config, application will have to
parse it on its own and pass it as global data: either mutable or
read-only. If we deem this necessary, we can provide custom libbpf
helpers to automate that, but I'd refrain from that. There are way too
many different ways to define application configs to reasonably
support anything generic. It's just not the problem that libbpf should
try to solve. Libbpf should make using of global data simpler and more
user-friendly.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [PATCH bpf-next 5/6] libbpf: support libbpf-provided extern variables
  2019-11-21  2:13                   ` Andrii Nakryiko
@ 2019-11-21  4:39                     ` Alexei Starovoitov
  0 siblings, 0 replies; 20+ messages in thread
From: Alexei Starovoitov @ 2019-11-21  4:39 UTC (permalink / raw)
  To: Andrii Nakryiko, Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, Networking, Daniel Borkmann, Kernel Team

On 11/20/19 6:13 PM, Andrii Nakryiko wrote:
> We should just say no until we have types.

Ok. I'm convinced. None of the proposals are clean enough
until we have BTF with externs.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2019-11-21  4:40 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-11-17  7:08 [PATCH bpf-next 0/6] Add libbpf-provided extern variables support Andrii Nakryiko
2019-11-17  7:08 ` [PATCH bpf-next 1/6] selftests/bpf: ensure no DWARF relocations for BPF object files Andrii Nakryiko
2019-11-17  7:08 ` [PATCH bpf-next 2/6] libbpf: refactor relocation handling Andrii Nakryiko
2019-11-17  7:08 ` [PATCH bpf-next 3/6] libbpf: fix various errors and warning reported by checkpatch.pl Andrii Nakryiko
2019-11-17  7:08 ` [PATCH bpf-next 4/6] libbpf: support initialized global variables Andrii Nakryiko
2019-11-17  7:08 ` [PATCH bpf-next 5/6] libbpf: support libbpf-provided extern variables Andrii Nakryiko
2019-11-19  3:21   ` Alexei Starovoitov
2019-11-19  6:57     ` Andrii Nakryiko
2019-11-19 15:42       ` Andrii Nakryiko
2019-11-19 15:58         ` Alexei Starovoitov
2019-11-19 23:32           ` Daniel Borkmann
2019-11-20  3:17             ` Andrii Nakryiko
2019-11-20  3:44           ` Andrii Nakryiko
2019-11-20 21:39             ` Alexei Starovoitov
2019-11-20 22:47               ` Andrii Nakryiko
2019-11-21  0:18                 ` Alexei Starovoitov
2019-11-21  2:13                   ` Andrii Nakryiko
2019-11-21  4:39                     ` Alexei Starovoitov
2019-11-17  7:08 ` [PATCH bpf-next 6/6] selftests/bpf: add tests for libbpf-provided externs Andrii Nakryiko
2019-11-17  7:12 ` [PATCH bpf-next 0/6] Add libbpf-provided extern variables support Andrii Nakryiko

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).