bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH bpf-next 0/8] libbpf ksym support and bpftool show PIDs
@ 2020-06-12 22:31 Andrii Nakryiko
  2020-06-12 22:31 ` [RFC PATCH bpf-next 1/8] libbpf: generalize libbpf externs support Andrii Nakryiko
                   ` (7 more replies)
  0 siblings, 8 replies; 21+ messages in thread
From: Andrii Nakryiko @ 2020-06-12 22:31 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel
  Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko, Hao Luo,
	Arnaldo Carvalho de Melo, Song Liu, Quentin Monnet

This patch set implements libbpf support for a second kind of special externs,
kernel symbols, in addition to existing Kconfig externs. 

Right now, only untyped (const void) externs are supported, which, in
C language, allow only to take their address. In the future, with kernel BTF
getting type info about its own global and per-cpu variables, libbpf will
extend this support with BTF type info, which will allow additionally to
directly access variable's contents and follow internal pointers, similarly to
how it's possible today in fentry/fexit programs.

As a first practical use of this functionality, bpftool gained ability to show
PIDs of processes that have open file descriptors for BPF map/program/link/BTF
object. It relies on iter/task_file BPF iterator program to extract this
information efficiently.

There was a bunch of bpftool refactoring (especially Makefile) necessary to
generalize bpftool's internal BPF program use. This includes generalization of
BPF skeletons support, addition of a vmlinux.h generation, extracting and
building minimal subset of bpftool for bootstrapping.

This RFC seeks feedback on ksym externs support in libbpf. The way it is
designed and supported should be naturally extended with backwards
compatibility in mind for when kernel will get support for kernel global
variables access through BPF.

Please also provide your feedback bpftool support for showing PIDs and and how
it could be done better. I'll update existing bpftool man pages and
bash-completion with examples of output and new flag in v1, after gather
initial feedback.

Cc: Hao Luo <haoluo@google.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Song Liu <songliubraving@fb.com>
Cc: Quentin Monnet <quentin@isovalent.com>

Andrii Nakryiko (8):
  libbpf: generalize libbpf externs support
  libbpf: add support for extracting kernel symbol addresses
  selftests/bpf: add __ksym extern selftest
  tools/bpftool: move map/prog parsing logic into common
  tools/bpftool: minimize bootstrap bpftool
  tools/bpftool: generalize BPF skeleton support and generate vmlinux.h
  libbpf: wrap source argument of BPF_CORE_READ macro in parentheses
  tools/bpftool: show PIDs with FDs open against BPF map/prog/link/btf

 tools/bpf/bpftool/.gitignore                  |   5 +-
 tools/bpf/bpftool/Makefile                    |  60 ++-
 tools/bpf/bpftool/btf.c                       |  39 ++
 tools/bpf/bpftool/common.c                    | 308 ++++++++++++
 tools/bpf/bpftool/link.c                      |  40 ++
 tools/bpf/bpftool/main.c                      |  19 +-
 tools/bpf/bpftool/main.h                      |  47 +-
 tools/bpf/bpftool/map.c                       | 195 ++------
 tools/bpf/bpftool/pids.c                      | 150 ++++++
 tools/bpf/bpftool/prog.c                      | 192 ++------
 tools/bpf/bpftool/skeleton/pid_iter.bpf.c     |  77 +++
 tools/bpf/bpftool/skeleton/profiler.bpf.c     |   3 +-
 tools/bpf/bpftool/skeleton/profiler.h         |  46 --
 tools/build/feature/Makefile                  |   4 +-
 tools/build/feature/test-clang-bpf-co-re.c    |   9 +
 .../build/feature/test-clang-bpf-global-var.c |   4 -
 tools/lib/bpf/bpf_core_read.h                 |   8 +-
 tools/lib/bpf/bpf_helpers.h                   |   1 +
 tools/lib/bpf/btf.h                           |   5 +
 tools/lib/bpf/libbpf.c                        | 464 +++++++++++++-----
 .../testing/selftests/bpf/prog_tests/ksyms.c  |  71 +++
 .../testing/selftests/bpf/progs/test_ksyms.c  |  32 ++
 22 files changed, 1240 insertions(+), 539 deletions(-)
 create mode 100644 tools/bpf/bpftool/pids.c
 create mode 100644 tools/bpf/bpftool/skeleton/pid_iter.bpf.c
 delete mode 100644 tools/bpf/bpftool/skeleton/profiler.h
 create mode 100644 tools/build/feature/test-clang-bpf-co-re.c
 delete mode 100644 tools/build/feature/test-clang-bpf-global-var.c
 create mode 100644 tools/testing/selftests/bpf/prog_tests/ksyms.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_ksyms.c

-- 
2.24.1


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [RFC PATCH bpf-next 1/8] libbpf: generalize libbpf externs support
  2020-06-12 22:31 [RFC PATCH bpf-next 0/8] libbpf ksym support and bpftool show PIDs Andrii Nakryiko
@ 2020-06-12 22:31 ` Andrii Nakryiko
       [not found]   ` <CA+khW7hAYVdoQX5-j0z1iGEVZeww4BBu4NXzy5eS5OwDRYqe2w@mail.gmail.com>
  2020-06-12 22:31 ` [RFC PATCH bpf-next 2/8] libbpf: add support for extracting kernel symbol addresses Andrii Nakryiko
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Andrii Nakryiko @ 2020-06-12 22:31 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel
  Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko, Hao Luo,
	Arnaldo Carvalho de Melo, Song Liu, Quentin Monnet

Switch existing Kconfig externs to be just one of few possible kinds of more
generic externs. This refactoring is in preparation for ksymbol extern
support, added in the follow up patch. There are no functional changes
intended.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/lib/bpf/libbpf.c | 332 ++++++++++++++++++++++++-----------------
 1 file changed, 199 insertions(+), 133 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 477c679ed945..b1a36c965fc0 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -329,24 +329,35 @@ struct bpf_map {
 
 enum extern_type {
 	EXT_UNKNOWN,
-	EXT_CHAR,
-	EXT_BOOL,
-	EXT_INT,
-	EXT_TRISTATE,
-	EXT_CHAR_ARR,
+	EXT_KCFG,
+};
+
+enum kcfg_type {
+	KCFG_UNKNOWN,
+	KCFG_CHAR,
+	KCFG_BOOL,
+	KCFG_INT,
+	KCFG_TRISTATE,
+	KCFG_CHAR_ARR,
 };
 
 struct extern_desc {
-	const char *name;
+	enum extern_type type;
 	int sym_idx;
 	int btf_id;
-	enum extern_type type;
-	int sz;
-	int align;
-	int data_off;
-	bool is_signed;
-	bool is_weak;
+	int sec_btf_id;
+	const char *name;
 	bool is_set;
+	bool is_weak;
+	union {
+		struct {
+			enum kcfg_type type;
+			int sz;
+			int align;
+			int data_off;
+			bool is_signed;
+		} kcfg;
+	};
 };
 
 static LIST_HEAD(bpf_objects_list);
@@ -1423,11 +1434,11 @@ static struct extern_desc *find_extern_by_name(const struct bpf_object *obj,
 	return NULL;
 }
 
-static int set_ext_value_tri(struct extern_desc *ext, void *ext_val,
-			     char value)
+static int set_kcfg_value_tri(struct extern_desc *ext, void *ext_val,
+			      char value)
 {
-	switch (ext->type) {
-	case EXT_BOOL:
+	switch (ext->kcfg.type) {
+	case KCFG_BOOL:
 		if (value == 'm') {
 			pr_warn("extern %s=%c should be tristate or char\n",
 				ext->name, value);
@@ -1435,7 +1446,7 @@ static int set_ext_value_tri(struct extern_desc *ext, void *ext_val,
 		}
 		*(bool *)ext_val = value == 'y' ? true : false;
 		break;
-	case EXT_TRISTATE:
+	case KCFG_TRISTATE:
 		if (value == 'y')
 			*(enum libbpf_tristate *)ext_val = TRI_YES;
 		else if (value == 'm')
@@ -1443,12 +1454,12 @@ static int set_ext_value_tri(struct extern_desc *ext, void *ext_val,
 		else /* value == 'n' */
 			*(enum libbpf_tristate *)ext_val = TRI_NO;
 		break;
-	case EXT_CHAR:
+	case KCFG_CHAR:
 		*(char *)ext_val = value;
 		break;
-	case EXT_UNKNOWN:
-	case EXT_INT:
-	case EXT_CHAR_ARR:
+	case KCFG_UNKNOWN:
+	case KCFG_INT:
+	case KCFG_CHAR_ARR:
 	default:
 		pr_warn("extern %s=%c should be bool, tristate, or char\n",
 			ext->name, value);
@@ -1458,12 +1469,12 @@ static int set_ext_value_tri(struct extern_desc *ext, void *ext_val,
 	return 0;
 }
 
-static int set_ext_value_str(struct extern_desc *ext, char *ext_val,
-			     const char *value)
+static int set_kcfg_value_str(struct extern_desc *ext, char *ext_val,
+			      const char *value)
 {
 	size_t len;
 
-	if (ext->type != EXT_CHAR_ARR) {
+	if (ext->kcfg.type != KCFG_CHAR_ARR) {
 		pr_warn("extern %s=%s should char array\n", ext->name, value);
 		return -EINVAL;
 	}
@@ -1477,10 +1488,10 @@ static int set_ext_value_str(struct extern_desc *ext, char *ext_val,
 
 	/* strip quotes */
 	len -= 2;
-	if (len >= ext->sz) {
+	if (len >= ext->kcfg.sz) {
 		pr_warn("extern '%s': long string config %s of (%zu bytes) truncated to %d bytes\n",
-			ext->name, value, len, ext->sz - 1);
-		len = ext->sz - 1;
+			ext->name, value, len, ext->kcfg.sz - 1);
+		len = ext->kcfg.sz - 1;
 	}
 	memcpy(ext_val, value + 1, len);
 	ext_val[len] = '\0';
@@ -1507,11 +1518,11 @@ static int parse_u64(const char *value, __u64 *res)
 	return 0;
 }
 
-static bool is_ext_value_in_range(const struct extern_desc *ext, __u64 v)
+static bool is_kcfg_value_in_range(const struct extern_desc *ext, __u64 v)
 {
-	int bit_sz = ext->sz * 8;
+	int bit_sz = ext->kcfg.sz * 8;
 
-	if (ext->sz == 8)
+	if (ext->kcfg.sz == 8)
 		return true;
 
 	/* Validate that value stored in u64 fits in integer of `ext->sz`
@@ -1526,26 +1537,26 @@ static bool is_ext_value_in_range(const struct extern_desc *ext, __u64 v)
 	 *  For unsigned target integer, check that all the (64 - Y) bits are
 	 *  zero.
 	 */
-	if (ext->is_signed)
+	if (ext->kcfg.is_signed)
 		return v + (1ULL << (bit_sz - 1)) < (1ULL << bit_sz);
 	else
 		return (v >> bit_sz) == 0;
 }
 
-static int set_ext_value_num(struct extern_desc *ext, void *ext_val,
-			     __u64 value)
+static int set_kcfg_value_num(struct extern_desc *ext, void *ext_val,
+			      __u64 value)
 {
-	if (ext->type != EXT_INT && ext->type != EXT_CHAR) {
+	if (ext->kcfg.type != KCFG_INT && ext->kcfg.type != KCFG_CHAR) {
 		pr_warn("extern %s=%llu should be integer\n",
 			ext->name, (unsigned long long)value);
 		return -EINVAL;
 	}
-	if (!is_ext_value_in_range(ext, value)) {
+	if (!is_kcfg_value_in_range(ext, value)) {
 		pr_warn("extern %s=%llu value doesn't fit in %d bytes\n",
-			ext->name, (unsigned long long)value, ext->sz);
+			ext->name, (unsigned long long)value, ext->kcfg.sz);
 		return -ERANGE;
 	}
-	switch (ext->sz) {
+	switch (ext->kcfg.sz) {
 		case 1: *(__u8 *)ext_val = value; break;
 		case 2: *(__u16 *)ext_val = value; break;
 		case 4: *(__u32 *)ext_val = value; break;
@@ -1591,15 +1602,15 @@ static int bpf_object__process_kconfig_line(struct bpf_object *obj,
 	if (!ext || ext->is_set)
 		return 0;
 
-	ext_val = data + ext->data_off;
+	ext_val = data + ext->kcfg.data_off;
 	value = sep + 1;
 
 	switch (*value) {
 	case 'y': case 'n': case 'm':
-		err = set_ext_value_tri(ext, ext_val, *value);
+		err = set_kcfg_value_tri(ext, ext_val, *value);
 		break;
 	case '"':
-		err = set_ext_value_str(ext, ext_val, value);
+		err = set_kcfg_value_str(ext, ext_val, value);
 		break;
 	default:
 		/* assume integer */
@@ -1609,12 +1620,12 @@ static int bpf_object__process_kconfig_line(struct bpf_object *obj,
 				ext->name, value);
 			return err;
 		}
-		err = set_ext_value_num(ext, ext_val, num);
+		err = set_kcfg_value_num(ext, ext_val, num);
 		break;
 	}
 	if (err)
 		return err;
-	pr_debug("extern %s=%s\n", ext->name, value);
+	pr_debug("kconfig extern %s=%s\n", ext->name, value);
 	return 0;
 }
 
@@ -1685,16 +1696,20 @@ static int bpf_object__read_kconfig_mem(struct bpf_object *obj,
 
 static int bpf_object__init_kconfig_map(struct bpf_object *obj)
 {
-	struct extern_desc *last_ext;
+	struct extern_desc *last_ext = NULL, *ext;
 	size_t map_sz;
-	int err;
+	int i, err;
 
-	if (obj->nr_extern == 0)
-		return 0;
+	for (i = 0; i < obj->nr_extern; i++) {
+		ext = &obj->externs[i];
+		if (ext->type == EXT_KCFG)
+			last_ext = ext;
+	}
 
-	last_ext = &obj->externs[obj->nr_extern - 1];
-	map_sz = last_ext->data_off + last_ext->sz;
+	if (!last_ext)
+		return 0;
 
+	map_sz = last_ext->kcfg.data_off + last_ext->kcfg.sz;
 	err = bpf_object__init_internal_map(obj, LIBBPF_MAP_KCONFIG,
 					    obj->efile.symbols_shndx,
 					    NULL, map_sz);
@@ -2709,8 +2724,33 @@ static int find_extern_btf_id(const struct btf *btf, const char *ext_name)
 	return -ENOENT;
 }
 
-static enum extern_type find_extern_type(const struct btf *btf, int id,
-					 bool *is_signed)
+static int find_extern_sec_btf_id(struct btf *btf, int ext_btf_id) {
+	const struct btf_var_secinfo *vs;
+	const struct btf_type *t;
+	int i, j, n;
+
+	if (!btf)
+		return -ESRCH;
+
+	n = btf__get_nr_types(btf);
+	for (i = 1; i <= n; i++) {
+		t = btf__type_by_id(btf, i);
+
+		if (!btf_is_datasec(t))
+			continue;
+
+		vs = btf_var_secinfos(t);
+		for (j = 0; j < btf_vlen(t); j++, vs++) {
+			if (vs->type == ext_btf_id)
+				return i;
+		}
+	}
+
+	return -ENOENT;
+}
+
+static enum kcfg_type find_kcfg_type(const struct btf *btf, int id,
+				     bool *is_signed)
 {
 	const struct btf_type *t;
 	const char *name;
@@ -2725,29 +2765,29 @@ static enum extern_type find_extern_type(const struct btf *btf, int id,
 		int enc = btf_int_encoding(t);
 
 		if (enc & BTF_INT_BOOL)
-			return t->size == 1 ? EXT_BOOL : EXT_UNKNOWN;
+			return t->size == 1 ? KCFG_BOOL : KCFG_UNKNOWN;
 		if (is_signed)
 			*is_signed = enc & BTF_INT_SIGNED;
 		if (t->size == 1)
-			return EXT_CHAR;
+			return KCFG_CHAR;
 		if (t->size < 1 || t->size > 8 || (t->size & (t->size - 1)))
-			return EXT_UNKNOWN;
-		return EXT_INT;
+			return KCFG_UNKNOWN;
+		return KCFG_INT;
 	}
 	case BTF_KIND_ENUM:
 		if (t->size != 4)
-			return EXT_UNKNOWN;
+			return KCFG_UNKNOWN;
 		if (strcmp(name, "libbpf_tristate"))
-			return EXT_UNKNOWN;
-		return EXT_TRISTATE;
+			return KCFG_UNKNOWN;
+		return KCFG_TRISTATE;
 	case BTF_KIND_ARRAY:
 		if (btf_array(t)->nelems == 0)
-			return EXT_UNKNOWN;
-		if (find_extern_type(btf, btf_array(t)->type, NULL) != EXT_CHAR)
-			return EXT_UNKNOWN;
-		return EXT_CHAR_ARR;
+			return KCFG_UNKNOWN;
+		if (find_kcfg_type(btf, btf_array(t)->type, NULL) != KCFG_CHAR)
+			return KCFG_UNKNOWN;
+		return KCFG_CHAR_ARR;
 	default:
-		return EXT_UNKNOWN;
+		return KCFG_UNKNOWN;
 	}
 }
 
@@ -2756,23 +2796,29 @@ static int cmp_externs(const void *_a, const void *_b)
 	const struct extern_desc *a = _a;
 	const struct extern_desc *b = _b;
 
-	/* descending order by alignment requirements */
-	if (a->align != b->align)
-		return a->align > b->align ? -1 : 1;
-	/* ascending order by size, within same alignment class */
-	if (a->sz != b->sz)
-		return a->sz < b->sz ? -1 : 1;
-	/* resolve ties by name */
+	if (a->type != b->type)
+		return a->type < b->type ? -1 : 1;
+
+	if (a->type == EXT_KCFG) {
+		/* descending order by alignment requirements */
+		if (a->kcfg.align != b->kcfg.align)
+			return a->kcfg.align > b->kcfg.align ? -1 : 1;
+		/* ascending order by size, within same alignment class */
+		if (a->kcfg.sz != b->kcfg.sz)
+			return a->kcfg.sz < b->kcfg.sz ? -1 : 1;
+		/* resolve ties by name */
+	}
+
 	return strcmp(a->name, b->name);
 }
 
 static int bpf_object__collect_externs(struct bpf_object *obj)
 {
+	struct btf_type *sec, *kcfg_sec = NULL;
 	const struct btf_type *t;
 	struct extern_desc *ext;
-	int i, n, off, btf_id;
-	struct btf_type *sec;
-	const char *ext_name;
+	int i, n, off;
+	const char *ext_name, *sec_name;
 	Elf_Scn *scn;
 	GElf_Shdr sh;
 
@@ -2818,22 +2864,39 @@ static int bpf_object__collect_externs(struct bpf_object *obj)
 		ext->name = btf__name_by_offset(obj->btf, t->name_off);
 		ext->sym_idx = i;
 		ext->is_weak = GELF_ST_BIND(sym.st_info) == STB_WEAK;
-		ext->sz = btf__resolve_size(obj->btf, t->type);
-		if (ext->sz <= 0) {
-			pr_warn("failed to resolve size of extern '%s': %d\n",
-				ext_name, ext->sz);
-			return ext->sz;
-		}
-		ext->align = btf__align_of(obj->btf, t->type);
-		if (ext->align <= 0) {
-			pr_warn("failed to determine alignment of extern '%s': %d\n",
-				ext_name, ext->align);
-			return -EINVAL;
-		}
-		ext->type = find_extern_type(obj->btf, t->type,
-					     &ext->is_signed);
-		if (ext->type == EXT_UNKNOWN) {
-			pr_warn("extern '%s' type is unsupported\n", ext_name);
+
+		ext->sec_btf_id = find_extern_sec_btf_id(obj->btf, ext->btf_id);
+		if (ext->btf_id <= 0) {
+			pr_warn("failed to find BTF for extern '%s' [%d] section: %d\n",
+				ext_name, ext->btf_id, ext->sec_btf_id);
+			return ext->sec_btf_id;
+		}
+		sec = (void *)btf__type_by_id(obj->btf, ext->sec_btf_id);
+		sec_name = btf__name_by_offset(obj->btf, sec->name_off);
+
+		if (strcmp(sec_name, KCONFIG_SEC) == 0) {
+			kcfg_sec = sec;
+			ext->type = EXT_KCFG;
+			ext->kcfg.sz = btf__resolve_size(obj->btf, t->type);
+			if (ext->kcfg.sz <= 0) {
+				pr_warn("failed to resolve size of kconfig extern '%s': %d\n",
+					ext_name, ext->kcfg.sz);
+				return ext->kcfg.sz;
+			}
+			ext->kcfg.align = btf__align_of(obj->btf, t->type);
+			if (ext->kcfg.align <= 0) {
+				pr_warn("failed to determine alignment of kconfig extern '%s': %d\n",
+					ext_name, ext->kcfg.align);
+				return -EINVAL;
+			}
+			ext->kcfg.type = find_kcfg_type(obj->btf, t->type,
+						        &ext->kcfg.is_signed);
+			if (ext->kcfg.type == KCFG_UNKNOWN) {
+				pr_warn("kconfig extern '%s' type is unsupported\n", ext_name);
+				return -ENOTSUP;
+			}
+		} else {
+			pr_warn("unrecognized extern section '%s'\n", sec_name);
 			return -ENOTSUP;
 		}
 	}
@@ -2842,42 +2905,40 @@ static int bpf_object__collect_externs(struct bpf_object *obj)
 	if (!obj->nr_extern)
 		return 0;
 
-	/* sort externs by (alignment, size, name) and calculate their offsets
-	 * within a map */
+	/* sort externs by type, for kcfg ones also by (align, size, name) */
 	qsort(obj->externs, obj->nr_extern, sizeof(*ext), cmp_externs);
-	off = 0;
-	for (i = 0; i < obj->nr_extern; i++) {
-		ext = &obj->externs[i];
-		ext->data_off = roundup(off, ext->align);
-		off = ext->data_off + ext->sz;
-		pr_debug("extern #%d: symbol %d, off %u, name %s\n",
-			 i, ext->sym_idx, ext->data_off, ext->name);
-	}
 
-	btf_id = btf__find_by_name(obj->btf, KCONFIG_SEC);
-	if (btf_id <= 0) {
-		pr_warn("no BTF info found for '%s' datasec\n", KCONFIG_SEC);
-		return -ESRCH;
-	}
+	if (kcfg_sec) {
+		sec = kcfg_sec;
+		/* for kcfg externs calculate their offsets within a .kconfig map */
+		off = 0;
+		for (i = 0; i < obj->nr_extern; i++) {
+			ext = &obj->externs[i];
+			if (ext->type != EXT_KCFG)
+				continue;
 
-	sec = (struct btf_type *)btf__type_by_id(obj->btf, btf_id);
-	sec->size = off;
-	n = btf_vlen(sec);
-	for (i = 0; i < n; i++) {
-		struct btf_var_secinfo *vs = btf_var_secinfos(sec) + i;
-
-		t = btf__type_by_id(obj->btf, vs->type);
-		ext_name = btf__name_by_offset(obj->btf, t->name_off);
-		ext = find_extern_by_name(obj, ext_name);
-		if (!ext) {
-			pr_warn("failed to find extern definition for BTF var '%s'\n",
-				ext_name);
-			return -ESRCH;
+			ext->kcfg.data_off = roundup(off, ext->kcfg.align);
+			off = ext->kcfg.data_off + ext->kcfg.sz;
+			pr_debug("extern #%d (kcfg): symbol %d, off %u, name %s\n",
+				 i, ext->sym_idx, ext->kcfg.data_off, ext->name);
+		}
+		sec->size = off;
+		n = btf_vlen(sec);
+		for (i = 0; i < n; i++) {
+			struct btf_var_secinfo *vs = btf_var_secinfos(sec) + i;
+
+			t = btf__type_by_id(obj->btf, vs->type);
+			ext_name = btf__name_by_offset(obj->btf, t->name_off);
+			ext = find_extern_by_name(obj, ext_name);
+			if (!ext) {
+				pr_warn("failed to find extern definition for BTF var '%s'\n",
+					ext_name);
+				return -ESRCH;
+			}
+			btf_var(t)->linkage = BTF_VAR_GLOBAL_ALLOCATED;
+			vs->offset = ext->kcfg.data_off;
 		}
-		vs->offset = ext->data_off;
-		btf_var(t)->linkage = BTF_VAR_GLOBAL_ALLOCATED;
 	}
-
 	return 0;
 }
 
@@ -3007,11 +3068,11 @@ static int bpf_program__record_reloc(struct bpf_program *prog,
 				sym_idx);
 			return -LIBBPF_ERRNO__RELOC;
 		}
-		pr_debug("found extern #%d '%s' (sym %d, off %u) for insn %u\n",
-			 i, ext->name, ext->sym_idx, ext->data_off, insn_idx);
+		pr_debug("found extern #%d '%s' (sym %d) for insn %u\n",
+			 i, ext->name, ext->sym_idx, insn_idx);
 		reloc_desc->type = RELO_EXTERN;
 		reloc_desc->insn_idx = insn_idx;
-		reloc_desc->sym_off = ext->data_off;
+		reloc_desc->sym_off = i; /* sym_off stores extern index */
 		return 0;
 	}
 
@@ -4928,6 +4989,7 @@ bpf_program__relocate(struct bpf_program *prog, struct bpf_object *obj)
 	for (i = 0; i < prog->nr_reloc; i++) {
 		struct reloc_desc *relo = &prog->reloc_desc[i];
 		struct bpf_insn *insn = &prog->insns[relo->insn_idx];
+		struct extern_desc *ext;
 
 		if (relo->insn_idx + 1 >= (int)prog->insns_cnt) {
 			pr_warn("relocation out of range: '%s'\n",
@@ -4946,9 +5008,10 @@ bpf_program__relocate(struct bpf_program *prog, struct bpf_object *obj)
 			insn[0].imm = obj->maps[relo->map_idx].fd;
 			break;
 		case RELO_EXTERN:
+			ext = &obj->externs[relo->sym_off];
 			insn[0].src_reg = BPF_PSEUDO_MAP_VALUE;
 			insn[0].imm = obj->maps[obj->kconfig_map_idx].fd;
-			insn[1].imm = relo->sym_off;
+			insn[1].imm = ext->kcfg.data_off;
 			break;
 		case RELO_CALL:
 			err = bpf_program__reloc_text(prog, obj, relo);
@@ -5572,30 +5635,33 @@ static int bpf_object__resolve_externs(struct bpf_object *obj,
 {
 	bool need_config = false;
 	struct extern_desc *ext;
+	void *kcfg_data;
 	int err, i;
-	void *data;
 
 	if (obj->nr_extern == 0)
 		return 0;
 
-	data = obj->maps[obj->kconfig_map_idx].mmaped;
+	if (obj->kconfig_map_idx >= 0)
+		kcfg_data = obj->maps[obj->kconfig_map_idx].mmaped;
 
 	for (i = 0; i < obj->nr_extern; i++) {
 		ext = &obj->externs[i];
 
-		if (strcmp(ext->name, "LINUX_KERNEL_VERSION") == 0) {
-			void *ext_val = data + ext->data_off;
+		if (ext->type == EXT_KCFG &&
+		    strcmp(ext->name, "LINUX_KERNEL_VERSION") == 0) {
+			void *ext_val = kcfg_data + ext->kcfg.data_off;
 			__u32 kver = get_kernel_version();
 
 			if (!kver) {
 				pr_warn("failed to get kernel version\n");
 				return -EINVAL;
 			}
-			err = set_ext_value_num(ext, ext_val, kver);
+			err = set_kcfg_value_num(ext, ext_val, kver);
 			if (err)
 				return err;
-			pr_debug("extern %s=0x%x\n", ext->name, kver);
-		} else if (strncmp(ext->name, "CONFIG_", 7) == 0) {
+			pr_debug("kconfig extern %s=0x%x\n", ext->name, kver);
+		} else if (ext->type == EXT_KCFG &&
+			   strncmp(ext->name, "CONFIG_", 7) == 0) {
 			need_config = true;
 		} else {
 			pr_warn("unrecognized extern '%s'\n", ext->name);
@@ -5603,20 +5669,20 @@ static int bpf_object__resolve_externs(struct bpf_object *obj,
 		}
 	}
 	if (need_config && extra_kconfig) {
-		err = bpf_object__read_kconfig_mem(obj, extra_kconfig, data);
+		err = bpf_object__read_kconfig_mem(obj, extra_kconfig, kcfg_data);
 		if (err)
 			return -EINVAL;
 		need_config = false;
 		for (i = 0; i < obj->nr_extern; i++) {
 			ext = &obj->externs[i];
-			if (!ext->is_set) {
+			if (ext->type == EXT_KCFG && !ext->is_set) {
 				need_config = true;
 				break;
 			}
 		}
 	}
 	if (need_config) {
-		err = bpf_object__read_kconfig_file(obj, data);
+		err = bpf_object__read_kconfig_file(obj, kcfg_data);
 		if (err)
 			return -EINVAL;
 	}
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH bpf-next 2/8] libbpf: add support for extracting kernel symbol addresses
  2020-06-12 22:31 [RFC PATCH bpf-next 0/8] libbpf ksym support and bpftool show PIDs Andrii Nakryiko
  2020-06-12 22:31 ` [RFC PATCH bpf-next 1/8] libbpf: generalize libbpf externs support Andrii Nakryiko
@ 2020-06-12 22:31 ` Andrii Nakryiko
       [not found]   ` <CA+khW7hFZzp_K_xydSFw0O3LYB22_fC=Z4wG7i9Si+phGHn4cQ@mail.gmail.com>
  2020-06-12 22:31 ` [RFC PATCH bpf-next 3/8] selftests/bpf: add __ksym extern selftest Andrii Nakryiko
                   ` (5 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Andrii Nakryiko @ 2020-06-12 22:31 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel
  Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko, Hao Luo,
	Arnaldo Carvalho de Melo, Song Liu, Quentin Monnet

Add support for another (in addition to existing Kconfig) special kind of
externs in BPF code, kernel symbol externs. Such externs allow BPF code to
"know" kernel symbol address and either use it for comparisons with kernel
data structures (e.g., struct file's f_op pointer, to distinguish different
kinds of file), or, with the help of bpf_probe_user_kernel(), to follow
pointers and read data from global variables. Kernel symbol addresses are
found through /proc/kallsyms, which should be present in the system.

Currently, such kernel symbol variables are typeless: they have to be defined
as `extern const void <symbol>` and the only operation you can do (in C code)
with them is to take its address. Such extern should reside in a special
section '.ksyms'. bpf_helpers.h header provides __ksym macro for this. Strong
vs weak semantics stays the same as with Kconfig externs. If symbol is not
found in /proc/kallsyms, this will be a failure for strong (non-weak) extern,
but will be defaulted to 0 for weak externs.

If the same symbol is defined multiple times in /proc/kallsyms, then it will
be error if any of the associated addresses differs. In that case, address is
ambiguous, so libbpf falls on the side of caution, rather than confusing user
with randomly chosen address.

In the future, once kernel is extended with variables BTF information, such
ksym externs will be supported in a typed version, which will allow BPF
program to read variable's contents directly, similarly to how it's done for
fentry/fexit input arguments.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/lib/bpf/bpf_helpers.h |   1 +
 tools/lib/bpf/btf.h         |   5 ++
 tools/lib/bpf/libbpf.c      | 138 ++++++++++++++++++++++++++++++++++--
 3 files changed, 139 insertions(+), 5 deletions(-)

diff --git a/tools/lib/bpf/bpf_helpers.h b/tools/lib/bpf/bpf_helpers.h
index f67dce2af802..a510d8ed716f 100644
--- a/tools/lib/bpf/bpf_helpers.h
+++ b/tools/lib/bpf/bpf_helpers.h
@@ -75,5 +75,6 @@ enum libbpf_tristate {
 };
 
 #define __kconfig __attribute__((section(".kconfig")))
+#define __ksym __attribute__((section(".ksyms")))
 
 #endif
diff --git a/tools/lib/bpf/btf.h b/tools/lib/bpf/btf.h
index 70c1b7ec2bd0..06cd1731c154 100644
--- a/tools/lib/bpf/btf.h
+++ b/tools/lib/bpf/btf.h
@@ -168,6 +168,11 @@ static inline bool btf_kflag(const struct btf_type *t)
 	return BTF_INFO_KFLAG(t->info);
 }
 
+static inline bool btf_is_void(const struct btf_type *t)
+{
+	return btf_kind(t) == BTF_KIND_UNKN;
+}
+
 static inline bool btf_is_int(const struct btf_type *t)
 {
 	return btf_kind(t) == BTF_KIND_INT;
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index b1a36c965fc0..451fb0b5615b 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -285,6 +285,7 @@ struct bpf_struct_ops {
 #define BSS_SEC ".bss"
 #define RODATA_SEC ".rodata"
 #define KCONFIG_SEC ".kconfig"
+#define KSYMS_SEC ".ksyms"
 #define STRUCT_OPS_SEC ".struct_ops"
 
 enum libbpf_map_type {
@@ -329,6 +330,7 @@ struct bpf_map {
 
 enum extern_type {
 	EXT_UNKNOWN,
+	EXT_KSYM,
 	EXT_KCFG,
 };
 
@@ -357,6 +359,9 @@ struct extern_desc {
 			int data_off;
 			bool is_signed;
 		} kcfg;
+		struct {
+			unsigned long addr;
+		} ksym;
 	};
 };
 
@@ -2812,9 +2817,25 @@ static int cmp_externs(const void *_a, const void *_b)
 	return strcmp(a->name, b->name);
 }
 
+static int find_int_btf_id(const struct btf *btf)
+{
+	const struct btf_type *t;
+	int i, n;
+
+	n = btf__get_nr_types(btf);
+	for (i = 1; i <= n; i++) {
+		t = btf__type_by_id(btf, i);
+
+		if (btf_is_int(t) && btf_int_bits(t) == 32)
+			return i;
+	}
+
+	return 0;
+}
+
 static int bpf_object__collect_externs(struct bpf_object *obj)
 {
-	struct btf_type *sec, *kcfg_sec = NULL;
+	struct btf_type *sec, *kcfg_sec = NULL, *ksym_sec = NULL;
 	const struct btf_type *t;
 	struct extern_desc *ext;
 	int i, n, off;
@@ -2895,6 +2916,17 @@ static int bpf_object__collect_externs(struct bpf_object *obj)
 				pr_warn("kconfig extern '%s' type is unsupported\n", ext_name);
 				return -ENOTSUP;
 			}
+		} else if (strcmp(sec_name, KSYMS_SEC) == 0) {
+			const struct btf_type *vt;
+
+			ksym_sec = sec;
+			ext->type = EXT_KSYM;
+
+			vt = skip_mods_and_typedefs(obj->btf, t->type, NULL);
+			if (!btf_is_void(vt)) {
+				pr_warn("ksym extern '%s' is not typeless (void)\n", ext_name);
+				return -ENOTSUP;
+			}
 		} else {
 			pr_warn("unrecognized extern section '%s'\n", sec_name);
 			return -ENOTSUP;
@@ -2908,6 +2940,43 @@ static int bpf_object__collect_externs(struct bpf_object *obj)
 	/* sort externs by type, for kcfg ones also by (align, size, name) */
 	qsort(obj->externs, obj->nr_extern, sizeof(*ext), cmp_externs);
 
+	/* for .ksyms section, we need to turn all externs into allocated
+	 * variables in BTF to pass kernel verification; we do this by
+	 * pretending that each extern is a 8-byte variable
+	 */
+	if (ksym_sec) {
+		/* find existing 4-byte integer type in BTF to use for fake
+		 * extern variables in DATASEC
+		 */
+		int int_btf_id = find_int_btf_id(obj->btf);
+
+		for (i = 0; i < n; i++) {
+			pr_debug("extern #%d (ksym): symbol %d, name %s\n",
+				 i, ext->sym_idx, ext->name);
+		}
+
+		sec = ksym_sec;
+		n = btf_vlen(sec);
+		for (i = 0, off = 0; i < n; i++, off += sizeof(int)) {
+			struct btf_var_secinfo *vs = btf_var_secinfos(sec) + i;
+			struct btf_type *vt;
+
+			vt = (void *)btf__type_by_id(obj->btf, vs->type);
+			ext_name = btf__name_by_offset(obj->btf, vt->name_off);
+			ext = find_extern_by_name(obj, ext_name);
+			if (!ext) {
+				pr_warn("failed to find extern definition for BTF var '%s'\n",
+					ext_name);
+				return -ESRCH;
+			}
+			btf_var(vt)->linkage = BTF_VAR_GLOBAL_ALLOCATED;
+			vt->type = int_btf_id;
+			vs->offset = off;
+			vs->size = sizeof(int);
+		}
+		sec->size = off;
+	}
+
 	if (kcfg_sec) {
 		sec = kcfg_sec;
 		/* for kcfg externs calculate their offsets within a .kconfig map */
@@ -5009,9 +5078,14 @@ bpf_program__relocate(struct bpf_program *prog, struct bpf_object *obj)
 			break;
 		case RELO_EXTERN:
 			ext = &obj->externs[relo->sym_off];
-			insn[0].src_reg = BPF_PSEUDO_MAP_VALUE;
-			insn[0].imm = obj->maps[obj->kconfig_map_idx].fd;
-			insn[1].imm = ext->kcfg.data_off;
+			if (ext->type == EXT_KCFG) {
+				insn[0].src_reg = BPF_PSEUDO_MAP_VALUE;
+				insn[0].imm = obj->maps[obj->kconfig_map_idx].fd;
+				insn[1].imm = ext->kcfg.data_off;
+			} else /* EXT_KSYM */ {
+				insn[0].imm = (__u32)ext->ksym.addr;
+				insn[1].imm = ext->ksym.addr >> 32;
+			}
 			break;
 		case RELO_CALL:
 			err = bpf_program__reloc_text(prog, obj, relo);
@@ -5630,10 +5704,57 @@ static int bpf_object__sanitize_maps(struct bpf_object *obj)
 	return 0;
 }
 
+static int bpf_object__read_kallsyms_file(struct bpf_object *obj)
+{
+	char sym_type, sym_name[256];
+	unsigned long sym_addr;
+	struct extern_desc *ext;
+	int ret, err = 0;
+	FILE *f;
+
+	f = fopen("/proc/kallsyms", "r");
+	if (!f) {
+		err = -errno;
+		pr_warn("failed to open /proc/kallsyms: %d\n", err);
+		return err;
+	}
+
+	while (true) {
+		ret = fscanf(f, "%lx %c %s%*[^\n]\n",
+			     &sym_addr, &sym_type, sym_name);
+		if (ret == EOF && feof(f))
+			break;
+		if (ret != 3) {
+			err = -EINVAL;
+			goto out;
+		}
+
+		ext = find_extern_by_name(obj, sym_name);
+		if (!ext || ext->type != EXT_KSYM)
+			continue;
+
+		if (ext->is_set && ext->ksym.addr != sym_addr) {
+			pr_warn("ksym extern '%s' resolution is ambiguous: 0x%lx or 0x%lx\n",
+				sym_name, ext->ksym.addr, sym_addr);
+			err = -EINVAL;
+			goto out;
+		}
+		if (!ext->is_set) {
+			ext->is_set = true;
+			ext->ksym.addr = sym_addr;
+			pr_debug("ksym extern %s=0x%lx\n", sym_name, sym_addr);
+		}
+	}
+
+out:
+	fclose(f);
+	return err;
+}
+
 static int bpf_object__resolve_externs(struct bpf_object *obj,
 				       const char *extra_kconfig)
 {
-	bool need_config = false;
+	bool need_config = false, need_kallsyms = false;
 	struct extern_desc *ext;
 	void *kcfg_data;
 	int err, i;
@@ -5663,6 +5784,8 @@ static int bpf_object__resolve_externs(struct bpf_object *obj,
 		} else if (ext->type == EXT_KCFG &&
 			   strncmp(ext->name, "CONFIG_", 7) == 0) {
 			need_config = true;
+		} else if (ext->type == EXT_KSYM) {
+			need_kallsyms = true;
 		} else {
 			pr_warn("unrecognized extern '%s'\n", ext->name);
 			return -EINVAL;
@@ -5686,6 +5809,11 @@ static int bpf_object__resolve_externs(struct bpf_object *obj,
 		if (err)
 			return -EINVAL;
 	}
+	if (need_kallsyms) {
+		err = bpf_object__read_kallsyms_file(obj);
+		if (err)
+			return -EINVAL;
+	}
 	for (i = 0; i < obj->nr_extern; i++) {
 		ext = &obj->externs[i];
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH bpf-next 3/8] selftests/bpf: add __ksym extern selftest
  2020-06-12 22:31 [RFC PATCH bpf-next 0/8] libbpf ksym support and bpftool show PIDs Andrii Nakryiko
  2020-06-12 22:31 ` [RFC PATCH bpf-next 1/8] libbpf: generalize libbpf externs support Andrii Nakryiko
  2020-06-12 22:31 ` [RFC PATCH bpf-next 2/8] libbpf: add support for extracting kernel symbol addresses Andrii Nakryiko
@ 2020-06-12 22:31 ` Andrii Nakryiko
       [not found]   ` <CA+khW7jxdS1KRpk2syVGjDqbyn3wAd3Eh_LEMAEhkPUehuXMwg@mail.gmail.com>
  2020-06-12 22:31 ` [RFC PATCH bpf-next 4/8] tools/bpftool: move map/prog parsing logic into common Andrii Nakryiko
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 21+ messages in thread
From: Andrii Nakryiko @ 2020-06-12 22:31 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel
  Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko, Hao Luo,
	Arnaldo Carvalho de Melo, Song Liu, Quentin Monnet

Validate libbpf is able to handle weak and strong kernel symbol externs in BPF
code correctly.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 .../testing/selftests/bpf/prog_tests/ksyms.c  | 71 +++++++++++++++++++
 .../testing/selftests/bpf/progs/test_ksyms.c  | 32 +++++++++
 2 files changed, 103 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/ksyms.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_ksyms.c

diff --git a/tools/testing/selftests/bpf/prog_tests/ksyms.c b/tools/testing/selftests/bpf/prog_tests/ksyms.c
new file mode 100644
index 000000000000..8f041c9059ed
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/ksyms.c
@@ -0,0 +1,71 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2019 Facebook */
+
+#include <test_progs.h>
+#include "test_ksyms.skel.h"
+#include <sys/stat.h>
+
+static int duration;
+
+static __u64 kallsyms_find(const char *sym)
+{
+	char type, name[256];
+	unsigned long addr, res = 0;
+	FILE *f;
+
+	f = fopen("/proc/kallsyms", "r");
+	if (CHECK(!f, "kallsyms_fopen", "failed to open: %d\n", errno))
+		return 0;
+
+	while (fscanf(f, "%lx %c %s%*[^\n]\n", &addr, &type, name) > 0) {
+		if (strcmp(name, sym) == 0) {
+			res = addr;
+			goto out;
+		}
+	}
+
+	CHECK(false, "not_found", "symbol %s not found\n", sym);
+out:
+	fclose(f);
+	return res;
+}
+
+void test_ksyms(void)
+{
+	__u64 link_fops_addr = kallsyms_find("bpf_link_fops");
+	const char *btf_path = "/sys/kernel/btf/vmlinux";
+	struct test_ksyms *skel;
+	struct test_ksyms__data *data;
+	struct stat st;
+	__u64 btf_size;
+	int err;
+
+	if (CHECK(stat(btf_path, &st), "stat_btf", "err %d\n", errno))
+		return;
+	btf_size = st.st_size;
+
+	skel = test_ksyms__open_and_load();
+	if (CHECK(!skel, "skel_open", "failed to open and load skeleton\n"))
+		return;
+
+	err = test_ksyms__attach(skel);
+	if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err))
+		goto cleanup;
+
+	/* trigger tracepoint */
+	usleep(1);
+
+	data = skel->data;
+	CHECK(data->out__bpf_link_fops != link_fops_addr, "bpf_link_fops",
+	      "got 0x%llx, exp 0x%llx\n",
+	      data->out__bpf_link_fops, link_fops_addr);
+	CHECK(data->out__bpf_link_fops1 != 0, "bpf_link_fops1",
+	      "got %llu, exp %llu\n", data->out__bpf_link_fops1, (__u64)0);
+	CHECK(data->out__btf_size != btf_size, "btf_size",
+	      "got %llu, exp %llu\n", data->out__btf_size, btf_size);
+	CHECK(data->out__fixed_percpu_data != 0, "fixed_percpu_data",
+	      "got %llu, exp %llu\n", data->out__fixed_percpu_data, (__u64)0);
+
+cleanup:
+	test_ksyms__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_ksyms.c b/tools/testing/selftests/bpf/progs/test_ksyms.c
new file mode 100644
index 000000000000..598d6749b691
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_ksyms.c
@@ -0,0 +1,32 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2019 Facebook */
+
+#include <stdbool.h>
+#include <linux/bpf.h>
+#include <bpf/bpf_helpers.h>
+
+__u64 out__bpf_link_fops = -1;
+__u64 out__bpf_link_fops1 = -1;
+__u64 out__btf_size = -1;
+__u64 out__fixed_percpu_data = -1;
+
+extern const void bpf_link_fops __ksym;
+/* non-existing symbol, weak, default to zero */
+extern const void bpf_link_fops1 __ksym __weak;
+extern const void __start_BTF __ksym;
+extern const void __stop_BTF __ksym;
+/* fixed_percpu_data is always at zero offset */
+extern const void fixed_percpu_data __ksym;
+
+SEC("raw_tp/sys_enter")
+int handler(const void *ctx)
+{
+	out__bpf_link_fops = (__u64)&bpf_link_fops;
+	out__bpf_link_fops1 = (__u64)&bpf_link_fops1;
+	out__btf_size = (__u64)(&__stop_BTF - &__start_BTF);
+	out__fixed_percpu_data = (__u64)&fixed_percpu_data;
+
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH bpf-next 4/8] tools/bpftool: move map/prog parsing logic into common
  2020-06-12 22:31 [RFC PATCH bpf-next 0/8] libbpf ksym support and bpftool show PIDs Andrii Nakryiko
                   ` (2 preceding siblings ...)
  2020-06-12 22:31 ` [RFC PATCH bpf-next 3/8] selftests/bpf: add __ksym extern selftest Andrii Nakryiko
@ 2020-06-12 22:31 ` Andrii Nakryiko
  2020-06-12 22:31 ` [RFC PATCH bpf-next 5/8] tools/bpftool: minimize bootstrap bpftool Andrii Nakryiko
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Andrii Nakryiko @ 2020-06-12 22:31 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel
  Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko, Hao Luo,
	Arnaldo Carvalho de Melo, Song Liu, Quentin Monnet

Move functions that parse map and prog by id/tag/name/etc outside of
map.c/prog.c, respectively. These functions are used outside of those files
and are generic enough to be in common. This also makes heavy-weight map.c and
prog.c more decoupled from the rest of bpftool files and facilitates more
lightweight bootstrap bpftool variant.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/bpf/bpftool/common.c | 308 +++++++++++++++++++++++++++++++++++++
 tools/bpf/bpftool/main.h   |   2 +
 tools/bpf/bpftool/map.c    | 156 -------------------
 tools/bpf/bpftool/prog.c   | 152 ------------------
 4 files changed, 310 insertions(+), 308 deletions(-)

diff --git a/tools/bpf/bpftool/common.c b/tools/bpf/bpftool/common.c
index c47bdc65de8e..6c864c3683fc 100644
--- a/tools/bpf/bpftool/common.c
+++ b/tools/bpf/bpftool/common.c
@@ -581,3 +581,311 @@ print_all_levels(__maybe_unused enum libbpf_print_level level,
 {
 	return vfprintf(stderr, format, args);
 }
+
+static int prog_fd_by_nametag(void *nametag, int **fds, bool tag)
+{
+	unsigned int id = 0;
+	int fd, nb_fds = 0;
+	void *tmp;
+	int err;
+
+	while (true) {
+		struct bpf_prog_info info = {};
+		__u32 len = sizeof(info);
+
+		err = bpf_prog_get_next_id(id, &id);
+		if (err) {
+			if (errno != ENOENT) {
+				p_err("%s", strerror(errno));
+				goto err_close_fds;
+			}
+			return nb_fds;
+		}
+
+		fd = bpf_prog_get_fd_by_id(id);
+		if (fd < 0) {
+			p_err("can't get prog by id (%u): %s",
+			      id, strerror(errno));
+			goto err_close_fds;
+		}
+
+		err = bpf_obj_get_info_by_fd(fd, &info, &len);
+		if (err) {
+			p_err("can't get prog info (%u): %s",
+			      id, strerror(errno));
+			goto err_close_fd;
+		}
+
+		if ((tag && memcmp(nametag, info.tag, BPF_TAG_SIZE)) ||
+		    (!tag && strncmp(nametag, info.name, BPF_OBJ_NAME_LEN))) {
+			close(fd);
+			continue;
+		}
+
+		if (nb_fds > 0) {
+			tmp = realloc(*fds, (nb_fds + 1) * sizeof(int));
+			if (!tmp) {
+				p_err("failed to realloc");
+				goto err_close_fd;
+			}
+			*fds = tmp;
+		}
+		(*fds)[nb_fds++] = fd;
+	}
+
+err_close_fd:
+	close(fd);
+err_close_fds:
+	while (--nb_fds >= 0)
+		close((*fds)[nb_fds]);
+	return -1;
+}
+
+int prog_parse_fds(int *argc, char ***argv, int **fds)
+{
+	if (is_prefix(**argv, "id")) {
+		unsigned int id;
+		char *endptr;
+
+		NEXT_ARGP();
+
+		id = strtoul(**argv, &endptr, 0);
+		if (*endptr) {
+			p_err("can't parse %s as ID", **argv);
+			return -1;
+		}
+		NEXT_ARGP();
+
+		(*fds)[0] = bpf_prog_get_fd_by_id(id);
+		if ((*fds)[0] < 0) {
+			p_err("get by id (%u): %s", id, strerror(errno));
+			return -1;
+		}
+		return 1;
+	} else if (is_prefix(**argv, "tag")) {
+		unsigned char tag[BPF_TAG_SIZE];
+
+		NEXT_ARGP();
+
+		if (sscanf(**argv, BPF_TAG_FMT, tag, tag + 1, tag + 2,
+			   tag + 3, tag + 4, tag + 5, tag + 6, tag + 7)
+		    != BPF_TAG_SIZE) {
+			p_err("can't parse tag");
+			return -1;
+		}
+		NEXT_ARGP();
+
+		return prog_fd_by_nametag(tag, fds, true);
+	} else if (is_prefix(**argv, "name")) {
+		char *name;
+
+		NEXT_ARGP();
+
+		name = **argv;
+		if (strlen(name) > BPF_OBJ_NAME_LEN - 1) {
+			p_err("can't parse name");
+			return -1;
+		}
+		NEXT_ARGP();
+
+		return prog_fd_by_nametag(name, fds, false);
+	} else if (is_prefix(**argv, "pinned")) {
+		char *path;
+
+		NEXT_ARGP();
+
+		path = **argv;
+		NEXT_ARGP();
+
+		(*fds)[0] = open_obj_pinned_any(path, BPF_OBJ_PROG);
+		if ((*fds)[0] < 0)
+			return -1;
+		return 1;
+	}
+
+	p_err("expected 'id', 'tag', 'name' or 'pinned', got: '%s'?", **argv);
+	return -1;
+}
+
+int prog_parse_fd(int *argc, char ***argv)
+{
+	int *fds = NULL;
+	int nb_fds, fd;
+
+	fds = malloc(sizeof(int));
+	if (!fds) {
+		p_err("mem alloc failed");
+		return -1;
+	}
+	nb_fds = prog_parse_fds(argc, argv, &fds);
+	if (nb_fds != 1) {
+		if (nb_fds > 1) {
+			p_err("several programs match this handle");
+			while (nb_fds--)
+				close(fds[nb_fds]);
+		}
+		fd = -1;
+		goto exit_free;
+	}
+
+	fd = fds[0];
+exit_free:
+	free(fds);
+	return fd;
+}
+
+static int map_fd_by_name(char *name, int **fds)
+{
+	unsigned int id = 0;
+	int fd, nb_fds = 0;
+	void *tmp;
+	int err;
+
+	while (true) {
+		struct bpf_map_info info = {};
+		__u32 len = sizeof(info);
+
+		err = bpf_map_get_next_id(id, &id);
+		if (err) {
+			if (errno != ENOENT) {
+				p_err("%s", strerror(errno));
+				goto err_close_fds;
+			}
+			return nb_fds;
+		}
+
+		fd = bpf_map_get_fd_by_id(id);
+		if (fd < 0) {
+			p_err("can't get map by id (%u): %s",
+			      id, strerror(errno));
+			goto err_close_fds;
+		}
+
+		err = bpf_obj_get_info_by_fd(fd, &info, &len);
+		if (err) {
+			p_err("can't get map info (%u): %s",
+			      id, strerror(errno));
+			goto err_close_fd;
+		}
+
+		if (strncmp(name, info.name, BPF_OBJ_NAME_LEN)) {
+			close(fd);
+			continue;
+		}
+
+		if (nb_fds > 0) {
+			tmp = realloc(*fds, (nb_fds + 1) * sizeof(int));
+			if (!tmp) {
+				p_err("failed to realloc");
+				goto err_close_fd;
+			}
+			*fds = tmp;
+		}
+		(*fds)[nb_fds++] = fd;
+	}
+
+err_close_fd:
+	close(fd);
+err_close_fds:
+	while (--nb_fds >= 0)
+		close((*fds)[nb_fds]);
+	return -1;
+}
+
+int map_parse_fds(int *argc, char ***argv, int **fds)
+{
+	if (is_prefix(**argv, "id")) {
+		unsigned int id;
+		char *endptr;
+
+		NEXT_ARGP();
+
+		id = strtoul(**argv, &endptr, 0);
+		if (*endptr) {
+			p_err("can't parse %s as ID", **argv);
+			return -1;
+		}
+		NEXT_ARGP();
+
+		(*fds)[0] = bpf_map_get_fd_by_id(id);
+		if ((*fds)[0] < 0) {
+			p_err("get map by id (%u): %s", id, strerror(errno));
+			return -1;
+		}
+		return 1;
+	} else if (is_prefix(**argv, "name")) {
+		char *name;
+
+		NEXT_ARGP();
+
+		name = **argv;
+		if (strlen(name) > BPF_OBJ_NAME_LEN - 1) {
+			p_err("can't parse name");
+			return -1;
+		}
+		NEXT_ARGP();
+
+		return map_fd_by_name(name, fds);
+	} else if (is_prefix(**argv, "pinned")) {
+		char *path;
+
+		NEXT_ARGP();
+
+		path = **argv;
+		NEXT_ARGP();
+
+		(*fds)[0] = open_obj_pinned_any(path, BPF_OBJ_MAP);
+		if ((*fds)[0] < 0)
+			return -1;
+		return 1;
+	}
+
+	p_err("expected 'id', 'name' or 'pinned', got: '%s'?", **argv);
+	return -1;
+}
+
+int map_parse_fd(int *argc, char ***argv)
+{
+	int *fds = NULL;
+	int nb_fds, fd;
+
+	fds = malloc(sizeof(int));
+	if (!fds) {
+		p_err("mem alloc failed");
+		return -1;
+	}
+	nb_fds = map_parse_fds(argc, argv, &fds);
+	if (nb_fds != 1) {
+		if (nb_fds > 1) {
+			p_err("several maps match this handle");
+			while (nb_fds--)
+				close(fds[nb_fds]);
+		}
+		fd = -1;
+		goto exit_free;
+	}
+
+	fd = fds[0];
+exit_free:
+	free(fds);
+	return fd;
+}
+
+int map_parse_fd_and_info(int *argc, char ***argv, void *info, __u32 *info_len)
+{
+	int err;
+	int fd;
+
+	fd = map_parse_fd(argc, argv);
+	if (fd < 0)
+		return -1;
+
+	err = bpf_obj_get_info_by_fd(fd, info, info_len);
+	if (err) {
+		p_err("can't get map info: %s", strerror(errno));
+		close(fd);
+		return err;
+	}
+
+	return fd;
+}
diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h
index 5cdf0bc049bd..4338ab9d86d4 100644
--- a/tools/bpf/bpftool/main.h
+++ b/tools/bpf/bpftool/main.h
@@ -210,7 +210,9 @@ int do_iter(int argc, char **argv);
 
 int parse_u32_arg(int *argc, char ***argv, __u32 *val, const char *what);
 int prog_parse_fd(int *argc, char ***argv);
+int prog_parse_fds(int *argc, char ***argv, int **fds);
 int map_parse_fd(int *argc, char ***argv);
+int map_parse_fds(int *argc, char ***argv, int **fds);
 int map_parse_fd_and_info(int *argc, char ***argv, void *info, __u32 *info_len);
 
 struct bpf_prog_linfo;
diff --git a/tools/bpf/bpftool/map.c b/tools/bpf/bpftool/map.c
index c5fac8068ba1..b9eee19b094c 100644
--- a/tools/bpf/bpftool/map.c
+++ b/tools/bpf/bpftool/map.c
@@ -92,162 +92,6 @@ static void *alloc_value(struct bpf_map_info *info)
 		return malloc(info->value_size);
 }
 
-static int map_fd_by_name(char *name, int **fds)
-{
-	unsigned int id = 0;
-	int fd, nb_fds = 0;
-	void *tmp;
-	int err;
-
-	while (true) {
-		struct bpf_map_info info = {};
-		__u32 len = sizeof(info);
-
-		err = bpf_map_get_next_id(id, &id);
-		if (err) {
-			if (errno != ENOENT) {
-				p_err("%s", strerror(errno));
-				goto err_close_fds;
-			}
-			return nb_fds;
-		}
-
-		fd = bpf_map_get_fd_by_id(id);
-		if (fd < 0) {
-			p_err("can't get map by id (%u): %s",
-			      id, strerror(errno));
-			goto err_close_fds;
-		}
-
-		err = bpf_obj_get_info_by_fd(fd, &info, &len);
-		if (err) {
-			p_err("can't get map info (%u): %s",
-			      id, strerror(errno));
-			goto err_close_fd;
-		}
-
-		if (strncmp(name, info.name, BPF_OBJ_NAME_LEN)) {
-			close(fd);
-			continue;
-		}
-
-		if (nb_fds > 0) {
-			tmp = realloc(*fds, (nb_fds + 1) * sizeof(int));
-			if (!tmp) {
-				p_err("failed to realloc");
-				goto err_close_fd;
-			}
-			*fds = tmp;
-		}
-		(*fds)[nb_fds++] = fd;
-	}
-
-err_close_fd:
-	close(fd);
-err_close_fds:
-	while (--nb_fds >= 0)
-		close((*fds)[nb_fds]);
-	return -1;
-}
-
-static int map_parse_fds(int *argc, char ***argv, int **fds)
-{
-	if (is_prefix(**argv, "id")) {
-		unsigned int id;
-		char *endptr;
-
-		NEXT_ARGP();
-
-		id = strtoul(**argv, &endptr, 0);
-		if (*endptr) {
-			p_err("can't parse %s as ID", **argv);
-			return -1;
-		}
-		NEXT_ARGP();
-
-		(*fds)[0] = bpf_map_get_fd_by_id(id);
-		if ((*fds)[0] < 0) {
-			p_err("get map by id (%u): %s", id, strerror(errno));
-			return -1;
-		}
-		return 1;
-	} else if (is_prefix(**argv, "name")) {
-		char *name;
-
-		NEXT_ARGP();
-
-		name = **argv;
-		if (strlen(name) > BPF_OBJ_NAME_LEN - 1) {
-			p_err("can't parse name");
-			return -1;
-		}
-		NEXT_ARGP();
-
-		return map_fd_by_name(name, fds);
-	} else if (is_prefix(**argv, "pinned")) {
-		char *path;
-
-		NEXT_ARGP();
-
-		path = **argv;
-		NEXT_ARGP();
-
-		(*fds)[0] = open_obj_pinned_any(path, BPF_OBJ_MAP);
-		if ((*fds)[0] < 0)
-			return -1;
-		return 1;
-	}
-
-	p_err("expected 'id', 'name' or 'pinned', got: '%s'?", **argv);
-	return -1;
-}
-
-int map_parse_fd(int *argc, char ***argv)
-{
-	int *fds = NULL;
-	int nb_fds, fd;
-
-	fds = malloc(sizeof(int));
-	if (!fds) {
-		p_err("mem alloc failed");
-		return -1;
-	}
-	nb_fds = map_parse_fds(argc, argv, &fds);
-	if (nb_fds != 1) {
-		if (nb_fds > 1) {
-			p_err("several maps match this handle");
-			while (nb_fds--)
-				close(fds[nb_fds]);
-		}
-		fd = -1;
-		goto exit_free;
-	}
-
-	fd = fds[0];
-exit_free:
-	free(fds);
-	return fd;
-}
-
-int map_parse_fd_and_info(int *argc, char ***argv, void *info, __u32 *info_len)
-{
-	int err;
-	int fd;
-
-	fd = map_parse_fd(argc, argv);
-	if (fd < 0)
-		return -1;
-
-	err = bpf_obj_get_info_by_fd(fd, info, info_len);
-	if (err) {
-		p_err("can't get map info: %s", strerror(errno));
-		close(fd);
-		return err;
-	}
-
-	return fd;
-}
-
 static int do_dump_btf(const struct btf_dumper *d,
 		       struct bpf_map_info *map_info, void *key,
 		       void *value)
diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
index a5eff83496f2..53d47610ff58 100644
--- a/tools/bpf/bpftool/prog.c
+++ b/tools/bpf/bpftool/prog.c
@@ -86,158 +86,6 @@ static void print_boot_time(__u64 nsecs, char *buf, unsigned int size)
 		strftime(buf, size, "%FT%T%z", &load_tm);
 }
 
-static int prog_fd_by_nametag(void *nametag, int **fds, bool tag)
-{
-	unsigned int id = 0;
-	int fd, nb_fds = 0;
-	void *tmp;
-	int err;
-
-	while (true) {
-		struct bpf_prog_info info = {};
-		__u32 len = sizeof(info);
-
-		err = bpf_prog_get_next_id(id, &id);
-		if (err) {
-			if (errno != ENOENT) {
-				p_err("%s", strerror(errno));
-				goto err_close_fds;
-			}
-			return nb_fds;
-		}
-
-		fd = bpf_prog_get_fd_by_id(id);
-		if (fd < 0) {
-			p_err("can't get prog by id (%u): %s",
-			      id, strerror(errno));
-			goto err_close_fds;
-		}
-
-		err = bpf_obj_get_info_by_fd(fd, &info, &len);
-		if (err) {
-			p_err("can't get prog info (%u): %s",
-			      id, strerror(errno));
-			goto err_close_fd;
-		}
-
-		if ((tag && memcmp(nametag, info.tag, BPF_TAG_SIZE)) ||
-		    (!tag && strncmp(nametag, info.name, BPF_OBJ_NAME_LEN))) {
-			close(fd);
-			continue;
-		}
-
-		if (nb_fds > 0) {
-			tmp = realloc(*fds, (nb_fds + 1) * sizeof(int));
-			if (!tmp) {
-				p_err("failed to realloc");
-				goto err_close_fd;
-			}
-			*fds = tmp;
-		}
-		(*fds)[nb_fds++] = fd;
-	}
-
-err_close_fd:
-	close(fd);
-err_close_fds:
-	while (--nb_fds >= 0)
-		close((*fds)[nb_fds]);
-	return -1;
-}
-
-static int prog_parse_fds(int *argc, char ***argv, int **fds)
-{
-	if (is_prefix(**argv, "id")) {
-		unsigned int id;
-		char *endptr;
-
-		NEXT_ARGP();
-
-		id = strtoul(**argv, &endptr, 0);
-		if (*endptr) {
-			p_err("can't parse %s as ID", **argv);
-			return -1;
-		}
-		NEXT_ARGP();
-
-		(*fds)[0] = bpf_prog_get_fd_by_id(id);
-		if ((*fds)[0] < 0) {
-			p_err("get by id (%u): %s", id, strerror(errno));
-			return -1;
-		}
-		return 1;
-	} else if (is_prefix(**argv, "tag")) {
-		unsigned char tag[BPF_TAG_SIZE];
-
-		NEXT_ARGP();
-
-		if (sscanf(**argv, BPF_TAG_FMT, tag, tag + 1, tag + 2,
-			   tag + 3, tag + 4, tag + 5, tag + 6, tag + 7)
-		    != BPF_TAG_SIZE) {
-			p_err("can't parse tag");
-			return -1;
-		}
-		NEXT_ARGP();
-
-		return prog_fd_by_nametag(tag, fds, true);
-	} else if (is_prefix(**argv, "name")) {
-		char *name;
-
-		NEXT_ARGP();
-
-		name = **argv;
-		if (strlen(name) > BPF_OBJ_NAME_LEN - 1) {
-			p_err("can't parse name");
-			return -1;
-		}
-		NEXT_ARGP();
-
-		return prog_fd_by_nametag(name, fds, false);
-	} else if (is_prefix(**argv, "pinned")) {
-		char *path;
-
-		NEXT_ARGP();
-
-		path = **argv;
-		NEXT_ARGP();
-
-		(*fds)[0] = open_obj_pinned_any(path, BPF_OBJ_PROG);
-		if ((*fds)[0] < 0)
-			return -1;
-		return 1;
-	}
-
-	p_err("expected 'id', 'tag', 'name' or 'pinned', got: '%s'?", **argv);
-	return -1;
-}
-
-int prog_parse_fd(int *argc, char ***argv)
-{
-	int *fds = NULL;
-	int nb_fds, fd;
-
-	fds = malloc(sizeof(int));
-	if (!fds) {
-		p_err("mem alloc failed");
-		return -1;
-	}
-	nb_fds = prog_parse_fds(argc, argv, &fds);
-	if (nb_fds != 1) {
-		if (nb_fds > 1) {
-			p_err("several programs match this handle");
-			while (nb_fds--)
-				close(fds[nb_fds]);
-		}
-		fd = -1;
-		goto exit_free;
-	}
-
-	fd = fds[0];
-exit_free:
-	free(fds);
-	return fd;
-}
-
 static void show_prog_maps(int fd, __u32 num_maps)
 {
 	struct bpf_prog_info info = {};
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH bpf-next 5/8] tools/bpftool: minimize bootstrap bpftool
  2020-06-12 22:31 [RFC PATCH bpf-next 0/8] libbpf ksym support and bpftool show PIDs Andrii Nakryiko
                   ` (3 preceding siblings ...)
  2020-06-12 22:31 ` [RFC PATCH bpf-next 4/8] tools/bpftool: move map/prog parsing logic into common Andrii Nakryiko
@ 2020-06-12 22:31 ` Andrii Nakryiko
  2020-06-12 22:31 ` [RFC PATCH bpf-next 6/8] tools/bpftool: generalize BPF skeleton support and generate vmlinux.h Andrii Nakryiko
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 21+ messages in thread
From: Andrii Nakryiko @ 2020-06-12 22:31 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel
  Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko, Hao Luo,
	Arnaldo Carvalho de Melo, Song Liu, Quentin Monnet

Build minimal "bootstrap mode" bpftool to enable skeleton (and, later,
vmlinux.h generation), instead of building almost complete, but slightly
different (w/o skeletons, etc) bpftool to bootstrap complete bpftool build.

Current approach doesn't scale well (engineering-wise) when adding more BPF
programs to bpftool and other complicated functionality, as it requires
constant adjusting of the code to work in both bootstrapped mode and normal
mode.

So it's better to build only minimal bpftool version that supports only BPF
skeleton code generation and BTF-to-C conversion. Thankfully, this is quite
easy to accomplish due to internal modularity of bpftool commands. This will
also allow to keep adding new functionality to bpftool in general, without the
need to care about bootstrap mode for those new parts of bpftool.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/bpf/bpftool/.gitignore |  2 +-
 tools/bpf/bpftool/Makefile   | 30 +++++++++++++-----------------
 tools/bpf/bpftool/main.c     | 11 +++++++++--
 tools/bpf/bpftool/main.h     | 27 +++++++++++++++------------
 4 files changed, 38 insertions(+), 32 deletions(-)

diff --git a/tools/bpf/bpftool/.gitignore b/tools/bpf/bpftool/.gitignore
index 26cde83e1ca3..ce721adf3161 100644
--- a/tools/bpf/bpftool/.gitignore
+++ b/tools/bpf/bpftool/.gitignore
@@ -1,6 +1,6 @@
 # SPDX-License-Identifier: GPL-2.0-only
 *.d
-/_bpftool
+/bpftool-bootstrap
 /bpftool
 bpftool*.8
 bpf-helpers.*
diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile
index 9e85f101be85..eec2da4d45d2 100644
--- a/tools/bpf/bpftool/Makefile
+++ b/tools/bpf/bpftool/Makefile
@@ -116,40 +116,36 @@ CFLAGS += -DHAVE_LIBBFD_SUPPORT
 SRCS += $(BFD_SRCS)
 endif
 
+BPFTOOL_BOOTSTRAP := $(if $(OUTPUT),$(OUTPUT)bpftool-bootstrap,./bpftool-bootstrap)
+
+BOOTSTRAP_OBJS = $(addprefix $(OUTPUT),main.o common.o json_writer.o gen.o btf.o)
 OBJS = $(patsubst %.c,$(OUTPUT)%.o,$(SRCS)) $(OUTPUT)disasm.o
-_OBJS = $(filter-out $(OUTPUT)prog.o,$(OBJS)) $(OUTPUT)_prog.o
 
-ifeq ($(feature-clang-bpf-global-var),1)
-	__OBJS = $(OBJS)
-else
-	__OBJS = $(_OBJS)
+ifneq ($(feature-clang-bpf-global-var),1)
+	CFLAGS += -DBPFTOOL_WITHOUT_SKELETONS
 endif
 
-$(OUTPUT)_prog.o: prog.c
-	$(QUIET_CC)$(CC) $(CFLAGS) -c -MMD -DBPFTOOL_WITHOUT_SKELETONS -o $@ $<
-
-$(OUTPUT)_bpftool: $(_OBJS) $(LIBBPF)
-	$(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) -o $@ $(_OBJS) $(LIBS)
-
 skeleton/profiler.bpf.o: skeleton/profiler.bpf.c $(LIBBPF)
 	$(QUIET_CLANG)$(CLANG) \
 		-I$(srctree)/tools/include/uapi/ \
 		-I$(LIBBPF_PATH) -I$(srctree)/tools/lib \
 		-g -O2 -target bpf -c $< -o $@
 
-profiler.skel.h: $(OUTPUT)_bpftool skeleton/profiler.bpf.o
-	$(QUIET_GEN)$(OUTPUT)./_bpftool gen skeleton skeleton/profiler.bpf.o > $@
+profiler.skel.h: $(BPFTOOL_BOOTSTRAP) skeleton/profiler.bpf.o
+	$(QUIET_GEN)$(BPFTOOL_BOOTSTRAP) gen skeleton skeleton/profiler.bpf.o > $@
 
 $(OUTPUT)prog.o: prog.c profiler.skel.h
-	$(QUIET_CC)$(CC) $(CFLAGS) -c -MMD -o $@ $<
 
 $(OUTPUT)disasm.o: $(srctree)/kernel/bpf/disasm.c
 	$(QUIET_CC)$(CC) $(CFLAGS) -c -MMD -o $@ $<
 
 $(OUTPUT)feature.o: | zdep
 
-$(OUTPUT)bpftool: $(__OBJS) $(LIBBPF)
-	$(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) -o $@ $(__OBJS) $(LIBS)
+$(BPFTOOL_BOOTSTRAP): $(BOOTSTRAP_OBJS) $(LIBBPF)
+	$(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) -o $@ $(BOOTSTRAP_OBJS) $(LIBS)
+
+$(OUTPUT)bpftool: $(OBJS) $(LIBBPF)
+	$(QUIET_LINK)$(CC) $(CFLAGS) $(LDFLAGS) -o $@ $(OBJS) $(LIBS)
 
 $(OUTPUT)%.o: %.c
 	$(QUIET_CC)$(CC) $(CFLAGS) -c -MMD -o $@ $<
@@ -157,7 +153,7 @@ $(OUTPUT)%.o: %.c
 clean: $(LIBBPF)-clean
 	$(call QUIET_CLEAN, bpftool)
 	$(Q)$(RM) -- $(OUTPUT)bpftool $(OUTPUT)*.o $(OUTPUT)*.d
-	$(Q)$(RM) -- $(OUTPUT)_bpftool profiler.skel.h skeleton/profiler.bpf.o
+	$(Q)$(RM) -- $(BPFTOOL_BOOTSTRAP) profiler.skel.h skeleton/profiler.bpf.o
 	$(Q)$(RM) -r -- $(OUTPUT)libbpf/
 	$(call QUIET_CLEAN, core-gen)
 	$(Q)$(RM) -- $(OUTPUT)FEATURE-DUMP.bpftool
diff --git a/tools/bpf/bpftool/main.c b/tools/bpf/bpftool/main.c
index 46bd716a9d86..bf4d7487552a 100644
--- a/tools/bpf/bpftool/main.c
+++ b/tools/bpf/bpftool/main.c
@@ -92,9 +92,16 @@ int cmd_select(const struct cmd *cmds, int argc, char **argv,
 	if (argc < 1 && cmds[0].func)
 		return cmds[0].func(argc, argv);
 
-	for (i = 0; cmds[i].func; i++)
-		if (is_prefix(*argv, cmds[i].cmd))
+	for (i = 0; cmds[i].cmd; i++) {
+		if (is_prefix(*argv, cmds[i].cmd)) {
+			if (!cmds[i].func) {
+				p_err("command '%s' is not supported in bootstrap mode",
+				      cmds[i].cmd);
+				return -1;
+			}
 			return cmds[i].func(argc - 1, argv + 1);
+		}
+	}
 
 	help(argc - 1, argv + 1);
 
diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h
index 4338ab9d86d4..aad7be74e8a7 100644
--- a/tools/bpf/bpftool/main.h
+++ b/tools/bpf/bpftool/main.h
@@ -194,19 +194,22 @@ int mount_bpffs_for_pin(const char *name);
 int do_pin_any(int argc, char **argv, int (*get_fd_by_id)(int *, char ***));
 int do_pin_fd(int fd, const char *name);
 
-int do_prog(int argc, char **arg);
-int do_map(int argc, char **arg);
-int do_link(int argc, char **arg);
-int do_event_pipe(int argc, char **argv);
-int do_cgroup(int argc, char **arg);
-int do_perf(int argc, char **arg);
-int do_net(int argc, char **arg);
-int do_tracelog(int argc, char **arg);
-int do_feature(int argc, char **argv);
-int do_btf(int argc, char **argv);
+/* commands available in bootstrap mode */
 int do_gen(int argc, char **argv);
-int do_struct_ops(int argc, char **argv);
-int do_iter(int argc, char **argv);
+int do_btf(int argc, char **argv);
+
+/* non-bootstrap only commands */
+int do_prog(int argc, char **arg) __weak;
+int do_map(int argc, char **arg) __weak;
+int do_link(int argc, char **arg) __weak;
+int do_event_pipe(int argc, char **argv) __weak;
+int do_cgroup(int argc, char **arg) __weak;
+int do_perf(int argc, char **arg) __weak;
+int do_net(int argc, char **arg) __weak;
+int do_tracelog(int argc, char **arg) __weak;
+int do_feature(int argc, char **argv) __weak;
+int do_struct_ops(int argc, char **argv) __weak;
+int do_iter(int argc, char **argv) __weak;
 
 int parse_u32_arg(int *argc, char ***argv, __u32 *val, const char *what);
 int prog_parse_fd(int *argc, char ***argv);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH bpf-next 6/8] tools/bpftool: generalize BPF skeleton support and generate vmlinux.h
  2020-06-12 22:31 [RFC PATCH bpf-next 0/8] libbpf ksym support and bpftool show PIDs Andrii Nakryiko
                   ` (4 preceding siblings ...)
  2020-06-12 22:31 ` [RFC PATCH bpf-next 5/8] tools/bpftool: minimize bootstrap bpftool Andrii Nakryiko
@ 2020-06-12 22:31 ` Andrii Nakryiko
  2020-06-12 22:31 ` [RFC PATCH bpf-next 7/8] libbpf: wrap source argument of BPF_CORE_READ macro in parentheses Andrii Nakryiko
  2020-06-12 22:31 ` [RFC PATCH bpf-next 8/8] tools/bpftool: show PIDs with FDs open against BPF map/prog/link/btf Andrii Nakryiko
  7 siblings, 0 replies; 21+ messages in thread
From: Andrii Nakryiko @ 2020-06-12 22:31 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel
  Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko, Hao Luo,
	Arnaldo Carvalho de Melo, Song Liu, Quentin Monnet

Adapt Makefile to support BPF skeleton generation beyond single profiler.bpf.c
case. Also add vmlinux.h generation and switch profiler.bpf.c to use it.

clang-bpf-global-var feature is extended and renamed to clang-bpf-co-re to
check for support of preserve_access_index attribute, which, together with BTF
for global variables, is the minimum requirement for modern BPF programs.

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/bpf/bpftool/.gitignore                  |  3 +-
 tools/bpf/bpftool/Makefile                    | 42 ++++++++++++-----
 tools/bpf/bpftool/skeleton/profiler.bpf.c     |  3 +-
 tools/bpf/bpftool/skeleton/profiler.h         | 46 -------------------
 tools/build/feature/Makefile                  |  4 +-
 tools/build/feature/test-clang-bpf-co-re.c    |  9 ++++
 .../build/feature/test-clang-bpf-global-var.c |  4 --
 7 files changed, 45 insertions(+), 66 deletions(-)
 delete mode 100644 tools/bpf/bpftool/skeleton/profiler.h
 create mode 100644 tools/build/feature/test-clang-bpf-co-re.c
 delete mode 100644 tools/build/feature/test-clang-bpf-global-var.c

diff --git a/tools/bpf/bpftool/.gitignore b/tools/bpf/bpftool/.gitignore
index ce721adf3161..3e601bcfd461 100644
--- a/tools/bpf/bpftool/.gitignore
+++ b/tools/bpf/bpftool/.gitignore
@@ -7,4 +7,5 @@ bpf-helpers.*
 FEATURE-DUMP.bpftool
 feature
 libbpf
-profiler.skel.h
+/*.skel.h
+/vmlinux.h
diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile
index eec2da4d45d2..bdb6e38c6c5c 100644
--- a/tools/bpf/bpftool/Makefile
+++ b/tools/bpf/bpftool/Makefile
@@ -42,6 +42,7 @@ CFLAGS += -O2
 CFLAGS += -W -Wall -Wextra -Wno-unused-parameter -Wno-missing-field-initializers
 CFLAGS += $(filter-out -Wswitch-enum,$(EXTRA_WARNINGS))
 CFLAGS += -DPACKAGE='"bpftool"' -D__EXPORTED_HEADERS__ \
+	-I$(if $(OUTPUT),$(OUTPUT),.) \
 	-I$(srctree)/kernel/bpf/ \
 	-I$(srctree)/tools/include \
 	-I$(srctree)/tools/include/uapi \
@@ -61,9 +62,9 @@ CLANG ?= clang
 
 FEATURE_USER = .bpftool
 FEATURE_TESTS = libbfd disassembler-four-args reallocarray zlib libcap \
-	clang-bpf-global-var
+	clang-bpf-co-re
 FEATURE_DISPLAY = libbfd disassembler-four-args zlib libcap \
-	clang-bpf-global-var
+	clang-bpf-co-re
 
 check_feat := 1
 NON_CHECK_FEAT_TARGETS := clean uninstall doc doc-clean doc-install doc-uninstall
@@ -121,20 +122,38 @@ BPFTOOL_BOOTSTRAP := $(if $(OUTPUT),$(OUTPUT)bpftool-bootstrap,./bpftool-bootstr
 BOOTSTRAP_OBJS = $(addprefix $(OUTPUT),main.o common.o json_writer.o gen.o btf.o)
 OBJS = $(patsubst %.c,$(OUTPUT)%.o,$(SRCS)) $(OUTPUT)disasm.o
 
-ifneq ($(feature-clang-bpf-global-var),1)
-	CFLAGS += -DBPFTOOL_WITHOUT_SKELETONS
-endif
+VMLINUX_BTF_PATHS := $(if $(O),$(O)/vmlinux)				\
+		     $(if $(KBUILD_OUTPUT),$(KBUILD_OUTPUT)/vmlinux)	\
+		     ../../../vmlinux					\
+		     /sys/kernel/btf/vmlinux				\
+		     /boot/vmlinux-$(shell uname -r)
+VMLINUX_BTF := $(abspath $(firstword $(wildcard $(VMLINUX_BTF_PATHS))))
+
+ifneq ($(VMLINUX_BTF),)
+ifeq ($(feature-clang-bpf-co-re),1)
+
+BUILD_BPF_SKELS := 1
+
+$(OUTPUT)vmlinux.h: $(VMLINUX_BTF) $(BPFTOOL_BOOTSTRAP)
+	$(QUIET_GEN)$(BPFTOOL_BOOTSTRAP) btf dump file $< format c > $@
 
-skeleton/profiler.bpf.o: skeleton/profiler.bpf.c $(LIBBPF)
+$(OUTPUT)%.bpf.o: skeleton/%.bpf.c $(OUTPUT)vmlinux.h $(LIBBPF)
 	$(QUIET_CLANG)$(CLANG) \
+		-I$(if $(OUTPUT),$(OUTPUT),.) \
 		-I$(srctree)/tools/include/uapi/ \
-		-I$(LIBBPF_PATH) -I$(srctree)/tools/lib \
+		-I$(LIBBPF_PATH) \
+		-I$(srctree)/tools/lib \
 		-g -O2 -target bpf -c $< -o $@
 
-profiler.skel.h: $(BPFTOOL_BOOTSTRAP) skeleton/profiler.bpf.o
-	$(QUIET_GEN)$(BPFTOOL_BOOTSTRAP) gen skeleton skeleton/profiler.bpf.o > $@
+$(OUTPUT)%.skel.h: $(OUTPUT)%.bpf.o $(BPFTOOL_BOOTSTRAP)
+	$(QUIET_GEN)$(BPFTOOL_BOOTSTRAP) gen skeleton $< > $@
+
+$(OUTPUT)prog.o: $(OUTPUT)profiler.skel.h
+
+endif
+endif
 
-$(OUTPUT)prog.o: prog.c profiler.skel.h
+CFLAGS += $(if BUILD_BPF_SKELS,,-DBPFTOOL_WITHOUT_SKELETONS)
 
 $(OUTPUT)disasm.o: $(srctree)/kernel/bpf/disasm.c
 	$(QUIET_CC)$(CC) $(CFLAGS) -c -MMD -o $@ $<
@@ -153,7 +172,7 @@ $(OUTPUT)%.o: %.c
 clean: $(LIBBPF)-clean
 	$(call QUIET_CLEAN, bpftool)
 	$(Q)$(RM) -- $(OUTPUT)bpftool $(OUTPUT)*.o $(OUTPUT)*.d
-	$(Q)$(RM) -- $(BPFTOOL_BOOTSTRAP) profiler.skel.h skeleton/profiler.bpf.o
+	$(Q)$(RM) -- $(BPFTOOL_BOOTSTRAP) $(OUTPUT)*.skel.h $(OUTPUT)vmlinux.h
 	$(Q)$(RM) -r -- $(OUTPUT)libbpf/
 	$(call QUIET_CLEAN, core-gen)
 	$(Q)$(RM) -- $(OUTPUT)FEATURE-DUMP.bpftool
@@ -188,6 +207,7 @@ FORCE:
 zdep:
 	@if [ "$(feature-zlib)" != "1" ]; then echo "No zlib found"; exit 1 ; fi
 
+.SECONDARY:
 .PHONY: all FORCE clean install uninstall zdep
 .PHONY: doc doc-clean doc-install doc-uninstall
 .DEFAULT_GOAL := all
diff --git a/tools/bpf/bpftool/skeleton/profiler.bpf.c b/tools/bpf/bpftool/skeleton/profiler.bpf.c
index 20034c12f7c5..5207880b9204 100644
--- a/tools/bpf/bpftool/skeleton/profiler.bpf.c
+++ b/tools/bpf/bpftool/skeleton/profiler.bpf.c
@@ -1,7 +1,6 @@
 // SPDX-License-Identifier: GPL-2.0
 // Copyright (c) 2020 Facebook
-#include "profiler.h"
-#include <linux/bpf.h>
+#include <vmlinux.h>
 #include <bpf/bpf_helpers.h>
 #include <bpf/bpf_tracing.h>
 
diff --git a/tools/bpf/bpftool/skeleton/profiler.h b/tools/bpf/bpftool/skeleton/profiler.h
deleted file mode 100644
index 1f767e9510f7..000000000000
--- a/tools/bpf/bpftool/skeleton/profiler.h
+++ /dev/null
@@ -1,46 +0,0 @@
-/* SPDX-License-Identifier: (LGPL-2.1 OR BSD-2-Clause) */
-#ifndef __PROFILER_H
-#define __PROFILER_H
-
-/* useful typedefs from vmlinux.h */
-
-typedef signed char __s8;
-typedef unsigned char __u8;
-typedef short int __s16;
-typedef short unsigned int __u16;
-typedef int __s32;
-typedef unsigned int __u32;
-typedef long long int __s64;
-typedef long long unsigned int __u64;
-
-typedef __s8 s8;
-typedef __u8 u8;
-typedef __s16 s16;
-typedef __u16 u16;
-typedef __s32 s32;
-typedef __u32 u32;
-typedef __s64 s64;
-typedef __u64 u64;
-
-enum {
-	false = 0,
-	true = 1,
-};
-
-#ifdef __CHECKER__
-#define __bitwise__ __attribute__((bitwise))
-#else
-#define __bitwise__
-#endif
-
-typedef __u16 __bitwise__ __le16;
-typedef __u16 __bitwise__ __be16;
-typedef __u32 __bitwise__ __le32;
-typedef __u32 __bitwise__ __be32;
-typedef __u64 __bitwise__ __le64;
-typedef __u64 __bitwise__ __be64;
-
-typedef __u16 __bitwise__ __sum16;
-typedef __u32 __bitwise__ __wsum;
-
-#endif /* __PROFILER_H */
diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile
index 84f845b9627d..26d11a1a9870 100644
--- a/tools/build/feature/Makefile
+++ b/tools/build/feature/Makefile
@@ -68,7 +68,7 @@ FILES=                                          \
          test-llvm-version.bin			\
          test-libaio.bin			\
          test-libzstd.bin			\
-         test-clang-bpf-global-var.bin		\
+         test-clang-bpf-co-re.bin		\
          test-file-handle.bin			\
          test-libpfm4.bin
 
@@ -325,7 +325,7 @@ $(OUTPUT)test-libaio.bin:
 $(OUTPUT)test-libzstd.bin:
 	$(BUILD) -lzstd
 
-$(OUTPUT)test-clang-bpf-global-var.bin:
+$(OUTPUT)test-clang-bpf-co-re.bin:
 	$(CLANG) -S -g -target bpf -o - $(patsubst %.bin,%.c,$(@F)) |	\
 		grep BTF_KIND_VAR
 
diff --git a/tools/build/feature/test-clang-bpf-co-re.c b/tools/build/feature/test-clang-bpf-co-re.c
new file mode 100644
index 000000000000..cb5265bfdd83
--- /dev/null
+++ b/tools/build/feature/test-clang-bpf-co-re.c
@@ -0,0 +1,9 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2020 Facebook
+
+struct test {
+	int a;
+	int b;
+} __attribute__((preserve_access_index));
+
+volatile struct test global_value_for_test = {};
diff --git a/tools/build/feature/test-clang-bpf-global-var.c b/tools/build/feature/test-clang-bpf-global-var.c
deleted file mode 100644
index 221f1481d52e..000000000000
--- a/tools/build/feature/test-clang-bpf-global-var.c
+++ /dev/null
@@ -1,4 +0,0 @@
-// SPDX-License-Identifier: GPL-2.0
-// Copyright (c) 2020 Facebook
-
-volatile int global_value_for_test = 1;
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH bpf-next 7/8] libbpf: wrap source argument of BPF_CORE_READ macro in parentheses
  2020-06-12 22:31 [RFC PATCH bpf-next 0/8] libbpf ksym support and bpftool show PIDs Andrii Nakryiko
                   ` (5 preceding siblings ...)
  2020-06-12 22:31 ` [RFC PATCH bpf-next 6/8] tools/bpftool: generalize BPF skeleton support and generate vmlinux.h Andrii Nakryiko
@ 2020-06-12 22:31 ` Andrii Nakryiko
  2020-06-12 22:31 ` [RFC PATCH bpf-next 8/8] tools/bpftool: show PIDs with FDs open against BPF map/prog/link/btf Andrii Nakryiko
  7 siblings, 0 replies; 21+ messages in thread
From: Andrii Nakryiko @ 2020-06-12 22:31 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel
  Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko, Hao Luo,
	Arnaldo Carvalho de Melo, Song Liu, Quentin Monnet

Wrap source argument of BPF_CORE_READ family of macros into parentheses to
allow uses like this:

BPF_CORE_READ((struct cast_struct *)src, a, b, c);

Fixes: 7db3822ab991 ("libbpf: Add BPF_CORE_READ/BPF_CORE_READ_INTO helpers")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/lib/bpf/bpf_core_read.h | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tools/lib/bpf/bpf_core_read.h b/tools/lib/bpf/bpf_core_read.h
index 7009dc90e012..eae5cccff761 100644
--- a/tools/lib/bpf/bpf_core_read.h
+++ b/tools/lib/bpf/bpf_core_read.h
@@ -217,7 +217,7 @@ enum bpf_field_info_kind {
  */
 #define BPF_CORE_READ_INTO(dst, src, a, ...)				    \
 	({								    \
-		___core_read(bpf_core_read, dst, src, a, ##__VA_ARGS__)	    \
+		___core_read(bpf_core_read, dst, (src), a, ##__VA_ARGS__)   \
 	})
 
 /*
@@ -227,7 +227,7 @@ enum bpf_field_info_kind {
  */
 #define BPF_CORE_READ_STR_INTO(dst, src, a, ...)			    \
 	({								    \
-		___core_read(bpf_core_read_str, dst, src, a, ##__VA_ARGS__) \
+		___core_read(bpf_core_read_str, dst, (src), a, ##__VA_ARGS__)\
 	})
 
 /*
@@ -254,8 +254,8 @@ enum bpf_field_info_kind {
  */
 #define BPF_CORE_READ(src, a, ...)					    \
 	({								    \
-		___type(src, a, ##__VA_ARGS__) __r;			    \
-		BPF_CORE_READ_INTO(&__r, src, a, ##__VA_ARGS__);	    \
+		___type((src), a, ##__VA_ARGS__) __r;			    \
+		BPF_CORE_READ_INTO(&__r, (src), a, ##__VA_ARGS__);	    \
 		__r;							    \
 	})
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [RFC PATCH bpf-next 8/8] tools/bpftool: show PIDs with FDs open against BPF map/prog/link/btf
  2020-06-12 22:31 [RFC PATCH bpf-next 0/8] libbpf ksym support and bpftool show PIDs Andrii Nakryiko
                   ` (6 preceding siblings ...)
  2020-06-12 22:31 ` [RFC PATCH bpf-next 7/8] libbpf: wrap source argument of BPF_CORE_READ macro in parentheses Andrii Nakryiko
@ 2020-06-12 22:31 ` Andrii Nakryiko
  2020-06-13  3:45   ` Alexei Starovoitov
  7 siblings, 1 reply; 21+ messages in thread
From: Andrii Nakryiko @ 2020-06-12 22:31 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel
  Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko, Hao Luo,
	Arnaldo Carvalho de Melo, Song Liu, Quentin Monnet

Add bpf_iter-based way to find all the processes that hold open FDs against
BPF object (map, prog, link, btf). Add new flag (-o, for "ownership", given
-p is already taken) to trigger collection and output of these PIDs.

Sample output for each of 4 BPF objects:

$ sudo ./bpftool -o prog show
1992: cgroup_skb  name egress_alt  tag 9ad187367cf2b9e8  gpl
        loaded_at 2020-06-12T14:18:10-0700  uid 0
        xlated 48B  jited 59B  memlock 4096B  map_ids 2074
        btf_id 460
        pids: 913709,913732,913733,913734
2062: cgroup_device  tag 8c42dee26e8cd4c2  gpl
        loaded_at 2020-06-12T14:37:52-0700  uid 0
        xlated 648B  jited 409B  memlock 4096B
        pids: 1

$ sudo ./bpftool -o map show
2074: array  name test_cgr.bss  flags 0x400
        key 4B  value 8B  max_entries 1  memlock 8192B
        btf_id 460
        pids: 913709,913732,913733,913734

$ sudo ./bpftool -o link show
82: cgroup  prog 1992
        cgroup_id 0  attach_type egress
        pids: 913709,913732,913733,913734
86: cgroup  prog 1992
        cgroup_id 0  attach_type egress
        pids: 913709,913732,913733,913734

$ sudo ./bpftool -o btf show
460: size 1527B  prog_ids 1992,1991  map_ids 2074  pids 913709,913732,913733,913734

Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/bpf/bpftool/Makefile                |   2 +
 tools/bpf/bpftool/btf.c                   |  39 ++++++
 tools/bpf/bpftool/link.c                  |  40 ++++++
 tools/bpf/bpftool/main.c                  |   8 +-
 tools/bpf/bpftool/main.h                  |  18 +++
 tools/bpf/bpftool/map.c                   |  39 ++++++
 tools/bpf/bpftool/pids.c                  | 150 ++++++++++++++++++++++
 tools/bpf/bpftool/prog.c                  |  40 ++++++
 tools/bpf/bpftool/skeleton/pid_iter.bpf.c |  77 +++++++++++
 9 files changed, 412 insertions(+), 1 deletion(-)
 create mode 100644 tools/bpf/bpftool/pids.c
 create mode 100644 tools/bpf/bpftool/skeleton/pid_iter.bpf.c

diff --git a/tools/bpf/bpftool/Makefile b/tools/bpf/bpftool/Makefile
index bdb6e38c6c5c..06f436e8191a 100644
--- a/tools/bpf/bpftool/Makefile
+++ b/tools/bpf/bpftool/Makefile
@@ -150,6 +150,8 @@ $(OUTPUT)%.skel.h: $(OUTPUT)%.bpf.o $(BPFTOOL_BOOTSTRAP)
 
 $(OUTPUT)prog.o: $(OUTPUT)profiler.skel.h
 
+$(OUTPUT)pids.o: $(OUTPUT)pid_iter.skel.h
+
 endif
 endif
 
diff --git a/tools/bpf/bpftool/btf.c b/tools/bpf/bpftool/btf.c
index faac8189b285..448c89d1faee 100644
--- a/tools/bpf/bpftool/btf.c
+++ b/tools/bpf/bpftool/btf.c
@@ -809,6 +809,22 @@ show_btf_plain(struct bpf_btf_info *info, int fd,
 			printf("%s%u", n++ == 0 ? "  map_ids " : ",",
 			       obj->obj_id);
 	}
+	if (!hash_empty(pids_table.table)) {
+		struct obj_pids *pids;
+		int i;
+
+		hash_for_each_possible(pids_table.table, pids, node, info->id) {
+			if (pids->id != info->id)
+				continue;
+			if (pids->pid_cnt == 0)
+				break;
+
+			printf("  pids");
+			for (i = 0; i < pids->pid_cnt; i++)
+				printf("%c%d", i == 0 ? ' ' : ',', pids->pids[i]);
+			break;
+		}
+	}
 
 	printf("\n");
 }
@@ -841,6 +857,25 @@ show_btf_json(struct bpf_btf_info *info, int fd,
 			jsonw_uint(json_wtr, obj->obj_id);
 	}
 	jsonw_end_array(json_wtr);	/* map_ids */
+	/* PIDs */
+	if (!hash_empty(pids_table.table)) {
+		struct obj_pids *pids;
+		int i;
+
+		hash_for_each_possible(pids_table.table, pids, node, info->id) {
+			if (pids->id != info->id)
+				continue;
+			if (pids->pid_cnt == 0)
+				break;
+
+			jsonw_name(json_wtr, "pids");
+			jsonw_start_array(json_wtr);
+			for (i = 0; i < pids->pid_cnt; i++)
+				jsonw_int(json_wtr, pids->pids[i]);
+			jsonw_end_array(json_wtr);
+			break;
+		}
+	}
 	jsonw_end_object(json_wtr);	/* btf object */
 }
 
@@ -893,6 +928,8 @@ static int do_show(int argc, char **argv)
 			close(fd);
 		return err;
 	}
+	if (show_pids)
+		build_obj_pids_table(&pids_table, BPF_OBJ_BTF);
 
 	if (fd >= 0) {
 		err = show_btf(fd, &btf_prog_table, &btf_map_table);
@@ -939,6 +976,8 @@ static int do_show(int argc, char **argv)
 exit_free:
 	delete_btf_table(&btf_prog_table);
 	delete_btf_table(&btf_map_table);
+	if (show_pids)
+		delete_obj_pids_table(&pids_table);
 
 	return err;
 }
diff --git a/tools/bpf/bpftool/link.c b/tools/bpf/bpftool/link.c
index fca57ee8fafe..d959ce838dfb 100644
--- a/tools/bpf/bpftool/link.c
+++ b/tools/bpf/bpftool/link.c
@@ -143,6 +143,25 @@ static int show_link_close_json(int fd, struct bpf_link_info *info)
 		}
 		jsonw_end_array(json_wtr);
 	}
+	if (!hash_empty(pids_table.table)) {
+		struct obj_pids *pids;
+		int i;
+
+		hash_for_each_possible(pids_table.table, pids, node, info->id) {
+			if (pids->id != info->id)
+				continue;
+			if (pids->pid_cnt == 0)
+				break;
+
+			jsonw_name(json_wtr, "pids");
+			jsonw_start_array(json_wtr);
+			for (i = 0; i < pids->pid_cnt; i++)
+				jsonw_int(json_wtr, pids->pids[i]);
+			jsonw_end_array(json_wtr);
+			break;
+		}
+	}
+
 	jsonw_end_object(json_wtr);
 
 	return 0;
@@ -212,6 +231,22 @@ static int show_link_close_plain(int fd, struct bpf_link_info *info)
 				printf("\n\tpinned %s", obj->path);
 		}
 	}
+	if (!hash_empty(pids_table.table)) {
+		struct obj_pids *pids;
+		int i;
+
+		hash_for_each_possible(pids_table.table, pids, node, info->id) {
+			if (pids->id != info->id)
+				continue;
+			if (pids->pid_cnt == 0)
+				break;
+
+			printf("\n\tpids:");
+			for (i = 0; i < pids->pid_cnt; i++)
+				printf("%c%d", i == 0 ? ' ' : ',', pids->pids[i]);
+			break;
+		}
+	}
 
 	printf("\n");
 
@@ -257,6 +292,8 @@ static int do_show(int argc, char **argv)
 
 	if (show_pinned)
 		build_pinned_obj_table(&link_table, BPF_OBJ_LINK);
+	if (show_pids)
+		build_obj_pids_table(&pids_table, BPF_OBJ_LINK);
 
 	if (argc == 2) {
 		fd = link_parse_fd(&argc, &argv);
@@ -296,6 +333,9 @@ static int do_show(int argc, char **argv)
 	if (json_output)
 		jsonw_end_array(json_wtr);
 
+	if (show_pids)
+		delete_obj_pids_table(&pids_table);
+
 	return errno == ENOENT ? 0 : -1;
 }
 
diff --git a/tools/bpf/bpftool/main.c b/tools/bpf/bpftool/main.c
index bf4d7487552a..53a546e1d9a8 100644
--- a/tools/bpf/bpftool/main.c
+++ b/tools/bpf/bpftool/main.c
@@ -25,12 +25,14 @@ json_writer_t *json_wtr;
 bool pretty_output;
 bool json_output;
 bool show_pinned;
+bool show_pids;
 bool block_mount;
 bool verifier_logs;
 bool relaxed_maps;
 struct pinned_obj_table prog_table;
 struct pinned_obj_table map_table;
 struct pinned_obj_table link_table;
+struct obj_pids_table pids_table;
 
 static void __noreturn clean_and_exit(int i)
 {
@@ -369,6 +371,7 @@ int main(int argc, char **argv)
 	pretty_output = false;
 	json_output = false;
 	show_pinned = false;
+	show_pids = false;
 	block_mount = false;
 	bin_name = argv[0];
 
@@ -377,7 +380,7 @@ int main(int argc, char **argv)
 	hash_init(link_table.table);
 
 	opterr = 0;
-	while ((opt = getopt_long(argc, argv, "Vhpjfmnd",
+	while ((opt = getopt_long(argc, argv, "Vhpjfomnd",
 				  options, NULL)) >= 0) {
 		switch (opt) {
 		case 'V':
@@ -401,6 +404,9 @@ int main(int argc, char **argv)
 		case 'f':
 			show_pinned = true;
 			break;
+		case 'o':
+			show_pids = true;
+			break;
 		case 'm':
 			relaxed_maps = true;
 			break;
diff --git a/tools/bpf/bpftool/main.h b/tools/bpf/bpftool/main.h
index aad7be74e8a7..286134b05687 100644
--- a/tools/bpf/bpftool/main.h
+++ b/tools/bpf/bpftool/main.h
@@ -127,11 +127,13 @@ static const char * const attach_type_name[__MAX_BPF_ATTACH_TYPE] = {
 extern const char * const map_type_name[];
 extern const size_t map_type_name_size;
 
+/* keep in sync with the definition in skeleton/pid_iter.bpf.c */
 enum bpf_obj_type {
 	BPF_OBJ_UNKNOWN,
 	BPF_OBJ_PROG,
 	BPF_OBJ_MAP,
 	BPF_OBJ_LINK,
+	BPF_OBJ_BTF,
 };
 
 extern const char *bin_name;
@@ -139,12 +141,14 @@ extern const char *bin_name;
 extern json_writer_t *json_wtr;
 extern bool json_output;
 extern bool show_pinned;
+extern bool show_pids;
 extern bool block_mount;
 extern bool verifier_logs;
 extern bool relaxed_maps;
 extern struct pinned_obj_table prog_table;
 extern struct pinned_obj_table map_table;
 extern struct pinned_obj_table link_table;
+extern struct obj_pids_table pids_table;
 
 void __printf(1, 2) p_err(const char *fmt, ...);
 void __printf(1, 2) p_info(const char *fmt, ...);
@@ -168,12 +172,26 @@ struct pinned_obj {
 	struct hlist_node hash;
 };
 
+struct obj_pids_table {
+	DECLARE_HASHTABLE(table, 16);
+};
+
+struct obj_pids {
+	struct hlist_node node;
+	__u32 id;
+	int pid_cnt;
+	int *pids;
+};
+
 struct btf;
 struct bpf_line_info;
 
 int build_pinned_obj_table(struct pinned_obj_table *table,
 			   enum bpf_obj_type type);
 void delete_pinned_obj_table(struct pinned_obj_table *tab);
+__weak int build_obj_pids_table(struct obj_pids_table *table,
+				enum bpf_obj_type type);
+__weak void delete_obj_pids_table(struct obj_pids_table *table);
 void print_dev_plain(__u32 ifindex, __u64 ns_dev, __u64 ns_inode);
 void print_dev_json(__u32 ifindex, __u64 ns_dev, __u64 ns_inode);
 
diff --git a/tools/bpf/bpftool/map.c b/tools/bpf/bpftool/map.c
index b9eee19b094c..1fc8b674049e 100644
--- a/tools/bpf/bpftool/map.c
+++ b/tools/bpf/bpftool/map.c
@@ -508,6 +508,24 @@ static int show_map_close_json(int fd, struct bpf_map_info *info)
 		}
 		jsonw_end_array(json_wtr);
 	}
+	if (!hash_empty(pids_table.table)) {
+		struct obj_pids *pids;
+		int i;
+
+		hash_for_each_possible(pids_table.table, pids, node, info->id) {
+			if (pids->id != info->id)
+				continue;
+			if (pids->pid_cnt == 0)
+				break;
+
+			jsonw_name(json_wtr, "pids");
+			jsonw_start_array(json_wtr);
+			for (i = 0; i < pids->pid_cnt; i++)
+				jsonw_int(json_wtr, pids->pids[i]);
+			jsonw_end_array(json_wtr);
+			break;
+		}
+	}
 
 	jsonw_end_object(json_wtr);
 
@@ -596,6 +614,22 @@ static int show_map_close_plain(int fd, struct bpf_map_info *info)
 	if (frozen)
 		printf("%sfrozen", info->btf_id ? "  " : "");
 
+	if (!hash_empty(pids_table.table)) {
+		struct obj_pids *pids;
+		int i;
+
+		hash_for_each_possible(pids_table.table, pids, node, info->id) {
+			if (pids->id != info->id)
+				continue;
+			if (pids->pid_cnt == 0)
+				break;
+
+			printf("\n\tpids:");
+			for (i = 0; i < pids->pid_cnt; i++)
+				printf("%c%d", i == 0 ? ' ' : ',', pids->pids[i]);
+			break;
+		}
+	}
 	printf("\n");
 	return 0;
 }
@@ -654,6 +688,8 @@ static int do_show(int argc, char **argv)
 
 	if (show_pinned)
 		build_pinned_obj_table(&map_table, BPF_OBJ_MAP);
+	if (show_pids)
+		build_obj_pids_table(&pids_table, BPF_OBJ_MAP);
 
 	if (argc == 2)
 		return do_show_subset(argc, argv);
@@ -697,6 +733,9 @@ static int do_show(int argc, char **argv)
 	if (json_output)
 		jsonw_end_array(json_wtr);
 
+	if (show_pids)
+		delete_obj_pids_table(&pids_table);
+
 	return errno == ENOENT ? 0 : -1;
 }
 
diff --git a/tools/bpf/bpftool/pids.c b/tools/bpf/bpftool/pids.c
new file mode 100644
index 000000000000..fb0a2d7213b4
--- /dev/null
+++ b/tools/bpf/bpftool/pids.c
@@ -0,0 +1,150 @@
+// SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+/* Copyright (C) 2020 Facebook */
+#include <errno.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+#include <bpf/bpf.h>
+
+#include "main.h"
+
+#ifdef BPFTOOL_WITHOUT_SKELETONS
+
+int build_obj_pids_table(struct obj_pids_table *table, enum bpf_obj_type type)
+{
+	p_err("bpftool built without PID iterator support");
+	return -ENOTSUP;
+}
+void delete_obj_pids_table(struct obj_pids_table *table) {}
+
+#else /* BPFTOOL_WITHOUT_SKELETONS */
+
+#include "pid_iter.skel.h"
+
+int build_obj_pids_table(struct obj_pids_table *table, enum bpf_obj_type type)
+{
+	struct obj_pids *pids;
+	struct pid_iter_bpf *skel;
+	FILE *f = NULL;
+	int err, ret, fd = -1, pid, i;
+	__u32 id;
+	bool found_id, found_pid;
+
+	hash_init(table->table);
+	set_max_rlimit();
+
+	skel = pid_iter_bpf__open();
+	if (!skel) {
+		p_err("failed to open PID iterator");
+		return -1;
+	}
+
+	skel->rodata->obj_type = type;
+
+	err = pid_iter_bpf__load(skel);
+	if (err) {
+		p_err("failed to load PID iterator: %d", err);
+		goto out;
+	}
+	err = pid_iter_bpf__attach(skel);
+	if (err) {
+		p_err("failed to attach PID iterator: %d", err);
+		goto out;
+	}
+
+	fd = bpf_iter_create(bpf_link__fd(skel->links.iter));
+	if (fd < 0) {
+		err = -errno;
+		p_err("failed to create PID iterator session: %d", err);
+		goto out;
+	}
+
+	f = fdopen(fd, "r");
+	if (!f) {
+		err = -errno;
+		goto out;
+	}
+
+	while (true) {
+		ret = fscanf(f, "%d %u\n", &pid, &id);
+		if (ret == EOF && feof(f))
+			break;
+		if (ret != 2) {
+			err = -EINVAL;
+			p_err("invalid PID iterator output format");
+			goto out;
+		}
+
+		found_id = false;
+		hash_for_each_possible(table->table, pids, node, id) {
+			if (pids->id != id)
+				continue;
+			found_id = true;
+
+			found_pid = false;
+			for (i = 0; i < pids->pid_cnt; i++) {
+				if (pids->pids[i] == pid) {
+					found_pid = true;
+					break;
+				}
+			}
+			if (!found_pid) {
+				void *tmp;
+
+				tmp = realloc(pids->pids, pids->pid_cnt + 1);
+				if (!tmp) {
+					p_err("failed to re-alloc memory for ID %u, PID %d...",
+					      id, pid);
+					break;
+				}
+				pids->pids = tmp;
+				pids->pids[pids->pid_cnt] = pid;
+				pids->pid_cnt++;
+			}
+			break;
+		}
+		if (!found_id) {
+			pids = calloc(1, sizeof(*pids));
+			if (!pids) {
+				p_err("failed to alloc memory for ID %u, PID %d...", id, pid);
+				continue;
+			}
+
+			pids->id = id;
+			pids->pids = malloc(sizeof(*pids->pids));
+			if (!pids->pids) {
+				free(pids);
+				p_err("failed to alloc memory for ID %u, PID %d...", id, pid);
+				continue;
+			}
+			pids->pids[0] = pid;
+			pids->pid_cnt = 1;
+			hash_add(table->table, &pids->node, id);
+		}
+	}
+	err = 0;
+out:
+	if (f)
+		fclose(f);
+	else if (fd >= 0)
+		close(fd);
+	pid_iter_bpf__destroy(skel);
+	return err;
+}
+
+void delete_obj_pids_table(struct obj_pids_table *table)
+{
+	struct obj_pids *pids;
+	struct hlist_node *tmp;
+	unsigned int bkt;
+
+	hash_for_each_safe(table->table, bkt, tmp, pids, node) {
+		hash_del(&pids->node);
+		free(pids->pids);
+		free(pids);
+	}
+}
+
+#endif
diff --git a/tools/bpf/bpftool/prog.c b/tools/bpf/bpftool/prog.c
index 53d47610ff58..6b430f9af2af 100644
--- a/tools/bpf/bpftool/prog.c
+++ b/tools/bpf/bpftool/prog.c
@@ -190,6 +190,24 @@ static void print_prog_json(struct bpf_prog_info *info, int fd)
 		jsonw_end_array(json_wtr);
 	}
 
+	if (!hash_empty(pids_table.table)) {
+		struct obj_pids *pids;
+		int i;
+
+		hash_for_each_possible(pids_table.table, pids, node, info->id) {
+			if (pids->id != info->id)
+				continue;
+			if (pids->pid_cnt == 0)
+				break;
+
+			jsonw_name(json_wtr, "pids");
+			jsonw_start_array(json_wtr);
+			for (i = 0; i < pids->pid_cnt; i++)
+				jsonw_int(json_wtr, pids->pids[i]);
+			jsonw_end_array(json_wtr);
+			break;
+		}
+	}
 	jsonw_end_object(json_wtr);
 }
 
@@ -256,6 +274,23 @@ static void print_prog_plain(struct bpf_prog_info *info, int fd)
 	if (info->btf_id)
 		printf("\n\tbtf_id %d", info->btf_id);
 
+	if (!hash_empty(pids_table.table)) {
+		struct obj_pids *pids;
+		int i;
+
+		hash_for_each_possible(pids_table.table, pids, node, info->id) {
+			if (pids->id != info->id)
+				continue;
+			if (pids->pid_cnt == 0)
+				break;
+
+			printf("\n\tpids:");
+			for (i = 0; i < pids->pid_cnt; i++)
+				printf("%c%d", i == 0 ? ' ' : ',', pids->pids[i]);
+			break;
+		}
+	}
+
 	printf("\n");
 }
 
@@ -321,6 +356,8 @@ static int do_show(int argc, char **argv)
 
 	if (show_pinned)
 		build_pinned_obj_table(&prog_table, BPF_OBJ_PROG);
+	if (show_pids)
+		build_obj_pids_table(&pids_table, BPF_OBJ_PROG);
 
 	if (argc == 2)
 		return do_show_subset(argc, argv);
@@ -362,6 +399,9 @@ static int do_show(int argc, char **argv)
 	if (json_output)
 		jsonw_end_array(json_wtr);
 
+	if (show_pids)
+		delete_obj_pids_table(&pids_table);
+
 	return err;
 }
 
diff --git a/tools/bpf/bpftool/skeleton/pid_iter.bpf.c b/tools/bpf/bpftool/skeleton/pid_iter.bpf.c
new file mode 100644
index 000000000000..9eedf5e56aa7
--- /dev/null
+++ b/tools/bpf/bpftool/skeleton/pid_iter.bpf.c
@@ -0,0 +1,77 @@
+// SPDX-License-Identifier: GPL-2.0
+// Copyright (c) 2020 Facebook
+#include <vmlinux.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_core_read.h>
+#include <bpf/bpf_tracing.h>
+
+/* keep in sync with the definition in main.h */
+enum bpf_obj_type {
+	BPF_OBJ_UNKNOWN,
+	BPF_OBJ_PROG,
+	BPF_OBJ_MAP,
+	BPF_OBJ_LINK,
+	BPF_OBJ_BTF,
+};
+
+extern const void bpf_link_fops __ksym;
+extern const void bpf_map_fops __ksym;
+extern const void bpf_prog_fops __ksym;
+extern const void btf_fops __ksym;
+
+const volatile enum bpf_obj_type obj_type = BPF_OBJ_UNKNOWN;
+
+static __always_inline __u32 get_obj_id(void *ent, enum bpf_obj_type type)
+{
+	switch (type) {
+	case BPF_OBJ_PROG:
+		return BPF_CORE_READ((struct bpf_prog *)ent, aux, id);
+	case BPF_OBJ_MAP:
+		return BPF_CORE_READ((struct bpf_map *)ent, id);
+	case BPF_OBJ_BTF:
+		return BPF_CORE_READ((struct btf *)ent, id);
+	case BPF_OBJ_LINK:
+		return BPF_CORE_READ((struct bpf_link *)ent, id);
+	default:
+		return 0;
+	}
+}
+
+SEC("iter/task_file")
+int iter(struct bpf_iter__task_file *ctx)
+{
+	struct file *file = ctx->file;
+	struct task_struct *task = ctx->task;
+	const void *fops;
+	int id;
+
+	if (!file || !task)
+		return 0;
+
+	switch (obj_type) {
+	case BPF_OBJ_PROG:
+		fops = &bpf_prog_fops;
+		break;
+	case BPF_OBJ_MAP:
+		fops = &bpf_map_fops;
+		break;
+	case BPF_OBJ_BTF:
+		fops = &btf_fops;
+		break;
+	case BPF_OBJ_LINK:
+		fops = &bpf_link_fops;
+		break;
+	default:
+		return 0;
+	}
+
+	if (file->f_op != fops)
+		return 0;
+
+	id = get_obj_id(file->private_data, obj_type);
+	BPF_SEQ_PRINTF(ctx->meta->seq, "%d %d\n", task->tgid, id);
+
+	return 0;
+}
+
+char LICENSE[] SEC("license") = "GPL";
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH bpf-next 8/8] tools/bpftool: show PIDs with FDs open against BPF map/prog/link/btf
  2020-06-12 22:31 ` [RFC PATCH bpf-next 8/8] tools/bpftool: show PIDs with FDs open against BPF map/prog/link/btf Andrii Nakryiko
@ 2020-06-13  3:45   ` Alexei Starovoitov
  2020-06-13  5:57     ` Andrii Nakryiko
  0 siblings, 1 reply; 21+ messages in thread
From: Alexei Starovoitov @ 2020-06-13  3:45 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, netdev, ast, daniel, andrii.nakryiko, kernel-team, Hao Luo,
	Arnaldo Carvalho de Melo, Song Liu, Quentin Monnet

On Fri, Jun 12, 2020 at 03:31:50PM -0700, Andrii Nakryiko wrote:
> Add bpf_iter-based way to find all the processes that hold open FDs against
> BPF object (map, prog, link, btf). Add new flag (-o, for "ownership", given
> -p is already taken) to trigger collection and output of these PIDs.
> 
> Sample output for each of 4 BPF objects:
> 
> $ sudo ./bpftool -o prog show
> 1992: cgroup_skb  name egress_alt  tag 9ad187367cf2b9e8  gpl
>         loaded_at 2020-06-12T14:18:10-0700  uid 0
>         xlated 48B  jited 59B  memlock 4096B  map_ids 2074
>         btf_id 460
>         pids: 913709,913732,913733,913734
> 2062: cgroup_device  tag 8c42dee26e8cd4c2  gpl
>         loaded_at 2020-06-12T14:37:52-0700  uid 0
>         xlated 648B  jited 409B  memlock 4096B
>         pids: 1
> 
> $ sudo ./bpftool -o map show
> 2074: array  name test_cgr.bss  flags 0x400
>         key 4B  value 8B  max_entries 1  memlock 8192B
>         btf_id 460
>         pids: 913709,913732,913733,913734
> 
> $ sudo ./bpftool -o link show
> 82: cgroup  prog 1992
>         cgroup_id 0  attach_type egress
>         pids: 913709,913732,913733,913734
> 86: cgroup  prog 1992
>         cgroup_id 0  attach_type egress
>         pids: 913709,913732,913733,913734

This is awesome.

Why extra flag though? I think it's so useful that everyone would want to see
this by default. Also the word 'pid' has kernel meaning or user space meaning?
Looks like kernel then bpftool should say 'tid'.
Could you capture comm as well and sort it by comm, like:

$ sudo ./bpftool link show
82: cgroup  prog 1992
        cgroup_id 0  attach_type egress
        systemd(1), firewall(913709 913732), logger(913733 913734)

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH bpf-next 8/8] tools/bpftool: show PIDs with FDs open against BPF map/prog/link/btf
  2020-06-13  3:45   ` Alexei Starovoitov
@ 2020-06-13  5:57     ` Andrii Nakryiko
  2020-06-13 22:14       ` Arnaldo Carvalho de Melo
  0 siblings, 1 reply; 21+ messages in thread
From: Andrii Nakryiko @ 2020-06-13  5:57 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team, Hao Luo, Arnaldo Carvalho de Melo,
	Song Liu, Quentin Monnet

On Fri, Jun 12, 2020 at 8:45 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Fri, Jun 12, 2020 at 03:31:50PM -0700, Andrii Nakryiko wrote:
> > Add bpf_iter-based way to find all the processes that hold open FDs against
> > BPF object (map, prog, link, btf). Add new flag (-o, for "ownership", given
> > -p is already taken) to trigger collection and output of these PIDs.
> >
> > Sample output for each of 4 BPF objects:
> >
> > $ sudo ./bpftool -o prog show
> > 1992: cgroup_skb  name egress_alt  tag 9ad187367cf2b9e8  gpl
> >         loaded_at 2020-06-12T14:18:10-0700  uid 0
> >         xlated 48B  jited 59B  memlock 4096B  map_ids 2074
> >         btf_id 460
> >         pids: 913709,913732,913733,913734
> > 2062: cgroup_device  tag 8c42dee26e8cd4c2  gpl
> >         loaded_at 2020-06-12T14:37:52-0700  uid 0
> >         xlated 648B  jited 409B  memlock 4096B
> >         pids: 1
> >
> > $ sudo ./bpftool -o map show
> > 2074: array  name test_cgr.bss  flags 0x400
> >         key 4B  value 8B  max_entries 1  memlock 8192B
> >         btf_id 460
> >         pids: 913709,913732,913733,913734
> >
> > $ sudo ./bpftool -o link show
> > 82: cgroup  prog 1992
> >         cgroup_id 0  attach_type egress
> >         pids: 913709,913732,913733,913734
> > 86: cgroup  prog 1992
> >         cgroup_id 0  attach_type egress
> >         pids: 913709,913732,913733,913734
>
> This is awesome.

Thanks.

>
> Why extra flag though? I think it's so useful that everyone would want to see

No good reason apart from "being safe by default". If turned on by
default, bpftool would need to probe for bpf_iter support first. I can
add probing and do this by default.

> this by default. Also the word 'pid' has kernel meaning or user space meaning?
> Looks like kernel then bpftool should say 'tid'.

No, its process ID in user-space sense. See task->tgid in
pid_iter.bpf.c. I figured thread ID isn't all that useful.

> Could you capture comm as well and sort it by comm, like:
>
> $ sudo ./bpftool link show
> 82: cgroup  prog 1992
>         cgroup_id 0  attach_type egress
>         systemd(1), firewall(913709 913732), logger(913733 913734)

Yep, comm is useful, I'll add that. Grouping by comm is kind of a
pain, though, plus usually there will be one process only. So let me
start with doing comm (pid) for each PID independently. I think that
will be as good in practice.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH bpf-next 8/8] tools/bpftool: show PIDs with FDs open against BPF map/prog/link/btf
  2020-06-13  5:57     ` Andrii Nakryiko
@ 2020-06-13 22:14       ` Arnaldo Carvalho de Melo
  2020-06-15  9:04         ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 21+ messages in thread
From: Arnaldo Carvalho de Melo @ 2020-06-13 22:14 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, Networking,
	Alexei Starovoitov, Daniel Borkmann, Kernel Team, Hao Luo,
	Song Liu, Quentin Monnet

Em Fri, Jun 12, 2020 at 10:57:59PM -0700, Andrii Nakryiko escreveu:
> On Fri, Jun 12, 2020 at 8:45 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Fri, Jun 12, 2020 at 03:31:50PM -0700, Andrii Nakryiko wrote:
> > > Add bpf_iter-based way to find all the processes that hold open FDs against
> > > BPF object (map, prog, link, btf). Add new flag (-o, for "ownership", given
> > > -p is already taken) to trigger collection and output of these PIDs.
> > >
> > > Sample output for each of 4 BPF objects:
> > >
> > > $ sudo ./bpftool -o prog show
> > > 1992: cgroup_skb  name egress_alt  tag 9ad187367cf2b9e8  gpl
> > >         loaded_at 2020-06-12T14:18:10-0700  uid 0
> > >         xlated 48B  jited 59B  memlock 4096B  map_ids 2074
> > >         btf_id 460
> > >         pids: 913709,913732,913733,913734
> > > 2062: cgroup_device  tag 8c42dee26e8cd4c2  gpl
> > >         loaded_at 2020-06-12T14:37:52-0700  uid 0
> > >         xlated 648B  jited 409B  memlock 4096B
> > >         pids: 1
> > >
> > > $ sudo ./bpftool -o map show
> > > 2074: array  name test_cgr.bss  flags 0x400
> > >         key 4B  value 8B  max_entries 1  memlock 8192B
> > >         btf_id 460
> > >         pids: 913709,913732,913733,913734
> > >
> > > $ sudo ./bpftool -o link show
> > > 82: cgroup  prog 1992
> > >         cgroup_id 0  attach_type egress
> > >         pids: 913709,913732,913733,913734
> > > 86: cgroup  prog 1992
> > >         cgroup_id 0  attach_type egress
> > >         pids: 913709,913732,913733,913734
> >
> > This is awesome.

Indeed.
 
> Thanks.
> 
> >
> > Why extra flag though? I think it's so useful that everyone would want to see

Agreed.
 
> No good reason apart from "being safe by default". If turned on by
> default, bpftool would need to probe for bpf_iter support first. I can
> add probing and do this by default.

I think this is the way to go.
 
> > this by default. Also the word 'pid' has kernel meaning or user space meaning?
> > Looks like kernel then bpftool should say 'tid'.
> 
> No, its process ID in user-space sense. See task->tgid in
> pid_iter.bpf.c. I figured thread ID isn't all that useful.
> 
> > Could you capture comm as well and sort it by comm, like:
> >
> > $ sudo ./bpftool link show
> > 82: cgroup  prog 1992
> >         cgroup_id 0  attach_type egress
> >         systemd(1), firewall(913709 913732), logger(913733 913734)
> 
> Yep, comm is useful, I'll add that. Grouping by comm is kind of a
> pain, though, plus usually there will be one process only. So let me
> start with doing comm (pid) for each PID independently. I think that
> will be as good in practice.

-- 

- Arnaldo

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH bpf-next 8/8] tools/bpftool: show PIDs with FDs open against BPF map/prog/link/btf
  2020-06-13 22:14       ` Arnaldo Carvalho de Melo
@ 2020-06-15  9:04         ` Toke Høiland-Jørgensen
  2020-06-15  9:30           ` Quentin Monnet
  0 siblings, 1 reply; 21+ messages in thread
From: Toke Høiland-Jørgensen @ 2020-06-15  9:04 UTC (permalink / raw)
  To: Arnaldo Carvalho de Melo, Andrii Nakryiko
  Cc: Alexei Starovoitov, Andrii Nakryiko, bpf, Networking,
	Alexei Starovoitov, Daniel Borkmann, Kernel Team, Hao Luo,
	Song Liu, Quentin Monnet

Arnaldo Carvalho de Melo <acme@kernel.org> writes:

> Em Fri, Jun 12, 2020 at 10:57:59PM -0700, Andrii Nakryiko escreveu:
>> On Fri, Jun 12, 2020 at 8:45 PM Alexei Starovoitov
>> <alexei.starovoitov@gmail.com> wrote:
>> >
>> > On Fri, Jun 12, 2020 at 03:31:50PM -0700, Andrii Nakryiko wrote:
>> > > Add bpf_iter-based way to find all the processes that hold open FDs against
>> > > BPF object (map, prog, link, btf). Add new flag (-o, for "ownership", given
>> > > -p is already taken) to trigger collection and output of these PIDs.
>> > >
>> > > Sample output for each of 4 BPF objects:
>> > >
>> > > $ sudo ./bpftool -o prog show
>> > > 1992: cgroup_skb  name egress_alt  tag 9ad187367cf2b9e8  gpl
>> > >         loaded_at 2020-06-12T14:18:10-0700  uid 0
>> > >         xlated 48B  jited 59B  memlock 4096B  map_ids 2074
>> > >         btf_id 460
>> > >         pids: 913709,913732,913733,913734
>> > > 2062: cgroup_device  tag 8c42dee26e8cd4c2  gpl
>> > >         loaded_at 2020-06-12T14:37:52-0700  uid 0
>> > >         xlated 648B  jited 409B  memlock 4096B
>> > >         pids: 1
>> > >
>> > > $ sudo ./bpftool -o map show
>> > > 2074: array  name test_cgr.bss  flags 0x400
>> > >         key 4B  value 8B  max_entries 1  memlock 8192B
>> > >         btf_id 460
>> > >         pids: 913709,913732,913733,913734
>> > >
>> > > $ sudo ./bpftool -o link show
>> > > 82: cgroup  prog 1992
>> > >         cgroup_id 0  attach_type egress
>> > >         pids: 913709,913732,913733,913734
>> > > 86: cgroup  prog 1992
>> > >         cgroup_id 0  attach_type egress
>> > >         pids: 913709,913732,913733,913734
>> >
>> > This is awesome.
>
> Indeed.
>  
>> Thanks.
>> 
>> >
>> > Why extra flag though? I think it's so useful that everyone would want to see
>
> Agreed.
>  
>> No good reason apart from "being safe by default". If turned on by
>> default, bpftool would need to probe for bpf_iter support first. I can
>> add probing and do this by default.
>
> I think this is the way to go.

+1

And also +1 on the awesomeness of this feature! :)

-Toke


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH bpf-next 8/8] tools/bpftool: show PIDs with FDs open against BPF map/prog/link/btf
  2020-06-15  9:04         ` Toke Høiland-Jørgensen
@ 2020-06-15  9:30           ` Quentin Monnet
  0 siblings, 0 replies; 21+ messages in thread
From: Quentin Monnet @ 2020-06-15  9:30 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Toke Høiland-Jørgensen, Arnaldo Carvalho de Melo,
	Alexei Starovoitov, Andrii Nakryiko, bpf, Networking,
	Alexei Starovoitov, Daniel Borkmann, Kernel Team, Hao Luo,
	Song Liu

2020-06-15 11:04 UTC+0200 ~ Toke Høiland-Jørgensen <toke@redhat.com>
> Arnaldo Carvalho de Melo <acme@kernel.org> writes:
> 
>> Em Fri, Jun 12, 2020 at 10:57:59PM -0700, Andrii Nakryiko escreveu:
>>> On Fri, Jun 12, 2020 at 8:45 PM Alexei Starovoitov
>>> <alexei.starovoitov@gmail.com> wrote:
>>>>
>>>> On Fri, Jun 12, 2020 at 03:31:50PM -0700, Andrii Nakryiko wrote:
>>>>> Add bpf_iter-based way to find all the processes that hold open FDs against
>>>>> BPF object (map, prog, link, btf). Add new flag (-o, for "ownership", given
>>>>> -p is already taken) to trigger collection and output of these PIDs.
>>>>>
>>>>> Sample output for each of 4 BPF objects:
>>>>>
>>>>> $ sudo ./bpftool -o prog show
>>>>> 1992: cgroup_skb  name egress_alt  tag 9ad187367cf2b9e8  gpl
>>>>>         loaded_at 2020-06-12T14:18:10-0700  uid 0
>>>>>         xlated 48B  jited 59B  memlock 4096B  map_ids 2074
>>>>>         btf_id 460
>>>>>         pids: 913709,913732,913733,913734
>>>>> 2062: cgroup_device  tag 8c42dee26e8cd4c2  gpl
>>>>>         loaded_at 2020-06-12T14:37:52-0700  uid 0
>>>>>         xlated 648B  jited 409B  memlock 4096B
>>>>>         pids: 1
>>>>>
>>>>> $ sudo ./bpftool -o map show
>>>>> 2074: array  name test_cgr.bss  flags 0x400
>>>>>         key 4B  value 8B  max_entries 1  memlock 8192B
>>>>>         btf_id 460
>>>>>         pids: 913709,913732,913733,913734
>>>>>
>>>>> $ sudo ./bpftool -o link show
>>>>> 82: cgroup  prog 1992
>>>>>         cgroup_id 0  attach_type egress
>>>>>         pids: 913709,913732,913733,913734
>>>>> 86: cgroup  prog 1992
>>>>>         cgroup_id 0  attach_type egress
>>>>>         pids: 913709,913732,913733,913734
>>>>
>>>> This is awesome.
>>
>> Indeed.
>>  
>>> Thanks.
>>>
>>>>
>>>> Why extra flag though? I think it's so useful that everyone would want to see
>>
>> Agreed.
>>  
>>> No good reason apart from "being safe by default". If turned on by
>>> default, bpftool would need to probe for bpf_iter support first. I can
>>> add probing and do this by default.
>>
>> I think this is the way to go.
> 
> +1
> 
> And also +1 on the awesomeness of this feature! :)
> 
> -Toke
> 

Thanks a lot Andrii, the feature looks great indeed.

Thank you for the clean-up and refactoring in bpftool and its Makefile
as well, I am happy to confirm that test_bpftool_build.sh still passes
on my setup with your changes.

Quentin

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH bpf-next 1/8] libbpf: generalize libbpf externs support
       [not found]   ` <CA+khW7hAYVdoQX5-j0z1iGEVZeww4BBu4NXzy5eS5OwDRYqe2w@mail.gmail.com>
@ 2020-06-15 18:55     ` Andrii Nakryiko
  0 siblings, 0 replies; 21+ messages in thread
From: Andrii Nakryiko @ 2020-06-15 18:55 UTC (permalink / raw)
  To: Hao Luo
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team, Arnaldo Carvalho de Melo, Song Liu,
	Quentin Monnet

On Mon, Jun 15, 2020 at 9:44 AM Hao Luo <haoluo@google.com> wrote:
>
> Andrii,
>
> Thanks for this patch, it looks very nice! Decoupling kconfig from generic externs is much needed.
>
> On Fri, Jun 12, 2020 at 3:34 PM Andrii Nakryiko <andriin@fb.com> wrote:
>>
>> Switch existing Kconfig externs to be just one of few possible kinds of more
>> generic externs. This refactoring is in preparation for ksymbol extern
>> support, added in the follow up patch. There are no functional changes
>> intended.
>>
>> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
>> ---
>>  tools/lib/bpf/libbpf.c | 332 ++++++++++++++++++++++++-----------------
>>  1 file changed, 199 insertions(+), 133 deletions(-)
>>

[...]

>> @@ -1443,12 +1454,12 @@ static int set_ext_value_tri(struct extern_desc *ext, void *ext_val,
>>                 else /* value == 'n' */
>>                         *(enum libbpf_tristate *)ext_val = TRI_NO;
>>                 break;
>> -       case EXT_CHAR:
>> +       case KCFG_CHAR:
>>                 *(char *)ext_val = value;
>>                 break;
>> -       case EXT_UNKNOWN:
>> -       case EXT_INT:
>> -       case EXT_CHAR_ARR:
>> +       case KCFG_UNKNOWN:
>> +       case KCFG_INT:
>> +       case KCFG_CHAR_ARR:
>>         default:
>>                 pr_warn("extern %s=%c should be bool, tristate, or char\n",
>>                         ext->name, value);
>
>
> Very minor: pr_warn("kconfig extern ..."); I noticed you have one similar message changed below.
>

yeah, good catch, I'll update

>>
>> @@ -1458,12 +1469,12 @@ static int set_ext_value_tri(struct extern_desc *ext, void *ext_val,
>>         return 0;
>>  }
>>

for the future, please cut irrelevant parts of the patch, makes it
easier to see where your replies are

[...]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH bpf-next 2/8] libbpf: add support for extracting kernel symbol addresses
       [not found]   ` <CA+khW7hFZzp_K_xydSFw0O3LYB22_fC=Z4wG7i9Si+phGHn4cQ@mail.gmail.com>
@ 2020-06-15 19:08     ` Andrii Nakryiko
  2020-06-16  8:05       ` Hao Luo
  0 siblings, 1 reply; 21+ messages in thread
From: Andrii Nakryiko @ 2020-06-15 19:08 UTC (permalink / raw)
  To: Hao Luo
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team, Arnaldo Carvalho de Melo, Song Liu,
	Quentin Monnet

On Mon, Jun 15, 2020 at 9:44 AM Hao Luo <haoluo@google.com> wrote:
>
> Thanks, Andrii,
>
> This change looks nice! A couple of comments:
>
> 1. A 'void' type variable looks slightly odd from a user's perspective. How about using 'u64' or 'void *'? Or at least, a named type, which aliases to 'void'?

That choice is very deliberate one. `extern const void` is the right
way in C language to access linker-generated symbols, for instance,
which is quite similar to what the intent is her. Having void type is
very explicit that you don't know/care about that value pointed to by
extern address, the only operation you can perform is to get it's
address.

Once we add kernel variables support, that's when types will start to
be specified and libbpf will do extra checks (type matching) and extra
work (generating ldimm64 with BTF ID, for instance), to allow C code
to access data pointed to by extern address.

Switching type to u64 would be misleading in allowing C code to
implicitly dereference value of extern. E.g., there is a big
difference between:

extern u64 bla;

printf("%lld\n", bla); /* de-reference happens here, we get contents
of memory pointed to by "bla" symbol */

printf("%p\n", &bla); /* here we get value of linker symbol/address of
extern variable */

Currently I explicitly support only the latter and want to prevent the
former, until we have kernel variables in BTF. Using `extern void`
makes compiler enforce that only the &bla form is allowed. Everything
else is compilation error.

> 2. About the type size of ksym, IIUC, it looks strange that the values read from kallsyms have 8 bytes but their corresponding vs->size is 4 bytes and vs->type points to 4-byte int. Can we make them of the same size?

That's a bit of a hack on my part. Variable needs to point to some
type, which size will match the size of datasec's varinfo entry. This
is checked and enforced by kernel. I'm looking for 4-byte int, because
it's almost guaranteed that it will be present in program's BTF and I
won't have to explicitly add it (it's because all BPF programs return
int, so it must be in program's BTF already). While 8-byte long is
less likely to be there.

In the future, if we have a nicer way to extend BTF (and we will
soon), we can do this a bit better, but either way that .ksyms DATASEC
type isn't used for anything (there is no map with that DATASEC as a
value type), so it doesn't matter.

>
> Hao
>
> On Fri, Jun 12, 2020 at 3:35 PM Andrii Nakryiko <andriin@fb.com> wrote:
>>
>> Add support for another (in addition to existing Kconfig) special kind of
>> externs in BPF code, kernel symbol externs. Such externs allow BPF code to
>> "know" kernel symbol address and either use it for comparisons with kernel
>> data structures (e.g., struct file's f_op pointer, to distinguish different
>> kinds of file), or, with the help of bpf_probe_user_kernel(), to follow
>> pointers and read data from global variables. Kernel symbol addresses are
>> found through /proc/kallsyms, which should be present in the system.
>>
>> Currently, such kernel symbol variables are typeless: they have to be defined
>> as `extern const void <symbol>` and the only operation you can do (in C code)
>> with them is to take its address. Such extern should reside in a special
>> section '.ksyms'. bpf_helpers.h header provides __ksym macro for this. Strong
>> vs weak semantics stays the same as with Kconfig externs. If symbol is not
>> found in /proc/kallsyms, this will be a failure for strong (non-weak) extern,
>> but will be defaulted to 0 for weak externs.
>>
>> If the same symbol is defined multiple times in /proc/kallsyms, then it will
>> be error if any of the associated addresses differs. In that case, address is
>> ambiguous, so libbpf falls on the side of caution, rather than confusing user
>> with randomly chosen address.
>>
>> In the future, once kernel is extended with variables BTF information, such
>> ksym externs will be supported in a typed version, which will allow BPF
>> program to read variable's contents directly, similarly to how it's done for
>> fentry/fexit input arguments.
>>
>> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
>> ---
>>  tools/lib/bpf/bpf_helpers.h |   1 +
>>  tools/lib/bpf/btf.h         |   5 ++
>>  tools/lib/bpf/libbpf.c      | 138 ++++++++++++++++++++++++++++++++++--
>>  3 files changed, 139 insertions(+), 5 deletions(-)
>>

[...]

>>
>>  enum extern_type {
>>         EXT_UNKNOWN,
>> +       EXT_KSYM,
>>
>>         EXT_KCFG,
>>  };
>
>
> Minor, let EXT_KSYM come after EXT_KCFG.

I wanted ksym externs to go before KCFG ones, but not sure why. I'll
double check, I don't think it should matter.

>
>>
>>

[...]

>> +static int bpf_object__read_kallsyms_file(struct bpf_object *obj)
>> +{
>> +       char sym_type, sym_name[256];
>> +       unsigned long sym_addr;
>> +       struct extern_desc *ext;
>> +       int ret, err = 0;
>> +       FILE *f;
>> +
>> +       f = fopen("/proc/kallsyms", "r");
>> +       if (!f) {
>> +               err = -errno;
>> +               pr_warn("failed to open /proc/kallsyms: %d\n", err);
>> +               return err;
>> +       }
>> +
>> +       while (true) {
>> +               ret = fscanf(f, "%lx %c %s%*[^\n]\n",
>> +                            &sym_addr, &sym_type, sym_name);
>
>
> Maybe better follow the existing pattern in kernel (scripts/kallsyms.c https://github.com/torvalds/linux/blob/master/scripts/kallsyms.c#L177)


oh, didn't know about this "%499s" trick, will change.

>
>>
>> +               if (ret == EOF && feof(f))
>> +                       break;
>> +               if (ret != 3) {
>> +                       err = -EINVAL;
>> +                       goto out;
>> +               }
>> +

[...]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH bpf-next 3/8] selftests/bpf: add __ksym extern selftest
       [not found]   ` <CA+khW7jxdS1KRpk2syVGjDqbyn3wAd3Eh_LEMAEhkPUehuXMwg@mail.gmail.com>
@ 2020-06-15 19:11     ` Andrii Nakryiko
  0 siblings, 0 replies; 21+ messages in thread
From: Andrii Nakryiko @ 2020-06-15 19:11 UTC (permalink / raw)
  To: Hao Luo
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team, Arnaldo Carvalho de Melo, Song Liu,
	Quentin Monnet

On Mon, Jun 15, 2020 at 9:45 AM Hao Luo <haoluo@google.com> wrote:
>
> Andrii, a couple of general comments on fixed_percpu_data.
>
> I think it would be better to check the existence of fixed_percpu_data in kallsyms first. If it's not there, just skip, or maybe warn but not fail.

fixed_percpu_data is always there, but I missed the fact that it's
x86-specific one. I'll switch to some bpf-specific symbol (e.g., like
bpf_prog_fops or something along those lines).

>
> Further, if we really want to be sure that  fixed_percpu_data is the first percpu var, we can read the value of __per_cpu_start, which marks the beginning address of the percpu section. Checking the address of fixed_percpu_data against __per_cpu_start rather than 0 should be more robust, I think, given that fixed_percpu_data exists.

There are assertions in Linux sources that fixed_percpu_data is 0, so
I don't think that it necessary. But it's a moot point, as I'll use
something less x86-specific.

>
> Hao
>
> On Fri, Jun 12, 2020 at 3:35 PM Andrii Nakryiko <andriin@fb.com> wrote:
>>
>> Validate libbpf is able to handle weak and strong kernel symbol externs in BPF
>> code correctly.
>>
>> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
>> ---
>>  .../testing/selftests/bpf/prog_tests/ksyms.c  | 71 +++++++++++++++++++
>>  .../testing/selftests/bpf/progs/test_ksyms.c  | 32 +++++++++
>>  2 files changed, 103 insertions(+)
>>  create mode 100644 tools/testing/selftests/bpf/prog_tests/ksyms.c
>>  create mode 100644 tools/testing/selftests/bpf/progs/test_ksyms.c
>>

[...]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH bpf-next 2/8] libbpf: add support for extracting kernel symbol addresses
  2020-06-15 19:08     ` Andrii Nakryiko
@ 2020-06-16  8:05       ` Hao Luo
  2020-06-17  1:24         ` Hao Luo
  0 siblings, 1 reply; 21+ messages in thread
From: Hao Luo @ 2020-06-16  8:05 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team, Arnaldo Carvalho de Melo, Song Liu,
	Quentin Monnet

On Mon, Jun 15, 2020 at 12:08 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Mon, Jun 15, 2020 at 9:44 AM Hao Luo <haoluo@google.com> wrote:
> >
> > Thanks, Andrii,
> >
> > This change looks nice! A couple of comments:
> >
> > 1. A 'void' type variable looks slightly odd from a user's perspective. How about using 'u64' or 'void *'? Or at least, a named type, which aliases to 'void'?
>
> That choice is very deliberate one. `extern const void` is the right
> way in C language to access linker-generated symbols, for instance,
> which is quite similar to what the intent is her. Having void type is
> very explicit that you don't know/care about that value pointed to by
> extern address, the only operation you can perform is to get it's
> address.
>
> Once we add kernel variables support, that's when types will start to
> be specified and libbpf will do extra checks (type matching) and extra
> work (generating ldimm64 with BTF ID, for instance), to allow C code
> to access data pointed to by extern address.
>
> Switching type to u64 would be misleading in allowing C code to
> implicitly dereference value of extern. E.g., there is a big
> difference between:
>
> extern u64 bla;
>
> printf("%lld\n", bla); /* de-reference happens here, we get contents
> of memory pointed to by "bla" symbol */
>
> printf("%p\n", &bla); /* here we get value of linker symbol/address of
> extern variable */
>
> Currently I explicitly support only the latter and want to prevent the
> former, until we have kernel variables in BTF. Using `extern void`
> makes compiler enforce that only the &bla form is allowed. Everything
> else is compilation error.
>

Ah, I see. I've been taking the extern variable as an actual variable
that contains the symbol's address, which is the first case. Your
approach makes sense. Thanks for explaining.

> > 2. About the type size of ksym, IIUC, it looks strange that the values read from kallsyms have 8 bytes but their corresponding vs->size is 4 bytes and vs->type points to 4-byte int. Can we make them of the same size?
>
> That's a bit of a hack on my part. Variable needs to point to some
> type, which size will match the size of datasec's varinfo entry. This
> is checked and enforced by kernel. I'm looking for 4-byte int, because
> it's almost guaranteed that it will be present in program's BTF and I
> won't have to explicitly add it (it's because all BPF programs return
> int, so it must be in program's BTF already). While 8-byte long is
> less likely to be there.
>
> In the future, if we have a nicer way to extend BTF (and we will
> soon), we can do this a bit better, but either way that .ksyms DATASEC
> type isn't used for anything (there is no map with that DATASEC as a
> value type), so it doesn't matter.
>

Thanks for explaining, Andrii.

These explanations as comments in the code would be quite helpful, IMHO.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH bpf-next 2/8] libbpf: add support for extracting kernel symbol addresses
  2020-06-16  8:05       ` Hao Luo
@ 2020-06-17  1:24         ` Hao Luo
  2020-06-17  1:36           ` Andrii Nakryiko
  0 siblings, 1 reply; 21+ messages in thread
From: Hao Luo @ 2020-06-17  1:24 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team, Arnaldo Carvalho de Melo, Song Liu,
	Quentin Monnet

Andrii,

Do you think we need to put the kernel's variables in one single
DATASEC in vmlinux BTF? It looks like all the ksyms in the program
will be under one ".ksyms" section, so we are not able to tell whether
a ksym is from a percpu section or a .rodata section. Without this
information, if the vmlinux has multiple DATASECs, the loader may need
to traverse all of them. If vmlinux BTF has only one DATASEC, it
matches the object's BTF better.

Right now, the percpu vars are in a ".data..percpu" DATASEC in my
patch and the plan seems that we will introduce more DATASECs to hold
other data.

Please let me know your insights here. Thanks.

Hao

On Tue, Jun 16, 2020 at 1:05 AM Hao Luo <haoluo@google.com> wrote:
>
> On Mon, Jun 15, 2020 at 12:08 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Mon, Jun 15, 2020 at 9:44 AM Hao Luo <haoluo@google.com> wrote:
> > >
> > > Thanks, Andrii,
> > >
> > > This change looks nice! A couple of comments:
> > >
> > > 1. A 'void' type variable looks slightly odd from a user's perspective. How about using 'u64' or 'void *'? Or at least, a named type, which aliases to 'void'?
> >
> > That choice is very deliberate one. `extern const void` is the right
> > way in C language to access linker-generated symbols, for instance,
> > which is quite similar to what the intent is her. Having void type is
> > very explicit that you don't know/care about that value pointed to by
> > extern address, the only operation you can perform is to get it's
> > address.
> >
> > Once we add kernel variables support, that's when types will start to
> > be specified and libbpf will do extra checks (type matching) and extra
> > work (generating ldimm64 with BTF ID, for instance), to allow C code
> > to access data pointed to by extern address.
> >
> > Switching type to u64 would be misleading in allowing C code to
> > implicitly dereference value of extern. E.g., there is a big
> > difference between:
> >
> > extern u64 bla;
> >
> > printf("%lld\n", bla); /* de-reference happens here, we get contents
> > of memory pointed to by "bla" symbol */
> >
> > printf("%p\n", &bla); /* here we get value of linker symbol/address of
> > extern variable */
> >
> > Currently I explicitly support only the latter and want to prevent the
> > former, until we have kernel variables in BTF. Using `extern void`
> > makes compiler enforce that only the &bla form is allowed. Everything
> > else is compilation error.
> >
>
> Ah, I see. I've been taking the extern variable as an actual variable
> that contains the symbol's address, which is the first case. Your
> approach makes sense. Thanks for explaining.
>
> > > 2. About the type size of ksym, IIUC, it looks strange that the values read from kallsyms have 8 bytes but their corresponding vs->size is 4 bytes and vs->type points to 4-byte int. Can we make them of the same size?
> >
> > That's a bit of a hack on my part. Variable needs to point to some
> > type, which size will match the size of datasec's varinfo entry. This
> > is checked and enforced by kernel. I'm looking for 4-byte int, because
> > it's almost guaranteed that it will be present in program's BTF and I
> > won't have to explicitly add it (it's because all BPF programs return
> > int, so it must be in program's BTF already). While 8-byte long is
> > less likely to be there.
> >
> > In the future, if we have a nicer way to extend BTF (and we will
> > soon), we can do this a bit better, but either way that .ksyms DATASEC
> > type isn't used for anything (there is no map with that DATASEC as a
> > value type), so it doesn't matter.
> >
>
> Thanks for explaining, Andrii.
>
> These explanations as comments in the code would be quite helpful, IMHO.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH bpf-next 2/8] libbpf: add support for extracting kernel symbol addresses
  2020-06-17  1:24         ` Hao Luo
@ 2020-06-17  1:36           ` Andrii Nakryiko
  2020-06-18  7:53             ` Hao Luo
  0 siblings, 1 reply; 21+ messages in thread
From: Andrii Nakryiko @ 2020-06-17  1:36 UTC (permalink / raw)
  To: Hao Luo
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team, Arnaldo Carvalho de Melo, Song Liu,
	Quentin Monnet

On Tue, Jun 16, 2020 at 6:24 PM Hao Luo <haoluo@google.com> wrote:
>
> Andrii,
>
> Do you think we need to put the kernel's variables in one single
> DATASEC in vmlinux BTF? It looks like all the ksyms in the program
> will be under one ".ksyms" section, so we are not able to tell whether
> a ksym is from a percpu section or a .rodata section. Without this
> information, if the vmlinux has multiple DATASECs, the loader may need
> to traverse all of them. If vmlinux BTF has only one DATASEC, it
> matches the object's BTF better.
>
> Right now, the percpu vars are in a ".data..percpu" DATASEC in my
> patch and the plan seems that we will introduce more DATASECs to hold
> other data.
>
> Please let me know your insights here. Thanks.

I think we should keep original DATASECs in vmlinux's BTF, so that
they match ELF sections. Otherwise BTF is going to lie and will cause
confusion down the road in the longer term.

On the BPF program side, though, I think we'll limit it to just two
special sections: .ksyms and .ksyms.percpu. libbpf will have to
traverse all vmlinux DATASECs to find corresponding variables, but
that's ok, it has to do the linear scan either way.

>
> Hao
>
> On Tue, Jun 16, 2020 at 1:05 AM Hao Luo <haoluo@google.com> wrote:
> >
> > On Mon, Jun 15, 2020 at 12:08 PM Andrii Nakryiko
> > <andrii.nakryiko@gmail.com> wrote:
> > >
> > > On Mon, Jun 15, 2020 at 9:44 AM Hao Luo <haoluo@google.com> wrote:
> > > >
> > > > Thanks, Andrii,
> > > >
> > > > This change looks nice! A couple of comments:
> > > >
> > > > 1. A 'void' type variable looks slightly odd from a user's perspective. How about using 'u64' or 'void *'? Or at least, a named type, which aliases to 'void'?
> > >
> > > That choice is very deliberate one. `extern const void` is the right
> > > way in C language to access linker-generated symbols, for instance,
> > > which is quite similar to what the intent is her. Having void type is
> > > very explicit that you don't know/care about that value pointed to by
> > > extern address, the only operation you can perform is to get it's
> > > address.
> > >
> > > Once we add kernel variables support, that's when types will start to
> > > be specified and libbpf will do extra checks (type matching) and extra
> > > work (generating ldimm64 with BTF ID, for instance), to allow C code
> > > to access data pointed to by extern address.
> > >
> > > Switching type to u64 would be misleading in allowing C code to
> > > implicitly dereference value of extern. E.g., there is a big
> > > difference between:
> > >
> > > extern u64 bla;
> > >
> > > printf("%lld\n", bla); /* de-reference happens here, we get contents
> > > of memory pointed to by "bla" symbol */
> > >
> > > printf("%p\n", &bla); /* here we get value of linker symbol/address of
> > > extern variable */
> > >
> > > Currently I explicitly support only the latter and want to prevent the
> > > former, until we have kernel variables in BTF. Using `extern void`
> > > makes compiler enforce that only the &bla form is allowed. Everything
> > > else is compilation error.
> > >
> >
> > Ah, I see. I've been taking the extern variable as an actual variable
> > that contains the symbol's address, which is the first case. Your
> > approach makes sense. Thanks for explaining.
> >
> > > > 2. About the type size of ksym, IIUC, it looks strange that the values read from kallsyms have 8 bytes but their corresponding vs->size is 4 bytes and vs->type points to 4-byte int. Can we make them of the same size?
> > >
> > > That's a bit of a hack on my part. Variable needs to point to some
> > > type, which size will match the size of datasec's varinfo entry. This
> > > is checked and enforced by kernel. I'm looking for 4-byte int, because
> > > it's almost guaranteed that it will be present in program's BTF and I
> > > won't have to explicitly add it (it's because all BPF programs return
> > > int, so it must be in program's BTF already). While 8-byte long is
> > > less likely to be there.
> > >
> > > In the future, if we have a nicer way to extend BTF (and we will
> > > soon), we can do this a bit better, but either way that .ksyms DATASEC
> > > type isn't used for anything (there is no map with that DATASEC as a
> > > value type), so it doesn't matter.
> > >
> >
> > Thanks for explaining, Andrii.
> >
> > These explanations as comments in the code would be quite helpful, IMHO.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [RFC PATCH bpf-next 2/8] libbpf: add support for extracting kernel symbol addresses
  2020-06-17  1:36           ` Andrii Nakryiko
@ 2020-06-18  7:53             ` Hao Luo
  0 siblings, 0 replies; 21+ messages in thread
From: Hao Luo @ 2020-06-18  7:53 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Daniel Borkmann, Kernel Team, Arnaldo Carvalho de Melo, Song Liu,
	Quentin Monnet

Sounds good. Thanks.

On Tue, Jun 16, 2020 at 6:37 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Tue, Jun 16, 2020 at 6:24 PM Hao Luo <haoluo@google.com> wrote:
> >
> > Andrii,
> >
> > Do you think we need to put the kernel's variables in one single
> > DATASEC in vmlinux BTF? It looks like all the ksyms in the program
> > will be under one ".ksyms" section, so we are not able to tell whether
> > a ksym is from a percpu section or a .rodata section. Without this
> > information, if the vmlinux has multiple DATASECs, the loader may need
> > to traverse all of them. If vmlinux BTF has only one DATASEC, it
> > matches the object's BTF better.
> >
> > Right now, the percpu vars are in a ".data..percpu" DATASEC in my
> > patch and the plan seems that we will introduce more DATASECs to hold
> > other data.
> >
> > Please let me know your insights here. Thanks.
>
> I think we should keep original DATASECs in vmlinux's BTF, so that
> they match ELF sections. Otherwise BTF is going to lie and will cause
> confusion down the road in the longer term.
>
> On the BPF program side, though, I think we'll limit it to just two
> special sections: .ksyms and .ksyms.percpu. libbpf will have to
> traverse all vmlinux DATASECs to find corresponding variables, but
> that's ok, it has to do the linear scan either way.
>
> >
> > Hao
> >
> > On Tue, Jun 16, 2020 at 1:05 AM Hao Luo <haoluo@google.com> wrote:
> > >
> > > On Mon, Jun 15, 2020 at 12:08 PM Andrii Nakryiko
> > > <andrii.nakryiko@gmail.com> wrote:
> > > >
> > > > On Mon, Jun 15, 2020 at 9:44 AM Hao Luo <haoluo@google.com> wrote:
> > > > >
> > > > > Thanks, Andrii,
> > > > >
> > > > > This change looks nice! A couple of comments:
> > > > >
> > > > > 1. A 'void' type variable looks slightly odd from a user's perspective. How about using 'u64' or 'void *'? Or at least, a named type, which aliases to 'void'?
> > > >
> > > > That choice is very deliberate one. `extern const void` is the right
> > > > way in C language to access linker-generated symbols, for instance,
> > > > which is quite similar to what the intent is her. Having void type is
> > > > very explicit that you don't know/care about that value pointed to by
> > > > extern address, the only operation you can perform is to get it's
> > > > address.
> > > >
> > > > Once we add kernel variables support, that's when types will start to
> > > > be specified and libbpf will do extra checks (type matching) and extra
> > > > work (generating ldimm64 with BTF ID, for instance), to allow C code
> > > > to access data pointed to by extern address.
> > > >
> > > > Switching type to u64 would be misleading in allowing C code to
> > > > implicitly dereference value of extern. E.g., there is a big
> > > > difference between:
> > > >
> > > > extern u64 bla;
> > > >
> > > > printf("%lld\n", bla); /* de-reference happens here, we get contents
> > > > of memory pointed to by "bla" symbol */
> > > >
> > > > printf("%p\n", &bla); /* here we get value of linker symbol/address of
> > > > extern variable */
> > > >
> > > > Currently I explicitly support only the latter and want to prevent the
> > > > former, until we have kernel variables in BTF. Using `extern void`
> > > > makes compiler enforce that only the &bla form is allowed. Everything
> > > > else is compilation error.
> > > >
> > >
> > > Ah, I see. I've been taking the extern variable as an actual variable
> > > that contains the symbol's address, which is the first case. Your
> > > approach makes sense. Thanks for explaining.
> > >
> > > > > 2. About the type size of ksym, IIUC, it looks strange that the values read from kallsyms have 8 bytes but their corresponding vs->size is 4 bytes and vs->type points to 4-byte int. Can we make them of the same size?
> > > >
> > > > That's a bit of a hack on my part. Variable needs to point to some
> > > > type, which size will match the size of datasec's varinfo entry. This
> > > > is checked and enforced by kernel. I'm looking for 4-byte int, because
> > > > it's almost guaranteed that it will be present in program's BTF and I
> > > > won't have to explicitly add it (it's because all BPF programs return
> > > > int, so it must be in program's BTF already). While 8-byte long is
> > > > less likely to be there.
> > > >
> > > > In the future, if we have a nicer way to extend BTF (and we will
> > > > soon), we can do this a bit better, but either way that .ksyms DATASEC
> > > > type isn't used for anything (there is no map with that DATASEC as a
> > > > value type), so it doesn't matter.
> > > >
> > >
> > > Thanks for explaining, Andrii.
> > >
> > > These explanations as comments in the code would be quite helpful, IMHO.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2020-06-18  7:54 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-12 22:31 [RFC PATCH bpf-next 0/8] libbpf ksym support and bpftool show PIDs Andrii Nakryiko
2020-06-12 22:31 ` [RFC PATCH bpf-next 1/8] libbpf: generalize libbpf externs support Andrii Nakryiko
     [not found]   ` <CA+khW7hAYVdoQX5-j0z1iGEVZeww4BBu4NXzy5eS5OwDRYqe2w@mail.gmail.com>
2020-06-15 18:55     ` Andrii Nakryiko
2020-06-12 22:31 ` [RFC PATCH bpf-next 2/8] libbpf: add support for extracting kernel symbol addresses Andrii Nakryiko
     [not found]   ` <CA+khW7hFZzp_K_xydSFw0O3LYB22_fC=Z4wG7i9Si+phGHn4cQ@mail.gmail.com>
2020-06-15 19:08     ` Andrii Nakryiko
2020-06-16  8:05       ` Hao Luo
2020-06-17  1:24         ` Hao Luo
2020-06-17  1:36           ` Andrii Nakryiko
2020-06-18  7:53             ` Hao Luo
2020-06-12 22:31 ` [RFC PATCH bpf-next 3/8] selftests/bpf: add __ksym extern selftest Andrii Nakryiko
     [not found]   ` <CA+khW7jxdS1KRpk2syVGjDqbyn3wAd3Eh_LEMAEhkPUehuXMwg@mail.gmail.com>
2020-06-15 19:11     ` Andrii Nakryiko
2020-06-12 22:31 ` [RFC PATCH bpf-next 4/8] tools/bpftool: move map/prog parsing logic into common Andrii Nakryiko
2020-06-12 22:31 ` [RFC PATCH bpf-next 5/8] tools/bpftool: minimize bootstrap bpftool Andrii Nakryiko
2020-06-12 22:31 ` [RFC PATCH bpf-next 6/8] tools/bpftool: generalize BPF skeleton support and generate vmlinux.h Andrii Nakryiko
2020-06-12 22:31 ` [RFC PATCH bpf-next 7/8] libbpf: wrap source argument of BPF_CORE_READ macro in parentheses Andrii Nakryiko
2020-06-12 22:31 ` [RFC PATCH bpf-next 8/8] tools/bpftool: show PIDs with FDs open against BPF map/prog/link/btf Andrii Nakryiko
2020-06-13  3:45   ` Alexei Starovoitov
2020-06-13  5:57     ` Andrii Nakryiko
2020-06-13 22:14       ` Arnaldo Carvalho de Melo
2020-06-15  9:04         ` Toke Høiland-Jørgensen
2020-06-15  9:30           ` Quentin Monnet

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).