All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH bpf-next v2 0/6] bpf: BTF support for ksyms
@ 2020-09-03 22:33 Hao Luo
  2020-09-03 22:33 ` [PATCH bpf-next v2 1/6] bpf: Introduce pseudo_btf_id Hao Luo
                   ` (5 more replies)
  0 siblings, 6 replies; 21+ messages in thread
From: Hao Luo @ 2020-09-03 22:33 UTC (permalink / raw)
  To: netdev, bpf, linux-kernel, linux-kselftest
  Cc: Shuah Khan, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Quentin Monnet, Hao Luo, Steven Rostedt, Ingo Molnar,
	Andrey Ignatov, Jakub Sitnicki

v1 -> v2:
 - Move check_pseudo_btf_id from check_ld_imm() to
   replace_map_fd_with_map_ptr() and rename the latter.
 - Add bpf_this_cpu_ptr().
 - Use bpf_core_types_are_compat() in libbpf.c for checking type
   compatibility.
 - Rewrite typed ksym extern type in BTF with int to save space.
 - Minor revision of bpf_per_cpu_ptr()'s comments.
 - Avoid using long in tests that use skeleton.
 - Refactored test_ksyms.c by moving kallsyms_find() to trace_helpers.c
 - Fold the patches that sync include/linux/uapi and
   tools/include/linux/uapi.

rfc -> v1:
 - Encode VAR's btf_id for PSEUDO_BTF_ID.
 - More checks in verifier. Checking the btf_id passed as
   PSEUDO_BTF_ID is valid VAR, its name and type.
 - Checks in libbpf on type compatibility of ksyms.
 - Add bpf_per_cpu_ptr() to access kernel percpu vars. Introduced
   new ARG and RET types for this helper.

This patch series extends the previously added __ksym externs with
btf support.

Right now the __ksym externs are treated as pure 64-bit scalar value.
Libbpf replaces ld_imm64 insn of __ksym by its kernel address at load
time. This patch series extend those externs with their btf info. Note
that btf support for __ksym must come with the kernel btf that has
VARs encoded to work properly. The corresponding chagnes in pahole
is available at [1] (with a fix at [2] for gcc 4.9+).

The first 3 patches in this series add support for general kernel
global variables, which include verifier checking (01/06), libpf
support (02/06) and selftests for getting typed ksym extern's kernel
address (03/06).

The next 3 patches extends that capability further by introducing
helpers bpf_per_cpu_ptr() and bpf_this_cpu_ptr(), which allows accessing
kernel percpu variables correctly (04/06 and 05/06).

The tests of this feature were performed against pahole that is extended
with [1] and [2]. For kernel BTF that does not have VARs encoded, the
selftests will be skipped.

[1] https://git.kernel.org/pub/scm/devel/pahole/pahole.git/commit/?id=f3d9054ba8ff1df0fc44e507e3a01c0964cabd42
[2] https://www.spinics.net/lists/dwarves/msg00451.html

Hao Luo (6):
  bpf: Introduce pseudo_btf_id
  bpf/libbpf: BTF support for typed ksyms
  bpf/selftests: ksyms_btf to test typed ksyms
  bpf: Introduce bpf_per_cpu_ptr()
  bpf: Introduce bpf_this_cpu_ptr()
  bpf/selftests: Test for bpf_per_cpu_ptr() and bpf_this_cpu_ptr()

 include/linux/bpf.h                           |   4 +
 include/linux/bpf_verifier.h                  |   4 +
 include/linux/btf.h                           |  26 +++
 include/uapi/linux/bpf.h                      |  69 ++++++-
 kernel/bpf/btf.c                              |  25 ---
 kernel/bpf/verifier.c                         | 176 +++++++++++++++++-
 kernel/trace/bpf_trace.c                      |  32 ++++
 tools/include/uapi/linux/bpf.h                |  69 ++++++-
 tools/lib/bpf/libbpf.c                        | 116 ++++++++++--
 .../testing/selftests/bpf/prog_tests/ksyms.c  |  31 +--
 .../selftests/bpf/prog_tests/ksyms_btf.c      |  73 ++++++++
 .../selftests/bpf/progs/test_ksyms_btf.c      |  49 +++++
 tools/testing/selftests/bpf/trace_helpers.c   |  26 +++
 tools/testing/selftests/bpf/trace_helpers.h   |   4 +
 14 files changed, 615 insertions(+), 89 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/ksyms_btf.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_ksyms_btf.c

-- 
2.28.0.526.ge36021eeef-goog


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH bpf-next v2 1/6] bpf: Introduce pseudo_btf_id
  2020-09-03 22:33 [PATCH bpf-next v2 0/6] bpf: BTF support for ksyms Hao Luo
@ 2020-09-03 22:33 ` Hao Luo
  2020-09-04 19:05   ` Andrii Nakryiko
  2020-09-03 22:33 ` [PATCH bpf-next v2 2/6] bpf/libbpf: BTF support for typed ksyms Hao Luo
                   ` (4 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Hao Luo @ 2020-09-03 22:33 UTC (permalink / raw)
  To: netdev, bpf, linux-kernel, linux-kselftest
  Cc: Shuah Khan, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Quentin Monnet, Hao Luo, Steven Rostedt, Ingo Molnar,
	Andrey Ignatov, Jakub Sitnicki

Pseudo_btf_id is a type of ld_imm insn that associates a btf_id to a
ksym so that further dereferences on the ksym can use the BTF info
to validate accesses. Internally, when seeing a pseudo_btf_id ld insn,
the verifier reads the btf_id stored in the insn[0]'s imm field and
marks the dst_reg as PTR_TO_BTF_ID. The btf_id points to a VAR_KIND,
which is encoded in btf_vminux by pahole. If the VAR is not of a struct
type, the dst reg will be marked as PTR_TO_MEM instead of PTR_TO_BTF_ID
and the mem_size is resolved to the size of the VAR's type.

From the VAR btf_id, the verifier can also read the address of the
ksym's corresponding kernel var from kallsyms and use that to fill
dst_reg.

Therefore, the proper functionality of pseudo_btf_id depends on (1)
kallsyms and (2) the encoding of kernel global VARs in pahole, which
should be available since pahole v1.18.

Signed-off-by: Hao Luo <haoluo@google.com>
---
 include/linux/bpf_verifier.h   |   4 ++
 include/linux/btf.h            |  15 +++++
 include/uapi/linux/bpf.h       |  38 ++++++++---
 kernel/bpf/btf.c               |  15 -----
 kernel/bpf/verifier.c          | 112 ++++++++++++++++++++++++++++++---
 tools/include/uapi/linux/bpf.h |  38 ++++++++---
 6 files changed, 182 insertions(+), 40 deletions(-)

diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index 53c7bd568c5d..a14063f64d96 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -308,6 +308,10 @@ struct bpf_insn_aux_data {
 			u32 map_index;		/* index into used_maps[] */
 			u32 map_off;		/* offset from value base address */
 		};
+		struct {
+			u32 pseudo_btf_id_type; /* type of pseudo_btf_id */
+			u32 pseudo_btf_id_meta; /* memsize or btf_id */
+		};
 	};
 	u64 map_key_state; /* constant (32 bit) key tracking for maps */
 	int ctx_field_size; /* the ctx field size for load insn, maybe 0 */
diff --git a/include/linux/btf.h b/include/linux/btf.h
index a9af5e7a7ece..592373d359b9 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -106,6 +106,21 @@ static inline bool btf_type_is_func_proto(const struct btf_type *t)
 	return BTF_INFO_KIND(t->info) == BTF_KIND_FUNC_PROTO;
 }
 
+static inline bool btf_type_is_var(const struct btf_type *t)
+{
+	return BTF_INFO_KIND(t->info) == BTF_KIND_VAR;
+}
+
+/* union is only a special case of struct:
+ * all its offsetof(member) == 0
+ */
+static inline bool btf_type_is_struct(const struct btf_type *t)
+{
+	u8 kind = BTF_INFO_KIND(t->info);
+
+	return kind == BTF_KIND_STRUCT || kind == BTF_KIND_UNION;
+}
+
 static inline u16 btf_type_vlen(const struct btf_type *t)
 {
 	return BTF_INFO_VLEN(t->info);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index 8dda13880957..ab00ad9b32e5 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -355,18 +355,38 @@ enum bpf_link_type {
 #define BPF_F_SLEEPABLE		(1U << 4)
 
 /* When BPF ldimm64's insn[0].src_reg != 0 then this can have
- * two extensions:
- *
- * insn[0].src_reg:  BPF_PSEUDO_MAP_FD   BPF_PSEUDO_MAP_VALUE
- * insn[0].imm:      map fd              map fd
- * insn[1].imm:      0                   offset into value
- * insn[0].off:      0                   0
- * insn[1].off:      0                   0
- * ldimm64 rewrite:  address of map      address of map[0]+offset
- * verifier type:    CONST_PTR_TO_MAP    PTR_TO_MAP_VALUE
+ * the following extensions:
+ *
+ * insn[0].src_reg:  BPF_PSEUDO_MAP_FD
+ * insn[0].imm:      map fd
+ * insn[1].imm:      0
+ * insn[0].off:      0
+ * insn[1].off:      0
+ * ldimm64 rewrite:  address of map
+ * verifier type:    CONST_PTR_TO_MAP
  */
 #define BPF_PSEUDO_MAP_FD	1
+/*
+ * insn[0].src_reg:  BPF_PSEUDO_MAP_VALUE
+ * insn[0].imm:      map fd
+ * insn[1].imm:      offset into value
+ * insn[0].off:      0
+ * insn[1].off:      0
+ * ldimm64 rewrite:  address of map[0]+offset
+ * verifier type:    PTR_TO_MAP_VALUE
+ */
 #define BPF_PSEUDO_MAP_VALUE	2
+/*
+ * insn[0].src_reg:  BPF_PSEUDO_BTF_ID
+ * insn[0].imm:      kernel btd id of VAR
+ * insn[1].imm:      0
+ * insn[0].off:      0
+ * insn[1].off:      0
+ * ldimm64 rewrite:  address of the kernel variable
+ * verifier type:    PTR_TO_BTF_ID or PTR_TO_MEM, depending on whether the var
+ *                   is struct/union.
+ */
+#define BPF_PSEUDO_BTF_ID	3
 
 /* when bpf_call->src_reg == BPF_PSEUDO_CALL, bpf_call->imm == pc-relative
  * offset to another bpf function
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index f9ac6935ab3c..5831d9f3f3c5 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -355,16 +355,6 @@ static bool btf_type_nosize_or_null(const struct btf_type *t)
 	return !t || btf_type_nosize(t);
 }
 
-/* union is only a special case of struct:
- * all its offsetof(member) == 0
- */
-static bool btf_type_is_struct(const struct btf_type *t)
-{
-	u8 kind = BTF_INFO_KIND(t->info);
-
-	return kind == BTF_KIND_STRUCT || kind == BTF_KIND_UNION;
-}
-
 static bool __btf_type_is_struct(const struct btf_type *t)
 {
 	return BTF_INFO_KIND(t->info) == BTF_KIND_STRUCT;
@@ -375,11 +365,6 @@ static bool btf_type_is_array(const struct btf_type *t)
 	return BTF_INFO_KIND(t->info) == BTF_KIND_ARRAY;
 }
 
-static bool btf_type_is_var(const struct btf_type *t)
-{
-	return BTF_INFO_KIND(t->info) == BTF_KIND_VAR;
-}
-
 static bool btf_type_is_datasec(const struct btf_type *t)
 {
 	return BTF_INFO_KIND(t->info) == BTF_KIND_DATASEC;
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index b4e9c56b8b32..3b382c080cfd 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -7398,6 +7398,24 @@ static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn)
 		return 0;
 	}
 
+	if (insn->src_reg == BPF_PSEUDO_BTF_ID) {
+		u32 type = aux->pseudo_btf_id_type;
+		u32 meta = aux->pseudo_btf_id_meta;
+
+		mark_reg_known_zero(env, regs, insn->dst_reg);
+
+		regs[insn->dst_reg].type = type;
+		if (type == PTR_TO_MEM) {
+			regs[insn->dst_reg].mem_size = meta;
+		} else if (type == PTR_TO_BTF_ID) {
+			regs[insn->dst_reg].btf_id = meta;
+		} else {
+			verbose(env, "bpf verifier is misconfigured\n");
+			return -EFAULT;
+		}
+		return 0;
+	}
+
 	map = env->used_maps[aux->map_index];
 	mark_reg_known_zero(env, regs, insn->dst_reg);
 	regs[insn->dst_reg].map_ptr = map;
@@ -9284,6 +9302,74 @@ static int do_check(struct bpf_verifier_env *env)
 	return 0;
 }
 
+/* replace pseudo btf_id with kernel symbol address */
+static int check_pseudo_btf_id(struct bpf_verifier_env *env,
+			       struct bpf_insn *insn,
+			       struct bpf_insn_aux_data *aux)
+{
+	u32 type, id = insn->imm;
+	const struct btf_type *t;
+	const char *sym_name;
+	u64 addr;
+
+	if (!btf_vmlinux) {
+		verbose(env, "%s: btf not available. verifier misconfigured.\n",
+			__func__);
+		return -EINVAL;
+	}
+
+	t = btf_type_by_id(btf_vmlinux, id);
+	if (!t) {
+		verbose(env, "%s: invalid btf_id %d\n", __func__, id);
+		return -ENOENT;
+	}
+
+	if (insn[1].imm != 0) {
+		verbose(env, "%s: BPF_PSEUDO_BTF_ID uses reserved fields\n",
+			__func__);
+		return -EINVAL;
+	}
+
+	if (!btf_type_is_var(t)) {
+		verbose(env, "%s: btf_id %d isn't KIND_VAR\n", __func__, id);
+		return -EINVAL;
+	}
+
+	sym_name = btf_name_by_offset(btf_vmlinux, t->name_off);
+	addr = kallsyms_lookup_name(sym_name);
+	if (!addr) {
+		verbose(env, "%s: failed to find the address of symbol '%s'.\n",
+			__func__, sym_name);
+		return -ENOENT;
+	}
+
+	insn[0].imm = (u32)addr;
+	insn[1].imm = addr >> 32;
+
+	type = t->type;
+	t = btf_type_skip_modifiers(btf_vmlinux, type, NULL);
+	if (!btf_type_is_struct(t)) {
+		const struct btf_type *ret;
+		const char *tname;
+		u32 tsize;
+
+		/* resolve the type size of ksym. */
+		ret = btf_resolve_size(btf_vmlinux, t, &tsize);
+		if (IS_ERR(ret)) {
+			tname = btf_name_by_offset(btf_vmlinux, t->name_off);
+			verbose(env, "unable to resolve the size of type '%s': %ld\n",
+				tname, PTR_ERR(ret));
+			return -EINVAL;
+		}
+		aux->pseudo_btf_id_type = PTR_TO_MEM;
+		aux->pseudo_btf_id_meta = tsize;
+	} else {
+		aux->pseudo_btf_id_type = PTR_TO_BTF_ID;
+		aux->pseudo_btf_id_meta = type;
+	}
+	return 0;
+}
+
 static int check_map_prealloc(struct bpf_map *map)
 {
 	return (map->map_type != BPF_MAP_TYPE_HASH &&
@@ -9394,10 +9480,14 @@ static bool bpf_map_is_cgroup_storage(struct bpf_map *map)
 		map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE);
 }
 
-/* look for pseudo eBPF instructions that access map FDs and
- * replace them with actual map pointers
+/* find and rewrite pseudo imm in ld_imm64 instructions:
+ *
+ * 1. if it accesses map FD, replace it with actual map pointer.
+ * 2. if it accesses btf_id of a VAR, replace it with pointer to the var.
+ *
+ * NOTE: btf_vmlinux is required for converting pseudo btf_id.
  */
-static int replace_map_fd_with_map_ptr(struct bpf_verifier_env *env)
+static int replace_pseudo_imm_with_ptr(struct bpf_verifier_env *env)
 {
 	struct bpf_insn *insn = env->prog->insnsi;
 	int insn_cnt = env->prog->len;
@@ -9438,6 +9528,14 @@ static int replace_map_fd_with_map_ptr(struct bpf_verifier_env *env)
 				/* valid generic load 64-bit imm */
 				goto next_insn;
 
+			if (insn[0].src_reg == BPF_PSEUDO_BTF_ID) {
+				aux = &env->insn_aux_data[i];
+				err = check_pseudo_btf_id(env, insn, aux);
+				if (err)
+					return err;
+				goto next_insn;
+			}
+
 			/* In final convert_pseudo_ld_imm64() step, this is
 			 * converted into regular 64-bit imm load insn.
 			 */
@@ -11388,10 +11486,6 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
 	if (is_priv)
 		env->test_state_freq = attr->prog_flags & BPF_F_TEST_STATE_FREQ;
 
-	ret = replace_map_fd_with_map_ptr(env);
-	if (ret < 0)
-		goto skip_full_check;
-
 	if (bpf_prog_is_dev_bound(env->prog->aux)) {
 		ret = bpf_prog_offload_verifier_prep(env->prog);
 		if (ret)
@@ -11417,6 +11511,10 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
 	if (ret)
 		goto skip_full_check;
 
+	ret = replace_pseudo_imm_with_ptr(env);
+	if (ret < 0)
+		goto skip_full_check;
+
 	ret = check_cfg(env);
 	if (ret < 0)
 		goto skip_full_check;
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index 8dda13880957..ab00ad9b32e5 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -355,18 +355,38 @@ enum bpf_link_type {
 #define BPF_F_SLEEPABLE		(1U << 4)
 
 /* When BPF ldimm64's insn[0].src_reg != 0 then this can have
- * two extensions:
- *
- * insn[0].src_reg:  BPF_PSEUDO_MAP_FD   BPF_PSEUDO_MAP_VALUE
- * insn[0].imm:      map fd              map fd
- * insn[1].imm:      0                   offset into value
- * insn[0].off:      0                   0
- * insn[1].off:      0                   0
- * ldimm64 rewrite:  address of map      address of map[0]+offset
- * verifier type:    CONST_PTR_TO_MAP    PTR_TO_MAP_VALUE
+ * the following extensions:
+ *
+ * insn[0].src_reg:  BPF_PSEUDO_MAP_FD
+ * insn[0].imm:      map fd
+ * insn[1].imm:      0
+ * insn[0].off:      0
+ * insn[1].off:      0
+ * ldimm64 rewrite:  address of map
+ * verifier type:    CONST_PTR_TO_MAP
  */
 #define BPF_PSEUDO_MAP_FD	1
+/*
+ * insn[0].src_reg:  BPF_PSEUDO_MAP_VALUE
+ * insn[0].imm:      map fd
+ * insn[1].imm:      offset into value
+ * insn[0].off:      0
+ * insn[1].off:      0
+ * ldimm64 rewrite:  address of map[0]+offset
+ * verifier type:    PTR_TO_MAP_VALUE
+ */
 #define BPF_PSEUDO_MAP_VALUE	2
+/*
+ * insn[0].src_reg:  BPF_PSEUDO_BTF_ID
+ * insn[0].imm:      kernel btd id of VAR
+ * insn[1].imm:      0
+ * insn[0].off:      0
+ * insn[1].off:      0
+ * ldimm64 rewrite:  address of the kernel variable
+ * verifier type:    PTR_TO_BTF_ID or PTR_TO_MEM, depending on whether the var
+ *                   is struct/union.
+ */
+#define BPF_PSEUDO_BTF_ID	3
 
 /* when bpf_call->src_reg == BPF_PSEUDO_CALL, bpf_call->imm == pc-relative
  * offset to another bpf function
-- 
2.28.0.526.ge36021eeef-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH bpf-next v2 2/6] bpf/libbpf: BTF support for typed ksyms
  2020-09-03 22:33 [PATCH bpf-next v2 0/6] bpf: BTF support for ksyms Hao Luo
  2020-09-03 22:33 ` [PATCH bpf-next v2 1/6] bpf: Introduce pseudo_btf_id Hao Luo
@ 2020-09-03 22:33 ` Hao Luo
  2020-09-04 19:34   ` Andrii Nakryiko
  2020-09-03 22:33 ` [PATCH bpf-next v2 3/6] bpf/selftests: ksyms_btf to test " Hao Luo
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Hao Luo @ 2020-09-03 22:33 UTC (permalink / raw)
  To: netdev, bpf, linux-kernel, linux-kselftest
  Cc: Shuah Khan, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Quentin Monnet, Hao Luo, Steven Rostedt, Ingo Molnar,
	Andrey Ignatov, Jakub Sitnicki

If a ksym is defined with a type, libbpf will try to find the ksym's btf
information from kernel btf. If a valid btf entry for the ksym is found,
libbpf can pass in the found btf id to the verifier, which validates the
ksym's type and value.

Typeless ksyms (i.e. those defined as 'void') will not have such btf_id,
but it has the symbol's address (read from kallsyms) and its value is
treated as a raw pointer.

Signed-off-by: Hao Luo <haoluo@google.com>
---
 tools/lib/bpf/libbpf.c | 116 ++++++++++++++++++++++++++++++++++++-----
 1 file changed, 102 insertions(+), 14 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index b688aadf09c5..c7a8d7d72b46 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -361,6 +361,12 @@ struct extern_desc {
 		} kcfg;
 		struct {
 			unsigned long long addr;
+
+			/* target btf_id of the corresponding kernel var. */
+			int vmlinux_btf_id;
+
+			/* local btf_id of the ksym extern's type. */
+			__u32 type_id;
 		} ksym;
 	};
 };
@@ -2479,12 +2485,23 @@ static int bpf_object__load_vmlinux_btf(struct bpf_object *obj)
 {
 	bool need_vmlinux_btf = false;
 	struct bpf_program *prog;
-	int err;
+	int i, err;
 
 	/* CO-RE relocations need kernel BTF */
 	if (obj->btf_ext && obj->btf_ext->core_relo_info.len)
 		need_vmlinux_btf = true;
 
+	/* Support for typed ksyms needs kernel BTF */
+	for (i = 0; i < obj->nr_extern; i++) {
+		const struct extern_desc *ext;
+
+		ext = &obj->externs[i];
+		if (ext->type == EXT_KSYM && ext->ksym.type_id) {
+			need_vmlinux_btf = true;
+			break;
+		}
+	}
+
 	bpf_object__for_each_program(prog, obj) {
 		if (!prog->load)
 			continue;
@@ -3060,16 +3077,10 @@ static int bpf_object__collect_externs(struct bpf_object *obj)
 				return -ENOTSUP;
 			}
 		} else if (strcmp(sec_name, KSYMS_SEC) == 0) {
-			const struct btf_type *vt;
-
 			ksym_sec = sec;
 			ext->type = EXT_KSYM;
-
-			vt = skip_mods_and_typedefs(obj->btf, t->type, NULL);
-			if (!btf_is_void(vt)) {
-				pr_warn("extern (ksym) '%s' is not typeless (void)\n", ext_name);
-				return -ENOTSUP;
-			}
+			skip_mods_and_typedefs(obj->btf, t->type,
+					       &ext->ksym.type_id);
 		} else {
 			pr_warn("unrecognized extern section '%s'\n", sec_name);
 			return -ENOTSUP;
@@ -3119,6 +3130,8 @@ static int bpf_object__collect_externs(struct bpf_object *obj)
 			vt->type = int_btf_id;
 			vs->offset = off;
 			vs->size = sizeof(int);
+			pr_debug("ksym var_secinfo: var '%s', type #%d, size %d, offset %d\n",
+				 ext->name, vt->type, vs->size, vs->offset);
 		}
 		sec->size = off;
 	}
@@ -5724,8 +5737,13 @@ bpf_program__relocate(struct bpf_program *prog, struct bpf_object *obj)
 				insn[0].imm = obj->maps[obj->kconfig_map_idx].fd;
 				insn[1].imm = ext->kcfg.data_off;
 			} else /* EXT_KSYM */ {
-				insn[0].imm = (__u32)ext->ksym.addr;
-				insn[1].imm = ext->ksym.addr >> 32;
+				if (ext->ksym.type_id) { /* typed ksyms */
+					insn[0].src_reg = BPF_PSEUDO_BTF_ID;
+					insn[0].imm = ext->ksym.vmlinux_btf_id;
+				} else { /* typeless ksyms */
+					insn[0].imm = (__u32)ext->ksym.addr;
+					insn[1].imm = ext->ksym.addr >> 32;
+				}
 			}
 			break;
 		case RELO_CALL:
@@ -6462,10 +6480,72 @@ static int bpf_object__read_kallsyms_file(struct bpf_object *obj)
 	return err;
 }
 
+static int bpf_object__resolve_ksyms_btf_id(struct bpf_object *obj)
+{
+	struct extern_desc *ext;
+	int i, id;
+
+	if (!obj->btf_vmlinux) {
+		pr_warn("support of typed ksyms needs kernel btf.\n");
+		return -ENOENT;
+	}
+
+	for (i = 0; i < obj->nr_extern; i++) {
+		const struct btf_type *targ_var, *targ_type;
+		__u32 targ_type_id, local_type_id;
+		int ret;
+
+		ext = &obj->externs[i];
+		if (ext->type != EXT_KSYM || !ext->ksym.type_id)
+			continue;
+
+		id = btf__find_by_name_kind(obj->btf_vmlinux, ext->name,
+					    BTF_KIND_VAR);
+		if (id <= 0) {
+			pr_warn("no btf entry for ksym '%s' in vmlinux.\n",
+				ext->name);
+			return -ESRCH;
+		}
+
+		/* find target type_id */
+		targ_var = btf__type_by_id(obj->btf_vmlinux, id);
+		targ_type = skip_mods_and_typedefs(obj->btf_vmlinux,
+						   targ_var->type,
+						   &targ_type_id);
+
+		/* find local type_id */
+		local_type_id = ext->ksym.type_id;
+
+		ret = bpf_core_types_are_compat(obj->btf_vmlinux, targ_type_id,
+						obj->btf, local_type_id);
+		if (ret <= 0) {
+			const struct btf_type *local_type;
+			const char *targ_name, *local_name;
+
+			local_type = btf__type_by_id(obj->btf, local_type_id);
+			targ_name = btf__name_by_offset(obj->btf_vmlinux,
+							targ_type->name_off);
+			local_name = btf__name_by_offset(obj->btf,
+							 local_type->name_off);
+
+			pr_warn("ksym '%s' expects type '%s' (vmlinux_btf_id: #%d), "
+				"but got '%s' (btf_id: #%d)\n", ext->name,
+				targ_name, targ_type_id, local_name, local_type_id);
+			return -EINVAL;
+		}
+
+		ext->is_set = true;
+		ext->ksym.vmlinux_btf_id = id;
+		pr_debug("extern (ksym) %s=vmlinux_btf_id(#%d)\n", ext->name, id);
+	}
+	return 0;
+}
+
 static int bpf_object__resolve_externs(struct bpf_object *obj,
 				       const char *extra_kconfig)
 {
-	bool need_config = false, need_kallsyms = false;
+	bool need_kallsyms = false, need_vmlinux_btf = false;
+	bool need_config = false;
 	struct extern_desc *ext;
 	void *kcfg_data = NULL;
 	int err, i;
@@ -6496,7 +6576,10 @@ static int bpf_object__resolve_externs(struct bpf_object *obj,
 			   strncmp(ext->name, "CONFIG_", 7) == 0) {
 			need_config = true;
 		} else if (ext->type == EXT_KSYM) {
-			need_kallsyms = true;
+			if (ext->ksym.type_id)
+				need_vmlinux_btf = true;
+			else
+				need_kallsyms = true;
 		} else {
 			pr_warn("unrecognized extern '%s'\n", ext->name);
 			return -EINVAL;
@@ -6525,6 +6608,11 @@ static int bpf_object__resolve_externs(struct bpf_object *obj,
 		if (err)
 			return -EINVAL;
 	}
+	if (need_vmlinux_btf) {
+		err = bpf_object__resolve_ksyms_btf_id(obj);
+		if (err)
+			return -EINVAL;
+	}
 	for (i = 0; i < obj->nr_extern; i++) {
 		ext = &obj->externs[i];
 
@@ -6557,10 +6645,10 @@ int bpf_object__load_xattr(struct bpf_object_load_attr *attr)
 	}
 
 	err = bpf_object__probe_loading(obj);
+	err = err ? : bpf_object__load_vmlinux_btf(obj);
 	err = err ? : bpf_object__resolve_externs(obj, obj->kconfig);
 	err = err ? : bpf_object__sanitize_and_load_btf(obj);
 	err = err ? : bpf_object__sanitize_maps(obj);
-	err = err ? : bpf_object__load_vmlinux_btf(obj);
 	err = err ? : bpf_object__init_kern_struct_ops_maps(obj);
 	err = err ? : bpf_object__create_maps(obj);
 	err = err ? : bpf_object__relocate(obj, attr->target_btf_path);
-- 
2.28.0.526.ge36021eeef-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH bpf-next v2 3/6] bpf/selftests: ksyms_btf to test typed ksyms
  2020-09-03 22:33 [PATCH bpf-next v2 0/6] bpf: BTF support for ksyms Hao Luo
  2020-09-03 22:33 ` [PATCH bpf-next v2 1/6] bpf: Introduce pseudo_btf_id Hao Luo
  2020-09-03 22:33 ` [PATCH bpf-next v2 2/6] bpf/libbpf: BTF support for typed ksyms Hao Luo
@ 2020-09-03 22:33 ` Hao Luo
  2020-09-04 19:49   ` Andrii Nakryiko
  2020-09-03 22:33 ` [PATCH bpf-next v2 4/6] bpf: Introduce bpf_per_cpu_ptr() Hao Luo
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Hao Luo @ 2020-09-03 22:33 UTC (permalink / raw)
  To: netdev, bpf, linux-kernel, linux-kselftest
  Cc: Shuah Khan, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Quentin Monnet, Hao Luo, Steven Rostedt, Ingo Molnar,
	Andrey Ignatov, Jakub Sitnicki

Selftests for typed ksyms. Tests two types of ksyms: one is a struct,
the other is a plain int. This tests two paths in the kernel. Struct
ksyms will be converted into PTR_TO_BTF_ID by the verifier while int
typed ksyms will be converted into PTR_TO_MEM.

Signed-off-by: Hao Luo <haoluo@google.com>
---
 .../testing/selftests/bpf/prog_tests/ksyms.c  | 31 +++------
 .../selftests/bpf/prog_tests/ksyms_btf.c      | 63 +++++++++++++++++++
 .../selftests/bpf/progs/test_ksyms_btf.c      | 23 +++++++
 tools/testing/selftests/bpf/trace_helpers.c   | 26 ++++++++
 tools/testing/selftests/bpf/trace_helpers.h   |  4 ++
 5 files changed, 123 insertions(+), 24 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/ksyms_btf.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_ksyms_btf.c

diff --git a/tools/testing/selftests/bpf/prog_tests/ksyms.c b/tools/testing/selftests/bpf/prog_tests/ksyms.c
index e3d6777226a8..d51ad2aedf64 100644
--- a/tools/testing/selftests/bpf/prog_tests/ksyms.c
+++ b/tools/testing/selftests/bpf/prog_tests/ksyms.c
@@ -7,39 +7,22 @@
 
 static int duration;
 
-static __u64 kallsyms_find(const char *sym)
-{
-	char type, name[500];
-	__u64 addr, res = 0;
-	FILE *f;
-
-	f = fopen("/proc/kallsyms", "r");
-	if (CHECK(!f, "kallsyms_fopen", "failed to open: %d\n", errno))
-		return 0;
-
-	while (fscanf(f, "%llx %c %499s%*[^\n]\n", &addr, &type, name) > 0) {
-		if (strcmp(name, sym) == 0) {
-			res = addr;
-			goto out;
-		}
-	}
-
-	CHECK(false, "not_found", "symbol %s not found\n", sym);
-out:
-	fclose(f);
-	return res;
-}
-
 void test_ksyms(void)
 {
-	__u64 link_fops_addr = kallsyms_find("bpf_link_fops");
 	const char *btf_path = "/sys/kernel/btf/vmlinux";
 	struct test_ksyms *skel;
 	struct test_ksyms__data *data;
+	__u64 link_fops_addr;
 	struct stat st;
 	__u64 btf_size;
 	int err;
 
+	err = kallsyms_find("bpf_link_fops", &link_fops_addr);
+	if (CHECK(err == -ENOENT, "kallsyms_fopen", "failed to open: %d\n", errno))
+		return;
+	if (CHECK(err == -EINVAL, "ksym_find", "symbol 'bpf_link_fops' not found\n"))
+		return;
+
 	if (CHECK(stat(btf_path, &st), "stat_btf", "err %d\n", errno))
 		return;
 	btf_size = st.st_size;
diff --git a/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c b/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c
new file mode 100644
index 000000000000..7b6846342449
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c
@@ -0,0 +1,63 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2020 Google */
+
+#include <test_progs.h>
+#include <bpf/libbpf.h>
+#include <bpf/btf.h>
+#include "test_ksyms_btf.skel.h"
+
+static int duration;
+
+void test_ksyms_btf(void)
+{
+	__u64 runqueues_addr, bpf_prog_active_addr;
+	struct test_ksyms_btf *skel;
+	struct test_ksyms_btf__data *data;
+	struct btf *btf;
+	int percpu_datasec;
+	int err;
+
+	err = kallsyms_find("runqueues", &runqueues_addr);
+	if (CHECK(err == -ENOENT, "kallsyms_fopen", "failed to open: %d\n", errno))
+		return;
+	if (CHECK(err == -EINVAL, "ksym_find", "symbol 'runqueues' not found\n"))
+		return;
+
+	err = kallsyms_find("bpf_prog_active", &bpf_prog_active_addr);
+	if (CHECK(err == -EINVAL, "ksym_find", "symbol 'bpf_prog_active' not found\n"))
+		return;
+
+	btf = libbpf_find_kernel_btf();
+	if (CHECK(IS_ERR(btf), "btf_exists", "failed to load kernel BTF: %ld\n",
+		  PTR_ERR(btf)))
+		return;
+
+	percpu_datasec = btf__find_by_name_kind(btf, ".data..percpu",
+						BTF_KIND_DATASEC);
+	if (percpu_datasec < 0) {
+		printf("%s:SKIP:no PERCPU DATASEC in kernel btf\n",
+		       __func__);
+		test__skip();
+		return;
+	}
+
+	skel = test_ksyms_btf__open_and_load();
+	if (CHECK(!skel, "skel_open", "failed to open and load skeleton\n"))
+		return;
+
+	err = test_ksyms_btf__attach(skel);
+	if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err))
+		goto cleanup;
+
+	/* trigger tracepoint */
+	usleep(1);
+
+	data = skel->data;
+	CHECK(data->out__runqueues != runqueues_addr, "runqueues",
+	      "got %llu, exp %llu\n", data->out__runqueues, runqueues_addr);
+	CHECK(data->out__bpf_prog_active != bpf_prog_active_addr, "bpf_prog_active",
+	      "got %llu, exp %llu\n", data->out__bpf_prog_active, bpf_prog_active_addr);
+
+cleanup:
+	test_ksyms_btf__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_ksyms_btf.c b/tools/testing/selftests/bpf/progs/test_ksyms_btf.c
new file mode 100644
index 000000000000..e04e31117f84
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_ksyms_btf.c
@@ -0,0 +1,23 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2020 Google */
+
+#include "vmlinux.h"
+
+#include <bpf/bpf_helpers.h>
+
+__u64 out__runqueues = -1;
+__u64 out__bpf_prog_active = -1;
+
+extern const struct rq runqueues __ksym; /* struct type global var. */
+extern const int bpf_prog_active __ksym; /* int type global var. */
+
+SEC("raw_tp/sys_enter")
+int handler(const void *ctx)
+{
+	out__runqueues = (__u64)&runqueues;
+	out__bpf_prog_active = (__u64)&bpf_prog_active;
+
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
diff --git a/tools/testing/selftests/bpf/trace_helpers.c b/tools/testing/selftests/bpf/trace_helpers.c
index 4d0e913bbb22..ade555fe8294 100644
--- a/tools/testing/selftests/bpf/trace_helpers.c
+++ b/tools/testing/selftests/bpf/trace_helpers.c
@@ -90,6 +90,32 @@ long ksym_get_addr(const char *name)
 	return 0;
 }
 
+/* open kallsyms and read symbol addresses on the fly. Without caching all symbols,
+ * this is faster than load + find. */
+int kallsyms_find(const char *sym, unsigned long long *addr)
+{
+	char type, name[500];
+	unsigned long long value;
+	int err = 0;
+	FILE *f;
+
+	f = fopen("/proc/kallsyms", "r");
+	if (!f)
+		return -ENOENT;
+
+	while (fscanf(f, "%llx %c %499s%*[^\n]\n", &value, &type, name) > 0) {
+		if (strcmp(name, sym) == 0) {
+			*addr = value;
+			goto out;
+		}
+	}
+	err = -EINVAL;
+
+out:
+	fclose(f);
+	return err;
+}
+
 void read_trace_pipe(void)
 {
 	int trace_fd;
diff --git a/tools/testing/selftests/bpf/trace_helpers.h b/tools/testing/selftests/bpf/trace_helpers.h
index 25ef597dd03f..f62fdef9e589 100644
--- a/tools/testing/selftests/bpf/trace_helpers.h
+++ b/tools/testing/selftests/bpf/trace_helpers.h
@@ -12,6 +12,10 @@ struct ksym {
 int load_kallsyms(void);
 struct ksym *ksym_search(long key);
 long ksym_get_addr(const char *name);
+
+/* open kallsyms and find addresses on the fly, faster than load + search. */
+int kallsyms_find(const char *sym, unsigned long long *addr);
+
 void read_trace_pipe(void);
 
 #endif
-- 
2.28.0.526.ge36021eeef-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH bpf-next v2 4/6] bpf: Introduce bpf_per_cpu_ptr()
  2020-09-03 22:33 [PATCH bpf-next v2 0/6] bpf: BTF support for ksyms Hao Luo
                   ` (2 preceding siblings ...)
  2020-09-03 22:33 ` [PATCH bpf-next v2 3/6] bpf/selftests: ksyms_btf to test " Hao Luo
@ 2020-09-03 22:33 ` Hao Luo
  2020-09-04 20:04   ` Andrii Nakryiko
  2020-09-03 22:33 ` [PATCH bpf-next v2 5/6] bpf: Introduce bpf_this_cpu_ptr() Hao Luo
  2020-09-03 22:33 ` [PATCH bpf-next v2 6/6] bpf/selftests: Test for bpf_per_cpu_ptr() and bpf_this_cpu_ptr() Hao Luo
  5 siblings, 1 reply; 21+ messages in thread
From: Hao Luo @ 2020-09-03 22:33 UTC (permalink / raw)
  To: netdev, bpf, linux-kernel, linux-kselftest
  Cc: Shuah Khan, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Quentin Monnet, Hao Luo, Steven Rostedt, Ingo Molnar,
	Andrey Ignatov, Jakub Sitnicki

Add bpf_per_cpu_ptr() to help bpf programs access percpu vars.
bpf_per_cpu_ptr() has the same semantic as per_cpu_ptr() in the kernel
except that it may return NULL. This happens when the cpu parameter is
out of range. So the caller must check the returned value.

Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Hao Luo <haoluo@google.com>
---
 include/linux/bpf.h            |  3 ++
 include/linux/btf.h            | 11 ++++++
 include/uapi/linux/bpf.h       | 17 +++++++++
 kernel/bpf/btf.c               | 10 ------
 kernel/bpf/verifier.c          | 66 +++++++++++++++++++++++++++++++---
 kernel/trace/bpf_trace.c       | 18 ++++++++++
 tools/include/uapi/linux/bpf.h | 17 +++++++++
 7 files changed, 128 insertions(+), 14 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index c6d9f2c444f4..6b2034f7665e 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -292,6 +292,7 @@ enum bpf_arg_type {
 	ARG_PTR_TO_ALLOC_MEM,	/* pointer to dynamically allocated memory */
 	ARG_PTR_TO_ALLOC_MEM_OR_NULL,	/* pointer to dynamically allocated memory or NULL */
 	ARG_CONST_ALLOC_SIZE_OR_ZERO,	/* number of allocated bytes requested */
+	ARG_PTR_TO_PERCPU_BTF_ID,	/* pointer to in-kernel percpu type */
 };
 
 /* type of values returned from helper functions */
@@ -305,6 +306,7 @@ enum bpf_return_type {
 	RET_PTR_TO_SOCK_COMMON_OR_NULL,	/* returns a pointer to a sock_common or NULL */
 	RET_PTR_TO_ALLOC_MEM_OR_NULL,	/* returns a pointer to dynamically allocated memory or NULL */
 	RET_PTR_TO_BTF_ID_OR_NULL,	/* returns a pointer to a btf_id or NULL */
+	RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL, /* returns a pointer to a valid memory or a btf_id or NULL */
 };
 
 /* eBPF function prototype used by verifier to allow BPF_CALLs from eBPF programs
@@ -385,6 +387,7 @@ enum bpf_reg_type {
 	PTR_TO_RDONLY_BUF_OR_NULL, /* reg points to a readonly buffer or NULL */
 	PTR_TO_RDWR_BUF,	 /* reg points to a read/write buffer */
 	PTR_TO_RDWR_BUF_OR_NULL, /* reg points to a read/write buffer or NULL */
+	PTR_TO_PERCPU_BTF_ID,	 /* reg points to percpu kernel type */
 };
 
 /* The information passed from prog-specific *_is_valid_access
diff --git a/include/linux/btf.h b/include/linux/btf.h
index 592373d359b9..07b7de1c05b0 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -71,6 +71,11 @@ btf_resolve_size(const struct btf *btf, const struct btf_type *type,
 	     i < btf_type_vlen(struct_type);			\
 	     i++, member++)
 
+#define for_each_vsi(i, struct_type, member)			\
+	for (i = 0, member = btf_type_var_secinfo(struct_type);	\
+	     i < btf_type_vlen(struct_type);			\
+	     i++, member++)
+
 static inline bool btf_type_is_ptr(const struct btf_type *t)
 {
 	return BTF_INFO_KIND(t->info) == BTF_KIND_PTR;
@@ -155,6 +160,12 @@ static inline const struct btf_member *btf_type_member(const struct btf_type *t)
 	return (const struct btf_member *)(t + 1);
 }
 
+static inline const struct btf_var_secinfo *btf_type_var_secinfo(
+		const struct btf_type *t)
+{
+	return (const struct btf_var_secinfo *)(t + 1);
+}
+
 #ifdef CONFIG_BPF_SYSCALL
 const struct btf_type *btf_type_by_id(const struct btf *btf, u32 type_id);
 const char *btf_name_by_offset(const struct btf *btf, u32 offset);
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index ab00ad9b32e5..d0ec94d5bdbf 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3596,6 +3596,22 @@ union bpf_attr {
  * 		the data in *dst*. This is a wrapper of copy_from_user().
  * 	Return
  * 		0 on success, or a negative error in case of failure.
+ *
+ * void *bpf_per_cpu_ptr(const void *percpu_ptr, u32 cpu)
+ *     Description
+ *             Take a pointer to a percpu ksym, *percpu_ptr*, and return a
+ *             pointer to the percpu kernel variable on *cpu*. A ksym is an
+ *             extern variable decorated with '__ksym'. For ksym, there is a
+ *             global var (either static or global) defined of the same name
+ *             in the kernel. The ksym is percpu if the global var is percpu.
+ *             The returned pointer points to the global percpu var on *cpu*.
+ *
+ *             bpf_per_cpu_ptr() has the same semantic as per_cpu_ptr() in the
+ *             kernel, except that bpf_per_cpu_ptr() may return NULL. This
+ *             happens if *cpu* is larger than nr_cpu_ids. The caller of
+ *             bpf_per_cpu_ptr() must check the returned value.
+ *     Return
+ *             A generic pointer pointing to the kernel percpu variable on *cpu*.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -3747,6 +3763,7 @@ union bpf_attr {
 	FN(inode_storage_delete),	\
 	FN(d_path),			\
 	FN(copy_from_user),		\
+	FN(bpf_per_cpu_ptr),            \
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 5831d9f3f3c5..4aa22e31774c 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -188,11 +188,6 @@
 	     i < btf_type_vlen(struct_type);				\
 	     i++, member++)
 
-#define for_each_vsi(i, struct_type, member)			\
-	for (i = 0, member = btf_type_var_secinfo(struct_type);	\
-	     i < btf_type_vlen(struct_type);			\
-	     i++, member++)
-
 #define for_each_vsi_from(i, from, struct_type, member)				\
 	for (i = from, member = btf_type_var_secinfo(struct_type) + from;	\
 	     i < btf_type_vlen(struct_type);					\
@@ -513,11 +508,6 @@ static const struct btf_var *btf_type_var(const struct btf_type *t)
 	return (const struct btf_var *)(t + 1);
 }
 
-static const struct btf_var_secinfo *btf_type_var_secinfo(const struct btf_type *t)
-{
-	return (const struct btf_var_secinfo *)(t + 1);
-}
-
 static const struct btf_kind_operations *btf_type_ops(const struct btf_type *t)
 {
 	return kind_ops[BTF_INFO_KIND(t->info)];
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 3b382c080cfd..a702600ff581 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -239,6 +239,7 @@ struct bpf_call_arg_meta {
 	int ref_obj_id;
 	int func_id;
 	u32 btf_id;
+	u32 ret_btf_id;
 };
 
 struct btf *btf_vmlinux;
@@ -504,6 +505,7 @@ static const char * const reg_type_str[] = {
 	[PTR_TO_XDP_SOCK]	= "xdp_sock",
 	[PTR_TO_BTF_ID]		= "ptr_",
 	[PTR_TO_BTF_ID_OR_NULL]	= "ptr_or_null_",
+	[PTR_TO_PERCPU_BTF_ID]	= "percpu_ptr_",
 	[PTR_TO_MEM]		= "mem",
 	[PTR_TO_MEM_OR_NULL]	= "mem_or_null",
 	[PTR_TO_RDONLY_BUF]	= "rdonly_buf",
@@ -570,7 +572,9 @@ static void print_verifier_state(struct bpf_verifier_env *env,
 			/* reg->off should be 0 for SCALAR_VALUE */
 			verbose(env, "%lld", reg->var_off.value + reg->off);
 		} else {
-			if (t == PTR_TO_BTF_ID || t == PTR_TO_BTF_ID_OR_NULL)
+			if (t == PTR_TO_BTF_ID ||
+			    t == PTR_TO_BTF_ID_OR_NULL ||
+			    t == PTR_TO_PERCPU_BTF_ID)
 				verbose(env, "%s", kernel_type_name(reg->btf_id));
 			verbose(env, "(id=%d", reg->id);
 			if (reg_type_may_be_refcounted_or_null(t))
@@ -2184,6 +2188,7 @@ static bool is_spillable_regtype(enum bpf_reg_type type)
 	case PTR_TO_RDONLY_BUF_OR_NULL:
 	case PTR_TO_RDWR_BUF:
 	case PTR_TO_RDWR_BUF_OR_NULL:
+	case PTR_TO_PERCPU_BTF_ID:
 		return true;
 	default:
 		return false;
@@ -4003,6 +4008,15 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
 			if (type != expected_type)
 				goto err_type;
 		}
+	} else if (arg_type == ARG_PTR_TO_PERCPU_BTF_ID) {
+		expected_type = PTR_TO_PERCPU_BTF_ID;
+		if (type != expected_type)
+			goto err_type;
+		if (!reg->btf_id) {
+			verbose(env, "Helper has invalid btf_id in R%d\n", regno);
+			return -EACCES;
+		}
+		meta->ret_btf_id = reg->btf_id;
 	} else if (arg_type == ARG_PTR_TO_BTF_ID) {
 		bool ids_match = false;
 
@@ -5002,6 +5016,30 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn
 		regs[BPF_REG_0].type = PTR_TO_MEM_OR_NULL;
 		regs[BPF_REG_0].id = ++env->id_gen;
 		regs[BPF_REG_0].mem_size = meta.mem_size;
+	} else if (fn->ret_type == RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL) {
+		const struct btf_type *t;
+
+		mark_reg_known_zero(env, regs, BPF_REG_0);
+		t = btf_type_skip_modifiers(btf_vmlinux, meta.ret_btf_id, NULL);
+		if (!btf_type_is_struct(t)) {
+			u32 tsize;
+			const struct btf_type *ret;
+			const char *tname;
+
+			/* resolve the type size of ksym. */
+			ret = btf_resolve_size(btf_vmlinux, t, &tsize);
+			if (IS_ERR(ret)) {
+				tname = btf_name_by_offset(btf_vmlinux, t->name_off);
+				verbose(env, "unable to resolve the size of type '%s': %ld\n",
+					tname, PTR_ERR(ret));
+				return -EINVAL;
+			}
+			regs[BPF_REG_0].type = PTR_TO_MEM_OR_NULL;
+			regs[BPF_REG_0].mem_size = tsize;
+		} else {
+			regs[BPF_REG_0].type = PTR_TO_BTF_ID_OR_NULL;
+			regs[BPF_REG_0].btf_id = meta.ret_btf_id;
+		}
 	} else if (fn->ret_type == RET_PTR_TO_BTF_ID_OR_NULL) {
 		int ret_btf_id;
 
@@ -7407,7 +7445,8 @@ static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn)
 		regs[insn->dst_reg].type = type;
 		if (type == PTR_TO_MEM) {
 			regs[insn->dst_reg].mem_size = meta;
-		} else if (type == PTR_TO_BTF_ID) {
+		} else if (type == PTR_TO_BTF_ID ||
+			   type == PTR_TO_PERCPU_BTF_ID) {
 			regs[insn->dst_reg].btf_id = meta;
 		} else {
 			verbose(env, "bpf verifier is misconfigured\n");
@@ -9307,10 +9346,14 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env,
 			       struct bpf_insn *insn,
 			       struct bpf_insn_aux_data *aux)
 {
-	u32 type, id = insn->imm;
+	u32 datasec_id, type, id = insn->imm;
+	const struct btf_var_secinfo *vsi;
+	const struct btf_type *datasec;
 	const struct btf_type *t;
 	const char *sym_name;
+	bool percpu = false;
 	u64 addr;
+	int i;
 
 	if (!btf_vmlinux) {
 		verbose(env, "%s: btf not available. verifier misconfigured.\n",
@@ -9343,12 +9386,27 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env,
 		return -ENOENT;
 	}
 
+	datasec_id = btf_find_by_name_kind(btf_vmlinux, ".data..percpu",
+					   BTF_KIND_DATASEC);
+	if (datasec_id > 0) {
+		datasec = btf_type_by_id(btf_vmlinux, datasec_id);
+		for_each_vsi(i, datasec, vsi) {
+			if (vsi->type == id) {
+				percpu = true;
+				break;
+			}
+		}
+	}
+
 	insn[0].imm = (u32)addr;
 	insn[1].imm = addr >> 32;
 
 	type = t->type;
 	t = btf_type_skip_modifiers(btf_vmlinux, type, NULL);
-	if (!btf_type_is_struct(t)) {
+	if (percpu) {
+		aux->pseudo_btf_id_type = PTR_TO_PERCPU_BTF_ID;
+		aux->pseudo_btf_id_meta = type;
+	} else if (!btf_type_is_struct(t)) {
 		const struct btf_type *ret;
 		const char *tname;
 		u32 tsize;
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index b2a5380eb187..d474c1530f87 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1144,6 +1144,22 @@ static const struct bpf_func_proto bpf_d_path_proto = {
 	.allowed	= bpf_d_path_allowed,
 };
 
+BPF_CALL_2(bpf_per_cpu_ptr, const void *, ptr, u32, cpu)
+{
+	if (cpu >= nr_cpu_ids)
+		return 0;
+
+	return (u64)per_cpu_ptr(ptr, cpu);
+}
+
+static const struct bpf_func_proto bpf_per_cpu_ptr_proto = {
+	.func		= bpf_per_cpu_ptr,
+	.gpl_only	= false,
+	.ret_type	= RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL,
+	.arg1_type	= ARG_PTR_TO_PERCPU_BTF_ID,
+	.arg2_type	= ARG_ANYTHING,
+};
+
 const struct bpf_func_proto *
 bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 {
@@ -1230,6 +1246,8 @@ bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return &bpf_get_task_stack_proto;
 	case BPF_FUNC_copy_from_user:
 		return prog->aux->sleepable ? &bpf_copy_from_user_proto : NULL;
+	case BPF_FUNC_bpf_per_cpu_ptr:
+		return &bpf_per_cpu_ptr_proto;
 	default:
 		return NULL;
 	}
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index ab00ad9b32e5..d0ec94d5bdbf 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -3596,6 +3596,22 @@ union bpf_attr {
  * 		the data in *dst*. This is a wrapper of copy_from_user().
  * 	Return
  * 		0 on success, or a negative error in case of failure.
+ *
+ * void *bpf_per_cpu_ptr(const void *percpu_ptr, u32 cpu)
+ *     Description
+ *             Take a pointer to a percpu ksym, *percpu_ptr*, and return a
+ *             pointer to the percpu kernel variable on *cpu*. A ksym is an
+ *             extern variable decorated with '__ksym'. For ksym, there is a
+ *             global var (either static or global) defined of the same name
+ *             in the kernel. The ksym is percpu if the global var is percpu.
+ *             The returned pointer points to the global percpu var on *cpu*.
+ *
+ *             bpf_per_cpu_ptr() has the same semantic as per_cpu_ptr() in the
+ *             kernel, except that bpf_per_cpu_ptr() may return NULL. This
+ *             happens if *cpu* is larger than nr_cpu_ids. The caller of
+ *             bpf_per_cpu_ptr() must check the returned value.
+ *     Return
+ *             A generic pointer pointing to the kernel percpu variable on *cpu*.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -3747,6 +3763,7 @@ union bpf_attr {
 	FN(inode_storage_delete),	\
 	FN(d_path),			\
 	FN(copy_from_user),		\
+	FN(bpf_per_cpu_ptr),            \
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
-- 
2.28.0.526.ge36021eeef-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH bpf-next v2 5/6] bpf: Introduce bpf_this_cpu_ptr()
  2020-09-03 22:33 [PATCH bpf-next v2 0/6] bpf: BTF support for ksyms Hao Luo
                   ` (3 preceding siblings ...)
  2020-09-03 22:33 ` [PATCH bpf-next v2 4/6] bpf: Introduce bpf_per_cpu_ptr() Hao Luo
@ 2020-09-03 22:33 ` Hao Luo
  2020-09-04 20:09   ` Andrii Nakryiko
  2020-09-03 22:33 ` [PATCH bpf-next v2 6/6] bpf/selftests: Test for bpf_per_cpu_ptr() and bpf_this_cpu_ptr() Hao Luo
  5 siblings, 1 reply; 21+ messages in thread
From: Hao Luo @ 2020-09-03 22:33 UTC (permalink / raw)
  To: netdev, bpf, linux-kernel, linux-kselftest
  Cc: Shuah Khan, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Quentin Monnet, Hao Luo, Steven Rostedt, Ingo Molnar,
	Andrey Ignatov, Jakub Sitnicki

Add bpf_this_cpu_ptr() to help access percpu var on this cpu. This
helper always returns a valid pointer, therefore no need to check
returned value for NULL. Also note that all programs run with
preemption disabled, which means that the returned pointer is stable
during all the execution of the program.

Signed-off-by: Hao Luo <haoluo@google.com>
---
 include/linux/bpf.h            |  1 +
 include/uapi/linux/bpf.h       | 14 ++++++++++++++
 kernel/bpf/verifier.c          | 10 +++++++---
 kernel/trace/bpf_trace.c       | 14 ++++++++++++++
 tools/include/uapi/linux/bpf.h | 14 ++++++++++++++
 5 files changed, 50 insertions(+), 3 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 6b2034f7665e..506fdd5d0463 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -307,6 +307,7 @@ enum bpf_return_type {
 	RET_PTR_TO_ALLOC_MEM_OR_NULL,	/* returns a pointer to dynamically allocated memory or NULL */
 	RET_PTR_TO_BTF_ID_OR_NULL,	/* returns a pointer to a btf_id or NULL */
 	RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL, /* returns a pointer to a valid memory or a btf_id or NULL */
+	RET_PTR_TO_MEM_OR_BTF_ID,	/* returns a pointer to a valid memory or a btf_id */
 };
 
 /* eBPF function prototype used by verifier to allow BPF_CALLs from eBPF programs
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index d0ec94d5bdbf..e7ca91c697ed 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -3612,6 +3612,19 @@ union bpf_attr {
  *             bpf_per_cpu_ptr() must check the returned value.
  *     Return
  *             A generic pointer pointing to the kernel percpu variable on *cpu*.
+ *
+ * void *bpf_this_cpu_ptr(const void *percpu_ptr)
+ *	Description
+ *		Take a pointer to a percpu ksym, *percpu_ptr*, and return a
+ *		pointer to the percpu kernel variable on this cpu. See the
+ *		description of 'ksym' in **bpf_per_cpu_ptr**\ ().
+ *
+ *		bpf_this_cpu_ptr() has the same semantic as this_cpu_ptr() in
+ *		the kernel. Different from **bpf_per_cpu_ptr**\ (), it would
+ *		never return NULL.
+ *	Return
+ *		A generic pointer pointing to the kernel percpu variable on
+ *		this cpu.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -3764,6 +3777,7 @@ union bpf_attr {
 	FN(d_path),			\
 	FN(copy_from_user),		\
 	FN(bpf_per_cpu_ptr),            \
+	FN(bpf_this_cpu_ptr),		\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index a702600ff581..e070d2abc405 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5016,8 +5016,10 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn
 		regs[BPF_REG_0].type = PTR_TO_MEM_OR_NULL;
 		regs[BPF_REG_0].id = ++env->id_gen;
 		regs[BPF_REG_0].mem_size = meta.mem_size;
-	} else if (fn->ret_type == RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL) {
+	} else if (fn->ret_type == RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL ||
+		   fn->ret_type == RET_PTR_TO_MEM_OR_BTF_ID) {
 		const struct btf_type *t;
+		bool not_null = fn->ret_type == RET_PTR_TO_MEM_OR_BTF_ID;
 
 		mark_reg_known_zero(env, regs, BPF_REG_0);
 		t = btf_type_skip_modifiers(btf_vmlinux, meta.ret_btf_id, NULL);
@@ -5034,10 +5036,12 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn
 					tname, PTR_ERR(ret));
 				return -EINVAL;
 			}
-			regs[BPF_REG_0].type = PTR_TO_MEM_OR_NULL;
+			regs[BPF_REG_0].type = not_null ?
+				PTR_TO_MEM : PTR_TO_MEM_OR_NULL;
 			regs[BPF_REG_0].mem_size = tsize;
 		} else {
-			regs[BPF_REG_0].type = PTR_TO_BTF_ID_OR_NULL;
+			regs[BPF_REG_0].type = not_null ?
+				PTR_TO_BTF_ID : PTR_TO_BTF_ID_OR_NULL;
 			regs[BPF_REG_0].btf_id = meta.ret_btf_id;
 		}
 	} else if (fn->ret_type == RET_PTR_TO_BTF_ID_OR_NULL) {
diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
index d474c1530f87..466acf82a9c7 100644
--- a/kernel/trace/bpf_trace.c
+++ b/kernel/trace/bpf_trace.c
@@ -1160,6 +1160,18 @@ static const struct bpf_func_proto bpf_per_cpu_ptr_proto = {
 	.arg2_type	= ARG_ANYTHING,
 };
 
+BPF_CALL_1(bpf_this_cpu_ptr, const void *, percpu_ptr)
+{
+	return (u64)this_cpu_ptr(percpu_ptr);
+}
+
+static const struct bpf_func_proto bpf_this_cpu_ptr_proto = {
+	.func		= bpf_this_cpu_ptr,
+	.gpl_only	= false,
+	.ret_type	= RET_PTR_TO_MEM_OR_BTF_ID,
+	.arg1_type	= ARG_PTR_TO_PERCPU_BTF_ID,
+};
+
 const struct bpf_func_proto *
 bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 {
@@ -1248,6 +1260,8 @@ bpf_tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
 		return prog->aux->sleepable ? &bpf_copy_from_user_proto : NULL;
 	case BPF_FUNC_bpf_per_cpu_ptr:
 		return &bpf_per_cpu_ptr_proto;
+	case BPF_FUNC_bpf_this_cpu_ptr:
+		return &bpf_this_cpu_ptr_proto;
 	default:
 		return NULL;
 	}
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index d0ec94d5bdbf..e7ca91c697ed 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -3612,6 +3612,19 @@ union bpf_attr {
  *             bpf_per_cpu_ptr() must check the returned value.
  *     Return
  *             A generic pointer pointing to the kernel percpu variable on *cpu*.
+ *
+ * void *bpf_this_cpu_ptr(const void *percpu_ptr)
+ *	Description
+ *		Take a pointer to a percpu ksym, *percpu_ptr*, and return a
+ *		pointer to the percpu kernel variable on this cpu. See the
+ *		description of 'ksym' in **bpf_per_cpu_ptr**\ ().
+ *
+ *		bpf_this_cpu_ptr() has the same semantic as this_cpu_ptr() in
+ *		the kernel. Different from **bpf_per_cpu_ptr**\ (), it would
+ *		never return NULL.
+ *	Return
+ *		A generic pointer pointing to the kernel percpu variable on
+ *		this cpu.
  */
 #define __BPF_FUNC_MAPPER(FN)		\
 	FN(unspec),			\
@@ -3764,6 +3777,7 @@ union bpf_attr {
 	FN(d_path),			\
 	FN(copy_from_user),		\
 	FN(bpf_per_cpu_ptr),            \
+	FN(bpf_this_cpu_ptr),		\
 	/* */
 
 /* integer value in 'imm' field of BPF_CALL instruction selects which helper
-- 
2.28.0.526.ge36021eeef-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* [PATCH bpf-next v2 6/6] bpf/selftests: Test for bpf_per_cpu_ptr() and bpf_this_cpu_ptr()
  2020-09-03 22:33 [PATCH bpf-next v2 0/6] bpf: BTF support for ksyms Hao Luo
                   ` (4 preceding siblings ...)
  2020-09-03 22:33 ` [PATCH bpf-next v2 5/6] bpf: Introduce bpf_this_cpu_ptr() Hao Luo
@ 2020-09-03 22:33 ` Hao Luo
  2020-09-04 20:15   ` Andrii Nakryiko
  5 siblings, 1 reply; 21+ messages in thread
From: Hao Luo @ 2020-09-03 22:33 UTC (permalink / raw)
  To: netdev, bpf, linux-kernel, linux-kselftest
  Cc: Shuah Khan, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Quentin Monnet, Hao Luo, Steven Rostedt, Ingo Molnar,
	Andrey Ignatov, Jakub Sitnicki

Test bpf_per_cpu_ptr() and bpf_this_cpu_ptr(). Test two paths in the
kernel. If the base pointer points to a struct, the returned reg is
of type PTR_TO_BTF_ID. Direct pointer dereference can be applied on
the returned variable. If the base pointer isn't a struct, the
returned reg is of type PTR_TO_MEM, which also supports direct pointer
dereference.

Acked-by: Andrii Nakryiko <andriin@fb.com>
Signed-off-by: Hao Luo <haoluo@google.com>
---
 .../selftests/bpf/prog_tests/ksyms_btf.c      | 10 +++++++
 .../selftests/bpf/progs/test_ksyms_btf.c      | 26 +++++++++++++++++++
 2 files changed, 36 insertions(+)

diff --git a/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c b/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c
index 7b6846342449..22cc642dbc0e 100644
--- a/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c
+++ b/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c
@@ -58,6 +58,16 @@ void test_ksyms_btf(void)
 	CHECK(data->out__bpf_prog_active != bpf_prog_active_addr, "bpf_prog_active",
 	      "got %llu, exp %llu\n", data->out__bpf_prog_active, bpf_prog_active_addr);
 
+	CHECK(data->out__rq_cpu == -1, "rq_cpu",
+	      "got %u, exp != -1\n", data->out__rq_cpu);
+	CHECK(data->out__percpu_bpf_prog_active == -1, "percpu_bpf_prog_active",
+	      "got %d, exp != -1\n", data->out__percpu_bpf_prog_active);
+
+	CHECK(data->out__this_rq_cpu == -1, "this_rq_cpu",
+	      "got %u, exp != -1\n", data->out__this_rq_cpu);
+	CHECK(data->out__this_bpf_prog_active == -1, "this_bpf_prog_active",
+	      "got %d, exp != -1\n", data->out__this_bpf_prog_active);
+
 cleanup:
 	test_ksyms_btf__destroy(skel);
 }
diff --git a/tools/testing/selftests/bpf/progs/test_ksyms_btf.c b/tools/testing/selftests/bpf/progs/test_ksyms_btf.c
index e04e31117f84..02d564349892 100644
--- a/tools/testing/selftests/bpf/progs/test_ksyms_btf.c
+++ b/tools/testing/selftests/bpf/progs/test_ksyms_btf.c
@@ -8,15 +8,41 @@
 __u64 out__runqueues = -1;
 __u64 out__bpf_prog_active = -1;
 
+__u32 out__rq_cpu = -1; /* percpu struct fields */
+int out__percpu_bpf_prog_active = -1; /* percpu int */
+
+__u32 out__this_rq_cpu = -1;
+int out__this_bpf_prog_active = -1;
+
 extern const struct rq runqueues __ksym; /* struct type global var. */
 extern const int bpf_prog_active __ksym; /* int type global var. */
 
 SEC("raw_tp/sys_enter")
 int handler(const void *ctx)
 {
+	struct rq *rq;
+	int *active;
+	__u32 cpu;
+
 	out__runqueues = (__u64)&runqueues;
 	out__bpf_prog_active = (__u64)&bpf_prog_active;
 
+	cpu = bpf_get_smp_processor_id();
+
+	/* test bpf_per_cpu_ptr() */
+	rq = (struct rq *)bpf_per_cpu_ptr(&runqueues, cpu);
+	if (rq)
+		out__rq_cpu = rq->cpu;
+	active = (int *)bpf_per_cpu_ptr(&bpf_prog_active, cpu);
+	if (active)
+		out__percpu_bpf_prog_active = *active;
+
+	/* test bpf_this_cpu_ptr */
+	rq = (struct rq *)bpf_this_cpu_ptr(&runqueues);
+	out__this_rq_cpu = rq->cpu;
+	active = (int *)bpf_this_cpu_ptr(&bpf_prog_active);
+	out__this_bpf_prog_active = *active;
+
 	return 0;
 }
 
-- 
2.28.0.526.ge36021eeef-goog


^ permalink raw reply related	[flat|nested] 21+ messages in thread

* Re: [PATCH bpf-next v2 1/6] bpf: Introduce pseudo_btf_id
  2020-09-03 22:33 ` [PATCH bpf-next v2 1/6] bpf: Introduce pseudo_btf_id Hao Luo
@ 2020-09-04 19:05   ` Andrii Nakryiko
  2020-09-14  4:55     ` Hao Luo
  0 siblings, 1 reply; 21+ messages in thread
From: Andrii Nakryiko @ 2020-09-04 19:05 UTC (permalink / raw)
  To: Hao Luo
  Cc: Networking, bpf, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Shuah Khan, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Quentin Monnet, Steven Rostedt, Ingo Molnar,
	Andrey Ignatov, Jakub Sitnicki

On Thu, Sep 3, 2020 at 3:34 PM Hao Luo <haoluo@google.com> wrote:
>
> Pseudo_btf_id is a type of ld_imm insn that associates a btf_id to a
> ksym so that further dereferences on the ksym can use the BTF info
> to validate accesses. Internally, when seeing a pseudo_btf_id ld insn,
> the verifier reads the btf_id stored in the insn[0]'s imm field and
> marks the dst_reg as PTR_TO_BTF_ID. The btf_id points to a VAR_KIND,
> which is encoded in btf_vminux by pahole. If the VAR is not of a struct
> type, the dst reg will be marked as PTR_TO_MEM instead of PTR_TO_BTF_ID
> and the mem_size is resolved to the size of the VAR's type.
>
> From the VAR btf_id, the verifier can also read the address of the
> ksym's corresponding kernel var from kallsyms and use that to fill
> dst_reg.
>
> Therefore, the proper functionality of pseudo_btf_id depends on (1)
> kallsyms and (2) the encoding of kernel global VARs in pahole, which
> should be available since pahole v1.18.
>
> Signed-off-by: Hao Luo <haoluo@google.com>
> ---

Logic looks correct, but I have a few suggestions for naming things
and verifier logs. Please see below.

>  include/linux/bpf_verifier.h   |   4 ++
>  include/linux/btf.h            |  15 +++++
>  include/uapi/linux/bpf.h       |  38 ++++++++---
>  kernel/bpf/btf.c               |  15 -----
>  kernel/bpf/verifier.c          | 112 ++++++++++++++++++++++++++++++---
>  tools/include/uapi/linux/bpf.h |  38 ++++++++---
>  6 files changed, 182 insertions(+), 40 deletions(-)
>
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index 53c7bd568c5d..a14063f64d96 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -308,6 +308,10 @@ struct bpf_insn_aux_data {
>                         u32 map_index;          /* index into used_maps[] */
>                         u32 map_off;            /* offset from value base address */
>                 };
> +               struct {
> +                       u32 pseudo_btf_id_type; /* type of pseudo_btf_id */
> +                       u32 pseudo_btf_id_meta; /* memsize or btf_id */

a bit misleading names, not clear at all in the code what's in there.
This section is for ldimm64 insns that are loading BTF variables,
right? so how about this:

struct {
    u32 reg_type;
    union {
        u32 btf_id;
        u32 memsize;
    };
} btf_var;

In case someone hates non-anonymous structs, I'd still go with
btf_var_reg_type, btf_var_btf_id and btf_var_memsize.

> +               };
>         };
>         u64 map_key_state; /* constant (32 bit) key tracking for maps */
>         int ctx_field_size; /* the ctx field size for load insn, maybe 0 */
> diff --git a/include/linux/btf.h b/include/linux/btf.h
> index a9af5e7a7ece..592373d359b9 100644
> --- a/include/linux/btf.h
> +++ b/include/linux/btf.h
> @@ -106,6 +106,21 @@ static inline bool btf_type_is_func_proto(const struct btf_type *t)
>         return BTF_INFO_KIND(t->info) == BTF_KIND_FUNC_PROTO;
>  }
>

[...]

> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index b4e9c56b8b32..3b382c080cfd 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -7398,6 +7398,24 @@ static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn)
>                 return 0;
>         }
>
> +       if (insn->src_reg == BPF_PSEUDO_BTF_ID) {
> +               u32 type = aux->pseudo_btf_id_type;
> +               u32 meta = aux->pseudo_btf_id_meta;
> +
> +               mark_reg_known_zero(env, regs, insn->dst_reg);
> +
> +               regs[insn->dst_reg].type = type;
> +               if (type == PTR_TO_MEM) {
> +                       regs[insn->dst_reg].mem_size = meta;
> +               } else if (type == PTR_TO_BTF_ID) {
> +                       regs[insn->dst_reg].btf_id = meta;

nit: probably worthwhile to introduce a local variable (dst_reg) to
capture pointer to regs[insn->dst_reg] in this entire function. Then
no reall need for type and meta local vars above, everything is going
to be short and sweet.

> +               } else {
> +                       verbose(env, "bpf verifier is misconfigured\n");
> +                       return -EFAULT;
> +               }
> +               return 0;
> +       }
> +
>         map = env->used_maps[aux->map_index];
>         mark_reg_known_zero(env, regs, insn->dst_reg);
>         regs[insn->dst_reg].map_ptr = map;
> @@ -9284,6 +9302,74 @@ static int do_check(struct bpf_verifier_env *env)
>         return 0;
>  }
>
> +/* replace pseudo btf_id with kernel symbol address */
> +static int check_pseudo_btf_id(struct bpf_verifier_env *env,
> +                              struct bpf_insn *insn,
> +                              struct bpf_insn_aux_data *aux)
> +{
> +       u32 type, id = insn->imm;
> +       const struct btf_type *t;
> +       const char *sym_name;
> +       u64 addr;
> +
> +       if (!btf_vmlinux) {
> +               verbose(env, "%s: btf not available. verifier misconfigured.\n",
> +                       __func__);

verifier logs so far hasn't used __func__ and it's not all that
meaningful to users and might change due to later refactorings.

Also, in this particular case, it's not a verifier misconfiguration,
but rather that the kernel doesn't have a built-in BTF, so suggest
enabling CONFIG_DEBUG_INFO_BTF=y.

"kernel is missing BTF, make sure CONFIG_DEBUG_INFO_BTF=y is specified
in Kconfig"

> +               return -EINVAL;
> +       }
> +
> +       t = btf_type_by_id(btf_vmlinux, id);
> +       if (!t) {
> +               verbose(env, "%s: invalid btf_id %d\n", __func__, id);

"ldimm64 insn specifies invalid btf_id %d"? add similar context below

> +               return -ENOENT;
> +       }
> +
> +       if (insn[1].imm != 0) {
> +               verbose(env, "%s: BPF_PSEUDO_BTF_ID uses reserved fields\n",
> +                       __func__);
> +               return -EINVAL;
> +       }
> +
> +       if (!btf_type_is_var(t)) {
> +               verbose(env, "%s: btf_id %d isn't KIND_VAR\n", __func__, id);
> +               return -EINVAL;
> +       }
> +
> +       sym_name = btf_name_by_offset(btf_vmlinux, t->name_off);
> +       addr = kallsyms_lookup_name(sym_name);
> +       if (!addr) {
> +               verbose(env, "%s: failed to find the address of symbol '%s'.\n",
> +                       __func__, sym_name);
> +               return -ENOENT;
> +       }
> +

[...]

> @@ -9394,10 +9480,14 @@ static bool bpf_map_is_cgroup_storage(struct bpf_map *map)
>                 map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE);
>  }
>
> -/* look for pseudo eBPF instructions that access map FDs and
> - * replace them with actual map pointers
> +/* find and rewrite pseudo imm in ld_imm64 instructions:
> + *
> + * 1. if it accesses map FD, replace it with actual map pointer.
> + * 2. if it accesses btf_id of a VAR, replace it with pointer to the var.
> + *
> + * NOTE: btf_vmlinux is required for converting pseudo btf_id.
>   */
> -static int replace_map_fd_with_map_ptr(struct bpf_verifier_env *env)
> +static int replace_pseudo_imm_with_ptr(struct bpf_verifier_env *env)

nit: I'd call this something like "resolve_pseudo_ldimm64" instead.
ptr here is an implementation detail of map pointers ldimm64 and
doesn't even match what we are doing for pseudo_btf_id

>  {
>         struct bpf_insn *insn = env->prog->insnsi;
>         int insn_cnt = env->prog->len;
> @@ -9438,6 +9528,14 @@ static int replace_map_fd_with_map_ptr(struct bpf_verifier_env *env)
>                                 /* valid generic load 64-bit imm */
>                                 goto next_insn;
>
> +                       if (insn[0].src_reg == BPF_PSEUDO_BTF_ID) {
> +                               aux = &env->insn_aux_data[i];
> +                               err = check_pseudo_btf_id(env, insn, aux);
> +                               if (err)
> +                                       return err;
> +                               goto next_insn;
> +                       }
> +
>                         /* In final convert_pseudo_ld_imm64() step, this is
>                          * converted into regular 64-bit imm load insn.
>                          */

[...]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH bpf-next v2 2/6] bpf/libbpf: BTF support for typed ksyms
  2020-09-03 22:33 ` [PATCH bpf-next v2 2/6] bpf/libbpf: BTF support for typed ksyms Hao Luo
@ 2020-09-04 19:34   ` Andrii Nakryiko
  2020-09-14  4:56     ` Hao Luo
  0 siblings, 1 reply; 21+ messages in thread
From: Andrii Nakryiko @ 2020-09-04 19:34 UTC (permalink / raw)
  To: Hao Luo
  Cc: Networking, bpf, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Shuah Khan, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Quentin Monnet, Steven Rostedt, Ingo Molnar,
	Andrey Ignatov, Jakub Sitnicki

On Thu, Sep 3, 2020 at 3:34 PM Hao Luo <haoluo@google.com> wrote:
>
> If a ksym is defined with a type, libbpf will try to find the ksym's btf
> information from kernel btf. If a valid btf entry for the ksym is found,
> libbpf can pass in the found btf id to the verifier, which validates the
> ksym's type and value.
>
> Typeless ksyms (i.e. those defined as 'void') will not have such btf_id,
> but it has the symbol's address (read from kallsyms) and its value is
> treated as a raw pointer.
>
> Signed-off-by: Hao Luo <haoluo@google.com>
> ---

Logic looks correct, but I have complaints about libbpf logging
consistency, please see suggestions below.

>  tools/lib/bpf/libbpf.c | 116 ++++++++++++++++++++++++++++++++++++-----
>  1 file changed, 102 insertions(+), 14 deletions(-)
>

[...]

> @@ -3119,6 +3130,8 @@ static int bpf_object__collect_externs(struct bpf_object *obj)
>                         vt->type = int_btf_id;
>                         vs->offset = off;
>                         vs->size = sizeof(int);
> +                       pr_debug("ksym var_secinfo: var '%s', type #%d, size %d, offset %d\n",
> +                                ext->name, vt->type, vs->size, vs->offset);

debug leftover?

>                 }
>                 sec->size = off;
>         }
> @@ -5724,8 +5737,13 @@ bpf_program__relocate(struct bpf_program *prog, struct bpf_object *obj)
>                                 insn[0].imm = obj->maps[obj->kconfig_map_idx].fd;
>                                 insn[1].imm = ext->kcfg.data_off;
>                         } else /* EXT_KSYM */ {
> -                               insn[0].imm = (__u32)ext->ksym.addr;
> -                               insn[1].imm = ext->ksym.addr >> 32;
> +                               if (ext->ksym.type_id) { /* typed ksyms */
> +                                       insn[0].src_reg = BPF_PSEUDO_BTF_ID;
> +                                       insn[0].imm = ext->ksym.vmlinux_btf_id;
> +                               } else { /* typeless ksyms */
> +                                       insn[0].imm = (__u32)ext->ksym.addr;
> +                                       insn[1].imm = ext->ksym.addr >> 32;
> +                               }
>                         }
>                         break;
>                 case RELO_CALL:
> @@ -6462,10 +6480,72 @@ static int bpf_object__read_kallsyms_file(struct bpf_object *obj)
>         return err;
>  }
>
> +static int bpf_object__resolve_ksyms_btf_id(struct bpf_object *obj)
> +{
> +       struct extern_desc *ext;
> +       int i, id;
> +
> +       if (!obj->btf_vmlinux) {
> +               pr_warn("support of typed ksyms needs kernel btf.\n");
> +               return -ENOENT;
> +       }

This check shouldn't be needed, you'd either successfully load
btf_vmlinux by now or will fail earlier, because BTF is required but
not found.

> +
> +       for (i = 0; i < obj->nr_extern; i++) {
> +               const struct btf_type *targ_var, *targ_type;
> +               __u32 targ_type_id, local_type_id;
> +               int ret;
> +
> +               ext = &obj->externs[i];
> +               if (ext->type != EXT_KSYM || !ext->ksym.type_id)
> +                       continue;
> +
> +               id = btf__find_by_name_kind(obj->btf_vmlinux, ext->name,
> +                                           BTF_KIND_VAR);
> +               if (id <= 0) {
> +                       pr_warn("no btf entry for ksym '%s' in vmlinux.\n",
> +                               ext->name);

please try to stick to consistent style of comments:

"extern (ksym) '%s': failed to find BTF ID in vmlinux BTF" or
something like that


> +                       return -ESRCH;
> +               }
> +
> +               /* find target type_id */
> +               targ_var = btf__type_by_id(obj->btf_vmlinux, id);
> +               targ_type = skip_mods_and_typedefs(obj->btf_vmlinux,
> +                                                  targ_var->type,
> +                                                  &targ_type_id);
> +
> +               /* find local type_id */
> +               local_type_id = ext->ksym.type_id;
> +
> +               ret = bpf_core_types_are_compat(obj->btf_vmlinux, targ_type_id,
> +                                               obj->btf, local_type_id);

you reversed the order, it's always local btf/id, then target btf/id.

> +               if (ret <= 0) {
> +                       const struct btf_type *local_type;
> +                       const char *targ_name, *local_name;
> +
> +                       local_type = btf__type_by_id(obj->btf, local_type_id);
> +                       targ_name = btf__name_by_offset(obj->btf_vmlinux,
> +                                                       targ_type->name_off);
> +                       local_name = btf__name_by_offset(obj->btf,
> +                                                        local_type->name_off);

it's a bit unfortunate that we get the name of an already resolved
type, because if you have a typedef to anon struct, this will give you
an empty string. I don't know how much of a problem that would be, so
I think it's fine to leave it as is, and fix it if it's a problem in
practice.

> +
> +                       pr_warn("ksym '%s' expects type '%s' (vmlinux_btf_id: #%d), "
> +                               "but got '%s' (btf_id: #%d)\n", ext->name,
> +                               targ_name, targ_type_id, local_name, local_type_id);

same thing, please stay consistent in logging format. Check
bpf_core_dump_spec() for how BTF type info is usually emitted
throughout libbpf:

"extern (ksym): incompatible types, expected [%d] %s %s, but kernel
has [%d] %s %s\n"

there is a btf_kind_str() helper to resolve kind to a string representation.


> +                       return -EINVAL;
> +               }
> +
> +               ext->is_set = true;
> +               ext->ksym.vmlinux_btf_id = id;
> +               pr_debug("extern (ksym) %s=vmlinux_btf_id(#%d)\n", ext->name, id);

"extern (ksym) '%s': resolved to [%d] %s %s\n", similar to above
suggestion. This "[%d]" format is very consistently used for BTF IDs
throughout, so it will be familiar and recognizable for people that
had to deal with this in libbpf logs.

> +       }
> +       return 0;
> +}
> +
>  static int bpf_object__resolve_externs(struct bpf_object *obj,
>                                        const char *extra_kconfig)
>  {
> -       bool need_config = false, need_kallsyms = false;
> +       bool need_kallsyms = false, need_vmlinux_btf = false;
> +       bool need_config = false;

nit: doesn't make sense to change the existing source code line at
all. Just add `bool need_vmlinux_btf = false;` on a new line? Or we
can split all these bools into 3 separate lines, if you prefer.

>         struct extern_desc *ext;
>         void *kcfg_data = NULL;
>         int err, i;

[...]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH bpf-next v2 3/6] bpf/selftests: ksyms_btf to test typed ksyms
  2020-09-03 22:33 ` [PATCH bpf-next v2 3/6] bpf/selftests: ksyms_btf to test " Hao Luo
@ 2020-09-04 19:49   ` Andrii Nakryiko
  2020-09-14  4:58     ` Hao Luo
  0 siblings, 1 reply; 21+ messages in thread
From: Andrii Nakryiko @ 2020-09-04 19:49 UTC (permalink / raw)
  To: Hao Luo
  Cc: Networking, bpf, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Shuah Khan, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Quentin Monnet, Steven Rostedt, Ingo Molnar,
	Andrey Ignatov, Jakub Sitnicki

On Thu, Sep 3, 2020 at 3:35 PM Hao Luo <haoluo@google.com> wrote:
>
> Selftests for typed ksyms. Tests two types of ksyms: one is a struct,
> the other is a plain int. This tests two paths in the kernel. Struct
> ksyms will be converted into PTR_TO_BTF_ID by the verifier while int
> typed ksyms will be converted into PTR_TO_MEM.
>
> Signed-off-by: Hao Luo <haoluo@google.com>
> ---
>  .../testing/selftests/bpf/prog_tests/ksyms.c  | 31 +++------
>  .../selftests/bpf/prog_tests/ksyms_btf.c      | 63 +++++++++++++++++++
>  .../selftests/bpf/progs/test_ksyms_btf.c      | 23 +++++++
>  tools/testing/selftests/bpf/trace_helpers.c   | 26 ++++++++
>  tools/testing/selftests/bpf/trace_helpers.h   |  4 ++
>  5 files changed, 123 insertions(+), 24 deletions(-)
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/ksyms_btf.c
>  create mode 100644 tools/testing/selftests/bpf/progs/test_ksyms_btf.c
>

[...]

> diff --git a/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c b/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c
> new file mode 100644
> index 000000000000..7b6846342449
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c
> @@ -0,0 +1,63 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2020 Google */
> +
> +#include <test_progs.h>
> +#include <bpf/libbpf.h>
> +#include <bpf/btf.h>
> +#include "test_ksyms_btf.skel.h"
> +
> +static int duration;
> +
> +void test_ksyms_btf(void)
> +{
> +       __u64 runqueues_addr, bpf_prog_active_addr;
> +       struct test_ksyms_btf *skel;
> +       struct test_ksyms_btf__data *data;
> +       struct btf *btf;
> +       int percpu_datasec;
> +       int err;
> +
> +       err = kallsyms_find("runqueues", &runqueues_addr);
> +       if (CHECK(err == -ENOENT, "kallsyms_fopen", "failed to open: %d\n", errno))
> +               return;
> +       if (CHECK(err == -EINVAL, "ksym_find", "symbol 'runqueues' not found\n"))
> +               return;
> +
> +       err = kallsyms_find("bpf_prog_active", &bpf_prog_active_addr);
> +       if (CHECK(err == -EINVAL, "ksym_find", "symbol 'bpf_prog_active' not found\n"))
> +               return;
> +
> +       btf = libbpf_find_kernel_btf();
> +       if (CHECK(IS_ERR(btf), "btf_exists", "failed to load kernel BTF: %ld\n",
> +                 PTR_ERR(btf)))
> +               return;
> +
> +       percpu_datasec = btf__find_by_name_kind(btf, ".data..percpu",
> +                                               BTF_KIND_DATASEC);
> +       if (percpu_datasec < 0) {
> +               printf("%s:SKIP:no PERCPU DATASEC in kernel btf\n",
> +                      __func__);
> +               test__skip();

leaking btf here

> +               return;
> +       }
> +
> +       skel = test_ksyms_btf__open_and_load();
> +       if (CHECK(!skel, "skel_open", "failed to open and load skeleton\n"))

here

> +               return;
> +
> +       err = test_ksyms_btf__attach(skel);
> +       if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err))
> +               goto cleanup;
> +
> +       /* trigger tracepoint */
> +       usleep(1);
> +
> +       data = skel->data;
> +       CHECK(data->out__runqueues != runqueues_addr, "runqueues",
> +             "got %llu, exp %llu\n", data->out__runqueues, runqueues_addr);
> +       CHECK(data->out__bpf_prog_active != bpf_prog_active_addr, "bpf_prog_active",
> +             "got %llu, exp %llu\n", data->out__bpf_prog_active, bpf_prog_active_addr);

u64 is not %llu on some arches, please cast explicitly to (unsigned long long)

> +
> +cleanup:

... and here (I suggest to just jump from all those locations here for cleanup)

> +       test_ksyms_btf__destroy(skel);
> +}
> diff --git a/tools/testing/selftests/bpf/progs/test_ksyms_btf.c b/tools/testing/selftests/bpf/progs/test_ksyms_btf.c
> new file mode 100644
> index 000000000000..e04e31117f84
> --- /dev/null
> +++ b/tools/testing/selftests/bpf/progs/test_ksyms_btf.c
> @@ -0,0 +1,23 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/* Copyright (c) 2020 Google */
> +
> +#include "vmlinux.h"
> +
> +#include <bpf/bpf_helpers.h>
> +
> +__u64 out__runqueues = -1;
> +__u64 out__bpf_prog_active = -1;

this is addresses, not values, so _addr part would make it clearer.

> +
> +extern const struct rq runqueues __ksym; /* struct type global var. */
> +extern const int bpf_prog_active __ksym; /* int type global var. */

When we add non-per-CPU kernel variables, I wonder if the fact that we
have both per-CPU and global kernel variables under the same __ksym
section would cause any problems and confusion? It's not clear to me
if we need to have a special __percpu_ksym section or not?..

> +
> +SEC("raw_tp/sys_enter")
> +int handler(const void *ctx)
> +{
> +       out__runqueues = (__u64)&runqueues;
> +       out__bpf_prog_active = (__u64)&bpf_prog_active;
> +
> +       return 0;
> +}
> +
> +char _license[] SEC("license") = "GPL";
> diff --git a/tools/testing/selftests/bpf/trace_helpers.c b/tools/testing/selftests/bpf/trace_helpers.c
> index 4d0e913bbb22..ade555fe8294 100644
> --- a/tools/testing/selftests/bpf/trace_helpers.c
> +++ b/tools/testing/selftests/bpf/trace_helpers.c
> @@ -90,6 +90,32 @@ long ksym_get_addr(const char *name)
>         return 0;
>  }
>
> +/* open kallsyms and read symbol addresses on the fly. Without caching all symbols,
> + * this is faster than load + find. */
> +int kallsyms_find(const char *sym, unsigned long long *addr)
> +{
> +       char type, name[500];
> +       unsigned long long value;
> +       int err = 0;
> +       FILE *f;
> +
> +       f = fopen("/proc/kallsyms", "r");
> +       if (!f)
> +               return -ENOENT;
> +
> +       while (fscanf(f, "%llx %c %499s%*[^\n]\n", &value, &type, name) > 0) {
> +               if (strcmp(name, sym) == 0) {
> +                       *addr = value;
> +                       goto out;
> +               }
> +       }
> +       err = -EINVAL;

These error codes seem backward to me. If you fail to open
/proc/kallsyms, that's an unexpected and invalid situation, so EINVAL
makes a bit more sense there. But -ENOENT is clearly for cases where
you didn't find what you were looking for, which is exactly this case.


> +
> +out:
> +       fclose(f);
> +       return err;
> +}
> +
>  void read_trace_pipe(void)
>  {
>         int trace_fd;
> diff --git a/tools/testing/selftests/bpf/trace_helpers.h b/tools/testing/selftests/bpf/trace_helpers.h
> index 25ef597dd03f..f62fdef9e589 100644
> --- a/tools/testing/selftests/bpf/trace_helpers.h
> +++ b/tools/testing/selftests/bpf/trace_helpers.h
> @@ -12,6 +12,10 @@ struct ksym {
>  int load_kallsyms(void);
>  struct ksym *ksym_search(long key);
>  long ksym_get_addr(const char *name);
> +
> +/* open kallsyms and find addresses on the fly, faster than load + search. */
> +int kallsyms_find(const char *sym, unsigned long long *addr);
> +
>  void read_trace_pipe(void);
>
>  #endif
> --
> 2.28.0.526.ge36021eeef-goog
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH bpf-next v2 4/6] bpf: Introduce bpf_per_cpu_ptr()
  2020-09-03 22:33 ` [PATCH bpf-next v2 4/6] bpf: Introduce bpf_per_cpu_ptr() Hao Luo
@ 2020-09-04 20:04   ` Andrii Nakryiko
  2020-09-14  5:01     ` Hao Luo
  0 siblings, 1 reply; 21+ messages in thread
From: Andrii Nakryiko @ 2020-09-04 20:04 UTC (permalink / raw)
  To: Hao Luo
  Cc: Networking, bpf, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Shuah Khan, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Quentin Monnet, Steven Rostedt, Ingo Molnar,
	Andrey Ignatov, Jakub Sitnicki

On Thu, Sep 3, 2020 at 3:35 PM Hao Luo <haoluo@google.com> wrote:
>
> Add bpf_per_cpu_ptr() to help bpf programs access percpu vars.
> bpf_per_cpu_ptr() has the same semantic as per_cpu_ptr() in the kernel
> except that it may return NULL. This happens when the cpu parameter is
> out of range. So the caller must check the returned value.
>
> Acked-by: Andrii Nakryiko <andriin@fb.com>
> Signed-off-by: Hao Luo <haoluo@google.com>
> ---
>  include/linux/bpf.h            |  3 ++
>  include/linux/btf.h            | 11 ++++++
>  include/uapi/linux/bpf.h       | 17 +++++++++
>  kernel/bpf/btf.c               | 10 ------
>  kernel/bpf/verifier.c          | 66 +++++++++++++++++++++++++++++++---
>  kernel/trace/bpf_trace.c       | 18 ++++++++++
>  tools/include/uapi/linux/bpf.h | 17 +++++++++
>  7 files changed, 128 insertions(+), 14 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index c6d9f2c444f4..6b2034f7665e 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -292,6 +292,7 @@ enum bpf_arg_type {
>         ARG_PTR_TO_ALLOC_MEM,   /* pointer to dynamically allocated memory */
>         ARG_PTR_TO_ALLOC_MEM_OR_NULL,   /* pointer to dynamically allocated memory or NULL */
>         ARG_CONST_ALLOC_SIZE_OR_ZERO,   /* number of allocated bytes requested */
> +       ARG_PTR_TO_PERCPU_BTF_ID,       /* pointer to in-kernel percpu type */
>  };
>
>  /* type of values returned from helper functions */
> @@ -305,6 +306,7 @@ enum bpf_return_type {
>         RET_PTR_TO_SOCK_COMMON_OR_NULL, /* returns a pointer to a sock_common or NULL */
>         RET_PTR_TO_ALLOC_MEM_OR_NULL,   /* returns a pointer to dynamically allocated memory or NULL */
>         RET_PTR_TO_BTF_ID_OR_NULL,      /* returns a pointer to a btf_id or NULL */
> +       RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL, /* returns a pointer to a valid memory or a btf_id or NULL */
>  };
>
>  /* eBPF function prototype used by verifier to allow BPF_CALLs from eBPF programs
> @@ -385,6 +387,7 @@ enum bpf_reg_type {
>         PTR_TO_RDONLY_BUF_OR_NULL, /* reg points to a readonly buffer or NULL */
>         PTR_TO_RDWR_BUF,         /* reg points to a read/write buffer */
>         PTR_TO_RDWR_BUF_OR_NULL, /* reg points to a read/write buffer or NULL */
> +       PTR_TO_PERCPU_BTF_ID,    /* reg points to percpu kernel type */
>  };
>
>  /* The information passed from prog-specific *_is_valid_access
> diff --git a/include/linux/btf.h b/include/linux/btf.h
> index 592373d359b9..07b7de1c05b0 100644
> --- a/include/linux/btf.h
> +++ b/include/linux/btf.h
> @@ -71,6 +71,11 @@ btf_resolve_size(const struct btf *btf, const struct btf_type *type,
>              i < btf_type_vlen(struct_type);                    \
>              i++, member++)
>
> +#define for_each_vsi(i, struct_type, member)                   \

datasec_type?

> +       for (i = 0, member = btf_type_var_secinfo(struct_type); \
> +            i < btf_type_vlen(struct_type);                    \
> +            i++, member++)
> +
>  static inline bool btf_type_is_ptr(const struct btf_type *t)
>  {
>         return BTF_INFO_KIND(t->info) == BTF_KIND_PTR;
> @@ -155,6 +160,12 @@ static inline const struct btf_member *btf_type_member(const struct btf_type *t)
>         return (const struct btf_member *)(t + 1);
>  }
>
> +static inline const struct btf_var_secinfo *btf_type_var_secinfo(
> +               const struct btf_type *t)
> +{
> +       return (const struct btf_var_secinfo *)(t + 1);
> +}
> +
>  #ifdef CONFIG_BPF_SYSCALL
>  const struct btf_type *btf_type_by_id(const struct btf *btf, u32 type_id);
>  const char *btf_name_by_offset(const struct btf *btf, u32 offset);
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index ab00ad9b32e5..d0ec94d5bdbf 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -3596,6 +3596,22 @@ union bpf_attr {
>   *             the data in *dst*. This is a wrapper of copy_from_user().
>   *     Return
>   *             0 on success, or a negative error in case of failure.
> + *
> + * void *bpf_per_cpu_ptr(const void *percpu_ptr, u32 cpu)
> + *     Description
> + *             Take a pointer to a percpu ksym, *percpu_ptr*, and return a
> + *             pointer to the percpu kernel variable on *cpu*. A ksym is an
> + *             extern variable decorated with '__ksym'. For ksym, there is a
> + *             global var (either static or global) defined of the same name
> + *             in the kernel. The ksym is percpu if the global var is percpu.
> + *             The returned pointer points to the global percpu var on *cpu*.
> + *
> + *             bpf_per_cpu_ptr() has the same semantic as per_cpu_ptr() in the
> + *             kernel, except that bpf_per_cpu_ptr() may return NULL. This
> + *             happens if *cpu* is larger than nr_cpu_ids. The caller of
> + *             bpf_per_cpu_ptr() must check the returned value.
> + *     Return
> + *             A generic pointer pointing to the kernel percpu variable on *cpu*.

Or NULL, if *cpu* is invalid.

>   */
>  #define __BPF_FUNC_MAPPER(FN)          \
>         FN(unspec),                     \
> @@ -3747,6 +3763,7 @@ union bpf_attr {
>         FN(inode_storage_delete),       \
>         FN(d_path),                     \
>         FN(copy_from_user),             \
> +       FN(bpf_per_cpu_ptr),            \
>         /* */
>

[...]

> @@ -4003,6 +4008,15 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
>                         if (type != expected_type)
>                                 goto err_type;
>                 }
> +       } else if (arg_type == ARG_PTR_TO_PERCPU_BTF_ID) {
> +               expected_type = PTR_TO_PERCPU_BTF_ID;
> +               if (type != expected_type)
> +                       goto err_type;
> +               if (!reg->btf_id) {
> +                       verbose(env, "Helper has invalid btf_id in R%d\n", regno);
> +                       return -EACCES;
> +               }
> +               meta->ret_btf_id = reg->btf_id;
>         } else if (arg_type == ARG_PTR_TO_BTF_ID) {
>                 bool ids_match = false;
>
> @@ -5002,6 +5016,30 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn
>                 regs[BPF_REG_0].type = PTR_TO_MEM_OR_NULL;
>                 regs[BPF_REG_0].id = ++env->id_gen;
>                 regs[BPF_REG_0].mem_size = meta.mem_size;
> +       } else if (fn->ret_type == RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL) {

Given this is internal implementation detail, this return type is
fine, but I'm wondering if it would be better to just make
PTR_TO_BTF_ID to allow not just structs? E.g., if we have an int, just
allow reading those 4 bytes.

Not sure what the implications are in terms of implementation, but
conceptually that shouldn't be a problem, given we do have BTF type ID
describing size and all.

> +               const struct btf_type *t;
> +
> +               mark_reg_known_zero(env, regs, BPF_REG_0);
> +               t = btf_type_skip_modifiers(btf_vmlinux, meta.ret_btf_id, NULL);
> +               if (!btf_type_is_struct(t)) {
> +                       u32 tsize;
> +                       const struct btf_type *ret;
> +                       const char *tname;
> +
> +                       /* resolve the type size of ksym. */
> +                       ret = btf_resolve_size(btf_vmlinux, t, &tsize);
> +                       if (IS_ERR(ret)) {
> +                               tname = btf_name_by_offset(btf_vmlinux, t->name_off);
> +                               verbose(env, "unable to resolve the size of type '%s': %ld\n",
> +                                       tname, PTR_ERR(ret));
> +                               return -EINVAL;
> +                       }
> +                       regs[BPF_REG_0].type = PTR_TO_MEM_OR_NULL;
> +                       regs[BPF_REG_0].mem_size = tsize;
> +               } else {
> +                       regs[BPF_REG_0].type = PTR_TO_BTF_ID_OR_NULL;
> +                       regs[BPF_REG_0].btf_id = meta.ret_btf_id;
> +               }
>         } else if (fn->ret_type == RET_PTR_TO_BTF_ID_OR_NULL) {
>                 int ret_btf_id;
>

[...]

> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index b2a5380eb187..d474c1530f87 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -1144,6 +1144,22 @@ static const struct bpf_func_proto bpf_d_path_proto = {
>         .allowed        = bpf_d_path_allowed,
>  };
>
> +BPF_CALL_2(bpf_per_cpu_ptr, const void *, ptr, u32, cpu)
> +{
> +       if (cpu >= nr_cpu_ids)
> +               return 0;
> +
> +       return (u64)per_cpu_ptr(ptr, cpu);

not sure, but on 32-bit arches this might cause compilation warning,
case to (unsigned long) instead?

> +}
> +
> +static const struct bpf_func_proto bpf_per_cpu_ptr_proto = {
> +       .func           = bpf_per_cpu_ptr,
> +       .gpl_only       = false,
> +       .ret_type       = RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL,
> +       .arg1_type      = ARG_PTR_TO_PERCPU_BTF_ID,
> +       .arg2_type      = ARG_ANYTHING,
> +};
> +

[...]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH bpf-next v2 5/6] bpf: Introduce bpf_this_cpu_ptr()
  2020-09-03 22:33 ` [PATCH bpf-next v2 5/6] bpf: Introduce bpf_this_cpu_ptr() Hao Luo
@ 2020-09-04 20:09   ` Andrii Nakryiko
  2020-09-14  5:04     ` Hao Luo
  0 siblings, 1 reply; 21+ messages in thread
From: Andrii Nakryiko @ 2020-09-04 20:09 UTC (permalink / raw)
  To: Hao Luo
  Cc: Networking, bpf, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Shuah Khan, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Quentin Monnet, Steven Rostedt, Ingo Molnar,
	Andrey Ignatov, Jakub Sitnicki

On Thu, Sep 3, 2020 at 3:35 PM Hao Luo <haoluo@google.com> wrote:
>
> Add bpf_this_cpu_ptr() to help access percpu var on this cpu. This
> helper always returns a valid pointer, therefore no need to check
> returned value for NULL. Also note that all programs run with
> preemption disabled, which means that the returned pointer is stable
> during all the execution of the program.
>
> Signed-off-by: Hao Luo <haoluo@google.com>
> ---

looks good, few small things, but otherwise:

Acked-by: Andrii Nakryiko <andriin@fb.com>

>  include/linux/bpf.h            |  1 +
>  include/uapi/linux/bpf.h       | 14 ++++++++++++++
>  kernel/bpf/verifier.c          | 10 +++++++---
>  kernel/trace/bpf_trace.c       | 14 ++++++++++++++
>  tools/include/uapi/linux/bpf.h | 14 ++++++++++++++
>  5 files changed, 50 insertions(+), 3 deletions(-)
>
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 6b2034f7665e..506fdd5d0463 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -307,6 +307,7 @@ enum bpf_return_type {
>         RET_PTR_TO_ALLOC_MEM_OR_NULL,   /* returns a pointer to dynamically allocated memory or NULL */
>         RET_PTR_TO_BTF_ID_OR_NULL,      /* returns a pointer to a btf_id or NULL */
>         RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL, /* returns a pointer to a valid memory or a btf_id or NULL */
> +       RET_PTR_TO_MEM_OR_BTF_ID,       /* returns a pointer to a valid memory or a btf_id */
>  };
>
>  /* eBPF function prototype used by verifier to allow BPF_CALLs from eBPF programs
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index d0ec94d5bdbf..e7ca91c697ed 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -3612,6 +3612,19 @@ union bpf_attr {
>   *             bpf_per_cpu_ptr() must check the returned value.
>   *     Return
>   *             A generic pointer pointing to the kernel percpu variable on *cpu*.
> + *
> + * void *bpf_this_cpu_ptr(const void *percpu_ptr)
> + *     Description
> + *             Take a pointer to a percpu ksym, *percpu_ptr*, and return a
> + *             pointer to the percpu kernel variable on this cpu. See the
> + *             description of 'ksym' in **bpf_per_cpu_ptr**\ ().
> + *
> + *             bpf_this_cpu_ptr() has the same semantic as this_cpu_ptr() in
> + *             the kernel. Different from **bpf_per_cpu_ptr**\ (), it would
> + *             never return NULL.
> + *     Return
> + *             A generic pointer pointing to the kernel percpu variable on

what's "a generic pointer"? is it as opposed to sk_buff pointer or something?

> + *             this cpu.
>   */
>  #define __BPF_FUNC_MAPPER(FN)          \
>         FN(unspec),                     \
> @@ -3764,6 +3777,7 @@ union bpf_attr {
>         FN(d_path),                     \
>         FN(copy_from_user),             \
>         FN(bpf_per_cpu_ptr),            \
> +       FN(bpf_this_cpu_ptr),           \
>         /* */
>
>  /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index a702600ff581..e070d2abc405 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -5016,8 +5016,10 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn
>                 regs[BPF_REG_0].type = PTR_TO_MEM_OR_NULL;
>                 regs[BPF_REG_0].id = ++env->id_gen;
>                 regs[BPF_REG_0].mem_size = meta.mem_size;
> -       } else if (fn->ret_type == RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL) {
> +       } else if (fn->ret_type == RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL ||
> +                  fn->ret_type == RET_PTR_TO_MEM_OR_BTF_ID) {
>                 const struct btf_type *t;
> +               bool not_null = fn->ret_type == RET_PTR_TO_MEM_OR_BTF_ID;

nit: this is fine, but I'd inline it below

>
>                 mark_reg_known_zero(env, regs, BPF_REG_0);
>                 t = btf_type_skip_modifiers(btf_vmlinux, meta.ret_btf_id, NULL);
> @@ -5034,10 +5036,12 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn
>                                         tname, PTR_ERR(ret));
>                                 return -EINVAL;
>                         }
> -                       regs[BPF_REG_0].type = PTR_TO_MEM_OR_NULL;
> +                       regs[BPF_REG_0].type = not_null ?
> +                               PTR_TO_MEM : PTR_TO_MEM_OR_NULL;
>                         regs[BPF_REG_0].mem_size = tsize;
>                 } else {
> -                       regs[BPF_REG_0].type = PTR_TO_BTF_ID_OR_NULL;
> +                       regs[BPF_REG_0].type = not_null ?
> +                               PTR_TO_BTF_ID : PTR_TO_BTF_ID_OR_NULL;
>                         regs[BPF_REG_0].btf_id = meta.ret_btf_id;
>                 }
>         } else if (fn->ret_type == RET_PTR_TO_BTF_ID_OR_NULL) {
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index d474c1530f87..466acf82a9c7 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -1160,6 +1160,18 @@ static const struct bpf_func_proto bpf_per_cpu_ptr_proto = {
>         .arg2_type      = ARG_ANYTHING,
>  };
>
> +BPF_CALL_1(bpf_this_cpu_ptr, const void *, percpu_ptr)
> +{
> +       return (u64)this_cpu_ptr(percpu_ptr);

see previous comment, this might trigger unnecessary compilation
warnings on 32-bit arches

> +}
> +
> +static const struct bpf_func_proto bpf_this_cpu_ptr_proto = {
> +       .func           = bpf_this_cpu_ptr,
> +       .gpl_only       = false,
> +       .ret_type       = RET_PTR_TO_MEM_OR_BTF_ID,
> +       .arg1_type      = ARG_PTR_TO_PERCPU_BTF_ID,
> +};
> +

[...]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH bpf-next v2 6/6] bpf/selftests: Test for bpf_per_cpu_ptr() and bpf_this_cpu_ptr()
  2020-09-03 22:33 ` [PATCH bpf-next v2 6/6] bpf/selftests: Test for bpf_per_cpu_ptr() and bpf_this_cpu_ptr() Hao Luo
@ 2020-09-04 20:15   ` Andrii Nakryiko
  2020-09-14 16:59     ` Hao Luo
  0 siblings, 1 reply; 21+ messages in thread
From: Andrii Nakryiko @ 2020-09-04 20:15 UTC (permalink / raw)
  To: Hao Luo
  Cc: Networking, bpf, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Shuah Khan, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Quentin Monnet, Steven Rostedt, Ingo Molnar,
	Andrey Ignatov, Jakub Sitnicki

On Thu, Sep 3, 2020 at 3:35 PM Hao Luo <haoluo@google.com> wrote:
>
> Test bpf_per_cpu_ptr() and bpf_this_cpu_ptr(). Test two paths in the
> kernel. If the base pointer points to a struct, the returned reg is
> of type PTR_TO_BTF_ID. Direct pointer dereference can be applied on
> the returned variable. If the base pointer isn't a struct, the
> returned reg is of type PTR_TO_MEM, which also supports direct pointer
> dereference.
>
> Acked-by: Andrii Nakryiko <andriin@fb.com>
> Signed-off-by: Hao Luo <haoluo@google.com>
> ---
>  .../selftests/bpf/prog_tests/ksyms_btf.c      | 10 +++++++
>  .../selftests/bpf/progs/test_ksyms_btf.c      | 26 +++++++++++++++++++
>  2 files changed, 36 insertions(+)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c b/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c
> index 7b6846342449..22cc642dbc0e 100644
> --- a/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c
> +++ b/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c
> @@ -58,6 +58,16 @@ void test_ksyms_btf(void)
>         CHECK(data->out__bpf_prog_active != bpf_prog_active_addr, "bpf_prog_active",
>               "got %llu, exp %llu\n", data->out__bpf_prog_active, bpf_prog_active_addr);
>
> +       CHECK(data->out__rq_cpu == -1, "rq_cpu",
> +             "got %u, exp != -1\n", data->out__rq_cpu);
> +       CHECK(data->out__percpu_bpf_prog_active == -1, "percpu_bpf_prog_active",
> +             "got %d, exp != -1\n", data->out__percpu_bpf_prog_active);
> +
> +       CHECK(data->out__this_rq_cpu == -1, "this_rq_cpu",
> +             "got %u, exp != -1\n", data->out__this_rq_cpu);
> +       CHECK(data->out__this_bpf_prog_active == -1, "this_bpf_prog_active",
> +             "got %d, exp != -1\n", data->out__this_bpf_prog_active);

see below for few suggestions to make these test more specific

out__this_bpf_prog_active it should always be > 0, no?

> +
>  cleanup:
>         test_ksyms_btf__destroy(skel);
>  }
> diff --git a/tools/testing/selftests/bpf/progs/test_ksyms_btf.c b/tools/testing/selftests/bpf/progs/test_ksyms_btf.c
> index e04e31117f84..02d564349892 100644
> --- a/tools/testing/selftests/bpf/progs/test_ksyms_btf.c
> +++ b/tools/testing/selftests/bpf/progs/test_ksyms_btf.c
> @@ -8,15 +8,41 @@
>  __u64 out__runqueues = -1;
>  __u64 out__bpf_prog_active = -1;
>
> +__u32 out__rq_cpu = -1; /* percpu struct fields */
> +int out__percpu_bpf_prog_active = -1; /* percpu int */
> +
> +__u32 out__this_rq_cpu = -1;
> +int out__this_bpf_prog_active = -1;
> +
>  extern const struct rq runqueues __ksym; /* struct type global var. */
>  extern const int bpf_prog_active __ksym; /* int type global var. */
>
>  SEC("raw_tp/sys_enter")
>  int handler(const void *ctx)
>  {
> +       struct rq *rq;
> +       int *active;
> +       __u32 cpu;
> +
>         out__runqueues = (__u64)&runqueues;
>         out__bpf_prog_active = (__u64)&bpf_prog_active;
>
> +       cpu = bpf_get_smp_processor_id();
> +
> +       /* test bpf_per_cpu_ptr() */
> +       rq = (struct rq *)bpf_per_cpu_ptr(&runqueues, cpu);
> +       if (rq)
> +               out__rq_cpu = rq->cpu;
> +       active = (int *)bpf_per_cpu_ptr(&bpf_prog_active, cpu);
> +       if (active)
> +               out__percpu_bpf_prog_active = *active;

this is equivalent to using bpf_this_cpu_ptr(), so:

1. you can compare value with out__this_xxx in user-space

2. it's interesting to also test that you can read value from some
other CPU. Can you add another variable and get value from CPU #0
always? E.g., for out__cpu_0_rq_cpu it should always be zero, right?

> +
> +       /* test bpf_this_cpu_ptr */
> +       rq = (struct rq *)bpf_this_cpu_ptr(&runqueues);
> +       out__this_rq_cpu = rq->cpu;
> +       active = (int *)bpf_this_cpu_ptr(&bpf_prog_active);
> +       out__this_bpf_prog_active = *active;
> +
>         return 0;
>  }
>
> --
> 2.28.0.526.ge36021eeef-goog
>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH bpf-next v2 1/6] bpf: Introduce pseudo_btf_id
  2020-09-04 19:05   ` Andrii Nakryiko
@ 2020-09-14  4:55     ` Hao Luo
  0 siblings, 0 replies; 21+ messages in thread
From: Hao Luo @ 2020-09-14  4:55 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Networking, bpf, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Shuah Khan, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Quentin Monnet, Steven Rostedt, Ingo Molnar,
	Andrey Ignatov, Jakub Sitnicki

Andrii,

Sorry for the late reply. Your suggestions are concrete and helpful. I
can apply them in v3.

Thanks!
Hao

On Fri, Sep 4, 2020 at 12:05 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Thu, Sep 3, 2020 at 3:34 PM Hao Luo <haoluo@google.com> wrote:
> >
> > Pseudo_btf_id is a type of ld_imm insn that associates a btf_id to a
> > ksym so that further dereferences on the ksym can use the BTF info
> > to validate accesses. Internally, when seeing a pseudo_btf_id ld insn,
> > the verifier reads the btf_id stored in the insn[0]'s imm field and
> > marks the dst_reg as PTR_TO_BTF_ID. The btf_id points to a VAR_KIND,
> > which is encoded in btf_vminux by pahole. If the VAR is not of a struct
> > type, the dst reg will be marked as PTR_TO_MEM instead of PTR_TO_BTF_ID
> > and the mem_size is resolved to the size of the VAR's type.
> >
> > From the VAR btf_id, the verifier can also read the address of the
> > ksym's corresponding kernel var from kallsyms and use that to fill
> > dst_reg.
> >
> > Therefore, the proper functionality of pseudo_btf_id depends on (1)
> > kallsyms and (2) the encoding of kernel global VARs in pahole, which
> > should be available since pahole v1.18.
> >
> > Signed-off-by: Hao Luo <haoluo@google.com>
> > ---
>
> Logic looks correct, but I have a few suggestions for naming things
> and verifier logs. Please see below.
>
> >  include/linux/bpf_verifier.h   |   4 ++
> >  include/linux/btf.h            |  15 +++++
> >  include/uapi/linux/bpf.h       |  38 ++++++++---
> >  kernel/bpf/btf.c               |  15 -----
> >  kernel/bpf/verifier.c          | 112 ++++++++++++++++++++++++++++++---
> >  tools/include/uapi/linux/bpf.h |  38 ++++++++---
> >  6 files changed, 182 insertions(+), 40 deletions(-)
> >
> > diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> > index 53c7bd568c5d..a14063f64d96 100644
> > --- a/include/linux/bpf_verifier.h
> > +++ b/include/linux/bpf_verifier.h
> > @@ -308,6 +308,10 @@ struct bpf_insn_aux_data {
> >                         u32 map_index;          /* index into used_maps[] */
> >                         u32 map_off;            /* offset from value base address */
> >                 };
> > +               struct {
> > +                       u32 pseudo_btf_id_type; /* type of pseudo_btf_id */
> > +                       u32 pseudo_btf_id_meta; /* memsize or btf_id */
>
> a bit misleading names, not clear at all in the code what's in there.
> This section is for ldimm64 insns that are loading BTF variables,
> right? so how about this:
>
> struct {
>     u32 reg_type;
>     union {
>         u32 btf_id;
>         u32 memsize;
>     };
> } btf_var;
>
> In case someone hates non-anonymous structs, I'd still go with
> btf_var_reg_type, btf_var_btf_id and btf_var_memsize.
>
> > +               };
> >         };
> >         u64 map_key_state; /* constant (32 bit) key tracking for maps */
> >         int ctx_field_size; /* the ctx field size for load insn, maybe 0 */
> > diff --git a/include/linux/btf.h b/include/linux/btf.h
> > index a9af5e7a7ece..592373d359b9 100644
> > --- a/include/linux/btf.h
> > +++ b/include/linux/btf.h
> > @@ -106,6 +106,21 @@ static inline bool btf_type_is_func_proto(const struct btf_type *t)
> >         return BTF_INFO_KIND(t->info) == BTF_KIND_FUNC_PROTO;
> >  }
> >
>
> [...]
>
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index b4e9c56b8b32..3b382c080cfd 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -7398,6 +7398,24 @@ static int check_ld_imm(struct bpf_verifier_env *env, struct bpf_insn *insn)
> >                 return 0;
> >         }
> >
> > +       if (insn->src_reg == BPF_PSEUDO_BTF_ID) {
> > +               u32 type = aux->pseudo_btf_id_type;
> > +               u32 meta = aux->pseudo_btf_id_meta;
> > +
> > +               mark_reg_known_zero(env, regs, insn->dst_reg);
> > +
> > +               regs[insn->dst_reg].type = type;
> > +               if (type == PTR_TO_MEM) {
> > +                       regs[insn->dst_reg].mem_size = meta;
> > +               } else if (type == PTR_TO_BTF_ID) {
> > +                       regs[insn->dst_reg].btf_id = meta;
>
> nit: probably worthwhile to introduce a local variable (dst_reg) to
> capture pointer to regs[insn->dst_reg] in this entire function. Then
> no reall need for type and meta local vars above, everything is going
> to be short and sweet.
>
> > +               } else {
> > +                       verbose(env, "bpf verifier is misconfigured\n");
> > +                       return -EFAULT;
> > +               }
> > +               return 0;
> > +       }
> > +
> >         map = env->used_maps[aux->map_index];
> >         mark_reg_known_zero(env, regs, insn->dst_reg);
> >         regs[insn->dst_reg].map_ptr = map;
> > @@ -9284,6 +9302,74 @@ static int do_check(struct bpf_verifier_env *env)
> >         return 0;
> >  }
> >
> > +/* replace pseudo btf_id with kernel symbol address */
> > +static int check_pseudo_btf_id(struct bpf_verifier_env *env,
> > +                              struct bpf_insn *insn,
> > +                              struct bpf_insn_aux_data *aux)
> > +{
> > +       u32 type, id = insn->imm;
> > +       const struct btf_type *t;
> > +       const char *sym_name;
> > +       u64 addr;
> > +
> > +       if (!btf_vmlinux) {
> > +               verbose(env, "%s: btf not available. verifier misconfigured.\n",
> > +                       __func__);
>
> verifier logs so far hasn't used __func__ and it's not all that
> meaningful to users and might change due to later refactorings.
>
> Also, in this particular case, it's not a verifier misconfiguration,
> but rather that the kernel doesn't have a built-in BTF, so suggest
> enabling CONFIG_DEBUG_INFO_BTF=y.
>
> "kernel is missing BTF, make sure CONFIG_DEBUG_INFO_BTF=y is specified
> in Kconfig"
>
> > +               return -EINVAL;
> > +       }
> > +
> > +       t = btf_type_by_id(btf_vmlinux, id);
> > +       if (!t) {
> > +               verbose(env, "%s: invalid btf_id %d\n", __func__, id);
>
> "ldimm64 insn specifies invalid btf_id %d"? add similar context below
>
> > +               return -ENOENT;
> > +       }
> > +
> > +       if (insn[1].imm != 0) {
> > +               verbose(env, "%s: BPF_PSEUDO_BTF_ID uses reserved fields\n",
> > +                       __func__);
> > +               return -EINVAL;
> > +       }
> > +
> > +       if (!btf_type_is_var(t)) {
> > +               verbose(env, "%s: btf_id %d isn't KIND_VAR\n", __func__, id);
> > +               return -EINVAL;
> > +       }
> > +
> > +       sym_name = btf_name_by_offset(btf_vmlinux, t->name_off);
> > +       addr = kallsyms_lookup_name(sym_name);
> > +       if (!addr) {
> > +               verbose(env, "%s: failed to find the address of symbol '%s'.\n",
> > +                       __func__, sym_name);
> > +               return -ENOENT;
> > +       }
> > +
>
> [...]
>
> > @@ -9394,10 +9480,14 @@ static bool bpf_map_is_cgroup_storage(struct bpf_map *map)
> >                 map->map_type == BPF_MAP_TYPE_PERCPU_CGROUP_STORAGE);
> >  }
> >
> > -/* look for pseudo eBPF instructions that access map FDs and
> > - * replace them with actual map pointers
> > +/* find and rewrite pseudo imm in ld_imm64 instructions:
> > + *
> > + * 1. if it accesses map FD, replace it with actual map pointer.
> > + * 2. if it accesses btf_id of a VAR, replace it with pointer to the var.
> > + *
> > + * NOTE: btf_vmlinux is required for converting pseudo btf_id.
> >   */
> > -static int replace_map_fd_with_map_ptr(struct bpf_verifier_env *env)
> > +static int replace_pseudo_imm_with_ptr(struct bpf_verifier_env *env)
>
> nit: I'd call this something like "resolve_pseudo_ldimm64" instead.
> ptr here is an implementation detail of map pointers ldimm64 and
> doesn't even match what we are doing for pseudo_btf_id
>
> >  {
> >         struct bpf_insn *insn = env->prog->insnsi;
> >         int insn_cnt = env->prog->len;
> > @@ -9438,6 +9528,14 @@ static int replace_map_fd_with_map_ptr(struct bpf_verifier_env *env)
> >                                 /* valid generic load 64-bit imm */
> >                                 goto next_insn;
> >
> > +                       if (insn[0].src_reg == BPF_PSEUDO_BTF_ID) {
> > +                               aux = &env->insn_aux_data[i];
> > +                               err = check_pseudo_btf_id(env, insn, aux);
> > +                               if (err)
> > +                                       return err;
> > +                               goto next_insn;
> > +                       }
> > +
> >                         /* In final convert_pseudo_ld_imm64() step, this is
> >                          * converted into regular 64-bit imm load insn.
> >                          */
>
> [...]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH bpf-next v2 2/6] bpf/libbpf: BTF support for typed ksyms
  2020-09-04 19:34   ` Andrii Nakryiko
@ 2020-09-14  4:56     ` Hao Luo
  0 siblings, 0 replies; 21+ messages in thread
From: Hao Luo @ 2020-09-14  4:56 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Networking, bpf, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Shuah Khan, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Quentin Monnet, Steven Rostedt, Ingo Molnar,
	Andrey Ignatov, Jakub Sitnicki

Will follow the libbpf logging convention. Thanks for the suggestions.

On Fri, Sep 4, 2020 at 12:34 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Thu, Sep 3, 2020 at 3:34 PM Hao Luo <haoluo@google.com> wrote:
> >
> > If a ksym is defined with a type, libbpf will try to find the ksym's btf
> > information from kernel btf. If a valid btf entry for the ksym is found,
> > libbpf can pass in the found btf id to the verifier, which validates the
> > ksym's type and value.
> >
> > Typeless ksyms (i.e. those defined as 'void') will not have such btf_id,
> > but it has the symbol's address (read from kallsyms) and its value is
> > treated as a raw pointer.
> >
> > Signed-off-by: Hao Luo <haoluo@google.com>
> > ---
>
> Logic looks correct, but I have complaints about libbpf logging
> consistency, please see suggestions below.
>
> >  tools/lib/bpf/libbpf.c | 116 ++++++++++++++++++++++++++++++++++++-----
> >  1 file changed, 102 insertions(+), 14 deletions(-)
> >
>
> [...]
>
> > @@ -3119,6 +3130,8 @@ static int bpf_object__collect_externs(struct bpf_object *obj)
> >                         vt->type = int_btf_id;
> >                         vs->offset = off;
> >                         vs->size = sizeof(int);
> > +                       pr_debug("ksym var_secinfo: var '%s', type #%d, size %d, offset %d\n",
> > +                                ext->name, vt->type, vs->size, vs->offset);
>
> debug leftover?
>

I was thinking we should leave a debug message when some entries in
BTF are modified. It's probably unnecessary, as I'm thinking of it
right now. I will remove this in v3.

> >                 }
> >                 sec->size = off;
> >         }
> > @@ -5724,8 +5737,13 @@ bpf_program__relocate(struct bpf_program *prog, struct bpf_object *obj)
> >                                 insn[0].imm = obj->maps[obj->kconfig_map_idx].fd;
> >                                 insn[1].imm = ext->kcfg.data_off;
> >                         } else /* EXT_KSYM */ {
> > -                               insn[0].imm = (__u32)ext->ksym.addr;
> > -                               insn[1].imm = ext->ksym.addr >> 32;
> > +                               if (ext->ksym.type_id) { /* typed ksyms */
> > +                                       insn[0].src_reg = BPF_PSEUDO_BTF_ID;
> > +                                       insn[0].imm = ext->ksym.vmlinux_btf_id;
> > +                               } else { /* typeless ksyms */
> > +                                       insn[0].imm = (__u32)ext->ksym.addr;
> > +                                       insn[1].imm = ext->ksym.addr >> 32;
> > +                               }
> >                         }
> >                         break;
> >                 case RELO_CALL:
> > @@ -6462,10 +6480,72 @@ static int bpf_object__read_kallsyms_file(struct bpf_object *obj)
> >         return err;
> >  }
> >
> > +static int bpf_object__resolve_ksyms_btf_id(struct bpf_object *obj)
> > +{
> > +       struct extern_desc *ext;
> > +       int i, id;
> > +
> > +       if (!obj->btf_vmlinux) {
> > +               pr_warn("support of typed ksyms needs kernel btf.\n");
> > +               return -ENOENT;
> > +       }
>
> This check shouldn't be needed, you'd either successfully load
> btf_vmlinux by now or will fail earlier, because BTF is required but
> not found.
>
> > +
> > +       for (i = 0; i < obj->nr_extern; i++) {
> > +               const struct btf_type *targ_var, *targ_type;
> > +               __u32 targ_type_id, local_type_id;
> > +               int ret;
> > +
> > +               ext = &obj->externs[i];
> > +               if (ext->type != EXT_KSYM || !ext->ksym.type_id)
> > +                       continue;
> > +
> > +               id = btf__find_by_name_kind(obj->btf_vmlinux, ext->name,
> > +                                           BTF_KIND_VAR);
> > +               if (id <= 0) {
> > +                       pr_warn("no btf entry for ksym '%s' in vmlinux.\n",
> > +                               ext->name);
>
> please try to stick to consistent style of comments:
>
> "extern (ksym) '%s': failed to find BTF ID in vmlinux BTF" or
> something like that
>
>
> > +                       return -ESRCH;
> > +               }
> > +
> > +               /* find target type_id */
> > +               targ_var = btf__type_by_id(obj->btf_vmlinux, id);
> > +               targ_type = skip_mods_and_typedefs(obj->btf_vmlinux,
> > +                                                  targ_var->type,
> > +                                                  &targ_type_id);
> > +
> > +               /* find local type_id */
> > +               local_type_id = ext->ksym.type_id;
> > +
> > +               ret = bpf_core_types_are_compat(obj->btf_vmlinux, targ_type_id,
> > +                                               obj->btf, local_type_id);
>
> you reversed the order, it's always local btf/id, then target btf/id.
>
> > +               if (ret <= 0) {
> > +                       const struct btf_type *local_type;
> > +                       const char *targ_name, *local_name;
> > +
> > +                       local_type = btf__type_by_id(obj->btf, local_type_id);
> > +                       targ_name = btf__name_by_offset(obj->btf_vmlinux,
> > +                                                       targ_type->name_off);
> > +                       local_name = btf__name_by_offset(obj->btf,
> > +                                                        local_type->name_off);
>
> it's a bit unfortunate that we get the name of an already resolved
> type, because if you have a typedef to anon struct, this will give you
> an empty string. I don't know how much of a problem that would be, so
> I think it's fine to leave it as is, and fix it if it's a problem in
> practice.
>
> > +
> > +                       pr_warn("ksym '%s' expects type '%s' (vmlinux_btf_id: #%d), "
> > +                               "but got '%s' (btf_id: #%d)\n", ext->name,
> > +                               targ_name, targ_type_id, local_name, local_type_id);
>
> same thing, please stay consistent in logging format. Check
> bpf_core_dump_spec() for how BTF type info is usually emitted
> throughout libbpf:
>
> "extern (ksym): incompatible types, expected [%d] %s %s, but kernel
> has [%d] %s %s\n"
>
> there is a btf_kind_str() helper to resolve kind to a string representation.
>
>
> > +                       return -EINVAL;
> > +               }
> > +
> > +               ext->is_set = true;
> > +               ext->ksym.vmlinux_btf_id = id;
> > +               pr_debug("extern (ksym) %s=vmlinux_btf_id(#%d)\n", ext->name, id);
>
> "extern (ksym) '%s': resolved to [%d] %s %s\n", similar to above
> suggestion. This "[%d]" format is very consistently used for BTF IDs
> throughout, so it will be familiar and recognizable for people that
> had to deal with this in libbpf logs.
>
> > +       }
> > +       return 0;
> > +}
> > +
> >  static int bpf_object__resolve_externs(struct bpf_object *obj,
> >                                        const char *extra_kconfig)
> >  {
> > -       bool need_config = false, need_kallsyms = false;
> > +       bool need_kallsyms = false, need_vmlinux_btf = false;
> > +       bool need_config = false;
>
> nit: doesn't make sense to change the existing source code line at
> all. Just add `bool need_vmlinux_btf = false;` on a new line? Or we
> can split all these bools into 3 separate lines, if you prefer.
>
> >         struct extern_desc *ext;
> >         void *kcfg_data = NULL;
> >         int err, i;
>
> [...]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH bpf-next v2 3/6] bpf/selftests: ksyms_btf to test typed ksyms
  2020-09-04 19:49   ` Andrii Nakryiko
@ 2020-09-14  4:58     ` Hao Luo
  2020-09-14 22:06       ` Andrii Nakryiko
  0 siblings, 1 reply; 21+ messages in thread
From: Hao Luo @ 2020-09-14  4:58 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Networking, bpf, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Shuah Khan, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Quentin Monnet, Steven Rostedt, Ingo Molnar,
	Andrey Ignatov, Jakub Sitnicki

Thanks for taking a look, Andrii.

On Fri, Sep 4, 2020 at 12:49 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Thu, Sep 3, 2020 at 3:35 PM Hao Luo <haoluo@google.com> wrote:
> >
> > Selftests for typed ksyms. Tests two types of ksyms: one is a struct,
> > the other is a plain int. This tests two paths in the kernel. Struct
> > ksyms will be converted into PTR_TO_BTF_ID by the verifier while int
> > typed ksyms will be converted into PTR_TO_MEM.
> >
> > Signed-off-by: Hao Luo <haoluo@google.com>
> > ---
> >  .../testing/selftests/bpf/prog_tests/ksyms.c  | 31 +++------
> >  .../selftests/bpf/prog_tests/ksyms_btf.c      | 63 +++++++++++++++++++
> >  .../selftests/bpf/progs/test_ksyms_btf.c      | 23 +++++++
> >  tools/testing/selftests/bpf/trace_helpers.c   | 26 ++++++++
> >  tools/testing/selftests/bpf/trace_helpers.h   |  4 ++
> >  5 files changed, 123 insertions(+), 24 deletions(-)
> >  create mode 100644 tools/testing/selftests/bpf/prog_tests/ksyms_btf.c
> >  create mode 100644 tools/testing/selftests/bpf/progs/test_ksyms_btf.c
> >

[...]

> > +       percpu_datasec = btf__find_by_name_kind(btf, ".data..percpu",
> > +                                               BTF_KIND_DATASEC);
> > +       if (percpu_datasec < 0) {
> > +               printf("%s:SKIP:no PERCPU DATASEC in kernel btf\n",
> > +                      __func__);
> > +               test__skip();
>
> leaking btf here
>
> > +               return;
> > +       }
> > +
> > +       skel = test_ksyms_btf__open_and_load();
> > +       if (CHECK(!skel, "skel_open", "failed to open and load skeleton\n"))
>
> here
>

Oops. Good catches. Will fix.

> > +               return;
> > +
> > +       err = test_ksyms_btf__attach(skel);
> > +       if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err))
> > +               goto cleanup;
> > +
> > +       /* trigger tracepoint */
> > +       usleep(1);
> > +
> > +       data = skel->data;
> > +       CHECK(data->out__runqueues != runqueues_addr, "runqueues",
> > +             "got %llu, exp %llu\n", data->out__runqueues, runqueues_addr);
> > +       CHECK(data->out__bpf_prog_active != bpf_prog_active_addr, "bpf_prog_active",
> > +             "got %llu, exp %llu\n", data->out__bpf_prog_active, bpf_prog_active_addr);
>
> u64 is not %llu on some arches, please cast explicitly to (unsigned long long)
>

Ack.

> > +
> > +cleanup:
>
> ... and here (I suggest to just jump from all those locations here for cleanup)
>

Makes sense. Will do.

> > +       test_ksyms_btf__destroy(skel);
> > +}
> > diff --git a/tools/testing/selftests/bpf/progs/test_ksyms_btf.c b/tools/testing/selftests/bpf/progs/test_ksyms_btf.c
> > new file mode 100644
> > index 000000000000..e04e31117f84
> > --- /dev/null
> > +++ b/tools/testing/selftests/bpf/progs/test_ksyms_btf.c
> > @@ -0,0 +1,23 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/* Copyright (c) 2020 Google */
> > +
> > +#include "vmlinux.h"
> > +
> > +#include <bpf/bpf_helpers.h>
> > +
> > +__u64 out__runqueues = -1;
> > +__u64 out__bpf_prog_active = -1;
>
> this is addresses, not values, so _addr part would make it clearer.
>

Ack.

> > +
> > +extern const struct rq runqueues __ksym; /* struct type global var. */
> > +extern const int bpf_prog_active __ksym; /* int type global var. */
>
> When we add non-per-CPU kernel variables, I wonder if the fact that we
> have both per-CPU and global kernel variables under the same __ksym
> section would cause any problems and confusion? It's not clear to me
> if we need to have a special __percpu_ksym section or not?..
>

Yeah. Totally agree. I thought about this. I think a separate
__percpu_ksym attribute is *probably* more clear. Not sure though. How
about we introduce a "__percpu_ksym" and make it an alias to "__ksym"
for now? If needed, we make an actual section for it in future.

> > +
> > +SEC("raw_tp/sys_enter")
> > +int handler(const void *ctx)
> > +{
> > +       out__runqueues = (__u64)&runqueues;
> > +       out__bpf_prog_active = (__u64)&bpf_prog_active;
> > +
> > +       return 0;
> > +}
> > +
> > +char _license[] SEC("license") = "GPL";
> > diff --git a/tools/testing/selftests/bpf/trace_helpers.c b/tools/testing/selftests/bpf/trace_helpers.c
> > index 4d0e913bbb22..ade555fe8294 100644
> > --- a/tools/testing/selftests/bpf/trace_helpers.c
> > +++ b/tools/testing/selftests/bpf/trace_helpers.c
> > @@ -90,6 +90,32 @@ long ksym_get_addr(const char *name)
> >         return 0;
> >  }
> >
> > +/* open kallsyms and read symbol addresses on the fly. Without caching all symbols,
> > + * this is faster than load + find. */
> > +int kallsyms_find(const char *sym, unsigned long long *addr)
> > +{
> > +       char type, name[500];
> > +       unsigned long long value;
> > +       int err = 0;
> > +       FILE *f;
> > +
> > +       f = fopen("/proc/kallsyms", "r");
> > +       if (!f)
> > +               return -ENOENT;
> > +
> > +       while (fscanf(f, "%llx %c %499s%*[^\n]\n", &value, &type, name) > 0) {
> > +               if (strcmp(name, sym) == 0) {
> > +                       *addr = value;
> > +                       goto out;
> > +               }
> > +       }
> > +       err = -EINVAL;
>
> These error codes seem backward to me. If you fail to open
> /proc/kallsyms, that's an unexpected and invalid situation, so EINVAL
> makes a bit more sense there. But -ENOENT is clearly for cases where
> you didn't find what you were looking for, which is exactly this case.
>
>

I thought about it. I used -ENOENT for fopen failure because I found
-ENOENT is for the case when a file/directory is not found, which is
more reasonable in describing fopen error. But your proposal also
makes  sense and that is what I originally had. It doesn't sound like
a big deal, I can switch the order them in v3.

> > +
> > +out:
> > +       fclose(f);
> > +       return err;
> > +}
> > +
> >  void read_trace_pipe(void)
> >  {
> >         int trace_fd;
> > diff --git a/tools/testing/selftests/bpf/trace_helpers.h b/tools/testing/selftests/bpf/trace_helpers.h
> > index 25ef597dd03f..f62fdef9e589 100644
> > --- a/tools/testing/selftests/bpf/trace_helpers.h
> > +++ b/tools/testing/selftests/bpf/trace_helpers.h
> > @@ -12,6 +12,10 @@ struct ksym {
> >  int load_kallsyms(void);
> >  struct ksym *ksym_search(long key);
> >  long ksym_get_addr(const char *name);
> > +
> > +/* open kallsyms and find addresses on the fly, faster than load + search. */
> > +int kallsyms_find(const char *sym, unsigned long long *addr);
> > +
> >  void read_trace_pipe(void);
> >
> >  #endif
> > --
> > 2.28.0.526.ge36021eeef-goog
> >

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH bpf-next v2 4/6] bpf: Introduce bpf_per_cpu_ptr()
  2020-09-04 20:04   ` Andrii Nakryiko
@ 2020-09-14  5:01     ` Hao Luo
  2020-09-14 18:09       ` Andrii Nakryiko
  0 siblings, 1 reply; 21+ messages in thread
From: Hao Luo @ 2020-09-14  5:01 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Networking, bpf, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Shuah Khan, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Quentin Monnet, Steven Rostedt, Ingo Molnar,
	Andrey Ignatov, Jakub Sitnicki

Thanks for review, Andrii.

One question, should I add bpf_{per, this}_cpu_ptr() to the
bpf_base_func_proto() in kernel/bpf/helpers.c?

On Fri, Sep 4, 2020 at 1:04 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Thu, Sep 3, 2020 at 3:35 PM Hao Luo <haoluo@google.com> wrote:
> >
> > Add bpf_per_cpu_ptr() to help bpf programs access percpu vars.
> > bpf_per_cpu_ptr() has the same semantic as per_cpu_ptr() in the kernel
> > except that it may return NULL. This happens when the cpu parameter is
> > out of range. So the caller must check the returned value.
> >
> > Acked-by: Andrii Nakryiko <andriin@fb.com>
> > Signed-off-by: Hao Luo <haoluo@google.com>
> > ---
> >  include/linux/bpf.h            |  3 ++
> >  include/linux/btf.h            | 11 ++++++
> >  include/uapi/linux/bpf.h       | 17 +++++++++
> >  kernel/bpf/btf.c               | 10 ------
> >  kernel/bpf/verifier.c          | 66 +++++++++++++++++++++++++++++++---
> >  kernel/trace/bpf_trace.c       | 18 ++++++++++
> >  tools/include/uapi/linux/bpf.h | 17 +++++++++
> >  7 files changed, 128 insertions(+), 14 deletions(-)
> >
> > diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> > index c6d9f2c444f4..6b2034f7665e 100644
> > --- a/include/linux/bpf.h
> > +++ b/include/linux/bpf.h
> > @@ -292,6 +292,7 @@ enum bpf_arg_type {
> >         ARG_PTR_TO_ALLOC_MEM,   /* pointer to dynamically allocated memory */
> >         ARG_PTR_TO_ALLOC_MEM_OR_NULL,   /* pointer to dynamically allocated memory or NULL */
> >         ARG_CONST_ALLOC_SIZE_OR_ZERO,   /* number of allocated bytes requested */
> > +       ARG_PTR_TO_PERCPU_BTF_ID,       /* pointer to in-kernel percpu type */
> >  };
> >
> >  /* type of values returned from helper functions */
> > @@ -305,6 +306,7 @@ enum bpf_return_type {
> >         RET_PTR_TO_SOCK_COMMON_OR_NULL, /* returns a pointer to a sock_common or NULL */
> >         RET_PTR_TO_ALLOC_MEM_OR_NULL,   /* returns a pointer to dynamically allocated memory or NULL */
> >         RET_PTR_TO_BTF_ID_OR_NULL,      /* returns a pointer to a btf_id or NULL */
> > +       RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL, /* returns a pointer to a valid memory or a btf_id or NULL */
> >  };
> >
> >  /* eBPF function prototype used by verifier to allow BPF_CALLs from eBPF programs
> > @@ -385,6 +387,7 @@ enum bpf_reg_type {
> >         PTR_TO_RDONLY_BUF_OR_NULL, /* reg points to a readonly buffer or NULL */
> >         PTR_TO_RDWR_BUF,         /* reg points to a read/write buffer */
> >         PTR_TO_RDWR_BUF_OR_NULL, /* reg points to a read/write buffer or NULL */
> > +       PTR_TO_PERCPU_BTF_ID,    /* reg points to percpu kernel type */
> >  };
> >
> >  /* The information passed from prog-specific *_is_valid_access
> > diff --git a/include/linux/btf.h b/include/linux/btf.h
> > index 592373d359b9..07b7de1c05b0 100644
> > --- a/include/linux/btf.h
> > +++ b/include/linux/btf.h
> > @@ -71,6 +71,11 @@ btf_resolve_size(const struct btf *btf, const struct btf_type *type,
> >              i < btf_type_vlen(struct_type);                    \
> >              i++, member++)
> >
> > +#define for_each_vsi(i, struct_type, member)                   \
>
> datasec_type?
>

Hmmm, right. It seems to come when copy-pasted from "for_each_member".

> > +       for (i = 0, member = btf_type_var_secinfo(struct_type); \
> > +            i < btf_type_vlen(struct_type);                    \
> > +            i++, member++)
> > +
> >  static inline bool btf_type_is_ptr(const struct btf_type *t)
> >  {
> >         return BTF_INFO_KIND(t->info) == BTF_KIND_PTR;
> > @@ -155,6 +160,12 @@ static inline const struct btf_member *btf_type_member(const struct btf_type *t)
> >         return (const struct btf_member *)(t + 1);
> >  }
> >
> > +static inline const struct btf_var_secinfo *btf_type_var_secinfo(
> > +               const struct btf_type *t)
> > +{
> > +       return (const struct btf_var_secinfo *)(t + 1);
> > +}
> > +
> >  #ifdef CONFIG_BPF_SYSCALL
> >  const struct btf_type *btf_type_by_id(const struct btf *btf, u32 type_id);
> >  const char *btf_name_by_offset(const struct btf *btf, u32 offset);
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index ab00ad9b32e5..d0ec94d5bdbf 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -3596,6 +3596,22 @@ union bpf_attr {
> >   *             the data in *dst*. This is a wrapper of copy_from_user().
> >   *     Return
> >   *             0 on success, or a negative error in case of failure.
> > + *
> > + * void *bpf_per_cpu_ptr(const void *percpu_ptr, u32 cpu)
> > + *     Description
> > + *             Take a pointer to a percpu ksym, *percpu_ptr*, and return a
> > + *             pointer to the percpu kernel variable on *cpu*. A ksym is an
> > + *             extern variable decorated with '__ksym'. For ksym, there is a
> > + *             global var (either static or global) defined of the same name
> > + *             in the kernel. The ksym is percpu if the global var is percpu.
> > + *             The returned pointer points to the global percpu var on *cpu*.
> > + *
> > + *             bpf_per_cpu_ptr() has the same semantic as per_cpu_ptr() in the
> > + *             kernel, except that bpf_per_cpu_ptr() may return NULL. This
> > + *             happens if *cpu* is larger than nr_cpu_ids. The caller of
> > + *             bpf_per_cpu_ptr() must check the returned value.
> > + *     Return
> > + *             A generic pointer pointing to the kernel percpu variable on *cpu*.
>
> Or NULL, if *cpu* is invalid.
>

Ack.

> >   */
> >  #define __BPF_FUNC_MAPPER(FN)          \
> >         FN(unspec),                     \
> > @@ -3747,6 +3763,7 @@ union bpf_attr {
> >         FN(inode_storage_delete),       \
> >         FN(d_path),                     \
> >         FN(copy_from_user),             \
> > +       FN(bpf_per_cpu_ptr),            \
> >         /* */
> >
>
> [...]
>
> > @@ -4003,6 +4008,15 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 arg,
> >                         if (type != expected_type)
> >                                 goto err_type;
> >                 }
> > +       } else if (arg_type == ARG_PTR_TO_PERCPU_BTF_ID) {
> > +               expected_type = PTR_TO_PERCPU_BTF_ID;
> > +               if (type != expected_type)
> > +                       goto err_type;
> > +               if (!reg->btf_id) {
> > +                       verbose(env, "Helper has invalid btf_id in R%d\n", regno);
> > +                       return -EACCES;
> > +               }
> > +               meta->ret_btf_id = reg->btf_id;
> >         } else if (arg_type == ARG_PTR_TO_BTF_ID) {
> >                 bool ids_match = false;
> >
> > @@ -5002,6 +5016,30 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn
> >                 regs[BPF_REG_0].type = PTR_TO_MEM_OR_NULL;
> >                 regs[BPF_REG_0].id = ++env->id_gen;
> >                 regs[BPF_REG_0].mem_size = meta.mem_size;
> > +       } else if (fn->ret_type == RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL) {
>
> Given this is internal implementation detail, this return type is
> fine, but I'm wondering if it would be better to just make
> PTR_TO_BTF_ID to allow not just structs? E.g., if we have an int, just
> allow reading those 4 bytes.
>
> Not sure what the implications are in terms of implementation, but
> conceptually that shouldn't be a problem, given we do have BTF type ID
> describing size and all.
>

Yeah. Totally agree. I looked at it initially. My take is
PTR_TO_BTF_ID is meant for struct types. It required some code
refactoring to break this assumption. I can add it to my TODO list and
investigate it if this makes more sense.

> > +               const struct btf_type *t;
> > +
> > +               mark_reg_known_zero(env, regs, BPF_REG_0);
> > +               t = btf_type_skip_modifiers(btf_vmlinux, meta.ret_btf_id, NULL);
> > +               if (!btf_type_is_struct(t)) {
> > +                       u32 tsize;
> > +                       const struct btf_type *ret;
> > +                       const char *tname;
> > +
> > +                       /* resolve the type size of ksym. */
> > +                       ret = btf_resolve_size(btf_vmlinux, t, &tsize);
> > +                       if (IS_ERR(ret)) {
> > +                               tname = btf_name_by_offset(btf_vmlinux, t->name_off);
> > +                               verbose(env, "unable to resolve the size of type '%s': %ld\n",
> > +                                       tname, PTR_ERR(ret));
> > +                               return -EINVAL;
> > +                       }
> > +                       regs[BPF_REG_0].type = PTR_TO_MEM_OR_NULL;
> > +                       regs[BPF_REG_0].mem_size = tsize;
> > +               } else {
> > +                       regs[BPF_REG_0].type = PTR_TO_BTF_ID_OR_NULL;
> > +                       regs[BPF_REG_0].btf_id = meta.ret_btf_id;
> > +               }
> >         } else if (fn->ret_type == RET_PTR_TO_BTF_ID_OR_NULL) {
> >                 int ret_btf_id;
> >
>
> [...]
>
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index b2a5380eb187..d474c1530f87 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -1144,6 +1144,22 @@ static const struct bpf_func_proto bpf_d_path_proto = {
> >         .allowed        = bpf_d_path_allowed,
> >  };
> >
> > +BPF_CALL_2(bpf_per_cpu_ptr, const void *, ptr, u32, cpu)
> > +{
> > +       if (cpu >= nr_cpu_ids)
> > +               return 0;
> > +
> > +       return (u64)per_cpu_ptr(ptr, cpu);
>
> not sure, but on 32-bit arches this might cause compilation warning,
> case to (unsigned long) instead?
>

Ah, I see, good catch! Will fix, thanks.


> > +}
> > +
> > +static const struct bpf_func_proto bpf_per_cpu_ptr_proto = {
> > +       .func           = bpf_per_cpu_ptr,
> > +       .gpl_only       = false,
> > +       .ret_type       = RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL,
> > +       .arg1_type      = ARG_PTR_TO_PERCPU_BTF_ID,
> > +       .arg2_type      = ARG_ANYTHING,
> > +};
> > +
>
> [...]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH bpf-next v2 5/6] bpf: Introduce bpf_this_cpu_ptr()
  2020-09-04 20:09   ` Andrii Nakryiko
@ 2020-09-14  5:04     ` Hao Luo
  0 siblings, 0 replies; 21+ messages in thread
From: Hao Luo @ 2020-09-14  5:04 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Networking, bpf, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Shuah Khan, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Quentin Monnet, Steven Rostedt, Ingo Molnar,
	Andrey Ignatov, Jakub Sitnicki

Thanks for taking a look!

On Fri, Sep 4, 2020 at 1:09 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Thu, Sep 3, 2020 at 3:35 PM Hao Luo <haoluo@google.com> wrote:
> >
> > Add bpf_this_cpu_ptr() to help access percpu var on this cpu. This
> > helper always returns a valid pointer, therefore no need to check
> > returned value for NULL. Also note that all programs run with
> > preemption disabled, which means that the returned pointer is stable
> > during all the execution of the program.
> >
> > Signed-off-by: Hao Luo <haoluo@google.com>
> > ---
>
> looks good, few small things, but otherwise:
>
> Acked-by: Andrii Nakryiko <andriin@fb.com>
>
[...]
> >
> >  /* eBPF function prototype used by verifier to allow BPF_CALLs from eBPF programs
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index d0ec94d5bdbf..e7ca91c697ed 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -3612,6 +3612,19 @@ union bpf_attr {
> >   *             bpf_per_cpu_ptr() must check the returned value.
> >   *     Return
> >   *             A generic pointer pointing to the kernel percpu variable on *cpu*.
> > + *
> > + * void *bpf_this_cpu_ptr(const void *percpu_ptr)
> > + *     Description
> > + *             Take a pointer to a percpu ksym, *percpu_ptr*, and return a
> > + *             pointer to the percpu kernel variable on this cpu. See the
> > + *             description of 'ksym' in **bpf_per_cpu_ptr**\ ().
> > + *
> > + *             bpf_this_cpu_ptr() has the same semantic as this_cpu_ptr() in
> > + *             the kernel. Different from **bpf_per_cpu_ptr**\ (), it would
> > + *             never return NULL.
> > + *     Return
> > + *             A generic pointer pointing to the kernel percpu variable on
>
> what's "a generic pointer"? is it as opposed to sk_buff pointer or something?
>

Ack. "A pointer" should be good enough. I wrote "generic pointer"
because the per_cpu_ptr() in kernel code is a macro, whose returned
value is a typed pointer, IIUC. But here we are missing the type. This
is another difference between this helper and per_cpu_ptr(). But this
may not matter.

> >  /* integer value in 'imm' field of BPF_CALL instruction selects which helper
> > diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> > index a702600ff581..e070d2abc405 100644
> > --- a/kernel/bpf/verifier.c
> > +++ b/kernel/bpf/verifier.c
> > @@ -5016,8 +5016,10 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn
> >                 regs[BPF_REG_0].type = PTR_TO_MEM_OR_NULL;
> >                 regs[BPF_REG_0].id = ++env->id_gen;
> >                 regs[BPF_REG_0].mem_size = meta.mem_size;
> > -       } else if (fn->ret_type == RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL) {
> > +       } else if (fn->ret_type == RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL ||
> > +                  fn->ret_type == RET_PTR_TO_MEM_OR_BTF_ID) {
> >                 const struct btf_type *t;
> > +               bool not_null = fn->ret_type == RET_PTR_TO_MEM_OR_BTF_ID;
>
> nit: this is fine, but I'd inline it below
>

Ack.

> >
> >                 mark_reg_known_zero(env, regs, BPF_REG_0);
> >                 t = btf_type_skip_modifiers(btf_vmlinux, meta.ret_btf_id, NULL);
> > @@ -5034,10 +5036,12 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn
> >                                         tname, PTR_ERR(ret));
> >                                 return -EINVAL;
> >                         }
> > -                       regs[BPF_REG_0].type = PTR_TO_MEM_OR_NULL;
> > +                       regs[BPF_REG_0].type = not_null ?
> > +                               PTR_TO_MEM : PTR_TO_MEM_OR_NULL;
> >                         regs[BPF_REG_0].mem_size = tsize;
> >                 } else {
> > -                       regs[BPF_REG_0].type = PTR_TO_BTF_ID_OR_NULL;
> > +                       regs[BPF_REG_0].type = not_null ?
> > +                               PTR_TO_BTF_ID : PTR_TO_BTF_ID_OR_NULL;
> >                         regs[BPF_REG_0].btf_id = meta.ret_btf_id;
> >                 }
> >         } else if (fn->ret_type == RET_PTR_TO_BTF_ID_OR_NULL) {
> > diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> > index d474c1530f87..466acf82a9c7 100644
> > --- a/kernel/trace/bpf_trace.c
> > +++ b/kernel/trace/bpf_trace.c
> > @@ -1160,6 +1160,18 @@ static const struct bpf_func_proto bpf_per_cpu_ptr_proto = {
> >         .arg2_type      = ARG_ANYTHING,
> >  };
> >
> > +BPF_CALL_1(bpf_this_cpu_ptr, const void *, percpu_ptr)
> > +{
> > +       return (u64)this_cpu_ptr(percpu_ptr);
>
> see previous comment, this might trigger unnecessary compilation
> warnings on 32-bit arches
>

Ack. Will cast to "unsigned long". Thanks for catching this!


> > +}
> > +
> > +static const struct bpf_func_proto bpf_this_cpu_ptr_proto = {
> > +       .func           = bpf_this_cpu_ptr,
> > +       .gpl_only       = false,
> > +       .ret_type       = RET_PTR_TO_MEM_OR_BTF_ID,
> > +       .arg1_type      = ARG_PTR_TO_PERCPU_BTF_ID,
> > +};
> > +
>
> [...]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH bpf-next v2 6/6] bpf/selftests: Test for bpf_per_cpu_ptr() and bpf_this_cpu_ptr()
  2020-09-04 20:15   ` Andrii Nakryiko
@ 2020-09-14 16:59     ` Hao Luo
  0 siblings, 0 replies; 21+ messages in thread
From: Hao Luo @ 2020-09-14 16:59 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: Networking, bpf, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Shuah Khan, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Quentin Monnet, Steven Rostedt, Ingo Molnar,
	Andrey Ignatov, Jakub Sitnicki

Thanks for taking a look!

On Fri, Sep 4, 2020 at 1:15 PM Andrii Nakryiko
<andrii.nakryiko@gmail.com> wrote:
>
> On Thu, Sep 3, 2020 at 3:35 PM Hao Luo <haoluo@google.com> wrote:
> >
> > Test bpf_per_cpu_ptr() and bpf_this_cpu_ptr(). Test two paths in the
> > kernel. If the base pointer points to a struct, the returned reg is
> > of type PTR_TO_BTF_ID. Direct pointer dereference can be applied on
> > the returned variable. If the base pointer isn't a struct, the
> > returned reg is of type PTR_TO_MEM, which also supports direct pointer
> > dereference.
> >
> > Acked-by: Andrii Nakryiko <andriin@fb.com>
> > Signed-off-by: Hao Luo <haoluo@google.com>
> > ---
> >  .../selftests/bpf/prog_tests/ksyms_btf.c      | 10 +++++++
> >  .../selftests/bpf/progs/test_ksyms_btf.c      | 26 +++++++++++++++++++
> >  2 files changed, 36 insertions(+)
> >
> > diff --git a/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c b/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c
> > index 7b6846342449..22cc642dbc0e 100644
> > --- a/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c
> > +++ b/tools/testing/selftests/bpf/prog_tests/ksyms_btf.c
> > @@ -58,6 +58,16 @@ void test_ksyms_btf(void)
> >         CHECK(data->out__bpf_prog_active != bpf_prog_active_addr, "bpf_prog_active",
> >               "got %llu, exp %llu\n", data->out__bpf_prog_active, bpf_prog_active_addr);
> >
> > +       CHECK(data->out__rq_cpu == -1, "rq_cpu",
> > +             "got %u, exp != -1\n", data->out__rq_cpu);
> > +       CHECK(data->out__percpu_bpf_prog_active == -1, "percpu_bpf_prog_active",
> > +             "got %d, exp != -1\n", data->out__percpu_bpf_prog_active);
> > +
> > +       CHECK(data->out__this_rq_cpu == -1, "this_rq_cpu",
> > +             "got %u, exp != -1\n", data->out__this_rq_cpu);
> > +       CHECK(data->out__this_bpf_prog_active == -1, "this_bpf_prog_active",
> > +             "got %d, exp != -1\n", data->out__this_bpf_prog_active);
>
> see below for few suggestions to make these test more specific
>
> out__this_bpf_prog_active it should always be > 0, no?
>

I could be wrong, but I remember raw_trace_point is not tracked by
bpf_prog_active. So I used bpf_prog_active >= 0 to be safe.

> > +
> >  cleanup:
> >         test_ksyms_btf__destroy(skel);
> >  }
> > diff --git a/tools/testing/selftests/bpf/progs/test_ksyms_btf.c b/tools/testing/selftests/bpf/progs/test_ksyms_btf.c
> > index e04e31117f84..02d564349892 100644
> > --- a/tools/testing/selftests/bpf/progs/test_ksyms_btf.c
> > +++ b/tools/testing/selftests/bpf/progs/test_ksyms_btf.c
> > @@ -8,15 +8,41 @@
> >  __u64 out__runqueues = -1;
> >  __u64 out__bpf_prog_active = -1;
> >
> > +__u32 out__rq_cpu = -1; /* percpu struct fields */
> > +int out__percpu_bpf_prog_active = -1; /* percpu int */
> > +
> > +__u32 out__this_rq_cpu = -1;
> > +int out__this_bpf_prog_active = -1;
> > +
> >  extern const struct rq runqueues __ksym; /* struct type global var. */
> >  extern const int bpf_prog_active __ksym; /* int type global var. */
> >
> >  SEC("raw_tp/sys_enter")
> >  int handler(const void *ctx)
> >  {
> > +       struct rq *rq;
> > +       int *active;
> > +       __u32 cpu;
> > +
> >         out__runqueues = (__u64)&runqueues;
> >         out__bpf_prog_active = (__u64)&bpf_prog_active;
> >
> > +       cpu = bpf_get_smp_processor_id();
> > +
> > +       /* test bpf_per_cpu_ptr() */
> > +       rq = (struct rq *)bpf_per_cpu_ptr(&runqueues, cpu);
> > +       if (rq)
> > +               out__rq_cpu = rq->cpu;
> > +       active = (int *)bpf_per_cpu_ptr(&bpf_prog_active, cpu);
> > +       if (active)
> > +               out__percpu_bpf_prog_active = *active;
>
> this is equivalent to using bpf_this_cpu_ptr(), so:
>
> 1. you can compare value with out__this_xxx in user-space
>
> 2. it's interesting to also test that you can read value from some
> other CPU. Can you add another variable and get value from CPU #0
> always? E.g., for out__cpu_0_rq_cpu it should always be zero, right?
>

Ack. That makes sense. You are right, out__cpu_0_rq_cpu is always zero.

> > +
> > +       /* test bpf_this_cpu_ptr */
> > +       rq = (struct rq *)bpf_this_cpu_ptr(&runqueues);
> > +       out__this_rq_cpu = rq->cpu;
> > +       active = (int *)bpf_this_cpu_ptr(&bpf_prog_active);
> > +       out__this_bpf_prog_active = *active;
> > +
> >         return 0;
> >  }
> >
> > --
> > 2.28.0.526.ge36021eeef-goog
> >

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH bpf-next v2 4/6] bpf: Introduce bpf_per_cpu_ptr()
  2020-09-14  5:01     ` Hao Luo
@ 2020-09-14 18:09       ` Andrii Nakryiko
  0 siblings, 0 replies; 21+ messages in thread
From: Andrii Nakryiko @ 2020-09-14 18:09 UTC (permalink / raw)
  To: Hao Luo
  Cc: Networking, bpf, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Shuah Khan, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Quentin Monnet, Steven Rostedt, Ingo Molnar,
	Andrey Ignatov, Jakub Sitnicki

On Sun, Sep 13, 2020 at 10:01 PM Hao Luo <haoluo@google.com> wrote:
>
> Thanks for review, Andrii.
>
> One question, should I add bpf_{per, this}_cpu_ptr() to the
> bpf_base_func_proto() in kernel/bpf/helpers.c?

Yes, probably, but given it allows poking at kernel memory, it
probably needs to be guarded by perfmon_capable() check, similar to
bpf_get_current_task_proto.

>
> On Fri, Sep 4, 2020 at 1:04 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Thu, Sep 3, 2020 at 3:35 PM Hao Luo <haoluo@google.com> wrote:
> > >
> > > Add bpf_per_cpu_ptr() to help bpf programs access percpu vars.
> > > bpf_per_cpu_ptr() has the same semantic as per_cpu_ptr() in the kernel
> > > except that it may return NULL. This happens when the cpu parameter is
> > > out of range. So the caller must check the returned value.
> > >
> > > Acked-by: Andrii Nakryiko <andriin@fb.com>
> > > Signed-off-by: Hao Luo <haoluo@google.com>
> > > ---
> > >  include/linux/bpf.h            |  3 ++
> > >  include/linux/btf.h            | 11 ++++++
> > >  include/uapi/linux/bpf.h       | 17 +++++++++
> > >  kernel/bpf/btf.c               | 10 ------
> > >  kernel/bpf/verifier.c          | 66 +++++++++++++++++++++++++++++++---
> > >  kernel/trace/bpf_trace.c       | 18 ++++++++++
> > >  tools/include/uapi/linux/bpf.h | 17 +++++++++
> > >  7 files changed, 128 insertions(+), 14 deletions(-)
> > >

[...]

> > > @@ -5002,6 +5016,30 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn
> > >                 regs[BPF_REG_0].type = PTR_TO_MEM_OR_NULL;
> > >                 regs[BPF_REG_0].id = ++env->id_gen;
> > >                 regs[BPF_REG_0].mem_size = meta.mem_size;
> > > +       } else if (fn->ret_type == RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL) {
> >
> > Given this is internal implementation detail, this return type is
> > fine, but I'm wondering if it would be better to just make
> > PTR_TO_BTF_ID to allow not just structs? E.g., if we have an int, just
> > allow reading those 4 bytes.
> >
> > Not sure what the implications are in terms of implementation, but
> > conceptually that shouldn't be a problem, given we do have BTF type ID
> > describing size and all.
> >
>
> Yeah. Totally agree. I looked at it initially. My take is
> PTR_TO_BTF_ID is meant for struct types. It required some code
> refactoring to break this assumption. I can add it to my TODO list and
> investigate it if this makes more sense.

PTR_TO_BTF_ID was *implemented* for struct, but at least naming-wise
nothing suggests it has to be restricted to structs. But yeah, this
should be a separate change, don't block your patches on that.

>
> > > +               const struct btf_type *t;
> > > +

[...]

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH bpf-next v2 3/6] bpf/selftests: ksyms_btf to test typed ksyms
  2020-09-14  4:58     ` Hao Luo
@ 2020-09-14 22:06       ` Andrii Nakryiko
  0 siblings, 0 replies; 21+ messages in thread
From: Andrii Nakryiko @ 2020-09-14 22:06 UTC (permalink / raw)
  To: Hao Luo
  Cc: Networking, bpf, open list, open list:KERNEL SELFTEST FRAMEWORK,
	Shuah Khan, Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	KP Singh, Quentin Monnet, Steven Rostedt, Ingo Molnar,
	Andrey Ignatov, Jakub Sitnicki

On Sun, Sep 13, 2020 at 9:58 PM Hao Luo <haoluo@google.com> wrote:
>
> Thanks for taking a look, Andrii.
>
> On Fri, Sep 4, 2020 at 12:49 PM Andrii Nakryiko
> <andrii.nakryiko@gmail.com> wrote:
> >
> > On Thu, Sep 3, 2020 at 3:35 PM Hao Luo <haoluo@google.com> wrote:
> > >
> > > Selftests for typed ksyms. Tests two types of ksyms: one is a struct,
> > > the other is a plain int. This tests two paths in the kernel. Struct
> > > ksyms will be converted into PTR_TO_BTF_ID by the verifier while int
> > > typed ksyms will be converted into PTR_TO_MEM.
> > >
> > > Signed-off-by: Hao Luo <haoluo@google.com>
> > > ---
> > >  .../testing/selftests/bpf/prog_tests/ksyms.c  | 31 +++------
> > >  .../selftests/bpf/prog_tests/ksyms_btf.c      | 63 +++++++++++++++++++
> > >  .../selftests/bpf/progs/test_ksyms_btf.c      | 23 +++++++
> > >  tools/testing/selftests/bpf/trace_helpers.c   | 26 ++++++++
> > >  tools/testing/selftests/bpf/trace_helpers.h   |  4 ++
> > >  5 files changed, 123 insertions(+), 24 deletions(-)
> > >  create mode 100644 tools/testing/selftests/bpf/prog_tests/ksyms_btf.c
> > >  create mode 100644 tools/testing/selftests/bpf/progs/test_ksyms_btf.c
> > >
>

[...]

> > > +
> > > +extern const struct rq runqueues __ksym; /* struct type global var. */
> > > +extern const int bpf_prog_active __ksym; /* int type global var. */
> >
> > When we add non-per-CPU kernel variables, I wonder if the fact that we
> > have both per-CPU and global kernel variables under the same __ksym
> > section would cause any problems and confusion? It's not clear to me
> > if we need to have a special __percpu_ksym section or not?..
> >
>
> Yeah. Totally agree. I thought about this. I think a separate
> __percpu_ksym attribute is *probably* more clear. Not sure though. How
> about we introduce a "__percpu_ksym" and make it an alias to "__ksym"
> for now? If needed, we make an actual section for it in future.

Let's keep it in __ksym as is. Verifier will have enough insight to
produce a meaningful error message, it won't be easy to misuse this
feature.

>
> > > +
> > > +SEC("raw_tp/sys_enter")
> > > +int handler(const void *ctx)
> > > +{
> > > +       out__runqueues = (__u64)&runqueues;
> > > +       out__bpf_prog_active = (__u64)&bpf_prog_active;
> > > +
> > > +       return 0;
> > > +}
> > > +
> > > +char _license[] SEC("license") = "GPL";
> > > diff --git a/tools/testing/selftests/bpf/trace_helpers.c b/tools/testing/selftests/bpf/trace_helpers.c
> > > index 4d0e913bbb22..ade555fe8294 100644
> > > --- a/tools/testing/selftests/bpf/trace_helpers.c
> > > +++ b/tools/testing/selftests/bpf/trace_helpers.c
> > > @@ -90,6 +90,32 @@ long ksym_get_addr(const char *name)
> > >         return 0;
> > >  }
> > >
> > > +/* open kallsyms and read symbol addresses on the fly. Without caching all symbols,
> > > + * this is faster than load + find. */
> > > +int kallsyms_find(const char *sym, unsigned long long *addr)
> > > +{
> > > +       char type, name[500];
> > > +       unsigned long long value;
> > > +       int err = 0;
> > > +       FILE *f;
> > > +
> > > +       f = fopen("/proc/kallsyms", "r");
> > > +       if (!f)
> > > +               return -ENOENT;
> > > +
> > > +       while (fscanf(f, "%llx %c %499s%*[^\n]\n", &value, &type, name) > 0) {
> > > +               if (strcmp(name, sym) == 0) {
> > > +                       *addr = value;
> > > +                       goto out;
> > > +               }
> > > +       }
> > > +       err = -EINVAL;
> >
> > These error codes seem backward to me. If you fail to open
> > /proc/kallsyms, that's an unexpected and invalid situation, so EINVAL
> > makes a bit more sense there. But -ENOENT is clearly for cases where
> > you didn't find what you were looking for, which is exactly this case.
> >
> >
>
> I thought about it. I used -ENOENT for fopen failure because I found
> -ENOENT is for the case when a file/directory is not found, which is
> more reasonable in describing fopen error. But your proposal also
> makes  sense and that is what I originally had. It doesn't sound like
> a big deal, I can switch the order them in v3.

For me, ENOENT is about the logical entity the function is working
with. For fopen() that would be file, so if it's not found -- ENOENT.
But here, for kallsyms_find it's a ksym. If /proc/kallsyms isn't there
or can't be open -- that's unexpected (EINVAL). But if /proc/kallsyms
was open but didn't contain the entity we are looking for (requested
ksym) -- that's ENOENT.

>
> > > +
> > > +out:
> > > +       fclose(f);
> > > +       return err;
> > > +}
> > > +
> > >  void read_trace_pipe(void)
> > >  {
> > >         int trace_fd;
> > > diff --git a/tools/testing/selftests/bpf/trace_helpers.h b/tools/testing/selftests/bpf/trace_helpers.h
> > > index 25ef597dd03f..f62fdef9e589 100644
> > > --- a/tools/testing/selftests/bpf/trace_helpers.h
> > > +++ b/tools/testing/selftests/bpf/trace_helpers.h
> > > @@ -12,6 +12,10 @@ struct ksym {
> > >  int load_kallsyms(void);
> > >  struct ksym *ksym_search(long key);
> > >  long ksym_get_addr(const char *name);
> > > +
> > > +/* open kallsyms and find addresses on the fly, faster than load + search. */
> > > +int kallsyms_find(const char *sym, unsigned long long *addr);
> > > +
> > >  void read_trace_pipe(void);
> > >
> > >  #endif
> > > --
> > > 2.28.0.526.ge36021eeef-goog
> > >

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2020-09-14 22:06 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-03 22:33 [PATCH bpf-next v2 0/6] bpf: BTF support for ksyms Hao Luo
2020-09-03 22:33 ` [PATCH bpf-next v2 1/6] bpf: Introduce pseudo_btf_id Hao Luo
2020-09-04 19:05   ` Andrii Nakryiko
2020-09-14  4:55     ` Hao Luo
2020-09-03 22:33 ` [PATCH bpf-next v2 2/6] bpf/libbpf: BTF support for typed ksyms Hao Luo
2020-09-04 19:34   ` Andrii Nakryiko
2020-09-14  4:56     ` Hao Luo
2020-09-03 22:33 ` [PATCH bpf-next v2 3/6] bpf/selftests: ksyms_btf to test " Hao Luo
2020-09-04 19:49   ` Andrii Nakryiko
2020-09-14  4:58     ` Hao Luo
2020-09-14 22:06       ` Andrii Nakryiko
2020-09-03 22:33 ` [PATCH bpf-next v2 4/6] bpf: Introduce bpf_per_cpu_ptr() Hao Luo
2020-09-04 20:04   ` Andrii Nakryiko
2020-09-14  5:01     ` Hao Luo
2020-09-14 18:09       ` Andrii Nakryiko
2020-09-03 22:33 ` [PATCH bpf-next v2 5/6] bpf: Introduce bpf_this_cpu_ptr() Hao Luo
2020-09-04 20:09   ` Andrii Nakryiko
2020-09-14  5:04     ` Hao Luo
2020-09-03 22:33 ` [PATCH bpf-next v2 6/6] bpf/selftests: Test for bpf_per_cpu_ptr() and bpf_this_cpu_ptr() Hao Luo
2020-09-04 20:15   ` Andrii Nakryiko
2020-09-14 16:59     ` Hao Luo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.