[PATCH v3 bpf-next 0/7] Support kernel module ksym variables

bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v3 bpf-next 0/7] Support kernel module ksym variables
@ 2021-01-12  7:55 Andrii Nakryiko
  2021-01-12  7:55 ` [PATCH v3 bpf-next 1/7] bpf: add bpf_patch_call_args prototype to include/linux/bpf.h Andrii Nakryiko
                   ` (6 more replies)
  0 siblings, 7 replies; 13+ messages in thread
From: Andrii Nakryiko @ 2021-01-12  7:55 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii, kernel-team, Hao Luo

Add support for using kernel module global variables (__ksym externs in BPF
program). BPF verifier will now support ldimm64 with src_reg=BPF_PSEUDO_BTF_ID
and non-zero insn[1].imm field, specifying module BTF's FD. In such case,
module BTF object, similarly to BPF maps referenced from ldimm64 with
src_reg=BPF_PSEUDO_MAP_FD, will be recorded in bpf_progran's auxiliary data
and refcnt will be increased for both BTF object itself and its kernel module.
This makes sure kernel module won't be unloaded from under active attached BPF
program. These refcounts will be dropped when BPF program is unloaded.

New selftest validates all this is working as intended. bpf_testmod.ko is
extended with per-CPU variable. Selftests expects the latest pahole changes
(soon to be released as v1.20) to generate per-CPU variable BTF info for
kernel module.

v2->v3:
  - added comments, addressed feedack (Yonghong, Hao);
v1->v2:
  - fixed few compiler warnings, posted as separate pre-patches;
rfc->v1:
  - use sys_membarrier(MEMBARRIER_CMD_GLOBAL) (Alexei).

Cc: Hao Luo <haoluo@google.com>

Andrii Nakryiko (7):
  bpf: add bpf_patch_call_args prototype to include/linux/bpf.h
  bpf: avoid warning when re-casting __bpf_call_base into
    __bpf_call_base_args
  bpf: declare __bpf_free_used_maps() unconditionally
  selftests/bpf: sync RCU before unloading bpf_testmod
  bpf: support BPF ksym variables in kernel modules
  libbpf: support kernel module ksym externs
  selftests/bpf: test kernel module ksym externs

 include/linux/bpf.h                           |  18 +-
 include/linux/bpf_verifier.h                  |   3 +
 include/linux/btf.h                           |   3 +
 include/linux/filter.h                        |   2 +-
 kernel/bpf/btf.c                              |  31 +++-
 kernel/bpf/core.c                             |  23 +++
 kernel/bpf/verifier.c                         | 154 ++++++++++++++----
 tools/lib/bpf/libbpf.c                        |  50 ++++--
 .../selftests/bpf/bpf_testmod/bpf_testmod.c   |   3 +
 .../selftests/bpf/prog_tests/btf_map_in_map.c |  33 ----
 .../selftests/bpf/prog_tests/ksyms_module.c   |  31 ++++
 .../selftests/bpf/progs/test_ksyms_module.c   |  26 +++
 tools/testing/selftests/bpf/test_progs.c      |  11 ++
 tools/testing/selftests/bpf/test_progs.h      |   1 +
 14 files changed, 305 insertions(+), 84 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/ksyms_module.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_ksyms_module.c

-- 
2.24.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v3 bpf-next 1/7] bpf: add bpf_patch_call_args prototype to include/linux/bpf.h
  2021-01-12  7:55 [PATCH v3 bpf-next 0/7] Support kernel module ksym variables Andrii Nakryiko
@ 2021-01-12  7:55 ` Andrii Nakryiko
  2021-01-12  7:55 ` [PATCH v3 bpf-next 2/7] bpf: avoid warning when re-casting __bpf_call_base into __bpf_call_base_args Andrii Nakryiko
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Andrii Nakryiko @ 2021-01-12  7:55 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel
  Cc: andrii, kernel-team, Hao Luo, kernel test robot, Yonghong Song

Add bpf_patch_call_args() prototype. This function is called from BPF verifier
and only if CONFIG_BPF_JIT_ALWAYS_ON is not defined. This fixes compiler
warning about missing prototype in some kernel configurations.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: 1ea47e01ad6e ("bpf: add support for bpf_call to interpreter")
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 include/linux/bpf.h | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 07cb5d15e743..ef9309604b3e 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1403,7 +1403,10 @@ static inline void bpf_long_memcpy(void *dst, const void *src, u32 size)
 /* verify correctness of eBPF program */
 int bpf_check(struct bpf_prog **fp, union bpf_attr *attr,
 	      union bpf_attr __user *uattr);
+
+#ifndef CONFIG_BPF_JIT_ALWAYS_ON
 void bpf_patch_call_args(struct bpf_insn *insn, u32 stack_depth);
+#endif
 
 struct btf *bpf_get_btf_vmlinux(void);
 
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 bpf-next 2/7] bpf: avoid warning when re-casting __bpf_call_base into __bpf_call_base_args
  2021-01-12  7:55 [PATCH v3 bpf-next 0/7] Support kernel module ksym variables Andrii Nakryiko
  2021-01-12  7:55 ` [PATCH v3 bpf-next 1/7] bpf: add bpf_patch_call_args prototype to include/linux/bpf.h Andrii Nakryiko
@ 2021-01-12  7:55 ` Andrii Nakryiko
  2021-01-12  7:55 ` [PATCH v3 bpf-next 3/7] bpf: declare __bpf_free_used_maps() unconditionally Andrii Nakryiko
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Andrii Nakryiko @ 2021-01-12  7:55 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel
  Cc: andrii, kernel-team, Hao Luo, kernel test robot, Yonghong Song

BPF interpreter uses extra input argument, so re-casts __bpf_call_base into
__bpf_call_base_args. Avoid compiler warning about incompatible function
prototypes by casting to void * first.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: 1ea47e01ad6e ("bpf: add support for bpf_call to interpreter")
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 include/linux/filter.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/linux/filter.h b/include/linux/filter.h
index 29c27656165b..5edf2b660881 100644
--- a/include/linux/filter.h
+++ b/include/linux/filter.h
@@ -886,7 +886,7 @@ void sk_filter_uncharge(struct sock *sk, struct sk_filter *fp);
 u64 __bpf_call_base(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
 #define __bpf_call_base_args \
 	((u64 (*)(u64, u64, u64, u64, u64, const struct bpf_insn *)) \
-	 __bpf_call_base)
+	 (void *)__bpf_call_base)
 
 struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *prog);
 void bpf_jit_compile(struct bpf_prog *prog);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 bpf-next 3/7] bpf: declare __bpf_free_used_maps() unconditionally
  2021-01-12  7:55 [PATCH v3 bpf-next 0/7] Support kernel module ksym variables Andrii Nakryiko
  2021-01-12  7:55 ` [PATCH v3 bpf-next 1/7] bpf: add bpf_patch_call_args prototype to include/linux/bpf.h Andrii Nakryiko
  2021-01-12  7:55 ` [PATCH v3 bpf-next 2/7] bpf: avoid warning when re-casting __bpf_call_base into __bpf_call_base_args Andrii Nakryiko
@ 2021-01-12  7:55 ` Andrii Nakryiko
  2021-01-12  7:55 ` [PATCH v3 bpf-next 4/7] selftests/bpf: sync RCU before unloading bpf_testmod Andrii Nakryiko
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Andrii Nakryiko @ 2021-01-12  7:55 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel
  Cc: andrii, kernel-team, Hao Luo, kernel test robot, Yonghong Song

__bpf_free_used_maps() is always defined in kernel/bpf/core.c, while
include/linux/bpf.h is guarding it behind CONFIG_BPF_SYSCALL. Move it out of
that guard region and fix compiler warning.

Reported-by: kernel test robot <lkp@intel.com>
Fixes: a2ea07465c8d ("bpf: Fix missing prog untrack in release_maps")
Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 include/linux/bpf.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index ef9309604b3e..6e585dbc10df 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1206,8 +1206,6 @@ void bpf_prog_sub(struct bpf_prog *prog, int i);
 void bpf_prog_inc(struct bpf_prog *prog);
 struct bpf_prog * __must_check bpf_prog_inc_not_zero(struct bpf_prog *prog);
 void bpf_prog_put(struct bpf_prog *prog);
-void __bpf_free_used_maps(struct bpf_prog_aux *aux,
-			  struct bpf_map **used_maps, u32 len);
 
 void bpf_prog_free_id(struct bpf_prog *prog, bool do_idr_lock);
 void bpf_map_free_id(struct bpf_map *map, bool do_idr_lock);
@@ -1676,6 +1674,9 @@ static inline struct bpf_prog *bpf_prog_get_type(u32 ufd,
 	return bpf_prog_get_type_dev(ufd, type, false);
 }
 
+void __bpf_free_used_maps(struct bpf_prog_aux *aux,
+			  struct bpf_map **used_maps, u32 len);
+
 bool bpf_prog_get_ok(struct bpf_prog *, enum bpf_prog_type *, bool);
 
 int bpf_prog_offload_compile(struct bpf_prog *prog);
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 bpf-next 4/7] selftests/bpf: sync RCU before unloading bpf_testmod
  2021-01-12  7:55 [PATCH v3 bpf-next 0/7] Support kernel module ksym variables Andrii Nakryiko
                   ` (2 preceding siblings ...)
  2021-01-12  7:55 ` [PATCH v3 bpf-next 3/7] bpf: declare __bpf_free_used_maps() unconditionally Andrii Nakryiko
@ 2021-01-12  7:55 ` Andrii Nakryiko
  2021-01-12  7:55 ` [PATCH v3 bpf-next 5/7] bpf: support BPF ksym variables in kernel modules Andrii Nakryiko
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 13+ messages in thread
From: Andrii Nakryiko @ 2021-01-12  7:55 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel
  Cc: andrii, kernel-team, Hao Luo, Alexei Starovoitov, Yonghong Song

If some of the subtests use module BTFs through ksyms, they will cause
bpf_prog to take a refcount on bpf_testmod module, which will prevent it from
successfully unloading. Module's refcnt is decremented when bpf_prog is freed,
which generally happens in RCU callback. So we need to trigger
syncronize_rcu() in the kernel, which can be achieved nicely with
membarrier(MEMBARRIER_CMD_GLOBAL) syscall. So do that in kernel_sync_rcu() and
make it available to other test inside the test_progs. This synchronize_rcu()
is called before attempting to unload bpf_testmod.

Fixes: 9f7fa225894c ("selftests/bpf: Add bpf_testmod kernel module for testing")
Suggested-by: Alexei Starovoitov <ast@kernel.org>
Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Hao Luo <haoluo@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 .../selftests/bpf/prog_tests/btf_map_in_map.c | 33 -------------------
 tools/testing/selftests/bpf/test_progs.c      | 11 +++++++
 tools/testing/selftests/bpf/test_progs.h      |  1 +
 3 files changed, 12 insertions(+), 33 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c b/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c
index 76ebe4c250f1..eb90a6b8850d 100644
--- a/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c
+++ b/tools/testing/selftests/bpf/prog_tests/btf_map_in_map.c
@@ -20,39 +20,6 @@ static __u32 bpf_map_id(struct bpf_map *map)
 	return info.id;
 }
 
-/*
- * Trigger synchronize_rcu() in kernel.
- *
- * ARRAY_OF_MAPS/HASH_OF_MAPS lookup/update operations trigger synchronize_rcu()
- * if looking up an existing non-NULL element or updating the map with a valid
- * inner map FD. Use this fact to trigger synchronize_rcu(): create map-in-map,
- * create a trivial ARRAY map, update map-in-map with ARRAY inner map. Then
- * cleanup. At the end, at least one synchronize_rcu() would be called.
- */
-static int kern_sync_rcu(void)
-{
-	int inner_map_fd, outer_map_fd, err, zero = 0;
-
-	inner_map_fd = bpf_create_map(BPF_MAP_TYPE_ARRAY, 4, 4, 1, 0);
-	if (CHECK(inner_map_fd < 0, "inner_map_create", "failed %d\n", -errno))
-		return -1;
-
-	outer_map_fd = bpf_create_map_in_map(BPF_MAP_TYPE_ARRAY_OF_MAPS, NULL,
-					     sizeof(int), inner_map_fd, 1, 0);
-	if (CHECK(outer_map_fd < 0, "outer_map_create", "failed %d\n", -errno)) {
-		close(inner_map_fd);
-		return -1;
-	}
-
-	err = bpf_map_update_elem(outer_map_fd, &zero, &inner_map_fd, 0);
-	if (err)
-		err = -errno;
-	CHECK(err, "outer_map_update", "failed %d\n", err);
-	close(inner_map_fd);
-	close(outer_map_fd);
-	return err;
-}
-
 static void test_lookup_update(void)
 {
 	int map1_fd, map2_fd, map3_fd, map4_fd, map5_fd, map1_id, map2_id;
diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
index 7d077d48cadd..e3fbca25696c 100644
--- a/tools/testing/selftests/bpf/test_progs.c
+++ b/tools/testing/selftests/bpf/test_progs.c
@@ -11,6 +11,7 @@
 #include <signal.h>
 #include <string.h>
 #include <execinfo.h> /* backtrace */
+#include <linux/membarrier.h>
 
 #define EXIT_NO_TEST		2
 #define EXIT_ERR_SETUP_INFRA	3
@@ -370,8 +371,18 @@ static int delete_module(const char *name, int flags)
 	return syscall(__NR_delete_module, name, flags);
 }
 
+/*
+ * Trigger synchronize_rcu() in kernel.
+ */
+int kern_sync_rcu(void)
+{
+	return syscall(__NR_membarrier, MEMBARRIER_CMD_GLOBAL, 0, 0);
+}
+
 static void unload_bpf_testmod(void)
 {
+	if (kern_sync_rcu())
+		fprintf(env.stderr, "Failed to trigger kernel-side RCU sync!\n");
 	if (delete_module("bpf_testmod", 0)) {
 		if (errno == ENOENT) {
 			if (env.verbosity > VERBOSE_NONE)
diff --git a/tools/testing/selftests/bpf/test_progs.h b/tools/testing/selftests/bpf/test_progs.h
index 115953243f62..e49e2fdde942 100644
--- a/tools/testing/selftests/bpf/test_progs.h
+++ b/tools/testing/selftests/bpf/test_progs.h
@@ -219,6 +219,7 @@ int bpf_find_map(const char *test, struct bpf_object *obj, const char *name);
 int compare_map_keys(int map1_fd, int map2_fd);
 int compare_stack_ips(int smap_fd, int amap_fd, int stack_trace_len);
 int extract_build_id(char *build_id, size_t size);
+int kern_sync_rcu(void);
 
 #ifdef __x86_64__
 #define SYS_NANOSLEEP_KPROBE_NAME "__x64_sys_nanosleep"
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 bpf-next 5/7] bpf: support BPF ksym variables in kernel modules
  2021-01-12  7:55 [PATCH v3 bpf-next 0/7] Support kernel module ksym variables Andrii Nakryiko
                   ` (3 preceding siblings ...)
  2021-01-12  7:55 ` [PATCH v3 bpf-next 4/7] selftests/bpf: sync RCU before unloading bpf_testmod Andrii Nakryiko
@ 2021-01-12  7:55 ` Andrii Nakryiko
  2021-01-12 16:27   ` Daniel Borkmann
  2021-01-12  7:55 ` [PATCH v3 bpf-next 6/7] libbpf: support kernel module ksym externs Andrii Nakryiko
  2021-01-12  7:55 ` [PATCH v3 bpf-next 7/7] selftests/bpf: test " Andrii Nakryiko
  6 siblings, 1 reply; 13+ messages in thread
From: Andrii Nakryiko @ 2021-01-12  7:55 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii, kernel-team, Hao Luo, Yonghong Song

Add support for directly accessing kernel module variables from BPF programs
using special ldimm64 instructions. This functionality builds upon vmlinux
ksym support, but extends ldimm64 with src_reg=BPF_PSEUDO_BTF_ID to allow
specifying kernel module BTF's FD in insn[1].imm field.

During BPF program load time, verifier will resolve FD to BTF object and will
take reference on BTF object itself and, for module BTFs, corresponding module
as well, to make sure it won't be unloaded from under running BPF program. The
mechanism used is similar to how bpf_prog keeps track of used bpf_maps.

One interesting change is also in how per-CPU variable is determined. The
logic is to find .data..percpu data section in provided BTF, but both vmlinux
and module each have their own .data..percpu entries in BTF. So for module's
case, the search for DATASEC record needs to look at only module's added BTF
types. This is implemented with custom search function.

Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Hao Luo <haoluo@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 include/linux/bpf.h          |  10 +++
 include/linux/bpf_verifier.h |   3 +
 include/linux/btf.h          |   3 +
 kernel/bpf/btf.c             |  31 ++++++-
 kernel/bpf/core.c            |  23 ++++++
 kernel/bpf/verifier.c        | 154 ++++++++++++++++++++++++++++-------
 6 files changed, 194 insertions(+), 30 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 6e585dbc10df..1aac2af12fed 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -761,9 +761,15 @@ struct bpf_ctx_arg_aux {
 	u32 btf_id;
 };
 
+struct btf_mod_pair {
+	struct btf *btf;
+	struct module *module;
+};
+
 struct bpf_prog_aux {
 	atomic64_t refcnt;
 	u32 used_map_cnt;
+	u32 used_btf_cnt;
 	u32 max_ctx_offset;
 	u32 max_pkt_offset;
 	u32 max_tp_access;
@@ -802,6 +808,7 @@ struct bpf_prog_aux {
 	const struct bpf_prog_ops *ops;
 	struct bpf_map **used_maps;
 	struct mutex used_maps_mutex; /* mutex for used_maps and used_map_cnt */
+	struct btf_mod_pair *used_btfs;
 	struct bpf_prog *prog;
 	struct user_struct *user;
 	u64 load_time; /* ns since boottime */
@@ -1668,6 +1675,9 @@ bpf_base_func_proto(enum bpf_func_id func_id)
 }
 #endif /* CONFIG_BPF_SYSCALL */
 
+void __bpf_free_used_btfs(struct bpf_prog_aux *aux,
+			  struct btf_mod_pair *used_btfs, u32 len);
+
 static inline struct bpf_prog *bpf_prog_get_type(u32 ufd,
 						 enum bpf_prog_type type)
 {
diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
index e941fe1484e5..dfe6f85d97dd 100644
--- a/include/linux/bpf_verifier.h
+++ b/include/linux/bpf_verifier.h
@@ -340,6 +340,7 @@ struct bpf_insn_aux_data {
 };
 
 #define MAX_USED_MAPS 64 /* max number of maps accessed by one eBPF program */
+#define MAX_USED_BTFS 64 /* max number of BTFs accessed by one BPF program */
 
 #define BPF_VERIFIER_TMP_LOG_SIZE	1024
 
@@ -398,7 +399,9 @@ struct bpf_verifier_env {
 	struct bpf_verifier_state_list **explored_states; /* search pruning optimization */
 	struct bpf_verifier_state_list *free_list;
 	struct bpf_map *used_maps[MAX_USED_MAPS]; /* array of map's used by eBPF program */
+	struct btf_mod_pair used_btfs[MAX_USED_BTFS]; /* array of BTF's used by BPF program */
 	u32 used_map_cnt;		/* number of used maps */
+	u32 used_btf_cnt;		/* number of used BTF objects */
 	u32 id_gen;			/* used to generate unique reg IDs */
 	bool allow_ptr_leaks;
 	bool allow_ptr_to_map_access;
diff --git a/include/linux/btf.h b/include/linux/btf.h
index 4c200f5d242b..7fabf1428093 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -91,6 +91,9 @@ int btf_type_snprintf_show(const struct btf *btf, u32 type_id, void *obj,
 int btf_get_fd_by_id(u32 id);
 u32 btf_obj_id(const struct btf *btf);
 bool btf_is_kernel(const struct btf *btf);
+bool btf_is_module(const struct btf *btf);
+struct module *btf_try_get_module(const struct btf *btf);
+u32 btf_nr_types(const struct btf *btf);
 bool btf_member_is_reg_int(const struct btf *btf, const struct btf_type *s,
 			   const struct btf_member *m,
 			   u32 expected_offset, u32 expected_size);
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 8d6bdb4f4d61..7ccc0133723a 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -458,7 +458,7 @@ static bool btf_type_is_datasec(const struct btf_type *t)
 	return BTF_INFO_KIND(t->info) == BTF_KIND_DATASEC;
 }
 
-static u32 btf_nr_types_total(const struct btf *btf)
+u32 btf_nr_types(const struct btf *btf)
 {
 	u32 total = 0;
 
@@ -476,7 +476,7 @@ s32 btf_find_by_name_kind(const struct btf *btf, const char *name, u8 kind)
 	const char *tname;
 	u32 i, total;
 
-	total = btf_nr_types_total(btf);
+	total = btf_nr_types(btf);
 	for (i = 1; i < total; i++) {
 		t = btf_type_by_id(btf, i);
 		if (BTF_INFO_KIND(t->info) != kind)
@@ -5743,6 +5743,11 @@ bool btf_is_kernel(const struct btf *btf)
 	return btf->kernel_btf;
 }
 
+bool btf_is_module(const struct btf *btf)
+{
+	return btf->kernel_btf && strcmp(btf->name, "vmlinux") != 0;
+}
+
 static int btf_id_cmp_func(const void *a, const void *b)
 {
 	const int *pa = a, *pb = b;
@@ -5877,3 +5882,25 @@ static int __init btf_module_init(void)
 
 fs_initcall(btf_module_init);
 #endif /* CONFIG_DEBUG_INFO_BTF_MODULES */
+
+struct module *btf_try_get_module(const struct btf *btf)
+{
+	struct module *res = NULL;
+#ifdef CONFIG_DEBUG_INFO_BTF_MODULES
+	struct btf_module *btf_mod, *tmp;
+
+	mutex_lock(&btf_module_mutex);
+	list_for_each_entry_safe(btf_mod, tmp, &btf_modules, list) {
+		if (btf_mod->btf != btf)
+			continue;
+
+		if (try_module_get(btf_mod->module))
+			res = btf_mod->module;
+
+		break;
+	}
+	mutex_unlock(&btf_module_mutex);
+#endif
+
+	return res;
+}
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
index 261f8692d0d2..69c3c308de5e 100644
--- a/kernel/bpf/core.c
+++ b/kernel/bpf/core.c
@@ -2119,6 +2119,28 @@ static void bpf_free_used_maps(struct bpf_prog_aux *aux)
 	kfree(aux->used_maps);
 }
 
+void __bpf_free_used_btfs(struct bpf_prog_aux *aux,
+			  struct btf_mod_pair *used_btfs, u32 len)
+{
+#ifdef CONFIG_BPF_SYSCALL
+	struct btf_mod_pair *btf_mod;
+	u32 i;
+
+	for (i = 0; i < len; i++) {
+		btf_mod = &used_btfs[i];
+		if (btf_mod->module)
+			module_put(btf_mod->module);
+		btf_put(btf_mod->btf);
+	}
+#endif
+}
+
+static void bpf_free_used_btfs(struct bpf_prog_aux *aux)
+{
+	__bpf_free_used_btfs(aux, aux->used_btfs, aux->used_btf_cnt);
+	kfree(aux->used_btfs);
+}
+
 static void bpf_prog_free_deferred(struct work_struct *work)
 {
 	struct bpf_prog_aux *aux;
@@ -2126,6 +2148,7 @@ static void bpf_prog_free_deferred(struct work_struct *work)
 
 	aux = container_of(work, struct bpf_prog_aux, work);
 	bpf_free_used_maps(aux);
+	bpf_free_used_btfs(aux);
 	if (bpf_prog_is_dev_bound(aux))
 		bpf_prog_offload_destroy(aux->prog);
 #ifdef CONFIG_PERF_EVENTS
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 17270b8404f1..1f077d9adc67 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -9703,6 +9703,36 @@ static int do_check(struct bpf_verifier_env *env)
 	return 0;
 }
 
+static int find_btf_percpu_datasec(struct btf *btf)
+{
+	const struct btf_type *t;
+	const char *tname;
+	int i, n;
+
+	/*
+	 * Both vmlinux and module each have their own ".data..percpu"
+	 * DATASECs in BTF. So for module's case, we need to skip vmlinux BTF
+	 * types to look at only module's own BTF types.
+	 */
+	n = btf_nr_types(btf);
+	if (btf_is_module(btf))
+		i = btf_nr_types(btf_vmlinux);
+	else
+		i = 1;
+
+	for(; i < n; i++) {
+		t = btf_type_by_id(btf, i);
+		if (BTF_INFO_KIND(t->info) != BTF_KIND_DATASEC)
+			continue;
+
+		tname = btf_name_by_offset(btf, t->name_off);
+		if (!strcmp(tname, ".data..percpu"))
+			return i;
+	}
+
+	return -ENOENT;
+}
+
 /* replace pseudo btf_id with kernel symbol address */
 static int check_pseudo_btf_id(struct bpf_verifier_env *env,
 			       struct bpf_insn *insn,
@@ -9710,48 +9740,57 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env,
 {
 	const struct btf_var_secinfo *vsi;
 	const struct btf_type *datasec;
+	struct btf_mod_pair *btf_mod;
 	const struct btf_type *t;
 	const char *sym_name;
 	bool percpu = false;
 	u32 type, id = insn->imm;
+	struct btf *btf;
 	s32 datasec_id;
 	u64 addr;
-	int i;
-
-	if (!btf_vmlinux) {
-		verbose(env, "kernel is missing BTF, make sure CONFIG_DEBUG_INFO_BTF=y is specified in Kconfig.\n");
-		return -EINVAL;
-	}
+	int i, btf_fd, err;
 
-	if (insn[1].imm != 0) {
-		verbose(env, "reserved field (insn[1].imm) is used in pseudo_btf_id ldimm64 insn.\n");
-		return -EINVAL;
+	btf_fd = insn[1].imm;
+	if (btf_fd) {
+		btf = btf_get_by_fd(btf_fd);
+		if (IS_ERR(btf)) {
+			verbose(env, "invalid module BTF object FD specified.\n");
+			return -EINVAL;
+		}
+	} else {
+		if (!btf_vmlinux) {
+			verbose(env, "kernel is missing BTF, make sure CONFIG_DEBUG_INFO_BTF=y is specified in Kconfig.\n");
+			return -EINVAL;
+		}
+		btf = btf_vmlinux;
+		btf_get(btf);
 	}
 
-	t = btf_type_by_id(btf_vmlinux, id);
+	t = btf_type_by_id(btf, id);
 	if (!t) {
 		verbose(env, "ldimm64 insn specifies invalid btf_id %d.\n", id);
-		return -ENOENT;
+		err = -ENOENT;
+		goto err_put;
 	}
 
 	if (!btf_type_is_var(t)) {
-		verbose(env, "pseudo btf_id %d in ldimm64 isn't KIND_VAR.\n",
-			id);
-		return -EINVAL;
+		verbose(env, "pseudo btf_id %d in ldimm64 isn't KIND_VAR.\n", id);
+		err = -EINVAL;
+		goto err_put;
 	}
 
-	sym_name = btf_name_by_offset(btf_vmlinux, t->name_off);
+	sym_name = btf_name_by_offset(btf, t->name_off);
 	addr = kallsyms_lookup_name(sym_name);
 	if (!addr) {
 		verbose(env, "ldimm64 failed to find the address for kernel symbol '%s'.\n",
 			sym_name);
-		return -ENOENT;
+		err = -ENOENT;
+		goto err_put;
 	}
 
-	datasec_id = btf_find_by_name_kind(btf_vmlinux, ".data..percpu",
-					   BTF_KIND_DATASEC);
+	datasec_id = find_btf_percpu_datasec(btf);
 	if (datasec_id > 0) {
-		datasec = btf_type_by_id(btf_vmlinux, datasec_id);
+		datasec = btf_type_by_id(btf, datasec_id);
 		for_each_vsi(i, datasec, vsi) {
 			if (vsi->type == id) {
 				percpu = true;
@@ -9764,10 +9803,10 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env,
 	insn[1].imm = addr >> 32;
 
 	type = t->type;
-	t = btf_type_skip_modifiers(btf_vmlinux, type, NULL);
+	t = btf_type_skip_modifiers(btf, type, NULL);
 	if (percpu) {
 		aux->btf_var.reg_type = PTR_TO_PERCPU_BTF_ID;
-		aux->btf_var.btf = btf_vmlinux;
+		aux->btf_var.btf = btf;
 		aux->btf_var.btf_id = type;
 	} else if (!btf_type_is_struct(t)) {
 		const struct btf_type *ret;
@@ -9775,21 +9814,54 @@ static int check_pseudo_btf_id(struct bpf_verifier_env *env,
 		u32 tsize;
 
 		/* resolve the type size of ksym. */
-		ret = btf_resolve_size(btf_vmlinux, t, &tsize);
+		ret = btf_resolve_size(btf, t, &tsize);
 		if (IS_ERR(ret)) {
-			tname = btf_name_by_offset(btf_vmlinux, t->name_off);
+			tname = btf_name_by_offset(btf, t->name_off);
 			verbose(env, "ldimm64 unable to resolve the size of type '%s': %ld\n",
 				tname, PTR_ERR(ret));
-			return -EINVAL;
+			err = -EINVAL;
+			goto err_put;
 		}
 		aux->btf_var.reg_type = PTR_TO_MEM;
 		aux->btf_var.mem_size = tsize;
 	} else {
 		aux->btf_var.reg_type = PTR_TO_BTF_ID;
-		aux->btf_var.btf = btf_vmlinux;
+		aux->btf_var.btf = btf;
 		aux->btf_var.btf_id = type;
 	}
+
+	/* check whether we recorded this BTF (and maybe module) already */
+	for (i = 0; i < env->used_btf_cnt; i++) {
+		if (env->used_btfs[i].btf == btf) {
+			btf_put(btf);
+			return 0;
+		}
+	}
+
+	if (env->used_btf_cnt >= MAX_USED_BTFS) {
+		err = -E2BIG;
+		goto err_put;
+	}
+
+	btf_mod = &env->used_btfs[env->used_btf_cnt];
+	btf_mod->btf = btf;
+	btf_mod->module = NULL;
+
+	/* if we reference variables from kernel module, bump its refcount */
+	if (btf_is_module(btf)) {
+		btf_mod->module = btf_try_get_module(btf);
+		if (!btf_mod->module) {
+			err = -ENXIO;
+			goto err_put;
+		}
+	}
+
+	env->used_btf_cnt++;
+
 	return 0;
+err_put:
+	btf_put(btf);
+	return err;
 }
 
 static int check_map_prealloc(struct bpf_map *map)
@@ -10086,6 +10158,13 @@ static void release_maps(struct bpf_verifier_env *env)
 			     env->used_map_cnt);
 }
 
+/* drop refcnt of maps used by the rejected program */
+static void release_btfs(struct bpf_verifier_env *env)
+{
+	__bpf_free_used_btfs(env->prog->aux, env->used_btfs,
+			     env->used_btf_cnt);
+}
+
 /* convert pseudo BPF_LD_IMM64 into generic BPF_LD_IMM64 */
 static void convert_pseudo_ld_imm64(struct bpf_verifier_env *env)
 {
@@ -12098,7 +12177,10 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
 		goto err_release_maps;
 	}
 
-	if (ret == 0 && env->used_map_cnt) {
+	if (ret)
+		goto err_release_maps;
+
+	if (env->used_map_cnt) {
 		/* if program passed verifier, update used_maps in bpf_prog_info */
 		env->prog->aux->used_maps = kmalloc_array(env->used_map_cnt,
 							  sizeof(env->used_maps[0]),
@@ -12112,15 +12194,29 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
 		memcpy(env->prog->aux->used_maps, env->used_maps,
 		       sizeof(env->used_maps[0]) * env->used_map_cnt);
 		env->prog->aux->used_map_cnt = env->used_map_cnt;
+	}
+	if (env->used_btf_cnt) {
+		/* if program passed verifier, update used_btfs in bpf_prog_aux */
+		env->prog->aux->used_btfs = kmalloc_array(env->used_btf_cnt,
+							  sizeof(env->used_btfs[0]),
+							  GFP_KERNEL);
+		if (!env->prog->aux->used_btfs) {
+			ret = -ENOMEM;
+			goto err_release_maps;
+		}
 
+		memcpy(env->prog->aux->used_btfs, env->used_btfs,
+		       sizeof(env->used_btfs[0]) * env->used_btf_cnt);
+		env->prog->aux->used_btf_cnt = env->used_btf_cnt;
+	}
+	if (env->used_map_cnt || env->used_btf_cnt) {
 		/* program is valid. Convert pseudo bpf_ld_imm64 into generic
 		 * bpf_ld_imm64 instructions
 		 */
 		convert_pseudo_ld_imm64(env);
 	}
 
-	if (ret == 0)
-		adjust_btf_func(env);
+	adjust_btf_func(env);
 
 err_release_maps:
 	if (!env->prog->aux->used_maps)
@@ -12128,6 +12224,8 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr,
 		 * them now. Otherwise free_used_maps() will release them.
 		 */
 		release_maps(env);
+	if (!env->prog->aux->used_btfs)
+		release_btfs(env);
 
 	/* extension progs temporarily inherit the attach_type of their targets
 	   for verification purposes, so set it back to zero before returning
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 bpf-next 6/7] libbpf: support kernel module ksym externs
  2021-01-12  7:55 [PATCH v3 bpf-next 0/7] Support kernel module ksym variables Andrii Nakryiko
                   ` (4 preceding siblings ...)
  2021-01-12  7:55 ` [PATCH v3 bpf-next 5/7] bpf: support BPF ksym variables in kernel modules Andrii Nakryiko
@ 2021-01-12  7:55 ` Andrii Nakryiko
  2021-01-12  7:55 ` [PATCH v3 bpf-next 7/7] selftests/bpf: test " Andrii Nakryiko
  6 siblings, 0 replies; 13+ messages in thread
From: Andrii Nakryiko @ 2021-01-12  7:55 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii, kernel-team, Hao Luo, Yonghong Song

Add support for searching for ksym externs not just in vmlinux BTF, but across
all module BTFs, similarly to how it's done for CO-RE relocations. Kernels
that expose module BTFs through sysfs are assumed to support new ldimm64
instruction extension with BTF FD provided in insn[1].imm field, so no extra
feature detection is performed.

Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Hao Luo <haoluo@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 tools/lib/bpf/libbpf.c | 50 +++++++++++++++++++++++++++---------------
 1 file changed, 32 insertions(+), 18 deletions(-)

diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
index 6ae748f6ea11..2abbc3800568 100644
--- a/tools/lib/bpf/libbpf.c
+++ b/tools/lib/bpf/libbpf.c
@@ -395,7 +395,8 @@ struct extern_desc {
 			unsigned long long addr;
 
 			/* target btf_id of the corresponding kernel var. */
-			int vmlinux_btf_id;
+			int kernel_btf_obj_fd;
+			int kernel_btf_id;
 
 			/* local btf_id of the ksym extern's type. */
 			__u32 type_id;
@@ -6162,7 +6163,8 @@ bpf_object__relocate_data(struct bpf_object *obj, struct bpf_program *prog)
 			} else /* EXT_KSYM */ {
 				if (ext->ksym.type_id) { /* typed ksyms */
 					insn[0].src_reg = BPF_PSEUDO_BTF_ID;
-					insn[0].imm = ext->ksym.vmlinux_btf_id;
+					insn[0].imm = ext->ksym.kernel_btf_id;
+					insn[1].imm = ext->ksym.kernel_btf_obj_fd;
 				} else { /* typeless ksyms */
 					insn[0].imm = (__u32)ext->ksym.addr;
 					insn[1].imm = ext->ksym.addr >> 32;
@@ -7319,7 +7321,8 @@ static int bpf_object__read_kallsyms_file(struct bpf_object *obj)
 static int bpf_object__resolve_ksyms_btf_id(struct bpf_object *obj)
 {
 	struct extern_desc *ext;
-	int i, id;
+	struct btf *btf;
+	int i, j, id, btf_fd, err;
 
 	for (i = 0; i < obj->nr_extern; i++) {
 		const struct btf_type *targ_var, *targ_type;
@@ -7331,10 +7334,25 @@ static int bpf_object__resolve_ksyms_btf_id(struct bpf_object *obj)
 		if (ext->type != EXT_KSYM || !ext->ksym.type_id)
 			continue;
 
-		id = btf__find_by_name_kind(obj->btf_vmlinux, ext->name,
-					    BTF_KIND_VAR);
+		btf = obj->btf_vmlinux;
+		btf_fd = 0;
+		id = btf__find_by_name_kind(btf, ext->name, BTF_KIND_VAR);
+		if (id == -ENOENT) {
+			err = load_module_btfs(obj);
+			if (err)
+				return err;
+
+			for (j = 0; j < obj->btf_module_cnt; j++) {
+				btf = obj->btf_modules[j].btf;
+				/* we assume module BTF FD is always >0 */
+				btf_fd = obj->btf_modules[j].fd;
+				id = btf__find_by_name_kind(btf, ext->name, BTF_KIND_VAR);
+				if (id != -ENOENT)
+					break;
+			}
+		}
 		if (id <= 0) {
-			pr_warn("extern (ksym) '%s': failed to find BTF ID in vmlinux BTF.\n",
+			pr_warn("extern (ksym) '%s': failed to find BTF ID in kernel BTF(s).\n",
 				ext->name);
 			return -ESRCH;
 		}
@@ -7343,24 +7361,19 @@ static int bpf_object__resolve_ksyms_btf_id(struct bpf_object *obj)
 		local_type_id = ext->ksym.type_id;
 
 		/* find target type_id */
-		targ_var = btf__type_by_id(obj->btf_vmlinux, id);
-		targ_var_name = btf__name_by_offset(obj->btf_vmlinux,
-						    targ_var->name_off);
-		targ_type = skip_mods_and_typedefs(obj->btf_vmlinux,
-						   targ_var->type,
-						   &targ_type_id);
+		targ_var = btf__type_by_id(btf, id);
+		targ_var_name = btf__name_by_offset(btf, targ_var->name_off);
+		targ_type = skip_mods_and_typedefs(btf, targ_var->type, &targ_type_id);
 
 		ret = bpf_core_types_are_compat(obj->btf, local_type_id,
-						obj->btf_vmlinux, targ_type_id);
+						btf, targ_type_id);
 		if (ret <= 0) {
 			const struct btf_type *local_type;
 			const char *targ_name, *local_name;
 
 			local_type = btf__type_by_id(obj->btf, local_type_id);
-			local_name = btf__name_by_offset(obj->btf,
-							 local_type->name_off);
-			targ_name = btf__name_by_offset(obj->btf_vmlinux,
-							targ_type->name_off);
+			local_name = btf__name_by_offset(obj->btf, local_type->name_off);
+			targ_name = btf__name_by_offset(btf, targ_type->name_off);
 
 			pr_warn("extern (ksym) '%s': incompatible types, expected [%d] %s %s, but kernel has [%d] %s %s\n",
 				ext->name, local_type_id,
@@ -7370,7 +7383,8 @@ static int bpf_object__resolve_ksyms_btf_id(struct bpf_object *obj)
 		}
 
 		ext->is_set = true;
-		ext->ksym.vmlinux_btf_id = id;
+		ext->ksym.kernel_btf_obj_fd = btf_fd;
+		ext->ksym.kernel_btf_id = id;
 		pr_debug("extern (ksym) '%s': resolved to [%d] %s %s\n",
 			 ext->name, id, btf_kind_str(targ_var), targ_var_name);
 	}
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v3 bpf-next 7/7] selftests/bpf: test kernel module ksym externs
  2021-01-12  7:55 [PATCH v3 bpf-next 0/7] Support kernel module ksym variables Andrii Nakryiko
                   ` (5 preceding siblings ...)
  2021-01-12  7:55 ` [PATCH v3 bpf-next 6/7] libbpf: support kernel module ksym externs Andrii Nakryiko
@ 2021-01-12  7:55 ` Andrii Nakryiko
  2021-01-13  1:29   ` Alexei Starovoitov
  6 siblings, 1 reply; 13+ messages in thread
From: Andrii Nakryiko @ 2021-01-12  7:55 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii, kernel-team, Hao Luo, Yonghong Song

Add per-CPU variable to bpf_testmod.ko and use those from new selftest to
validate it works end-to-end.

Acked-by: Yonghong Song <yhs@fb.com>
Acked-by: Hao Luo <haoluo@google.com>
Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
---
 .../selftests/bpf/bpf_testmod/bpf_testmod.c   |  3 ++
 .../selftests/bpf/prog_tests/ksyms_module.c   | 31 +++++++++++++++++++
 .../selftests/bpf/progs/test_ksyms_module.c   | 26 ++++++++++++++++
 3 files changed, 60 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/ksyms_module.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_ksyms_module.c

diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
index 2df19d73ca49..0b991e115d1f 100644
--- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
+++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
@@ -3,6 +3,7 @@
 #include <linux/error-injection.h>
 #include <linux/init.h>
 #include <linux/module.h>
+#include <linux/percpu-defs.h>
 #include <linux/sysfs.h>
 #include <linux/tracepoint.h>
 #include "bpf_testmod.h"
@@ -10,6 +11,8 @@
 #define CREATE_TRACE_POINTS
 #include "bpf_testmod-events.h"
 
+DEFINE_PER_CPU(int, bpf_testmod_ksym_percpu) = 123;
+
 noinline ssize_t
 bpf_testmod_test_read(struct file *file, struct kobject *kobj,
 		      struct bin_attribute *bin_attr,
diff --git a/tools/testing/selftests/bpf/prog_tests/ksyms_module.c b/tools/testing/selftests/bpf/prog_tests/ksyms_module.c
new file mode 100644
index 000000000000..4c232b456479
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/ksyms_module.c
@@ -0,0 +1,31 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2021 Facebook */
+
+#include <test_progs.h>
+#include <bpf/libbpf.h>
+#include <bpf/btf.h>
+#include "test_ksyms_module.skel.h"
+
+static int duration;
+
+void test_ksyms_module(void)
+{
+	struct test_ksyms_module* skel;
+	int err;
+
+	skel = test_ksyms_module__open_and_load();
+	if (CHECK(!skel, "skel_open", "failed to open skeleton\n"))
+		return;
+
+	err = test_ksyms_module__attach(skel);
+	if (CHECK(err, "skel_attach", "skeleton attach failed: %d\n", err))
+		goto cleanup;
+
+	usleep(1);
+
+	ASSERT_EQ(skel->bss->triggered, true, "triggered");
+	ASSERT_EQ(skel->bss->out_mod_ksym_global, 123, "global_ksym_val");
+
+cleanup:
+	test_ksyms_module__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_ksyms_module.c b/tools/testing/selftests/bpf/progs/test_ksyms_module.c
new file mode 100644
index 000000000000..d6a0b3086b90
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_ksyms_module.c
@@ -0,0 +1,26 @@
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (c) 2021 Facebook */
+
+#include "vmlinux.h"
+
+#include <bpf/bpf_helpers.h>
+
+extern const int bpf_testmod_ksym_percpu __ksym;
+
+int out_mod_ksym_global = 0;
+bool triggered = false;
+
+SEC("raw_tp/sys_enter")
+int handler(const void *ctx)
+{
+	int *val;
+	__u32 cpu;
+
+	val = (int *)bpf_this_cpu_ptr(&bpf_testmod_ksym_percpu);
+	out_mod_ksym_global = *val;
+	triggered = true;
+
+	return 0;
+}
+
+char LICENSE[] SEC("license") = "GPL";
-- 
2.24.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 bpf-next 5/7] bpf: support BPF ksym variables in kernel modules
  2021-01-12  7:55 ` [PATCH v3 bpf-next 5/7] bpf: support BPF ksym variables in kernel modules Andrii Nakryiko
@ 2021-01-12 16:27   ` Daniel Borkmann
  2021-01-12 20:38     ` Andrii Nakryiko
  2021-01-12 23:18     ` Alexei Starovoitov
  0 siblings, 2 replies; 13+ messages in thread
From: Daniel Borkmann @ 2021-01-12 16:27 UTC (permalink / raw)
  To: Andrii Nakryiko, bpf, netdev, ast; +Cc: kernel-team, Hao Luo, Yonghong Song

On 1/12/21 8:55 AM, Andrii Nakryiko wrote:
> Add support for directly accessing kernel module variables from BPF programs
> using special ldimm64 instructions. This functionality builds upon vmlinux
> ksym support, but extends ldimm64 with src_reg=BPF_PSEUDO_BTF_ID to allow
> specifying kernel module BTF's FD in insn[1].imm field.
> 
> During BPF program load time, verifier will resolve FD to BTF object and will
> take reference on BTF object itself and, for module BTFs, corresponding module
> as well, to make sure it won't be unloaded from under running BPF program. The
> mechanism used is similar to how bpf_prog keeps track of used bpf_maps.
> 
> One interesting change is also in how per-CPU variable is determined. The
> logic is to find .data..percpu data section in provided BTF, but both vmlinux
> and module each have their own .data..percpu entries in BTF. So for module's
> case, the search for DATASEC record needs to look at only module's added BTF
> types. This is implemented with custom search function.
> 
> Acked-by: Yonghong Song <yhs@fb.com>
> Acked-by: Hao Luo <haoluo@google.com>
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
[...]
> +
> +struct module *btf_try_get_module(const struct btf *btf)
> +{
> +	struct module *res = NULL;
> +#ifdef CONFIG_DEBUG_INFO_BTF_MODULES
> +	struct btf_module *btf_mod, *tmp;
> +
> +	mutex_lock(&btf_module_mutex);
> +	list_for_each_entry_safe(btf_mod, tmp, &btf_modules, list) {
> +		if (btf_mod->btf != btf)
> +			continue;
> +
> +		if (try_module_get(btf_mod->module))
> +			res = btf_mod->module;

One more thought (follow-up would be okay I'd think) ... when a module references
a symbol from another module, it similarly needs to bump the refcount of the module
that is owning it and thus disallowing to unload for that other module's lifetime.
That usage dependency is visible via /proc/modules however, so if unload doesn't work
then lsmod allows a way to introspect that to the user. This seems to be achieved via
resolve_symbol() where it records its dependency/usage. Would be great if we could at
some point also include the BPF prog name into that list so that this is more obvious.
Wdyt?

> +		break;
> +	}
> +	mutex_unlock(&btf_module_mutex);
> +#endif
> +
> +	return res;
> +}
> diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> index 261f8692d0d2..69c3c308de5e 100644
> --- a/kernel/bpf/core.c
> +++ b/kernel/bpf/core.c
> @@ -2119,6 +2119,28 @@ static void bpf_free_used_maps(struct bpf_prog_aux *aux)
>   	kfree(aux->used_maps);
>   }
>   
> +void __bpf_free_used_btfs(struct bpf_prog_aux *aux,
> +			  struct btf_mod_pair *used_btfs, u32 len)
> +{
> +#ifdef CONFIG_BPF_SYSCALL
> +	struct btf_mod_pair *btf_mod;
> +	u32 i;
> +
> +	for (i = 0; i < len; i++) {
> +		btf_mod = &used_btfs[i];
> +		if (btf_mod->module)
> +			module_put(btf_mod->module);
> +		btf_put(btf_mod->btf);
> +	}
> +#endif
> +}

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 bpf-next 5/7] bpf: support BPF ksym variables in kernel modules
  2021-01-12 16:27   ` Daniel Borkmann
@ 2021-01-12 20:38     ` Andrii Nakryiko
  2021-01-12 23:18     ` Alexei Starovoitov
  1 sibling, 0 replies; 13+ messages in thread
From: Andrii Nakryiko @ 2021-01-12 20:38 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Andrii Nakryiko, bpf, Networking, Alexei Starovoitov,
	Kernel Team, Hao Luo, Yonghong Song

On Tue, Jan 12, 2021 at 8:27 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 1/12/21 8:55 AM, Andrii Nakryiko wrote:
> > Add support for directly accessing kernel module variables from BPF programs
> > using special ldimm64 instructions. This functionality builds upon vmlinux
> > ksym support, but extends ldimm64 with src_reg=BPF_PSEUDO_BTF_ID to allow
> > specifying kernel module BTF's FD in insn[1].imm field.
> >
> > During BPF program load time, verifier will resolve FD to BTF object and will
> > take reference on BTF object itself and, for module BTFs, corresponding module
> > as well, to make sure it won't be unloaded from under running BPF program. The
> > mechanism used is similar to how bpf_prog keeps track of used bpf_maps.
> >
> > One interesting change is also in how per-CPU variable is determined. The
> > logic is to find .data..percpu data section in provided BTF, but both vmlinux
> > and module each have their own .data..percpu entries in BTF. So for module's
> > case, the search for DATASEC record needs to look at only module's added BTF
> > types. This is implemented with custom search function.
> >
> > Acked-by: Yonghong Song <yhs@fb.com>
> > Acked-by: Hao Luo <haoluo@google.com>
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> [...]
> > +
> > +struct module *btf_try_get_module(const struct btf *btf)
> > +{
> > +     struct module *res = NULL;
> > +#ifdef CONFIG_DEBUG_INFO_BTF_MODULES
> > +     struct btf_module *btf_mod, *tmp;
> > +
> > +     mutex_lock(&btf_module_mutex);
> > +     list_for_each_entry_safe(btf_mod, tmp, &btf_modules, list) {
> > +             if (btf_mod->btf != btf)
> > +                     continue;
> > +
> > +             if (try_module_get(btf_mod->module))
> > +                     res = btf_mod->module;
>
> One more thought (follow-up would be okay I'd think) ... when a module references
> a symbol from another module, it similarly needs to bump the refcount of the module
> that is owning it and thus disallowing to unload for that other module's lifetime.
> That usage dependency is visible via /proc/modules however, so if unload doesn't work
> then lsmod allows a way to introspect that to the user. This seems to be achieved via
> resolve_symbol() where it records its dependency/usage. Would be great if we could at
> some point also include the BPF prog name into that list so that this is more obvious.
> Wdyt?
>

Yeah, it's definitely nice to see dependent bpf progs. There is struct
module_use, which is used to record these dependencies, but the
assumption there is that dependencies could be only other modules. So
one way is to somehow extend that or add another set of bpf_prog
dependencies. First is a bit intrusive, while the seconds sucks even
more, IMO.

Alternatively, we can rely on bpf_link info to emit module info, if
the BPF program is attached to BTF type from the module. Then with
bpftool it would be easy to see this, but it's not as
readily-available info as /proc/modules, of course.

Any preferences?

> > +             break;
> > +     }
> > +     mutex_unlock(&btf_module_mutex);
> > +#endif
> > +
> > +     return res;
> > +}
> > diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c
> > index 261f8692d0d2..69c3c308de5e 100644
> > --- a/kernel/bpf/core.c
> > +++ b/kernel/bpf/core.c
> > @@ -2119,6 +2119,28 @@ static void bpf_free_used_maps(struct bpf_prog_aux *aux)
> >       kfree(aux->used_maps);
> >   }
> >
> > +void __bpf_free_used_btfs(struct bpf_prog_aux *aux,
> > +                       struct btf_mod_pair *used_btfs, u32 len)
> > +{
> > +#ifdef CONFIG_BPF_SYSCALL
> > +     struct btf_mod_pair *btf_mod;
> > +     u32 i;
> > +
> > +     for (i = 0; i < len; i++) {
> > +             btf_mod = &used_btfs[i];
> > +             if (btf_mod->module)
> > +                     module_put(btf_mod->module);
> > +             btf_put(btf_mod->btf);
> > +     }
> > +#endif
> > +}

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 bpf-next 5/7] bpf: support BPF ksym variables in kernel modules
  2021-01-12 16:27   ` Daniel Borkmann
  2021-01-12 20:38     ` Andrii Nakryiko
@ 2021-01-12 23:18     ` Alexei Starovoitov
  2021-01-13 22:55       ` Daniel Borkmann
  1 sibling, 1 reply; 13+ messages in thread
From: Alexei Starovoitov @ 2021-01-12 23:18 UTC (permalink / raw)
  To: Daniel Borkmann
  Cc: Andrii Nakryiko, bpf, Network Development, Alexei Starovoitov,
	Kernel Team, Hao Luo, Yonghong Song

On Tue, Jan 12, 2021 at 8:30 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>
> On 1/12/21 8:55 AM, Andrii Nakryiko wrote:
> > Add support for directly accessing kernel module variables from BPF programs
> > using special ldimm64 instructions. This functionality builds upon vmlinux
> > ksym support, but extends ldimm64 with src_reg=BPF_PSEUDO_BTF_ID to allow
> > specifying kernel module BTF's FD in insn[1].imm field.
> >
> > During BPF program load time, verifier will resolve FD to BTF object and will
> > take reference on BTF object itself and, for module BTFs, corresponding module
> > as well, to make sure it won't be unloaded from under running BPF program. The
> > mechanism used is similar to how bpf_prog keeps track of used bpf_maps.
> >
> > One interesting change is also in how per-CPU variable is determined. The
> > logic is to find .data..percpu data section in provided BTF, but both vmlinux
> > and module each have their own .data..percpu entries in BTF. So for module's
> > case, the search for DATASEC record needs to look at only module's added BTF
> > types. This is implemented with custom search function.
> >
> > Acked-by: Yonghong Song <yhs@fb.com>
> > Acked-by: Hao Luo <haoluo@google.com>
> > Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
> [...]
> > +
> > +struct module *btf_try_get_module(const struct btf *btf)
> > +{
> > +     struct module *res = NULL;
> > +#ifdef CONFIG_DEBUG_INFO_BTF_MODULES
> > +     struct btf_module *btf_mod, *tmp;
> > +
> > +     mutex_lock(&btf_module_mutex);
> > +     list_for_each_entry_safe(btf_mod, tmp, &btf_modules, list) {
> > +             if (btf_mod->btf != btf)
> > +                     continue;
> > +
> > +             if (try_module_get(btf_mod->module))
> > +                     res = btf_mod->module;
>
> One more thought (follow-up would be okay I'd think) ... when a module references
> a symbol from another module, it similarly needs to bump the refcount of the module
> that is owning it and thus disallowing to unload for that other module's lifetime.
> That usage dependency is visible via /proc/modules however, so if unload doesn't work
> then lsmod allows a way to introspect that to the user. This seems to be achieved via
> resolve_symbol() where it records its dependency/usage. Would be great if we could at
> some point also include the BPF prog name into that list so that this is more obvious.
> Wdyt?

I thought about it as well, but plenty of kernel things just grab the ref of ko
and don't add any way to introspect what piece of kernel is holding ko.
So this case won't be the first.
Also if we add it for bpf progs it could be confusing in lsmod.
Since it currently only shows other ko-s in there.
Long ago I had an awk script to parse that output to rmmod dependent modules
before rmmoding the main one. If somebody doing something like this
bpf prog names in the same place may break things.
So I think there are more cons than pros.
That is certainly a follow up if we agree on the direction.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 bpf-next 7/7] selftests/bpf: test kernel module ksym externs
  2021-01-12  7:55 ` [PATCH v3 bpf-next 7/7] selftests/bpf: test " Andrii Nakryiko
@ 2021-01-13  1:29   ` Alexei Starovoitov
  0 siblings, 0 replies; 13+ messages in thread
From: Alexei Starovoitov @ 2021-01-13  1:29 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Network Development, Alexei Starovoitov, Daniel Borkmann,
	Kernel Team, Hao Luo, Yonghong Song

On Tue, Jan 12, 2021 at 3:41 AM Andrii Nakryiko <andrii@kernel.org> wrote:
>
> Add per-CPU variable to bpf_testmod.ko and use those from new selftest to
> validate it works end-to-end.
>
> Acked-by: Yonghong Song <yhs@fb.com>
> Acked-by: Hao Luo <haoluo@google.com>
> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>

Applied.

FYI for everyone. This test needs the latest pahole.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v3 bpf-next 5/7] bpf: support BPF ksym variables in kernel modules
  2021-01-12 23:18     ` Alexei Starovoitov
@ 2021-01-13 22:55       ` Daniel Borkmann
  0 siblings, 0 replies; 13+ messages in thread
From: Daniel Borkmann @ 2021-01-13 22:55 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Andrii Nakryiko, bpf, Network Development, Alexei Starovoitov,
	Kernel Team, Hao Luo, Yonghong Song

On 1/13/21 12:18 AM, Alexei Starovoitov wrote:
> On Tue, Jan 12, 2021 at 8:30 AM Daniel Borkmann <daniel@iogearbox.net> wrote:
>> On 1/12/21 8:55 AM, Andrii Nakryiko wrote:
>>> Add support for directly accessing kernel module variables from BPF programs
>>> using special ldimm64 instructions. This functionality builds upon vmlinux
>>> ksym support, but extends ldimm64 with src_reg=BPF_PSEUDO_BTF_ID to allow
>>> specifying kernel module BTF's FD in insn[1].imm field.
>>>
>>> During BPF program load time, verifier will resolve FD to BTF object and will
>>> take reference on BTF object itself and, for module BTFs, corresponding module
>>> as well, to make sure it won't be unloaded from under running BPF program. The
>>> mechanism used is similar to how bpf_prog keeps track of used bpf_maps.
>>>
>>> One interesting change is also in how per-CPU variable is determined. The
>>> logic is to find .data..percpu data section in provided BTF, but both vmlinux
>>> and module each have their own .data..percpu entries in BTF. So for module's
>>> case, the search for DATASEC record needs to look at only module's added BTF
>>> types. This is implemented with custom search function.
>>>
>>> Acked-by: Yonghong Song <yhs@fb.com>
>>> Acked-by: Hao Luo <haoluo@google.com>
>>> Signed-off-by: Andrii Nakryiko <andrii@kernel.org>
>> [...]
>>> +
>>> +struct module *btf_try_get_module(const struct btf *btf)
>>> +{
>>> +     struct module *res = NULL;
>>> +#ifdef CONFIG_DEBUG_INFO_BTF_MODULES
>>> +     struct btf_module *btf_mod, *tmp;
>>> +
>>> +     mutex_lock(&btf_module_mutex);
>>> +     list_for_each_entry_safe(btf_mod, tmp, &btf_modules, list) {
>>> +             if (btf_mod->btf != btf)
>>> +                     continue;
>>> +
>>> +             if (try_module_get(btf_mod->module))
>>> +                     res = btf_mod->module;
>>
>> One more thought (follow-up would be okay I'd think) ... when a module references
>> a symbol from another module, it similarly needs to bump the refcount of the module
>> that is owning it and thus disallowing to unload for that other module's lifetime.
>> That usage dependency is visible via /proc/modules however, so if unload doesn't work
>> then lsmod allows a way to introspect that to the user. This seems to be achieved via
>> resolve_symbol() where it records its dependency/usage. Would be great if we could at
>> some point also include the BPF prog name into that list so that this is more obvious.
>> Wdyt?
> 
> I thought about it as well, but plenty of kernel things just grab the ref of ko
> and don't add any way to introspect what piece of kernel is holding ko.
> So this case won't be the first.
> Also if we add it for bpf progs it could be confusing in lsmod.
> Since it currently only shows other ko-s in there.
> Long ago I had an awk script to parse that output to rmmod dependent modules
> before rmmoding the main one. If somebody doing something like this
> bpf prog names in the same place may break things.
> So I think there are more cons than pros.

Hm, true that scripting could break in this case if we were to add bpf prog names in
there. :/ I don't have a better suggestion atm.. we could potentially add something
for the bpf prog info dump via bpftool, but it's a non-obvious location to people who
are used to check deps via lsmod. Also true that we bump ref from plenty of other
locations where it's not directly shown either apart from just the refcnt (e.g. socket
using tcp congctl module etc).

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-01-14  2:00 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-12  7:55 [PATCH v3 bpf-next 0/7] Support kernel module ksym variables Andrii Nakryiko
2021-01-12  7:55 ` [PATCH v3 bpf-next 1/7] bpf: add bpf_patch_call_args prototype to include/linux/bpf.h Andrii Nakryiko
2021-01-12  7:55 ` [PATCH v3 bpf-next 2/7] bpf: avoid warning when re-casting __bpf_call_base into __bpf_call_base_args Andrii Nakryiko
2021-01-12  7:55 ` [PATCH v3 bpf-next 3/7] bpf: declare __bpf_free_used_maps() unconditionally Andrii Nakryiko
2021-01-12  7:55 ` [PATCH v3 bpf-next 4/7] selftests/bpf: sync RCU before unloading bpf_testmod Andrii Nakryiko
2021-01-12  7:55 ` [PATCH v3 bpf-next 5/7] bpf: support BPF ksym variables in kernel modules Andrii Nakryiko
2021-01-12 16:27   ` Daniel Borkmann
2021-01-12 20:38     ` Andrii Nakryiko
2021-01-12 23:18     ` Alexei Starovoitov
2021-01-13 22:55       ` Daniel Borkmann
2021-01-12  7:55 ` [PATCH v3 bpf-next 6/7] libbpf: support kernel module ksym externs Andrii Nakryiko
2021-01-12  7:55 ` [PATCH v3 bpf-next 7/7] selftests/bpf: test " Andrii Nakryiko
2021-01-13  1:29   ` Alexei Starovoitov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).