[PATCH RFC bpf-next v1 0/6] Introduce unstable CT lookup helpers

bpf.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH RFC bpf-next v1 0/6] Introduce unstable CT lookup helpers
@ 2021-10-30 14:46 Kumar Kartikeya Dwivedi
  2021-10-30 14:46 ` [PATCH RFC bpf-next v1 1/6] bpf: Refactor bpf_check_mod_kfunc_call Kumar Kartikeya Dwivedi
                   ` (6 more replies)
  0 siblings, 7 replies; 18+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2021-10-30 14:46 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	Maxim Mikityanskiy, Florian Westphal, Jesper Dangaard Brouer,
	Toke Høiland-Jørgensen, netdev, netfilter-devel

This series adds unstable conntrack lookup helpers using BPF kfunc support.  The
patch adding the lookup helper is based off of Maxim's recent patch to aid in
rebasing their series on top of this, all adjusted to work with kfunc support
[0].

This is an RFC series, as I'm unsure whether the reference tracking for
PTR_TO_BTF_ID will be accepted. If not, we can go back to doing it the typical
way with PTR_TO_NF_CONN type, guarded with #if IS_ENABLED(CONFIG_NF_CONNTRACK).

Also, I want to understand whether it would make sense to introduce
check_helper_call style bpf_func_proto based argument checking for kfuncs, or
continue with how it is right now, since it doesn't seem correct that PTR_TO_MEM
can be passed where PTR_TO_BTF_ID may be expected. Only PTR_TO_CTX is enforced.

[0]: https://lore.kernel.org/bpf/20211019144655.3483197-8-maximmi@nvidia.com

Kumar Kartikeya Dwivedi (6):
  bpf: Refactor bpf_check_mod_kfunc_call
  bpf: Remove DEFINE_KFUNC_BTF_ID_SET
  bpf: Extend kfunc with PTR_TO_CTX and PTR_TO_MEM arguments
  bpf: Add reference tracking support to kfunc returned PTR_TO_BTF_ID
  net: netfilter: Add unstable CT lookup helper for XDP and TC-BPF
  selftests/bpf: Add referenced PTR_TO_BTF_ID selftest

 include/linux/bpf.h                           |  29 +-
 include/linux/btf.h                           |  54 +++-
 kernel/bpf/btf.c                              | 188 ++++++++++---
 kernel/bpf/verifier.c                         | 101 ++++++-
 net/bpf/test_run.c                            |  55 ++++
 net/core/filter.c                             |  56 ++++
 net/core/net_namespace.c                      |   1 +
 net/ipv4/tcp_bbr.c                            |   5 +-
 net/ipv4/tcp_cubic.c                          |   5 +-
 net/ipv4/tcp_dctcp.c                          |   5 +-
 net/netfilter/nf_conntrack_core.c             | 255 ++++++++++++++++++
 tools/testing/selftests/bpf/Makefile          |   2 +-
 tools/testing/selftests/bpf/bpf_tcp_helpers.h |   5 +
 .../selftests/bpf/bpf_testmod/bpf_testmod.c   |   5 +-
 .../selftests/bpf/prog_tests/kfunc_call.c     |  18 +-
 .../selftests/bpf/progs/kfunc_call_test.c     |  21 ++
 16 files changed, 741 insertions(+), 64 deletions(-)

-- 
2.33.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH RFC bpf-next v1 1/6] bpf: Refactor bpf_check_mod_kfunc_call
  2021-10-30 14:46 [PATCH RFC bpf-next v1 0/6] Introduce unstable CT lookup helpers Kumar Kartikeya Dwivedi
@ 2021-10-30 14:46 ` Kumar Kartikeya Dwivedi
  2021-10-30 14:46 ` [PATCH RFC bpf-next v1 2/6] bpf: Remove DEFINE_KFUNC_BTF_ID_SET Kumar Kartikeya Dwivedi
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2021-10-30 14:46 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	Maxim Mikityanskiy, Florian Westphal, Jesper Dangaard Brouer,
	Toke Høiland-Jørgensen, netdev, netfilter-devel

Future commits adding more callbacks will implement the same pattern of
matching module owner of kfunc_btf_id_set, and then operating on data in
the struct.

Since all call sites in the verifier hold a reference to struct module
parameter 'owner', it is safe to release the mutex lock and still
reference the struct pointer. This can be consolidated in a common
helper given the reference is always held for owner module parameter.

Since removal from the list is dependent on module reference dropping to
zero, it is safe to assume it is registered as long the caller holds a
reference.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 kernel/bpf/btf.c | 29 +++++++++++++++++++++--------
 1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index dbc3ad07e21b..be1082270455 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -6371,22 +6371,35 @@ void unregister_kfunc_btf_id_set(struct kfunc_btf_id_list *l,
 }
 EXPORT_SYMBOL_GPL(unregister_kfunc_btf_id_set);
 
-bool bpf_check_mod_kfunc_call(struct kfunc_btf_id_list *klist, u32 kfunc_id,
-			      struct module *owner)
+/* Caller must hold reference to module 'owner' */
+struct kfunc_btf_id_set *__get_kfunc_btf_id_set(struct kfunc_btf_id_list *klist,
+						struct module *owner)
 {
-	struct kfunc_btf_id_set *s;
+	struct kfunc_btf_id_set *s, *ret = NULL;
 
 	if (!owner)
-		return false;
+		return NULL;
 	mutex_lock(&klist->mutex);
 	list_for_each_entry(s, &klist->list, list) {
-		if (s->owner == owner && btf_id_set_contains(s->set, kfunc_id)) {
-			mutex_unlock(&klist->mutex);
-			return true;
+		if (s->owner == owner) {
+			ret = s;
+			break;
 		}
 	}
 	mutex_unlock(&klist->mutex);
-	return false;
+	return ret;
+}
+
+bool bpf_check_mod_kfunc_call(struct kfunc_btf_id_list *klist, u32 kfunc_id,
+			      struct module *owner)
+{
+	struct kfunc_btf_id_set *s;
+	bool ret = false;
+
+	s = __get_kfunc_btf_id_set(klist, owner);
+	if (s)
+		ret = btf_id_set_contains(s->set, kfunc_id);
+	return ret;
 }
 
 #endif
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH RFC bpf-next v1 2/6] bpf: Remove DEFINE_KFUNC_BTF_ID_SET
  2021-10-30 14:46 [PATCH RFC bpf-next v1 0/6] Introduce unstable CT lookup helpers Kumar Kartikeya Dwivedi
  2021-10-30 14:46 ` [PATCH RFC bpf-next v1 1/6] bpf: Refactor bpf_check_mod_kfunc_call Kumar Kartikeya Dwivedi
@ 2021-10-30 14:46 ` Kumar Kartikeya Dwivedi
  2021-10-30 14:46 ` [PATCH RFC bpf-next v1 3/6] bpf: Extend kfunc with PTR_TO_CTX and PTR_TO_MEM arguments Kumar Kartikeya Dwivedi
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2021-10-30 14:46 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	Maxim Mikityanskiy, Florian Westphal, Jesper Dangaard Brouer,
	Toke Høiland-Jørgensen, netdev, netfilter-devel

The only reason to keep it was to initialize list head, but future
commits will introduce more members that need to be set, which is more
convenient to do using designated initializer.

Hence, remove the macro, convert users, and initialize list head inside
register_kfunc_btf_id_set.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 include/linux/btf.h                                   | 4 ----
 kernel/bpf/btf.c                                      | 1 +
 net/ipv4/tcp_bbr.c                                    | 5 ++++-
 net/ipv4/tcp_cubic.c                                  | 5 ++++-
 net/ipv4/tcp_dctcp.c                                  | 5 ++++-
 tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c | 5 ++++-
 6 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/include/linux/btf.h b/include/linux/btf.h
index 203eef993d76..1da108e35042 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -270,10 +270,6 @@ static inline bool bpf_check_mod_kfunc_call(struct kfunc_btf_id_list *klist,
 }
 #endif
 
-#define DEFINE_KFUNC_BTF_ID_SET(set, name)                                     \
-	struct kfunc_btf_id_set name = { LIST_HEAD_INIT(name.list), (set),     \
-					 THIS_MODULE }
-
 extern struct kfunc_btf_id_list bpf_tcp_ca_kfunc_list;
 extern struct kfunc_btf_id_list prog_test_kfunc_list;
 
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index be1082270455..1773f91fff10 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -6356,6 +6356,7 @@ struct kfunc_btf_id_list {
 void register_kfunc_btf_id_set(struct kfunc_btf_id_list *l,
 			       struct kfunc_btf_id_set *s)
 {
+	INIT_LIST_HEAD(&s->list);
 	mutex_lock(&l->mutex);
 	list_add(&s->list, &l->list);
 	mutex_unlock(&l->mutex);
diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c
index ec5550089b4d..280dada5d1ae 100644
--- a/net/ipv4/tcp_bbr.c
+++ b/net/ipv4/tcp_bbr.c
@@ -1169,7 +1169,10 @@ BTF_ID(func, bbr_set_state)
 #endif
 BTF_SET_END(tcp_bbr_kfunc_ids)
 
-static DEFINE_KFUNC_BTF_ID_SET(&tcp_bbr_kfunc_ids, tcp_bbr_kfunc_btf_set);
+static struct kfunc_btf_id_set tcp_bbr_kfunc_btf_set = {
+	.owner = THIS_MODULE,
+	.set   = &tcp_bbr_kfunc_ids,
+};
 
 static int __init bbr_register(void)
 {
diff --git a/net/ipv4/tcp_cubic.c b/net/ipv4/tcp_cubic.c
index 5e9d9c51164c..70384a8040c5 100644
--- a/net/ipv4/tcp_cubic.c
+++ b/net/ipv4/tcp_cubic.c
@@ -497,7 +497,10 @@ BTF_ID(func, cubictcp_acked)
 #endif
 BTF_SET_END(tcp_cubic_kfunc_ids)
 
-static DEFINE_KFUNC_BTF_ID_SET(&tcp_cubic_kfunc_ids, tcp_cubic_kfunc_btf_set);
+static struct kfunc_btf_id_set tcp_cubic_kfunc_btf_set = {
+	.owner = THIS_MODULE,
+	.set   = &tcp_cubic_kfunc_ids,
+};
 
 static int __init cubictcp_register(void)
 {
diff --git a/net/ipv4/tcp_dctcp.c b/net/ipv4/tcp_dctcp.c
index 0d7ab3cc7b61..ac2a47eb89d8 100644
--- a/net/ipv4/tcp_dctcp.c
+++ b/net/ipv4/tcp_dctcp.c
@@ -251,7 +251,10 @@ BTF_ID(func, dctcp_state)
 #endif
 BTF_SET_END(tcp_dctcp_kfunc_ids)
 
-static DEFINE_KFUNC_BTF_ID_SET(&tcp_dctcp_kfunc_ids, tcp_dctcp_kfunc_btf_set);
+static struct kfunc_btf_id_set tcp_dctcp_kfunc_btf_set = {
+	.owner = THIS_MODULE,
+	.set   = &tcp_dctcp_kfunc_ids,
+};
 
 static int __init dctcp_register(void)
 {
diff --git a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
index 5d52ea2768df..a437086e1860 100644
--- a/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
+++ b/tools/testing/selftests/bpf/bpf_testmod/bpf_testmod.c
@@ -93,7 +93,10 @@ BTF_SET_START(bpf_testmod_kfunc_ids)
 BTF_ID(func, bpf_testmod_test_mod_kfunc)
 BTF_SET_END(bpf_testmod_kfunc_ids)
 
-static DEFINE_KFUNC_BTF_ID_SET(&bpf_testmod_kfunc_ids, bpf_testmod_kfunc_btf_set);
+static struct kfunc_btf_id_set bpf_testmod_kfunc_btf_set = {
+	.owner = THIS_MODULE,
+	.set   = &bpf_testmod_kfunc_ids,
+};
 
 static int bpf_testmod_init(void)
 {
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH RFC bpf-next v1 3/6] bpf: Extend kfunc with PTR_TO_CTX and PTR_TO_MEM arguments
  2021-10-30 14:46 [PATCH RFC bpf-next v1 0/6] Introduce unstable CT lookup helpers Kumar Kartikeya Dwivedi
  2021-10-30 14:46 ` [PATCH RFC bpf-next v1 1/6] bpf: Refactor bpf_check_mod_kfunc_call Kumar Kartikeya Dwivedi
  2021-10-30 14:46 ` [PATCH RFC bpf-next v1 2/6] bpf: Remove DEFINE_KFUNC_BTF_ID_SET Kumar Kartikeya Dwivedi
@ 2021-10-30 14:46 ` Kumar Kartikeya Dwivedi
  2021-10-30 14:46 ` [PATCH RFC bpf-next v1 4/6] bpf: Add reference tracking support to kfunc returned PTR_TO_BTF_ID Kumar Kartikeya Dwivedi
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2021-10-30 14:46 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	Maxim Mikityanskiy, Florian Westphal, Jesper Dangaard Brouer,
	Toke Høiland-Jørgensen, netdev, netfilter-devel

Allow passing PTR_TO_CTX, if the kfunc expects a matching struct type,
and punt to ptr_to_mem block if reg->type does not fall in one BTF ID
types. This will be used by future commits to get access to XDP and TC
PTR_TO_CTX.

Also add a btf_mod module parameter, so that the kfunc_id's owner module
can be used in future commit adding reference tracking for
PTR_TO_BTF_ID.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 include/linux/bpf.h   |  3 ++-
 kernel/bpf/btf.c      | 61 +++++++++++++++++++++----------------------
 kernel/bpf/verifier.c |  4 +--
 3 files changed, 34 insertions(+), 34 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 6deebf8bf78f..f8be80f748fc 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1665,7 +1665,8 @@ int btf_check_subprog_arg_match(struct bpf_verifier_env *env, int subprog,
 				struct bpf_reg_state *regs);
 int btf_check_kfunc_arg_match(struct bpf_verifier_env *env,
 			      const struct btf *btf, u32 func_id,
-			      struct bpf_reg_state *regs);
+			      struct bpf_reg_state *regs,
+			      struct module *btf_mod);
 int btf_prepare_func_args(struct bpf_verifier_env *env, int subprog,
 			  struct bpf_reg_state *reg);
 int btf_check_type_match(struct bpf_verifier_log *log, const struct bpf_prog *prog,
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 1773f91fff10..9099ef64b077 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -5566,7 +5566,8 @@ static u32 *reg2btf_ids[__BPF_REG_TYPE_MAX] = {
 static int btf_check_func_arg_match(struct bpf_verifier_env *env,
 				    const struct btf *btf, u32 func_id,
 				    struct bpf_reg_state *regs,
-				    bool ptr_to_mem_ok)
+				    bool ptr_to_mem_ok,
+				    struct module *btf_mod)
 {
 	struct bpf_verifier_log *log = &env->log;
 	const char *func_name, *ref_tname;
@@ -5602,8 +5603,10 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
 	 * verifier sees.
 	 */
 	for (i = 0; i < nargs; i++) {
+		const struct btf_type *resolve_ret;
 		u32 regno = i + 1;
 		struct bpf_reg_state *reg = &regs[regno];
+		u32 type_size;
 
 		t = btf_type_skip_modifiers(btf, args[i].type, NULL);
 		if (btf_type_is_scalar(t)) {
@@ -5621,19 +5624,25 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
 
 		ref_t = btf_type_skip_modifiers(btf, t->type, &ref_id);
 		ref_tname = btf_name_by_offset(btf, ref_t->name_off);
-		if (btf_is_kernel(btf)) {
+		if (btf_get_prog_ctx_type(log, btf, t,
+					  env->prog->type, i)) {
+			/* If function expects ctx type in BTF check that caller
+			 * is passing PTR_TO_CTX.
+			 */
+			if (reg->type != PTR_TO_CTX) {
+				bpf_log(log,
+					"arg#%d expected pointer to ctx, but got %s\n",
+					i, btf_type_str(t));
+				return -EINVAL;
+			}
+			if (check_ctx_reg(env, reg, regno))
+				return -EINVAL;
+		} else if (btf_is_kernel(btf)) {
 			const struct btf_type *reg_ref_t;
 			const struct btf *reg_btf;
 			const char *reg_ref_tname;
 			u32 reg_ref_id;
 
-			if (!btf_type_is_struct(ref_t)) {
-				bpf_log(log, "kernel function %s args#%d pointer type %s %s is not supported\n",
-					func_name, i, btf_type_str(ref_t),
-					ref_tname);
-				return -EINVAL;
-			}
-
 			if (reg->type == PTR_TO_BTF_ID) {
 				reg_btf = reg->btf;
 				reg_ref_id = reg->btf_id;
@@ -5641,9 +5650,13 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
 				reg_btf = btf_vmlinux;
 				reg_ref_id = *reg2btf_ids[reg->type];
 			} else {
-				bpf_log(log, "kernel function %s args#%d expected pointer to %s %s but R%d is not a pointer to btf_id\n",
-					func_name, i,
-					btf_type_str(ref_t), ref_tname, regno);
+				goto ptr_to_mem;
+			}
+
+			if (!btf_type_is_struct(ref_t)) {
+				bpf_log(log, "kernel function %s args#%d pointer type %s %s is not supported\n",
+					func_name, i, btf_type_str(ref_t),
+					ref_tname);
 				return -EINVAL;
 			}
 
@@ -5660,23 +5673,8 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
 					reg_ref_tname);
 				return -EINVAL;
 			}
-		} else if (btf_get_prog_ctx_type(log, btf, t,
-						 env->prog->type, i)) {
-			/* If function expects ctx type in BTF check that caller
-			 * is passing PTR_TO_CTX.
-			 */
-			if (reg->type != PTR_TO_CTX) {
-				bpf_log(log,
-					"arg#%d expected pointer to ctx, but got %s\n",
-					i, btf_type_str(t));
-				return -EINVAL;
-			}
-			if (check_ctx_reg(env, reg, regno))
-				return -EINVAL;
 		} else if (ptr_to_mem_ok) {
-			const struct btf_type *resolve_ret;
-			u32 type_size;
-
+ptr_to_mem:
 			resolve_ret = btf_resolve_size(btf, ref_t, &type_size);
 			if (IS_ERR(resolve_ret)) {
 				bpf_log(log,
@@ -5723,7 +5721,7 @@ int btf_check_subprog_arg_match(struct bpf_verifier_env *env, int subprog,
 		return -EINVAL;
 
 	is_global = prog->aux->func_info_aux[subprog].linkage == BTF_FUNC_GLOBAL;
-	err = btf_check_func_arg_match(env, btf, btf_id, regs, is_global);
+	err = btf_check_func_arg_match(env, btf, btf_id, regs, is_global, NULL);
 
 	/* Compiler optimizations can remove arguments from static functions
 	 * or mismatched type can be passed into a global function.
@@ -5736,9 +5734,10 @@ int btf_check_subprog_arg_match(struct bpf_verifier_env *env, int subprog,
 
 int btf_check_kfunc_arg_match(struct bpf_verifier_env *env,
 			      const struct btf *btf, u32 func_id,
-			      struct bpf_reg_state *regs)
+			      struct bpf_reg_state *regs,
+			      struct module *btf_mod)
 {
-	return btf_check_func_arg_match(env, btf, func_id, regs, false);
+	return btf_check_func_arg_match(env, btf, func_id, regs, false, btf_mod);
 }
 
 /* Convert BTF of a function into bpf_reg_state if possible
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index 3c8aa7df1773..ca4627f81b75 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -6713,8 +6713,8 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn)
 	}
 
 	/* Check the arguments */
-	err = btf_check_kfunc_arg_match(env, desc_btf, func_id, regs);
-	if (err)
+	err = btf_check_kfunc_arg_match(env, desc_btf, func_id, regs, btf_mod);
+	if (err < 0)
 		return err;
 
 	for (i = 0; i < CALLER_SAVED_REGS; i++)
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH RFC bpf-next v1 4/6] bpf: Add reference tracking support to kfunc returned PTR_TO_BTF_ID
  2021-10-30 14:46 [PATCH RFC bpf-next v1 0/6] Introduce unstable CT lookup helpers Kumar Kartikeya Dwivedi
                   ` (2 preceding siblings ...)
  2021-10-30 14:46 ` [PATCH RFC bpf-next v1 3/6] bpf: Extend kfunc with PTR_TO_CTX and PTR_TO_MEM arguments Kumar Kartikeya Dwivedi
@ 2021-10-30 14:46 ` Kumar Kartikeya Dwivedi
  2021-10-30 14:46 ` [PATCH RFC bpf-next v1 5/6] net: netfilter: Add unstable CT lookup helper for XDP and TC-BPF Kumar Kartikeya Dwivedi
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 18+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2021-10-30 14:46 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	Maxim Mikityanskiy, Florian Westphal, Jesper Dangaard Brouer,
	Toke Høiland-Jørgensen, netdev, netfilter-devel

This patch adds verifier support for PTR_TO_BTF_ID return type of kfunc
to be a reference, by reusing acquire_reference/release_reference
support for existing in-kernel bpf helpers.

A callback in the module kfunc_btf_id_set is added so that they can
report to the verifier whether a given kfunc_id, module pair is an
acquire/release kfunc, and track state of returned PTR_TO_BTF_ID
accordingly.

For now, we also add get_kfunc_return_type, since we don't need the
entire bpf_func_proto. The enforcement is only done in case a valid
return type is returned. __BPF_RET_TYPE_MAX is used by default to 'skip'
the check when it isn't needed by the kfunc.

The following restrictions exist for acquire function currently:
Only PTR_TO_BTF_ID or PTR_TO_BTF_ID_OR_NULL can be the return value.

The following restrictions exist for release function currently:
Only one PTR_TO_BTF_ID may be referenced.

An error is returned from verifier if they are broken. Later tests
ensure reference needs to be released if obtained using an acquire
kfunc. Right now, PTR_TO_BTF_ID is not passed to any existing built-in
relase helpers. Once this changes, an additional check for the
reg->btf_id is needed to ensure it is the right BTF ID that is being
released. If the same BTF ID's PTR_TO_BTF_ID may be passed to different
release functions in the future, additional state needs to be tracked
for the register so that it can be paired with the correct release
function.

For now, we can rely on the btf_struct_ids_match check to ensure we get
the pointer to the expected struct type.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 include/linux/bpf.h   |  4 ++
 include/linux/btf.h   | 49 ++++++++++++++++++++++
 kernel/bpf/btf.c      | 95 +++++++++++++++++++++++++++++++++++++++++-
 kernel/bpf/verifier.c | 97 ++++++++++++++++++++++++++++++++++++++++---
 4 files changed, 238 insertions(+), 7 deletions(-)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index f8be80f748fc..6fb34f0a2758 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -359,6 +359,7 @@ enum bpf_return_type {
 	RET_PTR_TO_MEM_OR_BTF_ID_OR_NULL, /* returns a pointer to a valid memory or a btf_id or NULL */
 	RET_PTR_TO_MEM_OR_BTF_ID,	/* returns a pointer to a valid memory or a btf_id */
 	RET_PTR_TO_BTF_ID,		/* returns a pointer to a btf_id */
+	__BPF_RET_TYPE_MAX,
 };
 
 /* eBPF function prototype used by verifier to allow BPF_CALLs from eBPF programs
@@ -515,6 +516,9 @@ struct bpf_verifier_ops {
 				 enum bpf_access_type atype,
 				 u32 *next_btf_id);
 	bool (*check_kfunc_call)(u32 kfunc_btf_id, struct module *owner);
+	bool (*is_acquire_kfunc)(u32 kfunc_btf_id, struct module *owner);
+	bool (*is_release_kfunc)(u32 kfunc_btf_id, struct module *owner);
+	enum bpf_return_type (*get_kfunc_return_type)(u32 kfunc_btf_id, struct module *owner);
 };
 
 struct bpf_prog_offload_ops {
diff --git a/include/linux/btf.h b/include/linux/btf.h
index 1da108e35042..464f22bf7d5f 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -4,6 +4,7 @@
 #ifndef _LINUX_BTF_H
 #define _LINUX_BTF_H 1
 
+#include <linux/bpf.h>
 #include <linux/types.h>
 #include <linux/bpfptr.h>
 #include <uapi/linux/btf.h>
@@ -243,6 +244,15 @@ struct kfunc_btf_id_set {
 	struct list_head list;
 	struct btf_id_set *set;
 	struct module *owner;
+
+	bool (*is_acquire_kfunc)(u32 kfunc_id);
+	bool (*is_release_kfunc)(u32 kfunc_id);
+	enum bpf_return_type (*get_kfunc_return_type)(u32 kfunc_id);
+	int  (*btf_struct_access)(struct bpf_verifier_log *log,
+				  const struct btf *btf,
+				  const struct btf_type *t, int off,
+				  int size, enum bpf_access_type atype,
+				  u32 *next_btf_id);
 };
 
 struct kfunc_btf_id_list;
@@ -254,6 +264,19 @@ void unregister_kfunc_btf_id_set(struct kfunc_btf_id_list *l,
 				 struct kfunc_btf_id_set *s);
 bool bpf_check_mod_kfunc_call(struct kfunc_btf_id_list *klist, u32 kfunc_id,
 			      struct module *owner);
+bool bpf_is_mod_acquire_kfunc(struct kfunc_btf_id_list *klist, u32 kfunc_id,
+			      struct module *owner);
+bool bpf_is_mod_release_kfunc(struct kfunc_btf_id_list *klist, u32 kfunc_id,
+			      struct module *owner);
+enum bpf_return_type bpf_get_mod_kfunc_return_type(struct kfunc_btf_id_list *klist,
+						   u32 kfunc_btf_id, struct module *owner);
+int bpf_btf_mod_struct_access(struct kfunc_btf_id_list *klist,
+			      struct module *owner,
+			      struct bpf_verifier_log *log,
+			      const struct btf *btf,
+			      const struct btf_type *t, int off,
+			      int size, enum bpf_access_type atype,
+			      u32 *next_btf_id);
 #else
 static inline void register_kfunc_btf_id_set(struct kfunc_btf_id_list *l,
 					     struct kfunc_btf_id_set *s)
@@ -268,6 +291,32 @@ static inline bool bpf_check_mod_kfunc_call(struct kfunc_btf_id_list *klist,
 {
 	return false;
 }
+static inline bool bpf_is_mod_acquire_kfunc(struct kfunc_btf_id_list *klist,
+					    u32 kfunc_id, struct module *owner)
+{
+	return false;
+}
+static inline bool bpf_is_mod_release_kfunc(struct kfunc_btf_id_list *klist,
+					    u32 kfunc_id, struct module *owner)
+{
+	return false;
+}
+static inline enum bpf_return_type
+bpf_get_mod_kfunc_return_type(struct kfunc_btf_id_list *klist, u32 kfunc_btf_id,
+			      struct module *owner);
+{
+	return __BPF_RET_TYPE_MAX;
+}
+static inline int bpf_btf_mod_struct_access(struct kfunc_btf_id_list *klist,
+					    struct module *owner,
+					    struct bpf_verifier_log *log,
+					    const struct btf *btf,
+					    const struct btf_type *t, int off,
+					    int size, enum bpf_access_type atype,
+					    u32 *next_btf_id)
+{
+	return __BPF_REG_TYPE_MAX;
+}
 #endif
 
 extern struct kfunc_btf_id_list bpf_tcp_ca_kfunc_list;
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 9099ef64b077..8b3c15f4359d 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -5570,10 +5570,12 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
 				    struct module *btf_mod)
 {
 	struct bpf_verifier_log *log = &env->log;
+	u32 i, nargs, ref_id, ref_obj_id = 0;
 	const char *func_name, *ref_tname;
 	const struct btf_type *t, *ref_t;
 	const struct btf_param *args;
-	u32 i, nargs, ref_id;
+	int ref_regno = 0;
+	bool rel = false;
 
 	t = btf_type_by_id(btf, func_id);
 	if (!t || !btf_type_is_func(t)) {
@@ -5646,6 +5648,17 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
 			if (reg->type == PTR_TO_BTF_ID) {
 				reg_btf = reg->btf;
 				reg_ref_id = reg->btf_id;
+				/* Ensure only one argument is referenced PTR_TO_BTF_ID */
+				if (reg->ref_obj_id) {
+					if (ref_obj_id) {
+						bpf_log(log, "verifier internal error: more than one arg with ref_obj_id R%d %u %u\n",
+							regno, reg->ref_obj_id, ref_obj_id);
+						return -EFAULT;
+					}
+					/* Allow passing referenced PTR_TO_BTF_ID to acquire function */
+					ref_regno = regno;
+					ref_obj_id = reg->ref_obj_id;
+				}
 			} else if (reg2btf_ids[reg->type]) {
 				reg_btf = btf_vmlinux;
 				reg_ref_id = *reg2btf_ids[reg->type];
@@ -5691,7 +5704,24 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
 		}
 	}
 
-	return 0;
+	if (btf_is_kernel(btf)) {
+		rel = env->ops->is_release_kfunc &&
+			env->ops->is_release_kfunc(func_id, btf_mod);
+
+		/* We make sure ref_obj_id is only set if only one reg->type == PTR_TO_BTF_ID */
+		if (rel && !ref_obj_id) {
+			bpf_log(log, "release kernel function %s expects refcounted PTR_TO_BTF_ID\n",
+				func_name);
+			return -EINVAL;
+		}
+		if (!rel && ref_obj_id) {
+			bpf_log(log, "refcounted PTR_TO_BTF_ID passed to non-release kernel function %s\n",
+				func_name);
+			return -EINVAL;
+		}
+	}
+	/* returns argument register number > 0 in case of reference release kfunc */
+	return ref_regno;
 }
 
 /* Compare BTF of a function with given bpf_reg_state.
@@ -6402,6 +6432,67 @@ bool bpf_check_mod_kfunc_call(struct kfunc_btf_id_list *klist, u32 kfunc_id,
 	return ret;
 }
 
+bool bpf_is_mod_acquire_kfunc(struct kfunc_btf_id_list *klist, u32 kfunc_id,
+			      struct module *owner)
+{
+	struct kfunc_btf_id_set *s;
+	bool ret = false;
+
+	s = __get_kfunc_btf_id_set(klist, owner);
+	if (s) {
+		ret = s->is_acquire_kfunc &&
+		      s->is_acquire_kfunc(kfunc_id);
+	}
+	return ret;
+}
+
+bool bpf_is_mod_release_kfunc(struct kfunc_btf_id_list *klist, u32 kfunc_id,
+			      struct module *owner)
+{
+	struct kfunc_btf_id_set *s;
+	bool ret = false;
+
+	s = __get_kfunc_btf_id_set(klist, owner);
+	if (s) {
+		ret = s->is_release_kfunc &&
+		      s->is_release_kfunc(kfunc_id);
+	}
+	return ret;
+}
+
+enum bpf_return_type bpf_get_mod_kfunc_return_type(struct kfunc_btf_id_list *klist,
+						   u32 kfunc_id, struct module *owner)
+{
+	enum bpf_return_type ret_type = __BPF_RET_TYPE_MAX;
+	struct kfunc_btf_id_set *s;
+
+	s = __get_kfunc_btf_id_set(klist, owner);
+	if (s) {
+		if (s->get_kfunc_return_type)
+			ret_type = s->get_kfunc_return_type(kfunc_id);
+	}
+	return ret_type;
+}
+
+int bpf_btf_mod_struct_access(struct kfunc_btf_id_list *klist,
+			      struct module *owner,
+			      struct bpf_verifier_log *log,
+			      const struct btf *btf,
+			      const struct btf_type *t, int off,
+			      int size, enum bpf_access_type atype,
+			      u32 *next_btf_id)
+{
+	struct kfunc_btf_id_set *s;
+
+	s = __get_kfunc_btf_id_set(klist, owner);
+	if (s) {
+		if (s->btf_struct_access)
+			return s->btf_struct_access(log, btf, t, off, size, atype,
+						    next_btf_id);
+	}
+	return __BPF_REG_TYPE_MAX;
+}
+
 #endif
 
 #define DEFINE_KFUNC_BTF_ID_LIST(name)                                         \
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index ca4627f81b75..4ac091a946c2 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -5789,6 +5789,28 @@ static int check_func_proto(const struct bpf_func_proto *fn, int func_id)
 	       check_refcount_ok(fn, func_id) ? 0 : -EINVAL;
 }
 
+static int check_kfunc_proto(struct bpf_verifier_env *env,
+			     enum bpf_return_type ret_type, u32 kfunc_id,
+			     bool acquire)
+{
+	switch (ret_type) {
+	case RET_VOID:
+	case RET_INTEGER:
+	case RET_PTR_TO_BTF_ID:
+	case RET_PTR_TO_BTF_ID_OR_NULL:
+	case __BPF_RET_TYPE_MAX:
+		break;
+	default:
+		verbose(env, "kfunc %u unsupported return type %d\n", kfunc_id,
+			ret_type);
+		return -EINVAL;
+	}
+	/* refcount_ok check is done later in bpf_check_kfunc_arg_match,
+	 * as only PTR_TO_BTF_ID can be refcounted
+	 */
+	return 0;
+}
+
 /* Packet data might have moved, any old PTR_TO_PACKET[_META,_END]
  * are now invalid, so turn them into unknown SCALAR_VALUE.
  */
@@ -6681,16 +6703,19 @@ static void mark_btf_func_reg_size(struct bpf_verifier_env *env, u32 regno,
 	}
 }
 
-static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn)
+static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn,
+			    int *insn_idx_p)
 {
 	const struct btf_type *t, *func, *func_proto, *ptr_type;
+	enum bpf_return_type ret_type = __BPF_RET_TYPE_MAX;
 	struct bpf_reg_state *regs = cur_regs(env);
 	const char *func_name, *ptr_type_name;
 	u32 i, nargs, func_id, ptr_type_id;
+	int err, insn_idx = *insn_idx_p;
 	struct module *btf_mod = NULL;
 	const struct btf_param *args;
+	bool acq, ret = false;
 	struct btf *desc_btf;
-	int err;
 
 	/* skip for now, but return error when we find this in fixup_kfunc_call */
 	if (!insn->imm)
@@ -6712,17 +6737,51 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn)
 		return -EACCES;
 	}
 
+	acq = env->ops->is_acquire_kfunc &&
+		env->ops->is_acquire_kfunc(func_id, btf_mod);
+
+	if (env->ops->get_kfunc_return_type) {
+		ret_type = env->ops->get_kfunc_return_type(func_id, btf_mod);
+		err = check_kfunc_proto(env, ret_type, func_id, acq);
+		if (err)
+			return err;
+		/* Skip check if not specified */
+		ret = ret_type != __BPF_RET_TYPE_MAX;
+	}
+
 	/* Check the arguments */
 	err = btf_check_kfunc_arg_match(env, desc_btf, func_id, regs, btf_mod);
 	if (err < 0)
 		return err;
+	/* In case of release function, we get register number of refcounted
+	 * PTR_TO_BTF_ID back from kfunc_arg_match, do the release now
+	 */
+	if (err > 0) {
+		err = release_reference(env, regs[err].ref_obj_id);
+		if (err) {
+			verbose(env, "kfunc %s#%d reference has not been acquired before\n",
+				func_name, func_id);
+			return err;
+		}
+	}
 
 	for (i = 0; i < CALLER_SAVED_REGS; i++)
 		mark_reg_not_init(env, regs, caller_saved[i]);
 
 	/* Check return type */
 	t = btf_type_skip_modifiers(desc_btf, func_proto->type, NULL);
+	/* allow helper to return refcounted PTR_TO_BTF_ID */
+	if (acq && !btf_type_is_ptr(t)) {
+		verbose(env, "acquire kernel function returns non-PTR_TO_BTF_ID\n");
+		return -EINVAL;
+	}
+
 	if (btf_type_is_scalar(t)) {
+		if (ret && ret_type != RET_INTEGER) {
+			verbose(env, "kfunc %s#%d func_proto ret type mismatch\n",
+				func_name, func_id);
+			return -EINVAL;
+		}
 		mark_reg_unknown(env, regs, BPF_REG_0);
 		mark_btf_func_reg_size(env, BPF_REG_0, t->size);
 	} else if (btf_type_is_ptr(t)) {
@@ -6736,12 +6795,40 @@ static int check_kfunc_call(struct bpf_verifier_env *env, struct bpf_insn *insn)
 				ptr_type_name);
 			return -EINVAL;
 		}
+		if (ret && ret_type != RET_PTR_TO_BTF_ID_OR_NULL &&
+		    ret_type != RET_PTR_TO_BTF_ID) {
+			verbose(env, "kfunc %s#%d func_proto ret type mismatch\n",
+				func_name, func_id);
+			return -EINVAL;
+		}
+
+		if (ret && ret_type == RET_PTR_TO_BTF_ID_OR_NULL) {
+			regs[BPF_REG_0].type = PTR_TO_BTF_ID_OR_NULL;
+			/* When more return types support _OR_NULL, move this
+			 * under a generic reg_type_may_be_null check.
+			 */
+			regs[BPF_REG_0].id = ++env->id_gen;
+		} else {
+			regs[BPF_REG_0].type = PTR_TO_BTF_ID;
+		}
 		mark_reg_known_zero(env, regs, BPF_REG_0);
 		regs[BPF_REG_0].btf = desc_btf;
-		regs[BPF_REG_0].type = PTR_TO_BTF_ID;
 		regs[BPF_REG_0].btf_id = ptr_type_id;
 		mark_btf_func_reg_size(env, BPF_REG_0, sizeof(void *));
-	} /* else { add_kfunc_call() ensures it is btf_type_is_void(t) } */
+		if (acq) {
+			int id = acquire_reference_state(env, insn_idx);
+
+			if (id < 0)
+				return id;
+			regs[BPF_REG_0].id = id;
+			regs[BPF_REG_0].ref_obj_id = id;
+		}
+	} else if (ret && ret_type != RET_VOID) {
+		/* add_kfunc_call() ensures it is btf_type_is_void(t) */
+		verbose(env, "kfunc %s#%d func_proto ret type mismatch\n",
+			func_name, func_id);
+		return -EINVAL;
+	}
 
 	nargs = btf_type_vlen(func_proto);
 	args = (const struct btf_param *)(func_proto + 1);
@@ -11308,7 +11395,7 @@ static int do_check(struct bpf_verifier_env *env)
 				if (insn->src_reg == BPF_PSEUDO_CALL)
 					err = check_func_call(env, insn, &env->insn_idx);
 				else if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL)
-					err = check_kfunc_call(env, insn);
+					err = check_kfunc_call(env, insn, &env->insn_idx);
 				else
 					err = check_helper_call(env, insn, &env->insn_idx);
 				if (err)
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH RFC bpf-next v1 5/6] net: netfilter: Add unstable CT lookup helper for XDP and TC-BPF
  2021-10-30 14:46 [PATCH RFC bpf-next v1 0/6] Introduce unstable CT lookup helpers Kumar Kartikeya Dwivedi
                   ` (3 preceding siblings ...)
  2021-10-30 14:46 ` [PATCH RFC bpf-next v1 4/6] bpf: Add reference tracking support to kfunc returned PTR_TO_BTF_ID Kumar Kartikeya Dwivedi
@ 2021-10-30 14:46 ` Kumar Kartikeya Dwivedi
  2021-10-31 19:10   ` Florian Westphal
  2021-10-30 14:46 ` [PATCH RFC bpf-next v1 6/6] selftests/bpf: Add referenced PTR_TO_BTF_ID selftest Kumar Kartikeya Dwivedi
  2021-11-02 23:16 ` [PATCH RFC bpf-next v1 0/6] Introduce unstable CT lookup helpers Alexei Starovoitov
  6 siblings, 1 reply; 18+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2021-10-30 14:46 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	Maxim Mikityanskiy, Florian Westphal, Jesper Dangaard Brouer,
	Toke Høiland-Jørgensen, netdev, netfilter-devel

This change adds conntrack lookup helpers using the unstable kfunc call
interface for the XDP and TC-BPF hooks.

Also add acquire/release functions (randomly returning NULL), and also
exercise the RET_PTR_TO_BTF_ID_OR_NULL path so that BPF program caller
has to check for NULL before dereferencing the pointer, for the TC hook.
These will be used in selftest.

Export get_net_ns_by_id and btf_type_by_id as nf_conntrack needs to call
them.

[ NOTE: Currently the btf_type check does not work, due to the problem
described in the thread mentioned in the comments ]

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 include/linux/bpf.h               |  22 +++
 include/linux/btf.h               |   1 +
 kernel/bpf/btf.c                  |   2 +
 net/bpf/test_run.c                |  55 +++++++
 net/core/filter.c                 |  56 +++++++
 net/core/net_namespace.c          |   1 +
 net/netfilter/nf_conntrack_core.c | 255 ++++++++++++++++++++++++++++++
 7 files changed, 392 insertions(+)

diff --git a/include/linux/bpf.h b/include/linux/bpf.h
index 6fb34f0a2758..b4f82aaf68bd 100644
--- a/include/linux/bpf.h
+++ b/include/linux/bpf.h
@@ -1647,6 +1647,10 @@ int bpf_prog_test_run_sk_lookup(struct bpf_prog *prog,
 				const union bpf_attr *kattr,
 				union bpf_attr __user *uattr);
 bool bpf_prog_test_check_kfunc_call(u32 kfunc_id, struct module *owner);
+bool bpf_prog_test_is_acquire_kfunc(u32 kfunc_id, struct module *owner);
+bool bpf_prog_test_is_release_kfunc(u32 kfunc_id, struct module *owner);
+enum bpf_return_type bpf_prog_test_get_kfunc_return_type(u32 kfunc_id,
+							 struct module *owner);
 bool btf_ctx_access(int off, int size, enum bpf_access_type type,
 		    const struct bpf_prog *prog,
 		    struct bpf_insn_access_aux *info);
@@ -1874,6 +1878,24 @@ static inline bool bpf_prog_test_check_kfunc_call(u32 kfunc_id,
 	return false;
 }
 
+static inline bool bpf_prog_test_is_acquire_kfunc(u32 kfunc_id,
+						  struct module *owner)
+{
+	return false;
+}
+
+static inline bool bpf_prog_test_is_release_kfunc(u32 kfunc_id,
+						  struct module *owner)
+{
+	return false;
+}
+
+static inline enum bpf_return_type
+bpf_prog_test_get_kfunc_return_type(u32 kfunc_id, struct module *owner)
+{
+	return __BPF_RET_TYPE_MAX;
+}
+
 static inline void bpf_map_put(struct bpf_map *map)
 {
 }
diff --git a/include/linux/btf.h b/include/linux/btf.h
index 464f22bf7d5f..87019548e37c 100644
--- a/include/linux/btf.h
+++ b/include/linux/btf.h
@@ -321,5 +321,6 @@ static inline int bpf_btf_mod_struct_access(struct kfunc_btf_id_list *klist,
 
 extern struct kfunc_btf_id_list bpf_tcp_ca_kfunc_list;
 extern struct kfunc_btf_id_list prog_test_kfunc_list;
+extern struct kfunc_btf_id_list xdp_kfunc_list;
 
 #endif
diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index 8b3c15f4359d..d0e5101b8c2d 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -735,6 +735,7 @@ const struct btf_type *btf_type_by_id(const struct btf *btf, u32 type_id)
 		return NULL;
 	return btf->types[type_id];
 }
+EXPORT_SYMBOL_GPL(btf_type_by_id);
 
 /*
  * Regular int is not a bit field and it must be either
@@ -6502,3 +6503,4 @@ int bpf_btf_mod_struct_access(struct kfunc_btf_id_list *klist,
 
 DEFINE_KFUNC_BTF_ID_LIST(bpf_tcp_ca_kfunc_list);
 DEFINE_KFUNC_BTF_ID_LIST(prog_test_kfunc_list);
+DEFINE_KFUNC_BTF_ID_LIST(xdp_kfunc_list);
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 46dd95755967..a678ddc97e0f 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -232,6 +232,28 @@ struct sock * noinline bpf_kfunc_call_test3(struct sock *sk)
 	return sk;
 }
 
+struct prog_test_ref_kfunc {
+	int a;
+	int b;
+};
+
+static struct prog_test_ref_kfunc prog_test_struct;
+
+noinline struct prog_test_ref_kfunc *bpf_kfunc_call_test_acquire(char *ptr)
+{
+	/* randomly return NULL */
+	if (get_jiffies_64() % 2)
+		return NULL;
+	prog_test_struct.a = 42;
+	prog_test_struct.b = 108;
+	return &prog_test_struct;
+}
+
+noinline void bpf_kfunc_call_test_release(struct prog_test_ref_kfunc *p)
+{
+	return;
+}
+
 __diag_pop();
 
 ALLOW_ERROR_INJECTION(bpf_modify_return_test, ERRNO);
@@ -240,8 +262,14 @@ BTF_SET_START(test_sk_kfunc_ids)
 BTF_ID(func, bpf_kfunc_call_test1)
 BTF_ID(func, bpf_kfunc_call_test2)
 BTF_ID(func, bpf_kfunc_call_test3)
+BTF_ID(func, bpf_kfunc_call_test_acquire)
+BTF_ID(func, bpf_kfunc_call_test_release)
 BTF_SET_END(test_sk_kfunc_ids)
 
+BTF_ID_LIST(test_sk_acq_rel)
+BTF_ID(func, bpf_kfunc_call_test_acquire)
+BTF_ID(func, bpf_kfunc_call_test_release)
+
 bool bpf_prog_test_check_kfunc_call(u32 kfunc_id, struct module *owner)
 {
 	if (btf_id_set_contains(&test_sk_kfunc_ids, kfunc_id))
@@ -249,6 +277,33 @@ bool bpf_prog_test_check_kfunc_call(u32 kfunc_id, struct module *owner)
 	return bpf_check_mod_kfunc_call(&prog_test_kfunc_list, kfunc_id, owner);
 }
 
+bool bpf_prog_test_is_acquire_kfunc(u32 kfunc_id, struct module *owner)
+{
+	if (!owner) /* bpf_kfunc_call_test_acquire */
+		return kfunc_id == test_sk_acq_rel[0];
+	return bpf_is_mod_acquire_kfunc(&prog_test_kfunc_list, kfunc_id, owner);
+}
+
+bool bpf_prog_test_is_release_kfunc(u32 kfunc_id, struct module *owner)
+{
+	if (!owner) /* bpf_kfunc_call_test_release */
+		return kfunc_id == test_sk_acq_rel[1];
+	return bpf_is_mod_release_kfunc(&prog_test_kfunc_list, kfunc_id, owner);
+}
+
+enum bpf_return_type bpf_prog_test_get_kfunc_return_type(u32 kfunc_id,
+							 struct module *owner)
+{
+	if (!owner) {
+		if (kfunc_id == test_sk_acq_rel[0])
+			return RET_PTR_TO_BTF_ID_OR_NULL;
+		else
+			return __BPF_RET_TYPE_MAX;
+	}
+	return bpf_get_mod_kfunc_return_type(&prog_test_kfunc_list, kfunc_id,
+					     owner);
+}
+
 static void *bpf_test_init(const union bpf_attr *kattr, u32 size,
 			   u32 headroom, u32 tailroom)
 {
diff --git a/net/core/filter.c b/net/core/filter.c
index 8e8d3b49c297..4e320de4472d 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -9948,6 +9948,12 @@ const struct bpf_prog_ops sk_filter_prog_ops = {
 	.test_run		= bpf_prog_test_run_skb,
 };
 
+static int xdp_btf_struct_access(struct bpf_verifier_log *log,
+				 const struct btf *btf,
+				 const struct btf_type *t, int off,
+				 int size, enum bpf_access_type atype,
+				 u32 *next_btf_id);
+
 const struct bpf_verifier_ops tc_cls_act_verifier_ops = {
 	.get_func_proto		= tc_cls_act_func_proto,
 	.is_valid_access	= tc_cls_act_is_valid_access,
@@ -9955,17 +9961,67 @@ const struct bpf_verifier_ops tc_cls_act_verifier_ops = {
 	.gen_prologue		= tc_cls_act_prologue,
 	.gen_ld_abs		= bpf_gen_ld_abs,
 	.check_kfunc_call	= bpf_prog_test_check_kfunc_call,
+	.is_acquire_kfunc	= bpf_prog_test_is_acquire_kfunc,
+	.is_release_kfunc	= bpf_prog_test_is_release_kfunc,
+	.get_kfunc_return_type  = bpf_prog_test_get_kfunc_return_type,
+	/* resuse the callback, there is nothing xdp specific in it */
+	.btf_struct_access      = xdp_btf_struct_access,
 };
 
 const struct bpf_prog_ops tc_cls_act_prog_ops = {
 	.test_run		= bpf_prog_test_run_skb,
 };
 
+static bool xdp_is_acquire_kfunc(u32 kfunc_id, struct module *owner)
+{
+	return bpf_is_mod_acquire_kfunc(&xdp_kfunc_list, kfunc_id, owner);
+}
+
+static bool xdp_is_release_kfunc(u32 kfunc_id, struct module *owner)
+{
+	return bpf_is_mod_release_kfunc(&xdp_kfunc_list, kfunc_id, owner);
+}
+
+static enum bpf_return_type xdp_get_kfunc_return_type(u32 kfunc_id,
+						      struct module *owner)
+{
+	return bpf_get_mod_kfunc_return_type(&xdp_kfunc_list, kfunc_id, owner);
+}
+
+static int xdp_btf_struct_access(struct bpf_verifier_log *log,
+				 const struct btf *btf,
+				 const struct btf_type *t, int off,
+				 int size, enum bpf_access_type atype,
+				 u32 *next_btf_id)
+{
+	int ret = __BPF_REG_TYPE_MAX;
+	struct module *mod;
+
+	if (atype != BPF_READ)
+		return -EACCES;
+
+	if (btf_is_module(btf)) {
+		mod = btf_try_get_module(btf);
+		if (!mod)
+			return -ENXIO;
+		ret = bpf_btf_mod_struct_access(&xdp_kfunc_list, mod, log, btf, t, off, size,
+						atype, next_btf_id);
+		module_put(mod);
+	}
+	if (ret == __BPF_REG_TYPE_MAX)
+		return btf_struct_access(log, btf, t, off, size, atype, next_btf_id);
+	return ret;
+}
+
 const struct bpf_verifier_ops xdp_verifier_ops = {
 	.get_func_proto		= xdp_func_proto,
 	.is_valid_access	= xdp_is_valid_access,
 	.convert_ctx_access	= xdp_convert_ctx_access,
 	.gen_prologue		= bpf_noop_prologue,
+	.is_acquire_kfunc	= xdp_is_acquire_kfunc,
+	.is_release_kfunc	= xdp_is_release_kfunc,
+	.get_kfunc_return_type  = xdp_get_kfunc_return_type,
+	.btf_struct_access	= xdp_btf_struct_access,
 };
 
 const struct bpf_prog_ops xdp_prog_ops = {
diff --git a/net/core/net_namespace.c b/net/core/net_namespace.c
index 202fa5eacd0f..7b4bfe793002 100644
--- a/net/core/net_namespace.c
+++ b/net/core/net_namespace.c
@@ -299,6 +299,7 @@ struct net *get_net_ns_by_id(const struct net *net, int id)
 
 	return peer;
 }
+EXPORT_SYMBOL_GPL(get_net_ns_by_id);
 
 /*
  * setup_net runs the initializers for the network namespace object.
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 770a63103c7a..69450b5c32f0 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -11,6 +11,9 @@
 
 #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
 
+#include <linux/bpf.h>
+#include <linux/btf.h>
+#include <linux/btf_ids.h>
 #include <linux/types.h>
 #include <linux/netfilter.h>
 #include <linux/module.h>
@@ -2451,6 +2454,252 @@ static int kill_all(struct nf_conn *i, void *data)
 	return net_eq(nf_ct_net(i), data);
 }
 
+/* Unstable Kernel Helpers for XDP hook */
+static struct nf_conn *__bpf_nf_ct_lookup(struct net *net,
+					  struct bpf_sock_tuple *bpf_tuple,
+					  u32 tuple_len, u8 protonum,
+					  u64 netns_id, u64 flags)
+{
+	struct nf_conntrack_tuple_hash *hash;
+	struct nf_conntrack_tuple tuple;
+
+	if (flags != IP_CT_DIR_ORIGINAL && flags != IP_CT_DIR_REPLY)
+		return ERR_PTR(-EINVAL);
+
+	memset(&tuple, 0, sizeof(tuple));
+
+	switch (tuple_len) {
+	case sizeof(bpf_tuple->ipv4):
+		tuple.src.l3num = AF_INET;
+		tuple.src.u3.ip = bpf_tuple->ipv4.saddr;
+		tuple.src.u.tcp.port = bpf_tuple->ipv4.sport;
+		tuple.dst.u3.ip = bpf_tuple->ipv4.daddr;
+		tuple.dst.u.tcp.port = bpf_tuple->ipv4.dport;
+		break;
+	case sizeof(bpf_tuple->ipv6):
+		tuple.src.l3num = AF_INET6;
+		memcpy(tuple.src.u3.ip6, bpf_tuple->ipv6.saddr, sizeof(bpf_tuple->ipv6.saddr));
+		tuple.src.u.tcp.port = bpf_tuple->ipv6.sport;
+		memcpy(tuple.dst.u3.ip6, bpf_tuple->ipv6.daddr, sizeof(bpf_tuple->ipv6.daddr));
+		tuple.dst.u.tcp.port = bpf_tuple->ipv6.dport;
+		break;
+	default:
+		return ERR_PTR(-EAFNOSUPPORT);
+	}
+
+	tuple.dst.protonum = protonum;
+	tuple.dst.dir = flags;
+
+	if ((s32)netns_id >= 0) {
+		if ((s32)netns_id > S32_MAX)
+			return ERR_PTR(-EINVAL);
+		net = get_net_ns_by_id(net, netns_id);
+		if (!net)
+			return ERR_PTR(-EINVAL);
+	}
+
+	hash = nf_conntrack_find_get(net, &nf_ct_zone_dflt, &tuple);
+	if ((s32)netns_id >= 0)
+		put_net(net);
+	if (!hash)
+		return ERR_PTR(-ENOENT);
+	return nf_ct_tuplehash_to_ctrack(hash);
+}
+
+static struct nf_conn *bpf_xdp_ct_lookup(struct xdp_md *xdp_ctx,
+					 struct bpf_sock_tuple *bpf_tuple,
+					 u32 tuple_len, u8 protonum,
+					 u64 netns_id, u64 *flags_err)
+{
+	struct xdp_buff *ctx = (struct xdp_buff *)xdp_ctx;
+	struct net *caller_net;
+	struct nf_conn *nfct;
+
+	if (!flags_err)
+		return NULL;
+	if (!bpf_tuple) {
+		*flags_err = -EINVAL;
+		return NULL;
+	}
+	caller_net = dev_net(ctx->rxq->dev);
+	nfct = __bpf_nf_ct_lookup(caller_net, bpf_tuple, tuple_len, protonum,
+				  netns_id, *flags_err);
+	if (IS_ERR(nfct)) {
+		*flags_err = PTR_ERR(nfct);
+		return NULL;
+	}
+	return nfct;
+}
+
+static struct nf_conn *bpf_skb_ct_lookup(struct __sk_buff *skb_ctx,
+					 struct bpf_sock_tuple *bpf_tuple,
+					 u32 tuple_len, u8 protonum,
+					 u64 netns_id, u64 *flags_err)
+{
+	struct sk_buff *skb = (struct sk_buff *)skb_ctx;
+	struct net *caller_net;
+	struct nf_conn *nfct;
+
+	if (!flags_err)
+		return NULL;
+	if (!bpf_tuple) {
+		*flags_err = -EINVAL;
+		return NULL;
+	}
+	caller_net = skb->dev ? dev_net(skb->dev) : sock_net(skb->sk);
+	nfct = __bpf_nf_ct_lookup(caller_net, bpf_tuple, tuple_len, protonum,
+				  netns_id, *flags_err);
+	if (IS_ERR(nfct)) {
+		*flags_err = PTR_ERR(nfct);
+		return NULL;
+	}
+	return nfct;
+}
+
+struct nf_conn *bpf_xdp_ct_lookup_tcp(struct xdp_md *xdp_ctx,
+				      struct bpf_sock_tuple *bpf_tuple,
+				      u32 tuple_len, u64 netns_id,
+				      u64 *flags_err)
+{
+	return bpf_xdp_ct_lookup(xdp_ctx, bpf_tuple, tuple_len, IPPROTO_TCP,
+				 netns_id, flags_err);
+}
+
+struct nf_conn *bpf_xdp_ct_lookup_udp(struct xdp_md *xdp_ctx,
+				      struct bpf_sock_tuple *bpf_tuple,
+				      u32 tuple_len, u64 netns_id,
+				      u64 *flags_err)
+{
+	return bpf_xdp_ct_lookup(xdp_ctx, bpf_tuple, tuple_len, IPPROTO_UDP,
+				 netns_id, flags_err);
+}
+
+struct nf_conn *bpf_skb_ct_lookup_tcp(struct __sk_buff *skb_ctx,
+				      struct bpf_sock_tuple *bpf_tuple,
+				      u32 tuple_len, u64 netns_id,
+				      u64 *flags_err)
+{
+	return bpf_skb_ct_lookup(skb_ctx, bpf_tuple, tuple_len, IPPROTO_TCP,
+				 netns_id, flags_err);
+}
+
+struct nf_conn *bpf_skb_ct_lookup_udp(struct __sk_buff *skb_ctx,
+				      struct bpf_sock_tuple *bpf_tuple,
+				      u32 tuple_len, u64 netns_id,
+				      u64 *flags_err)
+{
+	return bpf_skb_ct_lookup(skb_ctx, bpf_tuple, tuple_len, IPPROTO_UDP,
+				 netns_id, flags_err);
+}
+
+void bpf_ct_release(struct nf_conn *nfct)
+{
+	if (!nfct)
+		return;
+	nf_ct_put(nfct);
+}
+
+BTF_SET_START(nf_conntrack_xdp_ids)
+BTF_ID(func, bpf_xdp_ct_lookup_tcp)
+BTF_ID(func, bpf_xdp_ct_lookup_udp)
+BTF_ID(func, bpf_ct_release)
+BTF_SET_END(nf_conntrack_xdp_ids)
+
+BTF_SET_START(nf_conntrack_skb_ids)
+BTF_ID(func, bpf_skb_ct_lookup_tcp)
+BTF_ID(func, bpf_skb_ct_lookup_udp)
+BTF_ID(func, bpf_ct_release)
+BTF_SET_END(nf_conntrack_skb_ids)
+
+BTF_ID_LIST(nf_conntrack_ids)
+BTF_ID(func, bpf_xdp_ct_lookup_tcp)
+BTF_ID(func, bpf_xdp_ct_lookup_udp)
+BTF_ID(func, bpf_skb_ct_lookup_tcp)
+BTF_ID(func, bpf_skb_ct_lookup_udp)
+BTF_ID(func, bpf_ct_release)
+BTF_ID(struct, nf_conn)
+
+bool nf_is_acquire_kfunc(u32 kfunc_id)
+{
+	return kfunc_id == nf_conntrack_ids[0] ||
+	       kfunc_id == nf_conntrack_ids[1] ||
+	       kfunc_id == nf_conntrack_ids[2] ||
+	       kfunc_id == nf_conntrack_ids[3];
+}
+
+bool nf_is_release_kfunc(u32 kfunc_id)
+{
+	return kfunc_id == nf_conntrack_ids[4];
+}
+
+enum bpf_return_type nf_get_kfunc_return_type(u32 kfunc_id)
+{
+	if (kfunc_id == nf_conntrack_ids[0] ||
+	    kfunc_id == nf_conntrack_ids[1] ||
+	    kfunc_id == nf_conntrack_ids[2] ||
+	    kfunc_id == nf_conntrack_ids[3])
+		return RET_PTR_TO_BTF_ID_OR_NULL;
+	return __BPF_RET_TYPE_MAX;
+}
+
+static int nf_btf_struct_access(struct bpf_verifier_log *log,
+				const struct btf *btf,
+				const struct btf_type *t, int off,
+				int size, enum bpf_access_type atype,
+				u32 *next_btf_id)
+{
+	const struct btf_type *nf_conn_type;
+	size_t end;
+
+	nf_conn_type = btf_type_by_id(btf, nf_conntrack_ids[5]);
+	if (!nf_conn_type)
+		return -EACCES;
+	/* This won't work (not even with btf_struct_ids_match for off == 0),
+	 * see below for the reason:
+	 * https://lore.kernel.org/bpf/20211028014428.rsuq6rkfvqzq23tg@apollo.localdomain
+	 */
+	if (t != nf_conn_type) /* skip */
+		return __BPF_REG_TYPE_MAX;
+
+	if (atype != BPF_READ)
+		return -EACCES;
+
+	switch (off) {
+	case offsetof(struct nf_conn, status):
+		end = offsetofend(struct nf_conn, status);
+		break;
+	/* TODO(v2): We should do it per field offset */
+	case bpf_ctx_range(struct nf_conn, proto):
+		end = offsetofend(struct nf_conn, proto);
+		break;
+	default:
+		return -EACCES;
+	}
+
+	if (off + size > end)
+		return -EACCES;
+
+	return NOT_INIT;
+}
+
+static struct kfunc_btf_id_set nf_ct_xdp_kfunc_set = {
+	.owner                 = THIS_MODULE,
+	.set                   = &nf_conntrack_xdp_ids,
+	.is_acquire_kfunc      = nf_is_acquire_kfunc,
+	.is_release_kfunc      = nf_is_release_kfunc,
+	.get_kfunc_return_type = nf_get_kfunc_return_type,
+	.btf_struct_access     = nf_btf_struct_access,
+};
+
+static struct kfunc_btf_id_set nf_ct_skb_kfunc_set = {
+	.owner                 = THIS_MODULE,
+	.set                   = &nf_conntrack_skb_ids,
+	.is_acquire_kfunc      = nf_is_acquire_kfunc,
+	.is_release_kfunc      = nf_is_release_kfunc,
+	.get_kfunc_return_type = nf_get_kfunc_return_type,
+	.btf_struct_access     = nf_btf_struct_access,
+};
+
 void nf_conntrack_cleanup_start(void)
 {
 	conntrack_gc_work.exiting = true;
@@ -2459,6 +2708,9 @@ void nf_conntrack_cleanup_start(void)
 
 void nf_conntrack_cleanup_end(void)
 {
+	unregister_kfunc_btf_id_set(&xdp_kfunc_list, &nf_ct_xdp_kfunc_set);
+	unregister_kfunc_btf_id_set(&prog_test_kfunc_list, &nf_ct_skb_kfunc_set);
+
 	RCU_INIT_POINTER(nf_ct_hook, NULL);
 	cancel_delayed_work_sync(&conntrack_gc_work.dwork);
 	kvfree(nf_conntrack_hash);
@@ -2745,6 +2997,9 @@ int nf_conntrack_init_start(void)
 	conntrack_gc_work_init(&conntrack_gc_work);
 	queue_delayed_work(system_power_efficient_wq, &conntrack_gc_work.dwork, HZ);
 
+	register_kfunc_btf_id_set(&prog_test_kfunc_list, &nf_ct_skb_kfunc_set);
+	register_kfunc_btf_id_set(&xdp_kfunc_list, &nf_ct_xdp_kfunc_set);
+
 	return 0;
 
 err_proto:
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH RFC bpf-next v1 6/6] selftests/bpf: Add referenced PTR_TO_BTF_ID selftest
  2021-10-30 14:46 [PATCH RFC bpf-next v1 0/6] Introduce unstable CT lookup helpers Kumar Kartikeya Dwivedi
                   ` (4 preceding siblings ...)
  2021-10-30 14:46 ` [PATCH RFC bpf-next v1 5/6] net: netfilter: Add unstable CT lookup helper for XDP and TC-BPF Kumar Kartikeya Dwivedi
@ 2021-10-30 14:46 ` Kumar Kartikeya Dwivedi
  2021-11-02 23:16 ` [PATCH RFC bpf-next v1 0/6] Introduce unstable CT lookup helpers Alexei Starovoitov
  6 siblings, 0 replies; 18+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2021-10-30 14:46 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	Maxim Mikityanskiy, Florian Westphal, Jesper Dangaard Brouer,
	Toke Høiland-Jørgensen, netdev, netfilter-devel

Use the prog_test kfuncs to test the referenced PTR_TO_BTF_ID kfunc
support.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 tools/testing/selftests/bpf/Makefile          |  2 +-
 tools/testing/selftests/bpf/bpf_tcp_helpers.h |  5 +++++
 .../selftests/bpf/prog_tests/kfunc_call.c     | 18 ++++++++++------
 .../selftests/bpf/progs/kfunc_call_test.c     | 21 +++++++++++++++++++
 4 files changed, 39 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index f49cb5fc85af..36c284464599 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -322,7 +322,7 @@ SKEL_BLACKLIST := btf__% test_pinning_invalid.c test_sk_assign.c
 LINKED_SKELS := test_static_linked.skel.h linked_funcs.skel.h		\
 		linked_vars.skel.h linked_maps.skel.h
 
-LSKELS := kfunc_call_test.c fentry_test.c fexit_test.c fexit_sleep.c \
+LSKELS := fentry_test.c fexit_test.c fexit_sleep.c \
 	test_ringbuf.c atomics.c trace_printk.c trace_vprintk.c
 # Generate both light skeleton and libbpf skeleton for these
 LSKELS_EXTRA := test_ksyms_module.c test_ksyms_weak.c
diff --git a/tools/testing/selftests/bpf/bpf_tcp_helpers.h b/tools/testing/selftests/bpf/bpf_tcp_helpers.h
index b1ede6f0b821..f7fe596d2c7e 100644
--- a/tools/testing/selftests/bpf/bpf_tcp_helpers.h
+++ b/tools/testing/selftests/bpf/bpf_tcp_helpers.h
@@ -185,6 +185,11 @@ struct tcp_congestion_ops {
 	void *owner;
 };
 
+struct prog_test_ref_kfunc {
+	int a;
+	int b;
+} __attribute__((preserve_access_index));
+
 #define min(a, b) ((a) < (b) ? (a) : (b))
 #define max(a, b) ((a) > (b) ? (a) : (b))
 #define min_not_zero(x, y) ({			\
diff --git a/tools/testing/selftests/bpf/prog_tests/kfunc_call.c b/tools/testing/selftests/bpf/prog_tests/kfunc_call.c
index 5c9c0176991b..358b905ea9b5 100644
--- a/tools/testing/selftests/bpf/prog_tests/kfunc_call.c
+++ b/tools/testing/selftests/bpf/prog_tests/kfunc_call.c
@@ -2,31 +2,37 @@
 /* Copyright (c) 2021 Facebook */
 #include <test_progs.h>
 #include <network_helpers.h>
-#include "kfunc_call_test.lskel.h"
+#include "kfunc_call_test.skel.h"
 #include "kfunc_call_test_subprog.skel.h"
 
 static void test_main(void)
 {
-	struct kfunc_call_test_lskel *skel;
+	struct kfunc_call_test *skel;
 	int prog_fd, retval, err;
 
-	skel = kfunc_call_test_lskel__open_and_load();
+	skel = kfunc_call_test__open_and_load();
 	if (!ASSERT_OK_PTR(skel, "skel"))
 		return;
 
-	prog_fd = skel->progs.kfunc_call_test1.prog_fd;
+	prog_fd = bpf_program__fd(skel->progs.kfunc_call_test1);
 	err = bpf_prog_test_run(prog_fd, 1, &pkt_v4, sizeof(pkt_v4),
 				NULL, NULL, (__u32 *)&retval, NULL);
 	ASSERT_OK(err, "bpf_prog_test_run(test1)");
 	ASSERT_EQ(retval, 12, "test1-retval");
 
-	prog_fd = skel->progs.kfunc_call_test2.prog_fd;
+	prog_fd = bpf_program__fd(skel->progs.kfunc_call_test2);
 	err = bpf_prog_test_run(prog_fd, 1, &pkt_v4, sizeof(pkt_v4),
 				NULL, NULL, (__u32 *)&retval, NULL);
 	ASSERT_OK(err, "bpf_prog_test_run(test2)");
 	ASSERT_EQ(retval, 3, "test2-retval");
 
-	kfunc_call_test_lskel__destroy(skel);
+	prog_fd = bpf_program__fd(skel->progs.kfunc_call_test_ref_btf_id);
+	err = bpf_prog_test_run(prog_fd, 1, &pkt_v4, sizeof(pkt_v4),
+				NULL, NULL, (__u32 *)&retval, NULL);
+	ASSERT_OK(err, "bpf_prog_test_run(test_ref_btf_id)");
+	ASSERT_EQ(retval, 0, "test_ref_btf_id-retval");
+
+	kfunc_call_test__destroy(skel);
 }
 
 static void test_subprog(void)
diff --git a/tools/testing/selftests/bpf/progs/kfunc_call_test.c b/tools/testing/selftests/bpf/progs/kfunc_call_test.c
index 8a8cf59017aa..045f157309b6 100644
--- a/tools/testing/selftests/bpf/progs/kfunc_call_test.c
+++ b/tools/testing/selftests/bpf/progs/kfunc_call_test.c
@@ -7,6 +7,8 @@
 extern int bpf_kfunc_call_test2(struct sock *sk, __u32 a, __u32 b) __ksym;
 extern __u64 bpf_kfunc_call_test1(struct sock *sk, __u32 a, __u64 b,
 				  __u32 c, __u64 d) __ksym;
+extern struct prog_test_ref_kfunc *bpf_kfunc_call_test_acquire(char *packet) __ksym;
+extern void bpf_kfunc_call_test_release(struct prog_test_ref_kfunc *p) __ksym;
 
 SEC("tc")
 int kfunc_call_test2(struct __sk_buff *skb)
@@ -44,4 +46,23 @@ int kfunc_call_test1(struct __sk_buff *skb)
 	return ret;
 }
 
+SEC("tc")
+int kfunc_call_test_ref_btf_id(struct __sk_buff *skb)
+{
+	struct bpf_sock_tuple tuple = {};
+	struct prog_test_ref_kfunc *pt;
+	struct nf_conn *ct = NULL;
+	char *p = &(char){0};
+	__u64 flags_err = 0;
+	int ret = 0;
+
+	pt = bpf_kfunc_call_test_acquire(p);
+	if (pt) {
+		if (pt->a != 42 || pt->b != 108)
+			ret = -1;
+		bpf_kfunc_call_test_release(pt);
+	}
+	return ret;
+}
+
 char _license[] SEC("license") = "GPL";
-- 
2.33.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC bpf-next v1 5/6] net: netfilter: Add unstable CT lookup helper for XDP and TC-BPF
  2021-10-30 14:46 ` [PATCH RFC bpf-next v1 5/6] net: netfilter: Add unstable CT lookup helper for XDP and TC-BPF Kumar Kartikeya Dwivedi
@ 2021-10-31 19:10   ` Florian Westphal
  2021-11-01 19:49     ` Toke Høiland-Jørgensen
  2021-11-02 23:19     ` Alexei Starovoitov
  0 siblings, 2 replies; 18+ messages in thread
From: Florian Westphal @ 2021-10-31 19:10 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	Maxim Mikityanskiy, Florian Westphal, Jesper Dangaard Brouer,
	Toke Høiland-Jørgensen, netdev, netfilter-devel

Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
> This change adds conntrack lookup helpers using the unstable kfunc call
> interface for the XDP and TC-BPF hooks.
> 
> Also add acquire/release functions (randomly returning NULL), and also
> exercise the RET_PTR_TO_BTF_ID_OR_NULL path so that BPF program caller
> has to check for NULL before dereferencing the pointer, for the TC hook.
> These will be used in selftest.
> 
> Export get_net_ns_by_id and btf_type_by_id as nf_conntrack needs to call
> them.

It would be good to get a summary on how this is useful.

I tried to find a use case but I could not.
Entry will time out soon once packets stop appearing, so it can't be
used for stack bypass.  Is it for something else?  If so, what?

For UDP it will work to let a packet pass through classic forward
path once in a while, but this will not work for tcp, depending
on conntrack settings (lose mode, liberal pickup etc. pp).

> +/* Unstable Kernel Helpers for XDP hook */
> +static struct nf_conn *__bpf_nf_ct_lookup(struct net *net,
> +					  struct bpf_sock_tuple *bpf_tuple,
> +					  u32 tuple_len, u8 protonum,
> +					  u64 netns_id, u64 flags)
> +{
> +	struct nf_conntrack_tuple_hash *hash;
> +	struct nf_conntrack_tuple tuple;
> +
> +	if (flags != IP_CT_DIR_ORIGINAL && flags != IP_CT_DIR_REPLY)
> +		return ERR_PTR(-EINVAL);

The flags argument is not needed.

> +	tuple.dst.dir = flags;

.dir can be 0, its not used by nf_conntrack_find_get().

> +	hash = nf_conntrack_find_get(net, &nf_ct_zone_dflt, &tuple);

Ok, so default zone. Depending on meaning of "unstable helper" this
is ok and can be changed in incompatible way later.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC bpf-next v1 5/6] net: netfilter: Add unstable CT lookup helper for XDP and TC-BPF
  2021-10-31 19:10   ` Florian Westphal
@ 2021-11-01 19:49     ` Toke Høiland-Jørgensen
  2021-11-02 20:43       ` Florian Westphal
  2021-11-02 23:19     ` Alexei Starovoitov
  1 sibling, 1 reply; 18+ messages in thread
From: Toke Høiland-Jørgensen @ 2021-11-01 19:49 UTC (permalink / raw)
  To: Florian Westphal, Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	Maxim Mikityanskiy, Florian Westphal, Jesper Dangaard Brouer,
	netdev, netfilter-devel

Florian Westphal <fw@strlen.de> writes:

> Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
>> This change adds conntrack lookup helpers using the unstable kfunc call
>> interface for the XDP and TC-BPF hooks.
>> 
>> Also add acquire/release functions (randomly returning NULL), and also
>> exercise the RET_PTR_TO_BTF_ID_OR_NULL path so that BPF program caller
>> has to check for NULL before dereferencing the pointer, for the TC hook.
>> These will be used in selftest.
>> 
>> Export get_net_ns_by_id and btf_type_by_id as nf_conntrack needs to call
>> them.
>
> It would be good to get a summary on how this is useful.
>
> I tried to find a use case but I could not.
> Entry will time out soon once packets stop appearing, so it can't be
> used for stack bypass.  Is it for something else?  If so, what?

I think Maxim's use case was to implement a SYN proxy in XDP, where the
XDP program just needs to answer the question "do I have state for this
flow already". For TCP flows terminating on the local box this can be
done via a socket lookup, but for a middlebox, a conntrack lookup is
useful. Maxim, please correct me if I got your use case wrong.

> For UDP it will work to let a packet pass through classic forward
> path once in a while, but this will not work for tcp, depending
> on conntrack settings (lose mode, liberal pickup etc. pp).

The idea is certainly to follow up with some kind of 'update' helper. At
a minimum a "keep this entry alive" update, but potentially more
complicated stuff as well. Details TBD, input welcome :)

>> +/* Unstable Kernel Helpers for XDP hook */
>> +static struct nf_conn *__bpf_nf_ct_lookup(struct net *net,
>> +					  struct bpf_sock_tuple *bpf_tuple,
>> +					  u32 tuple_len, u8 protonum,
>> +					  u64 netns_id, u64 flags)
>> +{
>> +	struct nf_conntrack_tuple_hash *hash;
>> +	struct nf_conntrack_tuple tuple;
>> +
>> +	if (flags != IP_CT_DIR_ORIGINAL && flags != IP_CT_DIR_REPLY)
>> +		return ERR_PTR(-EINVAL);
>
> The flags argument is not needed.
>
>> +	tuple.dst.dir = flags;
>
> .dir can be 0, its not used by nf_conntrack_find_get().
>
>> +	hash = nf_conntrack_find_get(net, &nf_ct_zone_dflt, &tuple);
>
> Ok, so default zone. Depending on meaning of "unstable helper" this
> is ok and can be changed in incompatible way later.

I'm not sure about the meaning of "unstable" either, TBH, but in either
case I'd rather avoid changing things if we don't have to, so I think
adding the zone as an argument from the get-go may be better...

-Toke


^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC bpf-next v1 5/6] net: netfilter: Add unstable CT lookup helper for XDP and TC-BPF
  2021-11-01 19:49     ` Toke Høiland-Jørgensen
@ 2021-11-02 20:43       ` Florian Westphal
  2021-11-05 20:48         ` Kumar Kartikeya Dwivedi
  0 siblings, 1 reply; 18+ messages in thread
From: Florian Westphal @ 2021-11-02 20:43 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Florian Westphal, Kumar Kartikeya Dwivedi, bpf,
	Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	Maxim Mikityanskiy, Jesper Dangaard Brouer, netdev,
	netfilter-devel

Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> > I tried to find a use case but I could not.
> > Entry will time out soon once packets stop appearing, so it can't be
> > used for stack bypass.  Is it for something else?  If so, what?
> 
> I think Maxim's use case was to implement a SYN proxy in XDP, where the
> XDP program just needs to answer the question "do I have state for this
> flow already". For TCP flows terminating on the local box this can be
> done via a socket lookup, but for a middlebox, a conntrack lookup is
> useful. Maxim, please correct me if I got your use case wrong.

Looked at
https://netdevconf.info/0x15/slides/30/Netdev%200x15%20Accelerating%20synproxy%20with%20XDP.pdf

seems thats right, its only a "does it exist".

> > For UDP it will work to let a packet pass through classic forward
> > path once in a while, but this will not work for tcp, depending
> > on conntrack settings (lose mode, liberal pickup etc. pp).
> 
> The idea is certainly to follow up with some kind of 'update' helper. At
> a minimum a "keep this entry alive" update, but potentially more
> complicated stuff as well. Details TBD, input welcome :)

Depends on use case.  For bypass infra I'd target the flowtable
infra rather than conntrack because it gets rid of the "early time out"
problem, plus you get the output interface/dst entry.

Not trivial for xdp because existing code assumes sk_buff.
But I think it can be refactored to allow raw buffers, similar
to flow dissector.

> >> +	hash = nf_conntrack_find_get(net, &nf_ct_zone_dflt, &tuple);
> >
> > Ok, so default zone. Depending on meaning of "unstable helper" this
> > is ok and can be changed in incompatible way later.
> 
> I'm not sure about the meaning of "unstable" either, TBH, but in either
> case I'd rather avoid changing things if we don't have to, so I think
> adding the zone as an argument from the get-go may be better...

Another thing I just noted:
The above gives a nf_conn with incremented reference count.

For Maxims use case, thats unnecessary overhead. Existence can be
determined without reference increment.  The caveat is that the pointer
cannot be used after last rcu_read_unlock().

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC bpf-next v1 0/6] Introduce unstable CT lookup helpers
  2021-10-30 14:46 [PATCH RFC bpf-next v1 0/6] Introduce unstable CT lookup helpers Kumar Kartikeya Dwivedi
                   ` (5 preceding siblings ...)
  2021-10-30 14:46 ` [PATCH RFC bpf-next v1 6/6] selftests/bpf: Add referenced PTR_TO_BTF_ID selftest Kumar Kartikeya Dwivedi
@ 2021-11-02 23:16 ` Alexei Starovoitov
  2021-11-04 12:55   ` Kumar Kartikeya Dwivedi
  6 siblings, 1 reply; 18+ messages in thread
From: Alexei Starovoitov @ 2021-11-02 23:16 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	Maxim Mikityanskiy, Florian Westphal, Jesper Dangaard Brouer,
	Toke Høiland-Jørgensen, netdev, netfilter-devel

On Sat, Oct 30, 2021 at 08:16:03PM +0530, Kumar Kartikeya Dwivedi wrote:
> This series adds unstable conntrack lookup helpers using BPF kfunc support.  The
> patch adding the lookup helper is based off of Maxim's recent patch to aid in
> rebasing their series on top of this, all adjusted to work with kfunc support
> [0].
> 
> This is an RFC series, as I'm unsure whether the reference tracking for
> PTR_TO_BTF_ID will be accepted.

Yes. The patches look good overall.
Please don't do __BPF_RET_TYPE_MAX signalling. It's an ambiguous name.
_MAX is typically used for a different purpose. Just give it an explicit name.
I don't fully understand why that skip is needed though.
Why it's not one of existing RET_*. Duplication of return and
being lazy to propagate the correct ret value into get_kfunc_return_type ?

> If not, we can go back to doing it the typical
> way with PTR_TO_NF_CONN type, guarded with #if IS_ENABLED(CONFIG_NF_CONNTRACK).

Please don't. We already have a ton of special and custom types in the verifier.
refcnted PTR_TO_BTF_ID sounds as good way to scale it.

> Also, I want to understand whether it would make sense to introduce
> check_helper_call style bpf_func_proto based argument checking for kfuncs, or
> continue with how it is right now, since it doesn't seem correct that PTR_TO_MEM
> can be passed where PTR_TO_BTF_ID may be expected. Only PTR_TO_CTX is enforced.

Do we really allow to pass PTR_TO_MEM argument into a function that expects PTR_TO_BTF_ID ?
That sounds like a bug that we need to fix.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC bpf-next v1 5/6] net: netfilter: Add unstable CT lookup helper for XDP and TC-BPF
  2021-10-31 19:10   ` Florian Westphal
  2021-11-01 19:49     ` Toke Høiland-Jørgensen
@ 2021-11-02 23:19     ` Alexei Starovoitov
  1 sibling, 0 replies; 18+ messages in thread
From: Alexei Starovoitov @ 2021-11-02 23:19 UTC (permalink / raw)
  To: Florian Westphal
  Cc: Kumar Kartikeya Dwivedi, bpf, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, Maxim Mikityanskiy,
	Jesper Dangaard Brouer, Toke Høiland-Jørgensen, netdev,
	netfilter-devel

On Sun, Oct 31, 2021 at 08:10:45PM +0100, Florian Westphal wrote:
> 
> Depending on meaning of "unstable helper" this
> is ok and can be changed in incompatible way later.

"unstable helper" means that it's not part of uapi.
All kfuncs are like that. They can and will change.
bpf progs that call them would have to adjust.
Just like bpf progs that read kernel data and walk kernel types.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC bpf-next v1 0/6] Introduce unstable CT lookup helpers
  2021-11-02 23:16 ` [PATCH RFC bpf-next v1 0/6] Introduce unstable CT lookup helpers Alexei Starovoitov
@ 2021-11-04 12:55   ` Kumar Kartikeya Dwivedi
  2021-11-05 20:49     ` Alexei Starovoitov
  0 siblings, 1 reply; 18+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2021-11-04 12:55 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	Maxim Mikityanskiy, Florian Westphal, Jesper Dangaard Brouer,
	Toke Høiland-Jørgensen, netdev, netfilter-devel

On Wed, Nov 03, 2021 at 04:46:42AM IST, Alexei Starovoitov wrote:
> On Sat, Oct 30, 2021 at 08:16:03PM +0530, Kumar Kartikeya Dwivedi wrote:
> > This series adds unstable conntrack lookup helpers using BPF kfunc support.  The
> > patch adding the lookup helper is based off of Maxim's recent patch to aid in
> > rebasing their series on top of this, all adjusted to work with kfunc support
> > [0].
> >
> > This is an RFC series, as I'm unsure whether the reference tracking for
> > PTR_TO_BTF_ID will be accepted.
>
> Yes. The patches look good overall.
> Please don't do __BPF_RET_TYPE_MAX signalling. It's an ambiguous name.
> _MAX is typically used for a different purpose. Just give it an explicit name.
> I don't fully understand why that skip is needed though.

I needed a sentinel to skip return type checking (otherwise check that return
type and prototype match) since existing kfunc don't have a
get_kfunc_return_type callback, but if we add bpf_func_proto support to kfunc
then we can probably convert existing kfuncs to that as well and skip all this
logic. Mostly needed it for RET_PTR_TO_BTF_ID_OR_NULL.

Extending to support bpf_func_proto seemed like a bit of work so I wanted to get
some feedback first on all this, before working on it.

> Why it's not one of existing RET_*. Duplication of return and
> being lazy to propagate the correct ret value into get_kfunc_return_type ?
>
> > If not, we can go back to doing it the typical
> > way with PTR_TO_NF_CONN type, guarded with #if IS_ENABLED(CONFIG_NF_CONNTRACK).
>
> Please don't. We already have a ton of special and custom types in the verifier.
> refcnted PTR_TO_BTF_ID sounds as good way to scale it.
>

Understood.

> > Also, I want to understand whether it would make sense to introduce
> > check_helper_call style bpf_func_proto based argument checking for kfuncs, or
> > continue with how it is right now, since it doesn't seem correct that PTR_TO_MEM
> > can be passed where PTR_TO_BTF_ID may be expected. Only PTR_TO_CTX is enforced.
>
> Do we really allow to pass PTR_TO_MEM argument into a function that expects PTR_TO_BTF_ID ?

Sorry, that's poorly phrased. Current kfunc doesn't support PTR_TO_MEM. I meant
it would be allowed now, with the way I implemented things, but there also isn't
a way to signal whether PTR_TO_BTF_ID is expected (hence the question about
bpf_func_proto). I did not understand why that was not done originally (maybe it
was lack of usecase). PTR_TO_CTX works because the type is matched with prog
type, so you can't pass something else there. For other cases the type of
register is considered.

> That sounds like a bug that we need to fix.

--
Kartikeya

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC bpf-next v1 5/6] net: netfilter: Add unstable CT lookup helper for XDP and TC-BPF
  2021-11-02 20:43       ` Florian Westphal
@ 2021-11-05 20:48         ` Kumar Kartikeya Dwivedi
  0 siblings, 0 replies; 18+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2021-11-05 20:48 UTC (permalink / raw)
  To: Florian Westphal
  Cc: Toke Høiland-Jørgensen, bpf, Alexei Starovoitov,
	Daniel Borkmann, Andrii Nakryiko, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, Maxim Mikityanskiy,
	Jesper Dangaard Brouer, netdev, netfilter-devel

On Wed, Nov 03, 2021 at 02:13:58AM IST, Florian Westphal wrote:
> Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> > > I tried to find a use case but I could not.
> > > Entry will time out soon once packets stop appearing, so it can't be
> > > used for stack bypass.  Is it for something else?  If so, what?
> >
> > I think Maxim's use case was to implement a SYN proxy in XDP, where the
> > XDP program just needs to answer the question "do I have state for this
> > flow already". For TCP flows terminating on the local box this can be
> > done via a socket lookup, but for a middlebox, a conntrack lookup is
> > useful. Maxim, please correct me if I got your use case wrong.
>
> Looked at
> https://netdevconf.info/0x15/slides/30/Netdev%200x15%20Accelerating%20synproxy%20with%20XDP.pdf
>
> seems thats right, its only a "does it exist".
>

FYI, there's also an example in the original series (grep for bpf_ct_lookup_tcp):
https://lore.kernel.org/bpf/20211019144655.3483197-11-maximmi@nvidia.com

> > > For UDP it will work to let a packet pass through classic forward
> > > path once in a while, but this will not work for tcp, depending
> > > on conntrack settings (lose mode, liberal pickup etc. pp).
> >
> > The idea is certainly to follow up with some kind of 'update' helper. At
> > a minimum a "keep this entry alive" update, but potentially more
> > complicated stuff as well. Details TBD, input welcome :)
>
> Depends on use case.  For bypass infra I'd target the flowtable
> infra rather than conntrack because it gets rid of the "early time out"
> problem, plus you get the output interface/dst entry.
>
> Not trivial for xdp because existing code assumes sk_buff.
> But I think it can be refactored to allow raw buffers, similar
> to flow dissector.
>
> > >> +	hash = nf_conntrack_find_get(net, &nf_ct_zone_dflt, &tuple);
> > >
> > > Ok, so default zone. Depending on meaning of "unstable helper" this
> > > is ok and can be changed in incompatible way later.
> >
> > I'm not sure about the meaning of "unstable" either, TBH, but in either
> > case I'd rather avoid changing things if we don't have to, so I think
> > adding the zone as an argument from the get-go may be better...
>
> Another thing I just noted:
> The above gives a nf_conn with incremented reference count.
>
> For Maxims use case, thats unnecessary overhead. Existence can be
> determined without reference increment.  The caveat is that the pointer
> cannot be used after last rcu_read_unlock().

From my reading, it was safe but not correct to use (as in dereference) without
using nf_conntrack_find_get, since even though freeing of underlying memory is
done using SLAB_DESTROY_BY_RCU, but the nf_conn itself may not correspond to the
same tuple in the rcu read section without taking a reference. So doing what the
example currently does (checking ct->status & IPS_CONFIRMED_BIT) is not safe
without raising the reference, even though the XDP program invocation is under
RCU protection. Returning a PTR_TO_BTF_ID for the nf_conn wouldn't really work
without getting a reference on it, since the object can be recycled.

--
Kartikeya

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC bpf-next v1 0/6] Introduce unstable CT lookup helpers
  2021-11-04 12:55   ` Kumar Kartikeya Dwivedi
@ 2021-11-05 20:49     ` Alexei Starovoitov
  2021-11-05 21:13       ` Kumar Kartikeya Dwivedi
  0 siblings, 1 reply; 18+ messages in thread
From: Alexei Starovoitov @ 2021-11-05 20:49 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	Maxim Mikityanskiy, Florian Westphal, Jesper Dangaard Brouer,
	Toke Høiland-Jørgensen, netdev, netfilter-devel

On Thu, Nov 04, 2021 at 06:25:03PM +0530, Kumar Kartikeya Dwivedi wrote:
> On Wed, Nov 03, 2021 at 04:46:42AM IST, Alexei Starovoitov wrote:
> > On Sat, Oct 30, 2021 at 08:16:03PM +0530, Kumar Kartikeya Dwivedi wrote:
> > > This series adds unstable conntrack lookup helpers using BPF kfunc support.  The
> > > patch adding the lookup helper is based off of Maxim's recent patch to aid in
> > > rebasing their series on top of this, all adjusted to work with kfunc support
> > > [0].
> > >
> > > This is an RFC series, as I'm unsure whether the reference tracking for
> > > PTR_TO_BTF_ID will be accepted.
> >
> > Yes. The patches look good overall.
> > Please don't do __BPF_RET_TYPE_MAX signalling. It's an ambiguous name.
> > _MAX is typically used for a different purpose. Just give it an explicit name.
> > I don't fully understand why that skip is needed though.
> 
> I needed a sentinel to skip return type checking (otherwise check that return
> type and prototype match) since existing kfunc don't have a
> get_kfunc_return_type callback, but if we add bpf_func_proto support to kfunc
> then we can probably convert existing kfuncs to that as well and skip all this
> logic. Mostly needed it for RET_PTR_TO_BTF_ID_OR_NULL.

So it's just to special case r0=PTR_TO_BTF_ID_OR_NULL instead of
PTR_TO_BTF_ID that it's doing by default now?
Then could you use a btf_id list to whitelist all such funcs that needs _OR_NULL
variant and just do a search in that list in check_kfunc_call() ?
Instead of adding get_kfunc_return_type() callback.

> Extending to support bpf_func_proto seemed like a bit of work so I wanted to get
> some feedback first on all this, before working on it.

No need to hack into bpf_func_proto. All kernel funcs have BTF. It's all we need.
The _OR_NULL part we will eventually be able to express with btf_tag when
it's supported by both gcc and clang.

> > > Also, I want to understand whether it would make sense to introduce
> > > check_helper_call style bpf_func_proto based argument checking for kfuncs, or
> > > continue with how it is right now, since it doesn't seem correct that PTR_TO_MEM
> > > can be passed where PTR_TO_BTF_ID may be expected. Only PTR_TO_CTX is enforced.
> >
> > Do we really allow to pass PTR_TO_MEM argument into a function that expects PTR_TO_BTF_ID ?
> 
> Sorry, that's poorly phrased. Current kfunc doesn't support PTR_TO_MEM. I meant
> it would be allowed now, with the way I implemented things, but there also isn't
> a way to signal whether PTR_TO_BTF_ID is expected (hence the question about
> bpf_func_proto). I did not understand why that was not done originally (maybe it
> was lack of usecase). PTR_TO_CTX works because the type is matched with prog
> type, so you can't pass something else there. For other cases the type of
> register is considered.

Right. btf_check_kfunc_arg_match doesn't allow ptr_to_mem yet.
There is no signalling needed.
All args passed by the program into kfunc have to be either exact
PTR_TO_BTF_ID or conversions from PTR_TO_SOCK*.

Passing rX=PTR_TO_CTX into kfunc should not work. If I'm reading the code
correctly it's not allowed. I'm not sure why you're saying it can be done.
It's possible to pass PTR_TO_CTX into another bpf prog's global function.
The same btf_check_func_arg_match() helper checks both cases (global funcs and kfuncs).
Maybe that's where the confusion comes from?

Same with if (ptr_to_mem_ok). It's only for passing PTR_TO_MEM
into bpf prog's global function.
We can extend the verifier and allow PTR_TO_MEM into kfunc that
has 'long *' prototype, for example.
But it doesn't sound like the use case you have in mind.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC bpf-next v1 0/6] Introduce unstable CT lookup helpers
  2021-11-05 20:49     ` Alexei Starovoitov
@ 2021-11-05 21:13       ` Kumar Kartikeya Dwivedi
  2021-11-06 18:13         ` Alexei Starovoitov
  0 siblings, 1 reply; 18+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2021-11-05 21:13 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	Maxim Mikityanskiy, Florian Westphal, Jesper Dangaard Brouer,
	Toke Høiland-Jørgensen, netdev, netfilter-devel

On Sat, Nov 06, 2021 at 02:19:08AM IST, Alexei Starovoitov wrote:
> On Thu, Nov 04, 2021 at 06:25:03PM +0530, Kumar Kartikeya Dwivedi wrote:
> > On Wed, Nov 03, 2021 at 04:46:42AM IST, Alexei Starovoitov wrote:
> > > On Sat, Oct 30, 2021 at 08:16:03PM +0530, Kumar Kartikeya Dwivedi wrote:
> > > > This series adds unstable conntrack lookup helpers using BPF kfunc support.  The
> > > > patch adding the lookup helper is based off of Maxim's recent patch to aid in
> > > > rebasing their series on top of this, all adjusted to work with kfunc support
> > > > [0].
> > > >
> > > > This is an RFC series, as I'm unsure whether the reference tracking for
> > > > PTR_TO_BTF_ID will be accepted.
> > >
> > > Yes. The patches look good overall.
> > > Please don't do __BPF_RET_TYPE_MAX signalling. It's an ambiguous name.
> > > _MAX is typically used for a different purpose. Just give it an explicit name.
> > > I don't fully understand why that skip is needed though.
> >
> > I needed a sentinel to skip return type checking (otherwise check that return
> > type and prototype match) since existing kfunc don't have a
> > get_kfunc_return_type callback, but if we add bpf_func_proto support to kfunc
> > then we can probably convert existing kfuncs to that as well and skip all this
> > logic. Mostly needed it for RET_PTR_TO_BTF_ID_OR_NULL.
>
> So it's just to special case r0=PTR_TO_BTF_ID_OR_NULL instead of
> PTR_TO_BTF_ID that it's doing by default now?
> Then could you use a btf_id list to whitelist all such funcs that needs _OR_NULL
> variant and just do a search in that list in check_kfunc_call() ?
> Instead of adding get_kfunc_return_type() callback.
>

Hm, good idea, that should work for now.

> > Extending to support bpf_func_proto seemed like a bit of work so I wanted to get
> > some feedback first on all this, before working on it.
>
> No need to hack into bpf_func_proto. All kernel funcs have BTF. It's all we need.
> The _OR_NULL part we will eventually be able to express with btf_tag when
> it's supported by both gcc and clang.
>

More on this below.

> > > > Also, I want to understand whether it would make sense to introduce
> > > > check_helper_call style bpf_func_proto based argument checking for kfuncs, or
> > > > continue with how it is right now, since it doesn't seem correct that PTR_TO_MEM
> > > > can be passed where PTR_TO_BTF_ID may be expected. Only PTR_TO_CTX is enforced.
> > >
> > > Do we really allow to pass PTR_TO_MEM argument into a function that expects PTR_TO_BTF_ID ?
> >
> > Sorry, that's poorly phrased. Current kfunc doesn't support PTR_TO_MEM. I meant
> > it would be allowed now, with the way I implemented things, but there also isn't
> > a way to signal whether PTR_TO_BTF_ID is expected (hence the question about
> > bpf_func_proto). I did not understand why that was not done originally (maybe it
> > was lack of usecase). PTR_TO_CTX works because the type is matched with prog
> > type, so you can't pass something else there. For other cases the type of
> > register is considered.
>
> Right. btf_check_kfunc_arg_match doesn't allow ptr_to_mem yet.
> There is no signalling needed.
> All args passed by the program into kfunc have to be either exact
> PTR_TO_BTF_ID or conversions from PTR_TO_SOCK*.
>

I should have been clearer again :). Sorry for that.

Right now only PTR_TO_BTF_ID and PTR_TO_SOCK and scalars are supported, as you
noted, for kfunc arguments.

So in 3/6 I move the PTR_TO_CTX block before btf_is_kernel check, that means if
reg type is PTR_TO_CTX and it matches the argument for the program, it will use
that, otherwise it moves to btf_is_kernel(btf) block, which checks if reg->type
is PTR_TO_BTF_ID or one of PTR_TO_SOCK* and does struct match for those. Next, I
punt to ptr_to_mem for the rest of the cases, which I think is problematic,
since now you may pass PTR_TO_MEM where some kfunc wants a PTR_TO_BTF_ID.

But without bpf_func_proto, I am not sure we can decide what is expected in the
kfunc. For something like bpf_sock_tuple, we'd want a PTR_TO_MEM, but taking in
a PTR_TO_BTF_ID also isn't problematic since it is just data, but for a struct
embedding pointers or other cases, it may be a problem.

For PTR_TO_CTX in kfunc case, based on my reading and testing, it will reject
any attempts to pass anything other than PTR_TO_CTX due to btf_get_prog_ctx_type
for that argument. So that works fine.

To me it seems like extending with some limited argument checking is necessary,
either using tagging as you mentioned or bpf_func_proto, or some other hardcoded
checking for now since the number of helpers needing this support is low.

Please correct me if I'm wrong about any of this.

> Passing rX=PTR_TO_CTX into kfunc should not work. If I'm reading the code
> correctly it's not allowed. I'm not sure why you're saying it can be done.
> It's possible to pass PTR_TO_CTX into another bpf prog's global function.
> The same btf_check_func_arg_match() helper checks both cases (global funcs and kfuncs).
> Maybe that's where the confusion comes from?
>
> Same with if (ptr_to_mem_ok). It's only for passing PTR_TO_MEM
> into bpf prog's global function.
> We can extend the verifier and allow PTR_TO_MEM into kfunc that
> has 'long *' prototype, for example.
> But it doesn't sound like the use case you have in mind.

--
Kartikeya

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC bpf-next v1 0/6] Introduce unstable CT lookup helpers
  2021-11-05 21:13       ` Kumar Kartikeya Dwivedi
@ 2021-11-06 18:13         ` Alexei Starovoitov
  2021-11-07 15:44           ` Kumar Kartikeya Dwivedi
  0 siblings, 1 reply; 18+ messages in thread
From: Alexei Starovoitov @ 2021-11-06 18:13 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	Maxim Mikityanskiy, Florian Westphal, Jesper Dangaard Brouer,
	Toke Høiland-Jørgensen, netdev, netfilter-devel

On Sat, Nov 06, 2021 at 02:43:12AM +0530, Kumar Kartikeya Dwivedi wrote:
> 
> Right now only PTR_TO_BTF_ID and PTR_TO_SOCK and scalars are supported, as you
> noted, for kfunc arguments.
> 
> So in 3/6 I move the PTR_TO_CTX block before btf_is_kernel check, that means if
> reg type is PTR_TO_CTX and it matches the argument for the program, it will use
> that, otherwise it moves to btf_is_kernel(btf) block, which checks if reg->type
> is PTR_TO_BTF_ID or one of PTR_TO_SOCK* and does struct match for those. Next, I
> punt to ptr_to_mem for the rest of the cases, which I think is problematic,
> since now you may pass PTR_TO_MEM where some kfunc wants a PTR_TO_BTF_ID.
> 
> But without bpf_func_proto, I am not sure we can decide what is expected in the
> kfunc. For something like bpf_sock_tuple, we'd want a PTR_TO_MEM, but taking in
> a PTR_TO_BTF_ID also isn't problematic since it is just data, but for a struct
> embedding pointers or other cases, it may be a problem.
> 
> For PTR_TO_CTX in kfunc case, based on my reading and testing, it will reject
> any attempts to pass anything other than PTR_TO_CTX due to btf_get_prog_ctx_type
> for that argument. So that works fine.
> 
> To me it seems like extending with some limited argument checking is necessary,
> either using tagging as you mentioned or bpf_func_proto, or some other hardcoded
> checking for now since the number of helpers needing this support is low.

Got it. The patch 3 commit log was too terse for me to comprehend.
Even with detailed explanation above it took me awhile to understand the
consequences of the patch... and 'goto ptr_to_mem' I misunderstood completely.
I think now we're on the same page :)

Agree that allowing PTR_TO_CTX into kfunc is safe to do in all cases.
Converting PTR_TO_MEM to PTR_TO_BTF_ID is also safe when kernel side 'struct foo'
contains only scalars. The patches don't have this check yet (as far as I can see).
That's the only missing piece.
With that in place 'struct bpf_sock_tuple' can be defined on the kernel side.
The bpf prog can do include "vmlinux.h" to use it to pass as PTR_TO_MEM
into kfunc. The patch 5 kernel function bpf_skb_ct_lookup can stay as-is.
So no tagging or extensions to bpf_func_proto are necessary.

The piece I'm still missing is why you need two additional *btf_struct_access.
Why do you want to restrict read access?
The bpf-tcp infra has bpf_tcp_ca_btf_struct_access() to allow-list
few safe fields for writing.
Is there a use case to write into 'struct nf_conn' from bpf prog? Probably not yet.
Then let's keep the default btf_struct_access() behavior for now.
The patch 5 will be defining bpf_xdp_ct_lookup_tcp/bpf_skb_ct_lookup_tcp
and no callbacks at all.
acquire/release are probably cleaner as explicit btf_id_list-s.
Similar to btf_id_list for PTR_TO_BTF_ID_OR_NULL vs PTR_TO_BTF_ID return type.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH RFC bpf-next v1 0/6] Introduce unstable CT lookup helpers
  2021-11-06 18:13         ` Alexei Starovoitov
@ 2021-11-07 15:44           ` Kumar Kartikeya Dwivedi
  0 siblings, 0 replies; 18+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2021-11-07 15:44 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Song Liu, Yonghong Song, John Fastabend,
	Maxim Mikityanskiy, Florian Westphal, Jesper Dangaard Brouer,
	Toke Høiland-Jørgensen, netdev, netfilter-devel

On Sat, Nov 06, 2021 at 11:43:28PM IST, Alexei Starovoitov wrote:
> On Sat, Nov 06, 2021 at 02:43:12AM +0530, Kumar Kartikeya Dwivedi wrote:
> >
> > Right now only PTR_TO_BTF_ID and PTR_TO_SOCK and scalars are supported, as you
> > noted, for kfunc arguments.
> >
> > So in 3/6 I move the PTR_TO_CTX block before btf_is_kernel check, that means if
> > reg type is PTR_TO_CTX and it matches the argument for the program, it will use
> > that, otherwise it moves to btf_is_kernel(btf) block, which checks if reg->type
> > is PTR_TO_BTF_ID or one of PTR_TO_SOCK* and does struct match for those. Next, I
> > punt to ptr_to_mem for the rest of the cases, which I think is problematic,
> > since now you may pass PTR_TO_MEM where some kfunc wants a PTR_TO_BTF_ID.
> >
> > But without bpf_func_proto, I am not sure we can decide what is expected in the
> > kfunc. For something like bpf_sock_tuple, we'd want a PTR_TO_MEM, but taking in
> > a PTR_TO_BTF_ID also isn't problematic since it is just data, but for a struct
> > embedding pointers or other cases, it may be a problem.
> >
> > For PTR_TO_CTX in kfunc case, based on my reading and testing, it will reject
> > any attempts to pass anything other than PTR_TO_CTX due to btf_get_prog_ctx_type
> > for that argument. So that works fine.
> >
> > To me it seems like extending with some limited argument checking is necessary,
> > either using tagging as you mentioned or bpf_func_proto, or some other hardcoded
> > checking for now since the number of helpers needing this support is low.
>
> Got it. The patch 3 commit log was too terse for me to comprehend.
> Even with detailed explanation above it took me awhile to understand the
> consequences of the patch... and 'goto ptr_to_mem' I misunderstood completely.
> I think now we're on the same page :)
>
> Agree that allowing PTR_TO_CTX into kfunc is safe to do in all cases.
> Converting PTR_TO_MEM to PTR_TO_BTF_ID is also safe when kernel side 'struct foo'
> contains only scalars. The patches don't have this check yet (as far as I can see).
> That's the only missing piece.

This is a great idea! I think this does address the thing I was worried about.

> With that in place 'struct bpf_sock_tuple' can be defined on the kernel side.
> The bpf prog can do include "vmlinux.h" to use it to pass as PTR_TO_MEM
> into kfunc. The patch 5 kernel function bpf_skb_ct_lookup can stay as-is.
> So no tagging or extensions to bpf_func_proto are necessary.
>
> The piece I'm still missing is why you need two additional *btf_struct_access.
> Why do you want to restrict read access?
> The bpf-tcp infra has bpf_tcp_ca_btf_struct_access() to allow-list
> few safe fields for writing.
> Is there a use case to write into 'struct nf_conn' from bpf prog? Probably not yet.
> Then let's keep the default btf_struct_access() behavior for now.
> The patch 5 will be defining bpf_xdp_ct_lookup_tcp/bpf_skb_ct_lookup_tcp
> and no callbacks at all.
> acquire/release are probably cleaner as explicit btf_id_list-s.
> Similar to btf_id_list for PTR_TO_BTF_ID_OR_NULL vs PTR_TO_BTF_ID return type.

I agree with everything. I'll rework the BPF stuff like this. Thanks!

--
Kartikeya

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2021-11-07 15:44 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-10-30 14:46 [PATCH RFC bpf-next v1 0/6] Introduce unstable CT lookup helpers Kumar Kartikeya Dwivedi
2021-10-30 14:46 ` [PATCH RFC bpf-next v1 1/6] bpf: Refactor bpf_check_mod_kfunc_call Kumar Kartikeya Dwivedi
2021-10-30 14:46 ` [PATCH RFC bpf-next v1 2/6] bpf: Remove DEFINE_KFUNC_BTF_ID_SET Kumar Kartikeya Dwivedi
2021-10-30 14:46 ` [PATCH RFC bpf-next v1 3/6] bpf: Extend kfunc with PTR_TO_CTX and PTR_TO_MEM arguments Kumar Kartikeya Dwivedi
2021-10-30 14:46 ` [PATCH RFC bpf-next v1 4/6] bpf: Add reference tracking support to kfunc returned PTR_TO_BTF_ID Kumar Kartikeya Dwivedi
2021-10-30 14:46 ` [PATCH RFC bpf-next v1 5/6] net: netfilter: Add unstable CT lookup helper for XDP and TC-BPF Kumar Kartikeya Dwivedi
2021-10-31 19:10   ` Florian Westphal
2021-11-01 19:49     ` Toke Høiland-Jørgensen
2021-11-02 20:43       ` Florian Westphal
2021-11-05 20:48         ` Kumar Kartikeya Dwivedi
2021-11-02 23:19     ` Alexei Starovoitov
2021-10-30 14:46 ` [PATCH RFC bpf-next v1 6/6] selftests/bpf: Add referenced PTR_TO_BTF_ID selftest Kumar Kartikeya Dwivedi
2021-11-02 23:16 ` [PATCH RFC bpf-next v1 0/6] Introduce unstable CT lookup helpers Alexei Starovoitov
2021-11-04 12:55   ` Kumar Kartikeya Dwivedi
2021-11-05 20:49     ` Alexei Starovoitov
2021-11-05 21:13       ` Kumar Kartikeya Dwivedi
2021-11-06 18:13         ` Alexei Starovoitov
2021-11-07 15:44           ` Kumar Kartikeya Dwivedi

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).