[PATCH bpf-next v5 0/8] New nf_conntrack kfuncs for insertion, changing timeout, status

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH bpf-next v5 0/8] New nf_conntrack kfuncs for insertion, changing timeout, status
@ 2022-06-23 19:26 Kumar Kartikeya Dwivedi
  2022-06-23 19:26 ` [PATCH bpf-next v5 1/8] bpf: Add support for forcing kfunc args to be referenced Kumar Kartikeya Dwivedi
                   ` (7 more replies)
  0 siblings, 8 replies; 18+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-06-23 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Pablo Neira Ayuso, Florian Westphal, Jesper Dangaard Brouer,
	Toke Høiland-Jørgensen, Lorenzo Bianconi, netdev,
	netfilter-devel

Introduce the following new kfuncs:
 - bpf_{xdp,skb}_ct_alloc
 - bpf_ct_insert_entry
 - bpf_ct_{set,change}_timeout
 - bpf_ct_{set,change}_status

The setting of timeout and status on allocated or inserted/looked up CT
is same as the ctnetlink interface, hence code is refactored and shared
with the kfuncs. It is ensured allocated CT cannot be passed to kfuncs
that expected inserted CT, and vice versa. Please see individual patches
for details.

Changelog:
----------
v4 -> v5:
v4: https://lore.kernel.org/bpf/cover.1653600577.git.lorenzo@kernel.org

 * Drop read-only PTR_TO_BTF_ID approach, use struct nf_conn___init (Alexei)
 * Drop acquire release pair code that is no longer required (Alexei)
 * Disable writes into nf_conn, use dedicated helpers (Florian, Alexei)
 * Refactor and share ctnetlink code for setting timeout and status
 * Do strict type matching on finding __ref suffix on argument to
   prevent passing nf_conn___init as nf_conn (offset = 0, match on walk)
 * Remove bpf_ct_opts parameter from bpf_ct_insert_entry
 * Update selftests for new additions, add more negative tests

v3 -> v4:
v3: https://lore.kernel.org/bpf/cover.1652870182.git.lorenzo@kernel.org

 * split bpf_xdp_ct_add in bpf_xdp_ct_alloc/bpf_skb_ct_alloc and
   bpf_ct_insert_entry
 * add verifier code to properly populate/configure ct entry
 * improve selftests

v2 -> v3:
v2: https://lore.kernel.org/bpf/cover.1652372970.git.lorenzo@kernel.org

 * add bpf_xdp_ct_add and bpf_ct_refresh_timeout kfunc helpers
 * remove conntrack dependency from selftests
 * add support for forcing kfunc args to be referenced and related selftests

v1 -> v2:
v1: https://lore.kernel.org/bpf/1327f8f5696ff2bc60400e8f3b79047914ccc837.1651595019.git.lorenzo@kernel.org

 * add bpf_ct_refresh_timeout kfunc selftest

Kumar Kartikeya Dwivedi (5):
  bpf: Add support for forcing kfunc args to be referenced
  net: netfilter: Deduplicate code in bpf_{xdp,skb}_ct_lookup
  net: netfilter: Add kfuncs to set and change CT timeout
  selftests/bpf: Add verifier tests for forced kfunc ref args
  selftests/bpf: Add negative tests for new nf_conntrack kfuncs

Lorenzo Bianconi (3):
  net: netfilter: Add kfuncs to allocate and insert CT
  net: netfilter: Add kfuncs to set and change CT status
  selftests/bpf: Add tests for new nf_conntrack kfuncs

 include/net/netfilter/nf_conntrack_core.h     |  19 +
 kernel/bpf/btf.c                              |  48 ++-
 net/bpf/test_run.c                            |   5 +
 net/netfilter/nf_conntrack_bpf.c              | 330 +++++++++++++++---
 net/netfilter/nf_conntrack_core.c             |  62 ++++
 net/netfilter/nf_conntrack_netlink.c          |  54 +--
 .../testing/selftests/bpf/prog_tests/bpf_nf.c |  64 +++-
 .../testing/selftests/bpf/progs/test_bpf_nf.c |  85 ++++-
 .../selftests/bpf/progs/test_bpf_nf_fail.c    | 134 +++++++
 tools/testing/selftests/bpf/verifier/calls.c  |  53 +++
 10 files changed, 727 insertions(+), 127 deletions(-)
 create mode 100644 tools/testing/selftests/bpf/progs/test_bpf_nf_fail.c

-- 
2.36.1


^ permalink raw reply	[flat|nested] 18+ messages in thread

* [PATCH bpf-next v5 1/8] bpf: Add support for forcing kfunc args to be referenced
  2022-06-23 19:26 [PATCH bpf-next v5 0/8] New nf_conntrack kfuncs for insertion, changing timeout, status Kumar Kartikeya Dwivedi
@ 2022-06-23 19:26 ` Kumar Kartikeya Dwivedi
  2022-06-29  3:23   ` Alexei Starovoitov
  2022-06-23 19:26 ` [PATCH bpf-next v5 2/8] net: netfilter: Deduplicate code in bpf_{xdp,skb}_ct_lookup Kumar Kartikeya Dwivedi
                   ` (6 subsequent siblings)
  7 siblings, 1 reply; 18+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-06-23 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Pablo Neira Ayuso, Florian Westphal,
	Jesper Dangaard Brouer, Toke Høiland-Jørgensen,
	Lorenzo Bianconi, netdev, netfilter-devel

Similar to how we detect mem, size pairs in kfunc, teach verifier to
treat __ref suffix on argument name to imply that it must be a
referenced pointer when passed to kfunc. This is required to ensure that
kfunc that operate on some object only work on acquired pointers and not
normal PTR_TO_BTF_ID with same type which can be obtained by pointer
walking. Release functions need not specify such suffix on release
arguments as they are already expected to receive one referenced
argument.

Note that we use strict type matching when a __ref suffix is present on
the argument.

Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 kernel/bpf/btf.c   | 48 ++++++++++++++++++++++++++++++++++------------
 net/bpf/test_run.c |  5 +++++
 2 files changed, 41 insertions(+), 12 deletions(-)

diff --git a/kernel/bpf/btf.c b/kernel/bpf/btf.c
index f08037c31dd7..7b4128e3359a 100644
--- a/kernel/bpf/btf.c
+++ b/kernel/bpf/btf.c
@@ -6136,18 +6136,13 @@ static bool __btf_type_is_scalar_struct(struct bpf_verifier_log *log,
 	return true;
 }
 
-static bool is_kfunc_arg_mem_size(const struct btf *btf,
-				  const struct btf_param *arg,
-				  const struct bpf_reg_state *reg)
+static bool btf_param_match_suffix(const struct btf *btf,
+				   const struct btf_param *arg,
+				   const char *suffix)
 {
-	int len, sfx_len = sizeof("__sz") - 1;
-	const struct btf_type *t;
+	int len, sfx_len = strlen(suffix);
 	const char *param_name;
 
-	t = btf_type_skip_modifiers(btf, arg->type, NULL);
-	if (!btf_type_is_scalar(t) || reg->type != SCALAR_VALUE)
-		return false;
-
 	/* In the future, this can be ported to use BTF tagging */
 	param_name = btf_name_by_offset(btf, arg->name_off);
 	if (str_is_empty(param_name))
@@ -6156,22 +6151,41 @@ static bool is_kfunc_arg_mem_size(const struct btf *btf,
 	if (len < sfx_len)
 		return false;
 	param_name += len - sfx_len;
-	if (strncmp(param_name, "__sz", sfx_len))
+	if (strncmp(param_name, suffix, sfx_len))
 		return false;
 
 	return true;
 }
 
+static bool is_kfunc_arg_ref(const struct btf *btf,
+			     const struct btf_param *arg)
+{
+	return btf_param_match_suffix(btf, arg, "__ref");
+}
+
+static bool is_kfunc_arg_mem_size(const struct btf *btf,
+				  const struct btf_param *arg,
+				  const struct bpf_reg_state *reg)
+{
+	const struct btf_type *t;
+
+	t = btf_type_skip_modifiers(btf, arg->type, NULL);
+	if (!btf_type_is_scalar(t) || reg->type != SCALAR_VALUE)
+		return false;
+
+	return btf_param_match_suffix(btf, arg, "__sz");
+}
+
 static int btf_check_func_arg_match(struct bpf_verifier_env *env,
 				    const struct btf *btf, u32 func_id,
 				    struct bpf_reg_state *regs,
 				    bool ptr_to_mem_ok)
 {
 	enum bpf_prog_type prog_type = resolve_prog_type(env->prog);
+	bool rel = false, kptr_get = false, arg_ref = false;
 	struct bpf_verifier_log *log = &env->log;
 	u32 i, nargs, ref_id, ref_obj_id = 0;
 	bool is_kfunc = btf_is_kernel(btf);
-	bool rel = false, kptr_get = false;
 	const char *func_name, *ref_tname;
 	const struct btf_type *t, *ref_t;
 	const struct btf_param *args;
@@ -6231,6 +6245,15 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
 			return -EINVAL;
 		}
 
+		/* Check if argument must be a referenced pointer, args + i has
+		 * been verified to be a pointer (after skipping modifiers).
+		 */
+		arg_ref = is_kfunc_arg_ref(btf, args + i);
+		if (is_kfunc && arg_ref && !reg->ref_obj_id) {
+			bpf_log(log, "R%d must be referenced\n", regno);
+			return -EINVAL;
+		}
+
 		ref_t = btf_type_skip_modifiers(btf, t->type, &ref_id);
 		ref_tname = btf_name_by_offset(btf, ref_t->name_off);
 
@@ -6332,7 +6355,8 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env,
 			reg_ref_tname = btf_name_by_offset(reg_btf,
 							   reg_ref_t->name_off);
 			if (!btf_struct_ids_match(log, reg_btf, reg_ref_id,
-						  reg->off, btf, ref_id, rel && reg->ref_obj_id)) {
+						  reg->off, btf, ref_id,
+						  arg_ref || (rel && reg->ref_obj_id))) {
 				bpf_log(log, "kernel function %s args#%d expected pointer to %s %s but R%d has a pointer to %s %s\n",
 					func_name, i,
 					btf_type_str(ref_t), ref_tname,
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index 2ca96acbc50a..4314b8172b52 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -691,6 +691,10 @@ noinline void bpf_kfunc_call_test_mem_len_fail2(u64 *mem, int len)
 {
 }
 
+noinline void bpf_kfunc_call_test_ref(struct prog_test_ref_kfunc *p__ref)
+{
+}
+
 __diag_pop();
 
 ALLOW_ERROR_INJECTION(bpf_modify_return_test, ERRNO);
@@ -714,6 +718,7 @@ BTF_ID(func, bpf_kfunc_call_test_fail3)
 BTF_ID(func, bpf_kfunc_call_test_mem_len_pass1)
 BTF_ID(func, bpf_kfunc_call_test_mem_len_fail1)
 BTF_ID(func, bpf_kfunc_call_test_mem_len_fail2)
+BTF_ID(func, bpf_kfunc_call_test_ref)
 BTF_SET_END(test_sk_check_kfunc_ids)
 
 BTF_SET_START(test_sk_acquire_kfunc_ids)
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH bpf-next v5 2/8] net: netfilter: Deduplicate code in bpf_{xdp,skb}_ct_lookup
  2022-06-23 19:26 [PATCH bpf-next v5 0/8] New nf_conntrack kfuncs for insertion, changing timeout, status Kumar Kartikeya Dwivedi
  2022-06-23 19:26 ` [PATCH bpf-next v5 1/8] bpf: Add support for forcing kfunc args to be referenced Kumar Kartikeya Dwivedi
@ 2022-06-23 19:26 ` Kumar Kartikeya Dwivedi
  2022-06-23 19:26 ` [PATCH bpf-next v5 3/8] net: netfilter: Add kfuncs to allocate and insert CT Kumar Kartikeya Dwivedi
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-06-23 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Pablo Neira Ayuso, Florian Westphal, Jesper Dangaard Brouer,
	Toke Høiland-Jørgensen, Lorenzo Bianconi, netdev,
	netfilter-devel

Move common checks inside the common function, and maintain the only
difference the two being how to obtain the struct net * from ctx.
No functional change intended.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 net/netfilter/nf_conntrack_bpf.c | 52 +++++++++++---------------------
 1 file changed, 18 insertions(+), 34 deletions(-)

diff --git a/net/netfilter/nf_conntrack_bpf.c b/net/netfilter/nf_conntrack_bpf.c
index bc4d5cd63a94..5cb1820054fb 100644
--- a/net/netfilter/nf_conntrack_bpf.c
+++ b/net/netfilter/nf_conntrack_bpf.c
@@ -57,16 +57,19 @@ enum {
 
 static struct nf_conn *__bpf_nf_ct_lookup(struct net *net,
 					  struct bpf_sock_tuple *bpf_tuple,
-					  u32 tuple_len, u8 protonum,
-					  s32 netns_id, u8 *dir)
+					  u32 tuple_len, struct bpf_ct_opts *opts,
+					  u32 opts_len)
 {
 	struct nf_conntrack_tuple_hash *hash;
 	struct nf_conntrack_tuple tuple;
 	struct nf_conn *ct;
 
-	if (unlikely(protonum != IPPROTO_TCP && protonum != IPPROTO_UDP))
+	if (!opts || !bpf_tuple || opts->reserved[0] || opts->reserved[1] ||
+	    opts_len != NF_BPF_CT_OPTS_SZ)
+		return ERR_PTR(-EINVAL);
+	if (unlikely(opts->l4proto != IPPROTO_TCP && opts->l4proto != IPPROTO_UDP))
 		return ERR_PTR(-EPROTO);
-	if (unlikely(netns_id < BPF_F_CURRENT_NETNS))
+	if (unlikely(opts->netns_id < BPF_F_CURRENT_NETNS))
 		return ERR_PTR(-EINVAL);
 
 	memset(&tuple, 0, sizeof(tuple));
@@ -89,23 +92,22 @@ static struct nf_conn *__bpf_nf_ct_lookup(struct net *net,
 		return ERR_PTR(-EAFNOSUPPORT);
 	}
 
-	tuple.dst.protonum = protonum;
+	tuple.dst.protonum = opts->l4proto;
 
-	if (netns_id >= 0) {
-		net = get_net_ns_by_id(net, netns_id);
+	if (opts->netns_id >= 0) {
+		net = get_net_ns_by_id(net, opts->netns_id);
 		if (unlikely(!net))
 			return ERR_PTR(-ENONET);
 	}
 
 	hash = nf_conntrack_find_get(net, &nf_ct_zone_dflt, &tuple);
-	if (netns_id >= 0)
+	if (opts->netns_id >= 0)
 		put_net(net);
 	if (!hash)
 		return ERR_PTR(-ENOENT);
 
 	ct = nf_ct_tuplehash_to_ctrack(hash);
-	if (dir)
-		*dir = NF_CT_DIRECTION(hash);
+	opts->dir = NF_CT_DIRECTION(hash);
 
 	return ct;
 }
@@ -138,20 +140,11 @@ bpf_xdp_ct_lookup(struct xdp_md *xdp_ctx, struct bpf_sock_tuple *bpf_tuple,
 	struct net *caller_net;
 	struct nf_conn *nfct;
 
-	BUILD_BUG_ON(sizeof(struct bpf_ct_opts) != NF_BPF_CT_OPTS_SZ);
-
-	if (!opts)
-		return NULL;
-	if (!bpf_tuple || opts->reserved[0] || opts->reserved[1] ||
-	    opts__sz != NF_BPF_CT_OPTS_SZ) {
-		opts->error = -EINVAL;
-		return NULL;
-	}
 	caller_net = dev_net(ctx->rxq->dev);
-	nfct = __bpf_nf_ct_lookup(caller_net, bpf_tuple, tuple__sz, opts->l4proto,
-				  opts->netns_id, &opts->dir);
+	nfct = __bpf_nf_ct_lookup(caller_net, bpf_tuple, tuple__sz, opts, opts__sz);
 	if (IS_ERR(nfct)) {
-		opts->error = PTR_ERR(nfct);
+		if (opts)
+			opts->error = PTR_ERR(nfct);
 		return NULL;
 	}
 	return nfct;
@@ -181,20 +174,11 @@ bpf_skb_ct_lookup(struct __sk_buff *skb_ctx, struct bpf_sock_tuple *bpf_tuple,
 	struct net *caller_net;
 	struct nf_conn *nfct;
 
-	BUILD_BUG_ON(sizeof(struct bpf_ct_opts) != NF_BPF_CT_OPTS_SZ);
-
-	if (!opts)
-		return NULL;
-	if (!bpf_tuple || opts->reserved[0] || opts->reserved[1] ||
-	    opts__sz != NF_BPF_CT_OPTS_SZ) {
-		opts->error = -EINVAL;
-		return NULL;
-	}
 	caller_net = skb->dev ? dev_net(skb->dev) : sock_net(skb->sk);
-	nfct = __bpf_nf_ct_lookup(caller_net, bpf_tuple, tuple__sz, opts->l4proto,
-				  opts->netns_id, &opts->dir);
+	nfct = __bpf_nf_ct_lookup(caller_net, bpf_tuple, tuple__sz, opts, opts__sz);
 	if (IS_ERR(nfct)) {
-		opts->error = PTR_ERR(nfct);
+		if (opts)
+			opts->error = PTR_ERR(nfct);
 		return NULL;
 	}
 	return nfct;
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH bpf-next v5 3/8] net: netfilter: Add kfuncs to allocate and insert CT
  2022-06-23 19:26 [PATCH bpf-next v5 0/8] New nf_conntrack kfuncs for insertion, changing timeout, status Kumar Kartikeya Dwivedi
  2022-06-23 19:26 ` [PATCH bpf-next v5 1/8] bpf: Add support for forcing kfunc args to be referenced Kumar Kartikeya Dwivedi
  2022-06-23 19:26 ` [PATCH bpf-next v5 2/8] net: netfilter: Deduplicate code in bpf_{xdp,skb}_ct_lookup Kumar Kartikeya Dwivedi
@ 2022-06-23 19:26 ` Kumar Kartikeya Dwivedi
  2022-06-23 19:26 ` [PATCH bpf-next v5 4/8] net: netfilter: Add kfuncs to set and change CT timeout Kumar Kartikeya Dwivedi
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-06-23 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Lorenzo Bianconi, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Pablo Neira Ayuso, Florian Westphal,
	Jesper Dangaard Brouer, Toke Høiland-Jørgensen, netdev,
	netfilter-devel

From: Lorenzo Bianconi <lorenzo@kernel.org>

Introduce bpf_xdp_ct_alloc, bpf_skb_ct_alloc and bpf_ct_insert_entry
kfuncs in order to insert a new entry from XDP and TC programs.
Introduce bpf_nf_ct_tuple_parse utility routine to consolidate common
code.

We extract out a helper __nf_ct_set_timeout, used by the ctnetlink and
nf_conntrack_bpf code, extract it out to nf_conntrack_core, so that
nf_conntrack_bpf doesn't need a dependency on CONFIG_NF_CT_NETLINK.
Later this helper will be reused as a helper to set timeout of allocated
but not yet inserted CT entry.

The allocation functions return struct nf_conn___init instead of
nf_conn, to distinguish allocated CT from an already inserted or looked
up CT. This is later used to enforce restrictions on what kfuncs
allocated CT can be used with.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Co-developed-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 include/net/netfilter/nf_conntrack_core.h |  15 ++
 net/netfilter/nf_conntrack_bpf.c          | 213 +++++++++++++++++++---
 net/netfilter/nf_conntrack_netlink.c      |   8 +-
 3 files changed, 209 insertions(+), 27 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack_core.h b/include/net/netfilter/nf_conntrack_core.h
index 37866c8386e2..83a60c684e6c 100644
--- a/include/net/netfilter/nf_conntrack_core.h
+++ b/include/net/netfilter/nf_conntrack_core.h
@@ -84,4 +84,19 @@ void nf_conntrack_lock(spinlock_t *lock);
 
 extern spinlock_t nf_conntrack_expect_lock;
 
+/* ctnetlink code shared by both ctnetlink and nf_conntrack_bpf */
+
+#if (IS_BUILTIN(CONFIG_NF_CONNTRACK) && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) || \
+    (IS_MODULE(CONFIG_NF_CONNTRACK) && IS_ENABLED(CONFIG_DEBUG_INFO_BTF_MODULES) || \
+    IS_ENABLED(CONFIG_NF_CT_NETLINK))
+
+static inline void __nf_ct_set_timeout(struct nf_conn *ct, u64 timeout)
+{
+	if (timeout > INT_MAX)
+		timeout = INT_MAX;
+	WRITE_ONCE(ct->timeout, nfct_time_stamp + (u32)timeout);
+}
+
+#endif
+
 #endif /* _NF_CONNTRACK_CORE_H */
diff --git a/net/netfilter/nf_conntrack_bpf.c b/net/netfilter/nf_conntrack_bpf.c
index 5cb1820054fb..1d3c1d1e2d8b 100644
--- a/net/netfilter/nf_conntrack_bpf.c
+++ b/net/netfilter/nf_conntrack_bpf.c
@@ -55,6 +55,94 @@ enum {
 	NF_BPF_CT_OPTS_SZ = 12,
 };
 
+static int bpf_nf_ct_tuple_parse(struct bpf_sock_tuple *bpf_tuple,
+				 u32 tuple_len, u8 protonum, u8 dir,
+				 struct nf_conntrack_tuple *tuple)
+{
+	union nf_inet_addr *src = dir ? &tuple->dst.u3 : &tuple->src.u3;
+	union nf_inet_addr *dst = dir ? &tuple->src.u3 : &tuple->dst.u3;
+	union nf_conntrack_man_proto *sport = dir ? (void *)&tuple->dst.u
+						  : &tuple->src.u;
+	union nf_conntrack_man_proto *dport = dir ? &tuple->src.u
+						  : (void *)&tuple->dst.u;
+
+	if (unlikely(protonum != IPPROTO_TCP && protonum != IPPROTO_UDP))
+		return -EPROTO;
+
+	memset(tuple, 0, sizeof(*tuple));
+
+	switch (tuple_len) {
+	case sizeof(bpf_tuple->ipv4):
+		tuple->src.l3num = AF_INET;
+		src->ip = bpf_tuple->ipv4.saddr;
+		sport->tcp.port = bpf_tuple->ipv4.sport;
+		dst->ip = bpf_tuple->ipv4.daddr;
+		dport->tcp.port = bpf_tuple->ipv4.dport;
+		break;
+	case sizeof(bpf_tuple->ipv6):
+		tuple->src.l3num = AF_INET6;
+		memcpy(src->ip6, bpf_tuple->ipv6.saddr, sizeof(bpf_tuple->ipv6.saddr));
+		sport->tcp.port = bpf_tuple->ipv6.sport;
+		memcpy(dst->ip6, bpf_tuple->ipv6.daddr, sizeof(bpf_tuple->ipv6.daddr));
+		dport->tcp.port = bpf_tuple->ipv6.dport;
+		break;
+	default:
+		return -EAFNOSUPPORT;
+	}
+	tuple->dst.protonum = protonum;
+	tuple->dst.dir = dir;
+
+	return 0;
+}
+
+static struct nf_conn *
+__bpf_nf_ct_alloc_entry(struct net *net, struct bpf_sock_tuple *bpf_tuple,
+			u32 tuple_len, struct bpf_ct_opts *opts, u32 opts_len,
+			u32 timeout)
+{
+	struct nf_conntrack_tuple otuple, rtuple;
+	struct nf_conn *ct;
+	int err;
+
+	if (!opts || !bpf_tuple || opts->reserved[0] || opts->reserved[1] ||
+	    opts_len != NF_BPF_CT_OPTS_SZ)
+		return ERR_PTR(-EINVAL);
+
+	if (unlikely(opts->netns_id < BPF_F_CURRENT_NETNS))
+		return ERR_PTR(-EINVAL);
+
+	err = bpf_nf_ct_tuple_parse(bpf_tuple, tuple_len, opts->l4proto,
+				    IP_CT_DIR_ORIGINAL, &otuple);
+	if (err < 0)
+		return ERR_PTR(err);
+
+	err = bpf_nf_ct_tuple_parse(bpf_tuple, tuple_len, opts->l4proto,
+				    IP_CT_DIR_REPLY, &rtuple);
+	if (err < 0)
+		return ERR_PTR(err);
+
+	if (opts->netns_id >= 0) {
+		net = get_net_ns_by_id(net, opts->netns_id);
+		if (unlikely(!net))
+			return ERR_PTR(-ENONET);
+	}
+
+	ct = nf_conntrack_alloc(net, &nf_ct_zone_dflt, &otuple, &rtuple,
+				GFP_ATOMIC);
+	if (IS_ERR(ct))
+		goto out;
+
+	memset(&ct->proto, 0, sizeof(ct->proto));
+	__nf_ct_set_timeout(ct, timeout * HZ);
+	ct->status |= IPS_CONFIRMED;
+
+out:
+	if (opts->netns_id >= 0)
+		put_net(net);
+
+	return ct;
+}
+
 static struct nf_conn *__bpf_nf_ct_lookup(struct net *net,
 					  struct bpf_sock_tuple *bpf_tuple,
 					  u32 tuple_len, struct bpf_ct_opts *opts,
@@ -63,6 +151,7 @@ static struct nf_conn *__bpf_nf_ct_lookup(struct net *net,
 	struct nf_conntrack_tuple_hash *hash;
 	struct nf_conntrack_tuple tuple;
 	struct nf_conn *ct;
+	int err;
 
 	if (!opts || !bpf_tuple || opts->reserved[0] || opts->reserved[1] ||
 	    opts_len != NF_BPF_CT_OPTS_SZ)
@@ -72,27 +161,10 @@ static struct nf_conn *__bpf_nf_ct_lookup(struct net *net,
 	if (unlikely(opts->netns_id < BPF_F_CURRENT_NETNS))
 		return ERR_PTR(-EINVAL);
 
-	memset(&tuple, 0, sizeof(tuple));
-	switch (tuple_len) {
-	case sizeof(bpf_tuple->ipv4):
-		tuple.src.l3num = AF_INET;
-		tuple.src.u3.ip = bpf_tuple->ipv4.saddr;
-		tuple.src.u.tcp.port = bpf_tuple->ipv4.sport;
-		tuple.dst.u3.ip = bpf_tuple->ipv4.daddr;
-		tuple.dst.u.tcp.port = bpf_tuple->ipv4.dport;
-		break;
-	case sizeof(bpf_tuple->ipv6):
-		tuple.src.l3num = AF_INET6;
-		memcpy(tuple.src.u3.ip6, bpf_tuple->ipv6.saddr, sizeof(bpf_tuple->ipv6.saddr));
-		tuple.src.u.tcp.port = bpf_tuple->ipv6.sport;
-		memcpy(tuple.dst.u3.ip6, bpf_tuple->ipv6.daddr, sizeof(bpf_tuple->ipv6.daddr));
-		tuple.dst.u.tcp.port = bpf_tuple->ipv6.dport;
-		break;
-	default:
-		return ERR_PTR(-EAFNOSUPPORT);
-	}
-
-	tuple.dst.protonum = opts->l4proto;
+	err = bpf_nf_ct_tuple_parse(bpf_tuple, tuple_len, opts->l4proto,
+				    IP_CT_DIR_ORIGINAL, &tuple);
+	if (err < 0)
+		return ERR_PTR(err);
 
 	if (opts->netns_id >= 0) {
 		net = get_net_ns_by_id(net, opts->netns_id);
@@ -116,6 +188,43 @@ __diag_push();
 __diag_ignore_all("-Wmissing-prototypes",
 		  "Global functions as their definitions will be in nf_conntrack BTF");
 
+struct nf_conn___init {
+	struct nf_conn ct;
+};
+
+/* bpf_xdp_ct_alloc - Allocate a new CT entry
+ *
+ * Parameters:
+ * @xdp_ctx	- Pointer to ctx (xdp_md) in XDP program
+ *		    Cannot be NULL
+ * @bpf_tuple	- Pointer to memory representing the tuple to look up
+ *		    Cannot be NULL
+ * @tuple__sz	- Length of the tuple structure
+ *		    Must be one of sizeof(bpf_tuple->ipv4) or
+ *		    sizeof(bpf_tuple->ipv6)
+ * @opts	- Additional options for allocation (documented above)
+ *		    Cannot be NULL
+ * @opts__sz	- Length of the bpf_ct_opts structure
+ *		    Must be NF_BPF_CT_OPTS_SZ (12)
+ */
+struct nf_conn___init *
+bpf_xdp_ct_alloc(struct xdp_md *xdp_ctx, struct bpf_sock_tuple *bpf_tuple,
+		 u32 tuple__sz, struct bpf_ct_opts *opts, u32 opts__sz)
+{
+	struct xdp_buff *ctx = (struct xdp_buff *)xdp_ctx;
+	struct nf_conn *nfct;
+
+	nfct = __bpf_nf_ct_alloc_entry(dev_net(ctx->rxq->dev), bpf_tuple, tuple__sz,
+				       opts, opts__sz, 10);
+	if (IS_ERR(nfct)) {
+		if (opts)
+			opts->error = PTR_ERR(nfct);
+		return NULL;
+	}
+
+	return (struct nf_conn___init *)nfct;
+}
+
 /* bpf_xdp_ct_lookup - Lookup CT entry for the given tuple, and acquire a
  *		       reference to it
  *
@@ -150,6 +259,40 @@ bpf_xdp_ct_lookup(struct xdp_md *xdp_ctx, struct bpf_sock_tuple *bpf_tuple,
 	return nfct;
 }
 
+/* bpf_skb_ct_alloc - Allocate a new CT entry
+ *
+ * Parameters:
+ * @skb_ctx	- Pointer to ctx (__sk_buff) in TC program
+ *		    Cannot be NULL
+ * @bpf_tuple	- Pointer to memory representing the tuple to look up
+ *		    Cannot be NULL
+ * @tuple__sz	- Length of the tuple structure
+ *		    Must be one of sizeof(bpf_tuple->ipv4) or
+ *		    sizeof(bpf_tuple->ipv6)
+ * @opts	- Additional options for allocation (documented above)
+ *		    Cannot be NULL
+ * @opts__sz	- Length of the bpf_ct_opts structure
+ *		    Must be NF_BPF_CT_OPTS_SZ (12)
+ */
+struct nf_conn___init *
+bpf_skb_ct_alloc(struct __sk_buff *skb_ctx, struct bpf_sock_tuple *bpf_tuple,
+		 u32 tuple__sz, struct bpf_ct_opts *opts, u32 opts__sz)
+{
+	struct sk_buff *skb = (struct sk_buff *)skb_ctx;
+	struct nf_conn *nfct;
+	struct net *net;
+
+	net = skb->dev ? dev_net(skb->dev) : sock_net(skb->sk);
+	nfct = __bpf_nf_ct_alloc_entry(net, bpf_tuple, tuple__sz, opts, opts__sz, 10);
+	if (IS_ERR(nfct)) {
+		if (opts)
+			opts->error = PTR_ERR(nfct);
+		return NULL;
+	}
+
+	return (struct nf_conn___init *)nfct;
+}
+
 /* bpf_skb_ct_lookup - Lookup CT entry for the given tuple, and acquire a
  *		       reference to it
  *
@@ -184,6 +327,26 @@ bpf_skb_ct_lookup(struct __sk_buff *skb_ctx, struct bpf_sock_tuple *bpf_tuple,
 	return nfct;
 }
 
+/* bpf_ct_insert_entry - Add the provided entry into a CT map
+ *
+ * This must be invoked for referenced PTR_TO_BTF_ID.
+ *
+ * @nfct__ref	 - Pointer to referenced nf_conn___init object, obtained
+ *		   using bpf_xdp_ct_alloc or bpf_skb_ct_alloc.
+ */
+struct nf_conn *bpf_ct_insert_entry(struct nf_conn___init *nfct__ref)
+{
+	struct nf_conn *nfct = (struct nf_conn *)nfct__ref;
+	int err;
+
+	err = nf_conntrack_hash_check_insert(nfct);
+	if (err < 0) {
+		nf_conntrack_free(nfct);
+		return NULL;
+	}
+	return nfct;
+}
+
 /* bpf_ct_release - Release acquired nf_conn object
  *
  * This must be invoked for referenced PTR_TO_BTF_ID, and the verifier rejects
@@ -204,21 +367,29 @@ void bpf_ct_release(struct nf_conn *nfct)
 __diag_pop()
 
 BTF_SET_START(nf_ct_xdp_check_kfunc_ids)
+BTF_ID(func, bpf_xdp_ct_alloc)
 BTF_ID(func, bpf_xdp_ct_lookup)
+BTF_ID(func, bpf_ct_insert_entry)
 BTF_ID(func, bpf_ct_release)
 BTF_SET_END(nf_ct_xdp_check_kfunc_ids)
 
 BTF_SET_START(nf_ct_tc_check_kfunc_ids)
+BTF_ID(func, bpf_skb_ct_alloc)
 BTF_ID(func, bpf_skb_ct_lookup)
+BTF_ID(func, bpf_ct_insert_entry)
 BTF_ID(func, bpf_ct_release)
 BTF_SET_END(nf_ct_tc_check_kfunc_ids)
 
 BTF_SET_START(nf_ct_acquire_kfunc_ids)
+BTF_ID(func, bpf_xdp_ct_alloc)
 BTF_ID(func, bpf_xdp_ct_lookup)
+BTF_ID(func, bpf_skb_ct_alloc)
 BTF_ID(func, bpf_skb_ct_lookup)
+BTF_ID(func, bpf_ct_insert_entry)
 BTF_SET_END(nf_ct_acquire_kfunc_ids)
 
 BTF_SET_START(nf_ct_release_kfunc_ids)
+BTF_ID(func, bpf_ct_insert_entry)
 BTF_ID(func, bpf_ct_release)
 BTF_SET_END(nf_ct_release_kfunc_ids)
 
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index 722af5e309ba..0729b2f0d44f 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -2025,9 +2025,7 @@ static int ctnetlink_change_timeout(struct nf_conn *ct,
 {
 	u64 timeout = (u64)ntohl(nla_get_be32(cda[CTA_TIMEOUT])) * HZ;
 
-	if (timeout > INT_MAX)
-		timeout = INT_MAX;
-	WRITE_ONCE(ct->timeout, nfct_time_stamp + (u32)timeout);
+	__nf_ct_set_timeout(ct, timeout);
 
 	if (test_bit(IPS_DYING_BIT, &ct->status))
 		return -ETIME;
@@ -2292,9 +2290,7 @@ ctnetlink_create_conntrack(struct net *net,
 		goto err1;
 
 	timeout = (u64)ntohl(nla_get_be32(cda[CTA_TIMEOUT])) * HZ;
-	if (timeout > INT_MAX)
-		timeout = INT_MAX;
-	ct->timeout = (u32)timeout + nfct_time_stamp;
+	__nf_ct_set_timeout(ct, timeout);
 
 	rcu_read_lock();
  	if (cda[CTA_HELP]) {
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH bpf-next v5 4/8] net: netfilter: Add kfuncs to set and change CT timeout
  2022-06-23 19:26 [PATCH bpf-next v5 0/8] New nf_conntrack kfuncs for insertion, changing timeout, status Kumar Kartikeya Dwivedi
                   ` (2 preceding siblings ...)
  2022-06-23 19:26 ` [PATCH bpf-next v5 3/8] net: netfilter: Add kfuncs to allocate and insert CT Kumar Kartikeya Dwivedi
@ 2022-06-23 19:26 ` Kumar Kartikeya Dwivedi
  2022-06-23 19:26 ` [PATCH bpf-next v5 5/8] net: netfilter: Add kfuncs to set and change CT status Kumar Kartikeya Dwivedi
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-06-23 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Lorenzo Bianconi, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Pablo Neira Ayuso, Florian Westphal,
	Jesper Dangaard Brouer, Toke Høiland-Jørgensen, netdev,
	netfilter-devel

Introduce bpf_ct_set_timeout and bpf_ct_change_timeout kfunc helpers in
order to change nf_conn timeout. This is same as ctnetlink_change_timeout,
hence code is shared between both by extracting it out to
__nf_ct_change_timeout. It is also updated to return an error when it
sees IPS_FIXED_TIMEOUT_BIT bit in ct->status, as that check was missing.

It is required to introduce two kfuncs taking nf_conn___init and nf_conn
instead of sharing one because __ref suffix on the parameter name causes
strict type checking. This would disallow passing nf_conn___init to
kfunc taking nf_conn, and vice versa. We cannot remove the __ref suffix
as we only want to accept refcounted pointers and not e.g. ct->master.

Apart from this, bpf_ct_set_timeout is only called for newly allocated
CT so it doesn't need to inspect the status field just yet. Sharing the
helpers even if it was possible would make timeout setting helper
sensitive to order of setting status and timeout after allocation.

Hence, bpf_ct_set_* kfuncs are meant to be used on allocated CT, and
bpf_ct_change_* kfuncs are meant to be used on inserted or looked up
CT entry.

Co-developed-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 include/net/netfilter/nf_conntrack_core.h |  2 ++
 net/netfilter/nf_conntrack_bpf.c          | 34 +++++++++++++++++++++++
 net/netfilter/nf_conntrack_core.c         | 22 +++++++++++++++
 net/netfilter/nf_conntrack_netlink.c      |  9 +-----
 4 files changed, 59 insertions(+), 8 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack_core.h b/include/net/netfilter/nf_conntrack_core.h
index 83a60c684e6c..3b0f7d0eebae 100644
--- a/include/net/netfilter/nf_conntrack_core.h
+++ b/include/net/netfilter/nf_conntrack_core.h
@@ -97,6 +97,8 @@ static inline void __nf_ct_set_timeout(struct nf_conn *ct, u64 timeout)
 	WRITE_ONCE(ct->timeout, nfct_time_stamp + (u32)timeout);
 }
 
+int __nf_ct_change_timeout(struct nf_conn *ct, u64 cta_timeout);
+
 #endif
 
 #endif /* _NF_CONNTRACK_CORE_H */
diff --git a/net/netfilter/nf_conntrack_bpf.c b/net/netfilter/nf_conntrack_bpf.c
index 1d3c1d1e2d8b..db04874da950 100644
--- a/net/netfilter/nf_conntrack_bpf.c
+++ b/net/netfilter/nf_conntrack_bpf.c
@@ -364,6 +364,36 @@ void bpf_ct_release(struct nf_conn *nfct)
 	nf_ct_put(nfct);
 }
 
+/* bpf_ct_set_timeout - Set timeout of allocated nf_conn
+ *
+ * Sets the default timeout of newly allocated nf_conn before insertion.
+ * This helper must be invoked for refcounted pointer to nf_conn___init.
+ *
+ * Parameters:
+ * @nfct__ref    - Pointer to referenced nf_conn object, obtained using
+ *                 bpf_xdp_ct_alloc or bpf_skb_ct_alloc.
+ * @timeout      - Timeout in msecs.
+ */
+void bpf_ct_set_timeout(struct nf_conn___init *nfct__ref, u32 timeout)
+{
+	__nf_ct_set_timeout((struct nf_conn *)nfct__ref, msecs_to_jiffies(timeout));
+}
+
+/* bpf_ct_change_timeout - Change timeout of inserted nf_conn
+ *
+ * Change timeout associated of the inserted or looked up nf_conn.
+ * This helper must be invoked for refcounted pointer to nf_conn.
+ *
+ * Parameters:
+ * @nfct__ref    - Pointer to referenced nf_conn object, obtained using
+ *		   bpf_ct_insert_entry, bpf_xdp_ct_lookup, or bpf_skb_ct_lookup.
+ * @timeout      - New timeout in msecs.
+ */
+int bpf_ct_change_timeout(struct nf_conn *nfct__ref, u32 timeout)
+{
+	return __nf_ct_change_timeout(nfct__ref, msecs_to_jiffies(timeout));
+}
+
 __diag_pop()
 
 BTF_SET_START(nf_ct_xdp_check_kfunc_ids)
@@ -371,6 +401,8 @@ BTF_ID(func, bpf_xdp_ct_alloc)
 BTF_ID(func, bpf_xdp_ct_lookup)
 BTF_ID(func, bpf_ct_insert_entry)
 BTF_ID(func, bpf_ct_release)
+BTF_ID(func, bpf_ct_set_timeout);
+BTF_ID(func, bpf_ct_change_timeout);
 BTF_SET_END(nf_ct_xdp_check_kfunc_ids)
 
 BTF_SET_START(nf_ct_tc_check_kfunc_ids)
@@ -378,6 +410,8 @@ BTF_ID(func, bpf_skb_ct_alloc)
 BTF_ID(func, bpf_skb_ct_lookup)
 BTF_ID(func, bpf_ct_insert_entry)
 BTF_ID(func, bpf_ct_release)
+BTF_ID(func, bpf_ct_set_timeout);
+BTF_ID(func, bpf_ct_change_timeout);
 BTF_SET_END(nf_ct_tc_check_kfunc_ids)
 
 BTF_SET_START(nf_ct_acquire_kfunc_ids)
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 082a2fd8d85b..572f59a5e936 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -2786,3 +2786,25 @@ int nf_conntrack_init_net(struct net *net)
 	free_percpu(net->ct.stat);
 	return ret;
 }
+
+#if (IS_BUILTIN(CONFIG_NF_CONNTRACK) && IS_ENABLED(CONFIG_DEBUG_INFO_BTF)) || \
+    (IS_MODULE(CONFIG_NF_CONNTRACK) && IS_ENABLED(CONFIG_DEBUG_INFO_BTF_MODULES) || \
+    IS_ENABLED(CONFIG_NF_CT_NETLINK))
+
+/* ctnetlink code shared by both ctnetlink and nf_conntrack_bpf */
+
+int __nf_ct_change_timeout(struct nf_conn *ct, u64 timeout)
+{
+	if (test_bit(IPS_FIXED_TIMEOUT_BIT, &ct->status))
+		return -EPERM;
+
+	__nf_ct_set_timeout(ct, timeout);
+
+	if (test_bit(IPS_DYING_BIT, &ct->status))
+		return -ETIME;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(__nf_ct_change_timeout);
+
+#endif
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index 0729b2f0d44f..b1de07c73845 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -2023,14 +2023,7 @@ static int ctnetlink_change_helper(struct nf_conn *ct,
 static int ctnetlink_change_timeout(struct nf_conn *ct,
 				    const struct nlattr * const cda[])
 {
-	u64 timeout = (u64)ntohl(nla_get_be32(cda[CTA_TIMEOUT])) * HZ;
-
-	__nf_ct_set_timeout(ct, timeout);
-
-	if (test_bit(IPS_DYING_BIT, &ct->status))
-		return -ETIME;
-
-	return 0;
+	return __nf_ct_change_timeout(ct, (u64)ntohl(nla_get_be32(cda[CTA_TIMEOUT])) * HZ);
 }
 
 #if defined(CONFIG_NF_CONNTRACK_MARK)
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH bpf-next v5 5/8] net: netfilter: Add kfuncs to set and change CT status
  2022-06-23 19:26 [PATCH bpf-next v5 0/8] New nf_conntrack kfuncs for insertion, changing timeout, status Kumar Kartikeya Dwivedi
                   ` (3 preceding siblings ...)
  2022-06-23 19:26 ` [PATCH bpf-next v5 4/8] net: netfilter: Add kfuncs to set and change CT timeout Kumar Kartikeya Dwivedi
@ 2022-06-23 19:26 ` Kumar Kartikeya Dwivedi
  2022-06-23 19:26 ` [PATCH bpf-next v5 6/8] selftests/bpf: Add verifier tests for forced kfunc ref args Kumar Kartikeya Dwivedi
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 18+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-06-23 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Lorenzo Bianconi, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Pablo Neira Ayuso, Florian Westphal,
	Jesper Dangaard Brouer, Toke Høiland-Jørgensen, netdev,
	netfilter-devel

From: Lorenzo Bianconi <lorenzo@kernel.org>

Introduce bpf_ct_set_status and bpf_ct_change_status kfunc helpers in
order to set nf_conn field of allocated entry or update nf_conn status
field of existing inserted entry. Use nf_ct_change_status_common to
share the permitted status field changes between netlink and BPF side
by refactoring ctnetlink_change_status.

It is required to introduce two kfuncs taking nf_conn___init and nf_conn
instead of sharing one because __ref suffix on the parameter name causes
strict type checking. This would disallow passing nf_conn___init to
kfunc taking nf_conn, and vice versa. We cannot remove the __ref suffix
as we only want to accept refcounted pointers and not e.g. ct->master.

Hence, bpf_ct_set_* kfuncs are meant to be used on allocated CT, and
bpf_ct_change_* kfuncs are meant to be used on inserted or looked up
CT entry.

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Co-developed-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 include/net/netfilter/nf_conntrack_core.h |  2 ++
 net/netfilter/nf_conntrack_bpf.c          | 37 +++++++++++++++++++++
 net/netfilter/nf_conntrack_core.c         | 40 +++++++++++++++++++++++
 net/netfilter/nf_conntrack_netlink.c      | 39 ++--------------------
 4 files changed, 81 insertions(+), 37 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack_core.h b/include/net/netfilter/nf_conntrack_core.h
index 3b0f7d0eebae..3cd3a6e631aa 100644
--- a/include/net/netfilter/nf_conntrack_core.h
+++ b/include/net/netfilter/nf_conntrack_core.h
@@ -98,6 +98,8 @@ static inline void __nf_ct_set_timeout(struct nf_conn *ct, u64 timeout)
 }
 
 int __nf_ct_change_timeout(struct nf_conn *ct, u64 cta_timeout);
+void __nf_ct_change_status(struct nf_conn *ct, unsigned long on, unsigned long off);
+int nf_ct_change_status_common(struct nf_conn *ct, unsigned int status);
 
 #endif
 
diff --git a/net/netfilter/nf_conntrack_bpf.c b/net/netfilter/nf_conntrack_bpf.c
index db04874da950..6975dda77173 100644
--- a/net/netfilter/nf_conntrack_bpf.c
+++ b/net/netfilter/nf_conntrack_bpf.c
@@ -394,6 +394,39 @@ int bpf_ct_change_timeout(struct nf_conn *nfct__ref, u32 timeout)
 	return __nf_ct_change_timeout(nfct__ref, msecs_to_jiffies(timeout));
 }
 
+/* bpf_ct_set_status - Set status field of allocated nf_conn
+ *
+ * Set the status field of the newly allocated nf_conn before insertion.
+ * This must be invoked for referenced PTR_TO_BTF_ID to nf_conn___init.
+ *
+ * Parameters:
+ * @nfct__ref    - Pointer to referenced nf_conn object, obtained using
+ *		   bpf_xdp_ct_alloc or bpf_skb_ct_alloc.
+ * @status       - New status value.
+ */
+int bpf_ct_set_status(const struct nf_conn___init *nfct__ref, u32 status)
+{
+	return nf_ct_change_status_common((struct nf_conn *)nfct__ref, status);
+}
+
+/* bpf_ct_change_status - Change status of inserted nf_conn
+ *
+ * Change the status field of the provided connection tracking entry.
+ * This must be invoked for referenced PTR_TO_BTF_ID to nf_conn.
+ *
+ * Parameters:
+ * @nfct__ref    - Pointer to referenced nf_conn object, obtained using
+ *		   bpf_ct_insert_entry, bpf_xdp_ct_lookup or bpf_skb_ct_lookup.
+ * @status       - New status value.
+ */
+int bpf_ct_change_status(struct nf_conn *nfct__ref, u32 status)
+{
+	/* We need a different kfunc because __ref suffix makes type matching
+	 * strict, so normal nf_conn cannot be passed to bpf_ct_set_status.
+	 */
+	return nf_ct_change_status_common(nfct__ref, status);
+}
+
 __diag_pop()
 
 BTF_SET_START(nf_ct_xdp_check_kfunc_ids)
@@ -403,6 +436,8 @@ BTF_ID(func, bpf_ct_insert_entry)
 BTF_ID(func, bpf_ct_release)
 BTF_ID(func, bpf_ct_set_timeout);
 BTF_ID(func, bpf_ct_change_timeout);
+BTF_ID(func, bpf_ct_set_status);
+BTF_ID(func, bpf_ct_change_status);
 BTF_SET_END(nf_ct_xdp_check_kfunc_ids)
 
 BTF_SET_START(nf_ct_tc_check_kfunc_ids)
@@ -412,6 +447,8 @@ BTF_ID(func, bpf_ct_insert_entry)
 BTF_ID(func, bpf_ct_release)
 BTF_ID(func, bpf_ct_set_timeout);
 BTF_ID(func, bpf_ct_change_timeout);
+BTF_ID(func, bpf_ct_set_status);
+BTF_ID(func, bpf_ct_change_status);
 BTF_SET_END(nf_ct_tc_check_kfunc_ids)
 
 BTF_SET_START(nf_ct_acquire_kfunc_ids)
diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c
index 572f59a5e936..66a0aa8dbc3b 100644
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -2807,4 +2807,44 @@ int __nf_ct_change_timeout(struct nf_conn *ct, u64 timeout)
 }
 EXPORT_SYMBOL_GPL(__nf_ct_change_timeout);
 
+void __nf_ct_change_status(struct nf_conn *ct, unsigned long on, unsigned long off)
+{
+	unsigned int bit;
+
+	/* Ignore these unchangable bits */
+	on &= ~IPS_UNCHANGEABLE_MASK;
+	off &= ~IPS_UNCHANGEABLE_MASK;
+
+	for (bit = 0; bit < __IPS_MAX_BIT; bit++) {
+		if (on & (1 << bit))
+			set_bit(bit, &ct->status);
+		else if (off & (1 << bit))
+			clear_bit(bit, &ct->status);
+	}
+}
+EXPORT_SYMBOL_GPL(__nf_ct_change_status);
+
+int nf_ct_change_status_common(struct nf_conn *ct, unsigned int status)
+{
+	unsigned long d;
+
+	d = ct->status ^ status;
+
+	if (d & (IPS_EXPECTED|IPS_CONFIRMED|IPS_DYING))
+		/* unchangeable */
+		return -EBUSY;
+
+	if (d & IPS_SEEN_REPLY && !(status & IPS_SEEN_REPLY))
+		/* SEEN_REPLY bit can only be set */
+		return -EBUSY;
+
+	if (d & IPS_ASSURED && !(status & IPS_ASSURED))
+		/* ASSURED bit can only be set */
+		return -EBUSY;
+
+	__nf_ct_change_status(ct, status, 0);
+	return 0;
+}
+EXPORT_SYMBOL_GPL(nf_ct_change_status_common);
+
 #endif
diff --git a/net/netfilter/nf_conntrack_netlink.c b/net/netfilter/nf_conntrack_netlink.c
index b1de07c73845..e02832ef9b9f 100644
--- a/net/netfilter/nf_conntrack_netlink.c
+++ b/net/netfilter/nf_conntrack_netlink.c
@@ -1890,45 +1890,10 @@ ctnetlink_parse_nat_setup(struct nf_conn *ct,
 }
 #endif
 
-static void
-__ctnetlink_change_status(struct nf_conn *ct, unsigned long on,
-			  unsigned long off)
-{
-	unsigned int bit;
-
-	/* Ignore these unchangable bits */
-	on &= ~IPS_UNCHANGEABLE_MASK;
-	off &= ~IPS_UNCHANGEABLE_MASK;
-
-	for (bit = 0; bit < __IPS_MAX_BIT; bit++) {
-		if (on & (1 << bit))
-			set_bit(bit, &ct->status);
-		else if (off & (1 << bit))
-			clear_bit(bit, &ct->status);
-	}
-}
-
 static int
 ctnetlink_change_status(struct nf_conn *ct, const struct nlattr * const cda[])
 {
-	unsigned long d;
-	unsigned int status = ntohl(nla_get_be32(cda[CTA_STATUS]));
-	d = ct->status ^ status;
-
-	if (d & (IPS_EXPECTED|IPS_CONFIRMED|IPS_DYING))
-		/* unchangeable */
-		return -EBUSY;
-
-	if (d & IPS_SEEN_REPLY && !(status & IPS_SEEN_REPLY))
-		/* SEEN_REPLY bit can only be set */
-		return -EBUSY;
-
-	if (d & IPS_ASSURED && !(status & IPS_ASSURED))
-		/* ASSURED bit can only be set */
-		return -EBUSY;
-
-	__ctnetlink_change_status(ct, status, 0);
-	return 0;
+	return nf_ct_change_status_common(ct, ntohl(nla_get_be32(cda[CTA_STATUS])));
 }
 
 static int
@@ -2825,7 +2790,7 @@ ctnetlink_update_status(struct nf_conn *ct, const struct nlattr * const cda[])
 	 * unchangeable bits but do not error out. Also user programs
 	 * are allowed to clear the bits that they are allowed to change.
 	 */
-	__ctnetlink_change_status(ct, status, ~status);
+	__nf_ct_change_status(ct, status, ~status);
 	return 0;
 }
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH bpf-next v5 6/8] selftests/bpf: Add verifier tests for forced kfunc ref args
  2022-06-23 19:26 [PATCH bpf-next v5 0/8] New nf_conntrack kfuncs for insertion, changing timeout, status Kumar Kartikeya Dwivedi
                   ` (4 preceding siblings ...)
  2022-06-23 19:26 ` [PATCH bpf-next v5 5/8] net: netfilter: Add kfuncs to set and change CT status Kumar Kartikeya Dwivedi
@ 2022-06-23 19:26 ` Kumar Kartikeya Dwivedi
  2022-06-23 19:26 ` [PATCH bpf-next v5 7/8] selftests/bpf: Add tests for new nf_conntrack kfuncs Kumar Kartikeya Dwivedi
  2022-06-23 19:26 ` [PATCH bpf-next v5 8/8] selftests/bpf: Add negative " Kumar Kartikeya Dwivedi
  7 siblings, 0 replies; 18+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-06-23 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Pablo Neira Ayuso, Florian Westphal,
	Jesper Dangaard Brouer, Toke Høiland-Jørgensen,
	Lorenzo Bianconi, netdev, netfilter-devel

Make sure verifier rejects the bad cases and ensure the good case keeps
working. The selftests make use of the bpf_kfunc_call_test_ref kfunc
added in the previous patch only for verification.

Acked-by: Yonghong Song <yhs@fb.com>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 tools/testing/selftests/bpf/verifier/calls.c | 53 ++++++++++++++++++++
 1 file changed, 53 insertions(+)

diff --git a/tools/testing/selftests/bpf/verifier/calls.c b/tools/testing/selftests/bpf/verifier/calls.c
index 743ed34c1238..3fb4f69b1962 100644
--- a/tools/testing/selftests/bpf/verifier/calls.c
+++ b/tools/testing/selftests/bpf/verifier/calls.c
@@ -218,6 +218,59 @@
 	.result = REJECT,
 	.errstr = "variable ptr_ access var_off=(0x0; 0x7) disallowed",
 },
+{
+	"calls: invalid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID",
+	.insns = {
+	BPF_MOV64_REG(BPF_REG_1, BPF_REG_10),
+	BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -8),
+	BPF_ST_MEM(BPF_DW, BPF_REG_1, 0, 0),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, BPF_PSEUDO_KFUNC_CALL, 0, 0),
+	BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
+	BPF_EXIT_INSN(),
+	BPF_MOV64_REG(BPF_REG_6, BPF_REG_0),
+	BPF_MOV64_REG(BPF_REG_1, BPF_REG_0),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, BPF_PSEUDO_KFUNC_CALL, 0, 0),
+	BPF_LDX_MEM(BPF_DW, BPF_REG_1, BPF_REG_6, 16),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, BPF_PSEUDO_KFUNC_CALL, 0, 0),
+	BPF_MOV64_IMM(BPF_REG_0, 0),
+	BPF_EXIT_INSN(),
+	},
+	.prog_type = BPF_PROG_TYPE_SCHED_CLS,
+	.fixup_kfunc_btf_id = {
+		{ "bpf_kfunc_call_test_acquire", 3 },
+		{ "bpf_kfunc_call_test_ref", 8 },
+		{ "bpf_kfunc_call_test_ref", 10 },
+	},
+	.result_unpriv = REJECT,
+	.result = REJECT,
+	.errstr = "R1 must be referenced",
+},
+{
+	"calls: valid kfunc call: referenced arg needs refcounted PTR_TO_BTF_ID",
+	.insns = {
+	BPF_MOV64_REG(BPF_REG_1, BPF_REG_10),
+	BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -8),
+	BPF_ST_MEM(BPF_DW, BPF_REG_1, 0, 0),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, BPF_PSEUDO_KFUNC_CALL, 0, 0),
+	BPF_JMP_IMM(BPF_JNE, BPF_REG_0, 0, 1),
+	BPF_EXIT_INSN(),
+	BPF_MOV64_REG(BPF_REG_6, BPF_REG_0),
+	BPF_MOV64_REG(BPF_REG_1, BPF_REG_0),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, BPF_PSEUDO_KFUNC_CALL, 0, 0),
+	BPF_MOV64_REG(BPF_REG_1, BPF_REG_6),
+	BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, BPF_PSEUDO_KFUNC_CALL, 0, 0),
+	BPF_MOV64_IMM(BPF_REG_0, 0),
+	BPF_EXIT_INSN(),
+	},
+	.prog_type = BPF_PROG_TYPE_SCHED_CLS,
+	.fixup_kfunc_btf_id = {
+		{ "bpf_kfunc_call_test_acquire", 3 },
+		{ "bpf_kfunc_call_test_ref", 8 },
+		{ "bpf_kfunc_call_test_release", 10 },
+	},
+	.result_unpriv = REJECT,
+	.result = ACCEPT,
+},
 {
 	"calls: basic sanity",
 	.insns = {
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH bpf-next v5 7/8] selftests/bpf: Add tests for new nf_conntrack kfuncs
  2022-06-23 19:26 [PATCH bpf-next v5 0/8] New nf_conntrack kfuncs for insertion, changing timeout, status Kumar Kartikeya Dwivedi
                   ` (5 preceding siblings ...)
  2022-06-23 19:26 ` [PATCH bpf-next v5 6/8] selftests/bpf: Add verifier tests for forced kfunc ref args Kumar Kartikeya Dwivedi
@ 2022-06-23 19:26 ` Kumar Kartikeya Dwivedi
  2022-06-23 19:26 ` [PATCH bpf-next v5 8/8] selftests/bpf: Add negative " Kumar Kartikeya Dwivedi
  7 siblings, 0 replies; 18+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-06-23 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Lorenzo Bianconi, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Pablo Neira Ayuso, Florian Westphal,
	Jesper Dangaard Brouer, Toke Høiland-Jørgensen, netdev,
	netfilter-devel

From: Lorenzo Bianconi <lorenzo@kernel.org>

Introduce selftests for the following kfunc helpers:
- bpf_xdp_ct_alloc
- bpf_skb_ct_alloc
- bpf_ct_insert_entry
- bpf_ct_set_timeout
- bpf_ct_change_timeout
- bpf_ct_set_status
- bpf_ct_change_status

Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 .../testing/selftests/bpf/prog_tests/bpf_nf.c |  8 ++
 .../testing/selftests/bpf/progs/test_bpf_nf.c | 85 ++++++++++++++++---
 2 files changed, 81 insertions(+), 12 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_nf.c b/tools/testing/selftests/bpf/prog_tests/bpf_nf.c
index dd30b1e3a67c..6d53686a7e46 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_nf.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_nf.c
@@ -39,6 +39,14 @@ void test_bpf_nf_ct(int mode)
 	ASSERT_EQ(skel->bss->test_enonet_netns_id, -ENONET, "Test ENONET for bad but valid netns_id");
 	ASSERT_EQ(skel->bss->test_enoent_lookup, -ENOENT, "Test ENOENT for failed lookup");
 	ASSERT_EQ(skel->bss->test_eafnosupport, -EAFNOSUPPORT, "Test EAFNOSUPPORT for invalid len__tuple");
+	ASSERT_EQ(skel->data->test_alloc_entry, 0, "Test for alloc new entry");
+	ASSERT_EQ(skel->data->test_insert_entry, 0, "Test for insert new entry");
+	ASSERT_EQ(skel->data->test_succ_lookup, 0, "Test for successful lookup");
+	/* allow some tolerance for test_delta_timeout value to avoid races. */
+	ASSERT_GT(skel->bss->test_delta_timeout, 9, "Test for min ct timeout update");
+	ASSERT_LE(skel->bss->test_delta_timeout, 10, "Test for max ct timeout update");
+	/* expected status is IPS_SEEN_REPLY */
+	ASSERT_EQ(skel->bss->test_status, 2, "Test for ct status update ");
 end:
 	test_bpf_nf__destroy(skel);
 }
diff --git a/tools/testing/selftests/bpf/progs/test_bpf_nf.c b/tools/testing/selftests/bpf/progs/test_bpf_nf.c
index f00a9731930e..196cd8dfe42a 100644
--- a/tools/testing/selftests/bpf/progs/test_bpf_nf.c
+++ b/tools/testing/selftests/bpf/progs/test_bpf_nf.c
@@ -8,6 +8,8 @@
 #define EINVAL 22
 #define ENOENT 2
 
+extern unsigned long CONFIG_HZ __kconfig;
+
 int test_einval_bpf_tuple = 0;
 int test_einval_reserved = 0;
 int test_einval_netns_id = 0;
@@ -16,6 +18,11 @@ int test_eproto_l4proto = 0;
 int test_enonet_netns_id = 0;
 int test_enoent_lookup = 0;
 int test_eafnosupport = 0;
+int test_alloc_entry = -EINVAL;
+int test_insert_entry = -EAFNOSUPPORT;
+int test_succ_lookup = -ENOENT;
+u32 test_delta_timeout = 0;
+u32 test_status = 0;
 
 struct nf_conn;
 
@@ -26,31 +33,44 @@ struct bpf_ct_opts___local {
 	u8 reserved[3];
 } __attribute__((preserve_access_index));
 
+struct nf_conn *bpf_xdp_ct_alloc(struct xdp_md *, struct bpf_sock_tuple *, u32,
+				 struct bpf_ct_opts___local *, u32) __ksym;
 struct nf_conn *bpf_xdp_ct_lookup(struct xdp_md *, struct bpf_sock_tuple *, u32,
 				  struct bpf_ct_opts___local *, u32) __ksym;
+struct nf_conn *bpf_skb_ct_alloc(struct __sk_buff *, struct bpf_sock_tuple *, u32,
+				 struct bpf_ct_opts___local *, u32) __ksym;
 struct nf_conn *bpf_skb_ct_lookup(struct __sk_buff *, struct bpf_sock_tuple *, u32,
 				  struct bpf_ct_opts___local *, u32) __ksym;
+struct nf_conn *bpf_ct_insert_entry(struct nf_conn *) __ksym;
 void bpf_ct_release(struct nf_conn *) __ksym;
+void bpf_ct_set_timeout(struct nf_conn *, u32) __ksym;
+int bpf_ct_change_timeout(struct nf_conn *, u32) __ksym;
+int bpf_ct_set_status(struct nf_conn *, u32) __ksym;
+int bpf_ct_change_status(struct nf_conn *, u32) __ksym;
 
 static __always_inline void
-nf_ct_test(struct nf_conn *(*func)(void *, struct bpf_sock_tuple *, u32,
-				   struct bpf_ct_opts___local *, u32),
+nf_ct_test(struct nf_conn *(*lookup_fn)(void *, struct bpf_sock_tuple *, u32,
+					struct bpf_ct_opts___local *, u32),
+	   struct nf_conn *(*alloc_fn)(void *, struct bpf_sock_tuple *, u32,
+				       struct bpf_ct_opts___local *, u32),
 	   void *ctx)
 {
 	struct bpf_ct_opts___local opts_def = { .l4proto = IPPROTO_TCP, .netns_id = -1 };
 	struct bpf_sock_tuple bpf_tuple;
 	struct nf_conn *ct;
+	int err;
 
 	__builtin_memset(&bpf_tuple, 0, sizeof(bpf_tuple.ipv4));
 
-	ct = func(ctx, NULL, 0, &opts_def, sizeof(opts_def));
+	ct = lookup_fn(ctx, NULL, 0, &opts_def, sizeof(opts_def));
 	if (ct)
 		bpf_ct_release(ct);
 	else
 		test_einval_bpf_tuple = opts_def.error;
 
 	opts_def.reserved[0] = 1;
-	ct = func(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4), &opts_def, sizeof(opts_def));
+	ct = lookup_fn(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4), &opts_def,
+		       sizeof(opts_def));
 	opts_def.reserved[0] = 0;
 	opts_def.l4proto = IPPROTO_TCP;
 	if (ct)
@@ -59,21 +79,24 @@ nf_ct_test(struct nf_conn *(*func)(void *, struct bpf_sock_tuple *, u32,
 		test_einval_reserved = opts_def.error;
 
 	opts_def.netns_id = -2;
-	ct = func(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4), &opts_def, sizeof(opts_def));
+	ct = lookup_fn(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4), &opts_def,
+		       sizeof(opts_def));
 	opts_def.netns_id = -1;
 	if (ct)
 		bpf_ct_release(ct);
 	else
 		test_einval_netns_id = opts_def.error;
 
-	ct = func(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4), &opts_def, sizeof(opts_def) - 1);
+	ct = lookup_fn(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4), &opts_def,
+		       sizeof(opts_def) - 1);
 	if (ct)
 		bpf_ct_release(ct);
 	else
 		test_einval_len_opts = opts_def.error;
 
 	opts_def.l4proto = IPPROTO_ICMP;
-	ct = func(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4), &opts_def, sizeof(opts_def));
+	ct = lookup_fn(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4), &opts_def,
+		       sizeof(opts_def));
 	opts_def.l4proto = IPPROTO_TCP;
 	if (ct)
 		bpf_ct_release(ct);
@@ -81,37 +104,75 @@ nf_ct_test(struct nf_conn *(*func)(void *, struct bpf_sock_tuple *, u32,
 		test_eproto_l4proto = opts_def.error;
 
 	opts_def.netns_id = 0xf00f;
-	ct = func(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4), &opts_def, sizeof(opts_def));
+	ct = lookup_fn(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4), &opts_def,
+		       sizeof(opts_def));
 	opts_def.netns_id = -1;
 	if (ct)
 		bpf_ct_release(ct);
 	else
 		test_enonet_netns_id = opts_def.error;
 
-	ct = func(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4), &opts_def, sizeof(opts_def));
+	ct = lookup_fn(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4), &opts_def,
+		       sizeof(opts_def));
 	if (ct)
 		bpf_ct_release(ct);
 	else
 		test_enoent_lookup = opts_def.error;
 
-	ct = func(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4) - 1, &opts_def, sizeof(opts_def));
+	ct = lookup_fn(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4) - 1, &opts_def,
+		       sizeof(opts_def));
 	if (ct)
 		bpf_ct_release(ct);
 	else
 		test_eafnosupport = opts_def.error;
+
+	bpf_tuple.ipv4.saddr = bpf_get_prandom_u32(); /* src IP */
+	bpf_tuple.ipv4.daddr = bpf_get_prandom_u32(); /* dst IP */
+	bpf_tuple.ipv4.sport = bpf_get_prandom_u32(); /* src port */
+	bpf_tuple.ipv4.dport = bpf_get_prandom_u32(); /* dst port */
+
+	ct = alloc_fn(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4), &opts_def,
+		      sizeof(opts_def));
+	if (ct) {
+		struct nf_conn *ct_ins;
+
+		bpf_ct_set_timeout(ct, 10000);
+		bpf_ct_set_status(ct, IPS_CONFIRMED);
+
+		ct_ins = bpf_ct_insert_entry(ct);
+		if (ct_ins) {
+			struct nf_conn *ct_lk;
+
+			ct_lk = lookup_fn(ctx, &bpf_tuple, sizeof(bpf_tuple.ipv4),
+					  &opts_def, sizeof(opts_def));
+			if (ct_lk) {
+				/* update ct entry timeout */
+				bpf_ct_change_timeout(ct_lk, 10000);
+				test_delta_timeout = ct_lk->timeout - bpf_jiffies64();
+				test_delta_timeout /= CONFIG_HZ;
+				test_status = IPS_SEEN_REPLY;
+				bpf_ct_change_status(ct_lk, IPS_SEEN_REPLY);
+				bpf_ct_release(ct_lk);
+				test_succ_lookup = 0;
+			}
+			bpf_ct_release(ct_ins);
+			test_insert_entry = 0;
+		}
+		test_alloc_entry = 0;
+	}
 }
 
 SEC("xdp")
 int nf_xdp_ct_test(struct xdp_md *ctx)
 {
-	nf_ct_test((void *)bpf_xdp_ct_lookup, ctx);
+	nf_ct_test((void *)bpf_xdp_ct_lookup, (void *)bpf_xdp_ct_alloc, ctx);
 	return 0;
 }
 
 SEC("tc")
 int nf_skb_ct_test(struct __sk_buff *ctx)
 {
-	nf_ct_test((void *)bpf_skb_ct_lookup, ctx);
+	nf_ct_test((void *)bpf_skb_ct_lookup, (void *)bpf_skb_ct_alloc, ctx);
 	return 0;
 }
 
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* [PATCH bpf-next v5 8/8] selftests/bpf: Add negative tests for new nf_conntrack kfuncs
  2022-06-23 19:26 [PATCH bpf-next v5 0/8] New nf_conntrack kfuncs for insertion, changing timeout, status Kumar Kartikeya Dwivedi
                   ` (6 preceding siblings ...)
  2022-06-23 19:26 ` [PATCH bpf-next v5 7/8] selftests/bpf: Add tests for new nf_conntrack kfuncs Kumar Kartikeya Dwivedi
@ 2022-06-23 19:26 ` Kumar Kartikeya Dwivedi
  7 siblings, 0 replies; 18+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-06-23 19:26 UTC (permalink / raw)
  To: bpf
  Cc: Alexei Starovoitov, Andrii Nakryiko, Daniel Borkmann,
	Pablo Neira Ayuso, Florian Westphal, Jesper Dangaard Brouer,
	Toke Høiland-Jørgensen, Lorenzo Bianconi, netdev,
	netfilter-devel

Test cases we care about and ensure improper usage is caught and
rejected by the verifier.

Signed-off-by: Kumar Kartikeya Dwivedi <memxor@gmail.com>
---
 .../testing/selftests/bpf/prog_tests/bpf_nf.c |  56 +++++++-
 .../selftests/bpf/progs/test_bpf_nf_fail.c    | 134 ++++++++++++++++++
 2 files changed, 189 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/bpf/progs/test_bpf_nf_fail.c

diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_nf.c b/tools/testing/selftests/bpf/prog_tests/bpf_nf.c
index 6d53686a7e46..69877a16f42d 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_nf.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_nf.c
@@ -2,13 +2,29 @@
 #include <test_progs.h>
 #include <network_helpers.h>
 #include "test_bpf_nf.skel.h"
+#include "test_bpf_nf_fail.skel.h"
+
+static char log_buf[1024 * 1024];
+
+struct {
+	const char *prog_name;
+	const char *err_msg;
+} test_bpf_nf_fail_tests[] = {
+	{ "alloc_release", "kernel function bpf_ct_release args#0 expected pointer to STRUCT nf_conn but" },
+	{ "insert_insert", "kernel function bpf_ct_insert_entry args#0 expected pointer to STRUCT nf_conn___init but" },
+	{ "lookup_insert", "kernel function bpf_ct_insert_entry args#0 expected pointer to STRUCT nf_conn___init but" },
+	{ "set_timeout_after_insert", "kernel function bpf_ct_set_timeout args#0 expected pointer to STRUCT nf_conn___init but" },
+	{ "set_status_after_insert", "kernel function bpf_ct_set_status args#0 expected pointer to STRUCT nf_conn___init but" },
+	{ "change_timeout_after_alloc", "kernel function bpf_ct_change_timeout args#0 expected pointer to STRUCT nf_conn but" },
+	{ "change_status_after_alloc", "kernel function bpf_ct_change_status args#0 expected pointer to STRUCT nf_conn but" },
+};
 
 enum {
 	TEST_XDP,
 	TEST_TC_BPF,
 };
 
-void test_bpf_nf_ct(int mode)
+static void test_bpf_nf_ct(int mode)
 {
 	struct test_bpf_nf *skel;
 	int prog_fd, err;
@@ -51,10 +67,48 @@ void test_bpf_nf_ct(int mode)
 	test_bpf_nf__destroy(skel);
 }
 
+static void test_bpf_nf_ct_fail(const char *prog_name, const char *err_msg)
+{
+	LIBBPF_OPTS(bpf_object_open_opts, opts, .kernel_log_buf = log_buf,
+						.kernel_log_size = sizeof(log_buf),
+						.kernel_log_level = 1);
+	struct test_bpf_nf_fail *skel;
+	struct bpf_program *prog;
+	int ret;
+
+	skel = test_bpf_nf_fail__open_opts(&opts);
+	if (!ASSERT_OK_PTR(skel, "test_bpf_nf_fail__open"))
+		return;
+
+	prog = bpf_object__find_program_by_name(skel->obj, prog_name);
+	if (!ASSERT_OK_PTR(prog, "bpf_object__find_program_by_name"))
+		goto end;
+
+	bpf_program__set_autoload(prog, true);
+
+	ret = test_bpf_nf_fail__load(skel);
+	if (!ASSERT_ERR(ret, "test_bpf_nf_fail__load must fail"))
+		goto end;
+
+	if (!ASSERT_OK_PTR(strstr(log_buf, err_msg), "expected error message")) {
+		fprintf(stderr, "Expected: %s\n", err_msg);
+		fprintf(stderr, "Verifier: %s\n", log_buf);
+	}
+
+end:
+	test_bpf_nf_fail__destroy(skel);
+}
+
 void test_bpf_nf(void)
 {
+	int i;
 	if (test__start_subtest("xdp-ct"))
 		test_bpf_nf_ct(TEST_XDP);
 	if (test__start_subtest("tc-bpf-ct"))
 		test_bpf_nf_ct(TEST_TC_BPF);
+	for (i = 0; i < ARRAY_SIZE(test_bpf_nf_fail_tests); i++) {
+		if (test__start_subtest(test_bpf_nf_fail_tests[i].prog_name))
+			test_bpf_nf_ct_fail(test_bpf_nf_fail_tests[i].prog_name,
+					    test_bpf_nf_fail_tests[i].err_msg);
+	}
 }
diff --git a/tools/testing/selftests/bpf/progs/test_bpf_nf_fail.c b/tools/testing/selftests/bpf/progs/test_bpf_nf_fail.c
new file mode 100644
index 000000000000..bf79af15c808
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_bpf_nf_fail.c
@@ -0,0 +1,134 @@
+// SPDX-License-Identifier: GPL-2.0
+#include <vmlinux.h>
+#include <bpf/bpf_tracing.h>
+#include <bpf/bpf_helpers.h>
+#include <bpf/bpf_core_read.h>
+
+struct nf_conn;
+
+struct bpf_ct_opts___local {
+	s32 netns_id;
+	s32 error;
+	u8 l4proto;
+	u8 reserved[3];
+} __attribute__((preserve_access_index));
+
+struct nf_conn *bpf_skb_ct_alloc(struct __sk_buff *, struct bpf_sock_tuple *, u32,
+				 struct bpf_ct_opts___local *, u32) __ksym;
+struct nf_conn *bpf_skb_ct_lookup(struct __sk_buff *, struct bpf_sock_tuple *, u32,
+				  struct bpf_ct_opts___local *, u32) __ksym;
+struct nf_conn *bpf_ct_insert_entry(struct nf_conn *) __ksym;
+void bpf_ct_release(struct nf_conn *) __ksym;
+void bpf_ct_set_timeout(struct nf_conn *, u32) __ksym;
+int bpf_ct_change_timeout(struct nf_conn *, u32) __ksym;
+int bpf_ct_set_status(struct nf_conn *, u32) __ksym;
+int bpf_ct_change_status(struct nf_conn *, u32) __ksym;
+
+SEC("?tc")
+int alloc_release(struct __sk_buff *ctx)
+{
+	struct bpf_ct_opts___local opts = {};
+	struct bpf_sock_tuple tup = {};
+	struct nf_conn *ct;
+
+	ct = bpf_skb_ct_alloc(ctx, &tup, sizeof(tup.ipv4), &opts, sizeof(opts));
+	if (!ct)
+		return 0;
+	bpf_ct_release(ct);
+	return 0;
+}
+
+SEC("?tc")
+int insert_insert(struct __sk_buff *ctx)
+{
+	struct bpf_ct_opts___local opts = {};
+	struct bpf_sock_tuple tup = {};
+	struct nf_conn *ct;
+
+	ct = bpf_skb_ct_alloc(ctx, &tup, sizeof(tup.ipv4), &opts, sizeof(opts));
+	if (!ct)
+		return 0;
+	ct = bpf_ct_insert_entry(ct);
+	if (!ct)
+		return 0;
+	ct = bpf_ct_insert_entry(ct);
+	return 0;
+}
+
+SEC("?tc")
+int lookup_insert(struct __sk_buff *ctx)
+{
+	struct bpf_ct_opts___local opts = {};
+	struct bpf_sock_tuple tup = {};
+	struct nf_conn *ct;
+
+	ct = bpf_skb_ct_lookup(ctx, &tup, sizeof(tup.ipv4), &opts, sizeof(opts));
+	if (!ct)
+		return 0;
+	bpf_ct_insert_entry(ct);
+	return 0;
+}
+
+SEC("?tc")
+int set_timeout_after_insert(struct __sk_buff *ctx)
+{
+	struct bpf_ct_opts___local opts = {};
+	struct bpf_sock_tuple tup = {};
+	struct nf_conn *ct;
+
+	ct = bpf_skb_ct_alloc(ctx, &tup, sizeof(tup.ipv4), &opts, sizeof(opts));
+	if (!ct)
+		return 0;
+	ct = bpf_ct_insert_entry(ct);
+	if (!ct)
+		return 0;
+	bpf_ct_set_timeout(ct, 0);
+	return 0;
+}
+
+SEC("?tc")
+int set_status_after_insert(struct __sk_buff *ctx)
+{
+	struct bpf_ct_opts___local opts = {};
+	struct bpf_sock_tuple tup = {};
+	struct nf_conn *ct;
+
+	ct = bpf_skb_ct_alloc(ctx, &tup, sizeof(tup.ipv4), &opts, sizeof(opts));
+	if (!ct)
+		return 0;
+	ct = bpf_ct_insert_entry(ct);
+	if (!ct)
+		return 0;
+	bpf_ct_set_status(ct, 0);
+	return 0;
+}
+
+SEC("?tc")
+int change_timeout_after_alloc(struct __sk_buff *ctx)
+{
+	struct bpf_ct_opts___local opts = {};
+	struct bpf_sock_tuple tup = {};
+	struct nf_conn *ct;
+
+	ct = bpf_skb_ct_alloc(ctx, &tup, sizeof(tup.ipv4), &opts, sizeof(opts));
+	if (!ct)
+		return 0;
+	bpf_ct_change_timeout(ct, 0);
+	return 0;
+}
+
+SEC("?tc")
+int change_status_after_alloc(struct __sk_buff *ctx)
+{
+	struct bpf_ct_opts___local opts = {};
+	struct bpf_sock_tuple tup = {};
+	struct nf_conn *ct;
+
+	ct = bpf_skb_ct_alloc(ctx, &tup, sizeof(tup.ipv4), &opts, sizeof(opts));
+	if (!ct)
+		return 0;
+	bpf_ct_change_status(ct, 0);
+	return 0;
+}
+
+char _license[] SEC("license") = "GPL";
-- 
2.36.1


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v5 1/8] bpf: Add support for forcing kfunc args to be referenced
  2022-06-23 19:26 ` [PATCH bpf-next v5 1/8] bpf: Add support for forcing kfunc args to be referenced Kumar Kartikeya Dwivedi
@ 2022-06-29  3:23   ` Alexei Starovoitov
  2022-07-03  5:24     ` Kumar Kartikeya Dwivedi
  0 siblings, 1 reply; 18+ messages in thread
From: Alexei Starovoitov @ 2022-06-29  3:23 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Pablo Neira Ayuso, Florian Westphal,
	Jesper Dangaard Brouer, Toke Høiland-Jørgensen,
	Lorenzo Bianconi, netdev, netfilter-devel

On Fri, Jun 24, 2022 at 12:56:30AM +0530, Kumar Kartikeya Dwivedi wrote:
> Similar to how we detect mem, size pairs in kfunc, teach verifier to
> treat __ref suffix on argument name to imply that it must be a
> referenced pointer when passed to kfunc. This is required to ensure that
> kfunc that operate on some object only work on acquired pointers and not
> normal PTR_TO_BTF_ID with same type which can be obtained by pointer
> walking. Release functions need not specify such suffix on release
> arguments as they are already expected to receive one referenced
> argument.
> 
> Note that we use strict type matching when a __ref suffix is present on
> the argument.
... 
> +		/* Check if argument must be a referenced pointer, args + i has
> +		 * been verified to be a pointer (after skipping modifiers).
> +		 */
> +		arg_ref = is_kfunc_arg_ref(btf, args + i);
> +		if (is_kfunc && arg_ref && !reg->ref_obj_id) {
> +			bpf_log(log, "R%d must be referenced\n", regno);
> +			return -EINVAL;
> +		}
> +

imo this suffix will be confusing to use.
If I understand the intent the __ref should only be used
in acquire (and other) kfuncs that also do release.
Adding __ref to actual release kfunc will be a nop.
It will be checked, but it's not necessary.

At the end
+struct nf_conn *bpf_ct_insert_entry(struct nf_conn___init *nfct__ref)
will behave like kptr_xchg with exception that kptr_xchg takes any btf_id
while here it's fixed.

The code:
 if (rel && reg->ref_obj_id)
        arg_type |= OBJ_RELEASE;
should probably be updated with '|| arg_ref'
to make sure reg->off == 0 ?
That looks like a small bug.

But stepping back... why __ref is needed ?
We can add bpf_ct_insert_entry to acq and rel sets and it should work?
I'm assuming you're doing the orthogonal cleanup of resolve_btfid,
so we will have a single kfunc set where bpf_ct_insert_entry will
have both acq and rel flags.
I'm surely missing something.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v5 1/8] bpf: Add support for forcing kfunc args to be referenced
  2022-06-29  3:23   ` Alexei Starovoitov
@ 2022-07-03  5:24     ` Kumar Kartikeya Dwivedi
  2022-07-03  5:34       ` Kumar Kartikeya Dwivedi
  0 siblings, 1 reply; 18+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-07-03  5:24 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Pablo Neira Ayuso, Florian Westphal,
	Jesper Dangaard Brouer, Toke Høiland-Jørgensen,
	Lorenzo Bianconi, netdev, netfilter-devel

On Wed, 29 Jun 2022 at 08:53, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Fri, Jun 24, 2022 at 12:56:30AM +0530, Kumar Kartikeya Dwivedi wrote:
> > Similar to how we detect mem, size pairs in kfunc, teach verifier to
> > treat __ref suffix on argument name to imply that it must be a
> > referenced pointer when passed to kfunc. This is required to ensure that
> > kfunc that operate on some object only work on acquired pointers and not
> > normal PTR_TO_BTF_ID with same type which can be obtained by pointer
> > walking. Release functions need not specify such suffix on release
> > arguments as they are already expected to receive one referenced
> > argument.
> >
> > Note that we use strict type matching when a __ref suffix is present on
> > the argument.
> ...
> > +             /* Check if argument must be a referenced pointer, args + i has
> > +              * been verified to be a pointer (after skipping modifiers).
> > +              */
> > +             arg_ref = is_kfunc_arg_ref(btf, args + i);
> > +             if (is_kfunc && arg_ref && !reg->ref_obj_id) {
> > +                     bpf_log(log, "R%d must be referenced\n", regno);
> > +                     return -EINVAL;
> > +             }
> > +
>
> imo this suffix will be confusing to use.
> If I understand the intent the __ref should only be used
> in acquire (and other) kfuncs that also do release.
> Adding __ref to actual release kfunc will be a nop.
> It will be checked, but it's not necessary.
>
> At the end
> +struct nf_conn *bpf_ct_insert_entry(struct nf_conn___init *nfct__ref)
> will behave like kptr_xchg with exception that kptr_xchg takes any btf_id
> while here it's fixed.
>
> The code:
>  if (rel && reg->ref_obj_id)
>         arg_type |= OBJ_RELEASE;
> should probably be updated with '|| arg_ref'
> to make sure reg->off == 0 ?
> That looks like a small bug.
>

Indeed, I missed that. Thanks for catching it.

> But stepping back... why __ref is needed ?
> We can add bpf_ct_insert_entry to acq and rel sets and it should work?
> I'm assuming you're doing the orthogonal cleanup of resolve_btfid,
> so we will have a single kfunc set where bpf_ct_insert_entry will
> have both acq and rel flags.
> I'm surely missing something.

It is needed to prevent the case where someone might do:
ct = bpf_xdp_ct_alloc(...);
bpf_ct_set_timeout(ct->master, ...);

Or just obtain PTR_TO_BTF_ID by pointer walking and try to pass it in
to bpf_ct_set_timeout.

__ref allows an argument on a non-release kfunc to have checks like a
release argument, i.e. refcounted, reg->off == 0 (var_off is already
checked to be 0), so use the original pointer that was obtained from
an acquire kfunc. As you noted, it isn't strictly needed on release
kfunc (like bpf_ct_insert_entry) because the same checks happen for it
anyway. But both timeout and status helpers should use it if they
"operate" on the acquired ct (from alloc, insert, or lookup).

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v5 1/8] bpf: Add support for forcing kfunc args to be referenced
  2022-07-03  5:24     ` Kumar Kartikeya Dwivedi
@ 2022-07-03  5:34       ` Kumar Kartikeya Dwivedi
  2022-07-06 18:44         ` Alexei Starovoitov
  0 siblings, 1 reply; 18+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-07-03  5:34 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Pablo Neira Ayuso, Florian Westphal,
	Jesper Dangaard Brouer, Toke Høiland-Jørgensen,
	Lorenzo Bianconi, netdev, netfilter-devel

On Sun, 3 Jul 2022 at 10:54, Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
>
> On Wed, 29 Jun 2022 at 08:53, Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Fri, Jun 24, 2022 at 12:56:30AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > Similar to how we detect mem, size pairs in kfunc, teach verifier to
> > > treat __ref suffix on argument name to imply that it must be a
> > > referenced pointer when passed to kfunc. This is required to ensure that
> > > kfunc that operate on some object only work on acquired pointers and not
> > > normal PTR_TO_BTF_ID with same type which can be obtained by pointer
> > > walking. Release functions need not specify such suffix on release
> > > arguments as they are already expected to receive one referenced
> > > argument.
> > >
> > > Note that we use strict type matching when a __ref suffix is present on
> > > the argument.
> > ...
> > > +             /* Check if argument must be a referenced pointer, args + i has
> > > +              * been verified to be a pointer (after skipping modifiers).
> > > +              */
> > > +             arg_ref = is_kfunc_arg_ref(btf, args + i);
> > > +             if (is_kfunc && arg_ref && !reg->ref_obj_id) {
> > > +                     bpf_log(log, "R%d must be referenced\n", regno);
> > > +                     return -EINVAL;
> > > +             }
> > > +
> >
> > imo this suffix will be confusing to use.
> > If I understand the intent the __ref should only be used
> > in acquire (and other) kfuncs that also do release.
> > Adding __ref to actual release kfunc will be a nop.
> > It will be checked, but it's not necessary.
> >
> > At the end
> > +struct nf_conn *bpf_ct_insert_entry(struct nf_conn___init *nfct__ref)
> > will behave like kptr_xchg with exception that kptr_xchg takes any btf_id
> > while here it's fixed.
> >
> > The code:
> >  if (rel && reg->ref_obj_id)
> >         arg_type |= OBJ_RELEASE;
> > should probably be updated with '|| arg_ref'
> > to make sure reg->off == 0 ?
> > That looks like a small bug.
> >
>
> Indeed, I missed that. Thanks for catching it.
>
> > But stepping back... why __ref is needed ?
> > We can add bpf_ct_insert_entry to acq and rel sets and it should work?
> > I'm assuming you're doing the orthogonal cleanup of resolve_btfid,
> > so we will have a single kfunc set where bpf_ct_insert_entry will
> > have both acq and rel flags.
> > I'm surely missing something.
>
> It is needed to prevent the case where someone might do:
> ct = bpf_xdp_ct_alloc(...);
> bpf_ct_set_timeout(ct->master, ...);
>

A better illustration is probably bpf_xdp_ct_lookup and
bpf_ct_change_timeout, since here the type for ct->master won't match
with bpf_ct_set_timeout, but the point is the same.

> Or just obtain PTR_TO_BTF_ID by pointer walking and try to pass it in
> to bpf_ct_set_timeout.
>
> __ref allows an argument on a non-release kfunc to have checks like a
> release argument, i.e. refcounted, reg->off == 0 (var_off is already
> checked to be 0), so use the original pointer that was obtained from
> an acquire kfunc. As you noted, it isn't strictly needed on release
> kfunc (like bpf_ct_insert_entry) because the same checks happen for it
> anyway. But both timeout and status helpers should use it if they
> "operate" on the acquired ct (from alloc, insert, or lookup).

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v5 1/8] bpf: Add support for forcing kfunc args to be referenced
  2022-07-03  5:34       ` Kumar Kartikeya Dwivedi
@ 2022-07-06 18:44         ` Alexei Starovoitov
  2022-07-06 19:21           ` Kumar Kartikeya Dwivedi
  0 siblings, 1 reply; 18+ messages in thread
From: Alexei Starovoitov @ 2022-07-06 18:44 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Pablo Neira Ayuso, Florian Westphal,
	Jesper Dangaard Brouer, Toke Høiland-Jørgensen,
	Lorenzo Bianconi, netdev, netfilter-devel

On Sun, Jul 03, 2022 at 11:04:22AM +0530, Kumar Kartikeya Dwivedi wrote:
> On Sun, 3 Jul 2022 at 10:54, Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
> >
> > On Wed, 29 Jun 2022 at 08:53, Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Fri, Jun 24, 2022 at 12:56:30AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > > Similar to how we detect mem, size pairs in kfunc, teach verifier to
> > > > treat __ref suffix on argument name to imply that it must be a
> > > > referenced pointer when passed to kfunc. This is required to ensure that
> > > > kfunc that operate on some object only work on acquired pointers and not
> > > > normal PTR_TO_BTF_ID with same type which can be obtained by pointer
> > > > walking. Release functions need not specify such suffix on release
> > > > arguments as they are already expected to receive one referenced
> > > > argument.
> > > >
> > > > Note that we use strict type matching when a __ref suffix is present on
> > > > the argument.
> > > ...
> > > > +             /* Check if argument must be a referenced pointer, args + i has
> > > > +              * been verified to be a pointer (after skipping modifiers).
> > > > +              */
> > > > +             arg_ref = is_kfunc_arg_ref(btf, args + i);
> > > > +             if (is_kfunc && arg_ref && !reg->ref_obj_id) {
> > > > +                     bpf_log(log, "R%d must be referenced\n", regno);
> > > > +                     return -EINVAL;
> > > > +             }
> > > > +
> > >
> > > imo this suffix will be confusing to use.
> > > If I understand the intent the __ref should only be used
> > > in acquire (and other) kfuncs that also do release.
> > > Adding __ref to actual release kfunc will be a nop.
> > > It will be checked, but it's not necessary.
> > >
> > > At the end
> > > +struct nf_conn *bpf_ct_insert_entry(struct nf_conn___init *nfct__ref)
> > > will behave like kptr_xchg with exception that kptr_xchg takes any btf_id
> > > while here it's fixed.
> > >
> > > The code:
> > >  if (rel && reg->ref_obj_id)
> > >         arg_type |= OBJ_RELEASE;
> > > should probably be updated with '|| arg_ref'
> > > to make sure reg->off == 0 ?
> > > That looks like a small bug.
> > >
> >
> > Indeed, I missed that. Thanks for catching it.
> >
> > > But stepping back... why __ref is needed ?
> > > We can add bpf_ct_insert_entry to acq and rel sets and it should work?
> > > I'm assuming you're doing the orthogonal cleanup of resolve_btfid,
> > > so we will have a single kfunc set where bpf_ct_insert_entry will
> > > have both acq and rel flags.
> > > I'm surely missing something.
> >
> > It is needed to prevent the case where someone might do:
> > ct = bpf_xdp_ct_alloc(...);
> > bpf_ct_set_timeout(ct->master, ...);
> >
> 
> A better illustration is probably bpf_xdp_ct_lookup and
> bpf_ct_change_timeout, since here the type for ct->master won't match
> with bpf_ct_set_timeout, but the point is the same.

Sorry, I'm still not following.
Didn't we make pointer walking 'untrusted' so ct->master cannot be
passed into any kfunc?

> > Or just obtain PTR_TO_BTF_ID by pointer walking and try to pass it in
> > to bpf_ct_set_timeout.
> >
> > __ref allows an argument on a non-release kfunc to have checks like a
> > release argument, i.e. refcounted, reg->off == 0 (var_off is already
> > checked to be 0), so use the original pointer that was obtained from
> > an acquire kfunc. As you noted, it isn't strictly needed on release
> > kfunc (like bpf_ct_insert_entry) because the same checks happen for it
> > anyway. But both timeout and status helpers should use it if they
> > "operate" on the acquired ct (from alloc, insert, or lookup).

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v5 1/8] bpf: Add support for forcing kfunc args to be referenced
  2022-07-06 18:44         ` Alexei Starovoitov
@ 2022-07-06 19:21           ` Kumar Kartikeya Dwivedi
  2022-07-06 21:29             ` Alexei Starovoitov
  0 siblings, 1 reply; 18+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-07-06 19:21 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Pablo Neira Ayuso, Florian Westphal,
	Jesper Dangaard Brouer, Toke Høiland-Jørgensen,
	Lorenzo Bianconi, netdev, netfilter-devel

On Thu, 7 Jul 2022 at 00:14, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Sun, Jul 03, 2022 at 11:04:22AM +0530, Kumar Kartikeya Dwivedi wrote:
> > On Sun, 3 Jul 2022 at 10:54, Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
> > >
> > > On Wed, 29 Jun 2022 at 08:53, Alexei Starovoitov
> > > <alexei.starovoitov@gmail.com> wrote:
> > > >
> > > > On Fri, Jun 24, 2022 at 12:56:30AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > > > Similar to how we detect mem, size pairs in kfunc, teach verifier to
> > > > > treat __ref suffix on argument name to imply that it must be a
> > > > > referenced pointer when passed to kfunc. This is required to ensure that
> > > > > kfunc that operate on some object only work on acquired pointers and not
> > > > > normal PTR_TO_BTF_ID with same type which can be obtained by pointer
> > > > > walking. Release functions need not specify such suffix on release
> > > > > arguments as they are already expected to receive one referenced
> > > > > argument.
> > > > >
> > > > > Note that we use strict type matching when a __ref suffix is present on
> > > > > the argument.
> > > > ...
> > > > > +             /* Check if argument must be a referenced pointer, args + i has
> > > > > +              * been verified to be a pointer (after skipping modifiers).
> > > > > +              */
> > > > > +             arg_ref = is_kfunc_arg_ref(btf, args + i);
> > > > > +             if (is_kfunc && arg_ref && !reg->ref_obj_id) {
> > > > > +                     bpf_log(log, "R%d must be referenced\n", regno);
> > > > > +                     return -EINVAL;
> > > > > +             }
> > > > > +
> > > >
> > > > imo this suffix will be confusing to use.
> > > > If I understand the intent the __ref should only be used
> > > > in acquire (and other) kfuncs that also do release.
> > > > Adding __ref to actual release kfunc will be a nop.
> > > > It will be checked, but it's not necessary.
> > > >
> > > > At the end
> > > > +struct nf_conn *bpf_ct_insert_entry(struct nf_conn___init *nfct__ref)
> > > > will behave like kptr_xchg with exception that kptr_xchg takes any btf_id
> > > > while here it's fixed.
> > > >
> > > > The code:
> > > >  if (rel && reg->ref_obj_id)
> > > >         arg_type |= OBJ_RELEASE;
> > > > should probably be updated with '|| arg_ref'
> > > > to make sure reg->off == 0 ?
> > > > That looks like a small bug.
> > > >
> > >
> > > Indeed, I missed that. Thanks for catching it.
> > >
> > > > But stepping back... why __ref is needed ?
> > > > We can add bpf_ct_insert_entry to acq and rel sets and it should work?
> > > > I'm assuming you're doing the orthogonal cleanup of resolve_btfid,
> > > > so we will have a single kfunc set where bpf_ct_insert_entry will
> > > > have both acq and rel flags.
> > > > I'm surely missing something.
> > >
> > > It is needed to prevent the case where someone might do:
> > > ct = bpf_xdp_ct_alloc(...);
> > > bpf_ct_set_timeout(ct->master, ...);
> > >
> >
> > A better illustration is probably bpf_xdp_ct_lookup and
> > bpf_ct_change_timeout, since here the type for ct->master won't match
> > with bpf_ct_set_timeout, but the point is the same.
>
> Sorry, I'm still not following.
> Didn't we make pointer walking 'untrusted' so ct->master cannot be
> passed into any kfunc?
>

I don't believe that is the case, it is only true for kptrs loaded
from BPF maps (that too those with BPF_LDX, not the ones with
kptr_xchg). There we had a chance to do things differently. For normal
PTR_TO_BTF_ID obtained from kfuncs/BPF helpers, there is no untrusted
flag set on them, nor is it set when walking them.

I also think we discussed switching to this mode, by making many cases
untrusted by default, and using annotation to allow cases, making
pointers trusted at one level (like args for tracing/lsm progs, but
next deref becomes untrusted), but admittedly it may not cover enough
ground, and you didn't like it much either, so I stopped pursuing it.

> > > Or just obtain PTR_TO_BTF_ID by pointer walking and try to pass it in
> > > to bpf_ct_set_timeout.
> > >
> > > __ref allows an argument on a non-release kfunc to have checks like a
> > > release argument, i.e. refcounted, reg->off == 0 (var_off is already
> > > checked to be 0), so use the original pointer that was obtained from
> > > an acquire kfunc. As you noted, it isn't strictly needed on release
> > > kfunc (like bpf_ct_insert_entry) because the same checks happen for it
> > > anyway. But both timeout and status helpers should use it if they
> > > "operate" on the acquired ct (from alloc, insert, or lookup).

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v5 1/8] bpf: Add support for forcing kfunc args to be referenced
  2022-07-06 19:21           ` Kumar Kartikeya Dwivedi
@ 2022-07-06 21:29             ` Alexei Starovoitov
  2022-07-06 22:04               ` Alexei Starovoitov
  0 siblings, 1 reply; 18+ messages in thread
From: Alexei Starovoitov @ 2022-07-06 21:29 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Pablo Neira Ayuso, Florian Westphal,
	Jesper Dangaard Brouer, Toke Høiland-Jørgensen,
	Lorenzo Bianconi, netdev, netfilter-devel

On Thu, Jul 07, 2022 at 12:51:15AM +0530, Kumar Kartikeya Dwivedi wrote:
> On Thu, 7 Jul 2022 at 00:14, Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Sun, Jul 03, 2022 at 11:04:22AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > On Sun, 3 Jul 2022 at 10:54, Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
> > > >
> > > > On Wed, 29 Jun 2022 at 08:53, Alexei Starovoitov
> > > > <alexei.starovoitov@gmail.com> wrote:
> > > > >
> > > > > On Fri, Jun 24, 2022 at 12:56:30AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > > > > Similar to how we detect mem, size pairs in kfunc, teach verifier to
> > > > > > treat __ref suffix on argument name to imply that it must be a
> > > > > > referenced pointer when passed to kfunc. This is required to ensure that
> > > > > > kfunc that operate on some object only work on acquired pointers and not
> > > > > > normal PTR_TO_BTF_ID with same type which can be obtained by pointer
> > > > > > walking. Release functions need not specify such suffix on release
> > > > > > arguments as they are already expected to receive one referenced
> > > > > > argument.
> > > > > >
> > > > > > Note that we use strict type matching when a __ref suffix is present on
> > > > > > the argument.
> > > > > ...
> > > > > > +             /* Check if argument must be a referenced pointer, args + i has
> > > > > > +              * been verified to be a pointer (after skipping modifiers).
> > > > > > +              */
> > > > > > +             arg_ref = is_kfunc_arg_ref(btf, args + i);
> > > > > > +             if (is_kfunc && arg_ref && !reg->ref_obj_id) {
> > > > > > +                     bpf_log(log, "R%d must be referenced\n", regno);
> > > > > > +                     return -EINVAL;
> > > > > > +             }
> > > > > > +
> > > > >
> > > > > imo this suffix will be confusing to use.
> > > > > If I understand the intent the __ref should only be used
> > > > > in acquire (and other) kfuncs that also do release.
> > > > > Adding __ref to actual release kfunc will be a nop.
> > > > > It will be checked, but it's not necessary.
> > > > >
> > > > > At the end
> > > > > +struct nf_conn *bpf_ct_insert_entry(struct nf_conn___init *nfct__ref)
> > > > > will behave like kptr_xchg with exception that kptr_xchg takes any btf_id
> > > > > while here it's fixed.
> > > > >
> > > > > The code:
> > > > >  if (rel && reg->ref_obj_id)
> > > > >         arg_type |= OBJ_RELEASE;
> > > > > should probably be updated with '|| arg_ref'
> > > > > to make sure reg->off == 0 ?
> > > > > That looks like a small bug.
> > > > >
> > > >
> > > > Indeed, I missed that. Thanks for catching it.
> > > >
> > > > > But stepping back... why __ref is needed ?
> > > > > We can add bpf_ct_insert_entry to acq and rel sets and it should work?
> > > > > I'm assuming you're doing the orthogonal cleanup of resolve_btfid,
> > > > > so we will have a single kfunc set where bpf_ct_insert_entry will
> > > > > have both acq and rel flags.
> > > > > I'm surely missing something.
> > > >
> > > > It is needed to prevent the case where someone might do:
> > > > ct = bpf_xdp_ct_alloc(...);
> > > > bpf_ct_set_timeout(ct->master, ...);
> > > >
> > >
> > > A better illustration is probably bpf_xdp_ct_lookup and
> > > bpf_ct_change_timeout, since here the type for ct->master won't match
> > > with bpf_ct_set_timeout, but the point is the same.
> >
> > Sorry, I'm still not following.
> > Didn't we make pointer walking 'untrusted' so ct->master cannot be
> > passed into any kfunc?
> >
> 
> I don't believe that is the case, it is only true for kptrs loaded
> from BPF maps (that too those with BPF_LDX, not the ones with
> kptr_xchg). There we had a chance to do things differently. For normal
> PTR_TO_BTF_ID obtained from kfuncs/BPF helpers, there is no untrusted
> flag set on them, nor is it set when walking them.
> 
> I also think we discussed switching to this mode, by making many cases
> untrusted by default, and using annotation to allow cases, making
> pointers trusted at one level (like args for tracing/lsm progs, but
> next deref becomes untrusted), but admittedly it may not cover enough
> ground, and you didn't like it much either, so I stopped pursuing it.

Ahh. Now I remember. Thanks for reminding :)
Could you please summarize this thread and add all of it as a big comment
in the source code next to __ref handling to explain the motivation
and an example on when and how this __ref suffix should be used.
Otherwise somebody, like me, will forget the context soon.

I was thinking of better name than __ref, but couldn't come up with one.
__ref fits this use case the best.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v5 1/8] bpf: Add support for forcing kfunc args to be referenced
  2022-07-06 21:29             ` Alexei Starovoitov
@ 2022-07-06 22:04               ` Alexei Starovoitov
  2022-07-13 12:13                 ` Kumar Kartikeya Dwivedi
  0 siblings, 1 reply; 18+ messages in thread
From: Alexei Starovoitov @ 2022-07-06 22:04 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Pablo Neira Ayuso, Florian Westphal,
	Jesper Dangaard Brouer, Toke Høiland-Jørgensen,
	Lorenzo Bianconi, Network Development, netfilter-devel

On Wed, Jul 6, 2022 at 2:29 PM Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Thu, Jul 07, 2022 at 12:51:15AM +0530, Kumar Kartikeya Dwivedi wrote:
> > On Thu, 7 Jul 2022 at 00:14, Alexei Starovoitov
> > <alexei.starovoitov@gmail.com> wrote:
> > >
> > > On Sun, Jul 03, 2022 at 11:04:22AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > > On Sun, 3 Jul 2022 at 10:54, Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
> > > > >
> > > > > On Wed, 29 Jun 2022 at 08:53, Alexei Starovoitov
> > > > > <alexei.starovoitov@gmail.com> wrote:
> > > > > >
> > > > > > On Fri, Jun 24, 2022 at 12:56:30AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > > > > > Similar to how we detect mem, size pairs in kfunc, teach verifier to
> > > > > > > treat __ref suffix on argument name to imply that it must be a
> > > > > > > referenced pointer when passed to kfunc. This is required to ensure that
> > > > > > > kfunc that operate on some object only work on acquired pointers and not
> > > > > > > normal PTR_TO_BTF_ID with same type which can be obtained by pointer
> > > > > > > walking. Release functions need not specify such suffix on release
> > > > > > > arguments as they are already expected to receive one referenced
> > > > > > > argument.
> > > > > > >
> > > > > > > Note that we use strict type matching when a __ref suffix is present on
> > > > > > > the argument.
> > > > > > ...
> > > > > > > +             /* Check if argument must be a referenced pointer, args + i has
> > > > > > > +              * been verified to be a pointer (after skipping modifiers).
> > > > > > > +              */
> > > > > > > +             arg_ref = is_kfunc_arg_ref(btf, args + i);
> > > > > > > +             if (is_kfunc && arg_ref && !reg->ref_obj_id) {
> > > > > > > +                     bpf_log(log, "R%d must be referenced\n", regno);
> > > > > > > +                     return -EINVAL;
> > > > > > > +             }
> > > > > > > +
> > > > > >
> > > > > > imo this suffix will be confusing to use.
> > > > > > If I understand the intent the __ref should only be used
> > > > > > in acquire (and other) kfuncs that also do release.
> > > > > > Adding __ref to actual release kfunc will be a nop.
> > > > > > It will be checked, but it's not necessary.
> > > > > >
> > > > > > At the end
> > > > > > +struct nf_conn *bpf_ct_insert_entry(struct nf_conn___init *nfct__ref)
> > > > > > will behave like kptr_xchg with exception that kptr_xchg takes any btf_id
> > > > > > while here it's fixed.
> > > > > >
> > > > > > The code:
> > > > > >  if (rel && reg->ref_obj_id)
> > > > > >         arg_type |= OBJ_RELEASE;
> > > > > > should probably be updated with '|| arg_ref'
> > > > > > to make sure reg->off == 0 ?
> > > > > > That looks like a small bug.
> > > > > >
> > > > >
> > > > > Indeed, I missed that. Thanks for catching it.
> > > > >
> > > > > > But stepping back... why __ref is needed ?
> > > > > > We can add bpf_ct_insert_entry to acq and rel sets and it should work?
> > > > > > I'm assuming you're doing the orthogonal cleanup of resolve_btfid,
> > > > > > so we will have a single kfunc set where bpf_ct_insert_entry will
> > > > > > have both acq and rel flags.
> > > > > > I'm surely missing something.
> > > > >
> > > > > It is needed to prevent the case where someone might do:
> > > > > ct = bpf_xdp_ct_alloc(...);
> > > > > bpf_ct_set_timeout(ct->master, ...);
> > > > >
> > > >
> > > > A better illustration is probably bpf_xdp_ct_lookup and
> > > > bpf_ct_change_timeout, since here the type for ct->master won't match
> > > > with bpf_ct_set_timeout, but the point is the same.
> > >
> > > Sorry, I'm still not following.
> > > Didn't we make pointer walking 'untrusted' so ct->master cannot be
> > > passed into any kfunc?
> > >
> >
> > I don't believe that is the case, it is only true for kptrs loaded
> > from BPF maps (that too those with BPF_LDX, not the ones with
> > kptr_xchg). There we had a chance to do things differently. For normal
> > PTR_TO_BTF_ID obtained from kfuncs/BPF helpers, there is no untrusted
> > flag set on them, nor is it set when walking them.
> >
> > I also think we discussed switching to this mode, by making many cases
> > untrusted by default, and using annotation to allow cases, making
> > pointers trusted at one level (like args for tracing/lsm progs, but
> > next deref becomes untrusted), but admittedly it may not cover enough
> > ground, and you didn't like it much either, so I stopped pursuing it.
>
> Ahh. Now I remember. Thanks for reminding :)
> Could you please summarize this thread and add all of it as a big comment
> in the source code next to __ref handling to explain the motivation
> and an example on when and how this __ref suffix should be used.
> Otherwise somebody, like me, will forget the context soon.
>
> I was thinking of better name than __ref, but couldn't come up with one.
> __ref fits this use case the best.

Actually, maybe a kfunc flag will be better?
Like REF_ARGS
that would apply to all arguments of the kfunc
(not only those with __ref suffix).

We have three types of ptr_btf_id:
- ref counted
- untrusted
- old legacy that we cannot be break due to backward compat

In the future we'll probably be adding new kfuncs where we'd want
every argument to be trusted. In our naming convention these are
the refcounted ptr_to_btf_id that come from lookup-like kfuncs.
To consume them in the release kfunc they have to be refcounted,
but non-release kfunc (like set_timeout) also want a trusted ptr.
So the simple way of describe the intent would be:
BTF_ID(func, bpf_ct_release, RELEASE)
BTF_ID(func, bpf_ct_set_timeout, REF_ARGS)

or maybe TRUSTED_ARGS would be a better flag name.
wdyt?

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v5 1/8] bpf: Add support for forcing kfunc args to be referenced
  2022-07-06 22:04               ` Alexei Starovoitov
@ 2022-07-13 12:13                 ` Kumar Kartikeya Dwivedi
  2022-07-13 21:53                   ` Alexei Starovoitov
  0 siblings, 1 reply; 18+ messages in thread
From: Kumar Kartikeya Dwivedi @ 2022-07-13 12:13 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: bpf, Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Pablo Neira Ayuso, Florian Westphal,
	Jesper Dangaard Brouer, Toke Høiland-Jørgensen,
	Lorenzo Bianconi, Network Development, netfilter-devel

On Thu, 7 Jul 2022 at 00:04, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
>
> On Wed, Jul 6, 2022 at 2:29 PM Alexei Starovoitov
> <alexei.starovoitov@gmail.com> wrote:
> >
> > On Thu, Jul 07, 2022 at 12:51:15AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > On Thu, 7 Jul 2022 at 00:14, Alexei Starovoitov
> > > <alexei.starovoitov@gmail.com> wrote:
> > > >
> > > > On Sun, Jul 03, 2022 at 11:04:22AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > > > On Sun, 3 Jul 2022 at 10:54, Kumar Kartikeya Dwivedi <memxor@gmail.com> wrote:
> > > > > >
> > > > > > On Wed, 29 Jun 2022 at 08:53, Alexei Starovoitov
> > > > > > <alexei.starovoitov@gmail.com> wrote:
> > > > > > >
> > > > > > > On Fri, Jun 24, 2022 at 12:56:30AM +0530, Kumar Kartikeya Dwivedi wrote:
> > > > > > > > Similar to how we detect mem, size pairs in kfunc, teach verifier to
> > > > > > > > treat __ref suffix on argument name to imply that it must be a
> > > > > > > > referenced pointer when passed to kfunc. This is required to ensure that
> > > > > > > > kfunc that operate on some object only work on acquired pointers and not
> > > > > > > > normal PTR_TO_BTF_ID with same type which can be obtained by pointer
> > > > > > > > walking. Release functions need not specify such suffix on release
> > > > > > > > arguments as they are already expected to receive one referenced
> > > > > > > > argument.
> > > > > > > >
> > > > > > > > Note that we use strict type matching when a __ref suffix is present on
> > > > > > > > the argument.
> > > > > > > ...
> > > > > > > > +             /* Check if argument must be a referenced pointer, args + i has
> > > > > > > > +              * been verified to be a pointer (after skipping modifiers).
> > > > > > > > +              */
> > > > > > > > +             arg_ref = is_kfunc_arg_ref(btf, args + i);
> > > > > > > > +             if (is_kfunc && arg_ref && !reg->ref_obj_id) {
> > > > > > > > +                     bpf_log(log, "R%d must be referenced\n", regno);
> > > > > > > > +                     return -EINVAL;
> > > > > > > > +             }
> > > > > > > > +
> > > > > > >
> > > > > > > imo this suffix will be confusing to use.
> > > > > > > If I understand the intent the __ref should only be used
> > > > > > > in acquire (and other) kfuncs that also do release.
> > > > > > > Adding __ref to actual release kfunc will be a nop.
> > > > > > > It will be checked, but it's not necessary.
> > > > > > >
> > > > > > > At the end
> > > > > > > +struct nf_conn *bpf_ct_insert_entry(struct nf_conn___init *nfct__ref)
> > > > > > > will behave like kptr_xchg with exception that kptr_xchg takes any btf_id
> > > > > > > while here it's fixed.
> > > > > > >
> > > > > > > The code:
> > > > > > >  if (rel && reg->ref_obj_id)
> > > > > > >         arg_type |= OBJ_RELEASE;
> > > > > > > should probably be updated with '|| arg_ref'
> > > > > > > to make sure reg->off == 0 ?
> > > > > > > That looks like a small bug.
> > > > > > >
> > > > > >
> > > > > > Indeed, I missed that. Thanks for catching it.
> > > > > >
> > > > > > > But stepping back... why __ref is needed ?
> > > > > > > We can add bpf_ct_insert_entry to acq and rel sets and it should work?
> > > > > > > I'm assuming you're doing the orthogonal cleanup of resolve_btfid,
> > > > > > > so we will have a single kfunc set where bpf_ct_insert_entry will
> > > > > > > have both acq and rel flags.
> > > > > > > I'm surely missing something.
> > > > > >
> > > > > > It is needed to prevent the case where someone might do:
> > > > > > ct = bpf_xdp_ct_alloc(...);
> > > > > > bpf_ct_set_timeout(ct->master, ...);
> > > > > >
> > > > >
> > > > > A better illustration is probably bpf_xdp_ct_lookup and
> > > > > bpf_ct_change_timeout, since here the type for ct->master won't match
> > > > > with bpf_ct_set_timeout, but the point is the same.
> > > >
> > > > Sorry, I'm still not following.
> > > > Didn't we make pointer walking 'untrusted' so ct->master cannot be
> > > > passed into any kfunc?
> > > >
> > >
> > > I don't believe that is the case, it is only true for kptrs loaded
> > > from BPF maps (that too those with BPF_LDX, not the ones with
> > > kptr_xchg). There we had a chance to do things differently. For normal
> > > PTR_TO_BTF_ID obtained from kfuncs/BPF helpers, there is no untrusted
> > > flag set on them, nor is it set when walking them.
> > >
> > > I also think we discussed switching to this mode, by making many cases
> > > untrusted by default, and using annotation to allow cases, making
> > > pointers trusted at one level (like args for tracing/lsm progs, but
> > > next deref becomes untrusted), but admittedly it may not cover enough
> > > ground, and you didn't like it much either, so I stopped pursuing it.
> >
> > Ahh. Now I remember. Thanks for reminding :)
> > Could you please summarize this thread and add all of it as a big comment
> > in the source code next to __ref handling to explain the motivation
> > and an example on when and how this __ref suffix should be used.
> > Otherwise somebody, like me, will forget the context soon.
> >
> > I was thinking of better name than __ref, but couldn't come up with one.
> > __ref fits this use case the best.
>
> Actually, maybe a kfunc flag will be better?
> Like REF_ARGS
> that would apply to all arguments of the kfunc
> (not only those with __ref suffix).
>
> We have three types of ptr_btf_id:
> - ref counted
> - untrusted
> - old legacy that we cannot be break due to backward compat
>
> In the future we'll probably be adding new kfuncs where we'd want
> every argument to be trusted. In our naming convention these are
> the refcounted ptr_to_btf_id that come from lookup-like kfuncs.
> To consume them in the release kfunc they have to be refcounted,
> but non-release kfunc (like set_timeout) also want a trusted ptr.
> So the simple way of describe the intent would be:
> BTF_ID(func, bpf_ct_release, RELEASE)
> BTF_ID(func, bpf_ct_set_timeout, REF_ARGS)
>
> or maybe TRUSTED_ARGS would be a better flag name.
> wdyt?

Ok, I've implemented the kfunc flags and kept TRUSTED_ARGS as the
name. Just need to do a little bit of testing and will post it
together with this.

Just to confirm, should I still keep __ref or drop it? I think
TRUSTED_ARGS has its use but it may be too coarse. I already have the
patch so if you like we can add both ways now.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [PATCH bpf-next v5 1/8] bpf: Add support for forcing kfunc args to be referenced
  2022-07-13 12:13                 ` Kumar Kartikeya Dwivedi
@ 2022-07-13 21:53                   ` Alexei Starovoitov
  0 siblings, 0 replies; 18+ messages in thread
From: Alexei Starovoitov @ 2022-07-13 21:53 UTC (permalink / raw)
  To: Kumar Kartikeya Dwivedi
  Cc: bpf, Yonghong Song, Alexei Starovoitov, Andrii Nakryiko,
	Daniel Borkmann, Pablo Neira Ayuso, Florian Westphal,
	Jesper Dangaard Brouer, Toke Høiland-Jørgensen,
	Lorenzo Bianconi, Network Development, netfilter-devel

On Wed, Jul 13, 2022 at 5:13 AM Kumar Kartikeya Dwivedi
<memxor@gmail.com> wrote:
> > > Ahh. Now I remember. Thanks for reminding :)
> > > Could you please summarize this thread and add all of it as a big comment
> > > in the source code next to __ref handling to explain the motivation
> > > and an example on when and how this __ref suffix should be used.
> > > Otherwise somebody, like me, will forget the context soon.
> > >
> > > I was thinking of better name than __ref, but couldn't come up with one.
> > > __ref fits this use case the best.
> >
> > Actually, maybe a kfunc flag will be better?
> > Like REF_ARGS
> > that would apply to all arguments of the kfunc
> > (not only those with __ref suffix).
> >
> > We have three types of ptr_btf_id:
> > - ref counted
> > - untrusted
> > - old legacy that we cannot be break due to backward compat
> >
> > In the future we'll probably be adding new kfuncs where we'd want
> > every argument to be trusted. In our naming convention these are
> > the refcounted ptr_to_btf_id that come from lookup-like kfuncs.
> > To consume them in the release kfunc they have to be refcounted,
> > but non-release kfunc (like set_timeout) also want a trusted ptr.
> > So the simple way of describe the intent would be:
> > BTF_ID(func, bpf_ct_release, RELEASE)
> > BTF_ID(func, bpf_ct_set_timeout, REF_ARGS)
> >
> > or maybe TRUSTED_ARGS would be a better flag name.
> > wdyt?
>
> Ok, I've implemented the kfunc flags and kept TRUSTED_ARGS as the
> name. Just need to do a little bit of testing and will post it
> together with this.

Awesome!

> Just to confirm, should I still keep __ref or drop it? I think
> TRUSTED_ARGS has its use but it may be too coarse. I already have the
> patch so if you like we can add both ways now.

TRUSTED_ARGS may become too coarse, but let's cross that bridge
when there is actual need.
If we land __ref support right now there won't be any users
and the code will start to bit rot. So let's delay it.
Pls post that patch as an extra RFC patch anyway, so
it won't get lost.

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2022-07-13 21:53 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-06-23 19:26 [PATCH bpf-next v5 0/8] New nf_conntrack kfuncs for insertion, changing timeout, status Kumar Kartikeya Dwivedi
2022-06-23 19:26 ` [PATCH bpf-next v5 1/8] bpf: Add support for forcing kfunc args to be referenced Kumar Kartikeya Dwivedi
2022-06-29  3:23   ` Alexei Starovoitov
2022-07-03  5:24     ` Kumar Kartikeya Dwivedi
2022-07-03  5:34       ` Kumar Kartikeya Dwivedi
2022-07-06 18:44         ` Alexei Starovoitov
2022-07-06 19:21           ` Kumar Kartikeya Dwivedi
2022-07-06 21:29             ` Alexei Starovoitov
2022-07-06 22:04               ` Alexei Starovoitov
2022-07-13 12:13                 ` Kumar Kartikeya Dwivedi
2022-07-13 21:53                   ` Alexei Starovoitov
2022-06-23 19:26 ` [PATCH bpf-next v5 2/8] net: netfilter: Deduplicate code in bpf_{xdp,skb}_ct_lookup Kumar Kartikeya Dwivedi
2022-06-23 19:26 ` [PATCH bpf-next v5 3/8] net: netfilter: Add kfuncs to allocate and insert CT Kumar Kartikeya Dwivedi
2022-06-23 19:26 ` [PATCH bpf-next v5 4/8] net: netfilter: Add kfuncs to set and change CT timeout Kumar Kartikeya Dwivedi
2022-06-23 19:26 ` [PATCH bpf-next v5 5/8] net: netfilter: Add kfuncs to set and change CT status Kumar Kartikeya Dwivedi
2022-06-23 19:26 ` [PATCH bpf-next v5 6/8] selftests/bpf: Add verifier tests for forced kfunc ref args Kumar Kartikeya Dwivedi
2022-06-23 19:26 ` [PATCH bpf-next v5 7/8] selftests/bpf: Add tests for new nf_conntrack kfuncs Kumar Kartikeya Dwivedi
2022-06-23 19:26 ` [PATCH bpf-next v5 8/8] selftests/bpf: Add negative " Kumar Kartikeya Dwivedi

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.