All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH nf-next 0/8] netfilter: make nf_flowtable lifetime differ from container struct
@ 2023-11-21 12:27 Florian Westphal
  2023-11-21 12:27 ` [PATCH nf-next 1/8] netfilter: flowtable: move nf_flowtable out of container structures Florian Westphal
                   ` (8 more replies)
  0 siblings, 9 replies; 22+ messages in thread
From: Florian Westphal @ 2023-11-21 12:27 UTC (permalink / raw)
  To: netfilter-devel; +Cc: lorenzo, netdev, Florian Westphal

This series detaches nf_flowtable from the two existing container
structures.

Allocation and freeing is moved to the flowtable core.
Then, memory release is changed so it passes through another
synchronize_rcu() call.

Next, a new nftables flowtable flag is introduced to mark a flowtable
for explicit XDP-based offload.

Such flowtables have more restrictions,
in particular, if two flowtables are tagged as 'xdp offloaded', they
cannot share any net devices.

It would be possible to avoid such new 'xdp flag', but I see no way
to do so without breaking backwards compatbility: at this time the same
net_device can be part of any number of flowtables, this is very
inefficient from an XDP point of view: it would have to perform lookups
in all associated flowtables in a loop until a match is found.

This is hardly desirable.

Last two patches expose the hash table mapping and make utility
function available for XDP.

The XDP kfunc will be added in a followup patch.

Florian Westphal (8):
  netfilter: flowtable: move nf_flowtable out of container structures
  netfilter: nf_flowtable: replace init callback with a create one
  netfilter: nf_flowtable: make free a real free function
  netfilter: nf_flowtable: delay flowtable release a second time
  netfilter: nf_tables: reject flowtable hw offload for same device
  netfilter: nf_tables: add xdp offload flag
  netfilter: nf_tables: add flowtable map for xdp offload
  netfilter: nf_tables: permit duplicate flowtable mappings

 include/net/netfilter/nf_flow_table.h    |  15 ++-
 include/net/netfilter/nf_tables.h        |  15 ++-
 include/uapi/linux/netfilter/nf_tables.h |   5 +-
 net/netfilter/nf_flow_table_core.c       |  39 ++++--
 net/netfilter/nf_flow_table_inet.c       |   6 +-
 net/netfilter/nf_flow_table_offload.c    | 157 ++++++++++++++++++++++-
 net/netfilter/nf_tables_api.c            | 113 +++++++++++-----
 net/netfilter/nft_flow_offload.c         |   4 +-
 net/sched/act_ct.c                       |  37 +++---
 9 files changed, 315 insertions(+), 76 deletions(-)

-- 
2.41.0


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH nf-next 1/8] netfilter: flowtable: move nf_flowtable out of container structures
  2023-11-21 12:27 [PATCH nf-next 0/8] netfilter: make nf_flowtable lifetime differ from container struct Florian Westphal
@ 2023-11-21 12:27 ` Florian Westphal
  2023-11-23 13:52   ` Simon Horman
  2023-11-21 12:27 ` [PATCH nf-next 2/8] netfilter: nf_flowtable: replace init callback with a create one Florian Westphal
                   ` (7 subsequent siblings)
  8 siblings, 1 reply; 22+ messages in thread
From: Florian Westphal @ 2023-11-21 12:27 UTC (permalink / raw)
  To: netfilter-devel; +Cc: lorenzo, netdev, Florian Westphal

struct nf_flowtable is currently wholly embedded in either nft_flowtable
or tcf_ct_flow_table.

In order to allow flowtable acceleration via XDP, the XDP program will
need to map struct net_device to struct nf_flowtable.

To make this work reliably, make a clear separation of the frontend
(nft, tc) and backend (nf_flowtable) representation.

In this first patch, amke it so nft_flowtable and tcf_ct_flow_table
only store pointers to an nf_flowtable structure.

The main goal is to have follow patches that allow us to keep the
nf_flowtable structure around for a bit longer (e.g. until after
an rcu grace period has elapesed) when the frontend(s) are tearing the
structures down.

At this time, things are fine, but when xdp programs might be using
the nf_flowtable structure as well we will need a way to ensure that
no such users exist anymore.

Right now there is inufficient guarantee: nftables only ensures
that the netfilter hooks are unregistered, and tc only ensures the
tc actions have been removed.

Any future kfunc might still be called in parallel from an XDP
program.  The easies way to resolve this is to let the nf_flowtable
core handle release and module reference counting itself.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/net/netfilter/nf_tables.h | 15 +++++---
 net/netfilter/nf_tables_api.c     | 62 +++++++++++++++++--------------
 net/netfilter/nft_flow_offload.c  |  4 +-
 net/sched/act_ct.c                | 33 +++++++++-------
 4 files changed, 66 insertions(+), 48 deletions(-)

diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index b157c5cafd14..362eca5d0451 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -1380,8 +1380,14 @@ void nft_unregister_obj(struct nft_object_type *obj_type);
  *	@use: number of references to this flow table
  * 	@handle: unique object handle
  *	@dev_name: array of device names
- *	@data: rhashtable and garbage collector
- * 	@ops: array of hooks
+ *	@hook_list: list of struct nft_hook
+ *	@ft: pointer to underlying nf_flowtable
+ *
+ *	This structure represents the low-level
+ *	nf_flowtable within the nf_tables framework.
+ *
+ *	nf_flowtable itself has no concept of 'tables', 'transactions',
+ *	etc. They do not even have names.
  */
 struct nft_flowtable {
 	struct list_head		list;
@@ -1392,9 +1398,8 @@ struct nft_flowtable {
 	u32				genmask:2;
 	u32				use;
 	u64				handle;
-	/* runtime data below here */
-	struct list_head		hook_list ____cacheline_aligned;
-	struct nf_flowtable		data;
+	struct list_head		hook_list;
+	struct nf_flowtable		*ft;
 };
 
 struct nft_flowtable *nft_flowtable_lookup(const struct nft_table *table,
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index c0a42989b982..ee0de8d9b49c 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -8100,11 +8100,11 @@ static int nft_flowtable_parse_hook(const struct nft_ctx *ctx,
 
 		if (tb[NFTA_FLOWTABLE_HOOK_PRIORITY]) {
 			priority = ntohl(nla_get_be32(tb[NFTA_FLOWTABLE_HOOK_PRIORITY]));
-			if (priority != flowtable->data.priority)
+			if (priority != flowtable->ft->priority)
 				return -EOPNOTSUPP;
 		}
 
-		flowtable_hook->priority	= flowtable->data.priority;
+		flowtable_hook->priority	= flowtable->ft->priority;
 		flowtable_hook->num		= flowtable->hooknum;
 	}
 
@@ -8121,8 +8121,8 @@ static int nft_flowtable_parse_hook(const struct nft_ctx *ctx,
 		hook->ops.pf		= NFPROTO_NETDEV;
 		hook->ops.hooknum	= flowtable_hook->num;
 		hook->ops.priority	= flowtable_hook->priority;
-		hook->ops.priv		= &flowtable->data;
-		hook->ops.hook		= flowtable->data.type->hook;
+		hook->ops.priv		= flowtable->ft;
+		hook->ops.hook		= flowtable->ft->type->hook;
 	}
 
 	return err;
@@ -8164,8 +8164,8 @@ static void nft_unregister_flowtable_hook(struct net *net,
 					  struct nft_hook *hook)
 {
 	nf_unregister_net_hook(net, &hook->ops);
-	flowtable->data.type->setup(&flowtable->data, hook->ops.dev,
-				    FLOW_BLOCK_UNBIND);
+	flowtable->ft->type->setup(flowtable->ft, hook->ops.dev,
+				   FLOW_BLOCK_UNBIND);
 }
 
 static void __nft_unregister_flowtable_net_hooks(struct net *net,
@@ -8212,17 +8212,15 @@ static int nft_register_flowtable_net_hooks(struct net *net,
 			}
 		}
 
-		err = flowtable->data.type->setup(&flowtable->data,
-						  hook->ops.dev,
-						  FLOW_BLOCK_BIND);
+		err = flowtable->ft->type->setup(flowtable->ft, hook->ops.dev,
+						 FLOW_BLOCK_BIND);
 		if (err < 0)
 			goto err_unregister_net_hooks;
 
 		err = nf_register_net_hook(net, &hook->ops);
 		if (err < 0) {
-			flowtable->data.type->setup(&flowtable->data,
-						    hook->ops.dev,
-						    FLOW_BLOCK_UNBIND);
+			flowtable->ft->type->setup(flowtable->ft, hook->ops.dev,
+						   FLOW_BLOCK_UNBIND);
 			goto err_unregister_net_hooks;
 		}
 
@@ -8284,13 +8282,13 @@ static int nft_flowtable_update(struct nft_ctx *ctx, const struct nlmsghdr *nlh,
 			err = -EOPNOTSUPP;
 			goto err_flowtable_update_hook;
 		}
-		if ((flowtable->data.flags & NFT_FLOWTABLE_HW_OFFLOAD) ^
+		if ((flowtable->ft->flags & NFT_FLOWTABLE_HW_OFFLOAD) ^
 		    (flags & NFT_FLOWTABLE_HW_OFFLOAD)) {
 			err = -EOPNOTSUPP;
 			goto err_flowtable_update_hook;
 		}
 	} else {
-		flags = flowtable->data.flags;
+		flags = flowtable->ft->flags;
 	}
 
 	err = nft_register_flowtable_net_hooks(ctx->net, ctx->table,
@@ -8402,18 +8400,24 @@ static int nf_tables_newflowtable(struct sk_buff *skb,
 		goto err2;
 	}
 
+	flowtable->ft = kzalloc(sizeof(*flowtable->ft), GFP_KERNEL_ACCOUNT);
+	if (!flowtable->ft) {
+		err = -ENOMEM;
+		goto err3;
+	}
+
 	if (nla[NFTA_FLOWTABLE_FLAGS]) {
-		flowtable->data.flags =
+		flowtable->ft->flags =
 			ntohl(nla_get_be32(nla[NFTA_FLOWTABLE_FLAGS]));
-		if (flowtable->data.flags & ~NFT_FLOWTABLE_MASK) {
+		if (flowtable->ft->flags & ~NFT_FLOWTABLE_MASK) {
 			err = -EOPNOTSUPP;
 			goto err3;
 		}
 	}
 
-	write_pnet(&flowtable->data.net, net);
-	flowtable->data.type = type;
-	err = type->init(&flowtable->data);
+	write_pnet(&flowtable->ft->net, net);
+	flowtable->ft->type = type;
+	err = type->init(flowtable->ft);
 	if (err < 0)
 		goto err3;
 
@@ -8423,7 +8427,7 @@ static int nf_tables_newflowtable(struct sk_buff *skb,
 		goto err4;
 
 	list_splice(&flowtable_hook.list, &flowtable->hook_list);
-	flowtable->data.priority = flowtable_hook.priority;
+	flowtable->ft->priority = flowtable_hook.priority;
 	flowtable->hooknum = flowtable_hook.num;
 
 	err = nft_register_flowtable_net_hooks(ctx.net, table,
@@ -8448,8 +8452,9 @@ static int nf_tables_newflowtable(struct sk_buff *skb,
 		kfree_rcu(hook, rcu);
 	}
 err4:
-	flowtable->data.type->free(&flowtable->data);
+	flowtable->ft->type->free(flowtable->ft);
 err3:
+	kfree(flowtable->ft);
 	module_put(type->owner);
 err2:
 	kfree(flowtable->name);
@@ -8603,14 +8608,14 @@ static int nf_tables_fill_flowtable_info(struct sk_buff *skb, struct net *net,
 	}
 
 	if (nla_put_be32(skb, NFTA_FLOWTABLE_USE, htonl(flowtable->use)) ||
-	    nla_put_be32(skb, NFTA_FLOWTABLE_FLAGS, htonl(flowtable->data.flags)))
+	    nla_put_be32(skb, NFTA_FLOWTABLE_FLAGS, htonl(flowtable->ft->flags)))
 		goto nla_put_failure;
 
 	nest = nla_nest_start_noflag(skb, NFTA_FLOWTABLE_HOOK);
 	if (!nest)
 		goto nla_put_failure;
 	if (nla_put_be32(skb, NFTA_FLOWTABLE_HOOK_NUM, htonl(flowtable->hooknum)) ||
-	    nla_put_be32(skb, NFTA_FLOWTABLE_HOOK_PRIORITY, htonl(flowtable->data.priority)))
+	    nla_put_be32(skb, NFTA_FLOWTABLE_HOOK_PRIORITY, htonl(flowtable->ft->priority)))
 		goto nla_put_failure;
 
 	nest_devs = nla_nest_start_noflag(skb, NFTA_FLOWTABLE_HOOK_DEVS);
@@ -8825,15 +8830,16 @@ static void nf_tables_flowtable_destroy(struct nft_flowtable *flowtable)
 {
 	struct nft_hook *hook, *next;
 
-	flowtable->data.type->free(&flowtable->data);
+	flowtable->ft->type->free(flowtable->ft);
 	list_for_each_entry_safe(hook, next, &flowtable->hook_list, list) {
-		flowtable->data.type->setup(&flowtable->data, hook->ops.dev,
-					    FLOW_BLOCK_UNBIND);
+		flowtable->ft->type->setup(flowtable->ft, hook->ops.dev,
+					   FLOW_BLOCK_UNBIND);
 		list_del_rcu(&hook->list);
 		kfree(hook);
 	}
 	kfree(flowtable->name);
-	module_put(flowtable->data.type->owner);
+	module_put(flowtable->ft->type->owner);
+	kfree(flowtable->ft);
 	kfree(flowtable);
 }
 
@@ -10164,7 +10170,7 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb)
 			break;
 		case NFT_MSG_NEWFLOWTABLE:
 			if (nft_trans_flowtable_update(trans)) {
-				nft_trans_flowtable(trans)->data.flags =
+				nft_trans_flowtable(trans)->ft->flags =
 					nft_trans_flowtable_flags(trans);
 				nf_tables_flowtable_notify(&trans->ctx,
 							   nft_trans_flowtable(trans),
diff --git a/net/netfilter/nft_flow_offload.c b/net/netfilter/nft_flow_offload.c
index ab3362c483b4..83b631470bab 100644
--- a/net/netfilter/nft_flow_offload.c
+++ b/net/netfilter/nft_flow_offload.c
@@ -196,7 +196,7 @@ static void nft_dev_forward_path(struct nf_flow_route *route,
 	int i;
 
 	if (nft_dev_fill_forward_path(route, dst, ct, dir, ha, &stack) >= 0)
-		nft_dev_path_info(&stack, &info, ha, &ft->data);
+		nft_dev_path_info(&stack, &info, ha, ft->ft);
 
 	if (!info.indev || !nft_flowtable_find_dev(info.indev, ft))
 		return;
@@ -293,7 +293,7 @@ static void nft_flow_offload_eval(const struct nft_expr *expr,
 				  const struct nft_pktinfo *pkt)
 {
 	struct nft_flow_offload *priv = nft_expr_priv(expr);
-	struct nf_flowtable *flowtable = &priv->flowtable->data;
+	struct nf_flowtable *flowtable = priv->flowtable->ft;
 	struct tcphdr _tcph, *tcph = NULL;
 	struct nf_flow_route route = {};
 	enum ip_conntrack_info ctinfo;
diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
index b3f4a503ee2b..2a6362becd71 100644
--- a/net/sched/act_ct.c
+++ b/net/sched/act_ct.c
@@ -45,7 +45,7 @@ struct tcf_ct_flow_table {
 	struct rhash_head node; /* In zones tables */
 
 	struct rcu_work rwork;
-	struct nf_flowtable nf_ft;
+	struct nf_flowtable *nf_ft;
 	refcount_t ref;
 	u16 zone;
 
@@ -305,6 +305,7 @@ static int tcf_ct_flow_table_get(struct net *net, struct tcf_ct_params *params)
 	ct_ft = kzalloc(sizeof(*ct_ft), GFP_KERNEL);
 	if (!ct_ft)
 		goto err_alloc;
+
 	refcount_set(&ct_ft->ref, 1);
 
 	ct_ft->zone = params->zone;
@@ -312,24 +313,29 @@ static int tcf_ct_flow_table_get(struct net *net, struct tcf_ct_params *params)
 	if (err)
 		goto err_insert;
 
-	ct_ft->nf_ft.type = &flowtable_ct;
-	ct_ft->nf_ft.flags |= NF_FLOWTABLE_HW_OFFLOAD |
-			      NF_FLOWTABLE_COUNTER;
-	err = nf_flow_table_init(&ct_ft->nf_ft);
+	ct_ft->nf_ft = kzalloc(sizeof(*ct_ft->nf_ft), GFP_KERNEL);
+	if (!ct_ft->nf_ft)
+		goto err_alloc;
+
+	ct_ft->nf_ft->type = &flowtable_ct;
+	ct_ft->nf_ft->flags |= NF_FLOWTABLE_HW_OFFLOAD |
+			       NF_FLOWTABLE_COUNTER;
+	err = nf_flow_table_init(ct_ft->nf_ft);
 	if (err)
 		goto err_init;
-	write_pnet(&ct_ft->nf_ft.net, net);
+	write_pnet(&ct_ft->nf_ft->net, net);
 
 	__module_get(THIS_MODULE);
 out_unlock:
 	params->ct_ft = ct_ft;
-	params->nf_ft = &ct_ft->nf_ft;
+	params->nf_ft = ct_ft->nf_ft;
 	mutex_unlock(&zones_mutex);
 
 	return 0;
 
 err_init:
 	rhashtable_remove_fast(&zones_ht, &ct_ft->node, zones_params);
+	kfree(ct_ft->nf_ft);
 err_insert:
 	kfree(ct_ft);
 err_alloc:
@@ -345,16 +351,17 @@ static void tcf_ct_flow_table_cleanup_work(struct work_struct *work)
 
 	ct_ft = container_of(to_rcu_work(work), struct tcf_ct_flow_table,
 			     rwork);
-	nf_flow_table_free(&ct_ft->nf_ft);
+	nf_flow_table_free(ct_ft->nf_ft);
 
 	/* Remove any remaining callbacks before cleanup */
-	block = &ct_ft->nf_ft.flow_block;
-	down_write(&ct_ft->nf_ft.flow_block_lock);
+	block = &ct_ft->nf_ft->flow_block;
+	down_write(&ct_ft->nf_ft->flow_block_lock);
 	list_for_each_entry_safe(block_cb, tmp_cb, &block->cb_list, list) {
 		list_del(&block_cb->list);
 		flow_block_cb_free(block_cb);
 	}
-	up_write(&ct_ft->nf_ft.flow_block_lock);
+	up_write(&ct_ft->nf_ft->flow_block_lock);
+	kfree(ct_ft->nf_ft);
 	kfree(ct_ft);
 
 	module_put(THIS_MODULE);
@@ -417,7 +424,7 @@ static void tcf_ct_flow_table_add(struct tcf_ct_flow_table *ct_ft,
 		tcf_ct_flow_tc_ifidx(entry, act_ct_ext, FLOW_OFFLOAD_DIR_REPLY);
 	}
 
-	err = flow_offload_add(&ct_ft->nf_ft, entry);
+	err = flow_offload_add(ct_ft->nf_ft, entry);
 	if (err)
 		goto err_add;
 
@@ -625,7 +632,7 @@ static bool tcf_ct_flow_table_lookup(struct tcf_ct_params *p,
 				     struct sk_buff *skb,
 				     u8 family)
 {
-	struct nf_flowtable *nf_ft = &p->ct_ft->nf_ft;
+	struct nf_flowtable *nf_ft = p->ct_ft->nf_ft;
 	struct flow_offload_tuple_rhash *tuplehash;
 	struct flow_offload_tuple tuple = {};
 	enum ip_conntrack_info ctinfo;
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH nf-next 2/8] netfilter: nf_flowtable: replace init callback with a create one
  2023-11-21 12:27 [PATCH nf-next 0/8] netfilter: make nf_flowtable lifetime differ from container struct Florian Westphal
  2023-11-21 12:27 ` [PATCH nf-next 1/8] netfilter: flowtable: move nf_flowtable out of container structures Florian Westphal
@ 2023-11-21 12:27 ` Florian Westphal
  2023-11-21 12:27 ` [PATCH nf-next 3/8] netfilter: nf_flowtable: make free a real free function Florian Westphal
                   ` (6 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Florian Westphal @ 2023-11-21 12:27 UTC (permalink / raw)
  To: netfilter-devel; +Cc: lorenzo, netdev, Florian Westphal

Let the flowtable core allocate the nf_flowtable structure itself.
While at it, move the net and type assignment into the flowtable core.

The returned nf_flowtable pointer can still be released with plain kfree(),
this will be changed in a followup patch.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/net/netfilter/nf_flow_table.h |  5 +++--
 net/netfilter/nf_flow_table_core.c    | 19 ++++++++++++++-----
 net/netfilter/nf_flow_table_inet.c    |  6 +++---
 net/netfilter/nf_tables_api.c         |  8 +-------
 net/sched/act_ct.c                    | 12 ++++--------
 5 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/include/net/netfilter/nf_flow_table.h b/include/net/netfilter/nf_flow_table.h
index 36351e441316..d365eabd4a3c 100644
--- a/include/net/netfilter/nf_flow_table.h
+++ b/include/net/netfilter/nf_flow_table.h
@@ -52,7 +52,8 @@ struct nf_flow_rule {
 struct nf_flowtable_type {
 	struct list_head		list;
 	int				family;
-	int				(*init)(struct nf_flowtable *ft);
+	struct nf_flowtable *		(*create)(struct net *net,
+						  const struct nf_flowtable_type *type);
 	bool				(*gc)(const struct flow_offload *flow);
 	int				(*setup)(struct nf_flowtable *ft,
 						 struct net_device *dev,
@@ -279,7 +280,7 @@ void nf_flow_table_gc_cleanup(struct nf_flowtable *flowtable,
 			      struct net_device *dev);
 void nf_flow_table_cleanup(struct net_device *dev);
 
-int nf_flow_table_init(struct nf_flowtable *flow_table);
+struct nf_flowtable *nf_flow_table_create(struct net *net, const struct nf_flowtable_type *type);
 void nf_flow_table_free(struct nf_flowtable *flow_table);
 
 void flow_offload_teardown(struct flow_offload *flow);
diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c
index 920a5a29ae1d..375fc9c24149 100644
--- a/net/netfilter/nf_flow_table_core.c
+++ b/net/netfilter/nf_flow_table_core.c
@@ -531,29 +531,38 @@ void nf_flow_dnat_port(const struct flow_offload *flow, struct sk_buff *skb,
 }
 EXPORT_SYMBOL_GPL(nf_flow_dnat_port);
 
-int nf_flow_table_init(struct nf_flowtable *flowtable)
+struct nf_flowtable *nf_flow_table_create(struct net *net, const struct nf_flowtable_type *type)
 {
+	struct nf_flowtable *flowtable = kzalloc(sizeof(*flowtable), GFP_KERNEL_ACCOUNT);
 	int err;
 
+	if (!flowtable)
+		return NULL;
+
 	INIT_DELAYED_WORK(&flowtable->gc_work, nf_flow_offload_work_gc);
 	flow_block_init(&flowtable->flow_block);
 	init_rwsem(&flowtable->flow_block_lock);
 
 	err = rhashtable_init(&flowtable->rhashtable,
 			      &nf_flow_offload_rhash_params);
-	if (err < 0)
-		return err;
+	if (err < 0) {
+		kfree(flowtable);
+		return NULL;
+	}
 
 	queue_delayed_work(system_power_efficient_wq,
 			   &flowtable->gc_work, HZ);
 
+	write_pnet(&flowtable->net, net);
+	flowtable->type = type;
+
 	mutex_lock(&flowtable_lock);
 	list_add(&flowtable->list, &flowtables);
 	mutex_unlock(&flowtable_lock);
 
-	return 0;
+	return flowtable;
 }
-EXPORT_SYMBOL_GPL(nf_flow_table_init);
+EXPORT_SYMBOL_GPL(nf_flow_table_create);
 
 static void nf_flow_table_do_cleanup(struct nf_flowtable *flow_table,
 				     struct flow_offload *flow, void *data)
diff --git a/net/netfilter/nf_flow_table_inet.c b/net/netfilter/nf_flow_table_inet.c
index 9505f9d188ff..7b20a073573d 100644
--- a/net/netfilter/nf_flow_table_inet.c
+++ b/net/netfilter/nf_flow_table_inet.c
@@ -63,7 +63,7 @@ static int nf_flow_rule_route_inet(struct net *net,
 
 static struct nf_flowtable_type flowtable_inet = {
 	.family		= NFPROTO_INET,
-	.init		= nf_flow_table_init,
+	.create		= nf_flow_table_create,
 	.setup		= nf_flow_table_offload_setup,
 	.action		= nf_flow_rule_route_inet,
 	.free		= nf_flow_table_free,
@@ -73,7 +73,7 @@ static struct nf_flowtable_type flowtable_inet = {
 
 static struct nf_flowtable_type flowtable_ipv4 = {
 	.family		= NFPROTO_IPV4,
-	.init		= nf_flow_table_init,
+	.create		= nf_flow_table_create,
 	.setup		= nf_flow_table_offload_setup,
 	.action		= nf_flow_rule_route_ipv4,
 	.free		= nf_flow_table_free,
@@ -83,7 +83,7 @@ static struct nf_flowtable_type flowtable_ipv4 = {
 
 static struct nf_flowtable_type flowtable_ipv6 = {
 	.family		= NFPROTO_IPV6,
-	.init		= nf_flow_table_init,
+	.create		= nf_flow_table_create,
 	.setup		= nf_flow_table_offload_setup,
 	.action		= nf_flow_rule_route_ipv6,
 	.free		= nf_flow_table_free,
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index ee0de8d9b49c..cce82fef4488 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -8400,7 +8400,7 @@ static int nf_tables_newflowtable(struct sk_buff *skb,
 		goto err2;
 	}
 
-	flowtable->ft = kzalloc(sizeof(*flowtable->ft), GFP_KERNEL_ACCOUNT);
+	flowtable->ft = type->create(net, type);
 	if (!flowtable->ft) {
 		err = -ENOMEM;
 		goto err3;
@@ -8415,12 +8415,6 @@ static int nf_tables_newflowtable(struct sk_buff *skb,
 		}
 	}
 
-	write_pnet(&flowtable->ft->net, net);
-	flowtable->ft->type = type;
-	err = type->init(flowtable->ft);
-	if (err < 0)
-		goto err3;
-
 	err = nft_flowtable_parse_hook(&ctx, nla, &flowtable_hook, flowtable,
 				       extack, true);
 	if (err < 0)
diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
index 2a6362becd71..80869cc52348 100644
--- a/net/sched/act_ct.c
+++ b/net/sched/act_ct.c
@@ -313,17 +313,13 @@ static int tcf_ct_flow_table_get(struct net *net, struct tcf_ct_params *params)
 	if (err)
 		goto err_insert;
 
-	ct_ft->nf_ft = kzalloc(sizeof(*ct_ft->nf_ft), GFP_KERNEL);
+	ct_ft->nf_ft->type = &flowtable_ct;
+	ct_ft->nf_ft = nf_flow_table_create(net, &flowtable_ct);
 	if (!ct_ft->nf_ft)
-		goto err_alloc;
+		goto err_create;
 
-	ct_ft->nf_ft->type = &flowtable_ct;
 	ct_ft->nf_ft->flags |= NF_FLOWTABLE_HW_OFFLOAD |
 			       NF_FLOWTABLE_COUNTER;
-	err = nf_flow_table_init(ct_ft->nf_ft);
-	if (err)
-		goto err_init;
-	write_pnet(&ct_ft->nf_ft->net, net);
 
 	__module_get(THIS_MODULE);
 out_unlock:
@@ -333,7 +329,7 @@ static int tcf_ct_flow_table_get(struct net *net, struct tcf_ct_params *params)
 
 	return 0;
 
-err_init:
+err_create:
 	rhashtable_remove_fast(&zones_ht, &ct_ft->node, zones_params);
 	kfree(ct_ft->nf_ft);
 err_insert:
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH nf-next 3/8] netfilter: nf_flowtable: make free a real free function
  2023-11-21 12:27 [PATCH nf-next 0/8] netfilter: make nf_flowtable lifetime differ from container struct Florian Westphal
  2023-11-21 12:27 ` [PATCH nf-next 1/8] netfilter: flowtable: move nf_flowtable out of container structures Florian Westphal
  2023-11-21 12:27 ` [PATCH nf-next 2/8] netfilter: nf_flowtable: replace init callback with a create one Florian Westphal
@ 2023-11-21 12:27 ` Florian Westphal
  2023-11-21 12:27 ` [PATCH nf-next 4/8] netfilter: nf_flowtable: delay flowtable release a second time Florian Westphal
                   ` (5 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Florian Westphal @ 2023-11-21 12:27 UTC (permalink / raw)
  To: netfilter-devel; +Cc: lorenzo, netdev, Florian Westphal

The nf_flowtable 'free' function only frees the internal data
structures, e.g. the rhashtable.

Make it so it releases the entire nf_flowtable.

This wasn't done before because the nf_flowtable structure used to be
embedded into another frontend-representation struct.

This is no longer the case, nf_flowtable gets allocated by ->create(),
and therefore should also be released via ->free().

This also moves the module_put call into the nf_flowtable core.

A followup patch will delay the actual freeing until another
rcu grace period has elapsed.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/nf_flow_table_core.c |  2 ++
 net/netfilter/nf_tables_api.c      | 12 ++++--------
 net/sched/act_ct.c                 |  6 ++----
 3 files changed, 8 insertions(+), 12 deletions(-)

diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c
index 375fc9c24149..70cc4e0d5ac9 100644
--- a/net/netfilter/nf_flow_table_core.c
+++ b/net/netfilter/nf_flow_table_core.c
@@ -612,6 +612,8 @@ void nf_flow_table_free(struct nf_flowtable *flow_table)
 	nf_flow_table_gc_run(flow_table);
 	nf_flow_table_offload_flush_cleanup(flow_table);
 	rhashtable_destroy(&flow_table->rhashtable);
+	module_put(flow_table->type->owner);
+	kfree(flow_table);
 }
 EXPORT_SYMBOL_GPL(nf_flow_table_free);
 
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index cce82fef4488..e779e275d694 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -8402,8 +8402,9 @@ static int nf_tables_newflowtable(struct sk_buff *skb,
 
 	flowtable->ft = type->create(net, type);
 	if (!flowtable->ft) {
+		module_put(type->owner);
 		err = -ENOMEM;
-		goto err3;
+		goto err2;
 	}
 
 	if (nla[NFTA_FLOWTABLE_FLAGS]) {
@@ -8411,7 +8412,7 @@ static int nf_tables_newflowtable(struct sk_buff *skb,
 			ntohl(nla_get_be32(nla[NFTA_FLOWTABLE_FLAGS]));
 		if (flowtable->ft->flags & ~NFT_FLOWTABLE_MASK) {
 			err = -EOPNOTSUPP;
-			goto err3;
+			goto err4;
 		}
 	}
 
@@ -8447,9 +8448,6 @@ static int nf_tables_newflowtable(struct sk_buff *skb,
 	}
 err4:
 	flowtable->ft->type->free(flowtable->ft);
-err3:
-	kfree(flowtable->ft);
-	module_put(type->owner);
 err2:
 	kfree(flowtable->name);
 err1:
@@ -8824,7 +8822,6 @@ static void nf_tables_flowtable_destroy(struct nft_flowtable *flowtable)
 {
 	struct nft_hook *hook, *next;
 
-	flowtable->ft->type->free(flowtable->ft);
 	list_for_each_entry_safe(hook, next, &flowtable->hook_list, list) {
 		flowtable->ft->type->setup(flowtable->ft, hook->ops.dev,
 					   FLOW_BLOCK_UNBIND);
@@ -8832,8 +8829,7 @@ static void nf_tables_flowtable_destroy(struct nft_flowtable *flowtable)
 		kfree(hook);
 	}
 	kfree(flowtable->name);
-	module_put(flowtable->ft->type->owner);
-	kfree(flowtable->ft);
+	flowtable->ft->type->free(flowtable->ft);
 	kfree(flowtable);
 }
 
diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
index 80869cc52348..dc17b313c175 100644
--- a/net/sched/act_ct.c
+++ b/net/sched/act_ct.c
@@ -321,6 +321,7 @@ static int tcf_ct_flow_table_get(struct net *net, struct tcf_ct_params *params)
 	ct_ft->nf_ft->flags |= NF_FLOWTABLE_HW_OFFLOAD |
 			       NF_FLOWTABLE_COUNTER;
 
+	/* released via nf_flow_table_free() */
 	__module_get(THIS_MODULE);
 out_unlock:
 	params->ct_ft = ct_ft;
@@ -347,7 +348,6 @@ static void tcf_ct_flow_table_cleanup_work(struct work_struct *work)
 
 	ct_ft = container_of(to_rcu_work(work), struct tcf_ct_flow_table,
 			     rwork);
-	nf_flow_table_free(ct_ft->nf_ft);
 
 	/* Remove any remaining callbacks before cleanup */
 	block = &ct_ft->nf_ft->flow_block;
@@ -357,10 +357,8 @@ static void tcf_ct_flow_table_cleanup_work(struct work_struct *work)
 		flow_block_cb_free(block_cb);
 	}
 	up_write(&ct_ft->nf_ft->flow_block_lock);
-	kfree(ct_ft->nf_ft);
+	nf_flow_table_free(ct_ft->nf_ft);
 	kfree(ct_ft);
-
-	module_put(THIS_MODULE);
 }
 
 static void tcf_ct_flow_table_put(struct tcf_ct_flow_table *ct_ft)
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH nf-next 4/8] netfilter: nf_flowtable: delay flowtable release a second time
  2023-11-21 12:27 [PATCH nf-next 0/8] netfilter: make nf_flowtable lifetime differ from container struct Florian Westphal
                   ` (2 preceding siblings ...)
  2023-11-21 12:27 ` [PATCH nf-next 3/8] netfilter: nf_flowtable: make free a real free function Florian Westphal
@ 2023-11-21 12:27 ` Florian Westphal
  2023-11-21 12:27 ` [PATCH nf-next 5/8] netfilter: nf_tables: reject flowtable hw offload for same device Florian Westphal
                   ` (4 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Florian Westphal @ 2023-11-21 12:27 UTC (permalink / raw)
  To: netfilter-devel; +Cc: lorenzo, netdev, Florian Westphal

At this time the frontends (tc, nftables) ensure that the nf_flowtable
is removed after the frontend hooks are gone (tc action, netfilter hooks).

In both cases the nf_flowtable can be safely free'd as no packets will
be traversing these hooks anymore.

However, the upcoming nf_flowtable kfunc for XDP will still have a
pointer to the nf_flowtable in its own net_device -> nf_flowtable
mapping.

This mapping is removed via the flow_block UNBIND callback.

This callback however comes after an rcu grace period, not before.

Therefore defer the real freeing via call_rcu so that no kfunc can
possibly be using the nf_flowtable (or flow entries within) anymore.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/net/netfilter/nf_flow_table.h |  2 ++
 net/netfilter/nf_flow_table_core.c    | 18 ++++++++++++++----
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/include/net/netfilter/nf_flow_table.h b/include/net/netfilter/nf_flow_table.h
index d365eabd4a3c..6598ac455d17 100644
--- a/include/net/netfilter/nf_flow_table.h
+++ b/include/net/netfilter/nf_flow_table.h
@@ -83,6 +83,8 @@ struct nf_flowtable {
 	struct flow_block		flow_block;
 	struct rw_semaphore		flow_block_lock; /* Guards flow_block */
 	possible_net_t			net;
+
+	struct rcu_work			rwork;
 };
 
 static inline bool nf_flowtable_hw_offload(struct nf_flowtable *flowtable)
diff --git a/net/netfilter/nf_flow_table_core.c b/net/netfilter/nf_flow_table_core.c
index 70cc4e0d5ac9..cae27f8f0f68 100644
--- a/net/netfilter/nf_flow_table_core.c
+++ b/net/netfilter/nf_flow_table_core.c
@@ -599,11 +599,11 @@ void nf_flow_table_cleanup(struct net_device *dev)
 }
 EXPORT_SYMBOL_GPL(nf_flow_table_cleanup);
 
-void nf_flow_table_free(struct nf_flowtable *flow_table)
+static void nf_flow_table_free_rwork(struct work_struct *work)
 {
-	mutex_lock(&flowtable_lock);
-	list_del(&flow_table->list);
-	mutex_unlock(&flowtable_lock);
+	struct nf_flowtable *flow_table;
+
+	flow_table = container_of(to_rcu_work(work), struct nf_flowtable, rwork);
 
 	cancel_delayed_work_sync(&flow_table->gc_work);
 	nf_flow_table_offload_flush(flow_table);
@@ -615,6 +615,16 @@ void nf_flow_table_free(struct nf_flowtable *flow_table)
 	module_put(flow_table->type->owner);
 	kfree(flow_table);
 }
+
+void nf_flow_table_free(struct nf_flowtable *flow_table)
+{
+	mutex_lock(&flowtable_lock);
+	list_del(&flow_table->list);
+	mutex_unlock(&flowtable_lock);
+
+	INIT_RCU_WORK(&flow_table->rwork, nf_flow_table_free_rwork);
+	queue_rcu_work(system_power_efficient_wq, &flow_table->rwork);
+}
 EXPORT_SYMBOL_GPL(nf_flow_table_free);
 
 static int nf_flow_table_init_net(struct net *net)
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH nf-next 5/8] netfilter: nf_tables: reject flowtable hw offload for same device
  2023-11-21 12:27 [PATCH nf-next 0/8] netfilter: make nf_flowtable lifetime differ from container struct Florian Westphal
                   ` (3 preceding siblings ...)
  2023-11-21 12:27 ` [PATCH nf-next 4/8] netfilter: nf_flowtable: delay flowtable release a second time Florian Westphal
@ 2023-11-21 12:27 ` Florian Westphal
  2023-11-21 12:27 ` [PATCH nf-next 6/8] netfilter: nf_tables: add xdp offload flag Florian Westphal
                   ` (3 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Florian Westphal @ 2023-11-21 12:27 UTC (permalink / raw)
  To: netfilter-devel; +Cc: lorenzo, netdev, Florian Westphal

The existing check is not sufficient, as it only considers flowtables
within the same table.

In case of HW offload, we need check all flowtables that exist in
the same net namespace.

We can skip flowtables that are slated for removal (not active
in next gen).

Ideally this check would supersede the existing one, but this is
probably too risky and might prevent existing configs from working.

As is, you can do all of the following:

table ip t { flowtable f { devices = { lo  } } }
table ip6 t { flowtable f { devices = { lo  } } }
table inet t { flowtable f { devices = { lo  } } }

... but IMO this should not be possible in the first place.

Disable this for HW offload.

This is related to XDP flowtable work, the idea is to keep a small
hashtable that has a 'struct net_device := struct nf_flowtable' map.

This mapping must be unique.  The idea is to add a "XDP OFFLOAD"
flag to nftables api and then have this function run for 'xdp offload'
case too.

This is useful, because it would permit the "xdp offload" hashtable
to tolerate duplicate keys -- they would only occur during transactional
updates, e.g. a flush of the current table combined with atomic reload.

Without this change, the nf_flowtable core cannot tell when flowtable
is a real duplicate, or just a temporary artefact of the
two-phase-commit protocol (i.e., the clashing entry is queued for removal).

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/nf_tables_api.c | 36 +++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index e779e275d694..7437b997ca7e 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -8189,6 +8189,37 @@ static void nft_unregister_flowtable_net_hooks(struct net *net,
 	__nft_unregister_flowtable_net_hooks(net, hook_list, false);
 }
 
+static bool nft_flowtable_offload_clash(struct net *net,
+					const struct nft_hook *hook,
+					struct nft_flowtable *flowtable)
+{
+	const struct nftables_pernet *nft_net;
+	struct nft_flowtable *existing_ft;
+	const struct nft_table *table;
+
+	/* No offload requested, no need to validate */
+	if (!nf_flowtable_hw_offload(flowtable->ft))
+		return false;
+
+	nft_net = nft_pernet(net);
+
+	list_for_each_entry(table, &nft_net->tables, list) {
+		list_for_each_entry(existing_ft, &table->flowtables, list) {
+			const struct nft_hook *hook2;
+
+			if (!nft_is_active_next(net, existing_ft))
+				continue;
+
+			list_for_each_entry(hook2, &existing_ft->hook_list, list) {
+				if (hook->ops.dev == hook2->ops.dev)
+					return true;
+			}
+		}
+	}
+
+	return false;
+}
+
 static int nft_register_flowtable_net_hooks(struct net *net,
 					    struct nft_table *table,
 					    struct list_head *hook_list,
@@ -8199,6 +8230,11 @@ static int nft_register_flowtable_net_hooks(struct net *net,
 	int err, i = 0;
 
 	list_for_each_entry(hook, hook_list, list) {
+		if (nft_flowtable_offload_clash(net, hook, flowtable)) {
+			err = -EEXIST;
+			goto err_unregister_net_hooks;
+		}
+
 		list_for_each_entry(ft, &table->flowtables, list) {
 			if (!nft_is_active_next(net, ft))
 				continue;
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH nf-next 6/8] netfilter: nf_tables: add xdp offload flag
  2023-11-21 12:27 [PATCH nf-next 0/8] netfilter: make nf_flowtable lifetime differ from container struct Florian Westphal
                   ` (4 preceding siblings ...)
  2023-11-21 12:27 ` [PATCH nf-next 5/8] netfilter: nf_tables: reject flowtable hw offload for same device Florian Westphal
@ 2023-11-21 12:27 ` Florian Westphal
  2023-11-21 12:27 ` [PATCH nf-next 7/8] netfilter: nf_tables: add flowtable map for xdp offload Florian Westphal
                   ` (2 subsequent siblings)
  8 siblings, 0 replies; 22+ messages in thread
From: Florian Westphal @ 2023-11-21 12:27 UTC (permalink / raw)
  To: netfilter-devel; +Cc: lorenzo, netdev, Florian Westphal

Also make sure this flag cannot be set or cleared, just like
with the regular hw offload flag.

Otherwise we'd have to add more complexity and explicitly
withdraw or add the device to the netdev -> flowtable mapping.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/net/netfilter/nf_flow_table.h    |  1 +
 include/uapi/linux/netfilter/nf_tables.h |  5 ++++-
 net/netfilter/nf_tables_api.c            | 14 ++++++++++++--
 3 files changed, 17 insertions(+), 3 deletions(-)

diff --git a/include/net/netfilter/nf_flow_table.h b/include/net/netfilter/nf_flow_table.h
index 6598ac455d17..11985d9b8370 100644
--- a/include/net/netfilter/nf_flow_table.h
+++ b/include/net/netfilter/nf_flow_table.h
@@ -70,6 +70,7 @@ struct nf_flowtable_type {
 enum nf_flowtable_flags {
 	NF_FLOWTABLE_HW_OFFLOAD		= 0x1,	/* NFT_FLOWTABLE_HW_OFFLOAD */
 	NF_FLOWTABLE_COUNTER		= 0x2,	/* NFT_FLOWTABLE_COUNTER */
+	NF_FLOWTABLE_XDP_OFFLOAD	= 0x4,	/* NFT_FLOWTABLE_XDP_OFFLOAD */
 };
 
 struct nf_flowtable {
diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h
index ca30232b7bc8..ed297dc77288 100644
--- a/include/uapi/linux/netfilter/nf_tables.h
+++ b/include/uapi/linux/netfilter/nf_tables.h
@@ -1675,12 +1675,15 @@ enum nft_object_attributes {
  *
  * @NFT_FLOWTABLE_HW_OFFLOAD: flowtable hardware offload is enabled
  * @NFT_FLOWTABLE_COUNTER: enable flow counters
+ * @NFT_FLOWTABLE_XDP_OFFLOAD: flowtable xdp offload is enabled
  */
 enum nft_flowtable_flags {
 	NFT_FLOWTABLE_HW_OFFLOAD	= 0x1,
 	NFT_FLOWTABLE_COUNTER		= 0x2,
+	NFT_FLOWTABLE_XDP_OFFLOAD	= 0x4,
 	NFT_FLOWTABLE_MASK		= (NFT_FLOWTABLE_HW_OFFLOAD |
-					   NFT_FLOWTABLE_COUNTER)
+					   NFT_FLOWTABLE_COUNTER |
+					   NFT_FLOWTABLE_XDP_OFFLOAD),
 };
 
 /**
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 7437b997ca7e..4e21311ec768 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -8288,6 +8288,15 @@ static void nft_hooks_destroy(struct list_head *hook_list)
 	}
 }
 
+static bool nft_flowtable_flag_changes(const struct nf_flowtable *ft,
+				       unsigned int new_flags, enum nft_flowtable_flags flag)
+{
+	if ((ft->flags & flag) ^ (new_flags & flag))
+		return true;
+
+	return false;
+}
+
 static int nft_flowtable_update(struct nft_ctx *ctx, const struct nlmsghdr *nlh,
 				struct nft_flowtable *flowtable,
 				struct netlink_ext_ack *extack)
@@ -8318,8 +8327,9 @@ static int nft_flowtable_update(struct nft_ctx *ctx, const struct nlmsghdr *nlh,
 			err = -EOPNOTSUPP;
 			goto err_flowtable_update_hook;
 		}
-		if ((flowtable->ft->flags & NFT_FLOWTABLE_HW_OFFLOAD) ^
-		    (flags & NFT_FLOWTABLE_HW_OFFLOAD)) {
+
+		if (nft_flowtable_flag_changes(flowtable->ft, flags, NFT_FLOWTABLE_HW_OFFLOAD) ||
+		    nft_flowtable_flag_changes(flowtable->ft, flags, NFT_FLOWTABLE_XDP_OFFLOAD)) {
 			err = -EOPNOTSUPP;
 			goto err_flowtable_update_hook;
 		}
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH nf-next 7/8] netfilter: nf_tables: add flowtable map for xdp offload
  2023-11-21 12:27 [PATCH nf-next 0/8] netfilter: make nf_flowtable lifetime differ from container struct Florian Westphal
                   ` (5 preceding siblings ...)
  2023-11-21 12:27 ` [PATCH nf-next 6/8] netfilter: nf_tables: add xdp offload flag Florian Westphal
@ 2023-11-21 12:27 ` Florian Westphal
  2023-11-21 14:25   ` Lorenzo Bianconi
  2023-11-24 10:59   ` Toke Høiland-Jørgensen
  2023-11-21 12:27 ` [PATCH nf-next 8/8] netfilter: nf_tables: permit duplicate flowtable mappings Florian Westphal
  2023-11-24  9:50 ` [PATCH nf-next 0/8] netfilter: make nf_flowtable lifetime differ from container struct Pablo Neira Ayuso
  8 siblings, 2 replies; 22+ messages in thread
From: Florian Westphal @ 2023-11-21 12:27 UTC (permalink / raw)
  To: netfilter-devel; +Cc: lorenzo, netdev, Florian Westphal

This adds a small internal mapping table so that a new bpf (xdp) kfunc
can perform lookups in a flowtable.

As-is, xdp program has access to the device pointer, but no way to do a
lookup in a flowtable -- there is no way to obtain the needed struct
without questionable stunts.

This allows to obtain an nf_flowtable pointer given a net_device
structure.

A device cannot be added to multiple flowtables, the mapping needs
to be unique.  This is enforced when a flowtables with the
NF_FLOWTABLE_XDP_OFFLOAD was added.

Exposure of this NF_FLOWTABLE_XDP_OFFLOAD in UAPI could be avoided,
iff the 'net_device maps to 0 or 1 flowtable' paradigm is enforced
regardless of offload-or-not flag.

HOWEVER, that does break existing behaviour.

An alternative would be to repurpose the hw offload flag by allowing
XDP fallback when hw offload cannot be done due to lack of ndo
callbacks.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 include/net/netfilter/nf_flow_table.h |   7 ++
 net/netfilter/nf_flow_table_offload.c | 131 +++++++++++++++++++++++++-
 net/netfilter/nf_tables_api.c         |   3 +-
 3 files changed, 139 insertions(+), 2 deletions(-)

diff --git a/include/net/netfilter/nf_flow_table.h b/include/net/netfilter/nf_flow_table.h
index 11985d9b8370..b8b7fcb98732 100644
--- a/include/net/netfilter/nf_flow_table.h
+++ b/include/net/netfilter/nf_flow_table.h
@@ -93,6 +93,11 @@ static inline bool nf_flowtable_hw_offload(struct nf_flowtable *flowtable)
 	return flowtable->flags & NF_FLOWTABLE_HW_OFFLOAD;
 }
 
+static inline bool nf_flowtable_xdp_offload(struct nf_flowtable *flowtable)
+{
+	return flowtable->flags & NF_FLOWTABLE_XDP_OFFLOAD;
+}
+
 enum flow_offload_tuple_dir {
 	FLOW_OFFLOAD_DIR_ORIGINAL = IP_CT_DIR_ORIGINAL,
 	FLOW_OFFLOAD_DIR_REPLY = IP_CT_DIR_REPLY,
@@ -299,6 +304,8 @@ struct flow_ports {
 	__be16 source, dest;
 };
 
+struct nf_flowtable *nf_flowtable_by_dev(const struct net_device *dev);
+
 unsigned int nf_flow_offload_ip_hook(void *priv, struct sk_buff *skb,
 				     const struct nf_hook_state *state);
 unsigned int nf_flow_offload_ipv6_hook(void *priv, struct sk_buff *skb,
diff --git a/net/netfilter/nf_flow_table_offload.c b/net/netfilter/nf_flow_table_offload.c
index a010b25076ca..9ec7aa4ad2e5 100644
--- a/net/netfilter/nf_flow_table_offload.c
+++ b/net/netfilter/nf_flow_table_offload.c
@@ -17,6 +17,92 @@ static struct workqueue_struct *nf_flow_offload_add_wq;
 static struct workqueue_struct *nf_flow_offload_del_wq;
 static struct workqueue_struct *nf_flow_offload_stats_wq;
 
+struct flow_offload_xdp {
+	struct hlist_node hnode;
+
+	unsigned long net_device_addr;
+	struct nf_flowtable *ft;
+
+	struct rcu_head	rcuhead;
+};
+
+#define NF_XDP_HT_BITS	4
+static DEFINE_HASHTABLE(nf_xdp_hashtable, NF_XDP_HT_BITS);
+static DEFINE_MUTEX(nf_xdp_hashtable_lock);
+
+/* caller must hold rcu read lock */
+struct nf_flowtable *nf_flowtable_by_dev(const struct net_device *dev)
+{
+	unsigned long key = (unsigned long)dev;
+	const struct flow_offload_xdp *cur;
+
+	hash_for_each_possible_rcu(nf_xdp_hashtable, cur, hnode, key) {
+		if (key == cur->net_device_addr)
+			return cur->ft;
+	}
+
+	return NULL;
+}
+
+static int nf_flowtable_by_dev_insert(struct nf_flowtable *ft,
+				      const struct net_device *dev)
+{
+	unsigned long key = (unsigned long)dev;
+	struct flow_offload_xdp *cur;
+	int err = 0;
+
+	mutex_lock(&nf_xdp_hashtable_lock);
+	hash_for_each_possible(nf_xdp_hashtable, cur, hnode, key) {
+		if (key != cur->net_device_addr)
+			continue;
+		err = -EEXIST;
+		break;
+	}
+
+	if (err == 0) {
+		struct flow_offload_xdp *new;
+
+		new = kzalloc(sizeof(*new), GFP_KERNEL_ACCOUNT);
+		if (new) {
+			new->net_device_addr = key;
+			new->ft = ft;
+
+			hash_add_rcu(nf_xdp_hashtable, &new->hnode, key);
+		} else {
+			err = -ENOMEM;
+		}
+	}
+
+	mutex_unlock(&nf_xdp_hashtable_lock);
+
+	DEBUG_NET_WARN_ON_ONCE(err == 0 && nf_flowtable_by_dev(dev) != ft);
+
+	return err;
+}
+
+static void nf_flowtable_by_dev_remove(const struct net_device *dev)
+{
+	unsigned long key = (unsigned long)dev;
+	struct flow_offload_xdp *cur;
+	bool found = false;
+
+	mutex_lock(&nf_xdp_hashtable_lock);
+
+	hash_for_each_possible(nf_xdp_hashtable, cur, hnode, key) {
+		if (key != cur->net_device_addr)
+			continue;
+
+		hash_del_rcu(&cur->hnode);
+		kfree_rcu(cur, rcuhead);
+		found = true;
+		break;
+	}
+
+	mutex_unlock(&nf_xdp_hashtable_lock);
+
+	WARN_ON_ONCE(!found);
+}
+
 struct flow_offload_work {
 	struct list_head	list;
 	enum flow_cls_command	cmd;
@@ -1183,6 +1269,44 @@ static int nf_flow_table_offload_cmd(struct flow_block_offload *bo,
 	return 0;
 }
 
+static int nf_flow_offload_xdp_setup(struct nf_flowtable *flowtable,
+				     struct net_device *dev,
+				     enum flow_block_command cmd)
+{
+	if (!nf_flowtable_xdp_offload(flowtable))
+		return 0;
+
+	switch (cmd) {
+	case FLOW_BLOCK_BIND:
+		return nf_flowtable_by_dev_insert(flowtable, dev);
+	case FLOW_BLOCK_UNBIND:
+		nf_flowtable_by_dev_remove(dev);
+		return 0;
+	}
+
+	WARN_ON_ONCE(1);
+	return 0;
+}
+
+static void nf_flow_offload_xdp_cancel(struct nf_flowtable *flowtable,
+				       struct net_device *dev,
+				       enum flow_block_command cmd)
+{
+	if (!nf_flowtable_xdp_offload(flowtable))
+		return;
+
+	switch (cmd) {
+	case FLOW_BLOCK_BIND:
+		nf_flowtable_by_dev_remove(dev);
+		return;
+	case FLOW_BLOCK_UNBIND:
+		/* We do not re-bind in case hw offload would report error
+		 * on *unregister*.
+		 */
+		break;
+	}
+}
+
 int nf_flow_table_offload_setup(struct nf_flowtable *flowtable,
 				struct net_device *dev,
 				enum flow_block_command cmd)
@@ -1191,6 +1315,9 @@ int nf_flow_table_offload_setup(struct nf_flowtable *flowtable,
 	struct flow_block_offload bo;
 	int err;
 
+	if (nf_flow_offload_xdp_setup(flowtable, dev, cmd))
+		return -EBUSY;
+
 	if (!nf_flowtable_hw_offload(flowtable))
 		return 0;
 
@@ -1200,8 +1327,10 @@ int nf_flow_table_offload_setup(struct nf_flowtable *flowtable,
 	else
 		err = nf_flow_table_indr_offload_cmd(&bo, flowtable, dev, cmd,
 						     &extack);
-	if (err < 0)
+	if (err < 0) {
+		nf_flow_offload_xdp_cancel(flowtable, dev, cmd);
 		return err;
+	}
 
 	return nf_flow_table_block_setup(flowtable, &bo, cmd);
 }
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 4e21311ec768..223ca4d0e2a5 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -8198,7 +8198,8 @@ static bool nft_flowtable_offload_clash(struct net *net,
 	const struct nft_table *table;
 
 	/* No offload requested, no need to validate */
-	if (!nf_flowtable_hw_offload(flowtable->ft))
+	if (!nf_flowtable_hw_offload(flowtable->ft) &&
+	    !nf_flowtable_xdp_offload(flowtable->ft))
 		return false;
 
 	nft_net = nft_pernet(net);
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* [PATCH nf-next 8/8] netfilter: nf_tables: permit duplicate flowtable mappings
  2023-11-21 12:27 [PATCH nf-next 0/8] netfilter: make nf_flowtable lifetime differ from container struct Florian Westphal
                   ` (6 preceding siblings ...)
  2023-11-21 12:27 ` [PATCH nf-next 7/8] netfilter: nf_tables: add flowtable map for xdp offload Florian Westphal
@ 2023-11-21 12:27 ` Florian Westphal
  2023-11-24  9:50 ` [PATCH nf-next 0/8] netfilter: make nf_flowtable lifetime differ from container struct Pablo Neira Ayuso
  8 siblings, 0 replies; 22+ messages in thread
From: Florian Westphal @ 2023-11-21 12:27 UTC (permalink / raw)
  To: netfilter-devel; +Cc: lorenzo, netdev, Florian Westphal

The core ensures that no duplicate mapping (net_device x is always
assigned to exactly one flowtable, or none at all) exists.

Only exception: its assigned to a flowtable that is going away
in this transaction.

Therefore relax the existing check to permit the duplicate
entry, it is only temporary.

The existing entry will shadow the new one until the transaction
is committed (old entry is removed) or aborted (new entry is removed).

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/nf_flow_table_offload.c | 36 +++++++++++++++++++++++----
 1 file changed, 31 insertions(+), 5 deletions(-)

diff --git a/net/netfilter/nf_flow_table_offload.c b/net/netfilter/nf_flow_table_offload.c
index 9ec7aa4ad2e5..b6503fd45871 100644
--- a/net/netfilter/nf_flow_table_offload.c
+++ b/net/netfilter/nf_flow_table_offload.c
@@ -49,13 +49,14 @@ static int nf_flowtable_by_dev_insert(struct nf_flowtable *ft,
 {
 	unsigned long key = (unsigned long)dev;
 	struct flow_offload_xdp *cur;
+	bool duplicate = false;
 	int err = 0;
 
 	mutex_lock(&nf_xdp_hashtable_lock);
 	hash_for_each_possible(nf_xdp_hashtable, cur, hnode, key) {
 		if (key != cur->net_device_addr)
 			continue;
-		err = -EEXIST;
+		duplicate = true;
 		break;
 	}
 
@@ -67,7 +68,20 @@ static int nf_flowtable_by_dev_insert(struct nf_flowtable *ft,
 			new->net_device_addr = key;
 			new->ft = ft;
 
-			hash_add_rcu(nf_xdp_hashtable, &new->hnode, key);
+			if (duplicate) {
+				u32 index = hash_min(key, HASH_BITS(nf_xdp_hashtable));
+
+				/* nf_tables_api.c makes sure that only a single flowtable
+				 * has this device.
+				 *
+				 * Only exception: the flowtable is about to be removed.
+				 * Thus we tolerate the duplicated entry, the 'old' entry
+				 * will be unhashed after the transaction completes.
+				 */
+				hlist_add_tail_rcu(&new->hnode, &nf_xdp_hashtable[index]);
+			} else {
+				hash_add_rcu(nf_xdp_hashtable, &new->hnode, key);
+			}
 		} else {
 			err = -ENOMEM;
 		}
@@ -80,7 +94,8 @@ static int nf_flowtable_by_dev_insert(struct nf_flowtable *ft,
 	return err;
 }
 
-static void nf_flowtable_by_dev_remove(const struct net_device *dev)
+static void nf_flowtable_by_dev_remove(const struct nf_flowtable *ft,
+				       const struct net_device *dev)
 {
 	unsigned long key = (unsigned long)dev;
 	struct flow_offload_xdp *cur;
@@ -92,6 +107,17 @@ static void nf_flowtable_by_dev_remove(const struct net_device *dev)
 		if (key != cur->net_device_addr)
 			continue;
 
+		/* duplicate entry, happens when current transaction
+		 * removes flowtable f1 and adds flowtable f2, where both
+		 * have *dev assigned to them.
+		 *
+		 * nf_tables_api.c makes sure that at most one
+		 * flowtable,dev pair with 'xdp' flag enabled can exist
+		 * in the same generation.
+		 */
+		if (cur->ft != ft)
+			continue;
+
 		hash_del_rcu(&cur->hnode);
 		kfree_rcu(cur, rcuhead);
 		found = true;
@@ -1280,7 +1306,7 @@ static int nf_flow_offload_xdp_setup(struct nf_flowtable *flowtable,
 	case FLOW_BLOCK_BIND:
 		return nf_flowtable_by_dev_insert(flowtable, dev);
 	case FLOW_BLOCK_UNBIND:
-		nf_flowtable_by_dev_remove(dev);
+		nf_flowtable_by_dev_remove(flowtable, dev);
 		return 0;
 	}
 
@@ -1297,7 +1323,7 @@ static void nf_flow_offload_xdp_cancel(struct nf_flowtable *flowtable,
 
 	switch (cmd) {
 	case FLOW_BLOCK_BIND:
-		nf_flowtable_by_dev_remove(dev);
+		nf_flowtable_by_dev_remove(flowtable, dev);
 		return;
 	case FLOW_BLOCK_UNBIND:
 		/* We do not re-bind in case hw offload would report error
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* Re: [PATCH nf-next 7/8] netfilter: nf_tables: add flowtable map for xdp offload
  2023-11-21 12:27 ` [PATCH nf-next 7/8] netfilter: nf_tables: add flowtable map for xdp offload Florian Westphal
@ 2023-11-21 14:25   ` Lorenzo Bianconi
  2023-11-24 10:59   ` Toke Høiland-Jørgensen
  1 sibling, 0 replies; 22+ messages in thread
From: Lorenzo Bianconi @ 2023-11-21 14:25 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel, netdev

[-- Attachment #1: Type: text/plain, Size: 7529 bytes --]

> This adds a small internal mapping table so that a new bpf (xdp) kfunc
> can perform lookups in a flowtable.
> 
> As-is, xdp program has access to the device pointer, but no way to do a
> lookup in a flowtable -- there is no way to obtain the needed struct
> without questionable stunts.
> 
> This allows to obtain an nf_flowtable pointer given a net_device
> structure.
> 
> A device cannot be added to multiple flowtables, the mapping needs
> to be unique.  This is enforced when a flowtables with the
> NF_FLOWTABLE_XDP_OFFLOAD was added.
> 
> Exposure of this NF_FLOWTABLE_XDP_OFFLOAD in UAPI could be avoided,
> iff the 'net_device maps to 0 or 1 flowtable' paradigm is enforced
> regardless of offload-or-not flag.
> 
> HOWEVER, that does break existing behaviour.
> 
> An alternative would be to repurpose the hw offload flag by allowing
> XDP fallback when hw offload cannot be done due to lack of ndo
> callbacks.
> 
> Signed-off-by: Florian Westphal <fw@strlen.de>

Tested-by: Lorenzo Bianconi <lorenzo@kernel.org>

> ---
>  include/net/netfilter/nf_flow_table.h |   7 ++
>  net/netfilter/nf_flow_table_offload.c | 131 +++++++++++++++++++++++++-
>  net/netfilter/nf_tables_api.c         |   3 +-
>  3 files changed, 139 insertions(+), 2 deletions(-)
> 
> diff --git a/include/net/netfilter/nf_flow_table.h b/include/net/netfilter/nf_flow_table.h
> index 11985d9b8370..b8b7fcb98732 100644
> --- a/include/net/netfilter/nf_flow_table.h
> +++ b/include/net/netfilter/nf_flow_table.h
> @@ -93,6 +93,11 @@ static inline bool nf_flowtable_hw_offload(struct nf_flowtable *flowtable)
>  	return flowtable->flags & NF_FLOWTABLE_HW_OFFLOAD;
>  }
>  
> +static inline bool nf_flowtable_xdp_offload(struct nf_flowtable *flowtable)
> +{
> +	return flowtable->flags & NF_FLOWTABLE_XDP_OFFLOAD;
> +}
> +
>  enum flow_offload_tuple_dir {
>  	FLOW_OFFLOAD_DIR_ORIGINAL = IP_CT_DIR_ORIGINAL,
>  	FLOW_OFFLOAD_DIR_REPLY = IP_CT_DIR_REPLY,
> @@ -299,6 +304,8 @@ struct flow_ports {
>  	__be16 source, dest;
>  };
>  
> +struct nf_flowtable *nf_flowtable_by_dev(const struct net_device *dev);
> +
>  unsigned int nf_flow_offload_ip_hook(void *priv, struct sk_buff *skb,
>  				     const struct nf_hook_state *state);
>  unsigned int nf_flow_offload_ipv6_hook(void *priv, struct sk_buff *skb,
> diff --git a/net/netfilter/nf_flow_table_offload.c b/net/netfilter/nf_flow_table_offload.c
> index a010b25076ca..9ec7aa4ad2e5 100644
> --- a/net/netfilter/nf_flow_table_offload.c
> +++ b/net/netfilter/nf_flow_table_offload.c
> @@ -17,6 +17,92 @@ static struct workqueue_struct *nf_flow_offload_add_wq;
>  static struct workqueue_struct *nf_flow_offload_del_wq;
>  static struct workqueue_struct *nf_flow_offload_stats_wq;
>  
> +struct flow_offload_xdp {
> +	struct hlist_node hnode;
> +
> +	unsigned long net_device_addr;
> +	struct nf_flowtable *ft;
> +
> +	struct rcu_head	rcuhead;
> +};
> +
> +#define NF_XDP_HT_BITS	4
> +static DEFINE_HASHTABLE(nf_xdp_hashtable, NF_XDP_HT_BITS);
> +static DEFINE_MUTEX(nf_xdp_hashtable_lock);
> +
> +/* caller must hold rcu read lock */
> +struct nf_flowtable *nf_flowtable_by_dev(const struct net_device *dev)
> +{
> +	unsigned long key = (unsigned long)dev;
> +	const struct flow_offload_xdp *cur;
> +
> +	hash_for_each_possible_rcu(nf_xdp_hashtable, cur, hnode, key) {
> +		if (key == cur->net_device_addr)
> +			return cur->ft;
> +	}
> +
> +	return NULL;
> +}
> +
> +static int nf_flowtable_by_dev_insert(struct nf_flowtable *ft,
> +				      const struct net_device *dev)
> +{
> +	unsigned long key = (unsigned long)dev;
> +	struct flow_offload_xdp *cur;
> +	int err = 0;
> +
> +	mutex_lock(&nf_xdp_hashtable_lock);
> +	hash_for_each_possible(nf_xdp_hashtable, cur, hnode, key) {
> +		if (key != cur->net_device_addr)
> +			continue;
> +		err = -EEXIST;
> +		break;
> +	}
> +
> +	if (err == 0) {
> +		struct flow_offload_xdp *new;
> +
> +		new = kzalloc(sizeof(*new), GFP_KERNEL_ACCOUNT);
> +		if (new) {
> +			new->net_device_addr = key;
> +			new->ft = ft;
> +
> +			hash_add_rcu(nf_xdp_hashtable, &new->hnode, key);
> +		} else {
> +			err = -ENOMEM;
> +		}
> +	}
> +
> +	mutex_unlock(&nf_xdp_hashtable_lock);
> +
> +	DEBUG_NET_WARN_ON_ONCE(err == 0 && nf_flowtable_by_dev(dev) != ft);
> +
> +	return err;
> +}
> +
> +static void nf_flowtable_by_dev_remove(const struct net_device *dev)
> +{
> +	unsigned long key = (unsigned long)dev;
> +	struct flow_offload_xdp *cur;
> +	bool found = false;
> +
> +	mutex_lock(&nf_xdp_hashtable_lock);
> +
> +	hash_for_each_possible(nf_xdp_hashtable, cur, hnode, key) {
> +		if (key != cur->net_device_addr)
> +			continue;
> +
> +		hash_del_rcu(&cur->hnode);
> +		kfree_rcu(cur, rcuhead);
> +		found = true;
> +		break;
> +	}
> +
> +	mutex_unlock(&nf_xdp_hashtable_lock);
> +
> +	WARN_ON_ONCE(!found);
> +}
> +
>  struct flow_offload_work {
>  	struct list_head	list;
>  	enum flow_cls_command	cmd;
> @@ -1183,6 +1269,44 @@ static int nf_flow_table_offload_cmd(struct flow_block_offload *bo,
>  	return 0;
>  }
>  
> +static int nf_flow_offload_xdp_setup(struct nf_flowtable *flowtable,
> +				     struct net_device *dev,
> +				     enum flow_block_command cmd)
> +{
> +	if (!nf_flowtable_xdp_offload(flowtable))
> +		return 0;
> +
> +	switch (cmd) {
> +	case FLOW_BLOCK_BIND:
> +		return nf_flowtable_by_dev_insert(flowtable, dev);
> +	case FLOW_BLOCK_UNBIND:
> +		nf_flowtable_by_dev_remove(dev);
> +		return 0;
> +	}
> +
> +	WARN_ON_ONCE(1);
> +	return 0;
> +}
> +
> +static void nf_flow_offload_xdp_cancel(struct nf_flowtable *flowtable,
> +				       struct net_device *dev,
> +				       enum flow_block_command cmd)
> +{
> +	if (!nf_flowtable_xdp_offload(flowtable))
> +		return;
> +
> +	switch (cmd) {
> +	case FLOW_BLOCK_BIND:
> +		nf_flowtable_by_dev_remove(dev);
> +		return;
> +	case FLOW_BLOCK_UNBIND:
> +		/* We do not re-bind in case hw offload would report error
> +		 * on *unregister*.
> +		 */
> +		break;
> +	}
> +}
> +
>  int nf_flow_table_offload_setup(struct nf_flowtable *flowtable,
>  				struct net_device *dev,
>  				enum flow_block_command cmd)
> @@ -1191,6 +1315,9 @@ int nf_flow_table_offload_setup(struct nf_flowtable *flowtable,
>  	struct flow_block_offload bo;
>  	int err;
>  
> +	if (nf_flow_offload_xdp_setup(flowtable, dev, cmd))
> +		return -EBUSY;
> +
>  	if (!nf_flowtable_hw_offload(flowtable))
>  		return 0;
>  
> @@ -1200,8 +1327,10 @@ int nf_flow_table_offload_setup(struct nf_flowtable *flowtable,
>  	else
>  		err = nf_flow_table_indr_offload_cmd(&bo, flowtable, dev, cmd,
>  						     &extack);
> -	if (err < 0)
> +	if (err < 0) {
> +		nf_flow_offload_xdp_cancel(flowtable, dev, cmd);
>  		return err;
> +	}
>  
>  	return nf_flow_table_block_setup(flowtable, &bo, cmd);
>  }
> diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
> index 4e21311ec768..223ca4d0e2a5 100644
> --- a/net/netfilter/nf_tables_api.c
> +++ b/net/netfilter/nf_tables_api.c
> @@ -8198,7 +8198,8 @@ static bool nft_flowtable_offload_clash(struct net *net,
>  	const struct nft_table *table;
>  
>  	/* No offload requested, no need to validate */
> -	if (!nf_flowtable_hw_offload(flowtable->ft))
> +	if (!nf_flowtable_hw_offload(flowtable->ft) &&
> +	    !nf_flowtable_xdp_offload(flowtable->ft))
>  		return false;
>  
>  	nft_net = nft_pernet(net);
> -- 
> 2.41.0
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH nf-next 1/8] netfilter: flowtable: move nf_flowtable out of container structures
  2023-11-21 12:27 ` [PATCH nf-next 1/8] netfilter: flowtable: move nf_flowtable out of container structures Florian Westphal
@ 2023-11-23 13:52   ` Simon Horman
  2023-11-23 14:10     ` Florian Westphal
  0 siblings, 1 reply; 22+ messages in thread
From: Simon Horman @ 2023-11-23 13:52 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel, lorenzo, netdev

On Tue, Nov 21, 2023 at 01:27:44PM +0100, Florian Westphal wrote:
> struct nf_flowtable is currently wholly embedded in either nft_flowtable
> or tcf_ct_flow_table.
> 
> In order to allow flowtable acceleration via XDP, the XDP program will
> need to map struct net_device to struct nf_flowtable.
> 
> To make this work reliably, make a clear separation of the frontend
> (nft, tc) and backend (nf_flowtable) representation.
> 
> In this first patch, amke it so nft_flowtable and tcf_ct_flow_table
> only store pointers to an nf_flowtable structure.
> 
> The main goal is to have follow patches that allow us to keep the
> nf_flowtable structure around for a bit longer (e.g. until after
> an rcu grace period has elapesed) when the frontend(s) are tearing the
> structures down.
> 
> At this time, things are fine, but when xdp programs might be using
> the nf_flowtable structure as well we will need a way to ensure that
> no such users exist anymore.
> 
> Right now there is inufficient guarantee: nftables only ensures
> that the netfilter hooks are unregistered, and tc only ensures the
> tc actions have been removed.
> 
> Any future kfunc might still be called in parallel from an XDP
> program.  The easies way to resolve this is to let the nf_flowtable
> core handle release and module reference counting itself.
> 
> Signed-off-by: Florian Westphal <fw@strlen.de>

...

> @@ -312,24 +313,29 @@ static int tcf_ct_flow_table_get(struct net *net, struct tcf_ct_params *params)
>  	if (err)
>  		goto err_insert;
>  
> -	ct_ft->nf_ft.type = &flowtable_ct;
> -	ct_ft->nf_ft.flags |= NF_FLOWTABLE_HW_OFFLOAD |
> -			      NF_FLOWTABLE_COUNTER;
> -	err = nf_flow_table_init(&ct_ft->nf_ft);
> +	ct_ft->nf_ft = kzalloc(sizeof(*ct_ft->nf_ft), GFP_KERNEL);
> +	if (!ct_ft->nf_ft)
> +		goto err_alloc;

Hi Florian,

This branch will cause the function to return err, but err is 0 here.
Perhaps it should be set to a negative error value instead?

Flagged by Smatch.

> +
> +	ct_ft->nf_ft->type = &flowtable_ct;
> +	ct_ft->nf_ft->flags |= NF_FLOWTABLE_HW_OFFLOAD |
> +			       NF_FLOWTABLE_COUNTER;
> +	err = nf_flow_table_init(ct_ft->nf_ft);
>  	if (err)
>  		goto err_init;
> -	write_pnet(&ct_ft->nf_ft.net, net);
> +	write_pnet(&ct_ft->nf_ft->net, net);
>  
>  	__module_get(THIS_MODULE);
>  out_unlock:
>  	params->ct_ft = ct_ft;
> -	params->nf_ft = &ct_ft->nf_ft;
> +	params->nf_ft = ct_ft->nf_ft;
>  	mutex_unlock(&zones_mutex);
>  
>  	return 0;
>  
>  err_init:
>  	rhashtable_remove_fast(&zones_ht, &ct_ft->node, zones_params);
> +	kfree(ct_ft->nf_ft);
>  err_insert:
>  	kfree(ct_ft);
>  err_alloc:

...

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH nf-next 1/8] netfilter: flowtable: move nf_flowtable out of container structures
  2023-11-23 13:52   ` Simon Horman
@ 2023-11-23 14:10     ` Florian Westphal
  2023-11-25  8:26       ` Simon Horman
  0 siblings, 1 reply; 22+ messages in thread
From: Florian Westphal @ 2023-11-23 14:10 UTC (permalink / raw)
  To: Simon Horman; +Cc: Florian Westphal, netfilter-devel, lorenzo, netdev

Simon Horman <horms@kernel.org> wrote:
> > -	err = nf_flow_table_init(&ct_ft->nf_ft);
> > +	ct_ft->nf_ft = kzalloc(sizeof(*ct_ft->nf_ft), GFP_KERNEL);
> > +	if (!ct_ft->nf_ft)
> > +		goto err_alloc;
> 
> Hi Florian,
> 
> This branch will cause the function to return err, but err is 0 here.
> Perhaps it should be set to a negative error value instead?

Yes, this should fail with -ENOMEM.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH nf-next 0/8] netfilter: make nf_flowtable lifetime differ from container struct
  2023-11-21 12:27 [PATCH nf-next 0/8] netfilter: make nf_flowtable lifetime differ from container struct Florian Westphal
                   ` (7 preceding siblings ...)
  2023-11-21 12:27 ` [PATCH nf-next 8/8] netfilter: nf_tables: permit duplicate flowtable mappings Florian Westphal
@ 2023-11-24  9:50 ` Pablo Neira Ayuso
  2023-11-24  9:55   ` Florian Westphal
  2023-11-24 10:48   ` Toke Høiland-Jørgensen
  8 siblings, 2 replies; 22+ messages in thread
From: Pablo Neira Ayuso @ 2023-11-24  9:50 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel, lorenzo, netdev

Hi Florian,

Sorry for taking a long while.

On Tue, Nov 21, 2023 at 01:27:43PM +0100, Florian Westphal wrote:
> This series detaches nf_flowtable from the two existing container
> structures.
> 
> Allocation and freeing is moved to the flowtable core.
> Then, memory release is changed so it passes through another
> synchronize_rcu() call.
> 
> Next, a new nftables flowtable flag is introduced to mark a flowtable
> for explicit XDP-based offload.

If XDP uses the hardware offload infrastructure, then I don't see how
would it be possible to combine a software dataplane with hardware
offload, ie. assuming XDP for software acceleration and hardware
offload, because it takes a while for the flowtable hw offload
workqueue to set up things and meanwhile that happens, the software
path is exercised.

> Such flowtables have more restrictions, in particular, if two
> flowtables are tagged as 'xdp offloaded', they cannot share any net
> devices.
> 
> It would be possible to avoid such new 'xdp flag', but I see no way
> to do so without breaking backwards compatbility: at this time the same
> net_device can be part of any number of flowtables, this is very
> inefficient from an XDP point of view: it would have to perform lookups
> in all associated flowtables in a loop until a match is found.
> 
> This is hardly desirable.
> 
> Last two patches expose the hash table mapping and make utility
> function available for XDP.
> 
> The XDP kfunc will be added in a followup patch.

What is the plan to support for stackable device? eg. VLAN, or even
tunneling drivers such as VxLAN. I have (incomplete) patches to use
dev_fill_forward_path() to discover the path then configure the
flowtable datapath forwarding.

My understand is that XDP is all about programmibility, if user
decides to go for XDP then simply fully implement the fast path is the
XDP framework? I know of software already does so and they are
perfectly fine with this approach.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH nf-next 0/8] netfilter: make nf_flowtable lifetime differ from container struct
  2023-11-24  9:50 ` [PATCH nf-next 0/8] netfilter: make nf_flowtable lifetime differ from container struct Pablo Neira Ayuso
@ 2023-11-24  9:55   ` Florian Westphal
  2023-11-24 10:10     ` Pablo Neira Ayuso
  2023-11-24 10:48   ` Toke Høiland-Jørgensen
  1 sibling, 1 reply; 22+ messages in thread
From: Florian Westphal @ 2023-11-24  9:55 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Florian Westphal, netfilter-devel, lorenzo, netdev

Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > Next, a new nftables flowtable flag is introduced to mark a flowtable
> > for explicit XDP-based offload.
> 
> If XDP uses the hardware offload infrastructure, then I don't see how
> would it be possible to combine a software dataplane with hardware
> offload, ie. assuming XDP for software acceleration and hardware
> offload, because it takes a while for the flowtable hw offload
> workqueue to set up things and meanwhile that happens, the software
> path is exercised.

Lorenzo adds a kfunc that gets called from the xdp program
to do a lookup in the flowtable.

This patchset prepares for the kfunc by adding a function that
returns the flowtable based on net_device pointer.

The work queue for hw offload (or ndo ops) are not used.

> > The XDP kfunc will be added in a followup patch.
> 
> What is the plan to support for stackable device? eg. VLAN, or even
> tunneling drivers such as VxLAN. I have (incomplete) patches to use
> dev_fill_forward_path() to discover the path then configure the
> flowtable datapath forwarding.

If the xdp program can't handle it packet will be pushed up the stack,
i.e. nf ingress hook will handle it next.

> My understand is that XDP is all about programmibility, if user
> decides to go for XDP then simply fully implement the fast path is the
> XDP framework? I know of software already does so and they are
> perfectly fine with this approach.

I don't understand, you mean no integration at all?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH nf-next 0/8] netfilter: make nf_flowtable lifetime differ from container struct
  2023-11-24  9:55   ` Florian Westphal
@ 2023-11-24 10:10     ` Pablo Neira Ayuso
  2023-11-24 10:16       ` Florian Westphal
  0 siblings, 1 reply; 22+ messages in thread
From: Pablo Neira Ayuso @ 2023-11-24 10:10 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel, lorenzo, netdev

On Fri, Nov 24, 2023 at 10:55:12AM +0100, Florian Westphal wrote:
> Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > > Next, a new nftables flowtable flag is introduced to mark a flowtable
> > > for explicit XDP-based offload.
> > 
> > If XDP uses the hardware offload infrastructure, then I don't see how
> > would it be possible to combine a software dataplane with hardware
> > offload, ie. assuming XDP for software acceleration and hardware
> > offload, because it takes a while for the flowtable hw offload
> > workqueue to set up things and meanwhile that happens, the software
> > path is exercised.
> 
> Lorenzo adds a kfunc that gets called from the xdp program
> to do a lookup in the flowtable.
> 
> This patchset prepares for the kfunc by adding a function that
> returns the flowtable based on net_device pointer.
> 
> The work queue for hw offload (or ndo ops) are not used.

OK, but is it possible to combine this XDP approach with hardware
offload?

> > > The XDP kfunc will be added in a followup patch.
> > 
> > What is the plan to support for stackable device? eg. VLAN, or even
> > tunneling drivers such as VxLAN. I have (incomplete) patches to use
> > dev_fill_forward_path() to discover the path then configure the
> > flowtable datapath forwarding.
> 
> If the xdp program can't handle it packet will be pushed up the stack,
> i.e. nf ingress hook will handle it next.

Then, only very simple scenarios will benefit from this acceleration.

> > My understand is that XDP is all about programmibility, if user
> > decides to go for XDP then simply fully implement the fast path is the
> > XDP framework? I know of software already does so and they are
> > perfectly fine with this approach.
> 
> I don't understand, you mean no integration at all?

I mean, fully implement a fastpath in XDP/BPF using the datastructures
that it provides.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH nf-next 0/8] netfilter: make nf_flowtable lifetime differ from container struct
  2023-11-24 10:10     ` Pablo Neira Ayuso
@ 2023-11-24 10:16       ` Florian Westphal
  0 siblings, 0 replies; 22+ messages in thread
From: Florian Westphal @ 2023-11-24 10:16 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: Florian Westphal, netfilter-devel, lorenzo, netdev

Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > The work queue for hw offload (or ndo ops) are not used.
> 
> OK, but is it possible to combine this XDP approach with hardware
> offload?

Yes.  We could disallow it if you prefer.

Ordering is, for ingress packet processing:
HW -> XDP -> nf flowtable -> classic forward path

instead of:

HW -> nf flowtable -> classic forward path

For the existing design.

> > If the xdp program can't handle it packet will be pushed up the stack,
> > i.e. nf ingress hook will handle it next.
> 
> Then, only very simple scenarios will benefit from this acceleration.

Yes.  I don't see a reason to worry about more complex things right now.
E.g. PPPoE encap can be added later.

Or do you think this has to be added right from the very beginning?

I hope not.

> > > My understand is that XDP is all about programmibility, if user
> > > decides to go for XDP then simply fully implement the fast path is the
> > > XDP framework? I know of software already does so and they are
> > > perfectly fine with this approach.
> > 
> > I don't understand, you mean no integration at all?
> 
> I mean, fully implement a fastpath in XDP/BPF using the datastructures
> that it provides.

I think its very bad for netfilter.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH nf-next 0/8] netfilter: make nf_flowtable lifetime differ from container struct
  2023-11-24  9:50 ` [PATCH nf-next 0/8] netfilter: make nf_flowtable lifetime differ from container struct Pablo Neira Ayuso
  2023-11-24  9:55   ` Florian Westphal
@ 2023-11-24 10:48   ` Toke Høiland-Jørgensen
  1 sibling, 0 replies; 22+ messages in thread
From: Toke Høiland-Jørgensen @ 2023-11-24 10:48 UTC (permalink / raw)
  To: Pablo Neira Ayuso, Florian Westphal; +Cc: netfilter-devel, lorenzo, netdev

> My understand is that XDP is all about programmibility, if user
> decides to go for XDP then simply fully implement the fast path is the
> XDP framework? I know of software already does so and they are
> perfectly fine with this approach.

Yes, you can do that. But if you're reimplementing everything anyway,
why bother with XDP at all? Might as well go with DPDK and full bypass
then.

The benefit of XDP as a data path is the integration with the kernel
infrastructure: we have robust implementations of a bunch of protocols,
a control plane API that works with a bunch of userspace utilities
(e.g., routing daemons), and lots of data battle-tested data structures
for various things (e.g., the routing table fib). With XDP, you can use
this infrastructure in a pick-and-choose manner and implement your fast
path using just the features you care about for your use case, gaining
performance while still using the kernel path for the slow path to get
full functionality.

The first example of this paradigm was the bpf_fib_lookup() helper. With
this you can accelerate the forwarding fast path and still have the
kernel stack handle neighbour lookup, etc. Adding flowtable lookup
support is a natural extension of this, adding another integration point
you can use for a more complete forwarding acceleration, while still
integrating with the rest of the stack.

This was the "making XDP a magical go faster button" thing I was talking
about at Netconf (and again at Netdevconf), BTW: we should work towards
making XDP a complete (forwarding) acceleration solution, so we can
replace all the crappy hardware "fast path" and kernel bypass
implementations in the world :)

-Toke

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH nf-next 7/8] netfilter: nf_tables: add flowtable map for xdp offload
  2023-11-21 12:27 ` [PATCH nf-next 7/8] netfilter: nf_tables: add flowtable map for xdp offload Florian Westphal
  2023-11-21 14:25   ` Lorenzo Bianconi
@ 2023-11-24 10:59   ` Toke Høiland-Jørgensen
  2023-11-30 13:53     ` Florian Westphal
  1 sibling, 1 reply; 22+ messages in thread
From: Toke Høiland-Jørgensen @ 2023-11-24 10:59 UTC (permalink / raw)
  To: Florian Westphal, netfilter-devel; +Cc: lorenzo, netdev, Florian Westphal

Florian Westphal <fw@strlen.de> writes:

> A device cannot be added to multiple flowtables, the mapping needs
> to be unique.  This is enforced when a flowtables with the
> NF_FLOWTABLE_XDP_OFFLOAD was added.
>
> Exposure of this NF_FLOWTABLE_XDP_OFFLOAD in UAPI could be avoided,
> iff the 'net_device maps to 0 or 1 flowtable' paradigm is enforced
> regardless of offload-or-not flag.
>
> HOWEVER, that does break existing behaviour.

I am not a huge fan of this flag, especially not as UAPI. Using the XDP
offload functionality is already an explicit opt-in by userspace (you
need to load the XDP program). So adding a second UAPI flag that you
have to set for the flowtable to be compatible with XDP seems to just
constrain things needlessly (and is bound to lead to bugs)?

If we can't change the behaviour, we could change the lookup mechanism?
BPF is pretty flexible, nothing says it has to use an ifindex as the
lookup key? The neatest thing would be to have some way for userspace to
directly populate a reference to the flowtable struct in a map, but a
simpler solution would be to just introduce an opaque ID for each
flowtable instance and use that as the lookup key (userspace could
trivially put that into a map for the BPF program to find)?

-Toke

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH nf-next 1/8] netfilter: flowtable: move nf_flowtable out of container structures
  2023-11-23 14:10     ` Florian Westphal
@ 2023-11-25  8:26       ` Simon Horman
  2023-11-25  8:36         ` Simon Horman
  0 siblings, 1 reply; 22+ messages in thread
From: Simon Horman @ 2023-11-25  8:26 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel, lorenzo, netdev

On Thu, Nov 23, 2023 at 03:10:51PM +0100, Florian Westphal wrote:
> Simon Horman <horms@kernel.org> wrote:
> > > -	err = nf_flow_table_init(&ct_ft->nf_ft);
> > > +	ct_ft->nf_ft = kzalloc(sizeof(*ct_ft->nf_ft), GFP_KERNEL);
> > > +	if (!ct_ft->nf_ft)
> > > +		goto err_alloc;
> > 
> > Hi Florian,
> > 
> > This branch will cause the function to return err, but err is 0 here.
> > Perhaps it should be set to a negative error value instead?
> 
> Yes, this should fail with -ENOMEM.

Thanks, I will send a patch.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH nf-next 1/8] netfilter: flowtable: move nf_flowtable out of container structures
  2023-11-25  8:26       ` Simon Horman
@ 2023-11-25  8:36         ` Simon Horman
  0 siblings, 0 replies; 22+ messages in thread
From: Simon Horman @ 2023-11-25  8:36 UTC (permalink / raw)
  To: Florian Westphal; +Cc: netfilter-devel, lorenzo, netdev

On Sat, Nov 25, 2023 at 08:26:01AM +0000, Simon Horman wrote:
> On Thu, Nov 23, 2023 at 03:10:51PM +0100, Florian Westphal wrote:
> > Simon Horman <horms@kernel.org> wrote:
> > > > -	err = nf_flow_table_init(&ct_ft->nf_ft);
> > > > +	ct_ft->nf_ft = kzalloc(sizeof(*ct_ft->nf_ft), GFP_KERNEL);
> > > > +	if (!ct_ft->nf_ft)
> > > > +		goto err_alloc;
> > > 
> > > Hi Florian,
> > > 
> > > This branch will cause the function to return err, but err is 0 here.
> > > Perhaps it should be set to a negative error value instead?
> > 
> > Yes, this should fail with -ENOMEM.
> 
> Thanks, I will send a patch.

Ooops, for some reason I thought this had been accepted, but I don't see it
in nf-next.  So I guess I don't need to send a patch for now.

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH nf-next 7/8] netfilter: nf_tables: add flowtable map for xdp offload
  2023-11-24 10:59   ` Toke Høiland-Jørgensen
@ 2023-11-30 13:53     ` Florian Westphal
  2023-11-30 14:17       ` Toke Høiland-Jørgensen
  0 siblings, 1 reply; 22+ messages in thread
From: Florian Westphal @ 2023-11-30 13:53 UTC (permalink / raw)
  To: Toke Høiland-Jørgensen
  Cc: Florian Westphal, netfilter-devel, lorenzo, netdev

Toke Høiland-Jørgensen <toke@toke.dk> wrote:
> I am not a huge fan of this flag, especially not as UAPI. Using the XDP
> offload functionality is already an explicit opt-in by userspace (you
> need to load the XDP program). So adding a second UAPI flag that you
> have to set for the flowtable to be compatible with XDP seems to just
> constrain things needlessly (and is bound to lead to bugs)?

I can remove it.  But it leads to issues, for example one flowtable
can shadow another one.

I'd prefer to handle this from control plane and reject such config.
Alternative is to ignore this and handle it as "self sabotage, don't
care" combined with "do not do that, then".

> If we can't change the behaviour, we could change the lookup mechanism?
> BPF is pretty flexible, nothing says it has to use an ifindex as the
> lookup key? The neatest thing would be to have some way for userspace to
> directly populate a reference to the flowtable struct in a map, but a
> simpler solution would be to just introduce an opaque ID for each
> flowtable instance and use that as the lookup key (userspace could
> trivially put that into a map for the BPF program to find)?

Won't that complicate things?  Userspace will have to use netlink
events to discover when a flowtable is removed, no?

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH nf-next 7/8] netfilter: nf_tables: add flowtable map for xdp offload
  2023-11-30 13:53     ` Florian Westphal
@ 2023-11-30 14:17       ` Toke Høiland-Jørgensen
  0 siblings, 0 replies; 22+ messages in thread
From: Toke Høiland-Jørgensen @ 2023-11-30 14:17 UTC (permalink / raw)
  To: Florian Westphal; +Cc: Florian Westphal, netfilter-devel, lorenzo, netdev

Florian Westphal <fw@strlen.de> writes:

> Toke Høiland-Jørgensen <toke@toke.dk> wrote:
>> I am not a huge fan of this flag, especially not as UAPI. Using the XDP
>> offload functionality is already an explicit opt-in by userspace (you
>> need to load the XDP program). So adding a second UAPI flag that you
>> have to set for the flowtable to be compatible with XDP seems to just
>> constrain things needlessly (and is bound to lead to bugs)?
>
> I can remove it.  But it leads to issues, for example one flowtable
> can shadow another one.
>
> I'd prefer to handle this from control plane and reject such config.
> Alternative is to ignore this and handle it as "self sabotage, don't
> care" combined with "do not do that, then".

I do see your point about avoiding invalid configurations, but, well XDP
is already very much a "use it right or it will break on you" kind of
thing, so I think that bit is kinda unavoidable. As in, upon loading the
XDP program that does the lookup, you can validate the configuration and
reject loading if it's setup in a way that your program can support.
Whereas if you have to set a flag on the flowtable itself, that means
you have to make changes to the nft ruleset itself to be compatible with
XDP acceleration (right?), you can't just go "accelerate my existing
ruleset".

>> If we can't change the behaviour, we could change the lookup mechanism?
>> BPF is pretty flexible, nothing says it has to use an ifindex as the
>> lookup key? The neatest thing would be to have some way for userspace to
>> directly populate a reference to the flowtable struct in a map, but a
>> simpler solution would be to just introduce an opaque ID for each
>> flowtable instance and use that as the lookup key (userspace could
>> trivially put that into a map for the BPF program to find)?
>
> Won't that complicate things?  Userspace will have to use netlink
> events to discover when a flowtable is removed, no?

Well, I am kinda assuming that userspace is the entity doing the
removing, in which case it should already know this, right? I must admit
to being a little fuzzy on the details of when a flowtable object is
replaced, though. For instance, does reloading an nft ruleset always
replace the flowtable with a new one (even if there's no change to the
flowtable config itself)?

-Toke

^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2023-11-30 14:17 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-21 12:27 [PATCH nf-next 0/8] netfilter: make nf_flowtable lifetime differ from container struct Florian Westphal
2023-11-21 12:27 ` [PATCH nf-next 1/8] netfilter: flowtable: move nf_flowtable out of container structures Florian Westphal
2023-11-23 13:52   ` Simon Horman
2023-11-23 14:10     ` Florian Westphal
2023-11-25  8:26       ` Simon Horman
2023-11-25  8:36         ` Simon Horman
2023-11-21 12:27 ` [PATCH nf-next 2/8] netfilter: nf_flowtable: replace init callback with a create one Florian Westphal
2023-11-21 12:27 ` [PATCH nf-next 3/8] netfilter: nf_flowtable: make free a real free function Florian Westphal
2023-11-21 12:27 ` [PATCH nf-next 4/8] netfilter: nf_flowtable: delay flowtable release a second time Florian Westphal
2023-11-21 12:27 ` [PATCH nf-next 5/8] netfilter: nf_tables: reject flowtable hw offload for same device Florian Westphal
2023-11-21 12:27 ` [PATCH nf-next 6/8] netfilter: nf_tables: add xdp offload flag Florian Westphal
2023-11-21 12:27 ` [PATCH nf-next 7/8] netfilter: nf_tables: add flowtable map for xdp offload Florian Westphal
2023-11-21 14:25   ` Lorenzo Bianconi
2023-11-24 10:59   ` Toke Høiland-Jørgensen
2023-11-30 13:53     ` Florian Westphal
2023-11-30 14:17       ` Toke Høiland-Jørgensen
2023-11-21 12:27 ` [PATCH nf-next 8/8] netfilter: nf_tables: permit duplicate flowtable mappings Florian Westphal
2023-11-24  9:50 ` [PATCH nf-next 0/8] netfilter: make nf_flowtable lifetime differ from container struct Pablo Neira Ayuso
2023-11-24  9:55   ` Florian Westphal
2023-11-24 10:10     ` Pablo Neira Ayuso
2023-11-24 10:16       ` Florian Westphal
2023-11-24 10:48   ` Toke Høiland-Jørgensen

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.