All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] netfilter: nf_tables: set transactions
@ 2015-03-25 14:08 Patrick McHardy
  2015-03-25 14:08 ` [PATCH 1/4] netfilter: nf_tables: consolide set element destruction Patrick McHardy
                   ` (4 more replies)
  0 siblings, 5 replies; 9+ messages in thread
From: Patrick McHardy @ 2015-03-25 14:08 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel

These patches add transaction support to set elements. It pretty much
resembles the existing transaction support, the changelogs explain it
in more detail.

As a by-product the patches get rid of the RCU grace period for every
nft_hash element that is destroyed, we now have only the single global
transaction grace perdiod independantly of what we destroy.

Please apply, thanks!


Patrick McHardy (4):
  netfilter: nf_tables: consolide set element destruction
  netfilter: nf_tables: return set extensions from ->lookup()
  netfilter: nf_tables: add transaction helper functions
  netfilter: nf_tables: implement set transaction support

 include/net/netfilter/nf_tables.h | 67 +++++++++++++++++++++++++++----
 net/netfilter/nf_tables_api.c     | 84 +++++++++++++++++++++------------------
 net/netfilter/nf_tables_core.c    |  6 +--
 net/netfilter/nft_hash.c          | 62 ++++++++++++++++-------------
 net/netfilter/nft_lookup.c        |  6 ++-
 net/netfilter/nft_rbtree.c        | 69 +++++++++++++++++++-------------
 6 files changed, 188 insertions(+), 106 deletions(-)

-- 
2.1.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/4] netfilter: nf_tables: consolide set element destruction
  2015-03-25 14:08 [PATCH 0/4] netfilter: nf_tables: set transactions Patrick McHardy
@ 2015-03-25 14:08 ` Patrick McHardy
  2015-03-25 14:08 ` [PATCH 2/4] netfilter: nf_tables: return set extensions from ->lookup() Patrick McHardy
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Patrick McHardy @ 2015-03-25 14:08 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel

With the conversion to set extensions, it is now possible to consolidate
the different set element destruction functions.

The set implementations' ->remove() functions are changed to only take
the element out of their internal data structures. Elements will be freed
in a batched fashion after the global transaction's completion RCU grace
period.

This reduces the amount of grace periods required for nft_hash from N
to zero additional ones, additionally this guarantees that the set
elements' extensions of all implementations can be used under RCU
protection.

Signed-off-by: Patrick McHardy <kaber@trash.net>
---
 include/net/netfilter/nf_tables.h |  2 ++
 net/netfilter/nf_tables_api.c     | 34 ++++++++++++++++++++--------------
 net/netfilter/nft_hash.c          | 18 ++++--------------
 net/netfilter/nft_rbtree.c        | 14 +-------------
 4 files changed, 27 insertions(+), 41 deletions(-)

diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index ef3457c..6ac6332 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -423,6 +423,8 @@ static inline struct nft_set_ext *nft_set_elem_ext(const struct nft_set *set,
 	return elem + set->ops->elemsize;
 }
 
+void nft_set_elem_destroy(const struct nft_set *set, void *elem);
+
 /**
  *	struct nft_expr_type - nf_tables expression type
  *
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 99cb884..b35512f 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -3155,6 +3155,18 @@ static void *nft_set_elem_init(const struct nft_set *set,
 	return elem;
 }
 
+void nft_set_elem_destroy(const struct nft_set *set, void *elem)
+{
+	struct nft_set_ext *ext = nft_set_elem_ext(set, elem);
+
+	nft_data_uninit(nft_set_ext_key(ext), NFT_DATA_VALUE);
+	if (nft_set_ext_exists(ext, NFT_SET_EXT_DATA))
+		nft_data_uninit(nft_set_ext_data(ext), set->dtype);
+
+	kfree(elem);
+}
+EXPORT_SYMBOL_GPL(nft_set_elem_destroy);
+
 static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
 			    const struct nlattr *attr)
 {
@@ -3596,6 +3608,10 @@ static void nf_tables_commit_release(struct nft_trans *trans)
 	case NFT_MSG_DELSET:
 		nft_set_destroy(nft_trans_set(trans));
 		break;
+	case NFT_MSG_DELSETELEM:
+		nft_set_elem_destroy(nft_trans_elem_set(trans),
+				     nft_trans_elem(trans).priv);
+		break;
 	}
 	kfree(trans);
 }
@@ -3605,7 +3621,6 @@ static int nf_tables_commit(struct sk_buff *skb)
 	struct net *net = sock_net(skb->sk);
 	struct nft_trans *trans, *next;
 	struct nft_trans_elem *te;
-	struct nft_set_ext *ext;
 
 	/* Bump generation counter, invalidate any dump in progress */
 	while (++net->nft.base_seq == 0);
@@ -3690,18 +3705,12 @@ static int nf_tables_commit(struct sk_buff *skb)
 			break;
 		case NFT_MSG_DELSETELEM:
 			te = (struct nft_trans_elem *)trans->data;
-			ext = nft_set_elem_ext(te->set, te->elem.priv);
 
 			nf_tables_setelem_notify(&trans->ctx, te->set,
 						 &te->elem,
 						 NFT_MSG_DELSETELEM, 0);
 			te->set->ops->get(te->set, &te->elem);
-			nft_data_uninit(&te->elem.key, NFT_DATA_VALUE);
-			if (nft_set_ext_exists(ext, NFT_SET_EXT_DATA))
-				nft_data_uninit(nft_set_ext_data(ext),
-						te->set->dtype);
 			te->set->ops->remove(te->set, &te->elem);
-			nft_trans_destroy(trans);
 			break;
 		}
 	}
@@ -3733,6 +3742,10 @@ static void nf_tables_abort_release(struct nft_trans *trans)
 	case NFT_MSG_NEWSET:
 		nft_set_destroy(nft_trans_set(trans));
 		break;
+	case NFT_MSG_NEWSETELEM:
+		nft_set_elem_destroy(nft_trans_elem_set(trans),
+				     nft_trans_elem(trans).priv);
+		break;
 	}
 	kfree(trans);
 }
@@ -3742,7 +3755,6 @@ static int nf_tables_abort(struct sk_buff *skb)
 	struct net *net = sock_net(skb->sk);
 	struct nft_trans *trans, *next;
 	struct nft_trans_elem *te;
-	struct nft_set_ext *ext;
 
 	list_for_each_entry_safe(trans, next, &net->nft.commit_list, list) {
 		switch (trans->msg_type) {
@@ -3804,15 +3816,9 @@ static int nf_tables_abort(struct sk_buff *skb)
 		case NFT_MSG_NEWSETELEM:
 			nft_trans_elem_set(trans)->nelems--;
 			te = (struct nft_trans_elem *)trans->data;
-			ext = nft_set_elem_ext(te->set, te->elem.priv);
 
 			te->set->ops->get(te->set, &te->elem);
-			nft_data_uninit(&te->elem.key, NFT_DATA_VALUE);
-			if (nft_set_ext_exists(ext, NFT_SET_EXT_DATA))
-				nft_data_uninit(nft_set_ext_data(ext),
-						te->set->dtype);
 			te->set->ops->remove(te->set, &te->elem);
-			nft_trans_destroy(trans);
 			break;
 		case NFT_MSG_DELSETELEM:
 			nft_trans_elem_set(trans)->nelems++;
diff --git a/net/netfilter/nft_hash.c b/net/netfilter/nft_hash.c
index 15951a8..94bf25d 100644
--- a/net/netfilter/nft_hash.c
+++ b/net/netfilter/nft_hash.c
@@ -96,23 +96,12 @@ static int nft_hash_insert(const struct nft_set *set,
 					    nft_hash_params);
 }
 
-static void nft_hash_elem_destroy(const struct nft_set *set,
-				  struct nft_hash_elem *he)
-{
-	nft_data_uninit(nft_set_ext_key(&he->ext), NFT_DATA_VALUE);
-	if (set->flags & NFT_SET_MAP)
-		nft_data_uninit(nft_set_ext_data(&he->ext), set->dtype);
-	kfree(he);
-}
-
 static void nft_hash_remove(const struct nft_set *set,
 			    const struct nft_set_elem *elem)
 {
 	struct nft_hash *priv = nft_set_priv(set);
 
 	rhashtable_remove_fast(&priv->ht, elem->cookie, nft_hash_params);
-	synchronize_rcu();
-	kfree(elem->cookie);
 }
 
 static int nft_hash_get(const struct nft_set *set, struct nft_set_elem *elem)
@@ -208,16 +197,17 @@ static int nft_hash_init(const struct nft_set *set,
 	return rhashtable_init(&priv->ht, &params);
 }
 
-static void nft_free_element(void *ptr, void *arg)
+static void nft_hash_elem_destroy(void *ptr, void *arg)
 {
-	nft_hash_elem_destroy((const struct nft_set *)arg, ptr);
+	nft_set_elem_destroy((const struct nft_set *)arg, ptr);
 }
 
 static void nft_hash_destroy(const struct nft_set *set)
 {
 	struct nft_hash *priv = nft_set_priv(set);
 
-	rhashtable_free_and_destroy(&priv->ht, nft_free_element, (void *)set);
+	rhashtable_free_and_destroy(&priv->ht, nft_hash_elem_destroy,
+				    (void *)set);
 }
 
 static bool nft_hash_estimate(const struct nft_set_desc *desc, u32 features,
diff --git a/net/netfilter/nft_rbtree.c b/net/netfilter/nft_rbtree.c
index ebf6e60..332c6af 100644
--- a/net/netfilter/nft_rbtree.c
+++ b/net/netfilter/nft_rbtree.c
@@ -72,17 +72,6 @@ out:
 	return false;
 }
 
-static void nft_rbtree_elem_destroy(const struct nft_set *set,
-				    struct nft_rbtree_elem *rbe)
-{
-	nft_data_uninit(nft_set_ext_key(&rbe->ext), NFT_DATA_VALUE);
-	if (set->flags & NFT_SET_MAP &&
-	    nft_set_ext_exists(&rbe->ext, NFT_SET_EXT_DATA))
-		nft_data_uninit(nft_set_ext_data(&rbe->ext), set->dtype);
-
-	kfree(rbe);
-}
-
 static int __nft_rbtree_insert(const struct nft_set *set,
 			       struct nft_rbtree_elem *new)
 {
@@ -133,7 +122,6 @@ static void nft_rbtree_remove(const struct nft_set *set,
 	spin_lock_bh(&nft_rbtree_lock);
 	rb_erase(&rbe->node, &priv->root);
 	spin_unlock_bh(&nft_rbtree_lock);
-	kfree(rbe);
 }
 
 static int nft_rbtree_get(const struct nft_set *set, struct nft_set_elem *elem)
@@ -213,7 +201,7 @@ static void nft_rbtree_destroy(const struct nft_set *set)
 	while ((node = priv->root.rb_node) != NULL) {
 		rb_erase(node, &priv->root);
 		rbe = rb_entry(node, struct nft_rbtree_elem, node);
-		nft_rbtree_elem_destroy(set, rbe);
+		nft_set_elem_destroy(set, rbe);
 	}
 }
 
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/4] netfilter: nf_tables: return set extensions from ->lookup()
  2015-03-25 14:08 [PATCH 0/4] netfilter: nf_tables: set transactions Patrick McHardy
  2015-03-25 14:08 ` [PATCH 1/4] netfilter: nf_tables: consolide set element destruction Patrick McHardy
@ 2015-03-25 14:08 ` Patrick McHardy
  2015-03-25 14:08 ` [PATCH 3/4] netfilter: nf_tables: add transaction helper functions Patrick McHardy
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 9+ messages in thread
From: Patrick McHardy @ 2015-03-25 14:08 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel

Return the extension area from the ->lookup() function to allow to
consolidate common actions.

Signed-off-by: Patrick McHardy <kaber@trash.net>
---
 include/net/netfilter/nf_tables.h | 4 +++-
 net/netfilter/nft_hash.c          | 6 +++---
 net/netfilter/nft_lookup.c        | 6 +++++-
 net/netfilter/nft_rbtree.c        | 7 +++----
 4 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index 6ac6332..f190d26 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -200,6 +200,8 @@ struct nft_set_estimate {
 	enum nft_set_class	class;
 };
 
+struct nft_set_ext;
+
 /**
  *	struct nft_set_ops - nf_tables set operations
  *
@@ -218,7 +220,7 @@ struct nft_set_estimate {
 struct nft_set_ops {
 	bool				(*lookup)(const struct nft_set *set,
 						  const struct nft_data *key,
-						  struct nft_data *data);
+						  const struct nft_set_ext **ext);
 	int				(*get)(const struct nft_set *set,
 					       struct nft_set_elem *elem);
 	int				(*insert)(const struct nft_set *set,
diff --git a/net/netfilter/nft_hash.c b/net/netfilter/nft_hash.c
index 94bf25d..5bee821 100644
--- a/net/netfilter/nft_hash.c
+++ b/net/netfilter/nft_hash.c
@@ -66,7 +66,7 @@ static inline int nft_hash_cmp(struct rhashtable_compare_arg *arg,
 
 static bool nft_hash_lookup(const struct nft_set *set,
 			    const struct nft_data *key,
-			    struct nft_data *data)
+			    const struct nft_set_ext **ext)
 {
 	struct nft_hash *priv = nft_set_priv(set);
 	const struct nft_hash_elem *he;
@@ -76,8 +76,8 @@ static bool nft_hash_lookup(const struct nft_set *set,
 	};
 
 	he = rhashtable_lookup_fast(&priv->ht, &arg, nft_hash_params);
-	if (he && set->flags & NFT_SET_MAP)
-		nft_data_copy(data, nft_set_ext_data(&he->ext));
+	if (he != NULL)
+		*ext = &he->ext;
 
 	return !!he;
 }
diff --git a/net/netfilter/nft_lookup.c b/net/netfilter/nft_lookup.c
index 9615b8b..a5f30b8 100644
--- a/net/netfilter/nft_lookup.c
+++ b/net/netfilter/nft_lookup.c
@@ -31,9 +31,13 @@ static void nft_lookup_eval(const struct nft_expr *expr,
 {
 	const struct nft_lookup *priv = nft_expr_priv(expr);
 	const struct nft_set *set = priv->set;
+	const struct nft_set_ext *ext;
 
-	if (set->ops->lookup(set, &data[priv->sreg], &data[priv->dreg]))
+	if (set->ops->lookup(set, &data[priv->sreg], &ext)) {
+		if (set->flags & NFT_SET_MAP)
+			nft_data_copy(&data[priv->dreg], nft_set_ext_data(ext));
 		return;
+	}
 	data[NFT_REG_VERDICT].verdict = NFT_BREAK;
 }
 
diff --git a/net/netfilter/nft_rbtree.c b/net/netfilter/nft_rbtree.c
index 332c6af..cbba755 100644
--- a/net/netfilter/nft_rbtree.c
+++ b/net/netfilter/nft_rbtree.c
@@ -31,7 +31,7 @@ struct nft_rbtree_elem {
 
 static bool nft_rbtree_lookup(const struct nft_set *set,
 			      const struct nft_data *key,
-			      struct nft_data *data)
+			      const struct nft_set_ext **ext)
 {
 	const struct nft_rbtree *priv = nft_set_priv(set);
 	const struct nft_rbtree_elem *rbe, *interval = NULL;
@@ -55,10 +55,9 @@ found:
 			    *nft_set_ext_flags(&rbe->ext) &
 			    NFT_SET_ELEM_INTERVAL_END)
 				goto out;
-			if (set->flags & NFT_SET_MAP)
-				nft_data_copy(data, nft_set_ext_data(&rbe->ext));
-
 			spin_unlock_bh(&nft_rbtree_lock);
+
+			*ext = &rbe->ext;
 			return true;
 		}
 	}
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/4] netfilter: nf_tables: add transaction helper functions
  2015-03-25 14:08 [PATCH 0/4] netfilter: nf_tables: set transactions Patrick McHardy
  2015-03-25 14:08 ` [PATCH 1/4] netfilter: nf_tables: consolide set element destruction Patrick McHardy
  2015-03-25 14:08 ` [PATCH 2/4] netfilter: nf_tables: return set extensions from ->lookup() Patrick McHardy
@ 2015-03-25 14:08 ` Patrick McHardy
  2015-03-25 14:08 ` [PATCH 4/4] netfilter: nf_tables: implement set transaction support Patrick McHardy
  2015-03-26 11:09 ` [PATCH 0/4] netfilter: nf_tables: set transactions Pablo Neira Ayuso
  4 siblings, 0 replies; 9+ messages in thread
From: Patrick McHardy @ 2015-03-25 14:08 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel

Add some helper functions for building the genmask as preparation for
set transactions.

Also add a little documentation how this stuff actually works.

Signed-off-by: Patrick McHardy <kaber@trash.net>
---
 include/net/netfilter/nf_tables.h | 28 ++++++++++++++++++++++++++++
 net/netfilter/nf_tables_api.c     | 17 ++++++-----------
 net/netfilter/nf_tables_core.c    |  6 +-----
 3 files changed, 35 insertions(+), 16 deletions(-)

diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index f190d26..4c46a32 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -720,6 +720,34 @@ void nft_unregister_expr(struct nft_expr_type *);
 #define MODULE_ALIAS_NFT_SET() \
 	MODULE_ALIAS("nft-set")
 
+/*
+ * The gencursor defines two generations, the currently active and the
+ * next one. Objects contain a bitmask of 2 bits specifying the generations
+ * they're active in. A set bit means they're inactive in the generation
+ * represented by that bit.
+ *
+ * New objects start out as inactive in the current and active in the
+ * next generation. When committing the ruleset the bitmask is cleared,
+ * meaning they're active in all generations. When removing an object,
+ * it is set inactive in the next generation. After committing the ruleset,
+ * the objects are removed.
+ */
+static inline unsigned int nft_gencursor_next(const struct net *net)
+{
+	return net->nft.gencursor + 1 == 1 ? 1 : 0;
+}
+
+static inline u8 nft_genmask_next(const struct net *net)
+{
+	return 1 << nft_gencursor_next(net);
+}
+
+static inline u8 nft_genmask_cur(const struct net *net)
+{
+	/* Use ACCESS_ONCE() to prevent refetching the value for atomicity */
+	return 1 << ACCESS_ONCE(net->nft.gencursor);
+}
+
 /**
  *	struct nft_trans - nf_tables object update in transaction
  *
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index b35512f..66fa5e9 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -198,36 +198,31 @@ static int nft_delchain(struct nft_ctx *ctx)
 static inline bool
 nft_rule_is_active(struct net *net, const struct nft_rule *rule)
 {
-	return (rule->genmask & (1 << net->nft.gencursor)) == 0;
-}
-
-static inline int gencursor_next(struct net *net)
-{
-	return net->nft.gencursor+1 == 1 ? 1 : 0;
+	return (rule->genmask & nft_genmask_cur(net)) == 0;
 }
 
 static inline int
 nft_rule_is_active_next(struct net *net, const struct nft_rule *rule)
 {
-	return (rule->genmask & (1 << gencursor_next(net))) == 0;
+	return (rule->genmask & nft_genmask_next(net)) == 0;
 }
 
 static inline void
 nft_rule_activate_next(struct net *net, struct nft_rule *rule)
 {
 	/* Now inactive, will be active in the future */
-	rule->genmask = (1 << net->nft.gencursor);
+	rule->genmask = nft_genmask_cur(net);
 }
 
 static inline void
 nft_rule_deactivate_next(struct net *net, struct nft_rule *rule)
 {
-	rule->genmask = (1 << gencursor_next(net));
+	rule->genmask = nft_genmask_next(net);
 }
 
 static inline void nft_rule_clear(struct net *net, struct nft_rule *rule)
 {
-	rule->genmask &= ~(1 << gencursor_next(net));
+	rule->genmask &= ~nft_genmask_next(net);
 }
 
 static int
@@ -3626,7 +3621,7 @@ static int nf_tables_commit(struct sk_buff *skb)
 	while (++net->nft.base_seq == 0);
 
 	/* A new generation has just started */
-	net->nft.gencursor = gencursor_next(net);
+	net->nft.gencursor = nft_gencursor_next(net);
 
 	/* Make sure all packets have left the previous generation before
 	 * purging old rules.
diff --git a/net/netfilter/nf_tables_core.c b/net/netfilter/nf_tables_core.c
index 4429008..ef4dfcb 100644
--- a/net/netfilter/nf_tables_core.c
+++ b/net/netfilter/nf_tables_core.c
@@ -121,11 +121,7 @@ nft_do_chain(struct nft_pktinfo *pkt, const struct nf_hook_ops *ops)
 	struct nft_jumpstack jumpstack[NFT_JUMP_STACK_SIZE];
 	struct nft_stats *stats;
 	int rulenum;
-	/*
-	 * Cache cursor to avoid problems in case that the cursor is updated
-	 * while traversing the ruleset.
-	 */
-	unsigned int gencursor = ACCESS_ONCE(net->nft.gencursor);
+	unsigned int gencursor = nft_genmask_cur(net);
 
 do_chain:
 	rulenum = 0;
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 4/4] netfilter: nf_tables: implement set transaction support
  2015-03-25 14:08 [PATCH 0/4] netfilter: nf_tables: set transactions Patrick McHardy
                   ` (2 preceding siblings ...)
  2015-03-25 14:08 ` [PATCH 3/4] netfilter: nf_tables: add transaction helper functions Patrick McHardy
@ 2015-03-25 14:08 ` Patrick McHardy
  2015-03-26 11:09 ` [PATCH 0/4] netfilter: nf_tables: set transactions Pablo Neira Ayuso
  4 siblings, 0 replies; 9+ messages in thread
From: Patrick McHardy @ 2015-03-25 14:08 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel

Set elements are the last object type not supporting transaction support.
Implement similar to the existing rule transactions:

The global transaction counter keeps track of two generations, current
and next. Each element contains a bitmask specifying in which generations
it is inactive.

New elements start out as inactive in the current generation and active
in the next. On commit, the previous next generation becomes the current
generation and the element becomes active. The bitmask is then cleared
to indicate that the element is active in all future generations. If the
transaction is aborted, the element is removed from the set before it
becomes active.

When removing an element, it gets marked as inactive in the next generation.
On commit the next generation becomes active and the therefor the element
inactive. It is then taken out of then set and released. On abort, the
element is marked as active for the next generation again.

Lookups ignore elements not active in the current generation.

The current set types (hash/rbtree) both use a field in the extension area
to store the generation mask. This (currently) does not require any
additional memory since we have some free space in there.

Signed-off-by: Patrick McHardy <kaber@trash.net>
---
 include/net/netfilter/nf_tables.h | 33 +++++++++++++++++++++------
 net/netfilter/nf_tables_api.c     | 33 ++++++++++++++++-----------
 net/netfilter/nft_hash.c          | 38 +++++++++++++++++++++++--------
 net/netfilter/nft_rbtree.c        | 48 +++++++++++++++++++++++++++++++--------
 4 files changed, 112 insertions(+), 40 deletions(-)

diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h
index 4c46a32..b8cd60d 100644
--- a/include/net/netfilter/nf_tables.h
+++ b/include/net/netfilter/nf_tables.h
@@ -138,15 +138,10 @@ struct nft_userdata {
 /**
  *	struct nft_set_elem - generic representation of set elements
  *
- *	@cookie: implementation specific element cookie
  *	@key: element key
  *	@priv: element private data and extensions
- *
- *	The cookie can be used to store a handle to the element for subsequent
- *	removal.
  */
 struct nft_set_elem {
-	void			*cookie;
 	struct nft_data		key;
 	void			*priv;
 };
@@ -207,6 +202,8 @@ struct nft_set_ext;
  *
  *	@lookup: look up an element within the set
  *	@insert: insert new element into set
+ *	@activate: activate new element in the next generation
+ *	@deactivate: deactivate element in the next generation
  *	@remove: remove element from set
  *	@walk: iterate over all set elemeennts
  *	@privsize: function to return size of set private data
@@ -221,10 +218,12 @@ struct nft_set_ops {
 	bool				(*lookup)(const struct nft_set *set,
 						  const struct nft_data *key,
 						  const struct nft_set_ext **ext);
-	int				(*get)(const struct nft_set *set,
-					       struct nft_set_elem *elem);
 	int				(*insert)(const struct nft_set *set,
 						  const struct nft_set_elem *elem);
+	void				(*activate)(const struct nft_set *set,
+						    const struct nft_set_elem *elem);
+	void *				(*deactivate)(const struct nft_set *set,
+						      const struct nft_set_elem *elem);
 	void				(*remove)(const struct nft_set *set,
 						  const struct nft_set_elem *elem);
 	void				(*walk)(const struct nft_ctx *ctx,
@@ -261,6 +260,7 @@ void nft_unregister_set(struct nft_set_ops *ops);
  * 	@nelems: number of elements
  *	@policy: set parameterization (see enum nft_set_policies)
  * 	@ops: set ops
+ * 	@pnet: network namespace
  * 	@flags: set flags
  * 	@klen: key length
  * 	@dlen: data length
@@ -277,6 +277,7 @@ struct nft_set {
 	u16				policy;
 	/* runtime data below here */
 	const struct nft_set_ops	*ops ____cacheline_aligned;
+	possible_net_t			pnet;
 	u16				flags;
 	u8				klen;
 	u8				dlen;
@@ -355,10 +356,12 @@ struct nft_set_ext_tmpl {
 /**
  *	struct nft_set_ext - set extensions
  *
+ *	@genmask: generation mask
  *	@offset: offsets of individual extension types
  *	@data: beginning of extension data
  */
 struct nft_set_ext {
+	u8	genmask;
 	u8	offset[NFT_SET_EXT_NUM];
 	char	data[0];
 };
@@ -748,6 +751,22 @@ static inline u8 nft_genmask_cur(const struct net *net)
 	return 1 << ACCESS_ONCE(net->nft.gencursor);
 }
 
+/*
+ * Set element transaction helpers
+ */
+
+static inline bool nft_set_elem_active(const struct nft_set_ext *ext,
+				       u8 genmask)
+{
+	return !(ext->genmask & genmask);
+}
+
+static inline void nft_set_elem_change_active(const struct nft_set *set,
+					      struct nft_set_ext *ext)
+{
+	ext->genmask ^= nft_genmask_next(read_pnet(&set->pnet));
+}
+
 /**
  *	struct nft_trans - nf_tables object update in transaction
  *
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index 66fa5e9..5604c2d 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -2690,6 +2690,7 @@ static int nf_tables_newset(struct sock *nlsk, struct sk_buff *skb,
 		goto err2;
 
 	INIT_LIST_HEAD(&set->bindings);
+	write_pnet(&set->pnet, net);
 	set->ops   = ops;
 	set->ktype = ktype;
 	set->klen  = desc.klen;
@@ -3221,10 +3222,6 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
 	if (d1.type != NFT_DATA_VALUE || d1.len != set->klen)
 		goto err2;
 
-	err = -EEXIST;
-	if (set->ops->get(set, &elem) == 0)
-		goto err2;
-
 	nft_set_ext_add(&tmpl, NFT_SET_EXT_KEY);
 
 	if (nla[NFTA_SET_ELEM_DATA] != NULL) {
@@ -3266,6 +3263,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set,
 	if (trans == NULL)
 		goto err4;
 
+	ext->genmask = nft_genmask_cur(ctx->net);
 	err = set->ops->insert(set, &elem);
 	if (err < 0)
 		goto err5;
@@ -3353,19 +3351,24 @@ static int nft_del_setelem(struct nft_ctx *ctx, struct nft_set *set,
 	if (desc.type != NFT_DATA_VALUE || desc.len != set->klen)
 		goto err2;
 
-	err = set->ops->get(set, &elem);
-	if (err < 0)
-		goto err2;
-
 	trans = nft_trans_elem_alloc(ctx, NFT_MSG_DELSETELEM, set);
 	if (trans == NULL) {
 		err = -ENOMEM;
 		goto err2;
 	}
 
+	elem.priv = set->ops->deactivate(set, &elem);
+	if (elem.priv == NULL) {
+		err = -ENOENT;
+		goto err3;
+	}
+
 	nft_trans_elem(trans) = elem;
 	list_add_tail(&trans->list, &ctx->net->nft.commit_list);
 	return 0;
+
+err3:
+	kfree(trans);
 err2:
 	nft_data_uninit(&elem.key, desc.type);
 err1:
@@ -3692,9 +3695,11 @@ static int nf_tables_commit(struct sk_buff *skb)
 					     NFT_MSG_DELSET, GFP_KERNEL);
 			break;
 		case NFT_MSG_NEWSETELEM:
-			nf_tables_setelem_notify(&trans->ctx,
-						 nft_trans_elem_set(trans),
-						 &nft_trans_elem(trans),
+			te = (struct nft_trans_elem *)trans->data;
+
+			te->set->ops->activate(te->set, &te->elem);
+			nf_tables_setelem_notify(&trans->ctx, te->set,
+						 &te->elem,
 						 NFT_MSG_NEWSETELEM, 0);
 			nft_trans_destroy(trans);
 			break;
@@ -3704,7 +3709,6 @@ static int nf_tables_commit(struct sk_buff *skb)
 			nf_tables_setelem_notify(&trans->ctx, te->set,
 						 &te->elem,
 						 NFT_MSG_DELSETELEM, 0);
-			te->set->ops->get(te->set, &te->elem);
 			te->set->ops->remove(te->set, &te->elem);
 			break;
 		}
@@ -3812,11 +3816,14 @@ static int nf_tables_abort(struct sk_buff *skb)
 			nft_trans_elem_set(trans)->nelems--;
 			te = (struct nft_trans_elem *)trans->data;
 
-			te->set->ops->get(te->set, &te->elem);
 			te->set->ops->remove(te->set, &te->elem);
 			break;
 		case NFT_MSG_DELSETELEM:
+			te = (struct nft_trans_elem *)trans->data;
+
 			nft_trans_elem_set(trans)->nelems++;
+			te->set->ops->activate(te->set, &te->elem);
+
 			nft_trans_destroy(trans);
 			break;
 		}
diff --git a/net/netfilter/nft_hash.c b/net/netfilter/nft_hash.c
index 5bee821..c7e1a9d 100644
--- a/net/netfilter/nft_hash.c
+++ b/net/netfilter/nft_hash.c
@@ -35,6 +35,7 @@ struct nft_hash_elem {
 struct nft_hash_cmp_arg {
 	const struct nft_set		*set;
 	const struct nft_data		*key;
+	u8				genmask;
 };
 
 static const struct rhashtable_params nft_hash_params;
@@ -61,6 +62,8 @@ static inline int nft_hash_cmp(struct rhashtable_compare_arg *arg,
 
 	if (nft_data_cmp(nft_set_ext_key(&he->ext), x->key, x->set->klen))
 		return 1;
+	if (!nft_set_elem_active(&he->ext, x->genmask))
+		return 1;
 	return 0;
 }
 
@@ -71,6 +74,7 @@ static bool nft_hash_lookup(const struct nft_set *set,
 	struct nft_hash *priv = nft_set_priv(set);
 	const struct nft_hash_elem *he;
 	struct nft_hash_cmp_arg arg = {
+		.genmask = nft_genmask_cur(read_pnet(&set->pnet)),
 		.set	 = set,
 		.key	 = key,
 	};
@@ -88,6 +92,7 @@ static int nft_hash_insert(const struct nft_set *set,
 	struct nft_hash *priv = nft_set_priv(set);
 	struct nft_hash_elem *he = elem->priv;
 	struct nft_hash_cmp_arg arg = {
+		.genmask = nft_genmask_next(read_pnet(&set->pnet)),
 		.set	 = set,
 		.key	 = &elem->key,
 	};
@@ -96,30 +101,39 @@ static int nft_hash_insert(const struct nft_set *set,
 					    nft_hash_params);
 }
 
-static void nft_hash_remove(const struct nft_set *set,
-			    const struct nft_set_elem *elem)
+static void nft_hash_activate(const struct nft_set *set,
+			      const struct nft_set_elem *elem)
 {
-	struct nft_hash *priv = nft_set_priv(set);
+	struct nft_hash_elem *he = elem->priv;
 
-	rhashtable_remove_fast(&priv->ht, elem->cookie, nft_hash_params);
+	nft_set_elem_change_active(set, &he->ext);
 }
 
-static int nft_hash_get(const struct nft_set *set, struct nft_set_elem *elem)
+static void *nft_hash_deactivate(const struct nft_set *set,
+				 const struct nft_set_elem *elem)
 {
 	struct nft_hash *priv = nft_set_priv(set);
 	struct nft_hash_elem *he;
 	struct nft_hash_cmp_arg arg = {
+		.genmask = nft_genmask_next(read_pnet(&set->pnet)),
 		.set	 = set,
 		.key	 = &elem->key,
 	};
 
 	he = rhashtable_lookup_fast(&priv->ht, &arg, nft_hash_params);
-	if (!he)
-		return -ENOENT;
+	if (he != NULL)
+		nft_set_elem_change_active(set, &he->ext);
 
-	elem->priv = he;
+	return he;
+}
 
-	return 0;
+static void nft_hash_remove(const struct nft_set *set,
+			    const struct nft_set_elem *elem)
+{
+	struct nft_hash *priv = nft_set_priv(set);
+	struct nft_hash_elem *he = elem->priv;
+
+	rhashtable_remove_fast(&priv->ht, &he->node, nft_hash_params);
 }
 
 static void nft_hash_walk(const struct nft_ctx *ctx, const struct nft_set *set,
@@ -129,6 +143,7 @@ static void nft_hash_walk(const struct nft_ctx *ctx, const struct nft_set *set,
 	struct nft_hash_elem *he;
 	struct rhashtable_iter hti;
 	struct nft_set_elem elem;
+	u8 genmask = nft_genmask_cur(read_pnet(&set->pnet));
 	int err;
 
 	err = rhashtable_walk_init(&priv->ht, &hti);
@@ -155,6 +170,8 @@ static void nft_hash_walk(const struct nft_ctx *ctx, const struct nft_set *set,
 
 		if (iter->count < iter->skip)
 			goto cont;
+		if (!nft_set_elem_active(&he->ext, genmask))
+			goto cont;
 
 		elem.priv = he;
 
@@ -241,8 +258,9 @@ static struct nft_set_ops nft_hash_ops __read_mostly = {
 	.estimate	= nft_hash_estimate,
 	.init		= nft_hash_init,
 	.destroy	= nft_hash_destroy,
-	.get		= nft_hash_get,
 	.insert		= nft_hash_insert,
+	.activate	= nft_hash_activate,
+	.deactivate	= nft_hash_deactivate,
 	.remove		= nft_hash_remove,
 	.lookup		= nft_hash_lookup,
 	.walk		= nft_hash_walk,
diff --git a/net/netfilter/nft_rbtree.c b/net/netfilter/nft_rbtree.c
index cbba755..42d0ca4 100644
--- a/net/netfilter/nft_rbtree.c
+++ b/net/netfilter/nft_rbtree.c
@@ -29,6 +29,7 @@ struct nft_rbtree_elem {
 	struct nft_set_ext	ext;
 };
 
+
 static bool nft_rbtree_lookup(const struct nft_set *set,
 			      const struct nft_data *key,
 			      const struct nft_set_ext **ext)
@@ -36,6 +37,7 @@ static bool nft_rbtree_lookup(const struct nft_set *set,
 	const struct nft_rbtree *priv = nft_set_priv(set);
 	const struct nft_rbtree_elem *rbe, *interval = NULL;
 	const struct rb_node *parent;
+	u8 genmask = nft_genmask_cur(read_pnet(&set->pnet));
 	int d;
 
 	spin_lock_bh(&nft_rbtree_lock);
@@ -51,6 +53,10 @@ static bool nft_rbtree_lookup(const struct nft_set *set,
 			parent = parent->rb_right;
 		else {
 found:
+			if (!nft_set_elem_active(&rbe->ext, genmask)) {
+				parent = parent->rb_left;
+				continue;
+			}
 			if (nft_set_ext_exists(&rbe->ext, NFT_SET_EXT_FLAGS) &&
 			    *nft_set_ext_flags(&rbe->ext) &
 			    NFT_SET_ELEM_INTERVAL_END)
@@ -77,6 +83,7 @@ static int __nft_rbtree_insert(const struct nft_set *set,
 	struct nft_rbtree *priv = nft_set_priv(set);
 	struct nft_rbtree_elem *rbe;
 	struct rb_node *parent, **p;
+	u8 genmask = nft_genmask_next(read_pnet(&set->pnet));
 	int d;
 
 	parent = NULL;
@@ -91,8 +98,11 @@ static int __nft_rbtree_insert(const struct nft_set *set,
 			p = &parent->rb_left;
 		else if (d > 0)
 			p = &parent->rb_right;
-		else
-			return -EEXIST;
+		else {
+			if (nft_set_elem_active(&rbe->ext, genmask))
+				return -EEXIST;
+			p = &parent->rb_left;
+		}
 	}
 	rb_link_node(&new->node, parent, p);
 	rb_insert_color(&new->node, &priv->root);
@@ -116,18 +126,28 @@ static void nft_rbtree_remove(const struct nft_set *set,
 			      const struct nft_set_elem *elem)
 {
 	struct nft_rbtree *priv = nft_set_priv(set);
-	struct nft_rbtree_elem *rbe = elem->cookie;
+	struct nft_rbtree_elem *rbe = elem->priv;
 
 	spin_lock_bh(&nft_rbtree_lock);
 	rb_erase(&rbe->node, &priv->root);
 	spin_unlock_bh(&nft_rbtree_lock);
 }
 
-static int nft_rbtree_get(const struct nft_set *set, struct nft_set_elem *elem)
+static void nft_rbtree_activate(const struct nft_set *set,
+				const struct nft_set_elem *elem)
+{
+	struct nft_rbtree_elem *rbe = elem->priv;
+
+	nft_set_elem_change_active(set, &rbe->ext);
+}
+
+static void *nft_rbtree_deactivate(const struct nft_set *set,
+				   const struct nft_set_elem *elem)
 {
 	const struct nft_rbtree *priv = nft_set_priv(set);
 	const struct rb_node *parent = priv->root.rb_node;
 	struct nft_rbtree_elem *rbe;
+	u8 genmask = nft_genmask_cur(read_pnet(&set->pnet));
 	int d;
 
 	while (parent != NULL) {
@@ -140,12 +160,15 @@ static int nft_rbtree_get(const struct nft_set *set, struct nft_set_elem *elem)
 		else if (d > 0)
 			parent = parent->rb_right;
 		else {
-			elem->cookie = rbe;
-			elem->priv   = rbe;
-			return 0;
+			if (!nft_set_elem_active(&rbe->ext, genmask)) {
+				parent = parent->rb_left;
+				continue;
+			}
+			nft_set_elem_change_active(set, &rbe->ext);
+			return rbe;
 		}
 	}
-	return -ENOENT;
+	return NULL;
 }
 
 static void nft_rbtree_walk(const struct nft_ctx *ctx,
@@ -156,13 +179,17 @@ static void nft_rbtree_walk(const struct nft_ctx *ctx,
 	struct nft_rbtree_elem *rbe;
 	struct nft_set_elem elem;
 	struct rb_node *node;
+	u8 genmask = nft_genmask_cur(read_pnet(&set->pnet));
 
 	spin_lock_bh(&nft_rbtree_lock);
 	for (node = rb_first(&priv->root); node != NULL; node = rb_next(node)) {
+		rbe = rb_entry(node, struct nft_rbtree_elem, node);
+
 		if (iter->count < iter->skip)
 			goto cont;
+		if (!nft_set_elem_active(&rbe->ext, genmask))
+			goto cont;
 
-		rbe = rb_entry(node, struct nft_rbtree_elem, node);
 		elem.priv = rbe;
 
 		iter->err = iter->fn(ctx, set, iter, &elem);
@@ -228,7 +255,8 @@ static struct nft_set_ops nft_rbtree_ops __read_mostly = {
 	.destroy	= nft_rbtree_destroy,
 	.insert		= nft_rbtree_insert,
 	.remove		= nft_rbtree_remove,
-	.get		= nft_rbtree_get,
+	.deactivate	= nft_rbtree_deactivate,
+	.activate	= nft_rbtree_activate,
 	.lookup		= nft_rbtree_lookup,
 	.walk		= nft_rbtree_walk,
 	.features	= NFT_SET_INTERVAL | NFT_SET_MAP,
-- 
2.1.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/4] netfilter: nf_tables: set transactions
  2015-03-26 11:09 ` [PATCH 0/4] netfilter: nf_tables: set transactions Pablo Neira Ayuso
@ 2015-03-26 11:07   ` Patrick McHardy
  2015-03-26 11:15     ` Pablo Neira Ayuso
  0 siblings, 1 reply; 9+ messages in thread
From: Patrick McHardy @ 2015-03-26 11:07 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel

On 26.03, Pablo Neira Ayuso wrote:
> On Wed, Mar 25, 2015 at 02:08:46PM +0000, Patrick McHardy wrote:
> > These patches add transaction support to set elements. It pretty much
> > resembles the existing transaction support, the changelogs explain it
> > in more detail.
> 
> Thanks a lot for finishing this support.
> 
> > As a by-product the patches get rid of the RCU grace period for every
> > nft_hash element that is destroyed, we now have only the single global
> > transaction grace perdiod independantly of what we destroy.
> 
> Great work.
> 
> Applied, thanks Patrick!

Thanks! Next batch will follow shortly :)

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/4] netfilter: nf_tables: set transactions
  2015-03-25 14:08 [PATCH 0/4] netfilter: nf_tables: set transactions Patrick McHardy
                   ` (3 preceding siblings ...)
  2015-03-25 14:08 ` [PATCH 4/4] netfilter: nf_tables: implement set transaction support Patrick McHardy
@ 2015-03-26 11:09 ` Pablo Neira Ayuso
  2015-03-26 11:07   ` Patrick McHardy
  4 siblings, 1 reply; 9+ messages in thread
From: Pablo Neira Ayuso @ 2015-03-26 11:09 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netfilter-devel

On Wed, Mar 25, 2015 at 02:08:46PM +0000, Patrick McHardy wrote:
> These patches add transaction support to set elements. It pretty much
> resembles the existing transaction support, the changelogs explain it
> in more detail.

Thanks a lot for finishing this support.

> As a by-product the patches get rid of the RCU grace period for every
> nft_hash element that is destroyed, we now have only the single global
> transaction grace perdiod independantly of what we destroy.

Great work.

Applied, thanks Patrick!

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/4] netfilter: nf_tables: set transactions
  2015-03-26 11:15     ` Pablo Neira Ayuso
@ 2015-03-26 11:14       ` Patrick McHardy
  0 siblings, 0 replies; 9+ messages in thread
From: Patrick McHardy @ 2015-03-26 11:14 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel

On 26.03, Pablo Neira Ayuso wrote:
> On Thu, Mar 26, 2015 at 11:07:01AM +0000, Patrick McHardy wrote:
> > On 26.03, Pablo Neira Ayuso wrote:
> > > On Wed, Mar 25, 2015 at 02:08:46PM +0000, Patrick McHardy wrote:
> > > > These patches add transaction support to set elements. It pretty much
> > > > resembles the existing transaction support, the changelogs explain it
> > > > in more detail.
> > > 
> > > Thanks a lot for finishing this support.
> > > 
> > > > As a by-product the patches get rid of the RCU grace period for every
> > > > nft_hash element that is destroyed, we now have only the single global
> > > > transaction grace perdiod independantly of what we destroy.
> > > 
> > > Great work.
> > > 
> > > Applied, thanks Patrick!
> > 
> > Thanks! Next batch will follow shortly :)
> 
> I'm going to send a pull request to try to keep the batch relatively
> small, so the next batch will just sit for a while on the list, OK?

Sure, no problem.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/4] netfilter: nf_tables: set transactions
  2015-03-26 11:07   ` Patrick McHardy
@ 2015-03-26 11:15     ` Pablo Neira Ayuso
  2015-03-26 11:14       ` Patrick McHardy
  0 siblings, 1 reply; 9+ messages in thread
From: Pablo Neira Ayuso @ 2015-03-26 11:15 UTC (permalink / raw)
  To: Patrick McHardy; +Cc: netfilter-devel

On Thu, Mar 26, 2015 at 11:07:01AM +0000, Patrick McHardy wrote:
> On 26.03, Pablo Neira Ayuso wrote:
> > On Wed, Mar 25, 2015 at 02:08:46PM +0000, Patrick McHardy wrote:
> > > These patches add transaction support to set elements. It pretty much
> > > resembles the existing transaction support, the changelogs explain it
> > > in more detail.
> > 
> > Thanks a lot for finishing this support.
> > 
> > > As a by-product the patches get rid of the RCU grace period for every
> > > nft_hash element that is destroyed, we now have only the single global
> > > transaction grace perdiod independantly of what we destroy.
> > 
> > Great work.
> > 
> > Applied, thanks Patrick!
> 
> Thanks! Next batch will follow shortly :)

I'm going to send a pull request to try to keep the batch relatively
small, so the next batch will just sit for a while on the list, OK?

Thanks!

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2015-03-26 11:14 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-03-25 14:08 [PATCH 0/4] netfilter: nf_tables: set transactions Patrick McHardy
2015-03-25 14:08 ` [PATCH 1/4] netfilter: nf_tables: consolide set element destruction Patrick McHardy
2015-03-25 14:08 ` [PATCH 2/4] netfilter: nf_tables: return set extensions from ->lookup() Patrick McHardy
2015-03-25 14:08 ` [PATCH 3/4] netfilter: nf_tables: add transaction helper functions Patrick McHardy
2015-03-25 14:08 ` [PATCH 4/4] netfilter: nf_tables: implement set transaction support Patrick McHardy
2015-03-26 11:09 ` [PATCH 0/4] netfilter: nf_tables: set transactions Pablo Neira Ayuso
2015-03-26 11:07   ` Patrick McHardy
2015-03-26 11:15     ` Pablo Neira Ayuso
2015-03-26 11:14       ` Patrick McHardy

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.