Netfilter-Devel Archive on lore.kernel.org
 help / color / Atom feed
* [PATCH 0/9] Netfilter fixes for net
@ 2018-12-29 12:57 Pablo Neira Ayuso
  2018-12-29 12:57 ` [PATCH 1/9] netfilter: nf_tables: fix a missing check of nla_put_failure Pablo Neira Ayuso
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: Pablo Neira Ayuso @ 2018-12-29 12:57 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following patchset contains Netfilter fixes for net, specifically
fixes for the nf_conncount infrastructure which is causing troubles
since 5c789e131cbb9 ("netfilter: nf_conncount: Add list lock and gc
worker, and RCU for init tree search"). Patches aim to simplify this
infrastructure while fixing up the problems:

1) Use fixed size CONNCOUNT_SLOTS in nf_conncount, from Shawn Bohrer.

2) Incorrect signedness in age calculation from find_or_evict(),
   from Florian Westphal.

3) Proper locking for the garbage collector workqueue callback,
   first make a patch to count how many nodes can be collected
   without holding locks, then grab lock and release them. Also
   from Florian.

4) Restart node lookup from the insertion path, after releasing nodes
   via packet path garbage collection. Shawn Bohrer described a scenario
   that may result in inserting a connection in an already dead list
   node. Patch from Florian.

5) Merge lookup and add function to avoid a hold release and re-grab.
   From Florian.

6) Be safe and iterate over the node lists under the spinlock.

7) Speculative list nodes removal via garbage collection, check if
   list node got a connection while it was scheduled for deletion
   via gc.

8) Accidental argument swap in find_next_bit() that leads to more
   frequent scheduling of the workqueue. From Florian Westphal.

And one patch that falls within the miscelanea category in this batch:

9) Missing error path for nla_nest_start(), from Kangjie Lu.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Thanks!

----------------------------------------------------------------

The following changes since commit a3c9311f62b4943228ae90f769775dd3bcbfa7c0:

  include/linux/phy/phy.h: fix minor kerneldoc errors (2018-12-27 16:31:10 -0800)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git HEAD

for you to fetch changes up to a007232066f6839d6f256bab21e825d968f1a163:

  netfilter: nf_conncount: fix argument order to find_next_bit (2018-12-29 02:45:22 +0100)

----------------------------------------------------------------
Florian Westphal (5):
      netfilter: nf_conncount: don't skip eviction when age is negative
      netfilter: nf_conncount: split gc in two phases
      netfilter: nf_conncount: restart search when nodes have been erased
      netfilter: nf_conncount: merge lookup and add functions
      netfilter: nf_conncount: fix argument order to find_next_bit

Kangjie Lu (1):
      netfilter: nf_tables: fix a missing check of nla_put_failure

Pablo Neira Ayuso (2):
      netfilter: nf_conncount: move all list iterations under spinlock
      netfilter: nf_conncount: speculative garbage collection on empty lists

Shawn Bohrer (1):
      netfilter: nf_conncount: replace CONNCOUNT_LOCK_SLOTS with CONNCOUNT_SLOTS

 include/net/netfilter/nf_conntrack_count.h |  19 +-
 net/netfilter/nf_conncount.c               | 290 +++++++++++++----------------
 net/netfilter/nf_tables_api.c              |   2 +
 net/netfilter/nft_connlimit.c              |  14 +-
 4 files changed, 136 insertions(+), 189 deletions(-)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 1/9] netfilter: nf_tables: fix a missing check of nla_put_failure
  2018-12-29 12:57 [PATCH 0/9] Netfilter fixes for net Pablo Neira Ayuso
@ 2018-12-29 12:57 ` Pablo Neira Ayuso
  2018-12-29 12:57 ` [PATCH 2/9] netfilter: nf_conncount: replace CONNCOUNT_LOCK_SLOTS with CONNCOUNT_SLOTS Pablo Neira Ayuso
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Pablo Neira Ayuso @ 2018-12-29 12:57 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Kangjie Lu <kjlu@umn.edu>

If nla_nest_start() may fail. The fix checks its return value and goes
to nla_put_failure if it fails.

Signed-off-by: Kangjie Lu <kjlu@umn.edu>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_tables_api.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c
index fec814dace5a..2b0a93300dd7 100644
--- a/net/netfilter/nf_tables_api.c
+++ b/net/netfilter/nf_tables_api.c
@@ -5727,6 +5727,8 @@ static int nf_tables_fill_flowtable_info(struct sk_buff *skb, struct net *net,
 		goto nla_put_failure;
 
 	nest = nla_nest_start(skb, NFTA_FLOWTABLE_HOOK);
+	if (!nest)
+		goto nla_put_failure;
 	if (nla_put_be32(skb, NFTA_FLOWTABLE_HOOK_NUM, htonl(flowtable->hooknum)) ||
 	    nla_put_be32(skb, NFTA_FLOWTABLE_HOOK_PRIORITY, htonl(flowtable->priority)))
 		goto nla_put_failure;
-- 
2.11.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 2/9] netfilter: nf_conncount: replace CONNCOUNT_LOCK_SLOTS with CONNCOUNT_SLOTS
  2018-12-29 12:57 [PATCH 0/9] Netfilter fixes for net Pablo Neira Ayuso
  2018-12-29 12:57 ` [PATCH 1/9] netfilter: nf_tables: fix a missing check of nla_put_failure Pablo Neira Ayuso
@ 2018-12-29 12:57 ` Pablo Neira Ayuso
  2018-12-29 12:57 ` [PATCH 3/9] netfilter: nf_conncount: don't skip eviction when age is negative Pablo Neira Ayuso
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Pablo Neira Ayuso @ 2018-12-29 12:57 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Shawn Bohrer <sbohrer@cloudflare.com>

Most of the time these were the same value anyway, but when
CONFIG_LOCKDEP was enabled we would use a smaller number of locks to
reduce overhead.  Unfortunately having two values is confusing and not
worth the complexity.

This fixes a bug where tree_gc_worker() would only GC up to
CONNCOUNT_LOCK_SLOTS trees which meant when CONFIG_LOCKDEP was enabled
not all trees would be GCed by tree_gc_worker().

Fixes: 5c789e131cbb9 ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search")
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Shawn Bohrer <sbohrer@cloudflare.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_conncount.c | 19 +++++--------------
 1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c
index 9cd180bda092..3271a4e00500 100644
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -33,12 +33,6 @@
 
 #define CONNCOUNT_SLOTS		256U
 
-#ifdef CONFIG_LOCKDEP
-#define CONNCOUNT_LOCK_SLOTS	8U
-#else
-#define CONNCOUNT_LOCK_SLOTS	256U
-#endif
-
 #define CONNCOUNT_GC_MAX_NODES	8
 #define MAX_KEYLEN		5
 
@@ -60,7 +54,7 @@ struct nf_conncount_rb {
 	struct rcu_head rcu_head;
 };
 
-static spinlock_t nf_conncount_locks[CONNCOUNT_LOCK_SLOTS] __cacheline_aligned_in_smp;
+static spinlock_t nf_conncount_locks[CONNCOUNT_SLOTS] __cacheline_aligned_in_smp;
 
 struct nf_conncount_data {
 	unsigned int keylen;
@@ -353,7 +347,7 @@ insert_tree(struct net *net,
 	unsigned int count = 0, gc_count = 0;
 	bool node_found = false;
 
-	spin_lock_bh(&nf_conncount_locks[hash % CONNCOUNT_LOCK_SLOTS]);
+	spin_lock_bh(&nf_conncount_locks[hash]);
 
 	parent = NULL;
 	rbnode = &(root->rb_node);
@@ -430,7 +424,7 @@ insert_tree(struct net *net,
 	rb_link_node_rcu(&rbconn->node, parent, rbnode);
 	rb_insert_color(&rbconn->node, root);
 out_unlock:
-	spin_unlock_bh(&nf_conncount_locks[hash % CONNCOUNT_LOCK_SLOTS]);
+	spin_unlock_bh(&nf_conncount_locks[hash]);
 	return count;
 }
 
@@ -499,7 +493,7 @@ static void tree_gc_worker(struct work_struct *work)
 	struct rb_node *node;
 	unsigned int tree, next_tree, gc_count = 0;
 
-	tree = data->gc_tree % CONNCOUNT_LOCK_SLOTS;
+	tree = data->gc_tree % CONNCOUNT_SLOTS;
 	root = &data->root[tree];
 
 	rcu_read_lock();
@@ -621,10 +615,7 @@ static int __init nf_conncount_modinit(void)
 {
 	int i;
 
-	BUILD_BUG_ON(CONNCOUNT_LOCK_SLOTS > CONNCOUNT_SLOTS);
-	BUILD_BUG_ON((CONNCOUNT_SLOTS % CONNCOUNT_LOCK_SLOTS) != 0);
-
-	for (i = 0; i < CONNCOUNT_LOCK_SLOTS; ++i)
+	for (i = 0; i < CONNCOUNT_SLOTS; ++i)
 		spin_lock_init(&nf_conncount_locks[i]);
 
 	conncount_conn_cachep = kmem_cache_create("nf_conncount_tuple",
-- 
2.11.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 3/9] netfilter: nf_conncount: don't skip eviction when age is negative
  2018-12-29 12:57 [PATCH 0/9] Netfilter fixes for net Pablo Neira Ayuso
  2018-12-29 12:57 ` [PATCH 1/9] netfilter: nf_tables: fix a missing check of nla_put_failure Pablo Neira Ayuso
  2018-12-29 12:57 ` [PATCH 2/9] netfilter: nf_conncount: replace CONNCOUNT_LOCK_SLOTS with CONNCOUNT_SLOTS Pablo Neira Ayuso
@ 2018-12-29 12:57 ` Pablo Neira Ayuso
  2018-12-29 12:57 ` [PATCH 4/9] netfilter: nf_conncount: split gc in two phases Pablo Neira Ayuso
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Pablo Neira Ayuso @ 2018-12-29 12:57 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Florian Westphal <fw@strlen.de>

age is signed integer, so result can be negative when the timestamps
have a large delta.  In this case we want to discard the entry.

Instead of using age >= 2 || age < 0, just make it unsigned.

Fixes: b36e4523d4d56 ("netfilter: nf_conncount: fix garbage collection confirm race")
Reviewed-by: Shawn Bohrer <sbohrer@cloudflare.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_conncount.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c
index 3271a4e00500..8bb4ed85c262 100644
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -155,7 +155,7 @@ find_or_evict(struct net *net, struct nf_conncount_list *list,
 	const struct nf_conntrack_tuple_hash *found;
 	unsigned long a, b;
 	int cpu = raw_smp_processor_id();
-	__s32 age;
+	u32 age;
 
 	found = nf_conntrack_find_get(net, &conn->zone, &conn->tuple);
 	if (found)
-- 
2.11.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 4/9] netfilter: nf_conncount: split gc in two phases
  2018-12-29 12:57 [PATCH 0/9] Netfilter fixes for net Pablo Neira Ayuso
                   ` (2 preceding siblings ...)
  2018-12-29 12:57 ` [PATCH 3/9] netfilter: nf_conncount: don't skip eviction when age is negative Pablo Neira Ayuso
@ 2018-12-29 12:57 ` Pablo Neira Ayuso
  2018-12-29 12:57 ` [PATCH 5/9] netfilter: nf_conncount: restart search when nodes have been erased Pablo Neira Ayuso
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Pablo Neira Ayuso @ 2018-12-29 12:57 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Florian Westphal <fw@strlen.de>

The lockless workqueue garbage collector can race with packet path
garbage collector to delete list nodes, as it calls tree_nodes_free()
with the addresses of nodes that might have been free'd already from
another cpu.

To fix this, split gc into two phases.

One phase to perform gc on the connections: From a locking perspective,
this is the same as count_tree(): we hold rcu lock, but we do not
change the tree, we only change the nodes' contents.

The second phase acquires the tree lock and reaps empty nodes.
This avoids a race condition of the garbage collection vs.  packet path:
If a node has been free'd already, the second phase won't find it anymore.

This second phase is, from locking perspective, same as insert_tree().

The former only modifies nodes (list content, count), latter modifies
the tree itself (rb_erase or rb_insert).

Fixes: 5c789e131cbb9 ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search")
Reviewed-by: Shawn Bohrer <sbohrer@cloudflare.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_conncount.c | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c
index 8bb4ed85c262..753132e4afa8 100644
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -500,16 +500,32 @@ static void tree_gc_worker(struct work_struct *work)
 	for (node = rb_first(root); node != NULL; node = rb_next(node)) {
 		rbconn = rb_entry(node, struct nf_conncount_rb, node);
 		if (nf_conncount_gc_list(data->net, &rbconn->list))
-			gc_nodes[gc_count++] = rbconn;
+			gc_count++;
 	}
 	rcu_read_unlock();
 
 	spin_lock_bh(&nf_conncount_locks[tree]);
+	if (gc_count < ARRAY_SIZE(gc_nodes))
+		goto next; /* do not bother */
 
-	if (gc_count) {
-		tree_nodes_free(root, gc_nodes, gc_count);
+	gc_count = 0;
+	node = rb_first(root);
+	while (node != NULL) {
+		rbconn = rb_entry(node, struct nf_conncount_rb, node);
+		node = rb_next(node);
+
+		if (rbconn->list.count > 0)
+			continue;
+
+		gc_nodes[gc_count++] = rbconn;
+		if (gc_count >= ARRAY_SIZE(gc_nodes)) {
+			tree_nodes_free(root, gc_nodes, gc_count);
+			gc_count = 0;
+		}
 	}
 
+	tree_nodes_free(root, gc_nodes, gc_count);
+next:
 	clear_bit(tree, data->pending_trees);
 
 	next_tree = (tree + 1) % CONNCOUNT_SLOTS;
-- 
2.11.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 5/9] netfilter: nf_conncount: restart search when nodes have been erased
  2018-12-29 12:57 [PATCH 0/9] Netfilter fixes for net Pablo Neira Ayuso
                   ` (3 preceding siblings ...)
  2018-12-29 12:57 ` [PATCH 4/9] netfilter: nf_conncount: split gc in two phases Pablo Neira Ayuso
@ 2018-12-29 12:57 ` Pablo Neira Ayuso
  2018-12-29 12:58 ` [PATCH 6/9] netfilter: nf_conncount: merge lookup and add functions Pablo Neira Ayuso
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Pablo Neira Ayuso @ 2018-12-29 12:57 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Florian Westphal <fw@strlen.de>

Shawn Bohrer reported a following crash:
 |RIP: 0010:rb_erase+0xae/0x360
 [..]
 Call Trace:
  nf_conncount_destroy+0x59/0xc0 [nf_conncount]
  cleanup_match+0x45/0x70 [ip_tables]
  ...

Shawn tracked this down to bogus 'parent' pointer:
Problem is that when we insert a new node, then there is a chance that
the 'parent' that we found was also passed to tree_nodes_free() (because
that node was empty) for erase+free.

Instead of trying to be clever and detect when this happens, restart
the search if we have evicted one or more nodes.  To prevent frequent
restarts, do not perform gc on the second round.

Also, unconditionally schedule the gc worker.
The condition

  gc_count > ARRAY_SIZE(gc_nodes))

cannot be true unless tree grows very large, as the height of the tree
will be low even with hundreds of nodes present.

Fixes: 5c789e131cbb9 ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search")
Reported-by: Shawn Bohrer <sbohrer@cloudflare.com>
Reviewed-by: Shawn Bohrer <sbohrer@cloudflare.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_conncount.c | 18 +++++++-----------
 1 file changed, 7 insertions(+), 11 deletions(-)

diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c
index 753132e4afa8..0a83c694a8f1 100644
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -346,9 +346,10 @@ insert_tree(struct net *net,
 	struct nf_conncount_tuple *conn;
 	unsigned int count = 0, gc_count = 0;
 	bool node_found = false;
+	bool do_gc = true;
 
 	spin_lock_bh(&nf_conncount_locks[hash]);
-
+restart:
 	parent = NULL;
 	rbnode = &(root->rb_node);
 	while (*rbnode) {
@@ -381,21 +382,16 @@ insert_tree(struct net *net,
 		if (gc_count >= ARRAY_SIZE(gc_nodes))
 			continue;
 
-		if (nf_conncount_gc_list(net, &rbconn->list))
+		if (do_gc && nf_conncount_gc_list(net, &rbconn->list))
 			gc_nodes[gc_count++] = rbconn;
 	}
 
 	if (gc_count) {
 		tree_nodes_free(root, gc_nodes, gc_count);
-		/* tree_node_free before new allocation permits
-		 * allocator to re-use newly free'd object.
-		 *
-		 * This is a rare event; in most cases we will find
-		 * existing node to re-use. (or gc_count is 0).
-		 */
-
-		if (gc_count >= ARRAY_SIZE(gc_nodes))
-			schedule_gc_worker(data, hash);
+		schedule_gc_worker(data, hash);
+		gc_count = 0;
+		do_gc = false;
+		goto restart;
 	}
 
 	if (node_found)
-- 
2.11.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 6/9] netfilter: nf_conncount: merge lookup and add functions
  2018-12-29 12:57 [PATCH 0/9] Netfilter fixes for net Pablo Neira Ayuso
                   ` (4 preceding siblings ...)
  2018-12-29 12:57 ` [PATCH 5/9] netfilter: nf_conncount: restart search when nodes have been erased Pablo Neira Ayuso
@ 2018-12-29 12:58 ` Pablo Neira Ayuso
  2018-12-29 12:58 ` [PATCH 7/9] netfilter: nf_conncount: move all list iterations under spinlock Pablo Neira Ayuso
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Pablo Neira Ayuso @ 2018-12-29 12:58 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Florian Westphal <fw@strlen.de>

'lookup' is always followed by 'add'.
Merge both and make the list-walk part of nf_conncount_add().

This also avoids one unneeded unlock/re-lock pair.

Extra care needs to be taken in count_tree, as we only hold rcu
read lock, i.e. we can only insert to an existing tree node after
acquiring its lock and making sure it has a nonzero count.

As a zero count should be rare, just fall back to insert_tree()
(which acquires tree lock).

This issue and its solution were pointed out by Shawn Bohrer
during patch review.

Reviewed-by: Shawn Bohrer <sbohrer@cloudflare.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nf_conntrack_count.h |  18 +---
 net/netfilter/nf_conncount.c               | 146 +++++++++++++----------------
 net/netfilter/nft_connlimit.c              |  14 +--
 3 files changed, 72 insertions(+), 106 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack_count.h b/include/net/netfilter/nf_conntrack_count.h
index 4b2b2baf8ab4..aa66775c15f4 100644
--- a/include/net/netfilter/nf_conntrack_count.h
+++ b/include/net/netfilter/nf_conntrack_count.h
@@ -5,12 +5,6 @@
 
 struct nf_conncount_data;
 
-enum nf_conncount_list_add {
-	NF_CONNCOUNT_ADDED, 	/* list add was ok */
-	NF_CONNCOUNT_ERR,	/* -ENOMEM, must drop skb */
-	NF_CONNCOUNT_SKIP,	/* list is already reclaimed by gc */
-};
-
 struct nf_conncount_list {
 	spinlock_t list_lock;
 	struct list_head head;	/* connections with the same filtering key */
@@ -29,18 +23,12 @@ unsigned int nf_conncount_count(struct net *net,
 				const struct nf_conntrack_tuple *tuple,
 				const struct nf_conntrack_zone *zone);
 
-void nf_conncount_lookup(struct net *net, struct nf_conncount_list *list,
-			 const struct nf_conntrack_tuple *tuple,
-			 const struct nf_conntrack_zone *zone,
-			 bool *addit);
+int nf_conncount_add(struct net *net, struct nf_conncount_list *list,
+		     const struct nf_conntrack_tuple *tuple,
+		     const struct nf_conntrack_zone *zone);
 
 void nf_conncount_list_init(struct nf_conncount_list *list);
 
-enum nf_conncount_list_add
-nf_conncount_add(struct nf_conncount_list *list,
-		 const struct nf_conntrack_tuple *tuple,
-		 const struct nf_conntrack_zone *zone);
-
 bool nf_conncount_gc_list(struct net *net,
 			  struct nf_conncount_list *list);
 
diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c
index 0a83c694a8f1..ce7f7d1212a6 100644
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -83,38 +83,6 @@ static int key_diff(const u32 *a, const u32 *b, unsigned int klen)
 	return memcmp(a, b, klen * sizeof(u32));
 }
 
-enum nf_conncount_list_add
-nf_conncount_add(struct nf_conncount_list *list,
-		 const struct nf_conntrack_tuple *tuple,
-		 const struct nf_conntrack_zone *zone)
-{
-	struct nf_conncount_tuple *conn;
-
-	if (WARN_ON_ONCE(list->count > INT_MAX))
-		return NF_CONNCOUNT_ERR;
-
-	conn = kmem_cache_alloc(conncount_conn_cachep, GFP_ATOMIC);
-	if (conn == NULL)
-		return NF_CONNCOUNT_ERR;
-
-	conn->tuple = *tuple;
-	conn->zone = *zone;
-	conn->cpu = raw_smp_processor_id();
-	conn->jiffies32 = (u32)jiffies;
-	conn->dead = false;
-	spin_lock_bh(&list->list_lock);
-	if (list->dead == true) {
-		kmem_cache_free(conncount_conn_cachep, conn);
-		spin_unlock_bh(&list->list_lock);
-		return NF_CONNCOUNT_SKIP;
-	}
-	list_add_tail(&conn->node, &list->head);
-	list->count++;
-	spin_unlock_bh(&list->list_lock);
-	return NF_CONNCOUNT_ADDED;
-}
-EXPORT_SYMBOL_GPL(nf_conncount_add);
-
 static void __conn_free(struct rcu_head *h)
 {
 	struct nf_conncount_tuple *conn;
@@ -177,11 +145,10 @@ find_or_evict(struct net *net, struct nf_conncount_list *list,
 	return ERR_PTR(-EAGAIN);
 }
 
-void nf_conncount_lookup(struct net *net,
-			 struct nf_conncount_list *list,
-			 const struct nf_conntrack_tuple *tuple,
-			 const struct nf_conntrack_zone *zone,
-			 bool *addit)
+static int __nf_conncount_add(struct net *net,
+			      struct nf_conncount_list *list,
+			      const struct nf_conntrack_tuple *tuple,
+			      const struct nf_conntrack_zone *zone)
 {
 	const struct nf_conntrack_tuple_hash *found;
 	struct nf_conncount_tuple *conn, *conn_n;
@@ -189,9 +156,6 @@ void nf_conncount_lookup(struct net *net,
 	unsigned int collect = 0;
 	bool free_entry = false;
 
-	/* best effort only */
-	*addit = tuple ? true : false;
-
 	/* check the saved connections */
 	list_for_each_entry_safe(conn, conn_n, &list->head, node) {
 		if (collect > CONNCOUNT_GC_MAX_NODES)
@@ -201,21 +165,19 @@ void nf_conncount_lookup(struct net *net,
 		if (IS_ERR(found)) {
 			/* Not found, but might be about to be confirmed */
 			if (PTR_ERR(found) == -EAGAIN) {
-				if (!tuple)
-					continue;
-
 				if (nf_ct_tuple_equal(&conn->tuple, tuple) &&
 				    nf_ct_zone_id(&conn->zone, conn->zone.dir) ==
 				    nf_ct_zone_id(zone, zone->dir))
-					*addit = false;
-			} else if (PTR_ERR(found) == -ENOENT)
+					return 0; /* already exists */
+			} else {
 				collect++;
+			}
 			continue;
 		}
 
 		found_ct = nf_ct_tuplehash_to_ctrack(found);
 
-		if (tuple && nf_ct_tuple_equal(&conn->tuple, tuple) &&
+		if (nf_ct_tuple_equal(&conn->tuple, tuple) &&
 		    nf_ct_zone_equal(found_ct, zone, zone->dir)) {
 			/*
 			 * We should not see tuples twice unless someone hooks
@@ -223,7 +185,8 @@ void nf_conncount_lookup(struct net *net,
 			 *
 			 * Attempt to avoid a re-add in this case.
 			 */
-			*addit = false;
+			nf_ct_put(found_ct);
+			return 0;
 		} else if (already_closed(found_ct)) {
 			/*
 			 * we do not care about connections which are
@@ -237,8 +200,38 @@ void nf_conncount_lookup(struct net *net,
 
 		nf_ct_put(found_ct);
 	}
+
+	if (WARN_ON_ONCE(list->count > INT_MAX))
+		return -EOVERFLOW;
+
+	conn = kmem_cache_alloc(conncount_conn_cachep, GFP_ATOMIC);
+	if (conn == NULL)
+		return -ENOMEM;
+
+	conn->tuple = *tuple;
+	conn->zone = *zone;
+	conn->cpu = raw_smp_processor_id();
+	conn->jiffies32 = (u32)jiffies;
+	list_add_tail(&conn->node, &list->head);
+	list->count++;
+	return 0;
 }
-EXPORT_SYMBOL_GPL(nf_conncount_lookup);
+
+int nf_conncount_add(struct net *net,
+		     struct nf_conncount_list *list,
+		     const struct nf_conntrack_tuple *tuple,
+		     const struct nf_conntrack_zone *zone)
+{
+	int ret;
+
+	/* check the saved connections */
+	spin_lock_bh(&list->list_lock);
+	ret = __nf_conncount_add(net, list, tuple, zone);
+	spin_unlock_bh(&list->list_lock);
+
+	return ret;
+}
+EXPORT_SYMBOL_GPL(nf_conncount_add);
 
 void nf_conncount_list_init(struct nf_conncount_list *list)
 {
@@ -339,13 +332,11 @@ insert_tree(struct net *net,
 	    const struct nf_conntrack_tuple *tuple,
 	    const struct nf_conntrack_zone *zone)
 {
-	enum nf_conncount_list_add ret;
 	struct nf_conncount_rb *gc_nodes[CONNCOUNT_GC_MAX_NODES];
 	struct rb_node **rbnode, *parent;
 	struct nf_conncount_rb *rbconn;
 	struct nf_conncount_tuple *conn;
 	unsigned int count = 0, gc_count = 0;
-	bool node_found = false;
 	bool do_gc = true;
 
 	spin_lock_bh(&nf_conncount_locks[hash]);
@@ -363,20 +354,15 @@ insert_tree(struct net *net,
 		} else if (diff > 0) {
 			rbnode = &((*rbnode)->rb_right);
 		} else {
-			/* unlikely: other cpu added node already */
-			node_found = true;
-			ret = nf_conncount_add(&rbconn->list, tuple, zone);
-			if (ret == NF_CONNCOUNT_ERR) {
+			int ret;
+
+			ret = nf_conncount_add(net, &rbconn->list, tuple, zone);
+			if (ret)
 				count = 0; /* hotdrop */
-			} else if (ret == NF_CONNCOUNT_ADDED) {
+			else
 				count = rbconn->list.count;
-			} else {
-				/* NF_CONNCOUNT_SKIP, rbconn is already
-				 * reclaimed by gc, insert a new tree node
-				 */
-				node_found = false;
-			}
-			break;
+			tree_nodes_free(root, gc_nodes, gc_count);
+			goto out_unlock;
 		}
 
 		if (gc_count >= ARRAY_SIZE(gc_nodes))
@@ -394,9 +380,6 @@ insert_tree(struct net *net,
 		goto restart;
 	}
 
-	if (node_found)
-		goto out_unlock;
-
 	/* expected case: match, insert new node */
 	rbconn = kmem_cache_alloc(conncount_rb_cachep, GFP_ATOMIC);
 	if (rbconn == NULL)
@@ -431,7 +414,6 @@ count_tree(struct net *net,
 	   const struct nf_conntrack_tuple *tuple,
 	   const struct nf_conntrack_zone *zone)
 {
-	enum nf_conncount_list_add ret;
 	struct rb_root *root;
 	struct rb_node *parent;
 	struct nf_conncount_rb *rbconn;
@@ -444,7 +426,6 @@ count_tree(struct net *net,
 	parent = rcu_dereference_raw(root->rb_node);
 	while (parent) {
 		int diff;
-		bool addit;
 
 		rbconn = rb_entry(parent, struct nf_conncount_rb, node);
 
@@ -454,24 +435,29 @@ count_tree(struct net *net,
 		} else if (diff > 0) {
 			parent = rcu_dereference_raw(parent->rb_right);
 		} else {
-			/* same source network -> be counted! */
-			nf_conncount_lookup(net, &rbconn->list, tuple, zone,
-					    &addit);
+			int ret;
 
-			if (!addit)
+			if (!tuple) {
+				nf_conncount_gc_list(net, &rbconn->list);
 				return rbconn->list.count;
+			}
 
-			ret = nf_conncount_add(&rbconn->list, tuple, zone);
-			if (ret == NF_CONNCOUNT_ERR) {
-				return 0; /* hotdrop */
-			} else if (ret == NF_CONNCOUNT_ADDED) {
-				return rbconn->list.count;
-			} else {
-				/* NF_CONNCOUNT_SKIP, rbconn is already
-				 * reclaimed by gc, insert a new tree node
-				 */
+			spin_lock_bh(&rbconn->list.list_lock);
+			/* Node might be about to be free'd.
+			 * We need to defer to insert_tree() in this case.
+			 */
+			if (rbconn->list.count == 0) {
+				spin_unlock_bh(&rbconn->list.list_lock);
 				break;
 			}
+
+			/* same source network -> be counted! */
+			ret = __nf_conncount_add(net, &rbconn->list, tuple, zone);
+			spin_unlock_bh(&rbconn->list.list_lock);
+			if (ret)
+				return 0; /* hotdrop */
+			else
+				return rbconn->list.count;
 		}
 	}
 
diff --git a/net/netfilter/nft_connlimit.c b/net/netfilter/nft_connlimit.c
index b90d96ba4a12..af1497ab9464 100644
--- a/net/netfilter/nft_connlimit.c
+++ b/net/netfilter/nft_connlimit.c
@@ -30,7 +30,6 @@ static inline void nft_connlimit_do_eval(struct nft_connlimit *priv,
 	enum ip_conntrack_info ctinfo;
 	const struct nf_conn *ct;
 	unsigned int count;
-	bool addit;
 
 	tuple_ptr = &tuple;
 
@@ -44,19 +43,12 @@ static inline void nft_connlimit_do_eval(struct nft_connlimit *priv,
 		return;
 	}
 
-	nf_conncount_lookup(nft_net(pkt), &priv->list, tuple_ptr, zone,
-			    &addit);
-	count = priv->list.count;
-
-	if (!addit)
-		goto out;
-
-	if (nf_conncount_add(&priv->list, tuple_ptr, zone) == NF_CONNCOUNT_ERR) {
+	if (nf_conncount_add(nft_net(pkt), &priv->list, tuple_ptr, zone)) {
 		regs->verdict.code = NF_DROP;
 		return;
 	}
-	count++;
-out:
+
+	count = priv->list.count;
 
 	if ((count > priv->limit) ^ priv->invert) {
 		regs->verdict.code = NFT_BREAK;
-- 
2.11.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 7/9] netfilter: nf_conncount: move all list iterations under spinlock
  2018-12-29 12:57 [PATCH 0/9] Netfilter fixes for net Pablo Neira Ayuso
                   ` (5 preceding siblings ...)
  2018-12-29 12:58 ` [PATCH 6/9] netfilter: nf_conncount: merge lookup and add functions Pablo Neira Ayuso
@ 2018-12-29 12:58 ` Pablo Neira Ayuso
  2018-12-29 12:58 ` [PATCH 8/9] netfilter: nf_conncount: speculative garbage collection on empty lists Pablo Neira Ayuso
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Pablo Neira Ayuso @ 2018-12-29 12:58 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Two CPUs may race to remove a connection from the list, the existing
conn->dead will result in a use-after-free. Use the per-list spinlock to
protect list iterations.

As all accesses to the list now happen while holding the per-list lock,
we no longer need to delay free operations with rcu.

Joint work with Florian.

Fixes: 5c789e131cbb9 ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search")
Reviewed-by: Shawn Bohrer <sbohrer@cloudflare.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_conncount.c | 46 +++++++++++++++++++-------------------------
 1 file changed, 20 insertions(+), 26 deletions(-)

diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c
index ce7f7d1212a6..d0fd195b19a8 100644
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -43,8 +43,6 @@ struct nf_conncount_tuple {
 	struct nf_conntrack_zone	zone;
 	int				cpu;
 	u32				jiffies32;
-	bool				dead;
-	struct rcu_head			rcu_head;
 };
 
 struct nf_conncount_rb {
@@ -83,36 +81,21 @@ static int key_diff(const u32 *a, const u32 *b, unsigned int klen)
 	return memcmp(a, b, klen * sizeof(u32));
 }
 
-static void __conn_free(struct rcu_head *h)
-{
-	struct nf_conncount_tuple *conn;
-
-	conn = container_of(h, struct nf_conncount_tuple, rcu_head);
-	kmem_cache_free(conncount_conn_cachep, conn);
-}
-
 static bool conn_free(struct nf_conncount_list *list,
 		      struct nf_conncount_tuple *conn)
 {
 	bool free_entry = false;
 
-	spin_lock_bh(&list->list_lock);
-
-	if (conn->dead) {
-		spin_unlock_bh(&list->list_lock);
-		return free_entry;
-	}
+	lockdep_assert_held(&list->list_lock);
 
 	list->count--;
-	conn->dead = true;
-	list_del_rcu(&conn->node);
+	list_del(&conn->node);
 	if (list->count == 0) {
 		list->dead = true;
 		free_entry = true;
 	}
 
-	spin_unlock_bh(&list->list_lock);
-	call_rcu(&conn->rcu_head, __conn_free);
+	kmem_cache_free(conncount_conn_cachep, conn);
 	return free_entry;
 }
 
@@ -242,7 +225,7 @@ void nf_conncount_list_init(struct nf_conncount_list *list)
 }
 EXPORT_SYMBOL_GPL(nf_conncount_list_init);
 
-/* Return true if the list is empty */
+/* Return true if the list is empty. Must be called with BH disabled. */
 bool nf_conncount_gc_list(struct net *net,
 			  struct nf_conncount_list *list)
 {
@@ -253,12 +236,18 @@ bool nf_conncount_gc_list(struct net *net,
 	bool free_entry = false;
 	bool ret = false;
 
+	/* don't bother if other cpu is already doing GC */
+	if (!spin_trylock(&list->list_lock))
+		return false;
+
 	list_for_each_entry_safe(conn, conn_n, &list->head, node) {
 		found = find_or_evict(net, list, conn, &free_entry);
 		if (IS_ERR(found)) {
 			if (PTR_ERR(found) == -ENOENT)  {
-				if (free_entry)
+				if (free_entry) {
+					spin_unlock(&list->list_lock);
 					return true;
+				}
 				collected++;
 			}
 			continue;
@@ -271,23 +260,24 @@ bool nf_conncount_gc_list(struct net *net,
 			 * closed already -> ditch it
 			 */
 			nf_ct_put(found_ct);
-			if (conn_free(list, conn))
+			if (conn_free(list, conn)) {
+				spin_unlock(&list->list_lock);
 				return true;
+			}
 			collected++;
 			continue;
 		}
 
 		nf_ct_put(found_ct);
 		if (collected > CONNCOUNT_GC_MAX_NODES)
-			return false;
+			break;
 	}
 
-	spin_lock_bh(&list->list_lock);
 	if (!list->count) {
 		list->dead = true;
 		ret = true;
 	}
-	spin_unlock_bh(&list->list_lock);
+	spin_unlock(&list->list_lock);
 
 	return ret;
 }
@@ -478,6 +468,7 @@ static void tree_gc_worker(struct work_struct *work)
 	tree = data->gc_tree % CONNCOUNT_SLOTS;
 	root = &data->root[tree];
 
+	local_bh_disable();
 	rcu_read_lock();
 	for (node = rb_first(root); node != NULL; node = rb_next(node)) {
 		rbconn = rb_entry(node, struct nf_conncount_rb, node);
@@ -485,6 +476,9 @@ static void tree_gc_worker(struct work_struct *work)
 			gc_count++;
 	}
 	rcu_read_unlock();
+	local_bh_enable();
+
+	cond_resched();
 
 	spin_lock_bh(&nf_conncount_locks[tree]);
 	if (gc_count < ARRAY_SIZE(gc_nodes))
-- 
2.11.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 8/9] netfilter: nf_conncount: speculative garbage collection on empty lists
  2018-12-29 12:57 [PATCH 0/9] Netfilter fixes for net Pablo Neira Ayuso
                   ` (6 preceding siblings ...)
  2018-12-29 12:58 ` [PATCH 7/9] netfilter: nf_conncount: move all list iterations under spinlock Pablo Neira Ayuso
@ 2018-12-29 12:58 ` Pablo Neira Ayuso
  2018-12-29 12:58 ` [PATCH 9/9] netfilter: nf_conncount: fix argument order to find_next_bit Pablo Neira Ayuso
  2018-12-29 22:33 ` [PATCH 0/9] Netfilter fixes for net David Miller
  9 siblings, 0 replies; 11+ messages in thread
From: Pablo Neira Ayuso @ 2018-12-29 12:58 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Instead of removing a empty list node that might be reintroduced soon
thereafter, tentatively place the empty list node on the list passed to
tree_nodes_free(), then re-check if the list is empty again before erasing
it from the tree.

[ Florian: rebase on top of pending nf_conncount fixes ]

Fixes: 5c789e131cbb9 ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search")
Reviewed-by: Shawn Bohrer <sbohrer@cloudflare.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nf_conntrack_count.h |  1 -
 net/netfilter/nf_conncount.c               | 47 ++++++++++--------------------
 2 files changed, 15 insertions(+), 33 deletions(-)

diff --git a/include/net/netfilter/nf_conntrack_count.h b/include/net/netfilter/nf_conntrack_count.h
index aa66775c15f4..f32fc8289473 100644
--- a/include/net/netfilter/nf_conntrack_count.h
+++ b/include/net/netfilter/nf_conntrack_count.h
@@ -9,7 +9,6 @@ struct nf_conncount_list {
 	spinlock_t list_lock;
 	struct list_head head;	/* connections with the same filtering key */
 	unsigned int count;	/* length of list */
-	bool dead;
 };
 
 struct nf_conncount_data *nf_conncount_init(struct net *net, unsigned int family,
diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c
index d0fd195b19a8..f0b05dfebc6e 100644
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -81,27 +81,20 @@ static int key_diff(const u32 *a, const u32 *b, unsigned int klen)
 	return memcmp(a, b, klen * sizeof(u32));
 }
 
-static bool conn_free(struct nf_conncount_list *list,
+static void conn_free(struct nf_conncount_list *list,
 		      struct nf_conncount_tuple *conn)
 {
-	bool free_entry = false;
-
 	lockdep_assert_held(&list->list_lock);
 
 	list->count--;
 	list_del(&conn->node);
-	if (list->count == 0) {
-		list->dead = true;
-		free_entry = true;
-	}
 
 	kmem_cache_free(conncount_conn_cachep, conn);
-	return free_entry;
 }
 
 static const struct nf_conntrack_tuple_hash *
 find_or_evict(struct net *net, struct nf_conncount_list *list,
-	      struct nf_conncount_tuple *conn, bool *free_entry)
+	      struct nf_conncount_tuple *conn)
 {
 	const struct nf_conntrack_tuple_hash *found;
 	unsigned long a, b;
@@ -121,7 +114,7 @@ find_or_evict(struct net *net, struct nf_conncount_list *list,
 	 */
 	age = a - b;
 	if (conn->cpu == cpu || age >= 2) {
-		*free_entry = conn_free(list, conn);
+		conn_free(list, conn);
 		return ERR_PTR(-ENOENT);
 	}
 
@@ -137,14 +130,13 @@ static int __nf_conncount_add(struct net *net,
 	struct nf_conncount_tuple *conn, *conn_n;
 	struct nf_conn *found_ct;
 	unsigned int collect = 0;
-	bool free_entry = false;
 
 	/* check the saved connections */
 	list_for_each_entry_safe(conn, conn_n, &list->head, node) {
 		if (collect > CONNCOUNT_GC_MAX_NODES)
 			break;
 
-		found = find_or_evict(net, list, conn, &free_entry);
+		found = find_or_evict(net, list, conn);
 		if (IS_ERR(found)) {
 			/* Not found, but might be about to be confirmed */
 			if (PTR_ERR(found) == -EAGAIN) {
@@ -221,7 +213,6 @@ void nf_conncount_list_init(struct nf_conncount_list *list)
 	spin_lock_init(&list->list_lock);
 	INIT_LIST_HEAD(&list->head);
 	list->count = 0;
-	list->dead = false;
 }
 EXPORT_SYMBOL_GPL(nf_conncount_list_init);
 
@@ -233,7 +224,6 @@ bool nf_conncount_gc_list(struct net *net,
 	struct nf_conncount_tuple *conn, *conn_n;
 	struct nf_conn *found_ct;
 	unsigned int collected = 0;
-	bool free_entry = false;
 	bool ret = false;
 
 	/* don't bother if other cpu is already doing GC */
@@ -241,15 +231,10 @@ bool nf_conncount_gc_list(struct net *net,
 		return false;
 
 	list_for_each_entry_safe(conn, conn_n, &list->head, node) {
-		found = find_or_evict(net, list, conn, &free_entry);
+		found = find_or_evict(net, list, conn);
 		if (IS_ERR(found)) {
-			if (PTR_ERR(found) == -ENOENT)  {
-				if (free_entry) {
-					spin_unlock(&list->list_lock);
-					return true;
-				}
+			if (PTR_ERR(found) == -ENOENT)
 				collected++;
-			}
 			continue;
 		}
 
@@ -260,10 +245,7 @@ bool nf_conncount_gc_list(struct net *net,
 			 * closed already -> ditch it
 			 */
 			nf_ct_put(found_ct);
-			if (conn_free(list, conn)) {
-				spin_unlock(&list->list_lock);
-				return true;
-			}
+			conn_free(list, conn);
 			collected++;
 			continue;
 		}
@@ -273,10 +255,8 @@ bool nf_conncount_gc_list(struct net *net,
 			break;
 	}
 
-	if (!list->count) {
-		list->dead = true;
+	if (!list->count)
 		ret = true;
-	}
 	spin_unlock(&list->list_lock);
 
 	return ret;
@@ -291,6 +271,7 @@ static void __tree_nodes_free(struct rcu_head *h)
 	kmem_cache_free(conncount_rb_cachep, rbconn);
 }
 
+/* caller must hold tree nf_conncount_locks[] lock */
 static void tree_nodes_free(struct rb_root *root,
 			    struct nf_conncount_rb *gc_nodes[],
 			    unsigned int gc_count)
@@ -300,8 +281,10 @@ static void tree_nodes_free(struct rb_root *root,
 	while (gc_count) {
 		rbconn = gc_nodes[--gc_count];
 		spin_lock(&rbconn->list.list_lock);
-		rb_erase(&rbconn->node, root);
-		call_rcu(&rbconn->rcu_head, __tree_nodes_free);
+		if (!rbconn->list.count) {
+			rb_erase(&rbconn->node, root);
+			call_rcu(&rbconn->rcu_head, __tree_nodes_free);
+		}
 		spin_unlock(&rbconn->list.list_lock);
 	}
 }
@@ -318,7 +301,6 @@ insert_tree(struct net *net,
 	    struct rb_root *root,
 	    unsigned int hash,
 	    const u32 *key,
-	    u8 keylen,
 	    const struct nf_conntrack_tuple *tuple,
 	    const struct nf_conntrack_zone *zone)
 {
@@ -327,6 +309,7 @@ insert_tree(struct net *net,
 	struct nf_conncount_rb *rbconn;
 	struct nf_conncount_tuple *conn;
 	unsigned int count = 0, gc_count = 0;
+	u8 keylen = data->keylen;
 	bool do_gc = true;
 
 	spin_lock_bh(&nf_conncount_locks[hash]);
@@ -454,7 +437,7 @@ count_tree(struct net *net,
 	if (!tuple)
 		return 0;
 
-	return insert_tree(net, data, root, hash, key, keylen, tuple, zone);
+	return insert_tree(net, data, root, hash, key, tuple, zone);
 }
 
 static void tree_gc_worker(struct work_struct *work)
-- 
2.11.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* [PATCH 9/9] netfilter: nf_conncount: fix argument order to find_next_bit
  2018-12-29 12:57 [PATCH 0/9] Netfilter fixes for net Pablo Neira Ayuso
                   ` (7 preceding siblings ...)
  2018-12-29 12:58 ` [PATCH 8/9] netfilter: nf_conncount: speculative garbage collection on empty lists Pablo Neira Ayuso
@ 2018-12-29 12:58 ` Pablo Neira Ayuso
  2018-12-29 22:33 ` [PATCH 0/9] Netfilter fixes for net David Miller
  9 siblings, 0 replies; 11+ messages in thread
From: Pablo Neira Ayuso @ 2018-12-29 12:58 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

From: Florian Westphal <fw@strlen.de>

Size and 'next bit' were swapped, this bug could cause worker to
reschedule itself even if system was idle.

Fixes: 5c789e131cbb9 ("netfilter: nf_conncount: Add list lock and gc worker, and RCU for init tree search")
Reviewed-by: Shawn Bohrer <sbohrer@cloudflare.com>
Signed-off-by: Florian Westphal <fw@strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nf_conncount.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c
index f0b05dfebc6e..7554c56b2e63 100644
--- a/net/netfilter/nf_conncount.c
+++ b/net/netfilter/nf_conncount.c
@@ -488,7 +488,7 @@ static void tree_gc_worker(struct work_struct *work)
 	clear_bit(tree, data->pending_trees);
 
 	next_tree = (tree + 1) % CONNCOUNT_SLOTS;
-	next_tree = find_next_bit(data->pending_trees, next_tree, CONNCOUNT_SLOTS);
+	next_tree = find_next_bit(data->pending_trees, CONNCOUNT_SLOTS, next_tree);
 
 	if (next_tree < CONNCOUNT_SLOTS) {
 		data->gc_tree = next_tree;
-- 
2.11.0

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 0/9] Netfilter fixes for net
  2018-12-29 12:57 [PATCH 0/9] Netfilter fixes for net Pablo Neira Ayuso
                   ` (8 preceding siblings ...)
  2018-12-29 12:58 ` [PATCH 9/9] netfilter: nf_conncount: fix argument order to find_next_bit Pablo Neira Ayuso
@ 2018-12-29 22:33 ` David Miller
  9 siblings, 0 replies; 11+ messages in thread
From: David Miller @ 2018-12-29 22:33 UTC (permalink / raw)
  To: pablo; +Cc: netfilter-devel, netdev

From: Pablo Neira Ayuso <pablo@netfilter.org>
Date: Sat, 29 Dec 2018 13:57:54 +0100

> The following patchset contains Netfilter fixes for net, specifically
> fixes for the nf_conncount infrastructure which is causing troubles
> since 5c789e131cbb9 ("netfilter: nf_conncount: Add list lock and gc
> worker, and RCU for init tree search"). Patches aim to simplify this
> infrastructure while fixing up the problems:
 ...
> And one patch that falls within the miscelanea category in this batch:
> 
> 9) Missing error path for nla_nest_start(), from Kangjie Lu.
> 
> You can pull these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf.git

Pulled, thank you.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, back to index

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-29 12:57 [PATCH 0/9] Netfilter fixes for net Pablo Neira Ayuso
2018-12-29 12:57 ` [PATCH 1/9] netfilter: nf_tables: fix a missing check of nla_put_failure Pablo Neira Ayuso
2018-12-29 12:57 ` [PATCH 2/9] netfilter: nf_conncount: replace CONNCOUNT_LOCK_SLOTS with CONNCOUNT_SLOTS Pablo Neira Ayuso
2018-12-29 12:57 ` [PATCH 3/9] netfilter: nf_conncount: don't skip eviction when age is negative Pablo Neira Ayuso
2018-12-29 12:57 ` [PATCH 4/9] netfilter: nf_conncount: split gc in two phases Pablo Neira Ayuso
2018-12-29 12:57 ` [PATCH 5/9] netfilter: nf_conncount: restart search when nodes have been erased Pablo Neira Ayuso
2018-12-29 12:58 ` [PATCH 6/9] netfilter: nf_conncount: merge lookup and add functions Pablo Neira Ayuso
2018-12-29 12:58 ` [PATCH 7/9] netfilter: nf_conncount: move all list iterations under spinlock Pablo Neira Ayuso
2018-12-29 12:58 ` [PATCH 8/9] netfilter: nf_conncount: speculative garbage collection on empty lists Pablo Neira Ayuso
2018-12-29 12:58 ` [PATCH 9/9] netfilter: nf_conncount: fix argument order to find_next_bit Pablo Neira Ayuso
2018-12-29 22:33 ` [PATCH 0/9] Netfilter fixes for net David Miller

Netfilter-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/netfilter-devel/0 netfilter-devel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 netfilter-devel netfilter-devel/ https://lore.kernel.org/netfilter-devel \
		netfilter-devel@vger.kernel.org
	public-inbox-index netfilter-devel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.netfilter-devel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git