netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [iptables PATCH 0/3] nft: Fix transaction refreshing
@ 2020-10-05 14:48 Phil Sutter
  2020-10-05 14:48 ` [iptables PATCH 1/3] nft: Make batch_add_chain() return the added batch object Phil Sutter
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Phil Sutter @ 2020-10-05 14:48 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, Florian Westphal

With iptables-nft-restore in --noflush mode, the created batch job list
may need to be adjusted to a changing ruleset in kernel. In particular,
an input line like ':FOO - [0:0]' either means "flush chain FOO" or
"create chain FOO" depending on whether it exists already or not. Patch
3 contains a test case provoking this peculiar situation and fixes the
transaction prepare and refresh logic in that case. Patch 1 is a simple
preparation change, patch 2 a somewhat related fix for error reporting
with refreshed transactions.

Phil Sutter (3):
  nft: Make batch_add_chain() return the added batch object
  nft: Fix error reporting for refreshed transactions
  nft: Fix for concurrent noflush restore calls

 iptables/nft.c                                | 96 ++++++++++---------
 .../ipt-restore/0016-concurrent-restores_0    | 53 ++++++++++
 2 files changed, 102 insertions(+), 47 deletions(-)
 create mode 100755 iptables/tests/shell/testcases/ipt-restore/0016-concurrent-restores_0

-- 
2.28.0


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [iptables PATCH 1/3] nft: Make batch_add_chain() return the added batch object
  2020-10-05 14:48 [iptables PATCH 0/3] nft: Fix transaction refreshing Phil Sutter
@ 2020-10-05 14:48 ` Phil Sutter
  2020-10-05 21:07   ` Florian Westphal
  2020-10-05 14:48 ` [iptables PATCH 2/3] nft: Fix error reporting for refreshed transactions Phil Sutter
  2020-10-05 14:48 ` [iptables PATCH 3/3] nft: Fix for concurrent noflush restore calls Phil Sutter
  2 siblings, 1 reply; 13+ messages in thread
From: Phil Sutter @ 2020-10-05 14:48 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, Florian Westphal

Do this so in a later patch the 'skip' field can be adjusted.

While being at it, simplify a few callers and eliminate the need for a
'ret' variable.

Signed-off-by: Phil Sutter <phil@nwl.cc>
---
 iptables/nft.c | 35 +++++++++++++++++------------------
 1 file changed, 17 insertions(+), 18 deletions(-)

diff --git a/iptables/nft.c b/iptables/nft.c
index 775ace385c184..09421cf4eaaec 100644
--- a/iptables/nft.c
+++ b/iptables/nft.c
@@ -388,10 +388,11 @@ batch_set_add(struct nft_handle *h, enum obj_update_type type,
 	return batch_add(h, type, s);
 }
 
-static int batch_chain_add(struct nft_handle *h, enum obj_update_type type,
+static struct obj_update *
+batch_chain_add(struct nft_handle *h, enum obj_update_type type,
 			   struct nftnl_chain *c)
 {
-	return batch_add(h, type, c) ? 0 : -1;
+	return batch_add(h, type, c);
 }
 
 static struct obj_update *
@@ -905,7 +906,6 @@ int nft_chain_set(struct nft_handle *h, const char *table,
 		  const struct xt_counters *counters)
 {
 	struct nftnl_chain *c = NULL;
-	int ret;
 
 	nft_fn = nft_chain_set;
 
@@ -932,10 +932,11 @@ int nft_chain_set(struct nft_handle *h, const char *table,
 	if (c == NULL)
 		return 0;
 
-	ret = batch_chain_add(h, NFT_COMPAT_CHAIN_UPDATE, c);
+	if (!batch_chain_add(h, NFT_COMPAT_CHAIN_UPDATE, c))
+		return 0;
 
 	/* the core expects 1 for success and 0 for error */
-	return ret == 0 ? 1 : 0;
+	return 1;
 }
 
 static int __add_match(struct nftnl_expr *e, struct xt_entry_match *m)
@@ -1725,7 +1726,6 @@ int nft_chain_user_add(struct nft_handle *h, const char *chain, const char *tabl
 {
 	struct nftnl_chain_list *list;
 	struct nftnl_chain *c;
-	int ret;
 
 	nft_fn = nft_chain_user_add;
 
@@ -1745,14 +1745,15 @@ int nft_chain_user_add(struct nft_handle *h, const char *chain, const char *tabl
 	if (h->family == NFPROTO_BRIDGE)
 		nftnl_chain_set_u32(c, NFTNL_CHAIN_POLICY, NF_ACCEPT);
 
-	ret = batch_chain_add(h, NFT_COMPAT_CHAIN_USER_ADD, c);
+	if (!batch_chain_add(h, NFT_COMPAT_CHAIN_USER_ADD, c))
+		return 0;
 
 	list = nft_chain_list_get(h, table, chain);
 	if (list)
 		nftnl_chain_list_add(c, list);
 
 	/* the core expects 1 for success and 0 for error */
-	return ret == 0 ? 1 : 0;
+	return 1;
 }
 
 int nft_chain_restore(struct nft_handle *h, const char *chain, const char *table)
@@ -1760,7 +1761,6 @@ int nft_chain_restore(struct nft_handle *h, const char *chain, const char *table
 	struct nftnl_chain_list *list;
 	struct nftnl_chain *c;
 	bool created = false;
-	int ret;
 
 	nft_xt_builtin_init(h, table);
 
@@ -1787,14 +1787,15 @@ int nft_chain_restore(struct nft_handle *h, const char *chain, const char *table
 	if (!created)
 		return 1;
 
-	ret = batch_chain_add(h, NFT_COMPAT_CHAIN_USER_ADD, c);
+	if (!batch_chain_add(h, NFT_COMPAT_CHAIN_USER_ADD, c))
+		return 0;
 
 	list = nft_chain_list_get(h, table, chain);
 	if (list)
 		nftnl_chain_list_add(c, list);
 
 	/* the core expects 1 for success and 0 for error */
-	return ret == 0 ? 1 : 0;
+	return 1;
 }
 
 /* From linux/netlink.h */
@@ -1812,7 +1813,6 @@ static int __nft_chain_user_del(struct nftnl_chain *c, void *data)
 {
 	struct chain_user_del_data *d = data;
 	struct nft_handle *h = d->handle;
-	int ret;
 
 	/* don't delete built-in chain */
 	if (nft_chain_builtin(c))
@@ -1824,8 +1824,7 @@ static int __nft_chain_user_del(struct nftnl_chain *c, void *data)
 
 	/* XXX This triggers a fast lookup from the kernel. */
 	nftnl_chain_unset(c, NFTNL_CHAIN_HANDLE);
-	ret = batch_chain_add(h, NFT_COMPAT_CHAIN_USER_DEL, c);
-	if (ret)
+	if (!batch_chain_add(h, NFT_COMPAT_CHAIN_USER_DEL, c))
 		return -1;
 
 	nftnl_chain_list_del(c);
@@ -1900,7 +1899,6 @@ int nft_chain_user_rename(struct nft_handle *h,const char *chain,
 {
 	struct nftnl_chain *c;
 	uint64_t handle;
-	int ret;
 
 	nft_fn = nft_chain_user_rename;
 
@@ -1929,10 +1927,11 @@ int nft_chain_user_rename(struct nft_handle *h,const char *chain,
 	nftnl_chain_set_str(c, NFTNL_CHAIN_NAME, newname);
 	nftnl_chain_set_u64(c, NFTNL_CHAIN_HANDLE, handle);
 
-	ret = batch_chain_add(h, NFT_COMPAT_CHAIN_RENAME, c);
+	if (!batch_chain_add(h, NFT_COMPAT_CHAIN_RENAME, c))
+		return 0;
 
 	/* the core expects 1 for success and 0 for error */
-	return ret == 0 ? 1 : 0;
+	return 1;
 }
 
 bool nft_table_find(struct nft_handle *h, const char *tablename)
@@ -3296,7 +3295,7 @@ static int __nft_chain_zero_counters(struct nftnl_chain *c, void *data)
 		nftnl_chain_set_u64(c, NFTNL_CHAIN_PACKETS, 0);
 		nftnl_chain_set_u64(c, NFTNL_CHAIN_BYTES, 0);
 		nftnl_chain_unset(c, NFTNL_CHAIN_HANDLE);
-		if (batch_chain_add(h, NFT_COMPAT_CHAIN_ZERO, c))
+		if (!batch_chain_add(h, NFT_COMPAT_CHAIN_ZERO, c))
 			return -1;
 	}
 
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [iptables PATCH 2/3] nft: Fix error reporting for refreshed transactions
  2020-10-05 14:48 [iptables PATCH 0/3] nft: Fix transaction refreshing Phil Sutter
  2020-10-05 14:48 ` [iptables PATCH 1/3] nft: Make batch_add_chain() return the added batch object Phil Sutter
@ 2020-10-05 14:48 ` Phil Sutter
  2020-10-05 21:13   ` Florian Westphal
  2020-10-05 14:48 ` [iptables PATCH 3/3] nft: Fix for concurrent noflush restore calls Phil Sutter
  2 siblings, 1 reply; 13+ messages in thread
From: Phil Sutter @ 2020-10-05 14:48 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, Florian Westphal

When preparing a batch from the list of batch objects in nft_action(),
the sequence number used for each object is stored within that object
for later matching against returned error messages. Though if the
transaction has to be refreshed, some of those objects may be skipped,
other objects take over their sequence number and errors are matched to
skipped objects. Avoid this by resetting the skipped object's sequence
number to zero.

Fixes: 58d7de0181f61 ("xtables: handle concurrent ruleset modifications")
Signed-off-by: Phil Sutter <phil@nwl.cc>
---
 iptables/nft.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/iptables/nft.c b/iptables/nft.c
index 09421cf4eaaec..70be9ba908edc 100644
--- a/iptables/nft.c
+++ b/iptables/nft.c
@@ -2729,9 +2729,10 @@ retry:
 	h->nft_genid++;
 
 	list_for_each_entry(n, &h->obj_list, head) {
-
-		if (n->skip)
+		if (n->skip) {
+			n->seq = 0;
 			continue;
+		}
 
 		n->seq = seq++;
 		switch (n->type) {
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [iptables PATCH 3/3] nft: Fix for concurrent noflush restore calls
  2020-10-05 14:48 [iptables PATCH 0/3] nft: Fix transaction refreshing Phil Sutter
  2020-10-05 14:48 ` [iptables PATCH 1/3] nft: Make batch_add_chain() return the added batch object Phil Sutter
  2020-10-05 14:48 ` [iptables PATCH 2/3] nft: Fix error reporting for refreshed transactions Phil Sutter
@ 2020-10-05 14:48 ` Phil Sutter
  2020-10-12 12:54   ` Pablo Neira Ayuso
  2 siblings, 1 reply; 13+ messages in thread
From: Phil Sutter @ 2020-10-05 14:48 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, Florian Westphal

Transaction refresh was broken with regards to nft_chain_restore(): It
created a rule flush batch object only if the chain was found in cache
and a chain add object only if the chain was not found. Yet with
concurrent ruleset updates, one has to expect both situations:

* If a chain vanishes, the rule flush job must be skipped and instead
  the chain add job become active.

* If a chain appears, the chain add job must be skipped and instead
  rules flushed.

Change the code accordingly: Create both batch objects and set their
'skip' field depending on the situation in cache and adjust both in
nft_refresh_transaction().

As a side-effect, the implicit rule flush becomes explicit and all
handling of implicit batch jobs is dropped along with the related field
indicating such.

Reuse the 'implicit' parameter of __nft_rule_flush() to control the
initial 'skip' field value instead.

A subtle caveat is vanishing of existing chains: Creating the chain add
job based on the chain in cache causes a netlink message containing that
chain's handle which the kernel dislikes. Therefore unset the chain's
handle in that case.

Fixes: 58d7de0181f61 ("xtables: handle concurrent ruleset modifications")
Signed-off-by: Phil Sutter <phil@nwl.cc>
---
 iptables/nft.c                                | 58 ++++++++++---------
 .../ipt-restore/0016-concurrent-restores_0    | 53 +++++++++++++++++
 2 files changed, 83 insertions(+), 28 deletions(-)
 create mode 100755 iptables/tests/shell/testcases/ipt-restore/0016-concurrent-restores_0

diff --git a/iptables/nft.c b/iptables/nft.c
index 70be9ba908edc..9424c181d17cc 100644
--- a/iptables/nft.c
+++ b/iptables/nft.c
@@ -265,7 +265,6 @@ struct obj_update {
 	struct list_head	head;
 	enum obj_update_type	type:8;
 	uint8_t			skip:1;
-	uint8_t			implicit:1;
 	unsigned int		seq;
 	union {
 		struct nftnl_table	*table;
@@ -1634,7 +1633,7 @@ struct nftnl_set *nft_set_batch_lookup_byid(struct nft_handle *h,
 
 static void
 __nft_rule_flush(struct nft_handle *h, const char *table,
-		 const char *chain, bool verbose, bool implicit)
+		 const char *chain, bool verbose, bool skip)
 {
 	struct obj_update *obj;
 	struct nftnl_rule *r;
@@ -1656,7 +1655,7 @@ __nft_rule_flush(struct nft_handle *h, const char *table,
 		return;
 	}
 
-	obj->implicit = implicit;
+	obj->skip = skip;
 }
 
 struct nft_rule_flush_data {
@@ -1759,19 +1758,14 @@ int nft_chain_user_add(struct nft_handle *h, const char *chain, const char *tabl
 int nft_chain_restore(struct nft_handle *h, const char *chain, const char *table)
 {
 	struct nftnl_chain_list *list;
+	struct obj_update *obj;
 	struct nftnl_chain *c;
 	bool created = false;
 
 	nft_xt_builtin_init(h, table);
 
 	c = nft_chain_find(h, table, chain);
-	if (c) {
-		/* Apparently -n still flushes existing user defined
-		 * chains that are redefined.
-		 */
-		if (h->noflush)
-			__nft_rule_flush(h, table, chain, false, true);
-	} else {
+	if (!c) {
 		c = nftnl_chain_alloc();
 		if (!c)
 			return 0;
@@ -1779,20 +1773,26 @@ int nft_chain_restore(struct nft_handle *h, const char *chain, const char *table
 		nftnl_chain_set_str(c, NFTNL_CHAIN_TABLE, table);
 		nftnl_chain_set_str(c, NFTNL_CHAIN_NAME, chain);
 		created = true;
-	}
 
-	if (h->family == NFPROTO_BRIDGE)
-		nftnl_chain_set_u32(c, NFTNL_CHAIN_POLICY, NF_ACCEPT);
+		list = nft_chain_list_get(h, table, chain);
+		if (list)
+			nftnl_chain_list_add(c, list);
+	} else {
+		/* If the chain should vanish meanwhile, kernel genid changes
+		 * and the transaction is refreshed enabling the chain add
+		 * object. With the handle still set, kernel interprets it as a
+		 * chain replace job and errors since it is not found anymore.
+		 */
+		nftnl_chain_unset(c, NFTNL_CHAIN_HANDLE);
+	}
 
-	if (!created)
-		return 1;
+	__nft_rule_flush(h, table, chain, false, created);
 
-	if (!batch_chain_add(h, NFT_COMPAT_CHAIN_USER_ADD, c))
+	obj = batch_chain_add(h, NFT_COMPAT_CHAIN_USER_ADD, c);
+	if (!obj)
 		return 0;
 
-	list = nft_chain_list_get(h, table, chain);
-	if (list)
-		nftnl_chain_list_add(c, list);
+	obj->skip = !created;
 
 	/* the core expects 1 for success and 0 for error */
 	return 1;
@@ -2649,11 +2649,6 @@ static void nft_refresh_transaction(struct nft_handle *h)
 	h->error.lineno = 0;
 
 	list_for_each_entry_safe(n, tmp, &h->obj_list, head) {
-		if (n->implicit) {
-			batch_obj_del(h, n);
-			continue;
-		}
-
 		switch (n->type) {
 		case NFT_COMPAT_TABLE_FLUSH:
 			tablename = nftnl_table_get_str(n->table, NFTNL_TABLE_NAME);
@@ -2679,14 +2674,22 @@ static void nft_refresh_transaction(struct nft_handle *h)
 
 			c = nft_chain_find(h, tablename, chainname);
 			if (c) {
-				/* -restore -n flushes existing rules from redefined user-chain */
-				__nft_rule_flush(h, tablename,
-						 chainname, false, true);
 				n->skip = 1;
 			} else if (!c) {
 				n->skip = 0;
 			}
 			break;
+		case NFT_COMPAT_RULE_FLUSH:
+			tablename = nftnl_rule_get_str(n->rule, NFTNL_RULE_TABLE);
+			if (!tablename)
+				continue;
+
+			chainname = nftnl_rule_get_str(n->rule, NFTNL_RULE_CHAIN);
+			if (!chainname)
+				continue;
+
+			n->skip = !nft_chain_find(h, tablename, chainname);
+			break;
 		case NFT_COMPAT_TABLE_ADD:
 		case NFT_COMPAT_CHAIN_ADD:
 		case NFT_COMPAT_CHAIN_ZERO:
@@ -2698,7 +2701,6 @@ static void nft_refresh_transaction(struct nft_handle *h)
 		case NFT_COMPAT_RULE_INSERT:
 		case NFT_COMPAT_RULE_REPLACE:
 		case NFT_COMPAT_RULE_DELETE:
-		case NFT_COMPAT_RULE_FLUSH:
 		case NFT_COMPAT_SET_ADD:
 		case NFT_COMPAT_RULE_LIST:
 		case NFT_COMPAT_RULE_CHECK:
diff --git a/iptables/tests/shell/testcases/ipt-restore/0016-concurrent-restores_0 b/iptables/tests/shell/testcases/ipt-restore/0016-concurrent-restores_0
new file mode 100755
index 0000000000000..53ec12fa368af
--- /dev/null
+++ b/iptables/tests/shell/testcases/ipt-restore/0016-concurrent-restores_0
@@ -0,0 +1,53 @@
+#!/bin/bash
+
+set -e
+
+RS="*filter
+:INPUT ACCEPT [12024:3123388]
+:FORWARD ACCEPT [0:0]
+:OUTPUT ACCEPT [12840:2144421]
+:FOO - [0:0]
+:BAR0 - [0:0]
+:BAR1 - [0:0]
+:BAR2 - [0:0]
+:BAR3 - [0:0]
+:BAR4 - [0:0]
+:BAR5 - [0:0]
+:BAR6 - [0:0]
+:BAR7 - [0:0]
+:BAR8 - [0:0]
+:BAR9 - [0:0]
+"
+
+RS1="$RS
+-X BAR3
+-X BAR6
+-X BAR9
+-A FOO -s 9.9.0.1/32 -j BAR1
+-A FOO -s 9.9.0.2/32 -j BAR2
+-A FOO -s 9.9.0.4/32 -j BAR4
+-A FOO -s 9.9.0.5/32 -j BAR5
+-A FOO -s 9.9.0.7/32 -j BAR7
+-A FOO -s 9.9.0.8/32 -j BAR8
+COMMIT
+"
+
+RS2="$RS
+-X BAR2
+-X BAR5
+-X BAR7
+-A FOO -s 9.9.0.1/32 -j BAR1
+-A FOO -s 9.9.0.3/32 -j BAR3
+-A FOO -s 9.9.0.4/32 -j BAR4
+-A FOO -s 9.9.0.6/32 -j BAR6
+-A FOO -s 9.9.0.8/32 -j BAR8
+-A FOO -s 9.9.0.9/32 -j BAR9
+COMMIT
+"
+
+for n in $(seq 1 10); do
+	$XT_MULTI iptables-restore --noflush -w <<< "$RS1" &
+	$XT_MULTI iptables-restore --noflush -w <<< "$RS2" &
+	wait -n
+	wait -n
+done
-- 
2.28.0


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [iptables PATCH 1/3] nft: Make batch_add_chain() return the added batch object
  2020-10-05 14:48 ` [iptables PATCH 1/3] nft: Make batch_add_chain() return the added batch object Phil Sutter
@ 2020-10-05 21:07   ` Florian Westphal
  0 siblings, 0 replies; 13+ messages in thread
From: Florian Westphal @ 2020-10-05 21:07 UTC (permalink / raw)
  To: Phil Sutter; +Cc: Pablo Neira Ayuso, netfilter-devel, Florian Westphal

Phil Sutter <phil@nwl.cc> wrote:
> Do this so in a later patch the 'skip' field can be adjusted.
> 
> While being at it, simplify a few callers and eliminate the need for a
> 'ret' variable.

Reviewed-by: Florian Westphal <fw@strlen.de>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [iptables PATCH 2/3] nft: Fix error reporting for refreshed transactions
  2020-10-05 14:48 ` [iptables PATCH 2/3] nft: Fix error reporting for refreshed transactions Phil Sutter
@ 2020-10-05 21:13   ` Florian Westphal
  0 siblings, 0 replies; 13+ messages in thread
From: Florian Westphal @ 2020-10-05 21:13 UTC (permalink / raw)
  To: Phil Sutter; +Cc: Pablo Neira Ayuso, netfilter-devel, Florian Westphal

Phil Sutter <phil@nwl.cc> wrote:
> When preparing a batch from the list of batch objects in nft_action(),
> the sequence number used for each object is stored within that object
> for later matching against returned error messages. Though if the
> transaction has to be refreshed, some of those objects may be skipped,
> other objects take over their sequence number and errors are matched to
> skipped objects. Avoid this by resetting the skipped object's sequence
> number to zero.

Reviewed-by: Florian Westphal <fw@strlen.de>

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [iptables PATCH 3/3] nft: Fix for concurrent noflush restore calls
  2020-10-05 14:48 ` [iptables PATCH 3/3] nft: Fix for concurrent noflush restore calls Phil Sutter
@ 2020-10-12 12:54   ` Pablo Neira Ayuso
  2020-10-13 10:08     ` Phil Sutter
  0 siblings, 1 reply; 13+ messages in thread
From: Pablo Neira Ayuso @ 2020-10-12 12:54 UTC (permalink / raw)
  To: Phil Sutter; +Cc: netfilter-devel, Florian Westphal

On Mon, Oct 05, 2020 at 04:48:58PM +0200, Phil Sutter wrote:
> Transaction refresh was broken with regards to nft_chain_restore(): It
> created a rule flush batch object only if the chain was found in cache
> and a chain add object only if the chain was not found. Yet with
> concurrent ruleset updates, one has to expect both situations:
> 
> * If a chain vanishes, the rule flush job must be skipped and instead
>   the chain add job become active.
> 
> * If a chain appears, the chain add job must be skipped and instead
>   rules flushed.
> 
> Change the code accordingly: Create both batch objects and set their
> 'skip' field depending on the situation in cache and adjust both in
> nft_refresh_transaction().
> 
> As a side-effect, the implicit rule flush becomes explicit and all
> handling of implicit batch jobs is dropped along with the related field
> indicating such.
> 
> Reuse the 'implicit' parameter of __nft_rule_flush() to control the
> initial 'skip' field value instead.
> 
> A subtle caveat is vanishing of existing chains: Creating the chain add
> job based on the chain in cache causes a netlink message containing that
> chain's handle which the kernel dislikes. Therefore unset the chain's
> handle in that case.
> 
> Fixes: 58d7de0181f61 ("xtables: handle concurrent ruleset modifications")
> Signed-off-by: Phil Sutter <phil@nwl.cc>
> ---
>  iptables/nft.c                                | 58 ++++++++++---------
>  .../ipt-restore/0016-concurrent-restores_0    | 53 +++++++++++++++++
>  2 files changed, 83 insertions(+), 28 deletions(-)
>  create mode 100755 iptables/tests/shell/testcases/ipt-restore/0016-concurrent-restores_0
> 
> diff --git a/iptables/nft.c b/iptables/nft.c
> index 70be9ba908edc..9424c181d17cc 100644
> --- a/iptables/nft.c
> +++ b/iptables/nft.c
> @@ -265,7 +265,6 @@ struct obj_update {
>  	struct list_head	head;
>  	enum obj_update_type	type:8;
>  	uint8_t			skip:1;
> -	uint8_t			implicit:1;
>  	unsigned int		seq;
>  	union {
>  		struct nftnl_table	*table;
> @@ -1634,7 +1633,7 @@ struct nftnl_set *nft_set_batch_lookup_byid(struct nft_handle *h,
>  
>  static void
>  __nft_rule_flush(struct nft_handle *h, const char *table,
> -		 const char *chain, bool verbose, bool implicit)
> +		 const char *chain, bool verbose, bool skip)
>  {
>  	struct obj_update *obj;
>  	struct nftnl_rule *r;
> @@ -1656,7 +1655,7 @@ __nft_rule_flush(struct nft_handle *h, const char *table,
>  		return;
>  	}
>  
> -	obj->implicit = implicit;
> +	obj->skip = skip;
>  }
>  
>  struct nft_rule_flush_data {
> @@ -1759,19 +1758,14 @@ int nft_chain_user_add(struct nft_handle *h, const char *chain, const char *tabl
>  int nft_chain_restore(struct nft_handle *h, const char *chain, const char *table)
>  {
>  	struct nftnl_chain_list *list;
> +	struct obj_update *obj;
>  	struct nftnl_chain *c;
>  	bool created = false;
>  
>  	nft_xt_builtin_init(h, table);
>  
>  	c = nft_chain_find(h, table, chain);
> -	if (c) {
> -		/* Apparently -n still flushes existing user defined
> -		 * chains that are redefined.
> -		 */
> -		if (h->noflush)
> -			__nft_rule_flush(h, table, chain, false, true);
> -	} else {
> +	if (!c) {
>  		c = nftnl_chain_alloc();
>  		if (!c)
>  			return 0;
> @@ -1779,20 +1773,26 @@ int nft_chain_restore(struct nft_handle *h, const char *chain, const char *table
>  		nftnl_chain_set_str(c, NFTNL_CHAIN_TABLE, table);
>  		nftnl_chain_set_str(c, NFTNL_CHAIN_NAME, chain);
>  		created = true;
> -	}
>  
> -	if (h->family == NFPROTO_BRIDGE)
> -		nftnl_chain_set_u32(c, NFTNL_CHAIN_POLICY, NF_ACCEPT);
> +		list = nft_chain_list_get(h, table, chain);
> +		if (list)
> +			nftnl_chain_list_add(c, list);
> +	} else {
> +		/* If the chain should vanish meanwhile, kernel genid changes
> +		 * and the transaction is refreshed enabling the chain add
> +		 * object. With the handle still set, kernel interprets it as a
> +		 * chain replace job and errors since it is not found anymore.
> +		 */
> +		nftnl_chain_unset(c, NFTNL_CHAIN_HANDLE);
> +	}
>  
> -	if (!created)
> -		return 1;
> +	__nft_rule_flush(h, table, chain, false, created);

I like this trick.

If I understood correct, you always place an "add chain" command in
first place before the flush, whether the chain exists or not, so the
flush always succeeds.

> -	if (!batch_chain_add(h, NFT_COMPAT_CHAIN_USER_ADD, c))
> +	obj = batch_chain_add(h, NFT_COMPAT_CHAIN_USER_ADD, c);
> +	if (!obj)
>  		return 0;
>  
> -	list = nft_chain_list_get(h, table, chain);
> -	if (list)
> -		nftnl_chain_list_add(c, list);
> +	obj->skip = !created;
>  
>  	/* the core expects 1 for success and 0 for error */
>  	return 1;
> @@ -2649,11 +2649,6 @@ static void nft_refresh_transaction(struct nft_handle *h)
>  	h->error.lineno = 0;
>  
>  	list_for_each_entry_safe(n, tmp, &h->obj_list, head) {
> -		if (n->implicit) {
> -			batch_obj_del(h, n);
> -			continue;
> -		}
> -
>  		switch (n->type) {
>  		case NFT_COMPAT_TABLE_FLUSH:
>  			tablename = nftnl_table_get_str(n->table, NFTNL_TABLE_NAME);
> @@ -2679,14 +2674,22 @@ static void nft_refresh_transaction(struct nft_handle *h)
>  
>  			c = nft_chain_find(h, tablename, chainname);
>  			if (c) {
> -				/* -restore -n flushes existing rules from redefined user-chain */
> -				__nft_rule_flush(h, tablename,
> -						 chainname, false, true);
>  				n->skip = 1;
>  			} else if (!c) {
>  				n->skip = 0;
>  			}
>  			break;
> +		case NFT_COMPAT_RULE_FLUSH:
> +			tablename = nftnl_rule_get_str(n->rule, NFTNL_RULE_TABLE);
> +			if (!tablename)
> +				continue;
> +
> +			chainname = nftnl_rule_get_str(n->rule, NFTNL_RULE_CHAIN);
> +			if (!chainname)
> +				continue;
> +
> +			n->skip = !nft_chain_find(h, tablename, chainname);

So n->skip equals true if the chain does not exists in the cache, so
this flush does not fail (after this chain is gone because someone
else have remove it).

Patch LGTM, thanks Phil.

What I don't clearly see yet is what scenario is triggering the bug in
the existing code, if you don't mind to explain.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [iptables PATCH 3/3] nft: Fix for concurrent noflush restore calls
  2020-10-12 12:54   ` Pablo Neira Ayuso
@ 2020-10-13 10:08     ` Phil Sutter
  2020-10-13 10:15       ` Pablo Neira Ayuso
  0 siblings, 1 reply; 13+ messages in thread
From: Phil Sutter @ 2020-10-13 10:08 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, Florian Westphal

On Mon, Oct 12, 2020 at 02:54:50PM +0200, Pablo Neira Ayuso wrote:
[...]
> > -	if (h->family == NFPROTO_BRIDGE)
> > -		nftnl_chain_set_u32(c, NFTNL_CHAIN_POLICY, NF_ACCEPT);
> > +		list = nft_chain_list_get(h, table, chain);
> > +		if (list)
> > +			nftnl_chain_list_add(c, list);
> > +	} else {
> > +		/* If the chain should vanish meanwhile, kernel genid changes
> > +		 * and the transaction is refreshed enabling the chain add
> > +		 * object. With the handle still set, kernel interprets it as a
> > +		 * chain replace job and errors since it is not found anymore.
> > +		 */
> > +		nftnl_chain_unset(c, NFTNL_CHAIN_HANDLE);
> > +	}
> >  
> > -	if (!created)
> > -		return 1;
> > +	__nft_rule_flush(h, table, chain, false, created);
> 
> I like this trick.
> 
> If I understood correct, you always place an "add chain" command in
> first place before the flush, whether the chain exists or not, so the
> flush always succeeds.

Yes, this was already done for "add table". So I'm "merely" extending
Florian's refresh transaction logic with regards to 'skip' flag.

> > -	if (!batch_chain_add(h, NFT_COMPAT_CHAIN_USER_ADD, c))
> > +	obj = batch_chain_add(h, NFT_COMPAT_CHAIN_USER_ADD, c);
> > +	if (!obj)
> >  		return 0;
> >  
> > -	list = nft_chain_list_get(h, table, chain);
> > -	if (list)
> > -		nftnl_chain_list_add(c, list);
> > +	obj->skip = !created;
> >  
> >  	/* the core expects 1 for success and 0 for error */
> >  	return 1;
> > @@ -2649,11 +2649,6 @@ static void nft_refresh_transaction(struct nft_handle *h)
> >  	h->error.lineno = 0;
> >  
> >  	list_for_each_entry_safe(n, tmp, &h->obj_list, head) {
> > -		if (n->implicit) {
> > -			batch_obj_del(h, n);
> > -			continue;
> > -		}
> > -
> >  		switch (n->type) {
> >  		case NFT_COMPAT_TABLE_FLUSH:
> >  			tablename = nftnl_table_get_str(n->table, NFTNL_TABLE_NAME);
> > @@ -2679,14 +2674,22 @@ static void nft_refresh_transaction(struct nft_handle *h)
> >  
> >  			c = nft_chain_find(h, tablename, chainname);
> >  			if (c) {
> > -				/* -restore -n flushes existing rules from redefined user-chain */
> > -				__nft_rule_flush(h, tablename,
> > -						 chainname, false, true);
> >  				n->skip = 1;
> >  			} else if (!c) {
> >  				n->skip = 0;
> >  			}
> >  			break;
> > +		case NFT_COMPAT_RULE_FLUSH:
> > +			tablename = nftnl_rule_get_str(n->rule, NFTNL_RULE_TABLE);
> > +			if (!tablename)
> > +				continue;
> > +
> > +			chainname = nftnl_rule_get_str(n->rule, NFTNL_RULE_CHAIN);
> > +			if (!chainname)
> > +				continue;
> > +
> > +			n->skip = !nft_chain_find(h, tablename, chainname);
> 
> So n->skip equals true if the chain does not exists in the cache, so
> this flush does not fail (after this chain is gone because someone
> else have remove it).

Yes, the tricky bit is being able to adjust the batch object list no
matter how kernel ruleset changes. The only way is to assume the most
overhead and temporary disable individual jobs if not needed. This way
we can enable/disable while refreshing the transaction.

> Patch LGTM, thanks Phil.
> 
> What I don't clearly see yet is what scenario is triggering the bug in
> the existing code, if you don't mind to explain.

See the test case attached to the patch: An other iptables-restore
process may add references (i.e., jumps) to a chain the own
iptables-restore process wants to delete. This should not be a problem
because these references are added to a chain that is being flushed by
the own process as well. But if that chain doesn't exist while the own
process fetches kernel's ruleset, this flush job is not created.

Cheers, Phil

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [iptables PATCH 3/3] nft: Fix for concurrent noflush restore calls
  2020-10-13 10:08     ` Phil Sutter
@ 2020-10-13 10:15       ` Pablo Neira Ayuso
  2020-10-14  9:46         ` Phil Sutter
  0 siblings, 1 reply; 13+ messages in thread
From: Pablo Neira Ayuso @ 2020-10-13 10:15 UTC (permalink / raw)
  To: Phil Sutter, netfilter-devel, Florian Westphal

On Tue, Oct 13, 2020 at 12:08:03PM +0200, Phil Sutter wrote:
[...]
> On Mon, Oct 12, 2020 at 02:54:50PM +0200, Pablo Neira Ayuso wrote:
> [...]
> > Patch LGTM, thanks Phil.
> > 
> > What I don't clearly see yet is what scenario is triggering the bug in
> > the existing code, if you don't mind to explain.
> 
> See the test case attached to the patch: An other iptables-restore
> process may add references (i.e., jumps) to a chain the own
> iptables-restore process wants to delete. This should not be a problem
> because these references are added to a chain that is being flushed by
> the own process as well. But if that chain doesn't exist while the own
> process fetches kernel's ruleset, this flush job is not created.

Let me rephrase this:

1) process A fetches the ruleset, finds no chain C (no flush job then)
2) process B adds new chain C, flush job is present
3) process B adds the ruleset
4) process A appends rules to the existing chain C (because there is
   no flush job)

Is this the scenario? If so, I wonder why the generation ID is not
helping to refresh and retry.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [iptables PATCH 3/3] nft: Fix for concurrent noflush restore calls
  2020-10-13 10:15       ` Pablo Neira Ayuso
@ 2020-10-14  9:46         ` Phil Sutter
  2020-10-16 15:28           ` Pablo Neira Ayuso
  0 siblings, 1 reply; 13+ messages in thread
From: Phil Sutter @ 2020-10-14  9:46 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, Florian Westphal

On Tue, Oct 13, 2020 at 12:15:02PM +0200, Pablo Neira Ayuso wrote:
> On Tue, Oct 13, 2020 at 12:08:03PM +0200, Phil Sutter wrote:
> [...]
> > On Mon, Oct 12, 2020 at 02:54:50PM +0200, Pablo Neira Ayuso wrote:
> > [...]
> > > Patch LGTM, thanks Phil.
> > > 
> > > What I don't clearly see yet is what scenario is triggering the bug in
> > > the existing code, if you don't mind to explain.
> > 
> > See the test case attached to the patch: An other iptables-restore
> > process may add references (i.e., jumps) to a chain the own
> > iptables-restore process wants to delete. This should not be a problem
> > because these references are added to a chain that is being flushed by
> > the own process as well. But if that chain doesn't exist while the own
> > process fetches kernel's ruleset, this flush job is not created.
> 
> Let me rephrase this:
> 
> 1) process A fetches the ruleset, finds no chain C (no flush job then)
> 2) process B adds new chain C, flush job is present
> 3) process B adds the ruleset
> 4) process A appends rules to the existing chain C (because there is
>    no flush job)
> 
> Is this the scenario? If so, I wonder why the generation ID is not
> helping to refresh and retry.

Not quite, let me try to put this more clearly:

* Dump A:
  | *filter
  | :FOO - [0:0] # flush chain FOO
  | -X BAR       # remove chain BAR
  | COMMIT

* Dump B:
  | *filter
  | -A FOO -j BAR # reference BAR from a rule in FOO
  | COMMIT

* Kernel ruleset:
  | *filter
  | :BAR - [0:0]
  | COMMIT

* Process A:
  * read dump A
  * fetch cache

* Process B:
  * read dump B
  * fetch ruleset
  * commit to kernel

* Process A:
  * skip flush chain FOO job: not present
  * add delete chain BAR job: chain exists
  * commit fails (genid outdated)
  * refresh transaction:
    * delete chain BAR job remains active
    * genid updated
  * commit fails: can't remove chain BAR: EBUSY

I realize the test case is not quite effective, ruleset should be
emptied upon each iteration of concurrent restore job startup.

Cheers, Phil

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [iptables PATCH 3/3] nft: Fix for concurrent noflush restore calls
  2020-10-14  9:46         ` Phil Sutter
@ 2020-10-16 15:28           ` Pablo Neira Ayuso
  2020-10-26 16:31             ` Phil Sutter
  0 siblings, 1 reply; 13+ messages in thread
From: Pablo Neira Ayuso @ 2020-10-16 15:28 UTC (permalink / raw)
  To: Phil Sutter, netfilter-devel, Florian Westphal

On Wed, Oct 14, 2020 at 11:46:40AM +0200, Phil Sutter wrote:
> On Tue, Oct 13, 2020 at 12:15:02PM +0200, Pablo Neira Ayuso wrote:
> > On Tue, Oct 13, 2020 at 12:08:03PM +0200, Phil Sutter wrote:
> > [...]
> > > On Mon, Oct 12, 2020 at 02:54:50PM +0200, Pablo Neira Ayuso wrote:
> > > [...]
> > > > Patch LGTM, thanks Phil.
> > > > 
> > > > What I don't clearly see yet is what scenario is triggering the bug in
> > > > the existing code, if you don't mind to explain.
> > > 
> > > See the test case attached to the patch: An other iptables-restore
> > > process may add references (i.e., jumps) to a chain the own
> > > iptables-restore process wants to delete. This should not be a problem
> > > because these references are added to a chain that is being flushed by
> > > the own process as well. But if that chain doesn't exist while the own
> > > process fetches kernel's ruleset, this flush job is not created.
> > 
> > Let me rephrase this:
> > 
> > 1) process A fetches the ruleset, finds no chain C (no flush job then)
> > 2) process B adds new chain C, flush job is present
> > 3) process B adds the ruleset
> > 4) process A appends rules to the existing chain C (because there is
> >    no flush job)
> > 
> > Is this the scenario? If so, I wonder why the generation ID is not
> > helping to refresh and retry.
> 
> Not quite, let me try to put this more clearly:
> 
> * Dump A:
>   | *filter
>   | :FOO - [0:0] # flush chain FOO
>   | -X BAR       # remove chain BAR
>   | COMMIT
> 
> * Dump B:
>   | *filter
>   | -A FOO -j BAR # reference BAR from a rule in FOO
>   | COMMIT
> 
> * Kernel ruleset:
>   | *filter
>   | :BAR - [0:0]
>   | COMMIT
> 
> * Process A:
>   * read dump A
>   * fetch cache
> 
> * Process B:
>   * read dump B
>   * fetch ruleset
>   * commit to kernel
> 
> * Process A:
>   * skip flush chain FOO job: not present
>   * add delete chain BAR job: chain exists
>   * commit fails (genid outdated)
>   * refresh transaction:
>     * delete chain BAR job remains active
>     * genid updated
>   * commit fails: can't remove chain BAR: EBUSY

Makes sense. Thanks a lot for explaining. Probably you can include
this in the commit description for the record.

> I realize the test case is not quite effective, ruleset should be
> emptied upon each iteration of concurrent restore job startup.

Please, update the test and revamp.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [iptables PATCH 3/3] nft: Fix for concurrent noflush restore calls
  2020-10-16 15:28           ` Pablo Neira Ayuso
@ 2020-10-26 16:31             ` Phil Sutter
  2020-10-26 16:36               ` Pablo Neira Ayuso
  0 siblings, 1 reply; 13+ messages in thread
From: Phil Sutter @ 2020-10-26 16:31 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, Florian Westphal

On Fri, Oct 16, 2020 at 05:28:50PM +0200, Pablo Neira Ayuso wrote:
[...]
> Makes sense. Thanks a lot for explaining. Probably you can include
> this in the commit description for the record.
> 
> > I realize the test case is not quite effective, ruleset should be
> > emptied upon each iteration of concurrent restore job startup.
> 
> Please, update the test and revamp.

I pushed the commit already when you wrote "LGTM" in your first reply,
sorry. Yet to cover for the above, I just submitted a patch which adds a
bit of documentation to the test case (apart from improving its
effectiveness).

Cheers, Phil

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [iptables PATCH 3/3] nft: Fix for concurrent noflush restore calls
  2020-10-26 16:31             ` Phil Sutter
@ 2020-10-26 16:36               ` Pablo Neira Ayuso
  0 siblings, 0 replies; 13+ messages in thread
From: Pablo Neira Ayuso @ 2020-10-26 16:36 UTC (permalink / raw)
  To: Phil Sutter, netfilter-devel, Florian Westphal

On Mon, Oct 26, 2020 at 05:31:02PM +0100, Phil Sutter wrote:
> On Fri, Oct 16, 2020 at 05:28:50PM +0200, Pablo Neira Ayuso wrote:
> [...]
> > Makes sense. Thanks a lot for explaining. Probably you can include
> > this in the commit description for the record.
> > 
> > > I realize the test case is not quite effective, ruleset should be
> > > emptied upon each iteration of concurrent restore job startup.
> > 
> > Please, update the test and revamp.
> 
> I pushed the commit already when you wrote "LGTM" in your first reply,
> sorry. Yet to cover for the above, I just submitted a patch which adds a
> bit of documentation to the test case (apart from improving its
> effectiveness).

No problem. Thanks for following up.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2020-10-26 16:36 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-10-05 14:48 [iptables PATCH 0/3] nft: Fix transaction refreshing Phil Sutter
2020-10-05 14:48 ` [iptables PATCH 1/3] nft: Make batch_add_chain() return the added batch object Phil Sutter
2020-10-05 21:07   ` Florian Westphal
2020-10-05 14:48 ` [iptables PATCH 2/3] nft: Fix error reporting for refreshed transactions Phil Sutter
2020-10-05 21:13   ` Florian Westphal
2020-10-05 14:48 ` [iptables PATCH 3/3] nft: Fix for concurrent noflush restore calls Phil Sutter
2020-10-12 12:54   ` Pablo Neira Ayuso
2020-10-13 10:08     ` Phil Sutter
2020-10-13 10:15       ` Pablo Neira Ayuso
2020-10-14  9:46         ` Phil Sutter
2020-10-16 15:28           ` Pablo Neira Ayuso
2020-10-26 16:31             ` Phil Sutter
2020-10-26 16:36               ` Pablo Neira Ayuso

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).