netfilter-devel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH iptables RFC 0/4] revisit RESTART log
@ 2019-05-20 14:41 Pablo Neira Ayuso
  2019-05-20 14:41 ` [PATCH iptables RFC 1/4] nft: don't check for table existence from __nft_table_flush() Pablo Neira Ayuso
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2019-05-20 14:41 UTC (permalink / raw)
  To: netfilter-devel; +Cc: fw, phil

This still does not work, still searching for more bugs here.

Patch 1) Remove skip logic from __nft_table_flush(), before we
	 hit ERESTART. Better do not preventively skip table flush.

Patch 2) Keeps the original cache, while we introduce a new cache
         that is used when we hit ERESTART.

Patch 3) Remove NFT_COMPAT_TABLE_ADD case from refresh transaction,
         I don't find a scenario for this.

Patch 4) Reevaluate based on the existing cache, not on the previous
         object state. Original commit doesn't mention, but
	 NFT_COMPAT_CHAIN_USER_ADD only makes sense to me to do
	 the special handling from h->noflush.

I can still see the test still fails most of the time with:

line 5: CHAIN_USER_ADD failed (File exists): chain UC-0

which should not happen if table exists, because a flush should have
happened before.

Pablo Neira Ayuso (4):
  nft: don't check for table existence from __nft_table_flush()
  nft: keep original cache in case of ERESTART
  nft: don't skip table addition from ERESTART
  nft: don't care about previous state in RESTART

 iptables/nft.c | 77 +++++++++++++++++++++++++++++++---------------------------
 iptables/nft.h |  3 ++-
 2 files changed, 43 insertions(+), 37 deletions(-)

-- 
2.11.0


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH iptables RFC 1/4] nft: don't check for table existence from __nft_table_flush()
  2019-05-20 14:41 [PATCH iptables RFC 0/4] revisit RESTART log Pablo Neira Ayuso
@ 2019-05-20 14:41 ` Pablo Neira Ayuso
  2019-05-20 14:41 ` [PATCH iptables RFC 2/4] nft: keep original cache in case of ERESTART Pablo Neira Ayuso
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2019-05-20 14:41 UTC (permalink / raw)
  To: netfilter-devel; +Cc: fw, phil

This is a partial revert of d3e378b4a93f ("xtables: add skip flag to
objects"). This should be handled from the ERESTART case.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 iptables/nft.c | 25 +++++++++----------------
 1 file changed, 9 insertions(+), 16 deletions(-)

diff --git a/iptables/nft.c b/iptables/nft.c
index 172beec9ae27..0f0492bc200c 100644
--- a/iptables/nft.c
+++ b/iptables/nft.c
@@ -2103,10 +2103,9 @@ int nft_for_each_table(struct nft_handle *h,
 	return 0;
 }
 
-static int __nft_table_flush(struct nft_handle *h, const char *table, bool exists)
+static int __nft_table_flush(struct nft_handle *h, const char *table)
 {
 	const struct builtin_table *_t;
-	struct obj_update *obj;
 	struct nftnl_table *t;
 
 	t = nftnl_table_alloc();
@@ -2115,14 +2114,7 @@ static int __nft_table_flush(struct nft_handle *h, const char *table, bool exist
 
 	nftnl_table_set_str(t, NFTNL_TABLE_NAME, table);
 
-	obj = batch_table_add(h, NFT_COMPAT_TABLE_FLUSH, t);
-	if (!obj) {
-		nftnl_table_free(t);
-		return -1;
-	}
-
-	if (!exists)
-		obj->skip = 1;
+	batch_table_add(h, NFT_COMPAT_TABLE_FLUSH, t);
 
 	_t = nft_table_builtin_find(h, table);
 	assert(_t);
@@ -2138,7 +2130,6 @@ int nft_table_flush(struct nft_handle *h, const char *table)
 	struct nftnl_table_list_iter *iter;
 	struct nftnl_table_list *list;
 	struct nftnl_table *t;
-	bool exists = false;
 	int ret = 0;
 
 	nft_fn = nft_table_flush;
@@ -2160,15 +2151,17 @@ int nft_table_flush(struct nft_handle *h, const char *table)
 		const char *table_name =
 			nftnl_table_get_str(t, NFTNL_TABLE_NAME);
 
-		if (strcmp(table_name, table) == 0) {
-			exists = true;
-			break;
-		}
+		if (strcmp(table_name, table) != 0)
+			goto next;
 
+		ret = __nft_table_flush(h, table);
+		if (ret < 0)
+			goto err_table_iter;
+next:
 		t = nftnl_table_list_iter_next(iter);
 	}
 
-	ret = __nft_table_flush(h, table, exists);
+err_table_iter:
 	nftnl_table_list_iter_destroy(iter);
 err_table_list:
 err_out:
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH iptables RFC 2/4] nft: keep original cache in case of ERESTART
  2019-05-20 14:41 [PATCH iptables RFC 0/4] revisit RESTART log Pablo Neira Ayuso
  2019-05-20 14:41 ` [PATCH iptables RFC 1/4] nft: don't check for table existence from __nft_table_flush() Pablo Neira Ayuso
@ 2019-05-20 14:41 ` Pablo Neira Ayuso
  2019-05-20 14:41 ` [PATCH iptables RFC 3/4] nft: don't skip table addition from ERESTART Pablo Neira Ayuso
  2019-05-20 14:41 ` [PATCH iptables RFC 4/4] nft: don't care about previous state in RESTART Pablo Neira Ayuso
  3 siblings, 0 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2019-05-20 14:41 UTC (permalink / raw)
  To: netfilter-devel; +Cc: fw, phil

Phil Sutter says:

"The problem is that data in h->obj_list potentially sits in cache, too.
At least rules have to be there so insert with index works correctly. If
the cache is flushed before regenerating the batch, use-after-free
occurs which crashes the program."

This patch keeps around the original cache until we have refreshed the
batch.

Fixes: 862818ac3a0de ("xtables: add and use nft_build_cache")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 iptables/nft.c | 25 +++++++++++++++++++++----
 iptables/nft.h |  3 ++-
 2 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/iptables/nft.c b/iptables/nft.c
index 0f0492bc200c..a03a84c9a6d1 100644
--- a/iptables/nft.c
+++ b/iptables/nft.c
@@ -811,7 +811,7 @@ int nft_init(struct nft_handle *h, const struct builtin_table *t)
 
 	h->portid = mnl_socket_get_portid(h->nl);
 	h->tables = t;
-	h->cache = &h->__cache;
+	h->cache = &h->__cache[0];
 
 	INIT_LIST_HEAD(&h->obj_list);
 	INIT_LIST_HEAD(&h->err_list);
@@ -1618,14 +1618,30 @@ void nft_build_cache(struct nft_handle *h)
 		__nft_build_cache(h);
 }
 
-static void nft_rebuild_cache(struct nft_handle *h)
+static void __nft_flush_cache(struct nft_handle *h)
 {
-	if (!h->have_cache)
+	if (!h->cache_index) {
+		h->cache_index++;
+		h->cache = &h->__cache[h->cache_index];
+	} else {
 		flush_chain_cache(h, NULL);
+	}
+}
+
+static void nft_rebuild_cache(struct nft_handle *h)
+{
+	if (h->have_cache)
+		__nft_flush_cache(h);
 
 	__nft_build_cache(h);
 }
 
+static void nft_release_cache(struct nft_handle *h)
+{
+	if (h->cache_index)
+		flush_cache(&h->__cache[0], h->tables, NULL);
+}
+
 struct nftnl_chain_list *nft_chain_list_get(struct nft_handle *h,
 					    const char *table)
 {
@@ -1762,7 +1778,7 @@ __nft_rule_flush(struct nft_handle *h, const char *table,
 		return;
 	}
 
-	obj->implicit = 1;
+	obj->implicit = implicit;
 }
 
 int nft_rule_flush(struct nft_handle *h, const char *chain, const char *table,
@@ -2950,6 +2966,7 @@ retry:
 		batch_obj_del(h, n);
 	}
 
+	nft_release_cache(h);
 	mnl_batch_reset(h->batch);
 
 	if (i)
diff --git a/iptables/nft.h b/iptables/nft.h
index dc0797d302b8..43eb8a39dd9c 100644
--- a/iptables/nft.h
+++ b/iptables/nft.h
@@ -48,7 +48,8 @@ struct nft_handle {
 	struct list_head	err_list;
 	struct nft_family_ops	*ops;
 	const struct builtin_table *tables;
-	struct nft_cache	__cache;
+	unsigned int		cache_index;
+	struct nft_cache	__cache[2];
 	struct nft_cache	*cache;
 	bool			have_cache;
 	bool			restore;
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH iptables RFC 3/4] nft: don't skip table addition from ERESTART
  2019-05-20 14:41 [PATCH iptables RFC 0/4] revisit RESTART log Pablo Neira Ayuso
  2019-05-20 14:41 ` [PATCH iptables RFC 1/4] nft: don't check for table existence from __nft_table_flush() Pablo Neira Ayuso
  2019-05-20 14:41 ` [PATCH iptables RFC 2/4] nft: keep original cache in case of ERESTART Pablo Neira Ayuso
@ 2019-05-20 14:41 ` Pablo Neira Ayuso
  2019-05-20 14:41 ` [PATCH iptables RFC 4/4] nft: don't care about previous state in RESTART Pablo Neira Ayuso
  3 siblings, 0 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2019-05-20 14:41 UTC (permalink / raw)
  To: netfilter-devel; +Cc: fw, phil

I don't find a scenario that trigger this case.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 iptables/nft.c | 10 +---------
 1 file changed, 1 insertion(+), 9 deletions(-)

diff --git a/iptables/nft.c b/iptables/nft.c
index a03a84c9a6d1..c1a079b734cf 100644
--- a/iptables/nft.c
+++ b/iptables/nft.c
@@ -2787,15 +2787,6 @@ static void nft_refresh_transaction(struct nft_handle *h)
 			else if (!n->skip && !exists)
 				n->skip = 1;
 			break;
-		case NFT_COMPAT_TABLE_ADD:
-			tablename = nftnl_table_get_str(n->table, NFTNL_TABLE_NAME);
-			if (!tablename)
-				continue;
-
-			exists = nft_table_find(h, tablename);
-			if (n->skip && !exists)
-				n->skip = 0;
-			break;
 		case NFT_COMPAT_CHAIN_USER_ADD:
 			tablename = nftnl_chain_get_str(n->chain, NFTNL_CHAIN_TABLE);
 			if (!tablename)
@@ -2815,6 +2806,7 @@ static void nft_refresh_transaction(struct nft_handle *h)
 				n->skip = 0;
 			}
 			break;
+		case NFT_COMPAT_TABLE_ADD:
 		case NFT_COMPAT_CHAIN_ADD:
 		case NFT_COMPAT_CHAIN_ZERO:
 		case NFT_COMPAT_CHAIN_USER_DEL:
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH iptables RFC 4/4] nft: don't care about previous state in RESTART
  2019-05-20 14:41 [PATCH iptables RFC 0/4] revisit RESTART log Pablo Neira Ayuso
                   ` (2 preceding siblings ...)
  2019-05-20 14:41 ` [PATCH iptables RFC 3/4] nft: don't skip table addition from ERESTART Pablo Neira Ayuso
@ 2019-05-20 14:41 ` Pablo Neira Ayuso
  2019-05-20 14:49   ` Pablo Neira Ayuso
  3 siblings, 1 reply; 9+ messages in thread
From: Pablo Neira Ayuso @ 2019-05-20 14:41 UTC (permalink / raw)
  To: netfilter-devel; +Cc: fw, phil

We need to re-evalute based on the existing cache generation.

Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 iptables/nft.c | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/iptables/nft.c b/iptables/nft.c
index c1a079b734cf..bc3847d7ea47 100644
--- a/iptables/nft.c
+++ b/iptables/nft.c
@@ -2782,10 +2782,10 @@ static void nft_refresh_transaction(struct nft_handle *h)
 			if (!tablename)
 				continue;
 			exists = nft_table_find(h, tablename);
-			if (n->skip && exists)
-				n->skip = 0;
-			else if (!n->skip && !exists)
+			if (exists)
 				n->skip = 1;
+			else
+				n->skip = 0;
 			break;
 		case NFT_COMPAT_CHAIN_USER_ADD:
 			tablename = nftnl_chain_get_str(n->chain, NFTNL_CHAIN_TABLE);
@@ -2796,13 +2796,16 @@ static void nft_refresh_transaction(struct nft_handle *h)
 			if (!chainname)
 				continue;
 
+			if (!h->noflush)
+				break;
+
 			c = nft_chain_find(h, tablename, chainname);
-			if (c && !n->skip) {
+			if (c) {
 				/* -restore -n flushes existing rules from redefined user-chain */
-				if (h->noflush)
-					__nft_rule_flush(h, tablename,
-							 chainname, false, true);
-			} else if (!c && n->skip) {
+				__nft_rule_flush(h, tablename,
+						 chainname, false, true);
+				n->skip = 1;
+			} else if (!c) {
 				n->skip = 0;
 			}
 			break;
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH iptables RFC 4/4] nft: don't care about previous state in RESTART
  2019-05-20 14:41 ` [PATCH iptables RFC 4/4] nft: don't care about previous state in RESTART Pablo Neira Ayuso
@ 2019-05-20 14:49   ` Pablo Neira Ayuso
  2019-05-20 14:59     ` Pablo Neira Ayuso
  2019-05-20 15:06     ` Pablo Neira Ayuso
  0 siblings, 2 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2019-05-20 14:49 UTC (permalink / raw)
  To: netfilter-devel; +Cc: fw, phil

On Mon, May 20, 2019 at 04:41:15PM +0200, Pablo Neira Ayuso wrote:
> We need to re-evalute based on the existing cache generation.
> 
> Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
> ---
>  iptables/nft.c | 19 +++++++++++--------
>  1 file changed, 11 insertions(+), 8 deletions(-)
> 
> diff --git a/iptables/nft.c b/iptables/nft.c
> index c1a079b734cf..bc3847d7ea47 100644
> --- a/iptables/nft.c
> +++ b/iptables/nft.c
> @@ -2782,10 +2782,10 @@ static void nft_refresh_transaction(struct nft_handle *h)
>  			if (!tablename)
>  				continue;
>  			exists = nft_table_find(h, tablename);
> -			if (n->skip && exists)
> -				n->skip = 0;
> -			else if (!n->skip && !exists)
> +			if (exists)
>  				n->skip = 1;
> +			else
> +				n->skip = 0;

Actually, this should be the opposite:

 			if (exists)
 				n->skip = 0;
			else
				n->skip = 1;

So we only skip the flush if the table does not exist.

Still not working though, hitting EEXIST on CHAIN_USER_ADD.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH iptables RFC 4/4] nft: don't care about previous state in RESTART
  2019-05-20 14:49   ` Pablo Neira Ayuso
@ 2019-05-20 14:59     ` Pablo Neira Ayuso
  2019-05-20 15:06     ` Pablo Neira Ayuso
  1 sibling, 0 replies; 9+ messages in thread
From: Pablo Neira Ayuso @ 2019-05-20 14:59 UTC (permalink / raw)
  To: netfilter-devel; +Cc: fw, phil

On Mon, May 20, 2019 at 04:49:38PM +0200, Pablo Neira Ayuso wrote:
> On Mon, May 20, 2019 at 04:41:15PM +0200, Pablo Neira Ayuso wrote:
> > We need to re-evalute based on the existing cache generation.
> > 
> > Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
> > ---
> >  iptables/nft.c | 19 +++++++++++--------
> >  1 file changed, 11 insertions(+), 8 deletions(-)
> > 
> > diff --git a/iptables/nft.c b/iptables/nft.c
> > index c1a079b734cf..bc3847d7ea47 100644
> > --- a/iptables/nft.c
> > +++ b/iptables/nft.c
> > @@ -2782,10 +2782,10 @@ static void nft_refresh_transaction(struct nft_handle *h)
> >  			if (!tablename)
> >  				continue;
> >  			exists = nft_table_find(h, tablename);
> > -			if (n->skip && exists)
> > -				n->skip = 0;
> > -			else if (!n->skip && !exists)
> > +			if (exists)
> >  				n->skip = 1;
> > +			else
> > +				n->skip = 0;
> 
> Actually, this should be the opposite:
> 
>  			if (exists)
>  				n->skip = 0;
> 			else
> 				n->skip = 1;
> 
> So we only skip the flush if the table does not exist.
> 
> Still not working though, hitting EEXIST on CHAIN_USER_ADD.

Hm.

I also occasionally see "Message too long" errors, so looks like a few
more bugs ahead.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH iptables RFC 4/4] nft: don't care about previous state in RESTART
  2019-05-20 14:49   ` Pablo Neira Ayuso
  2019-05-20 14:59     ` Pablo Neira Ayuso
@ 2019-05-20 15:06     ` Pablo Neira Ayuso
  2019-05-20 15:12       ` Florian Westphal
  1 sibling, 1 reply; 9+ messages in thread
From: Pablo Neira Ayuso @ 2019-05-20 15:06 UTC (permalink / raw)
  To: netfilter-devel; +Cc: fw, phil

On Mon, May 20, 2019 at 04:49:38PM +0200, Pablo Neira Ayuso wrote:
> On Mon, May 20, 2019 at 04:41:15PM +0200, Pablo Neira Ayuso wrote:
> > We need to re-evalute based on the existing cache generation.
> >
> > Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
> > ---
> >  iptables/nft.c | 19 +++++++++++--------
> >  1 file changed, 11 insertions(+), 8 deletions(-)
> >
> > diff --git a/iptables/nft.c b/iptables/nft.c
> > index c1a079b734cf..bc3847d7ea47 100644
> > --- a/iptables/nft.c
> > +++ b/iptables/nft.c
> > @@ -2782,10 +2782,10 @@ static void nft_refresh_transaction(struct nft_handle *h)
> >  			if (!tablename)
> >  				continue;
> >  			exists = nft_table_find(h, tablename);
> > -			if (n->skip && exists)
> > -				n->skip = 0;
> > -			else if (!n->skip && !exists)
> > +			if (exists)
> >  				n->skip = 1;
> > +			else
> > +				n->skip = 0;
>
> Actually, this should be the opposite:
>
>  			if (exists)
>  				n->skip = 0;
> 			else
> 				n->skip = 1;
>
> So we only skip the flush if the table does not exist.
>
> Still not working though, hitting EEXIST on CHAIN_USER_ADD.

The nfnl_unlock(subsys_id); is released after check the generation ID
in nfnetlink.

This is rendering the generation ID useless. We need a kernel fix for
this.

The per-netns mutex net->nft.commit_mutex should be handled from the
nfnetlink core.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH iptables RFC 4/4] nft: don't care about previous state in RESTART
  2019-05-20 15:06     ` Pablo Neira Ayuso
@ 2019-05-20 15:12       ` Florian Westphal
  0 siblings, 0 replies; 9+ messages in thread
From: Florian Westphal @ 2019-05-20 15:12 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel, fw, phil

Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > So we only skip the flush if the table does not exist.
> >
> > Still not working though, hitting EEXIST on CHAIN_USER_ADD.
> 
> The nfnl_unlock(subsys_id); is released after check the generation ID
> in nfnetlink.
> 
> This is rendering the generation ID useless. We need a kernel fix for
> this.

-v, the subsys mutex is released, but we do hold the transaction mutex.

parallel batch that is incoming will block in
nf_tables_valid_genid() until current transaction completes, then it
will fail due to genid mismatch.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2019-05-20 15:13 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-20 14:41 [PATCH iptables RFC 0/4] revisit RESTART log Pablo Neira Ayuso
2019-05-20 14:41 ` [PATCH iptables RFC 1/4] nft: don't check for table existence from __nft_table_flush() Pablo Neira Ayuso
2019-05-20 14:41 ` [PATCH iptables RFC 2/4] nft: keep original cache in case of ERESTART Pablo Neira Ayuso
2019-05-20 14:41 ` [PATCH iptables RFC 3/4] nft: don't skip table addition from ERESTART Pablo Neira Ayuso
2019-05-20 14:41 ` [PATCH iptables RFC 4/4] nft: don't care about previous state in RESTART Pablo Neira Ayuso
2019-05-20 14:49   ` Pablo Neira Ayuso
2019-05-20 14:59     ` Pablo Neira Ayuso
2019-05-20 15:06     ` Pablo Neira Ayuso
2019-05-20 15:12       ` Florian Westphal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).