All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH nf] netfilter: nft_set_rbtree: bogus lookup/get on consecutive elements in named sets
@ 2019-12-05 18:07 Pablo Neira Ayuso
  2019-12-05 22:04 ` Phil Sutter
  0 siblings, 1 reply; 6+ messages in thread
From: Pablo Neira Ayuso @ 2019-12-05 18:07 UTC (permalink / raw)
  To: netfilter-devel

The existing rbtree implementation might store consecutive elements
where the closing element and the opening element might overlap, eg.

	[ a, a+1) [ a+1, a+2)

This patch removes the optimization for non-anonymous sets in the exact
matching case, where it is assumed to stop searching in case that the
closing element is found. Instead, invalidate candidate interval and
keep looking further in the tree.

This patch fixes the lookup and get operations.

Fixes: e701001e7cbe ("netfilter: nft_rbtree: allow adjacent intervals with dynamic updates")
Fixes: ba0e4d9917b4 ("netfilter: nf_tables: get set elements via netlink")
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 net/netfilter/nft_set_rbtree.c | 19 +++++++++++++++----
 1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
index 57123259452f..510169e28065 100644
--- a/net/netfilter/nft_set_rbtree.c
+++ b/net/netfilter/nft_set_rbtree.c
@@ -74,8 +74,13 @@ static bool __nft_rbtree_lookup(const struct net *net, const struct nft_set *set
 				parent = rcu_dereference_raw(parent->rb_left);
 				continue;
 			}
-			if (nft_rbtree_interval_end(rbe))
-				goto out;
+			if (nft_rbtree_interval_end(rbe)) {
+				if (set->flags & NFT_SET_ANONYMOUS)
+					return false;
+				parent = rcu_dereference_raw(parent->rb_left);
+				interval = NULL;
+				continue;
+			}
 
 			*ext = &rbe->ext;
 			return true;
@@ -88,7 +93,7 @@ static bool __nft_rbtree_lookup(const struct net *net, const struct nft_set *set
 		*ext = &interval->ext;
 		return true;
 	}
-out:
+
 	return false;
 }
 
@@ -141,6 +146,8 @@ static bool __nft_rbtree_get(const struct net *net, const struct nft_set *set,
 		} else {
 			if (!nft_set_elem_active(&rbe->ext, genmask))
 				parent = rcu_dereference_raw(parent->rb_left);
+				continue;
+			}
 
 			if (!nft_set_ext_exists(&rbe->ext, NFT_SET_EXT_FLAGS) ||
 			    (*nft_set_ext_flags(&rbe->ext) & NFT_SET_ELEM_INTERVAL_END) ==
@@ -148,7 +155,11 @@ static bool __nft_rbtree_get(const struct net *net, const struct nft_set *set,
 				*elem = rbe;
 				return true;
 			}
-			return false;
+
+			if (nft_rbtree_interval_end(rbe))
+				interval = NULL;
+
+			parent = rcu_dereference_raw(parent->rb_left);
 		}
 	}
 
-- 
2.11.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH nf] netfilter: nft_set_rbtree: bogus lookup/get on consecutive elements in named sets
  2019-12-05 18:07 [PATCH nf] netfilter: nft_set_rbtree: bogus lookup/get on consecutive elements in named sets Pablo Neira Ayuso
@ 2019-12-05 22:04 ` Phil Sutter
  2019-12-06 19:26   ` Pablo Neira Ayuso
  0 siblings, 1 reply; 6+ messages in thread
From: Phil Sutter @ 2019-12-05 22:04 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel

Hi Pablo,

On Thu, Dec 05, 2019 at 07:07:06PM +0100, Pablo Neira Ayuso wrote:
> The existing rbtree implementation might store consecutive elements
> where the closing element and the opening element might overlap, eg.
> 
> 	[ a, a+1) [ a+1, a+2)
> 
> This patch removes the optimization for non-anonymous sets in the exact
> matching case, where it is assumed to stop searching in case that the
> closing element is found. Instead, invalidate candidate interval and
> keep looking further in the tree.
> 
> This patch fixes the lookup and get operations.

I didn't get what the actual problem is?

[...]
> diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
> index 57123259452f..510169e28065 100644
> --- a/net/netfilter/nft_set_rbtree.c
> +++ b/net/netfilter/nft_set_rbtree.c
[...]
> @@ -141,6 +146,8 @@ static bool __nft_rbtree_get(const struct net *net, const struct nft_set *set,
>  		} else {
>  			if (!nft_set_elem_active(&rbe->ext, genmask))
>  				parent = rcu_dereference_raw(parent->rb_left);
> +				continue;
> +			}
>  
>  			if (!nft_set_ext_exists(&rbe->ext, NFT_SET_EXT_FLAGS) ||
>  			    (*nft_set_ext_flags(&rbe->ext) & NFT_SET_ELEM_INTERVAL_END) ==

Are you sure about that chunk? It adds a closing brace without a
matching opening one. Either this patch ignores whitespace change or
there's something fishy. :)

Cheers, Phil

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH nf] netfilter: nft_set_rbtree: bogus lookup/get on consecutive elements in named sets
  2019-12-05 22:04 ` Phil Sutter
@ 2019-12-06 19:26   ` Pablo Neira Ayuso
  2019-12-06 19:39     ` Pablo Neira Ayuso
  0 siblings, 1 reply; 6+ messages in thread
From: Pablo Neira Ayuso @ 2019-12-06 19:26 UTC (permalink / raw)
  To: Phil Sutter, netfilter-devel

On Thu, Dec 05, 2019 at 11:04:08PM +0100, Phil Sutter wrote:
> Hi Pablo,
> 
> On Thu, Dec 05, 2019 at 07:07:06PM +0100, Pablo Neira Ayuso wrote:
> > The existing rbtree implementation might store consecutive elements
> > where the closing element and the opening element might overlap, eg.
> > 
> > 	[ a, a+1) [ a+1, a+2)
> > 
> > This patch removes the optimization for non-anonymous sets in the exact
> > matching case, where it is assumed to stop searching in case that the
> > closing element is found. Instead, invalidate candidate interval and
> > keep looking further in the tree.
> > 
> > This patch fixes the lookup and get operations.
> 
> I didn't get what the actual problem is?

The lookup/get results false, while there is an element in the rbtree.
Moreover, the get operation returns true as if a+2 would be in the
tree. This happens with named sets after several set updates, I could
reproduce the issue with several elements mixed with insertion and
deletions in one batch.

I managed to trigger the problem using the test file coming in this
patch: https://patchwork.ozlabs.org/patch/1204779/

I can extend the patch description if you like.

> [...]
> > diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
> > index 57123259452f..510169e28065 100644
> > --- a/net/netfilter/nft_set_rbtree.c
> > +++ b/net/netfilter/nft_set_rbtree.c
> [...]
> > @@ -141,6 +146,8 @@ static bool __nft_rbtree_get(const struct net *net, const struct nft_set *set,
> >  		} else {
> >  			if (!nft_set_elem_active(&rbe->ext, genmask))
> >  				parent = rcu_dereference_raw(parent->rb_left);
> > +				continue;
> > +			}
> >  
> >  			if (!nft_set_ext_exists(&rbe->ext, NFT_SET_EXT_FLAGS) ||
> >  			    (*nft_set_ext_flags(&rbe->ext) & NFT_SET_ELEM_INTERVAL_END) ==
> 
> Are you sure about that chunk? It adds a closing brace without a
> matching opening one. Either this patch ignores whitespace change or
> there's something fishy. :)

Yes, I botched it when squashing my patches before submission. I just
sent a v2.

Thanks.



^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH nf] netfilter: nft_set_rbtree: bogus lookup/get on consecutive elements in named sets
  2019-12-06 19:26   ` Pablo Neira Ayuso
@ 2019-12-06 19:39     ` Pablo Neira Ayuso
  2019-12-07 11:03       ` Phil Sutter
  0 siblings, 1 reply; 6+ messages in thread
From: Pablo Neira Ayuso @ 2019-12-06 19:39 UTC (permalink / raw)
  To: Phil Sutter, netfilter-devel

On Fri, Dec 06, 2019 at 08:26:47PM +0100, Pablo Neira Ayuso wrote:
> On Thu, Dec 05, 2019 at 11:04:08PM +0100, Phil Sutter wrote:
> > Hi Pablo,
> > 
> > On Thu, Dec 05, 2019 at 07:07:06PM +0100, Pablo Neira Ayuso wrote:
> > > The existing rbtree implementation might store consecutive elements
> > > where the closing element and the opening element might overlap, eg.
> > > 
> > > 	[ a, a+1) [ a+1, a+2)
> > > 
> > > This patch removes the optimization for non-anonymous sets in the exact
> > > matching case, where it is assumed to stop searching in case that the
> > > closing element is found. Instead, invalidate candidate interval and
> > > keep looking further in the tree.
> > > 
> > > This patch fixes the lookup and get operations.
> > 
> > I didn't get what the actual problem is?
> 
> The lookup/get results false, while there is an element in the rbtree.
> Moreover, the get operation returns true as if a+2 would be in the
> tree. This happens with named sets after several set updates, I could
> reproduce the issue with several elements mixed with insertion and
> deletions in one batch.

To extend the problem description: The issue is that the existing
lookup optimization (that only works for the anonymous sets) might not
reach the opening [ a+1, ... element if the closing ... , a+1) is
found in first place when walking over the rbtree. Hence, walking the
full tree in that case is needed.

> I managed to trigger the problem using the test file coming in this
> patch: https://patchwork.ozlabs.org/patch/1204779/
> 
> I can extend the patch description if you like.
> 
> > [...]
> > > diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
> > > index 57123259452f..510169e28065 100644
> > > --- a/net/netfilter/nft_set_rbtree.c
> > > +++ b/net/netfilter/nft_set_rbtree.c
> > [...]
> > > @@ -141,6 +146,8 @@ static bool __nft_rbtree_get(const struct net *net, const struct nft_set *set,
> > >  		} else {
> > >  			if (!nft_set_elem_active(&rbe->ext, genmask))
> > >  				parent = rcu_dereference_raw(parent->rb_left);
> > > +				continue;
> > > +			}
> > >  
> > >  			if (!nft_set_ext_exists(&rbe->ext, NFT_SET_EXT_FLAGS) ||
> > >  			    (*nft_set_ext_flags(&rbe->ext) & NFT_SET_ELEM_INTERVAL_END) ==
> > 
> > Are you sure about that chunk? It adds a closing brace without a
> > matching opening one. Either this patch ignores whitespace change or
> > there's something fishy. :)
> 
> Yes, I botched it when squashing my patches before submission. I just
> sent a v2.
> 
> Thanks.
> 
> 

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH nf] netfilter: nft_set_rbtree: bogus lookup/get on consecutive elements in named sets
  2019-12-06 19:39     ` Pablo Neira Ayuso
@ 2019-12-07 11:03       ` Phil Sutter
  2019-12-07 18:38         ` Pablo Neira Ayuso
  0 siblings, 1 reply; 6+ messages in thread
From: Phil Sutter @ 2019-12-07 11:03 UTC (permalink / raw)
  To: Pablo Neira Ayuso; +Cc: netfilter-devel

Hi Pablo,

On Fri, Dec 06, 2019 at 08:39:38PM +0100, Pablo Neira Ayuso wrote:
> On Fri, Dec 06, 2019 at 08:26:47PM +0100, Pablo Neira Ayuso wrote:
> > On Thu, Dec 05, 2019 at 11:04:08PM +0100, Phil Sutter wrote:
> > > Hi Pablo,
> > > 
> > > On Thu, Dec 05, 2019 at 07:07:06PM +0100, Pablo Neira Ayuso wrote:
> > > > The existing rbtree implementation might store consecutive elements
> > > > where the closing element and the opening element might overlap, eg.
> > > > 
> > > > 	[ a, a+1) [ a+1, a+2)
> > > > 
> > > > This patch removes the optimization for non-anonymous sets in the exact
> > > > matching case, where it is assumed to stop searching in case that the
> > > > closing element is found. Instead, invalidate candidate interval and
> > > > keep looking further in the tree.
> > > > 
> > > > This patch fixes the lookup and get operations.
> > > 
> > > I didn't get what the actual problem is?
> > 
> > The lookup/get results false, while there is an element in the rbtree.
> > Moreover, the get operation returns true as if a+2 would be in the
> > tree. This happens with named sets after several set updates, I could
> > reproduce the issue with several elements mixed with insertion and
> > deletions in one batch.
> 
> To extend the problem description: The issue is that the existing
> lookup optimization (that only works for the anonymous sets) might not
> reach the opening [ a+1, ... element if the closing ... , a+1) is
> found in first place when walking over the rbtree. Hence, walking the
> full tree in that case is needed.

Ah! Thanks a lot for your elaborations. It was really hard to grasp what
all this is about from the initial commit message. :)

Sometimes I wonder if we should do more set optimizations under the hood
when adding elements. Right now, we don't touch existing ones although
it would make sense. And we could be more intelligent for example if a
set contains 20-30 and a user adds 25-35.

Cheers, Phil

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH nf] netfilter: nft_set_rbtree: bogus lookup/get on consecutive elements in named sets
  2019-12-07 11:03       ` Phil Sutter
@ 2019-12-07 18:38         ` Pablo Neira Ayuso
  0 siblings, 0 replies; 6+ messages in thread
From: Pablo Neira Ayuso @ 2019-12-07 18:38 UTC (permalink / raw)
  To: Phil Sutter, netfilter-devel

On Sat, Dec 07, 2019 at 12:03:07PM +0100, Phil Sutter wrote:
> Hi Pablo,
> 
> On Fri, Dec 06, 2019 at 08:39:38PM +0100, Pablo Neira Ayuso wrote:
> > On Fri, Dec 06, 2019 at 08:26:47PM +0100, Pablo Neira Ayuso wrote:
> > > On Thu, Dec 05, 2019 at 11:04:08PM +0100, Phil Sutter wrote:
> > > > Hi Pablo,
> > > > 
> > > > On Thu, Dec 05, 2019 at 07:07:06PM +0100, Pablo Neira Ayuso wrote:
> > > > > The existing rbtree implementation might store consecutive elements
> > > > > where the closing element and the opening element might overlap, eg.
> > > > > 
> > > > > 	[ a, a+1) [ a+1, a+2)
> > > > > 
> > > > > This patch removes the optimization for non-anonymous sets in the exact
> > > > > matching case, where it is assumed to stop searching in case that the
> > > > > closing element is found. Instead, invalidate candidate interval and
> > > > > keep looking further in the tree.
> > > > > 
> > > > > This patch fixes the lookup and get operations.
> > > > 
> > > > I didn't get what the actual problem is?
> > > 
> > > The lookup/get results false, while there is an element in the rbtree.
> > > Moreover, the get operation returns true as if a+2 would be in the
> > > tree. This happens with named sets after several set updates, I could
> > > reproduce the issue with several elements mixed with insertion and
> > > deletions in one batch.
> > 
> > To extend the problem description: The issue is that the existing
> > lookup optimization (that only works for the anonymous sets) might not
> > reach the opening [ a+1, ... element if the closing ... , a+1) is
> > found in first place when walking over the rbtree. Hence, walking the
> > full tree in that case is needed.
> 
> Ah! Thanks a lot for your elaborations. It was really hard to grasp what
> all this is about from the initial commit message. :)

I'll append this to the description before applying, no problem,
thanks for asking.

> Sometimes I wonder if we should do more set optimizations under the hood
> when adding elements. Right now, we don't touch existing ones although
> it would make sense. And we could be more intelligent for example if a
> set contains 20-30 and a user adds 25-35.

Yes, if 'automerge' is set on, then nft should start doing some more
intelligent stuff. Basically the idea is: check if this interval
overlaps with an existing one, if so transaction looks like this:

        [ delete 25-35 | add 20-35 ]

With the existing transaction infrastructure in place, I think it
should not be too hard. There's already a routine to check for
overlaps. If this is a data mapping, this needs to be careful to not
merge intervals that have different data on the right hand side.

There's another problem, I think incremental userspace cache is still
incomplete. I can probably scratch time and look into this, this might
not happen before 2020 though.

Thanks.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-12-07 18:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-05 18:07 [PATCH nf] netfilter: nft_set_rbtree: bogus lookup/get on consecutive elements in named sets Pablo Neira Ayuso
2019-12-05 22:04 ` Phil Sutter
2019-12-06 19:26   ` Pablo Neira Ayuso
2019-12-06 19:39     ` Pablo Neira Ayuso
2019-12-07 11:03       ` Phil Sutter
2019-12-07 18:38         ` Pablo Neira Ayuso

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.