* [PATCH nf] netfilter: nft_set_rbtree: bogus lookup/get on consecutive elements in named sets @ 2019-12-05 18:07 Pablo Neira Ayuso 2019-12-05 22:04 ` Phil Sutter 0 siblings, 1 reply; 6+ messages in thread From: Pablo Neira Ayuso @ 2019-12-05 18:07 UTC (permalink / raw) To: netfilter-devel The existing rbtree implementation might store consecutive elements where the closing element and the opening element might overlap, eg. [ a, a+1) [ a+1, a+2) This patch removes the optimization for non-anonymous sets in the exact matching case, where it is assumed to stop searching in case that the closing element is found. Instead, invalidate candidate interval and keep looking further in the tree. This patch fixes the lookup and get operations. Fixes: e701001e7cbe ("netfilter: nft_rbtree: allow adjacent intervals with dynamic updates") Fixes: ba0e4d9917b4 ("netfilter: nf_tables: get set elements via netlink") Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org> --- net/netfilter/nft_set_rbtree.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-) diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c index 57123259452f..510169e28065 100644 --- a/net/netfilter/nft_set_rbtree.c +++ b/net/netfilter/nft_set_rbtree.c @@ -74,8 +74,13 @@ static bool __nft_rbtree_lookup(const struct net *net, const struct nft_set *set parent = rcu_dereference_raw(parent->rb_left); continue; } - if (nft_rbtree_interval_end(rbe)) - goto out; + if (nft_rbtree_interval_end(rbe)) { + if (set->flags & NFT_SET_ANONYMOUS) + return false; + parent = rcu_dereference_raw(parent->rb_left); + interval = NULL; + continue; + } *ext = &rbe->ext; return true; @@ -88,7 +93,7 @@ static bool __nft_rbtree_lookup(const struct net *net, const struct nft_set *set *ext = &interval->ext; return true; } -out: + return false; } @@ -141,6 +146,8 @@ static bool __nft_rbtree_get(const struct net *net, const struct nft_set *set, } else { if (!nft_set_elem_active(&rbe->ext, genmask)) parent = rcu_dereference_raw(parent->rb_left); + continue; + } if (!nft_set_ext_exists(&rbe->ext, NFT_SET_EXT_FLAGS) || (*nft_set_ext_flags(&rbe->ext) & NFT_SET_ELEM_INTERVAL_END) == @@ -148,7 +155,11 @@ static bool __nft_rbtree_get(const struct net *net, const struct nft_set *set, *elem = rbe; return true; } - return false; + + if (nft_rbtree_interval_end(rbe)) + interval = NULL; + + parent = rcu_dereference_raw(parent->rb_left); } } -- 2.11.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH nf] netfilter: nft_set_rbtree: bogus lookup/get on consecutive elements in named sets 2019-12-05 18:07 [PATCH nf] netfilter: nft_set_rbtree: bogus lookup/get on consecutive elements in named sets Pablo Neira Ayuso @ 2019-12-05 22:04 ` Phil Sutter 2019-12-06 19:26 ` Pablo Neira Ayuso 0 siblings, 1 reply; 6+ messages in thread From: Phil Sutter @ 2019-12-05 22:04 UTC (permalink / raw) To: Pablo Neira Ayuso; +Cc: netfilter-devel Hi Pablo, On Thu, Dec 05, 2019 at 07:07:06PM +0100, Pablo Neira Ayuso wrote: > The existing rbtree implementation might store consecutive elements > where the closing element and the opening element might overlap, eg. > > [ a, a+1) [ a+1, a+2) > > This patch removes the optimization for non-anonymous sets in the exact > matching case, where it is assumed to stop searching in case that the > closing element is found. Instead, invalidate candidate interval and > keep looking further in the tree. > > This patch fixes the lookup and get operations. I didn't get what the actual problem is? [...] > diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c > index 57123259452f..510169e28065 100644 > --- a/net/netfilter/nft_set_rbtree.c > +++ b/net/netfilter/nft_set_rbtree.c [...] > @@ -141,6 +146,8 @@ static bool __nft_rbtree_get(const struct net *net, const struct nft_set *set, > } else { > if (!nft_set_elem_active(&rbe->ext, genmask)) > parent = rcu_dereference_raw(parent->rb_left); > + continue; > + } > > if (!nft_set_ext_exists(&rbe->ext, NFT_SET_EXT_FLAGS) || > (*nft_set_ext_flags(&rbe->ext) & NFT_SET_ELEM_INTERVAL_END) == Are you sure about that chunk? It adds a closing brace without a matching opening one. Either this patch ignores whitespace change or there's something fishy. :) Cheers, Phil ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH nf] netfilter: nft_set_rbtree: bogus lookup/get on consecutive elements in named sets 2019-12-05 22:04 ` Phil Sutter @ 2019-12-06 19:26 ` Pablo Neira Ayuso 2019-12-06 19:39 ` Pablo Neira Ayuso 0 siblings, 1 reply; 6+ messages in thread From: Pablo Neira Ayuso @ 2019-12-06 19:26 UTC (permalink / raw) To: Phil Sutter, netfilter-devel On Thu, Dec 05, 2019 at 11:04:08PM +0100, Phil Sutter wrote: > Hi Pablo, > > On Thu, Dec 05, 2019 at 07:07:06PM +0100, Pablo Neira Ayuso wrote: > > The existing rbtree implementation might store consecutive elements > > where the closing element and the opening element might overlap, eg. > > > > [ a, a+1) [ a+1, a+2) > > > > This patch removes the optimization for non-anonymous sets in the exact > > matching case, where it is assumed to stop searching in case that the > > closing element is found. Instead, invalidate candidate interval and > > keep looking further in the tree. > > > > This patch fixes the lookup and get operations. > > I didn't get what the actual problem is? The lookup/get results false, while there is an element in the rbtree. Moreover, the get operation returns true as if a+2 would be in the tree. This happens with named sets after several set updates, I could reproduce the issue with several elements mixed with insertion and deletions in one batch. I managed to trigger the problem using the test file coming in this patch: https://patchwork.ozlabs.org/patch/1204779/ I can extend the patch description if you like. > [...] > > diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c > > index 57123259452f..510169e28065 100644 > > --- a/net/netfilter/nft_set_rbtree.c > > +++ b/net/netfilter/nft_set_rbtree.c > [...] > > @@ -141,6 +146,8 @@ static bool __nft_rbtree_get(const struct net *net, const struct nft_set *set, > > } else { > > if (!nft_set_elem_active(&rbe->ext, genmask)) > > parent = rcu_dereference_raw(parent->rb_left); > > + continue; > > + } > > > > if (!nft_set_ext_exists(&rbe->ext, NFT_SET_EXT_FLAGS) || > > (*nft_set_ext_flags(&rbe->ext) & NFT_SET_ELEM_INTERVAL_END) == > > Are you sure about that chunk? It adds a closing brace without a > matching opening one. Either this patch ignores whitespace change or > there's something fishy. :) Yes, I botched it when squashing my patches before submission. I just sent a v2. Thanks. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH nf] netfilter: nft_set_rbtree: bogus lookup/get on consecutive elements in named sets 2019-12-06 19:26 ` Pablo Neira Ayuso @ 2019-12-06 19:39 ` Pablo Neira Ayuso 2019-12-07 11:03 ` Phil Sutter 0 siblings, 1 reply; 6+ messages in thread From: Pablo Neira Ayuso @ 2019-12-06 19:39 UTC (permalink / raw) To: Phil Sutter, netfilter-devel On Fri, Dec 06, 2019 at 08:26:47PM +0100, Pablo Neira Ayuso wrote: > On Thu, Dec 05, 2019 at 11:04:08PM +0100, Phil Sutter wrote: > > Hi Pablo, > > > > On Thu, Dec 05, 2019 at 07:07:06PM +0100, Pablo Neira Ayuso wrote: > > > The existing rbtree implementation might store consecutive elements > > > where the closing element and the opening element might overlap, eg. > > > > > > [ a, a+1) [ a+1, a+2) > > > > > > This patch removes the optimization for non-anonymous sets in the exact > > > matching case, where it is assumed to stop searching in case that the > > > closing element is found. Instead, invalidate candidate interval and > > > keep looking further in the tree. > > > > > > This patch fixes the lookup and get operations. > > > > I didn't get what the actual problem is? > > The lookup/get results false, while there is an element in the rbtree. > Moreover, the get operation returns true as if a+2 would be in the > tree. This happens with named sets after several set updates, I could > reproduce the issue with several elements mixed with insertion and > deletions in one batch. To extend the problem description: The issue is that the existing lookup optimization (that only works for the anonymous sets) might not reach the opening [ a+1, ... element if the closing ... , a+1) is found in first place when walking over the rbtree. Hence, walking the full tree in that case is needed. > I managed to trigger the problem using the test file coming in this > patch: https://patchwork.ozlabs.org/patch/1204779/ > > I can extend the patch description if you like. > > > [...] > > > diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c > > > index 57123259452f..510169e28065 100644 > > > --- a/net/netfilter/nft_set_rbtree.c > > > +++ b/net/netfilter/nft_set_rbtree.c > > [...] > > > @@ -141,6 +146,8 @@ static bool __nft_rbtree_get(const struct net *net, const struct nft_set *set, > > > } else { > > > if (!nft_set_elem_active(&rbe->ext, genmask)) > > > parent = rcu_dereference_raw(parent->rb_left); > > > + continue; > > > + } > > > > > > if (!nft_set_ext_exists(&rbe->ext, NFT_SET_EXT_FLAGS) || > > > (*nft_set_ext_flags(&rbe->ext) & NFT_SET_ELEM_INTERVAL_END) == > > > > Are you sure about that chunk? It adds a closing brace without a > > matching opening one. Either this patch ignores whitespace change or > > there's something fishy. :) > > Yes, I botched it when squashing my patches before submission. I just > sent a v2. > > Thanks. > > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH nf] netfilter: nft_set_rbtree: bogus lookup/get on consecutive elements in named sets 2019-12-06 19:39 ` Pablo Neira Ayuso @ 2019-12-07 11:03 ` Phil Sutter 2019-12-07 18:38 ` Pablo Neira Ayuso 0 siblings, 1 reply; 6+ messages in thread From: Phil Sutter @ 2019-12-07 11:03 UTC (permalink / raw) To: Pablo Neira Ayuso; +Cc: netfilter-devel Hi Pablo, On Fri, Dec 06, 2019 at 08:39:38PM +0100, Pablo Neira Ayuso wrote: > On Fri, Dec 06, 2019 at 08:26:47PM +0100, Pablo Neira Ayuso wrote: > > On Thu, Dec 05, 2019 at 11:04:08PM +0100, Phil Sutter wrote: > > > Hi Pablo, > > > > > > On Thu, Dec 05, 2019 at 07:07:06PM +0100, Pablo Neira Ayuso wrote: > > > > The existing rbtree implementation might store consecutive elements > > > > where the closing element and the opening element might overlap, eg. > > > > > > > > [ a, a+1) [ a+1, a+2) > > > > > > > > This patch removes the optimization for non-anonymous sets in the exact > > > > matching case, where it is assumed to stop searching in case that the > > > > closing element is found. Instead, invalidate candidate interval and > > > > keep looking further in the tree. > > > > > > > > This patch fixes the lookup and get operations. > > > > > > I didn't get what the actual problem is? > > > > The lookup/get results false, while there is an element in the rbtree. > > Moreover, the get operation returns true as if a+2 would be in the > > tree. This happens with named sets after several set updates, I could > > reproduce the issue with several elements mixed with insertion and > > deletions in one batch. > > To extend the problem description: The issue is that the existing > lookup optimization (that only works for the anonymous sets) might not > reach the opening [ a+1, ... element if the closing ... , a+1) is > found in first place when walking over the rbtree. Hence, walking the > full tree in that case is needed. Ah! Thanks a lot for your elaborations. It was really hard to grasp what all this is about from the initial commit message. :) Sometimes I wonder if we should do more set optimizations under the hood when adding elements. Right now, we don't touch existing ones although it would make sense. And we could be more intelligent for example if a set contains 20-30 and a user adds 25-35. Cheers, Phil ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH nf] netfilter: nft_set_rbtree: bogus lookup/get on consecutive elements in named sets 2019-12-07 11:03 ` Phil Sutter @ 2019-12-07 18:38 ` Pablo Neira Ayuso 0 siblings, 0 replies; 6+ messages in thread From: Pablo Neira Ayuso @ 2019-12-07 18:38 UTC (permalink / raw) To: Phil Sutter, netfilter-devel On Sat, Dec 07, 2019 at 12:03:07PM +0100, Phil Sutter wrote: > Hi Pablo, > > On Fri, Dec 06, 2019 at 08:39:38PM +0100, Pablo Neira Ayuso wrote: > > On Fri, Dec 06, 2019 at 08:26:47PM +0100, Pablo Neira Ayuso wrote: > > > On Thu, Dec 05, 2019 at 11:04:08PM +0100, Phil Sutter wrote: > > > > Hi Pablo, > > > > > > > > On Thu, Dec 05, 2019 at 07:07:06PM +0100, Pablo Neira Ayuso wrote: > > > > > The existing rbtree implementation might store consecutive elements > > > > > where the closing element and the opening element might overlap, eg. > > > > > > > > > > [ a, a+1) [ a+1, a+2) > > > > > > > > > > This patch removes the optimization for non-anonymous sets in the exact > > > > > matching case, where it is assumed to stop searching in case that the > > > > > closing element is found. Instead, invalidate candidate interval and > > > > > keep looking further in the tree. > > > > > > > > > > This patch fixes the lookup and get operations. > > > > > > > > I didn't get what the actual problem is? > > > > > > The lookup/get results false, while there is an element in the rbtree. > > > Moreover, the get operation returns true as if a+2 would be in the > > > tree. This happens with named sets after several set updates, I could > > > reproduce the issue with several elements mixed with insertion and > > > deletions in one batch. > > > > To extend the problem description: The issue is that the existing > > lookup optimization (that only works for the anonymous sets) might not > > reach the opening [ a+1, ... element if the closing ... , a+1) is > > found in first place when walking over the rbtree. Hence, walking the > > full tree in that case is needed. > > Ah! Thanks a lot for your elaborations. It was really hard to grasp what > all this is about from the initial commit message. :) I'll append this to the description before applying, no problem, thanks for asking. > Sometimes I wonder if we should do more set optimizations under the hood > when adding elements. Right now, we don't touch existing ones although > it would make sense. And we could be more intelligent for example if a > set contains 20-30 and a user adds 25-35. Yes, if 'automerge' is set on, then nft should start doing some more intelligent stuff. Basically the idea is: check if this interval overlaps with an existing one, if so transaction looks like this: [ delete 25-35 | add 20-35 ] With the existing transaction infrastructure in place, I think it should not be too hard. There's already a routine to check for overlaps. If this is a data mapping, this needs to be careful to not merge intervals that have different data on the right hand side. There's another problem, I think incremental userspace cache is still incomplete. I can probably scratch time and look into this, this might not happen before 2020 though. Thanks. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2019-12-07 18:39 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-12-05 18:07 [PATCH nf] netfilter: nft_set_rbtree: bogus lookup/get on consecutive elements in named sets Pablo Neira Ayuso 2019-12-05 22:04 ` Phil Sutter 2019-12-06 19:26 ` Pablo Neira Ayuso 2019-12-06 19:39 ` Pablo Neira Ayuso 2019-12-07 11:03 ` Phil Sutter 2019-12-07 18:38 ` Pablo Neira Ayuso
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.