Netfilter-Devel Archive on lore.kernel.org
 help / color / Atom feed
* Moving from ipset to nftables: Sets not ready for prime time yet?
@ 2020-07-02 22:30 Timo Sigurdsson
  2020-07-03  9:28 ` Stefano Brivio
  2020-07-30 19:27 ` Pablo Neira Ayuso
  0 siblings, 2 replies; 6+ messages in thread
From: Timo Sigurdsson @ 2020-07-02 22:30 UTC (permalink / raw)
  To: netfilter-devel

Hi,

I'm currently migrating my various iptables/ipset setups to nftables. The nftables syntax is a pleasure and for the most part the transition of my rulesets has been smooth. Moving my ipsets to nftables sets, however, has proven to be a major pain point - to a degree where I started wondering whether nftables sets are actually ready to replace existing ipset workflows yet.

Before I go into the various issues I encountered with nftables sets, let me briefly explain what my ipset workflow looked like. On gateways that forward traffic, I use ipsets for blacklisting. I fetch blacklists from various sources regularly, convert them to files that can be loaded with `ipset restore', load them into a new ipset and then replace the old ipset with the new one with `ipset swap`. Since some of my blacklists may contain the same addresses or ranges, I use ipsets' -exist switch when loading multiple blacklists into one ipset. This approach has worked for me for quite some time.

Now, let's get to the issues I encountered:

1) Auto-merge issues
Initially, I intended to use the auto-merge feature as a means of dealing with duplicate addresses in the various source lists I use. The first issue I encountered was that it's currently not possible to add an element to a set if it already exists in the set or is part or an interval in the set, despite the auto-merge flag set. This has been reported already by someone else [1] and the only workaround seems to be to add all addresses at once (within one 'add element' statement).

Another issue I stumbled upon was that auto-merge may actually generate wrong/incomplete intervals if you have multiple 'add element' statements within an nftables script file. I consider this a serious issue if you can't be sure whether the addresses or intervals you add to a set actually end up in the set. I reported this here [2]. The workaround for it is - again - to add all addresses in a single statement.

The third auto-merge issue I encountered is another one that has been reported already by someone else [3]. It is that the auto-merge flag actually makes it impossible to update the set atomically. Oh, well, let's abandon auto-merge altogether for now...
 
2) Atomic reload of large sets unbearably slow
Moving on without the auto-merge feature, I started testing sets with actual lists I use. The initial setup (meaning populating the sets for the first time) went fine. But when I tried to update them atomically, i.e. use a script file that would have a 'flush set' statement in the beginning and then an 'add element' statement with all the addresses I wanted to add to it, the system seemed to lock up. As it turns out, updating existing large sets is excessively slow - to a point where it becomes unusable if you work with multiple large sets. I reported the details including an example and performance indicators here [4]. The only workaround for this (that keeps atomicity) I found so far is to reload the complete firewall configuration including the set definitions. But that has other unwanted side-effects such as resetting all counters and so on.

3) Referencing sets within a set not possible
As a workaround for the auto-merge issues described above (and also for another use case), I was looking into the possibility to reference sets within a set so I could create a set for each source list I use and reference them in a single set so I could match them all at once without duplicating rules for multiple sets. To be clear, I'm not really sure whether this is supposed to work all. I found some commits which suggested to me it might be possible [5][6]. Nevertheless, I couldn't get this to work.

Summing up:
Well, that's quite a number of issues to run into as an nftables newbie. I wouldn't have expected this at all. And frankly, I actually converted my rules first and thought adjusting my scripts around ipset to achieve the same with nftables sets would be straightforward and simple... Maybe my approach or understanding of nftables is wrong. But I don't think that the use case is that extraordinary that it should be that difficult.

In any case, if anyone has any tips or workarounds to speed up the atomic reload of large sets, I'd be happy to hear (or read) them. Same goes for referencing sets within sets. If this should be possible to do, I'd appreaciate any hints to the correct syntax to do so.
Are there better approaches to deal with large sets regularly updated from various sources?


Cheers,

Timo


[1] https://www.spinics.net/lists/netfilter/msg58937.html
[2] https://bugzilla.netfilter.org/show_bug.cgi?id=1438
[3] https://bugzilla.netfilter.org/show_bug.cgi?id=1404
[4] https://bugzilla.netfilter.org/show_bug.cgi?id=1439
[5] http://git.netfilter.org/nftables/commit/?h=v0.9.0&id=a6b75b837f5e851c80f8f2dc508b11f1693af1b3
[6] http://git.netfilter.org/nftables/commit/?h=v0.9.0&id=bada2f9c182dddf72a6d3b7b00c9eace7eb596c3

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Moving from ipset to nftables: Sets not ready for prime time yet?
  2020-07-02 22:30 Moving from ipset to nftables: Sets not ready for prime time yet? Timo Sigurdsson
@ 2020-07-03  9:28 ` Stefano Brivio
  2020-07-03 10:24   ` Jozsef Kadlecsik
  2020-07-03 14:03   ` Timo Sigurdsson
  2020-07-30 19:27 ` Pablo Neira Ayuso
  1 sibling, 2 replies; 6+ messages in thread
From: Stefano Brivio @ 2020-07-03  9:28 UTC (permalink / raw)
  To: Timo Sigurdsson; +Cc: netfilter-devel, Phil Sutter, Jozsef Kadlecsik

Hi Timo,

On Fri,  3 Jul 2020 00:30:10 +0200 (CEST)
"Timo Sigurdsson" <public_timo.s@silentcreek.de> wrote:

> Another issue I stumbled upon was that auto-merge may actually
> generate wrong/incomplete intervals if you have multiple 'add
> element' statements within an nftables script file. I consider this a
> serious issue if you can't be sure whether the addresses or intervals
> you add to a set actually end up in the set. I reported this here
> [2]. The workaround for it is - again - to add all addresses in a
> single statement.

Practically speaking I think it's a bug, but I can't find a formal,
complete definition of automerge, so one can also say it "adds items up
to and including the first conflicting one", and there you go, it's
working as intended.

In general, when we discussed this "automerge" feature for
multi-dimensional sets in nftables (not your case, but I aimed at
consistency), I thought it was a mistake to introduce it altogether,
because it's hard to define it and whatever definition one comes up
with might not match what some users think. Consider this example:

# ipset create s hash:net,net
# ipset add s 10.0.1.1/30,192.168.1.1/24
# ipset add s 10.0.0.1/24,172.16.0.1
# ipset list s
[...]
Members:
10.0.1.0/30,192.168.1.0/24
10.0.0.0/24,172.16.0.1

good, ipset has no notion of automerge, so it won't try to do anything
bad here: the set of address pairs denoted by <10.0.1.1/30,
192.168.1.1/24> is disjoint from the set of address pairs denoted by
<10.0.0.1/24, 172.16.0.1>. Then:

# ipset add s 10.0.0.1/16,192.168.0.0/16
# ipset list s
[...]
Members:
10.0.1.0/30,192.168.1.0/24
10.0.0.0/16,192.168.0.0/16
10.0.0.0/24,172.16.0.1

and, as expected with ipset, we have entirely overlapping entries added
to the set. Is that a problem? Not really, ipset doesn't support maps,
so it doesn't matter which entry is actually matched.

# nft add table t
# nft add set t s '{ type ipv4_addr . ipv4_addr; flags interval ; }'
# nft add element t s '{ 10.0.1.1/30 . 192.168.1.1/24 }'
# nft add element t s '{ 10.0.0.1/24 . 172.16.0.1 }'
# nft add element t s '{ 10.0.0.1/16 . 192.168.0.0/16 }'
# nft list ruleset
table ip t {
	set s {
		type ipv4_addr . ipv4_addr
		flags interval
		elements = { 10.0.1.0/30 . 192.168.1.0/24,
			     10.0.0.0/24 . 172.16.0.1,
			     10.0.0.0/16 . 192.168.0.0/16 }
	}
}

also fine: the least generic entry is added first, so it matches first.
Let's try to reorder the insertions:

# nft add element t s '{ 10.0.0.1/16 . 192.168.0.0/16 }'
# nft add element t s '{ 10.0.0.1/24 . 172.16.0.1 }'
# nft add element t s '{ 10.0.1.1/30 . 192.168.1.1/24 }'
Error: Could not process rule: File exists
add element t s { 10.0.1.1/30 . 192.168.1.1/24 }
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

...because that entry would never match anything: it's inserted after a
more generic one that already covers it completely, and we'd like to
tell the user that it doesn't make sense.

Now, this is pretty much the only advantage of not allowing overlaps:
telling the user that some insertion doesn't make sense, and thus it
was probably not what the user wanted to do.

So... I wouldn't know how deal with your use case, even in theory, in a
consistent way. Should we rather introduce a flag that allows any type
of overlapping (default with ipset), which is a way for the user to
tell us they don't actually care about entries not having any effect?

And, in that case, would you expect the entry to be listed in the
resulting set, in case of full overlap (where one set is a subset, not
necessarily proper, of the other one)?

> [...]
>
> Summing up:
> Well, that's quite a number of issues to run into as an nftables
> newbie. I wouldn't have expected this at all. And frankly, I actually
> converted my rules first and thought adjusting my scripts around
> ipset to achieve the same with nftables sets would be straightforward
> and simple... Maybe my approach or understanding of nftables is
> wrong. But I don't think that the use case is that extraordinary that
> it should be that difficult.

I don't think so either, still I kind of expect to see the issues you
report as these features seem to start being heavily used just recently.

And I (maybe optimistically) think that all we need to iron out the
most apparent issues on the subject is a few reports like yours, so
thanks for sharing it.

-- 
Stefano


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Moving from ipset to nftables: Sets not ready for prime time yet?
  2020-07-03  9:28 ` Stefano Brivio
@ 2020-07-03 10:24   ` Jozsef Kadlecsik
  2020-07-03 13:38     ` Stefano Brivio
  2020-07-03 14:03   ` Timo Sigurdsson
  1 sibling, 1 reply; 6+ messages in thread
From: Jozsef Kadlecsik @ 2020-07-03 10:24 UTC (permalink / raw)
  To: Stefano Brivio; +Cc: Timo Sigurdsson, netfilter-devel, Phil Sutter

Hi Stefano,

On Fri, 3 Jul 2020, Stefano Brivio wrote:

> On Fri,  3 Jul 2020 00:30:10 +0200 (CEST)
> "Timo Sigurdsson" <public_timo.s@silentcreek.de> wrote:
> 
> > Another issue I stumbled upon was that auto-merge may actually
> > generate wrong/incomplete intervals if you have multiple 'add
> > element' statements within an nftables script file. I consider this a
> > serious issue if you can't be sure whether the addresses or intervals
> > you add to a set actually end up in the set. I reported this here
> > [2]. The workaround for it is - again - to add all addresses in a
> > single statement.
> 
> Practically speaking I think it's a bug, but I can't find a formal,
> complete definition of automerge, so one can also say it "adds items up
> to and including the first conflicting one", and there you go, it's
> working as intended.
> 
> In general, when we discussed this "automerge" feature for
> multi-dimensional sets in nftables (not your case, but I aimed at
> consistency), I thought it was a mistake to introduce it altogether,
> because it's hard to define it and whatever definition one comes up
> with might not match what some users think. Consider this example:
> 
> # ipset create s hash:net,net
> # ipset add s 10.0.1.1/30,192.168.1.1/24
> # ipset add s 10.0.0.1/24,172.16.0.1
> # ipset list s
> [...]
> Members:
> 10.0.1.0/30,192.168.1.0/24
> 10.0.0.0/24,172.16.0.1
> 
> good, ipset has no notion of automerge, so it won't try to do anything
> bad here: the set of address pairs denoted by <10.0.1.1/30,
> 192.168.1.1/24> is disjoint from the set of address pairs denoted by
> <10.0.0.1/24, 172.16.0.1>. Then:
> 
> # ipset add s 10.0.0.1/16,192.168.0.0/16
> # ipset list s
> [...]
> Members:
> 10.0.1.0/30,192.168.1.0/24
> 10.0.0.0/16,192.168.0.0/16
> 10.0.0.0/24,172.16.0.1
> 
> and, as expected with ipset, we have entirely overlapping entries added
> to the set. Is that a problem? Not really, ipset doesn't support maps,
> so it doesn't matter which entry is actually matched.

Actually, the flags, extensions (nomatch, timeout, skbinfo, etc.) in ipset 
are some kind of mappings and do matter which entry is matched and which 
flags, extensions are applied to the matching packets.

Therefore the matching in the net kind of sets follow a strict ordering: 
most specific match wins and in the case of multiple dimensions (like 
net,net above) it goes from left to right to find the best most specific 
match.

> # nft add table t
> # nft add set t s '{ type ipv4_addr . ipv4_addr; flags interval ; }'
> # nft add element t s '{ 10.0.1.1/30 . 192.168.1.1/24 }'
> # nft add element t s '{ 10.0.0.1/24 . 172.16.0.1 }'
> # nft add element t s '{ 10.0.0.1/16 . 192.168.0.0/16 }'
> # nft list ruleset
> table ip t {
> 	set s {
> 		type ipv4_addr . ipv4_addr
> 		flags interval
> 		elements = { 10.0.1.0/30 . 192.168.1.0/24,
> 			     10.0.0.0/24 . 172.16.0.1,
> 			     10.0.0.0/16 . 192.168.0.0/16 }
> 	}
> }
> 
> also fine: the least generic entry is added first, so it matches first.
> Let's try to reorder the insertions:
> 
> # nft add element t s '{ 10.0.0.1/16 . 192.168.0.0/16 }'
> # nft add element t s '{ 10.0.0.1/24 . 172.16.0.1 }'
> # nft add element t s '{ 10.0.1.1/30 . 192.168.1.1/24 }'
> Error: Could not process rule: File exists
> add element t s { 10.0.1.1/30 . 192.168.1.1/24 }
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> ...because that entry would never match anything: it's inserted after a
> more generic one that already covers it completely, and we'd like to
> tell the user that it doesn't make sense.

I think sets should not store information about which order the entries 
were added. That should totally be indifferent. The input of the sets may 
come from countless sources and if the order of adding the entries matters 
then a preordering is required, which is sometimes non-trivial.
 
> Now, this is pretty much the only advantage of not allowing overlaps:
> telling the user that some insertion doesn't make sense, and thus it
> was probably not what the user wanted to do.

This makes also impossible to make exceptions in the sets in nftables - 
with the "nomatch" flag in ipset one can easily create exceptions in 
intentionally overlapping entries (in whatever deep nestings) in a single 
set. In practice it comes quite handy to say

ipset create access_to_servers hash:ip,port,net
ipset add access_to_servers your_ssh_server,22,x.y.z.0/24
ipset add access_to_servers your_ssh_server,22,x.y.z.32/27 nomatch
...

and exclude access to some parts of a given subnet.

However, the internals of the sets in nftables are totally different from 
ipset, so I'm pretty sure it's absolutely not trivial (and sometimes 
impossible) to provide exactly the same behaviour.

> So... I wouldn't know how deal with your use case, even in theory, in a
> consistent way. Should we rather introduce a flag that allows any type
> of overlapping (default with ipset), which is a way for the user to
> tell us they don't actually care about entries not having any effect?
> 
> And, in that case, would you expect the entry to be listed in the
> resulting set, in case of full overlap (where one set is a subset, not
> necessarily proper, of the other one)?
> 
> > [...]
> >
> > Summing up:
> > Well, that's quite a number of issues to run into as an nftables
> > newbie. I wouldn't have expected this at all. And frankly, I actually
> > converted my rules first and thought adjusting my scripts around
> > ipset to achieve the same with nftables sets would be straightforward
> > and simple... Maybe my approach or understanding of nftables is
> > wrong. But I don't think that the use case is that extraordinary that
> > it should be that difficult.
> 
> I don't think so either, still I kind of expect to see the issues you
> report as these features seem to start being heavily used just recently.
> 
> And I (maybe optimistically) think that all we need to iron out the
> most apparent issues on the subject is a few reports like yours, so
> thanks for sharing it.

Best regards,
Jozsef
-
E-mail  : kadlec@blackhole.kfki.hu, kadlecsik.jozsef@wigner.hu
PGP key : https://wigner.hu/~kadlec/pgp_public_key.txt
Address : Wigner Research Centre for Physics
          H-1525 Budapest 114, POB. 49, Hungary

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Moving from ipset to nftables: Sets not ready for prime time yet?
  2020-07-03 10:24   ` Jozsef Kadlecsik
@ 2020-07-03 13:38     ` Stefano Brivio
  0 siblings, 0 replies; 6+ messages in thread
From: Stefano Brivio @ 2020-07-03 13:38 UTC (permalink / raw)
  To: Jozsef Kadlecsik; +Cc: Timo Sigurdsson, netfilter-devel, Phil Sutter

Hi József,

On Fri, 3 Jul 2020 12:24:03 +0200 (CEST)
Jozsef Kadlecsik <kadlec@netfilter.org> wrote:

> On Fri, 3 Jul 2020, Stefano Brivio wrote:
> 
> > On Fri,  3 Jul 2020 00:30:10 +0200 (CEST)
> > "Timo Sigurdsson" <public_timo.s@silentcreek.de> wrote:
> >   
> > > Another issue I stumbled upon was that auto-merge may actually
> > > generate wrong/incomplete intervals if you have multiple 'add
> > > element' statements within an nftables script file. I consider this a
> > > serious issue if you can't be sure whether the addresses or intervals
> > > you add to a set actually end up in the set. I reported this here
> > > [2]. The workaround for it is - again - to add all addresses in a
> > > single statement.  
> > 
> > Practically speaking I think it's a bug, but I can't find a formal,
> > complete definition of automerge, so one can also say it "adds items up
> > to and including the first conflicting one", and there you go, it's
> > working as intended.
> > 
> > In general, when we discussed this "automerge" feature for
> > multi-dimensional sets in nftables (not your case, but I aimed at
> > consistency), I thought it was a mistake to introduce it altogether,
> > because it's hard to define it and whatever definition one comes up
> > with might not match what some users think. Consider this example:
> > 
> > # ipset create s hash:net,net
> > # ipset add s 10.0.1.1/30,192.168.1.1/24
> > # ipset add s 10.0.0.1/24,172.16.0.1
> > # ipset list s
> > [...]
> > Members:
> > 10.0.1.0/30,192.168.1.0/24
> > 10.0.0.0/24,172.16.0.1
> > 
> > good, ipset has no notion of automerge, so it won't try to do anything
> > bad here: the set of address pairs denoted by <10.0.1.1/30,  
> > 192.168.1.1/24> is disjoint from the set of address pairs denoted by  
> > <10.0.0.1/24, 172.16.0.1>. Then:
> > 
> > # ipset add s 10.0.0.1/16,192.168.0.0/16
> > # ipset list s
> > [...]
> > Members:
> > 10.0.1.0/30,192.168.1.0/24
> > 10.0.0.0/16,192.168.0.0/16
> > 10.0.0.0/24,172.16.0.1
> > 
> > and, as expected with ipset, we have entirely overlapping entries added
> > to the set. Is that a problem? Not really, ipset doesn't support maps,
> > so it doesn't matter which entry is actually matched.  
> 
> Actually, the flags, extensions (nomatch, timeout, skbinfo, etc.) in ipset 
> are some kind of mappings and do matter which entry is matched and which 
> flags, extensions are applied to the matching packets.

Oh, I didn't consider that.

> Therefore the matching in the net kind of sets follow a strict ordering: 
> most specific match wins and in the case of multiple dimensions (like 
> net,net above) it goes from left to right to find the best most specific 
> match.

And I didn't know about this either. Well, this looks a bit arbitrary
to me, also because there's no such thing as hash:port,net, so forcing
the left-to-right precedence won't cover all the possible cases anyway.

In nftables, as sets now support an arbitrary number of dimensions, in
an arbitrary order, that would require an explicit evaluation ordering,
which is actually not too hard to implement. I just doubt the usage
would be practical.

> > # nft add table t
> > # nft add set t s '{ type ipv4_addr . ipv4_addr; flags interval ; }'
> > # nft add element t s '{ 10.0.1.1/30 . 192.168.1.1/24 }'
> > # nft add element t s '{ 10.0.0.1/24 . 172.16.0.1 }'
> > # nft add element t s '{ 10.0.0.1/16 . 192.168.0.0/16 }'
> > # nft list ruleset
> > table ip t {
> > 	set s {
> > 		type ipv4_addr . ipv4_addr
> > 		flags interval
> > 		elements = { 10.0.1.0/30 . 192.168.1.0/24,
> > 			     10.0.0.0/24 . 172.16.0.1,
> > 			     10.0.0.0/16 . 192.168.0.0/16 }
> > 	}
> > }
> > 
> > also fine: the least generic entry is added first, so it matches first.
> > Let's try to reorder the insertions:
> > 
> > # nft add element t s '{ 10.0.0.1/16 . 192.168.0.0/16 }'
> > # nft add element t s '{ 10.0.0.1/24 . 172.16.0.1 }'
> > # nft add element t s '{ 10.0.1.1/30 . 192.168.1.1/24 }'
> > Error: Could not process rule: File exists
> > add element t s { 10.0.1.1/30 . 192.168.1.1/24 }
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > 
> > ...because that entry would never match anything: it's inserted after a
> > more generic one that already covers it completely, and we'd like to
> > tell the user that it doesn't make sense.  
> 
> I think sets should not store information about which order the entries 
> were added. That should totally be indifferent. The input of the sets may 
> come from countless sources and if the order of adding the entries matters 
> then a preordering is required, which is sometimes non-trivial.

As it comes for free, I think it's nice to leave this possibility open
for simple combinations. It doesn't introduce any ambiguity. It's not
an usage I would recommend anyway, but I don't see the harm.

> > Now, this is pretty much the only advantage of not allowing overlaps:
> > telling the user that some insertion doesn't make sense, and thus it
> > was probably not what the user wanted to do.  
> 
> This makes also impossible to make exceptions in the sets in nftables - 
> with the "nomatch" flag in ipset one can easily create exceptions in 
> intentionally overlapping entries (in whatever deep nestings) in a single 
> set. In practice it comes quite handy to say
> 
> ipset create access_to_servers hash:ip,port,net
> ipset add access_to_servers your_ssh_server,22,x.y.z.0/24
> ipset add access_to_servers your_ssh_server,22,x.y.z.32/27 nomatch
> ...
> 
> and exclude access to some parts of a given subnet.
> 
> However, the internals of the sets in nftables are totally different from 
> ipset, so I'm pretty sure it's absolutely not trivial (and sometimes 
> impossible) to provide exactly the same behaviour.

It's actually kind of trivial for nft_set_pipapo, for nft_set_hash it
doesn't apply (it doesn't implement intervals), and I'm not sure about
nft_set_rbtree right now.

However, does this really provide any value compared to having a
separate set for exceptions matched earlier in a chain?

If it really does, I think it could and should be done in userspace by
splitting the intervals. The kernel back-ends shouldn't be overloaded
with complexity that doesn't *need* to live there, and no matter what,
this is going to have a performance impact on the lookup (it should be
doable to avoid an explicit branch for this, but we can't avoid
fetching more bits per element).

Ideally, I would even like to drop the need for timeout and validity
checks as part of the lookup, because they are quite heavy (fetching
the 'extension' pointer, branches, etc.). It involves some internal API
refactoring and is actually on my motionless to-do list, but too far
from the surface to have any practical value.

-- 
Stefano


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Moving from ipset to nftables: Sets not ready for prime time yet?
  2020-07-03  9:28 ` Stefano Brivio
  2020-07-03 10:24   ` Jozsef Kadlecsik
@ 2020-07-03 14:03   ` Timo Sigurdsson
  1 sibling, 0 replies; 6+ messages in thread
From: Timo Sigurdsson @ 2020-07-03 14:03 UTC (permalink / raw)
  To: sbrivio; +Cc: netfilter-devel, phil, kadlec

Hi Stefano,

Stefano Brivio schrieb am 03.07.2020 11:28 (GMT +02:00):

> On Fri,  3 Jul 2020 00:30:10 +0200 (CEST)
> "Timo Sigurdsson" <public_timo.s@silentcreek.de> wrote:
> 
>> Another issue I stumbled upon was that auto-merge may actually
>> generate wrong/incomplete intervals if you have multiple 'add
>> element' statements within an nftables script file. I consider this a
>> serious issue if you can't be sure whether the addresses or intervals
>> you add to a set actually end up in the set. I reported this here
>> [2]. The workaround for it is - again - to add all addresses in a
>> single statement.
>> ...
>> [2] https://bugzilla.netfilter.org/show_bug.cgi?id=1438
> 
> Practically speaking I think it's a bug, but I can't find a formal,
> complete definition of automerge, so one can also say it "adds items up
> to and including the first conflicting one", and there you go, it's
> working as intended.

Actually, I think it's a bug regardless of how auto-merge is defined exactly. Simply because I don't think that this
  add element family table myset { A }
  add element family table myset { B }
should give a different result compared to this
  add element family table myset { A, B }

But that's basically what's happening in my example.

> In general, when we discussed this "automerge" feature for
> multi-dimensional sets in nftables (not your case, but I aimed at
> consistency), I thought it was a mistake to introduce it altogether,
> because it's hard to define it and whatever definition one comes up
> with might not match what some users think. 

I understand that depending on the use case, one may have different expectations and that merging entries may cause issues. One example I read before was about adding a single IP address to a set which already contains the full /24 interval. If the addition would be simply ignored, you either wouldn't be able to delete the entry again or you'd have to break up the interval when doing so. So, I understand that it's impossible to make everybody happy.

> Let's try to reorder the insertions:
> 
> # nft add element t s '{ 10.0.0.1/16 . 192.168.0.0/16 }'
> # nft add element t s '{ 10.0.0.1/24 . 172.16.0.1 }'
> # nft add element t s '{ 10.0.1.1/30 . 192.168.1.1/24 }'
> Error: Could not process rule: File exists
> add element t s { 10.0.1.1/30 . 192.168.1.1/24 }
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> 
> ...because that entry would never match anything: it's inserted after a
> more generic one that already covers it completely, and we'd like to
> tell the user that it doesn't make sense.
> 
> Now, this is pretty much the only advantage of not allowing overlaps:
> telling the user that some insertion doesn't make sense, and thus it
> was probably not what the user wanted to do.

Two thoughts here:
1) If there is a real conflict, say in a verdict map with diverging entries, then sure, refusing to accept the conflicting entries is certainly useful. But if there is an overlap that bears no meaning other than you'd either have an additional entry in your set that would never be matched, then I think there should be a way to just allow this or ignore the addition altogether.

2) Another problem here in practice is that nft doesn't emit a warning but fails entirely if you load a large set from a script with a duplicate entry (without auto-merge). You could avoid this by adding each element individually, but that's very slow for large sets, so not really an option either.
 
> So... I wouldn't know how deal with your use case, even in theory, in a
> consistent way. Should we rather introduce a flag that allows any type
> of overlapping (default with ipset), which is a way for the user to
> tell us they don't actually care about entries not having any effect?

Giving advice here is difficult, since I'm not in any position to make judgements about other use cases that might be affected by this. But, generally speaking, yes I think some option that makes overlaps ignorable like `ipset -exist' would be useful as well as good documentation what is to be expected from each flag/option, so the user at least knows about the limitations. And if there is a way to determine this programmatically, there could be a distinction along the lines what I described above between conflicting statements (leading to an error) and statements that simply overlap or are superfluous (leading to a simple warning unless a flag/option is used to allow and ignore them).

> And, in that case, would you expect the entry to be listed in the
> resulting set, in case of full overlap (where one set is a subset, not
> necessarily proper, of the other one)?

I would have expected the entries to be merged, so if I add 192.168.0.0/24 and then 192.168.0.2, I'd only expect to get 192.168.0.0/24 returned when listing the set. But even if both entries were added and the second one would never match, that would be fine for my use case. The problem is, when working with multiple blacklists, I cannot keep track of which entries are already added to the set, especially if one list may contain complete networks and another list contains individual IP addresses.

Thanks and regards,

Timo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: Moving from ipset to nftables: Sets not ready for prime time yet?
  2020-07-02 22:30 Moving from ipset to nftables: Sets not ready for prime time yet? Timo Sigurdsson
  2020-07-03  9:28 ` Stefano Brivio
@ 2020-07-30 19:27 ` Pablo Neira Ayuso
  1 sibling, 0 replies; 6+ messages in thread
From: Pablo Neira Ayuso @ 2020-07-30 19:27 UTC (permalink / raw)
  To: Timo Sigurdsson; +Cc: netfilter-devel

On Fri, Jul 03, 2020 at 12:30:10AM +0200, Timo Sigurdsson wrote:
> Hi,
> 
> I'm currently migrating my various iptables/ipset setups to nftables. The nftables syntax is a pleasure and for the most part the transition of my rulesets has been smooth. Moving my ipsets to nftables sets, however, has proven to be a major pain point - to a degree where I started wondering whether nftables sets are actually ready to replace existing ipset workflows yet.
[...]
> 2) Atomic reload of large sets unbearably slow
> Moving on without the auto-merge feature, I started testing sets with actual lists I use. The initial setup (meaning populating the sets for the first time) went fine. But when I tried to update them atomically, i.e. use a script file that would have a 'flush set' statement in the beginning and then an 'add element' statement with all the addresses I wanted to add to it, the system seemed to lock up. As it turns out, updating existing large sets is excessively slow - to a point where it becomes unusable if you work with multiple large sets. I reported the details including an example and performance indicators here [4]. The only workaround for this (that keeps atomicity) I found so far is to reload the complete firewall configuration including the set definitions. But that has other unwanted side-effects such as resetting all counters and so on.
> 
> 3) Referencing sets within a set not possible
> As a workaround for the auto-merge issues described above (and also for another use case), I was looking into the possibility to reference sets within a set so I could create a set for each source list I use and reference them in a single set so I could match them all at once without duplicating rules for multiple sets. To be clear, I'm not really sure whether this is supposed to work all. I found some commits which suggested to me it might be possible [5][6]. Nevertheless, I couldn't get this to work.

For the record, these two issues are now fixed in git.

Thank you for reporting.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, back to index

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-07-02 22:30 Moving from ipset to nftables: Sets not ready for prime time yet? Timo Sigurdsson
2020-07-03  9:28 ` Stefano Brivio
2020-07-03 10:24   ` Jozsef Kadlecsik
2020-07-03 13:38     ` Stefano Brivio
2020-07-03 14:03   ` Timo Sigurdsson
2020-07-30 19:27 ` Pablo Neira Ayuso

Netfilter-Devel Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/netfilter-devel/0 netfilter-devel/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 netfilter-devel netfilter-devel/ https://lore.kernel.org/netfilter-devel \
		netfilter-devel@vger.kernel.org
	public-inbox-index netfilter-devel

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.netfilter-devel


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git